Skip to content
This repository was archived by the owner on Mar 11, 2021. It is now read-only.
This repository was archived by the owner on Mar 11, 2021. It is now read-only.

[Help] runtime error on Google Colaboratory #969

@y-ich

Description

@y-ich

Hi.

I tried to run Minigo on Google Colaboratory.

I managed to compile it, ran bazel-bin/cc/gtp, and got the following error.

  • command
    !bazel-bin/cc/gtp --device=$TPU_NAME --model=saved_models/000820-defence.minigo --seconds_per_move=60 --value_init_penalty=2.0

  • error
    resign_threshold:-0.999 resign_enabled:1 komi:7.5 value_init_penalty:2 policy_softmax_temp:0.98 soft_pick_enabled:0 soft_pick_cutoff:30 inject_noise:0 virtual_losses:8 num_readouts:100 seconds_per_move:60 time_limit:0 decay_factor:0.98 fastplay_frequency:0 fastplay_readouts:20 target_pruning:0 random_seed:0
    Will cache up to 704290 inferences, using roughly 1024MB.

    Initializing TPU grpc://10.110.208.194:8470
    Warming up...
    2020-02-02 14:14:27.900112: F cc/dual_net/tpu_dual_net.cc:167] Non-OK-status: session_->RunCallable(handle_, inputs_, &outputs_, nullptr) status: Invalid argument: From /job:tpu_worker/replica:0/task:0:
    Compilation failure: Matrix size-incompatible: In[0]: [1,722], In[1]: [256,128]
    [[{{node dense/MatMul}}]]
    TPU compilation failed
    [[tpu_compile_succeeded_assert/_17721758171533651890/_4]]
    *** SIGABRT received at time=1580652867 ***
    PC: @ 0x7fc747cc3e97 (unknown) (unknown)
    @ 0x5628351fecc2 64 absl::AbslFailureSignalHandler()
    @ 0x7fc748629890 631976416 (unknown)
    @ 0x56283524e57c 752 minigo::TpuDualNet::RunMany()
    @ 0x56283521253f 5504 minigo::GtpClient::Run()
    @ 0x5628351fe5ef 672 minigo::(anonymous namespace)::Gtp()
    @ 0x5628351fcc68 32 main
    @ 0x7fc747ca6b97 (unknown) (unknown)
    @ 0x82d6258d4c544155 (unknown) (unknown)

Does this mean that the weight file is wrong?

I generated it by the following command with modifed freeze_graph.py.
!python freeze_graph.py --model_path=gs://minigo-pub/v17-19x19/models/000820-defence --save_path=saved_models/000820-defence --use_tpu=true --tpu_name=$TPU_NAME --num_tpu_cores=8

Since the original freeze_graph.py saves the output graph to the same path as the input, I modified it to enable to specify the output path.

I will appreciate any advices.
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions