Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate GCS files to new (ideally public) locations #18518

Open
ScottTodd opened this issue Sep 12, 2024 · 5 comments
Open

Migrate GCS files to new (ideally public) locations #18518

ScottTodd opened this issue Sep 12, 2024 · 5 comments
Assignees
Labels
cleanup 🧹 infrastructure Relating to build systems, CI, or testing

Comments

@ScottTodd
Copy link
Member

We depend on a few files hosted in a GCP project using various buckets.

Most uses can be discovered in this repo with a regex search of https://storage\.googleapis\.com.*iree:

21 results - 13 files

.github\workflows\ci.yml:
  47    # attempt the setup step last ran in.
  48:   GCS_URL: https://storage.googleapis.com/iree-github-actions-${{ github.event_name == 'pull_request' && 'presubmit' || 'postsubmit' }}-artifacts/${{ github.run_id }}/${{ github.run_attempt }}
  49  

.github\workflows\pkgci_test_riscv64.yml:
  71          env:
  72:           IREE_ARTIFACT_URL: "https://storage.googleapis.com/iree-shared-files"
  73            RISCV_CLANG_TOOLCHAIN_FILE_NAME: "toolchain_iree_manylinux_2_28_20231012.tar.gz"

build_tools\docker\dockerfiles\base-arm64.Dockerfile:
  78  
  79: RUN wget --no-verbose "https://storage.googleapis.com/iree-shared-files/qemu-aarch64"
  80  RUN chmod +x ./qemu-aarch64 && cp ./qemu-aarch64 /usr/bin/qemu-aarch64 && rm -rf /install-qemu

build_tools\riscv\riscv_bootstrap.sh:
  14  PREBUILT_DIR="${HOME}/riscv"
  15: IREE_ARTIFACT_URL="https://storage.googleapis.com/iree-shared-files"
  16  

docs\website\docs\guides\ml-frameworks\tflite.md:
   98  WORKDIR="/tmp/workdir"
   99: TFLITE_URL="https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8.tflite"
  100  TFLITE_PATH=${WORKDIR}/model.tflite

  152  ``` python
  153: tfliteUrl = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8.tflite"
  154: jpgUrl = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8_input.jpg"
  155  

experimental\web\generate_web_metrics.sh:
  75  
  76: wget -nc https://storage.googleapis.com/iree-model-artifacts/mobile_ssd_v2_float_coco.tflite
  77: wget -nc https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite
  78: wget -nc https://storage.googleapis.com/iree-model-artifacts/posenet.tflite
  79: wget -nc https://storage.googleapis.com/iree-model-artifacts/mobilebert-baseline-tf2-float.tflite
  80: wget -nc https://storage.googleapis.com/iree-model-artifacts/mobilenet_v2_1.0_224.tflite
  81: wget -nc https://storage.googleapis.com/iree-model-artifacts/MobileNetV3SmallStaticBatch.tflite
  82  

integrations\tensorflow\test\python\iree_tfl_tests\imagenet_test_data.py:
   9      # We use an image of apples since this is an easy example.
  10:     img_path = "https://storage.googleapis.com/iree-model-artifacts/ILSVRC2012_val_00000023.JPEG"
  11      local_path = "/".join([workdir, "ILSVRC2012_val_00000023.JPEG"])

integrations\tensorflow\test\python\iree_tfl_tests\mobilebert_tf2_quant_test.py:
  8  # Source https://tfhub.dev/iree/lite-model/mobilebert/int8/1
  9: model_path = "https://storage.googleapis.com/iree-model-artifacts/mobilebert-baseline-tf2-quant.tflite"
  10  

integrations\tensorflow\test\python\iree_tfl_tests\mobilenet_v1_test.py:
  10  
  11: model_path = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/mobilenet_v1.tflite"
  12  

integrations\tensorflow\test\python\iree_tfl_tests\mobilenet_v3-large_uint8_test.py:
  8  # Source https://tfhub.dev/iree/lite-model/mobilenet_v3_large_100_224/uint8/1
  9: model_path = "https://storage.googleapis.com/iree-model-artifacts/mobilenet_v3-large_224_1.0_uint8.tflite"
  10  

integrations\tensorflow\test\python\iree_tfl_tests\posenet_i8_test.py:
  13  
  14: model_path = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8.tflite"
  15: model_input = "https://storage.googleapis.com/iree-model-artifacts/tflite-integration-tests/posenet_i8_input.jpg"
  16  

tests\e2e\stablehlo_models\mnist_train_test\mnist_train_test.py:
  21  
  22: MODEL_ARTIFACTS_URL = "https://storage.googleapis.com/iree-model-artifacts/mnist_train.2bec0cb356ae7c059e04624a627eb3b15b0a556cbd781bbed9f8d32e80a4311d.tar"
  23  

tests\e2e\stablehlo_models\mnist_train_test\README.md:
  22  sed -i \
  23:   "s|MODEL_ARTIFACTS_URL =.*|MODEL_ARTIFACTS_URL = \"https://storage.googleapis.com/iree-model-artifacts/mnist_train.${DIGEST}.tar\"|" \
  24    mnist_train_test.py

Those files are (as far as I can tell) only read from. They aren't written to, outside of very rare maintenance (none in the last year IIRC). There is a bucket that is read-write, used for ccache: http://storage.googleapis.com/iree-sccache/ccache. We are in the process of migrating off of that in #18238.

@ScottTodd ScottTodd added infrastructure Relating to build systems, CI, or testing cleanup 🧹 labels Sep 12, 2024
@ScottTodd
Copy link
Member Author

Until we find a better location, let's at least download and then upload mirrors to Azure for the riscv and arm files (qemu-aarch64, toolchain_iree_manylinux_2_28_20231012.tar.gz, and any others)

@ScottTodd
Copy link
Member Author

@Eliasj42 could you help mirror the https://storage.googleapis.com/iree-shared-files/qemu-aarch64 and toolchain_iree_manylinux_2_28_20231012.tar.gz files to the sharkpublic Azure storage account or some other public location? We can find a better long term home for those files later.

I'm less concerned about the .tflite files. We can just disable any tests relying on those.

@ScottTodd
Copy link
Member Author

Yep, then point the code to the new file locations.

Eliasj42 added a commit that referenced this issue Sep 16, 2024
Updated several links to point to files in azure shark-public container
instead of gcp. Progress on
#18518.

Signed-off-by: Elias Joseph <[email protected]>
Co-authored-by: Elias Joseph <[email protected]>
@ScottTodd
Copy link
Member Author

There are still a few places to update. A user just noted that the links in https://github.com/iree-org/iree/blob/main/build_tools/riscv/riscv_bootstrap.sh are dead.

We should also still switch to easier to manage files (git lfs?) with reproducible steps for generating them, instead of just mirroring to a cloud bucket that some project members have access to.

ScottTodd added a commit that referenced this issue Nov 13, 2024
See #18518. These tests have
started failing since the GCS bucket is now returning 403 errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup 🧹 infrastructure Relating to build systems, CI, or testing
Projects
None yet
Development

No branches or pull requests

2 participants