-
Notifications
You must be signed in to change notification settings - Fork 34
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Consolidate .gitignore file to inference directory This commit consolidates all preexisting inference-related gitignore directives into the `.gitignore` file in the `inference` directory. * Add eslint to project This commit adds an eslint config for the WASM tests. I think this overall adheres to Mozilla's code-format preferences and is good enough for a first pass. I have found the linting config to be a bit finnicky, so my preference would be to improve the linting in a follow-up, if needed/desired at all. * Add new lint tasks for eslint This commit adds new tasks to run the eslint linter on relevant JavaScript files in the project, as well as hooks up the tasks to a kind.yml file to run in CI. * Rename bergamot-translator directive to bergamot-translator-source The name of the file that we use in the mozilla-unified source tree is `bergamot-translator.js`, but the name of the file generated here is `bergamot-translator-worker.js`. I wanted the names to match, so I am renaming the CMake directives that dictate the generated file's names such that the generated WASM code will be `bergamot-translator.js`. This is the first step of that process. * Rename bergamot-translator-worker directive to bergamot-tarnslator This is the second step of the previous commit, which renames the WASM-related directives to simply be `bergamot-translator`. This results in the generated JavaScript file being `bergamot-translator.js` instead of `bergamot-translator-worker.js`. * Move thread-count default logic into build_wasm.py Given the issue where building the WASM within the Docker container fails on multiple threads only if the host operating system is macOS, I have moved that default logic within the script itself. The default can still be overridden by passing the `-j` flag, but rather than call sites having to know to do the "right" thing for macOS, I'm making it the default intrinsic behavior within the script. * Prepare Bergamot module for mozilla-unified in build-bergamot.py This moves the logic that is currently in the mozilla-unified tree, of adding the licensing, and wrapping the generated WASM JavaScript module in a function. This will be paired with a downstream-pr that removes this step on the mozilla-unified end. * Add Typescript bindings the for Bergamot This commit adds some Typescript bindings to the test directory that match the generated JS. I spent some time trying to get emscripten to generate these automatically, but I gave up on my time-boxed effort. * Add support for `git-lfs` to base docker image This commit adds support for pulling files via `git-lfs` to the Dockerfile for the base docker image. In order to pull the files, we need to install `git-lfs` from apt, but also add github.com to the list of known ssh hosts. * Add a subset of models for testing using `git-lfs` This commit adds the gzipped artifacts for * `enes` * `enfr` * `esen` * `fren` These are used for testing for the moment, but I view this as a temporary solution that is good enough for this PR. In the future, we will need to merge the `firefox-translations-models` repository here. * Add test-wasm.py script This commit adds a Python script for testing the WASM, which runs the WASM build script (if needed), and then invokes the test runner. * Extract test models from archives in test-wasm.py This commit modifies the new `test-wasm.py` script to extract the model artifacts from their gzipped files in the `models` test directory. The non-gzip artifacts are ignored in the .gitignore, as well as removed in the clean script. * Copy WASM build artifacts to test directory in test-wasm.py This commit taks the WASM artifacts generated by the build script and copies them to a directory for use in tests. * Produce hash of generated JS in test-wasm.py This commit computes a hash of the generated JavaScript, since the test runner adds it to the worker global scope using `eval`. This ensures that our test runner will only `eval` the intended script. * Add Web Worker simulation infrastructure This commit adds a minimal API surface of the WorkerGlobalScope API functionality that we use for Translations within Firefox, wrapping the Node.js worker_threads equivalent behavior underneath. This allows us to test the generated code in a Node.js environment with the same API calls that we use in Firefox. * Add Translations Engine and worker implementation This commit adds a simplified and minimal implementation of our Translations Engine from the mozilla-unified source tree, which is capable of starting a web-worker translator between a given language pair and translating a single message at a time. * Add test cases for current translations WASM bindings Adds test cases that test the current translation functionality end-to-end, including plaint-text translations and HTML translations.
- Loading branch information
Showing
43 changed files
with
2,796 additions
and
110 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
inference/**/*.gz filter=lfs diff=lfs merge=lfs -text |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
#!/usr/bin/env python3 | ||
import argparse | ||
import hashlib | ||
import os | ||
import shutil | ||
import subprocess | ||
import sys | ||
|
||
SCRIPTS_PATH = os.path.realpath(os.path.dirname(__file__)) | ||
INFERENCE_PATH = os.path.dirname(SCRIPTS_PATH) | ||
BUILD_PATH = os.path.join(INFERENCE_PATH, "build-wasm") | ||
WASM_PATH = os.path.join(INFERENCE_PATH, "wasm") | ||
WASM_TESTS_PATH = os.path.join(WASM_PATH, "tests") | ||
GENERATED_PATH = os.path.join(WASM_TESTS_PATH, "generated") | ||
MODELS_PATH = os.path.join(WASM_TESTS_PATH, "models") | ||
WASM_ARTIFACT = os.path.join(BUILD_PATH, "bergamot-translator.wasm") | ||
JS_ARTIFACT = os.path.join(BUILD_PATH, "bergamot-translator.js") | ||
JS_ARTIFACT_HASH = os.path.join(GENERATED_PATH, "bergamot-translator.js.sha256") | ||
|
||
|
||
def calculate_sha256(file_path): | ||
sha256_hash = hashlib.sha256() | ||
with open(file_path, "rb") as f: | ||
for byte_block in iter(lambda: f.read(4096), b""): | ||
sha256_hash.update(byte_block) | ||
return sha256_hash.hexdigest() | ||
|
||
|
||
def main(): | ||
parser = argparse.ArgumentParser( | ||
description="Test WASM by building and handling artifacts.", | ||
formatter_class=argparse.RawTextHelpFormatter, | ||
) | ||
|
||
parser.add_argument("--clobber", action="store_true", help="Clobber the build artifacts") | ||
parser.add_argument( | ||
"--debug", | ||
action="store_true", | ||
help="Build with debug symbols, useful for profiling", | ||
) | ||
parser.add_argument( | ||
"-j", | ||
type=int, | ||
help="Number of cores to use for building (default: all available cores)", | ||
) | ||
args = parser.parse_args() | ||
|
||
build_wasm_script = os.path.join(SCRIPTS_PATH, "build-wasm.py") | ||
build_command = [sys.executable, build_wasm_script] | ||
if args.clobber: | ||
build_command.append("--clobber") | ||
if args.debug: | ||
build_command.append("--debug") | ||
if args.j: | ||
build_command.extend(["-j", str(args.j)]) | ||
|
||
print("\n🚀 Starting build-wasm.py") | ||
subprocess.run(build_command, check=True) | ||
|
||
print("\n📥 Pulling translations model files with git lfs\n") | ||
subprocess.run(["git", "lfs", "pull"], cwd=MODELS_PATH, check=True) | ||
print(f" Pulled all files in {MODELS_PATH}") | ||
|
||
print("\n📁 Copying generated build artifacts to the WASM test directory\n") | ||
|
||
os.makedirs(GENERATED_PATH, exist_ok=True) | ||
shutil.copy2(WASM_ARTIFACT, GENERATED_PATH) | ||
shutil.copy2(JS_ARTIFACT, GENERATED_PATH) | ||
|
||
print(f" Copied the following artifacts to {GENERATED_PATH}:") | ||
print(f" - {JS_ARTIFACT}") | ||
print(f" - {WASM_ARTIFACT}") | ||
|
||
print(f"\n🔑 Calculating SHA-256 hash of {JS_ARTIFACT}\n") | ||
hash_value = calculate_sha256(JS_ARTIFACT) | ||
with open(JS_ARTIFACT_HASH, "w") as hash_file: | ||
hash_file.write(f"{hash_value} {os.path.basename(JS_ARTIFACT)}\n") | ||
print(f" Hash of {JS_ARTIFACT} written to") | ||
print(f" {JS_ARTIFACT_HASH}") | ||
|
||
print("\n📂 Decompressing model files required for WASM testing\n") | ||
subprocess.run(["gzip", "-dkrf", MODELS_PATH], check=True) | ||
print(f" Decompressed models in {MODELS_PATH}\n") | ||
|
||
print("\n🔧 Installing npm dependencies for WASM JS tests\n") | ||
subprocess.run(["npm", "install"], cwd=WASM_TESTS_PATH, check=True) | ||
|
||
print("\n📊 Running Translations WASM JS tests\n") | ||
subprocess.run(["npm", "run", "test"], cwd=WASM_TESTS_PATH, check=True) | ||
|
||
print("\n✅ test-wasm.py completed successfully.\n") | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.