Skip to content

Missing check for caching availability when getting model files #1279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PrafulB opened this issue Apr 10, 2025 · 2 comments · Fixed by #1280
Closed

Missing check for caching availability when getting model files #1279

PrafulB opened this issue Apr 10, 2025 · 2 comments · Fixed by #1280

Comments

@PrafulB
Copy link
Contributor

PrafulB commented Apr 10, 2025

When running in a Node.js environment, it looks like there's an edge case where the unavailability of a caching system causes an error when loading model files. The problematic code:

// https://github.com/huggingface/transformers.js/blob/10c09fb561857abf817d651a02898f590f5b7954/src/utils/hub.js#L650

const path = await cache.match(cacheKey);
if (path instanceof FileResponse) {
    return path.filePath;
}

Specifically, in a Node.js environment, getModelFile() attempts to download the respective model.onnx (or equivalent) file, store it in a cache and return the path to it. In cases where caching is disabled or unavailable and a remote model is to be loaded, cache stays undefined and the cache.match() call throws an error.

Simple repro (just setting useFSCache to be false):

import * as transformers from '@huggingface/transformers';

transformers.env.useFSCache = false;

transformers.pipeline("sentiment-analysis").then(async pipe => {
    console.log( await pipe('Doesn\'t work when cache is unavailable!') )
})

raises
TypeError: Cannot read properties of undefined (reading 'match').

File system availability for caching might be an okay assumption to make here, but I feel like it should still be checked for. The function does make allowances for it in most places, it's just this last piece that seems to miss it.

As a possible suggestion, the response variable already contains the downloaded model file, so in cases where caching is unavailable, it might make sense to just pass the buffer back. Seems like it's passed directly to ONNX.InferenceSession.create(), which accepts both buffers and paths. Happy to raise a PR if that's acceptable.

@PrafulB PrafulB changed the title [Transformers.js] Missing check for caching system availability when getting model files Missing check for caching system availability when getting model files Apr 10, 2025
@PrafulB PrafulB changed the title Missing check for caching system availability when getting model files Missing check for caching availability when getting model files Apr 10, 2025
@xenova
Copy link
Collaborator

xenova commented Apr 10, 2025

Hi there 👋 Thanks for the report. Your solution sounds like a good idea, so please feel free to open a PR! 🤗

@PrafulB
Copy link
Contributor Author

PrafulB commented Apr 11, 2025

Appreciate the quick confirmation @xenova ! Just raised PR #1280 for this.

Huge thanks for creating and maintaining this awesome library! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants