Skip to content

[S3 Driver] getKeys() doesn't respect prefix parameter and missing pagination for >1000 objects #728

@DemoMacro

Description

@DemoMacro

Environment

Reproduction

How to reproduce Bug 1 (prefix filtering doesn't work):

  1. Create an S3 bucket and organize some files using prefixes as folder names (for example: config/app.json, config/db.json, uploads/image1.jpg, uploads/image2.jpg)
  2. Set up unstorage with the S3 driver using your bucket credentials
  3. Try calling storage.getKeys("config") - you'd expect to only see the config files
  4. What you actually get: ALL files in the bucket, not just the config folder
  5. If you inspect the actual HTTP request being made to S3, you'll see there's no ?prefix= parameter in the URL

How to reproduce Bug 2 (pagination is missing):

  1. Create an S3 bucket with more than 1000 objects (you can do this quickly with a script or using the AWS CLI sync command)
  2. Set up unstorage with the S3 driver
  3. Call storage.getKeys() to get all keys
  4. You'll notice the returned array only has 1000 items, even though your bucket has more
  5. If you look at the S3 API response (or check the AWS CloudWatch logs), you'll see the response contains a NextContinuationToken element, indicating there's more data available
  6. The driver just stops after the first page and never tries to get the rest

What's happening in the code:

The listObjects function is supposed to handle both of these cases, but currently it just makes a single request without any prefix filtering, grabs the first page of results, and returns. There's no loop to handle pagination, and the prefix parameter isn't being added to the URL at all.

Describe the bug

I found two critical bugs in the S3 driver that make it unusable for production use cases.

Bug 1: The prefix parameter is completely ignored

The listObjects function accepts a prefix parameter, but when it makes the API request to S3, it simply doesn't use this parameter at all. This breaks any feature that relies on prefix-based filtering:

  • When you call getKeys("some-folder/"), you expect to get only keys under that folder, but instead you get ALL keys in the bucket
  • When you call clear("some-folder/"), you expect to delete only files in that folder, but it would delete everything in your entire bucket
  • Basically, any time you try to work with a specific subdirectory in your bucket, it just doesn't work

Looking at the code, the function defines the parameter but then constructs the URL without ever adding it to the query string. The S3 API never receives the prefix information.

Bug 2: Missing pagination support causes silent data loss

The S3 ListObjectsV2 API only returns up to 1000 objects per request. When you have more than 1000 objects in your bucket, the driver only fetches the first 1000 and stops. There's no mechanism to handle the continuation token and fetch the remaining pages.

This means:

  • If your bucket has 5,000 objects, getKeys() will silently return only 1,000 of them
  • You have no way to know this is happening unless you manually count the objects
  • Any data beyond the first 1,000 objects is effectively invisible to your application

The S3 API response includes a NextContinuationToken when there are more results, but the current implementation completely ignores this and doesn't make follow-up requests.

Additional context

Submitted a fix in #729

Logs

// []
console.log(await storage.getKeys("test"));

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions