-
Notifications
You must be signed in to change notification settings - Fork 66
feat: add withIterator
#154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
) | ||
return; | ||
|
||
const files = this.getArray(this.state.paths); | ||
for (let i = 0; i < entries.length; ++i) { | ||
if (maxFiles && counts.directories + counts.files >= maxFiles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should happen after the push so that you only check it when adding something new vs checking it for every entry. It might be faster that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe the check should happen inside the push methods and they can call the abort controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did originally have it in the push function but it means you either need:
- to give the pushFile and pushDirectory functions access to the
counts
andmaxFiles
- to do the check in the default
pushPath
callback, which means you need to remember to do it in any override ones too (such as the iterator)
the way it is now, we don't need to pass these things all the way down and can do it in one place
Doesn't work yet, but future James will sort that out.
Works almost but is an absolute trainwreck. Will write it better now that i know the problem.
Of course, we need a queue! Now things work because we're trying to async iterate a synchronous task within an asynchronous task.
Sometimes it is possible we complete before the next iteration has run, so we leave a promise dangling or some such thing.
We currently exceed `maxFiles` and slice the result later. This doesn't work in an iterator since there is no end "result". This change instead makes the walker stop walking when `maxFiles` is met, so we already have the correctly sized result set.
* refactor: return an `AsyncGenerator` instead * chore: update test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is the best approach. It looks super hacky and error prone. I am sure we can find a better approach even if that requires rewriting/duplicating some parts of fdir instead of shoehorning a solution over the existing implementation.
can you expand on what parts you think are hacky? the outer part is just a generator, whatever way we implement this, that'll have to exist somewhere the way we hook into the what paths have been emitted is the way it is since i didn't want to change how we read directories under the hood. if we change the way reading directories works, we could trigger the next read only when the next iteration happens i imagine. |
The way the
It wouldn't be a bad idea to write separate directory walking logic for the iterator. We can reuse whatever parts we can. |
to be clear, it doesn't support
i was just trying to provide an iterator around what already exists. it can be written better if we don't need to use the existing implementation. worth noting it is a regular pattern for implementing an iterator of a mixture of async/sync though. i haven't done anything unusual here, its a fairly normal solution if we want to use the existing implementation. it seems clear that you'd rather just use a new implementation, so i will revisit this some day when i get time. |
Fixes #40
it isn't a true iterator of paths since we still read each directory as a batch under the hood. but it does mean we're only ever holding at most whats in one directory in memory
we could one day follow up on this by switching to
opendir
and reading entries one by oneif this implementation seems of any interest, let me know and i can review with you
@thecodrr @SuperchupuDev
Notable changes
Tests use an
execute
helperAn
execute
helper has been extracted so we can run the tests against all types without caring if its an iterator or not.Now in a test, you just
await execute(api, type);
and it'll handle normalising the result.onlyCounts
tests disabled forwithIterator
typeIt wouldn't throw if we tried, but
onlyCounts
doesn't make any sense since there's no result when usingwithIterator
. It'd be an empty iteration. So we just don't test this in that case (as there'd be type errors too i suspect).group-files
now exposes apushGroup
parameterThe file grouping mechanism decides if to group or not, then goes ahead and pushes a group if it needs to.
With an iterator, we need a way to hook into the underlying push, not the outer
groupFiles
function.For that reason, i've introduced a parameter,
pushGroup
which is(item, arr) => void
.This means we can pass a custom
pushGroup
in iterator mode which doesn't actually push toarr
, but feeds it into the iterable. In all other cases, we use a defaultpushGroup
which just pushes toarr
.pushFile
andpushDirectory
also have apushPath
parameterFor the same reason as
groupFiles
, bothpushFile
andpushDirectory
now take a `pushPath.This is used when we want to push a path, whether it be a directory or not, to the result set. In iterator mode, we can use this to feed it into the iterable. Otherwise, we push to
this.state.paths
(arr
).Counts#directories
takes filters into accountThe
directories
count used to be all directories we have visited, rather than directories we have matched. This means filters and such don't get taken into account (e.g. if we don't include dirs, we still count them right now. and if we filter dirs, we sometimes count the ones we filtered out).Meanwhile,
files
count was always the files we have matched.In iterator mode, we need a way to know how many paths (files or directories) we pushed so far as we don't have a set anywhere to check the length of. This would be as simple as
directories + files
if not for the inconsistency mentioned above.So i have fixed the inconsistency, only incrementing
directories
for matched directories (i.e. those that end up in the result).WalkerIterator
A basic iterator that essentially loops until the async callback of the existing
Walker
has been called.On each iteration, it waits until we either push a new path or we complete (so isn't just mindlessly looping).