-
-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sheets-] speed up loading check of max_rows #2636
Conversation
visidata/sheets.py
Outdated
@@ -303,8 +303,9 @@ def loader(self): | |||
self.rows = [] | |||
try: | |||
with vd.Progress(gerund='loading', total=0): | |||
m = self.options.max_rows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to name the local variable max_rows
instead.
Option lookup is slow. It should be cached, but still, accessing the cache in an inner loop called millions of times is unnecessary. So this is the reasonable fix. |
For a sheet with millions of rows, the inner loop max_rows check was delaying loading time by 60%.
4e3c3be
to
f6d178a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for your diligence. I made it >= so we can have the right number of rows and also stop in case it goes over for some reason. (@anjakefala please squash)
Oh, good idea. I'll remember to consider |
Affects vd v3.1. After 7c24ffe, loading sheets became much slower. A 2 million row CSV sheet with multiple columns went from 10 seconds to 16 seconds.
I didn't look into the root cause of why
vd.options.max_rows
is so slow. Perhaps that's the proper fix instead?