-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Storm db manager #481
Conversation
# Create a DeferredLock that should be used by callers to schedule their call. | ||
self.db_lock = DeferredLock() | ||
|
||
self._version = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the current code, the initial database version is 0. Are you changing this for any particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in sqlitecachedb
they set it to 1 in the _open_connection
function if the query fails. That's why I went for 1 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking a the dispersy db manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will have to be careful to keep it consistent when refactoring I don't know if 0/1 are actually checked and used to act in Tribler and/or dispersy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the version is below 17 in Tribler, the database is too old to be migrated apparently. So 0 or 1 will make no difference. In dispersy the database_version
method (property) is used and checked for 0. if it's 0 it reads "# setup new database with current database_version".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I meant, if you refactor Tribler/dispersy to use the same db manager, and one used to give special meaning to 0, 1 or whatnot. You will need to make extra sure the meaning gets changed everywhere to mean the same. Just a friendly reminder.
98fed43
to
f3fc4ec
Compare
On Monday I wil set up an AllChannel experiment runner for dispersy too (will use latest Tribler devel with the dispersy from the PR being tested. This way you will be able to compare it with the upstream results. @qstokkink This will be very useful for your PRs too. |
Awesome, it's good to have more insurance that changes do not break current workings. |
@whirm 🎉 Nice, this leaves me more time to do other things. |
30ea219
to
5da399c
Compare
retest this please |
1 similar comment
retest this please |
3b4fcb3
to
d4792c3
Compare
@whirm Why did my job on jenkins get aborted? https://jenkins.tribler.org/job/dispersy/job/GH/job/PR/job/Unit_tests_linux/1009/ |
The Allchannel experiment failed so it aborted the whole stage. I moved the experiment to a second phase until I have time to deploy the automatic cluster selection stuff I made. |
Quite the timing, I was about to comment the question if the allchannel experiment was being timedout and therefore the build. Thanks for your reply :D |
d91843e
to
09ee2c2
Compare
@whirm Why does a push trigger two PRs ? https://gyazo.com/771ba958cf9dfc517da4095d3807ce69 https://jenkins.tribler.org/job/dispersy/job/GH/job/PR/job/Unit_tests_linux/1073/ |
Turns out both the master multijob and the Linux tester job had the GHPRB trigger enabled, so it has been running twice since the beginning. |
(fixed now) |
Thanks man! |
Time to start activating the Deferred lock on one of the functions! |
@@ -237,7 +199,7 @@ def requests(self, node_count, expected_responses, *pairs): | |||
other.send_identity(node) | |||
|
|||
messages = [other.create_sequence_text("Sequence message #%d" % i, i + 10, i) for i in range(1, 11)] | |||
yield other.store(messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be still a yield I guess.
b3b1b9f
to
5ac48c5
Compare
5ac48c5
to
21acaa8
Compare
Allchannel succeeded.For those interested: After putting this commit once per minute structure back, I got the following results ( 20 nodes 1k clients, i.e. the real allchannel scenario): https://jenkins.tribler.org/job/pers/job/allchannel_laurens_v2/115/artifact/output/ a (almost) perfect match with https://jenkins.tribler.org/job/Tribler/job/Experiments/job/AllChannel+Channelcommunity_devel/975/artifact/output/ Some interesting and notable differences: You see clear peaks in the read and writes of the regular one: and in my case it's more of a curve but with peaks: The download, upload and more are very similar however. For example the upload: |
So in conclusion: batching your SQL statements and flush once in a while really gives A LOT of performance. This will be an interesting problem to tackle once IO is moved to the threadpool because there you may not be able to batch that easily, perhaps have a dedicated thread for the IO that can batch + block so the main twisted thread can continue. Next up: gumby refactoring and Tribler. |
Also, it looks like my stuff is able to do (slightly) more IO than the current, while yielding the same results. Also I noticed that at the start of the experiment there is quite a bit of dropped messages, yet no dropped messages after that (like the current allchannel but more drops). @whirm thoughts? |
sooooo... You reduced the amount of commits by batching stuff up? dramatic changes in the graphs. |
lot more IO in your stuff. You commit after creating each message? |
@synctext Yes, I am now batching them (or well SQLite is) and once every minute or when you send messages created by yourself (so your global time advances) the changes are flushed to the actual disk. In all my previous attempts I would commit after every database change, this generated a lot more IO and was killing the performance. Basically the owner of the channel in the allchannel experiment would go flat. |
@lfdversluis good job! Indeed, most graphs look very similar. I was expecting the spiked pattern for the wchar graph but instead, it is a line so you have IO operations ongoing all the time? Regarding the dropped messages, they peak around 100 seconds. If you take a look at https://jenkins.tribler.org/job/pers/job/allchannel_laurens_v2/116/artifact/output//wchars.png, you see that the IO also peaks at around 100 seconds so I think you run into the same problem again: too much IO, causing dropped messages. I think this peak occurred the moment clients are starting to discover each other and start to send messages. Can you somehow investigate/reduce this peak? |
try to disable all commits and see what happens. |
Closing in favour of #530 |
I open this PR so people can view the progress on this work and provide feedback and or suggestions.
This will become the database manager used in Tribler/Dispersy to do IO on a separate thread to gain performance.
Small performance indicator: The current synchronous, blocking tests take 635-640 seconds to complete. Now with 18 more tests, it takes 375 seconds.
Adds the Storm DB manager (feature).
Adds 18 tests
Fixes #488 Fixes #484
Refactoring progress:
dispersy.py
community.py
multichain DB
DispersyDatabase
statistics.py
node.py
test_classification.py
test_sync.py
test_undo.py
tracker/community.py
database.py
execute
execute_many
insert
fetchone
fetchall
executescript
insert_many
delete
count
Impact changes for Tribler et al.
dispersy_auto_load
was previously a setter and getter using@propery
. I've left the getter untouched but it returns a deferred now. The setter however is a function nowset_dispersy_auto_load
which returns a deferred too. This setter was only used in a test, but probably is used more in Tribler.