-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async: review and refactor interaction between main thread and worker threads #143
Open
doudou
wants to merge
17
commits into
master
Choose a base branch
from
fix_async
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Unlike zero! which zeroes even the enums, Type.zero creates a value that is properly initialized, but zero everywhere the "value" is not specified.
…cesses If the call happens before the connection is actually closed, OmniORB handles it as a timeout. Otherwise, it is a ComError. This is actually not-so-great in our case, as we really do not want the timeout to happen. Note that it does not apply to crashing components, so the general Syskit behavior should not be affected.
a846ec3692b5020c3ed920683c3393cfc828943d uses the setTaskStates hook in RTT to get properly ordered state changes (see the corresponding commit in RTT 07d7922ba1f0ced3f0288652edca8c6bbd69200f). This causes some state change notifications to be duplicated (notified once by orogen and once by RTT)
This partially reverts commit c7f516d. cf252a4 changed the name of the "core" read method. This method was overloaded in OutputReader to raise if the remote process was dead, and disconnect the read side. I think the boat has sailed on {#read} not raising, and I'm not willing to re-introduce that behavior. However, I do believe the disconnect_all is a good move: it cleans up, and makes sure that a caller that cares does get notified that the reader is dead.
By reading the IOR of a task context from a pipe, we are not limited by CORBA name service to resolve the tasks. The wait running method must be called to register the IORs for each task in the process. This is done by reading the message from the pipe, validating it and saving it as a Hash. Later, this Hash is used to resolve the tasks when the resolve_all_tasks method is called. Since this method might be called before wait_running is ran, a nonblocking wait_running is called in it to ensure that the IORs are registered. Overall, the expected pipeline for Syskit is: The process client calls the process server, which spawns the Orocos::Process. On spawn, the IOR pipe is created. Then, the execution engine triggers the deployment ready resolution, which calls for the process' wait_running, expecting the IOR mappings to be resolved. They are registered and then resolve_all_tasks is called. If the IORs are valid, the deployment resolve the remote task handles and hopefully is becomes ready.
This implements for ruby tasks what was done for Orocos::Process. Since the ruby tasks already know their IOR from the beginning, all this does is look for each task's IOR on the deployed tasks map. The pipeline is the same as the Orocos::Process
feat: read IOR of a task context from pipe
chore: remove obsolete log management code
… threads Turns out that the async code was actually doing a lot of sync stuff behind the scenes, which was causing very noticeable lag and blocking behaviors when on bad connections. The first change was to create main_thread_call helper. One marks a "main thread only" method like so ~~~ main_thread_call def some_method end ~~~ and method calls not in the main thread will cause an exception to be raised. This is rather obviously meant for debugging. From there, I've fixed calls that were coming from a worker thread but should not have. In most cases, the pattern I had to change was ~~~ def some_method @event_loop.defer ... do do some async call in sync form (i.e. without blocks) end end ~~~ This was causing a lot of headaches as the code that turns async calls into sync calls depends on a lot of synchronization (e.g. with @delegator_obj), which was solved by having huge @mutex.synchronize blocks, serializing major parts of the execution. These calls have been changed into the non-blocking block form (a.k.a. callback form), and are called from the main thread. One major change is the removal of sub-genres of Async::NameService. These were rather pointless, as turning a sync into an async object can be done generically through to_async. The "new" Async::NameService simply delegates to the non-async get in a worker thread and then turns the non-async object into an async object. Another major change is how Async::TaskContext creates itself. The very unfortunate design choice in the async and proxy version of TaskContext was to let them get a name as argument, and let them do the resolution. This brings a lot of complexity in the async case, as there is a zone where the async task context does not have an underlying task context. Proxies are meant for that, not async. Generally speaking, this interface should be changed to do the name resolution in name_service (duh).
There was previously no way to make a task context proxy completely stop working.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Turns out that the async code was actually doing a lot of sync
stuff behind the scenes, which was causing very noticeable lag
and blocking behaviors when on bad connections.
The first change was to create main_thread_call helper. One marks
a "main thread only" method like so
and method calls not in the main thread will cause an exception to
be raised. This is rather obviously meant for debugging.
From there, I've fixed calls that were coming from a worker thread
but should not have. In most cases, the pattern I had to change was
This was causing a lot of headaches as the code that turns async
calls into sync calls depends on a lot of synchronization (e.g.
with @delegator_obj), which was solved by having huge
@mutex.synchronize blocks, serializing major parts of the execution.
These calls have been changed into the non-blocking block form
(a.k.a. callback form), and are called from the main thread.
One major change is the removal of sub-genres of Async::NameService.
These were rather pointless, as turning a sync into an async object
can be done generically through to_async. The "new" Async::NameService
simply delegates to the non-async get in a worker thread and then
turns the non-async object into an async object.
Another major change is how Async::TaskContext creates itself. The
very unfortunate design choice in the async and proxy version of
TaskContext was to let them get a name as argument, and let them
do the resolution. This brings a lot of complexity in the async
case, as there is a zone where the async task context does not
have an underlying task context. Proxies are meant for that, not
async. Generally speaking, this interface should be changed to
do the name resolution in name_service (duh).