Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace task run() return result with custom enum #2429

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

MitchTurner
Copy link
Member

@MitchTurner MitchTurner commented Nov 12, 2024

Linked Issues/PRs

closes: #2441

Description

Before any RunnableTask would return a Result<bool> where, implicitly, the bool was telling the service runner if it should re-run the task.

This is confusing from the customer perspective since you need to understand the behavior of ServiceRunner in order to choose the right response: i.e. Ok(true) meant you want to rerun. Instead this PR adds the Continue and Stop variants.

It was also confusing with Error since, unintuitively, an Error would result in the task being run again. This PR adds the ErrorContinue variant, so the implementer know explicitly what the behavior is.

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests
  • The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

After merging, notify other teams

[Add or remove entries as needed]

@@ -137,8 +138,7 @@ impl RunnableService for SyncTask {

#[async_trait::async_trait]
impl RunnableTask for SyncTask {
#[tracing::instrument(level = "debug", skip_all, err, ret)]
Copy link
Member Author

@MitchTurner MitchTurner Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to remove this from a couple locations since the instrument macro assumes the function returns Result

@@ -196,11 +197,15 @@ impl RunnableService for GraphqlService {

#[async_trait::async_trait]
impl RunnableTask for Task {
async fn run(&mut self, _: &mut StateWatcher) -> anyhow::Result<bool> {
self.server.as_mut().await?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces a non-trivial amount of noise with the removal of the ? operators.

AFAICT, there isn't a way to implement Try for arbitrary types in Rust which is a real bummer.

@MitchTurner MitchTurner marked this pull request as ready for review November 12, 2024 11:26
}
}
}
should_continue = true;
}
}

Ok(should_continue)
TaskRunResult::should_continue(should_continue)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though the idea was to remove should_continue variable and replace with the usage of the enum=D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My goal was to change the interface.

I'm happy to do some more refactoring too.

let da_block_costs = self.source.request_da_block_cost().await?;
self.shared_state.0.send(da_block_costs)?;
continue_running = true;
match self.source.request_da_block_cost().await.and_then(|da_block_costs| self.shared_state.0.send(da_block_costs).map_err(|err| anyhow::anyhow!(err))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very hard to read, maybe we could it inside of the function and process the result of the function?=)

Comment on lines 134 to 139
let block = match l2_block_res {
Ok(block) => block,
Err(err) => {
return anyhow!(err).into()
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this pattern in many places. Maybe it would be simpler to defined a helper trait with into_continue_task_result which either returns TaskRunResult::Continue or TaskRunResult::ErrorContinue

@MitchTurner MitchTurner self-assigned this Nov 14, 2024
tracing::debug!("stopping");
break;
}
TaskRunResult::ErrorContinue(e) => {
let e: &dyn std::error::Error = &*e;
tracing::error!(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to add a debug! statement to explicitly tell that the loop will continue (similar to the Continue case)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. Are you talking in the ErrorContinue case? I feel like we should definitely use error! there, especially since we don't exit on errors.

Comment on lines 247 to 248
let res = self.process_da_block_costs_res(da_block_costs).await;
TaskRunResult::Continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have error here, you will not propagate it to the caller of the run function and we will miss log for this error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this.

@@ -916,7 +915,12 @@ where
}
Some(TaskRequest::RespondWithGossipsubMessageReport((message, acceptance))) => {
// report_message(&mut self.p2p_service, message, acceptance);
self.p2p_service.report_message(message, acceptance)?;
match self.p2p_service.report_message(message, acceptance) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] if let Err(err) = res fits better here.


impl TaskRunResult {
/// Creates a `TaskRunResult` from a `Result` where `Ok` means `Continue` and any error is reported
pub fn continue_if_ok<T, E: Into<anyhow::Error>>(res: Result<T, E>) -> TaskRunResult {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue_if_ok is a bad name. I thought it returns Continue only If it is Ok.

What do you think about always_continue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it returns Continue only If it is Ok.

Yeah. It returns continue_ only if_ it is ok. I doesn't always_ return continue... I'm confused why you want me to change it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns Continue or ErrorContinue, which continues the task even if an error occurred.

From the function with name continue_if_ok I would expect return Continue on Ok and return Stop on Err

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now. That is confusing. I feel like "always continue" is confusing for the opposite reason though. "continue or error"? Haha

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just continue then?=) For me, the main point here is that we should still continue the task. What happens with error is not important for me as a reader of the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think you're right. I think that's what the caller is interested in. I'll go with one of those.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like TaskRunResult is the wrong name then. It's almost a state machine but not quite. But TaskState or TaskStateTransition might be a better name...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with always_continue and TaskNextAction (I asked ChatGPT for a good name and that was the one I liked the most)

xgreenx
xgreenx previously approved these changes Nov 18, 2024
@MitchTurner MitchTurner requested a review from a team November 18, 2024 12:41
rafal-ch
rafal-ch previously approved these changes Nov 19, 2024
Copy link
Contributor

@rafal-ch rafal-ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
with some nits/suggestions

Anyway, good piece of QoL code :)

crates/services/src/service.rs Show resolved Hide resolved
crates/services/consensus_module/poa/src/service.rs Outdated Show resolved Hide resolved
crates/services/p2p/src/service.rs Show resolved Hide resolved
@@ -568,16 +568,19 @@ where
// de-synchronization between on-chain and off-chain databases.
if let Err(e) = result {
tracing::error!("Error processing block: {:?}", e);
should_continue = self.continue_on_error;
if self.continue_on_error {
TaskNextAction::Continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not ErrorContinue here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Good idea. I was going to say that I want to maintain the old behavior, but that's essentially the same and cleaner.

acerone85
acerone85 previously approved these changes Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have Task return some business-related type rather than Result<bool>
4 participants