Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add file streams for teeing Job stdout/err to disc #1456

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ag-eitilt
Copy link
Collaborator

For logs, etc., it's desirable to write Job outputs to known files in addition to the existing streams mechanisms -- for example, directing specific Jobs to different locations, and writing those logs even if Wake is killed partway through.

src/runtime/job.cpp Outdated Show resolved Hide resolved
src/runtime/job.cpp Outdated Show resolved Hide resolved
src/runtime/job.cpp Outdated Show resolved Hide resolved
src/runtime/job.cpp Outdated Show resolved Hide resolved
@mwachs5
Copy link
Contributor

mwachs5 commented Nov 3, 2023

does anything in the logging.md need to get updated for this change?

@JakeSiFive
Copy link
Contributor

JakeSiFive commented Nov 6, 2023

does anything in the logging.md need to get updated for this change?

It shouldn't invalidate anything but it will add a new feature. If Sam want's to take that, I'm he's welcome to, otherwise I'll do it in a follow up PR to reduce the load on this PR so we can get the feature out the door.

@ag-eitilt
Copy link
Collaborator Author

I can definitely update that as part of this, once I get the implementation side of things done to know what I'll be updating it with. Thanks for pointing it out!

Copy link
Contributor

@JakeSiFive JakeSiFive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is looking good!

Comment on lines +350 to +352
for (auto &entry : teefiles) {
entry.second->close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These close on their own, no need to write this part. Currently there are no release order deps so we don't need to be explicit like I was above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know!

@@ -983,6 +1013,10 @@ bool JobTable::wait(Runtime &runtime) {
entry->job->db->save_output(entry->job->job, 1, buffer, got, entry->runtime(now));
if (!imp->batch) {
entry->stdout_linebuf->sputn(buffer, got);
for (auto &teefile : entry->stdout_teefiles) {
auto &tee_fd = imp->teefiles.at(teefile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We generally use the operator[] and not at() in this code base

}
delete[] stdout_teefiles;
delete[] stderr_teefiles;

std::shared_ptr<JobEntry> entry = std::make_shared<JobEntry>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to store an std::vector<std::ostream*> that acts as a list of weak pointers to the unique pointers stored by the job table that outlives these references. That way you don't have to do the job table string lookup every time, only at job launch time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    auto stdout_tee_names = std::vector<std::ostream*>();
    for (const auto& file : stdout_teefiles) {
      auto iter = jobtable...->teefiles.find(file);
      if (iter == ... .end()) {
          auto out = std::make_unique<std::ofstream>(stdout_teefiles[i], std::ios::out);
          iter = jobtable->imp->teefiles[stdout_teefiles[i]].emplace(file, std::move(out)).first
      }
      stdout_tee_names.push_back(iter->second.get());
    }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice little tweak that makes me glad I don't write C++ on the regular...

for (int i = 0; stdout_teefiles[i]; ++i) {
stdout_tee_names.push_back(stdout_teefiles[i]);
if (jobtable->imp->teefiles.find(stdout_teefiles[i]) == jobtable->imp->teefiles.end()) {
jobtable->imp->teefiles[stdout_teefiles[i]] = std::make_unique<std::ofstream>(stdout_teefiles[i], std::ios::out | std::ios::trunc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do an unlink after this so that we get a new inode instead of modifying the existing file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing that function in the C++ reference I'm using, though I do recognize what it means in context. Mind giving me a bit more guidance what that would look like in code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently it's a bit less complete a reference than I expected, thanks! (They could at least have included it and marked it as system dependent somehow...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlink(file.c_str());

Is all you would need but its probably best to only ignore the error if ENOENT is returned. Everything else should be logged as a warning

if (unlink(file.c_str()) < 0 && errno != ENOENT) {
    wcl::log::warning("Could not unlink %s while creating a new file for tee-ing: %s", file.c_str(), strerror(errno))();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants