-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add file streams for teeing Job stdout/err to disc #1456
base: master
Are you sure you want to change the base?
Conversation
does anything in the |
It shouldn't invalidate anything but it will add a new feature. If Sam want's to take that, I'm he's welcome to, otherwise I'll do it in a follow up PR to reduce the load on this PR so we can get the feature out the door. |
I can definitely update that as part of this, once I get the implementation side of things done to know what I'll be updating it with. Thanks for pointing it out! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything is looking good!
for (auto &entry : teefiles) { | ||
entry.second->close(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These close on their own, no need to write this part. Currently there are no release order deps so we don't need to be explicit like I was above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know!
@@ -983,6 +1013,10 @@ bool JobTable::wait(Runtime &runtime) { | |||
entry->job->db->save_output(entry->job->job, 1, buffer, got, entry->runtime(now)); | |||
if (!imp->batch) { | |||
entry->stdout_linebuf->sputn(buffer, got); | |||
for (auto &teefile : entry->stdout_teefiles) { | |||
auto &tee_fd = imp->teefiles.at(teefile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We generally use the operator[]
and not at()
in this code base
} | ||
delete[] stdout_teefiles; | ||
delete[] stderr_teefiles; | ||
|
||
std::shared_ptr<JobEntry> entry = std::make_shared<JobEntry>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to store an std::vector<std::ostream*>
that acts as a list of weak pointers to the unique pointers stored by the job table that outlives these references. That way you don't have to do the job table string lookup every time, only at job launch time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto stdout_tee_names = std::vector<std::ostream*>();
for (const auto& file : stdout_teefiles) {
auto iter = jobtable...->teefiles.find(file);
if (iter == ... .end()) {
auto out = std::make_unique<std::ofstream>(stdout_teefiles[i], std::ios::out);
iter = jobtable->imp->teefiles[stdout_teefiles[i]].emplace(file, std::move(out)).first
}
stdout_tee_names.push_back(iter->second.get());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice little tweak that makes me glad I don't write C++ on the regular...
src/runtime/job.cpp
Outdated
for (int i = 0; stdout_teefiles[i]; ++i) { | ||
stdout_tee_names.push_back(stdout_teefiles[i]); | ||
if (jobtable->imp->teefiles.find(stdout_teefiles[i]) == jobtable->imp->teefiles.end()) { | ||
jobtable->imp->teefiles[stdout_teefiles[i]] = std::make_unique<std::ofstream>(stdout_teefiles[i], std::ios::out | std::ios::trunc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do an unlink after this so that we get a new inode instead of modifying the existing file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing that function in the C++ reference I'm using, though I do recognize what it means in context. Mind giving me a bit more guidance what that would look like in code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unlink is a posix function: https://man7.org/linux/man-pages/man2/unlink.2.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently it's a bit less complete a reference than I expected, thanks! (They could at least have included it and marked it as system dependent somehow...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unlink(file.c_str());
Is all you would need but its probably best to only ignore the error if ENOENT is returned. Everything else should be logged as a warning
if (unlink(file.c_str()) < 0 && errno != ENOENT) {
wcl::log::warning("Could not unlink %s while creating a new file for tee-ing: %s", file.c_str(), strerror(errno))();
}
For logs, etc., it's desirable to write Job outputs to known files in addition to the existing streams mechanisms -- for example, directing specific Jobs to different locations, and writing those logs even if Wake is killed partway through.