Skip to content

Attempt to more consistently flush logs #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 28, 2024
Merged

Conversation

RealOrangeOne
Copy link
Member

Might be a little overkill, but hopefully this will flush logs more reliably once code finished.

The decision to mount the USB drive sync is still one worth having, but has some more caveats to consider.

Copy link
Contributor

@PeterJCLaw PeterJCLaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable, though needs testing.

@WillB97
Copy link
Contributor

WillB97 commented Aug 10, 2023

In future we should look at using os.fsync() instead of mounting drives with sync (sourcebots/robot-image#16)

@RealOrangeOne
Copy link
Member Author

we should look at using os.fsync()

I agree. I've implemented this now. I'm a little concerned that all these sync calls will slow down the drive, but we're not writing much, and it's all in a single thread, so it should be ok.

Copy link
Contributor

@PeterJCLaw PeterJCLaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As before: looks good, presumably needs testing.

@WillB97
Copy link
Contributor

WillB97 commented Jul 6, 2024

The fsync call causes only the first line to be logged.

@RealOrangeOne
Copy link
Member Author

I'm confused (and a little concerned) as to why the fsync call results in only 1 log line being saved. It should be flushing after each line?

@WillB97
Copy link
Contributor

WillB97 commented Jul 10, 2024

Specifically it looks like we only print one line when the code has crashed early on because we are not keeping up with the stdout stream.

Jul 10 22:19:25 robot runusb[732]: INFO:runusb:Found new mountpoint: /media/robot/Keychain (USBType.ROBOT)
Jul 10 22:19:25 robot runusb[732]: INFO:runusb:Starting user code
Jul 10 22:19:25 robot runusb[732]: INFO:runusb:  -> launched handler
Jul 10 22:19:26 robot runusb[732]: USERCODE:usercode:Traceback (most recent call last):
Jul 10 22:19:27 robot runusb[732]: WARNING:usercode:Process exited with code 1
Jul 10 22:19:27 robot runusb[732]: Exception in thread Thread-2 (_log_output):
Jul 10 22:19:27 robot runusb[732]: Traceback (most recent call last):
Jul 10 22:19:27 robot runusb[732]:   File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
Jul 10 22:19:27 robot runusb[732]:     self.run()
Jul 10 22:19:27 robot runusb[732]:   File "/usr/lib/python3.11/threading.py", line 975, in run
Jul 10 22:19:27 robot runusb[732]:     self._target(*self._args, **self._kwargs)
Jul 10 22:19:27 robot runusb[732]:   File "/usr/local/lib/python3.11/dist-packages/runusb/__main__.py", line 329, in _log_output
Jul 10 22:19:27 robot runusb[732]:     for line in iter(pipe.readline, ''):
Jul 10 22:19:27 robot runusb[732]: ValueError: I/O operation on closed file.

The correct output (with the fsync removed)

Jul 10 22:21:51 robot runusb[1189]: INFO:runusb:Found new mountpoint: /media/robot/Keychain (USBType.ROBOT)
Jul 10 22:21:51 robot runusb[1189]: INFO:runusb:Starting user code
Jul 10 22:21:51 robot runusb[1189]: INFO:runusb:  -> launched handler
Jul 10 22:21:52 robot runusb[1189]: USERCODE:usercode:Traceback (most recent call last):
Jul 10 22:21:52 robot runusb[1189]: USERCODE:usercode:  File "/media/robot/Keychain/robot.py", line 5, in <module>
Jul 10 22:21:52 robot runusb[1189]: USERCODE:usercode:    raise RuntimeError()
Jul 10 22:21:52 robot runusb[1189]: USERCODE:usercode:RuntimeError
Jul 10 22:21:52 robot runusb[1189]: INFO:runusb:Process output finished
Jul 10 22:21:52 robot runusb[1189]: WARNING:usercode:Process exited with code 1

@WillB97
Copy link
Contributor

WillB97 commented Jul 10, 2024

Using write_through seems to mostly fix the buffering issue and doesn't seem to show the issue with not printing stacktraces.

@RealOrangeOne
Copy link
Member Author

Given we explicitly run Python unbuffered, I'm surprised write_through makes a difference - but it can't hurt to be explicit here (in case there are other optimisations Python can make by avoiding any buffering code here).

@WillB97 WillB97 force-pushed the feature/flush-logs branch from 0b6c460 to 2dc1b54 Compare July 14, 2024 11:38
Copy link
Member Author

@RealOrangeOne RealOrangeOne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

@WillB97 WillB97 merged commit 054ed12 into master Jul 28, 2024
2 checks passed
@WillB97 WillB97 deleted the feature/flush-logs branch July 28, 2024 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants