-
Notifications
You must be signed in to change notification settings - Fork 3
Add checksum verification for all *basic* modules #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi, Marco! Thanks for the interest in contributing! Sorry for the delay on giving you any feedback on this. We're currently going through a accelerator commissioning after a two-month shutdown. In about two weeks, beamlines will be having beam for initial tests. In summary, things are a bit busy by now.
AFAIK, this is what different Linux distributions do, including ArchLinux, VoidLinux (possibly with a helper script).
Debian goes that direction, I'd say. Source packages contain SHA1 and SHA256 checksum, but the developer does not need to get into the specifics on how to generate and fill them, since packaging tools will do the job. (It does require you to have the tarball locally to be able to do so, nevertheless). I think getting this to be done by an automatic script should be easier than documenting the specifics on how to do so.
That's nice! I haven't looked your code thoroughly, but I did notice three things I think can be addressed beforehand:
These things might seem not important at first, but in the long run they make a huge difference, since the Git history will be full of meaningful rationale that can easily queried by For the changes themselves, I'll see if I can go through them soon. |
715f556 to
1725272
Compare
|
Greetings!
Done!
I did that in the titles for now. Will do better in the messages after I fix the next step:
For now I simply squashed the most obvious patch which was a fix to an obvious broken command. However, the commits are definitely not atomic since I had initially decided to add a new lnls-get-n-unpack from scratch and replace the old one with the new after it was all ready. I had decided to do that because although it's a simple bash script, at the time I didn't feel I could turn the old script into the new one in an small-increment series of commits. What I intend to do now is to reorganize the commits in such a way that they still create a new lnls-get-n-unpack from scratch, but replace it with the new one only after it is actually finished (what I should have done from the start, anyway). Please tell me if that is acceptable or if you require the original script to be modified in small increments for the PR to be accepted. Cheers! |
|
Hey Marco!
Talking with @henriquesimoes and @ericonr we agreed that it's better if the changes are made in the original script to preserve the commit history from previous patches. |
1725272 to
641ea36
Compare
As suggested by @gustavosr8 here: cnpem#80 (comment)
50e7ae8 to
dd820bc
Compare
|
Hi, @gustavosr8! I made the suggested changes, or at least as long as I can tell. Please feel free to review the current code. I tested both builds here and they seem to work. Please let me know if there is anything else that should be done. |
|
Hey Marco! I think that a rebase is missing. We added support for modbus in the last release, and the CI fails when building this module.
About that, WDYT on making a previous commit just adding the module's SHA256 and mentioning that it will be used in the future, separating it from the commit where you actually implement the SHA check and use it in the download and install functions? |
dd820bc to
acc6ecc
Compare
|
I apparently did a wrong force push after the rebase to add modbus and erased everything 🫠
Yup, seem fine to me. I'm a bit bothered by the thought that it won't be a "completely atomic" commit, but it won't break the build anyway. |
As suggested by @gustavosr8 here: cnpem#80 (comment)
75cc143 to
c513928
Compare
|
Managed to recover the commits after rebasing to add latest changes. |
|
Hey poor mortal! |
|
With #88 merged, we can avoid repeating all those variable names! Just move the checksum variables into the relevant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Marco!
I hope this message finds you well.
After being late almost a year now, I have finally taken the opportunity to review your changes carefully. I've made several comments. I didn't review the hashes themselves (nor the way they are introduced), because several of them will be different now, and they should be placed in the version files given the latest directory structure.
Please, let us know if you are still interested on working on this. If not, we can rework the patches ourselves and keep the appropriate authorship information.
Thanks for first working on this.
base/lnls-get-n-unpack.sh
Outdated
| -l) dest=. ;; | ||
| *) >&2 echo "Invalid extraction mode: must be either root (-r) or local (-l)." | ||
| exit 1; | ||
| exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: about this change's commit message: I often see a trailer for specifying who suggested the change (when that's relevant). The message would look like this:
base: remove unneeded ';' from lnls-get-n-unpack.
Suggested-by: Gustavo de Souza dos Reis <[email protected]>
If there was a really important discussion in the provided link, it should also be summarized in the message itself. Otherwise, the project's Git history would depend on the availability of the external platform (in this case, GitHub) to be understandable.
In general, I prefer to place links in their own line at the end of the message (see 76c2b86 for an example), pretty much like bibliographies. That makes it easier to read the message itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in commit 8bce3ff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nitpicking a bit more. They are usually used as trailers, which means they must come last. I don't know a proper way to provide a URL to the reference your usage of a trailer. I think a better way is to include a very brief mention to the message body itself, which is linked instead. Something like this:
base: remove unneeded ';' from lnls-get-n-unpack.
Command separator is not needed if there are no commands afterwards.
Drop it as suggested during review [1].
[1] https://github.com/cnpem/epics-in-docker/pull/80#discussion_r1840862700
Suggested-by: Gustavo de Souza dos Reis <[email protected]>
Note that I've also rewritten the body as a single sentence.
I'm just nitpicking on this because I think it will be valuable for your future contributions. That's certainly an overkill for this typo change; although it is a quite didactic example.
|
Hi! I did lose traction with this, but it would sure be nice to finish. The reminder is very much welcome :) I will work on this no later than until the end of this weekend. |
Good to know you're still somewhat interested. ;)
No hurry. I just found some time to review it after a long period of time. It's not a priority for us at all right now. By the way, that's why it took us so long to give you further feedback. Take your time to iterate over the review comments carefully. If we ever consider this a priority in our end, we'll let you know. We do want your contributions to be integrated nevertheless. We are considering it part of the roadmap for the v1.0.0 release, which doesn't have a target date. |
Suggested-by: Gustavo de Souza dos Reis <[email protected]> [1]. Command separator. Not needed if there are no commands afterwards. [1] cnpem#80 (comment)
Suggested-by: Gustavo de Souza dos Reis <[email protected]> [1]. Command separator. Not needed if there are no commands afterwards. [1] cnpem#80 (comment)
c513928 to
5c591e3
Compare
5c591e3 to
0c7694d
Compare
Suggested-by: Gustavo de Souza dos Reis <[email protected]> [1]. Command separator. Not needed if there are no commands afterwards. [1] cnpem#80 (comment)
11b4c6b to
b53205b
Compare
|
So, about the extra blank lines as I said in previous comments the rebase+forcepush confused me. I removed every extra line immediately after function init and before funtion end, but kept blank lines between blocks that seem to execute different tasks. If you prefer to have every blank line removed or any other criteria, please feel free to open more discussions and I will make sure to not get lost again with the diffs. |
|
My pipeline seems to be failing although locally everything works... I'm wonderin if this could have something to do with access to ghcr? |
Hi @marcomontevechi1! Yeah, I had to allow it to run, but seems to be passing without any issues now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit on e289f22
s/identated/indented
For reference: https://www.merriam-webster.com/dictionary/indent
b53205b to
a7a501c
Compare
Okay, fixed in commit message 5301220 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is getting into good shape. I'm leaving some minor suggestions in the threads, mostly nitpicks.
I'm still planning on properly reviewing the hashes variables. I just skimmed through them as of now. But they should be good, as the image builds fine.
Besides, here goes some commit message comments:
commit 5301220
Author: marcofilho [email protected]
Date: Wed Nov 20 22:50:59 2024 +0100base: encapsulate lnls-get-n-unpack in functions. This script will grow a bit, so I organize it before making it grow.
"so I organize it [...]" -> "so organize it [...]". As with the title, the
commit message can give orders to the codebase. This way messages becomes less
verbose, as we don't need to say things like "this commit", "these changes",
etc. It also makes clear what is the intended effect of applying the commit.
That's an approach the Git developers use as a convention internally, for instance.
Function content is indented. Functions parse_arguments and download are used at the end of script. Help function is added to be called when script is misused. Format of arguments passed to script will change in future commits, so this might help clarify when wrong usage is made. Similar to common ArgParse usage in python, parse_arguments decides
Suggestion: "common" -> "usual", as it is not intended to mean "shared usage",
but actually "frequent usage".
Nit: "ArgParse" -> "argparse" (as it is the module name) or "ArgumentParser" as
the class name. I'm not sure which one you intended to refer to.
Nit: "python" -> "Python"
which arguments will be passed to download.
commit f8356a1
Author: marcofilho [email protected]
Date: Sat Oct 18 16:50:15 2025 +0200base: add sha256 expected hashes structure.
Suggestion: "base: add SHA-256 hashes for existing modules."
Manually downloaded and checked each sha256sum for each epics module that is downloaded with lnls-get-n-unpack. Added the expected sums to each *_version.sh file.
"Each EPICS module that is downloaded with lnls-get-n-unpack was manually
downloaded and had its sha256sum calculated. The expected sums are added to the
corresponding *_version.sh files."
Starting with "Manually downloaded and checked" seems to be missing a subject.
Reworking as a passive voice makes the problem disappear without having to
indicate one (which is unusual in these messages, I'd say).
The variables are not actually used in this commit. Next commit shall add the actual sha256sum checking.
base/lnls-get-n-unpack.sh
Outdated
| -l) dest=. ;; | ||
| *) >&2 echo "Invalid extraction mode: must be either root (-r) or local (-l)." | ||
| exit 1; | ||
| exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nitpicking a bit more. They are usually used as trailers, which means they must come last. I don't know a proper way to provide a URL to the reference your usage of a trailer. I think a better way is to include a very brief mention to the message body itself, which is linked instead. Something like this:
base: remove unneeded ';' from lnls-get-n-unpack.
Command separator is not needed if there are no commands afterwards.
Drop it as suggested during review [1].
[1] https://github.com/cnpem/epics-in-docker/pull/80#discussion_r1840862700
Suggested-by: Gustavo de Souza dos Reis <[email protected]>
Note that I've also rewritten the body as a single sentence.
I'm just nitpicking on this because I think it will be valuable for your future contributions. That's certainly an overkill for this typo change; although it is a quite didactic example.
| PVXS_SHA256=14936dda59e81a2252e1da3cf147a038cc420e4510df30c3aeb2bf1113641555 | ||
| P4P_SHA256=2b89d7605ae35dc7edad2a36cf6b1d395170698b6d7fdfc3811a8d3ed5183bab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about having *_SHA256 variables interleaved with the *_VERSION variables? This way, all variables related to a module would be close together. This should be easier to edit them both when upgrading to a new module release.
As this is a tiresome thing to do (and undo), please wait for @gustavosr8's and @ericonr's opinions before applying this suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, having the same module's variables together seems more direct to tell if something is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downloading the packages and checking the hashes is more work than organizing them in the files... at least that's how it feels for me.
I added them as interleaved variables as suggested but it would be no trouble to put it back on the way it was before if you guys come to the conclusion that it's better that way.
I initially liked the ideas of having them in separate blocks by function (sha or version variables), but then there is the effort of checking if they are in the same order...
| SEQUENCER_SHA256=f5ebecdb231e106bb83db9a5fc877adb03bfd119e879a3668fdfc33d0aacb397 | ||
| CALC_SHA256=5cf1a7b3d444e763eb96ca5b9cdbcb9c29f5a6f9ac2b8d9cdb17a007d3fa8347 | ||
| ASYN_SHA256=47e993aeb300c597fcb0c3df6d3c88b9dd9e9fb90600da84cb2d2dc5b59a31aa | ||
| MODBUS_SHA256=c890feb00844c9bb15525307b6c3322832d10f504d288f0f650b6ea987483e7c | ||
| STREAMDEVICE_SHA256=e0640f00cd23ddd6015091d4b4e8e43a21d9e9e31d9639a5d0da6b187f42eb79 | ||
| BUSY_SHA256=1a09675bb69cdb09157b06d7276c4e4a9db8ca7e257108529e380c55452e4a53 | ||
| AUTOSAVE_SHA256=766dd7a8f71529f48d8122a3655f7986004b62ba589342153fa5a47f59119903 | ||
| SSCAN_SHA256=6911e114b07b3c200db781750035237d7dd494130f360e278df950a8358d6d78 | ||
| RECCASTER_SHA256=7108963bfa6c74d9571fe808ffa4312c1d562b30f575fb6b102fb20bc910aeb8 | ||
| IPAC_SHA256=4bd404eb9a205a32e6e25730115f6a1ab201dac69454136d4b570f880843f726 | ||
| CAPUTLOG_SHA256=6b85137906ed44a1f15358ab32fa9b4c450b37c495c65410606ea660ef4feb4c | ||
| RETOOLS_SHA256=676e2d325fe43126bf2762fa003450bc4c5ed33002ff42761a1d8c9a1f743bf7 | ||
| ETHER_IP_SHA256=ccf441ab842c8d24cea5b912b210852a4dc2fc005391a64f14de2ed586644155 | ||
| IOCSTATS_SHA256=13fbca066bdb34f1d84641a1b3f9fc505729866963f1e7d6d66dbc55b32a2cc8 | ||
| IPMICOMM_SHA256=7e526601461d7222834219a0f081a0550932b340eb945e515ffae94febe429f8 | ||
| PYDEVSUP_SHA256=3280cf3c4e9355f34841bc5ec38d0e5337783ea1678f31f7559b723ba8f4489b | ||
| SNMP_SHA256=f190b807aecd7d319e58263bca2ff883f891496793b68c871ada48a192d695a3 | ||
| SCALER_SHA256=faad6df4a71922ad6dcbd2d73020fad301ecd32e6e4e31ca0b0e4fd013c3bbce | ||
| MCA_SHA256=dddee716247e97e61f2e5a4bad07966ee00a7c107aed6f2d697b27337a44b9bc | ||
| LAKESHORE_SHA256=29b2a57583f69b02289368bcf76debe4a872380400fd4e309d1be320ec38d51e | ||
| LAKESHORE340_SHA256=3a5ac79abcce29d62d4eb6a16a63dc297594358512e3c90a96e2417ba40f7fd5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this tiresome work of (re)calculating all hashes for existing modules. This is pretty much appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problemo :)
I kinda chose this task because it was less complex and more of a manual thing anyway.
I thought for a minute about adding a helper script as discussed in previous comments, but I'm not so sure now...
A helper script that automatically downloads and checks everything seems to kinda break the whole purpose of checking the SHAs. On the other hand, an atomic script that only downloads one module using the specified version seems unnecessary code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda chose this task because it was less complex and more of a manual thing anyway.
Okay. Good to know. :)
A helper script that automatically downloads and checks everything seems to kinda break the whole purpose of checking the SHAs.
It depends. One way of approaching this is to inspect the tarball content before calculating its hash. But that's true when manually generating the hashes as well. Another approach is to simply trust the first seen tarball, and only ensure that is the same from then on. We still need to discuss which approach works the best for us, but both of them already solve one problem: attacks that attempt to change a tarball already in use without us noticing it.
On the other hand, an atomic script that only downloads one module using the specified version seems unnecessary code.
Agreed.
This script will grow a bit, so organize it before making it grow. Function content is indented. Functions parse_arguments and download are used at the end of script. Add help function to be called when script is misused. Format of arguments passed to script will change in future commits, so this might help clarify when wrong usage is made. Similar to usual ArgumentParser usage in Python, parse_arguments decides which arguments will be passed to download.
Each EPICS module that is downloaded with lnls-get-n-unpack was manually downloaded and had its sha256sum calculated. The expected sums are added to the corresponding *_version.sh files. The variables are not actually used in this commit. Next commit shall add the actual sha256sum checking.
Compare SHA-256 hash sum for each downloaded archive file against previously known SHA-256 values to ensure files are what they are supposed to be. This impacts how new modules are introduced, as they are required to define the expected SHA-256 along with the installation procedure, preferably on the adequade *_versions.sh files.
a7a501c to
fd3f451
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just two more minor comments in the threads.
In general, the only thing is missing is documentation now. I think the procedure on how to deal with checksum verification is valuable thing to be documented. In Érico's new PR, we open room in a document for writing such things.
We still need to discuss which approach is the best: recommending some kind of manual inspection of the tarballs upon upgrade/addition or using the Trust on First Use (TOFU) approach.
| shift | ||
| echo "$dest $@" # Throw extraction mode argument away |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this comment should be above the shift or be dropped. ;)
| install_from_github epics-modules pcas PCAS $PCAS_VERSION " | ||
| EPICS_BASE | ||
| " | ||
| " \ | ||
| $PCAS_SHA256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Something I hadn't thought before: placing it right after _VERSION variables is cleaner, isn't it?
It should just affect install_from_github, since download_from_github and others already take it that way.
| SEQUENCER_SHA256=f5ebecdb231e106bb83db9a5fc877adb03bfd119e879a3668fdfc33d0aacb397 | ||
| CALC_SHA256=5cf1a7b3d444e763eb96ca5b9cdbcb9c29f5a6f9ac2b8d9cdb17a007d3fa8347 | ||
| ASYN_SHA256=47e993aeb300c597fcb0c3df6d3c88b9dd9e9fb90600da84cb2d2dc5b59a31aa | ||
| MODBUS_SHA256=c890feb00844c9bb15525307b6c3322832d10f504d288f0f650b6ea987483e7c | ||
| STREAMDEVICE_SHA256=e0640f00cd23ddd6015091d4b4e8e43a21d9e9e31d9639a5d0da6b187f42eb79 | ||
| BUSY_SHA256=1a09675bb69cdb09157b06d7276c4e4a9db8ca7e257108529e380c55452e4a53 | ||
| AUTOSAVE_SHA256=766dd7a8f71529f48d8122a3655f7986004b62ba589342153fa5a47f59119903 | ||
| SSCAN_SHA256=6911e114b07b3c200db781750035237d7dd494130f360e278df950a8358d6d78 | ||
| RECCASTER_SHA256=7108963bfa6c74d9571fe808ffa4312c1d562b30f575fb6b102fb20bc910aeb8 | ||
| IPAC_SHA256=4bd404eb9a205a32e6e25730115f6a1ab201dac69454136d4b570f880843f726 | ||
| CAPUTLOG_SHA256=6b85137906ed44a1f15358ab32fa9b4c450b37c495c65410606ea660ef4feb4c | ||
| RETOOLS_SHA256=676e2d325fe43126bf2762fa003450bc4c5ed33002ff42761a1d8c9a1f743bf7 | ||
| ETHER_IP_SHA256=ccf441ab842c8d24cea5b912b210852a4dc2fc005391a64f14de2ed586644155 | ||
| IOCSTATS_SHA256=13fbca066bdb34f1d84641a1b3f9fc505729866963f1e7d6d66dbc55b32a2cc8 | ||
| IPMICOMM_SHA256=7e526601461d7222834219a0f081a0550932b340eb945e515ffae94febe429f8 | ||
| PYDEVSUP_SHA256=3280cf3c4e9355f34841bc5ec38d0e5337783ea1678f31f7559b723ba8f4489b | ||
| SNMP_SHA256=f190b807aecd7d319e58263bca2ff883f891496793b68c871ada48a192d695a3 | ||
| SCALER_SHA256=faad6df4a71922ad6dcbd2d73020fad301ecd32e6e4e31ca0b0e4fd013c3bbce | ||
| MCA_SHA256=dddee716247e97e61f2e5a4bad07966ee00a7c107aed6f2d697b27337a44b9bc | ||
| LAKESHORE_SHA256=29b2a57583f69b02289368bcf76debe4a872380400fd4e309d1be320ec38d51e | ||
| LAKESHORE340_SHA256=3a5ac79abcce29d62d4eb6a16a63dc297594358512e3c90a96e2417ba40f7fd5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda chose this task because it was less complex and more of a manual thing anyway.
Okay. Good to know. :)
A helper script that automatically downloads and checks everything seems to kinda break the whole purpose of checking the SHAs.
It depends. One way of approaching this is to inspect the tarball content before calculating its hash. But that's true when manually generating the hashes as well. Another approach is to simply trust the first seen tarball, and only ensure that is the same from then on. We still need to discuss which approach works the best for us, but both of them already solve one problem: attacks that attempt to change a tarball already in use without us noticing it.
On the other hand, an atomic script that only downloads one module using the specified version seems unnecessary code.
Agreed.
An initial maybe-naive attempt to solve #47
Add sha256sum -c to lnls-get-n-unpack. This covers almost everything but doesn't check areaDetector and motor modules yet.
In this approach, hashes are put manually by the developer, which I initially thought was a good idea...
However, it does feel a little bureaucratic now that I look at it. Is there a better way?
Or maybe should create a script that gets the hashes, but is just an developer auxiliary and is not run at build time?
Tested it and it builds both base and base/musl. Changing the hashes does make it fail like intended.