-
Notifications
You must be signed in to change notification settings - Fork 18
fw_update: check for existing candidate image before attempting download #743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Updates fw_update_post_download to return a status and updates platform port implementations to match. This is in service of subsequently moving logic currently in fw_update_validate in some ports to fw_update_post_download. Signed-off-by: Daniel Mangum <[email protected]>
Visit the preview URL for this PR (updated for commit e48d9d9): https://golioth-firmware-sdk-doxygen-dev--pr743-fw-update-chec-de1cpdoh.web.app (expires Mon, 17 Feb 2025 22:52:02 GMT) 🔥 via Firebase Hosting GitHub Action 🌎 Sign: a9993e61697a3983f3479e468bcb0b616f9a0578 |
1d7a337
to
0a23494
Compare
Codecov ReportAttention: Patch coverage is
|
f297bfd
to
b352821
Compare
/* | ||
* @note This is similar to Zephyr's flash_img_check() but uses the Golioth | ||
* port's sha256 API. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to being consistent with our sha256 implementation, this is also necessary because building with flash_img_check
enabled with mbedtls is currently broken on nordic platforms due to zephyrproject-rtos/zephyr#85238
9c6508e
to
1f0926e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @hasheddan!
port/linux/fw_update_linux.c
Outdated
{ | ||
return GOLIOTH_OK; | ||
// TODO(hasheddan): support validating candidate firmware. | ||
return GOLIOTH_ERR_FAIL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return GOLIOTH_ERR_FAIL; | |
return GOLIOTH_ERR_NOT_IMPLEMENTED; |
// Nothing to do | ||
return GOLIOTH_OK; | ||
// TODO(hasheddan): support validating candidate firmware. | ||
return GOLIOTH_ERR_FAIL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return GOLIOTH_ERR_FAIL; | |
return GOLIOTH_ERR_NOT_IMPLEMENTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice! I didn't realize we already had this status. In theory this would allow us to also call fw_update_validate
post download and just ignore result if GOLIOTH_ERR_NOT_IMPLEMENTED
. However, I think I'd lean towards not doing that given our existing hash check post download (as mentioned in the PR body). Regardless, should definitely be using the status here! 👍🏻
|
||
golioth_sys_sha256_t hash_ctx = golioth_sys_sha256_create(); | ||
|
||
buf_size = sizeof(_flash_img_context.buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for other reviewers: This buf size defaults to CONFIG_IMG_BLOCK_BUF_SIZE
which defaults to 512.
port/esp_idf/fw_update_esp_idf.c
Outdated
{ | ||
return GOLIOTH_ERR_FAIL; | ||
} | ||
if (memcmp(hash, data.image_digest, sizeof(data.image_digest)) != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to verify the length of hash
is the same as the length of data.image_digest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know the length of hash
here -- it is somewhat implicit in the port API contract as implemented, so it may be worth making it more explicit that a 32 byte sha256 hash is being passed (at the very least in the documentation on fw_update_validate
). In practice, because data.image_digest
is hard-coded to 32 bytes and we always pass a GOLIOTH_OTA_COMPONENT_BIN_HASH_LEN
(32) then they should always be same length.
I am still in process of testing the esp-idf fw_update_validate
implementation though, so there may need to be some further changes made here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the background.
In practice, because data.image_digest is hard-coded to 32 bytes and we always pass a GOLIOTH_OTA_COMPONENT_BIN_HASH_LEN (32) then they should always be same length.
Would it make sense to change the function as below?
enum golioth_status fw_update_validate(const uint8_t hash[GOLIOTH_OTA_COMPONENT_BIN_HASH_LEN],
const size_t size)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I typically shy away from using arrays as function parameters in C because in practice they decay to a pointers. However, they can give the impression that, for example, sizeof(hash)
in a function implementation would return the size of the array, but in reality it would return the size of the pointer. Some folks certainly argue the other side of this, so I'm open to discussion 🙂
port/esp_idf/fw_update_esp_idf.c
Outdated
.offset = _update_partition->address, | ||
.size = _update_partition->size, | ||
}; | ||
if (esp_image_verify(ESP_IMAGE_VERIFY, &part_pos, &data) != ESP_OK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be changing this in a moment because esp_image_verify
will use the appended digest of the image, rather than the actual hash of the bytes in the partition, so we'll end up with a mismatch with the hash in the manifest. This is similar to what we are avoiding in the Zephyr port with mcuboot.
c9f0a69
to
bea64d7
Compare
I think removing Given the above, it could make sense to rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @hasheddan! Overall looks good, left a few comments
src/fw_update.c
Outdated
{ | ||
GLTH_LOGE(TAG, "Failed to perform post download operations"); | ||
fw_download_failed(GOLIOTH_OTA_REASON_FIRMWARE_UPDATE_FAILED); | ||
golioth_sys_sha256_destroy(download_ctx.sha); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're destroying this SHA in a number of places. I think we can move the call golioth_sys_sha256_finish()
to just after the block download finishes (i.e. before the call to back_increment()
), and destroy the SHA context there. Then we can just have the calculated SHA on the stack which will be easier to manage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack! was avoiding making that change in this PR, but would love to go ahead and do so -- I think it probably makes the most sense after the err
check block (i.e. don't need to finish if we are going to destroy), but happy to always do finish + destroy after back_increment
if you find it cleaner 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the err
check works for me!
|
||
if (is_last) | ||
{ | ||
fw_update_post_download(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot easier to follow 👍
include/golioth/fw_update.h
Outdated
/// | ||
/// @return GOLIOTH_OK - image validated | ||
/// @return Otherwise - error in validation, abort firmware update | ||
enum golioth_status fw_update_validate(void); | ||
enum golioth_status fw_update_validate(const uint8_t *hash, const size_t size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's clear in the function comment, but when only looking at the function signature it would be easy to mistake size
to mean the size of the array passed in hash
. Also const
isn't necessary for arguments passed by value.
enum golioth_status fw_update_validate(const uint8_t *hash, const size_t size); | |
enum golioth_status fw_update_validate(const uint8_t *hash, size_t img_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops copy pasta on that second const
-- happy to update to img_size
as well 👍🏻
/// | ||
/// @return GOLIOTH_OK - image validated | ||
/// @return Otherwise - error in validation, abort firmware update | ||
enum golioth_status fw_update_validate(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit (and the next four) won't build without the following five commits. We should preserve bisectability by either squashing these commits or just updating the function signatures in this commit (without the functional changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, commit history needs to be adjusted prior to merge -- pushed small commits while iterating.
src/fw_update.c
Outdated
@@ -424,6 +453,18 @@ static void fw_update_thread(void *arg) | |||
continue; | |||
} | |||
|
|||
if (fw_update_validate(_component_ctx.target_component.hash, | |||
_component_ctx.target_component.size) | |||
== GOLIOTH_OK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure this is what clang-format
requires, but I'm not a fan haha. If you want to disable formatting for this line and replace it with something cleaner, I wouldn't be opposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack 😂
@@ -604,14 +630,11 @@ static void fw_update_thread(void *arg) | |||
/* Download successful. Reset backoff */ | |||
backoff_reset(&_component_ctx); | |||
|
|||
int countdown = 5; | |||
while (countdown > 0) | |||
if (fw_change_image_and_reboot() != GOLIOTH_OK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reports the DOWNLOADING
state to the cloud and changes the boot image, so we don't need to do that in lines 616 - 628 anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we still need the DOWNLOADED
report, but agree we can remove the UPDATING
and boot image swap 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! I mean to write UPDATING
- agreed we need to keep DOWNLOADED
.
@sam-golioth cool! At one point I had it as |
Sounds good to me! |
88b73eb
to
178808e
Compare
Moves the fw_update_post_download invocation out of the block download callback. This makes it more obvious when and if post download operations are invoked. Signed-off-by: Daniel Mangum <[email protected]>
Updates the esp_idf port by moving the OTA completion logic from validation into post download. While esp_idf OTA support does perform some validation when completing a download, the primary purpose is closing the OTA download context. This frees up fw_update_validate to be modified to allow for calling outside the context of a specific download. Signed-off-by: Daniel Mangum <[email protected]>
Update fw_update_validate to fw_update_check_candidate and pass an image hash and size such that image validateion is not coupled to any state maintained in ports. Ports should be able to validate an image in a candidate slot without needing to have downloaded since last reboot. Make a corresponding update in fw_update thread to check whether the requested firmware update image is already present in candidate slot. If so, download is skipped and we attempt to reboot into existing candidate slot image. Note that the call to fw_update_validate is now removed from the post download flow and not replaced in the same call site by fw_update_check_image. fw_update_validate was previously a no-op for most platform ports, and for those that did implement checks (esp_idf), the functionality was moved to fw_update_post_download, which is still called following an image download. Signed-off-by: Daniel Mangum <[email protected]>
Adds support for validating candidate firmware images by hash and size to the Zephyr firmware update port. Signed-off-by: Daniel Mangum <[email protected]>
Adds support for validating candidate firmware images by hash and size to the esp_idf firmware update port. Signed-off-by: Daniel Mangum <[email protected]>
178808e
to
e48d9d9
Compare
static enum golioth_status fw_verify_component_hash( | ||
struct download_progress_context *ctx, | ||
const uint8_t calc_sha256[GOLIOTH_OTA_COMPONENT_BIN_HASH_LEN], | ||
const uint8_t server_sha256[GOLIOTH_OTA_COMPONENT_BIN_HASH_LEN]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned earlier in review, I don't really love using arrays in function signatures, but given that is the existing pattern here, I opted to remain consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's gives a false sense of compiler security, but from a readability standpoint I think it's the clearest way in C to communicate to the user that their input must be this exact size, no more and no less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Given that this function is only visible to this compilation unit I am less worried about it here anyway 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Refactors
fw_update
logic to check whether the existing image in the candidate slot matches the desired update image in a received manifest. If so, we attempt to boot into the image without downloading it again.Implementing involved some cleanup of the interface to
fw_update
platform ports. Namely,fw_update_post_download
now returns a status, andfw_update_validate
receives an image hash and size. The former is to allow for ports to return an error if they are unable to "finalize" a download. This is particularly relevant foresp_idf
, as completing the download also implicitly performs image validation. The latter change is to allow for validation to be performed on a candidate image outside the context of the download of that image, which may have taken place prior to the most recent reboot.Because ports can perform custom validation in
fw_update_post_download
and we implement hash validation infw_update
for downloaded images, thefw_update_validate
implementation for a port is no longer invoked after download. Instead, it is now only invoked on receiving a new manifest to check whether the desired update image is already present. The primary reason for not also invokingfw_update_validate
post download is that it is not implemented for all ports at this time, though it is also somewhat redundant with our existing hash check. There are two options that would allow for callingfw_update_validate
post download and prior to download:(1) feels like the right path long term, and given that we do not lose any functionality today by removing the post download call (i.e. ports previously just had no-op implementations), moving forward with the
fw_update_post_download
+fw_update
's hash check seems like a reasonable course of action.Work items that should follow this implementation:
fw_update_validate
always fails.