-
Notifications
You must be signed in to change notification settings - Fork 12
[RFC] Fix password change required #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[RFC] Fix password change required #74
Conversation
I also expect LAVA to break since, by definition, fixing this issue modifies the serial console login flow. |
25a9664
to
2754a93
Compare
Indeed, note that the LAVA test files are in the same repo (under ci/lava/), so we can fix them in the same PR, but the tests will run against the copy in the repo rather than from the PR, so we won't be able to tell if they pass immediately. The failing runs are linked from here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks very clean both in the arrangement of commits and in the changes themselves, thanks!
I left minor review comments on the commits, for the general approach:
- I think the README is becoming too long, so we need to build a proper doc space, but I guess this can still be added as is for now
- we need to make sure the new script is undergoing Python static analysis like scripts/*.py
- I suspect we'll have more discussion on the QEMU testing strategy and the OS test framework; e.g. LAVA can do QEMU tests, and we will likely want to share tests between meta-qcom and qcom-deb-images somehow; I'm in favor of merging your simple but effective approach now; if the discussion takes too long, let's land the actual fix in a separate pull request, and then add the test framework
- as you note, we should use this in the github workflow; there are a lot of strategies to do this, but it looks like the run time is modest, so we can simply do it in the build workflow after the build; the downside is that it will stop the tests running on real hardware if it fails on QEMU, which is perhaps a good thing to save lab usage
@@ -183,6 +183,69 @@ NB: It's also possible to run qdl from the host while the baord is not connected | |||
|
|||
Want to join in the development? Changes welcome! See [CONTRIBUTING.md file](CONTRIBUTING.md) for step by step instructions. | |||
|
|||
### Test an image locally with qemu | |||
|
|||
You can boot an image locally with qemu as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the first step should be the list of dependencies that are needed, "apt install ", like qemu-system, qemu-efi-aarch64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks. Updated.
@@ -140,7 +140,7 @@ actions: | |||
- action: run | |||
description: Compress image file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's tweak the title too, perhaps "Add compressed image file".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
```bash | ||
# SCSI is required to present a device with a matching 4096 sector size | ||
# inside the VM | ||
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That made me wonder how we should define the QEMU model. We could pick it to be fast and manage it like a reference device that we maintain, or we could try to follow the most constrained environment we're trying to support. For the latter, the RB1 is probably the oldest board we're trying to enable, so 2GiB of RAM is fitting. I don't think QEMU has support for Kryo CPUs, so not sure what Cortex equivalent we should pick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since have dedicated hardware tests, I imagine the qemu tests to be primarily for hardware agnostic matters and mainly for the purposes of testing userspace matters to be correct.
Strict constraints like memory and CPU features would still be important though, as we'd want to know if any use case we test is memory constrained in order to make appropriate decisions, or indeed built for a too-high CPU target. We could mark tests differently depending on their requirements (eg. if we want a standard image to have functionality that we know won't work on the oldest supported hardware, but we still want to test it).
So then we would want the model to be fast and with the most limited memory, but with the option of marking tests that require more to automatically get more. And we'd keep hardware-specific tests out of this test suite.
# inside the VM | ||
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \ | ||
-device virtio-scsi-pci,id=scsi1 \ | ||
-device scsi-hd,bus=scsi1.0,drive=disk1,physical_block_size=4096,logical_block_size=4096 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's cool! Makes me think we should (separately) make a test to check the SD card/eMMC images boot too (with the normal 512 bytes sector size)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only did what was necessary to get the image we already had booting :)
Would an additional test be valuable here? The risk would be that something later works in qemu but fails on real hardware - in which case - why not just rely on the test against the real hardware?
I'm imagining the qemu test to be solely for testing the userspace-specific bits, and the hardware-specific bits (that interact most with qemu) to be a necessary distraction for that purpose that we should minimise. Then we can rely on the hardware tests for the hardware-specific bits.
Very useful, thanks for documenting!
So I believe LAVA can run QEMU for us, but I'm not sure there's value in sending jobs to LAVA to run on QEMU. In fact, I think it would be useful to be able to cover a number of tests on the GitHub side before hitting our own infra, thus discovering a number of issues with limited infra needs (just our github setup essentially). The Internet bandwidth of the Foundries LAVA setup is also limited, so doing everything on the public cloud side is probably faster. What's quite important IMO is aligning on the framework we'll use for the OS tests. There are a million options to start from, I don't think debos comes with any particular recommended way of doing tests. The Linaro folks are working on a LAVA test templating system called tuxlava, but I don't think it's meant to carry OS tests itself. I like the dashboard that Apertis has built, I believe it's using LAVA as well underneath: I shared with @mwasilew how we could run checkbox tests without relying on creating and publishing a snap.
Indeed; Milosz, @mattface and @doanac are probably the most interested in what we pick there. :-) |
Also, these are some of the tests we want to run on real hardware and that will need to be run through the test runner we pick: |
2754a93
to
17dc094
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've addressed all actionable changes. What is remaining is an agreed direction on how to approach testing.
```bash | ||
# SCSI is required to present a device with a matching 4096 sector size | ||
# inside the VM | ||
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since have dedicated hardware tests, I imagine the qemu tests to be primarily for hardware agnostic matters and mainly for the purposes of testing userspace matters to be correct.
Strict constraints like memory and CPU features would still be important though, as we'd want to know if any use case we test is memory constrained in order to make appropriate decisions, or indeed built for a too-high CPU target. We could mark tests differently depending on their requirements (eg. if we want a standard image to have functionality that we know won't work on the oldest supported hardware, but we still want to test it).
So then we would want the model to be fast and with the most limited memory, but with the option of marking tests that require more to automatically get more. And we'd keep hardware-specific tests out of this test suite.
# inside the VM | ||
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \ | ||
-device virtio-scsi-pci,id=scsi1 \ | ||
-device scsi-hd,bus=scsi1.0,drive=disk1,physical_block_size=4096,logical_block_size=4096 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only did what was necessary to get the image we already had booting :)
Would an additional test be valuable here? The risk would be that something later works in qemu but fails on real hardware - in which case - why not just rely on the test against the real hardware?
I'm imagining the qemu test to be solely for testing the userspace-specific bits, and the hardware-specific bits (that interact most with qemu) to be a necessary distraction for that purpose that we should minimise. Then we can rely on the hardware tests for the hardware-specific bits.
@@ -183,6 +183,69 @@ NB: It's also possible to run qdl from the host while the baord is not connected | |||
|
|||
Want to join in the development? Changes welcome! See [CONTRIBUTING.md file](CONTRIBUTING.md) for step by step instructions. | |||
|
|||
### Test an image locally with qemu | |||
|
|||
You can boot an image locally with qemu as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks. Updated.
@@ -140,7 +140,7 @@ actions: | |||
- action: run | |||
description: Compress image file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
For test architecture, for the development workflow I would like to be able to Clearly this isn't possible for tests that are necessarily tied to hardware. But for tweaking userspace elements, I'd like the above to be possible, which would keep friction to a minimum and thus maximise development velocity. I appreciate that for the purposes of unifying with technologies and tests defined elsewhere, some compromise may be needed! |
@basak-qcom do you have an oss email? if so, better to use it instead of [email protected]. |
For the changes itself I would just expect the password change to be isolated in another pr (and possible changes to the lava testing logic), and have one just for the qemu testing side of it. |
I think it is good to have at least some level of smoke testing done at PR level (to make sure the image is good enough for lava submission), but I think it is better to rely on lava for the rest of the testing, even for qemu, as that way we can concentrate all testing on the same framework, otherwise we will have 2 sets of test frameworks in place. |
I would go with a tool that works in both worlds - local and LAVA. I would propose test-runner.py from test-definitions. This allows local execution and LAVA jobs using the same test plans. Test plans can than be composed from multiple test repositories. As for this login test it's a special case as the test is executed before we're in the shell. I'm OK with the current implementation. The change in the LAVA jobs must be submitted separately. I'll try to provide example. |
Update LAVA job templates for the case when password has to be changed at the first login attempt. This patch should only be merged after qualcomm-linux#74 Signed-off-by: Milosz Wasilewski <[email protected]>
LAVA jobs change: #75 |
Thanks for the discussions! To summarise the plan as I understand it:
|
We have a fairly complicated build process in qcom-deb-images, as evidenced by the current README.md. Giving developers a one-command build instead means: 1. Simplified instructions for developers, or in other words, moving instructions for manual entry from README.md into automated invocation via the build system. 2. Automated dependency/rebuild tracking, reducing the instances where a full rebuild is required, or the easiest option, during development iteration. 3. Building only what is necessary for a given target. 4. The opportunity to move towards running CI before submission in a PR, rather than CI being entirely defined inside .github/workflows/ and therefore inaccessible for running locally. This is too complex to attempt all at once. Instead I'm doing it in multiple small steps, and this is the first step. Overall, my plan is: 1. Introduce a basic Makefile that works. 2. Per item, implement in the Makefile what CI requires that is already implemented in .github/workflows/. 3. Per item, replace complex build invocation steps made directly from .github/workflows/ with calls to the Makefile instead. 4. Once the elements covered by README.md are all covered, I'll update the instructions to use the Makefile instead. Note that CI for the Makefile should therefore not be necessary to implement directly; it'll happen step by step as CI is moved into it through step 3 above. If there are gaps in CI coverage that the Makefile does cover, then we can implement those as separate CI items to call from .github/workflows/ as we wish. In time, this should make .github/workflows/ simpler, and encode our knowledge of how to invoke the various tools (including debos) into the source tree itself, rather than via manual instructions in README.md that are also duplicated in .github/workflows/ Signed-off-by: Robie Basak <[email protected]>
Signed-off-by: Robie Basak <[email protected]>
The uncompressed image is to be used in the qemu testing flow, so keep it around instead of implicitly deleting it. Signed-off-by: Robie Basak <[email protected]>
Add a mechanism to test behaviours don't relate to hardware. These can be effectively tested without need of the target platform using qemu. This uses Python and pytest. Specific tests to follow. For now, this is just the raw test that will need to be run manually, and depends on the developer already having built a disk-ufs.img by hand using the instructions in README.md. Automation and integration into GitHub Actions will follow. How-to documentation would be useful but would be superseded by that automation and integration, so this is omitted for now. Signed-off-by: Robie Basak <[email protected]>
Existing code arranges to set up an initial well-known password but with a requirement to reset it on first login. This wasn't working because the well-known password was being set after expiring the "old" password, meaning that it was being treated as having been reset immediately. Instead, ensure to expire the password _after_ setting it. Fixes: qualcomm-linux#69 Signed-off-by: Robie Basak <[email protected]>
e0aee32
to
0833b97
Compare
Add a "make test" target to the Makefile and to CI. This is intended to run all tests that can be run locally, with just one for now. For now, knowledge of the specific artifacts required is duplicated between the Makefile and .github/workflows/test.yml. I don't like this, but this is currently necessary because the workflows are split and save and restore artifacts between building and testing. Perhaps this can be improved, but I'm deferring this for a possible future overhaul. Signed-off-by: Robie Basak <[email protected]>
0833b97
to
205a618
Compare
I've prepared some build system and CI automation to run this test. To make that easier, I've built upon the Makefile from #91, so I'm rebasing this onto that, then adding to that Makefile in a commit on top. I expect I'll have to iterate a bit in this PR to get the GitHub workflow changes right. |
This is rebased onto #91 so needs to land after that. Also it doesn't look like the test I added actually ran. I need to look into this. |
For PR #74, I want to test if these workflow changes are working, so am pushing to the repository directly to get the tests to run in the expected context. Once done, I can throw this change and branch away.
I tested booting under qemu from macOS today using the latest qemu-efi-aarch64 deb from http://ftp.debian.org/debian/pool/main/e/edk2/ and the homebrew qemu package; it didn't work with the mainline based kernel (e.g. from https://quic-yocto-fileserver-1029608027416.us-central1.run.app/qualcomm-linux/qcom-deb-images/16164587779-1/) and was stuck looking for the root fs in initramfs (some missing virtio modules?), but it worked with the Debian config based kernel (e.g. from https://quic-yocto-fileserver-1029608027416.us-central1.run.app/qualcomm-linux/qcom-deb-images/16215750029-1/) |
With |
The fix for the issue is trivial itself, but I also looked into:
Before merging, we should discuss what is best for the project in order to align with technologies already used and what team members are familiar with.