Skip to content

[RFC] Fix password change required #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

basak-qcom
Copy link
Contributor

The fix for the issue is trivial itself, but I also looked into:

  1. How to boot the image using qemu for development purposes.
  2. How to test this effectively. Preferably this would be without requiring external hardware, so the use of qemu is relevant. There are some technology choices here, and more work is needed for full integration (ideally a build system and GitHub Actions integration).

Before merging, we should discuss what is best for the project in order to align with technologies already used and what team members are familiar with.

@basak-qcom basak-qcom marked this pull request as draft June 12, 2025 16:06
@basak-qcom
Copy link
Contributor Author

I also expect LAVA to break since, by definition, fixing this issue modifies the serial console login flow.

@basak-qcom basak-qcom force-pushed the fix-password-change-required branch from 25a9664 to 2754a93 Compare June 12, 2025 16:27
@lool
Copy link
Contributor

lool commented Jun 13, 2025

I also expect LAVA to break since, by definition, fixing this issue modifies the serial console login flow.

Indeed, note that the LAVA test files are in the same repo (under ci/lava/), so we can fix them in the same PR, but the tests will run against the copy in the repo rather than from the PR, so we won't be able to tell if they pass immediately.

The failing runs are linked from here:
https://github.com/qualcomm-linux/qcom-deb-images/actions/runs/15616982180

Copy link
Contributor

@lool lool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks very clean both in the arrangement of commits and in the changes themselves, thanks!

I left minor review comments on the commits, for the general approach:

  • I think the README is becoming too long, so we need to build a proper doc space, but I guess this can still be added as is for now
  • we need to make sure the new script is undergoing Python static analysis like scripts/*.py
  • I suspect we'll have more discussion on the QEMU testing strategy and the OS test framework; e.g. LAVA can do QEMU tests, and we will likely want to share tests between meta-qcom and qcom-deb-images somehow; I'm in favor of merging your simple but effective approach now; if the discussion takes too long, let's land the actual fix in a separate pull request, and then add the test framework
  • as you note, we should use this in the github workflow; there are a lot of strategies to do this, but it looks like the run time is modest, so we can simply do it in the build workflow after the build; the downside is that it will stop the tests running on real hardware if it fails on QEMU, which is perhaps a good thing to save lab usage

@@ -183,6 +183,69 @@ NB: It's also possible to run qdl from the host while the baord is not connected

Want to join in the development? Changes welcome! See [CONTRIBUTING.md file](CONTRIBUTING.md) for step by step instructions.

### Test an image locally with qemu

You can boot an image locally with qemu as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the first step should be the list of dependencies that are needed, "apt install ", like qemu-system, qemu-efi-aarch64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks. Updated.

@@ -140,7 +140,7 @@ actions:
- action: run
description: Compress image file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's tweak the title too, perhaps "Add compressed image file".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

```bash
# SCSI is required to present a device with a matching 4096 sector size
# inside the VM
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That made me wonder how we should define the QEMU model. We could pick it to be fast and manage it like a reference device that we maintain, or we could try to follow the most constrained environment we're trying to support. For the latter, the RB1 is probably the oldest board we're trying to enable, so 2GiB of RAM is fitting. I don't think QEMU has support for Kryo CPUs, so not sure what Cortex equivalent we should pick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since have dedicated hardware tests, I imagine the qemu tests to be primarily for hardware agnostic matters and mainly for the purposes of testing userspace matters to be correct.

Strict constraints like memory and CPU features would still be important though, as we'd want to know if any use case we test is memory constrained in order to make appropriate decisions, or indeed built for a too-high CPU target. We could mark tests differently depending on their requirements (eg. if we want a standard image to have functionality that we know won't work on the oldest supported hardware, but we still want to test it).

So then we would want the model to be fast and with the most limited memory, but with the option of marking tests that require more to automatically get more. And we'd keep hardware-specific tests out of this test suite.

# inside the VM
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \
-device virtio-scsi-pci,id=scsi1 \
-device scsi-hd,bus=scsi1.0,drive=disk1,physical_block_size=4096,logical_block_size=4096 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's cool! Makes me think we should (separately) make a test to check the SD card/eMMC images boot too (with the normal 512 bytes sector size)!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only did what was necessary to get the image we already had booting :)

Would an additional test be valuable here? The risk would be that something later works in qemu but fails on real hardware - in which case - why not just rely on the test against the real hardware?

I'm imagining the qemu test to be solely for testing the userspace-specific bits, and the hardware-specific bits (that interact most with qemu) to be a necessary distraction for that purpose that we should minimise. Then we can rely on the hardware tests for the hardware-specific bits.

@lool
Copy link
Contributor

lool commented Jun 13, 2025

  1. How to boot the image using qemu for development purposes.

Very useful, thanks for documenting!

  1. How to test this effectively. Preferably this would be without requiring external hardware, so the use of qemu is relevant. There are some technology choices here, and more work is needed for full integration (ideally a build system and GitHub Actions integration).

So I believe LAVA can run QEMU for us, but I'm not sure there's value in sending jobs to LAVA to run on QEMU. In fact, I think it would be useful to be able to cover a number of tests on the GitHub side before hitting our own infra, thus discovering a number of issues with limited infra needs (just our github setup essentially). The Internet bandwidth of the Foundries LAVA setup is also limited, so doing everything on the public cloud side is probably faster.

What's quite important IMO is aligning on the framework we'll use for the OS tests. There are a million options to start from, I don't think debos comes with any particular recommended way of doing tests. The Linaro folks are working on a LAVA test templating system called tuxlava, but I don't think it's meant to carry OS tests itself.

I like the dashboard that Apertis has built, I believe it's using LAVA as well underneath:
https://qa.apertis.org/

I shared with @mwasilew how we could run checkbox tests without relying on creating and publishing a snap.

Before merging, we should discuss what is best for the project in order to align with technologies already used and what team members are familiar with.

Indeed; Milosz, @mattface and @doanac are probably the most interested in what we pick there. :-)

@lool
Copy link
Contributor

lool commented Jun 13, 2025

Also, these are some of the tests we want to run on real hardware and that will need to be run through the test runner we pick:
https://github.com/qualcomm-linux/qcom-linux-testkit

@basak-qcom basak-qcom force-pushed the fix-password-change-required branch from 2754a93 to 17dc094 Compare June 16, 2025 11:59
Copy link
Contributor Author

@basak-qcom basak-qcom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've addressed all actionable changes. What is remaining is an agreed direction on how to approach testing.

```bash
# SCSI is required to present a device with a matching 4096 sector size
# inside the VM
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since have dedicated hardware tests, I imagine the qemu tests to be primarily for hardware agnostic matters and mainly for the purposes of testing userspace matters to be correct.

Strict constraints like memory and CPU features would still be important though, as we'd want to know if any use case we test is memory constrained in order to make appropriate decisions, or indeed built for a too-high CPU target. We could mark tests differently depending on their requirements (eg. if we want a standard image to have functionality that we know won't work on the oldest supported hardware, but we still want to test it).

So then we would want the model to be fast and with the most limited memory, but with the option of marking tests that require more to automatically get more. And we'd keep hardware-specific tests out of this test suite.

# inside the VM
qemu-system-aarch64 -cpu cortex-a57 -m 2048 -M virt -nographic \
-device virtio-scsi-pci,id=scsi1 \
-device scsi-hd,bus=scsi1.0,drive=disk1,physical_block_size=4096,logical_block_size=4096 \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only did what was necessary to get the image we already had booting :)

Would an additional test be valuable here? The risk would be that something later works in qemu but fails on real hardware - in which case - why not just rely on the test against the real hardware?

I'm imagining the qemu test to be solely for testing the userspace-specific bits, and the hardware-specific bits (that interact most with qemu) to be a necessary distraction for that purpose that we should minimise. Then we can rely on the hardware tests for the hardware-specific bits.

@@ -183,6 +183,69 @@ NB: It's also possible to run qdl from the host while the baord is not connected

Want to join in the development? Changes welcome! See [CONTRIBUTING.md file](CONTRIBUTING.md) for step by step instructions.

### Test an image locally with qemu

You can boot an image locally with qemu as follows:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks. Updated.

@@ -140,7 +140,7 @@ actions:
- action: run
description: Compress image file
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@basak-qcom
Copy link
Contributor Author

basak-qcom commented Jun 16, 2025

For test architecture, for the development workflow I would like to be able to git clone this repository on a fresh Debian or Ubuntu machine, install only distribution packages (perhaps detailed in a README.md) with no further configuration, type make test or equivalent and get a test pass report.

Clearly this isn't possible for tests that are necessarily tied to hardware. But for tweaking userspace elements, I'd like the above to be possible, which would keep friction to a minimum and thus maximise development velocity.

I appreciate that for the purposes of unifying with technologies and tests defined elsewhere, some compromise may be needed!

@ricardosalveti
Copy link

@basak-qcom do you have an oss email? if so, better to use it instead of [email protected].

@ricardosalveti
Copy link

For the changes itself I would just expect the password change to be isolated in another pr (and possible changes to the lava testing logic), and have one just for the qemu testing side of it.

@ricardosalveti
Copy link

In fact, I think it would be useful to be able to cover a number of tests on the GitHub side before hitting our own infra, thus discovering a number of issues with limited infra needs (just our github setup essentially). The Internet bandwidth of the Foundries LAVA setup is also limited, so doing everything on the public cloud side is probably faster.

I think it is good to have at least some level of smoke testing done at PR level (to make sure the image is good enough for lava submission), but I think it is better to rely on lava for the rest of the testing, even for qemu, as that way we can concentrate all testing on the same framework, otherwise we will have 2 sets of test frameworks in place.

@mwasilew
Copy link
Contributor

I would go with a tool that works in both worlds - local and LAVA. I would propose test-runner.py from test-definitions. This allows local execution and LAVA jobs using the same test plans. Test plans can than be composed from multiple test repositories.

As for this login test it's a special case as the test is executed before we're in the shell. I'm OK with the current implementation. The change in the LAVA jobs must be submitted separately. I'll try to provide example.

mwasilew added a commit to mwasilew/qcom-deb-images that referenced this pull request Jun 16, 2025
Update LAVA job templates for the case when password has to be changed
at the first login attempt. This patch should only be merged after qualcomm-linux#74

Signed-off-by: Milosz Wasilewski <[email protected]>
@mwasilew
Copy link
Contributor

LAVA jobs change: #75
Example job: https://lava.infra.foundries.io/scheduler/job/79639#L1757

@basak
Copy link

basak commented Jul 1, 2025

Thanks for the discussions! To summarise the plan as I understand it:

  1. This test is exceptional (as it needs to run before a shell prompt), therefore unsuitable for LAVA and so we'll make it runnable directly here. I will implement a make test in this repository that does that, and then add it to CI. This PR can then land (after re-review).
  2. I will investigate test-definitions for general tests that are appropriate to run both locally and in LAVA - but not in this PR.

We have a fairly complicated build process in qcom-deb-images, as
evidenced by the current README.md. Giving developers a one-command
build instead means:

1. Simplified instructions for developers, or in other words, moving
instructions for manual entry from README.md into automated invocation
via the build system.

2. Automated dependency/rebuild tracking, reducing the instances where a
full rebuild is required, or the easiest option, during development
iteration.

3. Building only what is necessary for a given target.

4. The opportunity to move towards running CI before submission in a PR,
rather than CI being entirely defined inside .github/workflows/ and
therefore inaccessible for running locally.

This is too complex to attempt all at once. Instead I'm doing it in
multiple small steps, and this is the first step. Overall, my plan is:

1. Introduce a basic Makefile that works.

2. Per item, implement in the Makefile what CI requires that is already
implemented in .github/workflows/.

3. Per item, replace complex build invocation steps made directly from
.github/workflows/ with calls to the Makefile instead.

4. Once the elements covered by README.md are all covered, I'll update
the instructions to use the Makefile instead.

Note that CI for the Makefile should therefore not be necessary to
implement directly; it'll happen step by step as CI is moved into it
through step 3 above. If there are gaps in CI coverage that the Makefile
does cover, then we can implement those as separate CI items to call
from .github/workflows/ as we wish.

In time, this should make .github/workflows/ simpler, and encode our
knowledge of how to invoke the various tools (including debos) into the
source tree itself, rather than via manual instructions in README.md
that are also duplicated in .github/workflows/

Signed-off-by: Robie Basak <[email protected]>
The uncompressed image is to be used in the qemu testing flow, so keep
it around instead of implicitly deleting it.

Signed-off-by: Robie Basak <[email protected]>
Add a mechanism to test behaviours don't relate to hardware. These can
be effectively tested without need of the target platform using qemu.

This uses Python and pytest. Specific tests to follow.

For now, this is just the raw test that will need to be run manually,
and depends on the developer already having built a disk-ufs.img by
hand using the instructions in README.md. Automation and integration
into GitHub Actions will follow. How-to documentation would be useful
but would be superseded by that automation and integration, so this is
omitted for now.

Signed-off-by: Robie Basak <[email protected]>
Existing code arranges to set up an initial well-known password but with
a requirement to reset it on first login.

This wasn't working because the well-known password was being set after
expiring the "old" password, meaning that it was being treated as having
been reset immediately.

Instead, ensure to expire the password _after_ setting it.

Fixes: qualcomm-linux#69
Signed-off-by: Robie Basak <[email protected]>
@basak-qcom basak-qcom force-pushed the fix-password-change-required branch 2 times, most recently from e0aee32 to 0833b97 Compare July 2, 2025 14:47
Add a "make test" target to the Makefile and to CI. This is intended to
run all tests that can be run locally, with just one for now.

For now, knowledge of the specific artifacts required is duplicated
between the Makefile and .github/workflows/test.yml. I don't like this,
but this is currently necessary because the workflows are split and save
and restore artifacts between building and testing. Perhaps this can be
improved, but I'm deferring this for a possible future overhaul.

Signed-off-by: Robie Basak <[email protected]>
@basak-qcom basak-qcom force-pushed the fix-password-change-required branch from 0833b97 to 205a618 Compare July 2, 2025 14:52
@basak-qcom
Copy link
Contributor Author

I've prepared some build system and CI automation to run this test. To make that easier, I've built upon the Makefile from #91, so I'm rebasing this onto that, then adding to that Makefile in a commit on top.

I expect I'll have to iterate a bit in this PR to get the GitHub workflow changes right.

@basak-qcom
Copy link
Contributor Author

This is rebased onto #91 so needs to land after that.

Also it doesn't look like the test I added actually ran. I need to look into this.

basak-qcom added a commit that referenced this pull request Jul 3, 2025
For PR #74, I want to test if these workflow changes are working, so am
pushing to the repository directly to get the tests to run in the
expected context. Once done, I can throw this change and branch away.
@lool
Copy link
Contributor

lool commented Jul 11, 2025

I tested booting under qemu from macOS today using the latest qemu-efi-aarch64 deb from http://ftp.debian.org/debian/pool/main/e/edk2/ and the homebrew qemu package; it didn't work with the mainline based kernel (e.g. from https://quic-yocto-fileserver-1029608027416.us-central1.run.app/qualcomm-linux/qcom-deb-images/16164587779-1/) and was stuck looking for the root fs in initramfs (some missing virtio modules?), but it worked with the Debian config based kernel (e.g. from https://quic-yocto-fileserver-1029608027416.us-central1.run.app/qualcomm-linux/qcom-deb-images/16215750029-1/)

@lool
Copy link
Contributor

lool commented Jul 11, 2025

I tested booting under qemu from macOS today using the latest qemu-efi-aarch64 deb from http://ftp.debian.org/debian/pool/main/e/edk2/ and the homebrew qemu package; it didn't work with the mainline based kernel (e.g. from https://quic-yocto-fileserver-1029608027416.us-central1.run.app/qualcomm-linux/qcom-deb-images/16164587779-1/) and was stuck looking for the root fs in initramfs (some missing virtio modules?), but it worked with the Debian config based kernel (e.g. from https://quic-yocto-fileserver-1029608027416.us-central1.run.app/qualcomm-linux/qcom-deb-images/16215750029-1/)

With -device virtio-gpu-pci -display cocoa I even got to the lightdm prompt, and with -device usb-ehci,id=ehci -device usb-kbd -device usb-mouse I could use mouse and keyboard to login – yay!
image

@basak-qcom basak-qcom self-assigned this Jul 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants