Skip to content

Conversation

@schaefi
Copy link
Collaborator

@schaefi schaefi commented Sep 27, 2024

The bootstrap procedure based on apt only runs a manual collection of package scripts. This commit refactors the code that unpacks the bootstrap packages to a python implementation and adds a method to run the bootstrap scripts from all packages resolved by apt.

@schaefi schaefi requested a review from IsaacJT September 27, 2024 12:40
@schaefi schaefi self-assigned this Sep 27, 2024
# if hardcoding this for libc6 is something we can afford ?
# Help is appreciated
self.command_env['DPKG_MAINTSCRIPT_NAME'] = 'true'
self.command_env['DPKG_MAINTSCRIPT_PACKAGE'] = 'libc6'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IsaacJT As we discussed I created a refactor and implementation for the apt based bootstrap such that all package scripts from the resolved apt list are taken into account.

At this point however I needed two weird environment variables and I had to set libc6 as DPKG_MAINTSCRIPT_PACKAGE otherwise I got usr-merge complains. I really have no idea what these variables really do and my knowledge on the inner works of Debian/Ubuntu packaging is really bad.

I wanted to ask if you can give this PR a try if it would also fix the issue you found and maybe check if the above is ok as a permanent in code setup or if this is all wrong. I believe the way how I set the variables is not a good way to handle the issue that comes up if you disable them. Feel free to run the process without these variables so you can see the errors from apt.

Thanks much

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK DPKG_MAINTSCRIPT_NAME should be the name of the script (e.g. postinst, etc.) and DPKG_MAINTSCRIPT_PACKAGE should be the name of the package that the scripts belong to. I would have to double check this though...

Haven't had time to a have a proper look through yet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we run the scripts manually the setting of just true would then be ok ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also note that libc6 was the only package in the collection of all packages that complains about the missing variables DPKG_MAINTSCRIPT_*

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I simply do not understand the concept behind them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comment for this variable settings. I could not find any bad or negative side effect of setting them during the bootstrap of a Debian based system. I tested Ubuntu and Debian and the integration tests pass and boots. Thus I leave it up to the experts if this needs to be changed or can harm the bootstrap procedure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe DPKG_MAINTSCRIPT_PACKAGE should be set in the for block below so that it represents the name of the package that the script is currently being run for? And DPKG_MAINTSCRIPT_NAME could be set
to the script name immediately before calling the script, e.g.

if os.path.exists(script_pre):
    self.command_env['DPKG_MAINTSCRIPT_PACKAGE'] = 'preinst'
    ...

I think this would more closely emulate the environment that the scripts would normally be run in

@schaefi schaefi force-pushed the apt_bootstrap_package_scripts branch 3 times, most recently from a682603 to 06759ed Compare September 30, 2024 10:43
Command.run(
[
'bash', '-c',
'dpkg-deb --fsys-tarfile {0} | tar -C {1} -x'.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason you're using tar here? You can use dpkg -e <.deb path> to extract the scripts from a .deb

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extraction is not for the scripts. This extraction "installs" the downloaded bootstrap .deb package into the new root-tree. From the man page of dpkg-deb

--fsys-tarfile archive
           Extracts the filesystem tree data from a binary package and
           sends it to standard output in tar format.  Together with
           tar(1) this can be used to extract a particular file from a
           package archive.

So this call is not related to the script part of things, which comes later.

Copy link
Collaborator Author

@schaefi schaefi Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the process to bootstrap a debian based system follows the following logic

  1. apt-get _with_call_options_to_resolve_packages_and_only_download_them
  2. dpkg-deb --fsys-tarfile to extract (install) the set of resolved and downloaded packages
  3. dpkg -e extract the scripts
  4. run the scripts (chrooted)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. You could instead just use dpkg -x <.deb> <dest> to skip the intermediate tar stage

@schaefi
Copy link
Collaborator Author

schaefi commented Oct 4, 2024

@IsaacJT With this change in place all Debian/Ubuntu based integration tests have passed and boots. The package scripts of all bootstrap packages gets called. We still need to clarify if that variable setting is somehow evil or if it can stay. As I'm not into Debian that much it would be great to get some approval from the experts.

To review the logs from an integration test checkout e.g

My proposal would be to merge this to make a step forward and in case this causes issues fix them when we know that there is an issue and what it will be.

Thoughts ?

@schaefi schaefi force-pushed the apt_bootstrap_package_scripts branch 2 times, most recently from 88113e7 to f9abb3c Compare October 4, 2024 09:19
The bootstrap procedure based on apt only runs a manual
collection of package scripts. This commit refactors the
code that unpacks the bootstrap packages to a python
implementation and adds a method to run the bootstrap
scripts from all packages resolved by apt.
@schaefi schaefi force-pushed the apt_bootstrap_package_scripts branch from f9abb3c to 15b4bbc Compare October 4, 2024 09:33
@IsaacJT
Copy link
Collaborator

IsaacJT commented Oct 4, 2024

Thanks @schaefi, this seems like a really good solution. I think if the variables are updated on the go as per my other comments it should be all good to go

@Conan-Kudo Conan-Kudo merged commit 0fdd558 into main Oct 4, 2024
11 checks passed
@Conan-Kudo Conan-Kudo deleted the apt_bootstrap_package_scripts branch October 4, 2024 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants