Skip to content

Commit 8f6b9e2

Browse files
travisb-nexthopmeta-codesync[bot]
authored andcommitted
Distro Infrastructure container PXE-boot MVP (#711)
Summary: <!-- Thanks for submitting a pull request! We appreciate you spending the time to work on these changes. Please provide enough information so that others can review your pull request. --> **Pre-submission checklist** - [x] I've ran the linters locally and fixed lint errors related to the files I modified in this PR. You can install the linters by running `pip install -r requirements-dev.txt && pre-commit install` - [x] `pre-commit run` <!-- Explain the motivation for making this change and any other context that you think would help reviewers of your code. What existing problem does the pull request solve? --> Here the minimum viable Distro Infrastucture container needed to support IPv4 and IPv6 PXE boot is added. IPv4 expects a DHCP server to exist on the network to provide IPv4 addresses to the switch. IPv6 defaults to supply its own DHCPv6 server on the L2 segment, but that can be disabled. This is a self-contained, interactive docker container which uses Proxy DHCP (IPv4) or DHCPv6 (IPv6) to direct PXE-booting devices to the container's TFTP server and web server. iPXE is used to support loading the relatively large initrd image over HTTP instead of TFTP and to support supplying changeable arguments to the installer initrd. Currently these are hardcoded into autoexec.ipxe, but future changes might autogenerate this file based on the needs of the particular PXE installer. For usage details, see the included README.md. As this is a MVP, those instructions must be followed to the letter. Future work will integrate with the fboss-image tool to drive the Distro Infra container in a more user-friendly way. Once PXE boot has completed, the MAC is made ineligible for PXE booting again until reconfigured. This is to support PXE installing, then booting off the internal drive for every subsequent boot until PXE-booting is explicitly requested again. ## IPv4 boot flow Under IPv4, the boot flow with iPXE is simple because iPXE receives the next-server IP address. The IPv4 boot flow looks like: 1. BIOS 2. iPXE 3. `tftp://next-server/autoexec.ipxe` 4. `http://next-server/FBOSS-Distro-Image.{kernel,initrd}` 5. `http://next-server/FBOSS-Distro-Image.xz` ## IPv6 boot flow Unfortunately IPv6 is more complicated. iPXE does not receive next-server or anything like it under IPv6, so we cannot follow that simple flow. Further, iPXE by default tries to autoconfigure its network interface with IPv4 first then IPv6. Thus if the network were configured to support both IPv4 PXE boot and IPv6 PXE boot (the Distro Infrastructure default), while the BIOS would load iPXE over IPv6, iPXE would load the PXE installer over IPv4. This protocol switching is not satisfactory testing. To resolve these two problems, we separate iPXE into IPv4 and IPv6 versions. The IPv4 version operates as above. The IPv6 version uses two intermediate scripts to insert the server_ip configuration and maintain IPv6 throughout. The boot flow for IPv6 is: 1. BIOS 2. iPXEv6 3. Script embedded inside iPXEv6 which forces IPv6 and 'sources' a generated script `-serverip` 4. `-serverip`, a generated script sets the server_ip variable before passing control onto `tftp://server-ip/autoexec.ipxe` shared with IPv4 5. `tftp://next-server/autoexec.ipxe` 6. `http://next-server/FBOSS-Distro-Image.{kernel,initrd}` 7. `http://next-server/FBOSS-Distro-Image.xz` To support both paths with a common `autoexec.ipxe`, `host-server` is used as `server_ip` when executing under IPv4. Pull Request resolved: #711 Test Plan: <!-- Demonstrate the code is solid. Example: The exact commands you ran and their output, screenshots / videos if the pull request changes the user interface. How exactly did you verify that your PR solves the issue you wanted to solve? --> <!-- If a relevant Github issue exists for this PR, please make sure you link that issue to this PR --> Only manual, happy path is tested. This has been tested manually against fboss103. Under IPv6 that test output is: ``` ds103:#s-image/distro_infra $ ./build.sh && ./distro_infra.sh --intf vlan1033 --persist-dir data ... => exporting to image 0.7s => => exporting layers 0.6s => => writing image sha256:27dec285715ddfc30a692a4fee1cb34f79a02e581df34801a8a0330e256cf0c9 0.0s => => naming to docker.io/library/fboss_distro_infra 0.0s Listening on vlan1033 - 10.250.33.194 & fc00:33::89 dnsmasq: started, version 2.85 DNS disabled dnsmasq: compile time options: IPv6 GNU-getopt DBus no-UBus no-i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile dnsmasq-dhcp: DHCP, proxy on subnet 10.250.33.194 dnsmasq-dhcp: DHCPv6, IP range ::fb05:5000:1 -- ::fb05:50ff:ffff, lease time 5m, template for vlan1033 dnsmasq-dhcp: router advertisement on vlan1033 dnsmasq-dhcp: DHCPv6, IP range fc00:33::fb05:5000:1 -- fc00:33::fb05:50ff:ffff, lease time 5m, constructed for vlan1033 dnsmasq-dhcp: router advertisement on fc00:33::, constructed for vlan1033 dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: IPv6 router advertisement enabled dnsmasq-dhcp: DHCP, sockets bound exclusively to interface vlan1033 dnsmasq-tftp: TFTP root is /distro_infra/persistent secure mode dnsmasq-dhcp: read /distro_infra/dnsmasq_conf.d/default_ignore dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: Enter MAC address (blank to exit): dc-da-4d-fc-ad-2d dnsmasq: inotify, new or changed file /distro_infra/dnsmasq_conf.d/dc-da-4d-fc-ad-2d dnsmasq-dhcp: read /distro_infra/dnsmasq_conf.d/dc-da-4d-fc-ad-2d Enter MAC address (blank to exit): ``` Reboot fboss103 here ``` >>Checking Media Presence...... >>Media Present...... >>Start PXE over IPv6 on MAC: DC-DA-4D-FC-AD-2D. Press ESC key to abort PXE boot.. Station IP address is FC00:33:0:0:0:FB05:50DC:B9F7 Server IP address is FC00:33:0:0:0:0:0:89 NBP filename is ipxev6.efi NBP filesize is 1052160 Bytes >>Checking Media Presence...... >>Media Present...... Downloading NBP file... NBP file downloaded successfully. iPXE initialising devices... iPXE 1.21.1+ (g9486) -- Open Source Network Boot Firmware -- https://ipxe.org Features: DNS HTTP iSCSI TFTP VLAN SRP AoE EFI Menu Configuring [ipv6] (net0 dc:da:4d:fc:ad:2d)... ok tftp://[fc00:33::89]/ipxev6.efi-serverip... ok autoexec.ipxe... ok http://[fc00:33::89]:6969/dc-da-4d-fc-ad-2d/pxeboot.FBOSS-Distro-Image.x86_64-1.0.initrd... ok http://[fc00:33::89]:6969/dc-da-4d-fc-ad-2d/pxeboot.FBOSS-Distro-Image.x86_64-1.0.kernel... ok tftp://[fc00:33::89]/pxeboot_complete... ok EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path EFI stub: Measured initrd data into PCR 9 [ 0.000000] Linux version 6.12.63-200.el9.x86_64... ``` Then the PXE installer runs. The Distro Infrastructure output during this period is: ``` dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: DHCPSOLICIT(vlan1033) 00:02:00:00:ab:11:ea:34:3d:47:ca:ee:d2:07 dnsmasq-dhcp: DHCPREPLY(vlan1033) 00:02:00:00:ab:11:ea:34:3d:47:ca:ee:d2:07 no addresses available dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: DHCPSOLICIT(vlan1033) 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPADVERTISE(vlan1033) fc00:33::fb05:50dc:b9f7 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPREQUEST(vlan1033) 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPREPLY(vlan1033) fc00:33::fb05:50dc:b9f7 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: RTR-SOLICIT(vlan1033) dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: DHCPSOLICIT(vlan1033) 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPADVERTISE(vlan1033) fc00:33::fb05:50a2:9696 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPREQUEST(vlan1033) 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPREPLY(vlan1033) fc00:33::fb05:50a2:9696 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: DHCPRELEASE(vlan1033) 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-tftp: error 8 User aborted the transfer received from fc00:33::fb05:50dc:b9f7 dnsmasq-tftp: sent /distro_infra/persistent/dc-da-4d-fc-ad-2d/ipxev6.efi to fc00:33::fb05:50dc:b9f7 dnsmasq-tftp: sent /distro_infra/persistent/dc-da-4d-fc-ad-2d/ipxev6.efi to fc00:33::fb05:50dc:b9f7 dnsmasq-dhcp: DHCPRELEASE(vlan1033) 00:01:00:01:2e:30:1a:70:dc:da:4d:fc:ad:2d dnsmasq-dhcp: RTR-SOLICIT(vlan1033) dc:da:4d:fc:ad:2d dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: DHCPSOLICIT(vlan1033) 00:04:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 dnsmasq-dhcp: DHCPADVERTISE(vlan1033) fc00:33::fb05:50f5:dfc9 00:04:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 dnsmasq-dhcp: DHCPREQUEST(vlan1033) 00:04:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 dnsmasq-dhcp: DHCPREPLY(vlan1033) fc00:33::fb05:50f5:dfc9 00:04:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 dnsmasq-tftp: sent /distro_infra/persistent/dc-da-4d-fc-ad-2d/ipxev6.efi-serverip to fc00:33::fb05:50f5:dfc9 dnsmasq-tftp: sent /distro_infra/persistent/dc-da-4d-fc-ad-2d/autoexec.ipxe to fc00:33::fb05:50f5:dfc9 dnsmasq-dhcp: RTR-SOLICIT(vlan1033) dnsmasq-dhcp: RTR-ADVERT(vlan1033) fc00:33:: dnsmasq-dhcp: DHCPSOLICIT(vlan1033) 00:04:62:19:3e:08:1d:5a:56:77:93:71:a4:d7:25:6f:4c:de dnsmasq-dhcp: DHCPREPLY(vlan1033) fc00:33::fb05:50b2:2cbb 00:04:62:19:3e:08:1d:5a:56:77:93:71:a4:d7:25:6f:4c:de dnsmasq-tftp: sent /distro_infra/persistent/dc-da-4d-fc-ad-2d/pxeboot_complete to fc00:33::fb05:50f5:dfc9 dnsmasq-dhcp: read /distro_infra/dnsmasq_conf.d/default_ignore dc-da-4d-fc-ad-2d PXE booted, disabling future PXE boot provisioning ``` Critical is the line `dc-da-4d-fc-ad-2d PXE booted, disabling future PXE boot provisioning`, which indicates that PXE boot has been detected as complete and will not be offered to future boots. Subsequent reboots of fboss103 time-out when attempting PXE boot and boot off the NVME instead. IPv4 works almost identically except for downloads of the additional autoipv6.ipxe script. Reviewed By: srikrishnagopu Differential Revision: D91169704 Pulled By: kevin645 fbshipit-source-id: 4b9e8f7bacfe80a1600bdc70f9e65ffba6b020b4
1 parent 6e18086 commit 8f6b9e2

File tree

12 files changed

+382
-0
lines changed

12 files changed

+382
-0
lines changed

fboss-image/.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
distro_infra/parts/ipxe/ipxe
2+
distro_infra/parts/ipxe/ipxev4.efi
3+
distro_infra/parts/ipxe/ipxev6.efi
4+
image_builder/logs
5+
image_builder/output

fboss-image/distro_infra/BUCK

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
oncall("fboss_distro")
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
FROM quay.io/centos/centos:stream9
2+
3+
RUN dnf install -y 'dnf-command(config-manager)' && \
4+
dnf config-manager --set-enabled crb && \
5+
dnf install -y epel-release epel-next-release && \
6+
dnf install -y --allowerasing \
7+
wget curl tcpdump zstd iputils which net-tools iproute \
8+
man dnsmasq vim nginx procps-ng && \
9+
dnf clean all && rm -rf /var/cache/dnf
10+
11+
RUN mkdir -p /distro_infra/dnsmasq_conf.d
12+
COPY parts/run_distro_infra.sh /distro_infra
13+
COPY parts/post_tftp.sh /distro_infra
14+
COPY parts/ipxe/ipxev4.efi /distro_infra
15+
COPY parts/ipxe/ipxev6.efi /distro_infra
16+
COPY parts/autoexec.ipxe /distro_infra
17+
COPY parts/nginx.conf /distro_infra
18+
19+
RUN mkdir -p /distro_infra/persistent
20+
WORKDIR /distro_infra/persistent

fboss-image/distro_infra/README.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
Distro Infrastructure
2+
=====================
3+
4+
This directory contains the FBOSS Distro Infracture container. Right now it provides only the necessary support for IPv4
5+
and IPv6 PXE boot. Under IPv4 the network must have a DHCP server providing IP addresses. Under IPv6 the network must
6+
**NOT** have a DHCP server providing IP addresses.
7+
8+
Building
9+
--------
10+
11+
Build the container with `./build.sh`.
12+
13+
Usage
14+
-----
15+
16+
Start the container by running the 'distro_infra.sh' script which will start the container. This script takes two
17+
arguments:
18+
19+
- interface: The interface to attach to. This must have L2 adjacency with the management port of the FBOSS duts
20+
- persistent directory: The directory to use for persistent storage, primarily of images to load
21+
22+
For example, `mkdir images; ./distro_infra.sh vlan1033 images`.
23+
24+
This will start an interactive tool to configure supplying PXE boot options to a given MAC address. Exiting the tool
25+
terminates the container.
26+
27+
The first time a MAC address is given, a `<MAC>` directory under the persistent directory will be created. Into this the
28+
image files to boot must be manually extracted. eg:
29+
30+
```
31+
$ ./distro_infra.sh vlan1033 images
32+
Listening on vlan1033 - 10.250.33.194
33+
dnsmasq: started, version 2.85 DNS disabled
34+
dnsmasq: compile time options: IPv6 GNU-getopt DBus no-UBus no-i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth
35+
cryptohash DNSSEC loop-detect inotify dumpfile
36+
dnsmasq-dhcp: DHCP, proxy on subnet 10.250.33.194
37+
...
38+
Enter MAC address (blank to exit): DC-DA-4D-FC-AD-2D
39+
```
40+
41+
Now images/dc-da-4d-fc-ad-2d has been created. Into that directory the following files must be copied (or hardlinked
42+
from the also created `images/cache` directory), or extracted from the PXE installer tarball, which contains these
43+
precise names:
44+
45+
```
46+
$ cd images/dc-da-4d-fc-ad-2d
47+
$ tar -xf fboss-distro-image_pxe.tar
48+
$ ls -1
49+
FBOSS-Distro-Image.x86_64-1.0.config.bootoptions
50+
FBOSS-Distro-Image.x86_64-1.0.initrd
51+
FBOSS-Distro-Image.x86_64-1.0.kernel
52+
FBOSS-Distro-Image.x86_64-1.0.sha256
53+
FBOSS-Distro-Image.x86_64-1.0.xz
54+
pxeboot.FBOSS-Distro-Image.x86_64-1.0.kernel
55+
pxeboot.FBOSS-Distro-Image.x86_64-1.0.initrd
56+
```
57+
58+
Other files will be generated and populated automatically.
59+
60+
The dut can then be rebooted and PXE boot will start. Once PXE boot has completed, the Distro Infrastructure container
61+
will stop serving PXE boot to that MAC. To serve PXE boot again, re-enter the MAC address at the menu. It is not
62+
necessary to recopy the image files.
63+
64+
To terminate the script and container, enter a blank MAC address at the prompt: `Enter MAC address (blank to exit):`.

fboss-image/distro_infra/build.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
#!/bin/bash
2+
set -e
3+
4+
pushd parts/ipxe
5+
./build.sh
6+
popd
7+
8+
DOCKER_BUILDKIT=1 docker build . -t fboss_distro_infra
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
#!/bin/bash
2+
3+
INTERFACE=""
4+
PERSIST_DIR=""
5+
6+
help() {
7+
echo "Usage: $0 --intf <interface> --persist-dir <persistent dir>"
8+
echo ""
9+
echo "Options:"
10+
echo ""
11+
echo " -i|--intf <interface> Network interface to attach to (must have L2 adjacency with FBOSS duts)"
12+
echo " -p|--persist-dir <dir> Directory for persistent storage, primarily of images to load"
13+
echo ""
14+
echo " -h|--help Print this help message"
15+
echo ""
16+
echo "Examples:"
17+
echo " $0 --intf vlan1033 --persist-dir persist"
18+
echo " $0 -i vlan1033 -p persist"
19+
echo ""
20+
}
21+
22+
if [[ $# -eq 0 ]]; then
23+
echo "Error: No arguments provided"
24+
help
25+
exit 1
26+
else
27+
while [[ $# -gt 0 ]]; do
28+
case "$1" in
29+
-i | --intf)
30+
INTERFACE="$2"
31+
shift 2
32+
;;
33+
-p | --persist-dir)
34+
PERSIST_DIR="$2"
35+
shift 2
36+
;;
37+
-h | --help)
38+
help
39+
exit 0
40+
;;
41+
*)
42+
echo "Error: Unrecognized command option: '${1}'"
43+
help
44+
exit 1
45+
;;
46+
esac
47+
done
48+
fi
49+
50+
if [[ -z $INTERFACE || -z $PERSIST_DIR ]]; then
51+
echo "Error: --intf and --persist-dir are required"
52+
help
53+
exit 1
54+
fi
55+
56+
mkdir -p "${PERSIST_DIR}"
57+
58+
# Run the Docker container with the parsed arguments
59+
docker run --rm -it --network host --cap-add=NET_ADMIN \
60+
--volume "$(realpath "${PERSIST_DIR}")":/distro_infra/persistent:rw \
61+
fboss_distro_infra /distro_infra/run_distro_infra.sh --intf "${INTERFACE}"
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#!ipxe
2+
3+
# Under IPv6 we'll already have an IP address and will have set server_ip via the generated intermediate -serverip
4+
# script. With IPv4, this script is the first thing we'll have executed and so need to get an IP and set server_ip here.
5+
6+
# IPv4-only
7+
isset ${server_ip} || ifconf -c dhcp
8+
isset ${server_ip} || set server_ip ${next-server}
9+
10+
# Common to both IPv6 and IPv4
11+
set hbase http://${server_ip}:6969/${net0/mac:hexhyp}
12+
set tbase tftp://${server_ip}
13+
14+
set stub_name FBOSS-Distro-Image.x86_64-1.0
15+
16+
initrd ${hbase}/pxeboot.${stub_name}.initrd
17+
18+
kernel ${hbase}/pxeboot.${stub_name}.kernel console=ttyS4,57600n8 nomodeset rd.neednet=1 rd.kiwi.install.pxe ip=eno1:dhcp rd.kiwi.install.image=${hbase}/${stub_name}.xz rd.kiwi.install.target=/dev/nvme0n1
19+
20+
# Download a marker file via TFTP so we can use it as a signal to remove the dnsmasq configuration
21+
imgfetch ${tbase}/pxeboot_complete /pxeboot_complete
22+
23+
boot
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!ipxe
2+
3+
ifconf -c ipv6
4+
5+
# When booting over IPv6, iPXE only receives a fully-formed bootfile-url DHCPv6 option and it appears there is no way to
6+
# give just iPXE other options. bootfile-url becomes the iPXE ${filename} setting, but is a full URL and iPXE scripting
7+
# is not powerful enough to extract just the server IP from it so we can use HTTP downloading for the large artifacts.
8+
# Thus we autogenerate this iPXE script simply to set the server IP to be used by autoexec.ipxe.
9+
imgexec --replace ${filename}-serverip
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
#!/bin/bash
2+
3+
set -e
4+
5+
# We need to build a custom iPXE executable for IPv6 so we can include the autoexecipv6.ipxe script which works around
6+
# the lack of a 'source server' during IPv6 boot.
7+
if [ ! -d ipxe ]; then
8+
git clone https://github.com/ipxe/ipxe.git
9+
fi
10+
11+
cd ipxe/src
12+
make bin-x86_64-efi/ipxe.efi
13+
cp bin-x86_64-efi/ipxe.efi ../../ipxev4.efi
14+
make bin-x86_64-efi/ipxe.efi EMBED=../../autoipv6.ipxe
15+
cp bin-x86_64-efi/ipxe.efi ../../ipxev6.efi
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Most basic nginx config which statically serves /distro_infra/persistent on port 6969
2+
3+
daemon on;
4+
worker_processes 2;
5+
user nginx;
6+
7+
events {
8+
use epoll;
9+
worker_connections 128;
10+
}
11+
12+
error_log nginx_error.log info;
13+
14+
http {
15+
default_type application/octet-stream;
16+
access_log nginx_access.log;
17+
sendfile on;
18+
keepalive_timeout 65;
19+
20+
server {
21+
listen V4IP:6969 default_server;
22+
listen [V6IP]:6969 default_server;
23+
server_name localhost;
24+
root /distro_infra/persistent;
25+
26+
location / {
27+
try_files $uri $uri/ =404;
28+
autoindex on;
29+
}
30+
}
31+
}

0 commit comments

Comments
 (0)