Skip to content

Commit ef1e5dc

Browse files
binhuang00meta-codesync[bot]
authored andcommitted
Add FBOSS platform test requirements to OSS
Summary: - Add 1st version for FBOSS platform tests requirements - This is a minimum requirement that vendor should perform before delivery HW/SW to Meta Reviewed By: mikechoifb Differential Revision: D84682080 fbshipit-source-id: 9d0868b82be900ab3142b318fa585f9ea10242d1
1 parent 2afc42a commit ef1e5dc

File tree

1 file changed

+183
-0
lines changed

1 file changed

+183
-0
lines changed
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
# FBOSS Platform Testing Requirements
2+
3+
## Introduction
4+
5+
This FBOSS platform testing requirement is a minimal set of requirements that
6+
vendors must perform before every software and hardware release. Vendors are
7+
responsible for adding additional tests needed based on the platform design
8+
and features, to ensure units shipped to Meta are ready for Meta internal
9+
platform development.
10+
11+
## x86
12+
13+
### Two months before shipping the Unit to Meta
14+
15+
* Represent the Platform with PlatformManager configuration and ensure it passes validation.
16+
17+
### Before every software/hardware release, run the following tests
18+
19+
#### Software Tests
20+
21+
* Run all the tests in the fboss/platform directory. These tests can be run on any host. They are important if there are code changes to the fboss/platform directory (in addition to configuration changes). Note: These tests will also run on github when the vendor raises a PR.
22+
23+
Test cases can be found in this file, in “fboss_platform_services”, any tests does not end with “_hw_test” : [https://github.com/facebook/fboss/blob/main/CMakeLists.txt\#L968](https://github.com/facebook/fboss/blob/main/CMakeLists.txt#L968)
24+
25+
#### Hardware Tests
26+
27+
* These tests need to run on the actual HW platform. Note: These tests will not run on github PRs. Test cases can be found in this file, in “fboss_platform_services”, end with “_hw_test” : [https://github.com/facebook/fboss/blob/main/CMakeLists.txt\#L968](https://github.com/facebook/fboss/blob/main/CMakeLists.txt#L968)
28+
29+
* platform_manager_hw_test
30+
31+
* sensor_service_hw_test
32+
33+
* fan_service_hw_test
34+
35+
* data_corral_service_hw_test
36+
37+
* weutil_hw_test
38+
39+
* fw_util_hw_test
40+
41+
---
42+
43+
## BMC {#bmc}
44+
45+
| Category | Test Name | Description |
46+
| :---: | ----- | ----- |
47+
| **Basics** | fbobmc.bmc.bootup | Power on the switch and make sure OpenBMC can boot up to the login prompt successfully without user interactions. |
48+
| | fbobmc.uboot.version | Boot up OpenBMC and make sure u-boot version is v2019.04 |
49+
| | fbobmc.kernel.version | Boot up OpenBMC and make sure kernel version is Linux v6.6 |
50+
| | fbobmc.yocto.version | Boot up OpenBMC and make sure yocto version is lf-master |
51+
| | fbobmc.system.initializer | Boot up OpenBMC and make sure systemd is the system initializer (rather than sysV) |
52+
| | fbobmc.reboot.bmc.wont.affect.userver.oob | Reboot OpenBMC (by running "reboot") and make sure uServer is still reachable (pingable & sshable) while OpenBMC is booting. |
53+
| | fbobmc.reboot.bmc.wont.affect.userver.inband | Reboot OpenBMC (by running "reboot") and make sure uServer inband traffic is not affected while OpenBMC is booting. |
54+
| | fbobmc.reboot.userver.wont.affect.openbmc | Reboot uServer (by running "reboot" from uServer) and make sure OpenBMC is reachable (pingable & sshable) when and after uServer is booting up. |
55+
| | fbobmc.reboot.userver.bmc.in.parallel | Reboot uServer and OpenBMC at the same time (by running "reboot" command) for at least 10 times: make sure OpenBMC and uServer OS can always boot up independently. |
56+
| | fbobmc.no.kernel.panic | Boot up OpenBMC and do not reboot OpenBMC for at least 7 days: make sure there is no kernel panic during the period. |
57+
| | fbobmc.upgrade.primary.flash | Make sure "flash0" mtd partition is pointed to the primary flash, and OpenBMC's primary flash can be upgraded using "flashrom \<openbmc-image\> /dev/mtd\#" command successfully. |
58+
| | fbobmc.upgrade.2nd.flash | Make sure "flash1" mtd partition is pointing to the 2nd flash, and OpenBMC's secondary flash can be upgraded using "flashrom \<openbmc-image\> /dev/mtd\#" command successfully. |
59+
| | fbobmc.data0.partition | Boot up OpenBMC and make sure /mnt/data is mounted to UBIFS based flash data0 partition, and the data0 partition size is 64MB. |
60+
| **Serial Console** | fbobmc.bmc.console.access | Make sure OpenBMC u-boot, kernel and userspace console output can be accessed from the front-panel console port at baudrate 9600\. |
61+
| | fbobmc.userver.console.access | Run "sol.sh" from OpenBMC and make sure uServer UEFI and Kernel serial logs can be accessed properly, and "ctrl+x" can quit uServer console successfully. |
62+
| | fbobmc.mterm.service | Reboot uServer and make sure uServer UEFI and Kernel serial logs can be captured by OpenBMC and stored in /var/log/mTerm_wedge.log. |
63+
| **Ethernet** | fbobmc.eth0.consistent.mac.address | Reboot OpenBMC for 10 times and make sure OpenBMC eth0's MAC address is consistent across reboots, and the MAC address matches the BMC MAC stored in chassis eeprom. |
64+
| | fbobmc.eth0.dhcp.ipv6.address | Boot up OpenBMC and make sure OpenBMC can obtain global ipv6 address automatically after bootup. |
65+
| | fbobmc.eth0.reachable.from.outside | Boot up OpenBMC and make sure OpenBMC eth0's global ipv6 address is pingable and sshable from outside the switch. |
66+
| | fbobmc.eth0.global.ipv6.from.userver | Make sure OpenBMC eth0's global ipv6 address is pingable and sshable from uServer OS. |
67+
| | fbobmc.eth0.link.local.ipv6.from.userver | Make sure OpenBMC eth0's link local ipv6 address is pingable and sshable from uServer OS. |
68+
| | fbobmc.eth0.4088.reachable | Make sure OpenBMC eth0.4088 interface is pingable and sshable from uServer OS. |
69+
| | fbobmc.eth0.dhcp.id | Make sure OpenBMC include the pre-defined vendor options in dhcpv6 request. Refer to [https://github.com/facebook/openbmc/blob/helium/common/recipes-core/systemd-networkd/files/10-eth0.network](https://github.com/facebook/openbmc/blob/helium/common/recipes-core/systemd-networkd/files/10-eth0.network) |
70+
| | fbobmc.eth0.lldp_util | Make sure lldp-util in OpenBMC can detect the uplink/management switch |
71+
| | fbobmc.mdio.oob.switch.access | |
72+
| **EEPROM Access** | fbobmc.weuti.chassis.eeprom | Run "weutil" from OpenBMC and make sure "weutil" can parse and print the Chassis EEPROM content properly. |
73+
| | fbobmc.weuti.chassis.eeprom.fields | From output of weutil, all mandatory fields are populated: 5 fields (Product Name, Product Production State, Product Version, Product Subversion, Product Serial Number) |
74+
| | fbobmc.weuti.chassis.eeprom.crc | From output of weutil, EEPROM checksum is calculated and matches EEPROM programmed checksum. The output should show that in the very last line e.g. CRC16: 0x33ce (CRC Matched) |
75+
| | fbobmc.weutil.scm.eeprom | Run "weutil --eeprom scm" or "weutil --path <path_for_scm>" from OpenBMC and make sure "weutil" can parse and print the scm EEPROM content properly. |
76+
| | fbobmc.userver.mac.address | |
77+
| **Watchdog** | fbobmc.primary.watchdog.enabled.by.default | Boot up OpenBMC and run "devmem 0x1e78500c": make sure both bit 0 and bit 1 are set. |
78+
| | fbobmc.dual.boot.enabled.by.default | Reboot OpenBMC and interrrupt u-boot: run "otp info strap" in u-boot command line and make sure 0x2b (OTPSTRAP\[43\] trap_en_bspiabr) is set to 1\. |
79+
| | fbobmc.dual.boot.wdt.disabled.after.boot | Boot up OpenBMC and run "devmem 0x1e620064": make sure bit 0 is 0 (means FMC_WDT2 watchdog is disabled). |
80+
| | fbobmc.force.boot.from.2nd.flash | Login OpenBMC and run "boot_info.sh bmc reset slave": make sure OpenBMC boots from the 2nd flash at next bootup. |
81+
| | fbobmc.primary.kernel.corrupted | Corrupt the primary flash's "fit" partition and then reboot OpenBMC: make sure OpenBMC can boot from the 2nd flash successfully. |
82+
| | fbobmc.primary.uboot.corrupted | Corrupt the primary flash's "u-boot" partition and then reboot OpenBMC: make sure OpenBMC can boot from the 2nd flash successfully. |
83+
| | fbobmc.restore.boot.order | Force OpenBMC to boot from the 2nd flash, and then run "boot_info.sh bmc reset master": make sure OpenBMC is booted from the primary flash at next bootup. |
84+
| **Power Control** | fbobmc.reset.userver | Login OpenBMC and run "wedge_power.sh reset": make sure userver is reset properly (uServer OS will reboot successfully after "wedge_power.sh reset"). |
85+
| | fbobmc.reset.userver.in.parallel | Login OpenBMC and run "wedge_power.sh reset" from different terminals at the same time: make sure uServer is reset properly (uServer OS will reboot successfully after "wedge_power.sh reset"). |
86+
| | fbobmc.reset.chassis | Login OpenBMC and run "[wedge_power.sh](http://wedge_power.sh/) reset -s": make sure both BMC and userver are reset successfully. |
87+
| **IPMI** | fbobmc.ipmid.service | Login OpenBMC and make sure ipmid.service is running properly (systemctl status ipmid) |
88+
| | fbobmc.kcsd.service | Login OpenBMC and make sure kcsd@\[0-2\].service is running properly (systemctl status kcsd@0|1|2) |
89+
| | fbobmc.ipmi.mc.info.from.userver | Login uServer and run "ipmitool mc info": make sure the command can return successfully. |
90+
| | fbobmc.ipmi.sel.list.from.userver | Login uServer and run "ipmitool sel list": make sure the command can return successfully. |
91+
| | fbobmc.ipmi.sel.injection.from.userver | Login uServer and make sure SEL entries can be added manually. For example, "ipmitool raw 0x0a 0x44 0x01 0x00 0x02 0xab 0xcd 0xef 0x00 0x01 0x00 0x04 0x01 0x17 0x00 0xa0 0x04 0x07 01 00" and then "ipmitool sel list". |
92+
| **Recover Path** | fbobmc.bios.deselected.by.default | Login OpenBMC and make sure the BIOS chip is not reachable (deselected) by default. |
93+
| | fbobmc.recover.bios | Login OpenBMC and make sure the BIOS chip can be upgraded by "bios_util" command successfully. |
94+
| | fbobmc.recover.bios.in.parallel | Login OpenBMC and run "bios_util" from multiple terminals at the same time: make sure the first instance can upgrade BIOS successfully, and all the following "bios_util" instances fail with "Error: another instance is running". |
95+
| **Misc** | fbobmc.ssh.to.userver.via.usb0 | Login OpenBMC and make sure people can ssh to the userver by running "ssh root@fe80::2%usb0". |
96+
| | fbobmc.ssh.to.bmc.via.usb0 | Login uServer and make sure people can ssh to the OpenBMC by running "ssh root@fe80::1%usb0" |
97+
| | fbobmc.tpm | Login OpenBMC and make sure "/dev/tpm0" device is created successfully. |
98+
| | fbobmc.restapi | Make sure OpenBMC is reachable via restapi. For example "curl [http://localhost:8080/api/sys/bmc](http://localhost:8080/api/sys/bmc)" |
99+
100+
---
101+
102+
## BSP
103+
104+
Please refer to: [https://facebook.github.io/fboss/docs/testing/bsp_tests/](https://facebook.github.io/fboss/docs/testing/bsp_tests/)
105+
106+
---
107+
108+
## Firmware {#firmware}
109+
110+
As we do not have an open-source firmware test repository to share with vendors
111+
, we expect them to perform the basic tests outlined below by writing their own
112+
scripts. Some of these tests will be manual in nature. This represents the
113+
minimum firmware testing qualification that Meta expects vendors to meet
114+
before delivering firmware binaries.
115+
116+
####
117+
118+
| Basic Tests | Details |
119+
| ----- | ----- |
120+
| Firmware Upgrade | Upgrade to the new binary |
121+
| Firmware Downgrade | Revert back to previous Binary |
122+
| Power Cycle | Power cycle in between test to make sure the Box comes back after upgrade & Downgrade |
123+
| BMC OOB Reachability | Verify that the OOB is ssh’able after upgrade |
124+
| x86 Reachability | Veriffy X86 is ssh’able after upgrade |
125+
| ASIC Detection | Verify that ASIC can be detected on the PCI bus after upgrade |
126+
| Memory & CPU Consumption | Verify the Memory & CPU consumption is within acceptable threshold after FW Upgrade |
127+
| LED & Display Testing | Check all LEDs, display, or indicators functions as expected (Blinking patterns, color changes) |
128+
| Buttons | Test all buttons (push button, switch buttons), if any, to ensure they respond correctly and consistently |
129+
| FW version Readout | Verify that version can be read after upgrade & downgrade and it meets the expected version |
130+
| BIOS & CPU_CPLD Back up path | Test BIOS & CPU_CPLD OpenBMC Back-up path FW upgrade works as expected |
131+
| | |
132+
133+
---
134+
135+
## Stress Test
136+
137+
The FBOSS Platform Software vendor stress test is to ensure the stability of
138+
FBOSS platform software on the underlying HW that Meta team can have the
139+
confidence the platform software passed the stress test requirements is
140+
ready for each development phase exit.
141+
142+
### Stress Test Prerequisites
143+
144+
#### Platform SW stack stress test components on x86
145+
146+
1. Linux Kernel ready
147+
2. Pass all BSP tests
148+
3. Pass all platform SW tests
149+
4. Pass all HW tests for all services and utils
150+
1. Platform Manager
151+
2. Fan Service
152+
3. Sensor Service
153+
4. Data Corral Service
154+
5. FW util
155+
6. Weutil
156+
5. Pass all firmware tests in [Firmware](#firmware) section
157+
158+
#### SW stack stress test components on BMC
159+
160+
1. Pass All BMC tests in [BMC](#bmc) section
161+
162+
### Below are the stress test cases, all should pass without failure
163+
164+
| Category | Test Name | Sub Name | Details |
165+
| ----- | ----- | :---: | ----- |
166+
| **x86 BSP** | Kernel module load/unload | | Run bsp kmod test for 1000 times ([https://github.com/facebook/fboss/tree/main/fboss/platform/bsp_tests](https://github.com/facebook/fboss/tree/main/fboss/platform/bsp_tests)) |
167+
| | SPI Access | | Read and write each SPI device for 1000 times |
168+
| | FPGA | I2C Transactions | Read and Write I2C bus for each I2C device for 1000 times |
169+
| | FPGA | MDIO | Read and write scratch pad register 100 times |
170+
| **x86 Platform Services and Utils** | Platform Services | | Run all platform services non-stop for 7 days without memory leak, crash and exception |
171+
| | Platform Utils | fw_util | Program (upgrade-\>reboot-\>downgrade-\>reboot) all firmware continually for 100 times without failure |
172+
| | Platform Utils | weutil | Use weutil to get all the EEPROM info for 1000 times without failure |
173+
| **BMC Platform** | Primary flash BMC Update | | Perform primary flash OpenBMC upgrade/downgrade 1000 times |
174+
| | Secondary flash BMC Update | | Perform secondary flash OpenBMC upgrade/downgrade 1000 times |
175+
| | I2C Transaction | | Read and Write I2C bus for each I2C device for 1000 times |
176+
| | Reboot | BMC Reboot | Reboot OpenBMC 1000 times with primary flash without issue, it should not boot into secondary flash. Reboot OpenBMC 1000 times in secondary flash without issue, it should not boot into primary flash. and make sure: OpenBMC can always boot from the primary flash ("boot_info.sh bmc" can report boot source) OpenBMC bootup time must be within 5 minutes, including when data0 partition needs to be re-formatted |
177+
| | Reboot | X86 Reboot | Login OpenBMC and run "wedge_power.sh reset" for at least 1,000 times: make sure uServer is always reset properly (uServer OS will reboot successfully after "wedge_power.sh reset"). |
178+
| | IPMI Stress Test | | Use ipmitool to verify BMC system info for 1000 times |
179+
| | Primary/Secondary OpenBMC Boot Swap | | Boot swap test for 1000 times. E.g. boot from primary \-\> secondary \-\> primary, so on and so forth |
180+
| | BMC Recovery Path BIOS | | Log into OpenBMC and try to upgrade/downgrade BIOS in OpenBMC for 1000 times |
181+
| **Whole System** | Power Cycle | | Log into OpenBMC and run "[wedge_power.sh](http://wedge_power.sh/) reset \-s" for 1000 times, with x86 and OpenBMC kernels Meta required vendor to use, and the whole chassis (x86 and BMC) boot successfully each time |
182+
183+
####

0 commit comments

Comments
 (0)