Skip to content

Commit 99499fd

Browse files
Nirdesh S AGitHub Enterprise
authored andcommitted
AIE/D/08-n-body-simulator: update README.md
1 parent 4db2277 commit 99499fd

File tree

5 files changed

+13
-21
lines changed

5 files changed

+13
-21
lines changed

AI_Engine_Development/AIE/Design_Tutorials/08-n-body-simulator/Module_03_pl_kernels/README.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ After coming up with 400 tile AI Engine design, the next step is the come up wit
6262
|`packet_receiver`|Packet switching kernel that evaluates packet headers from incoming streams and reroutes data to one of 4 AXI4-Streams|499.5 MHz|
6363
|`s2mm_mp`|Quad-channel data-mover that moves data from AXI4-Stream to DDR.|411 MHz|
6464

65-
Using Vivado timing closure techniques, you can increase the FMax if needed. To showcase the example, integrate using the 300 MHz clock. There is also a 400 MHz timing-closed design in the [beamforming tutorial](https://github.com/Xilinx/Vitis-Tutorials/tree/master/AI_Engine_Development/Design_Tutorials/03-beamforming).
65+
Using Vivado timing closure techniques, you can increase the FMax if needed. To showcase the example, integrate using the 300 MHz clock. There is also a 400 MHz timing-closed design in the [beamforming tutorial](../../03-beamforming).
6666

6767
![alt text](images/pl_kernels_highlighted.PNG)
6868

@@ -95,10 +95,7 @@ The `s2mm_mp` kernel is generated from the `kernel/spec.json` specification. Rev
9595

9696
* [Vitis Utilities Library Documentation](https://docs.amd.com/r/en-US/Vitis_Libraries/utils/index.html)
9797

98-
* [Generating PL Data-Mover Kernels](https://docs.amd.com/r/en-US/Vitis_Libraries/utils/datamover/kernel_gen_guide.html)
99-
100-
* [Vitis Compiler Command](https://docs.amd.com/r/en-US/ug1393-vitis-application-acceleration/v-Command)
101-
98+
* [Vitis Compiler Command](https://docs.amd.com/r/en-US/ug1399-vitis-hls/vitis-v-and-vitis-run-Commands)
10299
## Next Steps
103100

104101
After compiling the PL datamover kernels, you are ready to link the entire hardware design together in the next module, [Module 04 - Full System Design](../Module_04_full_system_design).

AI_Engine_Development/AIE/Design_Tutorials/08-n-body-simulator/Module_04_full_system_design/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,9 @@ The following image was taken from the Vivado project for the entire design. It
5050

5151
## References
5252

53-
* [Beamforming Tutorial - Module_04 - AI Engine and PL Integration](https://github.com/Xilinx/Vitis-Tutorials/tree/master/AI_Engine_Development/Design_Tutorials/03-beamforming)
53+
* [Beamforming Tutorial - Module_04 - AI Engine and PL Integration](../../03-beamforming)
5454

55-
* [Vitis Compiler Command](https://docs.amd.com/r/en-US/ug1393-vitis-application-acceleration/v-Command)
55+
* [Vitis Compiler Command](https://docs.amd.com/r/en-US/ug1399-vitis-hls/vitis-v-and-vitis-run-Commands)
5656

5757
## Next Steps
5858

AI_Engine_Development/AIE/Design_Tutorials/08-n-body-simulator/Module_05_host_sw/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,8 +112,10 @@ The following is the general execution flow for the host applications.
112112

113113
* [XRT Github Repo](https://github.com/Xilinx/XRT)
114114

115-
* [Vitis Developing Application Documentation](https://docs.amd.com/r/en-US/ug1393-vitis-application-acceleration/Developing-Applications)
116-
* [Vitis Building-and-Running-the-Application Documentation](https://docs.amd.com/r/en-US/ug1393-vitis-application-acceleration/Building-and-Running-the-Application)
115+
* [Vitis Developing Application Documentation](https://docs.amd.com/r/en-US/ug1701-vitis-accelerated-embedded/Developing-Vitis-Kernels-and-Applications)
116+
117+
* [Vitis Building-and-Running-the-Application Documentation](https://docs.amd.com/r/en-US/ug1701-vitis-accelerated-embedded/Building-and-Running-the-System)
118+
117119

118120
## Next Steps
119121
After compiling the host software, you are ready to create the sd_card.img and run the design on hardware in the next module, [Module 06 - SD Card and Hardware Run](../Module_06_sd_card_and_hw_run).

AI_Engine_Development/AIE/Design_Tutorials/08-n-body-simulator/Module_07_results/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ Following is a table comparing the executions times to simulate 12,800 particles
3333
|Name|Hardware|Algorithm|Average Execution Time for 1 Timestep (seconds)|
3434
|---|---|--|---|
3535
|Python NBody Simulator|x86 Linux Machine|O(N)|14.96|
36-
|C++ NBody Simulator|A72 Embedded Arm Processor|O(N<sup>2</sup>)|120.487|
37-
|AI Engine NBody Simulator|Versal AI Engine IP|O(N)|0.0118|
36+
|C++ NBody Simulator|A72 Embedded Arm Processor|O(N<sup>2</sup>)|120.591|
37+
|AI Engine NBody Simulator|Versal AI Engine IP|O(N)|0.0074065|
3838

3939
As you can see, the N-Body Simulator implemented on the AI Engine offers a x2,800 improvement over the Python O(N) implementation and a x24,800 improvement over the C++ O(N<sup>2</sup>) implementation. A vectorized C++ NBody Simulator O(N) implementation can be created with pthreads, but is left as an exercise for the user.
4040

AI_Engine_Development/AIE/Design_Tutorials/08-n-body-simulator/README.md

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -35,16 +35,13 @@ This tutorial can be run on the [VCK190 Board](https://www.xilinx.com/products/b
3535

3636
* [AM009 AI Engine Architecture Manual](https://docs.amd.com/r/en-US/am009-versal-ai-engine/Revision-History)
3737

38-
* [AI Engine Documentation](https://docs.amd.com/v/u/en-US/ug1416-vitis-documentation)
39-
4038
### *Tools*: Installing the Tools
4139

4240
1. Obtain a license to enable beta devices in AMD tools (to use the VCK190 platform).
4341
2. Obtain licenses for AI Engine tools.
4442
3. Follow the instructions for the [Vitis Software Platform Installation](https://docs.amd.com/r/en-US/ug1393-vitis-application-acceleration/Vitis-Software-Platform-Installation) and ensure you have the following tools:
4543

4644
* [Vitis™ Unified Software Development Platform 2024.2](https://docs.amd.com/v/u/en-US/ug1416-vitis-documentation)
47-
* [Xilinx® Runtime and Platforms (XRT)](https://docs.amd.com/r/en-US/ug1393-vitis-application-acceleration/Installing-Xilinx-Runtime-and-Platforms)
4845
* [Embedded Platform VCK190 Base or VCK190 Base](https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/embedded-platforms.html)
4946

5047
### *Environment*: Setting Up Your Shell Environment
@@ -83,17 +80,15 @@ which aiecompiler
8380
### HPC Applications
8481
The goal of this tutorial is to create a general-purpose floating point accelerator for HPC applications. This tutorial demonstrates a x24,800 performance improvement using the AI Engine accelerator over the naive C++ implementation on the A72 embedded Arm® processor.
8582

86-
#### A similar accelerator example was implemented on the AMD UltraScale+™-based Ultra96 device using only PL resources [here](https://www.hackster.io/rajeev-patwari-ultra96-2019/ultra96-fpga-accelerated-parallel-n-particle-gravity-sim-87f45e).
87-
8883

8984
|Name|Hardware|Algorithm Complexity|Average Execution Time to Simulate 12,800 Particles for 1 Timestep (seconds)|
9085
|---|---|--|---|
9186
|Python N-Body Simulator|x86 Linux Machine|O(N)|14.96|
92-
|C++ N-Body Simulator|A72 Embedded Arm Processor|O(N<sup>2</sup>)|120.487|
93-
|AI Engine N-Body SImulator|Versal AI Engine IP|O(N)|0.0118|
87+
|C++ N-Body Simulator|A72 Embedded Arm Processor|O(N<sup>2</sup>)|120.591|
88+
|AI Engine N-Body SImulator|Versal AI Engine IP|O(N)|0.007405|
9489

9590
### PL Data-Mover Kernels
96-
Another goal of this tutorial is to showcase how to generate PL Data-Mover kernels from the [AMD Vitis Utility Library](https://docs.amd.com/r/en-US/Vitis_Libraries/utils/datamover/kernel_gen_guide.html). These kernels moves any amount of data from DDR buffers to AXI-Streams.
91+
Another goal of this tutorial is to showcase how to generate PL Data-Mover kernels These kernels moves any amount of data from DDR buffers to AXI-Streams.
9792

9893
## The N-Body Problem
9994
The N-Body problem is the problem of predicting the motions of a group of N objects which each have a gravitational force on each other. For any particle `i` in the system, the summation of the gravitational forces from all the other particles results in the acceleration of particle `i`. From this acceleration, we can calculate a particle's velocity and position (`x y z vx vy vz`) will be in the next timestep. Newtonian physics describes the behavior of very large bodies/particles within our universe. With certain assumptions, the laws can be applied to bodies/particles ranging from astronomical size to a golf ball (and even smaller).
@@ -272,8 +267,6 @@ By default, the Makefiles build the design for the VCK190 Production board (i.e.
272267

273268
* [N-body problem wiki page](https://en.wikipedia.org/wiki/N-body_problem)
274269

275-
* [Ultra96 FPGA-Accelerated Parallel N-Particle Gravity Sim](https://www.hackster.io/rajeev-patwari-ultra96-2019/ultra96-fpga-accelerated-parallel-n-particle-gravity-sim-87f45e)
276-
277270
## Next Steps
278271

279272
Let's get started with running the python model of the N-Body simulator on an x86 machine in [Module 01 - Python Simulations on x86](Module_01_python_sims).

0 commit comments

Comments
 (0)