Add Flops force SSE for direct comparison, add flops AVX512 option by FCLC · Pull Request #17 · TheRainDoodle/Phenom-II-Benchmark

FCLC · 2022-02-11T23:23:51Z

I've also Added the option for the repo to potentially be renamed to a "Phenominal" Benchmark as a fun tongue and cheek reference to the baseline CPU.

Next would be to add the ability to only allow for 4 physical cores as one of the toggles (we'd have to provision for SMT to avoid internal L1i and L1d contention amongst other issues if possible)

Some of the read me need not be merged, as it's specific to my environment (alder lake and the like)

Beyond that feels free to incorporate as you see fit.

Depending on time I may add other CPU's/GPU for laughs, but specifically for FLOPS I've added the single and multi core in the readme

Performance metrics on AVX-512 enabled alderlake: 8 P-cores/16threads of AVX-512 : Run 1/10 Time: 0.08902 Run 2/10 Time: 0.08652 Run 3/10 Time: 0.08655 Run 4/10 Time: 0.0866 Run 5/10 Time: 0.08652 Run 6/10 Time: 0.08653 Run 7/10 Time: 0.08604 Run 8/10 Time: 0.08657 Run 9/10 Time: 0.08641 Run 10/10 Time: 0.08656 Executed 1248 billion instructions/second Score: 30.17 Phenom's II's worth's Single P-core/Single thread: Run 1/10 Time: 0.6904 Run 2/10 Time: 0.6853 Run 3/10 Time: 0.6853 Run 4/10 Time: 0.6853 Run 5/10 Time: 0.6853 Run 6/10 Time: 0.6853 Run 7/10 Time: 0.6852 Run 8/10 Time: 0.6853 Run 9/10 Time: 0.6853 Run 10/10 Time: 0.6852 Executed 156.7 billion instructions/second Score: 15.11 Phenom's II's worth's

Update RM.md, add data and add reference to AVX512

… more documentation to asm and main

…henom II 810 4 core.

…teaching use of GPR, AVX, AVX2 and AVX512

FCLC · 2022-02-17T02:42:05Z

now added avx512 version of test 3, SHR REG, CL. see 40dbd26

performance using avx 512 on 12700k, 8 cores with SMT was 1259 billion instructions/sec.

~= 42 Phenoms II's

…e to 50 instead of 10

FCLC · 2022-06-22T19:48:03Z

You'll want to avoid merging 0870ce3 if the intention is to continue to support windows.

the windows tests were removed as they skew the results when dealing with multi threading.

In one case, windows slowed down AVX2 performance on a 5800X3D from 0.69T ins/second to ~0.5T.

If on windows, multi core support should be removed, and users directed to use WSL2 in a pinch

- Removed timing displays between each run; - Added compute of mean time per run (with standard deviation to validate the measures); - Added `printHelp` function to avoid re-printing the available 'commands' each time.

Adding measure of average time per run, simplified output between runs and added clang-format

(ReadME) Words are hard, let's fix them

FCLC · 2022-07-02T17:48:17Z

Development of these tests for AARCH64 is ongoing, it's within a fork of my repo

FCLC and others added 11 commits February 10, 2022 12:42

Update RM.md, add data and add reference to AVX512

f32b205

Merge pull request #1 from FCLC/FCLC-avx512-readme

54b416e

Update RM.md, add data and add reference to AVX512

Change tittle of repo to a pun

ba0f59f

Add new force SSE option to flops for direct instruction compare, add…

02adc74

… more documentation to asm and main

Merge branch 'master' of https://github.com/FCLC/Phenom-II-Benchmark

52c85a9

add options to enable matching number of threads on host to same as P…

333ecc5

…henom II 810 4 core.

added 4 core SSE vs single core AVX512 comparison

3329b53

Update README.md

936cd1a

create AVX512 version of bit shifting funtctions, lay groundwork for …

40dbd26

…teaching use of GPR, AVX, AVX2 and AVX512

Merge branch 'master' of https://github.com/FCLC/A-Phenominal-benchmark

fa5b4f8

FCLC and others added 10 commits March 24, 2022 13:29

change AVX to AVX2 in comment, using ymm registers

a7187ab

VMX and main

b2ad76e

Words are hard- lets improve them

1f53b51

change flops to use all ymm and zmm registers, change default run siz…

d23a6b9

…e to 50 instead of 10

latest

7ce5c44

Update to have TLDR and instructions

323bd18

remove the binary

efc672f

Merge branch 'master' of https://github.com/FCLC/A-Phenominal-benchmark

7b440f3

remove windows content to avoid confussion

0870ce3

Update README.md

a220679

dssgabriel added 7 commits June 23, 2022 00:51

Rewrote Makefile

6d26653

Added average time measurement over multiple runs

d533ffa

- Removed timing displays between each run; - Added compute of mean time per run (with standard deviation to validate the measures); - Added `printHelp` function to avoid re-printing the available 'commands' each time.

Cleanup of object files + old Makefile

474ae3f

Enhanced output + added option for standard AVX

508d2b7

Added gitignore

0531cf1

Fixed score compute of AVX512/best FLOPS benchmark

90dfeb0

Removed warnings: set currentFunction pointer to null

82e3191

FCLC and others added 3 commits June 23, 2022 16:13

Merge pull request #2 from dssgabriel/master

ff3ef52

Adding measure of average time per run, simplified output between runs and added clang-format

(ReadME) Words are hard, let's fix them

3b0fdb2

Merge pull request #3 from FCLC/dssgabriel-master

6e81d31

(ReadME) Words are hard, let's fix them

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Flops force SSE for direct comparison, add flops AVX512 option#17

Add Flops force SSE for direct comparison, add flops AVX512 option#17
FCLC wants to merge 31 commits intoTheRainDoodle:masterfrom
FCLC:master

FCLC commented Feb 11, 2022

Uh oh!

FCLC commented Feb 17, 2022

Uh oh!

FCLC commented Jun 22, 2022 •

edited

Loading

Uh oh!

FCLC commented Jul 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FCLC commented Feb 11, 2022

Uh oh!

FCLC commented Feb 17, 2022

Uh oh!

FCLC commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FCLC commented Jul 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FCLC commented Jun 22, 2022 •

edited

Loading