Skip to content

Update Readme: CUDA, CMake #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 6, 2019

Conversation

ax3l
Copy link
Member

@ax3l ax3l commented Dec 6, 2019

Update the readme with CMake notes, CUDA-focus of the fork, newer instructions and authors.
Remove old Makefile, we do not use nor support that.

Related to #22 #24


#if you are using AMD GPUs, uncomment the following line and set the install path correctly
#TARGET=ocl_memtest
AMD_INSTALL_PATH ?=/usr/local/ati-stream-sdk-v2.1-lnx64/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, I am old enough to remember ATI stream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use it in the good old times xD

Update the readme with CMake notes, CUDA-focus of the fork,
newer instructions and authors.
@ax3l ax3l force-pushed the doc-updateReadme branch 8 times, most recently from 3859027 to 64f3c0d Compare December 6, 2019 01:27
@ax3l
Copy link
Member Author

ax3l commented Dec 6, 2019

cc @grische @sbastrakov

@ax3l ax3l requested a review from sbastrakov December 6, 2019 01:29
@ax3l ax3l force-pushed the doc-updateReadme branch from 64f3c0d to 88a975f Compare December 6, 2019 01:30
@ax3l ax3l mentioned this pull request Dec 6, 2019

### Compile

Inside the source directory, run:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we normally recommend an out-of-source build? Guess for this in-source should be fine when done like described here.

Copy link
Member Author

@ax3l ax3l Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running in a quick-&-dirty but empty dir is not perfect but good enough. Simplifies instructions below.

Copy link
Member

@sbastrakov sbastrakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Some language nitpicking, in case you agree to it I can implement myself.

the address wires.

Test 1 `[Own address test]`
Each Memory location is filled with its own address. The next kernel checks if the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Memory capitalized?


### Known Issues

* If your machine is cuda 2.2, killing the program while it is running test 10 (the memory stress test) could result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

machine is cuda 2.2 should perhaps be machine runs cuda 2.2 ?

### Known Issues

* If your machine is cuda 2.2, killing the program while it is running test 10 (the memory stress test) could result
in your GPUs in bad state. This is a bug from the nvidia driver. A detailed description can be found in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a word missing between GPUs and in, like being or remaining.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvidia is also written as Nvidia, both versions are in several places.

Copy link
Member Author

@ax3l ax3l Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or retro NVidia or nVidia? Nobody knows

Then we exit the kernel so that the memory can be flushed. Then we start a new kernel to read
and check if the value matches the pattern. An error is recorded if it does not match for each
memory location. In the same kernel, the compliment of the pattern is written after the checking.
The third kernel is launched to read the value again and checks against the compliment of the pattern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checks -> check ?

### Detailed Description

Test 0 `[Walking 1 bit]`
This test changes one bit a time in memory address to see it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to see it -> to see if it?

are completed the data patterns are checked. Because the data is checked
only after the memory moves are completed it is not possible to know
where the error occurred. The addresses reported are only for where the
bad pattern was found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph has double spaces between sentences.


Test 8 `[Modulo 20, random pattern]`
A random pattern is generated. This pattern is used to set every 20th memory location
in memory. The rest of the memory location is set to the complimemnt of the pattern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in memory is excessive, due to memory location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of the memory location is - shouldn't it be plural?

The bit fade test initializes all of memory with a pattern and then
sleeps for 90 minutes. Then memory is examined to see if any memory bits
have changed. All ones and all zero patterns are used. This test takes
3 hours to complete. The Bit Fade test is disabled by default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is says 3 hours, but in the name 90 minutes.

Test 10 `[memory stress test]`
Stress memory as much as we can. A random pattern is generated and a kernel of large grid size
and block size is launched to set all memory to the pattern. A new read and write kernel is launched
immediately after the previous write kernel to check if there is any errors in memory and set the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is any errors is singular and plural mismatch.

immediately after the previous write kernel to check if there is any errors in memory and set the
memory to the compliment. This process is repeated for 1000 times for one pattern. The kernel is
written as to achieve the maximum bandwidth between the global memory and GPU.
This will increase the chance of catching software error. In practice, we found this test quite useful
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

software error should either be plural or singular with a in front.

@ax3l
Copy link
Member Author

ax3l commented Dec 6, 2019

I did not change the original lingo, but you are welcome to just add a follow-up PR ;)

@ax3l ax3l merged commit edb66a4 into ComputationalRadiationPhysics:dev Dec 6, 2019
@ax3l ax3l deleted the doc-updateReadme branch December 6, 2019 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants