-
Notifications
You must be signed in to change notification settings - Fork 33
Update Readme: CUDA, CMake #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
#if you are using AMD GPUs, uncomment the following line and set the install path correctly | ||
#TARGET=ocl_memtest | ||
AMD_INSTALL_PATH ?=/usr/local/ati-stream-sdk-v2.1-lnx64/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, I am old enough to remember ATI stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did use it in the good old times xD
Update the readme with CMake notes, CUDA-focus of the fork, newer instructions and authors.
3859027
to
64f3c0d
Compare
64f3c0d
to
88a975f
Compare
|
||
### Compile | ||
|
||
Inside the source directory, run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we normally recommend an out-of-source build? Guess for this in-source should be fine when done like described here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running in a quick-&-dirty but empty dir is not perfect but good enough. Simplifies instructions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Some language nitpicking, in case you agree to it I can implement myself.
the address wires. | ||
|
||
Test 1 `[Own address test]` | ||
Each Memory location is filled with its own address. The next kernel checks if the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is Memory
capitalized?
|
||
### Known Issues | ||
|
||
* If your machine is cuda 2.2, killing the program while it is running test 10 (the memory stress test) could result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
machine is cuda 2.2
should perhaps be machine runs cuda 2.2
?
### Known Issues | ||
|
||
* If your machine is cuda 2.2, killing the program while it is running test 10 (the memory stress test) could result | ||
in your GPUs in bad state. This is a bug from the nvidia driver. A detailed description can be found in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a word missing between GPUs
and in
, like being
or remaining
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvidia
is also written as Nvidia
, both versions are in several places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or retro NVidia or nVidia? Nobody knows
Then we exit the kernel so that the memory can be flushed. Then we start a new kernel to read | ||
and check if the value matches the pattern. An error is recorded if it does not match for each | ||
memory location. In the same kernel, the compliment of the pattern is written after the checking. | ||
The third kernel is launched to read the value again and checks against the compliment of the pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checks
-> check
?
### Detailed Description | ||
|
||
Test 0 `[Walking 1 bit]` | ||
This test changes one bit a time in memory address to see it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to see it
-> to see if it
?
are completed the data patterns are checked. Because the data is checked | ||
only after the memory moves are completed it is not possible to know | ||
where the error occurred. The addresses reported are only for where the | ||
bad pattern was found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph has double spaces between sentences.
|
||
Test 8 `[Modulo 20, random pattern]` | ||
A random pattern is generated. This pattern is used to set every 20th memory location | ||
in memory. The rest of the memory location is set to the complimemnt of the pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in memory
is excessive, due to memory location
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the memory location is
- shouldn't it be plural?
The bit fade test initializes all of memory with a pattern and then | ||
sleeps for 90 minutes. Then memory is examined to see if any memory bits | ||
have changed. All ones and all zero patterns are used. This test takes | ||
3 hours to complete. The Bit Fade test is disabled by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is says 3 hours
, but in the name 90 minutes.
Test 10 `[memory stress test]` | ||
Stress memory as much as we can. A random pattern is generated and a kernel of large grid size | ||
and block size is launched to set all memory to the pattern. A new read and write kernel is launched | ||
immediately after the previous write kernel to check if there is any errors in memory and set the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is any errors
is singular and plural mismatch.
immediately after the previous write kernel to check if there is any errors in memory and set the | ||
memory to the compliment. This process is repeated for 1000 times for one pattern. The kernel is | ||
written as to achieve the maximum bandwidth between the global memory and GPU. | ||
This will increase the chance of catching software error. In practice, we found this test quite useful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
software error
should either be plural or singular with a
in front.
I did not change the original lingo, but you are welcome to just add a follow-up PR ;) |
Update the readme with CMake notes, CUDA-focus of the fork, newer instructions and authors.
Remove old
Makefile
, we do not use nor support that.Related to #22 #24