-
Notifications
You must be signed in to change notification settings - Fork 187
Added some wording for the init function. #74
base: main
Are you sure you want to change the base?
Conversation
The class template `barrier` may also be declared without initialization in the `cuda::` namespace; a | ||
friend function `init` may be used to initialize the object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class template `barrier` may also be declared without initialization in the `cuda::` namespace; a | |
friend function `init` may be used to initialize the object. | |
Variable declarations whose type is an instantiation of the `barrier` class template may be initialized with the friend function `init`. |
I think the "class template barrier
" is already declared in the ::cuda
namespace and cannot be re-declared there. What can be declared without initialization are variables whose type is an instantiation of that template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class template `barrier` may also be declared without initialization in the `cuda::` namespace; a | |
friend function `init` may be used to initialize the object. | |
Variables declared with a type that is an instantiation of the `cuda::barrier` class template may be declared without initialization; in such case, they must be initialized by calling a friend function `init` prior to any use. |
How about the above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at my suggested change now, I think I'll want to tweak further, but that's a thing for tomorrow.
@@ -23,26 +23,35 @@ cuda::std::barrier<> bb; | |||
cuda::barrier<cuda::thread_scope_block> c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this default/value/trivial initialization do for barrier?
std::barrier
only has the constructor barrierbarrier(ptrdiff_t expected, CompletionFunction f = CompletionFunction());
What is the expected
count set to when using default/value/trivial initialization?
@@ -23,26 +23,35 @@ cuda::std::barrier<> bb; | |||
cuda::barrier<cuda::thread_scope_block> c; | |||
``` | |||
|
|||
The class template `barrier` may also be declared without initialization; a | |||
The class template `barrier` may also be declared without initialization in the `cuda::` namespace; a | |||
friend function `init` may be used to initialize the object. | |||
|
|||
```c++ | |||
// Shared memory does not allow initialization. | |||
__shared__ cuda::barrier<cuda::thread_scope_block> b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know of a specification of the exact semantics of __shared__
in CUDA C++ and how they interact with default constructors, but I imagine they are very similar to obtaining a pointer to a raw allocation with kernel lifetime, e.g.,
cuda::barrier<cuda::thread_scope_block>* b = ...points to uninitialized storage...;
where init(b, 1)
acts like new(b) cuda::barrier<cuda::thread_scope_block>{1}
.
Placement-new guarantees that the object will be initialized in place. Why do we need init
? Couldn't we implement the constructor as follows:
cuda::barrier<cuda::thread_scope_block>::barrier(ptrdiff_t expected) {
init(this, expected);
}
and just use "placement new
" for initializing barriers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is more or less what happens; but having non-experts write placement new expressions is probably not the greatest idea.
``` | ||
|
||
- Expects: `*bar` is trivially initialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intent is to exhibit undefined behavior if this condition is not met:
- Expects: `*bar` is trivially initialized. | |
- Preconditions: `*bar` is trivially initialized. |
``` | ||
|
||
- Expects: `*bar` is trivially initialized. | ||
- Effects: equivalent to initializing `*bar` with a constructor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which constructor? Arguable there is only one:
- Effects: equivalent to initializing `*bar` with a constructor. | |
- Effects: equivalent to initializing `*bar` with `barrier::barrier(expected, completion)`. |
template<thread_scope Sco, class CompletionF> | ||
__host__ __device__ void init(barrier<Sco,CompletionF>* bar, std::ptrdiff_t expected); | ||
template<thread_scope Sco, class CompletionF> | ||
__host__ __device__ void init(barrier<Sco,CompletionF>* bar, std::ptrdiff_t expected, CompletionF completion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we add two overloads of init
instead of one with a default CompletionF
?
template<thread_scope Sco, class CompletionF> | |
__host__ __device__ void init(barrier<Sco,CompletionF>* bar, std::ptrdiff_t expected); | |
template<thread_scope Sco, class CompletionF> | |
__host__ __device__ void init(barrier<Sco,CompletionF>* bar, std::ptrdiff_t expected, CompletionF completion); | |
template<thread_scope Sco, class CompletionF> | |
__host__ __device__ void init(barrier<Sco,CompletionF>* bar, std::ptrdiff_t expected, CompletionF completion = CompletionF()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Gonzalo that we should document this with just a single overload with completion
defaulted.
The reason for why this is implemented as two overloads is that the friend functions are instantiated when you instantiate the class template, so that they can be found by ADL; and if CompletionF
does not support default construction, the default argument is ill-formed and the entire program fails (even when you don't actually call init
, if I remember how this works and what I encountered here correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense.
The standard requires the default type of CompletionF
to be DefaultConstructible
, http://eel.is/c++draft/thread.barrier#class-6 , but it allows users to specify a non-DefaultConstructible
one.
This will need a rebase. |
@griwes would you mind pushing this over the finishing line? |
No description provided.