Skip to content

Conversation

@gmlueck
Copy link
Owner

@gmlueck gmlueck commented Mar 27, 2020

Clarify that SYCL_EXTERNAL cannot be used for the "operator()" member
function of a kernel function object, or for any member function.

Also use consistent formatting for "operator()". Most occurences used
\codeinline{}, but a few were using \tf{}. Standardize on the former.

Clarify that SYCL_EXTERNAL cannot be used for the "operator()" member
function of a kernel function object, or for any member function.

Also use consistent formatting for "operator()".  Most occurences used
\codeinline{}, but a few were using \tf{}.  Standardize on the former.
@gmlueck gmlueck requested review from jbrodman and mkinsner March 27, 2020 19:25
@gmlueck
Copy link
Owner Author

gmlueck commented Mar 27, 2020

Please consider this carefully. Our compiler gives an error that SYCL_EXTERNAL cannot be used for member functions. Is that intent of the spec, or is it a compiler bug? Obviously, this PR assumes it is the intent of the spec.

@mkinsner
Copy link
Collaborator

I believe that the compiler behavior matches the spec via:

default behavior in SYCL applications is that all the definitions and declarations of the functions and methods are available to the SYCL compiler, in the same translation unit
...
SYCL_EXTERNAL can only be used on functions

That said:

  1. I'm not sure that C++ defines "method" as a member function, although I think that's the common understanding. So this may not be well defined right now.
  2. I'm not sure that allowing function but not allowing member functions makes sense. As long as function pointers aren't introduced the current restriction seems too strong.
  3. This may not matter much to generalize in SYCL 1.2.1, and is probably worth the clarification that you've created.

@mkinsner
Copy link
Collaborator

@rolandschulz

@gmlueck
Copy link
Owner Author

gmlueck commented Apr 1, 2020

I'd like to drive this PR to conclusion. If we are agreed that the intent of the 1.2.1 spec is to disallow SYCL_EXTERNAL on member functions, then I think we should update the wording as I suggest to make this more clear. I believe the C++ standard uses the term "member function", not "method", so using that term in SYCL would be more precise.

I'm certainly not opposed to expanding the SYCL spec to allow SYCL_EXTERNAL on member functions. If that is our desire, shall we treat that as a separate PR? It would probably make sense to prototype that support in our own compiler before proposing a change to the spec.

@mkinsner
Copy link
Collaborator

mkinsner commented Apr 1, 2020

I'd like to drive this PR to conclusion. If we are agreed that the intent of the 1.2.1 spec is to disallow SYCL_EXTERNAL on member functions, then I think we should update the wording as I suggest to make this more clear. I believe the C++ standard uses the term "member function", not "method", so using that term in SYCL would be more precise.

I currently believe that SYCL_EXTERNAL on a member function is a rare use case, so don't think it's worth effort to generalize now. @rolandschulz @Pennycook @jbrodman any disagreement?

So the action, unless anyone objects, is to clarify in the spec that SYCL_EXTERNAL is not allowed on member functions.

@Pennycook
Copy link
Collaborator

I think the restrictions proposed here are too strong, but I might have misunderstood something.

I agree that allowing SYCL_EXTERNAL on arbitrary member functions of arbitrary classes is a bad idea and could lead to problems. It's not clear to me what would be expected to happen if a member function accessed a member of the class it is defined in, for example.

However, the proposed wording forbids a use-case that I assumed we wanted to support. If we forbid kernels defined as functors in different translation units, doesn't that force library developers to ship their library as a header or as a precompiled kernel object?

Can we and should we also support applying SYCL_EXTERNAL to an entire class definition if that class is a kernel functor?

@mkinsner
Copy link
Collaborator

mkinsner commented Apr 1, 2020

I think the restrictions proposed here are too strong, but I might have misunderstood something.

I agree that allowing SYCL_EXTERNAL on arbitrary member functions of arbitrary classes is a bad idea and could lead to problems. It's not clear to me what would be expected to happen if a member function accessed a member of the class it is defined in, for example.

However, the proposed wording forbids a use-case that I assumed we wanted to support. If we forbid kernels defined as functors in different translation units, doesn't that force library developers to ship their library as a header or as a precompiled kernel object?

Can we and should we also support applying SYCL_EXTERNAL to an entire class definition if that class is a kernel functor?

Is your suggestion that operator() of a class in a different xlation unit should be legal to mark as SYCL_EXTERNAL, and an entire class should potentially be able to be marked that way as well if it has operator() defined?

@Pennycook
Copy link
Collaborator

Is your suggestion that operator() of a class in a different xlation unit should be legal to mark as SYCL_EXTERNAL, and an entire class should potentially be able to be marked that way as well if it has operator() defined?

Just the second part. Marking an entire class as SYCL_EXTERNAL could instruct the compiler to try and compile it as a kernel, rather than waiting until it sees that type used as a kernel function (which I think is what we implement today).

I'm a little uncertain because I'm not sure if applying it to the class is any different to applying it to all functions and operators inside of the class. But marking the class seems clearer, and the behavior is easier to specify.

I think I am suggesting that:

  • We disallow SYCL_EXTERNAL on member functions
  • We allow SYCL_EXTERNAL on entire classes if they meet the requirements of a SYCL kernel functor
  • We remove the proposed text "\codeinline{operator()} function must be defined in the same translation unit as the host code that invokes it. (For example, if a kernel is invoked via \codeinline{parallel_for}, the definition of \codeinline{operator()} must be in the same translation unit as the call to \codeinline{parallel_for}.)"

@rolandschulz
Copy link
Collaborator

I agree it is likely rare. But I don't see any good reason to disallow it.

I agree that allowing SYCL_EXTERNAL on arbitrary member functions of arbitrary classes is a bad idea and could lead to problems. It's not clear to me what would be expected to happen if a member function accessed a member of the class it is defined in, for example.

Why would it be a bad idea? I don't see any difference between a (non-virtual) member function and a non-member function. From a compiler perspective S::foo and foo(S*) is identical. They both take a pointer to S as the first argument. And because they take a (pointer to) S as an argument the compiler has to make sure the definition of that type is available to the device compiler.

Therefore I disagree. I don't see any good reason to disallow it. The only challenging case is virtual member functions (because it is problematic to build a vtable if some but not all functions are external) but given that they are so far disallowed that's not something we need to worry about until then.

Do we know why our compiler disallows member functions currently?

@Pennycook
Copy link
Collaborator

Why would it be a bad idea? I don't see any difference between a (non-virtual) member function and a non-member function. From a compiler perspective S::foo and foo(S*) is identical. They both take a pointer to S as the first argument. And because they take a (pointer to) S as an argument the compiler has to make sure the definition of that type is available to the device compiler.

The concern I mentioned about member data is invalid -- you have to have an instance of the class to call a member function, so the data has to be on the device already. Thinking about it as foo(S*) helped clarify that, thanks.

I still think there is more to specify if we allow member functions, though. Would a user have to put SYCL_EXTERNAL on every function in the class? Does every function in the class need to be compatible with device compilation, or only the ones marked SYCL_EXTERNAL? These questions may already have answers for classes that meet the definition of a SYCL functor, but it's not obvious to me what behavior is expected of general classes.

@rolandschulz
Copy link
Collaborator

I still think there is more to specify if we allow member functions, though. Would a user have to put SYCL_EXTERNAL on every function in the class? Does every function in the class need to be compatible with device compilation, or only the ones marked SYCL_EXTERNAL? These questions may already have answers for classes that meet the definition of a SYCL functor, but it's not obvious to me what behavior is expected of general classes.

I would expect that the user has to put it on every function to be able to call it as external function. Until/unless we also allow SYCL_EXTERNAL on the class not just on the individual member function. Of course as with any function if one external function calls another non-external function that other function needs to be compiled for the device too. Therefore all functions which are either marked as external or are called by another function which is marked as external needs to be compiled for the device.

@gmlueck
Copy link
Owner Author

gmlueck commented Apr 1, 2020

FWIW, my original expectation matches what @rolandschulz describes. I expected to be able to define operator() as SYCL_EXTERNAL, and I expected any other member function it called to be automatically compiled for the device.

If we were to allow SYCL_EXTERNAL to apply to the entire class definition, would this mean that all member functions would be compiled for the device? It will be common, though, for some member functions not to be needed on the device. Does that mean we will need a way for the programmer to disable device compilation for these member functions?

It seems like it would be clearer to just allow SYCL_EXTERNAL on specific member functions that are needed on the device and not called from the same translation unit.

@rolandschulz
Copy link
Collaborator

FWIW, my original expectation matches what @rolandschulz describes. I expected to be able to define operator() as SYCL_EXTERNAL, and I expected any other member function it called to be automatically compiled for the device.

Why did you change it? Did you ask the compiler team why it is disallowed?

It seems like it would be clearer to just allow SYCL_EXTERNAL on specific member functions that are needed on the device and not called from the same translation unit.

I think we should at the very least do this for now. If people think it's too cumbersome to have to put it on each method we can revisit that and try to answer the other questions which follow from that.

@Pennycook
Copy link
Collaborator

Would this mean that SYCL_EXTERNAL on an operator() inside a SYCL kernel functor would be okay? Does it address this issue: intel/llvm#640?

@gmlueck
Copy link
Owner Author

gmlueck commented Apr 1, 2020

Would this mean that SYCL_EXTERNAL on an operator() inside a SYCL kernel functor would be okay? Does it address this issue: intel/llvm#640?

Yes, it would mean that you could put SYCL_EXTERNAL on operator() of a kernel functor, but it would only be needed if operator() was defined in a different translation unit. Just to be clear, SYCL_EXTERNAL would be put on the definition of operator(), not on the declaration inside the class body.

I looked at intel/llvm#640, and it looks like a different issue to me. FWIW, I also found the SYCL spec confusing w.r.t. the program class and pre-compiled kernels. I was thinking of proposing clearer language for this too, but I don't think I understand the expected usage model well enough yet.

Why did you change it? Did you ask the compiler team why it is disallowed?

No, I did not ask the compiler team. @mkinsner says above (in the second comment) that he thinks a strict reading of the current SYCL 1.2.1 spec is that SYC_EXTERNAL is not allowed on member functions. Therefore, the compiler team might disallow this simply because that's what the spec says.

My original goal of this PR was just to clarify what I thought was the intent of the spec -- that SYCL_EXTERNAL is not allowed for member functions. It seems like there is not a consensus that this is the right direction, though. I can gather some more information from the compiler team to see if there is a reason for disallowing it for member functions. I'll likely make this a lower priority, though, after trying to get the extension API proposal in order.

@Pennycook
Copy link
Collaborator

Yes, it would mean that you could put SYCL_EXTERNAL on operator() of a kernel functor, but it would only be needed if operator() was defined in a different translation unit. Just to be clear, SYCL_EXTERNAL would be put on the definition of operator(), not on the declaration inside the class body.

Got it, thanks.

I looked at intel/llvm#640, and it looks like a different issue to me. FWIW, I also found the SYCL spec confusing w.r.t. the program class and pre-compiled kernels. I was thinking of proposing clearer language for this too, but I don't think I understand the expected usage model well enough yet.

I agree they're not exactly the same issue, but I think they're related. Both problems have the same root cause: "the compiler doesn't know that this function/kernel needs to be available on the device because it didn't appear in a parallel_for in this translation unit".

Adding SYCL_EXTERNAL to the code in intel/llvm#640 would force the compiler to make that operator() available on the device. Is that the same as registering KernelFunctor as a kernel? Would the code in intel/llvm#640 start working, and if so is that by accident or by design?

The current solution in SYCL 1.2.1 is very specific to one use-case, and I'm wondering whether it's too specific. I guess it really depends on what the expected behavior is for the program object case, as you pointed out. @mkinsner, do you know what the intent is here?

@mkinsner
Copy link
Collaborator

mkinsner commented Apr 2, 2020

The current solution in SYCL 1.2.1 is very specific to one use-case, and I'm wondering whether it's too specific. I guess it really depends on what the expected behavior is for the program object case, as you pointed out. @mkinsner, do you know what the intent is here?

Not sure if I interpret the question right, but the program in 1.2.1 was a weird hybrid with lots of problems. The modules proposal reworks it, and I think factors in device linking. Primary use cases for program are to compile with specific arguments and to control when the compile happens.

Can you please clarify the question if it's something else?

@Pennycook
Copy link
Collaborator

Can you please clarify the question if it's something else?

The reminder that program is going away is useful, thanks. I'll go and re-read the Codeplay proposal.

The question I'm really asking is demonstrated very well by intel/llvm#640: does a class actually have to appear in an instantiation of a parallel_for to be compiled as a kernel? Or should everything that "looks like" a kernel be compiled as one, just in case it gets linked into a program or module?

@gmlueck
Copy link
Owner Author

gmlueck commented Apr 2, 2020

I hadn't heard of the new module proposal before. Searching around, I found this. Is that the latest version?

1 similar comment
@gmlueck
Copy link
Owner Author

gmlueck commented Apr 2, 2020

I hadn't heard of the new module proposal before. Searching around, I found this. Is that the latest version?

@mkinsner
Copy link
Collaborator

mkinsner commented Apr 3, 2020

The question I'm really asking is demonstrated very well by intel/llvm#640: does a class actually have to appear in an instantiation of a parallel_for to be compiled as a kernel? Or should everything that "looks like" a kernel be compiled as one, just in case it gets linked into a program or module?

I think that anything that looks like a kernel or could be a device function needs to be compiled for every device. Is there another option?

Alternatives that I don't think are tractable are to whitelist or to blacklist functions within the xlation unit using a [[kernel]] or [[device]] attr, but I don't think that's where we want to go.

For the kernel side (not function called by a kernel), the signature of a legal kernel function is a pretty specific subset of possible signatures. So finding them should be tractable, correct?

@Pennycook
Copy link
Collaborator

I think that anything that looks like a kernel or could be a device function needs to be compiled for every device. Is there another option?

For the kernel side (not function called by a kernel), the signature of a legal kernel function is a pretty specific subset of possible signatures. So finding them should be tractable, correct?

I agree that this is the way to go.

I was worried we were in a situation where we might need SYCL_EXTERNAL even within a translation unit to handle program linking, but I'm satisfied now that isn't the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants