Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should PyWPS automatically set minOccurs=0 when literal data default is provided? #625

Open
1 of 8 tasks
fmigneault opened this issue Sep 14, 2021 · 10 comments
Open
1 of 8 tasks

Comments

@fmigneault
Copy link
Collaborator

Description

Field min_occurs=1 is the default of inputs. When default parameter is provided to define a default literal value, that default 1 occurrence loses its meaning. I would imagine most people expect that adding this default parameter will make their process pick that value if it was omitted, but this will never happen due to min_occurs=1 check at execution time. The process description also displays ``minOccurs=1anddefaultValue` simultaneously, which makes the default behaviour hard to understand.

I am questioning whether PyWPS should be smart about it and automatically adjust the value to 0 when default is provided, since it doesn't make sense to have that default if it cannot be resolved by default when omitting the input value.

Although the fix is relatively simple to manually adjust any process I/O incorrectly defined, I find this is a quite common and recurring issue. See for example additional information which is just a list of cases I found within a week, over many WPS process implementations.

Environment

  • operating system: ubuntu 20.04
  • Python version: 3.8
  • PyWPS version: latest
  • source/distribution
  • git clone
  • Debian
  • PyPI
  • zip/tar.gz
  • other (please specify):
  • web server
  • Apache/mod_wsgi
  • CGI
  • other (please specify):

Additional Information

Following related issues/PR where similar problem was noticed:

fmigneault added a commit to crim-ca/weaver that referenced this issue Sep 14, 2021
@huard
Copy link
Collaborator

huard commented Sep 14, 2021

Instead of being smart and make an automatic adjustment, maybe PyWPS could raise a Warning when instantiating the Process. We'd catch these issues during development and resolve them explicitly.

@fmigneault
Copy link
Collaborator Author

@huard
I guess that would be a good compromise without directly impacting existing process definitions.

@cehbrecht
Copy link
Collaborator

I have added a new process in Emu to show the usage of default values:
bird-house/emu#114

Currently the default value is used when no value is provided both for min_occurs=0 and min_occurs=1.

Is this what is expected or should it be different?

@fmigneault
Copy link
Collaborator Author

@cehbrecht
Maybe my interpretation is wrong, but I believe that if an input has min_occurs=1, and you don't provide it during Execute request, the process should return an error indicating it is missing a required input. Therefore, the default would never be used, because you must provide one.

@fmigneault
Copy link
Collaborator Author

fmigneault commented Sep 16, 2021

I dug down in the specs.
OGC says that definitions of minOccurs, maxOccurs and default should behave as defined by XML schema specifications when parsing WPS-1.

Looking at https://www.w3.org/TR/xmlschema-0/#OccurrenceConstraints
It is mentioned (emphasis added by myself):

In general, an element is required to appear when the value of minOccurs is 1 or more.

and

When an attribute is declared with a default value, the value of the attribute is whatever value appears as the attribute's value in an instance document; if the attribute does not appear in the instance document, the schema processor provides the attribute with a value equal to that of the default attribute. Note that default values for attributes only make sense if the attributes themselves are optional, and so it is an error to specify both a default value and anything other than a value of optional for use.

There is a distinction between presence of the input and its contained value.
Looking at the example table, following case is provided:

Elements
(minOccurs, maxOccurs) fixed, default
Attributes
use, fixed, default
Notes
(0, 2) -, 37 n/a element may appear once, twice, or not at all; if the element does not appear it is not provided; if it does appear and it is empty, its value is 37; otherwise its value is that given; in general, minOccurs and maxOccurs values may be positive integers, and maxOccurs value may also be "unbounded"

Therefore, it looks like it is valid to have min_occurs=1 and default at the same time, but one must still provide the input with an empty value to employ that default. Omitting the input should raise a schema validation error of missing input.

So the test provided by @cehbrecht that defines input string_2 with min_occurs=1 and default='two' (https://github.com/bird-house/emu/pull/114/files#diff-489abd19af7c899a97437774b5d153eb55c7110d3d49d1f0ef23421fc44afd7fR29-R34) which resolves as 'two' when omitted (https://github.com/bird-house/emu/pull/114/files#diff-44189833c54dd811c26c176a0ad92fd2f40c78d15ef8ceb5ae7d23625ec6fb5dR27-R30) is invalid.
I believe PyWPS is not respecting the spec in this case.

The input should be provided explicitly in the execute XML body, but with empty value to respect the minOccurs=1 schema validation and use the default='two'. Something like <wps:Data></wps:Data> must be provided. The JSON equivalent would be {"string_2": null} instead of not having "string_2" in the body at all.

That being said, I think the intended use of default parameter in PyWPS is used to simply omit the input rather than explicitly providing it as empty, since the code currently picks that default value when the field is omitted. Therefore that input should be defined with min_occurs=0 to be compliant according to XML.

@cehbrecht
Copy link
Collaborator

cehbrecht commented Sep 20, 2021

The current pywps implementation seems to use the default value for initialization ... no matter the min_occurs value:

self._set_default_value(default, default_type)

inpt._set_default_value()

There is a previous discussion #226

@cehbrecht
Copy link
Collaborator

I still find it confusing :)

The spec says (?):

  • A default value is only used when min_occurs=0.
  • The default value is only set when a value is provided and it is empty.

I find it hard to set and recognize empty values. So we could adjust the implementation in this way:

  • A default value is only used when min_occurs=0.
  • The default value is only set when a value is not provided (or empty).

In case of min_occurs>=1 and a default value is configured we raise a warning?

Currently we have min_occurs=1 as default. Should it be min_occurs=0?

@fmigneault
Copy link
Collaborator Author

fmigneault commented Sep 20, 2021

I think the "conceptual" min_occurs=1 by default is valid. By this, I mean that user not explicitly providing a value for min_occurs should default to 1. Most users intend inputs to be required unless a default is added or min_occurs=0 is explicitly provided.

What we could do is define min_occurs=None as function parameter, and consider following cases:

  • If the user does provide min_occurs explicitly, we swap None to its value
  • If the user provided min_occurs=0 and no default, the internal value of default=None is used and that input is optional.
  • If the user provided min_occurs=0 and some default value, then internal default is set to that value. The input is optional, but omitting it during execute uses the default.
  • If the user provided only default (without any min_occurs), min_occurs=0 is used instead of 1 (resolve transparently).
  • Following those resolution, min_occurs>=1 and default is set, then raise a warning.
    The default could be ignored completely in this case (replace it by None).
    The only case when that would happen is when the user explicitly provided both parameters with erroneous/conflicting values.

I don't think we should worry about the case of "empty value provided", because the resulting value in Python code, whether parsed from XML or JSON, will result in None regardless if the input was omitted (i.e.: via using default=None) or explicitly passed empty/null (i.e.: via parsed value conversion to None).

@cehbrecht
Copy link
Collaborator

Instead of being smart and make an automatic adjustment, maybe PyWPS could raise a Warning when instantiating the Process. We'd catch these issues during development and resolve them explicitly.

@fmigneault I would agree with @huard and only raise a Warning when a default is given and min_occurs>0. The default will only be used when min_occurs=0 and no value is provided.

I think the main trouble is the "mental confusion" ... it will be easier to update the wps services and make it clear in the docs and examples.

Agree? Should we do this already for the 4.4.x branch or only on 4.5.x?

@fmigneault
Copy link
Collaborator Author

@cehbrecht
I am fine with a warning. It should offer sufficient information for users to manually updated it as needed.
I think it is worth having in 4.4.x (and any other patches) until 4.5 is completely passing the test suite and properly released (as 4.6? or 4.5.x+1 according to #590).

This was referenced Sep 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants