-
Notifications
You must be signed in to change notification settings - Fork 0
CWL definitions #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… job params JSON/YAML
huard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing a bit of information to get this to work, since I'm not familiar with CWL.
From the README, I understand that click2cwl can convert a CLI call to a CWL workflow.
I first generated an test input file by running
pytest src/goldfinch/processes/indicator/test.py
Now I have /tmp/pytest-of-david/pytest-0/test_hdd0/in.nc
I can create a CWL file with
click2cwl --process ./src/goldfinch/processes/indicator/hdd.py -- heating_degree_days > /tmp/hdd.cwl
but I was surprised that it didn't register the argument heating_degree_days anywhere. I tried with the -j argument
click2cwl --process ./src/goldfinch/processes/indicator/hdd.py --job /tmp/job.yml -- -i /tmp/pytest-of-david/pytest-0/test_hdd0/in.nc heating_degree_days
which saved the -i input, but not the process name.
In any case, I didn't find the combination of operations that would allow me to run a test computation.
Note that I'm confused by the -w vs --cwl options, and why the output of this is defined using -o, while for jobs we provide -j <filename> directly.
|
Is In other words, is the If so, it is possible
Option The |
|
Yes, exactly. I can change the CLI if it makes your life easier. Ok. One thing I want to point out is that we`ll probably want to chain multiple processes "in-memory". Although I've never tried it, I understand click does support chaining commands and holding a "context" in memory between these commands. Ideally, we'd be able to use this as well. Not sure how complicated it would be on your end. |
|
For the in-memory aspect, I think this is OK on the I'll take a quick look at what is happening an report back. |
|
@huard Found a workaround: 6603311 There are some fixes to apply directly in Provided that |
|
PRs created for above issues:
Will wait a few days to integrate a newer |
|
Still having trouble to get this to work: First it complained that there were many commands, even though I specified one. I think the logic below was flawed - click_functions = [
- (name, member)
- for name, member in
- inspect.getmembers(cli_mod)
- if (not name.startswith("_") or kwargs["command"] == name) and isinstance(member, click.Command)
- ]
+ click_functions = []
+ for name, member in inspect.getmembers(cli_mod):
+ if isinstance(member, click.Command) and not name.startswith("_"):
+ if kwargs["command"] and kwargs["command"] != name:
+ continue
+ click_functions.append((name, member))Now it doesn't understand some of the inputs: (goldfinch) david@it-282:~/src/goldfinch$ cwltool /tmp/pytest-of-david/package.cwl
INFO /home/david/.conda/envs/goldfinch/bin/cwltool 3.1.20250715140722
INFO Resolved '/tmp/pytest-of-david/package.cwl' to 'file:///tmp/pytest-of-david/package.cwl'
ERROR Tool definition failed validation:
../../../../tmp/pytest-of-david/package.cwl:5:1: checking field 'inputs'
../../../../tmp/pytest-of-david/package.cwl:16:3: checking object
'../../../../tmp/pytest-of-david/package.cwl#clt/dask_nthreads'
Field 'type' references unknown identifier
'None', tried
file:///tmp/pytest-of-david/package.cwl#None
../../../../tmp/pytest-of-david/package.cwl:39:3: checking object
'../../../../tmp/pytest-of-david/package.cwl#clt/verbose'
Field 'type' references unknown identifier
'None', tried
file:///tmp/pytest-of-david/package.cwl#NoneIndeed, the class: CommandLineTool
cwlVersion: v1.2
id: clt
inputs:
chunks:
inputBinding:
position: 7
prefix: --chunks
type: string?
dask_maxmem:
inputBinding:
position: 6
prefix: --dask-maxmem
type: string?
dask_nthreads:
inputBinding:
position: 5
prefix: --dask-nthreads
type: None?
engine:
inputBinding:
position: 8
prefix: --engine
type: string?
input:
type:
- 'null'
- inputBinding:
position: 1
prefix: -i
items: string
type: array
output:
inputBinding:
position: 2
prefix: -o
type: string?
verbose:
inputBinding:
position: 3
prefix: -v
type: None?
version:
inputBinding:
position: 4
prefix: -V
type: boolean?
outputs:
results:
outputBinding:
glob: .
type: Directory
requirements:
EnvVarRequirement:
envDef: {}
ResourceRequirement: {}
stderr: std.err
stdout: std.out
|
Yes. It seems I applied a
Will have to investigate where the |
|
Found the problem. They are not defined! Will do a follow PR with the maintainers to add the missing types supported by CWL. With the above PR fix applied: click2cwl \
-p src/goldfinch/processes/indicator/hdd.py \
-m id=heating_degree_days \
--cwl-version v1.2 \
--docker birdhouse/goldfinch:0.1.0 \
-m author=fmigneault \
-e TEST=VALUE \
-- \
heating_degree_days$namespaces:
s: https://schema.org/
$schemas:
- http://schema.org/version/9.0/schemaorg-current-http.rdf
baseCommand: python -m goldfinch.processes.indicator.hdd
class: CommandLineTool
cwlVersion: v1.2
hints:
DockerRequirement:
dockerPull: birdhouse/goldfinch:0.1.0
id: heating_degree_days
inputs:
chunks:
inputBinding:
position: 8
prefix: --chunks
type: string?
dask_maxmem:
inputBinding:
position: 7
prefix: --dask-maxmem
type: string?
dask_nthreads:
inputBinding:
position: 6
prefix: --dask-nthreads
type: int?
engine:
inputBinding:
position: 9
prefix: --engine
type: string?
help:
inputBinding:
position: 2
prefix: -h
type: boolean?
indicator:
inputBinding:
position: 1
prefix: --indicator
type:
- symbols:
- HUMIDEX
- HEAT_INDEX
- TG
- WIND_SPEED_FROM_VECTOR
- WIND_VECTOR_FROM_SPEED
- WIND_POWER_POTENTIAL
- WIND_PROFILE
- E_SAT
- HURS_FROMDEWPOINT
- HURS
- HUSS
- HUSS_FROMDEWPOINT
- VAPOR_PRESSURE_DEFICIT
- PRSN
- PRLP
- WIND_CHILL
- POTENTIAL_EVAPOTRANSPIRATION
- WATER_BUDGET_FROM_TAS
- WATER_BUDGET
- CORN_HEAT_UNITS
- UTCI
- MEAN_RADIANT_TEMPERATURE
- SHORTWAVE_UPWELLING_RADIATION_FROM_NET_DOWNWELLING
- LONGWAVE_UPWELLING_RADIATION_FROM_NET_DOWNWELLING
- CLEARNESS_INDEX
- RAIN_FRZGR
- RX1DAY
- MAX_N_DAY_PRECIPITATION_AMOUNT
- WETDAYS
- WETDAYS_PROP
- DRY_DAYS
- DRYNESS_INDEX
- CWD
- CDD
- SDII
- MAX_PR_INTENSITY
- PRCPTOT
- PRCPAVG
- WET_PRCPTOT
- LIQUIDPRCPTOT
- LIQUIDPRCPAVG
- SOLIDPRCPTOT
- SOLIDPRCPAVG
- xclim.core.indicator.SPI
- xclim.core.indicator.SPEI
- DC
- DMC
- CFFWIS
- KBDI
- DF
- FFDI
- LAST_SNOWFALL
- FIRST_SNOWFALL
- DAYS_WITH_SNOW
- SNOWFALL_FREQUENCY
- SNOWFALL_INTENSITY
- DAYS_OVER_PRECIP_THRESH
- DAYS_OVER_PRECIP_DOY_THRESH
- HIGH_PRECIP_LOW_TEMP
- FRACTION_OVER_PRECIP_DOY_THRESH
- FRACTION_OVER_PRECIP_THRESH
- LIQUID_PRECIP_RATIO
- DRY_SPELL_FREQUENCY
- DRY_SPELL_TOTAL_LENGTH
- DRY_SPELL_MAX_LENGTH
- WET_SPELL_FREQUENCY
- WET_SPELL_TOTAL_LENGTH
- WET_SPELL_MAX_LENGTH
- RPRCTOT
- COLD_AND_DRY_DAYS
- WARM_AND_DRY_DAYS
- WARM_AND_WET_DAYS
- COLD_AND_WET_DAYS
- RAIN_SEASON
- WATER_CYCLE_INTENSITY
- JETSTREAM_METRIC_WOOLLINGS
- TN_DAYS_ABOVE
- TN_DAYS_BELOW
- TG_DAYS_ABOVE
- TG_DAYS_BELOW
- TX_DAYS_ABOVE
- TX_DAYS_BELOW
- TX_TN_DAYS_ABOVE
- HEAT_WAVE_FREQUENCY
- HOT_SPELL_MAX_MAGNITUDE
- HEAT_WAVE_MAX_LENGTH
- HEAT_WAVE_TOTAL_LENGTH
- HEAT_WAVE_INDEX
- HEAT_SPELL_FREQUENCY
- HEAT_SPELL_MAX_LENGTH
- HEAT_SPELL_TOTAL_LENGTH
- HOT_SPELL_FREQUENCY
- HOT_SPELL_MAX_LENGTH
- HOT_SPELL_TOTAL_LENGTH
- TG_MEAN
- TG_MAX
- TG_MIN
- TX_MEAN
- TX_MAX
- TX_MIN
- TN_MEAN
- TN_MAX
- TN_MIN
- DTR
- DTRMAX
- DTRVAR
- ETR
- COLD_SPELL_DURATION_INDEX
- COLD_SPELL_DAYS
- COLD_SPELL_FREQUENCY
- COLD_SPELL_MAX_LENGTH
- COLD_SPELL_TOTAL_LENGTH
- COOL_NIGHT_INDEX
- DLYFRZTHW
- FREEZETHAW_SPELL_FREQUENCY
- FREEZETHAW_SPELL_MEAN_LENGTH
- FREEZETHAW_SPELL_MAX_LENGTH
- COOLING_DEGREE_DAYS
- COOLING_DEGREE_DAYS_APPROXIMATION
- HEATING_DEGREE_DAYS
- HEATING_DEGREE_DAYS_APPROXIMATION
- GROWING_DEGREE_DAYS
- FREEZING_DEGREE_DAYS
- THAWING_DEGREE_DAYS
- FRESHET_START
- FROST_DAYS
- FROST_SEASON_LENGTH
- LAST_SPRING_FROST
- FIRST_DAY_TN_BELOW
- FIRST_DAY_TG_BELOW
- FIRST_DAY_TX_BELOW
- FIRST_DAY_TN_ABOVE
- FIRST_DAY_TG_ABOVE
- FIRST_DAY_TX_ABOVE
- ICE_DAYS
- CONSECUTIVE_FROST_DAYS
- FROST_FREE_SEASON_LENGTH
- FROST_FREE_SEASON_START
- FROST_FREE_SEASON_END
- FROST_FREE_SPELL_MAX_LENGTH
- CONSECUTIVE_FROST_FREE_DAYS
- GROWING_SEASON_START
- GROWING_SEASON_LENGTH
- GROWING_SEASON_END
- TROPICAL_NIGHTS
- TG90P
- TG10P
- TX90P
- TX10P
- TN90P
- TN10P
- DEGREE_DAYS_EXCEEDANCE_DATE
- WARM_SPELL_DURATION_INDEX
- MAXIMUM_CONSECUTIVE_WARM_DAYS
- FIRE_SEASON
- HUGLIN_INDEX
- BIOLOGICALLY_EFFECTIVE_DEGREE_DAYS
- EFFECTIVE_GROWING_DEGREE_DAYS
- LATITUDE_TEMPERATURE_INDEX
- LATE_FROST_DAYS
- AUSTRALIAN_HARDINESS_ZONES
- USDA_HARDINESS_ZONES
- CP
- CU
- CALM_DAYS
- WINDY_DAYS
- SFCWIND_MAX
- SFCWIND_MEAN
- SFCWIND_MIN
- SFCWINDMAX_MAX
- SFCWINDMAX_MEAN
- SFCWINDMAX_MIN
- FIT
- RETURN_LEVEL
- STATS
- SND_SEASON_LENGTH
- SNW_SEASON_LENGTH
- SND_SEASON_START
- SNW_SEASON_START
- SND_SEASON_END
- SNW_SEASON_END
- SND_MAX_DOY
- SNOW_MELT_WE_MAX
- SNW_MAX
- SNW_MAX_DOY
- MELT_AND_PRECIP_MAX
- SND_STORM_DAYS
- SNW_STORM_DAYS
- BLOWING_SNOW
- SNOW_DEPTH
- SND_TO_SNW
- SNW_TO_SND
- SND_DAYS_ABOVE
- SNW_DAYS_ABOVE
- HOLIDAY_SNOW_DAYS
- HOLIDAY_SNOW_AND_SNOWFALL_DAYS
- BASE_FLOW_INDEX
- RB_FLASHINESS_INDEX
- DOY_QMAX
- DOY_QMIN
- xclim.core.indicator.FLOW_INDEX
- HIGH_FLOW_FREQUENCY
- LOW_FLOW_FREQUENCY
- xclim.core.indicator.SSI
- xclim.core.indicator.SGI
- SEA_ICE_EXTENT
- SEA_ICE_AREA
- icclim.TG
- icclim.TX
- icclim.TN
- icclim.TG90P
- icclim.TG10P
- icclim.TGX
- icclim.TGN
- icclim.TX90P
- icclim.TX10P
- icclim.TXX
- icclim.TXN
- icclim.TN90P
- icclim.TN10P
- icclim.TNX
- icclim.TNN
- icclim.HI
- icclim.BEDD
- icclim.CSDI
- icclim.WSDI
- icclim.SU
- icclim.CSU
- icclim.TR
- icclim.GD4
- icclim.FD
- icclim.CFD
- icclim.GSL
- icclim.ID
- icclim.HD17
- icclim.CDD
- icclim.CWD
- icclim.RR
- icclim.PRCPTOT
- icclim.SDII
- icclim.ETR
- icclim.DTR
- icclim.VDTR
- icclim.RR1
- icclim.R10MM
- icclim.R20MM
- icclim.RX1DAY
- icclim.RX5DAY
- icclim.R75P
- icclim.R95P
- icclim.R99P
- icclim.R75PTOT
- icclim.R95PTOT
- icclim.R99PTOT
- icclim.SD
- icclim.SD1
- icclim.SD5CM
- icclim.SD50CM
- icclim.CD
- icclim.WD
- icclim.WW
- icclim.CW
- anuclim.P10_MEANTEMPWARMESTQUARTER
- anuclim.P11_MEANTEMPCOLDESTQUARTER
- anuclim.P12_ANNUALPRECIP
- anuclim.P13_PRECIPWETTESTPERIOD
- anuclim.P14_PRECIPDRIESTPERIOD
- anuclim.P15_PRECIPSEASONALITY
- anuclim.P16_PRECIPWETTESTQUARTER
- anuclim.P17_PRECIPDRIESTQUARTER
- anuclim.P18_PRECIPWARMESTQUARTER
- anuclim.P19_PRECIPCOLDESTQUARTER
- anuclim.P1_ANNMEANTEMP
- anuclim.P2_MEANDIURNALRANGE
- anuclim.P3_ISOTHERMALITY
- anuclim.P4_TEMPSEASONALITY
- anuclim.P5_MAXTEMPWARMESTPERIOD
- anuclim.P6_MINTEMPCOLDESTPERIOD
- anuclim.P7_TEMPANNUALRANGE
- anuclim.P8_MEANTEMPWETTESTQUARTER
- anuclim.P9_MEANTEMPDRIESTQUARTER
- cf.CDD
- cf.CDDCOLDTT
- cf.CFD
- cf.CSU
- cf.CTMGETT
- cf.CTMGTTT
- cf.CTMLETT
- cf.CTMLTTT
- cf.CTNGETT
- cf.CTNGTTT
- cf.CTNLETT
- cf.CTNLTTT
- cf.CTXGETT
- cf.CTXGTTT
- cf.CTXLETT
- cf.CTXLTTT
- cf.CWD
- cf.DDGTTT
- cf.DDLTTT
- cf.DTR
- cf.ETR
- cf.FG
- cf.FXX
- cf.GD4
- cf.GDDGROWTT
- cf.HD17
- cf.HDDHEATTT
- cf.MAXDTR
- cf.PP
- cf.RH
- cf.SD
- cf.SDII
- cf.SS
- cf.TG
- cf.TMM
- cf.TMMAX
- cf.TMMEAN
- cf.TMMIN
- cf.TMN
- cf.TMX
- cf.TN
- cf.TNM
- cf.TNMAX
- cf.TNMEAN
- cf.TNMIN
- cf.TNN
- cf.TNX
- cf.TX
- cf.TXM
- cf.TXMAX
- cf.TXMEAN
- cf.TXMIN
- cf.TXN
- cf.TXX
- cf.VDTR
type: enum
input:
type:
- 'null'
- inputBinding:
position: 3
prefix: -i
items: string
type: array
output:
inputBinding:
position: 4
prefix: -o
type: string?
verbose:
inputBinding:
position: 5
prefix: -v
type: int?
outputs:
results:
outputBinding:
glob: .
type: Directory
requirements:
EnvVarRequirement:
envDef:
TEST: VALUE
ResourceRequirement: {}
s:author:
- class: s:Person
s:name: fmigneault
stderr: std.err
stdout: std.out |

Description
Add utility script that parses the
clickdefinitions of relevant CLIs found undersrc/goldfinch/processesand generates their corresponding CWL.Multiple options are provided to further extend the CWL metadata.
🚧 WIP 🚧
Find a better way to indicate
outputs:Currently, a single output is generated as:
While this might work for local execution, it is not sufficient for https://github.com/crim-ca/weaver. It is expected that a
Filereference would be provided (i.e.: the file path resolving to${outputs.results}/${inputs.output}in example scripts)Find a way to propagate media-types or other I/O-specific metadata
While not blocking deployment, that would help better describing the I/O and allowed contents they receive/produce.
References