You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Restructure validation intro and reduce overconfident claims
Addresses PR feedback about section structure and overconfident language.
Changes:
- Moved intro material (sections 1.1-1.3) to top-level overview before
numbered sections
- Moved configuration step (old 1.4) into section 1 as first practical
step before examining schema
- Renumbered all sections (old section 2 → 1, old section 3 → 2)
- Updated takeaways to be less presumptuous: "You've learned" instead
of "You now know", "seen in action" instead of "know how"
- Removed abrupt section ending - section 1 now flows naturally from
config to schema examination to adding parameters
The structure now feels more complete with each section having clear
practical outcomes rather than ending on pure exposition.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
Copy file name to clipboardExpand all lines: docs/hello_nf-core/05_input_validation.md
+59-75Lines changed: 59 additions & 75 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,49 +32,10 @@ Pipeline failed before execution - please fix the errors above
32
32
33
33
The pipeline fails immediately with clear, actionable error messages. This saves time, compute resources, and frustration.
34
34
35
-
## Two types of validation
35
+
## The nf-schema plugin
36
36
37
-
nf-core pipelines validate two different kinds of input:
38
-
39
-
### Parameter validation
40
-
41
-
This validates command-line parameters (flags like `--outdir`, `--batch`, `--input`):
42
-
43
-
- Checks parameter types, ranges, and formats
44
-
- Ensures required parameters are provided
45
-
- Validates file paths exist
46
-
- Defined in `nextflow_schema.json`
47
-
48
-
### Input data validation
49
-
50
-
This validates the contents of input files (like sample sheets or CSV files)
51
-
52
-
- Checks column structure and data types
53
-
- Validates file references within the input file
54
-
- Ensures required fields are present
55
-
- Defined in `assets/schema_input.json`
56
-
57
-
!!! note
58
-
59
-
This section assumes you have completed [Part 4: Make an nf-core module](./04_make_module.md) and have a working `core-hello` pipeline with nf-core-style modules.
60
-
61
-
If you didn't complete Part 4 or want to start fresh for this section, you can use the `core-hello-part4` solution as your starting point:
This gives you a fully functional nf-core pipeline with modules ready for adding input validation.
69
-
70
-
---
71
-
72
-
## 1. The nf-schema plugin
73
-
74
-
The [nf-schema plugin](https://nextflow-io.github.io/nf-schema/latest/) is a Nextflow plugin that provides comprehensive validation capabilities for any Nextflow pipeline.
75
-
While nf-schema is a standalone tool that can be used in any Nextflow workflow, it's heavily integrated into the nf-core ecosystem and is the standard validation solution for all nf-core pipelines.
76
-
77
-
### 1.1. Core functionality
37
+
The [nf-schema plugin](https://nextflow-io.github.io/nf-schema/latest/) is a Nextflow plugin that provides comprehensive validation capabilities for Nextflow pipelines.
38
+
While nf-schema works with any Nextflow workflow, it's the standard validation solution for all nf-core pipelines.
78
39
79
40
nf-schema provides several key functions:
80
41
@@ -86,7 +47,7 @@ nf-schema provides several key functions:
86
47
87
48
nf-schema is the successor to the deprecated nf-validation plugin and uses standard [JSON Schema Draft 2020-12](https://json-schema.org/) for validation.
88
49
89
-
### 1.2. The two schema files
50
+
##Two schema files
90
51
91
52
An nf-core pipeline uses two schema files for validation:
92
53
@@ -97,7 +58,25 @@ An nf-core pipeline uses two schema files for validation:
97
58
98
59
Both schemas use JSON Schema format, a widely-adopted standard for describing and validating data structures.
99
60
100
-
### 1.3. When validation occurs
61
+
### Two types of validation
62
+
63
+
nf-core pipelines validate two different kinds of input:
64
+
65
+
**Parameter validation** validates command-line parameters (flags like `--outdir`, `--batch`, `--input`):
66
+
67
+
- Checks parameter types, ranges, and formats
68
+
- Ensures required parameters are provided
69
+
- Validates file paths exist
70
+
- Defined in `nextflow_schema.json`
71
+
72
+
**Input data validation** validates the contents of input files (like sample sheets or CSV files):
73
+
74
+
- Checks column structure and data types
75
+
- Validates file references within the input file
76
+
- Ensures required fields are present
77
+
- Defined in `assets/schema_input.json`
78
+
79
+
### When validation occurs
101
80
102
81
```mermaid
103
82
graph LR
@@ -110,7 +89,26 @@ graph LR
110
89
111
90
Validation happens **before** any pipeline processes run, providing fast feedback and preventing wasted compute time.
112
91
113
-
### 1.4. Configure validation to skip input file validation
92
+
!!! note
93
+
94
+
This section assumes you have completed [Part 4: Make an nf-core module](./04_make_module.md) and have a working `core-hello` pipeline with nf-core-style modules.
95
+
96
+
If you didn't complete Part 4 or want to start fresh for this section, you can use the `core-hello-part4` solution as your starting point:
This gives you a fully functional nf-core pipeline with modules ready for adding input validation.
104
+
105
+
---
106
+
107
+
## 1. Parameter validation (nextflow_schema.json)
108
+
109
+
Let's start by adding parameter validation to our pipeline. This validates command-line flags like `--input`, `--outdir`, and `--batch`.
110
+
111
+
### 1.1. Configure validation to skip input file validation
114
112
115
113
The nf-core pipeline template comes with nf-schema already installed and configured:
116
114
@@ -120,7 +118,7 @@ The nf-core pipeline template comes with nf-schema already installed and configu
120
118
121
119
The validation behavior is controlled through the `validation{}` scope in `nextflow.config`.
122
120
123
-
Since we'll be working on parameter validation first (section 2) and won't configure the input data schema until section 3, we need to temporarily tell nf-schema to skip validating the `input` parameter's file contents.
121
+
Since we'll be working on parameter validation first (this section) and won't configure the input data schema until section 2, we need to temporarily tell nf-schema to skip validating the `input` parameter's file contents.
124
122
125
123
Open `nextflow.config` and find the `validation` block (around line 246). Add `ignoreParams` to skip input file validation:
126
124
@@ -146,30 +144,16 @@ Open `nextflow.config` and find the `validation` block (around line 246). Add `i
146
144
This configuration tells nf-schema to:
147
145
148
146
-**`defaultIgnoreParams`**: Skip validation of complex parameters like `genomes` (set by template developers)
149
-
-**`ignoreParams`**: Skip validation of the `input` parameter's file contents (temporary - we'll remove this in section 3)
147
+
-**`ignoreParams`**: Skip validation of the `input` parameter's file contents (temporary - we'll remove this in section 2)
150
148
-**`monochromeLogs`**: Control colored output in validation messages
151
149
152
150
!!! note "Why ignore the input parameter?"
153
151
154
152
The `input` parameter in `nextflow_schema.json` has `"schema": "assets/schema_input.json"` which tells nf-schema to validate the *contents* of the input CSV file against that schema.
155
153
Since we haven't configured that schema yet, we temporarily ignore this validation.
156
-
We'll remove this setting in section 3 after configuring the input data schema.
157
-
158
-
### Takeaway
159
-
160
-
You now understand what nf-schema does, the two types of validation it provides, when validation occurs, and how to configure validation behavior. You've also temporarily disabled input file validation so we can focus on parameter validation first.
161
-
162
-
### What's next?
163
-
164
-
Start by implementing parameter validation for command-line flags.
165
-
166
-
---
167
-
168
-
## 2. Parameter validation (nextflow_schema.json)
169
-
170
-
Let's start by adding parameter validation to our pipeline. This validates command-line flags like `--input`, `--outdir`, and `--batch`.
154
+
We'll remove this setting in section 2 after configuring the input data schema.
171
155
172
-
### 2.1. Examine the parameter schema
156
+
### 1.2. Examine the parameter schema
173
157
174
158
Let's look at a section of the `nextflow_schema.json` file that came with our pipeline template:
175
159
@@ -225,7 +209,7 @@ Key validation features:
225
209
226
210
Notice the `batch` parameter we've been using isn't defined yet in the schema!
227
211
228
-
### 2.2. Add the batch parameter
212
+
### 1.3. Add the batch parameter
229
213
230
214
While the schema is a JSON file that can be edited manually, **manual editing is error-prone and not recommended**.
231
215
Instead, nf-core provides an interactive GUI tool that handles the JSON Schema syntax for you and validates your changes:
@@ -317,7 +301,7 @@ grep -A 25 '"input_output_options"' nextflow_schema.json
317
301
318
302
You should see that the `batch` parameter has been added to the schema with the "required" field now showing `["input", "outdir", "batch"]`.
319
303
320
-
### 2.3. Test parameter validation
304
+
### 1.4. Test parameter validation
321
305
322
306
Now let's test that parameter validation works correctly.
323
307
@@ -348,7 +332,7 @@ The pipeline should run successfully, and the `batch` parameter is now validated
348
332
349
333
### Takeaway
350
334
351
-
You now know how to use the interactive `nf-core pipelines schema build` tool to add parameters to `nextflow_schema.json` and test parameter validation.
335
+
You've learned how to use the interactive `nf-core pipelines schema build` tool to add parameters to `nextflow_schema.json` and seen parameter validation in action.
352
336
The web interface handles all the JSON Schema syntax for you, making it easy to manage complex parameter schemas without error-prone manual JSON editing.
353
337
354
338
### What's next?
@@ -357,11 +341,11 @@ Now that parameter validation is working, let's add validation for the input dat
357
341
358
342
---
359
343
360
-
## 3. Input data validation (schema_input.json)
344
+
## 2. Input data validation (schema_input.json)
361
345
362
346
Now let's add validation for the contents of our input CSV file. While parameter validation checks command-line flags, input data validation ensures the data inside the CSV file is structured correctly.
363
347
364
-
### 3.1. Understand the greetings.csv format
348
+
### 2.1. Understand the greetings.csv format
365
349
366
350
Let's remind ourselves what our input looks like:
367
351
@@ -381,7 +365,7 @@ This is a simple CSV with:
381
365
- One greeting per line
382
366
- Text strings with no special format requirements
383
367
384
-
### 3.2. Design the schema structure
368
+
### 2.2. Design the schema structure
385
369
386
370
For our use case, we want to:
387
371
@@ -392,7 +376,7 @@ For our use case, we want to:
392
376
393
377
We'll structure this as an array of objects, where each object has a `greeting` field.
394
378
395
-
### 3.3. Update the schema file
379
+
### 2.3. Update the schema file
396
380
397
381
The nf-core pipeline template includes a default `assets/schema_input.json` designed for paired-end sequencing data.
398
382
We need to replace it with a simpler schema for our greetings use case.
@@ -469,7 +453,7 @@ The key changes:
469
453
- **`errorMessage`**: Custom error message shown if validation fails
470
454
- **`required`**: Changed from `["sample", "fastq_1"]` to `["greeting"]`
471
455
472
-
### 3.4. Add a header to the greetings.csv file
456
+
### 2.4. Add a header to the greetings.csv file
473
457
474
458
When nf-schema reads a CSV file, it expects the first row to contain column headers that match the field names in the schema.
475
459
@@ -504,7 +488,7 @@ You've created a JSON schema for the greetings input file and added the required
504
488
505
489
Implement the validation in the pipeline code using `samplesheetToList`.
506
490
507
-
### 3.5. Implement samplesheetToList in the pipeline
491
+
### 2.5. Implement samplesheetToList in the pipeline
508
492
509
493
Now we need to replace our simple CSV parsing with nf-schema's `samplesheetToList` function, which validates and converts the sample sheet.
510
494
@@ -609,7 +593,7 @@ You've successfully implemented input data validation using `samplesheetToList`
609
593
610
594
Re-enable input validation in the config and test both parameter and input data validation to see them in action.
611
595
612
-
### 3.6. Re-enable input validation
596
+
### 2.6. Re-enable input validation
613
597
614
598
Now that we've configured the input data schema, we can remove the temporary ignore setting we added in section 1.4.
615
599
@@ -636,7 +620,7 @@ Open `nextflow.config` and remove the `ignoreParams` line from the `validation`
636
620
637
621
Now nf-schema will validate both parameter types AND the input file contents.
638
622
639
-
### 3.7. Test input validation
623
+
### 2.7. Test input validation
640
624
641
625
Let's verify that our validation works by testing both valid and invalid inputs.
642
626
@@ -713,7 +697,7 @@ The schema validation ensures that input files have the correct structure before
713
697
714
698
### Takeaway
715
699
716
-
You now know how to implement and test both parameter validation and input data validation. Your pipeline validates inputs before execution, providing fast feedback and clear error messages.
700
+
You've implemented and tested both parameter validation and input data validation. Your pipeline now validates inputs before execution, providing fast feedback and clear error messages.
0 commit comments