Description
While we usually recommend putting all the data for a scenario inside of Gherkin documents, there are valid use cases for pulling data in from an external source, such as an Excel file (or other source).
SpecFlow already has support for this
I would like to come up with a specification for a preprocessor API that supports Gherkin, but also Markdown, which we might add support for at some time.
What I have in mind is this:
Document -> Preprocessor -> Preprocessed Document -> Parser -> AST -> Compiler -> Pickles -> Cucumber -> Results
The flow is currently:
Document -> Parser -> AST -> Compiler -> Pickles -> Cucumber -> Results
We could agree on a set of preprocessor directives for each supported input format (Gherkin, Markdown). Open questions:
Syntax
SpecFlow currently uses the format @source:excel-file-path[:sheet-name]
.
That works, but it wouldn't allow e.g. fetching an Excel document from a URL.
It also overloads the tag syntax, which seems a little confusing to me. I would prefer a dedicated preprocessor syntax (like the C preprocessor). Ideally a syntax that would work for both Gherkin and Markdown
Format
What if we want to pull data in from a CSV or JSON source? It would be nice if users could easily plug in their own preprocessor plugins for parsing the data at the external source. We'd provide a default one for CSV and/or Excel. I think most languages have decent Excel parsers.
Semantics
SpecFlow will merge contents from the Gherkin document and the external Excel document. That seems useful, but it would be great to be specific about what happens if the columns differ in various circumstances.
Let's discuss!