-
Notifications
You must be signed in to change notification settings - Fork 7
2: Sample Data Files
The following sample files can be used to test your software (they don't contain values for all data items; the files were created using a synthetic data generator library, see its project home page for a list of data items currently supported):
NAACCR 25
- Incidence 10 tumors (naaccr-xml-sample-v250-incidence-10.xml.gz)
- Incidence 100 tumors (naaccr-xml-sample-v250-incidence-100.xml.gz)
- Incidence 1000 tumors (naaccr-xml-sample-v250-incidence-1000.xml.gz)
- Abstract 10 tumors (naaccr-xml-sample-v250-abstract-10.xml.gz)
- Abstract 100 tumors (naaccr-xml-sample-v250-abstract-100.xml.gz)
- Abstract 1000 tumors (naaccr-xml-sample-v250-abstract-1000.xml.gz)
NAACCR 24
- Incidence 10 tumors (naaccr-xml-sample-v240-incidence-10.xml.gz)
- Incidence 100 tumors (naaccr-xml-sample-v240-incidence-100.xml.gz)
- Incidence 1000 tumors (naaccr-xml-sample-v240-incidence-1000.xml.gz)
- Abstract 10 tumors (naaccr-xml-sample-v240-abstract-10.xml.gz)
- Abstract 100 tumors (naaccr-xml-sample-v240-abstract-100.xml.gz)
- Abstract 1000 tumors (naaccr-xml-sample-v240-abstract-1000.xml.gz)
NAACCR 23
- Incidence 10 tumors (naaccr-xml-sample-v230-incidence-10.xml.gz)
- Incidence 100 tumors (naaccr-xml-sample-v230-incidence-100.xml.gz)
- Incidence 1000 tumors (naaccr-xml-sample-v230-incidence-1000.xml.gz)
- Abstract 10 tumors (naaccr-xml-sample-v230-abstract-10.xml.gz)
- Abstract 100 tumors (naaccr-xml-sample-v230-abstract-100.xml.gz)
- Abstract 1000 tumors (naaccr-xml-sample-v230-abstract-1000.xml.gz)
NAACCR 22
- Incidence 10 tumors (naaccr-xml-sample-v220-incidence-10.xml.gz)
- Incidence 100 tumors (naaccr-xml-sample-v220-incidence-100.xml.gz)
- Incidence 1000 tumors (naaccr-xml-sample-v220-incidence-1000.xml.gz)
- Abstract 10 tumors (naaccr-xml-sample-v220-abstract-10.xml.gz)
- Abstract 100 tumors (naaccr-xml-sample-v220-abstract-100.xml.gz)
- Abstract 1000 tumors (naaccr-xml-sample-v220-abstract-1000.xml.gz)
NAACCR 21
- Incidence 10 tumors (naaccr-xml-sample-v210-incidence-10.xml.gz)
- Incidence 100 tumors (naaccr-xml-sample-v210-incidence-100.xml.gz)
- Incidence 1000 tumors (naaccr-xml-sample-v210-incidence-1000.xml.gz)
- Abstract 10 tumors (naaccr-xml-sample-v210-abstract-10.xml.gz)
- Abstract 100 tumors (naaccr-xml-sample-v210-abstract-100.xml.gz)
- Abstract 1000 tumors (naaccr-xml-sample-v210-abstract-1000.xml.gz)
NAACCR 18
- Incidence 10 tumors (naaccr-xml-sample-v180-incidence-10.xml.gz)
- Incidence 100 tumors (naaccr-xml-sample-v180-incidence-100.xml.gz)
- Incidence 1000 tumors (naaccr-xml-sample-v180-incidence-1000.xml.gz)
- Abstract 10 tumors (naaccr-xml-sample-v180-abstract-10.xml.gz)
- Abstract 100 tumors (naaccr-xml-sample-v180-abstract-100.xml.gz)
- Abstract 1000 tumors (naaccr-xml-sample-v180-abstract-1000.xml.gz)
If you need larger volume of data, you can create synthetic NAACCR XML data files yourself using the free File*Pro software (the files provided here were generated using that software as well).
If you need more specific data, you are welcome to post something on the NAACCR XML Forum.
The Java library uses a set of valid and invalid sample files to make sure it implements the NAACCR XML standard properly.
That set can be used by other software to make sure they also implement the standard correctly.
- naaccr-xml-samples-v250.zip
- naaccr-xml-samples-v240.zip
- naaccr-xml-samples-v230.zip
- naaccr-xml-samples-v220.zip
- naaccr-xml-samples-v210.zip
- naaccr-xml-samples-v180.zip
The set is a zip file containing a collection of files with a specific prefix:
- files starting with valid_ are supposed to be valid according to the latest specifications of the standard.
- files starting with invalid_ are supposed to fail validation (some of them have obvious XML syntax errors, others fail the standard requirements like using an undefined NAACCR Item ID).
Every invalid file has a comment on the top explaining why it is invalid.