Skip to content

Darwin Core Archive Event Core

temi edited this page Feb 21, 2025 · 6 revisions

Event Core

Since ecodata 5.2

BioCollect hosts a lot of data which is stored in an internal format. This wiki page discusses how it can be transformed to Darwin Core Archive event core format. The structure of data is as follows. A citizen science project has multiple surveys (internally called project activity) which has multiple site visits (internally called activity).

BioCollect will generate an archive for each project. And, an archive will have the following files included.

  • eml.xml - Project metadata is added here
  • meta.xml - Describes the content of csv files
  • Event.csv - All site visits (activities) are recorded here. Also, project activity is added here under Survey eventType.
  • MeasurementOrFact.csv - Contains measurements or fact from site visits.
  • Media.csv - Contains images from site visits
  • Occurrence.csv - Contains species occurrences.

How to configure BioCollect to generate DwCA?

BioCollect DwCA creator is not smart. Admin has to help BioCollect to generate DwCA correctly. This has to be done on form template of a survey. Each dataModel you like to add to DwCA has to be annotated with property dwcAttribute and its value mapped to DwC field. It reuses the existing attributions used for record creation. An annotated example of a dataModel is given below.

    {
      "dataType": "text",
      "name": “author”,
      "dwcAttribute": "recordedBy",
      "description": "The name of the person submitting this record",
      "validate": "required"
    }

Here, the value added to author field is assigned DwC field recordedBy.

Adding MeasurementOrFact

Similar to the above, adding a measurement or fact is by assigning "dwcAttribute": "measurementValue". An example is given below. As you can see, all associated values that goes with a measurement or fact is added to the dataModel.

    {
      "dataType": "number",
      "name": "spiValue",
      "dwcAttribute": "measurementValue",
      "measurementUnit": "SPI",
      "measurementUnitID": "http://qudt.org/vocab/quantitykind/SPI”,
      "measurementType”: “number”,
      "measurementTypeID": "http://qudt.org/vocab/quantitykind/Number”,
      "measurementAccuracy": "0.1",
      "description": "Calculated stream pollution index (SPI)"
    }

Manipulating measurement type

In BioCollect you can have a table of measurement or fact values. Sometimes it is desirable to have programatically create a measurement type for each of these values. Expression language to generate the name is Spring Expression Language (SpEL). Below is an example of one such case.

{
      "dataType": "list",
      "name": "dominantPlantSpeciesPreIntervention",
      "columns": [
        {
          "dataType": "species",
          "description": "The dominant plant species on the site at the time of commencement of the intervention works. [LIST UP TO 4 SPECIES PER STRATUM]",
          "name": "dominantSpeciesPreIntervention",
          "dwcAttribute": "scientificName"
        },
        {
          "dataType": "text",
          "description": "The vegetation stratum occupied by the species in it's mature state.",
          "name": "dominantSpeciesPreInterventionStratum",
          "constraints": [
            "Canopy",
            "Midstory",
            "Ground stratum"
          ],
          "dwcAttribute": "measurementValue",
          "measurementUnit": "unitless",
          "measurementType": "['dominantSpeciesPreIntervention']['scientificName'] + ' - Stratum'"
        }
      ]
    }

For a table with below values

Dominant Species Stratum
Acacia dealbata Canopy
Eucalyptus tumida Midstory

MOF table will look like

Measurement Type Measurement Use ...
Acacia dealbata - Stratum Canopy ...
Eucalyptus tumida - Stratum Midstory ...

Manipulating measurement value

For use cases where data stored requires some transformation, dwcExpression attribute can be added to data model. For example, in case where you want individualCount added from multiple fields, you might do the following. It is again making use of Spring Expression Language.

{
      "dataType": "list",
      "name": "recruitment-sapling-and-seedling-count",
      "isObject": true,
      "columns": [
        {
          "dataType": "number",
          "name": "juvenile_count",
          "decimalPlaces": 0,
          "dwcExpression": "(['juvenile_count'] == null ? 0 : ['juvenile_count']) + (['seedling_count'] == null ? 0 : ['seedling_count']) + (['sapling_count'] == null ? 0 :['sapling_count'])",
          "dwcAttribute": "individualCount"
        },
        {
          "dataType": "number",
          "name": "seedling_count",
          "decimalPlaces": 0
        },
        {
          "dataType": "species",
          "name": "species",
          "dwcAttribute": "scientificName"
        },
        {
          "dataType": "number",
          "name": "sapling_count",
          "decimalPlaces": 0
        }
    ]
}

For a table with below values

Species Juvenile count Seedling count Sampling count
Acacia dealbata 1 0 5
Eucalyptus tumida 7 8 10

Occurrence table will look like

Scientific name Individual count ...
Acacia dealbata 6 ...
Eucalyptus tumida 25 ...


How to access DwCA?

DwCA file has to be accessed via ecodata - BioCollect’s back end system. The following APIs should be used.

1. Get JWT token:

POST https://auth.ala.org.au/cas/oidc/oidcAccessToken
Content-Type: application/x-www-form-urlencoded
Accept: application/json

grant_type=client_credentials
    &scope=ecodata/read_prod
    &client_id=...
    &client_secret=...

2. Get list of data resource available for harvesting

URL : https://ecodata.ala.org.au/ws/record/listHarvestDataResource?max=10&offset=0&sort=asc
Header:
Authorization : Bearer <JWT token obtained from Step 1>

It will generate response like below. Use value in archiveURL property to generate file.

{
    "total": 71,
    "list": [
        {
            "projectId": "17a7871e-15cd-43a3-b349-1161778b0aed",
            "name": "Superb Parrot Monitoring project",
            "dataResourceId": "dr5017",
            "dataProviderId": "dp3534",
            "status": "active",
            "alaHarvest": true,
            "archiveURL": "https://ecodata.ala.org.au/ws/project/17a7871e-15cd-43a3-b349-1161778b0aed/archive"
        },
       ………
    ]
}

3. Get DwC archive

URL : https://ecodata.ala.org.au/ws/project/17a7871e-15cd-43a3-b349-1161778b0aed/archive
Header: 
Authorization : Bearer <JWT token obtained from Step 1>

Note: creating the archive can take several minutes depending on the number of activities in a project. Next phase will make it faster.

Clone this wiki locally