Skip to content

[DwC export]: Implement Darwin Core Archive (DwCA) file generation #7733

@grantfitzsimmons

Description

@grantfitzsimmons

Goal

Implement the backend process that generates a valid Darwin Core Archive (.zip) from an Export Package, including the core file, extension files, meta.xml, and eml.xml.

Image

See https://ipt.gbif.org/manual/en/ipt/latest/dwca-guide for full structural information.

Background

The DwCA format is the required output for GBIF, Fishnet2, GGBN, and other aggregators. The generated archive must pass the GBIF data validator without errors.

Implementation

This process needs to draw from the cache tables built (see #7739) as the source.

Acceptance Criteria

  • Output is a .zip file containing: occurrence.csv (core file), one .csv per extension, meta.xml, and eml.xml.
  • All dates in output files use YYYY-MM-DD ISO format (FR-12), regardless of DB date format.
  • Column headers in CSV files use DwC term names (not Specify field labels).
  • A properly created archive should pass the GBIF data validator with no errors.
  • The file is named per the FileName field of the Export Package.
  • Extensions are to be added as explained above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions