Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

SynDB Connectomics Data Profile

The SynDB Connectomics Data Profile defines the metadata contract required for datasets to be findable, accessible, interoperable, and reusable in SynDB.

Required Dataset Metadata

Every dataset must include:

  • UUIDv7 dataset identifier.
  • DataCite dataset DOI for production F1=3 FAIR claims.
  • Human-readable dataset label.
  • SPDX-compatible data license.
  • Access policy: open, registered, or restricted.
  • Species resolved to an active ontology_term in the ncbi_taxon vocabulary.
  • Microscopy technique resolved to an active ontology_term in the microscopy vocabulary.
  • Brain regions resolved to active ontology terms in uberon, fbbt, or SynDB’s internal brain_region vocabulary.
  • Declared SynDB table list and uploaded table state.
  • Provenance, version, citation, lineage, and archive links.

Ontology Vocabularies

SynDB metadata uses these vocabularies:

VocabularyUse
ncbi_taxonSpecies and taxonomic identity
microscopyImaging and reconstruction modality
uberonVertebrate anatomical structures
fbbtDrosophila anatomical structures
brain_regionSynDB terms for structures not yet mapped to external vocabularies
chebiNeurotransmitter and chemical identity

All required ontology terms must be active, have a URI, and carry a registry version.

Relations

Dataset lineage and external references use DataCite relation types, including IsDerivedFrom, IsSourceOf, IsPartOf, HasPart, References, IsReferencedBy, IsVersionOf, and HasVersion.

Invalid relation strings are rejected.

Linked Data And Archive Contract

Each dataset exposes:

  • GET /v1/neurodata/datasets/{dataset_id}/metadata.jsonld
  • GET /v1/neurodata/datasets/{dataset_id}/archive.json
  • GET /v1/neurodata/datasets/{dataset_id}/doi
  • GET /v1/neurodata/datasets/{dataset_id}/provenance
  • GET /v1/neurodata/datasets/{dataset_id}/versions
  • GET /v1/neurodata/datasets/{dataset_id}/lineage
  • GET /v1/neurodata/datasets/{dataset_id}/citation

The JSON-LD document uses Schema.org, DCAT, Dublin Core, DataCite, PROV, SPDX, and SynDB terms. It includes conformsTo pointing to this profile.

The archive bundle is the long-term metadata preservation surface. It includes dataset metadata, JSON-LD, DataCite metadata, citations, provenance, versions, lineage, external references, deletion status, and navigation links.

Dataset DOIs are minted through DataCite when the deployment has DATACITE_ENABLED, repository credentials, and a DOI prefix configured. Local or development deployments must report DOI minting as unavailable rather than fabricating DOI identifiers.

Validation Rules

Dataset creation fails if any required species, microscopy, or brain-region term cannot be resolved to a non-deprecated ontology term with a URI.

Startup validation fails if the ontology registry is incomplete for any neurometa enum variant or persisted dataset metadata term. Missing data must be fixed upstream by adding or repairing the relevant ontology term before SynDB starts.

For production FAIR scoring, each published dataset must have a DataCite DOI record. Publication DOIs remain related identifiers and do not replace the dataset DOI.