SynDB Connectomics Data Profile

The SynDB Connectomics Data Profile defines the metadata contract required for datasets to be findable, accessible, interoperable, and reusable in SynDB.

Required Dataset Metadata

Every dataset must include:

UUIDv7 dataset identifier.
DataCite dataset DOI for production F1=3 FAIR claims.
Human-readable dataset label.
SPDX-compatible data license.
Access policy: open, registered, or restricted.
Species resolved to an active ontology_term in the ncbi_taxon vocabulary.
Microscopy technique resolved to an active ontology_term in the microscopy vocabulary.
Brain regions resolved to active ontology terms in uberon, fbbt, or SynDB’s internal brain_region vocabulary.
Declared SynDB table list and uploaded table state.
Provenance, version, citation, lineage, and archive links.

Ontology Vocabularies

SynDB metadata uses these vocabularies:

Vocabulary	Use
`ncbi_taxon`	Species and taxonomic identity
`microscopy`	Imaging and reconstruction modality
`uberon`	Vertebrate anatomical structures
`fbbt`	Drosophila anatomical structures
`brain_region`	SynDB terms for structures not yet mapped to external vocabularies
`chebi`	Neurotransmitter and chemical identity

All required ontology terms must be active, have a URI, and carry a registry version.

Relations

Dataset lineage and external references use DataCite relation types, including IsDerivedFrom, IsSourceOf, IsPartOf, HasPart, References, IsReferencedBy, IsVersionOf, and HasVersion.

Invalid relation strings are rejected.

Linked Data And Archive Contract

Each dataset exposes:

GET /v1/neurodata/datasets/{dataset_id}/metadata.jsonld
GET /v1/neurodata/datasets/{dataset_id}/archive.json
GET /v1/neurodata/datasets/{dataset_id}/doi
GET /v1/neurodata/datasets/{dataset_id}/provenance
GET /v1/neurodata/datasets/{dataset_id}/versions
GET /v1/neurodata/datasets/{dataset_id}/lineage
GET /v1/neurodata/datasets/{dataset_id}/citation

The JSON-LD document uses Schema.org, DCAT, Dublin Core, DataCite, PROV, SPDX, and SynDB terms. It includes conformsTo pointing to this profile.

The archive bundle is the long-term metadata preservation surface. It includes dataset metadata, JSON-LD, DataCite metadata, citations, provenance, versions, lineage, external references, deletion status, and navigation links.

Dataset DOIs are minted through DataCite when the deployment has DATACITE_ENABLED, repository credentials, and a DOI prefix configured. Local or development deployments must report DOI minting as unavailable rather than fabricating DOI identifiers.

Validation Rules

Dataset creation fails if any required species, microscopy, or brain-region term cannot be resolved to a non-deprecated ontology term with a URI.

Startup validation fails if the ontology registry is incomplete for any neurometa enum variant or persisted dataset metadata term. Missing data must be fixed upstream by adding or repairing the relevant ontology term before SynDB starts.

For production FAIR scoring, each published dataset must have a DataCite DOI record. Publication DOIs remain related identifiers and do not replace the dataset DOI.