SynDB Connectomics Data Profile
The SynDB Connectomics Data Profile defines the metadata contract required for datasets to be findable, accessible, interoperable, and reusable in SynDB.
Required Dataset Metadata
Every dataset must include:
- UUIDv7 dataset identifier.
- DataCite dataset DOI for production F1=3 FAIR claims.
- Human-readable dataset label.
- SPDX-compatible data license.
- Access policy:
open,registered, orrestricted. - Species resolved to an active
ontology_termin thencbi_taxonvocabulary. - Microscopy technique resolved to an active
ontology_termin themicroscopyvocabulary. - Brain regions resolved to active ontology terms in
uberon,fbbt, or SynDB’s internalbrain_regionvocabulary. - Declared SynDB table list and uploaded table state.
- Provenance, version, citation, lineage, and archive links.
Ontology Vocabularies
SynDB metadata uses these vocabularies:
| Vocabulary | Use |
|---|---|
ncbi_taxon | Species and taxonomic identity |
microscopy | Imaging and reconstruction modality |
uberon | Vertebrate anatomical structures |
fbbt | Drosophila anatomical structures |
brain_region | SynDB terms for structures not yet mapped to external vocabularies |
chebi | Neurotransmitter and chemical identity |
All required ontology terms must be active, have a URI, and carry a registry version.
Relations
Dataset lineage and external references use DataCite relation types, including IsDerivedFrom, IsSourceOf, IsPartOf, HasPart, References, IsReferencedBy, IsVersionOf, and HasVersion.
Invalid relation strings are rejected.
Linked Data And Archive Contract
Each dataset exposes:
GET /v1/neurodata/datasets/{dataset_id}/metadata.jsonldGET /v1/neurodata/datasets/{dataset_id}/archive.jsonGET /v1/neurodata/datasets/{dataset_id}/doiGET /v1/neurodata/datasets/{dataset_id}/provenanceGET /v1/neurodata/datasets/{dataset_id}/versionsGET /v1/neurodata/datasets/{dataset_id}/lineageGET /v1/neurodata/datasets/{dataset_id}/citation
The JSON-LD document uses Schema.org, DCAT, Dublin Core, DataCite, PROV, SPDX, and SynDB terms. It includes conformsTo pointing to this profile.
The archive bundle is the long-term metadata preservation surface. It includes dataset metadata, JSON-LD, DataCite metadata, citations, provenance, versions, lineage, external references, deletion status, and navigation links.
Dataset DOIs are minted through DataCite when the deployment has DATACITE_ENABLED, repository credentials, and a DOI prefix configured. Local or development deployments must report DOI minting as unavailable rather than fabricating DOI identifiers.
Validation Rules
Dataset creation fails if any required species, microscopy, or brain-region term cannot be resolved to a non-deprecated ontology term with a URI.
Startup validation fails if the ontology registry is incomplete for any neurometa enum variant or persisted dataset metadata term. Missing data must be fixed upstream by adding or repairing the relevant ontology term before SynDB starts.
For production FAIR scoring, each published dataset must have a DataCite DOI record. Publication DOIs remain related identifiers and do not replace the dataset DOI.