Upload

Note

Prerequisites

This article requires that you understand how data is stored on SynDB, we recommend reading through the overview article if you are uncertain.

Uploading to SynDB is a multistep process, and requires understanding of the SynDB dataset model.

The process

Preparation

We recommend you to follow the guide in the exact sequence provided. This ensures the instructions are followed effectively and idiomatically.

Terms and conditions

You must accept the terms and conditions before uploading data. The terms include:

Statement that the data is not false or misleading
Redistribution rights
Data licensing agreement with the license of your choice, see guide to pick license; the current default is CC BY 4.0.

SynDB utilizes data standardization to facilitate uploads. Your imaging metrics must be in a tabular data format; for instance, .xlsx, .csv, or .parquet. Read more about the data structuring in the contributor’s guide.

Once you enter the upload page, you will be prompted to log in to your SynDB account if you are not already; furthermore, you must verify your academic status by logging in to your institution’s account.

The upload

You can upload data using the CLI or the web UI, including mixing both approaches. The UI is usually the simplest path for a first upload, while the CLI is better for reproducible and scripted ingestion.

1. Assign IDs, and correlate relations

Each SynDB unit requires a unique ID assigned before being uploaded to the platform. The web UI does this automatically, but not the CLI. When you have multiple SynDB tables under one dataset it is expected that these have some relations with each other.

Warning

Dataset integrity

As it may lead to undefined behaviour, it is disallowed to upload SynDB table data that are unrelated under the same dataset!

Meaning that you cannot upload a table of neurons and a table of synapses under the same dataset unless each synapse has a relation to a neuron from the respective table of neurons.

Web UI

The web UI will automatically assign UUIDs to each SynDB unit. Parent-child relations are checked against the current SynDB table hierarchy during validation; see the data structuring guide for the current dataset model and naming rules.

CLI

The CLI flow is explicit and reproducible:

Create the dataset metadata record and note the returned dataset ID.

syndb data new \
  --label "My connectome release" \
  --animal "Drosophila melanogaster" \
  --microscopy EM \
  --table 1 \
  --table 6 \
  --brain-structure "mushroom body" \
  --license CC_BY

Prepare raw tabular files into a validated parquet upload directory.

syndb data prepare \
  --input-dir raw_dataset \
  --output-dir prepared_dataset

Validate the prepared parquet files before upload.

syndb data validate --input-dir prepared_dataset

Upload the prepared dataset through Arrow Flight.

syndb data upload \
  --input-dir prepared_dataset \
  --dataset-id <syndb-dataset-id>

This CLI flow mirrors the current validator and upload path used by the rest of the platform.

Synapse DB

Upload

The process

Preparation

Terms and conditions

Data structuring

The upload

1. Assign IDs, and correlate relations

Web UI

CLI

2. Selecting or creating the SynDB dataset metadata

3. Confirm and upload

Delete owned datasets

Keyboard shortcuts

Synapse DB