Getting Started
Warning
Please be aware that we are in the process of modernising our models to facilitate a new release of Data Central. To support this we have paused the acceptance of new data until April 2026. Please don’t generate any metadata for your upcoming Data Central data, as there will be changes to the metadata format that will supersede your work. Our documentation will be updated when we are ready to receive your metadata, and we will notify you through our usual channels: the ASA mailing list, and the datacentral-announce Slack channel.
If you have any questions, please contact us.
Note
Please be aware that this is a living document. Although Data Central will take every effort to minimise the changes to the format of the metadata required, we may from time to time make changes to these requirements in order to deliver the best possible user experience.
Data Central can currently ingest, link, cross-match and serve spectra, images, catalogues and IFS data products. We have detailed descriptions for how to format your data depending on its type, but first, follow the instructions below to setup the basic file structure and metadata files for your data release.
Metadata files are provided alongside your data and allow us to populate the Schema Browser to best describe your data. These files are un-editable once your data has been released via Data Central (prior to the final sign-off from the survey team, we will go through an iterative process with you to verify that the data are correct and displayed as expected. During that stage you are free to make changes to metadata files).
Note
Documentation intended for public use and/or subject to change (e.g., with detailed descriptions of analysis or version update information) should be maintained by the survey teams in Data Central’s Documentation portal: Documentation Central.
To get your team set up within Data Central, if you have not done so already, contact us by following
this link and choosing to Add new data.
Whilst we can be flexible about how we receive the data and metadata, here’s what we see as the quickest and easiest way to get your data ingested:
Create a Data Central Account at https://accounts.datacentral.org.au/register/ (if you have not already done so).
Request a team at https://teams.datacentral.org.au/request-team/ (if you do not already have a team for the survey/data). The team will need to be approved by a Data Central admin.
Log in with your Data Central account at https://dev.aao.org.au/ (using the Data Central button), as this will allow you git access to the repository which stores all the metadata. If multiple members of the team want to be updating the metadata, have them also create Data Central accounts and login to https://dev.aao.org.au.
Contact us by following this link and choosing to
Add new data. Let us know your data format and size so that we can identify the best way to transfer them.Make a merge request in our dcmetadata repository with the metadata as documented below.
Check the data in the pre-release link we (the Data Central team) send you, and sign off on the release.
Directory Structure
To ingest data into Data Central, you will provide two folders, one containing the data products themselves, and one containing the metadata. The data/ and metadata/ directories will be further populated according to the types of data products you wish to release (see section Data Types for specific requirements; catalogue/ifs/spectra etc).
Data
A top-level <survey> directory should contain a single directory per <datarelease>.
data
└── <survey>
└── <datarelease>
Attention
<survey> and <datarelease> should be replaced with the values you chose in Getting Started, e.g., gama and dr2.
Metadata
The following file structure should be adopted. A top-level <survey> directory should
contain a single directory per <datarelease>. Please follow the directory format indicated below.
The metadata files will populate the Schema Browser.
dcmetadata
└── surveys
|____ <survey>
├──── <survey>_survey_meta.txt
└──── <datarelease>
├── <survey>_<datarelease>_data_release_meta.txt
Metadata Files
Attention
Metadata files are always pipe-delimited, and have the extension .txt.
There are two files you will need to provide to initialize your survey and data release in Data Central, <survey>_survey_meta.txt and <survey>_<datarelease>_data_release_meta.txt.
<survey>_survey_meta.txt
This file describes your survey at a high level, and is used to populate the Schema Browser. It is uneditable once the data are released. Remember, documentation intended for public use and/or subject to change (e.g., with detailed descriptions of analysis or version update information) should be maintained by the survey teams in Data Central’s Documentation portal, Documentation Central, not the description field below.
Provide the following in a single pipe-delimited .txt file containing an entry (row) for your survey:
name |
pretty_name |
title |
description |
pi |
contact |
website |
|---|---|---|---|---|---|---|
gama |
GAMA |
Galaxy and Mass Assembly Survey |
GAMA is a project to exploit the latest generation of ground-based and space-borne survey facilities to study cosmology and galaxy formation and evolution. At the heart of this project lies the GAMA spectroscopic survey of ~300,000 galaxies down to r < 19.8 mag over ~286 deg2, carried out using the AAOmega multi-object spectrograph on the Anglo-Australian Telescope (AAT) by the GAMA team. This project was awarded 210 nights over 7 years (2008–2014) and the observations are now completed. This survey builds on, and is augmented by, previous spectroscopic surveys such as the Sloan Digital Sky Survey (SDSS), the 2dF Galaxy Redshift Survey(2dFGRS) and the Millennium Galaxy Catalogue (MGC). |
Simon Driver |
Simon Driver |
Please name this file: <survey>_survey_meta.txt, replacing <survey> in the filename with the name in the first column.
See file content information below:
- <survey>_survey_meta.txt
This file should contain the following columns:
- name(required=True, type=string, max_limit=100)
A machine-readable, lowercase version of your survey name (e.g., GAMA would be entered as gama). Use only alphanumeric characters (no spaces). This must be unique within Data Central. This should match the string used in the file name.
- pretty_name(required=True, type=string, max_limit=100)
A human-readable version of the survey name. This can contain any characters (up to the character limit), and will be used for display.
- title(required=True, type=string, max_limit=100)
A longer version of the survey name (likely an expansion of the acronym!).
- description(required=True, type=string, max_limit=500)
A succinct paragraph describing your survey.
- pi(required=True, type=string, max_limit=100)
The name of the Principal Investigator.
- contact(required=True, type=string, max_limit=100)
Format as: John Smith <john.smith@institute.org>.
- website(required=True, type=string, max_limit=500)
The survey team’s website for public use (e.g., https://devilsurvey.org/).
Note
If you have previously hosted data with Data Central, you can skip providing the <survey>_survey_meta.txt file. We will already have this from your previous data release.
<survey>_<datarelease>_data_release_meta.txt
This file describes your data release at a high level, and is used to populate the Schema Browser. It is uneditable once the data are released. Remember, documentation subject to public use and/or subject to change (e.g., with detailed descriptions of analysis or version update information) should be maintained by the survey teams in Data Central’s Documentation portal, Documentation Central, not the description field below.
Provide the following in a single pipe-delimited .txt file containing an entry (row) for your data release:
name |
pretty_name |
version |
data_release_number |
contact |
public |
group |
|---|---|---|---|---|---|---|
dr2 |
Data Release 2 |
1.0 |
2 |
Simon Driver <simon.driver@uwa.edu.au> |
False |
gama |
Please name this file: <survey>_<datarelease>_data_release_meta.txt, replacing <survey> in the filename with the survey name,
as above, and replacing <datarelease> in the filename with the name in the first column of this file.
See file content information below:
- <survey>_<datarelease>_data_release_meta.txt
This file should contain the following columns:
- name(required=True, type=string, max_limit=100)
A machine-readable, lowercase version of your data release name (e.g., Data Release 2 would be entered as dr2). Use only alphanumeric characters (no spaces). This must be unique within your survey (you cannot release dr2 twice!). This should match the string used for the data release in the file name.
- pretty_name(required=True, type=string, max_limit=100)
A human-readable version of the data release name. This can contain any characters (up to the character limit), and will be used for display.
- version(required=True, type=float, default=1.0)
The version of the data release data. This is meaningful to the team only, and should be described in the team-curated Documentation.
- data_release_number(required=True, type=int, default=1)
The number of the data release within Data Central. e.g., if this data release is named “Final Data Release”, which may correspond to the 5th data release, we need a numeric representation here so as to show the data releases in the Schema Browser sequentially.
- contact(required=True, type=string, max_limit=100)
Format as: John Smith <john.smith@institute.org>
- public(required=False, type=bool, default=True)
A flag to indicate whether the data release should be public, or restricted to a particular group of users.
- group(required=False, type=string, max_limit=100)
If the public flag is False, this column is required, otherwise it should be omitted. This column should be set to the group of users which the data release is restricted to, if it is not public.
The basic models are now also in place to begin ingesting data. Since your data release will likely contain catalogues, we recommend you start there first.
Read Next: