Getting Started

Data Central can currently ingest, link, cross-match and serve spectra, images, catalogues and IFS data products. We have detailed descriptions for how to format your data depending on its type, but first, follow the instructions below to setup the basic file structure and metadata files for your data release.

Metadata files are provided alongside your data and allow us to populate the Schema Browser to best describe your data. These files are un-editable once your data has been released via Data Central (prior to the final sign-off from the survey team, we will go through an iterative process with you to verify that the data are correct and displayed as expected. During that stage you are free to make changes to metadata files).

Documentation intended for public consumption that you will want to update frequently (e.g., with detailed descriptions of analysis or version update information) should be maintained by the survey teams themselves in Data Central’s Documentation portal: Documentation Central. Contact us to get your team set up within the system if you have not done so already.

Whilst we can be flexible about how we receive the data and metadata, here’s what we see as the quickest and easiest way to get your data ingested:

  1. Create a Data Central Account at https://accounts.datacentral.org.au/register/ (if you have not already done so).

  2. Request a team at https://teams.datacentral.org.au/request-team/ (if you do not already have a team for the survey/data). The team will need to be approved by a Data Central admin.

  3. Log in with your Data Central account at https://dev.aao.org.au/ (using the Data Central button), as this will allow you git access to the repository which stores all the metadata. If multiple members of the team want to be updating the metadata, have them also create Data Central accounts and login to https://dev.aao.org.au, and include their usernames in a ticket here https://jira.aao.org.au/servicedesk/customer/portal/3/create/28.

  4. Upload your data in the correct layout (as noted below) to https://cloud.datacentral.org.au/.

  5. Make a merge request on https://dev.aao.org.au/ with the metadata as documented below.

  6. Check the data in the pre-release link we (the Data Central team) send you, and sign off on the release.

Directory Structure

To ingest data into Data Central, you will provide two folders, one containing the data products themselves, and one containing the metadata. The data/ and metadata/ directories will be further populated according to the types of data products you wish to release (see section Data Types for specific requirements; catalogue/ifs/spectra etc).

Data

A top-level <survey> directory should contain a single directory per <datarelease>.

data
└── <survey>
    └── <datarelease>

Metadata

The following file structure should be adopted. A top-level <survey> directory should contain a single directory per <datarelease>. Both directories should have metadata files described below which will populate the Schema Browser.

metadata
└── <survey>
    ├── <survey>_survey_meta.txt
    └── <datarelease>
        └── <survey>_<data_release>_data_release_meta.txt

Metadata Files

Attention

Metadata files are always pipe-delimited, and have the extension .txt

There are two files you’ll need to provide to initialize your survey and data release in Data Central, <survey>_survey_meta.txt and <survey>_<datarelease>_data_release_meta.txt.

<survey>_survey_meta.txt

This file describes your survey at a high level, and is used to populate the Schema Browser. It is uneditable once the data are released, detailed documentation that is editable by the teams should be written into a Document Central article (contact us if you do not yet have an account).

Provide the following a single pipe-delimited .txt file containing an entry (row) for your survey:

name

pretty_name

title

description

pi

contact

website

gama

GAMA

Galaxy and Mass Assembly Survey

GAMA is a project to exploit the latest generation of ground-based and space-borne survey facilities to study cosmology and galaxy formation and evolution. At the heart of this project lies the GAMA spectroscopic survey of ~300,000 galaxies down to r < 19.8 mag over ~286 deg2, carried out using the AAOmega multi-object spectrograph on the Anglo-Australian Telescope (AAT) by the GAMA team. This project was awarded 210 nights over 7 years (2008–2014) and the observations are now completed. This survey builds on, and is augmented by, previous spectroscopic surveys such as the Sloan Digital Sky Survey (SDSS), the 2dF Galaxy Redshift Survey(2dFGRS) and the Millennium Galaxy Catalogue (MGC).

Simon Driver

Simon Driver

http://www/gama-survey.org/

Replace <survey> in the filename with the value for name in the first column.

<survey>_survey_meta.txt

This file should contain the following columns

name(required=True, type=char, max_limit=100)

A machine-readable, lowercase version of your survey name (e.g., GAMA would be entered as gama). Use only alphanumeric characters (no spaces and do not start your survey with a digit). This must be unique within Data Central (we cannot support multiple surveys named ‘DEVILS’, you will need to think of a new name!). This should match the string used in the file name.

pretty_name(required=True, type=char, max_limit=100)

A human-readable version of the survey name. This can contain any characters (up to the character limit).

title(required=True, type=char, max_limit=100)

A longer version of the survey name (likely an expansion of the acronym!)

description(required=True, type=char, max_limit=1000)

A succinct paragraph describing your survey.

pi(required=True, type=char, max_limit=100)

The name of the Principal Investigator

contact(required=True, type=char, max_limit=100)

Format as: John Smith <john.smith@institute.org>

website(required=True, type=char, max_limit=500)

The survey team’s website for public consumption (e.g., https://devilsurvey.org/)

Note

If you have previously hosted data with Data Central, you can skip providing the <survey>_survey_meta.txt file. We’ll already have this from your previous data release.

<survey>_<datarelease>_data_release_meta.txt

This file describes your data release at a high level, and is used to populate the Schema Browser. It is uneditable once the data are released, detailed documentation that is editable by the teams should be written into a Document Central article (contact us if you do not yet have an account).

Provide the following a single pipe-delimited .txt file containing an entry (row) for your data release:

name

pretty_name

version

data_release_number

contact

dr2

Data Release 2

1

2

Simon Driver <simon.driver@uwa.edu.au>

Replace <datarelease> in the filename with the value for name in the first column.

<survey>_<datarelease>_data_release_meta.txt

This file should contain the following columns

name(required=True, type=char, max_limit=100)

A machine-readable, lowercase version of your data release name (e.g., Data Release 2 would be entered as dr2). Use only alphanumeric characters (no spaces). This must be unique within your survey (you cannot release dr2 twice!). This should match the string used for the data release in the file name.

pretty_name(required=True, type=char, max_limit=100)

A human-readable version of the data release name. This can contain any characters (up to the character limit).

version(required=True, type=float, default=1.0)

The version of the data release data. This is meaningful to the team only, and should be described in the team-curated and created Documentation

data_release_number(required=True, type=int, default=1)

The number of the data release within Data Central. e.g., if this data release is named “Final Data Release”, which may correspond to the 5th data release, we need a numeric representation here so as to show the data releases in the schema browser sequentially.

contact(required=True, type=char, max_limit=100)

Format as: John Smith <john.smith@institute.org>

The basic models are now also in place to begin ingesting data. Since your data release will likely contain catalogues, we recommend you start there first.

Read Next: