Catalogues

Ingesting catalogues into Data Central allows for SQL/ADQL querying, and broadcasts the table(s) through the Data Central TAP server. The Schema Browser will also include an entry for each catalogue.

If you provide an input catalogue as part of the catalogue ingestion (as described later in this article), additional functionality is provided:

  • sources will appear in the Name Resolver and can be automatically resolved by the image cutout.

  • sources will appear in the Cone Search

  • sources will be available in the Single Object Viewer. Individual data products (IFS, Spectra) can be linked to a source, and custom SQL run to populate the SOV with particular rows from your catalogues.

Note

source in Data Central is used interchangeably with AstroObject. It is a survey-team defined astronomical object with positional information that individual data product files can be linked to.

Note

Remember that the documentation mentioned here is the static, paper-like documentation, the documentation on Documentent Central is entirely separate.

Data Model

Data Central’s ingestion process will map your data onto the Data Central data model format. Within Data Central, catalogue data are organised hierarchically, as per:

Survey
└── DataRelease
    └── Schema:Catalogues
        └── Group
            └── Table

There are dozens of tables from multiple surveys in the Data Central database. Groups are used to collect scientifically-related tables together, in order to help the user locate the correct table more quickly. To explore the data model further, visit the catalogue section of the Schema Browser to explore the relationships between groups and tables.

Directory Structure

To ingest data into Data Central, you will provide two folders, one containing the data products themselves, and one containing the metadata.

Data

The catalogues directory should contain the catalogue files themselves.

data
└── <survey>
    └── <datarelease>
        └── catalogues
            ├── my_input_cat.fits
            └── my_output_table.csv

Attention

<survey> and <datarelease> should be replaced with the values you chose in Getting Started, e.g., gama and dr2

Data Central supports catalogues/tables in .csv or .fits formats.

Danger

If your input table is > 2GB in size, please ensure the format is .csv (not fits).

Metadata

The following file structure should be adopted. A top-level <survey> directory should contain a single directory per <datarelease>. Both directories should have metadata files described below which will populate the Schema Browser.

metadata
└── <survey>
    ├── <survey>_survey_meta.txt
    └── <datarelease>
        ├── <survey>_<data_release>_data_release_meta.txt
        └── catalogues/
            ├── <survey>_<datarelease>_column_meta.txt
            ├── <survey>_<datarelease>_coordinate_meta.txt  ** optional
            ├── <survey>_<datarelease>_group_meta.txt
            ├── <survey>_<datarelease>_sql_meta.txt         ** optional
            ├── <survey>_<datarelease>_table_meta.txt
            └── docs/

The metadata/catalogues/ directory will contain a minimum of 3 metadata files, plus a docs/ directory if you have supplied additional documentation for a particular catalogue. The two optional metadata files (coordinate_meta and sql_meta) are described later in this article.

Metadata Files

Attention

Metadata files are always pipe-delimited, and have the extension .txt

<survey>_<datarelease>_group_meta.txt

This file describes the groups you would like to register, and will be used to populate the Schema Browser. Provide the following a single pipe-delimited .txt file containing an entry (row) for each group:

name

pretty_name

description

documentation

contact

date

version

ApMatchedPhotom

ApMatchedPhotom

This group provides aperture matched ugrizYJHK photometry.

unique_group_documentation_filename.txt

name <email@institute.org>

2012-04-23

v02

Please name this file: <survey>_<datarelease>_group_meta.txt e.g., sami_dr2_group_meta.txt

<survey>_<datarelease>_group_meta.txt

This file should contain the following columns

name(required=True, type=char, max_limit=100)

Group name. Use only alphanumeric characters. This must be unique per data release.

pretty_name(required=True, type=char, max_limit=100)

A human-readable version of the group name. This can contain any characters (up to the character limit).

description(required=True, type=char, max_limit=1000)

A succinct paragraph describing the group.

documentation(required=True, type=char, max_limit=1000)

If you would like formatted text to appear in the schema browser, please supply the name of the file containing html-formatted text (see Formatting for more info). Note, this is typically for 2-3 paragraphs of information. Detailed documentation should be written into a Document Central article. If you do not wish to supply documentation for a particular row, leave this entry blank.

contact(required=True, type=char, max_limit=500)

Format as: John Smith <john.smith@institute.org>

date(required=True, type=char, max_limit=100)

Group creation/update date as defined by the team e.g., 2012-04-23

version(required=True, type=char, max_limit=100)

Group version as defined by the team e.g., v1.8

<survey>_<datarelease>_table_meta.txt

This file describes the tables you would like to register, and will be used to populate the Schema Browser and be available for public SQL/ADQL querying, as well as discoverable through the Data Central TAP server.

Please provide a single .txt file with an entry per table, containing the following meta information:

name

description

documentation

group

filename

contact

date

version

ApMatchedCat

This table contains r-band aperture matched photometry and otherSource Extractor outputs for all GAMA DR2 objects.

unique_table_documentation_filename.txt

ApMatchedPhotom

ApMatchedCat.fits

name <email@institute.org>

2012-04-23

v02

Please name this file: <survey>_<datarelease>_table_meta.txt e.g., sami_dr2_table_meta.txt

<survey>_<datarelease>_table_meta.txt

This file should contain the following columns

name(required=True, type=char, max_limit=100)

Table name. Use only alphanumeric characters. This must be unique per data release.

description(required=True, type=char, max_limit=1000)

A succinct paragraph describing the group.

documentation(required=True, type=char, max_limit=1000)

If you would like formatted text to appear in the schema browser, please supply the name of the file containing html-formatted text (see Formatting for more info). Note, this is typically for 2-3 paragraphs of information. Detailed documentation should be written into a Document Central article. If you do not wish to supply documentation for a particular row, leave this entry blank.

group_name(required=True, type=char, max_limit=100)

The name of the group (must match a group name from the <survey>_<datarelease>_group_meta.txt file above)

filename(required=True, type=char, max_limit=1000)

The filename of the table you’ll be providing

contact(required=True, type=char, max_limit=500)

Format as: John Smith <john.smith@institute.org>

date(required=True, type=char, max_limit=100)

Table creation/update date as defined by the team e.g., 2012-04-23

version(required=True, type=char, max_limit=100)

Table version as defined by the team e.g., v1.8

<survey>_<datarelease>_column_meta.txt

This file describes the columns you would like to register for each table, and will be used to populate the Schema Browser, SQL/ADQL query service, and the TAP server. Please provide the following a single pipe-delimited .txt file containing an entry (row) for each column:

name

table_name

description

ucd

unit

data_type

ALPHA_J2000

ApMatchedCat

RA (r band)

pos.eq.ra;em.opt.R

deg

double

CATAID

EnvironmentMeasures

Unique GAMA ID

meta.id

double

Please name this file: <survey_dr>_column_meta.txt e.g., sami_dr2_column_meta.txt

<survey>_<datarelease>_column_meta.txt

This file should contain the following columns

name(required=True, type=char, max_limit=100)

Column name. Use only alphanumeric characters.

Attention

Column names must be SQL-queriable, use only characters, letters and underscores in your column names. Column names cannot start with numbers but can include numbers afterwards. Forbidden characters include: %^&({}+-/ ][‘’’

description(required=True, type=char, max_limit=1000)

A succinct paragraph describing the table.

table_name(required=True, type=char, max_limit=100)

The name of the table (must match a table name from the <survey>_<datarelease>_table_meta.txt file above)

ucd(required=True, type=char, max_limit=100)

UCDs can be found here: http://cds.u-strasbg.fr/UCD/tree/js/ (more info: https://arxiv.org/pdf/1110.0525.pdf)

unit(required=True, type=char, max_limit=100)

Column unit

data_type(required=True, type=char, max_limit=100)

data type of the column. Add the full name of the data type such as integer instead of shorten form int.

Extra Requirements

Note: this section is optional, you do not have to provide an _coordinate_meta.txt file or _sql_meta.txt file if your data release does not lend itself to individual source identification.

You cannot ingest individual data products associated with a single astronomical object without completing this step.

By providing a metadata file pointing to the input catalogue of your data release, Data Central is able to populate the database with Astronomical Objects from your survey. These objects are then accessible in the name resolver, and cone search (as well as the image cutout overplotting functionality).

Source Catalogue Identification

To identify a table as a source catalogue (i.e. an input catalogue that has one row per source in your data release), please provide a metadata file that contains the name of a single table that contains the resolver info (source name, coordinates, format), as per:

table_name

source_name_col

long_col

lat_col

long_format

lat_format

frame

equinox

InputCatA

CATAID

RA_deg

Dec_deg

deg

deg

icrs

Please name this file: <survey_dr>_coordinate_meta.txt e.g., sami_dr2_coordinate_meta.txt

Danger

If your input table is > 2GB in size, please ensure the format is .csv (not fits).

Tip

It is advised to provide coordinates as RA, Dec (degrees, degrees). If your Long/Lat fields are not in an ICRS coordinate frame (degrees), Data Central will auto-generate these columns.

<survey_dr>_coordinate_meta.txt

This file should contain the following columns

table_name(required=True, type=char, max_limit=100)

The table name (not filename) to be used (must have an entry in the <survey>_<datarelease>_table_meta.txt file.)

source_name_col(required=True, type=char, max_limit=100)

The column name for source name (from the specified table)

long_col(required=True, type=char, max_limit=100)

The column name for longitude (from the specified table)

lat_col(required=True, type=char, max_limit=100)

The column name for latitude (from the specified table)

long_format(required=True, type=char, max_limit=100)

The longitude format. Depending on the formatting of your coordinate values (i.e., whether decimal/space delimited/colon delimited) and the value of long_format/lat_format (deg or h), coordinate data are interpreted as:

value

format

interpretation

10.2345

deg

Degrees

1 2 3

deg

Degrees, arcmin, arcsecond

1:2:30.40

deg

Sexagesimal degrees

1 2 0

hourangle

Sexagesimal hours

lat_format(required=True, type=char, max_limit=100)

The latitude format. Depending on the formatting of your coordinate values (i.e., whether decimal/space delimited/colon delimited) and the value of long_format/lat_format (deg or h), coordinate data are interpreted as:

value

format

interpretation

10.2345

deg

Degrees

1 2 3

deg

Degrees, arcmin, arcsecond

1:2:30.40

deg

Sexagesimal degrees

1 2 0

hourangle

Sexagesimal hours

frame(required=True, type=char, max_limit=100)

Coordinate frame. Accepted values are (fk5, fk4, icrs, galactic, supergalactic)

equinox(required=True, type=char, max_limit=100)

If appropriate (leave blank for icrs), the equinox of this frame. Accepted values are (j2000, j1950, b1950)

Data Central will auto-generate ICRS-frame RA(deg) Dec(deg) columns if that format has not been provided. Data Central is able to transform using the combinations of coordinate systems listed above. If you do not see the coordinate system your data are currently recorded in, it is advised to generate RA, Dec columns as ICRS for your catalogue to be included.

Danger

The values in source_name_col must be unique across all of the tables included in the source catalogue. If none of your existing tables meet this requirement, then you will need to generate a new table, which need only include source name, RA and Dec. You do not need to include this table in the column or table metadata files, but it would be preferred.

SOV SQL Functionality

If you wish for a your tables to be queried and rows displayed as part of the Single Object Viewer, please provide the following table:

table_name

sql

InputCatA

“SELECT * FROM gama_dr2.InputCatA WHERE CATAID = {objid}”

Please name this file: <survey_dr>_sql_meta.txt e.g., sami_dr2_sql_meta.txt

<survey_dr>_sql_meta.txt

This file should contain the following columns

table_name(required=True, type=char, max_limit=100)

The table name (not filename) to be used (must have an entry in the <survey_dr>_table_meta.txt file.)

sql(required=True, type=char, max_limit=100)

The SQL expression (following the DC syntax of survey_dr.table_name) to be run on SOV load. {objid} will be replaced by the AstroObject being requested.

Attention

Ensure your SQL runs before submitting this metadata file. e.g., check whether you need single quotes around the objid as per: ‘{objid}’