Data File Formats

All FITS files

FITS files can themselves contain significant amounts of metadata, via the header in each extension. Using the correct headers and including all the relevant metadata both makes it easier for Data Central to ingest and manage your survey data, and reduce the amount of issues that users of the survey will encounter. To this end, here are some things you can do to ensure that the FITS files are in the best condition:

  1. astropy contains tools to ensure correctness of FITS files and metadata, run these tools on your FITS files before uploading them to Data Central. The ones we suggest running at a minimum are wcslist (for WCS) and verify() (for the FITS file itself).

  2. Sanity check that the WCS metadata contained within the file: there may be keywords left over from the data reduction process (e.g. a 2d WCS for a 1d spectrum), or tools adding additional WCS keywords which override earlier values (e.g. CRPIXjs vs. CDi_js vs. PCi_js). astropy may be able to pick up some of these issues, but depending on how the data is stored, this is not always possible (e.g. a 2d WCS would not be detected for spectra stored in rows within a single HDU, whilst logically each row should be treated as being independent). A common cause of this is copying the contents of the FITS header from unreduced to a reduced version of the data, without checking the resulting header: make sure that you understand what each keyword in means, and that it is still applicable (e.g. 2dfdr included 2D WCS metadata, providing information about the wavelength-pixel mapping, and the fibre-pixel mapping, if you copy across the header when you extract a single spectra, the second WCS axis is logically invalid).

  3. Specify all the units within the metadata, via standard keywords such as BUNIT and CUNITn. This means Data Central does not need to track units separately, and that standard tools (such as astropy) can display, convert and manipulate data without users needing to manage units manually. See Units within Data Central for some unit-specific advice, which applies across all of the different data types and formats (including catalogues).

  4. Specify bibliographic and provenance information in the primary header, via standard keywords such ORIGIN and REFERENC (please include as much as possible, even down to the specific data release the file is from). Whilst Data Central does not specifically use this information, tools such as specutils can use this to identify and load your files, and the files can understood outside the context of Data Central.

  5. Don’t include images/spectra/other data from other surveys within the same FITS file—this results in duplication of effort, increases the amount of code needed to manage these files (e.g. custom readers have to be written), and wastes storage and bandwidth. Instead, include references to other surveys via the survey object name (this is best done via catalogues, however this can additionally be stored in metadata within headers of the relevant files). The VO (Virtual Observatory) tools are designed for this kind of work, and future iterations of Data Central will expose more and more of these inter-survey links, and hence increase the richness of the visualisations.

  6. Include metadata about the different HDUs within the FITS file. Use standard keywords (like EXTNAME) where possible, but there are other non-standard keywords such as ROWn or ARRAYn which have become unofficial conventions.

A reference containing all the standard FITS keywords can be found at https://heasarc.gsfc.nasa.gov/docs/fcg/standard_dict.html and further links to other sets of keywords and conventions can be found at https://fits.gsfc.nasa.gov/fits_standard.html.

Imaging

Coming soon.

IFS

cube_blue/cube_red

Coming soon.

Spectra

spectrum_1d

1D Spectral FITS files must have a single source spectrum in a single row. Flux, SN, sky emission can be included in different FITS extensions.

If you are registering a 1D spectra in your <survey>_<datarelease>_product_meta.txt, you must also include the following columns in the <survey>_<datarelease>_product_meta.txt file:

Column

Description

fluxHDU

Number of extensions containing source flux

snHDU

Number of extensions containing source signal to noise (can be blank if not provided)

skyHDU

Number of extensions containing sky emission (can be blank if not provided)

fluxUnitKeyword

FITS header keyword containing flux units (can be “ergs/sec/cm^2/ang”, “ergs/sec/cm^2/Hz”, ‘jansky”, ‘counts’)

fluxScaleKeyword

FITS header keyword containing scaling for flux units (numeric)

wavelengthUnitKeyword

FITS header keyword containing wavelength units (can be ‘ang’, ’nm’, ‘micron’, ‘m’)

CRPIXKeyword

FITS header keyword containing pixel reference point for wavelength scale.

CRVALKeyword

FITS header keyword containing pixel reference point value for wavelength scale.

CDELTKeyword

FITS header keyword containing pixel width for wavelength scale.

spectrum_2d

Coming soon.