NRAO Data Archive

The NRAO archive can be accessed at https://data.nrao.edu

Introduction

The NRAO archive holds all raw data observed by the VLA (historical and Jansky), VLBA, GMVA, some GBT data (2014- 2020), and also serves ALMA data. Newly observed VLA data are transferred to the NRAO archive and available for retrieval to those with the appropriate privileges shortly after the end of the observations. The archive content can be accessed through the Archive Access Tool (AAT). The AAT and behind the scene supporting software are receiving regular updates, see their release notes here.

When users arrive at the AAT webpage, they will see a text search bar and a list of the most recent data that have been ingested into the archive.

archive-landing-page.png


Locating data

The AAT search provides a data retrieval tool which can be used either as a simple search using keywords in the main search box, or an advanced search under the "Show Search Inputs" below the search box. The Search Input option enables searches based on a large number of user-specified criteria. If a user just wants to access their own data and know their project code, they can put that into the text bar at the top and may not need to refine their search further. Note that the text search bar will search through all text in a project, and may return some unexpected matches. If there are too many matches, users should attempt a more specific search using the various input fields.

Users looking to see if a particular source has been observed, however, may want to specify more detailed search constraints like position and radius, receiver and/or frequency, and array configuration.

For sources with common names, it is best to search by using the source resolver. Put the common name of the source in the 'Source Name' box and click resolve. This will populate the RA and Dec boxes with the returned coordinates. Specifying a search radius can be helpful as well, otherwise the default is 1 degree. The text name of a source is not searched by default and needs to be enabled by ticking the checkbox below the Source Name box.

search-inputs.png

 


Archive Views

The default view of the archive and its search returns is the 'Project'-based view where observations (execution blocks) and images are organized underneath the top-level project that they are associated with. However, it is possible to view just the observations as individual executions by selecting 'View Observations', and it is also possible to just view images from the search returns by selecting 'View Images'.

Archive-views.png


Image View

The since the VLA imaging pipeline has come online since 2022, continuum images are now available for most data that could be processed by the VLA calibration pipeline. The AAT also includes images from the VLASS project, the Arecibo ALFALFA project, and CHILES.

Users can also view images using the integrated CARTA viewer within the NRAO archive. Either select one or multiple images and click 'View in CARTA' at the top or select the CARTA icon next to the selection button.

The AAT will currently only allow selection of images from the same project for download or viewing in CARTA, so if your search return contains images from multiple projects, upon selection, images from projects other than the selected one will be grayed out.

Image-view.png


VLA Data Retrieval

Data can be retrieved from the multitude of project views. Once a dataset of interest has been identified, users can select the clipboard icon to the left of a dataset. Multiple datasets can be selected at a given time, and then click 'Download'. Note that if a calibrated Measurement Set (MS) is requested, only one dataset may be selected at a time, and only datasets with an icon in the 'Cals' column can have a calibrated MS requested.

To see detailed information about the targets, frequency setup, and observations users will need to look at the scan lists for a dataset, see Scan Lists, accessible by clicking on the blue numbers in the far right column. The scan lists can also be accessed using the 'link' icon on the far left, next to the data selection button. This also provides a direct link to the dataset that can be shared or bookmarked.

Project-EB-view.png

Once data of interest have been selected, click the 'Download' button. which brings up the Download dialog box, where a user, if not logged-in, will need to input their e-mail address. If a user wants the data directly delivered to their Lustre area at the DSOC, they will need to be logged in and input their path. Lustre delivery for VLA data is only possible to /lustre/aoc.

Calibrated MS is selected by default if available, the CASA version can be changed by the user if desired, but the AAT automatically selects the CASA version the data were calibrated with if possible.

The link and wget/wget2 commands to download the data will be sent in the e-mail notification when staging for download is complete.

Download-dialog-EVLA.png


ALMA Data Retrieval

ALMA data retrieval is very similar to VLA, but ALMA data are organized as a Member Observation Unit Set (MOUS), or more simply a group of Execution Blocks (EBs)  or observations that were calibrated together. All the EBs need to be restored in order to have a complete dataset. To do that simply click the 'Download Calibrated MS' button that is located on the right side of the row. Clicking the 'Down Arrow' on the row of an ALMA MOUS will reveal more information about the targets, spectral setup, and EBs. The 'Re-Imaging' button will open the dialog box for the ALMA User-Defined Imaging service to make new images from ALMA data. Local delivery to a Lustre are is possible for ALMA data, but only to /lustre/cv.

The link and wget/wget2 commands to download the data will be sent in the e-mail notification when staging for download is complete. We do not recommend requesting tar files, wget/wget2 can download collections of files and directories.

ALMA-project-view.png


VLBA Data Retrieval

VLBA data retrieval is also very similar to VLA and ALMA. The VLBA project view organizes the observations in to Segments, which group multiple individual correlated data products together. One needs to click the down arrow to see all the correlated data products within a segment. Each file in a Segment corresponds to one correlator pass. There could be datasets related to "zoom bands" (re-correlation at higher spectral resolution), multiple bands in an observation (e.g., alternating between C-band and X-band throughout an observation), or data that was re-correlated at an observer's request. Note that observations which use the S/X dual-band observing mode will only produce a single correlated dataset with both bands included. To see detailed information about the targets, frequency setup, and observations users will need to look at the scan lists for a dataset, see Scan Lists.

VLBA-data-view.png

Once within a segment, a dataset can be selected for download.

VLBA-Segment.png

Because only the download of the correlated FITS-IDI (*.idifits) files, mark4 (*.mark4.tar.z) tarballs, and legacy hardware correlator UVFITS (*.uvfits) files are available for VLBA, the download dialog box is much more simple than that of the VLA or ALMA. Note that Lustre delivery for VLBA data is only possible to /lustre/aoc, and we do not recommend requesting tar files, wget/wget2 can download collections of files and directories. The link and wget/wget2 commands to download the data will be sent in the e-mail notification when staging for download is complete. 

Download-VLBA.png


GBT Data Retrieval

GBT data retrieval is similar to the other telescopes. Each observing session has its own archive file that would need to be selected for download. To see detailed information about the targets and observational sequence, users will need to look at the scan lists for a dataset, see Scan Lists. However, due to the way the GBT data were ingested, detailed frequency setup information is not currently available. Updates to the GBT metadata are planned for the future.

GBT-AAT-project-view.png

Multiple GBT datasets can be selected at a time, they come as individual tar files in the GBT raw data format. The download dialog box is simplified, like that of the VLBA since there are no online processing options that can be performed on GBT data. Note that Lustre delivery for GBT data is only possible to /lustre/cv. The link and wget/wget2 commands to download the data will be sent in the e-mail notification when staging for download is complete. 

GBT-download.png


Detailed Observation Information: Scan lists

ALMA data has some observational metadata related to targets, time on source, and frequency setups in their Project View. However, for VLA, VLBA, and GMVA, the detailed observational data are only currently available in the scan list. The scan lists detail the frequency setup, targets, and the overall execution of the observation and they can be accessed by clicking on the blue number in the scans column (far right) for each observation. To see the scan list for ALMA or VLBA data, you must descend into the MOUS/Segment to see the individual EBs. Note that the scan list can also be accessed from the 'link' icon on the left, next to the select data button.

Project-EB-view.png

Within the scan list webpage, users will see metadata associated with the observation; they can also request the data associated with the observation from this page. Users also can see the frequency setups, which are collapsed by default because there are often many spectral windows for standard VLA data. When expanded, each frequency setup will show the targets observed with that setup. The pointing setup will also appear here as well for VLA data that took reference pointing observations. The min/max spectral resolution can help identify whether an observation had higher frequency resolution spectral windows defined for spectral line science.

Scanlist-overview.png

When expanded, each frequency setup will show the targets, time on source (calculated from the scan durations and include the slew overhead), and further down the individual spectral window configurations can be seen. The frequencies shown in the AAT for VLA and VLBA data are in the TOPO reference frame which will be shifted relative to the actual sky frequency, which is important to keep in mind when determining if an observation observed a spectral line of interest.

Scanlist-detail.png


Download Directory Organization

The download directories are setup to organize the type of data being downloaded, planning for when downloads might include multiple types of data, e.g., calibrations, measurement sets, SDMs, and/or images.

The organization is typically as follows for ALMA and VLA data:

  • Project Code
    •  MOUS ID (ALMA only)
      • observation.MJD
        • SDMs
        • historical VLA export Files
      • pipeline.MJD
        • Restored Measurement Sets
      • calibration_pipeline.MJD
        • calibration tarballs
      • imaging_pipeline.MJD
        • VLA/ALMA Images
        • Ancillary images (flatnoise, PB, mask)
        • weblog from imaging pipeline run

Organization for VLBA data:

  • Project Code
    • Segment (may or may not be present)
      • idifits or mark4 files

All download requests will follow this format except Basic MS, which still uses an older delivery system that will be updated in the future. The VLA Basic MS delivery format is:

  • SDM ID
    • SDMID.ms.tgz
    • other files generated by the pipeline to convert from SDM to MS

Scripted Access to the NRAO Archive

The NRAO archive metadata can be searched using a virtual observatory (VO) Table Access Protocol (TAP) service. Detailed information and a demonstration notebook of the scripted access is from the Scripted Data Access page. Downloads are not available from the scripted access currently, but are under consideration in the future. However, a direct link to each dataset is available within the scripted returns so that users do not need to replicate their queries in the web interface.


Data Formats

Jansky VLA data formats

VLA data taken after January 2010, when the transition to the WIDAR correlator took place, are stored in the Science Data Model (SDM) format that is used by both the VLA and ALMA. VLA data are available through the AAT in the following formats:

  • In the native Science Data Model (SDM) format
    • All EVLA data
  • As an uncalibrated Measurement Set (MS) created from the SDM
    • All EVLA data
  • As a Calibrated Measurement Set
    • VLA data since the start of semester 16B (~August 2016)
    • ALMA data since cycle 5 (project codes starting with 2017)

Historical VLA data format

  • Export file format
    • Readable in CASA and AIPS.

VLBA/GMVA data formats

VLBA and GMVA data are typically provided in IDIFITS format which can be read by AIPS or CASA. Some VLBA and GMVA data may also come in Mark4 format which will come as a tarball and for recent observations is available alongside the IDIFITS file.

Image data formats

The images provided by the archive are all in FITS format which can be read by a variety of FITS readers like astropy, ds9, CASAviewer, and CARTA.


Proprietary Data Access

Access to VLA/VLBA data associated with a proposal are currently restricted to the proposing team for a period of 12 months from the date of the last observation in a project. However, 2025B thesis projects will have a 24 month proprietary period and starting in 2026A, all data will have a 24 month proprietary period. Projects submitted as a Director's Discretionary Time proposal will have a proprietary period of 6 months. Please, refer to the NRAO Data Archive Policy for more information regarding proprietary data.

You need to be PI or co-I on the proposal to be able to access proprietary data. If you were not listed on the original proposal, you must first obtain permission from the PI, who in turn, must contact us through the NRAO Helpdesk to allow your access to those data.

Any member of the observing team can access proprietary data by signing in on the archive page with their my.nrao.edu user ID and password (Figure 1), and using the link "your_username's Data" in the upper right corner of the site. If a user attempted to access their proprietary data without logging in, they will be prompted to login at the time of request.

Access to ALMA data is governed by the proprietary policies of the Joint ALMA Observatory (see ALMA Users Policies). Typical proprietary periods are 12 months from the delivery of the data to the PI and DDT proposals have a proprietary period of 6 months. Only the PI and delegees have access to proprietary ALMA data.


NVAS Survey

Historical VLA Images, and in many cases, calibrated uv data of the NVAS survey can be obtained from https://www.vla.nrao.edu/astro/nvas/ Searches used to be available through the legacy archive tool, but are currently limited to the more simple interface on the NVAS survey website. We plan to make these data available from the main archive in the future.


Known Issues/Limitations

  • Requests for SDM-BDF and MS of the same observations have to be submitted separately.
  • Only a single calibrated MS can be requested at any one time.
  • If you request very large tar files, the system may time out due to the length of time tar takes to run the tar command. We do not recommend requesting tar files. Instead download via wget (see below).
  • Download requests without tarring will be a directory structure. To download that directory structure use the following wget command, replacing 'https://dl-dsoc.nrao.edu/anonymous/.....' with the link provided in the notification e-mail.
    wget -r --reject "index.html*" -np -nH --cut-dirs=3 https://dl-dsoc.nrao.edu/anonymous/.....

    wget2 -r -l 10 --reject-regex="index.html*" --progress=bar -np -nH --cut-dirs=3 \
      https://dl-dsoc.nrao.edu/anonymous/
  • wget will fail occasionally for large downloads, if that is the case use the wget2 command instead
  • Some download requests (Basic MSes in particular) are returned in a nested directory, with a sub-directory named exactly the same as what you asked for; you will have to go into that sub-directory to get to the requested file, e.g. 20B-099.sb39274643.eb39345634.59269.54178597222/20B-099.sb39274643.eb39345634.59269.54178597222.ms.tgz
  • MSes for Basic MSes are always provided as a tarball. This is a limitation of the workflow and will be corrected in the future.
  • Searching for particular images by SDM ID will not return any values. Instead they can be searched for by using the project code field and putting the MJD string from the SDM in the filename field.
  • Delivery of VLA data in the UVFITS format was discontinued in 2013. See Instructions to create UVFITS from the SDM or CASA MS format.

  • The lock icons, denoting that data are proprietary, may be incorrect for ALMA data, but access rules are properly enforced by the Request Handler.

Some archival data have known problems which, together with the possible fix, are listed at the archive issues page.


Help, Feedback, and Suggestions

We value feedback from our user community on improving the NRAO archive tool. If a user has a feature request, or feedback on how a current feature/implementation could work better, they can file a helpdesk ticket at help.nrao.edu via the department: ‘Archive Access Tool Feedback.’ Other general help using the archive or problems getting data from the archive should select the 'VLA/VLBA Archive and Data Retrieval' department.

      Connect with NRAO

      The NSF National Radio Astronomy Observatory and NSF Green Bank Observatory are facilities of the U.S. National Science Foundation operated under cooperative agreement by Associated Universities, Inc.