SRDP > Scripted Access to the NRAO Archive

Scripted Access to the NRAO Archive

The web-based interface to the NRAO archive can fulfill the needs of many users. However, the browser interface can be difficult (or unusable) for users that need to execute more sophisticated queries and collation/filtering of search results. Thus, we have developed the ability to query the NRAO archive using Virtual Observatory (VO) protocols. This enables users to interact with the archive metadata in ways that are inefficient or impossible with the web-based interface.

The scripted interface conforms to the VO Table Access Protocol (TAP) standard and can be queried using the VO interface. We have primarily tested the service using the Python pyVO package, but should work with other VO enabled services (e.g., TopCat). To access the service, they should use the server address listed below (this is not a webpage but the address for the VO service).

Jupyter Notebook Demonstrating TAP Access

TAP Server address: https://data-query.nrao.edu/tap

Downloads are not yet possible through this scripted interface. Thus, users will need to use the scripted interface to identify the data they are looking for and then download those products through the web interface.

 

Getting Started for Users New to VO

We have created a demonstration Jupyter Notebook example that includes querying the archive and filtering/collecting the results. Click here to download the demonstration notebook. The example demonstrates querying by position and radius from that position. The search returns will contain all 'scans' within the archive that contain the queried coordinates. We then go on to show how to identify the unique Execution Blocks ([E]VLA, VLBA, ALMA) that contain the source and sum the observe time on source in a given Execution Block. This can enable users to more efficiently obtain this information than can be garnered from the web interface.

 

Columns Returned

The TAP interface contains many columns, and we describe them briefly here:

Most useful columns:
s_ra - right ascension
s_dec - declination
s_fov - field of view (may not be accurate for non-VLA data)
obs_publisher_did - file set ID, or execution block name (a unique name to locate in archive web interface)
target_name - name of target in NRAO archive
t_min  - observation start time (MJD)
t_max - observation end time (MJD)
t_exptime - exposure time of scan
freq_min - minimum frequency (Hz)
freq_max - maximum frequency (Hz)

em_min - minimum wavelength (meters)
em_max - maximum wavelength (meters)
facility_name - NRAO
instrument_name - VLBA, VLA, ALMA, GBT
pol_states - polarization products available
dataproduct_type - visibility or image
calib_level - calibration level of data (currently all are 1)
access_estsize - size of data in archive
configuration - Array configuration (VLA and ALMA)

 

Columns that are less useful at present:

obs_id - science product locator
em_res_power - spectral resolving power (unused)
t_resolution - time resolution
access_url - where to get the data
obs_collection - currently not used
em_xel - currently not used
o_ucd  - currently not used
s_region - currently not used
s_resolution - currently not used
access_format - data format