VLA Calibration Pipeline 2024.1.0.8; CASA version 6.6.1
General Description
The VLA calibration pipeline performs calibration and automated flagging using CASA. It is currently designed to work for Stokes I data (except P-band and 4-band), and has recently been modified to work with spectral line data. Each Scheduling Block (SB) observed by the VLA is automatically processed through the pipeline. The VLA calibration pipeline performs the following steps:
- Loads the data in from the archival format (Science Data Model-Binary Data Format [SDM-BDF]) and converts to a CASA MeasurementSet (MS), applies Hanning smoothing (when the data were not frequency averaged by the VLA correlator or marked as a spectral line), and obtains information about the observing set-up from the MS;
- Applies online flags (generated during observing) and other deterministic flags (shadowed data, edge channels of sub-bands, etc.);
- Prepares models for primary flux density calibrators;
- Derives pre-determined calibrations (antenna position corrections, gain curves, atmospheric opacity corrections, requantizer gains, etc.);
- Calculates level of gain compression that is present.
- Iteratively determines initial delay and bandpass calibrations, including flagging of RFI (Radio Frequency Interference) and some automated identification of system problems;
- Derives initial gain calibration, and derives the spectral index of the bandpass calibrator;
- RFI flagging is done on calibrator data with the initial calibration applied;
- Derives final delay, bandpass, and gain/phase calibrations, and applies them to the data;
- Runs the RFI flagging algorithm on the target data;
- Calculates data weights based on the inverse square RMS noise of the MS;
- Creates diagnostic images of calibrators.
For more information about pipeline heuristics, please see the pipeline presentations from the 2021 VLA Data Reduction workshop.
The scripted pipeline is no longer being updated, but is still available with older CASA versions; the most recent version is only guaranteed to work with CASA 5.3. If you are interested in obtaining the scripted pipeline, please see the Scripted Pipeline page.
Please see the Release Notes and Known issues below; the pipeline reference manual is also available. For questions, suggestions, and other issues encountered with the pipeline, please submit a ticket to the Pipeline Department of the NRAO Helpdesk.
VLA Imaging Pipeline
An imaging pipeline is also available for VLA continuum and spectral line data. The imaging pipeline will produce single continuum images of each target field per observed band and spectral cubes for each spectral line spw (if applicable, as detected or specified). For instructions of how to run the imaging pipeline and information about products created from the imaging pipeline, please see the VLA Imaging Pipeline webpage.
VLA Pipeline 2024.1 (CASA 6.6.1) Release Notes
- Spectral line processing is available in the VLA pipeline; by default the pipeline will only select spectral windows more narrow than the default continuum windows (128 MHz, 64 channels for C-band and higher, 64 MHz and 64 channels for S-band and lower) as spectral line. This occurs when the new hifv_importdata parameter specline_spws='auto', which is the default. See more details in the spectral line section.
- The task hifv_mstransform has been introduced (replacing hif_mstransform for the VLA) in the imaging recipes. This additional task supports spectral line processing by creating an additional MS of just the spectral line windows. A call to this task is also added to the casa_piperestorescript.py files generated by the pipeline.
- The calibrated measurement sets of the science targets (mySDM_targets.ms) split off by hifv_mstransform, are different from previous pipeline releases. mySDM_targets.ms will only contains spectral line spectral windows. The task hifv_mstransform will also create mySDM_targets_cont.ms, which contains all the spectral windows and corresponds to the mySDM_targets.ms that was produced by previous versions of the pipeline.
- The hifv_hanning task will only apply Hanning smoothing to continuum spectral windows and line spectral windows that may contain a maser line. Setting maser_detection=False
will turn off the hanning smoothing for spectral windows potentially containing a maser line. - Antenna position corrections are only added for updates that occur within 150 days (ant_pos_time_limit parameter for gencal and hifv_priorcals) after an observation was conducted; this prevents updates for antennas that remain on the same pads for sometimes years and have systematic drift. Setting ant_pos_time_limit=0 for hifv_priorcals will result in the previous behavior for antenna position corrections.
- The various a priori calibrations applied by hifv_priorcals can now be turned on and off individually. Also, the switched power data are now only plotted for a single spectral window in each baseband, and only when antennas are on-source, to provide a more intelligible plot.
- The task hifv_circfeedpolcal has an option added to disable to internal setting of the Pol. Angle calibrator IQUV (run_setjy). If set to False, a user can insert their own setjy command prior to task execution to produce fully polarization calibrated data; if set to True the task will use the internal hard coded values for 3C286 and 3C138 that are only valid for S-band. A bug was also fixed that prevented proper polarization calibration on data with more than 2 basebands.
- The flux calibration task (hifv_fluxboot) will select fitorder=0, if a spectral setup spans less than 256 MHz in its spectral grasp. This fixes a long standing issue of closely spaced spectral windows that result in unphysical spectral indices being derived.
- The fractional bandwidth threshold for hifv_fluxboot to select fitorder=4 has been lowered to 1.5 such that multiband data including L, to C-band will be able to use the higher order fit to the calibrator spectrum.
- The imaging pipeline and hif_selfcal will no longer select the wproject gridder for S and L-bands due to prohibitive memory and performance limitations for A and B configuration data.
- The imaging pipeline will select a reference frequency determined by the weighted sensitivity of each spectral window to enable more robust imaging with nterms=2.
- The theoretical sensitivity is now reported in the hif_makeimages weblog and in hif_selfcal.
- More flagversions are backed up in the archived calibration tarballs, and restores will now look for and apply the flagversion 'statwt_1' when present instead of 'Pipeline_Final'. This is because hifv_statwt is run as part of the restore and this ensures the input data are identical to when statwt was run originally during calibration. Previous differences, however, were found to be inconsequential.
- The task hifv_checkflag will now operate better when multiple calibration intents are set for the phase calibrator (e.g., both bandpass and complex gain are set, but another source has bandpass already).
- The TEC data can now be shown again in the hifv_priorcals task and the issues some users had encountered by the lack of the uncompress program on their installations should also now be corrected.
- Numerous bugs fixed in the hif_selfcal task, see VLA imaging pipeline for more detail.
- See also the ALMA pipeline documentation for general, non-VLA specific infrastructure changes.
Obtaining the Pipeline
The pipeline is released and included with CASA. The Obtaining CASA webpage has links to the CASA version with the most recent VLA pipeline and provides access to older versions. Follow these links to obtain CASA 6.6.1 with the pipeline for Linux systems and MacOS systems.
Please note that pipeline operation on MacOS is not officially supported, is mostly untested, and may not produce reliable results or results consistent with runs performed in Linux.
Pipeline Requirements
- The VLA calibration pipeline runs on each completed SB (typically a single SDM-BDF) separately; there is currently no provision for it running on collections of SBs.
- The pipeline relies on correctly set scan intents. We therefore recommend that every observer ensures that the scan intents are correctly specified in the Observation Preparation Tool (OPT) during the preparation of the SB (see OPT manual for details). Typical intents are:
- CALIBRATE_FLUX (required): flux density scale calibration scans (toward one of the standard VLA calibrators 3C48, 3C138, 3C147, or 3C286). If multiple fields are marked with this intent both will be used for the transfer. If this intent is not present, the pipeline will fail
- CALIBRATE_AMPLI and CALIBRATE_PHASE (required): temporal complex gain/phase calibration; if these intents are not present, the pipeline will fail
- CALIBRATE_BANDPASS (optional, but highly recommended): scan that is used to obtain the bandpass calibration (only the first instance of CALIBRATE_BANDPASS is used regardless of the band; multi-band scheduling blocks cannot use different bandpass calibrators for different bands); if not present, the first field with scan intent CALIBRATE_FLUX will be used for bandpass calibration
- CALIBRATE_DELAY (optional): delay calibrator scan; if not present the first scan with a CALIBRATE_BANDPASS intent is used for delay calibration, and if that one is not available delays are calculated using the first CALIBRATE_FLUX scan.
- The pipeline also currently requires a signal-to-noise of >~3 for each spectral window of a calibrator per integration (for each channel of the bandpass).
Automatic Processing for Science Observations
Starting with Semester 2013A D-configuration, a version of the calibration pipeline has been run automatically at the completion of all astronomical scheduling blocks, except for P-band and 4-band observations, with the resulting calibration tables and flags archived for future use. Through Semester 24B, B-configuration, NRAO executed the standard pipeline optimized for Stokes I continuum processing, independent of the actual science goal. Starting in Semester 2025A, A-configuration, a pipeline with modifications to enable Stokes I continuum and spectral line calibration began to operate; imaging of spectral line data as part of operations will not begin until 2025A, D-configuration, however. However, it may be necessary to re-run the pipeline with modifications depending on the science goal, and older data can be re-run through the latest pipeline to possibly achieve better results.
Investigators are notified when the calibrated data are ready for download; detailed quality assurance checks are performed on a subset of data via the SRDP project (see next section) or can be performed by NRAO staff upon request. Users will request their calibrated data from the NRAO archive, which will automatically produce the calibrated MS by applying the saved calibration and flag tables. See the section "Restore calibration from archived products" for details on the restoration procedures. Questions regarding the availability of pipeline-processed data should be submitted to Pipeline Department of the NRAO Helpdesk.
Science Ready Data Products (SRDP) Processing
Starting in June 2019, a subset of VLA projects will undergo a more rigorous quality assurance process as part of the NRAO SRDP Initiative. As of September 2023, SRDP processing is being done on most continuum-only projects observing at C-band and higher frequencies (single and multi-band data). Spectral line projects and those that used 3C138 and 3C48 as flux density calibrators are excluded from SRDP processing. Such projects are still run through the pipeline, but do not receive detailed quality checks.
SRDP deliveries may be expanded to include S-band and possibly L-band now that gain compression correction is available in the standard pipeline. Users will be delivered calibrated data that may have additional flagging applied and are expected to be ready for imaging by the user. Also, imaging of stokes I continuum data that successfully complete the pipeline has been conducted since ~2022 and those images are available in the NRAO archive.
Running the Pipeline
The pipeline can take a few hours to a few days to complete depending on the specifics of the observation; ensure that your process can run for the full duration without interruption. Also, make sure that there is enough space in your directory as the data volume will increase by about a factor of four. A display environment must be set as well: use xvfb or some other virtual display environment if you will run the pipeline as a batch job. There are several ways to run the pipeline and, in most cases, we recommend starting with the raw data (an SDM-BDF) that can be requested from the NRAO archive. Place your data (preferably the SDM-BDF, but an MS is also acceptable) in its own directory for processing. For example:
#In a Terminal
mkdir myVLAdata mv mySDM myVLAdata/ cd myVLAdata
Next, start CASA from the same directory where you placed your SDM-BDF. Note: do not try to run the pipeline from a different directory by giving it the full path to a dataset, as some of the CASA tasks require the MS to be co-located with its associated gain tables. And do not try to run the pipeline from inside the SDM-BDF or MS directories themselves. While SDM-BDFs and MSs are directories, they should always be treated as single entities that are accessed from the outside. It is also important that a fresh instance of CASA is started from the directory that will contain the SDM-BDF or MS, rather than using an existing instance of CASA and using "cd" to move to a new directory from within CASA, as the output plots will then end up in the wrong place and potentially overwrite your previous pipeline results.
Starting CASA with Pipeline Tasks
To start CASA with the pipeline from your own installation type:
#In a Terminal
casa --pipeline
Note that starting CASA without the --pipeline option will start CASA without any of the pipeline-specific tasks. (In turn, if you try to use 'casa --pipeline' for manual data reduction, plots that are using matplotlib, like plotants will not work; use 'casa' without --pipeline for manual data reduction.)
If you are at the Domenici Science Operations Center (DSOC) or using NRAO computing resources at the DSOC, we provide a shortcut to the latest CASA version that includes the pipeline. To start this version, type:
#In a Terminal
casa-pipe
To list other versions of CASA with the pipeline that are available at the DSOC, type:
#In a Terminal
casa-pipe -ls
To start a particular version, type:
#In a Terminal
casa-pipe -r <full text from version list>
Now that CASA is open, there are several ways to start the pipeline.
The CASA homepage has more information on using CASA at NRAO.
Method 1: casa_pipescript.py
The pipeline can be run using a 'casa_pipescript.py' file, the provided example (see example below) may be used, or a script from any previous pipeline run using the same CASA version. The casa_pipescript.py will need to be edited to add the intended SDM-BDF (or MS) name and executed in CASA. For this to work, the 'casa_pipescript.py' must be in the same directory where you start CASA with the pipeline. Once CASA is started (same steps as above) type:
#In CASA execfile('casa_pipescript.py')
Also see the Imaging pipeline section for running the Calibration pipeline with Target Imaging.
Method 2: Recipe Reducer
The pipeline comes with several predefined sets of instructions, called recipes, to accommodate for different needs: one for VLA data, one for ALMA data, one for VLASS, etc.
# In CASA import pipeline.recipereducer
pipeline.recipereducer.reduce(vis=['mySDM'],procedure='procedure_hifv.xml',loglevel='summary')
The upside to this method is that an exact reference to your CASA path and pipeline directory are not needed. But the down side is that this method will store the pipeline weblog in a directory called pipeline-procedure_hifv unlike the other methods described above and below that will create a unique directory name based on the date and time. Thus, one has to be cautious against running this procedure multiple times in the same directory. This method also includes a call to hifv_exportdata, which will package the calibration products into a 'products' directory one level up; this may not be desired by all users.
Method 3: One Stage at a Time
You may notice that the 'casa_pipescript.py' is a list of specific CASA pipeline tasks being called in order to form the default pipeline. If desired, one could run each of these tasks one at a time in CASA to inspect intermediate pipeline products as an example.
If you need to exit CASA between stages, you can restart the pipeline where you left off. However, in order for this to work, none of the files can be moved to other directories.
First, use the CASA pipeline task h_resume after starting CASA again. This will set up the environment again for the pipeline to work. Type:
# In CASA h_resume()
Now, you may start the next task in your list.
Note that the Python recipes as documented in previous versions are now considered deprecated because they are not being maintained relative to the XML recipes.
What you get: Pipeline Products
VLA pipeline output includes data products such as calibrated visibilities, a weblog, and all calibration tables. Note that the automated execution at NRAO will also run an additional data packaging step (hifv_exportdata) that moves most of the files to an upper level '../products' directory. This step is omitted in the manual execution and all products remain within the 'root' directory where the pipeline was executed.
The most important pipeline products include (mySDM is a placeholder for the SDM-BDF name):
- A MeasurementSet (MS) 'mySDM.ms' with applied flags and calibrated visibilities in the CORRECTED_DATA column that can be used for subsequent imaging (root directory).
- A weblog that is supplied as a compressed tarball weblog.tgz. When extracted, it has the form pipeline-YYYYMMDDTHHMMSSS/html/index.html, where the YYYYMMDDTHHMMSSS stands for the pipeline execution time stamp (multiple pipeline executions will result in multiple weblogs). The weblog contains information on the pipeline processing steps with diagnostic plots and statistics. An example is given in the VLA pipeline CASA guide. During execution of the pipeline, this same directory will be present within the working directory. If the pipeline was run using the recipe reducer method, the weblog directory will have the form pipeline-procedure_hifv/html/index.html.
- Calibrator images for each band (files start with 'oussid*' in the root directory).
- All calibration tables and the 'mySDM.ms.flagversions' directory that contains various flag backups made at various stages of the pipeline run (see section Calibration Tables).
- The casa-YYYYMMDD-HHMMSS.log CASA logger messages (in pipeline-YYYYMMDDTHHMMSSS/html/).
- 'casa_pipescript.py' (in pipeline-YYYYMMDDTHHMMSSS/html/), the script with the actual executed pipeline heuristic sequence and parameters. This file can be used to modify and re-execute the pipeline (see section The casa_pipescript.py file).
- 'casa_commands.log' (in pipeline-YYYYMMDDTHHMMSSS/html/), which contains the actual CASA commands that were generated by the pipeline heuristics (see section The casa_commands.log file).
- The output from CASA's task listobs is available at 'pipeline-YYYYMMDDTHHMMSSS/html/sessionSession_default/mySDM.ms/listobs.txt' and contains the characteristics of the observations (scans, source fields, spectral setup, antenna positions, and general information).
- As previously mentioned in Automatic Processing for Science Observations, calibrated MSs for data since late-2016 can be generated automatically using the NRAO archive. For older data, the archived calibration needs to be re-applied to the raw data that have to be downloaded in the SDM-BDF format from the archive (more ideally, older data should be recalibrated using the current pipeline). To prepare for this restoration procedure, the hifv_exportdata pipeline task is run as a last step. This task packages calibration tables in 'unknown.session_1.caltables.tgz' and flag backups in 'mySDM.ms.flagversions.tgz'. An additional text file, 'mySDM.ms.calapply.txt', is also produced by hifv_exportdata that CASA task applycal uses when restoring the calibration. The restoring process itself is performed by a script called 'casa_piperestorescript.py'. See the section Restore calibration from archived products for details on the restoration procedures.
The Pipeline Weblog
Information on the pipeline run can be inspected through a weblog that is launched by pointing a web browser to file:///<path to your working directory>/pipeline-YYYYMMDDTHHMMSSS/html/index.html. The weblog contains statistics and diagnostic plots for the SDM-BDF as a whole and for each stage of the pipeline. The weblog is the first place to check if a pipeline run was successful and to assess the quality of the calibration.
An example walkthrough of a pipeline weblog is provided in the VLA Pipeline CASA guide.
Note that viewing weblogs requires either disabling some browser security setting or running a local websever. Current instructions are provided within the weblog if a page cannot render due to browser settings. We most frequently view weblogs in Firefox, other browsers may not display all items correctly.
Calibration Tables
The final calibration tables of the pipeline are (where mySDM is a placeholder for the SDM-BDF name):
mySDM.ms.hifv_priorcals.s5_3.gc.tbl : Gaincurve mySDM.ms.hifv_priorcals.s5_4.opac.tbl : Opacity mySDM.ms.hifv_priorcals.s5_5.rq.tbl : Requantizer gains mySDM.ms.hifv_priorcals.s5_6.ants.tbl : Antenna positions (if created) mySDM.ms.hifv_finalcals.s13_2.finaldelay.tbl : Delay mySDM.ms.hifv_finalcals.s13_4.finalBPcal.tbl : Bandpass mySDM.ms.hifv_finalcals.s13_5.averagephasegain.tbl : Average Phase offsets mySDM.ms.hifv_finalcals.s13_7.finalampgaincal.tbl : Flux calibrated Temporal Gains mySDM.ms.hifv_finalcals.s13_8.finalphasegaincal.tbl : Temporal Phases
The casa_pipescript.py File
The pipeline sequence of the pipeline heuristic steps are listed in the 'casa_pipescript.py' script that is located in the pipeline-YYYYMMDDTHHMMSSS/html (where YYYYMMDDTHHMMSSS is the timestamp of the execution) directory. A typical 'casa_pipescript.py' has the following structure (where mySDM is again a placeholder for the name of the SDM-BDF raw data file and will have the name of the one that was processed):
# This CASA pipescript is meant for use with CASA 6.5.4 and pipeline 2023.1.0.124
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['mySDM'])
hifv_hanning()
hifv_flagdata(hm_tbuff='1.5int', fracspw=0.01, intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*')
hifv_vlasetjy()
hifv_priorcals()
hifv_syspower()
hifv_testBPdcals()
hifv_checkflag(checkflagmode='bpd-vla')
hifv_semiFinalBPdcals()
hifv_checkflag(checkflagmode='allcals-vla')
hifv_solint()
hifv_fluxboot()
hifv_finalcals()
hifv_applycals()
hifv_checkflag(checkflagmode='target-vla')
hifv_statwt()
hifv_plotsummary()
hif_makeimlist(intent='PHASE,BANDPASS', specmode='cont')
hif_makeimages(hm_masking='centralregion')
#hifv_exportdata()
finally:
h_save()
(Note that executions at NRAO may show small differences, e.g., an additional final hifv_exportdata (commented out in example above) step that packages the products to be stored in the NRAO archive.)
The above is, in fact, a standard user 'casa_pipescript.py' file for the current CASA and pipeline version (download to edit and run yourself) that can be used for general pipeline processing after inserting the correct mySDM filename in hifv_importdata.
The pipeline run can be modified by adapting this script to comment out individual steps, or by providing different parameters (see the CASA help for the parameters of each task). The script can then be (re-)executed via:
# In CASA execfile('casa_pipescript.py')
We will use this method later for an example where we modify the script for spectral line processing.
General modifications to the script include 'h_init(weblog=False)' for faster processing without any weblog or plotting.
The casa_commands.log File
casa_commands.log is another useful file in pipeline-YYYYMMDDTHHMMSSS/html (where YYYYMMDDTHHMMSSS is the timestamp of the pipeline execution) that lists all the individual CASA commands that the pipeline heuristics (hifv) tasks produced. Note that 'casa_commands.log' is not executable itself, but contains all the CASA tasks and associated parameters to trace back the individual data reduction steps.
Restore calibration from archived products
To apply the calibration and flag tables produced by the pipeline, we recommend using the same version of CASA used by the pipeline as well as the same version of the pipeline. The version of both the pipeline and CASA used may be confirmed via the main "Observation Overview Page", the home page of the weblog. We recommend starting with a fresh SDM-BDF, although a fresh MS should work, too. There may be small differences in the final result or statistics if online flags were applied when requesting the MS.
In order for the pipeline to work properly, please follow the steps below to prepare your directory and calibration files for application. In addition to the raw SDM-BDF you will need the following pipeline products that are included in the calibration tarball from the NRAO archive: 'unknown.session_1.caltables.tgz', 'mySDM.ms.flagversions.tgz', 'mySDM.ms.calapply.txt', 'unknown.pipeline_manifest.xml', and 'casa_piperestorescript.py' (where mySDM is a placeholder for the SDM-BDF name). To ensure everything is correctly placed for the script to apply calibration, please follow these steps:
- Create a directory where you will work, call it something like "restoration".
mkdir restoration
- Go into your restoration directory and create three new directories named exactly as follows:
mkdir rawdata working products
- Place the raw SDM-BDF into the "rawdata" directory.
mv /path/to/fresh/data/mySDM rawdata/
- Place 'unknown.session_1.caltables.tgz', 'mySDM.ms.flagversions.tgz', 'unknown.pipeline_manifest.xml', and 'mySDM.ms.calapply.txt' into the "products" directory:
mv *.tgz products/ mv *.xml products/ mv *.txt products/
- Place the 'casa_piperestorescript.py' file into the "working" directory:
mv casa_piperestorescript.py working/
- The 'casa_piperestorescript.py' looks similar to the 'casa_pipescript.py', but runs a special hifv_restoredata pipeline task to apply flags and calibration tables, followed by hifv_statwt. Edit the hifv_restoredata call to include "../rawdata/" in front of the name of the SDM-BDF (mySDM), e.g.:
__rethrow_casa_exceptions = True h_init() try: hifv_restoredata (vis=['../rawdata/mySDM'], session=['session_1'],\ ocorr_mode='co',gainmap=False) hifv_statwt() finally: h_save()
- From the "working" directory, start CASA:
casa --pipeline
or if you use computers at the DSOC:
casa-pipe
- Start the restore script from the CASA prompt:
# In CASA execfile('casa_piperestorescript.py')
- Enjoy calibrated data once the process finishes.
Note that a MMS (multiple measurement set) will no longer be created by the pipeline if it is run in parallel mode (mpicasa_ for the past several pipeline versions. For calibration, there is currently no benefit to running the pipeline in parallel mode. Speedups with mpicasa will only be realized in the imaging portions of the pipeline.
Quality (QA) Assessment Scores
Each pipeline stage has a quality assessment (QA) score assigned to it. The values range from 0 to 1 where
0.9-1.0 Standard/Good (green color)
0.66-0.90 Below-Standard (blue color; also shown as a question mark symbol)
0.33-0.66 Warning (amber color; cross symbol)
0.00-0.33 Error (red color; cross symbol)
Currently the QA scores are not always useful and stages with a score of 1.0 might show bad data and a low score could still be perfectly good. Thus, we currently recommend that all pipeline stages and the relevant products are checked. Below-standard and Warning scores should receive extra scrutiny. These scores often refer to large fractions of flagging that are applied to the data, but may also point out other issues where in fact additional flagging may be required. The QA section at the bottom of the weblog of each stage will provide more information about the particular origin of each score. Errors are usually very serious issues with the data or processing and should be resolved in any case.
Examples for QA scores are provided in the Pipeline CASAguide.
Additional flagging
Using a flagging template
Although the pipeline attempts to remove most RFI, there are still many cases where additional flagging is required. The pipeline will then be re-started with the additional flags pre-applied.
The best way to do so is to inspect the data and to record all the required flags in a flagging template. See the CASA flagdata task help for the details on the format. Here is an example for a template:
mode='manual' scan='1~3' reason='bad_data' #flags scans 1 to 3 mode='clip' clipminmax=[0,10] reason='flag_outliers' #flags data outside an amplitude range #here all amplitudes larger than 10 Jy # this line will be ignored mode='quack' scan='1~3,10~12' quackinterval=1.0 reason='remove_starting_integrations' #removes first second #from scans 1-3 and 10-12
The most important modes are manual to flag given time ranges, antennas, spws, scans, fields, etc., and clip to flag data exceeding threshold amplitude levels.
Flagging templates can be saved in text files with any given name, e.g. 'myflags.txt'. In 'casa_pipescript.py' modify the parameters of the hifv_flagdata task; change template=True and add filetemplate='myflags.txt'. It is important to provide the reason field, otherwise the flags may not be applied. The reason field cannot have spaces, but underscores are ok.
hifv_flagdata(hm_tbuff='1.5int', flagbackup=False, scan=True, fracspw=0.01, \ intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, \ *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*',\ template=True, filetemplate='myflags.txt')
The default for filetemplate is 'mySDM.flagtemplate.txt'. Therefore, flag files with this name would not require filetemplate to be specified.
Interactive Flagging
For some data it is not straightforward to derive flagging commands that can be placed in a template. In that case, one may use the interactive plotms or viewer/msview CASA GUIs to flag data directly in the MS. Re-execution of the pipeline (via the casa_pipescript.py file) will be possible, but a few steps require attention:
- Since Hanning smoothing was likely performed in the initial pipeline run, one should turn off Hanning smoothing for all re-executions (only if re-using the same MS). Otherwise the frequency resolution degrades more and more and flags will be extended to neighboring channels by smoothing an already flagged MS. To do so, comment out hifv_hanning in the 'casa_pipescript.py' file.
- By default, the pipeline will always revert back all flags to their original state that are saved in the 'mySDM.ms.flagversions' file. It will thus ignore all modifications that were made afterwards. To avoid resetting all flags, one should manually flag the MS and place it in a new directory. Do NOT copy over the related 'mySDM.ms.flagversions' directory. Then run the pipeline with the flagged MS as input to hifv_importdata via the modified 'casa_pipescript.py' file (it is possible to also run with the XML recipe, but this will reapply Hanning smoothing. With this procedure the pipeline will not be able to recover original flags and will proceed with the manual, interactive flags that the user has applied directly to the MS.
Flag the Flux Calibration Gain Table
Sometimes the gain solutions of the flux calibrator are not good for all antennas, even with the rejection performed by hifv_fluxboot. It is possible to flag the solutions directly in the 'fluxgaincal.g' calibration table. The CASA task plotms can be used for the flagging. The pipeline can use the flagged table, say 'fluxgaincal_edited.g' for the flux calibration by modifying the hifv_fluxboot call in the 'casa_pipescript.py' as follows:
hifv_fluxboot(caltable='fluxgaincal_edited.g')
Specifying or Avoiding a Specific Reference Antenna
In some cases, the pipeline may choose a reference antenna that is not ideal for calibration purposes. The pipeline algorithm picks an antenna (actually a ranked list of antennas) that is not heavily flagged and that is close to the center of the array. Other factors, e.g. phase jumps, or bad deformattors that are not caught in the hifv_testBPdcals stage, may still be present on the reference antenna and will then be reflected on all solutions. When this happens, it is advisable to tell the pipeline not to use a specific antenna as a reference antenna or specifically supply the ranked list for the pipeline to use.
This can be achieved by setting the parameters refant or refantignore that are available in the tasks hifv_testBPdcals, hifv_semiFinalBPdcals, hifv_solint, hifv_fluxboot, and hifv_finalcals.
For example, if we want to specify the ranked list of reference antennas to use the casa_pipescript.py would be modified as shown below:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array') context.set_state('ProjectSummary', 'telescope', 'EVLA') try: hifv_importdata(vis=['mySDM']) hifv_hanning() hifv_flagdata(hm_tbuff='1.5int', fracspw=0.01, intents='*POINTING*,*FOCUS*,\
*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*,\
*UNSPECIFIED#UNSPECIFIED*') hifv_vlasetjy() hifv_priorcals()
hifv_syspower() hifv_testBPdcals(refant='ea01,ea02,ea03') hifv_checkflag(checkflagmode='bpd-vla') hifv_semiFinalBPdcals(refant='ea01,ea02,ea03') hifv_checkflag(checkflagmode='allcals-vla') hifv_solint(refant='ea01,ea02,ea03') hifv_fluxboot(refant='ea01,ea02,ea03') hifv_finalcals(refant='ea01,ea02,ea03') hifv_applycals() hifv_checkflag(checkflagmode='target-vla') hifv_statwt() hifv_plotsummary() hif_makeimlist(intent='PHASE,BANDPASS', specmode='cont') hif_makeimages(hm_masking='centralregion') finally: h_save()
If, on the other hand, it is preferable for the pipeline to choose the reference antenna, but without a one or some of the antennas being a possible choice, the casa_pipescript.py would be modifed as shown below:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array') context.set_state('ProjectSummary', 'telescope', 'EVLA') try: hifv_importdata(vis=['mySDM']) hifv_hanning() hifv_flagdata(hm_tbuff='1.5int', fracspw=0.01, intents='*POINTING*,*FOCUS*,\
*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*,\
*UNSPECIFIED#UNSPECIFIED*') hifv_vlasetjy() hifv_priorcals()
hifv_syspower() hifv_testBPdcals(refantignore='ea28,ea07') hifv_checkflag(checkflagmode='bpd-vla') hifv_semiFinalBPdcals(refantignore='ea28,ea07') hifv_checkflag(checkflagmode='allcals-vla') hifv_solint(refantignore='ea28,ea07') hifv_fluxboot(refantignore='ea28,ea07') hifv_finalcals(refantignore='ea28,ea07') hifv_applycals() hifv_checkflag(checkflagmode='target-vla') hifv_statwt() hifv_plotsummary() hif_makeimlist(intent='PHASE,BANDPASS', specmode='cont') hif_makeimages(hm_masking='centralregion') finally: h_save()
Gain Compression Correction
The pipeline now allows for gain compression correction to be applied for VLA bands using the hifv_syspower task. This task was originally developed for the VLA Sky Survey and has been adapted to work with general VLA observations. The corrections should not be applied to data taken with the 3-bit samplers and the corrections are not recommended for K, Ka, and Q-band data regardless of whether the data were taken with the 3- or 8-bit samplers. The plots created in the pipeline for 3-bit data and K, Ka, and Q-bands are informational only, and can inform whether the flux scale may be significantly inaccurate (variable power differences (Pdiff) throughout observations or systematic differences between the flux calibrator and the complex gain calibrator). The Pdiff measurements are normalized relative to the measurements taken on the flux density calibrator.
Receiver gain compression occurs when the input power of an amplifier is increased to a level that reduces the gain of the amplifier and causes a nonlinear increase in output power. This can typically occur when strong RFI is present and the receiver will record the observed flux densities as being lower than they actually are. The task makes use of the switched power information available from the telescopes to determine if the receiver gains are being compressed. The gain compression correction will apply a scaling to the data in order to mitigate the effect of compression. By default the gain compression correction is not applied, but all the plots are made to assess whether the correction should be applied.
The correction is performed for power differences (Pdiffs) that are between 0.7 and 1.2 by default, values outside those ranges will have their values (and subsequently data) flagged. Expanding the range of Pdiffs corrected is not expected to result in a more accurate correction.
To apply the correction, one needs to re-run the pipeline and set the argument apply=True for hifv_syspower.
hifv_syspower(apply=True)
There are known instances where the switched power data and hence the Pdiff data are bad for a given antenna, this can be characterized by the Pdiff plots looking relatively constant or with slow variations in time but having values > 1 or < 1 or extremly noisy relative to other antennas. When switched power data are bad there is the antexclude parameter that can be used in conjuction the usemedian parameter. The format for antexclude is a Python dictionary whose format is {'Band' : {'eaXX' : {'usemedian': True or False}}}. This enables usemedian to be specified on a per-antenna and per-band basis. If usemedian=True, the median Pdiff from all antennas will be used for the antenna specified in antexclude. If usemedian=False, then the Pdiff values will be set to 1.0 for the specified antenna.
If there are bands where the syspower corrections are not desired, the do_not_apply parameter is available; specify the bands where syspower corrections should not be applied in a comma-separated string. See example below:
# Single Band Example hifv_syspower(clip_sp_template=[0.8,1.3],
antexclude={'S':{'ea02':{'usemedian':True},
'ea03':{'usemedian':False}}},
apply=True) # Multi-band Example (S, C, X and K bands) hifv_syspower(clip_sp_template=[0.7,1.2], antexclude={'S':{'ea02':{'usemedian':True}, 'ea03':{'usemedian':False}}, 'C':{'ea02':{'usemedian':True}, 'ea12':{'usemedian':False}}}, apply=True, do_not_apply='K,X')
Total Electron Content (TEC) Application
This version of CASA and the pipeline restores the ability to retrieve the TEC information for VLA observations after August 7, 2023. The VLA pipeline plots the TEC values at the time of observation, if the TEC data are available when the pipeline is executed. This is controlled by the show_tec_maps parameter whose default is True. The TEC data are not applied by default, but can be applied using the parameter apply_tec_correction, whose default is False, but can be set to True in a pipeline re-run.
hifv_priorcals(apply_tec_correction=True)
The application of TEC corrections is experimental and should be used with caution. TEC corrections are likely to primarily benefit L-band and possibly some S-band data if the ionosphere is very active. TEC data generally become available within ~2 days after an observation.
Modifying the Pipeline for non-Stokes I Continuum Data
The pipeline is developed for the Stokes I continuum case. But it is possible to modify the 'casa_pipescript.py' and run the pipeline for other use cases:
Spectral Line
Optimizations for spectral line processing have been introduced with this release of the pipeline, with the caveat that it is only obvious that spectral line observations are intended when spectral windows have more channels that the default continuum setups. Thus, the pipeline can detect setups that are optimized for science like Galactic line science and some atomic hydrogen science. But, line science that uses the bandwidth of WIDAR at the standard spectral resolution cannot be identified by the data properties alone. However, regardless of the pipeline processing heuristics, it is also imperative that the data are taken with bandpass and complex gain calibrators that are bright enough such that the bandpass solutions do not add noise to the data and the gain calibrator is bright enough to calibrate each spectral window individually. Future versions of the pipeline may include smoothing of the bandpass solutions and spectral window mapping to perform better in situations where the calibrators have low S/N.
Most of the spectral line functionality is handled behind the scenes by the pipeline and there are not additional stages in the calibration portion of the pipeline. If narrow spectral windows are detected by the pipeline they will go through slightly different processing that the spectral windows that are meant for continuum science. The spectral line windows are identified automatically by the pipeline in the hifv_importdata task. If additional specificity is needed, the spectral line spectral windows can be manually identified with the specline_spws parameter, where the line spectral windows are provided as a comma-separated string. The pipeline default is 'auto', other valid options are: 'none', 'all', or, for example, '2~3,6,7,9~12'. The 'auto' setting will process the windows more narrow that a standard continuum setup as being desired for spectral line science, 'none' will set all spectral windows to be processed as continuum, 'all' will process all spectral windows as spectral line, and the final example will process spectral windows 2, 3, 6, 7, 9, 10, 11, and 12 as spectral line windows.
Spectral line windows do not have Hanning smoothing applied to them in the hifv_hanning stage, unless they are likely to contain a well-known maser line, in which case Hanning smoothing is applied to avoid potential Gibbs ringing if the line is very bright. The data are then processed as normal going through all the same stages and continuum calibration does. The pipeline weblog in hifv_testBPDcals, hifv_semifinalBPDcals, and hifv_finalcals will show individual plots for each spectral line spectral window for the Bandpass calibrator data, in the first two task mentioned as well as for the Bandpass Amplitude and Phase solutions. Then in hifv_finalcals, the same Bandpass solution plots will be shown, in addition to the Amplitude vs. Time, and Phase vs. Time for the complex gain calibrator. These additional plots will make it easier to see if the calibration solutions are acceptable for the spectral line windows. Then, before hifv_checkflag (RFI flagging) is run on the target source(s), the pre-RFI flags are backed up. And then RFI flagging and hifv_statwt are run on all spectral windows. This will set the weights properly for all the spectral windows and the impact of bright spectral lines on the weights will be mitigated because they are typically flagged out by the RFI flagging.
Following the completion of the calibraiton pipeline steps, the task hifv_mstransform is run to split off the science targets from the main MS. However, in addition to just the science targets, this new task creates two MSes. First, it will split out the RFI-flagged data, with a filename mySDM_targets_cont.ms, which implies that the data from this MS will be used for continuum imaging. Then, hifv_mstransform restores the flagging state of the main MS to the pre-RFI-flagged state, and splits out only the spectral line windows to _targets.ms; these windows will have their relative weights set properly by statwt and then any flagging applied by the RFI autoflaggers has been removed, leaving the line data uncorrupted. Note that this new mySDM_targets.ms is distinct from what was created in previous pipeline versions since it only contains non-RFI-flagged spectral line-identified windows.
An obvious caveat with data that do no have RFI flagging applied is that there could be RFI contaminating a given spectral window. As such, users should be cautious about such impacts and RFI will typically appear as bright fringing patterns across the field of view in images. While it is not optimal to provide users with this kind of data, most Galactic line science at higher frequencies is not likely to have RFI contamination. But, we plan to make use of a new RFI flagging algorithm in CASA using gridded visibility data called msuvbinflag when available (EVLA Memo 198).
While our focus of flag handling is on the RFI flagging of the targets, users should also be aware that the RFI flagging of the bandpass and complex gain calibrators can also affect their line data. Regions of the spectrum flagged-out from the Bandpass calibrator will leave a gap by default as the pipeline does not use parameters like fillgaps at present.
The 'cont.dat' file
Another possibility is to inform the pipeline where the spectral lines are ahead of time using a file called, 'cont.dat', that specifies the frequencies only containing continuum (no spectral line). This will protect the ranges not specified in the file from being flagged for RFI when the hifv_checkflag is run on the target data, and the statistical weights calculated by hifv_statwt are determined using the data that do not contain the spectral line. This protects the range not specified because you are designating those regions as continuum and line free.
This file is made by the task hif_findcont, but that task is run as part of the imaging pipeline(make this a link) after hifv_checkflag and hifv_statwt have run. The cont.dat approach could be useful for a pipeline re-run using an edited `cont.dat' if, for example, there is RFI in the spectral window, away from the spectral line that a user would like removed.
The 'cont.dat' file has the following format:
Field: FIELDNAME1 SpectralWindow: SPWID1 freqrange1 LSRK (in GHz, LSRK) freqrange2 LSKR (in GHz, LSRK) SpectralWindow: SPWID2 freqrange1 LSRK (in GHz, LSRK) freqrange2 LSRK (in GHz, LSRK) ... Field: FIELDNAME2 SpectralWindow: SPWID1 freqrange1 LSRK (in GHz, LSRK) freqrange2 LSRK (in GHz, LSRK) ...
where FIELDNAMEx is the field name for each source. This provides the flexibility to define different continuum ranges for different targets. SPWIDn stands for the spw ID. Field names and spw ids can be found in the listobs output. An example with fields M82 and NGC3077 may look like:
Field: M82 SpectralWindow: 19 37.104~38.29GHz LSRK 38.30~39.104GHz LSRK SpectralWindow: 37 31.360~32.123GHz LSRK 32.130~33.360GHz LSRK Field: NGC3077 SpectralWindow: 37 31.360~32.123GHz LSRK 32.130~33.360GHz LSRK
For the field M82 this file defines spw 19 frequency ranges 37.104 – 38.290 GHz and 38.300 – 39.104 GHz as containing only continuum. This would be a setup where the line is found in the 10 MHz between 38.290 – 38.300 GHz. It also treats frequencies below 37.104 GHz and above 39.104 GHz the same as spectral lines. This can be used, for example, to exclude edge channels from being part of the autoflagging and weight calculations. Analogously, a spectral line falling in the 7 MHz range between 32.123 – 32.130GHz (between the two continuum ranges in spw 37) will be protected by the specification for spw 37 for both the M82 and NGC3077 fields.
The 'cont.dat' must be placed in the root directory where the SDM-BDF resides and where the pipeline is executed. The pipeline will automatically pick up the file, there is no need to explicitly provide the file name in 'casa_pipescript.py' because the tasks that will use the file look for it already.
Note that if you use the cont.dat file, only the fields and spws that appear in the cont.dat will have RFI flagging and statwt run on them. So, if you have 64 spws, and only one is specified in the cont.dat, then the remaining 63 spws will not have RFI flagging applied or their weights recalculated. So one must specify continuum ranges for all spectral windows to properly calibrate an entire dataset using the cont.dat.
Previous Pipeline Modifications for Spectral Line Data
The following advice has been given previously for modifying the continuum pipeline to better process spectral lines. While the pipeline can handle these cases on its own for the most part now, we leave this information here for reference.
- hifv_hanning: Hanning smoothing lessens the Gibbs ringing from strong spectral features, usually strong, narrow RFI, or very strong spectral lines such as masers. Hanning smoothing, however, reduces the spectral resolution. Therefore, depending on the data and the science case, one may or may not choose to apply Hanning smoothing. To disable the application of Hanning-smoothing in the pipeline, simply comment out hifv_hanning or remove the step from 'casa_pipescript.py'. Also, a single edge-channel is always flagged by the hanning smoothing task procedure.
- hifv_flagata: The pipeline, by default, flags 1% of the data on each spw edge as well as the first and last 20MHz of the lowest and highest frequency (assuming that those are the baseband edges). In some cases, for example spectral surveys, lines may fall right on such frequencies. The edgespw, fracspw, and baseband parameters in hifv_flagata can be adjusteded to flag different percentages of the edges. Note that if edgespw=True, it will always flag at least one edge channel, no matter how small fracspw is.
- hifv_checkflag(checkflagmode='bpd-vla'): While it is advisable to always run the RFI flaggers on the bandpass calibrators, users should be aware that a flag growth used in this mode will cause extra edge channels to be flagged if edge-channel flagging is used.
- hifv_checkflag(checkflagmode='target-vla'): Flagging prior to this step was only applied to the calibrator scans, which should be line-free. But hifv_checkflag attempts to auto-flag all fields including target fields. The rflag mode in CASA's flagdata is designed to remove outliers that deviate from a mean level. Strong spectral lines can fulfill this criterion and be flagged. The 'cont.dat' file will ensure that rflag will only be applied to the continuum frequency ranges specified in it. We recommend manual flagging for the spectral line frequency ranges after the pipeline has finished processing.
- hifv_statwt: A similar argument applies to the hifv_statwt step, where the visibilities are weighted by the square of the inverse of their RMS noise. Strong spectral lines will increase the RMS and will therefore be down-weighted. The cont.dat file will restrict statwt to only use the continuum frequency ranges for the rms and weight calculations and thus prevent the inclusion of spectral features. Alternatively, hifv_statwt can be excluded from the pipeline altogether and the CASA task statwt can be executed manually after the pipeline has finished, where statwt's parameter fitspw should be set to continuum channels only.
Given the above, we recommend to use the spectral line processing capability of the pipeline rather than the cont.dat or omitting pipeline stages with modifications that might be needed to specific parameters as needed.
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array') context.set_state('ProjectSummary', 'telescope', 'EVLA') try: hifv_importdata(vis=['mySDM'],specline_spws='auto')
#specline_spws can be set to auto, all, none, or specific spw IDs hifv_hanning() #maser_detection can be set to False (default True) hifv_flagdata(fracspw=0.01, \ intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, \ *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*', hm_tbuff='1.5int') hifv_vlasetjy() hifv_priorcals()
hifv_syspower()
hifv_testBPdcals()
hifv_checkflag(checkflagmode='bpd-vla') hifv_semiFinalBPdcals() hifv_checkflag(checkflagmode='allcals-vla') hifv_solint() hifv_fluxboot() hifv_finalcals() hifv_applycals() hifv_checkflag(checkflagmode='target-vla') hifv_statwt() hifv_plotsummary() hif_makeimlist(intent='PHASE,BANDPASS',specmode='cont') hif_makeimages(hm_masking='centralregion')
hifv_mstransform() # creates the mySDM_targets_cont.ms (continuum) and
# mySDM_targets.ms (non-RFI-flagged line MS) finally: h_save()
If a spectral line happens to be close to edge channels, one can decide to turn off edge channel flagging by setting the parameter edgespw=False in hifv_flagdata (if the line falls on the edge of a baseband, one may also consider to set baseband=False to avoid flagging the outer 10 baseband edges):
hifv_flagdata(intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*,\
*UNKNOWN*,*SYSTEM_CONFIGURATION*,\
*UNSPECIFIED#UNSPECIFIED*', fracspw=0.01, baseband=False, edgespw=False,\
hm_tbuff='1.5int')
or one can choose to reduce the fraction of edge channels being flagged. In the example below, we reduce the number to 1% on each end of each spw (note that this is now the pipeline default, the default was previously 5%):
hifv_flagdata(intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*,\
*UNKNOWN*, *SYSTEM_CONFIGURATION*,\
*UNSPECIFIED#UNSPECIFIED*',\
fracspw=0.01, baseband=True, edgespw=True,\
hm_tbuff='1.5int',)
But note that including edge channels in the calibration may introduce uncertainties given that spw edges have low signal-to-noise and may contain correlator artifacts. Inspect the data to ensure the spw edges are usable.
Once desired modifications are made, run the pipeline as:
# In CASA
execfile('casa_pipescript.py')
Polarization Calibration
The VLA pipeline does not automatically derive or apply polarization calibration currently, this is mainly due the lack of spatial models for polarization calibrators to ensure that calibration of the polarization angles accurate when the angle calibrator is resolved. If the scan intents included proper polarization intents (CALIBRATE_POL_LEAKAGE and CALIBRATE_POL_ANGLE), the pipeline will run with refantmode='strict', which is desireable for polarization calibration. The user may perform polarization calibration steps after the pipeline was run by using the pipeline calibration tables for pre-calibration as required.
Polarization calibration steps are explained in the 3C75 CASA guide (in particular, crosshand delay, the D-term (leakage), and polarization angle will be required). We also refer to the corresponding chapter in CASAdocs.
New functionality was introduced in the CASA 6.6.1 pipeline to better allow users to make use of the polarization calibration infrastructure that was created for VLASS, but can generally be applied to most data with properly set polarization intents. If a user knows the polarization properties of their polarization angle calibrator (perhaps fitted from the tabulated calibrator polarization information), they can insert a setjy command into a casa_pipescript.py after hifv_finalcals, and insert hifv_circfeedpolcal with the parameter run_setjy=False into their script before hif_applycals as in the following example.
...
hifv_finalcals()
# Reference Frequency for fit values
reffreq = '33.0GHz'
# Stokes I flux density
I = 1.4953
# Spectral Index
alpha = [-0.7512, 0.1885]
# Polarization Fraction (fractional polarizion)
polfrac =[ 0.0412126, 0.02973067, -0.00598331]
# Polarization Angle (Radians)
polangle = [ 1.48599775, 0.37284829, -0.99768313]
setjy(vis='mySDM.ms', field='0542+498=3C147', spw='2~65',
standard="manual", fluxdensity=[I,0,0,0], spix=alpha, reffreq=reffreq,
polindex=polfrac, polangle=polangle, rotmeas=0, usescratch=True)
hifv_circfeedpolcal(run_setjy=False)
hifv_applycals()
...
This will omit the setting of the Stokes IQUV properties for the calibrator that is internal to hifv_circfeedpolcal and only valid for 3C138 and 3C286 as S-band. However, if observing in S-band using either of those sources as the angle calibrator, then hifv_circfeedpolcal may work without the additional setjy call prior to the task and without the supplied argument.
Weak Calibrators
The VLA pipeline requires a minimum signal-to-noise of ~3 for each spw (each channel for the bandpass) and calibrator scan. Failure to integrate sufficiently on calibrator sources will lead to sub-optimal reductions with the pipeline. Users are advised to closely follow the Guide to Observing with the VLA to avoid issues due to faint calibrators whether using the pipeline or manual reduction.
Partially Frequency-Averaged Data
The pipeline invokes Hanning smoothing when frequency averaging at the VLA correlator is turned off (note that online frequency averaging is no longer offered). In that case, Hanning smoothing can fix the Gibbs ringing for strong narrow spectral features (e.g., RFI or maser lines). When the online frequency averaging is turned on, however, adjacent channels do not exhibit the typical Gibbs zig-zag pattern anymore and Hanning smoothing should not be applied for further processing. The pipeline currently detects if any spectral window was pre-averaged in frequency on the VLA correlator. In that case, it turns the Hanning smoothing off for all data. If one has an SDM where some spws are frequency averaged and some are not, the pipeline should be stopped after hifv_importasdm. The frequency averaged and non-averaged spws should then be separated into individual MSes using the CASA task 'split'. The pipeline can be started again on the individual MSes.
Incorrect scan intents
As mentioned in the "Pipeline Requirements", scan intents tell the pipeline which scans and fields are used for flux, delay, bandpass, gain and phase calibration. Scan intents should be set up correctly in the OPT before submitting the schedule block for observation.
When incorrect scan intents are identified after observations, one can still change the SDM-BDF with updated scan intents, although some care will be required.
The SDM-BDF metadata is structured in the form of XML files that can be edited. We provide a small Scan Intent Editing Perl Script to do this. The script is pretty self-explanatory and can add and delete scan intents to any scan.
Alternatively, the SDM can also be manually edited. Great care, however, should be taken not to corrupt the structure of the SDM-BDF/xml. We therefore advise not to edit the SDM-BDF/xml manually but to use the Perl script instead.
However, to edit the xml manually, cd into the SDM and edit the file 'Scan.xml'. We strongly recommend creating a backup copy of the 'Scan.xml' file in case the edits corrupt the metadata.
'Scan.xml' is divided into individual <row></row> blocks that identify each scan.
An example of a scan with a single scan intent (here: OBSERVE_TARGET) may look like:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>1</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 1 OBSERVE_TARGET</scanIntent> <calDataType>1 1 NONE</calDataType> <calibrationOnLine>1 1 false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
We can now change the scan intent, e.g., from OBSERVE_TARGET to CALIBRATE_AMPLI by simply updating the <scanIntent> tag:
<row>
<scanNumber>1</scanNumber>
<startTime>4870732142800000000</startTime>
<endTime>4870732322300000256</endTime>
<numIntent>1</numIntent>
<numSubscan>1</numSubscan>
<scanIntent>1 1 CALIBRATE_AMPLI</scanIntent>
<calDataType>1 1 NONE</calDataType>
<calibrationOnLine>1 1 false</calibrationOnLine>
<sourceName>J1041+0610</sourceName>
<flagRow>false</flagRow>
<execBlockId>ExecBlock_0</execBlockId>
</row>
If we want to add a second intent, we will have to make additional changes. Let's add CALIBRATE_PHASE:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>2</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 2 CALIBRATE_AMPLI CALIBRATE_PHASE</scanIntent> <calDataType>1 2 NONE NONE</calDataType> <calibrationOnLine>1 2 false false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
Inside <scanIntent> we added the second intent, but also increased the second number from 1 to 2. In addition, we specified <numIntent> to be 2, and added a second entry to <calDataType> and <calibrationOnLine>. For the latter two, we also updated the second number from 1 to 2.
Analoguously, if we now add a third intent, CALIBRATE_BANPDASS, to the same scan, the <row> will look like:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>3</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 3 CALIBRATE_AMPLI CALIBRATE_PHASE CALIBRATE_BANDPASS</scanIntent> <calDataType>1 3 NONE NONE NONE</calDataType> <calibrationOnLine>1 3 false false false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
Check with CASA's listobs on the imported MS (after importing the data to an MS via importasdm) if the scan intents are now displayed as desired. Revert back to the original 'Scan.xml' if the above was not successful and contact the NRAO helpdesk for advice.
Allowed Intents
CALIBRATE_AMPLI : Amplitude calibration scan CALIBRATE_PHASE : Phase calibration scan CALIBRATE_BANDPASS : Bandpass calibration scan CALIBRATE_DELAY : Delay calibration scan CALIBRATE_FLUX : flux measurement scan. CALIBRATE_POINTING : Pointing calibration scan CALIBRATE_POLARIZATION : Polarization calibration scan CALIBRATE_POL_LEAKAGE : Polarization Leakage calibration scan CALIBRATE_POL_ANGLE : Polarizaiton Angle calibration scan OBSERVE_TARGET : Target source scan CALIBRATE_ATMOSPHERE : Atmosphere calibration scan CALIBRATE_FOCUS : Focus calibration scan. Z coordinate to be derived CALIBRATE_FOCUS X : Focus calibration scan; X focus coordinate to be derived CALIBRATE_FOCUS Y : Focus calibration scan; Y focus coordinate to be derived CALIBRATE_SIDEBAND_RATIO : measure relative gains of sidebands. CALIBRATE_WVR : Data from the water vapor radiometers (and correlation data) are used to derive their calibration parameters. DO_SKYDIP : Skydip calibration scan MAP_ANTENNA_SURFACE : Holography calibration scan MAP_PRIMARY_BEAM : Data on a celestial calibration source are used to derive a map of the primary beam. TEST : used for development. UNSPECIFIED : Unspecified scan intent CALIBRATE_ANTENNA_POSITION : Requested by EVLA. CALIBRATE_ANTENNA_PHASE : Requested by EVLA. MEASURE_RFI : Requested by EVLA. CALIBRATE_ANTENNA_POINTING_MODEL : Requested by EVLA. SYSTEM_CONFIGURATION : Requested by EVLA. CALIBRATE_APPPHASE_ACTIVE : Calculate and apply phasing solutions. Applicable at ALMA. CALIBRATE APPPHASE PASSIVE : Apply previously obtained phasing solutions. Applicable at ALMA. OBSERVE_CHECK_SOURCE
Known Issues
This section will be updated as the pipeline continues to be developed. The comments below are general to most pipeline versions, including the current production pipeline used for VLA data. The current production version is CASA 6.6.1.
- The spatial models for the flux density calibrator 3C48 has problems at K-band and higher.
- Although the hifv_fluxboot task performs clipping of the gain solutions used for flux density boot strapping, in some cases (especially at high frequencies, for datasets that may have suffered from pointing or other problems) the default flux density scale may be uncertain if outlier rejection was not sufficient. For projects that may require very accurate flux density scales, a user may want to flag the flux gain table by hand.
- While the pipeline supports multi-band data, the same bandpass and flux density calibrators need to be used for all bands. If not, the data for one of the bands will be flagged.
- The pipeline effectively calibrates each spectral window separately. This means that the signal-to-noise ratio (S/N) on the calibrators needs to be sufficiently high that solutions can be obtained within the solution intervals specified in the pipeline. For the delay calibrator, this means a S/N>3 per integration time, t(int); for the bandpass calibrator, a S/N>5 is required for solution times up to 10*t(int); for the gain calibrator a S/N>3 is required for solution times up to 10*t(int), and >5 for a scan average. The calibrator strength guidelines as a function of observing bandwidth in the Guide to Observing with the VLA for the high frequency end of Q-band; following these guidelines will be safe for all frequencies, although very narrow spectral channels may be problematic. In these cases the data may need to be calibrated by hand, using polynomial bandpass fitting instead.
- In some instances, the post-RFI flagging plots look aesthetically worse than the pre-RFI flagging plots. This is due to a poorly performing antenna (higher noise than others) that is getting heavily flagged in the RFI flagging. The post-RFI flagging plot then has less data to average together resulting in a worse looking plot. This is not a problem and the outcomes from imaging with and without the flagging of these poorly performing antennas are not scientifically different. In the case of these noisy data not getting flagged (as in the previous pipeline version), they are strongly downweighted by statwt so they do not contribute much to the final images anyway.
- Edge channel flagging reported in hifv_flagdata will appear as zero for most continuum modes. This is because the pipeline by default is only flagging 1% of edge channels, which results in 1 edge channel being selected for flagging in most continuum spws; however, Hanning smoothing has already resulted in the flagging of the first and last channels in a spectral window, so there are no more edge channels needing to be flagged.
- The spectral line processing by the VLA pipeline will result in non-RFI flagged spectral windows for those identified by the pipeline (or specified by the user) as being intended for spectral line science. This could result in some RFI being present in the line spectral windows and this is planned to be addressed in a future release.
- The calibration pipeline supports calibration of OTF mosaic data for the VLA Sky Survey (VLASS) using a specific calibration recipe that is only validated for S-band.
- The pipeline currently fails when a CALIBRATE_PHASE or CALIBRATE_AMPLI intent is added to a scan that already has a CALIBRATE_FLUX or CALIBRATE_BANDPASS scan intent.
- CASA 6.X now pre-pends the paths listed in the system PYTHONPATH environment variable to its own PYTHONPATH. This is opposite of the CASA 5.X behavior that appended the system PYTHONPATH. This can cause issues with CASA importing Python libraries from the system Python installation rather than its internal Python libraries.
- The pipeline results can be changed subtly depending on the the environment variable OMP_NUM_THREADS. The NRAO processing uses OMP_NUM_THREADS=1 and if this is not used, there may be minor differences (not scientifically relevant, however) in the results obtained.
- The pipeline does not currently support partially frequency averaged data (a very rare observing mode), see section on this subject.
- The pipeline has not been tested nor validated for P- or 4-band observations and will fail.
- For information about know issues affecting previous release, please see the Known Issues page.
Previous Pipeline Releases: