VLA Imaging Pipeline 2024.1.0.8; CASA version 6.6.1
General Description
An imaging pipeline is available for VLA continuum and spectral line data. This imaging pipeline is built on the foundation provided by the ALMA imaging pipeline, but has been tailored to also support the VLA. The imaging pipeline parameters may not be optimal for all datasets, but is expected to operate successfully on all the bands supported by the Science Ready Data Products processing. The spectral line imaging pipeline is still in an early state and problems may occur beyond the current list of known issues. Some current features are as follows:
- A single aggregate (all spws and bandwidth combined) continuum image is produced per observed band, per target present in the input data.
- Mosaics are not currently supported, thus each field of a mosaic will be imaged individually.
- Image sizes are limited to at most 16384x16384 pixels meaning that images from A and B-configuration at low frequencies will not completely encompass the primary beam.
- Cleaning is done using auto-masking as part of tclean, cleaning down to the 4-sigma level using the nsigma parameter of tclean.
- Briggs weighting with robust=0.5 is adopted as a default for continuum image and robust=2.0 is adopted as a default for spectral line imaging.
- The task hifv_importdata will automatically identify spectral windows with spectral line science intents if they are more narrow than the default continuum windows (128 MHz, 64 channels for C-band and higher, 64 MHz and 64 channels for S-band and lower).
- To avoid flagging spectral lines as RFI, the flagging state of the data are restored to their pre-RFI flagging state before being split off into the line MS by hifv_mstransform. The task hifv_mstransform can create two measurement sets: mySDM_targets_cont.ms, where the data are all RFI flagged and suitable for continuum science and mySDM_targets.ms, which only includes spectral windows identified by the pipeline (or specified) as spectral line.
- Spectral line windows will have their line-free spectral regions detected using the hif_findcont pipeline task and the continuum will be subtracted in the uv-plane using hif_uvcontsub. Unflagged spectral channels will be used to created data cubes and are restored with a common beam.
- Targets with significant extended continuum emission (as determined by a ratio of visibility amplitudes from short and mid-baselines by the pipeline) will have the inner 5% of the uvrange omitted from imaging to avoid deconvolution errors from poorly sampled large-scale structure.
- Pixel sizes are chosen to sample the synthesized beam with 5 pixels across the minor axis; this specification drops to 4 pixels when the image size is mitigated; spectral line images can be further mitigated to avoid creating overwhelmingly large cubes.
- Nterms=2 is used when the fractional continuum bandwidth is > 10% in a single band or, if following hif_selfcal, it is determined that not using nterms=2 will limit the achievable S/N of the image.
- Self-calibration solutions from hif_selfcal will be applied to both the spectral line and continuum data when the self-calibration is succcessful.
Please see the Known Issues below. For questions, suggestions, and other issues encountered with the pipeline, please submit a ticket to the Pipeline Department of the NRAO Helpdesk.
The pipeline reference manual is also available.
VLA Pipeline 2024.1 (CASA 6.6.1) Release Notes
- Numerous bug fixes have been made to the hif_selfcal task to improve performance and reliability of the task.
- When importing an MS, there are now datacolumn specifications that are beneficial if specified as a dictionary argument to hifv_importdata. The most relevant keywords are datacolumns={'data': 'raw','corrected': 'regcal_contline_all'}, then after hif_mstransform is run, the datatype for the _targets.ms is re-registered automatically as 'data': 'regcal_contline_all'. After hif_selfcal is run, an additional datatype is added 'corrected': 'selfcal_contline_all'.
- Spectral line calibration and imaging are now included in the pipeline when spectral line windows are either specified or auto-detected by the calibration pipeline.
- The specification of spectral line science windows is made with the hifv_importdata task. If run with the defaults, the parameter specline_spws will be set to 'auto', where spectral windows more narrow than the default continuum will be considered spectral line. Other possibilities are 'none' (force no spectral line windows)', 'all' (force all spectral windows to be line windows), and e.g., '2,3,5~7,12~14', manual specification of the line spectral windows.
- The task hifv_mstransform has been introduced (replacing hif_mstransform for the VLA) in the imaging recipes. This additional task supports spectral line processing by creating an additional MS of just the spectral line windows. A call to this task is also added to the casa_piperestorescript.py files generated by the pipeline.
- L and S-band data will no longer use gridder='wproject' in the hif_makeimages and hif_selfcal tasks; the use of that option was deemed too resource intensive.
- The imaging pipeline will produce both non-self-calibrated images and self-calibrated images of the science target. However, if self-calibration fails, only the non-self-calibrated images are generated.
Pipeline Requirements
- The VLA imaging pipeline is currently designed to run on a single calibrated SB. However, it may successfully run on a collection of calibrated SBs. We recommend only attempting to image an SB(s) with the same targets, bands, and correlator setups. Multi-SB functionality is not fully validated and may not work properly with all data.
- The pipeline relies on correctly set scan intents. We therefore recommend that every observer ensures that the scan intents are correctly specified in the Observation Preparation Tool (OPT) during the preparation of the SB (see OPT manual for details). For the imaging pipeline to run, the OBSERVE_TARGET intent is required. Without this intent, there will be no science data to image. Other intents can be present in the MS, but in the second stage of the imaging pipeline, following import (or following the calibration portion of the calibration+imaging recipe), the science target data will be split off from the calibration sources.
- The imaging pipeline recipes expect that datasets will have the DATA and CORRECTED columns present. If an MS only has the data column (having previously been split from a calibrated measurement set), you should run the pipeline using a CASA pipescript with the hif_mstransform task removed or commented-out, and the datatype of the data column should be specified to be regcal_contline_science or regcal_contline_all depending on how the data were split, see Datatypes.
Obtaining and Running the Imaging Pipeline in CASA
The imaging pipeline can take a few hours to a few days to complete depending on the specifics of the data and whether parallel processing is used. We provide abbreviated instructions here for starting and running CASA; for full instructions, see the VLA Calibration pipeline.
You should start CASA in the same directory that contains the data you plan to work on. To start CASA with the pipeline from your own installation type (assuming that the executables are in your system PATH environment variable):
#In a Terminal
casa --pipeline
If you are running CASA on the Domenici Science Operations Center (DSOC) computers, you can start the latest CASA with pipeline using the command
#In a Terminal casa-pipe
The imaging pipeline (unlike the calibration pipeline) can be sped up by running it in parallel with mpicasa. Note that mpicasa only works in Linux.
#In a Terminal <path_to_casa>/mpicasa -n X <path_to_casa>/casa --pipeline
#At DSOC
export CASAPATH=/home/casa/packages/pipeline/casa-6.6.1-pipeline-2024.1.0.8/
$CASAPATH/bin/mpicasa -n X $CASAPATH/bin/casa --pipeline
where 'X' is the number of processing cores. Note that one core is always used for the management of the processes, so therefore mpicasa -n 9 will use 9 cores, 8 of which are used for processing the data. However, when using mpicasa, the memory usage will increase and depending on image size, number of threads, and amount of memory available on your computer (or the computing node), one could run out of memory and begin using swap space which will slow the imaging process to a crawl.
Ensure that your process can run for the full duration without interruption. Also, make sure that there is enough space in your directory as the data volume will increase by about a factor of two or more depending on the image sizes (depends on band and configuration). There are several ways to run the imaging pipeline; you can run it standalone on a previously calibrated dataset or you can run it as a combined calibration and imaging pipeline run.
The CASA homepage has more information on using CASA at NRAO.
Now that CASA is open, the imaging pipeline can be started with one of several methods. In these first methods to run the imaging pipeline, it is assumed that a previously calibrated measurement set is available. This could either be from a calibration pipeline run by the user or restored measurement set from the new NRAO archive.
Method 1: Pipeline Script
Continuum Imaging-only
You can use a pipeline script, for example 'casa_cont_imaging_pipescript.py' file. For this to work, the 'casa_cont_imaging_pipescript.py' must be in the same directory where you start CASA with the pipeline. Once CASA is started (same steps as above) type:
#In CASA execfile('casa_cont_imaging_pipescript.py')
In the following script, a calibrated measurement set is a required input. Simply replace myCaldMS with the name of the measurement set desired for imaging.
Example Imaging pipeline script:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['myCaldMS.ms'], datacolumns={'data': 'raw','corrected': 'regcal_contline_all'})
hifv_flagtargetsdata()
hifv_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont', datatype='regcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
#hifv_exportdata(imaging_products_only=True)
finally:
h_save()
If one simply wants to add the imaging commands to an existing calibration script, then the commands beginning with hifv_mstransform to hif_makeimages should be inserted into the calibration script, after calibrator imaging. If users do not wish to perform self-calibration, hif_selfcal and the following hif_makeimlist and hif_makeimages tasks should be removed.
Continuum and Spectral Line Imaging
You can use a pipeline script, for example 'casa_cont_line_imaging_pipescript.py' file. For this to work, the 'casa_cont_line_imaging_pipescript.py' must be in the same directory where you start CASA with the pipeline. Once CASA is started (same steps as above) type:
#In CASA execfile('casa_cont_line_imaging_pipescript.py')
In the following script, a calibrated measurement set is a required input. Simply replace myCaldMS with the name of the measurement set desired for imaging.
Example Imaging pipeline script:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['myCaldMS.ms'], datacolumns={'data': 'raw','corrected': 'regcal_contline_all'})
hifv_flagtargetsdata()
hifv_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='mfs')
hif_findcont()
hif_uvcontsub()
hif_makeimlist(specmode='cube')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maximsize=16384)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='cube', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
hifv_exportdata(imaging_products_only=True)
finally:
h_save()
Due to current pipeline limitations, we make multiple calls to hif_checkproductsize in order to limit the size of continuum and spectral line products individuallly. If users wish to produce data volumes larger than what are currently allowed, they can omit the task or change the settings for hif_checkproductsize with the caveat that they may generate a significant (i.e. TBs) of data during the pipeline run, depending on the parameters of the observation.
Method 2: Recipes
The Recipe Reducer is an alternative method to running the imaging recipe on calibrated data.
Continuum imaging only
# In CASA
import pipeline.recipereducer pipeline.recipereducer.reduce(vis=['myCaldMS.ms'],procedure='procedure_hifv_contimage_selfcal.xml',loglevel='summary')
Continuum and Spectral Line Imaging
# In CASA
import pipeline.recipereducer pipeline.recipereducer.reduce(vis=['myCaldMS.ms'],procedure='procedure_hifv_cont_cube_image_selfcal.xml',loglevel='summary')
Users should be aware that the recipe reducer creates a non-unique weblog directory and this method also includes a call to hifv_exportdata, which will package the calibration products into a 'products' directory one level up. This may not be desired by all users.
The recipe specified in this command includes target self-calibration. If self-calibration is not desired, procedure_hifv_contimage.xml or procedure_hifv_cont_cube_image.xml should be specified instead.
Method 3: One Stage at a Time
You may notice that the 'casa_cont_line_imaging_pipescript.py' or 'casa_cont_imaging_pipescript.py' is a list of specific CASA pipeline tasks being called in order to form the default pipeline. If desired, one could run each of these tasks one at a time in CASA, for example to inspect intermediate pipeline products.
If you need to exit CASA between stages, you can restart the pipeline where you left off, but be sure to run the command
# In CASA h_save()
However, in order for this to work, none of the files can be moved to other directories.
The to restart wher eyou left off, use the CASA pipeline task h_resume after starting CASA again. This will set up the environment again for the pipeline to work. Type:
# In CASA h_resume()
Now, you may start the next task in your list.
Running the Calibration Pipeline with the Imaging Pipeline
Method 1: Pipeline script
Continuum Calibration and Imaging-only
You can also run both the calibration pipeline and include the science target imaging. Below we show an example casa_calibration_and_imaging_pipescript.py that has the target imaging pipeline commands included at the end.
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['mySDM'], session=['default'],specline_spws='none')
hifv_hanning()
hifv_flagdata(hm_tbuff='1.5int', fracspw=0.01, intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*')
hifv_vlasetjy()
hifv_priorcals()
hifv_syspower()
hifv_testBPdcals()
hifv_checkflag(checkflagmode='bpd-vla')
hifv_semiFinalBPdcals()
hifv_checkflag(checkflagmode='allcals-vla')
hifv_solint()
hifv_fluxboot()
hifv_finalcals()
hifv_applycals()
hifv_checkflag(checkflagmode='target-vla')
hifv_statwt()
hifv_plotsummary()
hif_makeimlist(intent='PHASE,BANDPASS', specmode='cont')
hif_makeimages(hm_masking='centralregion')
hifv_exportdata()
hif_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont', datatype='regcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
#hifv_exportdata()
finally:
h_save()
In the example above, we set specline_spws='none' in hifv_importdata explicitly to only perform continuum processing. In such a case, hifv_mstransform will only produce the mySDM_targets_cont.ms.
Continuum and Spectral Line Calibration and Imaging
Calibration with both continuum and spectral line imaging is also possible with a very similar 'casa_calibration_and_cont_line_imaging_pipescript.py'.
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['mySDM'], session=['default'],specline_spws='auto') #
hifv_hanning()
hifv_flagdata(hm_tbuff='1.5int', fracspw=0.01, intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*')
hifv_vlasetjy()
hifv_priorcals()
hifv_syspower()
hifv_testBPdcals()
hifv_checkflag(checkflagmode='bpd-vla')
hifv_semiFinalBPdcals()
hifv_checkflag(checkflagmode='allcals-vla')
hifv_solint()
hifv_fluxboot()
hifv_finalcals()
hifv_applycals()
hifv_checkflag(checkflagmode='target-vla')
hifv_statwt()
hifv_plotsummary()
hif_makeimlist(intent='PHASE,BANDPASS', specmode='cont')
hif_makeimages(hm_masking='centralregion')
hifv_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='mfs')
hif_findcont()
hif_uvcontsub()
hif_makeimlist(specmode='cube')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maximsize=16384)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='cube', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
#hifv_exportdata()
finally:
h_save()
In the example above, we set specline_spws='auto' in hifv_importdata, but 'auto' is the default, so it is not necessary to specify this parameter. In this case, hifv_mstransform will produce the continuum MS: mySDM_targets_cont.ms and the spectral line MS mySDM_targets.ms. The hif_uvcontsub task will produce the continuum-subtracted spectral line MS, which will be named mySDM_targets_line.ms.
If this recipe were run on a MS with no narrow spectral windows, i.e., all continuum, the pipeline would not produce a spectral line MS and all the tasks that are related to spectral line processing or imaging will not run. So it is perfectly acceptable to run this single recipe on all types of data, but continuum users may opt for a recipe without all the extra, unused stages.
Method 2: Recipes
Continuum Calibration and Imaging-only
Similar to running just the imaging pipeline via the Recipe Reducer, there is a procedure to run calibration+imaging via this method as well.
# In CASA
import pipeline.recipereducer pipeline.recipereducer.reduce(vis=['mySDM'],procedure='procedure_hifv_calimage_cont.xml',loglevel='summary')
The same caveats about the non-unique weblog directory name and the call to hifv_exportdata described previously also apply here.
Continuum and Spectral Line Calibration and Imaging-only
Running the spectral line calibration and imaging recipe is just like the continuum-only recipe, with only a change in the recipe filename.
# In CASA
import pipeline.recipereducer pipeline.recipereducer.reduce(vis=['mySDM'],procedure='procedure_hifv_calimage_cont_cube_selfcal.xml',loglevel='summary')
Method 3: One Stage at a Time
As noted in the imaging-only pipeline and the VLA calibration pipeline, you may also run the pipeline one stage at a time, with the ability to resume if it's necessary to exit CASA.
Running the Imaging Pipeline on Multiple MSes
Users may wish to run the imaging pipeline on multiple measurement sets to create images that have increased sensitivity due from the combination of all relevant data. The pipeline is able to accept multiple measurement sets, but users must ensure that they have identical spectral setups and science targets included, otherwise the pipeline is likely to fail.
Example Imaging pipeline script for multiple MSes:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['myCaldMS_1.ms','myCaldMS_2.ms'], datacolumns={'data': 'raw','corrected': 'regcal_contline_all'})
hifv_flagtargetsdata()
hifv_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='mfs')
hif_findcont()
hif_uvcontsub()
hif_makeimlist(specmode='cube')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maximsize=16384)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='cube', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
hifv_exportdata(imaging_products_only=True)
finally:
h_save()
Note that it is simplest to import the original calibrated MSes into the pipeline and let it hand the splitting into the continuum and spectral line MSes, which will ensure that the datatypes get registered properly. It is possible to import multiple MSes with different assigned datatypes, but that can be more error-prone.
Data Types
The pipeline now includes the concept of data types where the data and corrected columns in a specific MS can be defined to contain a certain type of data. These datatypes are then used in the imaging and self-calibration process to keep track of what data are located in which column and which MS. Datatypes should be specified when importing a calibrated MS for imaging, otherwise the incorrect datacolumn might be used by the pipeline tasks, leading to undesirable results.
Typical usage has the datatypes specified in hifv_importdata like:
hifv_importdata(vis=['myCaldMS.ms'], datacolumns={'data': 'raw','corrected': 'regcal_contline_all'})
or if the calibrated data were already split to new MSes by hifv_mstransform:
hifv_importdata(vis=['myCaldMS_targets_cont.ms'], datacolumns={'data': 'regcal_cont_science'})
hifv_importdata(vis=['myCaldMS_targets.ms'],datacolumns={'data':'regcal_contline_science'})
There are other options for datatypes and we list them here for completeness.
Datatypes:
- raw - uncalibrated data.
- regcal_contline_all - standard calibrated data where both calibrators and science targets are present in the MS, contains both continuum and line data. This is the mySDM.ms after the calibration pipeline has run.
- regcal_contline_science - standard calibrated data where only science targets are present in the MS, contains both continuum and line data. This is the mySDM_targets.ms MS.
- regcal_cont_science - standard calibrated data where only science targets are present in the MS and contains the RFI-flagged data that are only suitable for continuum science - this is the mySDM_targets_cont.ms MS.
- selfcal_contline_science - self-calibrated data where only science targets are present in the MS, contains both continuum and line data. This is the mySDM_targets.ms MS with self-calibration applied.
- regcal_line_science - standard calibrated data of only the science targets where continuum subtraction has been applied. This is the mySDM_targets_line.ms MS.
- selfcal_line_science - self-calibrated data of only the science targets where continuum subtraction has been applied. This is the mySDM_targets_line.ms MS with self-calibration applied.
Self-Calibration
The VLA imaging pipeline now includes the hif_selfcal task to perform automated self-calibration on the science targets using auto-masking and heuristics that have been developed for both the VLA and ALMA imaging pipelines. The heuristics attempt to conduct self-calibration as a user would interactively, starting with shallow cleaning and long solution intervals and then cleaning deeper with progressively shorter solution intervals. However, checks are conducted before and after each self-calibration solution interval to ensure that self-calibration is improving the data.
After each self-calibration solution interval the S/N is verified to ensure that there is a net gain from that self-calibration interval and it is ensured that the beam has not increased by more than 5% in beam area from the pre-self-calibration image. If S/N decreases or the beam changes by > 5%, that solution interval is regarded as 'failed' and the last successful solution interval is restored.
The self-calibration solutions are applied to the continuum MS (mySDM_targets_cont.ms) and the line MS (mySDM_targets_line.ms).
Further information on self-calibration is available here.
Task Parameters
We list the most relevant parameters here that users may want to experiment with; the less relevant parameters are listed below the line.
- field (string) - field names to self-calibrate e.g., "HL_Tau"; default = "" which will self-calibrate all sources (numerical field IDs are not currently supported)
- apply_cal_mode_default (string) - Apply mode to use for applycal task during self-calibration; default = 'calflag'; options: 'calflag' ,'calonly', 'calflagstrict'
- amplitude_selfcal (boolean) - Attempt amplitude self-calibration following phase-only self-calibration; default = False
- gaincal_minsnr (float) - Minimum S/N for a solution to not be flagged by gaincal; default = 2.0
- minsnr_to_proceed (float) - Minimum estimated self-cal S/N computed on a per solution interval, per antenna basis, used to determine whether to attempt self-calibration for a source at a given solution interval; default = 3.0
- delta_beam_thresh (float) - Allowed fractional change in beam area for self-calibration to accept results of a solution interval; default = 0.05
Less relevant parameters to adjust:
- rel_thresh_scaling (string) - Scaling type to determine how clean thresholds per solution interval should be determined going from the starting clean threshold to 3.0 * RMS for the final solution interva'; default='log10', options: 'linear', 'log10', or 'loge' (natural log)
- dividing_factor (float) - Scaling factor to determine clean threshold for first self-calibration solution interval. Equivalent to (Peak S/N / dividing_factor) *RMS = First clean threshold; however, if (Peak S/N / dividing_factor) *RMS is < 5.0; a value of 5.0 is used for the first clean threshold. default = 40.0 for < 8 GHz; 15.0 for > 8 GHz
- check_all_spws (boolean) - The S/N of mfs images created on a per-spectral-window basis will be compared; default=False
- inf_EB_gaincal_combine (boolean) -If True, the gaincal combine parameter will be set to 'scan,spw'; if False, the gaincal combine parameter will be set to 'scan'; default=False; the only applies to the first solution interval that is computed over the length of an entire Execution Block
- spw (string)- spectral windows to self-calibrate; default = ""; all science spws will be self-calibrated
- apply (boolean) - apply final selfcal solutions back to the input Measurement Sets; default = True
If the pipeline is run using mpicasa, there are two parallelization modes that the hif_selfcal task may use depending on the data being self-calibrated. If there are multiple sources in the dataset, hif_selfcal will execute self-calibration simultaneously on each source, with each source using a single thread. If there is only a single target, then all the threads will be used to conduct imaging of the lone target in parallel. Self-calibration solutions are applied using applycal(applymode='calflag') which means that flagged gain solutions will result in data being flagged.
The temporary working files for self-calibration (images and gain tables) are located in sc_workdir_TARGETNAME. Users can examine this directory if more detail than is provided in the weblogs is desired. The filenaming convention for the images is Target_{Targetname}_{Telescope}_{Band}_{solution_interval}_{nth_selfcalibration_round}(_post). Images containing '_post in their filenames are generated after the application of the self-calibration solutions for that solution interval and cleaned to the same depth as the images created at the start of the solution interval to set the model.
If self-calibration was previously run on the target and the output is put into the products directory (*.hifv_calimage_cont_selfcal.auxproducts.tgz), the hif_selfcal task will look for this tar.gz file. If the file exists, the pipeline will reapply the pre-computed solutions and not run self-calibration. If a new run of self-calibration is desired, this file should be removed from the ../products directory.
What you get: Pipeline Products
VLA imaging pipeline output includes data products such as primary beam corrected images and spectral index images. Note that the automated execution at NRAO will also run an additional data packaging step (hifv_exportdata) which moves most of the files to an upper level '../products' directory. This step is omitted from the pipescript method, and all products remain within the 'root' directory where the pipeline was executed.
The most important pipeline products include:
- Science target images for each band and each target, and spectral line spws (if applicable) (files start with 'oussid*' in the root directory). These include the primary beam corrected tt0 image, tt1 (not primary beam corrected), pb (primary beam profile), clean mask, alpha (spectral index), and alpha.error (spectral index uncertainty) files. Note that when a very large number of pixels are used for the image (typically A and B configuration and/or high frequency data), images loaded in CASAviewer or CARTA may appear blank and simply need to be zoomed to find the sources(s).
- If self-calibration was successful, a tar file (*.hifv_calimage_cont_selfcal.auxproducts.tgz) containing the gaintables and JSON files to enable reapplication of the self-calibration solutions to the data.
- A weblog that is supplied as a compressed tarball weblog.tgz. When extracted, it has the form pipeline-YYYYMMDDTHHMMSSS/html/index.html, where the YYYYMMDDTHHMMSSS stands for the pipeline execution time stamp (multiple pipeline executions will result in multiple weblogs). The weblog contains information on the pipeline processing steps with diagnostic plots and statistics. The images for each target field in the weblog will not likely show detail for your observed target fields given the size of the images that might be created and the limited size of the weblog images. A VLA imaging pipeline CASA guide is under construction.
- The casapy-YYYYMMDD-HHMMSS.log CASA logger messages (in pipeline-YYYYMMDDTHHMMSSS/html/).
- 'casa_pipescript.py' (in pipeline-YYYYMMDDTHHMMSSS/html/), the script with the actually executed pipeline heuristic sequence and parameters. This file can be used to modify and re-execute the pipeline (see section The casa_pipescript.py file). Note that we also refer to a casa_imaging_pipescript.py; this is simply to differentiate between a script that runs calibration pipeline commands (possibly along with imaging) and one that runs only imaging. This file is created by the pipeline and is always called casa_pipescript.py regardless of what the filename of your script is called.
- 'casa_commands.log' (in pipeline-YYYYMMDDTHHMMSSS/html/), which contains the actual CASA commands that were generated by the pipeline heuristics (see section The casa_commands.log file).
- The output from CASA's task listobs is available at 'pipeline-YYYYMMDDTHHMMSSS/html/sessionSession_default/mySDM.ms/listobs.txt' and contains the characteristics of the observations (scans, source fields, spectral setup, antenna positions, and general information).
The Imaging Pipeline casa_imaging_pipescript.py File
The pipeline sequence of the pipeline heuristic steps are listed in the 'casa_pipescript.py' script that is located in the pipeline-YYYYMMDDTHHMMSSS/html (where YYYYMMDDTHHMMSSS is the timestamp of the execution) directory. Note that no matter what you call your script, the pipeline will create a file called casa_pipescript.py in the aforementioned directory as a record of what pipeline functions were run.
A typical 'casa_cont_imaging_pipescript.py' or 'casa_cont_line_imaging_pipescript.py' has the following structure (where mySDM is again a placeholder for the name of the SDM-BDF raw data file and will have the name of the one that was processed). The continuum-only script looks like:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['myCaldMS.ms'], datacolumns={'data': 'raw','corrected': 'regcal_contline_all'})
hifv_flagtargetsdata()
hifv_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont', datatype='regcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
#hifv_exportdata(imaging_products_only=True)
finally:
h_save()
While the continuum and line imaging pipescript will look like:
# This CASA pipescript is meant for use with CASA 6.6.1 and pipeline 2024.1.0.8
context = h_init()
context.set_state('ProjectSummary', 'observatory', 'Karl G. Jansky Very Large Array')
context.set_state('ProjectSummary', 'telescope', 'EVLA')
try:
hifv_importdata(vis=['myCaldMS.ms'], datacolumns={'data': 'raw','corrected': 'regcal_contline_all'})
hifv_flagtargetsdata()
hifv_mstransform()
hif_checkproductsize(maximsize=16384)
hif_makeimlist(specmode='cont')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='mfs')
hif_findcont()
hif_uvcontsub()
hif_makeimlist(specmode='cube')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maximsize=16384)
hif_selfcal()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hif_checkproductsize(maxcubesize=20.0, maxcubelimit=40.0, maxproductsize=100.0)
hif_makeimlist(specmode='cube', datatype='selfcal')
hif_makeimages(hm_cyclefactor=3.0)
hifv_pbcor()
#hifv_exportdata(imaging_products_only=True)
finally:
h_save()
Note that executions at NRAO may show small differences, e.g., an additional final hifv_exportdata (commented out in example above) step that packages the products to be stored in the NRAO archive.)
The above is, in fact, a standard user 'casa_imaging_pipescript.py' file for the current CASA and pipeline version (download to edit and run yourself: continuum-only or continuum and spectral line; make links!!) that can be used for general pipeline processing after inserting the correct myCaldMS filename in hifv_importdata.
The call to hifv_flagtargetsdata is there in case additional flagging needs to be added to the flagging template, all other flagging modes are turned off. This task will currently not apply flags to the spectral line MS and this limitation will be fixed in a future pipeline release.
The hifv_pbcor call after hif_makeimages will perform the primary beam correction. This is needed because tclean by default will not do the primary beam correction on wideband images. Note that this primary beam correction is approximate and uses the primary beam determined from the center of the band. If more accurate correction is required, please see the CASA task widebandpbcor.
The imaging pipeline run can be modified by adapting this script. At present there are limited options that should be altered. The script can then be (re-)executed via:
# In CASA execfile('casa_cont_imaging_pipescript.py')
or
execfile('casa_cont_cube_imaging_pipescript.py')
The casa_commands.log File
casa_commands.log is another useful file in pipeline-YYYYMMDDTHHMMSSS/html (where YYYYMMDDTHHMMSSS is the timestamp of the pipeline execution) that lists all the individual CASA commands that the pipeline heuristics (hifv) tasks produced. Note that 'casa_commands.log' is not executable itself, but contains all the CASA tasks and associated parameters to trace back the individual data reduction steps.
The Pipeline Weblog
Information on the pipeline run can be inspected through a weblog that is launched by pointing a web browser to file:///<path to your working directory>/pipeline-YYYYMMDDTHHMMSSS/html/index.html. The weblog contains statistics and diagnostic plots for the SDM-BDF as a whole and for each stage of the pipeline. The weblog is the first place to check if a pipeline run was successful and to assess the quality of the calibration.
An example walkthrough of a calibration pipeline weblog is provided in the VLA Pipeline CASA guide. A similar walk through for the imaging pipeline is under construction.
Note that we regularly test the weblog on Firefox. Other browsers may not display all items correctly.
Quality (QA) Assessment Scores
Each pipeline stage has a quality assessment (QA) score assigned to it. The values range from 0 to 1 where
0.9-1.0 Standard/Good (green color)
0.66-0.90 Below-Standard (blue color; also shown as a question mark symbol)
0.33-0.66 Warning (amber color; cross symbol)
0.00-0.33 Error (red color; cross symbol)
We recommend that all pipeline stages and the relevant products are checked. Below-standard and Warning scores should receive extra scrutiny. The QA section at the bottom of the weblog of each stage will provide more information about the particular origin of each score. Errors are usually very serious issues with the data or processing and should be resolved in any case. The QA scores for the imaging pipeline are not currently in a mature state. Currently the most relevant QA scores will be associated with hif_makeimages() where the scores will be dictated by the S/N ratio in the image. Low S/N will get a low score, but that may be expected depending on the properties of your data.
Examples for QA scores are provided in the Pipeline CASAguide.
Known Issues
This section will be updated as the pipeline continues to be developed. The comments below are general to most pipeline versions, including the current production pipeline used for VLA data. The current production version is CASA 6.6.1.
- Primary beam correction is done only on the tt0 image, and does not use the CASA task widebandpbcor which corrects for the spectral dependence of the primary beam and also corrects the .alpha images produced by tclean. Wide band primary beam correction is planned for inclusion in tclean in a future CASA release.
- The image size limitation will result in the primary beam (and side lobes) not being imaged for low-frequency data (primarily S-band and L-band) in A and B configurations. Therefore, strong sources outside the primary beam will not be deconvolved and may yield poor results.
- W-projection is not used for imaging L and S-bands due to the resource intesiveness of that option. As such, sources away from the inner ~1/3 of the primary beam may have position offsets and deconvolution errors.
- The weblog for hif_checkproductsize will show '-1 GB' for many entries. These are currently unused modes for cube images and maximum allowed size of products. Mitigation is currently only done to limit the size of image to a maximum of 16384x16384 pixels.
- If there are some channels included in a cube that result in the beam size varying considerably for some channels, the deconvolution using a common restoring beam may result in a cube of poor quality. As such, additional flagging of some channels may be necessary prior to running the pipeline.
- hifv_flagtargetsdata will not apply the flagging template to the spectral line MS due to the fact that the flagging state of the spectral line data is restored to its state immediately before running RFI flagging, this will be fixed in a future release).
- For imported calibrated data (typically imaging-only recipes), hifv_importdata will show a warning because there is a HISTORY table present from the calibration; this warning can be ignored.
- Ephemeris targets (i.e., Solar System objects) have not been validated with the imaging pipeline and may not work properly.
Previous Imaging Pipeline Releases: