Correlator

Introduction

The correlator is situated in the SOC, at the end of the data path.  Its role is to reproduce the signals recorded at the VLBA stations and any others involved in the observation, and to combine them in two-station baseline pairs, to yield the visibility function which is the fundamental measurement produced by the VLBA.  VLBA observations are processed using the DiFX software correlator. DiFX was developed at Swinburne University in Melbourne, Australia (Deller et al. 2007), and adapted to the VLBA operational environment by NRAO staff (Brisken 2008).  Subsequent references to "DiFX" apply specifically only to this VLBA implementation.

We encourage users to include the following text in the Acknowledgments section of any publication arising from VLBA observations made since December 2009:

This work made use of the Swinburne University of Technology software correlator, developed as part of the Australian Major National Research Facilities Programme and operated under licence.

... and to cite the following recent paper by the developers: Deller, et al. 2011, PASP, 123, 275.

Software correlation has become feasible in recent years, and is especially well suited to applications like VLBI with bandwidth-limited data-transmission systems and non-realtime processing. Among its several advantageous aspects are: (1) flexible allocation of processing resources to support correlation of varying numbers of stations, frequency and time resolution, and various special processing modes, with no fundamental fixed limits other than the finite performance of the processing cluster; (2) optimization of resource usage to minimize processing time; (3) integration of control and processing functions; (4) continuously scalable, incremental upgrade paths; and (5) relatively straightforward implementation of special modes and tests. These and other virtues of software correlation are discussed in more detail by Deller et al. (2007).

Despite the absence of fixed limits cited in item (1) above, NRAO has established guidelines for the extremes of spectral resolution, integration period, and output rate, for routine DiFX processing, as specified in the appropriate sections below. Exceptions will be considered for proposals including a sufficiently compelling scientific justification.

DiFX does not process data from a single antenna, nor is a multi-station autocorrelation-only mode available.

Operation of DiFX is governed primarily by an observation description in VEX format (currently vex1.5).  This format is used for both station and correlator control functions in a number of VLBI arrays, and NRAO program SCHED (Walker 2011) has been producing it for many years.

DiFX only accepts data on Mark5 disk modules, as recorded by a Mark5A, Mark5B, Mark5B+ or Mark5C recorder.  It can process data in a variety of formats including VLBA, Mark4, and Mark5B.  Support for VDIF format is currently incomplete but includes those versions created by the VLBA RDBE and the VLA WIDAR correlator.

Correlator output is written according to the FITS Interferometry Data Interchange Convention (Greisen 2009).  In addition to the fundamental visibility function measurements and associated meta-data, the FITS files include amplitude and phase calibration measurements, weather data, and editing flags, derived from data logged at the VLBA stations.  A recent AIPS release is required to handle DiFX data properly;  31DEC12 is recommended to support all features of the current DiFX version.

Conversion of DiFX correlator output to the Mark 4 format that is used primarily in analysis of geodetic observations is also available.  To enable this additional output, a SCHED parameter CORDFMT=MARK4 should be specified.

Spectral Resolution

DiFX allows quite flexible selection of the desired number of "spectral points" spanning each individual data channel.  Any number that can be factored as 2n · 5m can be specified, subject to these limitation:

  • A maximum of 4096 points per channel, for routine DiFX processing.
  • A total of 132,096, summed over all channels and polarization products, for compatibility with AIPS.
  • A minimum spectral resolution of 2 Hz.

The number of spectral points must be the same for all data channels at any given time, although multiple passes are possible with different sets of channels. The actual spectral resolution obtained, and statistical independence of the spectral points, depends on subsequent smoothing and other processing.

DiFX also supports "spectral zooming'', selection of a subset of correlated spectral points from any or all data channels.  Only the selected spectral points are included in the output dataset.  This capability is of value mainly in maser studies, where a recorded data channel may be much wider than the maser emission in two main categories of observations:  (1) Maser astrometry with in-beam continuum calibrators.  Wideband observing is required for maximum sensitivity on the calibrators, while zooming allows high spectral resolution at the frequencies where maser emission appears.  (2) Multiple maser transitions.  When wideband data channels are used to cover a large number of widely separated maser transitions, spectral zooming allows the empty portions of high-resolution spectrum to be discarded.

In proposing observations that will use spectral zooming, the required number of spectral points before zooming should be specified in the Proposal Submission Tool.  Currently, the location and width of the "zoom" bands must be communicated directly to VLBA operations before correlation.

Integration Period

DiFX accommodates a nearly continuous range of correlator integration periods over the range of practical interest. Individual integrations are quantized in multiples of the indivisible internal FFT interval, which is equal to the number of spectral points requested, divided by the data channel bandwidth.

For most cases, with low to moderate spectral resolution, and/or wide data channels, the FFT intervals are fairly short, and it is straightforward to find an integration period in any desired range that is an optimal integral multiple of the FFT interval. ("Optimal'' refers here to the performance of DiFX.)  Extreme cases of very high spectral resolution (many spectral points across a narrow data channel - resolution of less than about 100 Hz) imply FFT intervals long enough that only limited choices of integral multiples are available.

For flexibility in these situations (although the option exists in all cases), integration periods other than an integral multiple of the FFT interval can be approximated, in a long-term mean, by an appropriate sequence of nearby optimal integral multiples. In this case, output records are time-tagged as if correlated with exactly the requested period.

SCHED accepts an additional parameter so that users can indicate that the requested integration period is to be implemented exactly, as described above. Otherwise, the nearest optimal integral multiple of the FFT interval is passed to the correlator.

Multiple Phase Centers

The field of view in VLBI observations is very small, around 10-4 of the primary antenna beam area. This restricted interferometer beam arises in the correlation process, from smearing due to averaging in time (with, typically, a 2-second period) and/or across bandwidth ("chromatic aberration'' over, typically, 0.5 MHz spectral resolution), at positions away from the correlation phase center. Thus, imaging of targets that are widely spaced in the primary beam requires multiple processing passes in typical correlator implementations. If the visibilities are maintained at high time and frequency resolution, it is possible to perform a u-v shift after correlation, essentially repointing the correlated dataset to a new phase center. However, this approach would require prohibitively large visibility datasets.

DiFX implements multiple u-v shifts inside the correlator, to generate as many phase centers as are necessary, in a single correlation pass. The output consists of one dataset of normal size for each phase center. This mode consumes around three times the correlator resources of a normal continuum correlation, due to the need for finer frequency resolution before the u-v shift, but the additional cost is only weakly dependent on the number of phase centers. For reasonable spectral and temporal resolution requirements (for example, adequate for smearing < 10% at the 50% contour of the VLBA primary beam), 200 phase centers require only 20% more correlator time than 2 phase centers. Extremely high spectral and/or temporal resolution (e.g. for shifts even closer to the edge of the primary beam) carry a higher overhead per additional phase center. This mode thus should be requested only for imaging of three or more sources within any single antenna pointing.  The correlator output rate expands proportionally to the number of phase centers.

Correlator memory limits the product of baselines, spectral points, and phase centers for one correlator pass.  The current limit is approximately 600 phase centers for the 10 element VLBA at 2 Gbps record rate (512 MHz polarization-summed bandwidth).   An unlimited number of phase centers can ultimately be achieved in multiple correlation passes, however.

Multiple phase-center correlation is requested in the NRAO Proposal Submission Tool by setting the "Number of Fields'' item in the resource section to the maximum number of phase centers required for any antenna pointing specified in a given resource. The requested spectral resolution and integration time should correspond to the desired initial number of spectral points per data channel (required to minimize bandwidth smearing) and the desired integration between u-v shifts (to minimize time smearing).  An expanded output data rate that exceeds the current limit, as well as any required multiple passes, must be justified specifically in the proposal.

SCHED includes facilities to support specification of the actual phase center locations to be used in correlation.

For more details on wide-field imaging techniques, see Bridle & Schwab (1999), and Garrett et al. (1999).

Output Rate

Correlation parameters should result in an output rate less than 10 MBytes per second (of observing time) for routine DiFX processing; higher rates may be considered if required and adequately justified. Observers should ensure that their data-analysis facilities can handle the dataset volumes that will result from the correlation parameters they specify.

An approximate parametrization of the output rate is given by

\[R = 4 \cdot \frac{N_{\rm stn} \cdot (N_{\rm stn}+1) \cdot N_{\rm sbb} \cdot N_{\rm spc}}{T_{\rm int}}\cdot N_{\rm phc} \cdot p\]

where the rate \(R\) is in Byte/s;

\( N_{\rm stn} , \; N_{\rm sbb} , \; N_{\rm spc} \) are the numbers of observing stations, data channels, and spectral points per data channel, respectively;

\(T_{\rm int}\) is the correlator integration period;   and

\(N_{\rm phc}\) is the number of phase centers.

The polarization factor \(p=1\) for single-polar, or dual-polar parallel-hand output;  or \(p=2\) for cross-polar, four-Stokes processing.

Output data rates are also estimated by SCHED.


Connect with NRAO

The National Radio Astronomy Observatory and Green Bank Observatory are facilities of the U.S. National Science Foundation operated under cooperative agreement by Associated Universities, Inc.