Correlator
Introduction
The correlator is situated in the SOC, at the end of the data path. Its role is to reproduce the signals recorded at the VLBA stations and any others involved in the observation, and to combine them in two-station baseline pairs, to yield the visibility function which is the fundamental measurement produced by the VLBA. VLBA observations are processed using the DiFX software correlator. DiFX was developed at Swinburne University in Melbourne, Australia (Deller et al. 2007), and adapted to the VLBA operational environment by NRAO staff (Brisken 2008).
We encourage users to include the following text in the Acknowledgments section of any publication arising from VLBA observations made since December 2009:
This work made use of the Swinburne University of Technology software correlator, developed as part of the Australian Major National Research Facilities Programme and operated under licence.
... and to cite the following recent paper by the developers: Deller, et al. 2011, PASP, 123, 275.
Software correlation has become feasible in recent years, and is especially well suited to applications like VLBI with bandwidth-limited data-transmission systems and non-realtime processing. Among its several advantageous aspects are: (1) flexible allocation of processing resources to support correlation of varying numbers of stations, frequency and time resolution, and various special processing modes, with no fundamental fixed limits other than the finite performance of the processing cluster; (2) optimization of resource usage to minimize processing time; (3) integration of control and processing functions; (4) continuously scalable, incremental upgrade paths; and (5) relatively straightforward implementation of special modes and tests. These and other virtues of software correlation are discussed in more detail by Deller et al. (2007).
Despite the absence of fixed limits cited in item (1) above, NRAO has established guidelines for the extremes of spectral resolution, integration period, and output rate, for routine DiFX processing, as specified in the appropriate sections below. Exceptions will be considered for proposals including a sufficiently compelling scientific justification.
DiFX processes 2-bit samples with substantially greater efficiency than 1-bit samples over double the bandwidth, basically because only half as many samples must be correlated. Since these two cases have nearly equivalent sensitivity, 1-bit sampling is no longer supported by the VLBA's data systems.
Operation of DiFX is governed primarily by an observation description in VEX format. This format is used for both station and correlator control functions in a number of VLBI arrays, and NRAO program SCHED (Walker 2011) has been producing it for many years.
DiFX processes input data recorded in Mark 5A, Mark 5B, and the new Mark 5C formats. Correlator output is written according to the FITS Interferometry Data Interchange Convention (Greisen 2009). In addition to the fundamental visibility function measurements and associated meta-data, the FITS files include amplitude and phase calibration measurements, weather data, and editing flags, derived from data logged at the VLBA stations (Ulvestad 1999). AIPS release 31DEC08 or later is required to handle DiFX data properly.
Conversion of DiFX correlator output to the Mark 4 format that is used primarily in analysis of geodetic observations is also available. To enable this additional output, a SCHED parameter CORDFMT=MARK4 should be specified.
Spectral Resolution
A recent change to DiFX allows a more flexible selection of the desired number of "spectral points" spanning each individual data channel. Any number that can be factored as 2n · 5m can now be specified, subject to a maximum of 4096 for routine DiFX processing. The total number of spectral points, summed over all data channels and polarization products, is now 132,096 (for compatibility with AIPS). The number of spectral points must be the same for all data channels at any given time, although multiple passes are possible with different sets of channels. The actual spectral resolution obtained, and statistical independence of the spectral points, depends on subsequent smoothing and other processing.
DiFX also supports "spectral zooming'', selection of a subset of correlated spectral points from any or all data channels. Only the selected spectral points are included in the output dataset. This capability will be of value mainly in maser studies, where a recorded data channel may be much wider than the maser emission in two main categories of observations: (1) Maser astrometry with in-beam continuum calibrators. Wideband observing is required for maximum sensitivity on the calibrators, while zooming allows high spectral resolution at the frequencies where maser emission appears. (2) Multiple maser transitions. When wideband data channels are used to cover a large number of widely separated maser transitions, spectral zooming allows the empty portions of high-resolution spectrum to be discarded.
In proposing observations that will use spectral zooming, the required number of spectral points before zooming should be specified in the Proposal Submission Tool. Currently, the location and width of the "zoom" bands must be communicated directly to VLBA operations before correlation.
Integration Period
DiFX accommodates a nearly continuous range of correlator integration periods over the range of practical interest. Individual integrations are quantized in multiples of the indivisible internal FFT interval, which is equal to the number of spectral points requested, divided by the data channel bandwidth.
For most cases, with low to moderate spectral resolution, and/or wide data channels, the FFT intervals are fairly short, and it is straightforward to find an integration period in any desired range that is an optimal integral multiple of the FFT interval. ("Optimal'' refers here to the performance of DiFX.) Extreme cases of very high spectral resolution (many spectral points across a narrow data channel - resolution of less than about 100 Hz) imply FFT intervals long enough that only limited choices of integral multiples are available.
For flexibility in these situations (although the option exists in all cases), integration periods other than an integral multiple of the FFT interval can be approximated, in a long-term mean, by an appropriate sequence of nearby optimal integral multiples. In this case, output records are time-tagged as if correlated with exactly the requested period.
SCHED now accepts an additional parameter so that users can indicate that the requested integration period is to be implemented exactly, as described above. Otherwise, the nearest optimal integral multiple of the FFT interval is passed to the correlator.
Multiple Phase Centers
The field of view in VLBI observations is very small, around 10-4 of the primary antenna beam area. This restricted interferometer beam arises in the correlation process, from smearing due to averaging in time (with, typically, a 2-second period) and/or across bandwidth ("chromatic aberration'' over, typically, 0.5 MHz spectral resolution), at positions away from the correlation phase center. Thus, imaging of targets that are widely spaced in the primary beam requires multiple processing passes in typical correlator implementations. If the visibilities are maintained at high time and frequency resolution, it is possible to perform a u-v shift after correlation, essentially repointing the correlated dataset to a new phase center. However, this approach would require prohibitively large visibility datasets.
DiFX implements multiple u-v shifts inside the correlator, to generate as many phase centers as are necessary, in a single correlation pass. The output consists of one dataset of normal size for each phase center. This mode consumes around three times the correlator resources of a normal continuum correlation, due to the need for finer frequency resolution before the u-v shift, but the additional cost is only weakly dependent on the number of phase centers. For reasonable spectral and temporal resolution requirements (for example, adequate for smearing < 10% at the 50% contour of the VLBA primary beam), 200 phase centers require only 20% more correlator time than 2 phase centers. Extremely high spectral and/or temporal resolution (e.g. for shifts even closer to the edge of the primary beam) carry a higher overhead per additional phase center. This mode thus should be requested only for imaging of three or more sources within any single antenna pointing. The output data rate must be justified if it exceeds the current limit.
Multiple phase-center correlation is requested in the NRAO Proposal Submission Tool by setting the "Number of Fields'' item in the resource section to the maximum number of phase centers required for any antenna pointing specified in a given resource. The requested spectral resolution and integration time should correspond to the desired initial number of spectral points per data channel (required to minimize bandwidth smearing) and the desired integration between u-v shifts (to minimize time smearing). SCHED includes facilities to support specification of the actual phase center locations.
For more details on wide-field imaging techniques, see Bridle & Schwab (1999), and Garrett et al. (1999).
Output Rate
Correlation parameters should result in an output rate less than 10 MBytes per second (of observing time) for routine DiFX processing; higher rates may be considered if required and adequately justified. Observers should ensure that their data-analysis facilities can handle the dataset volumes that will result from the correlation parameters they specify.
An approximate parametrization of the output rate is given by
\[R = 4 \cdot \frac{N_{\rm stn} \cdot (N_{\rm stn}+1) \cdot N_{\rm sbb} \cdot N_{\rm spc}}{T_{\rm int}}\cdot N_{\rm phc} \cdot p\]
where the rate \(R\) is in Byte/s;
\( N_{\rm stn} , \; N_{\rm sbb} , \; N_{\rm spc} \) are the numbers of observing stations, data channels, and spectral points per data channel, respectively;
\(T_{\rm int}\) is the correlator integration period; and
\(N_{\rm phc}\) is the number of phase centers.
The polarization factor \(p=1\) for single-polar, or dual-polar parallel-hand output; or \(p=2\) for cross-polar, four-Stokes processing.
Output data rates are also estimated by SCHED.
Connect with NRAO