MY GUIDE TO GMRT DATA ANALYSIS


1. Loading and Calibration

Load FITS data to AIPS using FITLD. The indexing is not available by default, so we have to run INDXR to index the data. Use cparm 0 0 1 0 in INDXR. QUACK the data for removing first one or two visibilities from each scan. RUN SETJY on primary calibrator(s) to set the flux Check the integration time before loading to TVFLG; use this info in dparm(6); Load phase cal and flux cal together into TVFLG. Checking flux cal only has some disadvantages. Since FLUX CAL is for short time either in the beginning or at the end, it is not good representation of the entire data. PHASE CAL is present throughout the data, hence gives better picture of antenna problems like start and end, intermitant nature, etc.. Check PHASE-DIFF first in TVFLG, independently for RR and LL. This gives a better picture of antenna health. If antennas are not working for certain time range, flag only those time ranges where antennas are not working. The flagging time range includes all sources within the time range. Small number of "high" points can be ignored for now, as long as it is very small fraction and scattered randomnly. (Eg: In one antenna, only one baseline has high points for a few records only, but many such cases). Run CALIB for single channel (for eg: ichans 100, 100, 1 0), GETJY and CLCAL. Check the flux obtained by GETJY. One indicator of the quality of calibration is the error in the GETJY flux. Useful tips while running CALIB: At 325 MHz always use lower UV cutoff of 1 for calib; this is to get rid of correlated RFI. Also take ref antenna one from E02, S01,S02,W01,W02. This will greatly improve the solutions. If small fraction of solutions are failed, ignore, we will get back to the data in next round. Check FLUX cal and PHASE cal separately using DOCAL=1 on TVFLG and UVPLT. If high/low baselines are seen in phase cal throughout, flag for all sources. If high/low baselines are seen only in flux cal and not in phase cal, flag only for flux cal. SORT BY LENGTH to ensure that shorter baselines vs longer baselines comparison. If you are looking for very diffuse emission (HII region of several arc minutes), NEVER delete short spacings, however bad they appear, you can delete them after a few rounds of imaging. You can also check for source with docal 1; but too early. Neverthless, better to have one look at this early stage. In TVFLG, for source, always use 'SORT BY LENGTH' option to avoid misleading shortspacing. Flag only clear bad data; on doubtful cases, retain the data for next round of flagging.
If you flagged something in this round, delete all SN tables and CL version 2 and re-run calib, getjy, clcal chain; run clcal for source as well. Check flux cal, phase cal and source with docal 1 using TVFLG and UVPLT. If you are looking for not so diffuse emission, you can delete a few brightest short baselines (may be three/four shortest and high flux ones, if their flux is many more times than typical baselines afterwards. If some time ranges are bad, flag this time range using UVFLG. If there are RFI all over the data, at different times, you can use FLGIT or CLIP after looking at relatively cleaner channels. FLGIT is very powerful tool that can flag based on statistical parameters; can be tried if bad data at many places. But, before running FLGIT, some reasonable bandpass should be available and also reasonable estimate of properties of data (like highest flux, rms) in cleaner part of the data. After all iterations of flagging, if you are reasonably OK with flux and phase cal, run BPASS.. Now check using two options of POSSM. Option 1: Use sourc ''; doband -1; bpver 0; aparm 0; aparm(8) 2; nplo 6; solin 0; Bchan 5; echan 250 (for 256 channel data). This will show individual antenna bandshape. If some antenna shows spikes, zoom and note down exact channels.a If many antennas shows big and small spikes in same place, delete that channel from full data. Option 2. Use source 'target'; doband 1; aparm 0; nplo 0;docal 1; This will apply the bandpass to the source data and will show RR and LL separately; only single plot per polarisation. Here you will most likely see some spikes. For these channels, you can load that channel using TVFLG (docal 1 and doband 1) and find out how much of it is bad - if only a small time range is bad, flag only that part and retain clean data. If more than one-third is bad, not worth spending time on flagging portions, flag entire channel. Do this to RR and LL separately. Run BPASS again and repeat Option 2 of possm to ensure no spikes seen. USING FLGIT: If good fraction of the data is affeted by RFI, you should use FLGIT on target. If small fraction is bad, use CLIP. How much to clip is decided at different UV ranges. To know this, run UVPLT with bparm 11 0 option. To adjust y-axis, use bparm(3)=1 and bparm(7)=ymax; this will help to zoom y-range. Decide on clip level depending on levels of good time range. use xinc of 1, 10, 50 and 100 for getting a feel for density of the data. Don't create sharp edges by clipping; it will introduce ripples in images. Decide how much to clip based on a feel by TVFLG and using CLIP INTERACTIVELY in TVFLG (but not actually deleting; by mistake if you delete in TVFLG, use UNDO FLAGS option given in letf panel). If you have time, best thing is to load the source data with docal 1; doband 1; bchan 10 and echan 245 (for 256 channel data) SORT BY LENGTH. Do 'LOAD NEXT CHAN' option to understand the data and clip levels at various good channels. Do this to RR and LL separately. Run calib, getjy, clcal and bpass option once; after removing all old SN, BP and CL-2. Run POSSM with option 1, to figure out rough start chan and end chan. A good point is where the gain starts rising above 0.5 at start and before falls to 0.5 at end.. Note down these 'limiting channels' Do some arithmetic as how many channels you want to average. Look for integral multiples of this between the limiting channels. EXAMPLE - Suppose at 610 MHz, you need to average to 4 MHz. This means 32 channels to be averaged for 256 channel mode. Suppose the limiting channels you noted are 11 and 240. The nearest integral multiple of 32 is 224. So I use bchan 13 and echan 236. This gives me 7 channels of 4 MHz each. At 325 MHz, you need to average only 1 MHz - this is 8 channels. Here you could go from bchan 11 and echan 242. If required flag channels 241 and 242, depending on its quality. RUN SPLAT with docal=-1 and doband=3 option. DOCAL=-1 because at this stage, only bandpass needs to be applied and not the calibration. APARM 3 0 and channel=32, chinc 0; Use latest FG table. This ensures that the data is properly bandpass averaged, but NOT calibrated. The present calibration is based on single channel, which has poor SNR on weaker phase cal in particular. The resulting SPLAT file has fewer channels each with higher bandwidth, hence the calibration will be better. The resulting SPLAT file is un-calibrated. START from SETJY (because freq is changed slightly); and re-do the full calibration chain. The SNR is much better here and even smaller levels of bad data comes out due to this, which was not visible in full channel data. Here since the number of channels are less, you can inspect individual channels. If some channels have mild RFI, something wrong in channels which got averaged. You may have to go back to main data and check the corresponding channels. Many times, only a small fraction of the data are bad. Flag only that time range and re-splat. You can create a BANDPASS for splat data also. Normally only a few percent change is noticed here. Create BP table and apply while splitting. SPLIT the source using SPLIT using docal=1; doband 3; bpver 0; bchan 0;echan 0; aparm 0; ichans 0 options. Rename the SPLIT file as the same name as SPLAT and UVDATA, but with split extention. Check the split file using TVFLG and UVPLT. FLAG obvious bad data and retain doubtful cases; we can flag these at later stage. Always do sort by length to not go get confused with the diffuse emission in the field.

2. IMAGING and SELF-CALIBRATION.

All the imaging to be done using facets (3-D imaging). These facets can be obtained by SETFC (creates BOXFILE). To do setfc, estimate the diameter to be imaged. Thumb rule is 1.6 times the HPBW. RADIUS in DEGREEs to be given to setfc; don't forget to convert dia to radius!!! In imagr, use do3d 1, overlap 2, minpath 200 (instead of 51; makes small difference for sidelobes) robust 0 or -1 (I use -1). First two/three rounds of imagr to be run with full manual control. Put clean boxes only around genuine sources. This is important to build reliable model for self-cal. Note down 'Total Cleaned flux density' in the message window. If you have close to half-Jy at 610 and close to 1 Jy at 325 MHz, you have good signal to do phase cal. Don't clean too deep in first round. Run UVPLT and check at what UVRANGE the flux is matching to the Total Cleaned Flux density. Give this range to self-cal. Run calib (SOLMO 'P' and SOLTY 'L1R') with split as input and all maps as model. ncom is 0 (all components including negatives to be used in model). Don't forget to give number of maps to be used. Use solint of 3 minutes in first iteration. Set outcl as 'phscal' Run SNPLT with pixra -180 to 180 for phase (SN will be in SPLIT file). Look for regions with random scatter (zoom the timerange and confirm; don't get carried away by apparent fast change). If random phases are there, flag that time range from the '.phscal' file for that antenna-timerange. Run imagr with .phscal as input - do controlled cleaning. If there is notable increase in image quality and Total Cleaned Flux is much higher, use solint of 0.5 next time. If improvement is slow, use solint 1. Check SNPLT and look for bad phases here also (SN will be in phscal.1) Better to fix pixra to -180 to 180 (or -90 to 90) in SNPLT so that we don't get carried away by relatively small y range. Repeat imagr with manualy putting the clean boxes. Now you can go slightly deeper (you will automatically go!) Note down Total cleaned flux density and corresponding UVRANGE to be used in next round of self-cal. This time also check SNPLT with pixrange -90 to 90 or -60 to 60 or -30 to 30, as required. If SNPLT looks clean, ready to go imagr with dotv=-1 option. Note down the rms on a low-noise region. Roughly 2.5 times this value is the flux cut-off in next imagr round. set gain as 0.05, niter 1e9, flux=2.5 times rms (in Jy). Run self-cal with solint of 0.25; check SNPLT. Check rms of map. give 2.5 times the best rms as flux cutoff in next imagr.

3. Further flagging and improvement

 The image looks pretty good now? Wait!!! Run imagr to same depth with robust 0; Subtract all the sources from the final phase cal using the task 'UVSUB'. Give number of maps so that the sources from all facets are removed. This creates UVSUB file. Check the UVSUB file in TVFLG and UVFLG! SORT by LENGTH in TVFLG to avoid confusion with extended emission which was not subtracted fully. Many bad data will be found. Flag them using your favorite method. Bad data like bad baselines, bad time range, etc will be specific to some channels only and won't be present in all channels. All these have to be flagged patiently; It must be ensured that the "good data" does not get flagged (means NO overflagging). Very low level bad data can be retained now and we will be doing 2nd round of UVSUB later. Keep track of flag table version, many times new flag table is created in TVFLG and CLIP. After "cleaning" up of the UVSUB file, the source visibilities should be added back. Run UVSUB with this "cleaned" .UVSUB as input and fact -1; this will add the source back. Give outclass as UVADD, to avoid confusion. Run phase-only self-cal with this UVADD file and latest image (same used for UVSUB). Check SNPLT; Make a deep image out of this and run phase_only self cal again with highest time resolution (solint 0.25). Check SNPLT. Flag bad sectors on some antennas, if any. Make a deep image with robust -1; with about 2 times the rms of the previous best image as flux cutoff. Check the rms of this new deep image. This time, re-run imagr with robust 0 and flux 2 to 2.5 times the rms of the latest best image. Do UVSUB again, latest phase_cal and latest robust 0 images are inputs. Mostly you will discover more bad data in second round of UVSUB as well. Patiently clean them and add back and run last round of phase_self cal. In this last version of phscal, run imagr with robust -1 and robust 0 to same depth. We should do ONE (and only ONE) round of A&P cal. Solint should be about 3 min (because amp does not change rapidly). Use ncomo 0 and cparm(2) 1. This is to ensure that the mean gain of antenna before and after self cal should be same, only variations will be corrected in self_cal. If there are diffuse emission present in a few short space baselines, the A&P cal may not fully succeed. Otherwise A&P cal works good. A&P CAL clean as deep as possible and as wide as possible. Use FLATN to combine the individual facets and PBCOR to do primary beam correction. The FINAL image of publishable quality is ready.
If you have any comments (suggestions/errors if any/clarifications/ etc ) regarding this guide, please contact me at ishwar@ncra.tifr.res.in