AverageSpectrum([ws, parfile]) | The function will calculate an average spectrum from a list of files and a series of antennas (all averaged into one spectrum). |
WorkSpace(**args) | |
averagespectrum_getfile(current_file_number, ...) | To produce an error message in case the file does not exist |
make_frequencies(spectrum[, offset, ...]) | Calculates the frequencies for the calculated spectrum (outdated) |
Module author: Heino Falcke <h.falcke@astro.ru.nl>
The function will calculate an average spectrum from a list of files and a series of antennas (all averaged into one spectrum).
The task will do a basic quality check of each time series data set and only average good data.
For very large FFTs, and to save memory, one can use a doublefft which means that one big FFT is split into two steps of smaller FFts. This allows almost arbitrary resolution.
The function will go through all files in the list and then loop, one-by-one, over all antennas in the files. Data will be read in in nchunks chunks (i.e. blocks) of size fullsizes. For a single FFT these chunks will be subdivided into nblock blocks to do quality checks (i.e., to see if certain blocks deviate from others). For a double FFT with stride>1 only a fraction of these blocks are read in at once.
The desired frequency resolution is provided with the parameter delta_nu, but this will be rounded off to the nearest value using channel numbers of a power of 2. For single FFTs you can also set the parameter chunksize.
Note: the spectrum has N/2 channels and not N/2+1 channels as is usually the case. This means that the last channel is missing!!
The resulting spectrum is stored in the array Task.power (only complete for stride=1) and written to disk as an cr.hArray with parameters stored in the header dict (use getHeader('AverageSpectrum') to retrieve this.)
Note for single FFTs one can decide whether to average all antennas into one spectrum (e.g., an incoherent ‘station spectrum’), or whether to keep the antennas separate and get an average spectrum per antenna (addantennas=False).
This spectrum can be read back and a baseline can be fitted with FitBaseline.
Incoherent Beam
With the option calc_incoherent_sum one can add the square of all time series data of all antennas into one long time series data (containing the power) for single FFTs. The resulting array Task.incoherent_sum has the length of the entire data set (of one antenna, of course) and has the dimensions [nchunks,chunksize=blocksize]. It will be saved on disk in sp.par.incoherent_sum.
Dynamic Spectrum
With the option calc_dynspec turned on, the task will calculate a dynamic spectrum for each file within the length of the data set. It will add all antennas of a station and then divide the data sets into sections of length t_int within which all chunks of data are intergated. The section is the one time-step in the dynamic spectrum. The result is provided in Task.dynspec and Task.cleanspec (where the dynamic spectrum is divided by an average spectrum, stored in Task.avspec). Avspec and cleanspec are also par-array of dynspec (i.e., Task.dynspsc.par.cleanspec). The dynamic spectrum will also be stored in the output directory (output_subdir=filename.dir) as a separate file.
If multiple files are provided, also a dynamic spectrum is made out of the average spectra of the files being worked on and an average spectrum of all files. (Note, that the files have to be sorted by timestamp in the Task.filenames, which is produced by Task.filefilter.
Quality Checking
To avoid the spectrum being influenced by spikes in the time series, those spikes can be replaced by random numbers, before the FFT is taken (see randomize_peaks).
The quality information (RMS, MEAN, flagged blocks per antennas) is stored in a data ‘quality database’ in text and python form and is also available as Task.quality. (See also: Task.antennacharacteristics, Task.mean, Task.mean_rms, Task.rms, Task.rms_rms, Task.npeaks, Task.npeaks_rms, Task.homogeneity_factor - the latter is printend and is a useful hint if something is wrong)
One can specify the maximum number of blocks in a chunk that are allowed to fail a quality criterium (maxblocksflagged), before the entrie chunk is discarded. The way quality checking (see module :func:qualitycheck) works, is that each chunk is divided into blocks. If a few of these blocks have a mean or rms that deviates too much from the others, they will be flagged. Also, a noise level will be determined where one expects roughly one peak per block above that noise for Gaussian noise. If the actual number of peaks is spikeexcess above that number, flag the block (and perhaps the entire chunk)
Flagged blocks can be easily inspeced using the task method Task.qplot (qplot for quality plot).
If you see an outputline like this:
# Start antenna = 92 (ID= 17011092) - 4 passes: 184 - Mean= 3.98, RMS= 6.35, Npeaks= 211, Nexpected=256.00 (Npeaks/Nexpected= 0.82), nsigma= 2.80, limits=( -2.80, 2.80) 185 - Mean= 3.97, RMS= 6.39, Npeaks= 200, Nexpected=256.00 (Npeaks/Nexpected= 0.78), nsigma= 2.80, limits=( -2.80, 2.80) 186 - Mean= 3.98, RMS= 6.40, Npeaks= 219, Nexpected=256.00 (Npeaks/Nexpected= 0.86), nsigma= 2.80, limits=( -2.80, 2.80) - Block 514: mean= 0.25, rel. rms= 2.6, npeaks= 16, spikyness= 15.00, spikeexcess= 16.00 ['rms', 'spikeexcess']this will tell you that Antenna 17011092 was worked on (the 92nd antenna in the data file) and the 186th chunk (block 514) contained some suspicious data (too many spikes). If you want to inspect this, you can call:
>>> Task.qplot(186)This will plot the chunk and highlight the first flagged block (#514) in that chunk:
>>> Task.qplot(186,1)would highlight the second flagged block (which does not exist here).
If the chunks are too long to be entirely plotted, use:
>>> Task.qplot(186,all=False).Sections, Chunks, Blocks
The terminology is as follows: the smallest unit is a block. If stride>1 only every nth block will be read in to do a double fft.
stride*blocks will be one chunk of data. The size of a chunk determines the resolution of an FFT.
Input parameters
Output parameters
Plot the dynamic spectrum. Provide the dynamic spectrum computed by the Task AverageSpectrum as input.
plot_cleanspec | None | If False, don’t plot the cleaned spectrum (provided as dynspec.par.cleanspec). |
dmin | Minimum z-value (intensity) in dynamic spectrum to plot. | |
dmax | Maximum z-value (intensity) in dynamic spectrum to plot. | |
cmin | Minimum z-value (intensity) in clean dynamic spectrum to plot. | |
cmax | Maximum z-value (intensity) in clean dynamic spectrum to plot. | |
y/xlabel | Label of y and x-axis. | |
extent | (x0,x1,y0,y1) = define extent/range of x and y axis (just for labelling) | |
zoom_slice | slice(n0,n1) - Zoom into this slice of the data | |
zoom_x | (nu0,nu1) - zoom_slice slice corresponds to these frequencies |
Example:
>>> tload "AverageSpectrum"; go
>>> dsp=hArrayRead("Saturn.h5.dynspec.pcr")
>>> Task.dynplot(dsp,cmin=2.2*10**-5,cmax=0.002,dmin=6.8,dmax=10)
Initialize the task
If you see an output line like this:
# Start antenna = 0 (ID= 2000000) - 375 passes:
# Flagged: #0 chunk= 0 , Block 14: mean= 121.68, rel. rms= 970.1, npeaks= 0, spikyness= -1.00, spikeexcess= 0.00 ['rms', 'mean']
# Flagged: #0 chunk= 0 , Block 15: mean= 139.70, rel. rms=1049.7, npeaks= 0, spikyness= -1.00, spikeexcess= 0.00 ['rms', 'mean']
# Flagged: #2 chunk= 2 , Block 15: mean= 139.70, rel. rms=1049.7, npeaks= 0, spikyness= -1.00, spikeexcess= 0.00 ['rms', 'mean']
# Data flagged!
# End antenna = 0 Time = 14.825337 s nspectraadded = 374 nspectraflagged = 1
this will tell you that Antenna 2000000 was worked on (the 1st antenna in the data file) and the chunk 0 (blocks 14 and 15 within that chunk) and chunk 2 contained some suspicious data (with and rms and mean which fluctuated too much). If you want to inspect this, you can call:
>>> Task.qplot(0)
This will plot the chunk and highlight the first flagged block (#14) in that chunk:
>>> Task.qplot(0,1)
would highlight the second flagged block (#15)
If the chunks are too long to be entirely plotted, use:
>>> Task.qplot(186,all=False).
Run the program.
To produce an error message in case the file does not exist
Calculates the frequencies for the calculated spectrum (outdated)