pygama.evt.modules package

This subpackage provides some custom processors to process hit-structured data into event-structured data.

Custom processors must adhere to the following signature:

def my_evt_processor(
    datainfo,
    tcm,
    table_names,
    *,  # all following arguments are keyword-only
    arg1,
    arg2,
    ...
) -> LGDO:
    # ...

The first three arguments are automatically supplied by build_evt(), when the function is called from the build_evt() configuration.

  • datainfo: a DataInfo object that specifies tier names, file names, HDF5 groups in which data is found and pattern used by hit table names to encode the channel identifier (e.g. ch{}).

  • tcm: TCMData object that holds the TCM data, to be used for event reconstruction.

  • table_names: a list of hit table names to read the data from.

The remaining arguments are characteristic to the processor and can be supplied in the function call from the build_evt() configuration.

The function must return an LGDO object suitable for insertion in the final table with event data.

For examples, have a look at the existing processors provided by this subpackage.

Submodules

pygama.evt.modules.geds module

Event processors for HPGe data.

pygama.evt.modules.geds.apply_recovery_cut(datainfo, tcm, table_names, channel_mapping, *, timestamps, flag, time_window)
Return type:

Array

pygama.evt.modules.geds.apply_xtalk_correction(datainfo, tcm, table_names, channel_mapping, *, return_mode, uncal_energy_expr, cal_energy_expr, multiplicity_expr, xtalk_threshold=None, xtalk_matrix_filename='', xtalk_rawid_obj='xtc/rawid_index', xtalk_matrix_obj='xtc/xtalk_matrix_negative', positive_xtalk_matrix_obj='xtc/xtalk_matrix_positive')

Applies the cross-talk correction to the energy observable. The format of xtalk_matrix_filename should be currently be a path to a lh5 file.

The correction is applied using matrix algebra for all triggers above the threshold.

Parameters:
  • datainfo (DataInfo) – positional arguments automatically supplied by build_evt().

  • tcm (TCMData) – positional arguments automatically supplied by build_evt().

  • table_names (Sequence[str]) – positional arguments automatically supplied by build_evt().

  • return_mode (str) – string which can be either energy to return corrected energy or tcm_index

  • uncal_energy_expr (str) – expression for the pulse parameter to be gathered for the uncalibrated energy (used for correction), can be a combination of different fields.

  • cal_energy_expr (str) – expression for the pulse parameter to be gathered for the calibrated energy, used for the xtalk threshold, can be a combination of different fields.

  • xtalk_threshold (float | None) – threshold used for xtalk correction, hits below this energy will not be used to correct the other hits.

  • xtalk_matrix_filename (str) – name of the file containing the xtalk matrices.

  • xtalk_matrix_obj (str) – name of the lh5 object containing the xtalk matrix

  • positive_xtalk_matrix_obj (str) – name of the lh5 object containing the positive polarity xtalk matrix

  • xtalk_rawids_obj – name of the lh5 object containing the name of the rawids

Return type:

VectorOfVectors

pygama.evt.modules.geds.apply_xtalk_correction_and_calibrate(datainfo, tcm, table_names, channel_mapping, *, return_mode, uncal_energy_expr, cal_energy_expr, cal_par_files, multiplicity_expr, xtalk_matrix_filename, xtalk_threshold=None, xtalk_rawid_obj='xtc/rawid_index', xtalk_matrix_obj='xtc/xtalk_matrix_negative', positive_xtalk_matrix_obj='xtc/xtalk_matrix_positive', uncal_var='dsp.cuspEmax', recal_var='hit.cuspEmax_ctc_cal')

Applies the cross-talk correction to the energy observable.

The correction is applied using matrix algebra for all triggers above the xalk threshold.

Parameters:
  • datainfo (DataInfo) – positional arguments automatically supplied by build_evt().

  • tcm (TCMData) – positional arguments automatically supplied by build_evt().

  • table_names (Sequence[str]) – positional arguments automatically supplied by build_evt().

  • return_mode (str) – string which can be either energy to return corrected energy or tcm_index.

  • uncal_energy_expr (str) – expression for the pulse parameter to be gathered for the uncalibrated energy (used for correction), can be a combination of different fields.

  • cal_energy_expr (str) – expression for the pulse parameter to be gathered for the calibrated energy, used for the xtalk threshold, can be a combination of different fields.

  • cal_par_files (str | Sequence[str]) – path to the generated hit tier par-files defining the calibration curves. Used to recalibrate the data after xtalk correction.

  • multiplicity_expr (str) – expression defining the logic used to compute the event multiplicity.

  • xtalk_threshold (float | None) – threshold used for xtalk correction, hits below this energy will not be used to correct the other hits.

  • xtalk_matrix_filename (str) – path to the file containing the xtalk matrices.

  • xtalk_matrix_obj (str) – name of the lh5 object containing the xtalk matrix.

  • positive_xtalk_matrix_obj (str) – name of the lh5 object containing the positive polarity xtalk matrix.

  • xtalk_matrix_rawids – name of the lh5 object containing the name of the rawids.

  • recal_var (str) – name of the energy variable to use for re-calibration.

Return type:

VectorOfVectors

pygama.evt.modules.larveto module

Routines to evaluate the correlation between HPGe and SiPM signals.

pygama.evt.modules.larveto._ak_l200_test_stat_time_term(layouts, ts_bkg_prob, **kwargs)

Awkward transform to compute the per-pulse terms of the test statistics.

The two arguments are the pulse times t0 relative to the HPGe trigger and their amplitude amp. The function has to be invoked as ak.transform(_ak_l200_test_stat_terms, t0, amp, ...).

pygama.evt.modules.larveto.l200_combined_test_stat(t0, amp, geds_t0, ts_bkg_prob, rc_density)

Combined L200 LAr veto classifier.

Where combined means taking channel-specific parameters into account.

t0 and amp must be in the format of a 3-dimensional Awkward array, where the innermost dimension labels the SiPM pulse, the second one labels the SiPM channel and the outermost one labels the event.

Parameters:
  • t0 (Array) – arrival times of pulses in ns, split by channel.

  • amp (Array) – amplitude of pulses in p.e., split by channel.

  • geds_t0 (Array) – t0 (ns) of the HPGe signal.

  • ts_bkg_prob (float) – probability for a pulse coming from some uncorrelated physics (uniform distribution). needed for the LAr scintillation time pdf.

  • rc_density (Sequence[float]) – density array of the random coincidence LAr energy distribution (total energy summed over all channels, in p.e.). Derived from forced trigger data.

Return type:

Array

pygama.evt.modules.larveto.l200_rc_amp_logpdf(n_pes, rc_density=None, logexp_cont_slope=-0.1)

The L200 experimental random coincidence (RC) amplitude pdf

Parameters:
  • n_pes – number of photoelectrons.

  • rc_density – density array of the random coincidence LAr energy distribution (total energy summed over all channels, in p.e.). Derived from forced trigger data.

  • logexp_cont_slope – slope for exponential analytical continuation.

pygama.evt.modules.larveto.l200_tc_time_pdf(t0, *, domain_ns=(-1000, 5000), tau_singlet_ns=6, tau_triplet_ns=1100, sing2tot_ratio=0.3333333333333333, t0_res_ns=35, t0_bias_ns=-80, bkg_prob=0.42)

The L200 experimental LAr scintillation pdf

The theoretical scintillation pdf convoluted with a Normal distribution (experimental effects) and summed to a uniform distribution (uncorrelated pulses).

This routine does not work with ak.Array, since SciPy functions are not universal. See _ak_l200_test_stat_time_term() for an example Awkward transform that does the job.

Parameters:
  • t0 (float | _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) – arrival times of the SiPM pulses in ns.

  • tau_singlet_ns (float) – The lifetime of the LAr singlet state in ns.

  • tau_triplet_ns (float) – The lifetime of the LAr triplet state in ns.

  • sing2tot_ratio (float) – The singlet-to-total excitation probability ratio.

  • t0_res_ns (float) – sigma (ns) of the normal distribution.

  • t0_bias_ns (float) – mean (ns) of the normal distribution.

  • bkg_prob (float) – probability for a pulse coming from some uncorrelated physics (uniform distribution).

Return type:

float | _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

pygama.evt.modules.larveto.l200_test_stat(relative_t0, amp, ts_bkg_prob, rc_density)

Compute the test statistics.

Parameters:
  • relative_t0 – t0 (ns) of the SiPM pulses relative to the HPGe t0.

  • amp – amplitude in p.e. of the SiPM pulses.

pygama.evt.modules.larveto.pulse_amp_round(amp)

Get the most likely (integer) number of photo-electrons.

pygama.evt.modules.legend module

Module provides LEGEND internal functions

pygama.evt.modules.legend.convert_rawid(datainfo, tcm, table_names, channel_mapping, *, rawid_obj)

Convert rawid to channel number.

pygama.evt.modules.legend.metadata(params)
Return type:

list

pygama.evt.modules.spms module

Event processors for SiPM data.

pygama.evt.modules.spms.gather_pulse_data(datainfo, tcm, table_names, channel_mapping, *, observable, pulse_mask=None, a_thr_pe=None, t_loc_ns=None, dt_range_ns=None, t_loc_default_ns=None, drop_empty=True, energy_observable='hit.energy_in_pe', t0_observable='hit.trigger_pos')

Gathers SiPM pulse data into a 3D VectorOfVectors.

The returned data structure specifies the event in the first axis, the SiPM channel in the second and the pulse index in the last.

Pulse data can be optionally masked with pulse_mask or a mask can be built on the fly from the a_thr_pe, t_loc_ns, dt_range_ns, t_loc_default_ns arguments (see make_pulse_data_mask()).

If pulse_mask, a_thr_pe, t_loc_ns, dt_range_ns, t_loc_default_ns are all None, no masking is applied and the full data set is returned.

Parameters:
  • datainfo (DataInfo) – positional arguments automatically supplied by build_evt().

  • tcm (TCMData) – positional arguments automatically supplied by build_evt().

  • table_names (Sequence[str]) – positional arguments automatically supplied by build_evt().

  • observable (str) – name of the pulse parameter to be gathered, optionally prefixed by tier name (e.g. hit.energy_in_pe). If no tier is specified, it defaults to hit.

  • pulse_mask (VectorOfVectors | None) – 3D mask object used to filter out pulse data. See make_pulse_data_mask().

  • a_thr_pe (float | None) – amplitude threshold (in photoelectrons) used to build a pulse mask with make_pulse_data_mask(), if pulse_mask is None. The output pulse data will be such that the pulse amplitude is above this value.

  • t_loc_ns (float | None) – location of the time window in which pulses must sit. If a 1D array is provided, it is interpreted as a list of locations for each event (can be employed to e.g. provide the actual HPGe pulse position)

  • dt_range_ns (Sequence[float] | None) – tuple with dimension of the time window in which pulses must sit relative to t_loc_ns. If, for example, t_loc_ns is 48000 ns and dt_range_ns is (-1000, 5000) ns, the resulting window will be (47000, 53000) ns.

  • t_loc_default_ns (float | None) – default value for t_loc_ns, in case the supplied value is numpy.nan.

  • drop_empty (bool) – if True, drop empty arrays at the last axis (the pulse axis), i.e. drop channels with no pulse data. The filtering is applied after the application of the mask.

Return type:

VectorOfVectors

pygama.evt.modules.spms.gather_tcm_data(datainfo, tcm, table_names, channel_mapping, *, tcm_field='id', pulse_mask=None, a_thr_pe=None, t_loc_ns=None, dt_range_ns=None, t_loc_default_ns=None, drop_empty=True)

Gather TCM data into a 2D VectorOfVectors.

The returned data structure specifies the event on the first axis and the TCM data (id or idx) on the second. Can be used to filter out data from gather_pulse_data() based on SiPM channel provenance (id) or to load hit data from lower tiers (with idx).

If drop_empty is True, channel ids with no pulse data associated are removed.

See gather_pulse_data() for documentation about the other function arguments.

Return type:

VectorOfVectors

pygama.evt.modules.spms.geds_coincidence_classifier(datainfo, tcm, table_names, channel_mapping, *, spms_t0, spms_amp, geds_t0_ns, ts_bkg_prob, rc_density=None)

Calculate the HPGe / SiPMs coincidence classifier.

The value represents the likelihood of a physical correlation between HPGe and SiPM signals.

Parameters:
  • datainfo (DataInfo) – positional arguments automatically supplied by build_evt().

  • tcm (TCMData) – positional arguments automatically supplied by build_evt().

  • table_names (Sequence[str]) – positional arguments automatically supplied by build_evt().

  • t0 – arrival times of pulses in ns, split by channel.

  • amp – amplitude of pulses in p.e., split by channel.

  • geds_t0_ns (Array) – t0 (ns) of the HPGe signal.

  • ts_bkg_prob (float) – probability for a pulse coming from some uncorrelated physics (uniform distribution). needed for the LAr scintillation time pdf.

  • rc_density (Sequence[float] | None) – density array of the random coincidence LAr energy distribution (total energy summed over all channels, in p.e.). Derived from forced trigger data.

Return type:

Array

pygama.evt.modules.spms.make_pulse_data_mask(datainfo, tcm, table_names, channel_mapping, *, a_thr_pe=None, t_loc_ns=None, dt_range_ns=None, t_loc_default_ns=None, t0_observable='hit.trigger_pos', energy_observable='hit.energy_in_pe')

Calculate a 3D VectorOfVectors pulse data mask.

Useful to filter any pulse data based on pulse amplitude and start time.

Parameters:
  • datainfo (DataInfo) – positional arguments automatically supplied by build_evt().

  • tcm (TCMData) – positional arguments automatically supplied by build_evt().

  • table_names (Sequence[str]) – positional arguments automatically supplied by build_evt().

  • a_thr_pe – amplitude threshold (in photoelectrons) used to build a pulse mask with make_pulse_data_mask(), if pulse_mask is None. The output pulse data will be such that the pulse amplitude is above this value.

  • t_loc_ns – location of the time window in which pulses must sit. If a 1D array is provided, it is interpreted as a list of locations for each event (can be employed to e.g. provide the actual HPGe pulse position)

  • dt_range_ns – tuple with dimension of the time window in which pulses must sit relative to t_loc_ns. If, for example, t_loc_ns is 48000 ns and dt_range_ns is (-1000, 5000) ns, the resulting window will be (47000, 53000) ns.

  • t_loc_default_ns – default value for t_loc_ns, in case the supplied value is numpy.nan.

  • t0_observable – parameter to use for channels t0 in the form tier.param

  • energy_observable – parameter to use for channels energy in the form tier.param

Return type:

VectorOfVectors

pygama.evt.modules.xtalk module

Module for cross talk correction of energies.

pygama.evt.modules.xtalk.build_tcm_index_array(tcm, datainfo, rawids)

Builds a TCM index array for use in the event tier.

Parameters:
  • datainfo (DataInfo) – DataInfo object.

  • tcm (TCMData) – time-coincidence map object.

  • rawids (ndarray) – list of channel rawids from the cross talk matrix.

Return type:

ndarray

pygama.evt.modules.xtalk.calibrate_energy(datainfo, tcm, energy_corr, xtalk_matrix_rawids, par_files, uncal_energy_var=None, recal_energy_var=None, channel_mapping=None)

Function to recalibrate the energy after xtalk correction.

Parameters:
  • datainfo (DataInfo) – DataInfo object.

  • tcm (TCMData) – TCMData object.

  • energy_corr (ndarray) – cross talk corrected (uncal) energies to be recalibrated.

  • par_files (str | list[str]) – path to the parameter files.

  • uncal_energy_var (str | None) – name of the uncalibrated energy variable.

  • recal_energy_var (str | None) – variable to be used for recalibration.

pygama.evt.modules.xtalk.filter_hits(datainfo, tcm, filter_expr, xtalk_corr_energy, rawids)

Function that removes hits in an event below threshold.

Parameters:

datainfo, tcm

DataInfo and TCMData objects.

filter_expr

string containing the logic used to define which events are above threshold. this string can also refer to the corrected energy as xtalk_corr_energy.

xtalk_corr_energy

2D numpy array of correct energy, the row corresponds to the event and the column the rawid.

rawids

1D array of the rawids corresponding to each column.

Return type:

_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

pygama.evt.modules.xtalk.gather_energy(observable, tcm, datainfo, rawids)

Prepares the array of energies for the cross talk correction.

Parameters:
Return type:

_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

pygama.evt.modules.xtalk.get_xtalk_correction(tcm, datainfo, uncal_energy_expr, cal_energy_expr, xtalk_threshold=None, xtalk_matrix_filename='', xtalk_rawid_obj='xtc/rawid_index', xtalk_matrix_obj='xtc/xtalk_matrix_negative', positive_xtalk_matrix_obj='xtc/xtalk_matrix_positive')
pygama.evt.modules.xtalk.xtalk_correct_energy_impl(uncal_energy, cal_energy, xtalk_matrix, xtalk_threshold=None)

Function to perform the actual xtalk correction of energy.

  1. The energies are converted to a sparse format where each row corresponds to a rawid

  2. All energy less than the threshold are set to 0

  3. The correction is computed as:

\[E_{\text{cor},i} = -\times M_{i,j}E_{j}\]

where $M_{i,j}$ is the cross talk matrix element where $i$ is response and $j$ trigger channel.

Parameters:
  • uncal_energy (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) – 2D numpy array of the uncalibrated energies in each event, the row corresponds to an event and the column the rawid.

  • cal_energy (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) – 2D numpy array of the calibrated energies in each event, the row corresponds to an event and the column the rawid.

  • xtalk_matrix (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) – 2D numpy array of the cross talk correction matrix, the indices should correspond to rawids (with same mapping as energies).

  • xtalk_threshold (float | None) – threshold below which a hit is not used in xtalk correction.