pygama.pargen package#

Subpackage description

Submodules#

pygama.pargen.AoE_cal module#

This module provides functions for correcting the a/e energy dependence, determining the cut level and calculating survival fractions.

pygama.pargen.AoE_cal.AoEcorrection(energy: np.array, aoe: np.array, eres: list, pdf_path: str = None, display: int = 0)#

Calculates the corrections needed for the energy dependence of the A/E. Does this by fitting the compton continuum in slices and then applies fits to the centroid and variance.

Return type:

tuple(np.array, np.array)

pygama.pargen.AoE_cal.PDF_AoE(x: np.array, lambda_s: float, lambda_b: float, mu: float, sigma: float, tau: float, lower_range: float = inf, upper_range: float = inf, components: bool = False)#

PDF for A/E consists of a gaussian signal with gaussian tail background

Return type:

tuple(float, np.array)

pygama.pargen.AoE_cal.apply_dtcorr(aoe: array, dt: array, alpha: float) array#

Aligns dt regions

Return type:

array

pygama.pargen.AoE_cal.cal_aoe(files: list, lh5_path, cal_dict: dict, energy_param: str, cal_energy_param: str, eres_pars: list, dt_corr: bool = False, plot_savepath: str = None)#

Main function for running the a/e correction and cut determination.

Return type:

tuple(dict, dict)

pygama.pargen.AoE_cal.compton_sf(energy: np.array, aoe: np.array, cut: float, peak: float, eres: list[float, float], display: int = 1)#

Determines survival fraction for compton continuum by basic counting

Return type:

tuple(float, np.array, list)

pygama.pargen.AoE_cal.compton_sf_no_sweep(energy: array, aoe: array, peak: float, eres: list[float, float], aoe_low_cut_val: float, aoe_high_cut_val: Optional[float] = None, display: int = 1) float#

Calculates survival fraction for compton contiuum without sweeping through values

Return type:

float

pygama.pargen.AoE_cal.drift_time_correction(aoe: np.array, energy: np.array, dt: np.array, display: int = 0, pdf_path: str = None)#

Calculates the correction needed to align the two drift time regions for ICPC detectors

Return type:

tuple(np.array, float)

pygama.pargen.AoE_cal.energy_guess(hist, bins, var, func_i, peak, eres_pars, fit_range)#

Simple guess for peak fitting

pygama.pargen.AoE_cal.get_aoe_cut_fit(energy: np.array, aoe: np.array, peak: float, ranges: tuple(int, int), dep_acc: float, eres_pars: list, display: int = 1) float#

Determines A/E cut by sweeping through values and for each one fitting the DEP to determine how many events survive. Then interpolates to get cut value at desired DEP survival fraction (typically 90%)

Return type:

float

pygama.pargen.AoE_cal.get_classifier(aoe: array, energy: array, mu_pars: list[float, float], sigma_pars: list[float, float]) array#

Applies correction to A/E energy dependence

Return type:

array

pygama.pargen.AoE_cal.get_dt_guess(hist: array, bins: array, var: array) list#

Guess for fitting dt spectrum

Return type:

list

pygama.pargen.AoE_cal.get_peak_label(peak: float) str#
Return type:

str

pygama.pargen.AoE_cal.get_sf(energy: np.array, aoe: np.array, peak: float, fit_width: tuple(int, int), aoe_cut_val: float, eres_pars: list, display: int = 0)#

Calculates survival fraction for gamma lines using fitting method as in cut determination

Return type:

tuple(np.array, np.array, np.array, float, float)

pygama.pargen.AoE_cal.get_sf_no_sweep(energy: np.array, aoe: np.array, peak: float, fit_width: tuple(int, int), eres_pars: list, aoe_low_cut_val: float, aoe_high_cut_val: float = None, display: int = 1)#

Calculates survival fraction for gamma line without sweeping through values

Return type:

tuple(float, float)

pygama.pargen.AoE_cal.get_survival_fraction(energy, aoe, cut_val, peak, eres_pars, high_cut=None, guess_pars_cut=None, guess_pars_surv=None)#
pygama.pargen.AoE_cal.load_aoe(files: list, lh5_path: str, cal_dict: dict, energy_param: str, cal_energy_param: str)#

Loads in the A/E parameters needed and applies calibration constants to energy

Return type:

tuple(np.array, np.array, np.array, np.array)

pygama.pargen.AoE_cal.plot_compt_bands_overlayed(aoe: array, energy: array, eranges: list[tuple], aoe_range: Optional[list[float]] = None) None#

Function to plot various compton bands to check energy dependence and corrections

pygama.pargen.AoE_cal.plot_dt_dep(aoe: array, energy: array, dt: array, erange: list[tuple], title: str) None#

Function to produce 2d histograms of A/E against drift time to check dependencies

pygama.pargen.AoE_cal.pol1(x: array, a: float, b: float) array#

Basic Polynomial for fitting A/E centroid against energy

Return type:

array

pygama.pargen.AoE_cal.sigma_fit(x: array, a: float, b: float) array#

Function definition for fitting A/E sigma against energy

Return type:

array

pygama.pargen.AoE_cal.unbinned_aoe_fit(aoe: np.array, display: int = 0, verbose: bool = False)#

Fitting function for A/E, first fits just a gaussian before using the full pdf to fit if fails will return NaN values

Return type:

tuple(np.array, np.array)

pygama.pargen.AoE_cal.unbinned_energy_fit(energy: np.array, peak: float, eres_pars: list = None, simplex=False, guess=None, verbose: bool = False)#

Fitting function for energy peaks used to calculate survival fractions

Return type:

tuple(np.array, np.array)

pygama.pargen.cuts module#

This module provides routines for calculating and applying quality cuts

pygama.pargen.cuts.cut_dict_to_hit_dict(cut_dict)#
pygama.pargen.cuts.find_pulser_properties(df, energy='daqenergy')#
pygama.pargen.cuts.generate_cuts(data: dict[str, numpy.ndarray], parameters: list[str]) dict#

Finds double sided cut boundaries for a file for the parameters specified

Parameters:
  • data (lh5 table or dictionary of arrays) – data to calculate cuts on

  • parameters (dict) – dictionary with the parameter to be cut and the number of sigmas to cut at

Return type:

dict

pygama.pargen.cuts.get_cut_indexes(all_data: dict[str, numpy.ndarray], cut_dict: dict, energy_param: str = 'trapTmax') list[int]#

Returns a mask of the data, for a single file, that passes cuts based on dictionary of cuts in form of cut boundaries above :param File: dictionary of parameters + array such as load_nda or lh5 table of params :type File: dict or lh5_table :param Cut_dict: Dictionary file with cuts :type Cut_dict: string

Return type:

list[int]

pygama.pargen.cuts.tag_pulsers(df, chan_info, window=0.01)#

pygama.pargen.data_cleaning module#

mainly pulser tagging - gaussian_cut (fits data to a gaussian, returns mean +/- cut_sigma values) - xtalball_cut (fits data to a crystalball, returns mean +/- cut_sigma values) - find_pulser_properties (find pulser by looking for which peak has a constant time between events) - tag_pulsers

pygama.pargen.data_cleaning.find_pulser_properties(df, energy='trap_max')#
pygama.pargen.data_cleaning.gaussian_cut(data, cut_sigma=3, plotAxis=None)#

fits data to a gaussian, returns mean +/- cut_sigma values for a cut

pygama.pargen.data_cleaning.tag_pulsers(df, chan_info, window=250)#
pygama.pargen.data_cleaning.xtalball_cut(data, cut_sigma=3, plotFigure=None)#

fits data to a crystalball, returns mean +/- cut_sigma values for a cut

pygama.pargen.dsp_optimize module#

class pygama.pargen.dsp_optimize.ParGrid#

Bases: object

Parameter Grid class Each ParGrid entry corresponds to a dsp parameter to be varied. The ntuples must follow the pattern: ( name parameter value_strs) : ( str, str, list of str) where name and parameter are the same as ‘db.name.parameter’ in the processing chain, value_strs is the array of strings to set the argument to.

add_dimension(name, parameter, value_strs)#
get_data(i_dim, i_par)#
get_n_dimensions()#
get_n_grid_points()#
get_n_points_of_dim(i)#
get_par_meshgrid(copy=False, sparse=False)#

return a meshgrid of parameter values Always uses Matrix indexing (natural for par grid) so that mg[i1][i2][…] corresponds to index order in self.dims Note copy is False by default as opposed to numpy default of True

get_shape()#
get_zero_indices()#
iterate_indices(indices)#

iterate given indices [i1, i2, …] by one. For easier iteration. The convention here is arbitrary, but its the order the arrays would be traversed in a series of nested for loops in the order appearin in dims (first dimension is first for loop, etc): Return False when the grid runs out of indices. Otherwise returns True.

print_data(indices)#
set_dsp_pars(db_dict, indices)#
class pygama.pargen.dsp_optimize.ParGridDimension(name, parameter, value_strs)#

Bases: tuple

Create new instance of ParGridDimension(name, parameter, value_strs)

_asdict()#

Return a new dict which maps field names to their values.

_field_defaults = {}#
_fields = ('name', 'parameter', 'value_strs')#
classmethod _make(iterable)#

Make a new ParGridDimension object from a sequence or iterable

_replace(**kwds)#

Return a new ParGridDimension object replacing specified fields with new values

name#

Alias for field number 0

parameter#

Alias for field number 1

value_strs#

Alias for field number 2

pygama.pargen.dsp_optimize.get_grid_points(grid)#

Generates a list of the indices of all possible grid points

pygama.pargen.dsp_optimize.run_grid(tb_data, dsp_config, grid, fom_function, db_dict=None, verbosity=1, **fom_kwargs)#

Extract a table of optimization values for a grid of DSP parameters The grid argument defines a list of parameters and values over which to run the DSP defined in dsp_config on tb_data. At each point, a scalar figure-of-merit is extracted.

Returns a N-dimensional ndarray of figure-of-merit values, where the array axes are in the order they appear in grid.

Parameters:
  • tb_data (lh5 Table) – An input table of lh5 data. Typically a selection is made prior to sending tb_data to this function: optimization typically doesn’t have to run over all data

  • dsp_config (dict) – Specifies the DSP to be performed (see build_processing_chain()) and the list of output variables to appear in the output table for each grid point

  • grid (ParGrid) – See ParGrid class for format

  • fom_function (function) – When given the output lh5 table of this DSP iteration, the fom_function must return a scalar figure-of-merit. Should accept verbosity as a second keyword argument

  • db_dict (dict (optional)) – DSP parameters database. See build_processing_chain for formatting info

  • verbosity (int (optional)) – verbosity for the processing chain and fom_function calls

  • **fom_kwargs – Any keyword arguments for fom_function

Returns:

grid_values (ndarray of floats) – An N-dimensional numpy ndarray whose Mth axis corresponds to the Mth row of the grid argument

pygama.pargen.dsp_optimize.run_grid_multiprocess_parallel(tb_data, dsp_config, grid, fom_function, db_dict=None, verbosity=1, processes=5, fom_kwargs=None)#

run one iteration of DSP on tb_data with multiprocessing, can handle multiple grids if they are the same dimensions

Optionally returns a value for optimization

Parameters:
  • tb_data (lh5 Table) – An input table of lh5 data. Typically a selection is made prior to sending tb_data to this function: optimization typically doesn’t have to run over all data

  • dsp_config (dict) – Specifies the DSP to be performed for this iteration (see build_processing_chain()) and the list of output variables to appear in the output table

  • grid (pargrid, list of pargrids) – Grids to run optimization on

  • db_dict (dict (optional)) – DSP parameters database. See build_processing_chain for formatting info

  • fom_function (function or None (optional)) – When given the output lh5 table of this DSP iteration, the fom_function must return a scalar figure-of-merit value upon which the optimization will be based. Should accept verbosity as a second argument. If multiple grids provided can either pass one fom to have it run for each grid or a list of fom to run different fom on each grid.

  • verbosity (int (optional)) – verbosity for the processing chain and fom_function calls

  • processes (int) – DOCME

  • fom_kwargs – any keyword arguments to pass to the fom, if multiple grids given will need to be a list of the fom_kwargs for each grid

Returns:

  • figure_of_merit (float) – If fom_function is not None, returns figure-of-merit value for the DSP iteration

  • tb_out (lh5 Table) – If fom_function is None, returns the output lh5 table for the DSP iteration

pygama.pargen.dsp_optimize.run_grid_point(tb_data, dsp_config, grids, fom_function, iii, db_dict=None, verbosity=1, fom_kwargs=None)#

Runs a single grid point for the index specified

pygama.pargen.dsp_optimize.run_one_dsp(tb_data, dsp_config, db_dict=None, fom_function=None, verbosity=0, fom_kwargs=None)#

run one iteration of DSP on tb_data

Optionally returns a value for optimization

Parameters:
  • tb_data (lh5 Table) – An input table of lh5 data. Typically a selection is made prior to sending tb_data to this function: optimization typically doesn’t have to run over all data

  • dsp_config (dict) – Specifies the DSP to be performed for this iteration (see build_processing_chain()) and the list of output variables to appear in the output table

  • db_dict (dict (optional)) – DSP parameters database. See build_processing_chain for formatting info

  • fom_function (function or None (optional)) – When given the output lh5 table of this DSP iteration, the fom_function must return a scalar figure-of-merit value upon which the optimization will be based. Should accept verbosity as a second argument

  • verbosity (int (optional)) – verbosity for the processing chain and fom_function calls

  • fom_kwargs – any keyword arguments to pass to the fom

Returns:

  • figure_of_merit (float) – If fom_function is not None, returns figure-of-merit value for the DSP iteration

  • tb_out (lh5 Table) – If fom_function is None, returns the output lh5 table for the DSP iteration

pygama.pargen.ecal_th module#

This module provides a routine for running the energy calibration on Th data

pygama.pargen.ecal_th.energy_cal_th(files: list[str], energy_params: list[str], hit_dict: dict = {}, save_path: str = None, plot_path: str = None, cut_parameters: dict[str, int] = {'bl_mean': 4, 'bl_std': 4, 'pz_std': 4}, lh5_path: str = 'dsp', guess_keV: float = None, threshold: int = 0, p_val: float = 0.05, n_events: int = 15000, deg: int = 1)#

This is an example script for calibrating Th data.

Return type:

tuple(dict, dict)

pygama.pargen.ecal_th.fwhm_slope(x: array, m0: float, m1: float, m2: Optional[float] = None) array#

Fit the energy resolution curve

Return type:

array

pygama.pargen.ecal_th.get_peak_label(peak: float) str#
Return type:

str

pygama.pargen.ecal_th.get_peak_labels(labels: list[str], pars: list[float])#
Return type:

tuple(list[float], list[float])

pygama.pargen.ecal_th.load_data(files: list[str], lh5_path: str, energy_params: list[str], hit_dict: dict = {}, cut_parameters: list[str] = ['bl_mean', 'bl_std', 'pz_std']) dict[str, numpy.ndarray]#
Return type:

dict[str, numpy.ndarray]

pygama.pargen.energy_cal module#

routines for automatic calibration.

  • hpge_find_E_peaks (Find uncalibrated E peaks whose E spacing matches the pattern in peaks_keV)

  • hpge_get_E_peaks (Get uncalibrated E peaks at the energies of peaks_keV)

  • hpge_fit_E_peaks (fits the energy peals)

  • hpge_E_calibration (main routine – finds and fits peaks specified)

pygama.pargen.energy_cal.calibrate_tl208(energy_series, cal_peaks=None, plotFigure=None)#

energy_series: array of energies we want to calibrate cal_peaks: array of peaks to fit 1.) we find the 2614 peak by looking for the tallest peak at >0.1 the max adc value 2.) fit that peak to get a rough guess at a calibration to find other peaks with 3.) fit each peak in peak_energies 4.) do a linear fit to the peak centroids to find a calibration

pygama.pargen.energy_cal.get_calibration_energies(cal_type)#
pygama.pargen.energy_cal.get_hpge_E_bounds(func)#
pygama.pargen.energy_cal.get_hpge_E_fixed(func)#

Returns: Sequence list of fixed indexes for fitting and mask for parameters

pygama.pargen.energy_cal.get_hpge_E_peak_par_guess(hist, bins, var, func)#

Get parameter guesses for func fit to peak in hist

Parameters:
  • hist (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist(). Should be windowed around the peak.

  • bins (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist(). Should be windowed around the peak.

  • var (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist(). Should be windowed around the peak.

  • func (function) – The function to be fit to the peak in the (windowed) hist

pygama.pargen.energy_cal.get_i_local_extrema(data, delta)#

Get lists of indices of the local maxima and minima of data

The “local” extrema are those maxima / minima that have heights / depths of at least delta. Converted from MATLAB script at: http://billauer.co.il/peakdet.html

Parameters:
  • data (array-like) – the array of data within which extrema will be found

  • delta (scalar) – the absolute level by which data must vary (in one direction) about an extremum in order for it to be tagged

Returns:

imaxes, imins (2-tuple ( array, array )) – A 2-tuple containing arrays of variable length that hold the indices of the identified local maxima (first tuple element) and minima (second tuple element)

pygama.pargen.energy_cal.get_i_local_maxima(data, delta)#
pygama.pargen.energy_cal.get_i_local_minima(data, delta)#
pygama.pargen.energy_cal.get_most_prominent_peaks(energySeries, xlo, xhi, xpb, max_num_peaks=inf, test=False)#

find the most prominent peaks in a spectrum by looking for spikes in derivative of spectrum energySeries: array of measured energies max_num_peaks = maximum number of most prominent peaks to find return a histogram around the most prominent peak in a spectrum of a given percentage of width

pygama.pargen.energy_cal.hpge_E_calibration(E_uncal, peaks_keV, guess_keV, deg=0, uncal_is_int=False, range_keV=None, funcs=<bound method gauss_on_step_gen.get_cdf of <pygama.math.functions.gauss_on_step.gauss_on_step_gen object>>, gof_funcs=None, method='unbinned', gof_func=None, n_events=15000, simplex=False, allowed_p_val=0.05, verbose=True)#

Calibrate HPGe data to a set of known peaks

Parameters:
  • E_uncal (array) – unbinned energy data to be calibrated

  • peaks_keV (array) – list of peak energies to be fit to. Each must be in the data

  • guess_keV (float) – a rough initial guess at the conversion factor from E_uncal to keV. Must be positive

  • deg (non-negative int) – degree of the polynomial for the E_cal function E_keV = poly(E_uncal). deg = 0 corresponds to a simple scaling E_keV = scale * E_uncal. Otherwise follows the convention in np.polyfit

  • uncal_is_int (bool) – if True, attempts will be made to avoid picket-fencing when binning E_uncal

  • range_keV (float, tuple, array of floats, or array of tuples of floats) – ranges around which the peak fitting is performed if tuple(s) are supplied, they provide the left and right ranges

  • funcs – DOCME

  • gof_funcs (function or array of functions) – functions to use for calculation goodness of fit if unspecified will use same func as fit

  • method (str) – default is unbinned fit can specify to use binned fit method instead

  • gof_func – DOCME

  • n_events (int) – number of events to use for unbinned fit

  • simplex (bool) – DOCME

  • allowed_p_val – lower limit on p_val of fit

  • verbose (bool) – print debug statements

Returns:

  • pars, cov (array, 2D array) – array of calibration function parameters and their covariances. The form of the function is E_keV = poly(E_uncal). Assumes poly() is overwhelmingly dominated by the linear term. pars follows convention in np.polyfit unless deg=0, in which case it is the (lone) scale factor

  • results (dict with the following elements) –

    ‘detected_peaks_locs’, ‘detected_peaks_keV’array, array

    array of rough uncalibrated/calibrated energies at which the fit peaks were found in the initial peak search

    ’pt_pars’, ‘pt_cov’list of (array), list of (2D array)

    arrays of gaussian parameters / covariances fit to the peak tops in the first refinement

    ’pt_cal_pars’, ‘pt_cal_cov’array, 2D array

    array of calibration parameters E_uncal = poly(E_keV) for fit to means of gausses fit to tops of each peak

    ’pk_pars’, ‘pk_cov’, ‘pk_binws’, ‘pk_ranges’list of (array), list of (2D array), list, list of (array)

    the best fit parameters, covariances, bin width and energy range for the local fit to each peak

    ’pk_cal_pars’, ‘pk_cal_cov’array, 2D array

    array of calibration parameters E_uncal = poly(E_keV) for fit to means from full peak fits

    ’fwhms’, ‘dfwhms’array, array

    the numeric fwhms and their uncertainties for each peak.

pygama.pargen.energy_cal.hpge_find_E_peaks(hist, bins, var, peaks_keV, n_sigma=5, deg=0, Etol_keV=None, var_zero=1, verbose=False)#

Find uncalibrated E peaks whose E spacing matches the pattern in peaks_keV Note: the specialization here to units “keV” in peaks and Etol is unnecessary. However it is kept so that the default value for Etol_keV has an unambiguous interpretation.

Parameters:
  • hist (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist() var cannot contain any zero entries.

  • bins (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist() var cannot contain any zero entries.

  • var (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist() var cannot contain any zero entries.

  • peaks_keV (array) – Energies of peaks to search for (in keV)

  • n_sigma (float) – Threshold for detecting a peak in sigma (i.e. sqrt(var))

  • deg (int) – deg arg to pass to poly_match

  • Etol_keV (float) – absolute tolerance in energy for matching peaks

  • var_zero (float) – number used to replace zeros of var to avoid divide-by-zero in hist/sqrt(var). Default value is 1. Usually when var = 0 its because hist = 0, and any value here is fine.

  • verbose (bool) – print debug messages

Returns:

  • detected_peak_locations (list) – list of uncalibrated energies of detected peaks

  • detected_peak_energies (list) – list of calibrated energies of detected peaks

  • pars (list of floats) – the parameters for poly(peaks_uncal) = peaks_keV (polyfit convention)

pygama.pargen.energy_cal.hpge_fit_E_cal_func(mus, mu_vars, Es_keV, E_scale_pars, deg=0)#

Find best fit of E = poly(mus +/- sqrt(mu_vars)) This is an inversion of hpge_fit_E_scale. E uncertainties are computed from mu_vars / dmu/dE where mu = poly(E) is the E_scale function

Parameters:
  • mus (array) – uncalibrated energies

  • mu_vars (array) – variances in the mus

  • Es_keV (array) – energies to fit to, in keV

  • E_scale_pars (array) –

    ???

  • deg (int) – degree for energy scale fit. deg=0 corresponds to a simple scaling mu = scale * E. Otherwise deg follows the definition in np.polyfit

Returns:

  • pars (array) – parameters of the best fit. Follows the convention in np.polyfit

  • cov (2D array) – covariance matrix for the best fit parameters.

pygama.pargen.energy_cal.hpge_fit_E_peak_tops(hist, bins, var, peak_locs, n_to_fit=7, cost_func='Least Squares', inflate_errors=False, gof_method='var')#

Fit gaussians to the tops of peaks

Parameters:
  • hist (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist()

  • bins (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist()

  • var (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist()

  • peak_locs (array) – locations of peaks in hist. Must be accurate two within +/- 2*n_to_fit

  • n_to_fit (int) – number of hist bins near the peak top to include in the gaussian fit

  • cost_func (bool (optional)) – Flag passed to gauss_mode_width_max()

  • inflate_errors (bool (optional)) – Flag passed to gauss_mode_width_max()

  • gof_method (str (optional)) – method flag passed to gauss_mode_width_max()

Returns:

  • pars_list (list of array) – a list of best-fit parameters (mode, sigma, max) for each peak-top fit

  • cov_list (list of 2D arrays) – a list of covariance matrices for each pars

pygama.pargen.energy_cal.hpge_fit_E_peaks(E_uncal, mode_guesses, wwidths, n_bins=50, funcs=<bound method gauss_on_step_gen.get_cdf of <pygama.math.functions.gauss_on_step.gauss_on_step_gen object>>, method='unbinned', gof_funcs=None, n_events=15000, allowed_p_val=0.05, uncal_is_int=False, simplex=False)#

Fit the Energy peaks specified using the function given

Parameters:
  • E_uncal (array) – unbinned energy data to be fit

  • mode_guesses (array) – array of guesses for modes of each peak

  • wwidths (float or array of float) – array of widths to use for the fit windows (in units of E_uncal), typically on the order of 10 sigma where sigma is the peak width

  • n_bins (int or array of ints) – array of number of bins to use for the fit window histogramming

  • funcs (function or array of functions) – funcs to be used to fit each region

  • method (str) – default is unbinned fit can specify to use binned fit method instead

  • gof_funcs (function or array of functions) – functions to use for calculation goodness of fit if unspecified will use same func as fit

  • uncal_is_int (bool) – if True, attempts will be made to avoid picket-fencing when binning E_uncal

  • simplex (bool determining whether to do a round of simpson minimisation before gradient minimisation) –

  • n_events (int number of events to use for unbinned fit) –

  • allowed_p_val (lower limit on p_val of fit) –

Returns:

  • pars (list of array) – a list of best-fit parameters for each peak fit

  • covs (list of 2D arrays) – a list of covariance matrices for each pars

  • binwidths (list) – a list of bin widths used for each peak fit

  • ranges (list of array) – a list of [Euc_min, Euc_max] used for each peak fit

pygama.pargen.energy_cal.hpge_fit_E_scale(mus, mu_vars, Es_keV, deg=0)#

Find best fit of poly(E) = mus +/- sqrt(mu_vars) Compare to hpge_fit_E_cal_func which fits for E = poly(mu)

Parameters:
  • mus (array) – uncalibrated energies

  • mu_vars (array) – variances in the mus

  • Es_keV (array) – energies to fit to, in keV

  • deg (int) – degree for energy scale fit. deg=0 corresponds to a simple scaling mu = scale * E. Otherwise deg follows the definition in np.polyfit

Returns:

  • pars (array) – parameters of the best fit. Follows the convention in np.polyfit

  • cov (2D array) – covariance matrix for the best fit parameters.

pygama.pargen.energy_cal.hpge_get_E_peaks(hist, bins, var, cal_pars, peaks_keV, n_sigma=3, Etol_keV=5, var_zero=1, verbose=False)#

Get uncalibrated E peaks at the energies of peaks_keV

Parameters:
  • hist (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist() var cannot contain any zero entries.

  • bins (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist() var cannot contain any zero entries.

  • var (array, array, array) – Histogram of uncalibrated energies, see pgh.get_hist() var cannot contain any zero entries.

  • cal_pars (array) – Estimated energy calibration parameters used to search for peaks

  • peaks_keV (array) – Energies of peaks to search for (in keV)

  • n_sigma (float) – Threshold for detecting a peak in sigma (i.e. sqrt(var))

  • Etol_keV (float) – absolute tolerance in energy for matching peaks

  • var_zero (float) – number used to replace zeros of var to avoid divide-by-zero in hist/sqrt(var). Default value is 1. Usually when var = 0 its because hist = 0, and any value here is fine.

  • verbose (bool) – print debug messages

Returns:

  • got_peak_locations (list) – list of uncalibrated energies of found peaks

  • got_peak_energies (list) – list of calibrated energies of found peaks

  • pars (list of floats) – the parameters for poly(peaks_uncal) = peaks_keV (polyfit convention)

pygama.pargen.energy_cal.match_peaks(data_pks, cal_pks)#

Match uncalibrated peaks with literature energy values.

pygama.pargen.energy_cal.poly_match(xx, yy, deg=-1, rtol=1e-05, atol=1e-08)#

Find the polynomial function best matching pol(xx) = yy

Finds the poly fit of xx to yy that obtains the most matches between pol(xx) and yy in the np.isclose() sense. If multiple fits give the same number of matches, the fit with the best gof is used, where gof is computed only among the matches. Assumes that the relationship between xx and yy is monotonic

Parameters:
  • xx (array-like) – domain data array. Must be sorted from least to largest. Must satisfy len(xx) >= len(yy)

  • yy (array-like) – range data array: the values to which pol(xx) will be compared. Must be sorted from least to largest. Must satisfy len(yy) > max(2, deg+2)

  • deg (int) – degree of the polynomial to be used. If deg = 0, will fit for a simple scaling: scale * xx = yy. If deg = -1, fits to a simple shift in the data: xx + shift = yy. Otherwise, deg is equivalent to the deg argument of np.polyfit()

  • rtol (float) – the relative tolerance to be sent to np.isclose()

  • atol (float) – the absolute tolerance to be sent to np.isclose(). Has the same units as yy.

Returns:

  • pars (None or array of floats) – The parameters of the best fit of poly(xx) = yy. Follows the convention used for the return value “p” of polyfit. Returns None when the inputs are bad.

  • i_matches (list of int) – list of indices in xx for the matched values in the best match

pygama.pargen.energy_optimisation module#

This module contains the functions for performing the energy optimisation. This happens in 2 steps, firstly a grid search is performed on each peak separately using the optimiser, then the resulting grids are interpolated to provide the best energy resolution at Qbb

class pygama.pargen.energy_optimisation.BayesianOptimizer(acq_func, batch_size)#

Bases: object

_acquisition_function(x)#
_extend_prior_with_posterior_data(x, y)#
_get_expected_improvement(x_new)#
_get_lcb(x_new)#
_get_next_probable_point()#
_get_ucb(x_new)#
add_dimension(name, parameter, min_val, max_val, unit=None)#
add_initial_values(x_init, y_init)#
eta_param = 0#
get_best_vals()#
get_first_point()#
get_n_dimensions()#
iterate_values()#
kernel = None#
lambda_param = 0.01#
update(results)#
update_db_dict(db_dict)#
class pygama.pargen.energy_optimisation.OptimiserDimension(name, parameter, min_val, max_val, unit)#

Bases: tuple

Create new instance of OptimiserDimension(name, parameter, min_val, max_val, unit)

_asdict()#

Return a new dict which maps field names to their values.

_field_defaults = {}#
_fields = ('name', 'parameter', 'min_val', 'max_val', 'unit')#
classmethod _make(iterable)#

Make a new OptimiserDimension object from a sequence or iterable

_replace(**kwds)#

Return a new OptimiserDimension object replacing specified fields with new values

max_val#

Alias for field number 3

min_val#

Alias for field number 2

name#

Alias for field number 0

parameter#

Alias for field number 1

unit#

Alias for field number 4

pygama.pargen.energy_optimisation.event_selection(raw_files, lh5_path, dsp_config, db_dict, peaks_keV, peak_idxs, kev_widths, cut_parameters={'bl_mean': 4, 'bl_std': 4, 'pz_std': 4}, energy_parameter='trapTmax', n_events=10000, threshold=1000)#
pygama.pargen.energy_optimisation.find_lowest_grid_point_save(grid, err_grid, opt_dict)#

Finds the lowest grid point, if more than one with same value returns shortest filter.

pygama.pargen.energy_optimisation.fom_FWHM(tb_in, kwarg_dict, ctc_parameter, alpha, idxs=None, display=0)#

FOM for sweeping over ctc values to find the best value, returns the best found fwhm

pygama.pargen.energy_optimisation.fom_FWHM_fit(tb_in, kwarg_dict)#

FOM with no ctc sweep, used for optimising ftp.

pygama.pargen.energy_optimisation.fom_FWHM_with_dt_corr_fit(tb_in, kwarg_dict, ctc_parameter, idxs=None, display=0)#

FOM for sweeping over ctc values to find the best value, returns the best found fwhm with its error, the corresponding alpha value and the number of events in the fitted peak, also the reduced chisquare of the

pygama.pargen.energy_optimisation.fom_all_fit(tb_in, kwarg_dict)#

FOM to run over different ctc parameters

pygama.pargen.energy_optimisation.fwhm_slope(x, m0, m1, m2)#

Fit the energy resolution curve

pygama.pargen.energy_optimisation.get_best_vals(peak_grids, peak_energies, param, opt_dict, save_path=None)#

Finds best filter parameters

pygama.pargen.energy_optimisation.get_ctc_grid(grids, ctc_param)#

Reshapes optimizer grids to be in easier form

pygama.pargen.energy_optimisation.get_filter_params(grids, matched_configs, peak_energies, parameters, save_path=None)#

Finds best parameters for filter

pygama.pargen.energy_optimisation.get_peak_fwhm_with_dt_corr(Energies, alpha, dt, func, gof_func, peak, kev_width, guess=None, kev=False, display=0)#

Applies the drift time correction and fits the peak returns the fwhm, fwhm/max and associated errors, along with the number of signal events and the reduced chi square of the fit. Can return result in ADC or keV.

pygama.pargen.energy_optimisation.get_wf_indexes(sorted_indexs, n_events)#
pygama.pargen.energy_optimisation.index_data(data, indexes)#
pygama.pargen.energy_optimisation.interpolate_energy(peak_energies, points, err_points, energy)#
pygama.pargen.energy_optimisation.interpolate_energy_old(peak_energies, grids, error_grids, energy, nevents_grids)#

Interpolates fwhm vs energy for every grid point

pygama.pargen.energy_optimisation.interpolate_grid(energies, grids, int_energy, deg, nevents_grids)#

Interpolates energy vs parameter for every grid point using polynomial.

pygama.pargen.energy_optimisation.new_fom(data, kwarg_dict)#
pygama.pargen.energy_optimisation.run_optimisation(tb_data, dsp_config, fom_function, optimisers, fom_kwargs=None, db_dict=None, nan_val=10, n_iter=10)#
pygama.pargen.energy_optimisation.run_optimisation_multiprocessed(file, opt_config, dsp_config, cuts, lh5_path, fom=None, db_dict=None, processes=5, n_events=8000, **fom_kwargs)#

Runs optimisation on .lh5 file, this version multiprocesses the grid points, it also can handle multiple grids being passed as long as they are the same dimensions.

Parameters:
  • file (string) – path to raw .lh5 file

  • opt_config (str) – path to JSON dictionary to configure optimisation

  • dsp_config (str) – path to JSON dictionary specifying dsp configuration

  • fom (function) – When given the output lh5 table of a DSP iteration, the fom_function must return a scalar figure-of-merit value upon which the optimization will be based. Should accept verbosity as a second argument

  • n_events (int) – Number of events to run over

  • db_dict (dict) – Dictionary specifying any values to put in processing chain e.g. pz constant

  • processes (int) – Number of separate processes to run for the multiprocessing

pygama.pargen.energy_optimisation.set_par_space(opt_config)#

Generates grid for optimizer from dictionary of form {param : {start: , end: , spacing: }}

pygama.pargen.energy_optimisation.set_values(par_values)#

Finds values for grid

pygama.pargen.energy_optimisation.simple_guess(hist, bins, var, func_i, fit_range)#

Simple guess for peak fitting

pygama.pargen.energy_optimisation.single_peak_fom(data, kwarg_dict)#
pygama.pargen.energy_optimisation.unbinned_energy_fit(energy, func, gof_func, gof_range, fit_range=(inf, inf), guess=None, tol=None, verbose=False, display=0)#

Unbinned fit to energy. This is different to the default fitting as it will try different fitting methods and choose the best. This is necessary for the lower statistics.

pygama.pargen.extract_tau module#

This module is for extracting a single pole zero constant from the decay tail

pygama.pargen.extract_tau.dsp_preprocess_decay_const(raw_files: list[str], dsp_config: dict, lh5_path: str, double_pz: bool = False, plot_path: Optional[str] = None, opt_dict: Optional[dict] = None, threshold: int = 5000, cut_parameters: dict = {'bl_mean': 4, 'bl_slope': 4, 'bl_std': 4}) dict#

This function calculates the pole zero constant for the input data

Parameters:
  • f_raw (str) – The raw file to run the macro on

  • dsp_config (str) – Path to the dsp config file, this is a stripped down version which just includes cuts and slope of decay tail

  • channel (str) – Name of channel to process, should be name of lh5 group in raw files

Returns:

tau_dict (dict)

Return type:

dict

pygama.pargen.extract_tau.fom_dpz(tb_data, verbosity=0, rand_arg=None)#
pygama.pargen.extract_tau.get_decay_constant(slopes: array, wfs: array, plot_path: Optional[str] = None) dict#

Finds the decay constant from the modal value of the tail slope after cuts and saves it to the specified json.

Parameters:
  • slopes (array) – tail slope array

  • dict_file (str) – path to json file to save decay constant value to. It will be saved as a dictionary of form {‘pz’: {‘tau’: decay_constant}}

Returns:

tau_dict (dict)

Return type:

dict

pygama.pargen.extract_tau.get_dpz_consts(grid_out, opt_dict)#
pygama.pargen.extract_tau.run_tau(raw_file: list[str], config: dict, lh5_path: str, n_events: int = 10000, threshold: int = 5000) Table#
Return type:

Table

pygama.pargen.mse_psd module#

  • get_avse_cut (does AvsE)

  • get_ae_cut (does A/E)

pygama.pargen.mse_psd.get_ae_cut(e_cal, current, plotFigure=None)#
pygama.pargen.mse_psd.get_avse_cut(e_cal, current, plotFigure=None)#