pygama.math package¶
Useful functions for performing quick computations of statistical distributions, as well as helper functions for histogramming data
Subpackages¶
- pygama.math.functions package
- Submodules
- pygama.math.functions.crystal_ball module
- pygama.math.functions.error_function module
- pygama.math.functions.exgauss module
- pygama.math.functions.exponential module
- pygama.math.functions.gauss module
- pygama.math.functions.gauss_on_exgauss module
- pygama.math.functions.gauss_on_exponential module
- pygama.math.functions.gauss_on_linear module
- pygama.math.functions.gauss_on_step module
- pygama.math.functions.gauss_on_uniform module
- pygama.math.functions.hpge_peak module
- pygama.math.functions.linear module
- pygama.math.functions.moyal module
- pygama.math.functions.poisson module
- pygama.math.functions.polynomial module
- pygama.math.functions.pygama_continuous module
- pygama.math.functions.step module
- pygama.math.functions.sum_dists module
SumDistsSumDists._argcheck()SumDists._cdf()SumDists._pdf()SumDists.cdf_ext()SumDists.cdf_norm()SumDists.draw_cdf()SumDists.draw_pdf()SumDists.get_cdf()SumDists.get_fwfm()SumDists.get_fwhm()SumDists.get_mode()SumDists.get_mu()SumDists.get_pdf()SumDists.get_total_events()SumDists.pdf_ext()SumDists.pdf_norm()SumDists.required_args()
copy_signature()get_areas_fracs()get_dists_and_par_idxs()get_parameter_names()
- pygama.math.functions.triple_gauss_on_double_step module
- pygama.math.functions.uniform module
Submodules¶
pygama.math.binned_fitting module¶
pygama convenience functions for fitting binned data
- pygama.math.binned_fitting.fit_binned(func, hist, bins, var=None, guess=None, cost_func='LL', extended=True, simplex=False, bounds=None, fixed=None)¶
Do a binned fit to a histogram.
Default is extended Log Likelihood fit, with option for either Least Squares or other cost function.
- Parameters:
func (Callable) – the function to fit, if using LL as method needs to be a cdf
hist (ndarray) – histogrammed data
bins (ndarray) – histogrammed data
var (ndarray) – histogrammed data
guess (ndarray) – initial guess parameters
cost_func (str) – cost function to use
extended (bool) – run extended or non extended fit
simplex (bool) – whether to include a round of simpson minimisation before main minimisation
bounds (tuple[tuple[float, float], ...] | None) – list of tuples with bounds can be None, e.g. [(0,None), (0,10)]
fixed (tuple[int, ...] | None) – list of parameter indices to fix
- Returns:
coeff – Returned fit parameters
error – Iminuit errors
cov_matrix – Covariance matrix
- Return type:
- pygama.math.binned_fitting.gauss_mode(hist, bins, **kwargs)¶
Alias for gauss_mode_max that just returns the mode (position) of a peak
See also
- pygama.math.binned_fitting.gauss_mode_max(hist, bins, **kwargs)¶
Alias for gauss_mode_width_max that just returns the max and mode
See also
- Returns:
pars (ndarray with the parameters (mode, maximum) of the gaussian fit) – mode : the estimated x-position of the maximum maximum : the estimated maximum value of the peak
cov (2x2 ndarray of floats) – The covariance matrix for the 2 parameters in pars
- Return type:
Examples
>>> import pygama.math.histograms as pgh >>> from numpy.random import normal >>> import pygama.math.peak_fitting as pgf >>> hist, bins, var = pgh.get_hist(normal(size=10000), bins=100, range=(-5,5)) >>> pgf.gauss_mode_max(hist, bins, var=var, n_bins=20)
- pygama.math.binned_fitting.gauss_mode_width_max(hist, bins, var=None, mode_guess=None, n_bins=5, cost_func='Least Squares', inflate_errors=False, gof_method='var', sigma_limit=None)¶
Get the max, mode, and width of a peak based on gauss fit near the max Returns the parameters of a gaussian fit over n_bins in the vicinity of the maximum of the hist (or the max near mode_guess, if provided). This is equivalent to a Taylor expansion around the peak maximum because near its maximum a Gaussian can be approximated by a 2nd-order polynomial in x:
\[A \exp[ -(x-\mu)^2 / 2 \sigma^2 ] \simeq A [ 1 - (x-\mu)^2 / 2 \sigma^2 ] = A - (1/2!) (A/\sigma^2) (x-\mu)^2\]The advantage of using a gaussian over a polynomial directly is that the gaussian parameters are the ones we care about most for a peak, whereas for a poly we would have to extract them after the fit, accounting for covariances. The gaussian also better approximates most peaks farther down the peak. However, the gauss fit is nonlinear and thus less stable.
- Parameters:
hist (ndarray) – The values of the histogram to be fit
bins (ndarray) – The bin edges of the histogram to be fit
var (ndarray | None) – The variances of the histogram values. If not provided, square-root variances are assumed.
mode_guess (float | None) – An x-value (not a bin index!) near which a peak is expected. The algorithm fits around the maximum within +/- n_bins of the guess. If not provided, the center of the max bin of the histogram is used.
n_bins (int | None) – The number of bins (including the max bin) to be used in the fit. Also used for searching for a max near mode_guess
cost_func (str) – Passed to fit_binned()
inflate_errors (bool | None) – If true, the parameter uncertainties are inflated by sqrt(chi2red) if it is greater than 1
gof_method (str) – method flag for goodness_of_fit
sigma_limit (float | None) – upper bound on sigma
- Returns:
pars (ndarray containing the parameters (mode, sigma, maximum) of the gaussian fit) –
mode : the estimated x-position of the maximum
sigma : the estimated width of the peak. Equivalent to a gaussian width (sigma), but based only on the curvature within n_bins of the peak. Note that the Taylor-approxiamted curvature of the underlying function in the vicinity of the max is given by max / sigma^2
maximum : the estimated maximum value of the peak
cov (3x3 ndarray of floats) – The covariance matrix for the 3 parameters in pars
- Return type:
- pygama.math.binned_fitting.goodness_of_fit(hist, bins, var, func, pars, method='var', scale_bins=False)¶
Compute chisq and dof of fit
- Parameters:
hist (ndarray) – histogram data. var can be None if hist is integer counts
bins (ndarray) – histogram data. var can be None if hist is integer counts
var (ndarray) – histogram data. var can be None if hist is integer counts
func (Callable) – the function that was fit to the hist
pars (ndarray) – the best-fit pars of func. Assumes all pars are free parameters
method (str) – Sets the choice of “denominator” in the chi2 sum ‘var’: user passes in the variances in var (must not have zeros) ‘Pearson’: use func (hist must contain integer counts) ‘Neyman’: use hist (hist must contain integer counts and no zeros)
- Returns:
chisq – the summed up value of chisquared
dof – the number of degrees of freedom
- Return type:
- pygama.math.binned_fitting.poisson_gof(pars, func, hist, bins, is_integral=False, **kwargs)¶
Calculate the goodness of fit for the Poisson likelihood.
- Parameters:
- Returns:
Poisson G.O.F.
- Return type:
Notes
The Poisson likelihood does not give a good GOF until the counts are very high and all the poisson stats are roughly gaussian and you don’t need it anyway. But the G.O.F. is calculable for the Poisson likelihood. So we do it here.
- pygama.math.binned_fitting.taylor_mode_max(hist, bins, var=None, mode_guess=None, n_bins=5)¶
Get the max and mode of a peak based on Taylor exp near the max Returns the amplitude and position of a peak based on a poly fit over n_bins in the vicinity of the maximum of the hist (or the max near mode_guess, if provided)
- Parameters:
hist (ndarray) – The values of the histogram to be fit. Often: send in a slice around a peak
bins (ndarray) – The bin edges of the histogram to be fit
var (ndarray | None) – The variances of the histogram values. If not provided, square-root variances are assumed.
mode_guess (float | None) – An x-value (not a bin index!) near which a peak is expected. The algorithm fits around the maximum within +/- n_bins of the guess. If not provided, the center of the max bin of the histogram is used.
n_bins (int) – The number of bins (including the max bin) to be used in the fit. Also used for searching for a max near mode_guess
- Returns:
pars (2-tuple with the parameters (mode, max) of the fit) – mode : the estimated x-position of the maximum maximum : the estimated maximum value of the peak
cov (2x2 matrix of floats) – The covariance matrix for the 2 parameters in pars
- Return type:
Examples
>>> import pygama.analysis.histograms as pgh >>> from numpy.random import normal >>> import pygama.analysis.peak_fitting as pgf >>> hist, bins, var = pgh.get_hist(normal(size=10000), bins=100, range=(-5,5)) >>> pgf.taylor_mode_max(hist, bins, var, n_bins=5)
pygama.math.distributions module¶
Contains a list of distribution functions, all implemented using Numba’s
numba.jit() to take advantage of the just-in-time speed boost.
pygama.math.histogram module¶
pygama convenience functions for 1D histograms.
1D hists in pygama require 3 things available from all implementations of 1D histograms of numerical data in python: hist, bins, and var: - hist: an array of histogram values - bins: an array of bin edges - var: an array of variances in each bin If var is not provided, pygama assuems that the hist contains “counts” with variance = counts (Poisson stats)
These are just convenience functions, provided for your convenience. Hopefully they will help you if you need to do something trickier than is provided (e.g. 2D hists).
- pygama.math.histogram.better_int_binning(x_lo=0, x_hi=None, dx=None, n_bins=None)¶
Get a good binning for integer data.
Guarantees an integer bin width.
At least two of x_hi, dx, or n_bins must be provided.
- Parameters:
- Returns:
x_lo – int values for best x_lo
x_hi – int values for best x_hi, returned if x_hi is not None
dx – best int bin width, returned if arg dx is not None
n_bins – best int n_bins, returned if arg n_bins is not None
- Return type:
- pygama.math.histogram.find_bin(x, bins)¶
Returns the index of the bin containing x Returns -1 for underflow, and len(bins) for overflow For uniform bins, jumps directly to the appropriate index. For non-uniform bins, binary search is used.
- pygama.math.histogram.get_bin_centers(bins)¶
Returns an array of bin centers from an input array of bin edges. Works for non-uniform binning. Note: a new array is allocated
- pygama.math.histogram.get_bin_estimates(pars, func, bins, is_integral=False, **kwargs)¶
Bin expected means are estimated by f(bin_center)*bin_width. Supply an integrating function to compute the integral over the bin instead. TODO: make default integrating function a numerical method that is off by default.
- Parameters:
- Returns:
f(bin_center)*bin_width – The expected mean of a bin
- Return type:
See also
get_bin_widthsReturns the width of the bins
get_bin_centersFinds the bin centers of the supplied bins
- pygama.math.histogram.get_bin_widths(bins)¶
Returns an array of bin widths from an input array of bin edges. Works for non-uniform binning.
- pygama.math.histogram.get_fwfm(fraction, hist, bins, var=None, mx=None, dmx=0, bl=0, dbl=0, method='bins_over_f', n_slope=3)¶
Estimate the full width at some fraction of the max of data in a histogram
Typically used by sending slices around a peak. Searches about argmax(hist) for the peak to fall by [fraction] from mx to bl
- Parameters:
fraction (float) – The fractional amplitude at which to evaluate the full width
hist (ndarray) – The histogram data array containing the peak
bins (ndarray) – An array of bin edges for the histogram
var (ndarray | None) – An array of histogram variances. Used with the ‘fit_slopes’ method
mx (float | tuple[float, float] | None) – The value to use for the max of the peak. If None, np.amax(hist) is used.
dmx (float | None) – The uncertainty in mx
bl (float | tuple[float, float] | None) – Used to specify an offset from which to estimate the FWFM.
dbl (float | None) – The uncertainty in the bl
method (str) –
- ‘bins_over_f’the simplest method: just take the difference in the bin
centers that are over [fraction] of max. Only works for high stats and FWFM/bin_width >> 1
- ’interpolate’interpolate between the bins that cross the [fraction]
line. Works well for high stats and a reasonable number of bins. Uncertainty incorporates var, if provided.
- ’fit_slopes’fit over n_slope bins in the vicinity of the FWFM and
interpolate to get the fractional crossing point. Works okay even when stats are moderate but requires enough bins that dx traversed by n_slope bins is approximately linear. Incorporates bin variances in fit and final uncertainties if provided.
n_slope (int) – Number of bins in the vicinity of the FWFM used to interpolate the fractional crossing point with the ‘fit_slopes’ method
- Returns:
fwfm, dfwfm – fwfm: the full width at [fraction] of the maximum above bl dfwfm: the uncertainty in fwfm
- Return type:
Examples
>>> import pygama.analysis.histograms as pgh >>> from numpy.random import normal >>> hist, bins, var = pgh.get_hist(normal(size=10000), bins=100, range=(-5,5)) >>> pgh.get_fwfm(0.5, hist, bins, var, method='bins_over_f') (2.2, 0.15919638684132664) # may vary
>>> pgh.get_fwfm(0.5, hist, bins, var, method='interpolate') (2.2041666666666666, 0.09790931254396479) # may vary
>>> pgh.get_fwfm(0.5, hist, bins, var, method='fit_slopes') (2.3083363869003466, 0.10939486522749278) # may vary
- pygama.math.histogram.get_fwhm(hist, bins, var=None, mx=None, dmx=0, bl=0, dbl=0, method='bins_over_f', n_slope=3)¶
Convenience function for the FWHM of data in a histogram
Typically used by sending slices around a peak. Searches about argmax(hist) for the peak to fall by [fraction] from mx to bl
- Parameters:
fraction – The fractional amplitude at which to evaluate the full width
hist (ndarray) – The histogram data array containing the peak
bins (ndarray) – An array of bin edges for the histogram
var (ndarray | None) – An array of histogram variances. Used with the ‘fit_slopes’ method
mx (float | tuple[float, float] | None) – The value to use for the max of the peak. If None, np.amax(hist) is used.
dmx (float | None) – The uncertainty in mx
bl (float | tuple[float, float] | None) – Used to specify an offset from which to estimate the FWFM.
dbl (float | None) – The uncertainty in the bl
method (str) –
- ‘bins_over_f’the simplest method: just take the difference in the bin
centers that are over [fraction] of max. Only works for high stats and FWFM/bin_width >> 1
- ’interpolate’interpolate between the bins that cross the [fraction]
line. Works well for high stats and a reasonable number of bins. Uncertainty incorporates var, if provided.
- ’fit_slopes’fit over n_slope bins in the vicinity of the FWFM and
interpolate to get the fractional crossing point. Works okay even when stats are moderate but requires enough bins that dx traversed by n_slope bins is approximately linear. Incorporates bin variances in fit and final uncertainties if provided.
n_slope (int) – Number of bins in the vicinity of the FWFM used to interpolate the fractional crossing point with the ‘fit_slopes’ method
- Returns:
fwhm, dfwhm – fwfm: the full width at half of the maximum above bl dfwfm: the uncertainty in fwhm
- Return type:
See also
get_fwfmFunction that computes the FWFM
- pygama.math.histogram.get_gaussian_guess(hist, bins)¶
given a hist, gives guesses for mu, sigma, and amplitude
- Parameters:
- Returns:
guess_mu – guess for the mu parameter of a Gaussian
guess_sigma – guess for the sigma parameter of a Gaussian
guess_area – guess for the A parameter of a Gaussian
- Return type:
- pygama.math.histogram.get_hist(data, bins=None, range=None, dx=None, wts=None)¶
return hist, bins, var after binning data
This is just a wrapper for humba.histogram, with optional weights for each element and proper computing of variances.
Note: there are no overflow / underflow bins.
Available binning methods:
Default (no binning arguments) : 100 bins over an auto-detected range
bins=n, range=(x_lo, x_hi) : n bins over the specified range (or leave range=None for auto-detected range)
bins=[str] : use one of np.histogram’s automatic binning algorithms
bins=bin_edges_array : array lower bin edges, supports non-uniform binning
dx=dx, range=(x_lo, x_hi): bins of width dx over the specified range. Note: dx overrides the bins argument!
- Parameters:
data (ndarray) – The array of data to be histogrammed
bins (int | ndarray | str | None) – int: the number of bins to be used in the histogram array: an array of bin edges to use str: the name of the np.histogram automatic binning algorithm to use If not provided, humba.histogram’s default auto-binning routine is used
range (tuple[float, float] | None) – (x_lo, x_high) is the tuple of low and high x values to uses for the very ends of the bin range. If not provided, np.histogram chooses the ends based on the data in data
dx (float | None) – Specifies the bin width. Overrides “bins” if both arguments are present
wts (float | ndarray | None) – Array of weights for each bin. For example, if you want to divide all bins by a time T to get the bin contents in count rate, set wts = 1/T. Variances will be computed for each bin that appropriately account for each data point’s weighting.
- Returns:
hist – the values in each bin of the histogram
bins – an array of bin edges: bins[i] is the lower edge of the ith bin. Note: it includes the upper edge of the last bin and does not include underflow or overflow bins. So len(bins) = len(hist) + 1
var – array of variances in each bin of the histogram
- Return type:
- pygama.math.histogram.get_i_local_extrema(data, delta)¶
Get lists of indices of the local maxima and minima of data
The “local” extrema are those maxima / minima that have heights / depths of at least delta. Converted from MATLAB script at: http://billauer.co.il/peakdet.html
- Parameters:
data (array-like) – the array of data within which extrema will be found
delta (scalar) – the absolute level by which data must vary (in one direction) about an extremum in order for it to be tagged
- Returns:
imaxes, imins (2-tuple ( array, array )) – A 2-tuple containing arrays of variable length that hold the indices of the identified local maxima (first tuple element) and minima (second tuple element)
- pygama.math.histogram.get_i_local_maxima(data, delta)¶
- pygama.math.histogram.get_i_local_minima(data, delta)¶
- pygama.math.histogram.plot_hist(hist, bins, var=None, show_stats=False, stats_hloc=0.75, stats_vloc=0.85, fill=False, fillcolor='r', fillalpha=0.2, **kwargs)¶
Plot a step histogram, with optional error bars
- Parameters:
hist (ndarray) – The histogram data array containing the peak
bins (ndarray) – An array of bin edges for the histogram
var (ndarray | None) – An array of histogram variances. Used with the ‘fit_slopes’ method
show_stats (bool) – If True, shows the mean, mean error, and standard deviation of the histogram on the plot
stats_hloc (float) – matplotlib.pyplot horizontal location to place the stats
stats_vloc (float) – matplotlib.pyplot vertical location to place the stats
fill (bool) – If True, fills in a step histogram when plotting
fill_color – Color to fill the step histogram if fill is True
fill_alpha – Alpha amount if fill is True
pygama.math.hpge_peak_fitting module¶
pygama convenience functions for fitting hpge peak shape data
- pygama.math.hpge_peak_fitting.hpge_peak_fwfm(sigma, htail, tau, frac_max=0.5, cov=None)¶
Return the FWHM of the radford_peak function, ignoring background and step components. If calculating error also need the normalisation for the step function.
- pygama.math.hpge_peak_fitting.hpge_peak_fwhm(sigma, htail, tau, cov=None)¶
Return the FWHM of the hpge_peak function, ignoring background and step components. If calculating error also need the normalisation for the step function.
- Parameters:
sigma (float) – The width of the hpge_peak
htail (float) – The height of the tail in the hpge_peak
tau (float) – The characteristic scale in the extended Gaussian in the hpge_peak
cov (float | None) – The covariant matrix of the previous parameters
Returns –
- FWHM, FWHM_uncertainty
The FWHM of the hpge_peak and its uncertainty
- Return type:
- pygama.math.hpge_peak_fitting.hpge_peak_mode(mu, sigma, htail, tau, cov=None)¶
- pygama.math.hpge_peak_fitting.hpge_peak_parameter_gradient(e, pars, step_norm)¶
Computes the gradient of the hpge_peak parameters
- Parameters:
- Returns:
gradient – gradient of the n_sig, mu, sigma, h_tail, tau, n_bkg, and hstep parameters of the HPGe peak
- Return type:
- pygama.math.hpge_peak_fitting.hpge_peak_peakshape_derivative(e, pars, step_norm)¶
Computes the derivative of the hpge_peak peak shape
pygama.math.least_squares module¶
pygama convenience functions for linearly fitting data
- pygama.math.least_squares.fit_simple_scaling(x, y, var=1)¶
Fast computation of weighted linear least squares fit to a simple scaling
I.e. y = scale * x. Returns the best fit scale parameter and its variance.
- pygama.math.least_squares.linear_fit_by_sums(x, y, var=1)¶
Fast computation of weighted linear least squares fit to a linear model
Note: doesn’t compute covariances. If you want covariances, just use polyfit
pygama.math.rebin module¶
Adaptive rebinning routines for 1D histograms.
Two adaptive strategies are offered:
Bayesian blocks (Scargle et al. 2013), backed by
hepstats.modeling.bayesian_blocks().Peak-aware: peaks are detected and each peak region is collapsed into a single bin. The continuum is either left at bblocks spacing or rebinned to a uniform width.
Functions taking a raw data array are named hist_*; those taking a
finely-binned hist.Hist are named rebin_*. All routines
return a Variable-axis Weight-storage hist.Hist.
Public functions¶
hist_bblocks(): histogram raw data using Bayesian-blocks edges.hist_bblocks_with_peaks(): histogram raw data with bblocks edges on the continuum and one merged bin per detected peak.hist_uniform_with_peaks(): histogram raw data with a uniform continuum binning and one merged bin per detected peak.rebin_bblocks(): rebin an existing fine histogram onto Bayesian-blocks edges.rebin_bblocks_with_peaks(): rebin an existing fine histogram onto bblocks edges and merge each detected peak into a single bin.rebin_uniform_with_peaks(): rebin an existing fine histogram onto a uniform continuum binning and merge each detected peak into a single bin.
Rebinning a histogram with the edges of another¶
To rebin a fine histogram h_fine onto the edges of another
histogram h_ref, use hist.rebin() directly (the edges of
h_ref must be a subset of those of h_fine):
>>> import hist as bh
>>> h_out = h_fine[:: bh.rebin(edges=h_ref.axes[0].edges.tolist())]
- pygama.math.rebin._bblocks_edges(data, **kwargs)¶
Bblocks edges; last edge nudged by 1 ULP for right-exclusive axes.
- Return type:
- pygama.math.rebin._bblocks_edges_from_hist(h, *, p0=0.05)¶
Bayesian-blocks edges from a fine histogram, snapped to its grid.
Empty bins are dropped to avoid the
log(0)warning hepstats raises on zero-weight events.- Return type:
- pygama.math.rebin._collapse_peaks(h, *, n_sigma=5.0, width_factor=5.0, max_bin_width=None)¶
Merge bins inside each detected peak region into a single bin.
- Return type:
Hist
- pygama.math.rebin._find_peak_regions(h, *, n_sigma=5.0, width_factor=5.0, max_bin_width=None)¶
Detect peaks and return
(n_regions, 2)merged region edges.Edges are in axis units; the array is empty if no peaks pass the significance threshold. Peaks are strict local maxima of the rate (counts/width), filtered by
prominence * w_peak / sqrt(n_peak)againstn_sigma. Whenmax_bin_widthis given, candidates sitting on a bin wider than the cap are also rejected (line-like features have resolution-bounded widths; wide bins flag Compton edges and smooth-continuum plateaus). Each surviving peak is grown outward via_walk_peak()and overlapping regions are merged.- Return type:
- pygama.math.rebin._prepare_data(data, prebin_low, prebin_high)¶
Flatten awkward inputs and apply the optional range mask.
- pygama.math.rebin._snap_indices(fine_edges, target_edges)¶
Nearest-edge indices into
fine_edgesfor eachtarget_edges.Endpoints are clamped to
0andlen(fine_edges) - 1and duplicates are removed, so the result is a strictly increasing integer array suitable as group boundaries forhist.rebin().- Return type:
- pygama.math.rebin._uniform_edges_with_peaks(low, high, *, bin_width, peak_regions)¶
Uniform grid over
[low, high]with peak regions as single bins.Any uniform edge strictly inside a peak region, OR within
bin_widthof one of its boundaries, is dropped: the peak region’s neighboring continuum bins absorb the would-be fragment, making them up to2 * bin_widthwide instead. Uniform edges exactly on a peak boundary are kept. Assumespeak_regionsis sorted and non-overlapping.- Return type:
- pygama.math.rebin._walk_peak(start, step, rate, widths, width_cap, n)¶
Walk outward from a peak bin and return the farthest index reached.
Steps in direction
step(-1 or +1), stopping when the rate stops descending, the next bin’s width reacheswidth_cap, or the next bin is a strict local minimum of rate. At the array boundary the missing neighbor is treated as +inf.- Return type:
- pygama.math.rebin.hist_bblocks(data, *, prebin_width=None, prebin_low=None, prebin_high=None, p0=0.05)¶
Histogram an unbinned data array using Bayesian-blocks edges.
For large samples, pass
prebin_widthto first bin the data on a uniform fine grid and then callrebin_bblocks()– this is much faster than running bblocks on the raw data directly. Awkward inputs are flattened before binning.- Parameters:
data (ndarray | Array) – Array of measurement values (numpy or awkward).
prebin_width (float | None) – Width of the uniform pre-binning grid. Should be a few times finer than the narrowest feature you want to resolve (e.g. ~0.5 keV for HPGe lines with ~3 keV FWHM).
prebin_low (float | None) – Histogram range. Default to
data.min()/data.max(). Data outside the range is discarded. Only valid together withprebin_width. The upper edge may overshootprebin_highby up toprebin_widthso the range is covered by an integer number of bins.prebin_high (float | None) – Histogram range. Default to
data.min()/data.max(). Data outside the range is discarded. Only valid together withprebin_width. The upper edge may overshootprebin_highby up toprebin_widthso the range is covered by an integer number of bins.p0 (float) – Bblocks sensitivity: lower values give coarser bins, higher values finer. Typical 0.01-0.1.
- Returns:
h – A
Variable-axisWeight-storage histogram filled withdata.- Return type:
Hist
Examples
>>> from pygama.math.rebin import hist_bblocks >>> h = hist_bblocks(energies, prebin_width=0.5, prebin_low=0, prebin_high=3000)
- pygama.math.rebin.hist_bblocks_with_peaks(data, *, prebin_width=None, prebin_low=None, prebin_high=None, p0=0.05, n_sigma=5.0, width_factor=5.0, max_bin_width=None)¶
Histogram raw data with Bayesian-blocks edges and collapsed peaks.
Equivalent to
_collapse_peaks(hist_bblocks(data, ...)).- Parameters:
data (ndarray | Array) – See
hist_bblocks().prebin_width (float | None) – See
hist_bblocks().prebin_low (float | None) – See
hist_bblocks().prebin_high (float | None) – See
hist_bblocks().p0 (float) – See
hist_bblocks().n_sigma (float) – See
rebin_bblocks_with_peaks().width_factor (float) – See
rebin_bblocks_with_peaks().max_bin_width (float | None) – See
rebin_bblocks_with_peaks().
- Returns:
h – A
Variable-axisWeight-storage histogram.- Return type:
Hist
Examples
Starting with a large data array, prebin it for performance and use a maximum peak width of 10.
>>> from pygama.math.rebin import hist_bblocks_with_peaks >>> data = [...] >>> h = hist_bblocks_with_peaks( ... data, prebin_width=0.5, prebin_low=0, prebin_high=3000, max_bin_width=10 ... )
- pygama.math.rebin.hist_uniform_with_peaks(data, *, bin_width, prebin_width=None, prebin_low=None, prebin_high=None, p0=0.05, n_sigma=5.0, width_factor=5.0, max_bin_width=None)¶
Histogram raw data with uniform continuum bins and collapsed peaks.
- Parameters:
data (ndarray | Array) – See
hist_bblocks().prebin_width (float | None) – See
hist_bblocks().p0 (float) – See
hist_bblocks().prebin_low (float | None) – See
hist_bblocks(); here they also define the output range.prebin_high (float | None) – See
hist_bblocks(); here they also define the output range.bin_width (float) – See
rebin_uniform_with_peaks().n_sigma (float) – See
rebin_bblocks_with_peaks().width_factor (float) – See
rebin_bblocks_with_peaks().max_bin_width (float | None) – See
rebin_bblocks_with_peaks().
- Returns:
h – A
Variable-axisWeight-storage histogram filled withdata.- Return type:
Hist
Examples
Starting with a large data array, prebin it for performance and use a maximum peak width of 10. In between the peaks use an uniform bin width of 2.
>>> from pygama.math.rebin import hist_uniform_with_peaks >>> data = [...] >>> h = hist_uniform_with_peaks( ... energies, ... bin_width=2, ... prebin_width=0.5, ... prebin_low=0, ... prebin_high=3000, ... max_bin_width=10, ... )
- pygama.math.rebin.rebin_bblocks(h, *, p0=0.05)¶
Rebin a fine 1D histogram using Bayesian-blocks adaptive binning.
Each fine bin is treated as a weighted event (weight = bin count); the resulting block edges are snapped to the input grid.
- Parameters:
h (Hist) – Input finely-binned 1D histogram.
p0 (float) – See
hist_bblocks().
- Returns:
out – A
Variable-axisWeight-storage histogram whose edges are a subset of the input edges.- Return type:
Hist
Examples
>>> from pygama.math.rebin import rebin_bblocks >>> import hist >>> h_fine = hist.new.Reg(1000, 0, 1000).Double().fill(...) >>> h_bb = rebin_bblocks(h_fine)
- pygama.math.rebin.rebin_bblocks_with_peaks(h, *, n_sigma=5.0, width_factor=5.0, max_bin_width=None, p0=0.05)¶
Rebin a fine histogram with bblocks edges and collapsed peaks.
First applies
rebin_bblocks()to get adaptive continuum bins, then merges each detected gamma peak into a single bin.- Parameters:
h (Hist) – Input finely-binned 1D histogram.
n_sigma (float) – How tall a peak must be above its local baseline to be flagged, in units of Poisson noise. Default
5.0. Lower values pick up weaker peaks but also more spurious detections.width_factor (float) – How far the merge can extend on each side of a peak, relative to the width of the peak’s own bblocks bin. The walk stops at the first bin wider than
width_factor * peak_bin_width. Default5.0; try2.0-3.0for tighter merges on HPGe spectra.max_bin_width (float | None) –
Maximum allowed width for a peak bin (in axis units). Two effects:
Detection: candidates whose bblocks bin is wider than this are rejected – useful for filtering out Compton edges, continuum plateaus, and other broad features that the peak finder cannot tell apart from real gamma lines based on shape alone (see Notes).
Walk: the merge walk never extends into a bin wider than this, regardless of
width_factor.
Default
None(no filter). For HPGe spectra set it to a few times FWHM at the highest energy of interest (5-20 keV is typical) to suppress non-line features.p0 (float) – See
hist_bblocks().
- Returns:
out – A
Variable-axisWeight-storage histogram whose edges are a subset of the input edges.- Return type:
Hist
Notes
The peak finder only looks at local maxima of the rate (counts/width) and how high they stand above their local baseline; it knows nothing about what a real gamma line looks like. It can therefore flag features that aren’t line-like:
resolution-limited gamma peaks (the intended target);
the upper end of a long Compton-edge ramp;
step transitions;
wide continuum plateaus that bblocks captured in a single wide bin.
Real HPGe peaks live on narrow bblocks bins (FWHM is a few keV even at multi-MeV energies); the other features end up on much wider bins.
max_bin_widthis the simplest filter to reject everything that isn’t a real line.Examples
>>> from pygama.math.rebin import rebin_bblocks_with_peaks >>> h_out = rebin_bblocks_with_peaks(h_fine, max_bin_width=10)
- pygama.math.rebin.rebin_uniform_with_peaks(h, *, bin_width, n_sigma=5.0, width_factor=5.0, max_bin_width=None, p0=0.05)¶
Rebin a fine histogram onto uniform continuum bins with collapsed peaks.
Runs an internal
rebin_bblocks()pass to find the peaks, then builds the output: uniform-bin_widthcontinuum bins everywhere except inside detected peak regions, which each become a single bin. Output edges are snapped toh.axes[0].edges, sobin_widthneed not align exactly with the input grid.- Parameters:
h (Hist) – Input finely-binned 1D histogram.
bin_width (float) – Continuum bin width (in axis units). To avoid tiny fragment bins right next to peaks, the continuum bin bordering each peak absorbs the leftover and can be up to
2 * bin_widthwide.n_sigma (float) – See
rebin_bblocks_with_peaks().width_factor (float) – See
rebin_bblocks_with_peaks().max_bin_width (float | None) – See
rebin_bblocks_with_peaks().p0 (float) – See
rebin_bblocks_with_peaks().
- Returns:
out – A
Variable-axisWeight-storage histogram whose edges are a subset of the input edges.- Return type:
Hist
Examples
>>> from pygama.math.rebin import rebin_uniform_with_peaks >>> import hist >>> h_fine = hist.new.Reg(1000, 0, 1000).Double().fill(...) >>> h_out = rebin_uniform_with_peaks(h_fine, bin_width=2, max_bin_width=10)
pygama.math.unbinned_fitting module¶
pygama convenience functions for fitting ubinned data
- pygama.math.unbinned_fitting.fit_unbinned(func, data, guess=None, extended=True, cost_func='LL', simplex=False, bounds=None, fixed=None)¶
Do a unbinned fit to data. Default is extended Log Likelihood fit, with option for other cost functions.
- Parameters:
func (Callable) – the function to fit
data (ndarray) – the data values to be fit
guess (ndarray) – initial guess parameters
extended (bool) – run extended or non extended fit
cost_func (str) – cost function to use. LL is ExtendedUnbinnedNLL, None is for just UnbinnedNLL
simplex (bool) – whether to include a round of simpson minimisation before main minimisation
bounds (list[tuple[float, float]] | None) – Each tuple is (min, max) for the corresponding parameters. Bounds can be None, e.g. [(0,None), (0,10)]
fixed (list[bool] | None) – list of parameter indices to fix
- Returns:
pars, errs, cov – the best-fit parameters and their errors / covariance
- Return type:
pygama.math.units module¶
pygama.math.utils module¶
pygama utility functions.
- pygama.math.utils.get_formatted_stats(mean, sigma, ndigs=2)¶
convenience function for formatting mean +/- sigma to the right number of significant figures.
- pygama.math.utils.get_par_names(func)¶
Return a list containing the names of the arguments of “func” other than the first argument. In pygamaland, those are the function’s “parameters.”
- pygama.math.utils.print_fit_results(pars, cov, func=None, title=None, pad=True)¶
Convenience function to write scipy.optimize.curve_fit results to the log
- Parameters:
- Returns:
None – Writes the curve_fit results to the log
- Return type:
None