pygama.math.functions package¶
Statistical distributions for the Pygama package.
Each distribution requires the definition of four vectorized numbafied functions that take an array as an input:
1. nb_dist_pdf(x, mu, sigma, shape)()
Returns the PDF normalized on the support
2. nb_dist_cdf(x, mu, sigma, shape)()
Returns the CDF derived from the PDF that is normalized on the support
3. nb_dist_scaled_pdf(x, area, mu, sigma, shape)()
Returns area*nb_dist_prd
4. nb_dist_scaled_cdf(x, area, mu, sigma, shape)()
Returns area*nb_dist_cdf
NOTE: The order of the arguments of these functions follows the ordering convention from left to right (for whichever are present): x, x_lo, x_hi, area, mu, sigma, shapes
Then these four functions are packaged into a class that subclasses our own PygamaContinuous. This distribution class then has 9 required methods.
1. _pdf(x, mu, sigma, shape)()
An overloading of the scipy rv_continuous base class’ method, this is slow, but it enables access to other scipy rv_continuous methods like _rvs
2. _cdf(x, mu, sigma, shape)()
An overloading of the scipy rv_continuous base class’ method, this is slow, but it enables access to other scipy rv_continuous methods like _logcdf
3. get_pdf(x, mu, sigma, shape)()
A direct call to nb_dist_pdf, it is very quick. This PDF is normalized on its support
4. get_cdf(x, mu, sigma, shape)()
A direct call to nb_dist_cdf, it is very quick. This CDF comes from the PDF that is normalized on the support
5. norm_pdf(x, x_lo, x_hi, mu, sigma, shape)()
Returns the get_pdf, normalized to unity on the fit range (normalized on [x_lo, x_hi]). Needed for unbinned fits
6. norm_cdf(x, x_lo, x_hi, mu, sigma, shape)()
Returns get_cdf, normalized on the fit range (normalized on [x_lo, x_hi]). Needed for binned fits
7. pdf_ext(x, x_lo, x_hi, area, shape, mu, sigma)()
Returns both the integral of nb_dist_scaled_pdf, as well as nb_dist_scaled_pdf itself. Needed for extended unbinned fits
8. cdf_ext(x, area, shape, mu, sigma)()
Returns a direct call to nb_dist_scaled_cdf itself. Needed for extended binned fits.
9. required_args()
A tuple of the required shape, mu, and sigma parameters
NOTE: the order of the arguments to these functions must follow the convention for whichever are present: x, x_lo, x_hi,, area, mu, sigma, shape
PygamaContinuous subclasses scipy’s rv_continuous and overloads the class instantiation so that is calls numba_frozen. numba_frozen is a subclass of rv_continuous, and acts exactly like scipy’s rv_continuous_frozen, except that this class also has methods for pygama specific get_pdf, get_cdf, pdf_ext, and cdf_ext. This subclassing is necessary so that we can instantiate frozen distributions that have access to our required pygama class methods.
In addition, the class SumDists subclasses rv_continuous. This is so that distributions created out of the sum of distributions also have access to scipy rv_continuous methods, such as random sampling. SumDists also has convenience functions such as draw_pdf, which plots the pdf, and required_args which returns a list of the required arguments for the sum of the distribution.
Submodules¶
pygama.math.functions.crystal_ball module¶
Crystal ball distributions for Pygama
- class pygama.math.functions.crystal_ball.CrystalBallGen(*args, **kwargs)¶
Bases:
PygamaContinuous
- pygama.math.functions.crystal_ball.nb_crystal_ball_cdf(x, mu, sigma, beta, m)¶
CDF for power-law tail plus gaussian. Its range of support is \(x\in\mathbb{R}, \beta>0, m>1\). It computes:
\[\begin{split}cdf(x, \beta, m, \mu, \sigma)= \begin{cases} NA\sigma\frac{(B-\frac{x-\mu}{\sigma})^{1-m}}{m-1} \quad , \frac{x-\mu}{\sigma} \leq -\beta \\ NA\sigma\frac{(B+\beta)^{1-m}}{m-1} + n\sigma \sqrt{\frac{\pi}{2}}\left(\text{erf}\left(\frac{x-\mu}{\sigma \sqrt{2}}\right)+\text{erf}\left(\frac{\beta}{\sqrt{2}}\right)\right) \quad , \frac{x-\mu}{\sigma} > -\beta \end{cases}\end{split}\]Where
\[\begin{split}A = \frac{m^m}{\beta^m}e^{-\beta^2/2} \\ B = \frac{m}{\beta}-\beta \\ n = \frac{1}{\sigma \frac{m e^{-\beta^2/2}}{\beta(m-1)} + \sigma \sqrt{\frac{\pi}{2}}\left(1+\text{erf}\left(\frac{\beta}{\sqrt{2}}\right)\right)}\end{split}\]As a Numba vectorized function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.crystal_ball.nb_crystal_ball_pdf(x, mu, sigma, beta, m)¶
PDF of a power-law tail plus gaussian. Its range of support is \(x\in\mathbb{R}, \beta>0, m>1\). It computes:
\[\begin{split}pdf(x, \beta, m, \mu, \sigma) = \begin{cases}NA(B-\frac{x-\mu}{\sigma})^{-m} \quad \frac{x-\mu}{\sigma}\leq -\beta \\ Ne^{-(\frac{x-\mu}{\sigma})^2/2} \quad \frac{x-\mu}{\sigma}>-\beta\end{cases}\end{split}\]Where
\[\begin{split}A = \frac{m^m}{\beta^m}e^{-\beta^2/2} \\ B = \frac{m}{\beta}-\beta \\ n = \frac{1}{\sigma \frac{m e^{-\beta^2/2}}{\beta(m-1)} + \sigma \sqrt{\frac{\pi}{2}}\left(1+\text{erf}\left(\frac{\beta}{\sqrt{2}}\right)\right)}\end{split}\]As a Numba vectorized function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.crystal_ball.nb_crystal_ball_scaled_cdf(x, area, mu, sigma, beta, m)¶
Scaled CDF for power-law tail plus gaussian. Used for extended binned fits. As a Numba vectorized function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
x (ndarray) – The input data
area (float) – The number of counts in the distribution
mu (float) – The amount to shift the distribution
sigma (float) – The amount to scale the distribution
beta (float) – The point where the cdf changes from power-law to Gaussian
m (float) – The power of the power-law tail
- Return type:
- pygama.math.functions.crystal_ball.nb_crystal_ball_scaled_pdf(x, area, mu, sigma, beta, m)¶
Scaled PDF of a power-law tail plus gaussian. As a Numba vectorized function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
x (ndarray) – The input data
area (float) – The number of counts in the distribution
mu (float) – The amount to shift the distribution
sigma (float) – The amount to scale the distribution
beta (float) – The point where the pdf changes from power-law to Gaussian
m (float) – The power of the power-law tail
- Return type:
pygama.math.functions.error_function module¶
pygama.math.functions.exgauss module¶
Exponentially modified Gaussian distributions for pygama
- class pygama.math.functions.exgauss.ExgaussGen(*args, **kwargs)¶
Bases:
PygamaContinuous
- pygama.math.functions.exgauss.nb_exgauss_cdf(x, mu, sigma, tau)¶
The CDF for a normalized exponentially modified Gaussian. Its range of support is \(x\in(-\infty,\infty)\), \(\tau\in(-\infty,\infty)\) It computes the following CDF:
\[cdf(x,\tau,\mu,\sigma) = \tau\,pdf(x,\tau,\mu,\sigma)+ \frac{\tau}{2|\tau|}\text{erf}\left(\frac{\tau}{|\tau|\sqrt{2}}(\frac{x-\mu}{\sigma})\right) + \frac{1}{2}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.exgauss.nb_exgauss_pdf(x, mu, sigma, tau)¶
Normalized PDF of an exponentially modified Gaussian distribution. Its range of support is \(x\in(-\infty,\infty)\), \(\tau\in(-\infty,\infty)\) Calls either
nb_gauss_tail_exact()ornb_gauss_tail_approx()depending on which is computationally cheaperAs a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
- Return type:
See also
- pygama.math.functions.exgauss.nb_exgauss_scaled_cdf(x, area, mu, sigma, tau)¶
Scaled CDF of an exponentially modified Gaussian distribution As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.exgauss.nb_exgauss_scaled_pdf(x, area, mu, sigma, tau)¶
Scaled PDF of an exponentially modified Gaussian distribution Can be used as a component of other fit functions w/args mu,sigma,tau As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.exgauss.nb_gauss_tail_approx(x, mu, sigma, tau)¶
Approximate form of a normalized exponentially modified Gaussian PDF As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
- Return type:
See also
- pygama.math.functions.exgauss.nb_gauss_tail_exact(x, mu, sigma, tau, tmp)¶
Exact form of a normalized exponentially modified Gaussian PDF. It computes the following PDF:
\[pdf(x, \tau,\mu,\sigma) = \frac{1}{2|\tau|}e^{\frac{x-\mu}{\tau}+\frac{\sigma^2}{2\tau^2}}\text{erfc}\left(\frac{\tau(\frac{x-\mu}{\sigma})+\sigma}{|\tau|\sqrt{2}}\right)\]Where \(tmp = \frac{x-\mu}{\tau}+\frac{\sigma^2}{2\tau^2}\) is precomputed in
nb_exgauss_pdf()to save computational time.As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
- Return type:
See also
pygama.math.functions.exponential module¶
Exponential distributions for pygama
- class pygama.math.functions.exponential.ExponentialGen(*args, **kwargs)¶
Bases:
PygamaContinuous
- pygama.math.functions.exponential.nb_exponential_cdf(x, mu, sigma, lamb)¶
Normalised exponential cumulative distribution, w/ args: mu, sigma, lamb. Its range of support is \(x\in[0,\infty), \lambda>0\). It computes:
\[\begin{split}cdf(x, \lambda, \mu, \sigma) = \begin{cases} 1-e^{-\lambda\frac{x-\mu}{\sigma}} \quad , \frac{x-\mu}{\sigma} > 0 \\ 0 \quad , \frac{x-\mu}{\sigma}\leq 0 \end{cases}\end{split}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.exponential.nb_exponential_pdf(x, mu, sigma, lamb)¶
Normalised exponential probability density distribution, w/ args: mu, sigma, lamb. Its range of support is \(x\in[0,\infty), \lambda>0\). It computes:
\[\begin{split}pdf(x, \lambda, \mu, \sigma) = \begin{cases} \lambda e^{-\lambda\frac{x-\mu}{\sigma}} \quad , \frac{x-\mu}{\sigma}\geq 0 \\ 0 \quad , \frac{x-\mu}{\sigma}<0 \end{cases}\end{split}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.exponential.nb_exponential_scaled_cdf(x, area, mu, sigma, lamb)¶
Exponential cdf scaled by the area, used for extended binned fits As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.exponential.nb_exponential_scaled_pdf(x, area, mu, sigma, lamb)¶
Scaled exponential probability distribution, w/ args: area, mu, sigma, lambd. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
pygama.math.functions.gauss module¶
Gaussian distributions for pygama
- class pygama.math.functions.gauss.GaussianGen(*args, **kwargs)¶
Bases:
PygamaContinuous
- pygama.math.functions.gauss.nb_gauss(x, mu, sigma)¶
Gaussian, unnormalised for use in building PDFs As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.gauss.nb_gauss_amp(x, mu, sigma, a)¶
Gaussian with height as a parameter for FWHM etc. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.gauss.nb_gauss_cdf(x, mu, sigma)¶
Gaussian CDF, w/ args: mu, sigma. The support is \((-\infty, \infty)\)
\[cdf(x, \mu,\sigma) = \frac{1}{2}\left[1+\text{erf}(\frac{x-\mu}{\sigma\sqrt{2}})\right]\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.gauss.nb_gauss_pdf(x, mu, sigma)¶
Normalised Gaussian PDF, w/ args: mu, sigma. The support is \((-\infty, \infty)\)
\[pdf(x, \mu, \sigma) = \frac{1}{\sqrt{2\pi}}e^{(\frac{x-\mu}{\sigma}^2)/2}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.gauss.nb_gauss_scaled_cdf(x, area, mu, sigma)¶
Gaussian CDF scaled by the number of signal counts for extended binned fits As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.gauss.nb_gauss_scaled_pdf(x, area, mu, sigma)¶
Gaussian with height as a parameter for fwhm etc. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
pygama.math.functions.gauss_on_exgauss module¶
Provide a convenience function for a Gaussian on top of an extended Gaussian. The correct usage is gauss_on_exgauss.get_pdf(x, mu, sigma, htail, tau)()
- param mu:
The location and scale of the first Gaussian
- param htail:
The fraction of the Gaussian tail vs. the Gaussian peak
- param tau:
The characteristic scale of the extended Gaussian tail
- returns:
gauss_on_exgauss – A subclass of
SumDistsandrv_continuous, has methods ofpdf(),cdf(), etc.
Notes
The extended Gaussian distribution shares the mu, sigma with the Gaussian
pygama.math.functions.gauss_on_exponential module¶
Provide a convenience function for a Gaussian on top of an exponential. The correct usage is gauss_on_exponential.pdf_ext(x, n_sig, mu, sigma, lambda, n_bkg, mu_exp, sigma_exp)()
- param n_sig:
The area of the Gaussian
- param mu:
The location and scale of the first Gaussian
- param sigma:
The location and scale of the first Gaussian
- param n_bkg:
The area of the exponential
- param lambda:
The characteristic scale of the exponential
- param mu_exp:
The location and scale of the exponential
- param sigma_exp:
The location and scale of the exponential
- returns:
gauss_on_exponential – A subclass of
SumDistsandrv_continuous, has methods ofpdf(),cdf(), etc.
pygama.math.functions.gauss_on_linear module¶
Provide a convenience function for a Gaussian on top of a linear distribution.
- param area1:
The area of the Gaussian distribution
- param mu:
The location and scale of the first Gaussian
- param sigma:
The location and scale of the first Gaussian
- param area2:
The area of the linear distributions respectively
- param m:
The slope of the linear distribution
- param b:
The y intercept of the linear distribution
- param x_lo:
The lower bound on which to normalize the linear distribution
- param x_hi:
The upper bound on which to normalize the linear distribution
- returns:
gauss_on_linear – An instance of SumDists and rv_continuous, has methods of pdf, cdf, etc.
pygama.math.functions.gauss_on_step module¶
Provide a convenience function for a Gaussian on top of a step distribution.
- param x_lo:
The lower and upper bounds of the support of the step function
- param x_hi:
The lower and upper bounds of the support of the step function
- param area1:
The area of the Gaussian distribution
- param mu:
The location and scale of the first Gaussian, and step function
- param sigma:
The location and scale of the first Gaussian, and step function
- param area2:
The area of the step function
- param hstep:
The height of the step function
- returns:
gauss_on_step – An instance of SumDists and rv_continuous, has methods of pdf, cdf, etc.
Notes
The step function shares a mu and sigma with the step function The parameter array must have ordering (x_lo, x_hi, area1, mu, sigma, area2, hstep)
pygama.math.functions.gauss_on_uniform module¶
Provide a convenience function for a Gaussian on top of a uniform distribution.
- param x_lo:
The upper and lower bounds on which the uniform distribution is defined
- param x_hi:
The upper and lower bounds on which the uniform distribution is defined
- param n_sig:
The area of the Gaussian distribution
- param mu:
The location and scale of the first Gaussian
- param sigma:
The location and scale of the first Gaussian
- param n_bkg:
The area of the uniform distributions respectively
- returns:
gauss_on_uniform – An instance of SumDists and rv_continuous, has methods of pdf, cdf, etc.
pygama.math.functions.hpge_peak module¶
Provide a convenience function for the HPGe peak shape.
A HPGe peak consists of a Gaussian on an Exgauss on a step function.
Called with
hpge_peak.get_pdf(x, x_lo, x_hi, n_sig, mu, sigma, htail, tau, n_bkg, hstep)
- param x_lo:
Lower bound of the step function
- param x_hi:
Upper bound of the step function
- param n_sig:
The area of the gauss on exgauss
- param mu:
The centroid of the Gaussian
- param sigma:
The standard deviation of the Gaussian
- param htail:
The height of the Gaussian tail
- param tau:
The characteristic scale of the Gaussian tail
- param n_bkg:
The area of the step background
- param hstep:
The height of the step function background
- returns:
hpge_peak – A subclass of SumDists and rv_continuous, has methods of pdf, cdf, etc.
Notes
The extended Gaussian distribution and the step distribution share the mu, sigma with the Gaussian
- pygama.math.functions.hpge_peak.hpge_get_fwfm(self, pars, frac_max=0.5, cov=None)¶
Get the fwhm value from the output of a fit quickly Need to overload this to use hpge_peak_fwhm (to avoid a circular import) for when self is an hpge peak, and otherwise returns 2sqrt(-2log(frac_max))*sigma
- pygama.math.functions.hpge_peak.hpge_get_fwhm(self, pars, cov=None)¶
Get the fwhm value from the output of a fit quickly Need to overload this to use hpge_peak_fwhm (to avoid a circular import) for when self is an hpge peak, and otherwise returns 2sqrt(2log(2))*sigma
- pygama.math.functions.hpge_peak.hpge_get_mode(self, pars, cov=None)¶
Get the fwhm value from the output of a fit quickly Need to overload this to use hpge_peak_fwhm (to avoid a circular import) for when self is an hpge peak, and otherwise returns 2sqrt(2log(2))*sigma
pygama.math.functions.linear module¶
- class pygama.math.functions.linear.LinearGen(*args, **kwargs)¶
Bases:
PygamaContinuous- _argcheck(_x_lo, _x_hi, _m, _b)¶
Default check for correct values on args and keywords.
- Returns condition array of 1’s where arguments are correct and
0’s where they are not.
- pygama.math.functions.linear.nb_linear_cdf(x, x_lo, x_hi, m, b)¶
Normalised linear cumulative density function, w/ args: m, b. Its range of support is \(x\in(x_{lower},x_{upper})\). If \(x_{lower} = np.inf\) and \(x_{upper} = np.inf\), then the function takes \(x_{upper} = :func:`np.min(x)\) and \(x_{upper} = :func:`np.amax(x)\) It computes:
\[cdf(x, x_{lower}, x_{upper}, m, b) = \frac{\frac{m}{2}(x^2-x_{lower}^2)+b(x-x_{lower})}{\frac{m}{2}(x_{upper}^2-x_{lower}^2)+b(x_{upper}-x_{lower})}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.linear.nb_linear_pdf(x, x_lo, x_hi, m, b)¶
Normalised linear probability density function, w/ args: m, b. Its range of support is \(x\in(x_{lower},x_{upper})\). If \(x_{lower} = np.inf\) and \(x_{upper} = np.inf\), then the function takes \(x_{upper} = :func:`np.min(x)\) and \(x_{upper} = :func:`np.amax(x)\) It computes:
\[pdf(x, x_{lower}, x_{upper}, m, b) = \frac{mx+b}{\frac{m}{2}(x_{upper}^2-x_{lower}^2)+b(x_{upper}-x_{lower})}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.linear.nb_linear_scaled_cdf(x, x_lo, x_hi, area, m, b)¶
Linear cdf scaled by the area for extended binned fits As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
- Return type:
- pygama.math.functions.linear.nb_linear_scaled_pdf(x, x_lo, x_hi, area, m, b)¶
Scaled linear probability distribution, w/ args: m, b. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
- Return type:
pygama.math.functions.moyal module¶
Moyal distributions for pygama
- class pygama.math.functions.moyal.MoyalGen(*args, **kwargs)¶
Bases:
PygamaContinuous
- pygama.math.functions.moyal.nb_moyal_cdf(x, mu, sigma)¶
Normalised Moyal cumulative distribution, w/ args: mu, sigma. Its support is \(x\in\mathbb{R}\) It computes:
\[cdf(x, \mu, \sigma) = \text{erfc}\left(\exp\left(-\frac{x-\mu}{2\sigma}\right)/\sqrt{2}\right)\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.moyal.nb_moyal_pdf(x, mu, sigma)¶
Normalised Moyal probability distribution function, w/ args: mu, sigma. Its support is \(x\in\mathbb{R}\) It computes:
\[pdf(x, \mu, \sigma) = \frac{\exp\left(-\left(\frac{x-\mu}{\sigma}+\exp\left(-\frac{x-\mu}{\sigma}\right)\right)/2\right)}{\sigma\sqrt{2\pi}}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.moyal.nb_moyal_scaled_cdf(x, area, mu, sigma)¶
Moyal cdf scaled by the area, used for extended binned fits As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.moyal.nb_moyal_scaled_pdf(x, area, mu, sigma)¶
Scaled Moyal probability density function, w/ args: mu, sigma, area. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
pygama.math.functions.poisson module¶
Poisson distributions for pygama
- class pygama.math.functions.poisson.PoissonGen(a=0, b=inf, name=None, badvalue=None, moment_tol=1e-08, values=None, inc=1, longname=None, shapes=None, seed=None)¶
Bases:
rv_discrete- _cdf(x, mu, lamb)¶
- Return type:
array
- _pmf(x, mu, lamb)¶
- Return type:
array
- cdf_ext(x, area, mu, lamb)¶
- Return type:
array
- get_cdf(x, mu, lamb)¶
- Return type:
array
- get_pmf(x, mu, lamb)¶
- Return type:
array
- pmf_ext(x, x_lo, x_hi, area, mu, lamb)¶
- Return type:
array
- pygama.math.functions.poisson.factorial(nn)¶
- pygama.math.functions.poisson.nb_poisson_cdf(x, mu, lamb)¶
Normalised Poisson cumulative distribution, w/ args: mu, lamb. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
\[cdf(x, \lambda, \mu) = e^{-\lambda}\sum_{j=0}^{\lfloor x-\mu \rfloor}\frac{\lambda^j}{j!}\]
- pygama.math.functions.poisson.nb_poisson_pmf(x, mu, lamb)¶
Normalised Poisson distribution, w/ args: mu, lamb. The range of support is \(\mathbb{N}\), with \(lamb\) \(\in (0,\infty)\), \(\mu \in \mathbb{N}\) As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
\[pmf(x, \lambda, \mu) = \frac{\lambda^{x-\mu} e^{-\lambda}}{(x-\mu)!}\]
- pygama.math.functions.poisson.nb_poisson_scaled_cdf(x, area, mu, lamb)¶
Poisson cdf scaled by the number of signal counts for extended binned fits As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.poisson.nb_poisson_scaled_pmf(x, area, mu, lamb)¶
Scaled Poisson probability distribution, w/ args: lamb, mu. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
pygama.math.functions.polynomial module¶
- pygama.math.functions.polynomial.nb_poly(x, pars)¶
A polynomial function with pars following the numpy polynomial convention. It computes:
\[y = a_n x^n + ... + a_1 x + a_0\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
- Returns:
result – The polynomial defined by pars at the evaluated at x
- Return type:
Notes
This follows the
numpy.polynomial()convention of \(a_0 + a_1 x +.... + a_n x^n\)
pygama.math.functions.pygama_continuous module¶
Subclasses of scipy.stat’s rv_continuous, allows for freezing of pygama numba-fied distributions for faster functions Example:
>>> moyal = moyal_gen(name='moyal')
>>> moyal = moyal(1, 2)
>>> moyal.get_pdf([-1,2,3]) # a direct call to the faster numbafied method
>>> moyal.pdf([-1,2,3]) # a call to the slower scipy method, allows access to other scipy methods like .rvs
>>> moyal.rvs(100) # Can access scipy methods!
NOTE: the dist.pdf method is a slow scipy method, and the dist.get_pdf method is fast
- class pygama.math.functions.pygama_continuous.NumbaFrozen(dist, *args, **kwds)¶
Bases:
rv_frozenEssentially, an overloading of the definition of rv_continuous_frozen so that our pygama class instantiations have access to both the slower scipy methods, as well as the faster pygama methods (like get_pdf)
- cdf_ext(x, area)¶
Direct access to numba-fied extended pdfs, a fast function
- Returns:
scaled support normalized cdf
- Return type:
- cdf_norm(x, x_lo, x_hi)¶
Direct access to numba-fied cdfs, a fast function. Normalized on a fit range
- Returns:
fit-range normalized cdf
- get_cdf(x)¶
Direct access to numba-fied cdfs, a fast function
- Returns:
support normalized cdf
- Return type:
- get_pdf(x)¶
Direct access to numba-fied pdfs, a fast function
- Returns:
support normalized pdf
- Return type:
- pdf_ext(x, x_lo, x_hi, area)¶
Direct access to numba-fied extended pdfs, a fast function
- pdf_norm(x, x_lo, x_hi)¶
Direct access to numba-fied pdfs, a fast function. Normalized on a fit range
- Returns:
fit-range normalized pdf
- class pygama.math.functions.pygama_continuous.PygamaContinuous(momtype=1, a=None, b=None, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, seed=None)¶
Bases:
rv_continuousSubclass rv_continuous, and modify the instantiation so that we call an overloaded version of rv_continuous_frozen that has direct access to pygama numbafied functions
- _cdf_norm(x, x_lo, x_hi, *args, **_kwds)¶
Derive a cdf from a pdf that is normalized on a subset of its support, typically over a fit-range.
- Parameters:
x – The data
x_lo – The lower range to normalize on
x_hi – The upper range to normalize on
args – The shape and location and scale parameters of a specific distribution
- Returns:
pdf_norm – The pdf that is normalized on a smaller range
Notes
If upper_range and lower_range are both
np.inf(), then the function automatically takes x_lo =:func:np.amin(x), x_hi=:func:np.amax(x) We also need to overload this in every subclass because we want functions to be able to introspect the shape and location/scale names. For distributions that are only defined on a limited range, like with lower_range, upper_range, we don’t need to call these, instead just call the normal pdf.
- _pdf_norm(x, x_lo, x_hi, *args, **_kwds)¶
Normalize a pdf on a subset of its support, typically over a fit-range.
- Parameters:
x – The data
x_lo – The lower range to normalize on
x_hi – The upper range to normalize on
args – The shape and location and scale parameters of a specific distribution
- Returns:
pdf_norm – The pdf that is normalized on a smaller range
Notes
If upper_range and lower_range are both
np.inf(), then the function automatically takes x_lo =:func:np.amin(x), x_hi=:func:np.amax(x) We also need to overload this in every subclass because we want functions to be able to introspect the shape and location/scale names For distributions that are only defined on a limited range, like with lower_range, upper_range, we don’t need to call these, instead just call the normal pdf.
pygama.math.functions.step module¶
Step distributions for pygama
- class pygama.math.functions.step.StepGen(*args, **kwargs)¶
Bases:
PygamaContinuous- _argcheck(x_lo, x_hi, _mu, _sigma, _hstep)¶
Default check for correct values on args and keywords.
- Returns condition array of 1’s where arguments are correct and
0’s where they are not.
- pygama.math.functions.step.nb_step_cdf(x, x_lo, x_hi, mu, sigma, hstep)¶
Normalized CDF for step function w/args mu, sigma, hstep, x_lo, x_hi. Its range of support is \(x\in\) (x_lo, x_hi). It computes:
\[cdf(x, \mathrm{x}_\mathrm{lo}, \mathrm{x}_\mathrm{hi}, \mu, \sigma, hstep) = cdf(y=\frac{x-\mu}{\sigma}, hstep, \mathrm{x}_\mathrm{lo}, \mathrm{x}_\mathrm{hi}) = \frac{(y-y_{min}) +hstep\left(y\mathrm{erf}(\frac{y}{\sqrt{2}})+\sqrt{\frac{2}{\pi}}e^{-y^2/2}-y_{min}\mathrm{erf}(\frac{y_{min}}{\sqrt{2}})+\sqrt{\frac{2}{\pi}}e^{-y_{min}^2/2}\right)}{\sigma\left[(y_{max}-y_{min}) +hstep\left(y_{max}\mathrm{erf}(\frac{y_{max}}{\sqrt{2}})+\sqrt{\frac{2}{\pi}}e^{-y_{max}^2/2}-y_{min}\mathrm{erf}(\frac{y_{min}}{\sqrt{2}})+\sqrt{\frac{2}{\pi}}e^{-y_{min}^2/2}\right)\right] }\]Where \(y_{max} = \frac{\mathrm{x}_\mathrm{hi} - \mu}{\sigma}, y_{min} = \frac{\mathrm{x}_\mathrm{lo} - \mu}{\sigma}\). As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
x (ndarray) – The input data
x_lo (float) – The lower range on which to normalize the step PDF, default is to normalize from min to max x values
x_hi (float) – The upper range on which to normalize the step PDF
mu (float) – The location of the step
sigma (float) – The “width” of the step, because we are using an error function to define it
hstep (float) – The height of the step
- Return type:
- pygama.math.functions.step.nb_step_int(x, mu, sigma, hstep)¶
Integral of step function w/args mu, sigma, hstep. It computes:
\[\int cdf(x, hstep, \mu, \sigma)\, dx = \sigma\left(\frac{x-\mu}{\sigma} + hstep \left(\frac{x-\mu}{\sigma}\mathrm{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right) + \sqrt{\frac{2}{\pi}}\exp\left(-(\frac{x-\mu}{\sigma})^2/2\right) \right)\right)\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.step.nb_step_pdf(x, x_lo, x_hi, mu, sigma, hstep)¶
Normalised step function w/args mu, sigma, hstep, x_lo, x_hi. Its range of support is \(x\in\) (x_lo, x_hi). It computes:
\[pdf(x, \mathrm{x}_\mathrm{lo}, \mathrm{x}_\mathrm{hi}, \mu, \sigma, hstep) = pdf(y=\frac{x-\mu}{\sigma}, step, \mathrm{x}_\mathrm{lo}, \mathrm{x}_\mathrm{hi}) = \frac{1+hstep\mathrm{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)}{\sigma\left[(y-y_{min}) +hstep\left(y\mathrm{erf}(\frac{y}{\sqrt{2}})+\sqrt{\frac{2}{\pi}}e^{-y^2/2}-y_{min}\mathrm{erf}(\frac{y_{min}}{\sqrt{2}})+\sqrt{\frac{2}{\pi}}e^{-y_{min}^2/2}\right)\right]}\]Where \(y_{max} = \frac{\mathrm{x}_\mathrm{hi} - \mu}{\sigma}, y_{min} = \frac{\mathrm{x}_\mathrm{lo} - \mu}{\sigma}\). As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
x (ndarray) – The input data
x_lo (float) – The lower range on which to normalize the step PDF, default is to normalize from min to max x values
x_hi (float) – The upper range on which to normalize the step PDF
mu (float) – The location of the step
sigma (float) – The “width” of the step, because we are using an error function to define it
hstep (float) – The height of the step
- Return type:
- pygama.math.functions.step.nb_step_scaled_cdf(x, x_lo, x_hi, area, mu, sigma, hstep)¶
Scaled step function CDF w/args x_lo, x_hi, area, mu, sigma, hstep. Used for extended binned fits. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
x (ndarray) – The input data
x_lo (float) – The lower range on which to normalize the step PDF, default is to normalize from min to max x values
x_hi (float) – The upper range on which to normalize the step PDF
area (float) – The number to scale the distribution by
mu (float) – The location of the step
sigma (float) – The “width” of the step, because we are using an error function to define it
hstep (float) – The height of the step
- Return type:
- pygama.math.functions.step.nb_step_scaled_pdf(x, x_lo, x_hi, area, mu, sigma, hstep)¶
Scaled step function pdf w/args x_lo, x_hi, area, mu, sigma, hstep. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- Parameters:
x (ndarray) – The input data
x_lo (float) – The lower range on which to normalize the step PDF, default is to normalize from min to max x values
x_hi (float) – The upper range on which to normalize the step PDF
area (float) – The number to scale the distribution by
mu (float) – The location of the step
sigma (float) – The “width” of the step, because we are using an error function to define it
hstep (float) – The height of the step
- Return type:
- pygama.math.functions.step.nb_unnorm_step_pdf(x, mu, sigma, hstep)¶
Unnormalised step function for use in pdfs. It computes:
\[pdf(x, hstep, \mu, \sigma) = 1+ hstep\mathrm{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
pygama.math.functions.sum_dists module¶
A class that creates the sum of distributions, with methods for scipy computed pdfs() and cdfs(), as well as fast get_pdfs(), and pdf_ext()
mu1, mu2, sigma, frac = range(4)
moyal_add = SumDists(
[(moyal, [mu1, sigma]), (moyal, [mu2, sigma])], [frac], "fracs"
) # create two moyals that share a sigma and differ by a fraction,
x = np.arange(-10, 10)
pars = np.array([1, 2, 2, 0.1]) # corresponds to mu1 = 1, mu2 = 2, sigma = 2, frac=0.1
moyal_add.pdf(x, *pars)
moyal_add.draw_pdf(x, *pars)
moyal_add.required_args()
- class pygama.math.functions.sum_dists.SumDists(dists_and_pars_array, area_frac_idxs, flag, parameter_names=None, components=False, support_required=False, **kwds)¶
Bases:
rv_continuousInitialize an rv_continuous method so that we gain access to scipy computable methods. Precompute the support of the sum of the distributions.
The correct way to initialize is SumDists([(d1, [p1]), (d2, [p2])], [area_idx_1/frac_idx, area_idx_2/], frac_flag) Where d_i is a distribution and p_i is a parameter index array for that distribution.
Parameter index arrays contain indices that slice a single parameter array that is passed to method calls. For example, if the user will eventually pass parameters=[mu, sigma, tau, frac] to function.get_pdf(x, parameters) and the first distribution takes (mu, sigma) as its parameters, then p1=[0,1]. If the second distribution takes (mu, sigma, tau) then its parameter index array would be p2=[0, 1, 2] because tau is the index 2 entry in parameters.
Each par array can contain [x_lo, x_hi, mu, sigma, shape], and must be placed in that order.
The single parameter array passed to function calls must follow the ordering convention [x_lo, x_hi, frac/areas, shapes_1, frac/areas2, shapes_2] The single parameter array that is passed to
pdf_norm(),pdf_ext(), andcdf_norm()calls must have x_lo, x_hi as its first two arguments if none of the distributions require an explicit definition of their support.There are 4 flag options:
flag = “areas”, two areas passed as fit variables
flag = “fracs”, one fraction is passed a a fit variable, with the convention that SumDists performs f*dist_1 + (1-f)*dist_2
flag = “one_area”, one area is passed a a fit variable, with the convention that SumDists performs area*dist_1 + 1*dist_2
flag = None, no areas or fracs passed as fit variables, both are normalized to unity.
Notes
dists must be unfrozen pygama distributions of the type
PygamaContinuous()orsum_dist().- Parameters:
dists_and_pars_array – A list of two tuples, containing [(dist_1, [mu1, sigma1, shapes1]), (dist_2, [mu2, sigma2, shapes2])] to create a sum_dist from
area_frac_idxs – A list of the indices at which that either the areas or fraction will be placed in the eventual method calls
flag – One of three strings that initialize
sum_dist()in different modes. Either “fracs”, “areas” or “one_area”.parameter_names – An optional list of strings that contain the parameters names in the order they will appear in method calls
components – A boolean that if true will cause methods to return components instead of the sum of the distributions
support_required – A boolean that if true tells
sum_dist()that x_lo, x_hi will always be passed in method calls.
- _argcheck(*args)¶
Check that each distribution gets its own valid arguments.
Notes
This overloads the
scipy()definition so that the methodspdf()work.- Return type:
- _cdf(x, *params)¶
Overload
rv_continuous()definition of cdf in order to access other methods.
- _pdf(x, *params)¶
Overload
rv_continuous()definition of pdf in order to access other methods.
- cdf_ext(x, *params)¶
Returns the specified sum of all distributions’
get_cdf()methods, used in extended binned NLL fits.
- cdf_norm(x, *params)¶
Returns the specified sum of all distributions’
get_cdf()methods, but normalized on a range x_lo to x_hi. Used in binned NLL fits.NOTE: This assumes that x_lo, x_hi are the first two elements in the parameter array, unless x_lo and x_hi are required to define the support of the
SumDists()for all method calls. ForSumDists()created from a distribution where the support needs to be explicitly passed, this means thatcdf_norm()sets the fit range equal to the support range. It also means that summing distributions with two different explicit supports is not allowed, e.g. we cannot sum two step functions with different supports.
- draw_cdf(x, *params)¶
- draw_pdf(x, *params)¶
- get_fwfm(pars, cov=None, frac_max=0.5)¶
Get the fwfm value from the output of a fit quickly Need to overload this to use hpge_peak_fwfm (to avoid a circular import) for when self is an hpge peak, and otherwise returns 2sqrt(2log(2))*sigma
- get_fwhm(pars, cov=None)¶
Get the fwhm value from the output of a fit quickly Need to overload this to use hpge_peak_fwhm (to avoid a circular import) for when self is an hpge peak, and otherwise returns 2sqrt(2log(2))*sigma
- get_mode(pars, cov=None, errors=None)¶
Get the mode value from the output of a fit quickly Need to overload this to use hpge_peak_fwhm (to avoid a circular import) for when self is an hpge peak
- get_mu(pars, cov=None, errors=None)¶
Get the mu value from the output of a fit quickly
- get_total_events(pars, cov=None, errors=None)¶
Get the total events from the output of an extended fit quickly The total number of events come from the sum of the area of the signal and the area of the background components
- pdf_ext(x, *params)¶
Returns the specified sum of all distributions’
pdf_ext()methods, normalized on a range x_lo to x_hi. Used in extended unbinned NLL fits.NOTE: This assumes that x_lo, x_hi are the first two elements in the parameter array, unless x_lo and x_hi are required to define the support of the
SumDists()for all method calls. ForSumDists()created from a distribution where the support needs to be explicitly passed, this means thatpdf_ext()sets the fit range equal to the support range. It also means that summing distributions with two different explicit supports is not allowed, e.g. we cannot sum two step functions with different supports.
- pdf_norm(x, *params)¶
Returns the specified sum of all distributions’
get_pdf()methods, but normalized on a range x_lo to x_hi. Used in unbinned NLL fits.NOTE: This assumes that x_lo, x_hi are the first two elements in the parameter array, unless x_lo and x_hi are required to define the support of the
SumDists()for all method calls. ForSumDists()created from a distribution where the support needs to be explicitly passed, this means thatpdf_norm()sets the fit range equal to the support range. It also means that summing distributions with two different explicit supports is not allowed, e.g. we cannot sum two step functions with different supports.
- required_args()¶
- Returns:
req_args – A list of the required arguments for the
SumDists()instance, either passed by the user, or inferred.- Return type:
- pygama.math.functions.sum_dists.copy_signature(signature_to_copy, obj_to_copy_to)¶
Copy the signature provided in signature_to_copy into the signature for “obj_to_copy_to”. This is necessary so that we can override the signature for the various methods attached to different objects.
- pygama.math.functions.sum_dists.get_areas_fracs(params, area_frac_idxs, frac_flag, area_flag, one_area_flag)¶
Grab the value(s) of either the fraction or the areas passed in the params array from the
SumDists()call. IfSumDists()is in “fracs” mode, then this grabs f from the params array and returns fracs = [f, 1-f] and areas of unity. IfSumDists()is in “areas” mode, then this grabs s, b from the params array and returns unity fracs and areas = [s, b] IfSumDists()is in “one_area” mode, then this grabs s from the params array and returns unity fracs and areas = [s, 1]- Parameters:
params (array) – An array containing the shape values from a
SumDists()callarea_frac_idxs (array) – An array containing the indices of either the fracs or the areas present in the params array
frac_flag (bool) – A boolean telling if
SumDists()is in fracs mode or notarea_flag (bool) – A boolean telling if
SumDists()is in areas mode or notone_area_flag (bool) – A boolean telling if
SumDists()is to apply only area to one distribution
- Returns:
fracs, areas – Values of the fractions and the areas to post-multiply the sum of the distributions with
- Return type:
tuple[array, array]
- pygama.math.functions.sum_dists.get_dists_and_par_idxs(dists_and_pars_array)¶
Split the array of tuples passed to the
SumDists()constructor into separate arrays with one containing only the distributions and the other containing the parameter index arrays. Also performs some sanity checks.- Parameters:
dists_and_pars_array (array(<class 'tuple'>, dtype=object)) – An array of tuples, each tuple contains a distribution and the indices of the parameters in the
SumDists()call that correspond to that distribution- Returns:
dists, par_idxs – Returned only if every distribution has the correct number of required arguments
- Return type:
tuple[array, array]
- pygama.math.functions.sum_dists.get_parameter_names(dists, par_idxs, par_size)¶
Returns an array of the names of the required parameters for an instance of
SumDists()Works by callingrequired_args()for each distribution present in the sum. If a parameter is shared between distributions its name is only added once. If two parameters are required and share a name, then the second parameter gets an added index at the end.- Parameters:
dists (array) – An array containing the distributions in this instance of
SumDists()par_idxs (array) – An array of arrays, each array contains the indices of the parameters in the
SumDists()call that correspond to that distributionpar_size (int) – The size of the single parameter index array
- Returns:
param_names – An array containing the required parameter names
- Return type:
array
pygama.math.functions.triple_gauss_on_double_step module¶
Provide a convenience function for three gaussians on two steps
- param x_lo:
The lower range to compute the normalization of the step functions
- param x_hi:
The upper range to compute the normalization of the step functions
- param area1:
The area, location, and scale of the first Gaussian
- param mu_1:
The area, location, and scale of the first Gaussian
- param sigma_1:
The area, location, and scale of the first Gaussian
- param area2:
The area, location, and scale of the second Gaussian
- param mu_2:
The area, location, and scale of the second Gaussian
- param sigma_2:
The area, location, and scale of the second Gaussian
- param area3:
The area, location, and scale of the third Gaussian
- param mu_3:
The area, location, and scale of the third Gaussian
- param sigma_3:
The area, location, and scale of the third Gaussian
- param area4:
The area and height of the first step function
- param hstep_1:
The area and height of the first step function
- param area5:
The area and height of the second step function
- param hstep_2:
The area and height of the second step function
Example
triple_gauss_on_double_step.get_pdf(x, pars = [x_lo, x_hi, n_sig1, mu1, sigma1, n_sig2, mu2, sigma2, n_sig3, mu3, sigma3, n_bkg1, hstep1, n_bkg2, hstep2])
- returns:
triple_gauss_on_double_step – A subclass of SumDists and rv_continuous, has methods of pdf, cdf, etc.
Notes
The first step function shares the mu_1, sigma_1 with the first Gaussian, and the second step function shares the mu_2, sigma_2 with the second Gaussian
pygama.math.functions.uniform module¶
Uniform distributions for pygama
- class pygama.math.functions.uniform.UniformGen(*args, **kwargs)¶
Bases:
PygamaContinuous- _argcheck(a, b)¶
Default check for correct values on args and keywords.
- Returns condition array of 1’s where arguments are correct and
0’s where they are not.
- _get_support(a, b)¶
Return the support of the (unscaled, unshifted) distribution.
Must be overridden by distributions which have support dependent upon the shape parameters of the distribution. Any such override must not set or change any of the class members, as these members are shared amongst all instances of the distribution.
- Parameters:
arg1 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information).
arg2 (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information).
... (array_like) – The shape parameter(s) for the distribution (see docstring of the instance object for more information).
- Returns:
a, b (numeric (float, or int or +/-np.inf)) – end-points of the distribution’s support for the specified shape parameters.
- pygama.math.functions.uniform.nb_uniform_cdf(x, a, b)¶
Normalised uniform cumulative distribution, w/ args: a, b. Its range of support is \(x\in[a,b]\). The cdf is computed as:
\[\begin{split}cdf(x, a, b) = \begin{cases} 0 \quad , x<a \\ \frac{x-a}{b-a} \quad , a\leq x\leq b \\ 1 \quad , x>b \end{cases}\end{split}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.uniform.nb_uniform_pdf(x, a, b)¶
Normalised uniform probability density function, w/ args: a, b. Its range of support is \(x\in[a,b]\). The pdf is computed as:
\[\begin{split}pdf(x, a, b) = \begin{cases} \frac{1}{b-a} \quad , a\leq x\leq b \\ 0 \quad , \text{otherwise} \end{cases}\end{split}\]As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.uniform.nb_uniform_scaled_cdf(x, area, a, b)¶
Uniform cdf scaled by the area, used for extended binned fits As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.
- pygama.math.functions.uniform.nb_uniform_scaled_pdf(x, area, a, b)¶
Scaled uniform probability distribution, w/ args: a, b. As a Numba JIT function, it runs slightly faster than ‘out of the box’ functions.