Package overview

pygama is a Python package developed by the LEGEND collaboration for processing and analysing data from high-purity germanium (HPGe) and liquid-argon (LAr) detector systems. It sits at the higher levels of the LEGEND data processing chain and operates on data that has already been decoded and signal-processed by the companion packages legend-daq2lh5 and dspeed.

All data is stored and exchanged in the LEGEND HDF5 (LH5) format via the lgdo library.

Data tiers

The LEGEND processing chain is organised as a sequence of data tiers, each one adding progressively higher-level information:

Tier

Description

raw

Raw digitiser output decoded to LH5 by legend-daq2lh5.

dsp

Digital-signal-processing (DSP) parameters extracted from waveforms by dspeed: trapezoidal-filter energies, current amplitudes, timestamps, etc.

hit

Per-hit derived quantities (calibrated energy, quality-cut flags, …) produced from the dsp tier by pygama.hit.

tcm

Time Coincidence Map: a lookup table that groups hit-tier rows from different channels into physics events, built by pygama.evt.

evt

Event-level quantities aggregated across all channels that contribute to a single physics event, built by pygama.evt.

Main modules

pygama exposes four main sub-packages, each covering a distinct stage of the processing chain.

pygama.hit — Hit-tier production

Applies user-defined columnar transformations to dsp-tier tables to produce the hit tier. Expressions are given as strings (evaluated via eval()) and configured through a JSON/YAML dictionary, making the tier highly configurable without writing Python code. See Hit-tier production — pygama.hit for a detailed description.

pygama.evt — Event building

Groups hit-level data from multiple channels and multiple detector types into physics events using a Time Coincidence Map (TCM). The main entry points are build_tcm(), which builds the TCM from coincident hits, and build_evt(), which evaluates per-event quantities by aggregating across channels according to a JSON/YAML configuration. Detector- specific processors for HPGe, SiPM and LAr veto subsystems live in the pygama.evt.modules sub-package. See Event building — pygama.evt for a detailed description.

pygama.math — Mathematical utilities

A collection of statistical distributions, histogram helpers, and fitting routines used throughout the package. All probability density and cumulative distribution functions are JIT-compiled with Numba for speed. Binned and unbinned maximum- likelihood fits are implemented on top of iminuit. See Mathematical utilities — pygama.math for a detailed description.

pygama.pargen — Parameter generation

Routines for calibrating detector parameters from data: HPGe energy calibration, amplitude-over-energy (A/E) multi-site-event discrimination, late-charge (LQ) cut calibration, DSP-filter optimisation, and data-quality cuts. Results are typically stored as JSON/YAML parameter files that are then consumed by pygama.hit. See Parameter generation — pygama.pargen for a detailed description.

Data flow summary

The diagram below shows how the four main sub-packages interact within the overall processing chain:

legend-daq2lh5          dspeed               pygama
┌─────────────┐       ┌──────────┐     ┌─────────────────────────────────┐
│  raw tier   │──────▶│ dsp tier │────▶│  hit tier   (pygama.hit)        │
│  (decoded)  │─┐     │ (wf     │     │  (calibrated quantities)        │
└─────────────┘ │     │  params) │     └────────────────┬────────────────┘
                │     └──────────┘                      │
                │ pygama.evt.build_tcm                   │ pygama.evt.build_evt
                ▼                                        ▼
        ┌───────────────┐                      ┌────────────────┐
        │   tcm tier    │─────────────────────▶│   evt tier     │
        │ (coincidences)│                      │  (event-level) │
        └───────────────┘                      └────────────────┘

pygama.math and pygama.pargen support all stages above.