pygama.lgdo package#
Pygama works with “LEGEND Data Objects” (LGDO) defined in the LEGEND data
format specification.
This subpackage serves as the Python implementation of that specification. The
general strategy for the implementation is to dress standard Python and NumPy
objects with an attr dictionary holding LGDO metadata, plus some convenience
functions. The basic data object classes are:
Scalar: typed Python scalar. Access data via thevalueattributeArray: basicnumpy.ndarray. Access data via thendaattribute.FixedSizeArray: basicnumpy.ndarray. Access data via thendaattribute.ArrayOfEqualSizedArrays: multi-dimensionalnumpy.ndarray. Access data via thendaattribute.VectorOfVectors: a variable length array of variable length arrays. Implemented as a pair ofArray:flattened_dataholding the raw data, andcumulative_lengthwhose ith element is the sum of the lengths of the vectors withindex <= iStruct: a dictionary containing LGDO objects. Derives fromdictTable: aStructwhose elements (“columns”) are all array types with the same length (number of rows)
Currently the primary on-disk format for LGDO object is LEGEND HDF5 (LH5) files. IO
is done via the class lh5_store.LH5Store. LH5 files can also be
browsed easily in python like any HDF5 file using
h5py.
Submodules#
pygama.lgdo.array module#
Implements a LEGEND Data Object representing an n-dimensional array and corresponding utilities.
- class pygama.lgdo.array.Array(nda: Optional[ndarray] = None, shape: tuple[int, ...] = (), dtype: Optional[dtype] = None, fill_val: Optional[Union[float, int]] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
objectHolds an
numpy.ndarrayand attributes.Array(and the other various array types) holds an nda instead of deriving fromnumpy.ndarrayfor the following reasons:It keeps management of the nda totally under the control of the user. The user can point it to another object’s buffer, grab the nda and toss the
Array, etc.It allows the management code to send just the nda’s the central routines for data manpulation. Keeping LGDO’s out of that code allows for more standard, reusable, and (we expect) performant Python.
It allows the first axis of the nda to be treated as “special” for storage in
Tables.
- Parameters:
nda (np.ndarray) – An
numpy.ndarrayto be used for this object’s internal array. Note: the array is used directly, not copied. If not supplied, internal memory is newly allocated based on the shape and dtype arguments.shape (tuple[int, ...]) – A numpy-format shape specification for shape of the internal ndarray. Required if nda is
None, otherwise unused.dtype (np.dtype) – Specifies the type of the data in the array. Required if nda is
None, otherwise unused.fill_val (float | int) – If
None, memory is allocated without initialization. Otherwise, the array is allocated with all elements set to the corresponding fill value. If nda is notNone, this parameter is ignored.attrs (dict[str, Any]) – A set of user attributes to be carried along with this LGDO.
pygama.lgdo.arrayofequalsizedarrays module#
Implements a LEGEND Data Object representing an array of equal-sized arrays and corresponding utilities.
- class pygama.lgdo.arrayofequalsizedarrays.ArrayOfEqualSizedArrays(dims: Optional[tuple[int, ...]] = None, nda: Optional[ndarray] = None, shape: tuple[int, ...] = (), dtype: Optional[dtype] = None, fill_val: Optional[Union[float, int]] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
ArrayAn array of equal-sized arrays.
Arrays of equal size within a file but could be different from application to application. Canonical example: array of same-length waveforms.
- Parameters:
dims (tuple[int, ...]) – specifies the dimensions required for building the
ArrayOfEqualSizedArrays’ datatype attribute.nda (numpy.ndarray) – An
numpy.ndarrayto be used for this object’s internal array. Note: the array is used directly, not copied. If not supplied, internal memory is newly allocated based on the shape and dtype arguments.shape (tuple[int, ...]) – A NumPy-format shape specification for shape of the internal array. Required if nda is
None, otherwise unused.dtype (numpy.dtype) – Specifies the type of the data in the array. Required if nda is
None, otherwise unused.fill_val (int | float) – If
None, memory is allocated without initialization. Otherwise, the array is allocated with all elements set to the corresponding fill value. If nda is notNone, this parameter is ignored.attrs (dict[str, Any]) – A set of user attributes to be carried along with this LGDO.
Notes
If shape is not “1D array of arrays of shape given by axes 1-N” (of nda) then specify the dimensionality split in the constructor.
See also
pygama.lgdo.fixedsizearray module#
Implements a LEGEND Data Object representing an n-dimensional array of fixed size and corresponding utilities.
- class pygama.lgdo.fixedsizearray.FixedSizeArray(nda: Optional[ndarray] = None, shape: tuple[int, ...] = (), dtype: Optional[dtype] = None, fill_val: Optional[Union[float, int]] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
ArrayAn array of fixed-size arrays.
Arrays with guaranteed shape along axes > 0: for example, an array of vectors will always length 3 on axis 1, and it will never change from application to application. This data type is used for optimized memory handling on some platforms. We are not that sophisticated so we are just storing this identification for LGDO validity, i.e. for now this class is just an alias for
Array, but keeps track of the datatype name.See also
pygama.lgdo.lgdo_utils module#
Implements utilities for LEGEND Data Objects.
- pygama.lgdo.lgdo_utils.expand_path(path: str, list: bool = False) str | list#
Expand environment variables and wildcards to return absolute path
- Parameters:
- Returns:
path or list of paths – Unique absolute path, or list of all absolute paths
- Return type:
- pygama.lgdo.lgdo_utils.get_element_type(obj: object) str#
Get the LGDO element type of a scalar or array.
For use in LGDO datatype attributes.
- Parameters:
obj (object) – if a
str, will automatically returnstringif the object has anumpy.dtype, that will be used for determining the element type otherwise will attempt to case the type of the object to anumpy.dtype.- Returns:
element_type – A string stating the determined element type of the object.
- Return type:
- pygama.lgdo.lgdo_utils.parse_datatype(datatype: str) tuple[str, tuple[int, ...], str | list[str]]#
Parse datatype string and return type, dimensions and elements.
- Parameters:
datatype (str) – a LGDO-formatted datatype string.
- Returns:
element_type – the datatype name dims if not
None, a tuple of dimensions for the LGDO. Note this is not the same as the NumPy shape of the underlying data object. See the LGDO specification for more information. Also seeArrayOfEqualSizedArraysandlh5_store.LH5Store.read_object()for example code elements for numeric objects, the element type for struct-like objects, the list of fields in the struct.- Return type:
pygama.lgdo.lh5_store module#
This module implements routines from reading and writing LEGEND Data Objects in HDF5 files.
- class pygama.lgdo.lh5_store.LH5Iterator(lh5_files: str | list[str], group: str, base_path: str = '', entry_list: Optional[Union[list[int], list[list[int]]]] = None, entry_mask: Optional[Union[list[bool], list[list[bool]]]] = None, field_mask: Optional[Union[dict[str, bool], list[str], tuple[str]]] = None, buffer_len: int = 3200)#
Bases:
objectA class for iterating through one or more LH5 files, one block of entries at a time. This also accepts an entry list/mask to enable event selection, and a field mask.
This class can be used either for random access:
>>> lh5_obj, n_rows = lh5_it.read(entry)
to read the block of entries starting at entry. In case of multiple files or the use of an event selection, entry refers to a global event index across files and does not count events that are excluded by the selection.
This can also be used as an iterator:
>>> for lh5_obj, entry, n_rows in LH5Iterator(...): >>> # do the thing!
This is intended for if you are reading a large quantity of data but want to limit your memory usage (particularly when reading in waveforms!). The
lh5_objthat is read by this class is reused in order to avoid reallocation of memory; this means that if you want to hold on to data between reads, you will have to copy it somewhere!- Parameters:
lh5_files (str | list[str]) – file or files to read from. May include wildcards and environment variables.
group (str) – HDF5 group to read.
base_path (str) – HDF5 path to prepend.
entry_list (list[int] | list[list[int]]) – list of entry numbers to read. If a nested list is provided, expect one top-level list for each file, containing a list of local entries. If a list of ints is provided, use global entries.
entry_mask (list[bool] | list[list[bool]]) – mask of entries to read. If a list of arrays is provided, expect one for each file. Ignore if a selection list is provided.
field_mask (dict[str, bool] | list[str] | tuple[str]) – mask of which fields to read. See
LH5Store.read_object()for more details.buffer_len (int) – number of entries to read at a time while iterating through files.
- read(entry: int) tuple[Union[pygama.lgdo.array.Array, pygama.lgdo.scalar.Scalar, pygama.lgdo.struct.Struct, pygama.lgdo.vectorofvectors.VectorOfVectors], int]#
Read the next chunk of events, starting at entry. Return the LH5 buffer and number of rows read.
- class pygama.lgdo.lh5_store.LH5Store(base_path: str = '', keep_open: bool = False)#
Bases:
objectClass to represent a store of LEGEND HDF5 files. The two main methods implemented by the class are
read_object()andwrite_object().Examples
>>> from pygama.lgdo import LH5Store >>> store = LH5Store() >>> obj, _ = store.read_object("/geds/waveform", "file.lh5") >>> type(obj) pygama.lgdo.waveform_table.WaveformTable
- Parameters:
- get_buffer(name: str, lh5_file: str | h5py._hl.files.File | list[str | h5py._hl.files.File], size: Optional[int] = None, field_mask: Optional[Union[dict[str, bool], list[str], tuple[str]]] = None) Union[Array, Scalar, Struct, VectorOfVectors]#
Returns an LH5 object appropriate for use as a pre-allocated buffer in a read loop. Sets size to size if object has a size.
- Return type:
- gimme_file(lh5_file: str | h5py._hl.files.File, mode: str = 'r') File#
Returns a
h5pyfile object from the store or creates a new one.
- gimme_group(group: str, base_group: Group, grp_attrs: Optional[dict[str, Any]] = None, overwrite: bool = False) Group#
Returns an existing
h5pygroup from a base group or creates a new one. Can also set (or replace) group attributes.
- read_n_rows(name: str, lh5_file: str | h5py._hl.files.File) int | None#
Look up the number of rows in an Array-like object called name in lh5_file.
Return
Noneif it is aScalaror aStruct.- Return type:
int | None
- read_object(name: str, lh5_file: str | h5py._hl.files.File | list[str | h5py._hl.files.File], start_row: int = 0, n_rows: int = 9223372036854775807, idx: Optional[Union[ndarray, list, tuple, list[numpy.ndarray | list | tuple]]] = None, field_mask: Optional[Union[dict[str, bool], list[str], tuple[str]]] = None, obj_buf: Optional[Union[Array, Scalar, Struct, VectorOfVectors]] = None, obj_buf_start: int = 0) tuple[Union[pygama.lgdo.array.Array, pygama.lgdo.scalar.Scalar, pygama.lgdo.struct.Struct, pygama.lgdo.vectorofvectors.VectorOfVectors], int]#
Read LH5 object data from a file.
- Parameters:
name (str) – Name of the LH5 object to be read (including its group path).
lh5_file (str | h5py._hl.files.File | list[str | h5py._hl.files.File]) – The file(s) containing the object to be read out. If a list of files, array-like object data will be concatenated into the output object.
start_row (int) – Starting entry for the object read (for array-like objects). For a list of files, only applies to the first file.
n_rows (int) – The maximum number of rows to read (for array-like objects). The actual number of rows read will be returned as one of the return values (see below).
idx (Optional[Union[ndarray, list, tuple, list[numpy.ndarray | list | tuple]]]) – For NumPy-style “fancying indexing” for the read. Used to read out rows that pass some selection criteria. Only selection along the first axis is supported, so tuple arguments must be one-tuples. If n_rows is not false, idx will be truncated to n_rows before reading. To use with a list of files, can pass in a list of idx’s (one for each file) or use a long contiguous list (e.g. built from a previous identical read). If used in conjunction with start_row and n_rows, will be sliced to obey those constraints, where n_rows is interpreted as the (max) number of selected values (in idx) to be read out.
field_mask (Optional[Union[dict[str, bool], list[str], tuple[str]]]) – For tables and structs, determines which fields get written out. Only applies to immediate fields of the requested objects. If a dict is used, a default dict will be made with the default set to the opposite of the first element in the dict. This way if one specifies a few fields at
False, all but those fields will be read out, while if one specifies just a few fields asTrue, only those fields will be read out. If a list is provided, the listed fields will be set toTrue, while the rest will default toFalse.obj_buf (Optional[Union[Array, Scalar, Struct, VectorOfVectors]]) – Read directly into memory provided in obj_buf. Note: the buffer will be expanded to accommodate the data requested. To maintain the buffer length, send in
n_rows = len(obj_buf).obj_buf_start (int) – Start location in
obj_buffor read. For concatenating data to array-like objects.
- Returns:
(object, n_rows_read) – object is the read-out object n_rows_read is the number of rows successfully read out. Essential for arrays when the amount of data is smaller than the object buffer. For scalars and structs n_rows_read will be``1``. For tables it is redundant with
table.loc.- Return type:
tuple[Union[pygama.lgdo.array.Array, pygama.lgdo.scalar.Scalar, pygama.lgdo.struct.Struct, pygama.lgdo.vectorofvectors.VectorOfVectors], int]
- write_object(obj: Union[Array, Scalar, Struct, VectorOfVectors], name: str, lh5_file: str | h5py._hl.files.File, group: str | h5py._hl.group.Group = '/', start_row: int = 0, n_rows: Optional[int] = None, wo_mode: str = 'append', write_start: int = 0) None#
Write an LGDO into an LH5 file.
- Parameters:
obj (Union[Array, Scalar, Struct, VectorOfVectors]) – LH5 object. if object is array-like, writes n_rows starting from start_row in obj.
name (str) – name of the object in the output HDF5 file.
lh5_file (str | h5py._hl.files.File) – HDF5 file name or
h5py.Fileobject.group (str | h5py._hl.group.Group) – HDF5 group name or
h5py.Groupobject in which obj should be written.start_row (int) – first row in obj to be written.
n_rows (Optional[int]) – number of rows in obj to be written.
wo_mode (str) –
write_safeorw: only proceed with writing if the object does not already exist in the file.appendora: append along axis 0 (the first dimension) of array-like objects and array-like subfields of structs.Scalarobjects get overwritten.overwriteoro: replace data in the file if present, starting from write_start. Note: overwriting with write_start = end of array is the same asappend.overwrite_fileorof: delete file if present prior to writing to it. write_start should be 0 (its ignored).
write_start (int) – row in the output file (if already existing) to start overwriting from.
- pygama.lgdo.lh5_store._make_fd_idx(starts, stops, idx)#
- pygama.lgdo.lh5_store.load_dfs(f_list: str | list[str], par_list: list[str], lh5_group: str = '', idx_list: Optional[list[numpy.ndarray | list | tuple]] = None) DataFrame#
Build a
pandas.DataFramefrom LH5 data.Given a list of files (can use wildcards), a list of LH5 columns, and optionally the group path, return a
pandas.DataFramewith all values for each parameter.See also
- Returns:
dataframe – contains columns for each parameter in par_list, and rows containing all data for the associated parameters concatenated over all files in f_list.
- Return type:
- pygama.lgdo.lh5_store.load_nda(f_list: str | list[str], par_list: list[str], lh5_group: str = '', idx_list: Optional[list[numpy.ndarray | list | tuple]] = None) dict[str, numpy.ndarray]#
Build a dictionary of
numpy.ndarrays from LH5 data.Given a list of files, a list of LH5 table parameters, and an optional group path, return a NumPy array with all values for each parameter.
- Parameters:
f_list (str | list[str]) – A list of files. Can contain wildcards.
par_list (list[str]) – A list of parameters to read from each file.
lh5_group (str) – group path within which to find the specified parameters.
idx_list (Optional[list[numpy.ndarray | list | tuple]]) – for fancy-indexed reads. Must be one index array for each file in f_list.
- Returns:
par_data – A dictionary of the parameter data keyed by the elements of par_list. Each entry contains the data for the specified parameter concatenated over all files in f_list.
- Return type:
- pygama.lgdo.lh5_store.ls(lh5_file: str | h5py._hl.group.Group, lh5_group: str = '') list[str]#
Return a list of LH5 groups in the input file and group, similar to
lsorh5ls. Supports wildcards in group names.
- pygama.lgdo.lh5_store.show(lh5_file: str | h5py._hl.group.Group, lh5_group: str = '/', indent: str = '', header: bool = True) None#
Print a tree of LH5 file contents with LGDO datatype.
- Parameters:
Examples
>>> from pygama.lgdo import show >>> show("file.lh5", "/geds/raw") /geds/raw ├── channel · array<1>{real} ├── energy · array<1>{real} ├── timestamp · array<1>{real} ├── waveform · table{t0,dt,values} │ ├── dt · array<1>{real} │ ├── t0 · array<1>{real} │ └── values · array_of_equalsized_arrays<1,1>{real} └── wf_std · array<1>{real}
pygama.lgdo.scalar module#
Implements a LEGEND Data Object representing a scalar and corresponding utilities.
pygama.lgdo.struct module#
Implements a LEGEND Data Object representing a struct and corresponding utilities.
- class pygama.lgdo.struct.Struct(obj_dict: Optional[dict[str, Union[pygama.lgdo.scalar.Scalar, pygama.lgdo.array.Array, pygama.lgdo.vectorofvectors.VectorOfVectors, pygama.lgdo.struct.Struct]]] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
dictA dictionary of LGDO’s with an optional set of attributes.
After instantiation, add fields using
add_field()to keep the datatype updated, or callupdate_datatype()after adding.- Parameters:
- add_field(name: str, obj: Union[Scalar, Array, VectorOfVectors, Struct]) None#
Add a field to the table.
pygama.lgdo.table module#
Implements a LEGEND Data Object representing a special struct of arrays of equal length and corresponding utilities.
- class pygama.lgdo.table.Table(size: Optional[int] = None, col_dict: Optional[dict[str, Union[pygama.lgdo.scalar.Scalar, pygama.lgdo.array.Array, pygama.lgdo.vectorofvectors.VectorOfVectors, pygama.lgdo.struct.Struct]]] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
StructA special struct of arrays or subtable columns of equal length.
Holds onto an internal read/write location
locthat is useful in managing table I/O using functions likepush_row(),is_full(), andclear().Note
If you write to a table and don’t fill it up to its total size, be sure to resize it before passing to data processing functions, as they will call
__len__()to access valid data, which returns thesizeattribute.- Parameters:
size (int) – sets the number of rows in the table.
Arrays in col_dict will be resized to match size if both are not ``None`. If size is left asNone, the number of table rows is determined from the length of the first array in col_dict. If neither is provided, a default length of 1024 is used.col_dict (dict[str, LGDO]) – instantiate this table using the supplied named array-like LGDO’s. Note 1: no copy is performed, the objects are used directly. Note 2: if size is not
None, all arrays will be resized to match it. Note 3: if the arrays have different lengths, all will be resized to match the length of the first array.attrs (dict[str, Any]) – A set of user attributes to be carried along with this LGDO.
Notes
the
locattribute is initialized to 0.- add_column(name: str, obj: Union[Scalar, Struct, Array, VectorOfVectors], use_obj_size: bool = False, do_warn: bool = True) None#
Alias for
add_field()using table terminology ‘column’.
- add_field(name: str, obj: Union[Scalar, Struct, Array, VectorOfVectors], use_obj_size: bool = False, do_warn=True) None#
Add a field (column) to the table.
Use the name “field” here to match the terminology used in
Struct.- Parameters:
name (str) – the name for the field in the table.
obj (Union[Scalar, Struct, Array, VectorOfVectors]) – the object to be added to the table.
use_obj_size (bool) – if
True, resize the table to match the length of obj.do_warn – print or don’t print useful info. Passed to
resize()when use_obj_size isTrue.
- clear() None. Remove all items from D.#
- eval(expr_config: dict) Table#
Apply column operations to the table and return a new table holding the resulting columns.
Currently defers all the job to
numexpr.evaluate(). This might change in the future.- Parameters:
expr_config (dict) –
dictionary that configures expressions according the following specification:
{ "O1": { "expression": "p1 + p2 * a**2", "parameters": { "p1": "2", "p2": "3" } }, "O2": { "expression": "O1 - b" } // ... }
where:
expressionis an expression string supported bynumexpr.evaluate()(see also here for documentation). Note: because of internal limitations, reduction operations must appear the last in the stack.parametersis a dictionary of function parameters. Passed tonumexpr.evaluate`()as local_dict argument.
- Return type:
Warning
Blocks in expr_config must be ordered according to mutual dependency.
- get_dataframe(cols: Optional[list[str]] = None, copy: bool = False) DataFrame#
Get a
pandas.DataFramefrom the data in the table.Notes
The requested data must be array-like, with the
ndaattribute.- Parameters:
- Return type:
- join(other_table: Table, cols: Optional[list[str]] = None, do_warn: bool = True) None#
Add the columns of another table to this table.
Notes
Following the join, both tables have access to other_table’s fields (but other_table doesn’t have access to this table’s fields). No memory is allocated in this process. other_table can go out of scope and this table will retain access to the joined data.
- Parameters:
other_table (Table) – the table whose columns are to be joined into this table.
cols (Optional[list[str]]) – a list of names of columns from other_table to be joined into this table.
do_warn (bool) – set to
Falseto turn off warnings associated with mismatched loc parameter oradd_column()warnings.
- remove_column(name: str, delete: bool = False) None#
Alias for
remove_field()using table terminology ‘column’.
pygama.lgdo.vectorofvectors module#
Implements a LEGEND Data Object representing a variable-length array of variable-length arrays and corresponding utilities.
- class pygama.lgdo.vectorofvectors.VectorOfVectors(flattened_data: Optional[Array] = None, cumulative_length: Optional[Array] = None, shape_guess: Optional[tuple[int, int]] = None, dtype: Optional[dtype] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
objectA variable-length array of variable-length arrays.
For now only a 1D vector of 1D vectors is supported. Internal representation is as two NumPy arrays, one to store the flattened data contiguosly and one to store the cumulative sum of lengths of each vector.
- Parameters:
flattened_data (Array) – If not
None, used as the internal memory array for flattened_data. Otherwise, an internal flattened_data is allocated based on shape_guess and dtype.cumulative_length (Array) – If not
None, used as the internal memory array for cumulative_length. Should be dtypenumpy.uint32. If cumulative_length isNone, an internal cumulative_length is allocated based on the first element of shape_guess.shape_guess (tuple[int, int]) – A NumPy-format shape specification, required if either of flattened_data or cumulative_length are not supplied. The first element should not be a guess and sets the number of vectors to be stored. The second element is a guess or approximation of the typical length of a stored vector, used to set the initial length of flattened_data if it was not supplied.
dtype (np.dtype) – Sets the type of data stored in flattened_data. Required if flattened_data is
None.attrs (dict[str, Any]) – A set of user attributes to be carried along with this LGDO.
- set_vector(i_vec: int, nda: ndarray) None#
Insert vector nda at location i_vec.
Notes
flattened_data is doubled in length until nda can be appended to it.
- to_aoesa() ArrayOfEqualSizedArrays#
Convert to ArrayOfEqualSizedArrays, padding with NaNs
- Return type:
- pygama.lgdo.vectorofvectors.build_cl(sorted_array_in: Array, cumulative_length_out: Optional[ndarray] = None) ndarray#
build a cumulative_length array from an array of sorted data
So for example if sorted_array_in contains [ 3, 3, 3, 4 ], would return [ 2, 3 ]
For a sorted_array_in of indices, this is the inverse of explode_cl() below, in the sense that doing build_cl(explode_cl(cumulative_length)) would recover the original cumulative_length.
- Parameters:
sorted_array_in (Array) – Array of data already sorted; each N matching contiguous entries will be converted into a new row of cumulative_length_out
cumulative_length_out (Optional[ndarray]) – This is an optional pre-allocated array for the output cumulative_length. It will always have length <= sorted_array_in, so giving them the same length is safe if there is not a better guess.
- Returns:
cumulative_length_out – The output cumulative_length. If the user provides a cumulative_length_out that is too long, this return value is sliced to contain only the used portion of the allocated memory
- Return type:
- pygama.lgdo.vectorofvectors.explode(cumulative_length: Array, array_in: Array, array_out: Optional[ndarray] = None) ndarray#
explode a data array using a cumulative_length array
This is identical to allocated_explode_cl, except array_in gets exploded instead of cumulative_length. So for example, if array_in = [ 3, 4 ] and cumulative_length = [ 2, 3 ], array_out would be [ 3, 3, 3, 4 ]
- Parameters:
- Return type:
- pygama.lgdo.vectorofvectors.explode_arrays(cumulative_length: Array, arrays: list, out_arrays: Optional[list] = None) list#
explode a set of arrays using a cumulative_length array
- Parameters:
cumulative_length (Array) – the cumulative_length array to use for exploding
arrays (list) – the data arrays to be exploded. Each array must have same length as cumulative_length
out_arrays (Optional[list]) – an optional list of pre-allocated arrays to hold the exploded data. The length of the list should be equal to the number of “arrays”, and each entry in array_out should have length cumulative_length[-1]. If not provided, output arrays are allocated for the user.
- Return type:
- pygama.lgdo.vectorofvectors.explode_cl(cumulative_length: Array, array_out: Optional[ndarray] = None) ndarray#
explode a cumulative_length array
So for example if cumulative_length is [ 2, 3 ], would return [ 0, 0, 0, 1]
This is the inverse of build_cl() above, in the sense that doing build_cl(explode_cl(cumulative_length)) would recover the original cumulative_length.
- Parameters:
- Returns:
array_out – the exploded cumulative_length array
- Return type:
- pygama.lgdo.vectorofvectors.nb_build_cl(sorted_array_in: np.ndarray, cumulative_length_out: np.ndarray) np.ndarray#
numbified inner loop for build_cl
- Return type:
np.ndarray
- pygama.lgdo.vectorofvectors.nb_explode(cumulative_length: np.ndarray, array_in: np.ndarray, array_out: np.ndarray) np.ndarray#
numbified inner loop for explode
- Return type:
np.ndarray
- pygama.lgdo.vectorofvectors.nb_explode_cl(cumulative_length: np.ndarray, array_out: np.ndarray) np.ndarray#
numbified inner loop for explode_cl
- Return type:
np.ndarray
pygama.lgdo.waveform_table module#
Implements a LEGEND Data Object representing a special
Table to store blocks of one-dimensional time-series
data.
- class pygama.lgdo.waveform_table.WaveformTable(size: Optional[int] = None, t0: float | pygama.lgdo.array.Array | numpy.ndarray = 0, t0_units: Optional[str] = None, dt: float | pygama.lgdo.array.Array | numpy.ndarray = 1, dt_units: Optional[str] = None, values: Optional[Union[ArrayOfEqualSizedArrays, VectorOfVectors, ndarray]] = None, values_units: Optional[str] = None, values_adc_bit_depth: Optional[int] = None, wf_len: Optional[int] = None, dtype: Optional[dtype] = None, attrs: Optional[dict[str, Any]] = None)#
Bases:
TableAn LGDO for storing blocks of (1D) time-series data.
A
WaveformTableis an LGDOTablewith the 3 columnst0,dt, andvalues:t0[i]is a time offset (relative to a user-defined global reference) for the sample invalues[i][0]. Implemented as an LGDOArraywith optional attributeunits.dt[i]is the sampling period for the waveform atvalues[i]. Implemented as an LGDOArraywith optional attributeunits.values[i]is thei’th waveform in the table. Internally, the waveforms values may be either an LGDOArrayOfEqualSizedArrays<1,1>or as an LGDOVectorOfVectorsthat supports waveforms of unequal length. Can optionally be given aunitsattribute, as well as anadc_bit_depthattribute.
Note
On-disk and in-memory versions could be different e.g. if a compression routine is used.
- Parameters:
size (int) – sets the number of rows in the table. If
None, the size will be determined from the first among t0, dt, or values to return a valid length. If notNone, t0, dt, and values will be resized as necessary to match size. If size isNoneand t0, dt, and values are all non-array-like, a default size of 1024 is used.t0 (float | Array | np.ndarray) – \(t_0\) values to be used (or broadcast) to the t0 column.
t0_units (str) – units for the \(t_0\) values. If not
Noneand t0 is an LGDOArray, overrides what’s in t0.dt (float | Array | np.ndarray) – \(\delta t\) values (sampling period) to be used (or broadcasted) to the t0 column.
dt_units (str) – units for the dt values. If not
Noneand dt is an LGDOArray, overrides what’s in dt.values (ArrayOfEqualSizedArrays | VectorOfVectors | np.ndarray) – The waveform data to be stored in the table. If
Nonea block of data is prepared based on the wf_len and dtype arguments.values_units (str) – units for the waveform values. If not
Noneand values is an LGDOArray, overrides what’s in values.values_adc_bit_depth (int) – an integer for storing the ADC bit depth used to record this waveform
wf_len (int) – The length of the waveforms in each entry of a table. If
None(the default), unequal lengths are assumed andVectorOfVectorsis used for the values column. Ignored if values is a 2D ndarray, in which casevalues.shape[1]is used.dtype (np.dtype) – The NumPy
numpy.dtypeof the waveform data. If values is notNone, this argument is ignored. If both values and dtype areNone,numpy.float64is used.attrs (dict[str, Any]) – A set of user attributes to be carried along with this LGDO.
- resize_wf_len(new_len: int) None#
Alias for wf_len.setter, for when we want to make it clear in the code that memory is being reallocated.