postprocessor.chainer.Chainer¶

class Chainer(*args, **kwargs)[source]¶

Bases: Signal

Extend Signal by applying post-processes and allowing composite signals that combine basic signals. It “chains” multiple processes upon fetching a dataset to produce the desired datasets.

Instead of reading processes previously applied, it executes them when called.

Attributes

available: Get data sets available in h5 file.
cell_tree
channels: Get channels as an array of strings.
datasets: Print data sets available in h5 file.
max_span
merges: Get merges.
meta_h5: Return metadata, defining it if necessary.
n_merges: Get number of merges.
nstages
ntimepoints: Find the number of time points for one position, or one h5 file.
ntps: Get number of time points from the metadata.
p_available: Print data sets available in h5 file.
picks: Get picks.
stages: Get the contents of the pump with highest flow rate at each stage.
stages_span: Get consecutive stages and their corresponding number of time points.
stages_span_tp
stem: Get name of h5 file.
switch_times
tinterval: Find the interval between time points (minutes).

Methods

`add_name`(df, name)	Add column of identical strings to a dataframe.
`apply_chain`(input_data, chain, **kwargs)	Apply a series of processes to a data set.
`apply_prepost`(data[, merges, picks])	Apply modifier operations (picker or merger) to a dataframe.
`close`()	Close the h5 file.
`dataset_to_df`(f, path)	Get data from h5 file as a dataframe.
`get`(dataset[, chain, in_minutes, stages, retain])	Load data from an h5 file.
`get_consecutives`(tree, nstepsback)	Receives a sorted tree and returns the keys of consecutive elements.
`get_info_tree`([fields])	Return traps, time points and labels for this position in the form of a tree in the hierarchy determined by the argument fields.
`get_merged`(dataset)	Run preprocessing for merges.
`get_merges`()	Get merge events going up to the first level.
`get_picks`([names, path])	Get the relevant picks based on names.
`get_raw`(dataset[, in_minutes, lineage])	Load data from a h5 file and return as a dataframe.
`get_retained`(df, cutoff)	Return a fraction of the df, one without later time points.
`lineage`([lineage_location, merged])	Get lineage data from a given location in the h5 file.
`retained`(signal[, cutoff])	Load data (via decorator) and reduce the resulting dataframe.
`store_signal_path`(fullname, node)	Store the name of a signal if it is a leaf node (a group with no more groups inside) and if it starts with extraction.

cols_in_mins
get_npairs
get_npairs_over_time

Define index_names for dataframes, candidate fluorescence channels, and composite statistics.

Attributes

available: Get data sets available in h5 file.
cell_tree
channels: Get channels as an array of strings.
datasets: Print data sets available in h5 file.
max_span
merges: Get merges.
meta_h5: Return metadata, defining it if necessary.
n_merges: Get number of merges.
nstages
ntimepoints: Find the number of time points for one position, or one h5 file.
ntps: Get number of time points from the metadata.
p_available: Print data sets available in h5 file.
picks: Get picks.
stages: Get the contents of the pump with highest flow rate at each stage.
stages_span: Get consecutive stages and their corresponding number of time points.
stages_span_tp
stem: Get name of h5 file.
switch_times
tinterval: Find the interval between time points (minutes).

Methods

`add_name`(df, name)	Add column of identical strings to a dataframe.
`apply_chain`(input_data, chain, **kwargs)	Apply a series of processes to a data set.
`apply_prepost`(data[, merges, picks])	Apply modifier operations (picker or merger) to a dataframe.
`close`()	Close the h5 file.
`dataset_to_df`(f, path)	Get data from h5 file as a dataframe.
`get`(dataset[, chain, in_minutes, stages, retain])	Load data from an h5 file.
`get_consecutives`(tree, nstepsback)	Receives a sorted tree and returns the keys of consecutive elements.
`get_info_tree`([fields])	Return traps, time points and labels for this position in the form of a tree in the hierarchy determined by the argument fields.
`get_merged`(dataset)	Run preprocessing for merges.
`get_merges`()	Get merge events going up to the first level.
`get_picks`([names, path])	Get the relevant picks based on names.
`get_raw`(dataset[, in_minutes, lineage])	Load data from a h5 file and return as a dataframe.
`get_retained`(df, cutoff)	Return a fraction of the df, one without later time points.
`lineage`([lineage_location, merged])	Get lineage data from a given location in the h5 file.
`retained`(signal[, cutoff])	Load data (via decorator) and reduce the resulting dataframe.
`store_signal_path`(fullname, node)	Store the name of a signal if it is a leaf node (a group with no more groups inside) and if it starts with extraction.

cols_in_mins
get_npairs
get_npairs_over_time

__init__(*args, **kwargs)[source]¶: Define index_names for dataframes, candidate fluorescence channels, and composite statistics.

Methods

`__init__`(args, *kwargs)	Define index_names for dataframes, candidate fluorescence channels, and composite statistics.
`add_name`(df, name)	Add column of identical strings to a dataframe.
`apply_chain`(input_data, chain, **kwargs)	Apply a series of processes to a data set.
`apply_prepost`(data[, merges, picks])	Apply modifier operations (picker or merger) to a dataframe.
`close`()	Close the h5 file.
`cols_in_mins`(df)
`dataset_to_df`(f, path)	Get data from h5 file as a dataframe.
`get`(dataset[, chain, in_minutes, stages, retain])	Load data from an h5 file.
`get_consecutives`(tree, nstepsback)	Receives a sorted tree and returns the keys of consecutive elements.
`get_info_tree`([fields])	Return traps, time points and labels for this position in the form of a tree in the hierarchy determined by the argument fields.
`get_merged`(dataset)	Run preprocessing for merges.
`get_merges`()	Get merge events going up to the first level.
`get_npairs`([nstepsback, tree])
`get_npairs_over_time`([nstepsback])
`get_picks`([names, path])	Get the relevant picks based on names.
`get_raw`(dataset[, in_minutes, lineage])	Load data from a h5 file and return as a dataframe.
`get_retained`(df, cutoff)	Return a fraction of the df, one without later time points.
`lineage`([lineage_location, merged])	Get lineage data from a given location in the h5 file.
`retained`(signal[, cutoff])	Load data (via decorator) and reduce the resulting dataframe.
`store_signal_path`(fullname, node)	Store the name of a signal if it is a leaf node (a group with no more groups inside) and if it starts with extraction.

Attributes

`available`	Get data sets available in h5 file.
`cell_tree`
`channels`	Get channels as an array of strings.
`datasets`	Print data sets available in h5 file.
`max_span`	rtype `int`
`merges`	Get merges.
`meta_h5`	Return metadata, defining it if necessary.
`n_merges`	Get number of merges.
`nstages`	rtype `int`
`ntimepoints`	Find the number of time points for one position, or one h5 file.
`ntps`	Get number of time points from the metadata.
`p_available`	Print data sets available in h5 file.
`picks`	Get picks.
`stages`	Get the contents of the pump with highest flow rate at each stage.
`stages_span`	Get consecutive stages and their corresponding number of time points.
`stages_span_tp`	rtype `Tuple`[`Tuple`[`str`, `int`], `...`]
`stem`	Get name of h5 file.
`switch_times`	rtype `List`[`int`]
`tinterval`	Find the interval between time points (minutes).

static add_name(df, name)¶: Add column of identical strings to a dataframe.

apply_chain(input_data, chain, **kwargs)[source]¶

Apply a series of processes to a data set.

Like postprocessing, Chainer consecutively applies processes.

Parameters can be passed as kwargs.

Chainer does not support applying the same process multiple times with different parameters.

Parameters

input_datapd.DataFrame: Input data to process.
chaint.Tuple[str, …]: Tuple of strings with the names of the processes
**kwargskwargs: Arguments passed on to Process.as_function() method to modify the parameters.

Examples

FIXME: Add docs.

apply_prepost(data, merges=True, picks=True)¶

Apply modifier operations (picker or merger) to a dataframe.

Parameters

datat.Union[str, pd.DataFrame]: DataFrame or path to one.
mergest.Union[np.ndarray, bool]: (optional) 2-D array with three columns: the tile id, the mother label, and the daughter id. If True, fetch merges from file.
pickst.Union[np.ndarray, bool]: (optional) 2-D array with two columns: the tiles and the cell labels. If True, fetch picks from file.

Examples

FIXME: Add docs.

property available¶: Get data sets available in h5 file.

property channels: Collection[str]¶

Get channels as an array of strings.

Return type: Collection[str]

close()¶: Close the h5 file.

dataset_to_df(f, path)¶

Get data from h5 file as a dataframe.

Return type: DataFrame

property datasets¶: Print data sets available in h5 file.

get(dataset, chain=('standard', 'interpolate', 'savgol'), in_minutes=True, stages=True, retain=None, **kwargs)[source]¶: Load data from an h5 file.

static get_consecutives(tree, nstepsback)¶: Receives a sorted tree and returns the keys of consecutive elements.

get_info_tree(fields=('trap', 'timepoint', 'cell_label'))¶

Return traps, time points and labels for this position in the form of a tree in the hierarchy determined by the argument fields.

Note that it is compressed to non-empty elements and timepoints.

Default hierarchy is: - trap - time point - cell label

This function currently produces trees of depth 3, but it can easily be extended for deeper trees if needed (e.g. considering groups, chambers and/or positions).

Parameters

fields: list of strs: Fields to fetch from ‘cell_info’ inside the h5 file.

Returns

Nested dictionary where keys (or branches) are the upper levels and the leaves are the last element of :fields:.

get_merged(dataset)¶: Run preprocessing for merges.

get_merges()¶: Get merge events going up to the first level.

get_picks(names=('trap', 'cell_label'), path='modifiers/picks/')¶

Get the relevant picks based on names.

Return type: Set[Tuple[int, str]]

get_raw(dataset, in_minutes=True, lineage=False)¶

Load data from a h5 file and return as a dataframe.

Parameters

dataset: str or list of strs: The name of the h5 file or a list of h5 file names
in_minutes: boolean: If True,
lineage: boolean
:rtype: :py:class:`~pandas.core.frame.DataFrame`

static get_retained(df, cutoff)¶: Return a fraction of the df, one without later time points.

lineage(lineage_location=None, merged=False)¶

Get lineage data from a given location in the h5 file.

Returns an array with three columns: the tile id, the mother label, and the daughter label.

Return type: ndarray

property merges: ndarray¶: Get merges.

property meta_h5: Dict[str, Any]¶

Return metadata, defining it if necessary.

Return type: Dict[str, Any]

property n_merges¶: Get number of merges.

property ntimepoints¶: Find the number of time points for one position, or one h5 file.

property ntps: int¶

Get number of time points from the metadata.

Return type: int

property p_available¶: Print data sets available in h5 file.

property picks: ndarray¶: Get picks.

retained(signal, cutoff=0.8)¶

Load data (via decorator) and reduce the resulting dataframe.

Load data for a signal or a list of signals and reduce the resulting dataframes to a fraction of their original size, losing late time points.

property stages: List[str]¶

Get the contents of the pump with highest flow rate at each stage.

Return type: List[str]

property stages_span: Tuple[Tuple[str, int], ...]¶

Get consecutive stages and their corresponding number of time points.

Return type: Tuple[Tuple[str, int], ...]

property stem¶: Get name of h5 file.

store_signal_path(fullname, node)¶: Store the name of a signal if it is a leaf node (a group with no more groups inside) and if it starts with extraction.

property tinterval: int¶: Find the interval between time points (minutes).