treble_tsdk.collections.simulation_collection

Classes

SimulationCollection

class treble_tsdk.collections.simulation_collection.SimulationCollection

__init__(initial: list[Simulation], client: TSDKClient)

Initialize the collection base.

Parameters:

key_column (str) – The column to use as the key for the collection (dataframe). This column is used to identify the items in the collection. Rows with the same key are considered to reference the same item.
default_schema (dict[str, pl.DataType]) – The default schema for the collection.
required_columns (list[str]) – The required columns for the collection.
schema_mapper (dict[str, Callable[[T], Any]]) – The schema mapper for the collection. This function will be called for each item in the collection with the item as a parameter to get the value for the column.

add_column(name: str, mapper: Callable[[T], Any], dtype: polars.DataType | None = None)

Add a column to the collection using a mapper function.

Parameters:

name (str) – Name of the column.
mapper (Callable[[T], Any]) – Mapper function to get a value. Called for each row in the collection with the item (e.g. Simulation or IRInfo) as a parameter.
dtype (pl.DataType) – Data type of the column. If not provided the mapper function will be used to determine the data type.

add_projects(projects: Project | list[Project]): Add projects to the collection.

add_simulations(sims: Simulation | list[Simulation]): Add simulations to the collection.

apply(func: Callable[[polars.DataFrame], polars.DataFrame]) → _Self

Apply a DataFrame transformation and return a new collection of the same type.

The returned collection shares cached objects with the original but has an independent DataFrame. The transformation must preserve the required columns.

Parameters:: func (Callable[[pl.DataFrame], pl.DataFrame]) – A function that receives the underlying Polars DataFrame and returns a transformed DataFrame.
Returns:: A new collection of the same type with the transformed DataFrame.

Example:

filtered = coll.apply(lambda df: df.filter(pl.col("x") > 5).sort("y"))

copy_custom_columns(other_collection: CollectionBase, key_columns: list[str] | None = None, columns_to_copy: list[str] | None = None)

filter_collection(*args, **kwargs) → _Self

Filter the collection and return a new collection with matching rows.

Accepts the same arguments as polars.DataFrame.filter().

Returns:: A new collection of the same type containing only the matching rows.

Example:

nearby = coll.filter_collection(pl.col("source_receiver_dist") < 5.0)

get_ir_collection(simulation_subset: polars.DataFrame | None = None, inherit_columns: list[dict[str, str]] | None = None) → IRCollection

Convert the collection to an IR collection.

Parameters:

simulation_subset – A subset of the simulation collection to convert to an IR collection. If not provided, all simulations will be converted.
inherit_columns – A list of column mappings to inherit from the simulation collection. Each dict maps source column name (key) to target column name (value) in the IR collection. For example: [{"tags": "simulation_tags"}] maps "tags" to "simulation_tags".

head(n: int = 5) → _Self

Return a new collection with the first n rows.

Parameters:: n (int) – Number of rows, defaults to 5.
Returns:: A new collection of the same type.

plot(): Plot the collection using a interactive widget.

remove_column(name: str): Remove a column from the collection.

Sample rows where the specified column follows a target distribution.

Returns a new collection of the same type containing only the sampled rows.

Parameters:

column (str) – The column to sample from.
n_samples (int) – The number of rows to sample.
distribution (Distribution) – Distribution to sample from.

Returns:

A new collection containing the sampled rows.

sample_with_gaussian_distribution(column: str, n_samples: int, target_mean: float, target_std: float, seed: int = 42)

Sample rows where the specified column follows a target Gaussian distribution.

This is a convenience wrapper around sample_with_distribution.

Parameters:

column (str) – The column to sample from.
n_samples (int) – The number of rows to sample.
target_mean (float) – The target mean for the distribution.
target_std (float) – The target standard deviation for the distribution.
seed (int) – The random seed for reproducibility.

tail(n: int = 5) → _Self

Return a new collection with the last n rows.

Parameters:: n (int) – Number of rows, defaults to 5.
Returns:: A new collection of the same type.

write_parquet(path: str | Path, **kwargs)

Write the dataframe to a parquet file.

Parameters:

path (str | Path) – Path to the parquet file.
kwargs – Additional keyword arguments passed to polars.DataFrame.write_parquet.

property columns: list[str]: List of columns in the DataFrame

property dataframe: polars.DataFrame

Get the collection data as a Polars DataFrame.

Returns pl.DataFrame:: DataFrame containing all collection items and computed columns.

property editable_dataframe: polars.DataFrame: DataFrame with only the user-modifiable (non-locked) columns. The key column is always included in the editable dataframe although it is locked.

property lazy_df: polars.LazyFrame: Lazy access to underlying Polars DataFrame

property locked_columns: list[str]: List of locked columns for the collection. Columns in this list are read-only and cannot be modified by the user.

property required_columns: list[str]: List of required columns for the collection.