Environments

class coba.environments.Environments

A friendly API for common environment functionality.

Static Constructors

static from_custom(*environments: Environment | Sequence[Environment])

Create Environments from Environment.

Parameters:: *environments – Create an Environments from the environments.
Returns:: An Environments object.

static from_dataframe(dataframe) → Environments

Create Environments from a dataframe.

Parameters:: dataframe – Create Environments from this. There will only be one environment whose interactions will be the rows of the dataframe.
Returns:: An Environments object.

static from_feurer(drop_missing: bool = True) → Environments

Create Environments from the Feurer benchmark.

Parameters:: drop_missing – Exclude interactions with missing context features.

Remarks:: The description of the benchmark is provided at https://arxiv.org/abs/2007.04074. For Task ids 232, 3044, 75105, and 211723 every row has a missing feature. These environments will be empty when drop_missing is True. Task id 189866 has been updated to 361282, a new version of the original dataset that fixes api issues with the old dataset.

Returns:: An Environments object.

static from_kernel_synthetic(n_interactions: int, n_actions: int = 5, n_context_features: int = 5, n_action_features: int = 5, n_exemplars: int = 5, kernel: Literal['linear', 'polynomial', 'exponential', 'gaussian'] = 'gaussian', degree: int = 3, gamma: float = 1, seed: int | Sequence[int] = 1) → Environments

Create Environments from kernel-based reward functions.

Parameters:

n_interactions – The number of interactions the simulation should have.
n_actions – The number of actions each interaction should have.
n_context_features – The number of features each context should have.
n_action_features – The number of features each action should have.
n_exemplars – The number of exemplar context-action pairs.
kernel – The family of the kernel basis functions.
degree – This argument is only relevant when using polynomial kernels.
gamma – This argument is only relevant when using exponential kernels.
seed – The seed used to generate all random values. If seed is a list then an separate environment will be created for each seed.

Returns:

An Environments object.

static from_linear_synthetic(n_interactions: int, n_actions: int = 5, n_context_features: int = 5, n_action_features: int = 5, n_coefficients: int | None = 5, reward_features: Sequence[str] = ['a', 'xa'], seed: int | Sequence[int] = 1) → Environments

Create Environments from linear reward functions.

Parameters:

n_interactions – The number of interactions the simulation should have.
n_actions – The number of actions each interaction should have.
n_context_features – The number of features each context should have.
n_action_features – The number of features each action should have.
function. (n_coefficients The number of non-zero weights in the final reward) –
reward_features – The features in the simulation’s linear reward function.
seed – The seed used to generate all random values. If seed is a list then an separate environment will be created for each seed.

Returns:

An Environments object.

static from_mlp_synthetic(n_interactions: int, n_actions: int = 5, n_context_features: int = 5, n_action_features: int = 5, seed: int | Sequence[int] = 1) → Environments

Create Environments from kernel-based reward functions.

Parameters:

n_interactions – The number of interactions the simulation should have.
n_actions – The number of actions each interaction should have.
n_context_features – The number of features each context should have.
n_action_features – The number of features each action should have.
seed – The seed used to generate all random values. If seed is a list then an separate environment will be created for each seed.

Remarks:: The MLP architecture has a single hidden layer with sigmoid activation and one output value calculated from a random linear combination of the hidden layer’s output.

Returns:: An Environments object.

static from_neighbors_synthetic(n_interactions: int, n_actions: int = 5, n_context_features: int = 5, n_action_features: int = 5, n_neighborhoods: int = 30, seed: int | Sequence[int] = 1) → Environments

Create Environments from nearest neighbors reward functions.

Parameters:

n_interactions – The number of interactions the simulation should have.
n_actions – The number of actions each interaction should have.
n_context_features – The number of features each context should have.
n_action_features – The number of features each action should have.
n_neighborhoods – The number of distinct reward value neighborhoods.
seed – The seed used to generate all random values. If seed is a list then an separate environment will be created for each seed.

Returns:

An Environments object.

static from_openml(data_id: int | Sequence[int], drop_missing: bool = True, take: int = None, *, target: str = None, label_type: Literal['c', 'r', 'm'] = None) → Environments

static from_openml(*, task_id: int | Sequence[int], drop_missing: bool = True, take: int = None, target: str = None, label_type: Literal['m', 'c', 'r'] = None) → Environments

Create Environments from openml datasets.

Parameters:

data_id – The data id for a dataset on openml (i.e., openml.org/d/{id}). If data_id is a list then an environment will be created for each id.
task_id – The task id for a task on openml (i.e., openml.org/t/{id}). If task_id is a list then an environment will be created for each id.
drop_missing – Drop data rows with missing values.
take – The interaction count for the simulation (selected at random).
target – The column that should be marked as the label in the source.
label_type – Is the label a classification, regression or multilabel type.

Returns:

An Environments object.

static from_result(result: str | Result) → Environments

Create Environments from a given Result file.

Parameters:: result – The path to results or the results of an experiment. One environment will be created for every environment in the Experiment that produced the results.

Remarks:: We assume that ‘context’, ‘action’, probability’, and ‘reward’ was recorded during the experiment.

Returns:: An Environments object.

static from_save(path: str) → Environments

Create Environments from an Environments save file.

Parameters:: path – Path to an Environments save file.
Returns:: An Environments object.

static from_supervised(source: Source[Iterable[Dense] | Iterable[Sparse] | Iterable[Tuple[Any, Any]]], label_col: int | str = None, label_type: Literal['c', 'r', 'm'] = None, take: int = None) → Environments

static from_supervised(X=Sequence[Any], Y=Sequence[Any], label_type: Literal['c', 'r', 'm'] = None) → Environments

Create Environments using a supervised dataset.

Parameters:

source – A source that reads the supervised data. Coba natively provides support for csv, arff, libsvm, and manik data sources.
label_col – The header or index of the label in each example. If None the source must return an iterable of tuple pairs where the first item are features and the second item is a label.
X – The features to use when creating contexts.
Y – The labels to use when creating actions and rewards.
label_type – Is the label a classification, regression or multilabel type. If None the label type will be inferred based on the data source.
take – The interaction count for the simulation (selected at random).

Returns:

An Environments object.

static from_template(source: str | Source[Iterable[str]], **user_vars) → Environments

Create Environments from a template file.

Parameters:: **user_vars – overrideable template variables
Returns:: An Environments object.

Methods

batch(batch_size: int, batch_type: Literal['list', 'torch'] = 'list') → Environments

Batch interactions.

Parameters:

batch_size – The number of interactions in a batched interaction.
batch_type – The type of batch for interaction values.

Returns:

An Environments object.

binary() → Environments

Transform reward values to 1 or 0.

Remarks:: Reward values are either 1 (if max reward) or 0 (not max reward).

Returns:: An Environments object.

cache() → Environments

Add a caching pipe.

Remarks:: Results from earlier steps in the pipeline will re-used during later pipeline steps.

Returns:: An Environments object.

static cache_dir(path: str | Path = '~/.cache/coba') → Type[Environments]

Set the cache directory for openml sources.

Parameters:: path – A path to a directory to cache openml sources.
Returns:: The Environments class.

chunk(cache: bool = True) → Environments

Add a chunk pipe.

Parameters:: cache – output before the chunk should be cached.

Remarks:: This is useful if an early part of an Environments pipeline takes considerable time to evaluate (e.g., logged). Placing a chunk after the long running filter means that the filter will only be executed one time and the results will be resued.

Returns:: An Environments object.

cycle(after: int) → Environments

Cycle reward values.

Useful for testing a learner’s response to a non-stationary noise.

Parameters:: after – The number of interactions to wait before cycling reward values.
Returns:: An Environments object.

dense(n_feats: int, method: Literal['lookup', 'hashing'], context: bool = True, action: bool = False) → Environments

Ensure that features are dense.

Parameters:

n_feats – The number of features densified environment should have.
method – How sparse features are turned into dense features. The hashing trick is more memory efficient but may have collisions. The lookup method is less memory efficient but guaranteed to have no collisions.
context – Densify context features.
action – Densify action features.

Returns:

An Environments object.

filter(filter: EnvironmentFilter | Sequence[EnvironmentFilter]) → Environments

Apply custom filter to Environments.

Parameters:: filter – The filters to apply to self. If a list of filters is provided then a new pipeline is created for each filter.
Returns:: An Environments object.

flatten() → Environments

Flatten contexts and actions.

Examples

An interaction {‘context’: [[1,2],3]} would become {‘context’:[1,2,3]}.

Returns:: An Environments object.

grounded(n_users: int, n_normal: int, n_words: int, n_good: int, seed: int = 1) → Environments

Transform simulated interactions to IGL interactions.

Parameters:

n_users – The number of users in the grounded environment.
n_normal – The number of users with normal grounded behavior.
n_words – The number of potential feedback words for users.
n_good – The number of words that mean good out of the n_words.
seed – Seed for all random values.

Remarks:: See here for more on interaction grounded learning.

Returns:: An Environments object.

impute(stats: Literal['mean', 'median', 'mode'] | Sequence[Literal['mean', 'median', 'mode']] = 'mean', indicator: bool = True, using: int | None = None) → Environments

Impute missing data.

Parameters:

stats – The statistic to use for imputation. If a sequence is provided they will be applied in order. This is useful if different statistics should be applied for different types.
indicator – Indicates whether a binary feature should be added for missingness.
using – The number of interactions to use to calculate imputation statistics.

Returns:

An Environments object.

logged(learners: Learner | Sequence[Learner], seed: float | None = 1.23) → Environments

Transform simulated interactions to logged interactions.

Parameters:

learners – The learners that will be used as the logging policy. An environment will be created for every learner provided.
seed – The seed for used for all random number generation.

Remarks:: Adds ‘action’, ‘reward’, and ‘probability’ to interactions with ‘context’, ‘actions’, and ‘rewards’.

Returns:: An Environments object.

materialize() → Environments

Materialize and cache all environments.

Remarks:: Ideal for stateful environments such as Jupyter Notebook where environments can be saved in memory and re-used between experiments.

Returns:: An Environments object.

Add noise to values.

Parameters:

context – Shape parameters for a distribution or a callable that returns a noisy value.
action – Shape parameters for a distribution or a callable that returns a noisy value.
reward – Shape parameters for a distribution or a callable that returns a noisy value.
seed – The seed for all random values. If a sequence of seeds is given then multiple environments will be created with each using distinct noise filters from the seeds.

Remarks:

Supported distributions defined via tuple are:

random gaussian: (mean,`std`)
random integer : (‘i’,`inclusive min`,`inclusive max`)
random gaussian: (‘g’,`mean`,`std`)

Returns:: An Environments object.

ope_rewards(rewards_type: Literal['IPS', 'DM', 'DR'] | None = None)

Transform logged interactions to simulated interactions.

Parameters:

rewards_type – How to estimate the rewards function from the logged
data (i.e., inverse propensity score, direct method, or doubly robust) –

Remarks:: Adds ‘rewards’ to interactions with ‘context’, ‘action’, ‘reward’, and ‘probability’.

Returns:: An Environments object.

params(params: Mapping[str, Any]) → Environments

Add params to environments.

Parameters:: params – Parameter values to add to each Environment in Environments.
Returns:: An Environments object.

repr(cat_context: Literal['onehot', 'onehot_tuple', 'string'] = 'onehot', cat_actions: Literal['onehot', 'onehot_tuple', 'string'] = 'onehot') → Environments

Change representation of categorical data.

Parameters:

cat_context – How to represent categorical data in contexts.
cat_actions – How to represent categorical data in actions.

Returns:

An Environments object.

reservoir(n_interactions: int, seeds: int | Sequence[int] = 1, strict: bool = False) → Environments

Take n random interactions.

Parameters:

n_interactions – The maximum number of interactions to sample.
seeds – The seed to use to randomly generate the sample. If a sequence of seeds is provided then multiple samples are drawn.
strict – Do not draw a sample if there are not at least n_interactions.

Returns:

An Environments object.

riffle(spacing: int, seed: int = 1) → Environments

Riffle shuffle interactions.

Parameters:

spacing – The number of interactions from the beginning between each interaction shuffled in from the end.
seed – The seed used to determine the location of each ending interaction when placed within its beginning space.

Returns:

An Environments object.

save(path: str, processes: int = 1, overwrite: bool = False) → Environments

Save Environments to disk.

Parameters:

path – The location to save Environments (the file will be a zip archive).
processes – The number of process to use when generating environments.
overwrite – Indicate if an existing file at Path should be overwritten.

Returns:

An Environments object.

scale(shift: float | Literal['min', 'mean', 'med'] = 'min', scale: float | Literal['minmax', 'std', 'iqr', 'maxabs'] = 'minmax', targets: Literal['context'] = 'context', using: int | None = None) → Environments

Scale and shift features.

Parameters:

shift – The statistic to use to shift each context feature.
scale – The statistic to use to scale each context feature.
target – The target data we wish to scale in the environment.
using – The number of interactions to use when calculating statistics.

Remarks:: For example, scale(‘mean’, ‘std’) would standardize all context features while scale(‘med’, ‘iqr’) would apply what sklearn calls a RobustScaler to all context features.

Returns:: An Environments object.

shuffle(seed: int = 1) → Environments

shuffle(seeds: Iterable[int]) → Environments

shuffle(*, n: int) → Environments

Shuffle interaction order.

Parameters:

seed – The seed determining shuffle order.
seeds – Sequence of seeds determining shuffle order. A new environment is made for every seed where the only difference is the order of interactions.
n – The number of shuffling orders to produce. Equivalent to shuffle(seeds=range(n)).

Returns:

An Environments object.

slice(start: int | None, stop: int | None = None, step: int = 1) → Environments

Take a slice of interactions.

Parameters:

start – The starting index for the slice.
stop – The finishing index for the slice (exclusive).
step – The step size between each item in the slice.

Returns:

An Environments object.

sort(*keys: str | int | Sequence[str | int]) → Environments

Sort interactions by features.

Parameters:: *keys – The index or keys for context features.
Returns:: An Environments object.

sparse(context: bool = True, action: bool = False) → Environments

Ensure that features are sparse.

Parameters:

context – Sparsify context features.
action – Sparsify action features.

Returns:

An Environments object.

take(n_interactions: int, strict: bool = False) → Environments

Take the first n interactions.

Parameters:

n_interactions – The maximum number of interactions to take.
strict – Do not take any interactions if there are not at least n_interactions.

Returns:

An Environments object.

unbatch()

Unbatch interactions.

Remarks:: The unbatch command is the inverse of batch.

Returns:: An Environments object.

Select for characteristics.

Parameters:

n_interactions – The min, max or exact number of interactions an Environment must have.
n_actions – The min, max or exact number of actions an interaction must have.
n_features – The minimum, maximum or exact number of features interactions must have.

Returns:

An Environments object.