Result

class coba.results.Result

Data produced by an Experiment.

There are two key variables used across Result methods: l and p. The l variable can be thought of as what to compare. Alternatively, l can be thought of as the “label” in plots. By default l is ‘learner_id’. The p variable can be though of as what to “pair”. The p value will only be included in Result so long as there is an l for every p. By default p is ‘environment_id’. That means, by default Result plots only include p (environment_ids) where every l (learner_id) has been evaluated.

Static

static from_save(filename: str | Source[str]) Result

Load Result from an Experiment log.

Parameters:

filename – The path to an experiment log.

Returns:

A Result object.

Methods

where_best(l: str | Sequence[str], p: str | Sequence[str] = 'environment_id', y: str = 'reward', n: int | None = None, full_l: str | Sequence[str] = 'learner_id', full_p: str | Sequence[str] = 'environment_id') Result

Select the best performing l over p.

Parameters:
  • l – The hyperparameter values we wish to optimize.

  • p – The grouping variable we wish to optimize over.

  • y – The variable we wish to optimize.

  • n – The number of interactions we wish to consider.

  • full_l – The true lowest level label (e.g., learner_id)

  • full_p – The true lowest level pair (e.g., environment_id)

Returns:

A Result with full l and p that is the optimal over

full_l and full_p. For example we could say l is ‘family’ while full_l is ‘learner_id’. This would pick the best performing learner grouped by family for each p.

where_fin(n: int | Literal['min'] | None = None, l: str | Sequence[str] | None = None, p: str | Sequence[str] | None = None) Result

Filter the results down to even outcomes so that plotted results will be meaningful.

Parameters:
  • n – The number of interactions a specific evaluation must have (None indicates no constraint).

  • l – The level at which we wish to compare evalation outcomes (e.g., ‘learner_id’).

  • p – The pairs that must exist across all l in order to be included (e.g., ‘environment_id’).

Returns:

A Result where an l exists for every p and all p have ‘n’ interactions.

where(**kwargs) Result

Select learners/environments/evaluators.

Parameters:

kwargs – Any column in environments, learners, and evaluators to filter on. By default comparison operators are either ‘equal’ or ‘in’. For example, where(environment_id=1) would return a Result where environments only contains environment_id 1. On the other hand where(environment_id=[1,2]) would return a Result where environments contains environment_id 1 and 2. The keywords must indicate a column name in either environments, learners, evaluators, or interactions. Supported comparison operators are ‘=’,’!=’,’<=’,’<’,’>’,’>=’, ‘match’,’in’,’!in’. To indicate an explicit operator use a dictionary. For example, where(environment_id={‘<=’:100}) would return a Result with all environments whose environment_id <= 100. Including multiple kwargs in a single where applies an or conjuction. Chaining where statements is equivalent to an and conjuctor.

Reutrns:
A Result whose environments, learners, evaluators, and interactions satisfy the

where selectors.

plot_contrast(l1: Any, l2: Any, x: str | Sequence[str] = 'environment_id', y: str = 'reward', l: str | Sequence[str] = 'learner_id', p: str | Sequence[str] = 'environment_id', mode: Literal['diff', 'prob'] | Callable[[float, float], float] = 'diff', span: int | None = None, err: Literal['se', 'sd', 'bs', 'bi'] | PointAndInterval | None = None, errevery: int | None = None, labels: Sequence[str] | None = None, colors: Sequence[str] | None = None, title: str | None = None, xlabel: str | None = None, ylabel: str | None = None, xlim: Tuple[Number | None, Number | None] | None = None, ylim: Tuple[Number | None, Number | None] | None = None, xticks: bool = True, yticks: bool = True, legend: bool = True, alpha: float = 1, xorder: Literal['+', '-'] | None = None, boundary: bool = True, out: None | Literal['screen'] | str = 'screen', ax=None) None

Plot a direct contrast of the performance for two learners.

Parameters:
  • l1 – The first set of parameter values we want to contrast.

  • l2 – The second set of parameter values we want to contrast.

  • x – The value to plot on the x-axis. This can either be index or environment columns to group by.

  • y – The value to plot on the y-axis.

  • l – The level at which we want to contrast.

  • p – The pairs that must exist across all comparison levels in order to be included.

  • mode – ‘diff’ – plot the pairwise difference; ‘prob’ plot the probability of l1 beating l2.

  • span – The number of y values to smooth together when reporting y. If this is None then the average of all y values up to current is shown otherwise a moving average with window size of span (the window will be smaller than span initially).

  • err – This determines what kind of error bars to plot (if any). If None then no bars are plotted, if ‘se’ the standard error is shown, and if ‘sd’ the standard deviation is shown.

  • errevery – This determines the frequency of errorbars. If None they appear 5% of the time.

  • labels – The legend labels to use in the plot. These should be in order of the actual legend labels.

  • title – The title give the plot.

  • colors – The colors used to plot the learners plot.

  • xlabel – The label on the x-axis.

  • ylabel – The label on the y-axis.

  • legend – Whether the legend for the plot should be drawn.

  • alpha – The opacity of drawn data.

  • xlim – Define the x-axis limits to plot. If None the x-axis limits will be inferred.

  • ylim – Define the y-axis limits to plot. If None the y-axis limits will be inferred.

  • xticks – Whether the x-axis labels should be drawn.

  • yticks – Whether the y-axis labels should be drawn.

  • xorder – Indicates whether the x-axis should be in ascending (+) or descendeing (-) order.

  • boundary – Whether we want to plot the boundary line between which set is the best performing.

  • out – Indicate where the plot should be sent to after plotting is finished. Valid values are ‘screen’ to show it on screen, a path to save to disk, or None if the plot should not be output anywhere (i.e., kept in memory) in order to keep editing the plot after this call.

  • ax – Provide an optional axes that the plot will be drawn to. If not provided a new figure/axes is created.

plot_learners(x: str | Sequence[str] = 'index', y: str = 'reward', l: str | Sequence[str] = 'full_name', p: str | Sequence[str] = 'environment_id', span: int | None = None, err: Literal['se', 'sd', 'bs', 'bi'] | PointAndInterval | None = None, errevery: int | None = None, labels: Sequence[str] | None = None, colors: int | Sequence[str | int] | None = None, title: str | None = None, xlabel: str | None = None, ylabel: str | None = None, xlim: Tuple[Number | None, Number | None] | None = None, ylim: Tuple[Number | None, Number | None] | None = None, xticks: bool = True, yticks: bool = True, legend: bool = True, alpha: float = 1, xorder: Literal['+', '-'] | None = None, top_n: int | None = None, out: None | Literal['screen'] | str = 'screen', ax=None) None

Plot the performance of multiple learners on multiple environments. It gives a sense of the expected performance for different learners across independent environments. This plot is valuable in gaining insight into how various learners perform in comparison to one another.

Parameters:
  • x – The values to plot on the x-axis.

  • y – The value to plot on the y-axis.

  • l – The values to plot in the legend.

  • p – The pairs that must exist across all items in the legend in order to be included. If None no pairing checks are performed.

  • span – The number of y values to smooth together when reporting y. If this is None then the average of all y values up to current is shown otherwise a moving average with window size of span (the window will be smaller than span initially).

  • err – This determines what kind of error bars to plot (if any). If None then no bars are plotted, if ‘se’ the standard error is shown, and if ‘sd’ the standard deviation is shown.

  • errevery – This determines the frequency of errorbars. If None they appear 5% of the time.

  • labels – The legend labels to use in the plot. These should be in order of the actual legend labels.

  • colors – The colors used to plot the learners plot.

  • title – The title give the plot.

  • xlabel – The label on the x-axis.

  • ylabel – The label on the y-axis.

  • xlim – Define the x-axis limits to plot. If None the x-axis limits will be inferred.

  • ylim – Define the y-axis limits to plot. If None the y-axis limits will be inferred.

  • xticks – Whether the x-axis labels should be drawn.

  • yticks – Whether the y-axis labels should be drawn.

  • legend – Whether the legend for the plot should be drawn.

  • alpha – The opacity of drawn data.

  • xorder – Indicates whether the x-axis should be in ascending (+) or descendeing (-) order.

  • top_n – Only plot the top_n learners. If None all learners will be plotted. If negative the bottom will be plotted.

  • out – Indicate where the plot should be sent to after plotting is finished. Valid values are ‘screen’ to show it on screen, a path to save to disk, or None if the plot should not be output anywhere (i.e., kept in memory) in order to keep editing the plot after this call.

  • ax – Provide an optional axes that the plot will be drawn to. If not provided a new figure/axes is created.

raw_contrast(l1: Any, l2: Any, x: str | Sequence[str] = 'environment_id', y: str = 'reward', l: str | Sequence[str] = 'learner_id', p: str | Sequence[str] = 'environment_id', span: int | None = None) Table

A Table with the raw data plot_contrast.

Parameters:
  • l1 – The first l value to contrast.

  • l2 – the second l value to contrast.

  • x – The variables to plot on the x-axis.

  • y – The variable to plot on the y-axis.

  • l – The column names that l1 and l2 represent.

  • p – The pairings to require across all l1 and l2.

  • span – The size of the rolling average (None means progressive mean).

Examples

raw_contrast(1,2,x=’environment_id’,y=’reward’,l=’learner_id’,p=’environment_id’) would contrast learner_id=1 and learner_id=2 in terms of reward on all environment_ids.

Reutrns:

A Table with the raw data used to construct plot_contrast.

raw_learners(x: str | Sequence[str] = 'index', y: str = 'reward', l: str | Sequence[str] = 'full_name', p: str | Sequence[str] = 'environment_id', span: int | None = None) Table

A Table with the raw data used for plot_learners.

Parameters:
  • x – The variables to plot on the x-axis.

  • y – The variable to plot on the y-axis.

  • l – The labels to use in the plot legend.

  • p – The pairings to require across all l. If None no pairing checks are performed.

  • span – The size of the rolling average (None means progressive mean.)

Reutrns:

A Table with the raw data used to construct plot_learners.

Attributes

environments : Table

The environments in the Experiment.

The primary key of this table is environment_id.

evaluators : Table

The evaluators in the Experiment.

The primary key of this table is evaluator_id.

interactions : Table

The evaluated interactions in the Experiment.

The primary key of this Table is (environment_id, learner_id, evaluator_id, index).

learners : Table

The learners in the Experiment.

The primary key of this table is learner_id.