OpeRewards
- class coba.environments.OpeRewards
Transform logged interactions to simulated interactions.
Constructors
- __init__(rwd_type: Literal['IPS', 'DM', 'DR'] | None = None, target: str = 'rewards', features=[1, 'x', 'a', 'xa', 'xxa'])
Instantiate an OpeRewards filter.
- Parameters:
rewards_type – How to estimate the rewards function from the logged
data (i.e., inverse propensity score, direct method, or doubly robust) –
- Remarks:
Adds ‘rewards’ to interactions with ‘context’, ‘action’, ‘reward’, and ‘probability’.
Methods
- filter(interactions: Iterable[Interaction]) Iterable[Interaction]
Apply filter to an Environment’s interactions.
Attributes
- params