FixedLearner
- class coba.learners.FixedLearner
Select actions from a fixed distribution and learn nothing.
Constructors
- __init__(pmf: Sequence[Prob], seed: int = 1) None
Instantiate a FixedLearner.
- Parameters:
pmf – A PMF whose values are the probability of taking each action.
seed – The seed used to select actions in predict.
Methods
- learn(context: Context, action: Action, reward: float, prob: float) None
Learn about the action taken in the context.
- Parameters:
context – The context in which the action was taken.
action – The action that was taken.
reward – The reward for the given context and action (feedback for IGL problems).
probability – The probability the given action was taken.
**kwargs – Optional information returned during prediction.
- predict(context: None | str | Number | Sequence | Mapping, actions: None | Sequence[Action]) Tuple[Action, Prob]
Predict which action to take in the context.
- Parameters:
context – The current context. It will either be None (multi-armed bandit), a value (a single feature), a sequence of values (dense features), or a dictionary (sparse features).
actions – The current set of actions to choose from in the given context. Each action will either be a value (a single feature), a sequence of values (dense features), or a dictionary (sparse features).
- Returns:
A Prediction. Several prediction formats are supported. See the type-hint for these.
- score(context: None | str | Number | Sequence | Mapping, actions: None | Sequence[Action], action: str | Number | Sequence | Mapping) float
Propensity score an action.
- Parameters:
context – The current context.
actions – The current set of actions that can be chosen.
action – The action to propensity score.
- Returns:
The propensity score of the given action. That is, P(action|context,actions).
Attributes
- params