cpm.models
cpm.models.activation
CompetitiveGating(input=None, values=None, salience=None, P=1, **kwargs)
A competitive attentional gating function, an attentional activation function, that incorporates stimulus salience in addition to the stimulus vector to modulate the weights. It formalises the hypothesis that each stimulus has an underlying salience that competes to captures attentional focus (Paskewitz and Jones, 2020; Kruschke, 2001).
Parameters: |
|
---|
Examples:
>>> input = np.array([1, 1, 0])
>>> values = np.array([[0.1, 0.9, 0.8], [0.6, 0.2, 0.1]])
>>> salience = np.array([0.1, 0.2, 0.3])
>>> att = CompetitiveGating(input, values, salience, P = 1)
>>> att.compute()
array([[0.03333333, 0.6 , 0. ],
[0.2 , 0.13333333, 0. ]])
References
Kruschke, J. K. (2001). Toward a unified model of attention in associative learning. Journal of Mathematical Psychology, 45(6), 812-863.
Paskewitz, S., & Jones, M. (2020). Dissecting exit. Journal of mathematical psychology, 97, 102371.
compute()
Compute the activations mediated by underlying salience.
Returns: |
|
---|
Offset(input=None, offset=0, index=0, **kwargs)
A class for adding a scalar to one element of an input array. In practice, this can be used to "shift" or "offset" the "value" of one particular stimulus, for example to represent a consistent bias for (or against) that stimulus.
Parameters: |
|
---|
Examples:
>>> vals = np.array([2.1, 1.1])
>>> offsetter = Offset(input = vals, offset = 1.33, index = 0)
>>> offsetter.compute()
array([3.43, 1.1])
compute()
Add the offset to the requested input element.
Returns: |
|
---|
ProspectUtility(magnitudes=None, probabilities=None, alpha_pos=1, alpha_neg=None, lambda_loss=1, beta=1, delta=1, weighting='tk', **kwargs)
A class for computing choice utilities based on prospect theory.
Parameters: |
|
---|
Notes
The different weighting functions currently implemented are:
- `tk`: Tversky & Kahneman (1992).
- `pd`: Prelec (1998).
- `gw`: Gonzalez & Wu (1999).
Following Tversky & Kahneman (1992), the expected utility U of a choice option is defined as:
U = sum(w(p) * u(x)),
where w is a weighting function of the probability p of a potential outcome, and u is the utility function of the magnitude x of a potential outcome. These functions are defined as follows (equations 6 and 5 respectively in Tversky & Kahneman, 1992, pp. 309):
w(p) = p^beta / (p^beta + (1 - p)^beta)^(1/beta),
u(x) = ifelse(x >= 0, x^alpha_pos, -lambda * (-x)^alpha_neg),
where beta is the discriminability parameter of the weighting function; alpha_pos and alpha_neg are the risk attitude parameters in the gain and loss domains respectively, and lambda is the loss aversion parameter.
Several other definitions of the weighting function have been proposed in the literature, most notably in Prelec (1998) and Gonzalez & Wu (1999). Prelec (equation 3.2, 1998, pp. 503) proposed the following definition:
w(p) = exp(-delta * (-log(p))^beta),
where delta and beta are the attractiveness and discriminability parameters of the weighting function. Gonzalez & Wu (equation 3, 1999, pp. 139) proposed the following definition:
w(p) = (delta * p^beta) / ((delta * p^beta) + (1-p)^beta).
Examples:
>>> vals = np.array([np.array([1, 40]), np.array([10])], dtype=object)
>>> probs = np.array([np.array([0.95, 0.05]), np.array([1])], dtype=object)
>>> prospect = ProspectUtility(
magnitudes=vals, probabilities=probs, alpha_pos = 0.85, beta = 0.9
)
>>> prospect.compute()
array([2.44583162, 7.07945784])
References
Gonzalez, R., & Wu, G. (1999). On the shape of the probability weighting function. Cognitive psychology, 38(1), 129-166.
Prelec, D. (1998). The probability weighting function. Econometrica, 497-527.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and uncertainty, 5, 297-323.
compute()
Compute the expected utility of each choice option.
Returns: |
|
---|
SigmoidActivation(input=None, weights=None, **kwargs)
Represents a sigmoid activation function.
Parameters: |
|
---|
compute()
Compute the activation value using the sigmoid function.
Returns: |
|
---|
cpm.models.decision
ChoiceKernel(temperature_activations=0.5, temperature_kernel=0.5, activations=None, kernel=None, **kwargs)
A class representing a choice kernel based on a softmax function that incorporates the frequency of choosing an action. It is based on Equation 7 in Wilson and Collins (2019).
Parameters: |
|
---|
Notes
In order to get Equation 6 from Wilson and Collins (2019), either set activations
to None (default) or set it to 0.
See Also
cpm.models.learning.KernelUpdate: A class representing a kernel update (Equation 5; Wilson and Collins, 2019) that updates the kernel values.
References
Wilson, R. C., & Collins, A. G. E. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, Article e49547.
Examples:
>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> kernel = np.array([0.1, 0.9])
>>> choice_kernel = ChoiceKernel(temperature_activations=1, temperature_kernel=1, activations=activations, kernel=kernel)
>>> choice_kernel.compute()
array([0.44028635, 0.55971365])
GreedyRule(activations=None, epsilon=0, **kwargs)
A class representing an ε-greedy rule based on Daw et al. (2006).
Parameters: |
|
---|
Attributes: |
|
---|
References
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), Article 7095. https://doi.org/10.1038/nature04766
choice()
Chooses the action based on the greedy rule.
Returns: |
|
---|
compute()
Computes the greedy rule.
Returns: |
|
---|
config()
Returns the configuration of the greedy rule.
Returns: |
|
---|
Sigmoid(temperature=None, activations=None, beta=0, **kwargs)
A class representing a sigmoid function that takes an n by m array of activations and returns an n array of outputs, where n is the number of output and m is the number of inputs.
The sigmoid function is defined as: 1 / (1 + e^(-temperature * (x - beta))).
Parameters: |
|
---|
Examples:
>>> from cpm.models.decision import Sigmoid
>>> import numpy as np
>>> temperature = 7
>>> activations = np.array([[0.1, 0.2]])
>>> sigmoid = Sigmoid(temperature=temperature, activations=activations)
>>> sigmoid.compute()
array([[0.66818777, 0.80218389]])
choice()
Chooses the action based on the sigmoid function.
Returns: |
|
---|
Notes
The choice is based on the probabilities of the sigmoid function, but it is not guaranteed that the policy values will sum to 1. Therefore, the policies are normalised to sum to 1 when generating a discrete choice.
compute()
Computes the Sigmoid function.
Returns: |
|
---|
Softmax(temperature=None, xi=None, activations=None, **kwargs)
Softmax class for computing policies based on activations and temperature.
The softmax function is defined as: e^(temperature * x) / sum(e^(temperature * x)).
Parameters: |
|
---|
Notes
The inverse temperature parameter beta represents the degree of randomness in the choice process. As beta approaches positive infinity, choices becomes more deterministic, such that the choice option with the greatest activation is more likely to be chosen - it approximates a step function. By contrast, as beta approaches zero, choices becomes random (i.e., the probabilities the choice options are approximately equal) and therefore independent of the options' activations.
activations
must be a 2D array, where each row represents an outcome and each column represents a stimulus or other arbitrary features and variables.
If multiple values are provided for each outcome, the softmax function will sum these values up.
Note that if you have one value for each outcome (i.e. a classical bandit-like problem), and you represent it as a 1D
array, you must reshape it in the format specified for activations. So that if you have 3 stimuli
which all are actionable, [0.1, 0.5, 0.22]
, you should have a 2D array of shape (3, 1), [[0.1], [0.5], [0.22]]
.
You can see Example 2 for a demonstration.
Examples:
>>> from cpm.models.decision import Softmax
>>> import numpy as np
>>> temperature = 5
>>> activations = np.array([0.1, 0, 0.2])
>>> softmax = Softmax(temperature=temperature, activations=activations)
>>> softmax.compute()
array([0.30719589, 0.18632372, 0.50648039])
>>> softmax.choice() # This will randomly choose one of the actions based on the computed probabilities.
2
>>> Softmax(temperature=temperature, activations=activations).compute()
array([0.30719589, 0.18632372, 0.50648039])
choice()
Choose an action based on the computed policies.
Returns: |
|
---|
compute()
Compute the policies based on the activations and temperature.
Returns: |
|
---|
irreducible_noise()
Extended softmax class for computing policies based on activations, with parameters inverse temperature and irreducible noise.
The softmax function with irreducible noise is defined as:
(e^(beta * x) / sum(e^(beta * x))) * (1 - xi) + (xi / length(x)),
where x is the input array of activations, beta is the inverse temperature parameter, and xi is the irreducible noise parameter.
Notes
The irreducible noise parameter xi accounts for attentional lapses in the choice process. Specifically, the terms (1-xi) + (xi/length(x)) cause the choice probabilities to be proportionally scaled towards 1/length(x). Relatively speaking, this increases the probability that an option is selected if its activation is exceptionally low. This may seem counterintuitive in theory, but in practice it enables the model to capture highly surprising responses that can occur during attentional lapses.
Returns: |
|
---|
Examples:
>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> noisy_softmax = Softmax(temperature=1.5, xi=0.1, activations=activations)
>>> noisy_softmax.irreducible_noise()
array([0.4101454, 0.5898546])
cpm.models.learning
DeltaRule(alpha=None, zeta=None, weights=None, feedback=None, input=None, **kwargs)
DeltaRule class computes the prediction error for a given input and target value.
Parameters: |
|
---|
See Also
cpm.models.learning.SeparableRule : A class representing a learning rule based on the separable error-term of Bush and Mosteller (1951).
Notes
The delta-rule is a summed error term, which means that the error is defined as the difference between the target value and the summed activation of all values for a given output units target value available on the current trial/state. For separable error term, see the Bush and Mosteller (1951) rule.
The current implementation is based on the Gluck and Bower's (1988) delta rule, an extension of the Rescorla and Wagner (1972) learning rule to multi-outcome learning.
Examples:
>>> import numpy as np
>>> from cpm.models.learning import DeltaRule
>>> weights = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> teacher = np.array([1, 0])
>>> input = np.array([1, 1, 0])
>>> delta_rule = DeltaRule(alpha=0.1, zeta=0.1, weights=weights, feedback=teacher, input=input)
>>> delta_rule.compute()
array([[ 0.07, 0.07, 0. ],
[-0.09, -0.09, -0. ]])
>>> delta_rule.noisy_learning_rule()
array([[ 0.05755793, 0.09214091, 0.],
[-0.08837513, -0.1304325 , 0.]])
This implementation generalises to n-dimensional matrices, which means that it can be applied to both single- and multi-outcome learning paradigms.
>>> weights = np.array([0.1, 0.6, 0., 0.3])
>>> teacher = np.array([1])
>>> input = np.array([1, 1, 0, 0])
>>> delta_rule = DeltaRule(alpha=0.1, weights=weights, feedback=teacher, input=input)
>>> delta_rule.compute()
array([[0.03, 0.03, 0. , 0. ]])
References
Gluck, M. A., & Bower, G. H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117(3), 227–247.
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York:Appleton-Century-Crofts.
Widrow, B., & Hoff, M. E. (1960, August). Adaptive switching circuits. In IRE WESCON convention record (Vol. 4, No. 1, pp. 96-104).
compute()
Compute the prediction error using the delta learning rule. It is based on the Gluck and Bower's (1988) delta rule, an extension to Rescorla and Wagner (1972), which was identical to that of Widrow and Hoff (1960).
Returns: |
|
---|
noisy_learning_rule()
Add random noise to the prediction error computed from the delta learning rule as specified Findling et al. (2019). It is inspired by Weber's law of intensity sensation.
Returns: |
|
---|
References
Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S., and Wyart, V. (2019). Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nature Neuroscience 22, 2066–2077
reset()
Reset the weights to zero.
HumbleTeacher(alpha=None, weights=None, feedback=None, input=None, **kwargs)
A humbe teacher learning rule (Kruschke, 1992; Love, Gureckis, and Medin, 2004) for multi-dimensional outcome learning.
Attributes: |
|
---|
Parameters: |
|
---|
Notes
The humble teacher learning rule is a learning rule that is based on the idea that if output node activations large than the teaching signals should not be counted as error, but should be rewarded. So the humble teacher turns teaching signals into discrete (nominal) values, where they do not indicate the degree of membership between stimuli and outcome label, the degree of causality between stimuli and outcome, or the degree of correctness of the output.
References
Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44.
Examples:
>>> import numpy as np
>>> from cpm.models.learning import HumbleTeacher
>>> weights = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> teacher = np.array([0, 1])
>>> input = np.array([1, 1, 1])
>>> humble_teacher = HumbleTeacher(alpha=0.1, weights=weights, feedback=teacher, input=input)
>>> humble_teacher.compute()
array([[-0.06, -0.06, -0.06],
[ 0. , 0. , 0. ]])
compute()
Compute the weights using the CPM learning rule.
Returns: |
|
---|
KernelUpdate(response, alpha, kernel, input, **kwargs)
A class representing a learning rule for updating the choice kernel as specified by Equation 5 in Wilson and Collins (2019).
Parameters: |
|
---|
Notes
The kernel update component is used to represent how likely a given response is to be chosen based on the frequency it was chosen in the past. This can then be integrated into a choice kernel decision policy.
See Also
cpm.models.decision.ChoiceKernel : A class representing a choice kernel decision policy.
References
Wilson, Robert C., and Anne GE Collins. Ten simple rules for the computational modeling of behavioral data. Elife 8 (2019): e49547.
compute()
Compute the change in the kernel based on the given response, rate, and kernel, and return the updated kernel.
Returns: |
|
---|
config()
Get the configuration of the kernel update component.
Returns: |
|
---|
QLearningRule(alpha=0.5, gamma=0.1, values=None, reward=None, maximum=None, *args, **kwargs)
Q-learning rule (Watkins, 1989) for a one-dimensional array of Q-values.
Parameters: |
|
---|
Notes
The Q-learning rule is a model-free reinforcement learning algorithm that is used to learn the value of an action in a given state. It is defined as
Q(s, a) = Q(s, a) + alpha * (r + gamma * max(Q(s', a')) - Q(s, a)),
where Q(s, a)
is the value of action a
in state s
, r
is the reward received on the current state, gamma
is the discount factor, and max(Q(s', a'))
is the maximum estimated reward for the next state.
Examples:
>>> import numpy as np
>>> from cpm.models.learning import QLearningRule
>>> values = np.array([1, 0.5, 0.99])
>>> component = QLearningRule(alpha=0.1, gamma=0.8, values=values, reward=1, maximum=10)
>>> component.compute()
array([1.8 , 1.35 , 1.791])
References
Watkins, C. J. C. H. (1989). Learning from delayed rewards.
Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8, 279-292.
compute()
Compute the change in values based on the given values, reward, and parameters, and return the updated values.
Returns: |
|
---|
SeparableRule(alpha=None, zeta=None, weights=None, feedback=None, input=None, **kwargs)
A class representing a learning rule based on the separable error-term of Bush and Mosteller (1951).
Parameters: |
|
---|
See Also
cpm.models.learning.DeltaRule : An extension of the Rescorla and Wagner (1972) learning rule by Gluck and Bower (1988) to allow multi-outcome learning.
Notes
This type of learning rule was among the earliest formal models of associative learning (Le Pelley, 2004), which were based on standard linear operators (Bush & Mosteller, 1951; Estes, 1950; Kendler, 1971).
References
Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple learning. Psychological Review, 58, 313–323
Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, 57, 94–107
Kendler, T. S. (1971). Continuity theory and cue dominance. In J. T. Spence (Ed.), Essays in neobehaviorism: A memorial volume to Kenneth W. Spence. New York: Appleton-Century-Crofts.
Le Pelley, M. E. (2004). The role of associative history in models of associative learning: A selective review and a hybrid model. Quarterly Journal of Experimental Psychology Section B, 57(3), 193-243.
compute()
Computes the prediction error using the learning rule.
Returns:
ndarray The prediction error for each stimuli-outcome mapping. It has the same shape as the weights input argument.
noisy_learning_rule()
Add random noise to the prediction error computed from the delta learning rule as specified Findling et al. (2019). It is inspired by Weber's law of intensity sensation.
Returns: |
|
---|
References
Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S., and Wyart, V. (2019). Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nature Neuroscience 22, 2066–2077
reset()
Resets the weights to zero.