cpm.models
cpm.models.activation
CompetitiveGating(input=None, values=None, salience=None, P=1, **kwargs)
A competitive attentional gating function, an attentional activation function, that incorporates stimulus salience in addition to the stimulus vector to modulate the weights. It formalises the hypothesis that each stimulus has an underlying salience that competes to captures attentional focus (Paskewitz and Jones, 2020; Kruschke, 2001).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
array_like
|
The input value. The stimulus representation (vector). |
None
|
values |
array_like
|
The values. A 2D array of values, where each row represents an outcome and each column represents a single stimulus. |
None
|
salience |
array_like
|
The salience value. A 1D array of salience values, where each value represents the salience of a single stimulus. |
None
|
P |
float
|
The power value, also called attentional normalisation or brutality, which influences the degree of attentional competition. |
1
|
Examples:
>>> input = np.array([1, 1, 0])
>>> values = np.array([[0.1, 0.9, 0.8], [0.6, 0.2, 0.1]])
>>> salience = np.array([0.1, 0.2, 0.3])
>>> att = CompetitiveGating(input, values, salience, P = 1)
>>> att.compute()
array([[0.03333333, 0.6 , 0. ],
[0.2 , 0.13333333, 0. ]])
References
Kruschke, J. K. (2001). Toward a unified model of attention in associative learning. Journal of Mathematical Psychology, 45(6), 812-863.
Paskewitz, S., & Jones, M. (2020). Dissecting exit. Journal of mathematical psychology, 97, 102371.
compute()
Compute the activations mediated by underlying salience.
Returns:
Type | Description |
---|---|
array_like
|
The values updated with the attentional gain and stimulus vector. |
Offset(input=None, offset=0, index=0, **kwargs)
A class for adding a scalar to one element of an input array. In practice, this can be used to "shift" or "offset" the "value" of one particular stimulus, for example to represent a consistent bias for (or against) that stimulus.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
array_like
|
The input value. The stimulus representation (vector). |
None
|
offset |
float
|
The value to be added to one element of the input. |
0
|
index |
int
|
The index of the element of the input vector to which the offset should be added. |
0
|
**kwargs |
dict, optional
|
Additional keyword arguments. |
{}
|
Examples:
>>> vals = np.array([2.1, 1.1])
>>> offsetter = Offset(input = vals, offset = 1.33, index = 0)
>>> offsetter.compute()
array([3.43, 1.1])
compute()
Add the offset to the requested input element.
Returns:
Type | Description |
---|---|
numpy.ndarray
|
The stimulus representation (vector) with offset added to the requested element. |
ProspectUtility(magnitudes=None, probabilities=None, alpha_pos=1, alpha_neg=None, lambda_loss=1, beta=1, delta=1, weighting='tk', **kwargs)
A class for computing choice utilities based on prospect theory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
magnitudes |
numpy.ndarray
|
The magnitudes of potential outcomes for each choice option. Should be a nested array where the outer dimension represents trials, followed by options within each trial, followed by potential outcomes within each option. |
None
|
probabilities |
numpy.ndarray
|
The probabilities of potential outcomes for each choice option. Should be a nested array where the outer dimension represents trials, followed by options within each trial, followed by potential outcomes within each option. |
None
|
alpha_pos |
float
|
The risk attitude parameter for non-negative outcomes, which determines the curvature of the utility function in the gain domain. If alpha_neg is undefined, alpha_pos will be used for both the gain and loss domains. |
1
|
alpha_neg |
float
|
The risk attitude parameter for negative outcomes, which determines the curvature of the utility function in the loss domain. |
None
|
lambda_loss |
float
|
The loss aversion parameter, which scales the utility of negative outcomes relative to non-negative outcomes. |
1
|
beta |
float
|
The discriminability parameter, which determines the curvature of the weighting function. |
1
|
delta |
float
|
The attractiveness parameter, which determines the elevation of the weighting function. |
1
|
weighting |
str
|
The definition of the weighting function. Should be one of 'tk', 'pd', or 'gw'. |
'tk'
|
**kwargs |
dict, optional
|
Additional keyword arguments. |
{}
|
Notes
The different weighting functions currently implemented are:
- `tk`: Tversky & Kahneman (1992).
- `pd`: Prelec (1998).
- `gw`: Gonzalez & Wu (1999).
Following Tversky & Kahneman (1992), the expected utility U of a choice option is defined as:
U = sum(w(p) * u(x)),
where w is a weighting function of the probability p of a potential outcome, and u is the utility function of the magnitude x of a potential outcome. These functions are defined as follows (equations 6 and 5 respectively in Tversky & Kahneman, 1992, pp. 309):
w(p) = p^beta / (p^beta + (1 - p)^beta)^(1/beta),
u(x) = ifelse(x >= 0, x^alpha_pos, -lambda * (-x)^alpha_neg),
where beta is the discriminability parameter of the weighting function; alpha_pos and alpha_neg are the risk attitude parameters in the gain and loss domains respectively, and lambda is the loss aversion parameter.
Several other definitions of the weighting function have been proposed in the literature, most notably in Prelec (1998) and Gonzalez & Wu (1999). Prelec (equation 3.2, 1998, pp. 503) proposed the following definition:
w(p) = exp(-delta * (-log(p))^beta),
where delta and beta are the attractiveness and discriminability parameters of the weighting function. Gonzalez & Wu (equation 3, 1999, pp. 139) proposed the following definition:
w(p) = (delta * p^beta) / ((delta * p^beta) + (1-p)^beta).
Examples:
>>> vals = np.array([np.array([1, 40]), np.array([10])], dtype=object)
>>> probs = np.array([np.array([0.95, 0.05]), np.array([1])], dtype=object)
>>> prospect = ProspectUtility(
magnitudes=vals, probabilities=probs, alpha_pos = 0.85, beta = 0.9
)
>>> prospect.compute()
array([2.44583162, 7.07945784])
References
Gonzalez, R., & Wu, G. (1999). On the shape of the probability weighting function. Cognitive psychology, 38(1), 129-166.
Prelec, D. (1998). The probability weighting function. Econometrica, 497-527.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and uncertainty, 5, 297-323.
compute()
Compute the expected utility of each choice option.
Returns:
Type | Description |
---|---|
numpy.ndarray
|
The computed expected utility of each choice option. |
SigmoidActivation(input=None, weights=None, **kwargs)
Represents a sigmoid activation function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
array_like
|
The input value. The stimulus representation (vector). |
None
|
weights |
array_like
|
The weights value. A 2D array of weights, where each row represents an outcome and each column represents a single stimulus. |
None
|
**kwargs |
dict
|
Additional keyword arguments. |
{}
|
compute()
Compute the activation value using the sigmoid function.
Returns:
Type | Description |
---|---|
numpy.ndarray
|
The computed activation value. |
cpm.models.decision
ChoiceKernel(temperature_activations=0.5, temperature_kernel=0.5, activations=None, kernel=None, **kwargs)
A class representing a choice kernel based on a softmax function that incorporates the frequency of choosing an action. It is based on Equation 7 in Wilson and Collins (2019).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
temperature_activations |
float
|
The inverse temperature parameter for the softmax computation. |
0.5
|
temperature_kernel |
float
|
The inverse temperature parameter for the kernel computation. |
0.5
|
activations |
ndarray, optional
|
An array of activations for the softmax function. |
None
|
kernel |
ndarray, optional
|
An array of kernel values for the softmax function. |
None
|
Notes
In order to get Equation 6 from Wilson and Collins (2019), either set activations
to None (default) or set it to 0.
See Also
cpm.models.learning.KernelUpdate: A class representing a kernel update (Equation 5; Wilson and Collins, 2019) that updates the kernel values.
References
Wilson, R. C., & Collins, A. G. E. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, Article e49547.
Examples:
>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> kernel = np.array([0.1, 0.9])
>>> choice_kernel = ChoiceKernel(temperature_activations=1, temperature_kernel=1, activations=activations, kernel=kernel)
>>> choice_kernel.compute()
array([0.44028635, 0.55971365])
GreedyRule(activations=None, epsilon=0, **kwargs)
A class representing an ε-greedy rule based on Daw et al. (2006).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
activations |
ndarray
|
An array of activations for the greedy rule. |
None
|
epsilon |
float
|
Exploration parameter. The probability of selecting a random action. |
0
|
Attributes:
Name | Type | Description |
---|---|---|
activations |
ndarray
|
An array of activations for the greedy rule. |
epsilon |
float
|
Exploration parameter. The probability of selecting a random action. |
policies |
ndarray
|
An array of outputs computed using the greedy rule. |
shape |
tuple
|
The shape of the activations array. |
References
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), Article 7095. https://doi.org/10.1038/nature04766
choice()
Chooses the action based on the greedy rule.
Returns:
Name | Type | Description |
---|---|---|
action |
int
|
The chosen action based on the greedy rule. |
compute()
Computes the greedy rule.
Returns:
Name | Type | Description |
---|---|---|
output |
ndarray
|
A 2D array of outputs computed using the greedy rule. |
config()
Returns the configuration of the greedy rule.
Returns:
Name | Type | Description |
---|---|---|
config |
dict
|
A dictionary containing the configuration of the greedy rule.
|
Sigmoid(temperature=None, activations=None, beta=0, **kwargs)
A class representing a sigmoid function that takes an n by m array of activations and returns an n array of outputs, where n is the number of output and m is the number of inputs.
The sigmoid function is defined as: 1 / (1 + e^(-temperature * (x - beta))).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
temperature |
float
|
The inverse temperature parameter for the sigmoid function. |
None
|
beta |
float
|
It is the value of the output activation that results in an output rating of P = 0.5. |
0
|
activations |
ndarray
|
An array of activations for the sigmoid function. |
None
|
choice()
Chooses the action based on the sigmoid function.
Returns:
Name | Type | Description |
---|---|---|
action |
int
|
The chosen action based on the sigmoid function. |
Notes
The choice is based on the probabilities of the sigmoid function, but it is not guaranteed that the policy values will sum to 1. Therefore, the policies are normalised to sum to 1 when generating a discrete choice.
compute()
Computes the Sigmoid function.
Returns:
Name | Type | Description |
---|---|---|
output |
ndarray
|
A 2D array of outputs computed using the sigmoid function. |
Softmax(temperature=None, xi=None, activations=None, **kwargs)
Softmax class for computing policies based on activations and temperature.
The softmax function is defined as: e^(temperature * x) / sum(e^(temperature * x)).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
temperature |
float
|
The inverse temperature parameter for the softmax computation. |
None
|
xi |
float
|
The irreducible noise parameter for the softmax computation. |
None
|
activations |
numpy.ndarray
|
Array of activations for each possible outcome/action. It should be a 2D ndarray, where each row represents an outcome and each column represents a single stimulus. |
None
|
Notes
The inverse temperature parameter beta represents the degree of randomness in the choice process. As beta approaches positive infinity, choices becomes more deterministic, such that the choice option with the greatest activation is more likely to be chosen - it approximates a step function. By contrast, as beta approaches zero, choices becomes random (i.e., the probabilities the choice options are approximately equal) and therefore independent of the options' activations.
activations
must be a 2D array, where each row represents an outcome and each column represents a stimulus or other arbitrary features and variables.
If multiple values are provided for each outcome, the softmax function will sum these values up.
Note that if you have one value for each outcome (i.e. a classical bandit-like problem), and you represent it as a 1D
array, you must reshape it in the format specified for activations. So that if you have 3 stimuli
which all are actionable, [0.1, 0.5, 0.22]
, you should have a 2D array of shape (3, 1), [[0.1], [0.5], [0.22]]
.
You can see Example 2 for a demonstration.
Examples:
>>> from cpm.models.decision import Softmax
>>> import numpy as np
>>> temperature = 1
>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> softmax = Softmax(temperature=temperature, activations=activations)
>>> softmax.compute()
array([0.45352133, 0.54647867])
>>> softmax.config()
{
"temperature" : 1,
"activations":
array([[ 0.1, 0. , 0.2],
[-0.6, 0. , 0.9]]),
"name" : "Softmax",
"type" : "decision",
}
>>> Softmax(temperature=temperature, activations=activations).compute()
array([0.45352133, 0.54647867])
choice()
Choose an action based on the computed policies.
Returns:
Name | Type | Description |
---|---|---|
int |
The chosen action based on the computed policies.
|
compute()
Compute the policies based on the activations and temperature.
Returns:
Type | Description |
---|---|
numpy.ndarray
|
irreducible_noise()
Extended softmax class for computing policies based on activations, with parameters inverse temperature and irreducible noise.
The softmax function with irreducible noise is defined as:
(e^(beta * x) / sum(e^(beta * x))) * (1 - xi) + (xi / length(x)),
where x is the input array of activations, beta is the inverse temperature parameter, and xi is the irreducible noise parameter.
Notes
The irreducible noise parameter xi accounts for attentional lapses in the choice process. Specifically, the terms (1-xi) + (xi/length(x)) cause the choice probabilities to be proportionally scaled towards 1/length(x). Relatively speaking, this increases the probability that an option is selected if its activation is exceptionally low. This may seem counterintuitive in theory, but in practice it enables the model to capture highly surprising responses that can occur during attentional lapses.
Returns:
Type | Description |
---|---|
numpy.ndarray
|
Examples:
>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> noisy_softmax = Softmax(temperature=1.5, xi=0.1, activations=activations)
>>> noisy_softmax.irreducible_noise()
array([0.4101454, 0.5898546])
cpm.models.learning
DeltaRule(alpha=None, zeta=None, weights=None, feedback=None, input=None, **kwargs)
DeltaRule class computes the prediction error for a given input and target value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
alpha |
float
|
The learning rate. |
None
|
zeta |
The constant fraction of the magnitude of the prediction error. |
None
|
|
weights |
array-like
|
The value matrix, where rows are outcomes and columns are stimuli or features. The values can be anything; for example belief values, association weights, connection weights, Q-values. |
None
|
feedback |
array-like
|
The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict. |
None
|
input |
array-like
|
The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1. |
None
|
**kwargs |
dict, optional
|
Additional keyword arguments. |
{}
|
See Also
cpm.models.learning.SeparableRule : A class representing a learning rule based on the separable error-term of Bush and Mosteller (1951).
Notes
The delta-rule is a summed error term, which means that the error is defined as the difference between the target value and the summed activation of all values for a given output units target value available on the current trial/state. For separable error term, see the Bush and Mosteller (1951) rule.
The current implementation is based on the Gluck and Bower's (1988) delta rule, an extension of the Rescorla and Wagner (1972) learning rule to multi-outcome learning.
Examples:
>>> import numpy as np
>>> from cpm.models.learning import DeltaRule
>>> weights = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> teacher = np.array([1, 0])
>>> input = np.array([1, 1, 0])
>>> delta_rule = DeltaRule(alpha=0.1, zeta=0.1, weights=weights, feedback=teacher, input=input)
>>> delta_rule.compute()
array([[ 0.07, 0.07, 0. ],
[-0.09, -0.09, -0. ]])
>>> delta_rule.noisy_learning_rule()
array([[ 0.05755793, 0.09214091, 0.],
[-0.08837513, -0.1304325 , 0.]])
This implementation generalises to n-dimensional matrices, which means that it can be applied to both single- and multi-outcome learning paradigms.
>>> weights = np.array([0.1, 0.6, 0., 0.3])
>>> teacher = np.array([1])
>>> input = np.array([1, 1, 0, 0])
>>> delta_rule = DeltaRule(alpha=0.1, weights=weights, feedback=teacher, input=input)
>>> delta_rule.compute()
array([[0.03, 0.03, 0. , 0. ]])
References
Gluck, M. A., & Bower, G. H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117(3), 227–247.
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York:Appleton-Century-Crofts.
Widrow, B., & Hoff, M. E. (1960, August). Adaptive switching circuits. In IRE WESCON convention record (Vol. 4, No. 1, pp. 96-104).
compute()
Compute the prediction error using the delta learning rule. It is based on the Gluck and Bower's (1988) delta rule, an extension to Rescorla and Wagner (1972), which was identical to that of Widrow and Hoff (1960).
Returns:
Type | Description |
---|---|
ndarray
|
The prediction error for each stimuli-outcome mapping with learning noise. It has the same shape as the weights input argument. |
noisy_learning_rule()
Add random noise to the prediction error computed from the delta learning rule as specified Findling et al. (2019). It is inspired by Weber's law of intensity sensation.
Returns:
Type | Description |
---|---|
ndarray
|
The prediction error for each stimuli-outcome mapping with learning noise. It has the same shape as the weights input argument. |
References
Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S., and Wyart, V. (2019). Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nature Neuroscience 22, 2066–2077
reset()
Reset the weights to zero.
HumbleTeacher(alpha=None, weights=None, feedback=None, input=None, **kwargs)
A humbe teacher learning rule (Kruschke, 1992; Love, Gureckis, and Medin, 2004) for multi-dimensional outcome learning.
Attributes:
Name | Type | Description |
---|---|---|
alpha |
float
|
The learning rate. |
input |
ndarray or array_like
|
The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1. |
weights |
ndarray
|
The weights value. A 2D array of weights, where each row represents an outcome and each column represents a single stimulus. |
teacher |
ndarray
|
The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict. |
shape |
tuple
|
The shape of the weight matrix. |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
alpha |
float
|
The learning rate. |
None
|
weights |
array-like
|
The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1. |
None
|
feedback |
array-like
|
The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict. |
None
|
input |
array-like
|
The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1. |
None
|
**kwargs |
dict, optional
|
Additional keyword arguments. |
{}
|
Notes
The humble teacher learning rule is a learning rule that is based on the idea that if output node activations large than the teaching signals should not be counted as error, but should be rewarded. So the humble teacher turns teaching signals into discrete (nominal) values, where they do not indicate the degree of membership between stimuli and outcome label, the degree of causality between stimuli and outcome, or the degree of correctness of the output.
References
Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44.
Examples:
>>> import numpy as np
>>> from cpm.models.learning import HumbleTeacher
>>> weights = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> teacher = np.array([0, 1])
>>> input = np.array([1, 1, 1])
>>> humble_teacher = HumbleTeacher(alpha=0.1, weights=weights, feedback=teacher, input=input)
>>> humble_teacher.compute()
array([[-0.06, -0.06, -0.06],
[ 0. , 0. , 0. ]])
compute()
Compute the weights using the CPM learning rule.
Returns:
Name | Type | Description |
---|---|---|
weights |
numpy.ndarray
|
The updated weights matrix. |
KernelUpdate(response, alpha, kernel, input, **kwargs)
A class representing a learning rule for updating the choice kernel as specified by Equation 5 in Wilson and Collins (2019).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response |
ndarray
|
The response vector. It must be a binary numpy.ndarray, so that each element corresponds to a response option. If there are 4 response options, and the second was selected, it would be represented as |
required |
alpha |
float
|
The kernel learning rate. |
required |
kernel |
ndarray
|
The kernel used for learning. It is a 1D array of kernel values, where each element corresponds to a response option. Each element must correspond to the same response option in the |
required |
Notes
The kernel update component is used to represent how likely a given response is to be chosen based on the frequency it was chosen in the past. This can then be integrated into a choice kernel decision policy.
See Also
cpm.models.decision.ChoiceKernel : A class representing a choice kernel decision policy.
References
Wilson, Robert C., and Anne GE Collins. Ten simple rules for the computational modeling of behavioral data. Elife 8 (2019): e49547.
compute()
Compute the change in the kernel based on the given response, rate, and kernel, and return the updated kernel.
Returns:
Name | Type | Description |
---|---|---|
output |
numpy.ndarray
|
The computed change of the kernel. |
config()
Get the configuration of the kernel update component.
Returns:
Name | Type | Description |
---|---|---|
config |
dict
|
A dictionary containing the configuration parameters of the kernel update component.
|
QLearningRule(alpha=0.5, gamma=0.1, values=None, reward=None, maximum=None, *args, **kwargs)
Q-learning rule (Watkins, 1989) for a one-dimensional array of Q-values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
alpha |
float
|
The learning rate. Default is 0.5. |
0.5
|
gamma |
float
|
The discount factor. Default is 0.1. |
0.1
|
values |
ndarray
|
The values matrix. It is a 1D array of Q-values active for the current state, where each element corresponds to an action. |
None
|
reward |
float
|
The reward received on the current state. |
None
|
maximum |
float
|
The maximum estimated reward for the next state. |
None
|
Notes
The Q-learning rule is a model-free reinforcement learning algorithm that is used to learn the value of an action in a given state. It is defined as
Q(s, a) = Q(s, a) + alpha * (r + gamma * max(Q(s', a')) - Q(s, a)),
where Q(s, a)
is the value of action a
in state s
, r
is the reward received on the current state, gamma
is the discount factor, and max(Q(s', a'))
is the maximum estimated reward for the next state.
Examples:
>>> import numpy as np
>>> from cpm.models.learning import QLearningRule
>>> values = np.array([1, 0.5, 0.99])
>>> component = QLearningRule(alpha=0.1, gamma=0.8, values=values, reward=1, maximum=10)
>>> component.compute()
array([1.8 , 1.35 , 1.791])
References
Watkins, C. J. C. H. (1989). Learning from delayed rewards.
Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8, 279-292.
compute()
Compute the change in values based on the given values, reward, and parameters, and return the updated values.
Returns:
Name | Type | Description |
---|---|---|
output |
numpy.ndarray
|
The computed output values. |
SeparableRule(alpha=None, zeta=None, weights=None, feedback=None, input=None, **kwargs)
A class representing a learning rule based on the separable error-term of Bush and Mosteller (1951).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
alpha |
float
|
The learning rate. |
None
|
zeta |
float, optional
|
The constant fraction of the magnitude of the prediction error, also called Weber's scaling. |
None
|
weights |
array-like
|
The value matrix, where rows are outcomes and columns are stimuli or features. The values can be anything; for example belief values, association weights, connection weights, Q-values. |
None
|
feedback |
array-like, optional
|
The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict. |
None
|
input |
array-like, optional
|
The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1. |
None
|
**kwargs |
dict, optional
|
Additional keyword arguments. |
{}
|
See Also
cpm.models.learning.DeltaRule : An extension of the Rescorla and Wagner (1972) learning rule by Gluck and Bower (1988) to allow multi-outcome learning.
Notes
This type of learning rule was among the earliest formal models of associative learning (Le Pelley, 2004), which were based on standard linear operators (Bush & Mosteller, 1951; Estes, 1950; Kendler, 1971).
References
Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple learning. Psychological Review, 58, 313–323
Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, 57, 94–107
Kendler, T. S. (1971). Continuity theory and cue dominance. In J. T. Spence (Ed.), Essays in neobehaviorism: A memorial volume to Kenneth W. Spence. New York: Appleton-Century-Crofts.
Le Pelley, M. E. (2004). The role of associative history in models of associative learning: A selective review and a hybrid model. Quarterly Journal of Experimental Psychology Section B, 57(3), 193-243.
compute()
Computes the prediction error using the learning rule.
Returns:
ndarray The prediction error for each stimuli-outcome mapping. It has the same shape as the weights input argument.
noisy_learning_rule()
Add random noise to the prediction error computed from the delta learning rule as specified Findling et al. (2019). It is inspired by Weber's law of intensity sensation.
Returns:
Type | Description |
---|---|
ndarray
|
The prediction error for each stimuli-outcome mapping with learning noise. It has the same shape as the weights input argument. |
References
Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S., and Wyart, V. (2019). Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nature Neuroscience 22, 2066–2077
reset()
Resets the weights to zero.