cpm.models

cpm.models.activation

CompetitiveGating(input=None, values=None, salience=None, P=1, **kwargs)

A competitive attentional gating function, an attentional activation function, that incorporates stimulus salience in addition to the stimulus vector to modulate the weights. It formalises the hypothesis that each stimulus has an underlying salience that competes to captures attentional focus (Paskewitz and Jones, 2020; Kruschke, 2001).

Parameters:

Name Type Description Default
input array_like

The input value. The stimulus representation (vector).

None
values array_like

The values. A 2D array of values, where each row represents an outcome and each column represents a single stimulus.

None
salience array_like

The salience value. A 1D array of salience values, where each value represents the salience of a single stimulus.

None
P float

The power value, also called attentional normalisation or brutality, which influences the degree of attentional competition.

1

Examples:

>>> input = np.array([1, 1, 0])
>>> values = np.array([[0.1, 0.9, 0.8], [0.6, 0.2, 0.1]])
>>> salience = np.array([0.1, 0.2, 0.3])
>>> att = CompetitiveGating(input, values, salience, P = 1)
>>> att.compute()
array([[0.03333333, 0.6       , 0.        ],
       [0.2       , 0.13333333, 0.        ]])
References

Kruschke, J. K. (2001). Toward a unified model of attention in associative learning. Journal of Mathematical Psychology, 45(6), 812-863.

Paskewitz, S., & Jones, M. (2020). Dissecting exit. Journal of mathematical psychology, 97, 102371.

compute()

Compute the activations mediated by underlying salience.

Returns:

Type Description
array_like

The values updated with the attentional gain and stimulus vector.

Offset(input=None, offset=0, index=0, **kwargs)

A class for adding a scalar to one element of an input array. In practice, this can be used to "shift" or "offset" the "value" of one particular stimulus, for example to represent a consistent bias for (or against) that stimulus.

Parameters:

Name Type Description Default
input array_like

The input value. The stimulus representation (vector).

None
offset float

The value to be added to one element of the input.

0
index int

The index of the element of the input vector to which the offset should be added.

0
**kwargs dict, optional

Additional keyword arguments.

{}

Examples:

>>> vals = np.array([2.1, 1.1])
>>> offsetter = Offset(input = vals, offset = 1.33, index = 0)
>>> offsetter.compute()
array([3.43, 1.1])

compute()

Add the offset to the requested input element.

Returns:

Type Description
numpy.ndarray

The stimulus representation (vector) with offset added to the requested element.

ProspectUtility(magnitudes=None, probabilities=None, alpha_pos=1, alpha_neg=None, lambda_loss=1, beta=1, delta=1, weighting='tk', **kwargs)

A class for computing choice utilities based on prospect theory.

Parameters:

Name Type Description Default
magnitudes numpy.ndarray

The magnitudes of potential outcomes for each choice option. Should be a nested array where the outer dimension represents trials, followed by options within each trial, followed by potential outcomes within each option.

None
probabilities numpy.ndarray

The probabilities of potential outcomes for each choice option. Should be a nested array where the outer dimension represents trials, followed by options within each trial, followed by potential outcomes within each option.

None
alpha_pos float

The risk attitude parameter for non-negative outcomes, which determines the curvature of the utility function in the gain domain. If alpha_neg is undefined, alpha_pos will be used for both the gain and loss domains.

1
alpha_neg float

The risk attitude parameter for negative outcomes, which determines the curvature of the utility function in the loss domain.

None
lambda_loss float

The loss aversion parameter, which scales the utility of negative outcomes relative to non-negative outcomes.

1
beta float

The discriminability parameter, which determines the curvature of the weighting function.

1
delta float

The attractiveness parameter, which determines the elevation of the weighting function.

1
weighting str

The definition of the weighting function. Should be one of 'tk', 'pd', or 'gw'.

'tk'
**kwargs dict, optional

Additional keyword arguments.

{}
Notes

The different weighting functions currently implemented are:

- `tk`: Tversky & Kahneman (1992).
- `pd`: Prelec (1998).
- `gw`: Gonzalez & Wu (1999).

Following Tversky & Kahneman (1992), the expected utility U of a choice option is defined as:

U = sum(w(p) * u(x)),

where w is a weighting function of the probability p of a potential outcome, and u is the utility function of the magnitude x of a potential outcome. These functions are defined as follows (equations 6 and 5 respectively in Tversky & Kahneman, 1992, pp. 309):

w(p) = p^beta / (p^beta + (1 - p)^beta)^(1/beta),


u(x) = ifelse(x >= 0, x^alpha_pos, -lambda * (-x)^alpha_neg),

where beta is the discriminability parameter of the weighting function; alpha_pos and alpha_neg are the risk attitude parameters in the gain and loss domains respectively, and lambda is the loss aversion parameter.

Several other definitions of the weighting function have been proposed in the literature, most notably in Prelec (1998) and Gonzalez & Wu (1999). Prelec (equation 3.2, 1998, pp. 503) proposed the following definition:

w(p) = exp(-delta * (-log(p))^beta),

where delta and beta are the attractiveness and discriminability parameters of the weighting function. Gonzalez & Wu (equation 3, 1999, pp. 139) proposed the following definition:

w(p) = (delta * p^beta) / ((delta * p^beta) + (1-p)^beta).

Examples:

>>> vals = np.array([np.array([1, 40]), np.array([10])], dtype=object)
>>> probs = np.array([np.array([0.95, 0.05]), np.array([1])], dtype=object)
>>> prospect = ProspectUtility(
        magnitudes=vals, probabilities=probs, alpha_pos = 0.85, beta = 0.9
    )
>>> prospect.compute()
array([2.44583162, 7.07945784])
References

Gonzalez, R., & Wu, G. (1999). On the shape of the probability weighting function. Cognitive psychology, 38(1), 129-166.

Prelec, D. (1998). The probability weighting function. Econometrica, 497-527.

Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and uncertainty, 5, 297-323.

compute()

Compute the expected utility of each choice option.

Returns:

Type Description
numpy.ndarray

The computed expected utility of each choice option.

SigmoidActivation(input=None, weights=None, **kwargs)

Represents a sigmoid activation function.

Parameters:

Name Type Description Default
input array_like

The input value. The stimulus representation (vector).

None
weights array_like

The weights value. A 2D array of weights, where each row represents an outcome and each column represents a single stimulus.

None
**kwargs dict

Additional keyword arguments.

{}

compute()

Compute the activation value using the sigmoid function.

Returns:

Type Description
numpy.ndarray

The computed activation value.

cpm.models.decision

ChoiceKernel(temperature_activations=0.5, temperature_kernel=0.5, activations=None, kernel=None, **kwargs)

A class representing a choice kernel based on a softmax function that incorporates the frequency of choosing an action. It is based on Equation 7 in Wilson and Collins (2019).

Parameters:

Name Type Description Default
temperature_activations float

The inverse temperature parameter for the softmax computation.

0.5
temperature_kernel float

The inverse temperature parameter for the kernel computation.

0.5
activations ndarray, optional

An array of activations for the softmax function.

None
kernel ndarray, optional

An array of kernel values for the softmax function.

None
Notes

In order to get Equation 6 from Wilson and Collins (2019), either set activations to None (default) or set it to 0.

See Also

cpm.models.learning.KernelUpdate: A class representing a kernel update (Equation 5; Wilson and Collins, 2019) that updates the kernel values.

References

Wilson, R. C., & Collins, A. G. E. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, Article e49547.

Examples:

>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> kernel = np.array([0.1, 0.9])
>>> choice_kernel = ChoiceKernel(temperature_activations=1, temperature_kernel=1, activations=activations, kernel=kernel)
>>> choice_kernel.compute()
array([0.44028635, 0.55971365])

GreedyRule(activations=None, epsilon=0, **kwargs)

A class representing an ε-greedy rule based on Daw et al. (2006).

Parameters:

Name Type Description Default
activations ndarray

An array of activations for the greedy rule.

None
epsilon float

Exploration parameter. The probability of selecting a random action.

0

Attributes:

Name Type Description
activations ndarray

An array of activations for the greedy rule.

epsilon float

Exploration parameter. The probability of selecting a random action.

policies ndarray

An array of outputs computed using the greedy rule.

shape tuple

The shape of the activations array.

References

Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), Article 7095. https://doi.org/10.1038/nature04766

choice()

Chooses the action based on the greedy rule.

Returns:

Name Type Description
action int

The chosen action based on the greedy rule.

compute()

Computes the greedy rule.

Returns:

Name Type Description
output ndarray

A 2D array of outputs computed using the greedy rule.

config()

Returns the configuration of the greedy rule.

Returns:

Name Type Description
config dict

A dictionary containing the configuration of the greedy rule.

  • activations (ndarray): An array of activations for the greedy rule.
  • name (str): The name of the greedy rule.
  • type (str): The class of function it belongs.

Sigmoid(temperature=None, activations=None, beta=0, **kwargs)

A class representing a sigmoid function that takes an n by m array of activations and returns an n array of outputs, where n is the number of output and m is the number of inputs.

The sigmoid function is defined as: 1 / (1 + e^(-temperature * (x - beta))).

Parameters:

Name Type Description Default
temperature float

The inverse temperature parameter for the sigmoid function.

None
beta float

It is the value of the output activation that results in an output rating of P = 0.5.

0
activations ndarray

An array of activations for the sigmoid function.

None

choice()

Chooses the action based on the sigmoid function.

Returns:

Name Type Description
action int

The chosen action based on the sigmoid function.

Notes

The choice is based on the probabilities of the sigmoid function, but it is not guaranteed that the policy values will sum to 1. Therefore, the policies are normalised to sum to 1 when generating a discrete choice.

compute()

Computes the Sigmoid function.

Returns:

Name Type Description
output ndarray

A 2D array of outputs computed using the sigmoid function.

Softmax(temperature=None, xi=None, activations=None, **kwargs)

Softmax class for computing policies based on activations and temperature.

The softmax function is defined as: e^(temperature * x) / sum(e^(temperature * x)).

Parameters:

Name Type Description Default
temperature float

The inverse temperature parameter for the softmax computation.

None
xi float

The irreducible noise parameter for the softmax computation.

None
activations numpy.ndarray

Array of activations for each possible outcome/action. It should be a 2D ndarray, where each row represents an outcome and each column represents a single stimulus.

None
Notes

The inverse temperature parameter beta represents the degree of randomness in the choice process. As beta approaches positive infinity, choices becomes more deterministic, such that the choice option with the greatest activation is more likely to be chosen - it approximates a step function. By contrast, as beta approaches zero, choices becomes random (i.e., the probabilities the choice options are approximately equal) and therefore independent of the options' activations.

activations must be a 2D array, where each row represents an outcome and each column represents a stimulus or other arbitrary features and variables. If multiple values are provided for each outcome, the softmax function will sum these values up.

Note that if you have one value for each outcome (i.e. a classical bandit-like problem), and you represent it as a 1D array, you must reshape it in the format specified for activations. So that if you have 3 stimuli which all are actionable, [0.1, 0.5, 0.22], you should have a 2D array of shape (3, 1), [[0.1], [0.5], [0.22]]. You can see Example 2 for a demonstration.

Examples:

>>> from cpm.models.decision import Softmax
>>> import numpy as np
>>> temperature = 1
>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> softmax = Softmax(temperature=temperature, activations=activations)
>>> softmax.compute()
array([0.45352133, 0.54647867])
>>> softmax.config()
{
    "temperature"   : 1,
    "activations":
        array([[ 0.1,  0. ,  0.2],
        [-0.6,  0. ,  0.9]]),
    "name"  : "Softmax",
    "type"  : "decision",
}
>>> Softmax(temperature=temperature, activations=activations).compute()
array([0.45352133, 0.54647867])

choice()

Choose an action based on the computed policies.

Returns:

Name Type Description
int The chosen action based on the computed policies.

compute()

Compute the policies based on the activations and temperature.

Returns:

Type Description
numpy.ndarray

irreducible_noise()

Extended softmax class for computing policies based on activations, with parameters inverse temperature and irreducible noise.

The softmax function with irreducible noise is defined as:

(e^(beta * x) / sum(e^(beta * x))) * (1 - xi) + (xi / length(x)),

where x is the input array of activations, beta is the inverse temperature parameter, and xi is the irreducible noise parameter.

Notes

The irreducible noise parameter xi accounts for attentional lapses in the choice process. Specifically, the terms (1-xi) + (xi/length(x)) cause the choice probabilities to be proportionally scaled towards 1/length(x). Relatively speaking, this increases the probability that an option is selected if its activation is exceptionally low. This may seem counterintuitive in theory, but in practice it enables the model to capture highly surprising responses that can occur during attentional lapses.

Returns:

Type Description
numpy.ndarray

Examples:

>>> activations = np.array([[0.1, 0, 0.2], [-0.6, 0, 0.9]])
>>> noisy_softmax = Softmax(temperature=1.5, xi=0.1, activations=activations)
>>> noisy_softmax.irreducible_noise()
array([0.4101454, 0.5898546])

cpm.models.learning

DeltaRule(alpha=None, zeta=None, weights=None, feedback=None, input=None, **kwargs)

DeltaRule class computes the prediction error for a given input and target value.

Parameters:

Name Type Description Default
alpha float

The learning rate.

None
zeta

The constant fraction of the magnitude of the prediction error.

None
weights array-like

The value matrix, where rows are outcomes and columns are stimuli or features. The values can be anything; for example belief values, association weights, connection weights, Q-values.

None
feedback array-like

The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict.

None
input array-like

The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1.

None
**kwargs dict, optional

Additional keyword arguments.

{}
See Also

cpm.models.learning.SeparableRule : A class representing a learning rule based on the separable error-term of Bush and Mosteller (1951).

Notes

The delta-rule is a summed error term, which means that the error is defined as the difference between the target value and the summed activation of all values for a given output units target value available on the current trial/state. For separable error term, see the Bush and Mosteller (1951) rule.

The current implementation is based on the Gluck and Bower's (1988) delta rule, an extension of the Rescorla and Wagner (1972) learning rule to multi-outcome learning.

Examples:

>>> import numpy as np
>>> from cpm.models.learning import DeltaRule
>>> weights = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> teacher = np.array([1, 0])
>>> input = np.array([1, 1, 0])
>>> delta_rule = DeltaRule(alpha=0.1, zeta=0.1, weights=weights, feedback=teacher, input=input)
>>> delta_rule.compute()
array([[ 0.07,  0.07,  0.  ],
       [-0.09, -0.09, -0.  ]])
>>> delta_rule.noisy_learning_rule()
array([[ 0.05755793,  0.09214091,  0.],
       [-0.08837513, -0.1304325 ,  0.]])

This implementation generalises to n-dimensional matrices, which means that it can be applied to both single- and multi-outcome learning paradigms.

>>> weights = np.array([0.1, 0.6, 0., 0.3])
>>> teacher = np.array([1])
>>> input = np.array([1, 1, 0, 0])
>>> delta_rule = DeltaRule(alpha=0.1, weights=weights, feedback=teacher, input=input)
>>> delta_rule.compute()
array([[0.03, 0.03, 0.  , 0.  ]])
References

Gluck, M. A., & Bower, G. H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117(3), 227–247.

Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York:Appleton-Century-Crofts.

Widrow, B., & Hoff, M. E. (1960, August). Adaptive switching circuits. In IRE WESCON convention record (Vol. 4, No. 1, pp. 96-104).

compute()

Compute the prediction error using the delta learning rule. It is based on the Gluck and Bower's (1988) delta rule, an extension to Rescorla and Wagner (1972), which was identical to that of Widrow and Hoff (1960).

Returns:

Type Description
ndarray

The prediction error for each stimuli-outcome mapping with learning noise. It has the same shape as the weights input argument.

noisy_learning_rule()

Add random noise to the prediction error computed from the delta learning rule as specified Findling et al. (2019). It is inspired by Weber's law of intensity sensation.

Returns:

Type Description
ndarray

The prediction error for each stimuli-outcome mapping with learning noise. It has the same shape as the weights input argument.

References

Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S., and Wyart, V. (2019). Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nature Neuroscience 22, 2066–2077

reset()

Reset the weights to zero.

HumbleTeacher(alpha=None, weights=None, feedback=None, input=None, **kwargs)

A humbe teacher learning rule (Kruschke, 1992; Love, Gureckis, and Medin, 2004) for multi-dimensional outcome learning.

Attributes:

Name Type Description
alpha float

The learning rate.

input ndarray or array_like

The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1.

weights ndarray

The weights value. A 2D array of weights, where each row represents an outcome and each column represents a single stimulus.

teacher ndarray

The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict.

shape tuple

The shape of the weight matrix.

Parameters:

Name Type Description Default
alpha float

The learning rate.

None
weights array-like

The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1.

None
feedback array-like

The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict.

None
input array-like

The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1.

None
**kwargs dict, optional

Additional keyword arguments.

{}
Notes

The humble teacher learning rule is a learning rule that is based on the idea that if output node activations large than the teaching signals should not be counted as error, but should be rewarded. So the humble teacher turns teaching signals into discrete (nominal) values, where they do not indicate the degree of membership between stimuli and outcome label, the degree of causality between stimuli and outcome, or the degree of correctness of the output.

References

Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44.

Examples:

>>> import numpy as np
>>> from cpm.models.learning import HumbleTeacher
>>> weights = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> teacher = np.array([0, 1])
>>> input = np.array([1, 1, 1])
>>> humble_teacher = HumbleTeacher(alpha=0.1, weights=weights, feedback=teacher, input=input)
>>> humble_teacher.compute()
array([[-0.06, -0.06, -0.06],
   [ 0.  ,  0.  ,  0.  ]])

compute()

Compute the weights using the CPM learning rule.

Returns:

Name Type Description
weights numpy.ndarray

The updated weights matrix.

KernelUpdate(response, alpha, kernel, input, **kwargs)

A class representing a learning rule for updating the choice kernel as specified by Equation 5 in Wilson and Collins (2019).

Parameters:

Name Type Description Default
response ndarray

The response vector. It must be a binary numpy.ndarray, so that each element corresponds to a response option. If there are 4 response options, and the second was selected, it would be represented as [0, 1, 0, 0].

required
alpha float

The kernel learning rate.

required
kernel ndarray

The kernel used for learning. It is a 1D array of kernel values, where each element corresponds to a response option. Each element must correspond to the same response option in the response vector.

required
Notes

The kernel update component is used to represent how likely a given response is to be chosen based on the frequency it was chosen in the past. This can then be integrated into a choice kernel decision policy.

See Also

cpm.models.decision.ChoiceKernel : A class representing a choice kernel decision policy.

References

Wilson, Robert C., and Anne GE Collins. Ten simple rules for the computational modeling of behavioral data. Elife 8 (2019): e49547.

compute()

Compute the change in the kernel based on the given response, rate, and kernel, and return the updated kernel.

Returns:

Name Type Description
output numpy.ndarray

The computed change of the kernel.

config()

Get the configuration of the kernel update component.

Returns:

Name Type Description
config dict

A dictionary containing the configuration parameters of the kernel update component.

  • response (float): The response of the system.
  • rate (float): The learning rate.
  • kernel (list): The kernel used for learning.
  • input (str): The name of the input.
  • name (str): The name of the kernel update component class.
  • type (str): The type of the kernel update component.

QLearningRule(alpha=0.5, gamma=0.1, values=None, reward=None, maximum=None, *args, **kwargs)

Q-learning rule (Watkins, 1989) for a one-dimensional array of Q-values.

Parameters:

Name Type Description Default
alpha float

The learning rate. Default is 0.5.

0.5
gamma float

The discount factor. Default is 0.1.

0.1
values ndarray

The values matrix. It is a 1D array of Q-values active for the current state, where each element corresponds to an action.

None
reward float

The reward received on the current state.

None
maximum float

The maximum estimated reward for the next state.

None
Notes

The Q-learning rule is a model-free reinforcement learning algorithm that is used to learn the value of an action in a given state. It is defined as

Q(s, a) = Q(s, a) + alpha * (r + gamma * max(Q(s', a')) - Q(s, a)),

where Q(s, a) is the value of action a in state s, r is the reward received on the current state, gamma is the discount factor, and max(Q(s', a')) is the maximum estimated reward for the next state.

Examples:

>>> import numpy as np
>>> from cpm.models.learning import QLearningRule
>>> values = np.array([1, 0.5, 0.99])
>>> component = QLearningRule(alpha=0.1, gamma=0.8, values=values, reward=1, maximum=10)
>>> component.compute()
array([1.8  , 1.35 , 1.791])
References

Watkins, C. J. C. H. (1989). Learning from delayed rewards.

Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8, 279-292.

compute()

Compute the change in values based on the given values, reward, and parameters, and return the updated values.

Returns:

Name Type Description
output numpy.ndarray

The computed output values.

SeparableRule(alpha=None, zeta=None, weights=None, feedback=None, input=None, **kwargs)

A class representing a learning rule based on the separable error-term of Bush and Mosteller (1951).

Parameters:

Name Type Description Default
alpha float

The learning rate.

None
zeta float, optional

The constant fraction of the magnitude of the prediction error, also called Weber's scaling.

None
weights array-like

The value matrix, where rows are outcomes and columns are stimuli or features. The values can be anything; for example belief values, association weights, connection weights, Q-values.

None
feedback array-like, optional

The target values or feedback, sometimes referred to as teaching signals. These are the values that the algorithm should learn to predict.

None
input array-like, optional

The input value. The stimulus representation in the form of a 1D array, where each element can take a value of 0 and 1.

None
**kwargs dict, optional

Additional keyword arguments.

{}
See Also

cpm.models.learning.DeltaRule : An extension of the Rescorla and Wagner (1972) learning rule by Gluck and Bower (1988) to allow multi-outcome learning.

Notes

This type of learning rule was among the earliest formal models of associative learning (Le Pelley, 2004), which were based on standard linear operators (Bush & Mosteller, 1951; Estes, 1950; Kendler, 1971).

References

Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple learning. Psychological Review, 58, 313–323

Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, 57, 94–107

Kendler, T. S. (1971). Continuity theory and cue dominance. In J. T. Spence (Ed.), Essays in neobehaviorism: A memorial volume to Kenneth W. Spence. New York: Appleton-Century-Crofts.

Le Pelley, M. E. (2004). The role of associative history in models of associative learning: A selective review and a hybrid model. Quarterly Journal of Experimental Psychology Section B, 57(3), 193-243.

compute()

Computes the prediction error using the learning rule.

Returns:

ndarray The prediction error for each stimuli-outcome mapping. It has the same shape as the weights input argument.

noisy_learning_rule()

Add random noise to the prediction error computed from the delta learning rule as specified Findling et al. (2019). It is inspired by Weber's law of intensity sensation.

Returns:

Type Description
ndarray

The prediction error for each stimuli-outcome mapping with learning noise. It has the same shape as the weights input argument.

References

Findling, C., Skvortsova, V., Dromnelle, R., Palminteri, S., and Wyart, V. (2019). Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nature Neuroscience 22, 2066–2077

reset()

Resets the weights to zero.