cpm.optimisation

`cpm.optimisation.DifferentialEvolution(model=None, data=None, minimisation=minimise.LogLikelihood.bernoulli, prior=False, parallel=False, cl=None, libraries=['numpy', 'pandas'], ppt_identifier=None, display=False, **kwargs)`

Class representing the Differential Evolution optimization algorithm.

Parameters:

Name	Type	Description	Default
`model`	`cpm.generators.Wrapper`	The model to be optimized.	`None`
`data`	`pd.DataFrame, pd.DataFrameGroupBy, list`	The data used for optimization. If a pd.Dataframe, it is grouped by the `ppt_identifier`. If it is a pd.DataFrameGroupby, groups are assumed to be participants. An array of dictionaries, where each dictionary contains the data for a single participant, including information about the experiment and the results too. See Notes for more information.	`None`
`minimisation`	`function`	The loss function for the objective minimization function. Default is `minimise.LogLikelihood.bernoulli`. See the `minimise` module for more information. User-defined loss functions are also supported, but they must conform to the format of currently implemented ones.	`minimise.LogLikelihood.bernoulli`
`prior`		Whether to include priors in the optimisation. Deafult is 'False'.	`False`
`parallel`	`bool`	Whether to use parallel processing. Default is `False`.	`False`
`cl`	`int`	The number of cores to use for parallel processing. Default is `None`. If `None`, the number of cores is set to 2. If `cl` is set to `None` and `parallel` is set to `True`, the number of cores is set to the number of cores available on the machine.	`None`
`libraries`	`list, optional`	The libraries to import for parallel processing for `ipyparallel` with the IPython kernel. Default is `["numpy", "pandas"]`	`['numpy', 'pandas']`
`ppt_identifier`	`str`	The key in the participant data dictionary that contains the participant identifier. Default is `None`. Returned in the optimization details.	`None`
`**kwargs`	`dict`	Additional keyword arguments. See the `scipy.optimize.differential_evolution` documentation for what is supported.	`{}`

Notes

The data parameter must contain all input to the model, including the observed data. The data parameter can be a pandas DataFrame, a pandas DataFrameGroupBy object, or a list of dictionaries. If the data parameter is a pandas DataFrame, it is assumed that the data needs to be grouped by the participant identifier, ppt_identifier. If the data parameter is a pandas DataFrameGroupBy object, the groups are assumed to be participants. If the data parameter is a list of dictionaries, each dictionary should contain the data for a single participant, including information about the experiment and the results. The observed data for each participant should be included in the dictionary under the key or column 'observed'. The 'observed' key should correspond, both in format and shape, to the 'dependent' variable calculated by the model Wrapper.

`export(details=False)`

Exports the optimization results and fitted parameters as a pandas.DataFrame.

Parameters:

Name	Type	Description	Default
`details`	`bool`	Whether to include the various metrics related to the optimisation routine in the output.	`False`

Returns:

Type	Description
`pandas.DataFrame`	A pandas DataFrame containing the optimization results and fitted parameters. If `details` is `True`, the DataFrame will also include the optimization details.

Notes

The DataFrame will not contain the population and population_energies keys from the optimization details. If you want to investigate it, please use the details attribute.

`optimise()`

Performs the optimization process.

Returns:

Type	Description
`None`

`reset()`

Resets the optimization results and fitted parameters.

Returns: - None

`cpm.optimisation.Fmin(model=None, data=None, initial_guess=None, minimisation=None, cl=None, parallel=False, libraries=['numpy', 'pandas'], prior=False, number_of_starts=1, ppt_identifier=None, display=False, **kwargs)`

Class representing the Fmin search (unbounded) optimization algorithm using a downhill simplex.

Parameters:

Name	Type	Description	Default
`model`	`cpm.generators.Wrapper`	The model to be optimized.	`None`
`data`	`pd.DataFrame, pd.DataFrameGroupBy, list`	The data used for optimization. If a pd.Dataframe, it is grouped by the `ppt_identifier`. If it is a pd.DataFrameGroupby, groups are assumed to be participants. An array of dictionaries, where each dictionary contains the data for a single participant, including information about the experiment and the results too. See Notes for more information.	`None`
`minimisation`	`function`	The loss function for the objective minimization function. See the `minimise` module for more information. User-defined loss functions are also supported.	`None`
`prior`		Whether to include the prior in the optimization. Default is `False`.	`False`
`number_of_starts`	`int`	The number of random initialisations for the optimization. Default is `1`.	`1`
`initial_guess`	`list or array-like`	The initial guess for the optimization. Default is `None`. If `number_of_starts` is set, and the `initial_guess` parameter is 'None', the initial guesses are randomly generated from a uniform distribution.	`None`
`parallel`	`bool`	Whether to use parallel processing. Default is `False`.	`False`
`cl`	`int`	The number of cores to use for parallel processing. Default is `None`. If `None`, the number of cores is set to 2. If `cl` is set to `None` and `parallel` is set to `True`, the number of cores is set to the number of cores available on the machine.	`None`
`libraries`	`list, optional`	The libraries to import for parallel processing for `ipyparallel` with the IPython kernel. Default is `["numpy", "pandas"]`.	`['numpy', 'pandas']`
`ppt_identifier`	`str`	The key in the participant data dictionary that contains the participant identifier. Default is `None`. Returned in the optimization details.	`None`
`**kwargs`	`dict`	Additional keyword arguments. See the `scipy.optimize.fmin` documentation for what is supported.	`{}`

Notes

The data parameter must contain all input to the model, including the observed data. The data parameter can be a pandas DataFrame, a pandas DataFrameGroupBy object, or a list of dictionaries. If the data parameter is a pandas DataFrame, it is assumed that the data needs to be grouped by the participant identifier, ppt_identifier. If the data parameter is a pandas DataFrameGroupBy object, the groups are assumed to be participants. If the data parameter is a list of dictionaries, each dictionary should contain the data for a single participant, including information about the experiment and the results. The observed data for each participant should be included in the dictionary under the key or column 'observed'. The 'observed' key should correspond, both in format and shape, to the 'dependent' variable calculated by the model Wrapper.

The optimization process is repeated number_of_starts times, and only the best-fitting output from the best guess is stored.

`export()`

Exports the optimization results and fitted parameters as a pandas.DataFrame.

Returns:

Type	Description
`pandas.DataFrame`	A pandas DataFrame containing the optimization results and fitted parameters.

`optimise()`

Performs the optimization process.

Returns: - None

`reset(initial_guess=True)`

Resets the optimization results and fitted parameters.

Parameters:

Name	Type	Description	Default
`initial_guess`	`bool, optional`	Whether to reset the initial guess (generates a new set of random numbers within parameter bounds). Default is `True`.	`True`

Returns:

Type	Description
`None`

`cpm.optimisation.FminBound(model=None, data=None, initial_guess=None, number_of_starts=1, minimisation=None, cl=None, parallel=False, libraries=['numpy', 'pandas'], prior=False, ppt_identifier=None, display=False, **kwargs)`

Class representing the Fmin search (bounded) optimization algorithm using the L-BFGS-B method.

Parameters:

Name	Type	Description	Default
`model`	`cpm.generators.Wrapper`	The model to be optimized.	`None`
`data`	`pd.DataFrame, pd.DataFrameGroupBy, list`	The data used for optimization. If a pd.Dataframe, it is grouped by the `ppt_identifier`. If it is a pd.DataFrameGroupby, groups are assumed to be participants. An array of dictionaries, where each dictionary contains the data for a single participant, including information about the experiment and the results too. See Notes for more information.	`None`
`minimisation`	`function`	The loss function for the objective minimization function. See the `minimise` module for more information. User-defined loss functions are also supported.	`None`
`prior`		Whether to include the prior in the optimization. Default is `False`.	`False`
`number_of_starts`	`int`	The number of random initialisations for the optimization. Default is `1`.	`1`
`initial_guess`	`list or array-like`	The initial guess for the optimization. Default is `None`. If `number_of_starts` is set, and the `initial_guess` parameter is 'None', the initial guesses are randomly generated from a uniform distribution.	`None`
`parallel`	`bool`	Whether to use parallel processing. Default is `False`.	`False`
`cl`	`int`	The number of cores to use for parallel processing. Default is `None`. If `None`, the number of cores is set to 2. If `cl` is set to `None` and `parallel` is set to `True`, the number of cores is set to the number of cores available on the machine.	`None`
`libraries`	`list, optional`	The libraries to import for parallel processing for `ipyparallel` with the IPython kernel. Default is `["numpy", "pandas"]`.	`['numpy', 'pandas']`
`ppt_identifier`	`str`	The key in the participant data dictionary that contains the participant identifier. Default is `None`. Returned in the optimization details.	`None`
`**kwargs`	`dict`	Additional keyword arguments. See the `scipy.optimize.fmin_l_bfgs_b` documentation for what is supported.	`{}`

Notes

The data parameter must contain all input to the model, including the observed data. The data parameter can be a pandas DataFrame, a pandas DataFrameGroupBy object, or a list of dictionaries. If the data parameter is a pandas DataFrame, it is assumed that the data needs to be grouped by the participant identifier, ppt_identifier. If the data parameter is a pandas DataFrameGroupBy object, the groups are assumed to be participants. If the data parameter is a list of dictionaries, each dictionary should contain the data for a single participant, including information about the experiment and the results. The observed data for each participant should be included in the dictionary under the key or column 'observed'. The 'observed' key should correspond, both in format and shape, to the 'dependent' variable calculated by the model Wrapper.

The optimization process is repeated number_of_starts times, and only the best-fitting output from the best guess is stored.

`export()`

Exports the optimization results and fitted parameters as a pandas.DataFrame.

Returns:

Type	Description
`pandas.DataFrame`	A pandas DataFrame containing the optimization results and fitted parameters.

`optimise(display=True)`

Performs the optimization process.

Returns: - None

`reset(initial_guess=True)`

Resets the optimization results and fitted parameters.

Parameters:

Name	Type	Description	Default
`initial_guess`	`bool, optional`	Whether to reset the initial guess (generates a new set of random numbers within parameter bounds). Default is `True`.	`True`

Returns:

Type	Description
`None`

`cpm.optimisation.Minimize(model=None, data=None, initial_guess=None, minimisation=None, method='Nelder-Mead', cl=None, parallel=False, libraries=['numpy', 'pandas'], prior=False, number_of_starts=1, ppt_identifier=None, display=False, **kwargs)`

Class representing scipy's Minimize algorithm wrapped for subject-level parameter estimations.

Parameters:

Name	Type	Description	Default
`model`	`cpm.generators.Wrapper`	The model to be optimized.	`None`
`data`	`pd.DataFrame, pd.DataFrameGroupBy, list`	The data used for optimization. If a pd.Dataframe, it is grouped by the `ppt_identifier`. If it is a pd.DataFrameGroupby, groups are assumed to be participants. An array of dictionaries, where each dictionary contains the data for a single participant, including information about the experiment and the results too. See Notes for more information.	`None`
`minimisation`	`function`	The loss function for the objective minimization function. See the `minimise` module for more information. User-defined loss functions are also supported.	`None`
`number_of_starts`	`int`	The number of random initialisations for the optimization. Default is `1`.	`1`
`initial_guess`	`list or array-like`	The initial guess for the optimization. Default is `None`. If `number_of_starts` is set, and the `initial_guess` parameter is 'None', the initial guesses are randomly generated from a uniform distribution.	`None`
`parallel`	`bool`	Whether to use parallel processing. Default is `False`.	`False`
`cl`	`int`	The number of cores to use for parallel processing. Default is `None`. If `None`, the number of cores is set to 2. If `cl` is set to `None` and `parallel` is set to `True`, the number of cores is set to the number of cores available on the machine.	`None`
`libraries`	`list, optional`	The libraries to import for parallel processing for `ipyparallel` with the IPython kernel. Default is `["numpy", "pandas"]`	`['numpy', 'pandas']`
`ppt_identifier`	`str`	The key in the participant data dictionary that contains the participant identifier. Default is `None`. Returned in the optimization details.	`None`
`**kwargs`	`dict`	Additional keyword arguments. See the `scipy.optimize.minimize` documentation for what is supported.	`{}`

Notes

The data parameter must contain all input to the model, including the observed data. The data parameter can be a pandas DataFrame, a pandas DataFrameGroupBy object, or a list of dictionaries. If the data parameter is a pandas DataFrame, it is assumed that the data needs to be grouped by the participant identifier, ppt_identifier. If the data parameter is a pandas DataFrameGroupBy object, the groups are assumed to be participants. If the data parameter is a list of dictionaries, each dictionary should contain the data for a single participant, including information about the experiment and the results. The observed data for each participant should be included in the dictionary under the key or column 'observed'. The 'observed' key should correspond, both in format and shape, to the 'dependent' variable calculated by the model Wrapper.

The optimization process is repeated number_of_starts times, and only the best-fitting output from the best guess is stored.

`export()`

Exports the optimization results and fitted parameters as a pandas.DataFrame.

Returns:

Type	Description
`pandas.DataFrame`	A pandas DataFrame containing the optimization results and fitted parameters.

`optimise()`

Performs the optimization process.

Returns: - None

`reset(initial_guess=True)`

Resets the optimization results and fitted parameters.

Parameters:

Name	Type	Description	Default
`initial_guess`	`bool, optional`	Whether to reset the initial guess (generates a new set of random numbers within parameter bounds). Default is `True`.	`True`

Returns:

Type	Description
`None`

`cpm.optimisation.Bads(model=None, data=None, minimisation=minimise.LogLikelihood.continuous, prior=False, number_of_starts=1, initial_guess=None, parallel=False, cl=None, libraries=['numpy', 'pandas'], ppt_identifier=None, **kwargs)`

Class representing the Bayesian Adaptive Direct Search (BADS) optimization algorithm.

Parameters:

Name	Type	Description	Default
`model`	`cpm.generators.Wrapper`	The model to be optimized.	`None`
`data`	`pd.DataFrame, pd.DataFrameGroupBy, list`	The data used for optimization. If a pd.Dataframe, it is grouped by the `ppt_identifier`. If it is a pd.DataFrameGroupby, groups are assumed to be participants. An array of dictionaries, where each dictionary contains the data for a single participant, including information about the experiment and the results too. See Notes for more information.	`None`
`minimisation`	`function`	The loss function for the objective minimization function. Default is `minimise.LogLikelihood.continuous`. See the `minimise` module for more information. User-defined loss functions are also supported.	`minimise.LogLikelihood.continuous`
`prior`		Whether to include the prior in the optimization. Default is `False`.	`False`
`number_of_starts`	`int`	The number of random initialisations for the optimization. Default is `1`.	`1`
`initial_guess`	`list or array-like`	The initial guess for the optimization. Default is `None`. If `number_of_starts` is set, and the `initial_guess` parameter is 'None', the initial guesses are randomly generated from a uniform distribution.	`None`
`parallel`	`bool`	Whether to use parallel processing. Default is `False`.	`False`
`cl`	`int`	The number of cores to use for parallel processing. Default is `None`. If `None`, the number of cores is set to 2. If `cl` is set to `None` and `parallel` is set to `True`, the number of cores is set to the number of cores available on the machine.	`None`
`libraries`	`list, optional`	The libraries required for the parallel processing with `ipyparallel` with the IPython kernel. Default is `["numpy", "pandas"]`.	`['numpy', 'pandas']`
`ppt_identifier`	`str`	The key in the participant data dictionary that contains the participant identifier. Default is `None`. Returned in the optimization details.	`None`
`**kwargs`	`dict`	Additional keyword arguments. See the `pybads.bads` documentation for what is supported.	`{}`

Notes

The data parameter must contain all input to the model, including the observed data. The data parameter can be a pandas DataFrame, a pandas DataFrameGroupBy object, or a list of dictionaries. If the data parameter is a pandas DataFrame, it is assumed that the data needs to be grouped by the participant identifier, ppt_identifier. If the data parameter is a pandas DataFrameGroupBy object, the groups are assumed to be participants. If the data parameter is a list of dictionaries, each dictionary should contain the data for a single participant, including information about the experiment and the results. The observed data for each participant should be included in the dictionary under the key or column 'observed'. The 'observed' key should correspond, both in format and shape, to the 'dependent' variable calculated by the model Wrapper.

The optimization process is repeated number_of_starts times, and only the best-fitting output from the best guess is stored.

The BADS algorithm has been designed to handle both deterministic and noisy (stochastic) target functions. A deterministic target function is a target function that returns the same exact probability value for a given dataset and proposed set of parameter values. By contrast, a stochastic target function returns varying probability values for the same input (data and parameters). The vast majority of models use a deterministic target function. We recommend that users make this explicit to BADS, by providing an options dictionary that includes the key uncertainty_handling set to False. Please see that BADS options documentation for more details.

`export()`

Exports the optimization results and fitted parameters as a pandas.DataFrame.

Returns:

Type	Description
`pandas.DataFrame`	A pandas DataFrame containing the optimization results and fitted parameters.

`optimise()`

Performs the optimization process.

Returns: - None

`reset(initial_guess=True)`

Resets the optimization results and fitted parameters.

Parameters:

Name	Type	Description	Default
`initial_guess`	`bool, optional`	Whether to reset the initial guess (generates a new set of random numbers within parameter bounds). Default is `True`.	`True`

Returns:

Type	Description
`None`

minimise

`cpm.optimisation.minimise.LogLikelihood()`

`bernoulli(predicted=None, observed=None, negative=True, **kwargs)`

Compute the log likelihood of the predicted values given the observed values for Bernoulli data.

Bernoulli(y|p) = p if y = 1 and 1 - p if y = 0

Parameters:

Name	Type	Description	Default
`predicted`	`array-like`	The predicted values. It must have the same shape as `observed`. See Notes for more details.	`None`
`observed`	`array-like`	The observed values. It must have the same shape as `predicted`. See Notes for more details.	`None`
`negative`	`bool, optional`	Flag indicating whether to return the negative log likelihood.	`True`

Returns:

Type	Description
`float`	The summed log likelihood or negative log likelihood.

Notes

predicted and observed must have the same shape. observed is a binary variable, so it can only take the values 0 or 1. predicted must be a value between 0 and 1. Values are clipped to avoid log(0) and log(1). If we encounter any non-finite values, we set any log likelihood to the value of np.log(1e-100).

Examples:

>>> import numpy as np
>>> observed = np.array([1, 0, 1, 0])
>>> predicted = np.array([0.7, 0.3, 0.6, 0.4])
>>> LogLikelihood.bernoulli(predicted, observed)
1.7350011354094463

`categorical(predicted=None, observed=None, negative=True, **kwargs)`

Compute the log likelihood of the predicted values given the observed values for categorical data.

Categorical(y|p) = p_y

Parameters:

Name	Type	Description	Default
`predicted`	`array-like`	The predicted values. It must have the same shape as `observed`. See Notes for more details.	`None`
`observed`	`array-like`	The observed values. It must have the same shape as `predicted`. See Notes for more details.	`None`
`negative`	`bool, optional`	Flag indicating whether to return the negative log likelihood.	`True`

Returns:

Type	Description
`float`	The log likelihood or negative log likelihood.

Notes

predicted and observed must have the same shape. observed is a vector of integers starting from 0 (first possible response), where each integer corresponds to the observed value. If there are two choice options, then observed would have a shape of (n, 2) and predicted would have a shape of (n, 2). On each row of observed, the array would have a 1 in the column corresponding to the observed value and a 0 in the other column.

Examples:

>>> import numpy as np
>>> observed = np.array([0, 1, 0, 1])
>>> predicted = np.array([[0.7, 0.3], [0.3, 0.7], [0.6, 0.4], [0.4, 0.6]])
>>> LogLikelihood.categorical(predicted, observed)
1.7350011354094463

`continuous(predicted, observed, negative=True, **kwargs)`

Compute the log likelihood of the predicted values given the observed values for continuous data.

Parameters:

Name	Type	Description	Default
`predicted`	`array-like`	The predicted values.	required
`observed`	`array-like`	The observed values.	required
`negative`	`bool, optional`	Flag indicating whether to return the negative log likelihood.	`True`

Returns:

Type	Description
`float`	The summed log likelihood or negative log likelihood.

Examples:

>>> import numpy as np
>>> observed = np.array([1, 0, 1, 0])
>>> predicted = np.array([0.7, 0.3, 0.6, 0.4])
>>> LogLikelihood.continuous(predicted, observed)
1.7350011354094463

`cpm.optimisation.minimise.Distance()`

`MSE(predicted, observed, **kwargs)`

Compute the Mean Squared Errors (EDE).

Parameters:

Name	Type	Description	Default
`predicted`	`array-like`	The predicted values.	required
`observed`	`array-like`	The observed values.	required

Returns:

Type	Description
`float`	The Euclidean distance.

`SSE(predicted, observed, **kwargs)`

Compute the sum of squared errors (SSE).

Parameters:

Name	Type	Description	Default
`predicted`	`array-like`	The predicted values.	required
`observed`	`array-like`	The observed values.	required

Returns:

Type	Description
`float`	The sum of squared errors.

`cpm.optimisation.minimise.Bayesian()`

`AIC(likelihood, n, k, **kwargs)`

Calculate the Akaike Information Criterion (AIC).

Parameters:

Name	Type	Description	Default
`likelihood`	`float`	The log likelihood value.	required
`n`	`int`	The number of data points.	required
`k`	`int`	The number of parameters.	required

Returns:

Type	Description
`float`	The AIC value.

`BIC(likelihood, n, k, **kwargs)`

Calculate the Bayesian Information Criterion (BIC).

Parameters:

Name	Type	Description	Default
`likelihood`	`float`	The log likelihood value.	required
`n`	`int`	The number of data points.	required
`k`	`int`	The number of parameters.	required

Returns:

Type	Description
`float`	The BIC value.