cpm.utils

cpm.utils.metad

cpm.utils.metad.count_trials(data=pd.DataFrame, stimuli='Stimuli', responses='Responses', accuracy='Accuracy', confidence='Confidence', nRatings=4, padding=False, padAmount=None)

Convert raw behavioral data to nR_S1 and nR_S2 response count.

Given data from an experiment where an observer discriminates between two stimulus alternatives on every trial and provides confidence ratings, converts trial by trial experimental information for N trials into response counts.

Parameters:
  • data

    Dataframe containing stimuli, accuracy and confidence ratings.

  • stimuli (str, default: 'Stimuli' ) –

    Stimuli ID (0 or 1). If a dataframe is provided, should be the name of the column containing the stimuli ID. Default is 'Stimuli'.

  • responses (str, default: 'Responses' ) –

    Response (0 or 1). If a dataframe is provided, should be the name of the column containing the response accuracy. Default is 'Responses'.

  • accuracy (str, default: 'Accuracy' ) –

    Response accuracy (0 or 1). If a dataframe is provided, should be the name of the column containing the response accuracy. Default is 'Accuracy'.

  • confidence (str, default: 'Confidence' ) –

    Confidence ratings. If a dataframe is provided, should be the name of the column containing the confidence ratings. Default is 'Confidence'.

  • nRatings (int, default: 4 ) –

    Total of available subjective ratings available for the subject. e.g. if subject can rate confidence on a scale of 1-4, then nRatings = 4. Default is 4.

  • padding (bool, default: False ) –

    If True, each response count in the output has the value of padAmount added to it. Padding cells is desirable if trial counts of 0 interfere with model fitting. If False, trial counts are not manipulated and 0s may be present in the response count output. Default value for padding is 0.

  • padAmount (float, default: None ) –

    The value to add to each response count if padding is set to 1. Default value is 1/(2*nRatings)

Returns:
  • nR_S1, nR_S2 :

    Vectors containing the total number of responses in each accuracy category, conditional on presentation of S1 and S2.

Notes

All trials where stimuli is not 0 or 1, accuracy is not 0 or 1, or confidence is not in the range [1, nRatings], are automatically omitted.

The inputs can be responses, accuracy or both. If both responses and accuracy are provided, will check for consstency. If only accuracy is provided, the responses vector will be automatically infered.

If nR_S1 = [100 50 20 10 5 1], then when stimulus S1 was presented, the subject had the following accuracy counts: responded S1, confidence=3 : 100 times responded S1, confidence=2 : 50 times responded S1, confidence=1 : 20 times responded S2, confidence=1 : 10 times responded S2, confidence=2 : 5 times responded S2, confidence=3 : 1 time

The ordering of accuracy / confidence counts for S2 should be the same as it is for S1. e.g. if nR_S2 = [3 7 8 12 27 89], then when stimulus S2 was presented, the subject had the following accuracy counts: responded S1, confidence=3 : 3 times responded S1, confidence=2 : 7 times responded S1, confidence=1 : 8 times responded S2, confidence=1 : 12 times responded S2, confidence=2 : 27 times responded S2, confidence=3 : 89 times

Examples:

>>> stimID = [0, 1, 0, 0, 1, 1, 1, 1]
>>> accuracy = [0, 1, 1, 1, 0, 0, 1, 1]
>>> confidence = [1, 2, 3, 4, 4, 3, 2, 1]
>>> nRatings = 4
>>> nR_S1, nR_S2 = trials2counts(stimID, accuracy, confidence, nRatings)
>>> print(nR_S1, nR_S2)
Reference

This function is adapted from the Python version of trials2counts.m by Maniscalco & Lau [1] retrieved at: http://www.columbia.edu/~bsm2105/type2sdt/trials2counts.py

.. [1] Maniscalco, B., & Lau, H. (2012). A signal detection theoretic approach for estimating metacognitive sensitivity from confidence ratings. Consciousness and Cognition, 21(1), 422–430. https://doi.org/10.1016/j.concog.2011.09.021

cpm.utils.metad.bin_ratings(ratings=None, nbins=4, verbose=True, ignore_invalid=False)

Convert from continuous to discrete ratings.

Resample if quantiles are equal at high or low end to ensure proper assignment of binned confidence

Parameters:
  • ratings (list | ndarray, default: None ) –

    Ratings on a continuous scale.

  • nbins (int, default: 4 ) –

    The number of discrete ratings to resample. Default set to 4.

  • verbose (boolean, default: True ) –

    If True, warnings will be returned.

  • ignore_invalid (bool, default: False ) –

    If False (default), an error will be raised in case of impossible discretisation of the confidence ratings. This is mostly due to identical values, and SDT values should not be extracted from the data. If True, the discretisation will process anyway. This option can be useful for plotting.

Returns:
  • discreteRatings( ndarray ) –

    New rating array only containing integers between 1 and nbins.

  • out( dict ) –

    Dictionary containing logs of the discretisation process: * 'confbins': list or 1d array-like - If the ratings were reampled, a list containing the new ratings and the new low or hg threshold, appened before or after the rating, respectively. Else, only returns the ratings. * 'rebin': boolean - If True, the ratings were resampled due to larger numbers of highs or low ratings. * 'binCount' : int - Number of bins

  • .. warning:: This function will automatically control for bias in high or

    low confidence ratings. If the first two or the last two quantiles have identical values, low or high confidence trials are excluded (respectively), and the function is run again on the remaining data.

Raises:
  • ValueError:

    If the confidence ratings contains a lot of identical values and ignore_invalid is False.

Examples:

>>> from metadpy.utils import discreteRatings
>>> ratings = np.array([
>>>     96, 98, 95, 90, 32, 58, 77,  6, 78, 78, 62, 60, 38, 12,
>>>     63, 18, 15, 13, 49, 26,  2, 38, 60, 23, 25, 39, 22, 33,
>>>     32, 27, 40, 13, 35, 16, 35, 73, 50,  3, 40, 0, 34, 47,
>>>     52,  0,  0,  0, 25,  1, 16, 37, 59, 20, 25, 23, 45, 22,
>>>     28, 62, 61, 69, 20, 75, 10, 18, 61, 27, 63, 22, 54, 30,
>>>     36, 66, 14,  2, 53, 58, 88, 23, 77, 54])
>>> discreteRatings, out = discreteRatings(ratings)
(array([4, 4, 4, 4, 2, 3, 4, 1, 4, 4, 4, 4, 3, 1, 4, 1, 1, 1, 3, 2, 1, 3,
    4, 2, 2, 3, 2, 2, 2, 2, 3, 1, 3, 1, 3, 4, 3, 1, 3, 1, 2, 3, 3, 1,
    1, 1, 2, 1, 1, 3, 3, 2, 2, 2, 3, 2, 2, 4, 4, 4, 2, 4, 1, 1, 4, 2,
    4, 2, 3, 2, 3, 4, 1, 1, 3, 3, 4, 2, 4, 3]),
{'confBins': array([ 0., 20., 35., 60., 98.]), 'rebin': 0, 'binCount': 21})

cpm.utils.data

cpm.utils.data.convert_to_RLRW(data, human_response, reward, stimulus, participant, **kwargs)

Convert a pandas DataFrame into a format compatible with the RLRW wrapper. This function takes in a DataFrame and the column names for the human response, reward, stimulus, and participant identifier, and returns a new DataFrame that is structured in a way that can be used with the RLRW wrapper. The function also checks to ensure that the specified columns exist in the input data and raises informative error messages if any of the columns are missing. Additionally, it can handle cases where there are multiple columns for stimulus, human response, or reward by creating new column names that match the number of columns provided.

Parameters:
  • data (DataFrame) –

    The pandas DataFrame to convert.

  • human_response (str or list) –

    The column name for the human response.

  • reward (str or list) –

    The column name(s) for the rewards in an order corresponding to the stimulus columns.

  • stimulus (str or list) –

    The column name(s) for the stimulus in an order corresponding to the reward columns.

  • participant (str) –

    The column name for the participant identifier.

  • kwargs (dict, default: {} ) –

    Any other keyword arguments to pass to the pandas DataFrame.

Returns:
  • DataFrame

    The pandas.DataFrame compatible with the RLRW wrapper, containing the stimulus, human response, and reward columns.

Examples:

>>> import pandas as pd
>>> from cpm.utils.data import convert_to_RLRW
>>> data = pd.DataFrame({
...     "participant_id": [1, 1, 2],
...     "stim_left": [0, 1, 0],
...     "stim_right": [1, 0, 1],
...     "choice": [1, 0, 1],
...     "reward_left": [1, 0, 1],
...     "reward_right": [0, 1, 0],
...     "block": [1, 1, 2],
        "condition": ["A", "A", "B"]
... })
>>> output = convert_to_RLRW(
...     data=data,
...     human_response="choice",
...     reward=["reward_left", "reward_right"],
...     stimulus=["stim_left", "stim_right"],
...     participant="participant_id",
...     block="block",
...     condition="condition"
... )
>>> print(output)
    arm_0  arm_1  response  reward_0  reward_1  participant  block condition
0       0       1         1         1         0            1      1         A
1       1       0         0         0         1            1      1         A
2       0       1         1         1         0            2      2         B
See also

cpm.applications.reinforcement_learning.RLRW: The RLRW wrapper that this function is designed to be compatible with.