CAC from Contingency Tables

The Contingency Table, also referred to as the cross tabulation, or the frequency table, presents data in the form of a matrix showing the distribution of subjects by rater and category.

Such datasets are the ones with the table_ prefix in the datasets module. One such example is the table_cont3x3abstractors.

[1]:
from irrCAC.datasets import table_cont3x3abstractors

data = table_cont3x3abstractors()
data
[1]:
Ectopic AIU NIU
Ectopic 13 0 0
AIU 0 20 7
NIU 0 4 56

Initialize CAC

To compute the various agreement coefficients using the contingency table, first initialize a CAC object. By initializing the object, it has information about the subjects, categories, and weights.

[2]:
from irrCAC.table import CAC

cac_abstractors = CAC(data)
cac_abstractors
[2]:
<irrCAC.table.CAC Subjects: 100, Categories: ['Ectopic', 'AIU', 'NIU'], Weights: "identity">

Brennar-Prediger Coefficient

To calculate the Brennar-Prediger coefficient, call the bp() method.

[3]:
bp = cac_abstractors.bp()
bp
[3]:
{'est': {'coefficient_value': 0.835,
  'coefficient_name': 'Brennan-Prediger',
  'confidence_interval': (0.74187, 0.92813),
  'p_value': 0.0,
  'z': 17.79114,
  'se': 0.04693,
  'pa': 0.89,
  'pe': 0.33333},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Cohen’s kappa

To caclulate the Cohen’s kappa coefficient, call the cohen() method.

[4]:
cohen = cac_abstractors.cohen()
cohen
[4]:
{'est': {'coefficient_value': 0.79641,
  'coefficient_name': "Cohen's kappa",
  'confidence_interval': (0.67952, 0.9133),
  'p_value': 0.0,
  'z': 13.51892,
  'se': 0.05891,
  'pa': 0.89,
  'pe': 0.4597},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Gwet’s AC1/AC2

To calculate the Gwet’s coefficient, call the gwet() method.

[5]:
gwet = cac_abstractors.gwet()
gwet
[5]:
{'est': {'coefficient_value': 0.84933,
  'coefficient_name': "Gwet's AC1",
  'confidence_interval': (0.76358, 0.93508),
  'p_value': 0.0,
  'z': 19.65248,
  'se': 0.04322,
  'pa': 0.89,
  'pe': 0.26992},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Krippendorff’s Alpha

To calculate the Krippendorff’s Alpha coefficient, call the krippendorff() method

[6]:
alpha = cac_abstractors.krippendorff()
alpha
[6]:
{'est': {'coefficient_value': 0.79726,
  'coefficient_name': "Krippendorff's Alpha",
  'confidence_interval': (0.68008, 0.91444),
  'p_value': 0.0,
  'z': 13.50033,
  'se': 0.05905,
  'pa': 0.89055,
  'pe': 0.46015},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Percent Agreement

To calculate the Percent Agreement, call the pa2() method.

[7]:
pa2 = cac_abstractors.pa2()
pa2
[7]:
{'est': {'coefficient_value': 0.89,
  'coefficient_name': 'Percent Agreement',
  'confidence_interval': (0.82792, 0.95208),
  'p_value': 0.0,
  'z': 28.44452,
  'se': 0.03129,
  'pa': 0.89,
  'pe': 0},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Scott’s Pi

To calculate the Scott’s Pi, call the scott() method.

[8]:
scott = cac_abstractors.scott()
scott
[8]:
{'est': {'coefficient_value': 0.79624,
  'coefficient_name': "Scott's Pi",
  'confidence_interval': (0.67906, 0.91342),
  'p_value': 0.0,
  'z': 13.48308,
  'se': 0.05905,
  'pa': 0.89,
  'pe': 0},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Weighted Analysis

We can use custom weights or predefined weight types initializing the CAC objects. For the available weight types see the Weights module.

In the following example, we initialize a new object on the same data using linear weights.

[9]:
cac_abstractors_linear = CAC(data, weights='linear')
cac_abstractors_linear
[9]:
<irrCAC.table.CAC Subjects: 100, Categories: ['Ectopic', 'AIU', 'NIU'], Weights: "linear">

To see the weights’ matrix we can print the weights_mat attribute of the object.

[10]:
cac_abstractors_linear.weights_mat
[10]:
array([[1. , 0.5, 0. ],
       [0.5, 1. , 0.5],
       [0. , 0.5, 1. ]])

Next, we simply call the method of the coefficient we want the calculation. Here for example we show the weighted Brennan-Prediger coefficient.

[11]:
bp_linear = cac_abstractors_linear.bp()
bp_linear
[11]:
{'est': {'coefficient_value': 0.87625,
  'coefficient_name': 'Brennan-Prediger',
  'confidence_interval': (0.80641, 0.94609),
  'p_value': 0.0,
  'z': 24.8934,
  'se': 0.0352,
  'pa': 0.945,
  'pe': 0.55556},
 'weights': array([[1. , 0.5, 0. ],
        [0.5, 1. , 0.5],
        [0. , 0.5, 1. ]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

To use custom weights pass a list of values or an array.

[12]:
import numpy as np


weights = np.array(
    [[1. , 0.75, 0. ],
     [0.75, 1. , 0.0],
     [0. , 0.75, 1. ]])
cac_abstractors_custom_weights = CAC(data, weights=weights)
cac_abstractors_custom_weights
[12]:
<irrCAC.table.CAC Subjects: 100, Categories: ['Ectopic', 'AIU', 'NIU'], Weights: "Custom Weights">

Verify the weights.

[13]:
cac_abstractors_custom_weights.weights_mat
[13]:
array([[1.  , 0.75, 0.  ],
       [0.75, 1.  , 0.  ],
       [0.  , 0.75, 1.  ]])

Calculate a coefficient using the custom weights.

[14]:
bp_custom_weights = cac_abstractors_custom_weights.bp()
bp_custom_weights
[14]:
{'est': {'coefficient_value': 0.808,
  'coefficient_name': 'Brennan-Prediger',
  'confidence_interval': (0.68557, 0.93043),
  'p_value': 0.0,
  'z': 13.09482,
  'se': 0.0617,
  'pa': 0.92,
  'pe': 0.58333},
 'weights': array([[1.  , 0.75, 0.  ],
        [0.75, 1.  , 0.  ],
        [0.  , 0.75, 1.  ]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

To compare the results of identity weights, the calculation with the linear weights, and the custom weights, we display the results side by side.

[15]:
import pandas as pd

df = pd.DataFrame(
    zip(bp['est'].items(),
        bp_linear['est'].items(),
        bp_custom_weights['est'].items()),
    columns=['Identity Weights', 'Linear Weights', 'Custom Weights'])
df
[15]:
Identity Weights Linear Weights Custom Weights
0 (coefficient_value, 0.835) (coefficient_value, 0.87625) (coefficient_value, 0.808)
1 (coefficient_name, Brennan-Prediger) (coefficient_name, Brennan-Prediger) (coefficient_name, Brennan-Prediger)
2 (confidence_interval, (0.74187, 0.92813)) (confidence_interval, (0.80641, 0.94609)) (confidence_interval, (0.68557, 0.93043))
3 (p_value, 0.0) (p_value, 0.0) (p_value, 0.0)
4 (z, 17.79114) (z, 24.8934) (z, 13.09482)
5 (se, 0.04693) (se, 0.0352) (se, 0.0617)
6 (pa, 0.89) (pa, 0.945) (pa, 0.92)
7 (pe, 0.33333) (pe, 0.55556) (pe, 0.58333)