CAC from Contingency Tables

The Contingency Table, also referred to as the cross tabulation, or the frequency table, presents data in the form of a matrix showing the distribution of subjects by rater and category.

Such datasets are the ones with the table_ prefix in the datasets module. One such example is the table_cont3x3abstractors.

[1]:

from irrCAC.datasets import table_cont3x3abstractors

data = table_cont3x3abstractors()
data

[1]:

	Ectopic	AIU	NIU
Ectopic	13	0	0
AIU	0	20	7
NIU	0	4	56

Initialize CAC

To compute the various agreement coefficients using the contingency table, first initialize a CAC object. By initializing the object, it has information about the subjects, categories, and weights.

[2]:

from irrCAC.table import CAC

cac_abstractors = CAC(data)
cac_abstractors

[2]:

<irrCAC.table.CAC Subjects: 100, Categories: ['Ectopic', 'AIU', 'NIU'], Weights: "identity">

Brennar-Prediger Coefficient

To calculate the Brennar-Prediger coefficient, call the bp() method.

[3]:

bp = cac_abstractors.bp()
bp

[3]:

{'est': {'coefficient_value': 0.835,
  'coefficient_name': 'Brennan-Prediger',
  'confidence_interval': (0.74187, 0.92813),
  'p_value': 0.0,
  'z': 17.79114,
  'se': 0.04693,
  'pa': 0.89,
  'pe': 0.33333},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Cohen’s kappa

To caclulate the Cohen’s kappa coefficient, call the cohen() method.

[4]:

cohen = cac_abstractors.cohen()
cohen

[4]:

{'est': {'coefficient_value': 0.79641,
  'coefficient_name': "Cohen's kappa",
  'confidence_interval': (0.67952, 0.9133),
  'p_value': 0.0,
  'z': 13.51892,
  'se': 0.05891,
  'pa': 0.89,
  'pe': 0.4597},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Gwet’s AC1/AC2

To calculate the Gwet’s coefficient, call the gwet() method.

[5]:

gwet = cac_abstractors.gwet()
gwet

[5]:

{'est': {'coefficient_value': 0.84933,
  'coefficient_name': "Gwet's AC1",
  'confidence_interval': (0.76358, 0.93508),
  'p_value': 0.0,
  'z': 19.65248,
  'se': 0.04322,
  'pa': 0.89,
  'pe': 0.26992},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Krippendorff’s Alpha

To calculate the Krippendorff’s Alpha coefficient, call the krippendorff() method

[6]:

alpha = cac_abstractors.krippendorff()
alpha

[6]:

{'est': {'coefficient_value': 0.79726,
  'coefficient_name': "Krippendorff's Alpha",
  'confidence_interval': (0.68008, 0.91444),
  'p_value': 0.0,
  'z': 13.50033,
  'se': 0.05905,
  'pa': 0.89055,
  'pe': 0.46015},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Percent Agreement

To calculate the Percent Agreement, call the pa2() method.

[7]:

pa2 = cac_abstractors.pa2()
pa2

[7]:

{'est': {'coefficient_value': 0.89,
  'coefficient_name': 'Percent Agreement',
  'confidence_interval': (0.82792, 0.95208),
  'p_value': 0.0,
  'z': 28.44452,
  'se': 0.03129,
  'pa': 0.89,
  'pe': 0},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Scott’s Pi

To calculate the Scott’s Pi, call the scott() method.

[8]:

scott = cac_abstractors.scott()
scott

[8]:

{'est': {'coefficient_value': 0.79624,
  'coefficient_name': "Scott's Pi",
  'confidence_interval': (0.67906, 0.91342),
  'p_value': 0.0,
  'z': 13.48308,
  'se': 0.05905,
  'pa': 0.89,
  'pe': 0},
 'weights': array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

Weighted Analysis

We can use custom weights or predefined weight types initializing the CAC objects. For the available weight types see the Weights module.

In the following example, we initialize a new object on the same data using linear weights.

[9]:

cac_abstractors_linear = CAC(data, weights='linear')
cac_abstractors_linear

[9]:

<irrCAC.table.CAC Subjects: 100, Categories: ['Ectopic', 'AIU', 'NIU'], Weights: "linear">

To see the weights’ matrix we can print the weights_mat attribute of the object.

[10]:

cac_abstractors_linear.weights_mat

[10]:

array([[1. , 0.5, 0. ],
       [0.5, 1. , 0.5],
       [0. , 0.5, 1. ]])

Next, we simply call the method of the coefficient we want the calculation. Here for example we show the weighted Brennan-Prediger coefficient.

[11]:

bp_linear = cac_abstractors_linear.bp()
bp_linear

[11]:

{'est': {'coefficient_value': 0.87625,
  'coefficient_name': 'Brennan-Prediger',
  'confidence_interval': (0.80641, 0.94609),
  'p_value': 0.0,
  'z': 24.8934,
  'se': 0.0352,
  'pa': 0.945,
  'pe': 0.55556},
 'weights': array([[1. , 0.5, 0. ],
        [0.5, 1. , 0.5],
        [0. , 0.5, 1. ]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

To use custom weights pass a list of values or an array.

[12]:

import numpy as np


weights = np.array(
    [[1. , 0.75, 0. ],
     [0.75, 1. , 0.0],
     [0. , 0.75, 1. ]])
cac_abstractors_custom_weights = CAC(data, weights=weights)
cac_abstractors_custom_weights

[12]:

<irrCAC.table.CAC Subjects: 100, Categories: ['Ectopic', 'AIU', 'NIU'], Weights: "Custom Weights">

Verify the weights.

[13]:

cac_abstractors_custom_weights.weights_mat

[13]:

array([[1.  , 0.75, 0.  ],
       [0.75, 1.  , 0.  ],
       [0.  , 0.75, 1.  ]])

Calculate a coefficient using the custom weights.

[14]:

bp_custom_weights = cac_abstractors_custom_weights.bp()
bp_custom_weights

[14]:

{'est': {'coefficient_value': 0.808,
  'coefficient_name': 'Brennan-Prediger',
  'confidence_interval': (0.68557, 0.93043),
  'p_value': 0.0,
  'z': 13.09482,
  'se': 0.0617,
  'pa': 0.92,
  'pe': 0.58333},
 'weights': array([[1.  , 0.75, 0.  ],
        [0.75, 1.  , 0.  ],
        [0.  , 0.75, 1.  ]]),
 'categories': ['Ectopic', 'AIU', 'NIU']}

To compare the results of identity weights, the calculation with the linear weights, and the custom weights, we display the results side by side.

[15]:

import pandas as pd

df = pd.DataFrame(
    zip(bp['est'].items(),
        bp_linear['est'].items(),
        bp_custom_weights['est'].items()),
    columns=['Identity Weights', 'Linear Weights', 'Custom Weights'])
df

[15]:

	Identity Weights	Linear Weights	Custom Weights
0	(coefficient_value, 0.835)	(coefficient_value, 0.87625)	(coefficient_value, 0.808)
1	(coefficient_name, Brennan-Prediger)	(coefficient_name, Brennan-Prediger)	(coefficient_name, Brennan-Prediger)
2	(confidence_interval, (0.74187, 0.92813))	(confidence_interval, (0.80641, 0.94609))	(confidence_interval, (0.68557, 0.93043))
3	(p_value, 0.0)	(p_value, 0.0)	(p_value, 0.0)
4	(z, 17.79114)	(z, 24.8934)	(z, 13.09482)
5	(se, 0.04693)	(se, 0.0352)	(se, 0.0617)
6	(pa, 0.89)	(pa, 0.945)	(pa, 0.92)
7	(pe, 0.33333)	(pe, 0.55556)	(pe, 0.58333)