{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true, "pycharm": { "name": "#%% md\n" } }, "source": [ "# CAC from Contingency Tables\n", "\n", "The Contingency Table, also referred to as the cross tabulation,\n", "or the frequency table, presents data in the form of a matrix\n", "showing the distribution of subjects by rater and category.\n", "\n", "Such datasets are the ones with the `table_` prefix in the\n", "[datasets](../irrCAC.rst#module-irrCAC.datasets) module.\n", "One such example is the\n", "[table_cont3x3abstractors](../irrCAC.rst#irrCAC.datasets.table_cont3x3abstractors)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
EctopicAIUNIU
Ectopic1300
AIU0207
NIU0456
\n
", "text/plain": " Ectopic AIU NIU\nEctopic 13 0 0\nAIU 0 20 7\nNIU 0 4 56" }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from irrCAC.datasets import table_cont3x3abstractors\n", "\n", "data = table_cont3x3abstractors()\n", "data" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Initialize CAC\n", "\n", "To compute the various agreement coefficients using the contingency table,\n", "first initialize a [CAC](../irrCAC.rst#module-irrCAC.table) object.\n", "By initializing the object, it has information about the subjects, categories,\n", "and weights." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "" }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from irrCAC.table import CAC\n", "\n", "cac_abstractors = CAC(data)\n", "cac_abstractors" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Brennar-Prediger Coefficient\n", "\n", "To calculate the Brennar-Prediger coefficient, call the `bp()` method." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.835,\n 'coefficient_name': 'Brennan-Prediger',\n 'confidence_interval': (0.74187, 0.92813),\n 'p_value': 0.0,\n 'z': 17.79114,\n 'se': 0.04693,\n 'pa': 0.89,\n 'pe': 0.33333},\n 'weights': array([[1., 0., 0.],\n [0., 1., 0.],\n [0., 0., 1.]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bp = cac_abstractors.bp()\n", "bp" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Cohen's kappa\n", "\n", "To caclulate the Cohen's kappa coefficient, call the `cohen()` method." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.79641,\n 'coefficient_name': \"Cohen's kappa\",\n 'confidence_interval': (0.67952, 0.9133),\n 'p_value': 0.0,\n 'z': 13.51892,\n 'se': 0.05891,\n 'pa': 0.89,\n 'pe': 0.4597},\n 'weights': array([[1., 0., 0.],\n [0., 1., 0.],\n [0., 0., 1.]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cohen = cac_abstractors.cohen()\n", "cohen" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Gwet's AC1/AC2\n", "\n", "To calculate the Gwet's coefficient, call the `gwet()` method." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.84933,\n 'coefficient_name': \"Gwet's AC1\",\n 'confidence_interval': (0.76358, 0.93508),\n 'p_value': 0.0,\n 'z': 19.65248,\n 'se': 0.04322,\n 'pa': 0.89,\n 'pe': 0.26992},\n 'weights': array([[1., 0., 0.],\n [0., 1., 0.],\n [0., 0., 1.]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gwet = cac_abstractors.gwet()\n", "gwet" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Krippendorff's Alpha\n", "\n", "To calculate the Krippendorff's Alpha coefficient, call the `krippendorff()`\n", "method" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.79726,\n 'coefficient_name': \"Krippendorff's Alpha\",\n 'confidence_interval': (0.68008, 0.91444),\n 'p_value': 0.0,\n 'z': 13.50033,\n 'se': 0.05905,\n 'pa': 0.89055,\n 'pe': 0.46015},\n 'weights': array([[1., 0., 0.],\n [0., 1., 0.],\n [0., 0., 1.]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alpha = cac_abstractors.krippendorff()\n", "alpha" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Percent Agreement\n", "\n", "To calculate the Percent Agreement, call the `pa2()` method." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.89,\n 'coefficient_name': 'Percent Agreement',\n 'confidence_interval': (0.82792, 0.95208),\n 'p_value': 0.0,\n 'z': 28.44452,\n 'se': 0.03129,\n 'pa': 0.89,\n 'pe': 0},\n 'weights': array([[1., 0., 0.],\n [0., 1., 0.],\n [0., 0., 1.]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pa2 = cac_abstractors.pa2()\n", "pa2" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Scott's Pi\n", "\n", "To calculate the Scott's Pi, call the `scott()` method." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.79624,\n 'coefficient_name': \"Scott's Pi\",\n 'confidence_interval': (0.67906, 0.91342),\n 'p_value': 0.0,\n 'z': 13.48308,\n 'se': 0.05905,\n 'pa': 0.89,\n 'pe': 0},\n 'weights': array([[1., 0., 0.],\n [0., 1., 0.],\n [0., 0., 1.]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scott = cac_abstractors.scott()\n", "scott" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "## Weighted Analysis\n", "\n", "We can use custom weights or predefined weight types initializing the\n", "[CAC](../irrCAC.rst#module-irrCAC.table) objects. For the available weight\n", "types see the [Weights](../irrCAC.rst#module-irrCAC.weights) module.\n", "\n", "In the following example, we initialize a new object on the same data using\n", "[linear](../irrCAC.rst#irrCAC.weights.Weights.linear) weights." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "" }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cac_abstractors_linear = CAC(data, weights='linear')\n", "cac_abstractors_linear" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "To see the weights' matrix we can print the `weights_mat` attribute of the\n", "object." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "array([[1. , 0.5, 0. ],\n [0.5, 1. , 0.5],\n [0. , 0.5, 1. ]])" }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cac_abstractors_linear.weights_mat" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Next, we simply call the method of the coefficient we want the calculation.\n", "Here for example we show the weighted Brennan-Prediger coefficient." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.87625,\n 'coefficient_name': 'Brennan-Prediger',\n 'confidence_interval': (0.80641, 0.94609),\n 'p_value': 0.0,\n 'z': 24.8934,\n 'se': 0.0352,\n 'pa': 0.945,\n 'pe': 0.55556},\n 'weights': array([[1. , 0.5, 0. ],\n [0.5, 1. , 0.5],\n [0. , 0.5, 1. ]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bp_linear = cac_abstractors_linear.bp()\n", "bp_linear" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "To use custom weights pass a list of values or an array." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "" }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "\n", "\n", "weights = np.array(\n", " [[1. , 0.75, 0. ],\n", " [0.75, 1. , 0.0],\n", " [0. , 0.75, 1. ]])\n", "cac_abstractors_custom_weights = CAC(data, weights=weights)\n", "cac_abstractors_custom_weights" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Verify the weights." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "array([[1. , 0.75, 0. ],\n [0.75, 1. , 0. ],\n [0. , 0.75, 1. ]])" }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cac_abstractors_custom_weights.weights_mat" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Calculate a coefficient using the custom weights." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "{'est': {'coefficient_value': 0.808,\n 'coefficient_name': 'Brennan-Prediger',\n 'confidence_interval': (0.68557, 0.93043),\n 'p_value': 0.0,\n 'z': 13.09482,\n 'se': 0.0617,\n 'pa': 0.92,\n 'pe': 0.58333},\n 'weights': array([[1. , 0.75, 0. ],\n [0.75, 1. , 0. ],\n [0. , 0.75, 1. ]]),\n 'categories': ['Ectopic', 'AIU', 'NIU']}" }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bp_custom_weights = cac_abstractors_custom_weights.bp()\n", "bp_custom_weights" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "To compare the results of [identity](../irrCAC.rst#irrCAC.weights.Weights.identity) weights,\n", "the calculation with the [linear](../irrCAC.rst#irrCAC.weights.Weights.linear)\n", "weights, and the custom weights, we display the results side by side." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
Identity WeightsLinear WeightsCustom Weights
0(coefficient_value, 0.835)(coefficient_value, 0.87625)(coefficient_value, 0.808)
1(coefficient_name, Brennan-Prediger)(coefficient_name, Brennan-Prediger)(coefficient_name, Brennan-Prediger)
2(confidence_interval, (0.74187, 0.92813))(confidence_interval, (0.80641, 0.94609))(confidence_interval, (0.68557, 0.93043))
3(p_value, 0.0)(p_value, 0.0)(p_value, 0.0)
4(z, 17.79114)(z, 24.8934)(z, 13.09482)
5(se, 0.04693)(se, 0.0352)(se, 0.0617)
6(pa, 0.89)(pa, 0.945)(pa, 0.92)
7(pe, 0.33333)(pe, 0.55556)(pe, 0.58333)
\n
", "text/plain": " Identity Weights \\\n0 (coefficient_value, 0.835) \n1 (coefficient_name, Brennan-Prediger) \n2 (confidence_interval, (0.74187, 0.92813)) \n3 (p_value, 0.0) \n4 (z, 17.79114) \n5 (se, 0.04693) \n6 (pa, 0.89) \n7 (pe, 0.33333) \n\n Linear Weights \\\n0 (coefficient_value, 0.87625) \n1 (coefficient_name, Brennan-Prediger) \n2 (confidence_interval, (0.80641, 0.94609)) \n3 (p_value, 0.0) \n4 (z, 24.8934) \n5 (se, 0.0352) \n6 (pa, 0.945) \n7 (pe, 0.55556) \n\n Custom Weights \n0 (coefficient_value, 0.808) \n1 (coefficient_name, Brennan-Prediger) \n2 (confidence_interval, (0.68557, 0.93043)) \n3 (p_value, 0.0) \n4 (z, 13.09482) \n5 (se, 0.0617) \n6 (pa, 0.92) \n7 (pe, 0.58333) " }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "df = pd.DataFrame(\n", " zip(bp['est'].items(),\n", " bp_linear['est'].items(),\n", " bp_custom_weights['est'].items()),\n", " columns=['Identity Weights', 'Linear Weights', 'Custom Weights'])\n", "df" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 1 }