Subgroup analysis

In many real-world applications, we are not just interested in the calibration of the overall population, but also interested in the calibration for subgroups within the population. calzone provides a simple way to perform subgroup analysis given some data input format. In order to perform subgroup analysis, the input csv file should contain the following columns:

proba_0, proba_1, …, proba_n, subgroup_1, subgroup_2, …, subgroup_m, label

where n >= 1 and m >= 1.

In this example, we will use the example simulated dataset in the calzone package with only one subgroup field and two subgroups. See quickstart for more details.

[ ]:
### import the packages and read the data
import numpy as np
from calzone.utils import data_loader
from calzone.metrics import CalibrationMetrics

dataset = data_loader('../../../example_data/simulated_data_subgroup.csv')
print(np.loadtxt('../../../example_data/simulated_data_subgroup.csv',dtype=str)[:5]) #first 5 lines of the csv files
print("Whether the dataset has subgroup:",dataset.have_subgroup)

### Create the CalibrationMetrics class
metrics_cal = CalibrationMetrics(class_to_calculate=1)
['proba_0,proba_1,subgroup_1,label'
 '0.1444156178040511,0.8555843821959489,A,0'
 '0.8552048445812981,0.1447951554187019,A,0'
 '0.2569696048872897,0.7430303951127103,A,0'
 '0.39931305655530125,0.6006869434446988,A,1']
Whether the dataset has subgroup: True
[2]:
### subgroup analysis for each group
### You can preform other analysis during the loop (eg. plotting the reliability diagram etc)
for i,subgroup_column in enumerate(dataset.subgroup_indices):
    print(f"subgroup {i+1}")
    for j,subgroup_class in enumerate(dataset.subgroups_class[i]):
        print(f"subgroup {i+1} class {subgroup_class}")
        proba = dataset.probs[dataset.subgroups_index[i][j],:]
        label = dataset.labels[dataset.subgroups_index[i][j]]
        result = metrics_cal.calculate_metrics(label, proba,metrics='all')
        for metric in result:
            print(f"{metric}: {result[metric]}")
subgroup 1
subgroup 1 class A
SpiegelhalterZ score: 0.3763269161877356
SpiegelhalterZ p-value: 0.7066738713391099
ECE-H topclass: 0.009608653731328977
ECE-H: 0.01208775955804901
MCE-H topclass: 0.03926468843081976
MCE-H: 0.04848338618970194
HL-H score: 8.884991559088098
HL-H p-value: 0.35209071874348785
ECE-C topclass: 0.009458033653818828
ECE-C: 0.008733966945443138
MCE-C topclass: 0.020515047600205505
MCE-C: 0.02324031223486256
HL-C score: 3.694947603203135
HL-C p-value: 0.8835446575708198
COX coef: 0.9942499557748269
COX intercept: -0.04497652296600376
COX coef lowerci: 0.9372902801721911
COX coef upperci: 1.0512096313774626
COX intercept lowerci: -0.12348577118577644
COX intercept upperci: 0.03353272525376893
COX ICI: 0.005610391483826338
Loess ICI: 0.00558856942568957
subgroup 1 class B
SpiegelhalterZ score: 27.93575342117766
SpiegelhalterZ p-value: 0.0
ECE-H topclass: 0.07658928982434714
ECE-H: 0.0765892898243467
MCE-H topclass: 0.1327565894838103
MCE-H: 0.16250572519432438
HL-H score: 910.4385762101924
HL-H p-value: 0.0
ECE-C topclass: 0.07429481165606829
ECE-C: 0.07479369479609524
MCE-C topclass: 0.14090872416947742
MCE-C: 0.14045600565696226
HL-C score: 2246.1714434139853
HL-C p-value: 0.0
COX coef: 0.5071793536874274
COX intercept: 0.00037947714112375366
COX coef lowerci: 0.47838663128188996
COX coef upperci: 0.5359720760929648
COX intercept lowerci: -0.07796623141885761
COX intercept upperci: 0.07872518570110512
COX ICI: 0.07746407648179383
Loess ICI: 0.06991428582761099
[3]:
### An alernative way to do the same thing is through command line interface

%run ../../../cal_metrics.py \
--csv_file '../../../example_data/simulated_data_subgroup.csv' \
--metrics all \
--class_to_calculate 1 \
--num_bins 10 \
--verbose
Metrics:
SpiegelhalterZ score: 18.327
SpiegelhalterZ p-value: 0.
ECE-H topclass: 0.042
ECE-H: 0.042
MCE-H topclass: 0.055
MCE-H: 0.063
HL-H score: 429.732
HL-H p-value: 0.
ECE-C topclass: 0.042
ECE-C: 0.038
MCE-C topclass: 0.065
MCE-C: 0.064
HL-C score: 1138.842
HL-C p-value: 0.
COX coef: 0.668
COX intercept: -0.02
COX coef lowerci: 0.641
COX coef upperci: 0.696
COX intercept lowerci: -0.074
COX intercept upperci: 0.034
COX ICI: 0.049
Loess ICI: 0.037
Metrics for subgroup subgroup_1_group_A:
SpiegelhalterZ score: 0.376
SpiegelhalterZ p-value: 0.707
ECE-H topclass: 0.01
ECE-H: 0.012
MCE-H topclass: 0.039
MCE-H: 0.048
HL-H score: 8.885
HL-H p-value: 0.352
ECE-C topclass: 0.009
ECE-C: 0.009
MCE-C topclass: 0.021
MCE-C: 0.023
HL-C score: 3.695
HL-C p-value: 0.884
COX coef: 0.994
COX intercept: -0.045
COX coef lowerci: 0.937
COX coef upperci: 1.051
COX intercept lowerci: -0.123
COX intercept upperci: 0.034
COX ICI: 0.006
Loess ICI: 0.006
Metrics for subgroup subgroup_1_group_B:
SpiegelhalterZ score: 27.936
SpiegelhalterZ p-value: 0.
ECE-H topclass: 0.077
ECE-H: 0.077
MCE-H topclass: 0.133
MCE-H: 0.163
HL-H score: 910.439
HL-H p-value: 0.
ECE-C topclass: 0.074
ECE-C: 0.075
MCE-C topclass: 0.141
MCE-C: 0.140
HL-C score: 2246.171
HL-C p-value: 0.
COX coef: 0.507
COX intercept: 0.000
COX coef lowerci: 0.478
COX coef upperci: 0.536
COX intercept lowerci: -0.078
COX intercept upperci: 0.079
COX ICI: 0.077
Loess ICI: 0.07
[ ]: