Overview

Dataset statistics

Number of variables14
Number of observations303
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory33.3 KiB
Average record size in memory112.4 B

Variable types

NUM6
CAT4
BOOL4

Reproduction

Analysis started2020-08-25 01:16:46.602618
Analysis finished2020-08-25 01:16:52.832561
Duration6.23 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 1 (0.3%) duplicate rows Duplicates
Oldpeak has 99 (32.7%) zeros Zeros
Number of vessels colored has 175 (57.8%) zeros Zeros

Variables

Age
Real number (ℝ≥0)

Distinct count41
Unique (%)13.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.366336633663366
Minimum29.0
Maximum77.0
Zeros0
Zeros (%)0.0%
Memory size2.5 KiB
2020-08-25T01:16:52.876220image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum29
5-th percentile39.1
Q147.5
median55
Q361
95-th percentile68
Maximum77
Range48
Interquartile range (IQR)13.5

Descriptive statistics

Standard deviation9.08210099
Coefficient of variation (CV)0.1670537607
Kurtosis-0.542167141
Mean54.36633663
Median Absolute Deviation (MAD)7
Skewness-0.2024633655
Sum16473
Variance82.48455839
2020-08-25T01:16:52.972883image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
58196.3%
 
57175.6%
 
54165.3%
 
59144.6%
 
52134.3%
 
51124.0%
 
56113.6%
 
60113.6%
 
44113.6%
 
62113.6%
 
64103.3%
 
41103.3%
 
6393.0%
 
6793.0%
 
5382.6%
 
6582.6%
 
4582.6%
 
4382.6%
 
5582.6%
 
4282.6%
 
6182.6%
 
4872.3%
 
5072.3%
 
4672.3%
 
6672.3%
 
Other values (16)4615.2%
 
ValueCountFrequency (%) 
2910.3%
 
3420.7%
 
3541.3%
 
3720.7%
 
3831.0%
 
3941.3%
 
4031.0%
 
41103.3%
 
4282.6%
 
4382.6%
 
ValueCountFrequency (%) 
7710.3%
 
7610.3%
 
7410.3%
 
7131.0%
 
7041.3%
 
6931.0%
 
6841.3%
 
6793.0%
 
6672.3%
 
6582.6%
 

Sex
Boolean

Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
1
207
0
96
ValueCountFrequency (%) 
120768.3%
 
09631.7%
 

Chest pain type
Categorical

Distinct count4
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2
143
3
87
0
50
1
 
23
ValueCountFrequency (%) 
214347.2%
 
38728.7%
 
05016.5%
 
1237.6%
 
2020-08-25T01:16:53.101717image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
214347.2%
 
38728.7%
 
05016.5%
 
1237.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
214347.2%
 
38728.7%
 
05016.5%
 
1237.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
214347.2%
 
38728.7%
 
05016.5%
 
1237.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
214347.2%
 
38728.7%
 
05016.5%
 
1237.6%
 

Trestbps
Real number (ℝ≥0)

Distinct count49
Unique (%)16.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131.62376237623764
Minimum94.0
Maximum200.0
Zeros0
Zeros (%)0.0%
Memory size2.5 KiB
2020-08-25T01:16:53.209358image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum94
5-th percentile108
Q1120
median130
Q3140
95-th percentile160
Maximum200
Range106
Interquartile range (IQR)20

Descriptive statistics

Standard deviation17.53814281
Coefficient of variation (CV)0.1332445031
Kurtosis0.9290540528
Mean131.6237624
Median Absolute Deviation (MAD)10
Skewness0.7137684379
Sum39882
Variance307.5864533
2020-08-25T01:16:53.317338image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1203712.2%
 
1303611.9%
 
1403210.6%
 
110196.3%
 
150175.6%
 
138134.3%
 
128124.0%
 
160113.6%
 
125113.6%
 
11293.0%
 
13282.6%
 
11872.3%
 
10862.0%
 
13562.0%
 
12462.0%
 
13451.7%
 
14551.7%
 
15251.7%
 
12241.3%
 
10041.3%
 
17041.3%
 
11531.0%
 
14231.0%
 
12631.0%
 
18031.0%
 
Other values (24)3411.2%
 
ValueCountFrequency (%) 
9420.7%
 
10041.3%
 
10110.3%
 
10220.7%
 
10410.3%
 
10531.0%
 
10610.3%
 
10862.0%
 
110196.3%
 
11293.0%
 
ValueCountFrequency (%) 
20010.3%
 
19210.3%
 
18031.0%
 
17820.7%
 
17410.3%
 
17210.3%
 
17041.3%
 
16510.3%
 
16410.3%
 
160113.6%
 

Cholesterol
Real number (ℝ≥0)

Distinct count152
Unique (%)50.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean246.26402640264027
Minimum126.0
Maximum564.0
Zeros0
Zeros (%)0.0%
Memory size2.5 KiB
2020-08-25T01:16:53.439715image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum126
5-th percentile175
Q1211
median240
Q3274.5
95-th percentile326.9
Maximum564
Range438
Interquartile range (IQR)63.5

Descriptive statistics

Standard deviation51.83075099
Coefficient of variation (CV)0.2104682188
Kurtosis4.505423168
Mean246.2640264
Median Absolute Deviation (MAD)32
Skewness1.143400821
Sum74618
Variance2686.426748
2020-08-25T01:16:53.533625image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
19762.0%
 
23462.0%
 
20462.0%
 
26951.7%
 
21251.7%
 
25451.7%
 
21141.3%
 
24041.3%
 
24341.3%
 
28241.3%
 
23941.3%
 
23341.3%
 
17741.3%
 
22641.3%
 
23131.0%
 
25031.0%
 
24631.0%
 
23031.0%
 
21931.0%
 
24931.0%
 
20131.0%
 
17531.0%
 
22331.0%
 
22031.0%
 
20331.0%
 
Other values (127)20567.7%
 
ValueCountFrequency (%) 
12610.3%
 
13110.3%
 
14110.3%
 
14920.7%
 
15710.3%
 
16010.3%
 
16410.3%
 
16610.3%
 
16710.3%
 
16810.3%
 
ValueCountFrequency (%) 
56410.3%
 
41710.3%
 
40910.3%
 
40710.3%
 
39410.3%
 
36010.3%
 
35410.3%
 
35310.3%
 
34210.3%
 
34110.3%
 
Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
258
1
 
45
ValueCountFrequency (%) 
025885.1%
 
14514.9%
 

Resting ecg
Categorical

Distinct count3
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2
152
1
147
0
 
4
ValueCountFrequency (%) 
215250.2%
 
114748.5%
 
041.3%
 
2020-08-25T01:16:53.662109image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
215250.2%
 
114748.5%
 
041.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
215250.2%
 
114748.5%
 
041.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
215250.2%
 
114748.5%
 
041.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
215250.2%
 
114748.5%
 
041.3%
 

Max heart rate
Real number (ℝ≥0)

Distinct count91
Unique (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean149.64686468646866
Minimum71.0
Maximum202.0
Zeros0
Zeros (%)0.0%
Memory size2.5 KiB
2020-08-25T01:16:53.769321image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum71
5-th percentile108.1
Q1133.5
median153
Q3166
95-th percentile181.9
Maximum202
Range131
Interquartile range (IQR)32.5

Descriptive statistics

Standard deviation22.90516111
Coefficient of variation (CV)0.1530614167
Kurtosis-0.06196993058
Mean149.6468647
Median Absolute Deviation (MAD)15
Skewness-0.5374096527
Sum45343
Variance524.6464057
2020-08-25T01:16:53.884490image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
162113.6%
 
16393.0%
 
16093.0%
 
17382.6%
 
15282.6%
 
14472.3%
 
17272.3%
 
15072.3%
 
12572.3%
 
14372.3%
 
13272.3%
 
14262.0%
 
15862.0%
 
16962.0%
 
14062.0%
 
15662.0%
 
17851.7%
 
17451.7%
 
15451.7%
 
16551.7%
 
16151.7%
 
18251.7%
 
15751.7%
 
17051.7%
 
17951.7%
 
Other values (66)14146.5%
 
ValueCountFrequency (%) 
7110.3%
 
8810.3%
 
9010.3%
 
9510.3%
 
9620.7%
 
9710.3%
 
9910.3%
 
10320.7%
 
10531.0%
 
10610.3%
 
ValueCountFrequency (%) 
20210.3%
 
19510.3%
 
19410.3%
 
19210.3%
 
19010.3%
 
18810.3%
 
18710.3%
 
18620.7%
 
18510.3%
 
18410.3%
 
Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
204
1
99
ValueCountFrequency (%) 
020467.3%
 
19932.7%
 

Oldpeak
Real number (ℝ≥0)

ZEROS

Distinct count40
Unique (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0396039603960396
Minimum0.0
Maximum6.2
Zeros99
Zeros (%)32.7%
Memory size2.5 KiB
2020-08-25T01:16:54.002438image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.8
Q31.6
95-th percentile3.4
Maximum6.2
Range6.2
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.161075022
Coefficient of variation (CV)1.116843593
Kurtosis1.575813073
Mean1.03960396
Median Absolute Deviation (MAD)0.8
Skewness1.269719931
Sum315
Variance1.348095207
2020-08-25T01:16:54.114392image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
09932.7%
 
1.2175.6%
 
1144.6%
 
0.6144.6%
 
1.4134.3%
 
0.8134.3%
 
0.2124.0%
 
1.6113.6%
 
1.8103.3%
 
0.493.0%
 
293.0%
 
0.172.3%
 
2.862.0%
 
2.662.0%
 
351.7%
 
1.951.7%
 
1.551.7%
 
0.551.7%
 
3.641.3%
 
2.241.3%
 
0.931.0%
 
3.431.0%
 
0.331.0%
 
2.431.0%
 
431.0%
 
Other values (15)206.6%
 
ValueCountFrequency (%) 
09932.7%
 
0.172.3%
 
0.2124.0%
 
0.331.0%
 
0.493.0%
 
0.551.7%
 
0.6144.6%
 
0.710.3%
 
0.8134.3%
 
0.931.0%
 
ValueCountFrequency (%) 
6.210.3%
 
5.610.3%
 
4.410.3%
 
4.220.7%
 
431.0%
 
3.810.3%
 
3.641.3%
 
3.510.3%
 
3.431.0%
 
3.220.7%
 

Slope
Categorical

Distinct count3
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2
142
1
140
0
 
21
ValueCountFrequency (%) 
214246.9%
 
114046.2%
 
0216.9%
 
2020-08-25T01:16:54.246852image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
214246.9%
 
114046.2%
 
0216.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
214246.9%
 
114046.2%
 
0216.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
214246.9%
 
114046.2%
 
0216.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
214246.9%
 
114046.2%
 
0216.9%
 

Number of vessels colored
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7293729372937293
Minimum0
Maximum4
Zeros175
Zeros (%)57.8%
Memory size2.5 KiB
2020-08-25T01:16:54.352115image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.022606365
Coefficient of variation (CV)1.402034971
Kurtosis0.8392531872
Mean0.7293729373
Median Absolute Deviation (MAD)0
Skewness1.310422135
Sum221
Variance1.045723778
2020-08-25T01:16:54.471015image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
017557.8%
 
16521.5%
 
23812.5%
 
3206.6%
 
451.7%
 
ValueCountFrequency (%) 
017557.8%
 
16521.5%
 
23812.5%
 
3206.6%
 
451.7%
 
ValueCountFrequency (%) 
451.7%
 
3206.6%
 
23812.5%
 
16521.5%
 
017557.8%
 

Thal
Categorical

Distinct count4
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2
166
3
117
1
 
18
0
 
2
ValueCountFrequency (%) 
216654.8%
 
311738.6%
 
1185.9%
 
020.7%
 
2020-08-25T01:16:54.625910image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
216654.8%
 
311738.6%
 
1185.9%
 
020.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
216654.8%
 
311738.6%
 
1185.9%
 
020.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
216654.8%
 
311738.6%
 
1185.9%
 
020.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
216654.8%
 
311738.6%
 
1185.9%
 
020.7%
 

target
Boolean

Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
165
1
138
ValueCountFrequency (%) 
016554.5%
 
113845.5%
 

Interactions

2020-08-25T01:16:47.282403image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:47.399849image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:47.521549image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:47.633887image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:47.757507image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:47.897166image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.235129image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.355304image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.480972image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.605376image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.732856image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.861527image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:48.997916image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.103108image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.216307image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.322995image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.440534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.558090image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.684008image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.809087image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:49.939375image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.062428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.195924image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.326038image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.470734image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.591781image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.717610image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.835879image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:50.969242image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.093698image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.230422image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.358763image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.498410image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.630322image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.772764image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:51.912084image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:16:54.759780image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:16:55.038908image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:16:55.311387image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:16:55.587587image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-08-25T01:16:55.827172image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-08-25T01:16:52.366778image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:16:52.692508image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

AgeSexChest pain typeTrestbpsCholesterolFasting blood sugar < 120Resting ecgMax heart rateExercise induced anginaOldpeakSlopeNumber of vessels coloredThaltarget
068.013118.0277.002151.001.02130
176.003140.0197.000116.001.11020
245.000130.0234.001175.000.61020
341.003112.0268.001172.010.02020
435.002138.0183.002182.001.42020
562.002124.0209.002163.000.02020
664.002130.0303.002122.002.01220
754.003135.0304.012170.000.02020
847.013138.0257.001156.000.02020
951.013100.0222.002143.011.21020

Last rows

AgeSexChest pain typeTrestbpsCholesterolFasting blood sugar < 120Resting ecgMax heart rateExercise induced anginaOldpeakSlopeNumber of vessels coloredThaltarget
29368.012144.0193.012141.003.41231
29462.012120.0267.00299.011.81231
29558.013132.0224.001173.003.22231
29650.013140.0233.002163.000.61131
29763.002124.0197.002136.010.01021
29860.013140.0185.001155.003.01021
29949.013118.0149.001126.000.82321
30061.002130.0330.001169.000.02021
30158.012100.0234.002156.000.12131
30268.013180.0274.011150.011.61031

Duplicate rows

Most frequent

AgeSexChest pain typeTrestbpsCholesterolFasting blood sugar < 120Resting ecgMax heart rateExercise induced anginaOldpeakSlopeNumber of vessels coloredThaltargetcount
038.013138.0175.002173.000.024202