Overview

Dataset statistics

Number of variables9
Number of observations768
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory54.1 KiB
Average record size in memory72.2 B

Variable types

NUM8
CAT1

Reproduction

Analysis started2020-08-25 01:21:34.177669
Analysis finished2020-08-25 01:21:44.974445
Duration10.8 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

A1 has 111 (14.5%) zeros Zeros
A3 has 35 (4.6%) zeros Zeros
A4 has 227 (29.6%) zeros Zeros
A5 has 374 (48.7%) zeros Zeros
A6 has 11 (1.4%) zeros Zeros

Variables

A1
Real number (ℝ≥0)

ZEROS

Distinct count17
Unique (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8450520833333335
Minimum0.0
Maximum17.0
Zeros111
Zeros (%)14.5%
Memory size6.1 KiB
2020-08-25T01:21:45.015334image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.369578063
Coefficient of variation (CV)0.8763413316
Kurtosis0.1592197775
Mean3.845052083
Median Absolute Deviation (MAD)2
Skewness0.9016739792
Sum2953
Variance11.35405632
2020-08-25T01:21:45.127470image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
113517.6%
 
011114.5%
 
210313.4%
 
3759.8%
 
4688.9%
 
5577.4%
 
6506.5%
 
7455.9%
 
8384.9%
 
9283.6%
 
10243.1%
 
11111.4%
 
13101.3%
 
1291.2%
 
1420.3%
 
1510.1%
 
1710.1%
 
ValueCountFrequency (%) 
011114.5%
 
113517.6%
 
210313.4%
 
3759.8%
 
4688.9%
 
5577.4%
 
6506.5%
 
7455.9%
 
8384.9%
 
9283.6%
 
ValueCountFrequency (%) 
1710.1%
 
1510.1%
 
1420.3%
 
13101.3%
 
1291.2%
 
11111.4%
 
10243.1%
 
9283.6%
 
8384.9%
 
7455.9%
 

A2
Real number (ℝ≥0)

Distinct count136
Unique (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120.89453125
Minimum0.0
Maximum199.0
Zeros5
Zeros (%)0.7%
Memory size6.1 KiB
2020-08-25T01:21:45.241979image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile79
Q199
median117
Q3140.25
95-th percentile181
Maximum199
Range199
Interquartile range (IQR)41.25

Descriptive statistics

Standard deviation31.9726182
Coefficient of variation (CV)0.2644670347
Kurtosis0.6407798204
Mean120.8945312
Median Absolute Deviation (MAD)20
Skewness0.1737535018
Sum92847
Variance1022.248314
2020-08-25T01:21:45.352598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
100172.2%
 
99172.2%
 
129141.8%
 
125141.8%
 
106141.8%
 
111141.8%
 
112131.7%
 
108131.7%
 
102131.7%
 
105131.7%
 
95131.7%
 
122121.6%
 
109121.6%
 
90111.4%
 
114111.4%
 
120111.4%
 
117111.4%
 
107111.4%
 
124111.4%
 
119111.4%
 
128111.4%
 
84101.3%
 
115101.3%
 
12691.2%
 
9291.2%
 
Other values (111)46360.3%
 
ValueCountFrequency (%) 
050.7%
 
4410.1%
 
5610.1%
 
5720.3%
 
6110.1%
 
6210.1%
 
6510.1%
 
6710.1%
 
6830.4%
 
7140.5%
 
ValueCountFrequency (%) 
19910.1%
 
19810.1%
 
19740.5%
 
19630.4%
 
19520.3%
 
19430.4%
 
19320.3%
 
19110.1%
 
19010.1%
 
18940.5%
 

A3
Real number (ℝ≥0)

ZEROS

Distinct count47
Unique (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.10546875
Minimum0.0
Maximum122.0
Zeros35
Zeros (%)4.6%
Memory size6.1 KiB
2020-08-25T01:21:45.473787image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.7
Q162
median72
Q380
95-th percentile90
Maximum122
Range122
Interquartile range (IQR)18

Descriptive statistics

Standard deviation19.35580717
Coefficient of variation (CV)0.2800908166
Kurtosis5.18015656
Mean69.10546875
Median Absolute Deviation (MAD)8
Skewness-1.843607983
Sum53073
Variance374.6472712
2020-08-25T01:21:45.573730image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
70577.4%
 
74526.8%
 
78455.9%
 
68455.9%
 
72445.7%
 
64435.6%
 
80405.2%
 
76395.1%
 
60374.8%
 
0354.6%
 
62344.4%
 
66303.9%
 
82303.9%
 
88253.3%
 
84233.0%
 
90222.9%
 
58212.7%
 
86212.7%
 
50131.7%
 
56121.6%
 
54111.4%
 
52111.4%
 
9281.0%
 
7581.0%
 
6570.9%
 
Other values (22)557.2%
 
ValueCountFrequency (%) 
0354.6%
 
2410.1%
 
3020.3%
 
3810.1%
 
4010.1%
 
4440.5%
 
4620.3%
 
4850.7%
 
50131.7%
 
52111.4%
 
ValueCountFrequency (%) 
12210.1%
 
11410.1%
 
11030.4%
 
10820.3%
 
10630.4%
 
10420.3%
 
10210.1%
 
10030.4%
 
9830.4%
 
9640.5%
 

A4
Real number (ℝ≥0)

ZEROS

Distinct count51
Unique (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.536458333333332
Minimum0.0
Maximum99.0
Zeros227
Zeros (%)29.6%
Memory size6.1 KiB
2020-08-25T01:21:45.685634image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median23
Q332
95-th percentile44
Maximum99
Range99
Interquartile range (IQR)32

Descriptive statistics

Standard deviation15.95221757
Coefficient of variation (CV)0.776775494
Kurtosis-0.5200718662
Mean20.53645833
Median Absolute Deviation (MAD)12
Skewness0.1093724965
Sum15772
Variance254.4732453
2020-08-25T01:21:45.781852image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
022729.6%
 
32314.0%
 
30273.5%
 
27233.0%
 
23222.9%
 
18202.6%
 
28202.6%
 
33202.6%
 
31192.5%
 
39182.3%
 
19182.3%
 
29172.2%
 
22162.1%
 
25162.1%
 
37162.1%
 
40162.1%
 
26162.1%
 
41152.0%
 
35152.0%
 
15141.8%
 
17141.8%
 
36141.8%
 
20131.7%
 
24121.6%
 
42111.4%
 
Other values (26)11815.4%
 
ValueCountFrequency (%) 
022729.6%
 
720.3%
 
820.3%
 
1050.7%
 
1160.8%
 
1270.9%
 
13111.4%
 
1460.8%
 
15141.8%
 
1660.8%
 
ValueCountFrequency (%) 
9910.1%
 
6310.1%
 
6010.1%
 
5610.1%
 
5420.3%
 
5220.3%
 
5110.1%
 
5030.4%
 
4930.4%
 
4840.5%
 

A5
Real number (ℝ≥0)

ZEROS

Distinct count186
Unique (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.79947916666667
Minimum0.0
Maximum846.0
Zeros374
Zeros (%)48.7%
Memory size6.1 KiB
2020-08-25T01:21:45.882585image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median30.5
Q3127.25
95-th percentile293
Maximum846
Range846
Interquartile range (IQR)127.25

Descriptive statistics

Standard deviation115.2440024
Coefficient of variation (CV)1.444169856
Kurtosis7.214259554
Mean79.79947917
Median Absolute Deviation (MAD)30.5
Skewness2.272250858
Sum61286
Variance13281.18008
2020-08-25T01:21:45.981848image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
037448.7%
 
105111.4%
 
13091.2%
 
14091.2%
 
12081.0%
 
18070.9%
 
10070.9%
 
9470.9%
 
11060.8%
 
11560.8%
 
13560.8%
 
4950.7%
 
7650.7%
 
21050.7%
 
5650.7%
 
6650.7%
 
6440.5%
 
16540.5%
 
15540.5%
 
20040.5%
 
12540.5%
 
7140.5%
 
16040.5%
 
19040.5%
 
9040.5%
 
Other values (161)25733.5%
 
ValueCountFrequency (%) 
037448.7%
 
1410.1%
 
1510.1%
 
1610.1%
 
1820.3%
 
2210.1%
 
2320.3%
 
2510.1%
 
2910.1%
 
3210.1%
 
ValueCountFrequency (%) 
84610.1%
 
74410.1%
 
68010.1%
 
60010.1%
 
57910.1%
 
54510.1%
 
54310.1%
 
54010.1%
 
51010.1%
 
49520.3%
 

A6
Real number (ℝ≥0)

ZEROS

Distinct count248
Unique (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.992578124999998
Minimum0.0
Maximum67.1
Zeros11
Zeros (%)1.4%
Memory size6.1 KiB
2020-08-25T01:21:46.083642image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21.8
Q127.3
median32
Q336.6
95-th percentile44.395
Maximum67.1
Range67.1
Interquartile range (IQR)9.3

Descriptive statistics

Standard deviation7.88416032
Coefficient of variation (CV)0.2464371671
Kurtosis3.290442901
Mean31.99257812
Median Absolute Deviation (MAD)4.6
Skewness-0.4289815885
Sum24570.3
Variance62.15998396
2020-08-25T01:21:46.193772image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
32131.7%
 
31.6121.6%
 
31.2121.6%
 
0111.4%
 
32.4101.3%
 
33.3101.3%
 
32.991.2%
 
30.191.2%
 
32.891.2%
 
30.891.2%
 
34.281.0%
 
33.681.0%
 
29.781.0%
 
39.470.9%
 
27.670.9%
 
3070.9%
 
25.970.9%
 
35.570.9%
 
30.570.9%
 
28.770.9%
 
27.870.9%
 
30.470.9%
 
33.270.9%
 
25.660.8%
 
24.260.8%
 
Other values (223)55872.7%
 
ValueCountFrequency (%) 
0111.4%
 
18.230.4%
 
18.410.1%
 
19.110.1%
 
19.310.1%
 
19.410.1%
 
19.520.3%
 
19.630.4%
 
19.910.1%
 
2010.1%
 
ValueCountFrequency (%) 
67.110.1%
 
59.410.1%
 
57.310.1%
 
5510.1%
 
53.210.1%
 
52.910.1%
 
52.320.3%
 
5010.1%
 
49.710.1%
 
49.610.1%
 

A7
Real number (ℝ≥0)

Distinct count517
Unique (%)67.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47187630208333325
Minimum0.078
Maximum2.42
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-08-25T01:21:46.305833image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.078
5-th percentile0.14035
Q10.24375
median0.3725
Q30.62625
95-th percentile1.13285
Maximum2.42
Range2.342
Interquartile range (IQR)0.3825

Descriptive statistics

Standard deviation0.331328595
Coefficient of variation (CV)0.7021513764
Kurtosis5.594953528
Mean0.4718763021
Median Absolute Deviation (MAD)0.1675
Skewness1.919911066
Sum362.401
Variance0.1097786379
2020-08-25T01:21:46.405584image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.25460.8%
 
0.25860.8%
 
0.23850.7%
 
0.20750.7%
 
0.26850.7%
 
0.25950.7%
 
0.26150.7%
 
0.29940.5%
 
0.55140.5%
 
0.2740.5%
 
0.26340.5%
 
0.24540.5%
 
0.28440.5%
 
0.68740.5%
 
0.2640.5%
 
0.23740.5%
 
0.19740.5%
 
0.30440.5%
 
0.1940.5%
 
0.16740.5%
 
0.69240.5%
 
0.16530.4%
 
0.58730.4%
 
0.42230.4%
 
0.29230.4%
 
Other values (492)66386.3%
 
ValueCountFrequency (%) 
0.07810.1%
 
0.08410.1%
 
0.08520.3%
 
0.08820.3%
 
0.08910.1%
 
0.09210.1%
 
0.09610.1%
 
0.110.1%
 
0.10110.1%
 
0.10210.1%
 
ValueCountFrequency (%) 
2.4210.1%
 
2.32910.1%
 
2.28810.1%
 
2.13710.1%
 
1.89310.1%
 
1.78110.1%
 
1.73110.1%
 
1.69910.1%
 
1.69810.1%
 
1.610.1%
 

A8
Real number (ℝ≥0)

Distinct count52
Unique (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.240885416666664
Minimum21.0
Maximum81.0
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-08-25T01:21:46.526961image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum81
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.76023154
Coefficient of variation (CV)0.3537881556
Kurtosis0.6431588885
Mean33.24088542
Median Absolute Deviation (MAD)7
Skewness1.129596701
Sum25529
Variance138.3030459
2020-08-25T01:21:46.642709image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
22729.4%
 
21638.2%
 
25486.2%
 
24466.0%
 
23384.9%
 
28354.6%
 
26334.3%
 
27324.2%
 
29293.8%
 
31243.1%
 
41222.9%
 
30212.7%
 
37192.5%
 
42182.3%
 
33172.2%
 
32162.1%
 
36162.1%
 
38162.1%
 
45152.0%
 
34141.8%
 
40131.7%
 
46131.7%
 
43131.7%
 
39121.6%
 
35101.3%
 
Other values (27)11314.7%
 
ValueCountFrequency (%) 
21638.2%
 
22729.4%
 
23384.9%
 
24466.0%
 
25486.2%
 
26334.3%
 
27324.2%
 
28354.6%
 
29293.8%
 
30212.7%
 
ValueCountFrequency (%) 
8110.1%
 
7210.1%
 
7010.1%
 
6920.3%
 
6810.1%
 
6730.4%
 
6640.5%
 
6530.4%
 
6410.1%
 
6340.5%
 

target
Categorical

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
1
500
2
268
ValueCountFrequency (%) 
150065.1%
 
226834.9%
 
2020-08-25T01:21:46.779247image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
150065.1%
 
226834.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number768100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
150065.1%
 
226834.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common768100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
150065.1%
 
226834.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII768100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
150065.1%
 
226834.9%
 

Interactions

2020-08-25T01:21:34.538169image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:34.694052image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:34.850691image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:35.178695image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:35.324349image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:35.470931image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:35.625539image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:35.772024image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:35.923512image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:36.095776image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:36.255644image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:36.411053image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:36.564440image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:36.709432image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:36.865439image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.022019image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.186945image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.337818image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.493066image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.645668image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.786744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:37.925621image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.077102image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.218930image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.362748image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.508574image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.654234image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.792788image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:38.929469image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:39.068472image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:39.215935image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:39.355632image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:39.504250image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:39.830862image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:39.973988image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.115260image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.246299image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.374474image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.518039image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.652759image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.787540image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:40.944788image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:41.108752image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:41.259367image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:41.406586image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:41.553576image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:41.713902image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:41.860679image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.011735image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.162038image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.310014image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.450316image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.590161image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.724962image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:42.871814image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.016857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.163192image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.314161image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.468332image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.619944image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.765109image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:43.912688image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:44.064296image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:44.211466image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:21:46.899508image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:21:47.090564image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:21:47.286822image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:21:47.482411image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-08-25T01:21:44.641380image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:44.874298image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

A1A2A3A4A5A6A7A8target
09.0140.094.00.00.032.70.73445.02
12.0108.080.00.00.027.00.25952.02
21.0128.048.045.0194.040.50.61324.02
35.0130.082.00.00.039.10.95637.02
40.0121.066.030.0165.034.30.20333.02
54.0109.064.044.099.034.80.90526.02
69.0145.080.046.0130.037.90.63740.02
74.0123.062.00.00.032.00.22635.02
80.0151.090.046.00.042.10.37121.02
99.0130.070.00.00.034.20.65245.02

Last rows

A1A2A3A4A5A6A7A8target
7585.073.060.00.00.026.80.26827.01
7591.079.080.025.037.025.40.58322.01
7600.0119.064.018.092.034.90.72523.01
7610.0105.064.041.0142.041.50.17322.01
7621.0116.078.029.0180.036.10.49625.01
7632.0121.070.032.095.039.10.88623.01
7641.087.060.037.075.037.20.50922.01
7654.095.060.032.00.035.40.28428.01
7663.0116.00.00.00.023.50.18723.01
7670.0118.064.023.089.00.01.73121.01