Overview

Dataset statistics

Number of variables15
Number of observations690
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory81.0 KiB
Average record size in memory120.2 B

Variable types

NUM8
BOOL5
CAT2

Reproduction

Analysis started2020-08-25 01:07:41.417867
Analysis finished2020-08-25 01:07:52.342618
Duration10.92 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

A3 has 19 (2.8%) zeros Zeros
A7 has 70 (10.1%) zeros Zeros
A10 has 395 (57.2%) zeros Zeros
A13 has 132 (19.1%) zeros Zeros

Variables

A1
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
1
468
0
222
ValueCountFrequency (%) 
146867.8%
 
022232.2%
 

A2
Real number (ℝ≥0)

Distinct count350
Unique (%)50.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.568202898550727
Minimum13.75
Maximum80.25
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB
2020-08-25T01:07:52.388887image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum13.75
5-th percentile17.956
Q122.67
median28.625
Q337.7075
95-th percentile56.231
Maximum80.25
Range66.5
Interquartile range (IQR)15.0375

Descriptive statistics

Standard deviation11.85327277
Coefficient of variation (CV)0.3754813922
Kurtosis1.192058599
Mean31.5682029
Median Absolute Deviation (MAD)6.795
Skewness1.155935008
Sum21782.06
Variance140.5000754
2020-08-25T01:07:52.509102image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
31.57121.7%
 
22.6791.3%
 
20.4271.0%
 
24.560.9%
 
22.560.9%
 
20.6760.9%
 
23.5860.9%
 
18.8360.9%
 
19.1760.9%
 
2560.9%
 
2350.7%
 
23.0850.7%
 
23.2550.7%
 
33.1750.7%
 
27.8350.7%
 
27.6750.7%
 
32.3340.6%
 
28.5840.6%
 
41.1740.6%
 
25.6740.6%
 
25.1740.6%
 
24.5840.6%
 
24.7540.6%
 
26.6740.6%
 
20.7540.6%
 
Other values (325)55480.3%
 
ValueCountFrequency (%) 
13.7510.1%
 
15.1710.1%
 
15.7510.1%
 
15.8320.3%
 
15.9210.1%
 
1620.3%
 
16.0820.3%
 
16.1710.1%
 
16.2520.3%
 
16.3330.4%
 
ValueCountFrequency (%) 
80.2510.1%
 
76.7510.1%
 
74.8310.1%
 
73.4210.1%
 
71.5810.1%
 
69.510.1%
 
69.1710.1%
 
68.6710.1%
 
67.7510.1%
 
65.4210.1%
 

A3
Real number (ℝ≥0)

ZEROS

Distinct count215
Unique (%)31.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.758724637681159
Minimum0.0
Maximum28.0
Zeros19
Zeros (%)2.8%
Memory size5.5 KiB
2020-08-25T01:07:52.632174image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.165
Q11
median2.75
Q37.2075
95-th percentile14
Maximum28
Range28
Interquartile range (IQR)6.2075

Descriptive statistics

Standard deviation4.978163249
Coefficient of variation (CV)1.046112904
Kurtosis2.274021887
Mean4.758724638
Median Absolute Deviation (MAD)2.21
Skewness1.488813125
Sum3283.52
Variance24.78210933
2020-08-25T01:07:52.743240image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.5213.0%
 
3192.8%
 
0192.8%
 
2.5192.8%
 
0.75162.3%
 
1.25162.3%
 
0.5152.2%
 
5142.0%
 
6.5121.7%
 
1.75121.7%
 
4121.7%
 
1101.4%
 
10101.4%
 
0.585101.4%
 
2101.4%
 
0.37591.3%
 
1191.3%
 
0.83581.2%
 
0.16581.2%
 
12.581.2%
 
3.581.2%
 
5.581.2%
 
11.581.2%
 
0.5481.2%
 
781.2%
 
Other values (190)39357.0%
 
ValueCountFrequency (%) 
0192.8%
 
0.0450.7%
 
0.0810.1%
 
0.08510.1%
 
0.12550.7%
 
0.16581.2%
 
0.1710.1%
 
0.20530.4%
 
0.2130.4%
 
0.2560.9%
 
ValueCountFrequency (%) 
2810.1%
 
26.33510.1%
 
25.2110.1%
 
25.12510.1%
 
25.08510.1%
 
22.2910.1%
 
2210.1%
 
21.510.1%
 
2110.1%
 
2010.1%
 

A4
Categorical

Distinct count3
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2
525
1
163
3
 
2
ValueCountFrequency (%) 
252576.1%
 
116323.6%
 
320.3%
 
2020-08-25T01:07:52.886186image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
252576.1%
 
116323.6%
 
320.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
252576.1%
 
116323.6%
 
320.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common690100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
252576.1%
 
116323.6%
 
320.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII690100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
252576.1%
 
116323.6%
 
320.3%
 

A5
Real number (ℝ≥0)

Distinct count14
Unique (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.372463768115942
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB
2020-08-25T01:07:53.171521image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q310
95-th percentile14
Maximum14
Range13
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.683264787
Coefficient of variation (CV)0.4995975434
Kurtosis-0.8490425801
Mean7.372463768
Median Absolute Deviation (MAD)3
Skewness-0.06919047486
Sum5087
Variance13.56643949
2020-08-25T01:07:53.273187image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
814621.2%
 
117811.3%
 
9649.3%
 
3598.6%
 
6547.8%
 
1537.7%
 
4517.4%
 
13415.9%
 
14385.5%
 
7385.5%
 
2304.3%
 
10253.6%
 
5101.4%
 
1230.4%
 
ValueCountFrequency (%) 
1537.7%
 
2304.3%
 
3598.6%
 
4517.4%
 
5101.4%
 
6547.8%
 
7385.5%
 
814621.2%
 
9649.3%
 
10253.6%
 
ValueCountFrequency (%) 
14385.5%
 
13415.9%
 
1230.4%
 
117811.3%
 
10253.6%
 
9649.3%
 
814621.2%
 
7385.5%
 
6547.8%
 
5101.4%
 

A6
Real number (ℝ≥0)

Distinct count8
Unique (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6927536231884055
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB
2020-08-25T01:07:53.388955image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median4
Q35
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.99231607
Coefficient of variation (CV)0.4245516022
Kurtosis-0.1781323337
Mean4.692753623
Median Absolute Deviation (MAD)0
Skewness0.4684118266
Sum3238
Variance3.969323321
2020-08-25T01:07:53.505769image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
440859.1%
 
813820.0%
 
5598.6%
 
1578.3%
 
981.2%
 
381.2%
 
760.9%
 
260.9%
 
ValueCountFrequency (%) 
1578.3%
 
260.9%
 
381.2%
 
440859.1%
 
5598.6%
 
760.9%
 
813820.0%
 
981.2%
 
ValueCountFrequency (%) 
981.2%
 
813820.0%
 
760.9%
 
5598.6%
 
440859.1%
 
381.2%
 
260.9%
 
1578.3%
 

A7
Real number (ℝ≥0)

ZEROS

Distinct count132
Unique (%)19.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2234057971014494
Minimum0.0
Maximum28.5
Zeros70
Zeros (%)10.1%
Memory size5.5 KiB
2020-08-25T01:07:53.629224image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.165
median1
Q32.625
95-th percentile8.56875
Maximum28.5
Range28.5
Interquartile range (IQR)2.46

Descriptive statistics

Standard deviation3.346513359
Coefficient of variation (CV)1.505129367
Kurtosis11.20019166
Mean2.223405797
Median Absolute Deviation (MAD)0.915
Skewness2.891330424
Sum1534.15
Variance11.19915166
2020-08-25T01:07:53.729161image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
07010.1%
 
0.25355.1%
 
0.04334.8%
 
1314.5%
 
0.125304.3%
 
0.5284.1%
 
0.085263.8%
 
1.5253.6%
 
0.165223.2%
 
2.5172.5%
 
2162.3%
 
1.75152.2%
 
5131.9%
 
3.5121.7%
 
0.29121.7%
 
0.75121.7%
 
3111.6%
 
1.25101.4%
 
2.25101.4%
 
481.2%
 
0.41581.2%
 
0.37571.0%
 
5.571.0%
 
6.560.9%
 
0.2160.9%
 
Other values (107)22031.9%
 
ValueCountFrequency (%) 
07010.1%
 
0.04334.8%
 
0.085263.8%
 
0.125304.3%
 
0.165223.2%
 
0.2160.9%
 
0.25355.1%
 
0.29121.7%
 
0.33550.7%
 
0.37571.0%
 
ValueCountFrequency (%) 
28.510.1%
 
2020.3%
 
1810.1%
 
17.510.1%
 
1610.1%
 
15.510.1%
 
1530.4%
 
14.41510.1%
 
1430.4%
 
13.87520.3%
 

A8
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
1
361
0
329
ValueCountFrequency (%) 
136152.3%
 
032947.7%
 

A9
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
395
1
295
ValueCountFrequency (%) 
039557.2%
 
129542.8%
 

A10
Real number (ℝ≥0)

ZEROS

Distinct count23
Unique (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4
Minimum0.0
Maximum67.0
Zeros395
Zeros (%)57.2%
Memory size5.5 KiB
2020-08-25T01:07:53.847723image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile11
Maximum67
Range67
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.862940034
Coefficient of variation (CV)2.026225014
Kurtosis50.82943127
Mean2.4
Median Absolute Deviation (MAD)0
Skewness5.152519857
Sum1656
Variance23.64818578
2020-08-25T01:07:53.962454image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
039557.2%
 
17110.3%
 
2456.5%
 
3284.1%
 
6233.3%
 
11192.8%
 
5182.6%
 
7162.3%
 
4152.2%
 
8101.4%
 
9101.4%
 
1281.2%
 
1481.2%
 
1081.2%
 
1540.6%
 
1630.4%
 
1720.3%
 
2020.3%
 
2310.1%
 
4010.1%
 
6710.1%
 
1310.1%
 
1910.1%
 
ValueCountFrequency (%) 
039557.2%
 
17110.3%
 
2456.5%
 
3284.1%
 
4152.2%
 
5182.6%
 
6233.3%
 
7162.3%
 
8101.4%
 
9101.4%
 
ValueCountFrequency (%) 
6710.1%
 
4010.1%
 
2310.1%
 
2020.3%
 
1910.1%
 
1720.3%
 
1630.4%
 
1540.6%
 
1481.2%
 
1310.1%
 

A11
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
374
1
316
ValueCountFrequency (%) 
037454.2%
 
131645.8%
 

A12
Categorical

Distinct count3
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2
625
1
 
57
3
 
8
ValueCountFrequency (%) 
262590.6%
 
1578.3%
 
381.2%
 
2020-08-25T01:07:54.105748image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
262590.6%
 
1578.3%
 
381.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
262590.6%
 
1578.3%
 
381.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Common690100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
262590.6%
 
1578.3%
 
381.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII690100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
262590.6%
 
1578.3%
 
381.2%
 

A13
Real number (ℝ≥0)

ZEROS

Distinct count171
Unique (%)24.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean184.0144927536232
Minimum0.0
Maximum2000.0
Zeros132
Zeros (%)19.1%
Memory size5.5 KiB
2020-08-25T01:07:54.216310image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q180
median160
Q3272
95-th percentile460
Maximum2000
Range2000
Interquartile range (IQR)192

Descriptive statistics

Standard deviation172.1592735
Coefficient of variation (CV)0.9355745352
Kurtosis19.92669755
Mean184.0144928
Median Absolute Deviation (MAD)100
Skewness2.749911749
Sum126970
Variance29638.81546
2020-08-25T01:07:54.327773image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
013219.1%
 
120355.1%
 
200355.1%
 
160344.9%
 
80304.3%
 
100304.3%
 
280223.2%
 
180182.6%
 
140162.3%
 
320142.0%
 
240142.0%
 
300131.9%
 
184131.9%
 
260111.6%
 
22091.3%
 
6091.3%
 
40091.3%
 
36071.0%
 
34071.0%
 
38050.7%
 
10840.6%
 
4040.6%
 
14440.6%
 
42040.6%
 
13240.6%
 
Other values (146)20730.0%
 
ValueCountFrequency (%) 
013219.1%
 
1710.1%
 
2020.3%
 
2110.1%
 
2210.1%
 
2410.1%
 
2810.1%
 
2910.1%
 
3010.1%
 
3210.1%
 
ValueCountFrequency (%) 
200010.1%
 
116010.1%
 
98010.1%
 
92810.1%
 
84010.1%
 
76010.1%
 
72020.3%
 
71110.1%
 
68010.1%
 
64010.1%
 

A14
Real number (ℝ≥0)

Distinct count240
Unique (%)34.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1018.3855072463768
Minimum1.0
Maximum100001.0
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB
2020-08-25T01:07:54.458036image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median6
Q3396.5
95-th percentile4120.4
Maximum100001
Range100000
Interquartile range (IQR)395.5

Descriptive statistics

Standard deviation5210.102598
Coefficient of variation (CV)5.116041579
Kurtosis214.6699724
Mean1018.385507
Median Absolute Deviation (MAD)5
Skewness13.14065501
Sum702686
Variance27145169.08
2020-08-25T01:07:54.558739image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
129542.8%
 
2294.2%
 
1001101.4%
 
501101.4%
 
391.3%
 
30181.2%
 
681.2%
 
781.2%
 
20160.9%
 
460.9%
 
10160.9%
 
5150.7%
 
550.7%
 
840.6%
 
300140.6%
 
1140.6%
 
2140.6%
 
15140.6%
 
500130.4%
 
40130.4%
 
400130.4%
 
60130.4%
 
200130.4%
 
1930.4%
 
56130.4%
 
Other values (215)24435.4%
 
ValueCountFrequency (%) 
129542.8%
 
2294.2%
 
391.3%
 
460.9%
 
550.7%
 
681.2%
 
781.2%
 
840.6%
 
920.3%
 
1010.1%
 
ValueCountFrequency (%) 
10000110.1%
 
5110110.1%
 
5000110.1%
 
3128610.1%
 
2672710.1%
 
1802810.1%
 
1510910.1%
 
1500110.1%
 
1321310.1%
 
1120310.1%
 

target
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
383
1
307
ValueCountFrequency (%) 
038355.5%
 
130744.5%
 

Interactions

2020-08-25T01:07:42.053317image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:42.198090image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:42.340764image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:42.482115image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:42.632941image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:42.773829image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:42.923840image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.078083image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.218446image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.370453image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.507489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.647455image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.798313image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:43.935426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:44.076554image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:44.221909image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:44.538379image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:44.680655image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:44.821063image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:44.962926image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.112571image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.251511image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.399253image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.548725image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.691680image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.844163image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:45.992215image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:46.142869image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:46.302947image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:46.462687image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:46.616525image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:46.772573image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:46.924564image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.063806image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.195303image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.328081image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.471734image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.604211image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.743855image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:47.888713image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:48.021941image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:48.168034image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:48.311747image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:48.461060image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:48.615169image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:48.936974image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.091201image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.250179image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.395830image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.546191image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.691843image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.842784image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:49.999247image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:50.145871image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:50.300704image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:50.469993image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:50.621407image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:50.759110image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:50.894502image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:51.040485image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:51.187803image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:51.327292image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:51.476388image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:51.626741image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:07:54.693787image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:07:54.944683image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:07:55.202517image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:07:55.451760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-08-25T01:07:55.658428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-08-25T01:07:51.887011image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:07:52.217502image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

A1A2A3A4A5A6A7A8A9A10A11A12A13A14target
0127.582.0401642.000113.012370.0561.01
1147.336.5002841.000000.0120.0229.00
2068.6715.00021090.0001114.0020.03377.01
3122.670.7502341.585011.012400.010.00
4169.179.0002114.000011.00270.07.00
5128.583.6252640.250000.012100.01.00
6023.5811.5001483.000000.01220.017.00
7128.9215.0002885.3351111.0020.02284.01
8131.4215.5002840.500100.002120.01.00
9117.253.0002440.040000.012160.041.00

Last rows

A1A2A3A4A5A6A7A8A9A10A11A12A13A14target
680151.3310.0002350.0001111.0020.01250.01
681036.001.0002842.0001111.0020.0457.01
682119.427.2502740.040011.002100.02.00
683135.580.7502441.500000.012231.01.00
684024.753.00021181.8351119.0020.0501.01
685058.3310.00021144.0001114.0020.01603.01
686126.674.25021344.290111.012120.01.01
687129.670.7501840.040000.002240.01.00
688126.001.00021141.750100.012280.01.01
689028.170.5852640.040000.002260.01005.00