Overview

Dataset statistics

Number of variables16
Number of observations690
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory86.4 KiB
Average record size in memory128.2 B

Variable types

NUM8
BOOL4
CAT4

Reproduction

Analysis started2020-08-25 01:21:07.016273
Analysis finished2020-08-25 01:21:18.378021
Duration11.36 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

A5 is highly correlated with A4High correlation
A4 is highly correlated with A5High correlation
A5 is highly correlated with A4High correlation
A4 is highly correlated with A5High correlation
A3 has 19 (2.8%) zeros Zeros
A6 has 9 (1.3%) zeros Zeros
A7 has 9 (1.3%) zeros Zeros
A8 has 70 (10.1%) zeros Zeros
A11 has 395 (57.2%) zeros Zeros
A14 has 132 (19.1%) zeros Zeros
A15 has 295 (42.8%) zeros Zeros

Variables

A1
Categorical

Distinct count3
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2
468
1
210
0
 
12
ValueCountFrequency (%) 
246867.8%
 
121030.4%
 
0121.7%
 
2020-08-25T01:21:18.450050image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
246867.8%
 
121030.4%
 
0121.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
246867.8%
 
121030.4%
 
0121.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common690100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
246867.8%
 
121030.4%
 
0121.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII690100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
246867.8%
 
121030.4%
 
0121.7%
 

A2
Real number (ℝ≥0)

Distinct count350
Unique (%)50.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.52898550724638
Minimum0
Maximum349
Zeros1
Zeros (%)0.1%
Memory size5.5 KiB
2020-08-25T01:21:18.559891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20.45
Q171
median133.5
Q3226
95-th percentile326.55
Maximum349
Range349
Interquartile range (IQR)155

Descriptive statistics

Standard deviation96.18894626
Coefficient of variation (CV)0.6390061418
Kurtosis-0.9452219534
Mean150.5289855
Median Absolute Deviation (MAD)73
Skewness0.4088069518
Sum103865
Variance9252.313382
2020-08-25T01:21:18.677676image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
349121.7%
 
7191.3%
 
4671.0%
 
3160.9%
 
8960.9%
 
8260.9%
 
6960.9%
 
4860.9%
 
3460.9%
 
9460.9%
 
7850.7%
 
7550.7%
 
12550.7%
 
12350.7%
 
7650.7%
 
17950.7%
 
17140.6%
 
6440.6%
 
7440.6%
 
8140.6%
 
9040.6%
 
9140.6%
 
9640.6%
 
4940.6%
 
10240.6%
 
Other values (325)55480.3%
 
ValueCountFrequency (%) 
010.1%
 
110.1%
 
210.1%
 
320.3%
 
410.1%
 
520.3%
 
620.3%
 
710.1%
 
820.3%
 
930.4%
 
ValueCountFrequency (%) 
349121.7%
 
34810.1%
 
34710.1%
 
34610.1%
 
34510.1%
 
34410.1%
 
34310.1%
 
34210.1%
 
34110.1%
 
34010.1%
 

A3
Real number (ℝ≥0)

ZEROS

Distinct count215
Unique (%)31.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.758724637681159
Minimum0.0
Maximum28.0
Zeros19
Zeros (%)2.8%
Memory size5.5 KiB
2020-08-25T01:21:18.793926image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.165
Q11
median2.75
Q37.2075
95-th percentile14
Maximum28
Range28
Interquartile range (IQR)6.2075

Descriptive statistics

Standard deviation4.978163249
Coefficient of variation (CV)1.046112904
Kurtosis2.274021887
Mean4.758724638
Median Absolute Deviation (MAD)2.21
Skewness1.488813125
Sum3283.52
Variance24.78210933
2020-08-25T01:21:18.900532image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.5213.0%
 
3192.8%
 
0192.8%
 
2.5192.8%
 
1.25162.3%
 
0.75162.3%
 
0.5152.2%
 
5142.0%
 
4121.7%
 
1.75121.7%
 
6.5121.7%
 
0.585101.4%
 
2101.4%
 
1101.4%
 
10101.4%
 
0.37591.3%
 
1191.3%
 
11.581.2%
 
781.2%
 
0.5481.2%
 
0.16581.2%
 
0.83581.2%
 
12.581.2%
 
3.581.2%
 
5.581.2%
 
Other values (190)39357.0%
 
ValueCountFrequency (%) 
0192.8%
 
0.0450.7%
 
0.0810.1%
 
0.08510.1%
 
0.12550.7%
 
0.16581.2%
 
0.1710.1%
 
0.20530.4%
 
0.2130.4%
 
0.2560.9%
 
ValueCountFrequency (%) 
2810.1%
 
26.33510.1%
 
25.2110.1%
 
25.12510.1%
 
25.08510.1%
 
22.2910.1%
 
2210.1%
 
21.510.1%
 
2110.1%
 
2010.1%
 

A4
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct count4
Unique (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2
519
3
163
0
 
6
1
 
2
ValueCountFrequency (%) 
251975.2%
 
316323.6%
 
060.9%
 
120.3%
 
2020-08-25T01:21:19.040394image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
251975.2%
 
316323.6%
 
060.9%
 
120.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
251975.2%
 
316323.6%
 
060.9%
 
120.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common690100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
251975.2%
 
316323.6%
 
060.9%
 
120.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII690100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
251975.2%
 
316323.6%
 
060.9%
 
120.3%
 

A5
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct count4
Unique (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
1
519
3
163
0
 
6
2
 
2
ValueCountFrequency (%) 
151975.2%
 
316323.6%
 
060.9%
 
220.3%
 
2020-08-25T01:21:19.185020image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
151975.2%
 
316323.6%
 
060.9%
 
220.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
151975.2%
 
316323.6%
 
060.9%
 
220.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common690100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
151975.2%
 
316323.6%
 
060.9%
 
220.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII690100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
151975.2%
 
316323.6%
 
060.9%
 
220.3%
 

A6
Real number (ℝ≥0)

ZEROS

Distinct count15
Unique (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.398550724637682
Minimum0
Maximum14
Zeros9
Zeros (%)1.3%
Memory size5.5 KiB
2020-08-25T01:21:19.294748image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q310
95-th percentile14
Maximum14
Range14
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.109777811
Coefficient of variation (CV)0.642298231
Kurtosis-1.241341393
Mean6.398550725
Median Absolute Deviation (MAD)3
Skewness0.3088174203
Sum4415
Variance16.89027366
2020-08-25T01:21:19.409712image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
313719.9%
 
127811.3%
 
1649.3%
 
8598.6%
 
2547.8%
 
7537.7%
 
10517.4%
 
4415.9%
 
14385.5%
 
11385.5%
 
5304.3%
 
6253.6%
 
9101.4%
 
091.3%
 
1330.4%
 
ValueCountFrequency (%) 
091.3%
 
1649.3%
 
2547.8%
 
313719.9%
 
4415.9%
 
5304.3%
 
6253.6%
 
7537.7%
 
8598.6%
 
9101.4%
 
ValueCountFrequency (%) 
14385.5%
 
1330.4%
 
127811.3%
 
11385.5%
 
10517.4%
 
9101.4%
 
8598.6%
 
7537.7%
 
6253.6%
 
5304.3%
 

A7
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.994202898550725
Minimum0
Maximum9
Zeros9
Zeros (%)1.3%
Memory size5.5 KiB
2020-08-25T01:21:19.526498image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median8
Q38
95-th percentile8
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.594505977
Coefficient of variation (CV)0.4328358617
Kurtosis-0.9537779728
Mean5.994202899
Median Absolute Deviation (MAD)0
Skewness-0.7387113704
Sum4136
Variance6.731461265
2020-08-25T01:21:19.640906image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
839957.8%
 
413820.0%
 
1598.6%
 
3578.3%
 
091.3%
 
981.2%
 
581.2%
 
260.9%
 
640.6%
 
720.3%
 
ValueCountFrequency (%) 
091.3%
 
1598.6%
 
260.9%
 
3578.3%
 
413820.0%
 
581.2%
 
640.6%
 
720.3%
 
839957.8%
 
981.2%
 
ValueCountFrequency (%) 
981.2%
 
839957.8%
 
720.3%
 
640.6%
 
581.2%
 
413820.0%
 
3578.3%
 
260.9%
 
1598.6%
 
091.3%
 

A8
Real number (ℝ≥0)

ZEROS

Distinct count132
Unique (%)19.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2234057971014494
Minimum0.0
Maximum28.5
Zeros70
Zeros (%)10.1%
Memory size5.5 KiB
2020-08-25T01:21:19.750821image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.165
median1
Q32.625
95-th percentile8.56875
Maximum28.5
Range28.5
Interquartile range (IQR)2.46

Descriptive statistics

Standard deviation3.346513359
Coefficient of variation (CV)1.505129367
Kurtosis11.20019166
Mean2.223405797
Median Absolute Deviation (MAD)0.915
Skewness2.891330424
Sum1534.15
Variance11.19915166
2020-08-25T01:21:19.852617image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
07010.1%
 
0.25355.1%
 
0.04334.8%
 
1314.5%
 
0.125304.3%
 
0.5284.1%
 
0.085263.8%
 
1.5253.6%
 
0.165223.2%
 
2.5172.5%
 
2162.3%
 
1.75152.2%
 
5131.9%
 
0.29121.7%
 
0.75121.7%
 
3.5121.7%
 
3111.6%
 
2.25101.4%
 
1.25101.4%
 
481.2%
 
0.41581.2%
 
5.571.0%
 
0.37571.0%
 
1.08560.9%
 
6.560.9%
 
Other values (107)22031.9%
 
ValueCountFrequency (%) 
07010.1%
 
0.04334.8%
 
0.085263.8%
 
0.125304.3%
 
0.165223.2%
 
0.2160.9%
 
0.25355.1%
 
0.29121.7%
 
0.33550.7%
 
0.37571.0%
 
ValueCountFrequency (%) 
28.510.1%
 
2020.3%
 
1810.1%
 
17.510.1%
 
1610.1%
 
15.510.1%
 
1530.4%
 
14.41510.1%
 
1430.4%
 
13.87520.3%
 

A9
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
1
361
0
329
ValueCountFrequency (%) 
136152.3%
 
032947.7%
 

A10
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
395
1
295
ValueCountFrequency (%) 
039557.2%
 
129542.8%
 

A11
Real number (ℝ≥0)

ZEROS

Distinct count23
Unique (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4
Minimum0.0
Maximum67.0
Zeros395
Zeros (%)57.2%
Memory size5.5 KiB
2020-08-25T01:21:19.962549image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile11
Maximum67
Range67
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.862940034
Coefficient of variation (CV)2.026225014
Kurtosis50.82943127
Mean2.4
Median Absolute Deviation (MAD)0
Skewness5.152519857
Sum1656
Variance23.64818578
2020-08-25T01:21:20.077271image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
039557.2%
 
17110.3%
 
2456.5%
 
3284.1%
 
6233.3%
 
11192.8%
 
5182.6%
 
7162.3%
 
4152.2%
 
9101.4%
 
8101.4%
 
1481.2%
 
1081.2%
 
1281.2%
 
1540.6%
 
1630.4%
 
2020.3%
 
1720.3%
 
4010.1%
 
1910.1%
 
2310.1%
 
1310.1%
 
6710.1%
 
ValueCountFrequency (%) 
039557.2%
 
17110.3%
 
2456.5%
 
3284.1%
 
4152.2%
 
5182.6%
 
6233.3%
 
7162.3%
 
8101.4%
 
9101.4%
 
ValueCountFrequency (%) 
6710.1%
 
4010.1%
 
2310.1%
 
2020.3%
 
1910.1%
 
1720.3%
 
1630.4%
 
1540.6%
 
1481.2%
 
1310.1%
 

A12
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
374
1
316
ValueCountFrequency (%) 
037454.2%
 
131645.8%
 

A13
Categorical

Distinct count3
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
625
2
 
57
1
 
8
ValueCountFrequency (%) 
062590.6%
 
2578.3%
 
181.2%
 
2020-08-25T01:21:20.226456image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
062590.6%
 
2578.3%
 
181.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
062590.6%
 
2578.3%
 
181.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Common690100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
062590.6%
 
2578.3%
 
181.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII690100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
062590.6%
 
2578.3%
 
181.2%
 

A14
Real number (ℝ≥0)

ZEROS

Distinct count171
Unique (%)24.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.927536231884055
Minimum0
Maximum170
Zeros132
Zeros (%)19.1%
Memory size5.5 KiB
2020-08-25T01:21:20.338412image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17.25
median40
Q395.75
95-th percentile159
Maximum170
Range170
Interquartile range (IQR)88.5

Descriptive statistics

Standard deviation54.81326496
Coefficient of variation (CV)0.9628603061
Kurtosis-0.8705687147
Mean56.92753623
Median Absolute Deviation (MAD)40
Skewness0.6586872057
Sum39280
Variance3004.494016
2020-08-25T01:21:20.633883image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
013219.1%
 
8355.1%
 
40355.1%
 
23344.9%
 
1304.3%
 
159304.3%
 
74223.2%
 
34182.6%
 
15162.3%
 
86142.0%
 
61142.0%
 
80131.9%
 
170131.9%
 
67111.6%
 
11191.3%
 
5191.3%
 
14791.3%
 
9671.0%
 
9071.0%
 
10350.7%
 
11040.6%
 
12240.6%
 
1740.6%
 
11640.6%
 
14140.6%
 
Other values (146)20730.0%
 
ValueCountFrequency (%) 
013219.1%
 
1304.3%
 
210.1%
 
340.6%
 
420.3%
 
520.3%
 
610.1%
 
710.1%
 
8355.1%
 
910.1%
 
ValueCountFrequency (%) 
170131.9%
 
16910.1%
 
16810.1%
 
16730.4%
 
16610.1%
 
16510.1%
 
16410.1%
 
16320.3%
 
16220.3%
 
16110.1%
 

A15
Real number (ℝ≥0)

ZEROS

Distinct count240
Unique (%)34.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1017.3855072463768
Minimum0.0
Maximum100000.0
Zeros295
Zeros (%)42.8%
Memory size5.5 KiB
2020-08-25T01:21:20.749625image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median5
Q3395.5
95-th percentile4119.4
Maximum100000
Range100000
Interquartile range (IQR)395.5

Descriptive statistics

Standard deviation5210.102598
Coefficient of variation (CV)5.121070195
Kurtosis214.6699724
Mean1017.385507
Median Absolute Deviation (MAD)5
Skewness13.14065501
Sum701996
Variance27145169.08
2020-08-25T01:21:20.853104image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
029542.8%
 
1294.2%
 
500101.4%
 
1000101.4%
 
291.3%
 
581.2%
 
30081.2%
 
681.2%
 
10060.9%
 
360.9%
 
20060.9%
 
450.7%
 
5050.7%
 
2040.6%
 
15040.6%
 
1040.6%
 
300040.6%
 
740.6%
 
40030.4%
 
56030.4%
 
200030.4%
 
60030.4%
 
1830.4%
 
400030.4%
 
500030.4%
 
Other values (215)24435.4%
 
ValueCountFrequency (%) 
029542.8%
 
1294.2%
 
291.3%
 
360.9%
 
450.7%
 
581.2%
 
681.2%
 
740.6%
 
820.3%
 
910.1%
 
ValueCountFrequency (%) 
10000010.1%
 
5110010.1%
 
5000010.1%
 
3128510.1%
 
2672610.1%
 
1802710.1%
 
1510810.1%
 
1500010.1%
 
1321210.1%
 
1120210.1%
 

target
Boolean

Distinct count2
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
0
383
1
307
ValueCountFrequency (%) 
038355.5%
 
130744.5%
 

Interactions

2020-08-25T01:21:07.806068image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:07.967637image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:08.115580image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:08.271829image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:08.416627image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:08.572932image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:08.738061image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:08.894322image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.056016image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.217021image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.358460image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.520775image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.654690image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.792908image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:09.937292image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.081781image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.221227image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.378077image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.530977image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.693010image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.837037image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:10.989846image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:11.142955image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:11.298839image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:11.448069image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:11.591140image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:11.729877image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:11.869184image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:12.202965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:12.334476image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:12.476913image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:12.620565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:12.756572image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:12.903277image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.045885image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.196072image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.331499image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.470215image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.621589image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.769724image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:13.913020image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:14.073534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:14.223316image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:14.386338image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:14.543851image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:14.696979image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:14.859948image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.018965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.175227image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.337244image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.488550image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.647447image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.792415image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:15.942420image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:16.110141image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:16.264646image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:16.413346image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:16.560521image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:16.899225image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:17.044915image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:17.182756image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:17.321263image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:17.467640image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:17.617347image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:21:20.993042image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:21:21.260936image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:21:21.527667image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:21:21.793689image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-08-25T01:21:22.016704image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-08-25T01:21:17.889980image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:21:18.241491image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

A1A2A3A4A5A6A7A8A9A10A11A12A13A14A15target
01200.54021381.750011.0101595.00
12110.335331080.290000.002400.00
221591.12521730.000011.00016719.00
312831.33521870.335000.0000120.00
421713.500211080.500000.010580.00
511951.25033840.500000.010230.00
611082.00021950.000000.010721.00
72530.87533340.250000.00074204.00
821360.37521380.290000.00051140.00
922310.25021341.085000.0008613.00

Last rows

A1A2A3A4A5A6A7A8A9A10A11A12A13A14A15target
680120012.0002112414.000118.00006590.01
6812557.50021281.415111.0001599800.01
68222405.00021385.000117.00003065.01
68312451.00021812.250100.0100300.01
68421240.58533480.250112.00067500.01
68527611.50021883.500119.000144742.01
686132119.50021385.500117.00003000.01
68721560.00021181.250111.000420.01
6882390.375211282.000112.0101590.01
68921486.50021483.125118.000881200.01