Overview

Dataset statistics

Number of variables7
Number of observations209
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.6 KiB
Average record size in memory56.6 B

Variable types

NUM7

Reproduction

Analysis started2020-08-24 23:56:30.736005
Analysis finished2020-08-24 23:56:37.938870
Duration7.2 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

CACH has 69 (33.0%) zeros Zeros
CHMIN has 5 (2.4%) zeros Zeros
CHMAX has 5 (2.4%) zeros Zeros

Variables

MYCT
Real number (ℝ≥0)

Distinct count60
Unique (%)28.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean203.82296650717703
Minimum17.0
Maximum1500.0
Zeros0
Zeros (%)0.0%
Memory size1.8 KiB
2020-08-24T23:56:37.981551image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile26
Q150
median110
Q3225
95-th percentile806
Maximum1500
Range1483
Interquartile range (IQR)175

Descriptive statistics

Standard deviation260.2629259
Coefficient of variation (CV)1.276906771
Kurtosis7.062767112
Mean203.8229665
Median Absolute Deviation (MAD)70
Skewness2.544153006
Sum42599
Variance67736.79062
2020-08-24T23:56:38.099926image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
502512.0%
 
14094.3%
 
2683.8%
 
30083.8%
 
18073.3%
 
32073.3%
 
5673.3%
 
3873.3%
 
10562.9%
 
7562.9%
 
80062.9%
 
20062.9%
 
16052.4%
 
90052.4%
 
14352.4%
 
2541.9%
 
11041.9%
 
6041.9%
 
2941.9%
 
2341.9%
 
40041.9%
 
33031.4%
 
48031.4%
 
11531.4%
 
25031.4%
 
Other values (35)5626.8%
 
ValueCountFrequency (%) 
1721.0%
 
2341.9%
 
2541.9%
 
2683.8%
 
2941.9%
 
3031.4%
 
3510.5%
 
3873.3%
 
4021.0%
 
4810.5%
 
ValueCountFrequency (%) 
150021.0%
 
110021.0%
 
90052.4%
 
81021.0%
 
80062.9%
 
70021.0%
 
60010.5%
 
48031.4%
 
40041.9%
 
35010.5%
 

MMIN
Real number (ℝ≥0)

Distinct count25
Unique (%)12.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2867.9808612440193
Minimum64.0
Maximum32000.0
Zeros0
Zeros (%)0.0%
Memory size1.8 KiB
2020-08-24T23:56:38.224152image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum64
5-th percentile256
Q1768
median2000
Q34000
95-th percentile8000
Maximum32000
Range31936
Interquartile range (IQR)3232

Descriptive statistics

Standard deviation3878.742758
Coefficient of variation (CV)1.352429791
Kurtosis17.6132748
Mean2867.980861
Median Absolute Deviation (MAD)1232
Skewness3.515933448
Sum599408
Variance15044645.38
2020-08-24T23:56:38.339336image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20005425.8%
 
10003818.2%
 
5122210.5%
 
40002210.5%
 
8000209.6%
 
256136.2%
 
768104.8%
 
1600073.3%
 
38421.0%
 
262021.0%
 
524021.0%
 
26221.0%
 
310021.0%
 
131021.0%
 
300010.5%
 
19210.5%
 
12810.5%
 
150010.5%
 
52410.5%
 
6410.5%
 
9610.5%
 
500010.5%
 
3200010.5%
 
230010.5%
 
50010.5%
 
ValueCountFrequency (%) 
6410.5%
 
9610.5%
 
12810.5%
 
19210.5%
 
256136.2%
 
26221.0%
 
38421.0%
 
50010.5%
 
5122210.5%
 
52410.5%
 
ValueCountFrequency (%) 
3200010.5%
 
1600073.3%
 
8000209.6%
 
524021.0%
 
500010.5%
 
40002210.5%
 
310021.0%
 
300010.5%
 
262021.0%
 
230010.5%
 

MMAX
Real number (ℝ≥0)

Distinct count23
Unique (%)11.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11796.153110047846
Minimum64.0
Maximum64000.0
Zeros0
Zeros (%)0.0%
Memory size1.8 KiB
2020-08-24T23:56:38.457973image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum64
5-th percentile1200
Q14000
median8000
Q316000
95-th percentile32000
Maximum64000
Range63936
Interquartile range (IQR)12000

Descriptive statistics

Standard deviation11726.56438
Coefficient of variation (CV)0.9941007265
Kurtosis5.902470634
Mean11796.15311
Median Absolute Deviation (MAD)4500
Skewness2.140662637
Sum2465396
Variance137512312.1
2020-08-24T23:56:38.570552image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
80004320.6%
 
160003516.7%
 
40003315.8%
 
320002311.0%
 
2000178.1%
 
12000104.8%
 
100073.3%
 
600062.9%
 
500052.4%
 
300052.4%
 
6400041.9%
 
2400041.9%
 
620031.4%
 
1048021.0%
 
51221.0%
 
262021.0%
 
2097021.0%
 
6410.5%
 
76810.5%
 
150010.5%
 
450010.5%
 
350010.5%
 
630010.5%
 
ValueCountFrequency (%) 
6410.5%
 
51221.0%
 
76810.5%
 
100073.3%
 
150010.5%
 
2000178.1%
 
262021.0%
 
300052.4%
 
350010.5%
 
40003315.8%
 
ValueCountFrequency (%) 
6400041.9%
 
320002311.0%
 
2400041.9%
 
2097021.0%
 
160003516.7%
 
12000104.8%
 
1048021.0%
 
80004320.6%
 
630010.5%
 
620031.4%
 

CACH
Real number (ℝ≥0)

ZEROS

Distinct count22
Unique (%)10.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.205741626794257
Minimum0.0
Maximum256.0
Zeros69
Zeros (%)33.0%
Memory size1.8 KiB
2020-08-24T23:56:38.683375image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q332
95-th percentile128
Maximum256
Range256
Interquartile range (IQR)32

Descriptive statistics

Standard deviation40.62872191
Coefficient of variation (CV)1.611883614
Kurtosis10.27838402
Mean25.20574163
Median Absolute Deviation (MAD)8
Skewness2.824777332
Sum5268
Variance1650.693044
2020-08-24T23:56:38.787935image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
06933.0%
 
83114.8%
 
322311.0%
 
64209.6%
 
16146.7%
 
483.8%
 
2473.3%
 
12862.9%
 
652.4%
 
3041.9%
 
241.9%
 
6521.0%
 
25621.0%
 
13121.0%
 
11221.0%
 
921.0%
 
121.0%
 
4821.0%
 
14210.5%
 
1210.5%
 
16010.5%
 
9610.5%
 
ValueCountFrequency (%) 
06933.0%
 
121.0%
 
241.9%
 
483.8%
 
652.4%
 
83114.8%
 
921.0%
 
1210.5%
 
16146.7%
 
2473.3%
 
ValueCountFrequency (%) 
25621.0%
 
16010.5%
 
14210.5%
 
13121.0%
 
12862.9%
 
11221.0%
 
9610.5%
 
6521.0%
 
64209.6%
 
4821.0%
 

CHMIN
Real number (ℝ≥0)

ZEROS

Distinct count15
Unique (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.698564593301436
Minimum0.0
Maximum52.0
Zeros5
Zeros (%)2.4%
Memory size1.8 KiB
2020-08-24T23:56:38.901910image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q36
95-th percentile16
Maximum52
Range52
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.816273503
Coefficient of variation (CV)1.450714014
Kurtosis22.46889907
Mean4.698564593
Median Absolute Deviation (MAD)1
Skewness4.027332866
Sum982
Variance46.46158447
2020-08-24T23:56:39.006388image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
19445.0%
 
32813.4%
 
8188.6%
 
6167.7%
 
12115.3%
 
16104.8%
 
483.8%
 
573.3%
 
262.9%
 
052.4%
 
5221.0%
 
2610.5%
 
2410.5%
 
710.5%
 
3210.5%
 
ValueCountFrequency (%) 
052.4%
 
19445.0%
 
262.9%
 
32813.4%
 
483.8%
 
573.3%
 
6167.7%
 
710.5%
 
8188.6%
 
12115.3%
 
ValueCountFrequency (%) 
5221.0%
 
3210.5%
 
2610.5%
 
2410.5%
 
16104.8%
 
12115.3%
 
8188.6%
 
710.5%
 
6167.7%
 
573.3%
 

CHMAX
Real number (ℝ≥0)

ZEROS

Distinct count31
Unique (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.267942583732058
Minimum0.0
Maximum176.0
Zeros5
Zeros (%)2.4%
Memory size1.8 KiB
2020-08-24T23:56:39.121409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median8
Q324
95-th percentile64
Maximum176
Range176
Interquartile range (IQR)19

Descriptive statistics

Standard deviation25.99731821
Coefficient of variation (CV)1.423111447
Kurtosis15.88808138
Mean18.26794258
Median Absolute Deviation (MAD)6
Skewness3.59590538
Sum3818
Variance675.8605539
2020-08-24T23:56:39.240857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
63014.4%
 
242411.5%
 
8209.6%
 
32157.2%
 
5136.2%
 
16125.7%
 
2115.3%
 
4115.3%
 
1104.8%
 
394.3%
 
1283.8%
 
6452.4%
 
052.4%
 
2052.4%
 
1441.9%
 
1041.9%
 
3831.4%
 
5431.4%
 
17621.0%
 
10421.0%
 
721.0%
 
12821.0%
 
1910.5%
 
4810.5%
 
1510.5%
 
Other values (6)62.9%
 
ValueCountFrequency (%) 
052.4%
 
1104.8%
 
2115.3%
 
394.3%
 
4115.3%
 
5136.2%
 
63014.4%
 
721.0%
 
8209.6%
 
1041.9%
 
ValueCountFrequency (%) 
17621.0%
 
12821.0%
 
11210.5%
 
10421.0%
 
6452.4%
 
5431.4%
 
5210.5%
 
4810.5%
 
3831.4%
 
32157.2%
 

target
Real number (ℝ≥0)

Distinct count116
Unique (%)55.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean105.622009569378
Minimum6.0
Maximum1150.0
Zeros0
Zeros (%)0.0%
Memory size1.8 KiB
2020-08-24T23:56:39.359495image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile12
Q127
median50
Q3113
95-th percentile386.2
Maximum1150
Range1144
Interquartile range (IQR)86

Descriptive statistics

Standard deviation160.8307331
Coefficient of variation (CV)1.522700938
Kurtosis19.25218675
Mean105.6220096
Median Absolute Deviation (MAD)27
Skewness3.892814292
Sum22075
Variance25866.52471
2020-08-24T23:56:39.466069image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3273.3%
 
2262.9%
 
5062.9%
 
4062.9%
 
1652.4%
 
4552.4%
 
2452.4%
 
3652.4%
 
3841.9%
 
6041.9%
 
6641.9%
 
3041.9%
 
1841.9%
 
1241.9%
 
2041.9%
 
2641.9%
 
1131.4%
 
3331.4%
 
2731.4%
 
6231.4%
 
27721.0%
 
621.0%
 
46521.0%
 
7121.0%
 
7221.0%
 
Other values (91)11052.6%
 
ValueCountFrequency (%) 
621.0%
 
710.5%
 
810.5%
 
1010.5%
 
1131.4%
 
1241.9%
 
1310.5%
 
1421.0%
 
1652.4%
 
1721.0%
 
ValueCountFrequency (%) 
115010.5%
 
114410.5%
 
91510.5%
 
63610.5%
 
51021.0%
 
48910.5%
 
46521.0%
 
40510.5%
 
39710.5%
 
37010.5%
 

Interactions

2020-08-24T23:56:30.988767image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.123258image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.253608image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.385226image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.516351image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.642030image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.775599image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:31.906503image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.039775image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.165877image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.292667image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.415564image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.539857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.672190image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.803103image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:32.936110image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.064595image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.197690image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.326015image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.451015image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.581034image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.866494image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:33.994392image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.115045image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.235760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.350966image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.469668image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.592700image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.712999image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.837335image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:34.959908image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.084635image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.201077image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.316716image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.440915image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.572001image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.705103image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.837078image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:35.970977image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.109170image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.234220image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.367840image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.500860image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.630718image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.756366image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:36.882483image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:37.010735image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:37.133371image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:37.261850image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-24T23:56:39.591769image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-24T23:56:39.777959image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-24T23:56:39.965744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-24T23:56:40.160809image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-08-24T23:56:37.487423image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-24T23:56:37.863938image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

MYCTMMINMMAXCACHCHMINCHMAXtarget
0125.0256.06000.0256.016.0128.0198.0
129.08000.032000.032.08.032.0269.0
229.08000.032000.032.08.032.0220.0
329.08000.032000.032.08.032.0172.0
429.08000.016000.032.08.016.0132.0
526.08000.032000.064.08.032.0318.0
623.016000.032000.064.016.032.0367.0
723.016000.032000.064.016.032.0489.0
823.016000.064000.064.016.032.0636.0
923.032000.064000.0128.032.064.01144.0

Last rows

MYCTMMINMMAXCACHCHMINCHMAXtarget
19930.08000.064000.0128.012.0176.01150.0
200180.0262.04000.00.01.03.012.0
201180.0512.04000.00.01.03.014.0
202180.0262.04000.00.01.03.018.0
203180.0512.04000.00.01.03.021.0
204124.01000.08000.00.01.08.042.0
20598.01000.08000.032.02.08.046.0
206125.02000.08000.00.02.014.052.0
207480.0512.08000.032.00.00.067.0
208480.01000.04000.00.00.00.045.0