Dataset statistics
| Number of variables | 5 |
|---|---|
| Number of observations | 3848 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 150.4 KiB |
| Average record size in memory | 40.0 B |
Variable types
| NUM | 5 |
|---|
Reproduction
| Analysis started | 2020-08-25 00:02:34.049641 |
|---|---|
| Analysis finished | 2020-08-25 00:02:38.267732 |
| Duration | 4.22 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
RIDGE
Real number (ℝ)
| Distinct count | 3809 |
|---|---|
| Unique (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.003636643747891553 |
|---|---|
| Minimum | -23.283899307250977 |
| Maximum | 21.40660095214844 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.2 KiB |
Quantile statistics
| Minimum | -23.28389931 |
|---|---|
| 5-th percentile | -11.17349524 |
| Q1 | -3.983725011 |
| median | -0.163850002 |
| Q3 | 4.64715004 |
| 95-th percentile | 10.14166489 |
| Maximum | 21.40660095 |
| Range | 44.69050026 |
| Interquartile range (IQR) | 8.630875051 |
Descriptive statistics
| Standard deviation | 6.398236563 |
|---|---|
| Coefficient of variation (CV) | -1759.379529 |
| Kurtosis | -0.05367512239 |
| Mean | -0.003636643748 |
| Median Absolute Deviation (MAD) | 4.33465004 |
| Skewness | -0.1305804539 |
| Sum | -13.99380514 |
| Variance | 40.93743111 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| -5.058400154 | 2 | 0.1% | |
| -0.7645000219 | 2 | 0.1% | |
| -6.882999897 | 2 | 0.1% | |
| -1.662400007 | 2 | 0.1% | |
| -0.7056999803 | 2 | 0.1% | |
| 3.00819993 | 2 | 0.1% | |
| -3.240999937 | 2 | 0.1% | |
| -0.8740000129 | 2 | 0.1% | |
| -1.11559999 | 2 | 0.1% | |
| 6.177400112 | 2 | 0.1% | |
| -0.5633000135 | 2 | 0.1% | |
| 5.752299786 | 2 | 0.1% | |
| 4.326399803 | 2 | 0.1% | |
| -1.052600026 | 2 | 0.1% | |
| 0.04720000178 | 2 | 0.1% | |
| -6.354599953 | 2 | 0.1% | |
| 4.098199844 | 2 | 0.1% | |
| 0.7972999811 | 2 | 0.1% | |
| 7.572000027 | 2 | 0.1% | |
| 0.6024000049 | 2 | 0.1% | |
| 4.872900009 | 2 | 0.1% | |
| -1.329399943 | 2 | 0.1% | |
| 4.663099766 | 2 | 0.1% | |
| 4.238100052 | 2 | 0.1% | |
| 6.319200039 | 2 | 0.1% | |
| Other values (3784) | 3798 | 98.7% |
| Value | Count | Frequency (%) | |
| -23.28389931 | 1 | < 0.1% | |
| -22.80050087 | 1 | < 0.1% | |
| -21.75200081 | 1 | < 0.1% | |
| -21.74710083 | 1 | < 0.1% | |
| -20.86540031 | 1 | < 0.1% | |
| -19.73810005 | 1 | < 0.1% | |
| -19.46870041 | 1 | < 0.1% | |
| -18.93370056 | 1 | < 0.1% | |
| -18.7663002 | 1 | < 0.1% | |
| -18.62100029 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 21.40660095 | 1 | < 0.1% | |
| 20.63330078 | 1 | < 0.1% | |
| 18.53009987 | 1 | < 0.1% | |
| 18.25749969 | 1 | < 0.1% | |
| 18.05599976 | 1 | < 0.1% | |
| 17.49290085 | 1 | < 0.1% | |
| 16.95219994 | 1 | < 0.1% | |
| 16.44709969 | 1 | < 0.1% | |
| 16.31100082 | 1 | < 0.1% | |
| 16.22249985 | 1 | < 0.1% |
NUB
Real number (ℝ)
| Distinct count | 3811 |
|---|---|
| Unique (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0001596685136153349 |
|---|---|
| Minimum | -16.393499374389652 |
| Maximum | 17.25830078125 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.2 KiB |
Quantile statistics
| Minimum | -16.39349937 |
|---|---|
| 5-th percentile | -8.222910118 |
| Q1 | -3.757625043 |
| median | -0.2316999957 |
| Q3 | 3.750525057 |
| 95-th percentile | 8.419920206 |
| Maximum | 17.25830078 |
| Range | 33.65180016 |
| Interquartile range (IQR) | 7.508150101 |
Descriptive statistics
| Standard deviation | 5.186310551 |
|---|---|
| Coefficient of variation (CV) | 32481.73627 |
| Kurtosis | -0.3087882455 |
| Mean | 0.0001596685136 |
| Median Absolute Deviation (MAD) | 3.712400079 |
| Skewness | 0.07219002509 |
| Sum | 0.6144044404 |
| Variance | 26.89781713 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3.125 | 2 | 0.1% | |
| -5.540299892 | 2 | 0.1% | |
| -5.589600086 | 2 | 0.1% | |
| -3.763200045 | 2 | 0.1% | |
| 1.90989995 | 2 | 0.1% | |
| -1.357200027 | 2 | 0.1% | |
| 7.782400131 | 2 | 0.1% | |
| 0.3251000047 | 2 | 0.1% | |
| -1.265799999 | 2 | 0.1% | |
| -5.517700195 | 2 | 0.1% | |
| -2.288599968 | 2 | 0.1% | |
| 2.644000053 | 2 | 0.1% | |
| 1.354500055 | 2 | 0.1% | |
| 9.808300018 | 2 | 0.1% | |
| 7.772600174 | 2 | 0.1% | |
| 0.8011999726 | 2 | 0.1% | |
| -3.010299921 | 2 | 0.1% | |
| 1.626899958 | 2 | 0.1% | |
| 0.01750000007 | 2 | 0.1% | |
| -3.985699892 | 2 | 0.1% | |
| 0.04949999973 | 2 | 0.1% | |
| -1.088600039 | 2 | 0.1% | |
| -2.857800007 | 2 | 0.1% | |
| 4.943099976 | 2 | 0.1% | |
| 2.057399988 | 2 | 0.1% | |
| Other values (3786) | 3798 | 98.7% |
| Value | Count | Frequency (%) | |
| -16.39349937 | 1 | < 0.1% | |
| -16.31049919 | 1 | < 0.1% | |
| -15.87110043 | 1 | < 0.1% | |
| -15.18190002 | 1 | < 0.1% | |
| -14.59840012 | 1 | < 0.1% | |
| -14.30830002 | 1 | < 0.1% | |
| -14.14210033 | 1 | < 0.1% | |
| -13.85719967 | 1 | < 0.1% | |
| -13.6111002 | 1 | < 0.1% | |
| -13.53219986 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 17.25830078 | 1 | < 0.1% | |
| 17.18639946 | 1 | < 0.1% | |
| 16.43759918 | 1 | < 0.1% | |
| 15.35270023 | 1 | < 0.1% | |
| 15.04249954 | 1 | < 0.1% | |
| 14.39369965 | 1 | < 0.1% | |
| 14.27009964 | 1 | < 0.1% | |
| 14.21500015 | 1 | < 0.1% | |
| 14.16590023 | 1 | < 0.1% | |
| 14.1079998 | 1 | < 0.1% |
CRACK
Real number (ℝ)
| Distinct count | 3816 |
|---|---|
| Unique (%) | 99.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0031031201140502016 |
|---|---|
| Minimum | -31.413000106811523 |
| Maximum | 30.317800521850586 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.2 KiB |
Quantile statistics
| Minimum | -31.41300011 |
|---|---|
| 5-th percentile | -12.8170599 |
| Q1 | -5.453274846 |
| median | -0.05614999868 |
| Q3 | 5.661125183 |
| 95-th percentile | 12.49110994 |
| Maximum | 30.31780052 |
| Range | 61.73080063 |
| Interquartile range (IQR) | 11.11440003 |
Descriptive statistics
| Standard deviation | 7.875198832 |
|---|---|
| Coefficient of variation (CV) | 2537.832421 |
| Kurtosis | -0.1554786953 |
| Mean | 0.003103120114 |
| Median Absolute Deviation (MAD) | 5.568200072 |
| Skewness | -0.05705019029 |
| Sum | 11.9408062 |
| Variance | 62.01875664 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4.329299927 | 2 | 0.1% | |
| -3.96510005 | 2 | 0.1% | |
| 3.8756001 | 2 | 0.1% | |
| 6.566500187 | 2 | 0.1% | |
| -0.4408000112 | 2 | 0.1% | |
| 1.237800002 | 2 | 0.1% | |
| 2.533900023 | 2 | 0.1% | |
| 5.887300014 | 2 | 0.1% | |
| 2.395900011 | 2 | 0.1% | |
| 7.825399876 | 2 | 0.1% | |
| -4.321899891 | 2 | 0.1% | |
| -2.240099907 | 2 | 0.1% | |
| -1.434499979 | 2 | 0.1% | |
| 1.192700028 | 2 | 0.1% | |
| -0.01049999986 | 2 | 0.1% | |
| -1.16989994 | 2 | 0.1% | |
| -2.367899895 | 2 | 0.1% | |
| 8.70759964 | 2 | 0.1% | |
| 3.678200006 | 2 | 0.1% | |
| 2.906300068 | 2 | 0.1% | |
| 10.18789959 | 2 | 0.1% | |
| 0.02080000006 | 2 | 0.1% | |
| -7.734399796 | 2 | 0.1% | |
| -10.39799976 | 2 | 0.1% | |
| 0.6998000145 | 2 | 0.1% | |
| Other values (3791) | 3798 | 98.7% |
| Value | Count | Frequency (%) | |
| -31.41300011 | 1 | < 0.1% | |
| -26.46520042 | 1 | < 0.1% | |
| -25.54310036 | 1 | < 0.1% | |
| -23.54290009 | 1 | < 0.1% | |
| -22.73690033 | 1 | < 0.1% | |
| -22.70610046 | 1 | < 0.1% | |
| -22.46310043 | 1 | < 0.1% | |
| -22.42289925 | 1 | < 0.1% | |
| -21.98570061 | 1 | < 0.1% | |
| -21.97030067 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 30.31780052 | 1 | < 0.1% | |
| 25.64340019 | 1 | < 0.1% | |
| 25.16469955 | 1 | < 0.1% | |
| 23.82040024 | 1 | < 0.1% | |
| 22.52599907 | 1 | < 0.1% | |
| 22.26129913 | 1 | < 0.1% | |
| 21.66869926 | 1 | < 0.1% | |
| 21.62319946 | 1 | < 0.1% | |
| 21.4633007 | 1 | < 0.1% | |
| 21.27829933 | 1 | < 0.1% |
WEIGHT
Real number (ℝ)
| Distinct count | 3826 |
|---|---|
| Unique (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.004237028622148818 |
|---|---|
| Minimum | -34.03519821166992 |
| Maximum | 35.80279922485352 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.2 KiB |
Quantile statistics
| Minimum | -34.03519821 |
|---|---|
| 5-th percentile | -16.10761023 |
| Q1 | -7.018650055 |
| median | -0.1493500024 |
| Q3 | 6.7997998 |
| 95-th percentile | 17.09936495 |
| Maximum | 35.80279922 |
| Range | 69.83799744 |
| Interquartile range (IQR) | 13.81844985 |
Descriptive statistics
| Standard deviation | 10.04309165 |
|---|---|
| Coefficient of variation (CV) | 2370.314801 |
| Kurtosis | -0.1602870554 |
| Mean | 0.004237028622 |
| Median Absolute Deviation (MAD) | 6.893500015 |
| Skewness | 0.1087342671 |
| Sum | 16.30408614 |
| Variance | 100.86369 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1.578199983 | 2 | 0.1% | |
| -7.30700016 | 2 | 0.1% | |
| -1.281900048 | 2 | 0.1% | |
| -11.93109989 | 2 | 0.1% | |
| 5.632800102 | 2 | 0.1% | |
| -0.5529999733 | 2 | 0.1% | |
| -8.25949955 | 2 | 0.1% | |
| 11.43150043 | 2 | 0.1% | |
| 1.087000012 | 2 | 0.1% | |
| -5.807700157 | 2 | 0.1% | |
| 4.518599987 | 2 | 0.1% | |
| -13.88860035 | 2 | 0.1% | |
| 2.58949995 | 2 | 0.1% | |
| -1.894299984 | 2 | 0.1% | |
| 0.8550999761 | 2 | 0.1% | |
| -8.920399666 | 2 | 0.1% | |
| 13.87629986 | 2 | 0.1% | |
| -11.55000019 | 2 | 0.1% | |
| 2.619600058 | 2 | 0.1% | |
| -1.225800037 | 2 | 0.1% | |
| 14.86410046 | 2 | 0.1% | |
| -8.173800468 | 2 | 0.1% | |
| 12.88829994 | 1 | < 0.1% | |
| -0.2344000041 | 1 | < 0.1% | |
| -3.730799913 | 1 | < 0.1% | |
| Other values (3801) | 3801 | 98.8% |
| Value | Count | Frequency (%) | |
| -34.03519821 | 1 | < 0.1% | |
| -32.2765007 | 1 | < 0.1% | |
| -30.96940041 | 1 | < 0.1% | |
| -30.90139961 | 1 | < 0.1% | |
| -30.5734005 | 1 | < 0.1% | |
| -28.51819992 | 1 | < 0.1% | |
| -28.20120049 | 1 | < 0.1% | |
| -27.52059937 | 1 | < 0.1% | |
| -27.19820023 | 1 | < 0.1% | |
| -26.46730042 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 35.80279922 | 1 | < 0.1% | |
| 33.45940018 | 1 | < 0.1% | |
| 30.13699913 | 1 | < 0.1% | |
| 28.79669952 | 1 | < 0.1% | |
| 28.30419922 | 1 | < 0.1% | |
| 27.59659958 | 1 | < 0.1% | |
| 27.56870079 | 1 | < 0.1% | |
| 27.46229935 | 1 | < 0.1% | |
| 27.36070061 | 1 | < 0.1% | |
| 27.15279961 | 1 | < 0.1% |
target
Real number (ℝ)
| Distinct count | 3784 |
|---|---|
| Unique (%) | 98.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.00016629382793204362 |
|---|---|
| Minimum | -12.03909969329834 |
| Maximum | 10.867300033569336 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.2 KiB |
Quantile statistics
| Minimum | -12.03909969 |
|---|---|
| 5-th percentile | -5.079669857 |
| Q1 | -2.132449985 |
| median | -0.0304499995 |
| Q3 | 2.028625011 |
| 95-th percentile | 5.374345088 |
| Maximum | 10.86730003 |
| Range | 22.90639973 |
| Interquartile range (IQR) | 4.161074996 |
Descriptive statistics
| Standard deviation | 3.144394589 |
|---|---|
| Coefficient of variation (CV) | 18908.66684 |
| Kurtosis | 0.1951873519 |
| Mean | 0.0001662938279 |
| Median Absolute Deviation (MAD) | 2.080800056 |
| Skewness | 0.1097937519 |
| Sum | 0.6398986499 |
| Variance | 9.887217333 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| -2.382400036 | 2 | 0.1% | |
| 2.054800034 | 2 | 0.1% | |
| 0.1541000009 | 2 | 0.1% | |
| -0.8809999824 | 2 | 0.1% | |
| -2.684700012 | 2 | 0.1% | |
| -3.549499989 | 2 | 0.1% | |
| 1.015100002 | 2 | 0.1% | |
| -1.573300004 | 2 | 0.1% | |
| -2.105799913 | 2 | 0.1% | |
| -3.509500027 | 2 | 0.1% | |
| -1.770799994 | 2 | 0.1% | |
| -0.4722000062 | 2 | 0.1% | |
| 3.422199965 | 2 | 0.1% | |
| -0.5922999978 | 2 | 0.1% | |
| 2.336400032 | 2 | 0.1% | |
| -0.3285000026 | 2 | 0.1% | |
| -2.823800087 | 2 | 0.1% | |
| -1.725800037 | 2 | 0.1% | |
| 1.246600032 | 2 | 0.1% | |
| -3.605200052 | 2 | 0.1% | |
| -0.3892999887 | 2 | 0.1% | |
| 0.8540999889 | 2 | 0.1% | |
| -3.504499912 | 2 | 0.1% | |
| 0.3398000002 | 2 | 0.1% | |
| 0.4550999999 | 2 | 0.1% | |
| Other values (3759) | 3798 | 98.7% |
| Value | Count | Frequency (%) | |
| -12.03909969 | 1 | < 0.1% | |
| -11.87720013 | 1 | < 0.1% | |
| -10.32610035 | 1 | < 0.1% | |
| -9.708399773 | 1 | < 0.1% | |
| -9.394900322 | 1 | < 0.1% | |
| -9.276200294 | 1 | < 0.1% | |
| -9.003499985 | 1 | < 0.1% | |
| -8.875 | 1 | < 0.1% | |
| -8.84679985 | 1 | < 0.1% | |
| -8.762399673 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 10.86730003 | 1 | < 0.1% | |
| 10.49339962 | 1 | < 0.1% | |
| 10.44169998 | 1 | < 0.1% | |
| 10.26500034 | 1 | < 0.1% | |
| 10.15480042 | 1 | < 0.1% | |
| 10.13580036 | 1 | < 0.1% | |
| 9.756699562 | 1 | < 0.1% | |
| 9.680999756 | 1 | < 0.1% | |
| 9.436200142 | 1 | < 0.1% | |
| 9.419799805 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| RIDGE | NUB | CRACK | WEIGHT | target | |
|---|---|---|---|---|---|
| 0 | -2.3482 | 3.6314 | 5.0289 | 10.8721 | -1.3852 |
| 1 | -1.1520 | 1.4805 | 3.2375 | -0.5939 | 2.1235 |
| 2 | -2.5245 | -6.8633 | -2.8037 | 8.4631 | -3.4126 |
| 3 | 5.7523 | -6.5091 | -5.1510 | 4.3480 | -10.3261 |
| 4 | 8.7494 | -3.8978 | -1.3834 | -14.8776 | -2.4153 |
| 5 | 10.4303 | -3.1628 | 12.7885 | -14.8519 | -6.4942 |
| 6 | -3.6049 | 4.6081 | 6.5540 | 5.9773 | 4.0404 |
| 7 | -5.6383 | -0.8158 | -3.8120 | 1.1674 | 7.0468 |
| 8 | 9.5434 | 4.0865 | 2.7542 | -18.9002 | -0.0672 |
| 9 | -9.0292 | 2.9723 | 3.6759 | 13.8820 | 4.2106 |
Last rows
| RIDGE | NUB | CRACK | WEIGHT | target | |
|---|---|---|---|---|---|
| 3838 | 7.1823 | -2.6548 | 0.244700 | -10.8065 | -4.7775 |
| 3839 | 3.4640 | -8.2061 | -1.421900 | -3.3024 | -3.8587 |
| 3840 | -12.0811 | -1.3975 | 4.744400 | 23.9576 | 2.3006 |
| 3841 | 1.7635 | 5.4823 | -7.332600 | -0.2084 | -2.4527 |
| 3842 | 10.0841 | 4.1937 | -4.093000 | -12.4840 | -2.9099 |
| 3843 | -11.1764 | -3.1833 | -0.194100 | 6.8507 | 8.5044 |
| 3844 | 4.8725 | -1.5653 | -1.354000 | -13.8886 | 2.1865 |
| 3845 | 6.3814 | 4.3648 | -22.422899 | -19.1334 | 1.8819 |
| 3846 | 2.7014 | -3.8759 | -7.262700 | -6.2986 | -0.4284 |
| 3847 | 6.6282 | -0.7684 | -10.631300 | -5.9356 | -3.4739 |