Overview

Dataset statistics

Number of variables8
Number of observations303
Missing cells0
Missing cells (%)0.0%
Duplicate rows116
Duplicate rows (%)38.3%
Total size in memory19.1 KiB
Average record size in memory64.4 B

Variable types

CAT4
BOOL3
NUM1

Reproduction

Analysis started2020-08-25 01:17:09.669493
Analysis finished2020-08-25 01:17:10.582116
Duration0.91 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 116 (38.3%) duplicate rows Duplicates
target has 164 (54.1%) zeros Zeros

Variables

sex
Boolean

Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
1
206
0
97
ValueCountFrequency (%) 
120668.0%
 
09732.0%
 

cp
Categorical

Distinct count4
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
4
144
3
86
2
50
1
 
23
ValueCountFrequency (%) 
414447.5%
 
38628.4%
 
25016.5%
 
1237.6%
 
2020-08-25T01:17:10.651314image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
414447.5%
 
38628.4%
 
25016.5%
 
1237.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
414447.5%
 
38628.4%
 
25016.5%
 
1237.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
414447.5%
 
38628.4%
 
25016.5%
 
1237.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
414447.5%
 
38628.4%
 
25016.5%
 
1237.6%
 

fbs
Boolean

Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
258
1
 
45
ValueCountFrequency (%) 
025885.1%
 
14514.9%
 

restecg
Categorical

Distinct count3
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
151
2
148
1
 
4
ValueCountFrequency (%) 
015149.8%
 
214848.8%
 
141.3%
 
2020-08-25T01:17:10.786749image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
015149.8%
 
214848.8%
 
141.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
015149.8%
 
214848.8%
 
141.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
015149.8%
 
214848.8%
 
141.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
015149.8%
 
214848.8%
 
141.3%
 

exang
Boolean

Distinct count2
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
204
1
99
ValueCountFrequency (%) 
020467.3%
 
19932.7%
 

slope
Categorical

Distinct count3
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
1
142
2
140
3
 
21
ValueCountFrequency (%) 
114246.9%
 
214046.2%
 
3216.9%
 
2020-08-25T01:17:11.095684image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
114246.9%
 
214046.2%
 
3216.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
114246.9%
 
214046.2%
 
3216.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
114246.9%
 
214046.2%
 
3216.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
114246.9%
 
214046.2%
 
3216.9%
 

thal
Categorical

Distinct count4
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
0
166
2
117
1
 
18
3
 
2
ValueCountFrequency (%) 
016654.8%
 
211738.6%
 
1185.9%
 
320.7%
 
2020-08-25T01:17:11.227427image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
016654.8%
 
211738.6%
 
1185.9%
 
320.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number303100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
016654.8%
 
211738.6%
 
1185.9%
 
320.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common303100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
016654.8%
 
211738.6%
 
1185.9%
 
320.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII303100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
016654.8%
 
211738.6%
 
1185.9%
 
320.7%
 

target
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9372937293729373
Minimum0
Maximum4
Zeros164
Zeros (%)54.1%
Memory size2.5 KiB
2020-08-25T01:17:11.344313image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.228535688
Coefficient of variation (CV)1.310726456
Kurtosis-0.1387539879
Mean0.9372937294
Median Absolute Deviation (MAD)0
Skewness1.058495607
Sum284
Variance1.509299937
2020-08-25T01:17:11.458411image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
016454.1%
 
15518.2%
 
23611.9%
 
33511.6%
 
4134.3%
 
ValueCountFrequency (%) 
016454.1%
 
15518.2%
 
23611.9%
 
33511.6%
 
4134.3%
 
ValueCountFrequency (%) 
4134.3%
 
33511.6%
 
23611.9%
 
15518.2%
 
016454.1%
 

Interactions

2020-08-25T01:17:10.006440image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:17:11.589229image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:17:11.793363image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:17:11.992808image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:17:12.193083image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-08-25T01:17:12.373916image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-08-25T01:17:10.260726image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:17:10.480409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

sexcpfbsrestecgexangslopethaltarget
011120310
114021202
214021221
313000300
402020100
512000100
604020303
704001100
814020222
914121321

Last rows

sexcpfbsrestecgexangslopethaltarget
29314021122
29404001201
29512000100
29614120213
29704001221
29811000221
29914100222
30014001223
30102020201
30213000100

Duplicate rows

Most frequent

sexcpfbsrestecgexangslopethaltargetcount
211200010010
241300010010
1020001008
4030001008
5030002007
6030201006
37140012235
46140212235
10040001004
13040012014