Correlation - Statistics

 


 introduced to various correlation statistical methods, particularly focusing on the Pearson Correlation method. Here's a summary of the key points covered:


Pearson Correlation Method:


Pearson correlation is a statistical method used to measure the strength and direction of the linear relationship between two continuous numerical variables.

It provides two values: the correlation coefficient and the p-value.

The correlation coefficient ranges from -1 to 1, where:

Close to 1 implies a large positive correlation.

Close to -1 implies a large negative correlation.

Close to 0 implies no correlation between the variables.

The p-value indicates the certainty of the correlation coefficient calculated:

A p-value less than 0.001 suggests strong certainty.

A p-value between 0.001 and 0.05 suggests moderate certainty.

A p-value between 0.05 and 0.1 suggests weak certainty.

A p-value larger than 0.1 suggests no certainty of correlation.

Strong correlation is indicated when the correlation coefficient is close to 1 or -1, and the p-value is less than 0.001.

Calculation of Pearson Correlation:


The Pearson Correlation can be easily calculated using statistical packages like SciPy stats.

Interpreting Correlation Results:


An example was provided, analyzing the correlation between horsepower and car price.

The correlation coefficient was approximately 0.8, indicating a strong positive correlation.

The small p-value (< 0.001) suggested strong certainty about the correlation.

Creating a Correlation Heat Map:


All variables were considered to create a heat map indicating the correlation between each variable.

The color scheme indicated the Pearson correlation coefficient, providing insight into the strength of the correlation between variables.

A diagonal line with a dark red color indicated highly correlated variables, which is expected as it represents the correlation of each variable with itself (which is always 1).

This correlation heat map provides a comprehensive overview of how different variables are related to one another, particularly in relation to car price.





Comments

Popular posts from this blog

Common cybersecurity terminology

Introduction to security frameworks and controls

syllabus