Today on Wale’s musings, we will be reflecting on Correlation and Regression Analysis, a topic studied in Data Analysis class at the Lagos Business School (LBS).
Regression analysis is a statistical technique for examining the relationships between variables. Typically, the investigator endeavors to determine the causal effect of one variable on another, such as the effect of a price increase on demand or the effect of money supply, or changes in the inflation rate.
Regression analysis is a collection of statistical techniques used to estimate the relationship between a dependent variable and independent variables. It can be used to evaluate the degree of the relationship between variables and to simulate their future relationship.
CORRELATION ANALYSIS
In research, correlation analysis is a statistical technique that measures the linear relationship between two variables and computes their association. As stated, correlation analysis determines the level of change in one variable because of the change in the other variable.
What are the 4 types of correlation?
Correlation is a two-variable analysis that measures the strength and direction of the link between two variables. The correlation coefficient can range between +1 and -1, depending on the degree of the relationship. A value of +1 indicates that the two variables are perfectly correlated. As the value of the correlation coefficient approaches zero, the relationship between the two variables will weaken. The sign of the coefficient indicates the orientation of the relationship; a plus sign indicates a positive relationship, while a minus sign indicates a negative relationship. Typically, four categories of correlations are measured in statistics: the Pearson correlation, the Kendall rank correlation, the Spearman correlation, and the Point-Biserial correlation.
Pearson r correlation
Pearson r correlation is the most commonly used correlation statistic for determining the strength of a connection between two linearly related variables. In the financial market, for example, if we want to know how two stocks are related to each other, we can use Pearson r correlation to determine the degree of relationship.
Assumptions of Pearson r correlation
The Pearson r correlation requires normally distributed data on both factors. (Customarily, distributed variables have a bell-shaped curve). Linearity and homoscedasticity are two more. Both linearity and homoscedasticity presuppose that the connection between the two variables is linear and that the data is distributed normally around the regression line.
Kendall rank correlation
A non-parametric method that assesses the degree to which two variables are dependent on one another is known as the Kendall rank correlation test. It supposes that we have two samples, a and b and that the quantity of each sample is n, we know that the total number of pairings with a b is n(n-1)/2.
Spearman rank correlation
A non-parametric measurement known as the Spearman rank correlation is utilized to determine the level of association that exists between two different variables. The Spearman rank correlation test is the appropriate correlation analysis to use when the variables are measured on a scale that is at least ordinal because it does not make any assumptions about the distribution of the data, and it does not convey any assumptions about the distribution of the data.
Until next time when we will be considering another reflection on my learnings at LBS, have a great week ahead.
OLAWALE