Correlation is a measure of the strength and direction of the linear relationships between two variables. It takes values from -1 (a perfect negative correlation) to 1 (a perfect positive correlation). Covariance is a measure of the degree of how two variables change relative to each other and the direction of the linear relationships between them. Unlike correlation, covariance doesn’t have any range limit.
In R, to calculate the correlation, we need to use the cor()
function, to calculate the covariance—the cov()
function. The syntax of both functions is identical: we need to pass in two variables (vectors) for which we want to calculate the measure (e.g., cor(vector_1, vector_2)
or cov(vector_1, vector_2)
), or the whole data frame, if we want to calculate the correlation or covariance between all the variables of that data frame (e.g., cor(df) or cov(df)
). In the case of two vectors, the result will be a single value, in the case of a data frame, the result will be a correlation (or covariance) matrix.
Leave a Reply