Pearson Correlation

Pearson Correlation Definition

For a population

$$\rho_{X,Y} = \frac{cov(X,Y)}{\sigma_X\sigma_Y} = \frac{E[(X - \mu_X)(Y - \mu_Y)]}{\sigma_X\sigma_Y}$$

where, cov is the covariance, $$\sigma_X$$ is the standard deviation of X

$$\mu_X$$ is the mean of X, and E is the exceptation.

For a sample

$$r = \frac{\Sigma{i=1}^{n}(X_i-\bar{X})(Y_i-\bar{Y})}{\sqrt{\Sigma{i=1}^{n}(Xi-\bar{X})^2}\sqrt{\Sigma{i=1}^{n}(Y_i-\bar{Y})^2}}$$

where, an equivalent expression is the following:

$$r = \frac{1}{n-1}\Sigma_{i=1}^{n}(\frac{X_i-\bar{X}}{S_X})(\frac{Y_i-\bar{Y}}{S_Y})$$

where,

$$\bar{X} = \frac{1}{n}\Sigma{i=1}^{n}X_i,\ and\ S_X = \sqrt{\frac{1}{n-1}\Sigma{i=1}^{n}(X_i-\bar{X})^{2}}$$

Calculate Correlation in R

# use cor{stats}
data1 <- rnorm(10, mean = 2, sd = 1)
data2 <- rnorm(10, mean = 4, sd = 2)

# start to calculate pearson correlation coefficient
pearsonRes <- cor(data1, data2, use = "everything", method = "pearson")

Test of pearson correlation in R

# use cor{stats}
data1 <- rnorm(10, mean = 2, sd = 1)
data2 <- rnorm(10, mean = 4, sd = 2)

# start statistics test of pearson correlation
pearsonTest <- cor.test(data1, data2, alternative = "two.sided", method = "pearson")
pearsonTestCor <- pearsonTest$estimate
pearsonTestPvalue <- pearsonTest$p.value

Pearson correlation

Pearson Correlation