Linear regression

Prediction plot

Model Coefficiens

Residual standard error (RSE)

(Represents roughly the average difference between the observed outcome values and the predicted values by the model. The lower the RSE the best the model fits to our data.)

Multiple R-Squared

(Proportion of variance explained by adding variables)

Adjusted R-Squared

(Adjusted R-Square takes into account the number of variables and is most useful for multiple-regression.)

p-value

(The F-test checks if at least one variable's weight is significantly different than zero. This is a global test to help asses a model. If the p-value is not significant (e.g. greater than 0.05) than your model is essentially not doing anything.)

Residuals are normally distributed

(The Jarque-Bera test which checks if the skewness and kurtosis of the residuals are similar to that of a normal distribution.)

Residuals have constant variance

(Breusch-Pagan test for heteroskedasticity tests which checks whether the variance of the errors from a regression is dependent on the values of the independent variables.)

Variance inflation factors (VIF)

(VIF provides a measure of multicollinearity among the independent variables in a multiple regression model. A VIF of 1 means that there is no correlation among the jth predictor and the remaining predictor variables. The general rule of thumb is that VIFs exceeding 4 warrant further investigation, while VIFs exceeding 10 are signs of serious multicollinearity requiring correction)

Conditions

Residual standard error (RSE)

(Represents roughly the average difference between the observed outcome values and the predicted values by the model. The lower the RSE the best the model fits to our data.)

Multiple R-Squared

(Proportion of variance explained by adding variables)

Correlation plot

Download data

Choose 1st correlator

Choose 2nd correlator

Pearson correlation

(Pearson correlation evaluates the linear relationship between two continuous variables. For the Pearson r correlation, both variables should be normally distributed. Other assumptions include linearity, homoscedasticity, and the absence of outliers. Linearity assumes a straight line relationship between each of the two variables and homoscedasticity assumes that data is equally distributed about the regression line.)

Spearman correlation

(The Spearman correlation coefficient is based on the ranked values for each variable rather than the raw data. The Spearman rank correlation test does not carry any assumptions about the distribution of the data and is the appropriate correlation analysis when the variables are measured on a scale that is at least ordinal.)

Kendall correlation

(The Kendall rank coefficient is non-parametric, as it does not rely on any assumptions on the distributions)