Prediction plot
Download data
Model Coefficiens
Residual standard error (RSE)
(Represents roughly the average difference between the observed outcome values and the predicted
values by the model. The lower the RSE the best the model fits to our data.)
Multiple R-Squared
(Proportion of variance explained by adding variables)
Adjusted R-Squared
(Adjusted R-Square takes into account the number of variables and is
most useful for multiple-regression.)
p-value
(The F-test checks if at least one variable's weight is significantly
different than zero. This is a global test to help asses a model.
If the p-value is not significant (e.g. greater than 0.05) than your model
is essentially not doing anything.)
Residuals are normally distributed
(The Jarque-Bera test which checks if the skewness and kurtosis of the residuals
are similar to that of a normal distribution.)
Residuals have constant variance
(Breusch-Pagan test for heteroskedasticity tests which checks
whether the variance of the errors from a regression is dependent
on the values of the independent variables.)
Variance inflation factors (VIF)
(VIF provides a measure of multicollinearity among the independent variables
in a multiple regression model. A VIF of 1 means that there is no correlation
among the jth predictor and the remaining predictor variables.
The general rule of thumb is that VIFs exceeding 4 warrant further investigation,
while VIFs exceeding 10 are signs of serious multicollinearity requiring correction)
Residual standard error (RSE)
(Represents roughly the average difference between the observed outcome values and the predicted
values by the model. The lower the RSE the best the model fits to our data.)
Multiple R-Squared
(Proportion of variance explained by adding variables)
Correlation plot
Download data
Pearson correlation
(Pearson correlation evaluates the linear relationship between
two continuous variables. For the Pearson r correlation,
both variables should be normally distributed.
Other assumptions include linearity, homoscedasticity, and the absence of outliers.
Linearity assumes a straight line relationship between each of the two variables
and homoscedasticity assumes that data is equally distributed about
the regression line.)
Spearman correlation
(The Spearman correlation coefficient is based on the ranked
values for each variable rather than the raw data. The Spearman rank correlation
test does not carry any assumptions about
the distribution of the data and is the appropriate correlation analysis when
the variables are measured on a scale that
is at least ordinal.)
Kendall correlation
(The Kendall rank coefficient is non-parametric,
as it does not rely on any assumptions on the distributions)