MSE is a risk function, corresponding to the expected value of the squared error loss or quadratic loss. Doing so "costs us one degree of freedom".

The estimate of σ² shows up directly in Minitab's standard regression analysis output. Both linear regression techniques such as analysis of variance estimate the MSE as part of the analysis and use the estimated MSE to determine the statistical significance of the factors or Will we ever know this value σ²? ISBN0-387-96098-8.

However, one of the assumptions of classical linear regression is that the error terms conditional on different $X$ values all have the same variance, that is, for any $X_i$ and $X_j$,

The MSE can be written as the sum of the variance of the estimator and the squared bias of the estimator, providing a useful way to calculate the MSE and implying

Examples[edit] Mean[edit] Suppose we have a random sample of size n from a population, X 1 , … , X n {\displaystyle X_{1},\dots ,X_{n}} . This definition for a known, computed quantity differs from the above definition for the computed MSE of a predictor in that a different denominator is used.

Suppose you have two brands (A and B) of thermometers, and each brand offers a Celsius thermometer and a Fahrenheit thermometer. Will this thermometer brand (A) yield more precise future predictions …? … or this one (B)? I have fit a multiple linear regression model in eviews, and I am asked to calculate "estimated unbiased variance of the error term, i.e., $\hat\sigma^2$".

In the Analysis of Variance table, the value of MSE, 74.67, appears appropriately under the column labeled MS (for Mean Square) and in the row labeled Residual Error (for Error). Variance[edit] Further information: Sample variance The usual estimator for the variance is the corrected sample variance: S n − 1 2 = 1 n − 1 ∑ i = 1 n In that case weighted least squares is used to correct for the heteroscedasticity. For a Gaussian distribution this is the best unbiased estimator (that is, it has the lowest MSE among all unbiased estimators), but not, say, for a uniform distribution.

Now let's extend this thinking to arrive at an estimate for the population variance σ² in the simple linear regression setting. So just as with sample variances in univariate samples, reducing the denominator can make the value correct on average; that is, $s^2 = \frac{n}{n-p}s^2_n = \frac{RSS}{n-p}=\frac{1}{n-p}\sum_{i=1}^n(y_i-\hat y_i)^2$. (Note that RSS there Based on the resulting data, you obtain two estimated regression lines — one for brand A and one for brand B.

The numerator adds up how far each response yi is from the estimated mean \(\bar{y}\) in squared units, and the denominator divides the sum by n-1, not n as you would