Residual Calculator
Free residual calculator for linear regression. Residuals, SSE, MSE, RMSE, standardized, studentized
Why This Statistical Analysis Matters
Why: Statistical calculator for analysis.
How: Enter inputs and compute results.
Residual Calculator — Regression Diagnostics
Individual residuals, SSE, MSE, RMSE, standardized, studentized, Cook's distance, leverage, Durbin-Watson.
Real-World Scenarios — Click to Load
| x | y | |
|---|---|---|
Residual Table
| i | x | y | ŷ | Residual | Std. | Stud. | Leverage | Cook's D |
|---|---|---|---|---|---|---|---|---|
| 1 | 1.0000 | 2.0000 | 2.0000 | -0.0000 | -0.0000 | 0.0000 | 0.6000 | 0.0000 |
| 2 | 2.0000 | 4.0000 | 3.7000 | 0.3000 | 1.1339 | 1.7496 | 0.3000 | 0.2755 |
| 3 | 3.0000 | 5.0000 | 5.4000 | -0.4000 | -1.4142 | -2.5000 | 0.2000 | 0.2500 |
| 4 | 4.0000 | 7.0000 | 7.1000 | -0.1000 | -0.3780 | -0.4518 | 0.3000 | 0.0306 |
| 5 | 5.0000 | 9.0000 | 8.8000 | 0.2000 | 1.0000 | 2.5000 | 0.6000 | 0.7500 |
Residuals vs Fitted
Q-Q Plot of Residuals
Residual Histogram
Calculation Breakdown
For educational and informational purposes only. Verify with a qualified professional.
Key Takeaways
- Residual: eᵢ = yᵢ − ŷᵢ. Difference between observed and fitted.
- Properties: Σeᵢ = 0, Σeᵢxᵢ = 0 for least squares.
- Standardized residual: rᵢ = eᵢ / (s × √(1 − hᵢᵢ)). Same scale for comparison.
- Studentized (externally): Uses leave-one-out SE. Better for outlier detection.
- Cook's distance: Dᵢ = rᵢ² × hᵢᵢ / (p(1−hᵢᵢ)). Influential if Dᵢ > 4/n.
- Leverage: hᵢᵢ = 1/n + (xᵢ−x̄)²/Σ(xⱼ−x̄)². High if hᵢᵢ > 2p/n.
- Durbin-Watson: DW ≈ 2 means no autocorrelation. DW < 1.5 or > 2.5 suggests issues.
- SSE, MSE, RMSE: SSE = Σeᵢ², MSE = SSE/(n−2), RMSE = √MSE.
Did You Know?
Formulas Reference
ŷ = b₀ + b₁x, eᵢ = yᵢ − ŷᵢ
SSE = Σeᵢ², MSE = SSE/(n−2), RMSE = √MSE
Standardized: rᵢ = eᵢ / (s × √(1 − hᵢᵢ)), s = √(SSE/(n−2))
Leverage: hᵢᵢ = 1/n + (xᵢ−x̄)² / Σ(xⱼ−x̄)²
Cook's D: Dᵢ = rᵢ² × hᵢᵢ / (p(1−hᵢᵢ))
Durbin-Watson: DW = Σ(eᵢ − eᵢ₋₁)² / Σeᵢ²
Residual Diagnostics Checklist
- • Residuals vs fitted: Random scatter around 0. Curves suggest nonlinearity.
- • Q-Q plot: Points along diagonal suggest normal residuals.
- • Histogram: Bell-shaped residuals support normality assumption.
- • Cook's distance: Values > 4/n indicate influential points.
- • Durbin-Watson: 1.5 < DW < 2.5 suggests no autocorrelation.
Frequently Asked Questions
What is the difference between standardized and studentized residuals?
Standardized uses the full-sample SE. Studentized (externally) uses leave-one-out SE for each point, making outlier detection more accurate.
When is Cook's distance considered high?
Dᵢ > 4/n or Dᵢ > 4/(n−p−1) suggests the point is influential. Consider removing or investigating.
What does Durbin-Watson tell us?
DW tests for autocorrelation. DW ≈ 2 means no autocorrelation. DW < 1.5 suggests positive autocorrelation; DW > 2.5 suggests negative.
Why do residuals sum to zero?
Least squares minimizes Σeᵢ². The normal equations force Σeᵢ = 0 and Σeᵢxᵢ = 0.
How do I fix heteroscedasticity?
Consider transformations (log y), weighted least squares, or robust standard errors.
What is the difference between SSE, MSE, and RMSE?
SSE = sum of squared residuals. MSE = SSE/(n−2) is the average squared error. RMSE = √MSE has the same units as y.
Leverage vs Influence
Leverage measures how far a point is from the center of x. High leverage points can have large impact on the slope. Influence (Cook's D) combines leverage and residual size — a point is influential if removing it changes the fit substantially.
Chart Interpretation
Residuals vs fitted: Ideal: random scatter around 0. Curved pattern = nonlinearity. Funnel = heteroscedasticity.
Q-Q plot: Theoretical vs sample quantiles. Points along diagonal = normal residuals. Tails deviating = heavy tails or outliers.
Histogram: Bell-shaped distribution supports normality. Skewed or bimodal suggests violations.
Applications
Regression Diagnostics
Validate linear regression assumptions
Outlier Detection
Identify unusual observations
Model Improvement
Guide transformations and refinements
Quality Assurance
Check calibration and measurement errors
Limitations
- • Residual diagnostics assume the linear model is approximately correct. Severe nonlinearity may obscure patterns.
- • With small n, Cook's D and leverage thresholds are approximate.
- • Durbin-Watson assumes ordered data (e.g., time series). For cross-sectional data, order may be arbitrary.
- • Removing influential points changes the model; document and justify any exclusions.
Worked Example
For data (1,2), (2,4), (3,5), (4,7), (5,9): ŷ = 1.2 + 1.4x. Residuals: e₁ = 0.4, e₂ = −0.2, e₃ = −0.8, e₄ = 0.6, e₅ = 0. SS_res = 1.2, MSE = 1.2/3 = 0.4, RMSE = √0.4 ≈ 0.632. Leverage h₁₁ = 1/5 + (1−3)²/10 = 0.5. Standardized r₁ = 0.4/(0.632×√0.5) ≈ 0.89.
Official Data Sources
Disclaimer: This calculator is for educational purposes. For research, verify assumptions and use established statistical software.
Related Calculators
Linear Regression Calculator
Full simple linear regression: y = a + bx. Slope, intercept, R², standard errors, t-tests for coefficients, confidence and prediction intervals, ANOVA table.
StatisticsAbsolute Uncertainty Calculator
Computes absolute uncertainty, relative (percentage) uncertainty, and propagates uncertainties through arithmetic operations. Essential for physics...
StatisticsAB Test Calculator
Full A/B test statistical significance calculator. Two-proportion z-test, sample size estimation, power analysis, confidence intervals, and Bayesian approach.
StatisticsANOVA Calculator
One-way analysis of variance (ANOVA) calculator. Compare means of multiple groups, compute F-statistic, p-value, eta squared, and ANOVA table with...
StatisticsBonferroni Correction Calculator
Adjusts significance level for multiple comparisons to control family-wise error rate. Compares Bonferroni, Šidák, Holm, and Benjamini-Hochberg corrections.
StatisticsChi-Square Calculator
Perform chi-square goodness-of-fit and independence tests. χ² statistic, p-value, degrees of freedom, and Cramér's V.
Statistics