What does this calculator do?

Compute statistical results from your inputs.

Enter values in the input fields and view results.

What are the formulas?

See the educational content section for formulas.

For statistical analysis and computation.

Check the related calculators section below.

Results follow standard statistical methods.

4 more

STATISTICSInference & TestsStatistics Calculator

📊

Residual Calculator

Free residual calculator for linear regression. Residuals, SSE, MSE, RMSE, standardized, studentized

Run CalculatorExplore data analysis and statistical calculations

Why This Statistical Analysis Matters

Why: Statistical calculator for analysis.

How: Enter inputs and compute results.

STATISTICSRegression Diagnostics

Residual Calculator — Regression Diagnostics

Individual residuals, SSE, MSE, RMSE, standardized, studentized, Cook's distance, leverage, Durbin-Watson.

Linear Regression →R² Calculator →

Real-World Scenarios — Click to Load

Data (x, y) — Add/Remove rows

Decimal places for display

x	y

residual_diagnostics.sh

CALCULATED

$ residual_analysis --n=5 --pairs="5 (x,y) points"

SSE

0.3000

MSE

0.1000

RMSE

0.3162

Durbin-Watson

2.5333

Equation

ŷ=0.30+1.70x

Residuals: [-0.0000, 0.3000, -0.4000, -0.1000, 0.2000]

Residual Calculator Result

ŷ = 0.300 + 1.700x

RMSE = 0.3162

SSE = 0.3000MSE = 0.1000DW = 2.533

numbervibe.com/calculators/statistics/residual-calculator

Residual Table

i	x	y	ŷ	Residual	Std.	Stud.	Leverage	Cook's D
1	1.0000	2.0000	2.0000	-0.0000	-0.0000	0.0000	0.6000	0.0000
2	2.0000	4.0000	3.7000	0.3000	1.1339	1.7496	0.3000	0.2755
3	3.0000	5.0000	5.4000	-0.4000	-1.4142	-2.5000	0.2000	0.2500
4	4.0000	7.0000	7.1000	-0.1000	-0.3780	-0.4518	0.3000	0.0306
5	5.0000	9.0000	8.8000	0.2000	1.0000	2.5000	0.6000	0.7500

Residuals vs Fitted

Q-Q Plot of Residuals

Residual Histogram

Calculation Breakdown

REGRESSION FIT

Mean of x

3.0000

x̄ = Σx/n = 1+2+3+4+5/5

Mean of y

5.4000

ȳ = Σy/n

Slope b₁

1.7000

b₁ = Σ(x-x̄)(y-ȳ) / Σ(x-x̄)²

Intercept b₀

0.3000

b₀ = ȳ − b₁x̄

Fitted equation

ŷ = 0.3000 + 1.7000x

RESIDUAL METRICS

SSE (Sum of Squared Errors)

0.3000

SSE = Σ(yᵢ − ŷᵢ)²

MSE (Mean Squared Error)

0.1000

MSE = SSE/(n−2) = 0.3000/3

RMSE (Root MSE)

0.3162

RMSE = √MSE

AUTOCORRELATION

Durbin-Watson

2.5333

DW = Σ(eᵢ−eᵢ₋₁)² / Σeᵢ²

Cook's D threshold (4/n)

0.8000

ext{Influential} ext{if} Dᵢ > 4/n

Leverage threshold (2p/n)

0.4000

ext{High} ext{leverage} ext{if} hᵢᵢ > 2p/n

For educational and informational purposes only. Verify with a qualified professional.

Key Takeaways

Residual: eᵢ = yᵢ − ŷᵢ. Difference between observed and fitted.
Properties: Σeᵢ = 0, Σeᵢxᵢ = 0 for least squares.
Standardized residual: rᵢ = eᵢ / (s × √(1 − hᵢᵢ)). Same scale for comparison.
Studentized (externally): Uses leave-one-out SE. Better for outlier detection.
Cook's distance: Dᵢ = rᵢ² × hᵢᵢ / (p(1−hᵢᵢ)). Influential if Dᵢ > 4/n.
Leverage: hᵢᵢ = 1/n + (xᵢ−x̄)²/Σ(xⱼ−x̄)². High if hᵢᵢ > 2p/n.
Durbin-Watson: DW ≈ 2 means no autocorrelation. DW < 1.5 or > 2.5 suggests issues.
SSE, MSE, RMSE: SSE = Σeᵢ², MSE = SSE/(n−2), RMSE = √MSE.

Did You Know?

📊Residuals sum to zero by construction — the regression line passes through (x̄, ȳ).Source: Least squares normal equations

📈A curved pattern in residuals vs fitted suggests the relationship is nonlinear.Source: Penn State STAT 501

💰Fan shape (heteroscedasticity) means variance changes with x. Consider weighted least squares.Source: NIST Handbook

🌡️Standardized residuals > 2 or < −2 may indicate outliers.Source: Cook & Weisberg

🧪Durbin-Watson near 0 or 4 suggests positive or negative autocorrelation.Source: Durbin & Watson, 1950

📏High leverage points are far from x̄; they can pull the line toward them.Source: Belsley, Kuh & Welsch

Formulas Reference

ŷ = b₀ + b₁x, eᵢ = yᵢ − ŷᵢ

SSE = Σeᵢ², MSE = SSE/(n−2), RMSE = √MSE

Standardized: rᵢ = eᵢ / (s × √(1 − hᵢᵢ)), s = √(SSE/(n−2))

Leverage: hᵢᵢ = 1/n + (xᵢ−x̄)² / Σ(xⱼ−x̄)²

Cook's D: Dᵢ = rᵢ² × hᵢᵢ / (p(1−hᵢᵢ))

Durbin-Watson: DW = Σ(eᵢ − eᵢ₋₁)² / Σeᵢ²

Residual Diagnostics Checklist

• Residuals vs fitted: Random scatter around 0. Curves suggest nonlinearity.
• Q-Q plot: Points along diagonal suggest normal residuals.
• Histogram: Bell-shaped residuals support normality assumption.
• Cook's distance: Values > 4/n indicate influential points.
• Durbin-Watson: 1.5 < DW < 2.5 suggests no autocorrelation.

Frequently Asked Questions

What is the difference between standardized and studentized residuals?

Standardized uses the full-sample SE. Studentized (externally) uses leave-one-out SE for each point, making outlier detection more accurate.

When is Cook's distance considered high?

Dᵢ > 4/n or Dᵢ > 4/(n−p−1) suggests the point is influential. Consider removing or investigating.

What does Durbin-Watson tell us?

DW tests for autocorrelation. DW ≈ 2 means no autocorrelation. DW < 1.5 suggests positive autocorrelation; DW > 2.5 suggests negative.

Why do residuals sum to zero?

Least squares minimizes Σeᵢ². The normal equations force Σeᵢ = 0 and Σeᵢxᵢ = 0.

How do I fix heteroscedasticity?

Consider transformations (log y), weighted least squares, or robust standard errors.

What is the difference between SSE, MSE, and RMSE?

SSE = sum of squared residuals. MSE = SSE/(n−2) is the average squared error. RMSE = √MSE has the same units as y.

Leverage vs Influence

Leverage measures how far a point is from the center of x. High leverage points can have large impact on the slope. Influence (Cook's D) combines leverage and residual size — a point is influential if removing it changes the fit substantially.

Chart Interpretation

Residuals vs fitted: Ideal: random scatter around 0. Curved pattern = nonlinearity. Funnel = heteroscedasticity.

Q-Q plot: Theoretical vs sample quantiles. Points along diagonal = normal residuals. Tails deviating = heavy tails or outliers.

Histogram: Bell-shaped distribution supports normality. Skewed or bimodal suggests violations.

Applications

Regression Diagnostics

Validate linear regression assumptions

Outlier Detection

Identify unusual observations

Model Improvement

Guide transformations and refinements

Quality Assurance

Check calibration and measurement errors

Limitations

• Residual diagnostics assume the linear model is approximately correct. Severe nonlinearity may obscure patterns.
• With small n, Cook's D and leverage thresholds are approximate.
• Durbin-Watson assumes ordered data (e.g., time series). For cross-sectional data, order may be arbitrary.
• Removing influential points changes the model; document and justify any exclusions.

Worked Example

For data (1,2), (2,4), (3,5), (4,7), (5,9): ŷ = 1.2 + 1.4x. Residuals: e₁ = 0.4, e₂ = −0.2, e₃ = −0.8, e₄ = 0.6, e₅ = 0. SS_res = 1.2, MSE = 1.2/3 = 0.4, RMSE = √0.4 ≈ 0.632. Leverage h₁₁ = 1/5 + (1−3)²/10 = 0.5. Standardized r₁ = 0.4/(0.632×√0.5) ≈ 0.89.