STATISTICSStatisticsMathematics Calculator
📊

Least Squares Regression

The line of best fit minimizes the sum of squared vertical distances (residuals). Slope a and intercept b give ŷ = ax + b. R² = r² is the proportion of variance explained.

Concept Fundamentals
a = (nΣxy − ΣxΣy)/(nΣx² − (Σx)²)
Slope
b = ȳ − a·x̄
Intercept
Variance explained
y − ŷ
Residual

Did our AI summary help? Let us know.

The regression line passes through (x̄, ȳ)—the centroid. R² = 0.75 means 75% of y variation explained by x. Correlation does not imply causation.

Key quantities
a = (nΣxy − ΣxΣy)/(nΣx² − (Σx)²)
Slope
Key relation
b = ȳ − a·x̄
Intercept
Key relation
Variance explained
Key relation
y − ŷ
Residual
Key relation

Ready to run the numbers?

Why: Regression predicts y from x—sales from ads, weight from height, grades from study time.

How: Enter x,y pairs; the calculator finds slope and intercept that minimize Σ(y − ŷ)².

The regression line passes through (x̄, ȳ)—the centroid.R² = 0.75 means 75% of y variation explained by x.

Run the calculator when you are ready.

Line of Best FitMinimize squared errors
📈
STATISTICS

Least Squares Regression — Line of Best Fit

Enter X and Y values. Get slope, intercept, R², Pearson r, predicted values, residuals, and charts.

📊 Quick Examples — Click to Load

Inputs

regression_results.sh
CALCULATED
Equation
y = 2.0248x - 0.1667
Pearson r
0.9993
0.9986
n
10
Slope ± SE
2.0248 ± 0.0267
Intercept ± SE
-0.1667 ± 0.1656
Std Error (est)
0.2424
Share:
Least Squares Regression
y = 2.0248x - 0.1667
r = 0.999R² = 0.999n = 10Slope = 2.025Intercept = -0.167
Data Points & Residuals
x=1.00 y=2.00 ŷ=1.86 e=0.14
x=2.00 y=3.90 ŷ=3.88 e=0.02
x=3.00 y=6.20 ŷ=5.91 e=0.29
x=4.00 y=7.80 ŷ=7.93 e=-0.13
x=5.00 y=9.50 ŷ=9.96 e=-0.46
... +5 more
numbervibe.com/calculators/mathematics/statistics/least-squares-regression-calculator

Data Points & Predictions

XY (Actual)ŷ (Predicted)Residual
1.00002.00001.85820.1418
2.00003.90003.88300.0170
3.00006.20005.90790.2921
4.00007.80007.9327-0.1327
5.00009.50009.9576-0.4576
6.000011.900011.9824-0.0824
7.000014.100014.00730.0927
8.000015.800016.0321-0.2321
9.000018.200018.05700.1430
10.000020.300020.08180.2182

Scatter Plot & Regression Line

Residuals (Observed − Predicted)

📐 Calculation Breakdown

SUMS
n (data points)10
Σx55.0000
Σy109.7000
Σxy770.4000
Σx²385.0000
Σy²1542.1300
REGRESSION
Δ = n·Σx² − (Σx)²825.0000
Slope (a)2.0248
Intercept (b)-0.1667
CORRELATION
Equationy = 2.0248x - 0.1667
Pearson r0.9993
0.9986
Correlation strengthvery strong
ERRORS
Std Error (slope)0.0267
Std Error (intercept)0.1656
Std Error of estimate0.2424

For educational and informational purposes only. Verify with a qualified professional.

🧮 Fascinating Math Facts

📊

The least squares line passes through the centroid (x̄, ȳ).

📈

R² = 0.75 means 75% of variation in y is explained by the linear relationship.

📋 Key Takeaways

  • Least squares minimizes the sum of squared vertical distances from points to the line
  • Slope (a) = change in y per unit change in x; intercept (b) = y when x = 0
  • Pearson r (−1 to +1) measures linear correlation; = proportion of variance explained
  • Residuals = observed − predicted; good fit has residuals scattered randomly around zero

💡 Did You Know?

📊The least squares line passes through the point (x̄, ȳ) — the centroid of your data.Source: NIST
📈R² = 0.75 means 75% of the variation in y is explained by the linear relationship with x.Source: Regression
🎯Correlation does not imply causation. Two variables can be correlated due to a third factor.Source: Statistics
📐The regression line minimizes Σ(y − ŷ)². No other line has a smaller sum of squared residuals.Source: Least Squares
📉Outliers can strongly affect the regression line. Always check residual plots for patterns.Source: EDA
🔢For simple linear regression, Pearson r = sign(slope) × √R². They are mathematically related.Source: Correlation

📖 How It Works

Enter X and Y values (comma or space separated). The calculator computes n, Σx, Σy, Σxy, Σx², Σy², then slope a = (nΣxy − ΣxΣy) / (nΣx² − (Σx)²) and intercept b = (Σy − a·Σx) / n.

Slope Formula

a = (n·Σxy − Σx·Σy) / (n·Σx² − (Σx)²). The denominator Δ must be non-zero (X must vary).

Intercept Formula

b = (Σy − a·Σx) / n = ȳ − a·x̄. The line always passes through (x̄, ȳ).

Pearson r and R²

r = (nΣxy − ΣxΣy) / √[(nΣx²−(Σx)²)(nΣy²−(Σy)²)]. R² = r². R² is the proportion of variance in y explained by x.

🎯 Expert Tips

Check Linearity

Plot your data first. If the relationship is curved, linear regression may be inappropriate. Consider transformations.

Residual Plot

Residuals should be randomly scattered. Patterns (funnel, curve) suggest heteroscedasticity or non-linearity.

Sample Size

At least 10–30 points recommended. Small samples can produce misleading R² and unstable coefficients.

Outliers

Extreme points can pull the line. Verify they are not data errors. Consider robust regression for outlier-heavy data.

📊 Comparison Table

|r| RangeStrengthR² Interpretation
0.9 – 1.0Very strong90–100% variance explained
0.7 – 0.9Strong49–81% variance explained
0.5 – 0.7Moderate25–49% variance explained
0.3 – 0.5Weak9–25% variance explained
0.0 – 0.3Very weak<9% variance explained

❓ FAQ

What is the difference between correlation and causation?

Correlation measures statistical association. Causation means one variable directly causes another. A strong correlation does not imply causation — a third variable may explain both.

When should I use linear regression?

When the relationship between X and Y appears approximately linear, and you want to predict Y from X or quantify the strength of the relationship.

What are the assumptions of linear regression?

Linearity, independence of errors, homoscedasticity (constant error variance), and normality of errors. Check residuals to validate.

How do I interpret the slope?

Slope = change in Y per 1-unit increase in X. A slope of 2.5 means Y increases by 2.5 when X increases by 1.

What does R² mean?

R² (coefficient of determination) is the proportion of variance in Y explained by X. R² = 0.80 means 80% of the variation in Y is explained by the linear relationship.

How do I enter my data?

Enter X values in one box and Y values in another, in the same order. Use commas, spaces, or newlines. Both must have the same count.

📊 Infographic Stats

r = ±1
Perfect linear fit
Variance explained
Σe²
Sum of squared residuals (minimized)
(x̄,ȳ)
Line passes through centroid

⚠️ Disclaimer: This calculator is for educational purposes. Verify critical analyses with professional statistical software when making decisions.

👈 START HERE
⬅️Jump in and explore the concept!
AI

Related Calculators