Regression Calculator

Perform least-squares regression on your data. Toggle between Linear Regression (y = a + bx) and Polynomial Regression (degree 2–4). Add up to 50 (x,y) points, inspect fitted coefficients, R², residuals, and visualize the fit. You can copy results, download CSV, or print the page.

Points: 0 (max 50)
#xyŷ (pred)residual
Plot: scatter points (blue), fitted line/curve (orange). Residuals shown as gray vertical bars when enabled.

Regression and Least Squares — Understanding Fit, Error and Interpretation

Regression analysis estimates relationships between variables. The simplest and most common is linear regression, which finds the best-fit straight line y = a + bx that minimizes the sum of squared residuals between observed values y_i and predicted values ŷ_i. Polynomial regression generalizes this idea to estimate a polynomial relationship. Least squares is robust, intuitive and computationally efficient, forming the core of many statistical and machine-learning pipelines.

Least squares in brief

Given n observations (x_i, y_i), the least-squares estimate chooses coefficients that minimize S = Σ (y_i − ŷ_i)². For linear regression (degree 1), closed-form solutions exist using simple summations: b = Cov(x,y)/Var(x) and a = ȳ − b x̄. For polynomial regression we solve a normal equation (XᵀX)β = Xᵀy for coefficient vector β where X is the Vandermonde/design matrix.

Key statistics

  • Slope / coefficients: parameters that define the fitted relationship.
  • Residuals: e_i = y_i − ŷ_i, indicating errors of the fit per point.
  • Sum of squared residuals (SSR): Σ e_i² — objective minimized by least squares.
  • Correlation coefficient (r): measures linear association between x and y (−1 to 1). For linear regression, b = r (s_y/s_x).
  • Coefficient of determination (R²): proportion of y-variance explained by the model: R² = 1 − SSR / SST where SST = Σ (y_i − ȳ)².

Computational details

Polynomial regression solves normal equations which can be ill-conditioned for high degrees or poorly scaled x. This tool supports up to degree 4 and up to 50 points; for larger or noisy datasets consider numerical libraries with regularization (ridge, orthogonal polynomials) or piecewise models (splines).

Interpreting results

High R² indicates the model explains much of the variance, but watch for overfitting with high-degree polynomials. Residual plots help detect non-random patterns: residuals should be roughly randomly scattered — patterns suggest model misspecification.

Examples

Example 1 (linear): Points (1,2),(2,3),(3,5) produce slope b ≈ 1.5 and intercept a ≈ 0.1 — predicted line approximates observed trend. R² tells how tight the points lie around the line.

Example 2 (polynomial): Quadratic fit to ballistic motion data yields coefficients that relate to initial position, velocity and acceleration.

Best practices

  1. Plot residuals to check fit quality.
  2. Center and scale x when fitting higher-degree polynomials to reduce numerical issues.
  3. Prefer lower-degree models unless there is strong theoretical justification.
  4. For predictive tasks validate models on held-out data.

This calculator gives you the core tools to run exploratory regression quickly. Use the plot, residuals and summary statistics together to judge model appropriateness. For production analyses, complement this with cross-validation and diagnostic checks.

Frequently Asked Questions

1. How many points can I use?
Up to 50 points. Add/remove points with the controls above.
2. What does 'Show residuals' do?
It draws vertical bars between observed y values and predicted ŷ values to visualize pointwise errors.
3. Can I use polynomial regression?
Yes — choose Polynomial mode and select degree 2–4. Results include expanded coefficients.
4. Are inputs precise?
Inputs accept decimals and simple fractions like 3/4; results are computed with JavaScript numbers and rounded for display.
5. What statistics are shown?
Slope/intercept (or polynomial coefficients), r, R², SSR, and standard error (for linear).
6. How can I export results?
Use 'Download CSV' to save x, y, ŷ and residuals. 'Copy Result' copies a summary to the clipboard.
7. What if X values are identical?
For linear regression there must be variance in x; identical x values cause a degenerate design matrix for polynomial fits. The tool will alert you.
8. Can I get exact rational coefficients?
Coefficients are numeric decimals; if you require rational exactness use symbolic tools with rational arithmetic.
9. Does R² always indicate a good model?
A high R² doesn't guarantee appropriateness—check residual patterns and consider overfitting.
10. Can you add robust regression or regularization?
Yes — methods like ridge, lasso or robust M-estimators can be added on request.