Ordinary Least Squares - pythonjournals

Most likely, you have previously encountered a trendline on a scatter plot. Through a cloud of dots, that straight line? It was unable to guess its destination. It was calculated.

The basis of almost all basic regression models in statistics, economics, and data science is a computation known as Ordinary Least Squares (OLS).

What is OLS, really?

Imagine attempting to forecast website traffic using the number of social media shares a post receives. You gather fifty data points. Plotting them reveals a distinct increasing tendency, even though they don’t form a perfect line.

The straight line that best matches those locations is found using OLS. “Best” refers to reducing the overall squared distance between each actual data point and the predicted line.

That line looks like this:

y = mx + b

Where:

( y ) = predicted traffic
( x ) = number of shares
( m ) = slope (how much traffic increases per share)
( b ) = intercept (baseline traffic with zero shares)

Why “Least Squares”?

Because OLS doesn’t just add up the errors (predicted vs. actual). It squares them first. Why?

Squaring makes all errors positive (so negatives don’t cancel positives).
It penalizes large errors more heavily — a big mistake is much worse than several tiny ones.

The result? One mathematically unique line that minimizes the total squared error.

What OLS gives you

Coefficients – Easy-to-interpret numbers (e.g., “1 extra share = 5 more visits”).
R-squared – How much of the variation in ( y ) your ( x ) explains (0–100%).
Standard errors – How confident you can be in those coefficients.

When OLS works beautifully (and when it doesn’t)

Works well when:

The relationship is roughly linear
Errors are random and unrelated to ( x )
No extreme outliers dominate the line

Fails badly when:

The true pattern is a curve (e.g., diminishing returns)
One or two extreme points hijack the line
Your variables are highly correlated (multicollinearity)

A real-world example

Let’s say you run a small online store. You regress daily sales (( y )) on daily ad spend (( x )).

OLS tells you:

“For every $1 you spend on ads, sales increase by $4.20.”

That’s gold. Now you know whether ads are profitable.

The bottom line

OLS is not fancy. It’s not deep learning. It won’t win a Kaggle competition on messy image data.

But for understanding relationships, making simple predictions, and explaining results to a boss or client, OLS is still one of the most powerful tools you can learn.

It fits on a single line of Python (statsmodels or scikit-learn), R (lm()), or even Excel.