What Is the Least Squares Method?
Imagine we have a set of data from observations or experiments consisting of pairs of values (x, y). If we plot these data points on a scatter diagram, we sometimes see a pattern or trend that resembles a straight line.
The question is, out of the many straight lines we could draw through these points, which is the best straight line that best represents the entire dataset?
The Least Squares Method is a mathematical procedure used to find one unique straight line that is considered the best fit for the set of data points.
Minimizing Squared Errors
The main idea is to minimize the error or residual from each data point to the prediction line. So how do we determine the "best fit" line?
-
Prediction Line: We try drawing a straight line () among the data points.
-
Error (Residual): For each original data point (), there will be a vertical distance to the predicted line (). This distance is called the error or residual:
-
Minimize Sum of Squared Errors: The Least Squares Method works by finding the straight line that makes the sum of the squares of all errors () as small as possible. This is why it's called "Least Squares".
Why square the errors?
- Squaring the errors makes all values positive, so errors above and below the line don't cancel each other out.
- Larger errors contribute much more to the total sum (because they are squared), so this method strongly tries to minimize large errors.
Visualization Example
For example, a company wants to see the relationship between the advertising costs they incur (in millions of rupiah) and the number of products sold (in thousands of units). The data they collected is as follows:
The straight line drawn on the diagram above is the best-fit line found using the Least Squares Method for this advertising cost and sales data. This line represents the general linear trend that most closely approximates all data points, and the dashed lines show the residuals being minimized.
Mathematical Basis
Mathematically, we are looking for the line with the equation:
Where the values of (intercept) and (slope) are chosen such that the value of:
is minimized.
Through calculus (which we don't need to derive here), formulas are found to obtain the values of and that satisfy this condition:
Formula key:
- = Number of data pairs.
- , = Sum of all x and y values.
- = Sum of the product of each x and y pair.
- = Sum of the square of each x value.
- = Mean of x ( ).
- = Mean of y ( ).
Thus, the Least Squares Method provides a systematic and objective way to find the best straight line representing the linear trend in the data based on the principle of minimizing the sum of squared errors.