Background of Hands-On Math 

Data Fitting - Curve Fitting - Interpolation -Approximation - Maximizing - Minimizing - Optimization - Solving

Common Elements of Data and Curve Fitting, Approximation and Optimization, Maximization and Minimization.

Data and Curve Fitting, Approximation, and Optimization are mathematical procedures. They all employ numerical representations of: an Initial State, Objectives to be achieved, Controllables that can be adjusted to reduce error, a Current State or result that can be compared with the objectives to produce a Current Error,  and a Strategy or strategies to find successive values for the controllables that reduce error.

Maximization and Minimization Procedures also employ numerical representations of Initial State, Controllables, Current State and Strategy but differ in that the Objective(s), and hence the Error(s), is(are) not usually defined by fixed numerical values.

Sometimes a tentative numerical objective is provided as a large or small value that can be compared with a Current State to produce, over some iterations, a least achievable numerical Error. Although Excel's Solver is equipped to find Max, Min and Equals as objectives, the tentative objective technique can be illustrated with a simple example, the minimum of (1+sinx/x). A graph of this function and the value of its minimum are shown next:


As Excel cannot readily find values for this function at x = 0, provide Solver with a constraint such as x > =  0.0000001. 

Now provide Solver with the tentative objective of setting the value of the function to, for example, -1,000,000,000.

Set x to an initial value of 1.0 and start Solver.  Solver will complain that it was unable to reach a feasible solution. Notwithstanding the complaint it will have found a somewhat inaccurate value for x quite close to the value for the location of the actual minimum, for example, 4.49330.

The user may now choose an objective only slightly smaller than the minimum, e.g. 0.782 and Solver finds the value for x as 4.4934341 i.e. in agreement with the value it would find if set up to locate a minimum rather than being set up to meet an objective.

Linear and Non-Linear

Gauss is credited with being first, in 1795 at age 18, to find the "Linear Least Squares" method that he used for fitting observations that contain observational errors. Linear implies that the parameters of a model are linearly combined.  When this is so, a closed form of solution can be found, not necessarily, but often in the form of polynomial coefficients. When the application is non-linear, iterative procedures are employed.

Gauss Applied Linear Least Squares to Predicting the Orbit of Ceres

Ceres, a new asteroid, was discovered on January 1, 1801 and tracked for 40 days at which time it was lost in the Sun's glare. At the age of 24, Gauss employed Linear Least Squares to estimate and project the orbit of Ceres so that it could be located when it emerged from behind the Sun.

In 1829, Gauss was able to show that a least-square estimator was the best linear unbiased estimator of the polynomial coefficients when the measurements being fitted had uncorrelated, equal variance,  zero-mean errors - all properties that might be expected of careful human measurements.  This is the reason that the Least Squares representation of error is used when observational data is to be fitted with a curve.  Least Squares minimization can also be thought of as a neat way of smoothing data that has a large error component

Fitting a Curve with a Curve, Approximation

There are cases where the curve to be fitted is not the result of noisy physical measurements that require smoothing and instead is the result of labour-intensive calculation. 

What is then desired is a more simple expression for a range of the calculated curve.  In this case the objective is to find a new curve that is less complex to calculate and which, although not an exact fit, is satisfactory in some sense.

This is a case where the data to be fitted is regarded as exact and, for the sake of reduced calculation effort, we are willing to accept a less precise representation.

As an example, the A and  B  coefficients of the function  Y = A * X + B / X  are now chosen, using Solver, to fit the calculated function Y =  X ^ 0.5 which has been calculated at 201 points from X = 0.25 to X = 2.25. (The fitting function has been chosen rather arbitrarily only to show the effect of using two alternatives to Least Squares.) 

Name the 201 values of calculated Y, C1 through C201.  Name the corresponding 201 values that will be found for function Y, F1 through F201.  Postulate errors, E1 through E201, which can be found for each point  by comparing the corresponding Ci and Fi values.  There are many possible ways of forming the Ei values.

Least Squares would set Ei = (Ci-Fi)^2 and Solver would be used to choose A and B so as to minimize the sum of the 201 Ei values.

Sometimes it is of interest to chose Ei as Abs(Ci-Fi) so as to minimize Absolute Error rather than squared error.

At other times it is Relative Error that is important.  We would might use either Ei = abs( Fi / Ci - 1) or Ei = ( Fi / Ci - 1) ^2.

In addition to such choices, some users would prefer to minimize the greatest of the Ei rather than their sum, MiniMax.  Perhaps there are other users who would prefer to make the two largest errors as equal and as small as possible.  Who knows?

To illustrate the effects that can occur with choice of the error criterion, Solver was employed to fit the list of square roots using MiniMax for both the Absolute Error criterion and the Relative Error criterion as seen next:


For the Minimum Absolute Error fit, blue, the maximum error is seen to occur at both the largest and smallest X values. This results in a comparatively high relative error for X near 0.25.

For the Minimum Relative Error fit, red, the maximum error is seen to occur at the largest X value. 

Which of these two fits would you prefer?

When the object is solely a close fit, greatest interest may be on using the criterion and other strategy that would provide a satisfactory fit with the least expenditure of calculation time.  That is a subject outside of the scope of Hands-On Math.


Note in the foregoing example of curve fitting that it was incidental that the fitting expression interpolated the 201-point list.  There can be occasions when the interpolation aspect of curve fitting is the primary objective although the fit remains important. 

Optimization - Minimization - Maximization

Curve fitting can be described as having the objective of fitting a curve with minimum error. However there are times when the objective is not a curve but is instead a combination of disparate items such as fuel efficiency, dollar cost and size and the objective is to minimize a weighted combination of such items.

If the objective were to maximize the ratio of size to dollar cost for a given fuel efficiency we would think of the problem as one where the optimum was found when that ratio was largest.

Methods for minimization and maximization differ little. A large quantity becomes small when; instead, its inverse is the objective. 

The term Optimization is most often employed when the objective combines disparate items whether or not the application is one of maximization or minimization.


When the objective function is a weighted sum of disparate items, such as are found in modeling a projectile aiming at a target. Example items are: the time that a position is reached, the angle of approach at that time and the velocity at that time. The controllables may also be disparate such as Charge, Elevation, and Azimuth.

If the projectile were a grenade aiming at a moving target then the time and angle of approach might be given most weight. If it were a shell aiming at a stationary target then velocity and angle of approach might be weighted most.

When the controllables are disparate. The sensitivities of the weighted objective to the controllables will be disparate and must be separately weighted in adjusting the controllables.  Otherwise one or more of the sensitivities could be too small to affect the optimization or so large as to weaken or destroy convergence.  This situation can be looked after by the user intervening initially to set the weights and from time to time intervening to adjust the weights.  In come cases means can be found to automatically adapt the sensitivity weights as iteration progresses.

Numerical Differentiation

Just as numerical integration resides at the core of the numerical solution of differential equations, numerical differentiation to determine sensitivities is the core procedure of numerical optimization.  The methods to be employed could be absolute or relative, e.g., a controllable could be changed by a small constant amount such as 0.0001 or by 0.01%, the former being a concern when the value of the controllable is both very large such as 1,000,000,000,000 and very small such as 0.000,000,000,001, the latter creating concerns when the value of the controllable is at or near zero. 

There are two-point, three-point and even five-point differentiation methods. There is Differential Quadrature, which uses weighted sums of function values. There are methods based on Splines, methods based on Taylor series, and other methods can be devised or found.

This writer tends to favour the use of forms of relative error in situations where the controllable does not take on the value of zero and the computer being used is efficient at division.


Sets of linear and non-linear equations or a mixture of both may be solved iteratively with an optimization routine such as Solver. 

This ability to solve can be looked upon more broadly as an ability to estimate or infer the cause of an observed effect. 

Could observation of the tides, at a position on the Earth, and over a period of months become a basis for any inferences about our Moon, its orbit or about properties of Earth? 

Top Prev. Topic Topics