Chapter 2
Sensitivity in Approximation
Steepest Descent
Although adjustment of each coefficient individually works
well when using orthogonal functions, there could be some difficulty with
numerical precision in cases when the error measure is dominated by errors in coefficients
other than the one that is being adjusted. Additionally, the contribution to the
error of as yet unadjusted coefficients can make convergence quite slow.
When not employing orthogonal functions for approximation and when simply optimizing
coefficients to minimize cost, maximize
efficiency, and the like, convergence can be impossible when adjusting one coefficient
at a time.
A Partial Analogy to the Minimization Process
Imagine a mountain valley in the Canadian Rockies that is surrounded by mountains.
You are well up the mountain slope, the ground is very rough and snow covered.
Snow is falling heavily and you can only just see your feet. You want to reach
your
canoe on the shallow lake in the low point of the valley as soon as possible.
What do you do?
You could put one foot forward and if it felt lower in that direction, take further
steps.
If it felt higher then you might turn around and step in the opposite
direction.
You could repeat the foregoing until neither forward nor reverse steps
felt lower.
Then you might make a 90-degree turn and repeat the process.
Hopefully you will
have reached your objective when neither forward nor backward
steps lower your
position. Are you at the water's edge or in the water?
If you are neither, then your 90-degree turn could have been less than precise. You might consider repeating
the entire process, albeit you could die from exposure
before reaching your objective.
(You might have been spiraling around the
valley following a very small downward slope.)
As an alternative, consider moving one foot forward and then to the back and noting
which of these lowered your position the most and by how much. Then turn 90 degrees
and repeat the foregoing. Now take a step in a direction that lies between
the two best
directions and favours, proportionately, the direction that lowered
your position the
most, the direction of steepest decent.
It will be intuitive that the method of steepest decent offers the best chance of
reaching
the lake before dying of exposure.
But there are other problems. Too large an initial step could take you into the next valley or across the valley
to the
opposite slope. Steps that are too small can fail to guide you out
of small depressions,
called local minima. |
Additionally, when adjusting only one coefficient at a time,
and in situations where the effects of coefficient adjustments are not fully independent,
it can be impossible to reach the objective. In such cases adjusting all coefficients
at once using steepest decent will often succeed.
The efficacy of
steepest decent coefficient adjustment
will now be compared with the
one at a time process that
was used in the previous treatment of approximating a semicircle with sine functions.
First determine the amount that the squared error changes for a small change in
each of the 17 coefficients. These amounts indicate the sensitivity of the error to
adjustment of the coefficient. Then adjust all coefficients in proportion to
the negative of their sensitivities, negative because we wish to reduce error rather
than increase it. Continue these adjustments until the reduction in the error
that results is less than a predetermined amount, in our case 0.00000001.
Recall that the error that resulted when using the
one at a time
adjustments was 0.008464. Also, some coefficients required 81 steps of adjustment.
The fewest steps required for a coefficient were 54. Employing
steepest
decent, with the aid of a macro (shown later), only 8 steps
were required. The results for the initial and first two of the eight steps are shown in the graphs that follow. The final error was again 0.008464.
The sensitivities for the17 coefficients, for each step, as found by the macro are
seen next.
There is no separate need to adjust the step sizes since they are maintained in proportion to the continually reducing sensitivities.
Next view the increments made to the coefficients in each step and their final
values.
The final values consist of the initial values plus the increments calculated in
each step.
The Macro follows.
The spreadsheet Excel is a registered trademark of Microsoft Corporation. Excel 2000 used herein is Copyright by Microsoft Corporation. Excel
contains a tool called Solver that is used for iterative
optimization. Solver and the foregoing macro found exactly the same coefficients
for the sinusoidal approximation to the semicircle.
Other uses for iterative optimization include solving sets of linear equations,
locating roots of polynomials, solving sets of differential equations and more.
Use Solver
to solve a few linear equations of your own
choosing. (Hint!
Assign initial values to the unknown coefficients. Next, let Solver adjust the assigned values until the error between the results given by the equations and the known values becomes small.)
It is said that one's understanding of mathematical methods is greatly advanced
by doing. Try to make some improvements to the given
macro. It is possible.
Next
The spreadsheet is used to find a close approximation to e, the base of Naperian
logarithms. An application of e is then explored.