Maxima and minima of functions of several variables, stationary point, Lagrange�s method of multipliers

SolitaryRoad.com

Website owner:  James Miller

[ Home ] [ Up ] [ Info ] [ Mail ]

MAXIMA AND MINIMA OF FUNCTIONS OF SEVERAL VARIABLES, STATIONARY POINT, LAGRANGE’S METHOD OF MULTIPLIERS

Def. Stationary (or critical) point. For a function y = f(x) of a single variable, a stationary (or critical) point is a point at which dy/dx = 0; for a function u = f(x₁, x₂, ... , x_n) of n variables it is a point at which

In the case of a function y = f(x) of a single variable, a stationary point corresponds to a point on the curve at which the tangent to the curve is horizontal. In the case of a function y = f(x, y) of two variables a stationary point corresponds to a point on the surface at which the tangent plane to the surface is horizontal.

In the case of a function y = f(x) of a single variable, a stationary point can be any of the following three: a maximum point, a minimum point or an inflection point. For a function y = f(x, y) of two variables, a stationary point can be a maximum point, a minimum point or a saddle point. For a function of n variables it can be a maximum point, a minimum point or a point that is analogous to an inflection or saddle point.

Maxima and minima of functions of several variables. A function f(x, y) of two independent variables has a maximum at a point (x₀, y₀) if f(x₀, y₀) f(x, y) for all points (x, y) in the neighborhood of (x₀, y₀). Such a function has a minimum at a point (x₀, y₀) if f(x₀, y₀) f(x, y) for all points (x, y) in the neighborhood of (x₀, y₀).

A function f(x₁, x₂, ... , x_n) of n independent variables has a maximum at a point (x₁', x₂', ... , x_n') if f(x₁', x₂', ... , x_n') f(x₁, x₂, ... , x_n) at all points in the neighborhood of (x₁', x₂', ... , x_n'). Such a function has a minimum at a point (x₁', x₂', ... , x_n') if f(x₁', x₂', ... , x_n') f(x₁, x₂, ... , x_n) at all points in the neighborhood of (x₁', x₂', ... , x_n').

Necessary condition for a maxima or minima. A necessary condition for a function f(x, y) of two variables to have a maxima or minima at point (x₀, y₀) is that

at the point (i.e. that the point be a stationary point).

In the case of a function f(x₁, x₂, ... , x_n) of n variables, the condition for the function to have a maximum or minimum at point (x₁^', x₂', ... , x_n') is that

at that point (i.e. that the point be a stationary point).

To find the maximum or minimum points of a function we first locate the stationary points using 1) above. After locating the stationary points we then examine each stationary point to determine if it is a maximum or minimum. To determine if a point is a maximum or minimum we may consider values of the function in the neighborhood of the point as well as the values of its first and second partial derivatives. We also may be able to establish what it is by arguments of one kind or other. The following theorem may be useful in establishing maximums and minimums for the case of functions of two variables.

Sufficient condition for a maximum or minimum of a function z = f(x, y). Let z = f(x, y) have continuous first and second partial derivatives in the neighborhood of point (x₀, y₀). If at the point (x₀, y₀)

and

then there is a maximum at (x₀, y₀) if

and a minimum if

If Δ > 0 , point (x₀, y₀) is a saddle point (neither maximum nor minimum). If Δ = 0 , the nature of point (x₀, y₀) is undecided. More investigation is necessary.

Example. Find the maxima and minima of function z = x² + xy + y² - y .

Solution..

2x + y = 0

x + 2y = 1

x = -1/3 , y = 2/3

This is the stationary point. At this point Δ > 0 and

and the point is a minimum. The minimum value of the function is - 1/3.

Maxima and minima of functions subject to constraints. Let us set ourselves the following problem: Let F(x, y) and G(x, y) be functions defined over some region R of the x-y plane. Find the points at which the function F(x, y) has maximums subject to the side condition G(x, y) = 0. Basically we are asking the question: At what points on the solution set of G(x, y) = 0 does F(x, y) have maximums? The solution set of G(x, y) = 0 corresponds to some curve in the plane. See Figure 1. The solution set (i.e. locus) of G(x, y) = 0 is shown in red. Figure 2 shows the situation in three dimensions where function z = F(x, y) is shown rising up above the x-y plane along the curve G(x, y) = 0. The problem is to find the maximums of z = F(x, y) along the curve G(x, y) = 0.

Let us now consider the same problem in three variables. Let F(x, y, z) and G(x, y, z) be functions defined over some region R of space. Find the points at which the function F(x, y, z) has maximums subject to the side condition G(x, y, z) = 0. Basically we are asking the question: At what points on the solution set of G(x, y, z) = 0 does F(x, y, z) have maximums? G(x, y, z) = 0 represents some surface in space. In Figure 3, G(x, y, z) = 0 is depicted as a spheroid in space. The problem then is to find the maximums of the function F(x, y, z) as evaluated on this spheroidal surface.

Let us now consider another problem. Suppose instead of one side condition we have two. Let F(x, y, z), G(x, y, z) and H(x, y, z) be functions defined over some region R of space. Find the points at which the function F(x, y, z) has maximums subject to the side conditions

2) G(x, y, z) = 0

3) H(x, y, z) = 0.

Here we wish to find the maximum values of F(x, y, z) on that set of points that satisfy both equations 2) and 3). Thus if D represents the solution set of G(x, y, z) = 0 and E represents the solution set of H(x, y, z) = 0 we wish to find the maximum points of F(x, y, z) as evaluated on set F = D E (i.e. the intersection of sets D and E). In Fig. 4 G(x, y, z) = 0 is depicted as an ellipsoid and H(x, y, z) = 0 as a plane. The intersection of the ellipsoid and the plane is the set F on which F(x, y, z) is to be evaluated.

The above can be generalized to functions of n variables F(x₁, x₂, ... , x_n), G(x₁, x₂, ... , x_n), etc. and m side conditions.

Methods for finding maxima and minima of functions subject to constraints.

1. Method of direct elimination. Suppose we wish to find the maxima or minima of a function F(x, y) with the constraint Φ(x, y) = 0. Suppose we are so lucky that Φ(x, y) = 0 can be solved explicitly for y, giving y = g(x). We can then substitute g(x) for y in F(x, y) and then find the maximums and minimums of F(x, g(x)) by standard methods. In some cases, it may be possible to do this kind of thing. We express some of the variables in the equations of constraint in terms of other variables and then substitute into the function whose extrema are sought, and find the extrema by standard methods.

2. Method of implicit functions. Suppose we wish to find the maxima or minima of a function u = F(x, y, z) with the constraint Φ(x, y, z) = 0. We note that Φ(x, y, z) = 0 defines z implicitly as a function of x and y i.e. z = f(x, y). We thus seek the extrema of the quantity

u = F(x, y, f(x, y)) .

The necessary condition for a stationary point, as given by 1) above, becomes

(where F₁ represents the partial of F with respect to x, etc.)

Taking partials of Φ with respect to x and y it follows that

(since the partial derivative of a function that is constant is zero).

From the pair of equations consisting of the first equation in 4) and 5) we can eliminate ∂z / ∂x giving

6) F₁Φ₃ - F₃Φ₁ = 0

From the pair of equations consisting of the second equation in 4) and 5) we can eliminate ∂z / ∂y giving

7) F₂Φ₃ - F₃Φ₂ = 0

Equations 6) and 7) can be written in determinant form as

Equations 8) combined with the equation Φ(x, y, z) = 0 give us three equations which we can solve simultaneously for x, y, z to obtain the stationary points of function F(x, y, z). The maxima and minima will be among the stationary points.

This same method can be used for functions of an arbitrary number of variables and an arbitrary number of side conditions (smaller than the number of variables).

Extrema for a function of four variables with two auxiliary equations. Suppose we wish to find the maxima or minima of a function

u = F(x, y, z, t)

with the side conditions

9) Φ(x, y, z, t) = 0 ψ(x, y, z, t) = 0.

Equations 9) define variables z and t implicitly as functions of x and y i.e.

10) z = f₁(x,y) t = f₂(x, y) .

We thus seek the extrema of the quantity

u = F(x, y, f₁(x, y), f₂(x, y)) .

The necessary condition for a stationary point, as given by 1) above, becomes

Taking partials of Φ with respect to x and y it follows that

Taking partials of ψ with respect to x and y it follows that

From 12) and 13) we can derive the conditions

Equations 14) combined with the auxiliary equations Φ(x, y, z, t) = 0 and ψ(x, y, z, t) = 0 give us four equations which we can solve simultaneously for x, y, z, t to obtain the stationary points of function F(x, y, z, t). The maxima and minima will be among the stationary points.

Extrema for a function of n variables with p auxiliary equations. The p equations corresponding to equation 14) above for the case of a function of n variables

u = F(x₁, x₂, ... .x_n)

and p auxiliary equations (i.e. side conditions)

Φ(x₁, x₂, ... , x_n) = 0

Ψ(x₁, x₂, ... , x_n) = 0

.................................

Ω(x₁, x₂, ... , x_n) = 0

are

These p equations along with the p auxiliary equations

Φ(x₁, x₂, ... , x_n) = 0

Ψ(x₁, x₂, ... , x_n) = 0

.................................

Ω(x₁, x₂, ... , x_n) = 0

can be solved simultaneously for the n variables x₁, x₂, ... .x_n to obtain the stationary points of F(x₁, x₂, ... .x_n). The maxima and minima will be among the stationary points.

*********************************

Geometrical interpretation for extrema of function F(x, y, z) with a constraint. We shall now present a theorem that gives a geometrical interpretation for the case of extremal values of functions of type F(x, y, z) with a constraint.

Theorem 1. Suppose the functions F(x, y, z) and Φ(x, y, z) have continuous first partial derivatives throughout a certain region R of space. Let the equation Φ(x, y, z) = 0 define a surface S, every point of which is in the interior of R, and suppose that the three partial derivatives Φ₁, Φ₂, Φ₃ are never simultaneously zero at a point of S. Then a necessary condition for the values of F(x, y, z) on S to attain an extreme value (either relative or absolute) at a point of S is that F₁, F₂, F₃ be proportional to Φ₁, Φ₂, Φ₃ at that point. If C is the value of F at the point, and if the constant of proportionality is not zero, the geometric meaning of the proportionality is that the surface S and the surface F(x, y, z) = C are tangent at the point in question.

Rationale behind theorem. From 8) above, a necessary condition for F(x, y, z) to attain a maxima or minima (i.e. a condition for a stationary point) at a point P is that

F₁Φ₃ - F₃Φ₁ = 0 F₂Φ₃ - F₃Φ₂ = 0

Thus at a stationary point the partial derivatives F₁, F₂, F₃ and Φ₁, Φ₂, Φ₃ are proportional. Now the partial derivatives F₁, F₂, F₃ and Φ₁, Φ₂, Φ₃ represent the gradients of the functions F and Φ; and the gradient, at any point P, of a scalar point function ψ(x, y, z) is a vector that is normal to that level surface of ψ(x, y, z) that passes through point P. If C is the value of F at the stationary point P, then the vector (F₁, F₂, F₃) at point P is normal to the surface F(x, y, z) = C at P. Similarly, the vector (Φ₁, Φ₂, Φ₃) at point P is normal to the surface Φ(x, y, z) = 0 at P. Since the partial derivatives F₁, F₂, F₃ and Φ₁, Φ₂, Φ₃ are proportional, the normals to the two surfaces point in the same direction at P and the surfaces must be tangent at point P.

Example. Consider the maximum and minimum values of F(x, y, z) = x² + y² + z² on the surface of the ellipsoid

Since F(x, y, z) is the square of the distance from (x, y, z) to the origin, it is clear that we are looking for the points at maximum and minimum distances from the center of the ellipsoid. The maximum occurs at the ends of the longest principal axis, namely at ( 8, 0, 0). The minimum occurs at the ends of the shortest principal axis, namely at (0, 0, 5). Consider the maximum point (8, 0, 0). The value of F at this point is 64, and the surface F(x, y, z) = 64 is a sphere. The sphere and the ellipsoid are tangent at (8, 0, 0) as asserted by the theorem. In this case the ratios G₁:G₂:G₃ and F₁:F₂:F₃ at (8, 0, 0) are 1/4 : 0 : 0 and 16 : 0 : 0 respectively.

This example brings out the fact that the tangency of the surfaces (or the proportionality of the two sets of ratios), is a necessary but not a sufficient condition for a maximum or minimum value of F, for we note that the condition of proportionality exists at the points (0, 6, 0), which are the ends of the principal axis of intermediate length. But the value of F in neither a maximum nor a minimum at this point.

Case of extrema of function F(x, y) with a constraint. A similar geometrical interpretation can be given to the problem of extremal values for F(x, y) subject to the constraint Φ(x, y) = 0. Here we have a curve defined by the constraint, and a one-parameter family of curves F(x, y) = C. At a point of extremal value of F the curve F(x, y) = C through the point will be tangent to the curve defined by the constraint.

3. Lagrange’s Method of Multipiers. Let F(x, y, z) and Φ(x, y, z) be functions defined over some region R of space. Find the points at which the function F(x, y, z) has maximums and minimums subject to the side condition Φ(x, y, z) = 0. Lagrange’s method for solving this problem consists of forming a third function G(x, y, z) given by

17) G(x, y, z) = F(x, y, z) + λΦ(x, y, z) ,

where λ is a constant (i.e. a parameter) to which we will later assign a value, and then finding the maxima and minima of the function G(x, y, z). A reader might quickly ask, “Of what interest are the maxima and minima of the function G(x, y, z)? How does this help us solve the problem of finding the maxima and minima of F(x, y, z)?” The answer is that examination of 17) shows that for those points corresponding to the solution set of Φ(x, y, z) = 0 the function G(x, y, z) is equal to the function F(x, y, z) since at those points equation 17) becomes

G(x, y, z) = F(x, y, z) + λ·0 .

Thus, for the points on the surface Φ(x, y, z) = 0, functions F and G are equal so the maxima and minima of G are also the maxima and minima of F. The procedure for finding the maxima and minima of G(x, y, z) is as follows: We regard G(x, y, z) as a function of three independent variables and write down the necessary conditions for a stationary point using 1) above:

18) F₁ + λΦ₁ = 0 F₂ + λΦ₂ = 0 F₃ + λΦ₃ = 0

We then solve these three equations along with the equation of constraint Φ(x, y, z) = 0 to find the values of the four quantities x, y, z, λ. More than one point can be found in this way and this will give us the locations of the stationary points. The maxima and minima will be among the stationary points thus found.

Let us now observe something. If equations 18) are to hold simultaneously, then it follows from the third of them that λ must have the value

If we substitute this value of λ into the first two equations of 18) we obtain

F₁Φ₃ - F₃Φ₁ = 0 F₂Φ₃ - F₃Φ₂ = 0

We note that the two equations of 19) are identically the same conditions as 8) above for the previous method. Thus using equations 19) along with the equation of constraint Φ(x, y, z) = 0 is exactly the same procedure as the previous method in which we used equations 8) and the same constraint.

One of the great advantages of Lagrange’s method over the method of implicit functions or the method of direct elimination is that it enables us to avoid making a choice of independent variables. This is sometimes very important; it permits the retention of symmetry in a problem where the variables enter symmetrically at the outset.

Lagrange’s method can be used with functions of any number of variables and any number of constraints (smaller than the number of variables). In general, given a function F(x₁, x₂, ... , x_n) of n variables and h side conditions Φ₁ = 0, Φ₂ = 0, .... , Φ_h = 0, for which this function may have a maximum or minimum, equate to zero the partial derivatives of the auxiliary function F + λ₁Φ₁ + λ₂Φ₂ + ...... + λ_hΦ_h with respect to x₁, x₂, ... , x_n , regarding λ₁, λ₂, ..... ,λ_h as constants, and solve these n equations simultaneously with the given h side conditions, treating the λ’s as unknowns to be eliminated.

The parameter λ in Lagrange’s method is called Lagrange’s multiplier.

References.

Taylor. Advanced Calculus

Osgood. Advanced Calculus.

James and James. Mathematics Dictionary.

Mathematics, Its Content, Methods and Meaning. Vol. I