American Journal of Applied Mathematics
Volume 5, Issue 1, February 2017, Pages: 19-30

On Modified DFP Update for Unconstrained Optimization

Saad Shakir Mahmood, Samira Hassan Shnywer

Department of Mathematics, College of Education, Almustansiryah University, Baghdad, Iraq

(S. S. Mahmood)
(S. H. Shnywer)

Saad Shakir Mahmood, Samira Hassan Shnywer. On Modified DFP Update for Unconstrained Optimization. American Journal of Applied Mathematics. Vol. 5, No. 1, 2017, pp. 19-30. doi: 10.11648/j.ajam.20170501.13

Received: December 25, 2016; Accepted: January 9, 2017; Published: February 6, 2017

Abstract: In this paper, we propose a new modify of DFP update with a new extended quasi-Newton condition for unconstrained optimization problem so called update. This update is based on a new Zhang Xu condition we show that  update preserves the value of determinant of the next Hessian matrix equal to the value of determinant of current Hessian matrix theoretically and practically. Global convergence of the modify is established. Local and super linearly convergence are obtained for the proposed method. Numerical results are given to compare a performance of the modify method with the standard DFP method on same function is selected.

Keywords: Quasi-Newton Equation, the DFP Updating Formula, Global Convergence and Super Linearly Convergence

Contents

1. Introduction

The quasi-Newton methods are very useful and efficient methods for solving the unconstrained minimization problem

(1)

Where is twice continuously differentiable. Starting from point  and a symmetric positive definite matrix  quasi-Newton method generates sequence { and { by the iteration of the form

(2)

where the update  satisfies the following famous quasi-Newton equation or (secant equation):

(3)

with

(4)

where  is the step length and  is the search direction that is obtained by solving the equation:

(5)

in which  is the gradient of  at  and  is an approximation to the Hessian matrix  The updating matrix  is required to satisfy the usual quasi-Newton equation (3) with equation (4). So  is reasonable approximation to .

The  update consist of iteration of the form (2) where  is the search direction which of the form (5) and the Hessian approximation B is update by the ( formula with quasi-Newton equation (3).

(6)

where (7)

The formula is the modifying of DFP update which is satisfy equation (3) and in the next section.

In the following discussion, we shall use  and  to denote the -norm and the frobenius norm, respectively. For a symmetric positive definite matrix , we show also use the following weighted norm

(8)

2.  Update

Let  and from Zhang Xu condition, we have:

which gives  (9)

From quasi-Newton equation (3), we get:

which is extended of quasi-Newton equation (10)

Now, substitution (9) in (10), we obtain:

(11)

Let we consider the determinant of Hessian matrix for the DFP update [7] with replace each  by , we get:

Det  (12)

And from equation (9), we obtain:

Det  (13)

Which gives:

Det  (14)

we suppose that Det .

Then the equation (14) becomes:

Now, by multiplying both sides with  we get:

(15)

Hence

.

Finally:

(16)

In addition, by more simplifying we can write  as follows:

(17)

which is equivalent the following formula:

(18)

and also the formula (17) is equivalent the following formula:

(19)

where  is defined by (9). It is clear that any formula is symmetric and satisfies the quasi-Newton equation.

3. Convergence Analysis

Now, we study the global convergence of the  update:

At the first, we need the following assumptions:

Assumption (3.1)

: f: is twice continuously differentiable on convex set

is uniformly convex, i.e., there exist positive constants n and N such that for all , which is convex, where  is starting point, we have:

(20)

The assumption (B) implies that is positive definite on ,

and that f has a unique minimizer in L.

By definition of weighted norm (8) and equation (9), satisfy extended (Q-N) equation then and  that is then

(21)

where .

Now by return to the property (20) and from definition of weighted norm, we get:

(22)

Where  is the average Hessian, which is defined as:

(23)

and

(24)

Since also

(25)

Assumption (B) of (3.1) means that  is positive definite proven, thus its square root is well defined. There is a symmetric square root  is satisfying

If we let , then (26)

(27)

Substitution equation (26) in equation (27), we get

(28)

And from (20) , we know  Then we can divide both sides from this, we get:

That mean

(29)

Then from equation (28), we get:

(30)

In addition, from equations (21) and (9), we get:

(31)

Which gives:

(32)

And

(33)

Therefore, from the above discussion, we have:

Lemma (3.2)

Let f:  satisfy Assumption (3.1). Then

Are bounded.

Hence, we have

Are bounded.

Lemma (3.3)

Under exact line search,  and  are convergent.

Which gives:

is convergent and bounded.

From lemma (3.2), we have

is convergent and bounded.

Since  is convergent, we get:

is convergent, then

Which implies  is convergent, where  is the minimum of .

Lemma (3.4)

For all vector x, the inequality

(34)

holds, where  is the minimum of .

Proof:

Since the function

is a convex function, we have:

In particular, set , then we have

Which gives

By multiplying both sides with (-1) we get

By Cauchy-Schwarz inequality, we get

(35)

From (24) and (9), we have

(36)

Hence

Let

(37)

Substituting (37) into (35) establishes (34).

Theorem (3.5)

Suppose that  satisfies Assumption (3.1). Then under exact line search the sequence { generated by method converges to the minimizer  of f.

Proof: Consider  formula of inverse Hessian approximation

(38)

and from  formula (19) of Hessian approximation

(39)

Obviously,  By computing the trace of (39), we have

(40)

The middle two terms can be written as:

From equation (4) and (5), we have

From the property of the DFP method [5] , we obtain:

From equation (4)and (5) again, we get

(41)

Fromthe positive definiteness property of , then (38) becomes:

Which gives:

By finding the inverse number of the expression, we obtain

(42)

using (41) and (42), then (40) becomes:

(43)

By recurrence, we obtain:

(44)

Therefore, by lemma (3.2), there exists a positive number N which is independent of k, such that

(45)

In the left part, we will prove that if the theorem does not hold, then the sum of the last two term in (45) is negative.

Now consider the trace of , from (38), we have

(46)

Since  is positive definite, the right hand side of (46)is positive. By lemma (3.2) there exists  which is independent of k such that

(47)

Note that

(48)

and

(49)

By the positive definiteness of  and exact line search, then by using (49), (48) and (47) in turn, we obtain:

(50)

By using Cauchy-Schwarz inequality and (50)

(51)

Now suppose that the theorem is not true, that is, there exists  such that for all sufficiently large k,

(52)

Also, by lemma (3.3), there exists a constant  such that

which gives  and further . Then, by (51) and (52) we deduce, for any k sufficiently large, that

(53)

The above inequality implies that the sum of the last two terms in (45) is negative.

By (53) and (45), we immediately obtain:

(54)

Note that, for a symmetric positive definite matrix, the inverse of trace is the lower bound of the last eigenvalue of inverse of the matrix. Then, it follows from (54) that

(55)

Where  is the lower bound of the last eigenvalue of . However, from the property of Rayleigh quotient [9], we have

(56)

which contradicts (55). This contradiction proves that { converges to  and that our theorem holds.

4. Local Linear Convergence of  Method

Now, we shall prove the local linear convergence of  method of equivalent formula (18) for α>0 under exact line search.

The  iteration we consider is:

(57)

(58)

So that replace  and  by  and  respectively.

In this discussion of this subsection, we need the following assumption:

Assumption (4.1)

The mapping  is continuously differentiable in open convex set

: There is  in D such that and is nonsingular. : satisfies the Lipshiz condition at  that is, there exists a constant  such that

,

We begin with some general converges results.

Lemma (4.2) [9]

Let satisfy assumption (A). Then for any    we have

(59)

Furthermore, assume that  satisfy assumption (C), then

(60)

and

(61)

where .(62)

Lemma (4.3)

Let ,. If

then  is positive and

(63)

Conversely, if  and (63) holds, then

(64)

Theorem (4.4) [9]

Let  is satisfy the assumptions and  in (4.1), U an update function such that for all  and

, we have (65)

(66)

where is some constant, or that

(67)

where  and  are some constants, and

(68)

Then, there exist constants and such that, for all  and , the iteration (57) and (65) is well-defined, and  converges to linearly.

To study the local convergence of method, it is required to estimate

As show in the following theorem, there is a matrix

in Since

(69)

It is a secant of the angle between  and . In general,  and  is not parallel. So may be quite big, and it is not suitable to estimate  by means of norm. However, near ,

closes a quadratic function, and hence  and  are approximately parallel, where . In motivates us some weighted norm to estimate . Then, we have

(70)

Below, we first develop the linear convergence of .

Theorem (4.5)

Let  satisfy Assumption  in (4.1). Also let

(71)

In a neighborhood of , where , then, there exist and such that for and , the iteration (57) and equivalent formula (18) of  method is well-defined, and the produced sequence { converges to linearly.

Proof:

Based on the lemma (4.4), to prove the linearly convergence of  method, it is enough to prove

(72)

where  and  are positive constants independent of  and , is defined by (68).

Le and  which is  and  respectively, also are symmetric positive definite matrices.

From (57) and the formula (18) of , it follows that

(73)

where

(74)

Thus, from (73), one has

(75)

Note that  is defined by (69).

The first term on the right hand side of (75) can be estimate as:

(76)

moreover, for the rest two terms on the right hand side of (75) and by (70) we have:

(77)

and

(78)

where

(79)

which by lemma (4.3) implies the curvature condition

Now, we estimate  by using (76), (77) and (78), we have

(80)

Note from lemma (4.2) that

(81)

Since

By lemma (4.3), we have:

Consequently, if  and  are in the neighborhood of , then

So, the two terms in (80) satisfy respectively.

(82)

and

(83)

combining (82) with (83) into (80), we have:

Which completes the proof by applying lemma (4.4) with  and

5. Super Linear Convergence of  Method

Now, we shall prove the super linear convergence of the  method. the convergence analysis in this section mainly Dennis and Mor'e [2]. The super linear convergence of the sequence generated by the iteration (57) is generally characterized by the following theorem.

Theorem (5.1) [2]

Let  is satisfy  and  in Assumption (4.1). Let { be a sequence, of nonsingular matrices. Suppose for , that the iteration generated by (57) remain in D. . Suppose also that { converges to  Then {converges to  at super linear rate if and only if

(84)

Theorem (5.1) indicates that if  converges to along the direction then method converges super linearly. This theorem is very important in analysis of . Equation (84)is called the Dennis

and Mor'e characterization of super linear convergence.

To apply theorem (5.1), we need a refinement estimate  which is established with the help of the following lemmas.

Lemma (5.2)

Let  be a nonsingular symmetric matrix, If, for ,

The inequality

(85)

holds, then for any non-zero matrix , we have:

(86)

(87)

(88)

Where

(89)

Proof:

Note that

(90)

Also, it follows from Cauchy-Schwarz inequality and (85) that

(91)

From (90) and (91), we get

And, in same way, we have

Which gives the first result (a).

Now, we will prove (b) By using the property of the Frobenius norm of a rank-one update.

To prove (b) we need the following property

In particular,

Let

By using (a) and (89), we get:

And therefore

from (89) again, we get:

Which shows (b):

Finally, we prove  by means of (b). It enough to prove that

(92)

from (a), we have:

(93)

which proves (c).

We have known that if  satisfies Assumption (4.1), then (72) holds.

Then under Assumptions of the theorem (4.5), the preceding lemma can be applied with the sitting

, ,

,

and

Due to the linear convergence of the sequence {gives in theorem (4.5), we have  and can sequnetly by lemma (5.3), there exists a constant  such that

(94)

Hence

,

is exists.

Lemma (5.3):

Let { and { be sequences of nonnegative numbers satisfying

(95)

and

(96)

then { is converges.

These results together then give rise to a refinement estimate of  as follows:

Lemma (5.4)

Under the assumption of theorem(4.5), there exist positive constants and , such that , we have

(97)

where  is defined by (68) and

(98)

Proof:

First m we write . From (75), we have:

Let

(99)

And

Similar to the proof of the theorem (4.5), we known that there exists  and  such that

If we let , then (75) becomes:

(100)

Since

Then, by use of lemma (5.2), and from (99), we get:

Note that  thus, by using lemma (5.2) again, we obtain:

Where  is defined by (98), From lemma (5.2) again and by (89), we get

(101)

Where  Substitution (101) into (100) and from (99), we deduce the desired result (97). The proof is complete

Finally, using the above four lemmas, we can establish the following super linear convergence theorem of method.

Theorem (5.5)

Under the assumption of theorem (4.5), method defined by (57) and (58) is convergent super linearly.

Proof:

Since  then (95) can be written as:

Summing the above from k = 1 to infinity gives:

Since, from theorem (4.5), { is linearly convergent, then

Also, since { is bounded, then

By (94), the  exists.

Hence, if some subsequence of {converges to zero.

The whole sequence converges to zero.

Therefore,

Which proves the super linear convergence of {by theorem (5.1). Otherwise, there exists a positive constant such that

then

Since it follows that

Furthermore, we have:

Where  is defined by (98).

Then, by using , we immediately obtain:

Hence, { is convergent super linearly, we complete the proof.

6. Numerical Results

This section is devoted to numerical experiments. Our purpose was to check whether the modified algorithm provide improvements on the corresponding standard DFP algorithm. The programs were written in MATLAP. The reason for their selection is that the problems appear to have been used in standard problems in most the literature these functions represent a result of application in the branch of technology and industry.

The test functions are chosen as follows:

. [1]

Rosen brook's function. [4]

.

Rosenbroc'k cliff function [8]

.

Generalized Edger function. [1]

.

Extended Himmelbla function [1]

Rosen rock's function [6]

Trigonometric function [1]

.

Extended Rosen rock function [1]

. c = 100

Watson function [6]

.

. and

Freudenstein and Roth function [3]

Table 1. Numerical results for DFP and  update.

 Fun. Starting point Dim. DFP The Best Feval Iter. Feval Iter. 1 2 6.0962e-016 2 1.7909e-017 2 DFP 1 2 2.7900e-020 2 2.7900e-020 2 Same 1 2 3.2494e-016 2 2.9082e-017 2 DFP 2 4 1.5076e-007 9 9.0146e-007 7 2 4 1.8142e-013 3 7.3965e-011 3 2 4 7.9747e-008 34 7.2012e-006 9 3 2 1.7525e-018 20 1.6887e-016 14 3 2 1.5615e-016 3 7.5140e-015 3 3 2 3.0198e-o17 14 2.4041e-017 14 Same 4 4 0.2011 3 0.2011 3 Same 4 12 0.2004 3 0.2004 3 Same 4 12 0.2007 3 0.2007 3 Same 5 2 5.5433e-015 8 8.8715e-014 3 5 18 2.9846e-008 9 3.6384e-008 6 5 18 1.4714e-010 7 5.3611e-009 6 6 2 8.1785e-012 7 3.2157e-011 6 6 2 1.3697e-013 8 3.1675e-009 8 6 2 1.1910e-012 8 2.7507e-012 8 7 8 2.0658e-011 6 1.9815e-010 24 7 8 5.1583e-012 6 3.1487e-011 8 7 4 1.7728e-010 2 1.7728e-010 2 Same 8 12 7.6464e-007 13 4.6941e-006 5 8 12 3.7119e-006 13 3.7598e-006 4 8 12 6.8965e-007 21 1.8229e-006 5 9 3 0 1 0 1 Same 9 3 0.0220 40 0.0118 9 9 2 3.5522 31 40.2765 8 10 4 6.0707e-017 3 1.0838e-016 3 10 4 2.1626e-013 3 3.5351e-008 3 10 10 1.6053e-010 2 5.5796e-010 2 11 2 5.7785e-010 12 2.2977e-010 8 11 2 2.3185e-013 15 1.7572e-013 9 11 2 1.6761e-012 13 8.3077e-012 10

7. Conclusion

In this thesis, we introduce a new modified of the DFP say  update, we show that under certain circumstances this update preserve the value of the determinant of hessian matrix and without Quasi-Newton or based on the Zhang Xu condition.

Global convergence of the proposed method establishes under exact line search. The proposed method possesses local linearly convergence and super linearly convergence for unconstrained optimization problem.

Numerical results show that the proposed is efficient for unconstrained optimization problem compared the modified  method with the standard DFP method on same function is selected, which suggests that a good improvement has been achieved.

References

1. Al-Bayati, A., (1991), A new family of self-scaling variable metric Algorithms for unconstraint optimization, Journal of Educ. and Sci., Iraq, Vol. 12, pp. 25-54.
2. Dennis J. E., More J., (1974), "A characterization of super linear Convergence and its application to quasi-Newton methods Math and Computation 28 (6) 549-60.
3. F. Freudenstein and B. Roth, (1962), Numerical solution of system of nonlinear equations, Journal of ACM, Vol. 10, No. 4, pp. 550-556.
4. H. H. Rosen brock, (1960), An automatic method for finding the greatest least value of a function, Computer Journal, Vol. 3, No. 3, pp. 175-184.
5. J. E. Dennis, Jr. and Robert B. Schnabel, (1996), Numerical Methods for Unconstrained Optimization and Nonlinear Equations, SIAM, Classics in Applied Mathematics.
6. Oren, S. S (1973), Self-scaling Variable Metric Algorithms without line search for Unconstrained minimization.Mathematics of computation, 27: 873-885.
7. Saad S. Mahmood, (2011), update for Unconstrained Opt-imization, Journal of college of Education, No.1, AlmustansiriyaUniversity.
8. Todd M. J., (1984), Quasi-Newton updates in abstract spaces, SIAM Review, 26: 367-377.
9. W. Sun and Y. Yuan, (2006), "Optimization Theory and Method: Nonlinear Programing", Vol. 1 of Springer optimization and its Applications, Springer, New York, NY, USA.
10. Yuan, Y., (1990), On a Modified Algorithm for Unconstrained Optimization, Computing Center, Academia Sinica, Beijing, China.

 Contents 1. 2. 3. 4. 5. 6. 7.
Article Tools