A descent modified Hager-Zhang conjugate gradient method and its global convergence

In this paper, based on the memoryless BFGS quasi-Newton method, we propose a new modified Hager-Zhang (HZ) type method. An attractive property of the proposed method is that the direction generated by the method is always a descent direction for the objective function. Moreover, if the exact line search is used, the new method reduces to the ordinary HS method. Under appropriate conditions, we show that the modified HZ method is globally convergent for convex and general functions. Numerical results are also reported. رط نیسحت ىلع ةدمتعملا قفارتملا جردتلا ةقی (HZ) يلومشلا اھبراقتو صخلملا دمتعا ةیـصاخ ىـلع ثـحبلا اذـه ةـقیرطل ةركاذـلا صیـلقت BFGS ةدـیدج ةـقیرط تـحرتقا ، ةــقیرط نیســحتل Hager-Zhang (HZ) . يبلســلا هــجتملا ةیــصاخب ظفتــحت ةــقیرطلا ةبســنلاب ىــلإ فدـــهلا ةــلاد نــع لاضـــف ةـــقیرط ىــلإ دوــعت ةدــیدجلا ةـــقیرطلا نأــف طوبضـــملا ثــحبلا طـــخ مادختـــسا Hesttenes-Stiefel (HS) ةدـیدجلا ةـقیرطلا نأ تاـبثإ مـت طورشـلا ضـعب مادختـسابو ،ةیـسایقلا ةــماعلاو ةــبدحملا لاودــلا لــكل يلومشــلا براــقتلا طرــش قــقحت . ا ةءاــفك تــتبثأ ةــیلمعلا جئاــتنلا ةــقیرطل ةحرتقملا . 1Introduction. We consider the unconstrained nonlinear optimization problem: Minimize f(x), n R xÎ (1) where R R f n ® : is smooth, nonlinear function whose gradient will be denoted by g. nonlinear Conjugate Gradient (CG) method is one of the Assistant Prof. \ College of Computers Sciences and Math.\ University of Mosul. * Lecture\ College of Computers Sciences and Math.\ University of Mosul. ** Received:1/10 /2011 ____________________Accepted: 21 /12 / 2011 A descent modified Hager-Zhang conjugate gradient method... ] 223 [ effective methods for solving unconstrained nonlinear optimization problem (1), its iterative formula is given by: k k k k d x x a + = +1 , (2) and

is smooth, nonlinear function whose gradient will be denoted by g. nonlinear Conjugate Gradient (CG) method is one of the effective methods for solving unconstrained nonlinear optimization problem (1), its iterative formula is given by: Where k d is the search direction, k a step-length which is computed by carrying out a line search.The main step-length rules are as follows: 1. Armigo-Goldenstien rule.Find an k a >0 such that (5) 2. Weak Wolfe-Powell rule (WWP).Find an k a >0 satisfying (4) and (6) 3. Strong Wolfe-Powell rule (SWP).Find an k a >0 satisfying (4) and (7) See (Hong & et al, 2009), where k b is a scalar and k g denotes . There are some famous formulas for k b such as where and .denote the Euclidean norm.Hager-Zhang (Hager & Zhang, 2005) proposed a new formula as follows: Obviously, formula (12) can be rewrite as: This method can be regarded as a modified HS method.Hager and Zhang proved that this method with the Wolfe line search converges globally (Hager & Zhang, 2006).Other interested results about the global convergence of CG methods can be found in (Dai & Yuan, 1996), ( Dai, 1988), (Sun & Zhang, 2001) and (Liu & et al, 1995).
Our paper is organized as follows: In section 2 we defined the memoryless BFGS method.In section 3 we first derived the new modified of the HZ method and present a new algorithm, the sufficient descent property of new algorithm is also proved in this section.In section 4, we establish the global convergence of the new algorithm for the convex and general functions.The preliminary numerical results are contained in section 5.

2-The memoryless quasi-Newton method.
In order to introduce our method, let us simply recall the well-known BFGS quasi-Newton method.This type of method was suggested for the first time by Perry (Perry, 1978), he noted that in eq.( 3) the scalar k b was chosen to make k d and 1 + k d conjugate using an exact line search, since, in general, line search is not exact, Perry relaxed this requirement and he rewrote eq.(3) where k b is defined by (10) in an equivalent form, but assuming inexact line search, thus he obtained: (14) but this matrix is not of full rank; hence he modified (14) as: . Then (Shanno, 1978) addressed the issue that (15b) does not satisfy the actual QN-condition, so he modified it in order to make it do so, he then obtained This new form of the projection matrix / 1 + k Q has a special relationship with the BFGS update formula which is defined by (Dennis & More', 1974, 1977) and (Al-Bayati,1996) It is easily seen that ( 16) is equivalent to (17) with k H replaced by I (i.e. if I H k º , where I is the identity matrix) then the above BFGS method becomes the memoryless BFGS.The CG-method, which is referred to as a memoryless BFGS method, defined by The above equality can be rewritten in the following equivalent form: for more details see (Zhang, 2006) and (Zhang, 2009), since , then (20) is equivalent: ( ) he called the method (2) with (25) as a new LS (MLS) method.It is clear that the MLS method reduces to the standard LS method if exact line search is used since, this line search ensure

3-A new Hager-Zhang conjugate gradient (MHZ) method.
(Al-Bayati, 1991) investigated another family of QN-method for which the updating matrix 1 + k H was defined by: The above updating formula generates positive definite matrices.If we use memoryless (i.e.I H k º ) then the above matrix is defined by: ) ( 2 where . We call the method (2) and (33) as MHZ.Now, we present concrete algorithm as follows:

Algorithm of the MHZ method.
Step 1: set k=1, Step 2: set , where a is a scalar chosen in such a way that Step 3: check for convergence, i.e. if , where Î is a small positive tolerance, stop.step 4: otherwise, if (k=n), or 1 1 1 2 .0 Compute the new search direction defined by: ( ) ; A descent modified Hager-Zhang conjugate gradient method… ] 227 [ set k=1, and go to step 2. Else, set k=k+1.
Step 5: compute the new search direction defined in (33) and go to step 2. Next, we will establish the search direction defined by (32) always yield descent direction.

Theorem
Let { } { } be generated by the new HZ (MHZ), then we have Proof: .
(35) Which satisfy the sufficient descent condition.

4-The global convergence of MHZ method.
In order to establish the global convergence result for the MHZ, we will impose the following assumptions of f, which have been used often in the literature to analyze the global convergence of CG methods with inexact line search.

Theorem
Suppose that the Assumption (A) holds and consider any CG methods (2) and (3), where the direction 1 + k d given by (32) is a descent direction and k a is obtained by the SWP if From (36) we get ( We have from (37), (39), (40) and the SWP that From the SWP, we set Therefore, we have , which we will use to conclude, by contradiction, that the gradients cannot be bounded away from zero (Nocedal & Glibart, 1992).

Lemma
Suppose that Assumption (A) holds, consider the method ( 2

Proof:
We can rewrite (33) in the following form . Now, we define 1 1 1 From (3), we have for (43) Using the identity and (43), we have (44) (the last equality can be verified by squaring both sides).Using the condition , the triangle inequality, and (44), we obtain ( )( ) (45) Now, using SWP and (37), we get  The above inequality with (45) completes our proof.

Property (*):
Consider a method of the form ( 2)-( 3), and suppose that (46) for all 1 ³ k .Under this assumption we say that the method has property (*) if there exists a constant b>1 and l >0 such that for all k: See (Nocedal& Glibart , 1992).
We use this property to show that asymptotically the search directions are generated by the Algorithm (3.1) where k b is defined by (32) change slowly.

Lemma
Suppose that the Assumption (A) holds.Consider the CG-algorithm , the second SWP and ( 46) Obviously, if Therefore, for b and l are defined in (49) and (50), respectively, it follows that the relations (46) and (47) holds.
It is clear that many other choices of k b give rise to algorithm with property (*).For example, if k b and The next Lemma shows that if the gradient is bounded away from zero, and if the method has property (*), then a fraction of the steps cannot be too small.Therefore, the algorithm makes a rapid progress to the optimum.We let * N denote the set of positive integers, and for l >0, we define the set of index ) the following Lemma is similar to Lemma 3.5 in (Dai, 2001) and Lemma 4.2 in (Nocedal & Glibart, 1992).

Lemma
Suppose that all Assumptions of Lemma 4.4 are satisfied.Then there exists a l >0 such that for any The prove of Lemma is similar to Theorem 3.6 in (Dai, 2001) or to Theorem 3.2 in (Hager & Zhang, 2005) and the proof is omitted here.

i . e . t h e n e w a l g o r i t h m ( 3 . 1 ) w i t h
MHZ k b is satisfying the global convergence.

5-Numerical results
In this section, we compare the performance of the new method MHZ on a set of test functions.The cods were written in Fortran77 and in double precision arithmetic.Our experiments are performed on a set of (25) nonlinear unconstrained test functions.These test functions are contributed in CUTE (Bongratz & et al, 1995) and (Andrei, 2008).All these algorithms are implemented with strong Wolfe Powell search conditions (4) and ( 7), with .I n a l l t h e s e methods terminate when the following stopping criterion is met: In Table (1), we compare some numerical results for Shanno and MHZ methods.In order to summarize our numerical results, we are concerned only with the total of N different dimensions, N=100, 500, 1000, 5000, 10000 and 50000.
In Table (2), we compare the percentage performance of the new MHZ against Shanno Methods for the total of our 25 test functions.While in Table (3) we compare between Shanno , Hestenes and Stiefel (HS), Hager-Zhang (HZ) and Modified Hager-Zhang (MHZ) methods for the total of N different dimensions [N=100, 1000 & 10000] for each test function.
From these three tables we see that for more test functions the MHZ method are really much better than others.
search, the method reduces to a nonlinear version of the Hestenes-Stiefel CG-scheme.Moreover, exact line search(21) reduces to the standard HS method.(Zhang, 2009) replace term new LS type CG-method defined by: ) represent a new HZ type define by: that the new algorithm is global convergence for general function, we establish a bounded for the change ( ) line search satisfying the Zoutendijk condition and the sufficient descent condition if (37) holds, taking the summation of the first part of equation (42),we obtain It follows from the definition of MHZ k b the line search satisfies the Zoutendijk condition (38), and the sufficient descent condition (35); (iii) Property (*) holds.
We record the Number Of Iterations calls (NOI), the Number Of Function evaluations calls (NOF), and the dimensions of test problems calls (N), for the purpose of our comparisons.

The Fourth Scientific Conference of the College of Computer Science & Mathematics
and its gradient g is Lipschitz continuous, namely, there exists a constant L>0, such that