Optimal CD-DY Conjugate Gradient Methods with Sufficient Descent Directions

Conjugate Gradient (CG) methods are widely used for large scale unconstrained optimization problems. Most of CG-methods don’t always generate a descent search direction, so the descent condition is usually assumed in the analysis and implementations. In this paper, we have studied several modified CG-methods based on the famous CD (CG-method), and show that our new proposed CG-methods produces sufficient descent and converges globally if the Wolfe conditions are satisfied. Moreover, they produces the original version of the CD (CG-method), if the line searches are exact. The numerical results show that the new methods are more effective and promising by comparing with the standard CD and DY (CGmethods).


Introduction
We are concerned with the following unconstrained minimization problem: is available.There are several kinds of numerical methods for solving (1.1), which include the Steepest Descent (SD) method, the Newton method and Quasi-Newton (QN) methods, for example.Among them, the CG-method is one choice for solving large scale problems, because it does not need any matrices [13,15].CG-methods are iterative methods and at the k-th iteration, it's general form is given by: where the step-length k is positive and the directions k d are computed by: 0 , where k g denotes ) ( k x f and R k is a scalar parameter which characterizes the CG-method.If f is a strictly convex quadratic function and the line search is exact, then the iterative method (1.2)-(1.3) is called linear CG-method.Well-known formulas for k are the Fletcher-Reeves (FR), Polak-Ribiere-Polyak (PRP) and Hestenes-Stiefel (HS) formulas (see [8]; [15], [16]; and [11] respectively), Conjugate Descent (CD) [9], Liu-Storey (LS) [13] and Dai-Yuan (DY) [6] formulas and they are given by:  where .denotes the Euclidean norm, and . The global convergence properties of the FR, PRP and HS methods without regular restarts have been studied by many researchers, including Al-Baali [1] and Gilbert and Nocedal [10], Zoutendijk [18], Liu et al [12], Dai and Yuan [5], Powell [17] and Dai and Yuan [4].To establish the convergence results of these methods, it is normally required that the step-length k satisfies the following strong Wolfe conditions: .Some convergence analysis even require that the step-size k can be computed by an exact line search, namely: On the other hand, many other numerical methods for unconstrained optimization are proved to be convergent under the standard Wolfe conditions (1.10): For example, see Fletcher [9].Hence, it is interesting to investigate whether there exists a CG-method that converges under the standard Wolfe Optimal CD-DY Conjugate Gradient Methods with Sufficient. . . .

[213]
conditions.Recently CG-methods which satisfy the sufficient descent condition independent of line searches are paid attention to, and such researches are classified by, two types.The first one modifies the parameter k , and the second one modifies the search direction k d .

A New Type of CD (CG-Method).
In this section we have, first, to investigate how to determine a descent direction of the objective function.Let k x be the current iterate and k d will be defined by: . We note that, the new method reduces to the standard CD method if the line search is exact.Furthermore, if 0 then ,

CD k
New k and if 1 then .

New k
But generally we refer to use the inexact line search (s.t.Wolfe line search).We first prove that k d is a sufficiently descent direction by using the new New k only for 0 1 .

Lemma
Suppose that k d is given by (2.1)-(2.2).We assume that k satisfies strong Wolfe condition (1.10)-(1.11)with Then, the following result: By using strong Wolfe Powell condition (1.11) in (2.4) yields: Hence, from Lemma 2.1, it is known that d k is a sufficient descent direction of f at x k .

Computations of The New Scalar
We can compute the value of by three different approaches so the results yields three different CG-methods:

Descent Direction
By Lemma 2.1, d k which is defined in (2.3) is a descent direction.Now, we have: Optimal CD-DY Conjugate Gradient Methods with Sufficient. . . .

Outline of The New 1 CG-Algorithm:
Step 1: Initialization: Take Step 6: Search Direction: Compute the new search direction d k as: If the restart criterion of Powell, s.t.Step 7: Set k=k+1 and go to Step 2.

Conjugacy Property
First, we define conjugate directions, "the set of non-zero vectors {d 0 ,d 1 ,…,d n } called conjugate about the nonsingular matrix G if the following property is satisfied": The second way to compute the value of by use conjugate property.Dai and Liao [3] proved that we can write (2.8) for quadratic functions exact line search by: and with an inexact line search by: 0 , In this paper we suggest another value of by using (2.9): Optimal CD-DY Conjugate Gradient Methods with Sufficient. . . .

Outline of The New 2 CG-Algorithm:
Step 1: Initialization: Take Step 6: Search Direction: Compute the new search direction d k as: Step 7: Set k=k+1 and go to Step 2.

A Parallel Direction to the Newton direction
The third way to compute another new value of the scalar by assuming a parallel direction to the Newton direction : As we know, when the initial point 0 x is close enough to a local minimum point * x , then the best direction to be followed in the current point k x is the Newton direction Therefore, our motivation is to choose the parameter New k in (2.2) so that the direction k d can be the best direction we know, i.e. the Newton direction.Hence, using the Newton direction (2.14) in (2.11): Optimal CD-DY Conjugate Gradient Methods with Sufficient. . . .

[219]
Observe that the Newton direction is being used here only as a motivation for formula .However, in formula (2.16) the main drawback is the presence of the Hessian.One of the first CG-algorithm using the Hessian matrix was given by Daniel [7], where . For large-scale problems, choices for the update parameter that do not require the evaluation of the Hessian matrix are often preferred in practice to the methods that require the Hessian.As we know, QN-methods an approximation matrix G is used and updated so that the new matrix k H satisfies the secant condition.This leads us to a motivate CG-algorithm, where: ] we get : Then putted 3 from (2.17) in (2.2) we get:  Step 7: Set k=k+1 and go to Step 2.

Convergence Analysis
In this section, we are in a position to study the global convergence of new proposed CG-method with different versions.We first state the following mild assumptions, which will be used in the proof of global convergence property.

Assumption (H).
(i) The level set x is the starting point.(ii) In a neighborhood of S, f is continuously differentiable and its gradient g is Lipschitz continuously, namely, there exists a constant 0 L such that: Optimal CD-DY Conjugate Gradient Methods with Sufficient. . . .

[221]
Obviously, from the Assumption (H, i) there exists a positive constant D such that: where D is the diameter of .From Assumption (H, ii), we also know that there exists a constant 0 , such that: On some studies of the CG-methods, the sufficient descent or descent condition plays an important role.Unfortunately, this condition is hard to hold.

Theorem
In the CG-algorithm (1.
Hence the direction k d is a descent one.
Then by the strong Wolfe condition (1.11) and use the Powell restarting criterion to get: Hence the proof is completed.

Theorem
Under Assumptions (H, i) and (H, ii) , suppose that d k is given by (2.1) and ( 2 For all k.Then, from (2.1), it follows that:

The 6th Scientific Conference of the College of Computer Sciences & Mathematics
[224] That is, Since d 1 = -g 1 , so that Optimal CD-DY Conjugate Gradient Methods with Sufficient. . . .
which is contrary to proof this theorem.Hence , the proof is complete.

Numerical Experiments
The main work of this section is to report the performance of the new methods on a set of test problems.the codes were written in Fortran and in double precision arithmetic.All the tests were performed on a PC.Our experiments were performed on a set of 35-nonlinear unconstrained problems that have second derivatives available.These test problems are contributed in CUTE [2] and their details are given in the Appendix.for each test function we have considered 10 numerical experiments with number of variable n= 100,200,……,1000.In order to assess the reliability of our new proposed methods, we have tested them against standard (CD & DY) classical CG-methods and using the same test problems.All these methods terminate when the following stopping criterion is met.
We also force these routines stopped if the iterations exceed 1000 or the number of function evaluations reach 2000 without achieving the minimum.We use summarize our numerical results , we have concerned only on the total of different dimensions n= 100, 200,……,1000, for all tools used in these comparisons.The percentage performance of the three new methods against 100% (CD, DY) methods respectively, as follows in Tables (4.4)-(4.9).

Concluding Remarks.
In this paper, we have proved the global convergence property of a new proposed nonlinear CG-method with new parameters (2.6),(2.12)and (2.17).They are, in general, generates sufficient descent directions independent of line searches.Numerical results show that our three new proposed CG-methods are effective for solving large-scale unconstrained optimization problems.

Step 2 : 1 Step 3 : 10 )Step 4 :
Computation of the Line Search: Compute k satisfying strong Wolfe conditions (1.10)-(1.11)and then evaluate Test for Convergence: If ( The 6th Scientific Conference of the College of Computer Sciences & Mathematics [216]is satisfied then the iterations are stopped.If µ 1 then put µ = 1 and set ,

Step 2 :Step 4 :
Computation of the Line Search: Compute k satisfying strong Wolfe conditions (1.10)-(1.11)and then evaluate the iterations are stopped.If µ 1 then put µ = 1 and set ,

18 ) 2 . 7 .Step 2 :Step 4 : 5 :Step 6 :
Outlines of The New 3 CG-Algorithm: The 6th Scientific Conference of the College of Computer Sciences & Mathematics Computation of the Line Search: Compute k satisfying strong Wolfe conditions (1.10)-(1.11)and then evaluate the iterations are stopped.If µ 1 then put µ = 1 and set , Computation of the New Scalar Parameters: compute the new parameter 3 New from (2.18).Search Direction: Compute the new search direction d k as: If the restart criterion of Powell, s.t.

3 )
compare some numerical result for New1 -New 3 CG-methods against (CD & DY) CG-methods respectively, these tables indicate for (n) as the dimension of the problem; (NOI) number of iterations; (NOFG) number of function and gradient evaluation; (CPU) the total time required to complete the evaluation process for each test problem.In Table (4.4)-(4.9), we have compared the percentage performance of these New CG-methods against CD & DY methods taking over all the tools as 100% .In order to The 6th Scientific Conference of the College of Computer Sciences & Mathematics [226]

Table 4 . 4 :
Percentage Performance of Table (4.1)against CD-Method % Clearly, from the above table, we have found that the new proposed method beats classical CD method in about (53.8)% NOI; (51.6)%NOFG and (70.0)%Time.

Table 4 . 5 :
Percentage Performance of Table (4.1)against DY-Method Clearly, from the above table, we have found that the new proposed method beats DY method in about ( 53.4 )% NOI; ( 51.4 )% NOFG and ( 70.8 )%Time.

Table 4 .6: Percentage Performance of Table (4.2) against CD-Method
Clearly, from the above table, we have found that the new proposed method beats classical CD method in about (45.1)% NOI; (43.3)%NOFG and (52.0)%Time.

DY Conjugate Gradient Methods with Sufficient. . . .
Clearly, from the above table, we have found that the new proposed method beats DY method in about (44.6)% NOI; (43.0)%NOFG and (53.4)%Time.

Table 4 . 8 :
Percentage Performance of Table (4.3)against CD-Method Clearly, from the above table, we have found that the new proposed method beats CD method in about (42.6) %NOI; (41.7)%NOFG and (60.8)%Time.

Table 4 . 9 :
Percentage Performance of Table (4.3)against DY-Method Clearly, from the above table, we have found that the new proposed method beats DY method in about (42.0)% NOI; (41.3)%NOFG and (62.0)%Time.