A New Extended PR Conjugate Gradient Method for Solving Smooth Minimization Problems

In this paper, we have discussed and investigated an extended PR-CG method which uses function and gradient values. The new method involves the extended CGmethods and have the sufficient descent and globally convergence properties under certain conditions. We have got some important numerical results by improving an standard computer program compared with Wu and Chen (2010) method in this field. ةقيرط PR ةعسوملا يف جردتلا قفارتملا لحل لئاسملا ةيريغصتلا ةمعانلا


Introduction.
Our problem is to minimize a function of n variables: , where R R : n ® f ……………….…………………….……….....( 1) is a smooth nonlinear function and its gradient (x) f Ñ is available.At the current iterative point k x , the Conjugate Gradient (CG) method has the following form: where k a is a step-length; k d is a search direction; parameter.The CG-method has played a special role in solving large-scale nonlinear optimization due to the simplicity of their iterations and their very low memory requirements, for example.Some well-known formulas for k b are the Fletcher-Reeves (FR), Polak-Ribiére (PR), Hestenes-Stiefel (HS) methods which are given, respectively, by:  ) ( ) .……….…………………………………………...……(5c) Equations [(4a)-(4b)] and [(5a)-(5b)] are called the "Standard Wolfe" and "Strong Wolfe" conditions, respectively.It has been shown by Dai and Yuan [32] that for the FR scheme, the strong Wolfe-Powell conditions may not yield a direction of descent unless 2 1 £ s . In typical implementations of the Wolfe-Powell conditions, it is often most efficient to choose s close to one.Hence, the constraint 2 1 £ s , needed to ensure descent, represents a significant restriction in the choice of the line search parameters.For the PR scheme, the strong Wolfe-Powell conditions may not yield a direction of descent for any choice of .Although all these methods are equivalent in the linear case, their behaviors for general objective functions may be far different.In the PR method, if a bad direction and a tiny step from a are also likely to be poor unless a restart along the gradient direction is performed.For general functions, [19] proved the global convergence of PR method with exact line search.On the other hand, the PR and HS methods perform similarly in terms of theoretical property.Both methods are preferred to the FR method in its numerical performance, because the methods essentially perform a restart after it encounters a bad direction.Nevertheless, [25] showed that the PR and the HS methods can cycle infinitely without approaching a solution, which implies that they do not have globally convergence.Therefore, over the past few years, much effort has been put to find out new formulae for CG-methods such that they have not only global convergence property for general functions but also good numerical performance [21] and [26].New kinds of nonlinear CG-methods are developed by using new conjugacy condition, such as [31]; [20]; [18] and [35].Recently, [2] proposed a new three term preconditioned gradient memory method.Their method subsumes some other families of nonlinear preconditioned gradient memory methods as its subfamilies with Powell's restart criterion and inexact Armijo line searches.Their search direction was defined by: where k a is a step-size defined by inexact Armijo line search procedure and k b is the conjugacy parameter.[11] introduced two versions CG-algorithm.Their search directions are defined by: More recently, [5] introduced a new three-term CG-method.An attractive property of their proposed method is that the generated directions are always descending.Besides, this property is independent of line search used and the convexity of objective function.A remarkable property of the method is that it produces a descent direction at each iteration.Motivated by the nice descent property.In order to ensure the global convergence for general functions, Dai and Liao restrict k b to be positive, that is: .0 , 0 , max The search direction of their method was given by: where is defined in (9), and and t is defined by: Also, [7] proposed several extended CG-methods which combine both quadratic and non-quadratic models.Their extended search directions are defined as: ……………...…………………...... ( 16) Finally, [3]is considered as a modified three term CG-method defined as : ) )( )( ( for u approaches 0, and . The search direction generated by this method at each iteration satisfies the descent condition.The optimal value of the parameter u is given in (20).

In this paper, we have proposed a new formula
for k b applying the rational non-quadratic model and Perry's conjugacy condition [1].
where k H is an approximation to the inverse Hessian and . They respectively can be seen as the modifications of the method HS and PR.In comparison with classic CG methods, the decrease of the objective function value is contained in the two new formulae.Moreover, New k b keeps the property of PR method, namely, if a very small step is generated the next search direction tends to the Steepest Descent (SD) direction, preventing a sequence of tiny steps from happening.Furthermore, finite quadratic termination is retained for the new methods.Since the sufficient descent condition is a property of great importance for the global convergence analysis of any CG-method, we have modified the conjugacy parameter of [14] to implement the non-quadratic rational model which satisfies the sufficient descent property and the standard Wolfe-Powell conditions.In addition, the global convergence property of the new proposed CG-method is discussed and a set of numerical results presented show that the new proposed method is efficient.

The Fourth Scientific Conference of the College of Computer Science & Mathematics
] 178 [

Materials and Methods. 2.1 Extended CG-Methods for Non-Quadratic Models.
Many attempts have been made to investigate more general function than the quadratic one as a basis for the CG-methods.Over years, various authors have published works in this area, and a large variety of methods have been derived to solved this problem for many sorts of objective functions.The CG-methods discussed so far assume a local quadratic representation of the objective function.However, quadratic models may not always be adequate to incorporate all the information which might be needed to represent the objective function successfully.and in problems where the quadratic representation is not good.When we are remote from such a region, a non-quadratic model may better represent the objective function and that leads to speculation on a better way to choose a type of a non-quadratic model.

Extended Rational CG-Method. [8]
The CG-method so far discussed is a local quadratic representation of the objective function.In problems when the quadratic representation is not good, or when we are remote from such a region, quadratic function ) , where f is monotonic increasing, may be better to represent the objective and thus it gives an advantage to a method based on this model.In order to obtain better global rate of convergence for minimization methods when applied to more general functions than the quadratic.In this paper, Al-Bayati's 1993 extended CG-method which is invariant to nonlinear scaling of quadratic rational functions is proposed and combined with the standard conjugacy condition of [14] to increase the efficiency of this type of CGmethods.There is some precedent for this approach, if ) (x q is quadratic function then a function f is defined as nonlinear scaling of ) ( x q if the invariancy property to nonlinear scaling by [17]  has been considered by [15].Al-Bayati introduced several non-quadratic rational models; see for example Boland theorem [30]; [8]; [4]; [10] and [9].Al-Bayati's, 1993 non-quadratic model to be investigated here, is defined as the quotient of two quadratic functions and so belongs also to the class of rational functions Al-Bayati's rational function model was considered by: is the quadratic function then it determines the solutions min x in a finite number of iterations not exceeding (n), and )] ( [ x q f satisfy the property (23).

Outline of Al-Bayati's Extended Rational CG-Model.
Step 1: Compute a, b and c using 2 Step 2: ; set 1 = r and go to Step 4.
Step 4: Where d is a suitable tolerance value; say .This direction k d is then used instead of the direction used in the standard CG-formula and since the model satisfies conditions (23), the resulting algorithm has finite convergence on model (24).Recently, [6] introduced a new extended CG-method for which its search directions are defined by: k r is a scalar defined in (26).

Wu and Chen (2010) CG-Method.
In this section, we are going to present the recent work of the two well-known scientists Wu and Chen in (2010).They introduced several well-known CG-formulas.The conjugacy parameters of these CG-methods are given by; …………………….…………..………….(37) They proved that all the above CG-methods satisfy the sufficient descent condition and have the global convergence property.
From Section (2) we can get k r using (26) to use in the new extended CG method whose conjugacy parameter is defined by New k b such that: Note that the scalar k r may be rewritten as: By using (45), equation (44) becomes: 4.1 Outline of The New Extended CG-Method.
Step 1: Step 3: ; k a is obtained by WP-procedure.

2 Theoretical Properties for the New Extended CG-Method.
In this section, we focus on the convergence behavior on the New k b method with exact line searches.Hence, we make the following basic assumptions on the objective function.

3 Assumption.
f is bounded below in the level set neighborhood U of the level set 0 x L , f is continuously differentiable and its gradient f Ñ is Lipschitz continuous in the level set 0 x L , namely, there exists a constant L> 0 such that: for all x, y Î

Lemma (Zoutendijk Condition).
Suppose that Assumption 4.3 holds.Consider any CG-type method in the form of where k d is a descent direction and k a satisfies the Wolfe-Powell line search conditions (4 and 5 ).Then we have that:

Theorem
Suppose that Assumption 4.3 holds.Consider the new extended CG-method defined in (47) with New k b , if k a is obtained by an exact line search and then: We now prove the theorem by contradiction and assume that there exists some constants g > 0 such that for all 0 ³ k .The compactness of the level set 0 x L implies that there exists a constant g >0 such that g £ k g . Since 0 ® k s , we know that there is a , k for all , where p is the same as in Lemma 4.4.Then, for all , k k > we have:

Proof:
For initial direction we have: which satisfies (54).Now let the theorem be true for all 1 -k , i.e. 0 Multiplying the search direction of (47) by Using Wolfe-Powell conditions (4) (5) we have: If exact line searches are used then (57) becomes using (56): …………………………………………………………………….(58) Hence, for ELS, the search directions are sufficiently descent since ) For inexact line searches we have: Since our function f is uniformly convex function either in the quadratic or in the non-quadratic regions, then there exists a Lipschitz constant L >0 and a constant, for all x, y Î 0 x L …………..…...……(59) Or equivalently: From Powell restarting criterion we have: ) The Fourth Scientific Conference of the College of Computer Science & Mathematics ] 184 [ 63) Using ( 62) and ( 63) in (61): Thus our new proposed extended CG-method has also sufficient descent directions using inexact line searches under the condition that Powell restarting condition must be used.Therefore, the method has a global convergent property by satisfying the conditions of Zoutendijk theorem [19].

Numerical Results
The main work of this section is to report the performance of the new method on a set of test problems.The codes are written in Fortran and in double precision arithmetic.All the tests are performed on a PC.Our experiments are performed on a set of 35 nonlinear unconstrained problems that have second derivatives available.These test problems are contributed in CUTE and their details are given in the Appendix.Our numerical results are divided into three branches according to the numerical experiments with their number of variables: 1-10 numerical experiments with n = 100, 200, . . . . . .., 1000.2-5 numerical experiments with n = 100, 300, 500, 700, 900.3-4 numerical experiments with n = 100, 400, 700,1000.
In order to assess the reliability of our new proposed method, we have tested it against the standard Wu & Chen's modified PRCG-method [14] using the same set of test problems.All these methods terminate when the following stopping criterion is met: Tables 5.4, 5.5 and 5.6 compare the percentage performance of the new extended PRCG-methods against the standard Wu & Chen PRCG-method taking over all the tools as 100%.In order to summarize our numerical results, we are concerned only with the total of (n) different dimensions for all tools used in these comparisons.
3d) Another important issue related to the performance of CG-methods is the line search, which requires sufficient accuracy to ensure that the search directions yield descent. Common criteria for line search accuracy are the Wolfe-Powell conditions:

Tables 5 . 1 , 5 .2 and 5 . 3
compare some numerical results for the modified PRCG method due to Wu & Chen and the new extended PRCG method for 35 test functions.In all these tables (n) indicates the dimension of the problem; (NOI) indicates the number of iterations; (NOFG) indicates the number of function and gradient evaluations; (TIME) indicates the total time required to complete the evaluation process for each test problem.

Fourth Scientific Conference of the College of Computer Science & Mathematics
It is clear from Table(5.4) that taking, over all, the tools as a 100% for the Wu & Chen PRCG method, the New Extended PRCG method has an improvement, in about (12.3%) NOI; (11.5%)NOFG and (2.5%) TIME, also from Table (5.5) that taking, over all, the tools for PRCG method has an improvement, in about (6.1%) NOI;(5.3%)NOFG and (3.4%) TIME.It is clear from Table (5.6) that taking, over all, the tools for PRCG method has an improvement, in about (12.3%) NOI; (11.3%)NOFG and (2.2%) TIME.These results indicate that new extended PRCG method, in general, is the best.