DFG Zentrum Matheon

Project C12

General Purpose, Linearly Invariant Algorithm for Large-Scale Nonlinear Programming

HU Berlin                    Matheon                    Deutschland, Land der Ideen

Duration: January 2004 - May 2010
Project head: Prof. Andreas Griewank, Ph.D.

Institut für Mathematik
Humboldt-Universität zu Berlin,
Rudower Chaussee 25
12489 Berlin-Adlershof

Tel: +49 (0)30 - 2093 5820 (office), +49 (0)30 - 2093 5833 (secretary Jutta Kerger)
email: griewank@math.hu-berlin.de
Responsible: Kshitij Kulshreshtha

Institut für Mathematik
Humboldt-Universität zu Berlin,
Rudower Chaussee 25
12489 Berlin-Adlershof

Tel: +49 (0)30 - 2093 5868
email: kshitij@math.hu-berlin.de
Funding: DFG Research Center "Mathematics for Key Technologies" (MATHEON)

BACKGROUND. For several decades sequential quadratic programming (SQP) algorithms [GMW81] have been the method of choice for the solution of nonlinearly constrained optimization problems (NLP). Recently, the interior point (IP) approach originally developed for linear programming (LP) has been adopted very successfully to nonlinear problems. Currently, the handful of NLP codes that seem most efficient on large problems in the popular test sets (http://plato.asu.edu/topics/testcases.html) are all interior point variants.

There are also some efforts to adopt SQP to certain problem classes with particular structure in order to overcome some of its drawbacks. In its classical form SQP requires the evaluation and factorization of the constraint Jacobian at each major(=outer) iteration and a low rank update of the factorization at each minor(=inner) iteration. Also each major iteration expends a considerable effort to solve an inequality constrained quadratic program (QP) whose Hessian is only a rough approximation, which may lead to a faulty identification of the active constraint set. In contrast interior point algorithms utilize exact derivative information of both objective and constraints, which typically requires that the problem be specified in a special modeling languages like AMPL or GAMS. IP codes usually perform one (positive definite) matrix factorization at each major iteration even though preconditioned conjugate gradient as a minor iteration is certainly an option.
The cost for forming and factoring derivative matrices in both SQP and LP codes grows excessively with the numbers of variables (n) and constraints (m) , except when Jacobian and Hessian structure is apparent and can be exploited by the optimization algorithm effectively. Often this requires additional user sophistication and extra programming efforts. We will exploit several ideas to develop more efficient and user convenient NLP solvers for problems without apparent structure. All too often such problems are currently attacked in the industrial practice by ad hoc methods based on nonmathematical analogies.
In unconstrained optimization so-called quasi-Newton methods have been very succesfull and are widely used. They alleviate the need to evaluate second derivatives and, at the same time, reduce the linear algebra effort for computing each new iteration point from cubic to quadratic with respect to n . Classical SQP implentations salvage only the first but not the second desirable feature in the more general context of constrained optimization. Also, they require at each iteration the evaluation of all constraint gradients, which is in general m times as expensive as evaluating the constraint functions as such.

EXPERTISE. The 'total quasi-Newton' approach suggested in [GW02] combines the more selective evaluation of derivatives with the approximation of the Jacobian and Hessian matrices by new, linearly invariant update formulas. As a result the linear algebra cost per iteration is only quadratic with respect to n+m, and the evaluation effort is bounded by a small multiple of that needed to compute the objective and derivative values by themselves. Due to the heredity property of the updates equality constrained quadratic programs are solved exactly in n steps for almost all initializations of primal and dual variables as well as Jacobian and Hessian approximations.

RESEARCH PROGRAM. Along the lines of the original proposal we intend to develop an NLP solver that requires only the input of an evaluation program for objective and constraints with first and second derivative vectors being computed by automatic differentiation (AD). It should be highly competetive for problems with n and m in the thousands, where dense Jacobians and Hessians can be stored in main memory and thus efficiently manipulated. For larger problems data sparse matrix representations of the kind suggested in [DFW90] , [NW99] are envisaged. Moreover, sparse preconditioning that renders the relative matrix discrepancies compact must be incorporated for structured problem classes, such as those arising in optimal control. Starting Summer 2008 we have persued a limited memory approach to reduce the memory storage requirement, and as a side benefit maintain the positive definiteness of the approximating Hessian. A general description is given in the recent conference procedings article Leuven Talk by Torsten Bosse In successive diploma theses by Volker Schlosshauer (Marchi 2009) , Torsten Bosse and Simon Rsel (December 2009/January 2010) the following improvements have been accomplished

The total storage requirement will come down to 2*n*l+m*m/2, where n is the number of variables, m the number of active constraints and l the user selected number of secant pairs kept in storage. Superlinear convergence is lost but the number of iterations was found to grow only moderately compared to the full storage implementation using n*n/2+m*m/2+n*m storage. A critical challenge remains the active set strategy. Especially on highly nonlinear problems, where the number of (nearly) violated constraints at early iterates may greatly excede the number of variables solving the local QP seems to require an inordinate effort, which is unlikely to pay off. That holds all the more when the Jacobian is only approximated as is the case in our method. Wevare therefor persuing an approach of constraint aggregation.

COOPERATION. There will be close cooperation with the groups of Andrea Walther at TU Dresden and Lawrence Biegler at CMU Pittsburgh. They will develop a trust region based version whereas we will persue a line-search approach with inertia controlling [GMW81] to keep the reduced Hessian positive definite. Especially with regards to the comparative efficiency of the new algorithm we will cooperate with the group of Michael Saunders at Stanford University. and Phil Gill at University of Southern California, San Diego are both very experienced experts onn the development of NLP algorithms and codes. We have had intensive discussions on various aspects of our SQP method. Paul Boggs at Sandia Labs, California is an expert on merit functions for NLP. Hans Mittelmann at Arizon State University. Within the center we will cooporate directly with the groups of Peter Deuflhard, Werner Römisch, and Fredi Tröltzsch, who solve structured optimization problems by SQP and IP methods. We will work with NAG Oxford to develop a library NLP routine that exploits their new Fortran compiler with AD capability.


Last Update: Kshitij Kulshreshtha 2009-04-27