Treffer: Computation of Lagrange multipliers for linear least squares problems with equality constraints
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Operational research. Management
Weitere Informationen
The computation of Lagrange multipliers in linear least squares problems with equality constraints is studied. A perturbation analysis is performed to examine the conditioning of the problems; the behaviour of two algorithms on a class of test problems is shown on some graphs.
AN0005643916;cqx01dec.01;2002May17.10:18;v1.8
COMPUTATION OF LAGRANGE MULTIPLIERS FOR LINEAR LEAST SQUARES PROBLEMS WITH EQUALITY CONSTRAINTS
The computation of Lagrange multipliers in linear least squares problems with equality constraints is studied. A perturbation analysis is performed to examine the conditioning of the problems; the behaviour of two algorithms on a class of test problems is shown on some graphs.
Received December 21, 2000; revised June 7, 2001
Abstract
AMS Subject Classifications: 65K05, 90C31.
Key Words: Constrained least squares problems, Lagrange multipliers, stability.
1. Introduction
The numerical solution of a linear least squares problem with equality constraints
Multiple line equation(s) cannot be represented in ASCII text (1)
where C is a k × n matrix, E is an n × m matrix, d is a k-vector, b is an m-vector, has been widely studied. Algorithms for solving problem (1) are given in [2, 10]; perturbation theories are presented in [3, 5] and the condition numbers of a general formula given in [5] are examined in [9]; numerical stability, using backward error analysis, of null space algorithm and of direct elimination method are analysed in [3, 7] and [8].
In this paper we study the numerical computation of Lagrange multipliers: a detailed perturbation analysis is given; moreover, two algorithms, a null space algorithm (method 2 in [3] also studied in [7]) and a direct elimination algorithm [8] are extended to the computation of Lagrange multipliers and tested on a large set of problems.
It is well-known that, to solve problems with inequality constraints, Lagrange multipliers of problems with equality constraints must be computed; in particular, active set algorithms, at each step, compute the solution and the Lagrange multipliers of a problem with equality constraints.
Now we make some remarks about existence and uniqueness of the solution and of the Lagrange multiplier vector of the problem (1).
We assume that E is of rank m and that[Multiple line equation(s) cannot be represented in ASCII text] is of rank n. Under these conditions the solution of the problem (1) exists and is unique; moreover the linear system obtained by the Kuhn—Tucker conditions of (1)
Multiple line equation(s) cannot be represented in ASCII text (2)
has a unique solution [Multiple line equation(s) cannot be represented in ASCII text], where x∗ is the solution of (1) and v∗ is the corresponding Lagrange multipliers m-vector.
If we set
Multiple line equation(s) cannot be represented in ASCII text (3)
the solution [Multiple line equation(s) cannot be represented in ASCII text] of (2) becomes
Multiple line equation(s) cannot be represented in ASCII text (4)
Let us consider the orthogonal decomposition
Multiple line equation(s) cannot be represented in ASCII text
where Q is an n × n orthogonal matrix and R is an upper m × m triangular matrix.
By decomposing Q in two matrices Q<subs>1</subs> and Q<subs>2</subs> of dimensions n × m and n × (n − m) respectively
<ct id="AN0005643916-4"> Q = (Q<subs>1</subs> Q<subs>2</subs>)</ct>we have
<ct id="AN0005643916-5"> E = Q<subs>1</subs>R.</ct>If we define
<ct id="AN0005643916-6"> Y = Q<subs>1</subs>R<sups>−T</sups>, Z = Q<subs>2</subs></ct>from the second equation of (2), we can prove that
<ct id="AN0005643916-7"> Yb = Q<subs>1</subs>Q<sups>T, sub 1</sups>x = (I − ZZ[sup T])x∗. (5)</ct>It is easy to verify that [6]
Multiple line equation(s) cannot be represented in ASCII text (6)
where
Multiple line equation(s) cannot be represented in ASCII text
moreover it holds
<ct id="AN0005643916-8"> U<sups>T</sups> = U = −S<sups>T</sups>C<sups>T</sups>CY (7) S<sups>T</sups>C<sups>T</sups>CZZ<sups>T</sups> = 0.</ct>If we call r the residual vector
<ct id="AN0005643916-9"> r = d − Cx∗,</ct>the first equation of (2) can be expressed as
<ct id="AN0005643916-10"> Ev∗ = C<sups>T</sups>r. (8)</ct>Moreover from (4) and (6) we obtain
<ct id="AN0005643916-11"> v∗ = S<sups>T</sups> C<sups>T</sups> d + Ub</ct>that, by (5) and (7), can be written as
<ct id="AN0005643916-12"> v∗ = (CS)<sups>T</sups>r.</ct>2. Computing Methods
The solution and Lagrange multipliers vector of problem (1) can be computed by solving the linear system (2); two algorithms are obtained by applying to system (2) the orthogonal decompositions
Multiple line equation(s) cannot be represented in ASCII text
and
<ct id="AN0005643916-14"> Q<sups>T</sups>E<sups>T</sups>P<subs>1</subs> = [R<subs>1</subs>R<subs>2</subs>],</ct>where we have considered a pivoting strategy.
The matrices R and R<subs>1</subs> are m × m upper and lower triangular respectively, H and Q are orthogonal matrices of order n and m and P and P<subs>1</subs> are permutation matrices.
R and R<subs>1</subs> are nonsingular since E<sups>T</sups> is of full row rank.
Null space algorithm (A1)
<ct id="AN0005643916-16"> Multiple line equation(s) cannot be represented in ASCII text CH = [C<subs>1</subs>, C<subs>2</subs>] R<sups>T</sups>y<subs>1</subs> = P<sups>T</sups>b &dtilde; = d − C[sub 1]y<subs>1</subs> min ¦¦C<subs>2</subs>y<subs>2</subs> − &dtilde;¦¦ Multiple line equation(s) cannot be represented in ASCII text r = &dtilde; − C<subs>2</subs>y<subs>2</subs> Ru = C<sups>T, sub 1</sups>r v∗ = Pu</ct>Direct elimination algorithm (A2)
<ct id="AN0005643916-18"> Q<sups>T</sups>E<sups>T</sups>P<subs>1</subs> = [R<subs>1</subs>R<subs>2</subs> &bmacr; = Q<sups>T</sups>b R<subs>1</subs>y = &bmacr; R<subs>1</subs>F = R<subs>2</subs> &Cmacr; = CP<subs>1</subs> &Cmacr; = [&Cmacr;<subs>1</subs>&Cmacr;<subs>2</subs>] T = &Cmacr;<subs>1</subs>F V = &Cmacr;<subs>2</subs> − T f = d − &Cmacr;<subs>1</subs>y Multiple line equation(s) cannot be represented in ASCII text h = F&xmacr;<subs>2</subs> &xmacr;<subs>1</subs> = y − h Multiple line equation(s) cannot be represented in ASCII text r = f − V&xmacr;<subs>2</subs> R<sups>T, sub 1</sups>u = &Cmacr;<sups>T, sub 1</sups>r v∗ = Qu</ct>3. Perturbation Analysis
The following perturbation analysis is linear because we suppose that data perturbations are small so that quadratic and higher components are negligible. The analysis starts following the same line followed in [1] for quadratic programming problems.
Let us consider the perturbed problem
Multiple line equation(s) cannot be represented in ASCII text (9)
where
<ct id="AN0005643916-20"> &Ctilde; = C + ΔC, &Etilde; = E + ΔE, &dtilde; = d + Δd, &btilde; = b + Δb.</ct>The Kuhn—Tucker conditions of the problem (9) are:
Multiple line equation(s) cannot be represented in ASCII text
This linear system can be written as
<ct id="AN0005643916-21"> (B + ΔB)z = f + Δf (10)</ct>where B is defined in (3) and
Multiple line equation(s) cannot be represented in ASCII text
Multiple line equation(s) cannot be represented in ASCII text
Multiple line equation(s) cannot be represented in ASCII text
Multiple line equation(s) cannot be represented in ASCII text
If we assume that the perturbation matrix ΔB satisfies the inequality
Multiple line equation(s) cannot be represented in ASCII text
then the spectral radius of B<sups>−1</sups>ΔB satisfies the inequality
Multiple line equation(s) cannot be represented in ASCII text
B + ΔB is nonsingular and the solution of system (10)
Multiple line equation(s) cannot be represented in ASCII text
exists and is unique.
From the first order perturbation result of the system (10), which is
Multiple line equation(s) cannot be represented in ASCII text
we obtain
Multiple line equation(s) cannot be represented in ASCII text
*[This character cannot be converted to ASCII text] means that in the fight side of the equality we neglect the components of perturbations of order higher than one.
From (6) we have
Multiple line equation(s) cannot be represented in ASCII text
and then
Multiple line equation(s) cannot be represented in ASCII text
Define
Multiple line equation(s) cannot be represented in ASCII text
Denote by P(E) the projection matrix on the null space of E<sups>T</sups>
<ct id="AN0005643916-22"> P(E) = I − (E<sups>T</sups>)<sups>+</sups>E<sups>T</sups>,</ct>where (E<sups>T</sups>)<sups>+</sups> is the generalized Penrose inverse of E<sups>T</sups>; define K(E) and K(C) to be the condition numbers of E and C
Multiple line equation(s) cannot be represented in ASCII text
and, as in [5], set
<ct id="AN0005643916-23"> L<sups>+, sub IC</sups> = [I − (CP(E))<sups>+</sups>C](E[sup T])<sups>+</sups>,</ct>thus define
Multiple line equation(s) cannot be represented in ASCII text (11)
it is easy to see that
Multiple line equation(s) cannot be represented in ASCII text (12)
and
Multiple line equation(s) cannot be represented in ASCII text (13)
Because of [Multiple line equation(s) cannot be represented in ASCII text], see (8), from formulae (11), (12), (13) it follows
Multiple line equation(s) cannot be represented in ASCII text (14)
and it is easy to see that
<ct id="AN0005643916-24"> ρβ ≥ 1.</ct>The first inequality of (14) can be obtained from [3, 5].
Furthermore in [9] it has been proved that, if C and E are full rank matrices, μ<subs>1</subs> ≥ μ<subs>2</subs> ≥ ··· ≥ μ<subs>n</subs> are the singular values of C, K(C) and K(E) are the condition numbers of C and E, then K<subs>E</subs>(C), K<sups>C</sups>(E), v(C, E<sups>T</sups>), ρ and γ satisfy the inequalities
Multiple line equation(s) cannot be represented in ASCII text (15)
From now on, it is supposed that these relations are satisfied.
Moreover if K(C) = 1, that is μ<subs>1</subs> = μ<subs>2</subs> = ··· = μ<subs>n</subs> = μ, it follows
Multiple line equation(s) cannot be represented in ASCII text (16)
The first and second of these equalities follow from (15). The third is proved by considering the singular value decomposition
Multiple line equation(s) cannot be represented in ASCII text
where U and V are k × k and n × n orthogonal matrices, U<subs>1</subs> is the matrix whose columns are the first n columns of U and D is an n × n diagonal matrix; by observing that now it is
<ct id="AN0005643916-25"> D = μI, C<sups>T</sups>C = μ²I, C = μU[sub 1]V<sups>T</sups></ct>it follows that
<ct id="AN0005643916-26"> CL<sups>+, sub IC</sups> = μU<subs>1</subs>V<sups>T</sups>Q<subs>1</subs>R[sup −T]</ct>and then
Multiple line equation(s) cannot be represented in ASCII text
From the first of (14) it follows that the sensitivity of x∗ is governed by K<subs>E</subs>(C), K<sups>C</sups>(E), v(C, E) and also by ρ and γ. If ρ is small or zero, also γ is small for (15), then the sensitivity of x∗ is measured by K<subs>E</subs>(C) and K<sups>C</sups>(E) and, since K<subs>E</subs>(C) and K<sups>C</sups>(E) are bounded by K(C) and K(E) as it is shown in (15), a sufficient condition for the problem (1) to be well conditioned is that E and C are both well conditioned. If ρ is not small, the sensitivity of x∗ is measured by (K<subs>E</subs>(C))²v(C, E)ρ ≤ (K(C))²K(E)ρ.
From the second of (14) it follows that the sensitivity of v∗ is also influenced by β; if ¦¦r¦¦ *[This character cannot be converted to ASCII text] 0, that is ρ is small, and ¦¦C<sups>T</sups>r¦¦ *[This character cannot be converted to ASCII text] 0, that is β is large, the effect of data perturbations can be small for the computation of x∗ and large for the computation of v∗. We observe that, if C<sups>T</sups>r = 0, then v∗ = 0 and x∗ is solution of the non constrained problem min ¦¦Cx − d¦¦; in particular, if r = 0, x∗ is solution of the linear system Cx = d. If β and ρβ = [Multiple line equation(s) cannot be represented in ASCII text] are small, a sufficient condition for computation of v∗ to be well conditioned is that E and C are both well conditioned.
Extensive numerical experimentation to estimate x∗ is exposed in [9]; now the numerical experiments are directed to point out the different behaviour of the error propagation for the x∗ and v∗ computation when k, n, m, K(E), K(C), ¦¦r¦¦ and β change.
4. Numerical Experiments
For the experimentation we consider test problems described in [9]. E and C are assumed to be full rank matrices; the solution x∗ and the residual norm ¦¦r¦¦ = ¦¦d − Cx∗¦¦ are assigned; the residual direction u = r/¦¦r¦¦ and the vector v of Lagrange multipliers are randomly generated. Four orthogonal matrices Q, K, H and V of appropriate sizes are randomly computed such that C and E can be defined as
Multiple line equation(s) cannot be represented in ASCII text
where D and S are diagonal matrices with diagonal elements d<subs>ii</subs> = μ<subs>i</subs> and s<subs>ii</subs> = s<subs>i</subs>.
The parameters that change in the experiments are ρ and the condition numbers K(C) = μ<subs>1</subs>/μ<subs>n</subs> and K(E) = s<subs>1</subs>/s<subs>m</subs>.
In the present experiments also β assumes different values. To this purpose, the direction u = r/¦¦r¦¦ is not randomly generated but it is computed as:
Multiple line equation(s) cannot be represented in ASCII text
where u<subs>i</subs> and u<subs>j</subs> are columns of U and t is a number between 0 and 1 (0 ≤ t ≤ 1). If we consider the partition of U as
<ct id="AN0005643916-28"> U = (U<subs>1</subs> U<subs>2</subs>)</ct>where U<subs>1</subs> is a k × n matrix, we have
Multiple line equation(s) cannot be represented in ASCII text
so that
Multiple line equation(s) cannot be represented in ASCII text
In particular, ¦¦C<sups>T</sups>r¦¦ = 0 for t = 0, that is when r has the direction of u<subs>j</subs> and x∗ is also solution of the unconstrained problem min¦¦Cx − d¦¦; ¦¦C<sups>T</sups>r¦¦ = ¦¦r¦¦μ<subs>i</subs> for t = 1, that is when r has the direction of u<subs>i</subs>.
In the experimentation, the values of K(E) and K(C) are assigned and the singular values of E and C are generated in geometrical progression:
Multiple line equation(s) cannot be represented in ASCII text
where μ<subs>1</subs> = 1 and s<subs>1</subs> is computed in the construction of test problem; moreover we have chosen i = 1 and j = k, ¦¦x∗¦¦ = 1, so that, being ¦¦C¦¦ = ¦¦x∗¦¦ = μ<subs>1</subs> = 1, it follows:
Multiple line equation(s) cannot be represented in ASCII text (17)
The A1 and A2 algorithms have been implemented in FORTRAN by using LINPACK library [4]. The experiments have been carded out on a Pentium PC for many values of k, n, tn, K(C), K(E) and ¦¦r¦¦ varying t. The test problems have been constructed in double precision (machine precision eps = 2.22E − 16), the algorithms have been implemented in single precision (eps = 1.19E − 07).
The two algorithms require almost the same computing time. This behaviour can be explained because, as noticed in [2], the operation count for the computation of x∗ with both algorithms is nearly equal. Moreover the last three steps of the algorithms do not change the size order of computational complexity.
The results of the experimentation obtained by the two algorithms are substantially the same and agree with formulae (14), (15) and (16); hence A1 and A2 work as well behaved algorithms on the test problems considered. We recall that in [3, 7, 8] only the stability of algorithms A1 and A2 for computation of x∗ is shown.
Some results of the experiments are shown in Figs. 1–16. In these figures plots of relative errors of the solution [Multiple line equation(s) cannot be represented in ASCII text] and of Lagrange multipliers [Multiple line equation(s) cannot be represented in ASCII text] versus t, for k = 200, n = 100, m = 50 and for some values of ¦¦r(x∗)¦¦, K(C) and K(E), are reported; here &xmacr; and &vmacr; are the computed vectors for the solution and for the Lagrange multipliers. Each graph represents the average values obtained in 20 executions that differ only in the initialization of the random-number generator used to construct the test problems.
The graphs confirm that the error in x∗ does not depend on t, that is by β, depends by K<subs>E</subs>(C) and K<sups>C</sups>(E) for small values of ρ, by (K<subs>E</subs>(C))²v(C,E) ρ for large values of ρ. For small values of t, that is for large values of β, the error in v∗ depends by K(E)v(C,E)β and, if ρ is not too small, by K<sups>C</sups>(E)ρβ.
Finally we observe that in Fig. 9 the Lagrange multiplier errors are larger than those of Fig. 13 because, as it is shown in (16) and (17), if K(C) = 1 then v(C,E) = K(E), while if K(C) > 1 then v(C,E) ≤ K(E).
GRAPH: Figures 1–3
GRAPH: Figures 4–6
GRAPH: Figures 7–9
GRAPH: Figures 10–12
GRAPH: Figures 13–15
GRAPH: Figure 16
References
[1] Arioli, M.: The use of QR factorization in sparse quadratic programming and backward error issues. SIAM J. Matrix Anal. Appl. 21, 825–839 (2000).
[2] Björck, Å.: Numerical methods for least squares problems. Philadelphia: SIAM, 1996.
[3] Cox, A. J., Higham, N. J.: Accuracy and stability of the null space method for solving the equality constrained least squares problem. BIT 39, 34–50 (1999).
[4] Dongarra, J. J., Bunch, J. R., Moler, C. B., Stewart, G. W.: LINPACK User's Guide. Philadelphia: SIAM, 1979.
[5] Eldén, L.: Perturbation theory for the least squares problem with linear equality constraints. SIAM J. Numer. Anal. 17, 338–350 (1980).
[6] Fletcher, R.: Practical methods of optimization. Chichester: Wiley, 1987.
[7] Galligani, E., Laratta, A.: Error analysis of null space algorithm for linear equality constrained least squares problems. Computing 52, 161–176 (1994).
[8] Galligani, E., Zanni, L.: On the stability of the direct elimination method for equality constrained least squares problems. Computing 64, 263–277 (2000).
[9] Laratta, A., Zironi, F.: Numerical solution of linear least squares problems with linear equality constraints. J. Optim. Theory Appl. 65, 67–83 (1990).
[10] Lawson, C. L., Hanson, R. J.: Solving least squares problems. Philadelphia: SIAM, 1995.
By A. Laratta, Modena and F. Zironi, Modena
A. Laratta Dipartimento di Matematica Università di Modena e Reggio Emilia Via Campi, 213/B I-41100 Modena Italy e-mail: laratta@unimo.it
F. Zironi Dipartimento di Matematica Università di Modena e Reggio Emilia Via Campi, 213/B I-41100 Modena Italy e-mail: zironi@unimo.it