lsqr(A, b, damp=0.0, atol=1e-06, btol=1e-06, conlim=100000000.0, iter_lim=None, show=False, calc_var=False, x0=None)
The function solves Ax = b
or min ||Ax - b||^2
or min ||Ax - b||^2 + d^2 ||x - x0||^2
.
The matrix A may be square or rectangular (over-determined or under-determined), and may have any rank.
1. Unsymmetric equations -- solve Ax = b 2. Linear least squares -- solve Ax = b in the least-squares sense 3. Damped least squares -- solve ( A )*x = ( b ) ( damp*I ) ( damp*x0 ) in the least-squares sense
LSQR uses an iterative method to approximate the solution. The number of iterations required to reach a certain accuracy depends strongly on the scaling of the problem. Poor scaling of the rows or columns of A should therefore be avoided where possible.
For example, in problem 1 the solution is unaltered by row-scaling. If a row of A is very small or large compared to the other rows of A, the corresponding row of ( A b ) should be scaled up or down.
In problems 1 and 2, the solution x is easily recovered following column-scaling. Unless better information is known, the nonzero columns of A should be scaled so that they all have the same Euclidean norm (e.g., 1.0).
In problem 3, there is no freedom to re-scale if damp is nonzero. However, the value of damp should be assigned only after attention has been paid to the scaling of A.
The parameter damp is intended to help regularize ill-conditioned systems, by preventing the true solution from being very large. Another aid to regularization is provided by the parameter acond, which may be used to terminate iterations before the computed solution becomes very large.
If some initial estimate x0
is known and if damp == 0
, one could proceed as follows:
Compute a residual vector
r0 = b - A@x0
.Use LSQR to solve the system
A@dx = r0
.Add the correction dx to obtain a final solution
x = x0 + dx
.
This requires that x0
be available before and after the call to LSQR. To judge the benefits, suppose LSQR takes k1 iterations to solve A@x = b and k2 iterations to solve A@dx = r0. If x0 is "good", norm(r0) will be smaller than norm(b). If the same stopping tolerances atol and btol are used for each system, k1 and k2 will be similar, but the final solution x0 + dx should be more accurate. The only way to reduce the total work is to use a larger stopping tolerance for the second system. If some value btol is suitable for A@x = b, the larger value btol*norm(b)/norm(r0) should be suitable for A@dx = r0.
Preconditioning is another way to reduce the number of iterations. If it is possible to solve a related system M@x = b
efficiently, where M approximates A in some helpful way (e.g. M - A has low rank or its elements are small relative to those of A), LSQR may converge more rapidly on the system A@M(inverse)@z =
b
, after which x can be recovered by solving M@x = z.
If A is symmetric, LSQR should not be used!
Alternatives are the symmetric conjugate-gradient method (cg) and/or SYMMLQ. SYMMLQ is an implementation of symmetric cg that applies to any symmetric A and will converge more rapidly than LSQR. If A is positive definite, there are other implementations of symmetric cg that require slightly less work per iteration than SYMMLQ (but will take the same number of iterations).
Representation of an m-by-n matrix. Alternatively, A
can be a linear operator which can produce Ax
and A^T x
using, e.g., scipy.sparse.linalg.LinearOperator
.
Right-hand side vector b
.
Damping coefficient. Default is 0.
Stopping tolerances. lsqr
continues iterations until a certain backward error estimate is smaller than some quantity depending on atol and btol. Let r = b - Ax
be the residual vector for the current approximate solution x
. If Ax = b
seems to be consistent, lsqr
terminates when norm(r) <= atol * norm(A) * norm(x) + btol * norm(b)
. Otherwise, lsqr
terminates when norm(A^H r) <=
atol * norm(A) * norm(r)
. If both tolerances are 1.0e-6 (default), the final norm(r)
should be accurate to about 6 digits. (The final x
will usually have fewer correct digits, depending on cond(A)
and the size of LAMBDA.) If atol
or :None:None:`btol`
is None, a default value of 1.0e-6 will be used. Ideally, they should be estimates of the relative error in the entries of A
and b
respectively. For example, if the entries of A
have 7 correct digits, set atol = 1e-7
. This prevents the algorithm from doing unnecessary work beyond the uncertainty of the input data.
Another stopping tolerance. lsqr terminates if an estimate of cond(A)
exceeds :None:None:`conlim`
. For compatible systems Ax =
b
, :None:None:`conlim`
could be as large as 1.0e+12 (say). For least-squares problems, conlim should be less than 1.0e+8. Maximum precision can be obtained by setting atol = btol =
conlim = zero
, but the number of iterations may then be excessive. Default is 1e8.
Explicit limitation on number of iterations (for safety).
Display an iteration log. Default is False.
Whether to estimate diagonals of (A'A + damp^2*I)^{-1}
.
Initial guess of x, if None zeros are used. Default is None.
The final solution.
Gives the reason for termination. 1 means x is an approximate solution to Ax = b. 2 means x approximately solves the least-squares problem.
Iteration number upon termination.
norm(r)
, where r = b - Ax
.
sqrt( norm(r)^2 + damp^2 * norm(x - x0)^2 )
. Equal to :None:None:`r1norm`
if damp == 0
.
Estimate of Frobenius norm of Abar = [[A]; [damp*I]]
.
Estimate of cond(Abar)
.
Estimate of norm(A'@r - damp^2*(x - x0))
.
norm(x)
If calc_var
is True, estimates all diagonals of (A'A)^{-1}
(if damp == 0
) or more generally (A'A +
damp^2*I)^{-1}
. This is well defined if A has full column rank or damp > 0
. (Not sure what var means if rank(A)
< n
and damp = 0.
)
Find the least-squares solution to a large, sparse, linear system of equations.
>>> from scipy.sparse import csc_matrix
... from scipy.sparse.linalg import lsqr
... A = csc_matrix([[1., 0.], [1., 1.], [0., 1.]], dtype=float)
The first example has the trivial solution :None:None:`[0, 0]`
>>> b = np.array([0., 0., 0.], dtype=float)
... x, istop, itn, normr = lsqr(A, b)[:4]
... istop 0
>>> x array([ 0., 0.])
The stopping code :None:None:`istop=0`
returned indicates that a vector of zeros was found as a solution. The returned solution x
indeed contains :None:None:`[0., 0.]`
. The next example has a non-trivial solution:
>>> b = np.array([1., 0., -1.], dtype=float)
... x, istop, itn, r1norm = lsqr(A, b)[:4]
... istop 1
>>> x array([ 1., -1.])
>>> itn 1
>>> r1norm 4.440892098500627e-16
As indicated by :None:None:`istop=1`
, lsqr
found a solution obeying the tolerance limits. The given solution :None:None:`[1., -1.]`
obviously solves the equation. The remaining return values include information about the number of iterations (:None:None:`itn=1`
) and the remaining difference of left and right side of the solved equation. The final example demonstrates the behavior in the case where there is no solution for the equation:
>>> b = np.array([1., 0.01, -1.], dtype=float)
... x, istop, itn, r1norm = lsqr(A, b)[:4]
... istop 2
>>> x array([ 1.00333333, -0.99666667])
>>> A.dot(x)-b array([ 0.00333333, -0.00333333, 0.00333333])
>>> r1norm 0.005773502691896255
:None:None:`istop`
indicates that the system is inconsistent and thus x
is rather an approximate solution to the corresponding least-squares problem. :None:None:`r1norm`
contains the norm of the minimal residual that was found.
The following pages refer to to this document either explicitly or contain code examples using this.
scipy.sparse.linalg._isolve.lsqr.lsqr
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them