19.LinearAlgebraicSolversandEigenvalueAna.pdf资源-CSDN文库

需积分: 5 3 浏览量 2021-10-25 16:26:03 上传评论收藏 315KB PDF 举报

线性代数求解器与特征值分析是数值计算领域中的核心内容，广泛应用于科学计算、工程模拟和数据处理等多个领域。本章将概述解决线性方程组和特征值问题的主要数值方法。 1. 线性方程组的求解线性方程组Ax = b的求解通常分为直接法和迭代法。直接法主要基于高斯消元的思想，包括LU分解、Cholesky分解等。这些方法能够给出精确解，但对大型稀疏矩阵的处理效率较低，因为涉及到大量的矩阵运算。 2. 预条件技术对于大型线性系统，迭代法如共轭梯度法（CG）、最小残差法（MINRES）、双共轭梯度法（BiCG）、广义最小残差法（GMRES）等Krylov子空间方法成为首选。这些方法通常需要配合预条件技术使用，预条件器是系统矩阵A的一种易于求逆的近似，可以加速收敛速度。常见的预条件器有Jacobi预条件器、Gauss-Seidel预条件器和ILU分解等。 3. 特征值问题特征值问题Ax = λx的求解没有严格的直接方法，QR算法因其快速的收敛性被视为一种近似的直接方法，适合小规模问题。对于大规模问题，迭代方法如Lanczos法、Arnoldi方法和Jacobi-Davidson方法更加实用。这些方法通过构建Krylov子空间逐步逼近特征值和特征向量。 4. 数值稳定性与收敛性数值稳定性是选择求解策略的关键考虑因素，直接法在某些情况下可能因舍入误差导致数值不稳定，而迭代法可能需要更多的迭代次数才能达到所需的精度。预条件技术能有效改善迭代法的收敛性。 5. 应用背景线性代数求解器与特征值分析在各种实际问题中都有应用，如结构力学的有限元分析、流体力学的偏微分方程求解、机器学习中的特征降维等。深入理解这些方法对于优化计算性能和提高解的质量至关重要。参考文献： - Golub, G. H., & Van Loan, C. F. (1996). Matrix Computations. Johns Hopkins University Press. - Stewart, G. W. (1998). Matrix Algorithms: Basic Decompositions. SIAM. - van der Vorst, H. A. (2003). Iterative Krylov Methods for Large Linear Systems. Cambridge University Press. 总结，本章详细介绍了线性方程组的直接法和迭代法，以及特征值问题的数值求解策略，强调了预条件技术在提高计算效率中的作用，并提供了相关领域的经典参考资料。理解和掌握这些方法对于解决实际问题具有重要意义。

资源详情

资源评论

资源推荐

Chapter 19

Linear Algebraic Solvers and Eigenvalue Analysis

Henk A. van der Vorst

Utrecht University, Utrecht, The Netherlands

1 Introduction 551

2 Mathematical Preliminaries 551

3 Direct Methods for Linear Systems 553

4 Preconditioning 560

5 Incomplete LU Factorizations 562

6 Methods for the Complete Eigenproblem 567

7 Iterative Methods for the Eigenproblem 571

Notes 574

References 575

1 INTRODUCTION

In this chapter, an overview of the most widely used

numerical methods for the solution of linear systems of

equations and for eigenproblems is presented.

For linear systems Ax = b, with A a real square non-

singular n × n matrix, direct solution methods and iterative

methods are discussed. The direct methods are variations

on Gaussian elimination. The iterative methods are the so-

called Krylov projection–type methods, and they include

popular methods such as Conjugate Gradients, MINRES,

Bi-Conjugate Gradients, QMR, Bi-CGSTAB, and GMRES.

Iterative methods are often used in combination with

the so-called preconditioning operators (easily invertible

approximations for the operator of the system to be solved).

We will give a brief overview of the various preconditioners

that exist.

For the eigenproblems of the type Ax = λx,theQR

method, which is often considered to be a direct method

Encyclopedia of Computational Mechanics, Edited by Erwin

Stein, Ren

e de Borst and Thomas J.R. Hughes. Volume 1: Funda-

mentals.

 2004 John Wiley & Sons, Ltd. ISBN: 0-470-84699-2.

because of its very fast convergence, is discussed. Strictly,

speaking, there are no direct methods for the eigenprob-

lem; all methods are necessarily iterative. The QR method

is expensive for larger values of n, and for these larger

values, a number of iterative methods, including the Lanc-

zos method, Arnoldi’s method, and the Jacobi–Davidson

method are presented.

For a general background on linear algebra for numerical

applications, see Golub and Van Loan (1996) and Stewart

(1998). Modern iterative methods for linear systems are

discussed in van der Vorst (2003). A basic introduction

with simple software is presented in Barrett et al. (1994).

A complete overview of algorithms for eigenproblems,

including pointers to software, is given in Bai et al. (2000).

Implementation aspects for high-performance computers

are discussed in detail in Dongarra et al. (1998).

Some useful state-of-the-art papers have appeared; we

mention papers on the history of iterative methods by Golub

and van der Vorst (2000) and Saad and van der Vorst

(2000). An overview on parallelizable aspects of sparse

matrix techniques is presented in Duff and van der Vorst

(1999). A state-of-the-art overview for preconditioners is

presented in Benzi (2002).

The purpose of this chapter is to make the reader familiar

with the ideas and the usage of iterative methods. We expect

that guided with sufﬁcient knowledge about the background

of iterative methods, one will be able to make a proper

choice for a particular class of problems. It will also provide

guidance on how to tune these methods, in particular, for

the selection or construction of effective preconditioners.

2 MATHEMATICAL PRELIMINARIES

In this section, some basic notions and notations on linear

systems and eigenproblems have been collected.

552 Linear Algebraic Solvers and Eigenvalue Analysis

2.1 Matrices and vectors

We will be concerned with linear systems Ax = b,where

A is usually an n × n matrix:

A ∈

n×n

The elements of A will be denoted as a

i,j

. The vectors

x = (x

,...,x

)

and b belong to the linear space R

Sometimes we will admit complex matrices A ∈

n×n

and

vectors x,b ∈

, but that will be explicitly mentioned.

Over the space

, we will use the Euclidean inner

product between two vectors x and y:

y =



i=1

and for v, w ∈ C

, we use the standard complex inner

product

w =



i=1

¯v

These inner products lead to the 2-norm or Euclidean length

of a vector

x

√

x for x ∈ R

v

√

v for v ∈ C

With these norms, we can associate a 2-norm for matrices:

for A ∈

n×n

, its associated 2-norm A

is deﬁned as

A

= sup

y∈R

,y=0

Ay 

y

and in the complex case, similarly, using the complex

inner product. This matrix norm gives the maximal length

multiplication effect of A on a vector (where the length is

deﬁned by the given norm).

The associated matrix norms are convenient because they

can be used to bound products. For A ∈

n×k

, B ∈ R

k×m

we have that

AB

≤A

B

in particular,

Ax

≤A

x

The inverse of a nonsingular matrix A is denoted as A

−1

Particularly useful is the condition number of a square

nonsingular matrix A deﬁned as

(A) =A

A

−1



The condition number is used to characterize the sensitivity

of the solution x of Ax = b with respect to perturbations

in b and A. For perturbed systems, we have the following

theorem.

Theorem 1 (Golub and Van Loan, 1996; Th. 2.7.2)

Suppose

Ax = b, A ∈

n×n

, 0 = b ∈ R

(A + A)y = b + b, A ∈ R

n×n

, b ∈ R

with A

≤ A

and b

≤ b

If κ

(A) ≡ r<1,thenA + A is nonsingular and

y − x

x

≤



1 − r

(A)

With the superscript

, we denote the transpose of a

matrix (or vector): for A ∈

n×k

, the matrix B = A

∈

k×n

is deﬁned by

i,j

= a

j,i

If E ∈ C

n×k

, then the superscript

is used to denote its

complex conjugate F = E

,deﬁnedas

i,j

=¯e

j,i

Sometimes, the superscript

is used for complex matrices

in order to denote the transpose of a complex matrix.

The matrix A is symmetric if A = A

,andB ∈ C

n×n

Hermitian if B = B

. Hermitian matrices have the attrac-

tive property that their spectrum is real. In particular,

Hermitian (or symmetric real) matrices that are positive-

deﬁnite are attractive because they can be solved rather

easily by proper iterative methods (the CG method).

A Hermitian matrix A ∈

n×n

is positive-deﬁnite if

Ax > 0forall0= x ∈ C

. A positive-deﬁnite Hermi-

tian matrix has only positive real eigenvalues.

We will encounter some special matrix forms, in partic-

ular tridiagonal matrices and (upper) Hessenberg matrices.

The matrix T = (t

i,j

) ∈ R

n×m

will be called tridiagonal,

if all elements for which |i − j | > 1 are zero. It is called

upper Hessenberg if all elements for which i>j+ 1are

zero. In the context of Krylov subspaces, these matrices are

often (k + 1) × k and they will then be denoted as T

k+1,k

2.2 Eigenvalues and eigenvectors

For purposes of analysis, it is often helpful or instructive

to transform a given matrix to an easier form, for instance,

diagonal or upper triangular form.

Linear Algebraic Solvers and Eigenvalue Analysis 553

The easiest situation is the symmetric case: for a real

symmetric matrix, there exists an orthogonal matrix Q ∈

n×n

,sothatQ

AQ = D,whereD ∈ R

n×n

is a diagonal

matrix. The diagonal elements of D are the eigenvalues of

A, and the columns of Q are the corresponding eigenvectors

of A. Note that the eigenvalues and eigenvectors of A are

all real.

If A ∈

n×n

is Hermitian (A = A

), then there exist Q ∈

n×n

and a diagonal matrix D ∈ R

n×n

,sothatQ

Q = I

and Q

AQ = D. This means that the eigenvalues of a

Hermitian matrix are all real, but its eigenvectors may be

complex.

Unsymmetric matrices do not, in general, have an ortho-

normal set of eigenvectors and may not have a complete

set of eigenvectors, but they can be transformed unitarily

to Schur form:

∗

AQ = R

in which R is upper triangular.

If the matrix A is complex, then the matrices Q and R

may be complex as well. However, they may be complex

even when A is real unsymmetric. It may then be advan-

tageous to work in real arithmetic. This can be realized

because of the existence of the real Schur decomposition.If

A ∈

n×n

, then it can be transformed with an orthonormal

Q ∈

n×n

AQ =



with



R =









1,1



1,2

···



1,k



2,2

···



2,k

00···



k,k







∈ R

n×n

Each



i,i

is either 1 × 1ora2× 2 (real) matrix having

complex conjugate eigenvalues. For a proof of this, see

Golub and Van Loan (1996, Chapter 7.4.1). This form of



R is referred to as an upper quasi-triangular matrix.

If all eigenvalues are distinct, then there exists a nonsin-

gular matrix X (in general not orthogonal) that transforms

A to diagonal form:

−1

AX = D

A general matrix can be transformed to Jordan form with

a nonsingular X:

−1

AX = diag(J

,...,J

)

where







10··· 0

0 λ

0 ··· 0 λ







If there is a J

with dimension greater than 1, then the

matrix A is defective. In this case, A does not have a

complete set of independent eigenvectors. In numerical

computations, one may argue that small perturbations lead

to different eigenvalues, and hence that it will be unlikely

that A has a true Jordan form in actual computation.

However, if A is close to a matrix with a nontrivial Jordan

block, then this is reﬂected by a (severely) ill-conditioned

eigenvector matrix X.

We will also encounter eigenvalues that are called Ritz

values. For simplicity, we will introduce them here for

the real case. The subspace methods that are collected in

this chapter are based on the approach to identify good

solutions from certain low-dimensional subspaces

⊂ R

where k  n denotes the dimension of the subspace. If

∈ R

n×k

denotes an orthogonal basis of V

, then the

operator H

= V

∈ R

k×k

represents the projection of

A onto V

. Assume that the eigenvalues and eigenvectors

of H

are represented as

(k)

= θ

(k)

then θ

(k)

is called a Ritz value of A with respect to V

and V

(k)

is its corresponding Ritz vector. For a thorough

discussion of Ritz values and Ritz vectors, see, for instance,

Parlett (1980), Stewart (2001), and van der Vorst (2002).

For some methods, we will see that Harmonic Ritz values

play a role. Let W

denote an orthogonal basis for the

subspace A

, then the Harmonic Ritz values of A with

respect to that subspace are the inverses of the eigenvalues

of the projection Z

of A

−1

= W

−1

3 DIRECT METHODS FOR LINEAR

SYSTEMS

We will ﬁrst consider the case that we have to solve

Ax = b

with A a square n × n matrix. The standard approaches are

based upon Gaussian elimination. This works as follows.

554 Linear Algebraic Solvers and Eigenvalue Analysis

Assuming that a

1,1

= 0, one can subtract multiples of the

ﬁrst row of A of the other rows, so that the coefﬁcients a

i,1

for i>1 become zero. Of course, the same multiples of

have to be subtracted from the corresponding b

.This

process can be repeated for the remaining (n − 1) × (n − 1)

submatrix, in order to eliminate the coefﬁcients for x

in the second column. After completion of the process,

the remaining matrix has zeros below the diagonal and

the linear system can now easily be solved. For a dense

linear system, this way of computing the solution x requires

roughly (2/3)n

arithmetic operations.

In order to make the process numerically stable, the rows

of A are permuted so that the largest element (in absolute

value) in the ﬁrst column appears in the ﬁrst position. This

process is known as partial pivoting and it is repeated for

the submatrices.

The process of Gaussian elimination is equivalent with

the decomposition of A as

A = LU

with L a lower triangular matrix and U an upper triangular

matrix and this is what is done in modern software. After

the decomposition, one has to solve LUx = b and this is

done in two steps:

1. First solve y from Ly = b.

2. Then x is obtained from solving Ux = y.

The computational costs for one single linear system are

exactly the same as for Gaussian elimination, and partial

pivoting is included without noticeable costs. The permuta-

tions associated with the pivoting process are represented by

an index array and this index array is used for rearranging

b, before the solving of Ly = b.

If one has a number of linear systems with the same

matrix A, but with different right-hand sides, then one can

use the LU decomposition for all these right-hand sides.

The solution for each new right-hand side then takes only

O(n

) operations, which is much cheaper than to repeat the

Gaussian elimination procedure afresh for each right-hand

side.

This process of LU-decomposition, with partial pivoting,

is the recommended strategy for the solution of dense linear

systems. Reliable software for this process is available from

software libraries including NAG and LAPACK (Anderson

et al., 1992) and the process is used in Matlab. It is

relatively cheap to compute a good guess for the condition

number of A, and this shows how sensitive the linear

systems may be for perturbations to the elements of A and b

(see Theorem 1). It should be noted that checking whether

the computed solution ˆx satisﬁes

b − A ˆx

b

≤ 

for some small  does not provide much information on

the validity of ˆx without further information on A.IfA is

close to a singular matrix, then small changes in the input

data (or even rounding errors) may lead to large errors in the

computed solution. The condition number of A is a measure

for how close A is to a singular matrix (cf. Theorem 1).

The computation of the factors L and U , with partial

pivoting, is in general rather stable (small perturbations to

A lead to acceptable perturbations in L and U ). If this is a

point of concern (visible through a relatively large residual

b − A ˆx, then the effects of these perturbed L and U can

be largely removed with iterative reﬁnement. The idea of

iterative reﬁnement is to compute r = b − A ˆx and to solve

Az = r, using the available factors L and U . The computed

solution ˆz is used to correct the approximated solution

to ˆx +ˆz. The procedure can be repeated if necessary.

Iterative reﬁnement is most effective if r is computed in

higher precision. Apart from this, the process is relatively

cheap because it requires only n

operations (compared to

the

O(n

) operations for the LU factorization. For further

details, we refer to Golub and Van Loan (1996).

For increasing n, the above sketched direct solution

method becomes increasingly expensive (

O(n

) arithmetic

operations) and for that reason all sorts of alternative

algorithms have been developed to help reduce the costs

for special classes of systems.

An important subclass is the class of symmetric positive-

deﬁnite matrices (see Section 2.1). A symmetric positive

deﬁnite matrix A can be decomposed as

A = LL

and this is known as the Cholesky decomposition of A.The

Cholesky decomposition can be computed in about half the

time as an LU decomposition and pivoting is not necessary.

It also requires half the amount of computer storage, since

only half of A and only L need to be stored (L may even

overwrite A if A is not necessary for other purposes). It may

be good to note that the numerical stability of Cholesky’s

process does not automatically lead to accurate solutions x.

This depends, again, on the condition number of the given

matrix. The stability of the Cholesky process means that the

computed factor L is relatively insensitive for perturbations

of A.

There is an obvious way to transform Ax = b into a

system with a symmetric positive-deﬁnite matrix:

Ax = A

Linear Algebraic Solvers and Eigenvalue Analysis 555

but this should be almost always avoided. It is not efﬁ-

cient because the construction of B = A

A requires 2n

arithmetic operations for a dense n × n matrix. Moreover,

the condition number of the matrix B is the square of the

condition number of A, which makes the solution x much

more sensitive to perturbations.

Another important class of matrices that occur in practi-

cal problems involves the matrices with many zero entries,

the so-called sparse matrices. Depending on how the

nonzero elements of A are distributed over the matrix,

large savings can be achieved by taking account of the

sparsity patterns. The easiest case is when the nonzero ele-

ments are in a (narrow) band around the diagonal of A.

The LU factorization, even with partial pivoting, preserves

much of this band structure, and software is available for

these systems; see, for instance, LAPACK (Anderson et al.,

1992).

If the nonzero entries are not located in a narrow band

along the diagonal, then it may be more problematic to

take advantage of the given nonzero pattern (also called

the sparsity pattern). It is often possible to permute rows

and or columns of A during the LU factorization process so

that the factors L and U also remain satisfactorily sparse.

It is not easy to code these algorithms, but software for the

direct decomposition of sparse matrices is available (for

instance in NAG).

For matrices with a very special sparsity pattern or where

the elements satisfy special properties, for instance, constant

diagonals as in Toeplitz matrices, special algorithms have

been derived, for instance, Fast Poisson solvers and Toeplitz

solvers. For an introduction and further references, see

Golub and Van Loan (1996).

If the matrix A is nonsquare, that is, m × n, or singu-

lar, then the Gaussian elimination procedures cannot be

used. Instead, one may use QR factorizations, or even

better (but more expensive), the singular value decom-

position (SVD). For details on this, see Golub and Van

Loan (1996) or the manuals of software libraries. The QR

decomposition algorithm and the SVD are also available in

Matlab.

The alternative for direct solvers, if any of the pre-

viously mentioned methods does not lead to the solu-

tion with reasonable computer resources (CPU time and

storage), may be an iterative way of solution. Iterative

methods are usually considered for the solution of very

large sparse linear systems. Unfortunately, there is not

one given iterative procedure that solves a linear system

with a general sparse matrix, similar to the LU algo-

rithm for dense linear systems. Iterative methods come in

a great variety, and it requires much insight and tuning

to adapt them for classes of special problems. Therefore,

we will pay more attention to these methods. This is nec-

essary because many of the sparse problems, related to

ﬁnite element or ﬁnite difference discretizations of mechan-

ical problems, cannot be solved fast enough by direct

methods.

3.1 Iterative solution methods

The idea behind iterative methods is to replace the given

system by some nearby system that can be more eas-

ily solved; that is, instead of Ax = b, we solve the

simpler system Kx

= b and take x

as an approxima-

tion for x. Obviously, we want the correction z that

satisﬁes

A(x

+ z) = b

This leads to a new linear system

Az = b − Ax

Again, we solve this system by a nearby system, and most

often one takes K again:

= b − Ax

This leads to the new approximation x

= x

+ z

.Thecor-

rection procedure can now be repeated for x

, and so on,

which gives us an iterative method.

For the basic or Richardson iteration, introduced above,

it follows that

k+1

= x

+ z

= x

+ K

−1

(b − Ax

)

= x

+ K

−1

(1)

with r

= b − Ax

.WeuseK

−1

only for notational pur-

poses; we (almost) never compute inverses of matrices

explicitly. When we speak of K

−1

b, we mean the vector



b that is solved from K



b = b. The matrix K is called the

preconditioner. In order to simplify our formulas, we will

take K = I and apply the presented iteration schemes to

the preconditioned system K

−1

Ax = K

−1

b if we have a

better preconditioner available.

From now on, we will also assume that x

= 0 to simplify

future formulas. This does not mean a loss of generality,

because the situation x

= 0 can be transformed with a

simple shift to the system

Ay = b − Ax

b(2)

for which obviously y

= 0.

剩余25页未读，继续阅读

评论收藏

内容反馈

xiaogg621

粉丝: 0
资源: 30

19.Linear Algebraic Solvers and Eigenvalue Ana.pdf

评论0

最新资源

19.Linear Algebraic Solvers and Eigenvalue Ana.pdf

评论0

Algebraic-Solver

[A.History.of.Algebraic.and.Differential.Topology.1900-1960].Dieudonne.J.djvu

Algebraic graph theory-Springer (2001).pdf

Massey.A.basic.course.in.algebraic.topology.pdf

GTM135 Advanced Linear Algebra.pdf

Algebraic Geometry and Arithmetic Curves - Q Liu

Error Correction coding——mathematical methods and algorithms

Unifying Theories of Programming - C.A.R. Hoare and He Jifeng, 1998.pdf

(ebook-pdf).-.mathematics.-.algebraic.geometry.(v3_geometry_math

Algebraic Coding Theory(Elwyn Berlekamp著)信道编译码理论.pdf

代数几何,.van.der.Waerden,.Algebraic.Geometry,.2ed,.2008.chs._WPCBJ_.djvu

【EN】【Algebraic Geometry（I-V）】【代数几何（全5卷）】【Shafarevich I.R】【Springer】.djvu

数学资料合集 Math Complete

Basic.Algebraic.Geometry.1.2nd.ed.I.R.Shafarevich

sympy-docs-pdf-1.0.pdf

Algebraic methods in the global theory of complex spaces. 双面3042148134.pdf

Algebraic.Functions.and.Projective.Curves,.David.Goldschmidt（代数函数和投影曲线）.

Algebraic Number Theory Jurgen Neukirch.pdf

Noise, Oscillators and Algebraic Randomness.pdf

Graphs and Matrices.pdf

Lie Algebras, Algebraic Groups.pdf

代数特征值问题求解的模板：实用指南Templates for the Solution of Algebraic Eigenvalue Problems: a Practical Guide

Algebraic Geometry and Statistical Learning Theory

C程序数值试验大全Solution of Linear Algebraic EquationsThe Art of Scientific Computing

Algebraic Number Theory by J.S. Milne

Algebraic Graph Theory-Springer.pdf

复变函数及其应用（第9版）英文版 带完整标签

Foundations of Algebraic Specification and Formal Software Developme

最新资源

复变函数及其应用（第9版）英文版带完整标签