陆吾生华东师大讲义笔记

所需积分/C币:35 2017-12-25 11:06:21 690KB PDF
收藏 收藏
举报

陆吾生 华东师大 讲义 最优化 梯度下降法 随机梯度下降法
1.2 Constrained Optimization: The Karush-Kuhn-Tucker(KKT) Conditions( Chap. 10 The constrained optimization problem minimize X subject to: a, (x)=0 for i=1, 2,,p c(x)≥0forj=1,2,…,q · Fcasiblc rcgion:{x:a1(x)=0fori=1,2,…,p,ande(x)≥0forj=1,2,…,q} A point x is said to bc fcasible if it is in the fcasiblc rcgion(insidc or on the boundary) wc say an inequality constraint c(x)20 is active at a feasible point x, if c(x)=0. And inequality constraint c; ( r) 20 is said to be inactive at a feasible point x if c(x)>0 KKT conditions(first-order necessary conditions for point x* to be a minimum point If x' is a local minimizer of the constrained problem. then (a)a(x)=0fori=1,2,…,p (b)c、(x")≥0forj=1,2,…,g (c)there exist Lagrange multipliers A, for isis p and u for 1sjsq such that 7(x)=∑va(x)+∑Ve(x) (d)(x)=0forj=1,2,…,q (e)42≥0forj=1, Remarks Conditions(a)and(b) are merely the constraints imposed in the problem 9 Condition(c)says at a local minimizer the gradient of the objective function is a linear combination of the gradient of the constraint functions and the coefficients of the combination are lagrange multipliers. Scc Examplc 10.10 and Fig. 10. 8 for illustration ◆ Define lagrangian L(x,,1)=(x)∑礼a(x)-∑uc(x), the equation in(c) can be expresse ed as VL(x,,C)=0 Also note that the equality and inequality constraints can be expressed in terms of Lagrangian as L(x:2,)=0 (x,2,p)≤0 Conditions in(d)are called complementarity conditionS. It follows that if the kth inequality constraint is in active at x then uk=0. In other words, the equation in(c) only contains those terms Vc, (x) that are active at x t The total number of equations p (from(a))+n(from(c))+ g(from(d)) are equal to the total number of unknowns x, n, u'i 1.3 Convcx Scts, Convex Functions, and Convex Programming(CP)(Chaps. 2, 10, 12, 13, 14) ◆ A set is said to be convex if for every pair of points x1,x2∈ Rand every real c with0≤a≤1, point x=x1+(1-a)x2 belongs to灾 for al0≤a≤1 +A function f()defined over a convex set Ris said to be convex if for every pair of points xi, x2 E Rand every real a with0≤a≤1, f(ax1+(1-a)x2)≤(x)+(1-c)f(x2) uppose f(r)EC. Function f(x)is convex over a convex set if and only if f(x,> f(x)+g(x)(x-x or all x and x1∈呎 4 Using the above property, it can be shown that a function fa)eC is convex over a convex set Rif anf only if the Hessian H(x)off(x)is positive semidefinite forx E R 4 The problem of minimizing a convex function f()(with no constraints )is not very complicated, because the first-order necessary condition becomes a sufficient condition, and any local minimizer of a convex function is a global minimizer +A constrained minimization problem minimize subject to: a; (x)=0 for i=1, 2, c(x)≥0forj=1,2,…,q is said to be a convex programming( CP)problem if the objective function is convex and the feasible region defined by the constraints, i. e,x: a(x)=0 for i=1, 2, ..,P, and c; (x)20 for j=1, 2,. 9i is a convex set. In words, a CP problem is a problem of minimizing a convex function over a convex region. It can be shown that the feasible region x: a; (x)=0 for i=1, 2,,P, and c (x)>0 for j=1, 2,. y is convex if(1)all equality constraint functions ai(x)are linear, and(ii)all inequality functions-ci()are convex An important property of the class of cp problems is that the kKt conditions become both necessary and sufficient. This implies that any fx, 1, u) satisfying the KKt conditions gives a local solution (minimizer)x*for the CP problem and any local minimizer of a Cp problem is a global minimizer 1. 4 Wolfe duality( Chap 10 Wolfe's theorem(1991)on duality is concerned with the general Cp problem minimize subject to: a, (x)=ax-b=o for i=1, 2, .,P C,(x)≥0 which is referred as the primal problem. Wolfe's theorem says that if ix,2,u solve the KKt system for the primal problem, then 2, u also solves the dual problem which is defined by maximize L(x, 1, u) x,瓦, (功) subject to: V,L(x, 2, F)=0 ≥0 In addition, f(x)=L(x,a,u) Proof: From the KKT conditions of the primal problem, we obtain f(x)=L(x, n,u)and u 20 For a feasible set x,a, ui, we have u>0 and V L(x, 1, p)=0,hence L(x,元∵,)=f(x)2f(x2)-∑11(x)-∑c1(x)=L(x:,) With u>0 the Lagrangian L(x, n, u)is convex, hence I(x,,A)≥L(x,,p)+(x2-x)V,L(x,,p)=L(x,4,p) Therefore L(x,巩,g)≥L(x,λ,p),i,e.,set{x,A,p} solves the dual problem 4 If we define the duality gap S(x,a,u)=f(x)-L(x, n, u), then the Kkt conditions imply that 8(x, 1, u)20 for fcasiblc (x, i, u)and S(x,2,u)=0. This is bccausc for a fcasiblc sct(, 1, u),wc have d(x,A,)=f(x)-L(x,,)=∑11(x)+∑c(x)-∑c(x)≥0,and 0(x,死,)=∑比c(x)=0 1.5 Linear Programming and Quadratic Programming( Chap 12) 9 Linear programming problems are of great importance in both theoretical and practical perspectives. The modern theory and algorithms of interior-point methods were initially developed for the lp problems in 1980,s and the basic concepts and solution techniques for Lp have since been extended to larger classes of CP and non-convex problems. On the other hand, there are a great many of practical problems that can be accurately or approximately modeled as LP problcms 9 The standard-form linear programming (LP) problem is given by minimize f(x)=c'x subject to: Ax= b ≥0 where matrix A is of size p x n with p< n and a is assumed to have full row rank. It is important to realize that lp problems are convex programming problems. Hence the kkT conditions are necessary and sufficient conditions, and any local solution is a global solution Moreover, the wolfe theorem applies: the dual of the above problem is given by maximI ize h(a)=b'n (①) subject to: A2+ 0 9 The kkt conditions in this case are given b λ+ Ar= b 0forl≤i≤n x≥0,≥0 4 A useful concept is the development of interior-point algorithms for LP is central path. To introduce it we first write the kkt conditions as Ax= b with x≥0 4+H= c with A≥0 XH=O where X-diagx1, x2,...,xn). The central path for a standard-form LP is defined by a set of vectors ix(r),(r), u(r)) that satisfy Ax= b with x > o (c) A n+u=c with u>0 X with T>0,e=[1l1…l The duality gap on the central path is found to be [x(),(r)=cx(r)-b(r) =[2()4+(]x)-b( u(r)x(T)=nt We see that duality gap approaches zero as parameter T approaches zero. In other words, the central path defines a parameterized trajectory approaching to the solution of both primal and dual problems as I approaches zero 4 Below we describe a primal-dual path-following algorithm for LP problems. The idea behind the algorithm is to start with a strictly feasible initial set of vectors wo=o, no, uo and use first-order perturbations of the equations (c) to obtain a set of increment vectors to update point wk. So in the kth iteration we seeks=8,8,8) such that the next iterate x:41,1}={x+6+62,1+6n} remains strictly feasible and approaches the central path defined by (c)with t=tk+l. This leads(c) to a system of linear equations as follows 48.=0 A6.+=0 n8.+X6 where M=diag{(),(A)2…、(A)n} ◆ Additional details: 2+a,8 whcrc ak is obtaincd by a linc-scarch stcp to cnsurc thc strict fcasibility of point wk-l a The value of parameter I, is determine by ith p ntp u An interior-point algorithm with nonfeasible initialization cab be developed in a similar way, see sec 12.5.2. Lp problems may be formulated and solved in an alternative-form which are equivalent to lp problems in the standard form: minimize f(x=c'x abject to:Ax≥b Counterpart algorithms for Lp problems in alternative-form can also be developed 1.6 Semidefinite Programming(SDP)(Chap. 14 SDP is a class of CP problems that includes Lp and convex quadratic programming problems as its subclasses and a lot more other CP problems of practical usefulness, 1990s have witnessed intensive research interest in SDP that has resulted in many efficient solution methods and software tools for SDP problems Notation and definitions We denote the set of n x n symmetric matrices by S". For 4,B E S", the inner product of 4 with B is defined by AB= trace(AB)=∑∑ab Matrix A being positive semidefinite is signified as A>0, and a being positive definite is denoted by A20 4 The primal sdp problem is defined as minimize C·.X abject to:A·X=b,fo l.2 X>0 where C, X, and 4; are members of s". Note that if c=diag[c, 4i-diagail, and x=diag X for some vectors c and ai, then the sdp problem becomes a standard Lp problem minimize X=c subject to: Ax= b 式 where A is a matrix with a as its i-th row and b=[61 b2.. b,] 9 The dual sdp problem assumes the form maximize by bject to:∑nA+S=C By eliminating S and making simple variable changes, we can write the dual problem as Minimize c r (①) ctto:F(x)≥0 whcrc The kkt conditions for sdp ∑yA+S=C A…X= 6. for i=1,2 SX=0 X≥0,S≥0 7 An example: convex quadratic programming(QP) problems The general convex oP problems can be expressed as minimize x Hx+pr with HO subject to:Ax≥b which is equivalent to minimize subjcct to:xHx+px≤ Ax>b where, by using a dccomposition of H=HH, the first constraint can be cxprcsscd as Px-(H (Hx)(Hx)>0 which is equivalent to G(8,x) 0 In addition the second constraint can be written as F(x)=F+∑xF0 where Fo--diag b, and Fi-diagiai s with a; being the i-th column of A By defining an augmented vector x=l, the convex QP problem is now formulated as the SDP minimize C subject to:E(x)≥0 where=1 00, and E(x)=diag(8,x), F(x)) 1.7 Sccond-Ordcr Conc Programming(SOCP)(Chap. 14) 4 The class of socp problem is a subset of the class of sDp problems but it is still large enough to include LP, QP, and many other CP problems of practical importance. Moreover, interior-point algorithms that are specifically designed for soCp is considerably more efficient than what the best an SDP algorithm can offer t Notation and definitions a A second-order cone(also known as quadratic cone or Lorentz cone)of dimension n is defined as R,u∈ R" for ul< a Examples: n=1: a ray on the t-axis starting at t=0, for n=2 and 3 see Fig 14.3 a note that the second-order cone is convex in r The primal socp problem is given b minimize ∑ subject to A x1∈K,fori=1,2,…,q where∈R,x∈R",A∈R"m,b∈Rm1, and K, is a second- order cone of dimension ni We note an analogy between SoCP and LP: both involve a linear objective function and a set of linear equality constraints. While the variable x in an LP problem is constrained to the region x: x>0 which is a convex cone, each variable vector x; in an SoCP problem is constrained to the second-order cone Ki The dual socp problem assumes the form maximize by (①) subject to:A.y+S1=c;fori=1,2,…,q o We also notice a similar analogy between the dual soCP and dual LP problems 9 s,∈K for ◆ Now if we let y, a and then the dual socp problem can be expressed as minimize subjcct to:|A4x+c‖≤bx+d,fori=1,2,,q u An example: Linear fractional problem minimize x十C subject to: ax+c>0 for i=l, 2,,p bx+d≥0fori=1,2, By introducing auxiliary constraints ar+ (an2x+c1)≥1 the linear fractional problem can be formulated as minimize ubject to: S(ax+c)21 for i=1, 2,,P ≥0 b.x+d≥0 We can show that u2≤lv,u≥0,v≥0 if and only if 2w + Hence the first two constraints in the above problem are equivalent to r+c+o or I and the linear fractional problem can be converted into minimize ∑8 2 subject to <ax+c.+δ.fori=1.2 D ax+c.-d b.x+d.≥0 which is an SocP problem

...展开详情
试读 43P 陆吾生华东师大讲义笔记
立即下载 低至0.43元/次 身份认证VIP会员低至7折
一个资源只可评论一次,评论内容不能少于5个字
木2木 老师讲的好,是个不错的资源
2019-08-08
回复
yeeeeeeeeee 陆老师的课程讲得通熟易懂
2018-04-02
回复
关注 私信 TA的资源
上传资源赚积分,得勋章
最新推荐
陆吾生华东师大讲义笔记 35积分/C币 立即下载
1/43
陆吾生华东师大讲义笔记第1页
陆吾生华东师大讲义笔记第2页
陆吾生华东师大讲义笔记第3页
陆吾生华东师大讲义笔记第4页
陆吾生华东师大讲义笔记第5页
陆吾生华东师大讲义笔记第6页
陆吾生华东师大讲义笔记第7页
陆吾生华东师大讲义笔记第8页
陆吾生华东师大讲义笔记第9页

试读结束, 可继续读4页

35积分/C币 立即下载 >