At present, it is an urgent issue to effectively train artificial neural network (ANN), especially when the data is large. Online learning has been used to solve the problem, most of which is based on least mean square (LMS). However, it is inefficient to implement the LMS on conventional digital hardware, because of the physical separation between the memory arrays and arithmetic module. To solve this problem, CMOS has been utilized. However, it costs too many powers and areas while designing