Deep Learning_ Fundamentals, Theory and Applications.pdf

所需积分/C币:50 2019-06-14 18:37:30 4.17MB PDF
收藏 收藏

Deep Learning_ Fundamentals, Theory and Applications
Cognitive Computation Trends is an exciting new Book Series covering cutting- edge research, practical applications and future trends covering the whole spectrum of multi-disciplinary fields encompassed by the emerging discipline of Cognitive Computation. The Series aims to bridge the existing gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities The broad scope of Cognitive Computation Trends covers basic and applied work involving bio-inspired computational, theoretical, experimental and integrative accounts of all aspects of natural and artificial cognitive systems, including perception, action, attention, learning and memory, decision making, language processing, communication, reasoning, problem solving, and consciousness Moreinformationaboutthisseriesat Kaizhu huang· Amir hussain·Qiu- Feng Wang Rui zhang Editors Deep learning Fundamentals, Theory and Applications S ringer editors Kaizhu huang Amir hussain XIan Jiaotong-Liverpool University School of Computing Suzhou. china Edinburgh Napier University Edinburgh. UK Qiu-Feng Wang Rui zhang Xian Jiaotong-Liverpool University Xi'an Jiaotong-Liverpool University Suzhou China Suzhou china ISSN2524-5341 IssN 2524-535X (electronic) Cognitive Computation Trends ISBN978-3-030-060725 ISBN978-3-03006073-2( e Book) Library of Congress Control Number: 2019930405 O Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Preface Over the past 10 years, deep learning has attracted a lot of attention, and many exciting results have been achieved in various areas, such as speech recognition computer vision, handwriting recognition, machine translation, and natural lan guage understanding. Rather surprisingly, the performance of machines has even surpassed humans' in some specific areas. The fast development of deep learning has already started impacting people's lives; however, challenges still exist. In particular, the theory of successful deep learning has yet to be clearly explained and realization of state-of-the-art performance with deep learning models requires tremendous amounts of labelled data. Further optimization of deep learning models can require substantially long times for real-world applications. Hence, much effort is still needed to investigate deep learning theory and apply it in various challenging areas. This book looks at some of the problems involved and describes, in depth, the osSible solutions, and latest techniques achieved by researchers in the areas of machine learning, computer vision, and natural language processing. The book comprises six chapters, each preceded by an introduction and followed by a comprehensive list of references for further reading and research The chapters are summarized below Density models provide a framework to estimate distributions of the data, which is a major task in machine learning. Chapter 1 introduces deep density models with latent variables, which are based on a greedy layer-wise unsupervised learnin gorithm. Each layer of deep models employs a model that has only one layer of latent variables such as the mixtures of factor Analyzers (mfas) and the mixtures of Factor Analyzers with Common LoadingS(MCFAs) Recurrent Neural Networks(RNN-based deep learning models have been widely investigated for the sequence pattern recognition, especially the long Short term Memory (LSTM). Chapter 2 introduces a deep lstm architecture and a Connectionist Temporal Classification( CtC) beam search algorithm and evaluates this design on online handwriting recognition Following on above deep learning-related theories, Chapters 3, 4, 5 and introduce recent advances on applications of deep learning methods in several Preface areas. Chapter 3 overviews the state-of-the-art performance of deep learning-based Chinese handwriting recognition, including both isolated character recognition and text recognition Chapters 4 and 5 describe application of deep learning methods in natural language processing (NLP), which is a key research area in artificial intelligence (AD NLP aims at designing computer algorithms to understand and process natural language in the same way as humans do. Specifically, Chapter 4 focuses on NLP fundamentals, such as word embedding or representation methods via deep learning, and describes two powerful learning models in NLP: Recurrent Neural Networks(RNN) and Convolutional Neural Networks(CNN). Chapter 5 addresses deep learning technologies in a number of benchmark NLP tasks, including entity recognition, super-tagging, machine translation and text summarization Finally, Chapter 6 introduces Oceanic data analysis with deep learning models, focusing on how CNns are used for ocean front recognition and ls tMs for sea surface temperature prediction, respectively In summary, we believe this book will serve as a useful reference for senior (undergraduate or graduate) students in computer science, statistics, electrical engineering, as well as others interested in studying or exploring the potential of exploiting deep learning algorithms. It will also be of special interest to researchers n the area of Al, pattern recognition, machine learning, and related areas, alongside engineers interested in applying deep learning models in existing or new practical applications. In terms of prerequisites, readers are assumed to be familiar with basic algebra, as well as computer programming skit e calculus, probability and linear machine learning concepts including multivariate Suzhou. China Kaizhu huang Edinburgh, UK Amir hussain Suzhou china Qiu-Feng Wang Suzhou china Rui zhang March 2018 Contents 1 Introduction to Deep Density Models with Latent variables Xi Yang, Kaizhu Huang, Rui Zhang, and Amir Hussain 2 Deep RNN Architecture: Design and Evaluation............. 31 Tonghua Su, Li Sun, Qiu-Feng Wang, and Da-Han Wang 3 Deep Learning Based Handwritten Chinese Character and Text Recognition Xu-Yao Zhang, Yi-Chao Wu, Fei Yin, and Cheng-Lin Liu 4 Deep Learning and Its applications to Natural language Processing................. Haiqin Yang, Linkai Luo, Lap Pong Chueng, David Ling, and francis chin 5 Deep Learning for Natural Language Processing............ 111 Jiajun Zhang and Chengqing Zong 6 Oceanic Data Analysis with Deep Learning Models ..... ...... 139 Guoqiang Zhong, Li-Na Wang, Qin Zhang, Estanislau Lima, Xin Sun, Junyu Dong, Hui Wang, and Biao Shen Index 16l Chapter 1 Introduction to Deep Density models Check for with latent variables Xi Yang, Kaizhu huang Rui Zhang, and Amir hussain Abstract This chapter introduces deep density models with latent variables which are based on a greedy layer-wise unsupervised learning algorithm. Each layer of th deep models employs a model that has only one layer of latent variables, such as the Mixtures of Factor Analyzers (MFAs) and the Mixtures of Factor Analyzers with Common Loadings(MCFAs). As the background, MFAS and mCFAs approaches are reviewed. By the comparison between these two approaches, sharing the common loading is more physically meaningful since the common loading is regarded as a kind of feature selection or reduction matrix. Importantly mcFas can remarkably reduce the number of free parameters than MFAs. Then the deep models (deep MFAs and deep mcFas and their inferences are described, which show that the greedy layer-wise algorithm is an efficient way to learn deep density models and the deep architectures can be much more efficient(sometimes exponentially) than shallow architectures. The performance is evaluated between two shallow models, and two deep models separately on both density estimation and clustering Furthermore, the deep models are also compared with their shallow counterparts Keywords Deep density model Mixture of factor analyzers. Common component factor loading. Dimensionality reduction 1.1 Introduction Density models provide a framework for estimating distributions of the data and therefore emerge as one of the central theoretical approaches for designing machines(Rippel and Adams 2013; Ghahramani 2015). One of the essential X.Yang.k. Huang(凶)·R. Zhang Xian Jiaotong-Liverpool University, Suzhou, China;; A. Hussain School of Computing, Edinburgh Napier University, Edinburgh, UK e-mail: a hussain@ napier. ac uk o Springer Nature Switzerland ag 2019 K Huang et al.(eds ) Deep learning: Fundamentals, Theory and Applications, CognitiveComputationTrends2, Ⅹ. Yang et al probabilistic methods is to adopt latent variables, which reveals data structure and explores features for subsequent discriminative learning. The latent variable models are widely used in machine learning, data mining and statistical analysis In the recent advances in machine learning, a central task is to estimate the deep architectures for modeling the type of data which has the complex structure such as text, images, and sounds. The deep density models have always been the hot spot for constructing sophisticated density estimates. The existed models, probabilistic graphical models, not only prove that the deep architectures can create a better prior for complex structured data than the shallow density architectures theoretically, but also has practical significance in prediction, reconstruction, clustering, and simulation(Hinton et al. 2006; Hinton and Salakhutdinov 2006). However, they often encounter computational difficulties in practice due to a large number of free parameters and costly inference procedures( Salakhutdinov et al. 2007) To this end, the greedy layer-wise learning algorithm is an efficient way to learn deep architectures which consist of many latent variables layers. With this algorithm, the first layer is learned by using a model that has only one layer of latent variables. Especially, the same scheme can be extended to train the following layers at a time. Compared with previous methods, this deep latent variable model has fewer free parameters by sharing the parameters between successive layers and a simpler inference procedure due to a concise objective function. The deep density models with latent variables are often used to solve unsupervised tasks. In many applications, the parameters of these density models are determined by the maximum likelihood, which typically adopts the expectation-maximization (EM) algorithm(Ma and Xu 2005; Do and batzoglou 2008 McLachlan and Krishnan 2007). In Sect. 1. 4, we shall see that EM is a powerful and elegant method to find the maximum likelihood solutions for models with latent variables 1. 1. Density Model with latent variables Density estimation is a major task in machine learning. In general, the most commonly used method is Maximum Likelihood Estimate ( MLE). In this way, we can establish a likelihood function L(u, E)=2N In p(ynlu, E). However, the derivation of directly calculating the likelihood functions has computational difficulties because of the very high dimensions of 2. thus, a set of variables x is defined to govern multiple y, and when the distribution of p(x) is found, p(y) can be determined by the joint distribution over y and x. typically the covariates 2 are ruled out. In this setting, x is assumed to affect the manifest variables (observable variables), but it is not directly observable. Thus, x is So-called the latent variable (Loehlin 1998). Importantly, the introduction of latent variables allows the formation of complicated distributions from simpler components. In statistics Iyn-N observation data, u-mean,E-covariates

试读 127P Deep Learning_ Fundamentals, Theory and Applications.pdf
立即下载 身份认证后 购VIP低至7折
  • 分享宗师

关注 私信
Deep Learning_ Fundamentals, Theory and Applications.pdf 50积分/C币 立即下载
Deep Learning_ Fundamentals, Theory and Applications.pdf第1页
Deep Learning_ Fundamentals, Theory and Applications.pdf第2页
Deep Learning_ Fundamentals, Theory and Applications.pdf第3页
Deep Learning_ Fundamentals, Theory and Applications.pdf第4页
Deep Learning_ Fundamentals, Theory and Applications.pdf第5页
Deep Learning_ Fundamentals, Theory and Applications.pdf第6页
Deep Learning_ Fundamentals, Theory and Applications.pdf第7页
Deep Learning_ Fundamentals, Theory and Applications.pdf第8页
Deep Learning_ Fundamentals, Theory and Applications.pdf第9页
Deep Learning_ Fundamentals, Theory and Applications.pdf第10页
Deep Learning_ Fundamentals, Theory and Applications.pdf第11页
Deep Learning_ Fundamentals, Theory and Applications.pdf第12页
Deep Learning_ Fundamentals, Theory and Applications.pdf第13页
Deep Learning_ Fundamentals, Theory and Applications.pdf第14页
Deep Learning_ Fundamentals, Theory and Applications.pdf第15页
Deep Learning_ Fundamentals, Theory and Applications.pdf第16页
Deep Learning_ Fundamentals, Theory and Applications.pdf第17页
Deep Learning_ Fundamentals, Theory and Applications.pdf第18页
Deep Learning_ Fundamentals, Theory and Applications.pdf第19页
Deep Learning_ Fundamentals, Theory and Applications.pdf第20页

试读结束, 可继续阅读

50积分/C币 立即下载