Speech_Synthesis_Paul_Taylor.pdf

所需积分/C币:17 2017-07-28 19:01:20 4.49MB PDF
收藏 收藏
举报

Speech Synthesis Paul Taylor.
Summary of Contents 1 Introduction。.. 2 Communication and language 3 The text-to- Speech Problen∴∴…………∴………….26 4 Text Segmentation and Organisation.,.....,,....... 5 Text Decoding.......... 79 6 Prosody prediction from Text.....,...........,...... 112 7 Phonetics and phonology ∴147 8 Pronunciation .193 9 Synthesis of Prosody.…………………………227 10 Signals and Filters∴∴∴.∴∴∴∴∴∴∴∴∴∴∴∴∴∴∴∴265 11 Acoustic Models of Speech Production............,....... 316 12 Analysis of Speech Signals∴∴………349 13 Synthesis Techniques based on Vocal Tract Models n396 14 Synthesis by Concatenation and Signal Processing Modification....... 422 15 Hidden Markov Model Synthesis 。。446 16 Unit Selection Synthesis 17 Further ssues.............................. 528 18 Conclusions 545 Summary of Contents Preface , XX 1 ntroduction 鲁·。●●D 1 What are text-to-speech systems for? 2 What should the goals of text-to-speech system development be? 1.3 The Engineering Approach 1. 4 Overview of the book 1.4. 1 Viewpoints within the book 1. 4.2 Readers' backgrounds 12345567 1.4.3 Background and specialist sections 2 Communication and Language........ 2.1 Types of communication 2.1.1 Affective communication 2.1.2 Iconic communication 2.1.3 Symbolic communication 10 2.1.4 Combinations of symbols 2.1.5 Meaning, form and signal 2.2 Human communication 13 2.2.1 erbal communication .14 2.2.2 Linguistic levels 16 2.2.3 Affective prosody 17 2. 2. 4 Augmentative Prosody 2.3 Communication processes 18 2.3.1 Communication factors 19 2.3.2 Generation 20 2.3.3 Encoding 21 2. 3, 4 Decoding 2.3.5 nderstanding 2.4 Discussion 23 2.5 Summary 24 3 The Text-to-Speech Problem 26 1 Speech and Writing ·· 26 1.1 Physical nature 27 3. 1.2 Spoken form and written form 28 3.1.3Use 29 3.1.4 Prosodic and verbal content 31 1.5 Component balance 31 3. 1.6 Non-linguistic content 32 3.1.7 Semiotic systems 33 3.1.8 Writing Systems 34 3.2 Reading aloud Summary of contents 3.2.1 Reading silently and reading aloud 35 3.2.2 Prosody in reading aloud 36 3.2.3 Verbal content and style in reading aloud 37 3.3 Text-to-speech system organisation 38 3.3.1 The Common Form model 3.3.2 Other models 39 3.3.3 Comparison 40 3.4 Systems 3.4.1 A Simple text-to-speech system 41 3.4.2 Concept to speech 3.4.3 Canned Speech and Limited Domain Synthesis..,.......... 43 3.5 Key problems in Text-to-speech 44 3.5.1 Text classification with respect to semiotic systems 44 3.5.2 Decoding natural language text 46 3.5.3 Naturalness 47 3.5.4 Intelligibilit ng the ge in sI 48 3.5.5 Auxiliary generation for prosod 49 3.5.6 Adapting the system to the situation 50 3.6 Summary 4 Text Segmentation and Organisation.…… 4.1 Overview of the problem 4.2 Words and Sentences 53 4.2.1 What is a word? 54 4.2.2 Defining words in text-to-speech 5 4.2.3 Scope and morphology 59 4.2.4 Contractions and Clitics 4.2.5 Slang fo 61 4.2.6 Hyphenated forms 4.2.7 What is a sentence? 4.2.8 The le 4.3 Text Segmentation 4.3.1 Tokenisation 43.2 Tokenisation and punctuation 4.3.3 Tokenisation Algorithms 66 4.3.4 Sentence Splitting 67 4.4 Processing documents 4.4.1 Markup language 4.4.2 Interpreting ch 70 4.5 Text-to-Speech Architectures 4.6 Discussion 76 Summary of Contents 4.6. Further Reading 76 4.6.2 Summary 77 5 Text Decoding: Finding the words from the te 79 5.1O of Text decoding 79 5.2 Text Classification Algorithms 2. 1 Features and algorithms 80 5.2.2 Tagging and word sense disambiguation 5.2.3 Ad-hoc approaches 84 5.2.4 Deterministic rule approaches 84 5.2.5 Decision lists 86 5.2.6 Naive Bayes Classifier 87 5.2.7 Decision trees 88 2.8 Part-of-speech Tagging 89 5.3 Non-Natural Language Text 93 5.3.1 Semiotic Classification 93 5.3.2 Semiotic Decoding 96 53. 3 Verbalisation 5.4 Natural Language Text 99 5.4.1 Acronyms and letter sequences 100 5.4.2 Homograph disambiguation ...101 5.4.3 Non-homographs .102 5.5 Natural Language parsing 103 5.1 Context Free Grammars 103 5.2 Statistical Parsing 105 5.6 Discussion ,,,,,,,,,106 5.6. I Further reading .109 5.6.2 Summary ···· 6 Prosody Prediction from Text....................... 112 6.1 Prosodic form 112 6.2 Phrasing 113 6.2.1 Phrasing Phenomena ·· 6.2.2 Models of Phrasing 114 6.3 Prominence 117 6.3.1 Syntactic prominence patterns .117 6.3.2 Discourse prominence patterns .119 6.3.3 Prominence systems, data and labelling 120 6.4 Intonation and tune 122 6.5 Prosodic meaning and Function 123 6.5.1 Affective Prosody 123 6.5.2 Suprasegmental 124 Summary of contents 6.5.3 Augmentative Prosody 125 6.5.4 Symbolic communication and prosodic style 127 6.6 Determining Prosody from the Text 128 6.6.1 Prosody and human reading 128 6.6.2 Controlling the degree of augmentative prosody 129 6.6.3 Prosody and synthesis techniques 129 6.7 Phrasing prediction 130 6.7.1 Experimental formulation 130 6.7.2 Deterministic approaches 131 6.7.3 Classifier approaches 133 6.7.4 HMM approaches ·· 134 6.7.5 Hybrid approaches 137 6. 8 Prominence prediction 137 6.8.1 Compound noun phrases 137 6.8.2 Function word prominence 139 6.8.3 Data driven approaches 139 6.9 Intonational Tune Prediction 140 6.10 Discussion 140 6.10.1 Labelling schemes and labelling accuracy 140 6. 10.2 Linguistic theories and prosody 142 6.10.3 Synthesising suprasegmental and true prosody ,,,,143 6. 10. 4 Prosody in real dialogues 144 6.10.5 Conclusion 145 6. 10.6 Summary 145 7 Phonetics and phonology ,,147 7.1 Articulatory phonetics and speech production 147 7. 1. 1 The vocal organs 148 7. 1.2 Sound sources 148 7.1.3 Sound output 151 7. 1. 4 The vocal tract filter 15 7.1.5 Vowels ··· l53 7. 1.6 Consonants 7.1.7 Examining speech production .157 7.2 Acoustics phonetics and speech perception 158 7.2.1 Acoustic representations 15 7. 2.2 Acoustic characteristics 7.3 The communicative use of speech 7.3.1 Communicating discrete information with a continuous channe 162 7.3.2 Phonemes, phones and allophones 164 7.3.3 Allophonic variation and phonetic context Summary of Contents 7.3.4 Coarticulation, targets and transients 7.3.5 The continuous nature of speech .,,,,,,,,,,,,,,.170 7.3.6 Transcription 171 7.3.7 The distinctiveness of speech in communication 173 7.4 Phonology: the linguisti 173 7. 4. Phonotactics 174 7.4.2 Word formation .180 7.4.3 Distinctive Features and Phonological Theories 182 7.4.4 Syllables 185 7. 4.5 Lexical Stress .187 7.5 Discussion ·· 190 7.5.1 Further reading 190 7.5.2 Summary 191 8 Pronunciation 193 8. 1 Pronunciation representations 193 8.1.1 Why bother? 193 8.1.2 Phonemic and phonetic input ..194 8.1.3 Difficulties in deriving phonetic input 8.1.4 A Structured approach to pronunciation 196 8.1.5 Abstract phonological representations 197 8.2 Formulating a phonological representation system 198 8.2.1 Simple consonants and vowels 198 8.2.2 Difficult consonants 200 8.2.3 Diphthongs and affricates 201 8.2.4 Approximant-vowel combinations ,,,,,,,,.202 8.2.5 Defining the full inventory 203 8 2.6 Phoneme names 205 8.2.7 Syllabic issues 207 8. 3 The Lexicon 208 8.3.1L d rule 209 8.3.2 Lexicon formats ·· 21l 8.3 3 The offine lexicon 214 8.3.4 The system lexico 215 8.3.5 Lexicon quality .216 8.3.6 Determining the pronunciation of unknown words 217 8.4 Grapheme-to-Phoneme Conversion 219 8.4. 1 Rule based techniques 219 8.4.2 Grapheme to phoneme alignment 220 8.4.3 Neural networks 220 8.4.4 Pronunciation by analogy 221 Summary of contents 8.4.5 Other data driven techniques 8.4.6 Statistical Techniques 222 8. 5 Further Is 223 8.5.1 Morphology .223 8.5.2 Language origin and names 224 8.5.3 Post-lexical processing 224 8.6 Summary 225 9 Synthesis of Prosody.,......,... 227 9.1 Intonation Overview 227 9. 1. 1 FO and pitch 228 9.1.2 Intonational fo 228 9. 1.3 Models of fo contours ..,.,.230 9.1.4Mi 231 9.2 Intonational behaviour 231 9.2.1 Intonational tune 232 9.2.2 Downdrift 23 9.2.3 Pitch Range .235 9.2.4 Pitch Accents and boundary tones 237 9. 3 Intonation Theories and models 239 9.3.1 Traditional models and the british school 239 9.3.2 The dutch school 239 9.3.3 Autosegmental-Metrical and ToBI models 240 9.3.4 The INTSINT Model 241 9.3.5 The Fujisaki model and Superimpositional models .242 9.3.6 The Tilt model 244 9.3.7 Comparison 246 9.4 Intonation Synthesis with AM models 248 9.4.1 Prediction of AM labels from text .248 9.4.2 Deterministic synthesis method 249 9.4.3 Data Driven synthesis methods 250 9.5 Intonation Synthesis with Deterministic Acoustic Models 9.4.4 Analysis with Autosegmental models 250 251 9.5.1 Synthesis with superimpositional models 251 9.5.2 Synthesis with the Tilt model 252 9.5.3 Analysis with Fujisaki and Tilt models 252 9.6 Data Driven Intonation models 252 9.6.1 Unit selection style approaches 253 9.6.2 Dynamic System Models 254 9.6.3 Hidden markov models 2 9. 6. 4 Functional model 256

...展开详情
试读 127P Speech_Synthesis_Paul_Taylor.pdf
立即下载 低至0.43元/次 身份认证VIP会员低至7折
    抢沙发
    一个资源只可评论一次,评论内容不能少于5个字
    • 分享王者

      成功上传51个资源即可获取
    关注 私信 TA的资源
    上传资源赚积分,得勋章
    最新推荐
    Speech_Synthesis_Paul_Taylor.pdf 17积分/C币 立即下载
    1/127
    Speech_Synthesis_Paul_Taylor.pdf第1页
    Speech_Synthesis_Paul_Taylor.pdf第2页
    Speech_Synthesis_Paul_Taylor.pdf第3页
    Speech_Synthesis_Paul_Taylor.pdf第4页
    Speech_Synthesis_Paul_Taylor.pdf第5页
    Speech_Synthesis_Paul_Taylor.pdf第6页
    Speech_Synthesis_Paul_Taylor.pdf第7页
    Speech_Synthesis_Paul_Taylor.pdf第8页
    Speech_Synthesis_Paul_Taylor.pdf第9页
    Speech_Synthesis_Paul_Taylor.pdf第10页
    Speech_Synthesis_Paul_Taylor.pdf第11页
    Speech_Synthesis_Paul_Taylor.pdf第12页
    Speech_Synthesis_Paul_Taylor.pdf第13页
    Speech_Synthesis_Paul_Taylor.pdf第14页
    Speech_Synthesis_Paul_Taylor.pdf第15页
    Speech_Synthesis_Paul_Taylor.pdf第16页
    Speech_Synthesis_Paul_Taylor.pdf第17页
    Speech_Synthesis_Paul_Taylor.pdf第18页
    Speech_Synthesis_Paul_Taylor.pdf第19页
    Speech_Synthesis_Paul_Taylor.pdf第20页

    试读已结束,剩余107页未读...

    17积分/C币 立即下载 >