Unsupervised language identification based on Latent Dirichlet Allocation

Unsupervised language identification based on Latent Dirichlet Allocation
W. Zhang et al. / Computer Speech and language 39(2016)4766 Note that(1) and ( 3)concern the representation of our data, namely what feature space can express the documents for training and identifying the language present. The wordlevel features, used by, for example, Grefenstette(1995) and Rehurek and Kolkus(2009), have drawbacks in this respect. The more appropriate alternatively is to create a language ngram model based on characters(Cavnar and Trenkle, 1994)or encoded bytes(Dunning, 1994). With ngrams where n<3, linguistic phenomenon with granularity smaller than words can usually be characterized and a elatively small training corpus can be used and training converges faster(Souter et al., 1994). According to Chen and Goodman(1996), Zhai and Lafferty(2001), and Zhai and Lafferty(2004)(as many others ), there are empirical results that using smoothing or interpolation will promise in better results on Natural Language Processing. The smoothing of language model here is to better estimate real probabilities and avoid the bias, because there is always insufficient data in training set to estimate probabilities accurately in the languages. However, due to the need for smoothing, pruning or interpolation, it can be timeconsuming to generate ngram features. In this paper we will see that even the raw ngram counts can effectively be used as features for effective language identification as long as an appropriate learning algorithm is employed to provide effective smoothing(with alpha and beta of the Dirichlet parameter in our paper). Conversely(2)means that the local linkage information of words(or ngrams)may lead Chinese Whispers based Igorithms to fail to filter out the mixed language short sentences since the words from other languages will add incorrect linkages between language clusters. We show that an appropriately structured probabilistic model, which utilizes global statistical information is able to avoid these local incorrect interventions This paper presents an unsupervised language identification method using the raw ngram count to characterize features and a reformulated Latent Dirichlet Allocation (LDa) topic model (Blei et al., 2003). This approach is tested on the ECI/mci benchmark in the context of being a language identification tool, and on other data as a filtering ool. Additionally, we compared four kinds of measure and also the Hierarchical Dirichlet process(HDP)on several configurations of ECIUCI benchmark, to determine the number of languages present The outline of the rest of this paper is as follows: in Section 3, we conduct a detailed analysis of the problem we are addressing and propose the Language Identification model based on Latent Dirichlet Allocation (LDA LI). In Section 4, we introduce the feature space, training and inference algorithms of the LDALI. In Section 4 we also briefly evaluate four kinds of measure and the HDP, compared their differences on finding the suitable topic number(number of clustering ) In Section 5, we present experimental results including analyses using the ECI/MCI benchmark and a Wikipedia based Swahili corpus. Finally, Section 6 draws conclusions and proposes future directions 3. Towards an unsupervised approach 3.1. The latent dirichlet allocation model To be able to identify language in an unsupervised fashion we adopt and adapt a model from the field of topic modelling. Here, a topic model is a statistical model for discovering the abstract topics that occur in a collection of documents. Papadimitriou et al.(1998)give a probabilistic analysis of Latent Semantic Indexing which is seen as an early topic model. Almost at the same time, Hofmann(1999) proposed the probabilistic Latent Semantic Indexing PLSI with a suitable EM learning algorithm. The most common topic model currently in use, is the generalization of plSI into Latent Dirichlet Allocation, which allows documents to contain a mixture of topics, developed by blei et al.(2003). In LDA, each document is modelled as a mixture of K latent topics, where each topic k is a multinomial distribution pk over a Wword vocabulary. For any document j, its topic mixture 8; is a probability distribution drawn from a Dirichlet prior with parameter a For each ith word xi inj, a topic zi=k is drawn from Oj, and xi is drawn from ck. The generative process for LDa is thus given by Dir(),中k~Dir(P), ziil= klj Mult(ei) xilai muti(r), W. Zhang et al. Computer Speech and Language 39(2016)4766 B F D Fig. I. Bayesian network of LDA where Dir()denotes the Dirichlet distribution and Mult(*)is the multinomial distribution. Lda is a kind of Bayesian hierarchical model, the graphical model for it is illustrated in Fig. 1, where the observed variables, that is, words xi and hyperparameters a and B, are shaded For more detailed description of LDA, see Blei et al.(2003). The LDa model has three merits. The first is exchangeability, according to Blei et al. (2003), topics are conditionally independent and identically distributed in a fixed document which means the topics are infinitely exchangeable within the document The second is that the conjugate distribution of the dirichlet is a multinomial distribution The exchangeability and conjugation between of the dirichlet and multinomial distribution makes the learning algorithm relative simple as a pseudocount can be used directly by Expectation Maximization(Blei et al., 2003)or Gibbs sampling. In Section 4 our presented implementation uses Collapsed Gibbs Sampling( Griffiths and Steyvers, 2004). The third merit of LDa is that it inherently provides some degree of the automatic smoothing. Asuncion et al. (2009)point out the lda is a flexible latent variable framework for modelling sparse data in extremely high dimensional spaces. Even with the default hyperparameter settings of those learning algorithms, LDa can smooth the sparse count data and infer on unseen data(blei et al, 2003). Thus in Section 4.2 we have just used the raw ngram counts as the features. These counts are from unigram to 5g, so include both lowerorder and highorder ngram 3.2. LDA for language identification The basic idea behind traditional LDa is that documents are represented as random mixtures over latent topics, here each topic is characterised by a distribution over words(blei et al., 2003). To adapt this to language identification we consider documents represented as random mixtures over latent languages, where each language is characterised as a distribution over letter ngrams counts In such way, the documentLanguage and Language Ngram hierarchies can similarly be modelled by the lda for language identification, We call this approach LDaLI for short. Fig. 2 gives the pseudo code of generative LDALI model During the inference phase of LDAli, to classify a document as given language either the most probably language can be chosen, or a threshold can be set and multiple languages can be assigned to individual documents. Additionally the formulation of the model places no restrictions on the length of the document and is able classify very short documents, i.e., individual short sentences. As our speech synthesis work currently builds systems in a single language at a time, this paper explores the result of assuming that we are only interested in sentences that are purely of one language, and investigates whether we can identify these sentences appropriately. However there is plenty of scope for using this model in scenarios that do not make this assumption W. Zhang et al. / Computer Speech and language 39(2016)4766 // Language plat for all languages kE/1, K/do sample components ok n Dir(s) for all documents je 1, D/ do sample mixture proportion 8, Diria) the document length is Ni cam plate for all ngrams ie/1, Nj/in document do sample lai u(6y;) sample Ngram tai N Mult(ozi nd end Fig. 2. Generative model of LD. 4. Algorithm and model selection 4. Gibbs sampling and inference The learning algorithm in this paper is based on the Collapsed Gibbs Sampling(CGS)Griffiths and Steyvers(2004) a Markovchain Monte Carlo method. The model parameter= iok Dir(B)), the set of topic distributions, can be integrated using the Dirichletmultinomial conjugacy. The posterior distribution P(zlw) can then be estimated using the Collapsed Gibbs Sampling algorithm, which, in each iteration, updates each topic assignment zii E Z by sampling the full conditional posterior distribution p(zii= kz Cku t B d cuora+ w where k c[l, K is a topic,wc[l,wi is a word in the vocabulary, xi denotes the ith word in document and the topic assigned to xij denotes the words in the corpus with xi excluded and z are the corresponding topic assignments of W. In addition, cword denotes the number of times that word w is assigned to topic k not including the current instance xi and zij, and Chi the number of times that topic k has occurred in document j not including xi and zi. Whenever zi is assigned to a sample drawn from(1), matrices word and cdoc are updated. After enough samplin D K iterations to burn in the Markov chain, 0=(ejj=l and o=l9*)k=I can be estimated by COsta ∑≌1C0+Kc word ord + wB From Eqs.(2)and (3), we see that the CGs learning and inference are some kinds of pseudo counts of original corpus. The implementation of CGs used in this paper is based upon the implementation of Wang et al.(2009)using a MapReduce parallel framework with efficiency improvements by Liu et al. (2011) using a Message Passing Interface I (dN) Sometimes we are required to actually identify the individual languages present(for example, when evaluating the model with a test set and calculating precision and recall), for each language(topic)cluster we examine the sentence Ihttps://code.googl coIn p/plda/ W. Zhang et al. Computer Speech and Language 39(2016)4766 assigned to that language with the highest probability, and manually determine which language it is and then label all the sentences assigned to this language cluster as being of that language. Merging languages is also performed manually if there are more language clusters than actual languages known to be present this strategy is used throughout our experiment which means use of LDali reduces the need for annotation of hundreds or thousands of sentences to a few representative ones for given languages. This classifies each sentence as being of an actual language, which can then be evaluated correct or not in experiments where groundtruth is known 4.2. Feature space and sample representation A corpus is first converted into samples by considering each individual sentence a document. These documents are then converted into character based ngram counts( tokens for spaces, and beginning and end of sentence markers are included for each document). Cavnar and Trenkle(1994) show that for supervised learning n<3 is sufficient ngram length. We however found improved performance with our unsupervised method when we included ngrams with n in the range 15. which we believe allows us to capture more information across both short and long contexts. We did attempt to build models using n>5 but they proved computationally impractical to train. Due to the smoothing ability of LDa with large sparse data as discussed in Section 3. 1, we are able to use the raw ngram counts In practice the smoothing and pruning is actually realised by the hyper parameters a and B, which are configured with their default small values(<1)suggested by Liu et al. (2011) 4.3. Model selection and topic number An important issue with lda topic modelling is how to determine that an adequate number of individual topics are being modelled. In most cases(Blei et al., 2003; Griffiths and Steyvers, 2004; Newman et al., 2009: Wallach et al 2009; Grin and Hornik, 2011), perplexity is used to evaluate the resulting model on heldout data In our experiment(see Section 5), we found that the perplexity always reduces as the number of languages is increased, as shown in Fig. 9 of Blei et al. (2003) and Fig. 10 of Newman et al. (2009), this continues beyond the point where the number of languages in the model is larger than the actual number of different languages in the data In fact, the perplexity 2n( q) is another form of crossentropy over the test set H(P, q)=Pu log qv=H(p)+KL(Pg (4) Pu is the probability of each word v, estimated by Pv=n//inn/ in test set, qu is the probability of each word v computed by the LDa model with 4=∑∑ jk=1 H(p)denotes the entropy of p and kl(pll@)is the KullbackLeibler divergence(the Kl divergence or relative entropy) of q from p. From analyses, it can be seen that there is no explicit penalty term on the language number in Eq. (4) which means perplexity is not biased towards minimising the number of languages. Further more, even if a model makes pv=qu on every word such that KL(PlQ)=0. it is by no means that the model is the best model, because H(p, )=H(p)is not the minimum of real underlying probability distribution p, instead it is just a bias estimation of H(p) on limited test set. This implies that perplexity may not be the best measure to find the smallest topic number without significantly degrading performance Another way to find the correct number of languages is to use the hierarchical Dirichlet process(HDP)(Teh et al 2006), instead of lDa to hierarchically cluster the document into languages and thus automatically find the number of languages. However, here we find that HDP tends to choose far too many languages than the measure in Arun et al (2010) of LDA(see Table 2 of Section 5.3). The hdP behaviour is more like a language hashing process than one that can find the minimal language number appropriate for a given dataset. This means HDP is not necessary to find the minimum of topic number since the splitmerge process are recursively run on sub topics W Zhang et al. /Computer Speech and language 39(2016 ) 4766 In Cao et al.(2009), standard cosine distance(similarity) is used to measure the correlation between topics cOre(b;,中)= ,中n中元 (5 ∑v√∑ where i,j∈[1,K,U∈[1,W]· When corre(φ;,中) is smaller, the topics are more independent. the average cosine distance between every pair of topics is used to measure the stability of a topic structure ∑A∑=+1core(中;,中;) avedis(structure) K(K1)/2 a smaller ave dis shows that the structure is more stable. so ave dis is minimised. however. in some situations more topics always reduce ave dis even though they are unnecessary as we saw with perplexity This happens in our experiments, see Section 5.3 and Fig. 8(a) for detail. Alternatively, Zavitsanos et al. (2008)use Kl divergence instead of cosine similarity. Here, we revise this notion by replacing the term from Eg (5)in Eq(6)with symmetric KL divergence, since it is more reasonable that Eq (6)should average correlation between pairs of topics. Now we can use this to find the language number with the maximal average kldivergence since the distributions of wordvectors of different languages should have maximal average divergence when the correct number of languages is selected Thus it ill reduce the difference (in divergence)to either decrease or increase the number of languages. However, we find this measure is not sufficiently sensitive to language number and its value almost does not change with language number in some situations which are confirmed by Fig 8(b). The disadvantage with both Cao et al.(2009)and Zavitsanos et al. (2008)is that they only consider the information in the stochastic languageletter ngram matrix and ignore the sentencelanguage matrix Arun et al.(2010)view LDa as a matrix factorization mechanism, where a given corpus C is split into two matrix factors and given by CDW=⊙Dk·kW,⊙Dk=园,B2…,bD,中kW=[,,…,k1 where D is the number of documents present in the corpus and w is the size of the vocabulary as mentioned in 4.1 The quality of the split depends on K, the right number of topics chosen. This measure is computed in terms of symmetric KLdivergence of salient distributions that are derived from these matrix factors. ODxK is the document topic(sentencelanguage)matrix, while Kxw is the topicword (languagengram) matrix. Note that here 8 and lokk=i are different from Section 4.1, and are just the numerators of Eqs. (2)and (3). So it is clear that ∑k,0=∑d,kk∈[1,K1 U=1 d=1 where o(k, v) is the kth row vth column element of matrix 6, the same goes with (d, k). Eg (7)is the number of words assigned to each topic looked in two different waysone as row sum over words and other as columnsum over documents. However, when both these matrices are row normalized(as done by LDa), this equality will not hold any more. This is the reason only the numerator of Eys. (2) and (3)are used They proposed the Symmetric Kl divergence of Co and Co symkl= kl(CollCo)+kl(CollO) where, Co is the distribution of singular values of topicword matrix a, Co is the distribution obtained by normalizing the vector /* O(L is 1*D vector of lengths of each document in the corpus and o is the documenttopic matrix). Both the distributions Co and Co are in sorted order so that the corresponding topic components are expected to match With the measure in Eg (8), they find that this divergence between the Co distribution and the ce distribution initially degrades then starts to increase once the right number of topics is reached. In Section 5.4, we will see the effectiveness of symKL and also compared with the others measures mentioned above W. Zhang et al. Computer Speech and Language 39(2016)4766 5. Experimental evaluation It is difficult to fairly evaluate the LDILi model directly against other supervised language identification models partly due to the bias of the tasks that they are designed to perform and partly due to the supervisedunsupervised difference. Therefore results comparing systems, should not be treated judgementally to say one system is better than the other, but rather they should be used to understand how the models behave differentl Experiments I and 2 evaluated the lda li in this way as a general language identification model. here we performed experiments either using LDALI as unsupervised learner(Experiment I)or as unsupervised clusterer (Experiment 2) and compared our model to other approaches using the ECI/MCi, a benchmark corpora for language identification studies(ArmstrongWarwick et al, 1994 We then perform a number of experiments to further evaluate the LDALI in the way we intended to use it. In Experiment 3, we compared the measures related to the topic number mentioned in section 4.3. We also filtered either on ECIMCI dataset (Experiment 4) or a corpus of Swahili from Wikipedia(Experiment 5 )as tests of found data where a language was mixed with unknown other languages Finally, in Experiment 6, we compared the performance of LDALI to that of a more traditional lDa wordfeature model to show that the letter ngram counts a much better representation than words for this task examined the first sentence(that with the highest probability which is computed and output by LDa approach as to a As mentioned in the Section 4.1, to actually calculate precision and recall, for each topic modelled by LDALI (2)in 4.1)assigned to that topic, manually identified its language, and then labelled all the other sentences assigned to that topic as being of that same language. We then, where necessary manually merged topics assigned the same language. This strategy was used throughout our experiments 5.1. Experiment 1: LDALI as pseudosupervised learning Experiments I and 2 used nine languages which had the same configurations as Takciand Gungor(2012). First in data(actually, for LDALI we consider this pseudotraining, as no use was made ot ems trained with the same limited Experiment 1, we tested the hypothesis that our system was equal to supervised systems trained with the same limited We performed 10fold Cross validation(CV)training and testing with the same configuration used by Takciand Guingor (2012) and calculated precision, recall and Fscore We compared our LDALI model to 3 available existing methods of Language identification: the lang lD tool of Lui and Baldwin(2012), the Guess language toolthe current version of Text Cat(Cavnar and Trenkle, 1994)and the ICF of Takciand Gungor(2012) We firstly prepared the eCi/mci data for 10fold Cross Validation with the same configuration with Takciand Gungor (2012). With this we then trained and tested langld, guess_language and our LDALI models. Note that for the LDali model this is pseudotraining as we did not use the language classes present. The average precision, recall and Fscores were then compared (here the precision, recall and Fscore were firstly averaged across the 10cv, then averaged according to the language ratio configuration of Takciand Guingor(2012)). We also include results reported by Takciand Gungor (2012) for their ICF model that used the same configuration. Fig 3(a),(b) and Table I show that our LDALI system though is unsupervised generally compared favourably in performance with the supervised systems. Note that we present the lDalI model with 16 topic classes where symmetric KLdivergence reaches the minimum This is discussed further in the Experiment 3 5.2. Experiment 2: LDALl as clustering One criticism of Experiment l is that it is not fair on the supervised systems, as they are designed to be used trained on larger data sets. We address this with Experiment 2. In Experiment 2 we tested the hypothesis that our system 2http://www.elsnet.org/eci.html 3 Dutch291K, English108K, French108K, German171K, Italian99K, Portuguese107K, Spanish107K, Swedish91K, and Turkish109K ttps: //github. com/saffsd/angID 5https://bitbucket.org/spirit/guess_language/overview W. Zhang et al. / Computer Speech and language 39(2016)4766 日= 0.8 √ (a) average precisions on 10fold CV (b)average recalls on 10fold CV Fig 3. Precisions and recalls on 10fold cv Table 1 Performance over 10fold CV, Note: ICF method is from Takciand Guingor(2012)Guess language is from Cavnar and Trenkle(1994)langID is from Lui and Baldwin (2012) Method Sentence length of test Precision Recall Fscore LDALI Max 1297. Min 10 95.71% 95.03% 95.35% langlo Average 81.65 95.71% 96.00% 95.73% Guess.⊥ anguage characters 99.27% 95.00% 97.05% ICF 100 characters 97.10% 97.50% 97.30% as a unsupervised algorithm, would perform as well as supervised systems in the more realword situation where the supervised systems are were fully trained on additional data. We used our LDALI system to cluster the ECIMci data in an unsupervised fashion and compared to the other systems fullytrained on their wider variety of training data We evaluated on the full eCi/mci subset used above default versions of lang ld and guess_ language were taken pretrained on their very large corpora, while the LDaLi ran solely on the ECI/MCi as a standard clustering algorithm without additional language knowledge. The trade off here was that our LDALI system was solely pseudotrained on indomain data, where as the other systems are trained on substantially more(including indomain) data. We justify this comparisons as it is how each of these systems would be used in practice. Here the LDali model was used with 16 topic classes, and performed using 10CV as described in Experiment 1 Fig 4(a) and(b)shows that our LDaLI system generally outperformed the other systems. In a sense this was not surprising in terms of the data used, as the ldali clusters all the data, and hence was indomain. The more general training regimes of the other systems was to their disadvantage here. Domain and style differences are the likely explanation for lang lD outperforming Guess_language as the former is more up to date in terms of content of the training corpora and closer to the data being classified. The general conclusion here is training the supervised systems only on the indomain here data does not harm them, this may not however be the case for smaller data sets 5.3. Experiment 3: comparison of measures on topic number In Experiment 3, we investigated measures for finding the correct number of languages present in a corpus. We implemented the perplexity and HDP measures in conjunction with the nine languages used in Experiments 1 and 2. We also used an additional nine languages, now 18 languages in total. For each N of (N=3, 6, 9, 12, 15, 18)we randomly select N languages and spit the data from these languages to perform a 10fold Cross validation for a given N 6 Dutch, English, French, German, Italian, Spanish, Swedish, Turkish, Portuguese, Albanian, Bulgarian, Czech, Estonian, Latin, Lithuanian, Modern greek. Norwegian and russian w. Zhang et al. /Computer Speech and Language 39(2016)4766 l langID. py Guess language (a) comparison of precision scores (b comparison of recalls scores Fig 4. Precisions and recalls scores Perplexity 350 200 150 60 topic numb Fig. 5. Perplexity related to topic number, with 9 languages present We then averaged the cosine distance( Cao et al., 2009), word KLdivergence( Zavitsanos et al., 2008)and symmetric KL divergence(Arun et al., 2010). To test for consistency in our choice of languages selected, we repeated this set of Experiments 5 times, randomly choosing different subsets of languages each time for each N. results were found to be consistent irrespective of the languages chosen Experiments 1 and 2, showed that we can use the LDali as a language identification tool with the ability to generalize well if we could find the appropriate number if topics present. As a general language identification tool there are two requirements on the language number firstly the language number must be large enough to account for all the languages present. Then it is better for the language number to be as close to actual number of languages present in order to avoid unnecessary examination and merging of the language clusters actually of the same language To evaluate the ability to find the correct number of languages present, we calculated the perplexity, HDP and symKL (all averaged across 10CV) on the nine languages used in Experiments 1 and 2, and compared with the precision, recall and Fscores. We used 18 languages to compare the cosine distance, word KLdivergence and symmetric KL divergence(abbreviated as: cosim, wordkl and symKL, respectively). The five measures are described in detail in Section 4.3 In Fig. 5 we can see that perplexity always reduced with an increase in language number even when the number of languages was greater than 50 and the precision and recall were degraded(see Fig. 6, each time the topic number increases, a new LDa is trainedsubsequently the F score might drop), therefore we discard perplexity. Table 2 shows
 194KB
Unsupervised Language Filtering using the Latent Dirichlet Allocation
20170511Unsupervised Language Filtering using the Latent Dirichlet Allocation
 1.69MB
MultiSensor Prognostics using an Unsupervised Health Index
20170916MultiSensor Prognostics using an Unsupervised Health Index based on LSTM EncoderDecoder 给予LSTM的深度学习，在工业中的应用
 3.57MB
Unsupervised Partbased Weighting Aggregation
20201210Unsupervised Partbased Weighting Aggregation of Deep Convolutional Features for Image Retrieval.pdf
 202KB
Unsupervised Learning by Probabilistic Latent Semantic Analysis
20121118Unsupervised Learning by Probabilistic Latent Semantic Analysis
 1.12MB
rankingSvm
20180410Extensive experiments conducted on a real world CQA dataset from Stack Overflow show that our proposed two methods can both outperform the traditional query likelihood language model (QLLM) as well ...
 1001KB
Unsupervised identification of text reuse in early Chinese literature1.pdf
20200427Unsupervised identification of text reuse in early Chinese literature1.pdf
 916KB
Unsupervised adaptive sign language recognition based on hypothesis comparison guided cross validation and linguistic prior filtering
20210211Unsupervised adaptive sign language recognition based on hypothesis comparison guided cross validation and linguistic prior filtering
 1.12MB
PPREDICT & CLUSTER Unsupervised Skeleton Based Action Recognition中文翻译.pdf
20200913用知云文献翻译软件+自己的一些理解翻译的这篇PPREDICT & CLUSTER Unsupervised Skeleton Based Action Recognition论文
 2.80MB
Robust and Unsupervised KPI Anomaly Detection Based on Conditional Variational Autoencoder
20210207Robust and Unsupervised KPI Anomaly Detection Based on Conditional Variational Autoencoder
 1.53MB
Graphbased Natural Language Processing and Information Retrieval
20140222Graphbased methods for syntactic processing are presented in Chapter 8: an unsupervised partofspeech tagging algorithm based on graph clustering, minimum spanning trees for dependency parsing, PP...
 4.59MB
HandsOn Unsupervised Learning Using Python epub格式
20190308Author Ankur Patel provides practical knowledge on how to apply unsupervised learning using two simple, productionready Python frameworks  scikitlearn and TensorFlow using Keras. With the handson ...
 5.69MB
HandsOn Unsupervised Learning Using Python.pdf
20190729国外最新的一本关于无监督学习的hands on 书籍，很适合对无监督学习感兴趣的同学。
 439KB
论文研究一种多尺度和聚类分析的变化检测 .pdf
20190815一种多尺度和聚类分析的变化检测，张小华，王乐，针对多时相遥感影像变化检测问题，提出了一种多尺度和聚类分析的变化检测方法。该方法在差异影像的基础上，利用均值平移算法对差
 197KB
An Unsupervised Method for Flotation Froth Image Segmentation Evaluation Based on Image GrayLevel Distribution
20210209An Unsupervised Method for Flotation Froth Image Segmentation Evaluation Based on Image GrayLevel Distribution
 3.65MB
20192区Unsupervised Anomaly Detection Minimum Spanning Tree Hydropower
2021022020192区Unsupervised Anomaly Detection Based on Minimum Spanning Tree Approximated Distance Measures and Its Application to Hydropower Turbines
 1.24MB
Unsupervised Salient Object Segmentation Based on Kernel Density Estimation and TwoPhase Graph Cut
20210209Unsupervised Salient Object Segmentation Based on Kernel Density Estimation and TwoPhase Graph Cut
 6.18MB
Knowledge Transfer between Computer Vision and Text Mining: Similaritybased
20181029similaritybased learning process can be both supervised and unsupervised, and the pairwise relationship can be either a similarity, a dissimilarity, or a distance function. This book studies several ...
 4.57MB
无监督学习入门 HandsOn Unsupervised Learning Using Python
20201025HandsOn Unsupervised Learning Using Python How to Build Applied Machine Learning Solutions from Unlabeled Data Ankur A. Patel
 5.69MB
HandsOn Unsupervised Learning Using Python  Ankur A. Patel.pdf
20190726许多行业专家认为，无人监督学习人工智能的下一个前沿，这可能是人工智能研究的关键，即所谓的一般人工智能。由于世界上大多数数据都没有标记，因此无法应用传统的监督学习;这就是无监督学习的用武之地。...

下载
行业分类物理装置一种MDT会诊系统.zip
行业分类物理装置一种MDT会诊系统.zip

下载
呂宗學老師Tableau視覺化教學_中階技巧篇一_DIY Hans Rosling 的動態泡泡圖.rar
呂宗學老師Tableau視覺化教學_中階技巧篇一_DIY Hans Rosling 的動態泡泡圖.rar

下载
行业分类物理装置家居控制方法、装置、计算机设备和存储介质.zip
行业分类物理装置家居控制方法、装置、计算机设备和存储介质.zip

下载
行业分类物理装置家用制冷装置实训操作实验平台.zip
行业分类物理装置家用制冷装置实训操作实验平台.zip

下载
JDK1.8_262安装包，亲测可用.rar
JDK1.8_262安装包，亲测可用.rar

下载
基于.net的藏文在线二手平台的设计与实现【附源码】
基于.net的藏文在线二手平台的设计与实现【附源码】

下载
行业分类作业装置一种不锈钢圈的自动卷边机.zip
行业分类作业装置一种不锈钢圈的自动卷边机.zip

下载
xamppwindowsx647.2.320VC15installer.exe
xamppwindowsx647.2.320VC15installer.exe

下载
高并发，高性能，高可用MySQL实战，从数据库原理到高性能实战一次性掌握
高并发，高性能，高可用MySQL实战，从数据库原理到高性能实战一次性掌握

下载
陀螺逆转点法定向及精度评定（内容全面）.pdf
陀螺逆转点法定向及精度评定（内容全面）.pdf