没有合适的资源?快使用搜索试试~ 我知道了~
Mamda:通过歧管对齐方式推断微小RNA-疾病关联。
0 下载量 126 浏览量
2021-03-08
16:07:27
上传
评论
收藏 1.12MB PDF 举报
温馨提示
通过推断miRNA疾病关联发现疾病相关的microRNA(miRNA)对于了解疾病的发病机理以及进行治疗和预防至关重要。 最近开发的用于推断miRNA疾病关联的计算模型假设功能相关的miRNA与表型相似的疾病相关联,因此通过使用miRNA-miRNA和疾病疾病的相似性来推断miRNA疾病的关联,具体取决于开采现有的生物资源来确定。 从流形学习的角度来看,miRNA-miRNA相似性和疾病-疾病相似性分别确定了miRNA和疾病的低维流形,并且当前计算模型的基本假设等同于miRNA和疾病的流形结构之间的一致性。 在本文中,我们提出了一种新颖的微小RNA疾病推断框架(MAMDA),该框架明确利用了这种一致性特性,并通过将miRNA的歧管结构与疾病的排列对齐并通过对经过实验验证的miRNA疾病的监督来推断出miRNA疾病的关联。协会。 基于三个方面,实验结果表明,提出的框架优于几种具有代表性的最新技术。 首先,使用k倍交叉验证的AUC值表明,与四种经典技术(HGIMDA,HDMP,RLSMDA和NCPMDA)相比,我们的方法可获得更可靠的预测。 其次,用HMDD和dbDEMC验证了miRNA与
资源推荐
资源详情
资源评论
Contents lists available at ScienceDirect
Computers in Biology and Medicine
journal homepage: www.elsevier.com/locate/compbiomed
MAMDA: Inferring microRNA-Disease associations with manifold alignment
Fang Yan
a
, Yuanjie Zheng
a,b,∗
, Weikuan Jia
a
, Sujuan Hou
a
, Rui Xiao
c
a
School of Information Science and Engineering at Shandong Normal University, Jinan, China
b
Key Lab of Intelligent Computing & Information Security in Universities of Shandong, Shandong Provincial Key Laboratory for Novel Distributed Computer Software
Technology, Institute of Biomedical Sciences, Shandong Normal University, Jinan, China
c
Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
ARTICLE INFO
Keywords:
microRNA-disease association
Computational model
microRNA
Disease
Computational biology
ABSTRACT
Uncovering disease-related microRNAs (miRNAs) by inferring miRNA-disease associations is of critical im-
portance for understanding the pathogenesis of disease and carrying out treatment and prevention. Recently
developed computational models for inferring miRNA-disease associations assume that functionally related
miRNAs are associated with phenotypically similar diseases and hence infer miRNA-disease associations by using
miRNA-miRNA and disease-disease similarities, which are concretely determined by mining existing biological
resources. From the perspective of manifold learning, miRNA-miRNA similarities and disease-disease similarities
determine a low-dimensional manifold for miRNAs and diseases, respectively, and the basic assumption of
current computational models is equivalent to consistency between the manifold structures of miRNA and
disease. In this paper, we propose a novel microRNA-disease inference framework (MAMDA) that explicitly takes
advantage of this consistency property and infers miRNA-disease associations by aligning the manifold structure
of miRNA with that of disease together with supervision of experimentally verified miRNA-disease associations.
Based on three aspects, experimental results show that the proposed framework outperforms several re-
presentative state-of-the-art techniques. First, AUC values using k-fold cross-validation indicate that our method
acquires more reliable predictions than four classical techniques (HGIMDA, HDMP, RLSMDA, and NCPMDA).
Second, 48/48 predicted associations between miRNAs and breast cancer are validated with the HMDD and
dbDEMC to show the effectiveness of predicting isolated diseases with unknown miRNAs. Third, two case studies
of colon neoplasms and lung neoplasms validate the superior accuracy of MAMDA, with 48/50 and 48/50
predicted associations in the HMDD and dbDEMC, respectively.
1. Introduction
As a class of noncoding RNAs, microRNAs (miRNAs) are single-
stranded RNA molecules approximately 22 nucleotides in length that
play pivotal roles in regulating gene expression [1]. They may fine-tune
the protein-encoding expression of as many as 30% of all mammalian
genes [2]. In addition, great discoveries in miRNAs have also shown
that miRNAs acting as important new regulatory molecules in different
human diseases have great potential in the diagnosis and treatment of
many diseases. For instance, it is well documented that upregulation or
downregulation of miRNAs occurs in various human cancers. Silencing,
antisense blocking and modification of miRNAs are potential ther-
apeutic treatments involving these miRNAs [3,4].
Uncovering miRNA-disease associations is of critical importance not
only for investigating disease pathogenesis at the molecular level and
facilitating diagnosis, treatment and prevention of disease [5–7] but
also for formulating personalized treatment regimens [8]. Several ex-
perimental methods have been successfully exploited, such as micro-
array profiling and RT-PCR [9–11]. However, these experimental
methods are expensive and time consuming [12]. To eliminate the
drawbacks of these experimental techniques, computational methods
have been developed to predict and rank disease-related miRNAs by
inferring miRNA-disease associations [13].
The main purpose of these computational methods is to offer reli-
able miRNA candidates for biological experiments in combination with
existing experimental data. This strategy can make up for the short-
comings of the experimental approaches and further improve the effi-
ciency and success rate under the circumstances of many-to-many as-
sociation maps between miRNAs and diseases. Computational strategies
developed in recent years for predicting associations can be divided into
the following two categories: relation-learning methods and similarity-
computation approaches. They are both based on the assumption that
https://doi.org/10.1016/j.compbiomed.2019.05.014
Received 31 December 2018; Received in revised form 17 May 2019; Accepted 17 May 2019
∗
Corresponding author. School of Information Science and Engineering at Shandong Normal University, Jinan, China.
E-mail address: yjzheng@sdnu.edu.cn (Y. Zheng).
Computers in Biology and Medicine 110 (2019) 156–163
0010-4825/ © 2019 Published by Elsevier Ltd.
T
functionally related miRNAs tend to be associated with phenotypically
similar diseases and vice versa [14]. Specifically, the relation-learning
methods focus on learning an miRNA-disease relationship, while the
similarity-computation approaches generate predictions/inferences by
simultaneously considering the experimentally verified miRNA-disease
associations, miRNA's functional similarities, and diseases' phenotypic
similarities, among other relationships [14–21]. However, the existing
studies on miRNA and disease association prediction still face the fol-
lowing problems. On the one hand, the relation-learning methods are
subject to data samples that are distributed unevenly. Currently, only a
relatively small number of miRNA-disease associations have been
confirmed by experiments. The false positives in miRNA target gene
prediction and the disease-related annotation information are not suf-
ficient, and numerous undiscovered miRNA-disease associations still
exist, leading to the problem of an uneven data distribution. On the
other hand, the similarity-computation approaches suffer from in-
accurate prediction caused by the sparsity of data and isolated diseases
(diseases not related to other diseases and miRNAs). This is because the
known diseases and miRNAs have some known associations, which can
provide much prior knowledge for subsequently predicting potential
relationships. Isolated diseases and unknown miRNAs, in contrast, can
hardly provide prior knowledge and thus affect prediction accuracy.
The base assumption that miRNAs with similar functions are asso-
ciated with diseases with similar phenotypes is equivalent to the con-
sistency between the underlying low-dimensional manifold structure of
miRNAs and that of diseases [22]. It is reasonable to assume that latent
data for fully describing miRNAs or diseases are in a high-dimensional
space, considering their high biological complexity. At the same time,
the data should also form an underlying low-dimensional manifold,
considering the established similarities between miRNA functions or
disease phenotypes. This low-dimensional manifold can actually be
reconstructed from the similarities between miRNAs or diseases using
recent manifold learning techniques [23–25]. However, this invaluable
information is not explicitly leveraged by most of the current miRNA-
disease prediction techniques.
In this paper, we develop a novel framework for inferring miRNA-
disease associations with manifold alignment (MAMDA). It establishes a
regression between miRNAs and diseases by explicitly taking advantage
of the consistency between their underlying low-dimensional manifold
structures. Specifically, we first construct a graph Laplacian for both
miRNAs and diseases, which is determined by the functional similarity
of miRNAs or the phenotypic similarity of diseases. We then present a
formulation that joins the graph representation of miRNAs and diseases
by considering the experimentally verified miRNA-disease associations,
miRNAs' functional similarities and diseases' phenotypic similarities
together. This formulation results in common low-dimensional em-
bedding over the joined graph and provides an alignment of the un-
derlying low-dimensional manifold structures (benefitting from the
low-dimensional manifold structure's consistency between miRNAs and
diseases). The resulting manifold alignment offers an inference of the
miRNA-disease associations. Experimental results show that the pro-
posed manifold alignment-based technique for association prediction
between miRNAs and diseases outperforms several representative state-
of-the-art approaches.
2. Related works
Grounded in known miRNA-disease associations, miRNA functional
similarities, disease phenotypic affinities, or protein-disease relations,
recent progressions of computational methods put forward effective
algorithms for systematically predicting miRNAs related to given dis-
eases and screening candidates for molecular biological experiments, in
turn reducing expenditures and curtailing experiment cycles
[14,
26,27].
Existing methods for predicting the type of miRNA-disease
associations are classified mainly into relation-learning methods and
similarity-computation methods.
In general, relation-learning methods focus on learning miRNA-
disease relations. Ala Qabaja et al. [28] proposed the Lasso regression
model-based protein network to infer associations between miRNAs and
diseases. Xu et al. [29] and Jiang et al. [7] applied the support vector
machine (SVM) to classify proven miRNA-disease associations and ne-
gative ones. Chen et al. [30] first introduced a decision tree learning-
based model (EGBMMDA) to infer miRNA-disease associations. The
authors used a gradient boosting model to train the regression tree
based on the results of miRNA-disease relations calculated by statistical
measures, graph theoretical measures and matrix factorization. As is
well known, this type of method has difficulty in collecting negative
training samples, which refer to unknown miRNA-disease relationships.
To overcome these limitations, Zou et al. [21] employed CATAPULT,
treating all negative associations as unlabeled data to solve the lack of
negative samples. Therein, semisupervised techniques were introduced
to solve the issue of imbalanced and unlabeled samples to uncover
potential microRNA-disease association. Chen et al. [31] proposed an
RLSMDA method combining the semisupervised strategy and regular-
ized least squares framework to identify the miRNA-disease associa-
tions that have unknown related miRNAs. Chen et al. [32] developed
the machine learning method-based restricted Boltzmann machine
named RBMMMDA to infer multiple types of relationships between
miRNA and disease on a large scale. In addition, Chen et al. [33] in-
troduced a bipartite network projection-based method (BNPMDA) for
predicting miRNA-disease relations. However, this method cannot infer
isolated diseases without known related miRNAs. To solve this problem,
three similar methods based on known associations and integrated
miRNA similarity and disease similarity were introduced. Chen et al.
[34] proposed another semisupervised model based on low-rank in-
ductive matrix completion (IMCMDA) to identify the missing associa-
tions between miRNAs and diseases via known associations and in-
tegrated similarity for miRNA and disease in order to infer miRNA-
disease associations. Chen et al. [35] developed a novel computational
model (MDHGI) for the prediction of miRNA-disease associations
through matrix decomposition-based sparse learning. Combining mul-
tiple feature spaces to achieve a single classifier, Chen et al. [36] pro-
jected the feature profile of miRNAs or diseases into a common sub-
space and predicted relationships between miRNAs and diseases via the
Laplacian regularized sparse subspace learning method (LRSSLMDA) to
acquire reliable predictions.
The similarity-computation approaches can be roughly classified
into local and global network-based similarity-computation ap-
proaches. Moreover, the strategies used include various methods, such
as hypergeometric distribution, random walk, graph theory, path-based
approaches, and social network analysis. According to whether they use
only miRNA neighbor information, some local network-based simi-
larity-computation measures are summarized as follows. Jiang et al. [5]
first constructed a Boolean miRNA network containing nodes and edges
to infer miRNA-disease associations. Xuan et al. [37] proposed the
HDMP technique based on the weighted k most-similar neighbors,
which overcame the shortcoming that many false positives are pro-
duced in the Boolean network. Mørk et al. [27] presented protein-
driven inference (miRPD) for prediction based on miRNA-disease and
protein-disease
associations. In contrast, Chen et al. [38] proposed a
global network similarity strategy to predict miRNA-disease associa-
tions using three inference methods, namely, microRNA-, phenotype-,
and network consistency-based similarity inference. Similarly, Chen
et al. [39] also exploited a global network similarity strategy called the
RWRMDA, inspired by a random walk. Gu et al. [16] introduced a
network consistency projection approach called NCPMDA to predict the
potential miRNA-disease associations in all diseases without known
negative samples. The better performance of the global approach in-
dicated that it may obtain higher accuracy than approaches using the
local network similarity model, which focuses only on neighboring re-
lationships; however, this method ignores the information in proteins.
Then, Shi et al. [40] developed the RWR algorithm employing protein-
F. Yan, et al.
Computers in Biology and Medicine 110 (2019) 156–163
157
剩余7页未读,继续阅读
资源评论
weixin_38617297
- 粉丝: 2
- 资源: 896
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功