Networkpredictingdrug'sanatomicaltherapeuticchemicalcode资源-CSDN文库

研究论文

31 浏览量 2021-02-09 01:30:27 上传评论收藏 1.67MB PDF 举报

资源推荐

资源详情

资源评论

Network predicting drug’s anatomical therapeutic

chemical code

Yong-Cui Wang

, Shi-Long Chen

, Nai-Yang Deng

and Yong Wang

3 ∗

Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau

Biology, Chinese Academy of Sciences, Xining, China, 810001.

College of Science, China Agricultural University, Beijing, China, 100083.

National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and

Systems Science, Chinese Academy of Sciences, Beijing, China, 100190.

ABSTRACT

Motivation: Discovering drug’s Anatomical Therapeutic Chemical

(ATC) classiﬁcation rules at molecular level is of vital importance to

understand a vast majority of drugs action. However, few studies

attempt to annotate drug’s potential ATC-codes by computational

approaches.

Results: Here, we introduce drug-target network to computationally

predict drug’s ATC-codes and propose a novel method named

NetPredATC. Starting from the assumption that dr ugs with similar

chemical structures or target proteins share common ATC-codes, our

method, NetPredATC, aims to assign drug’s potential ATC-codes by

integrating chemical structures and target proteins. Speciﬁcally, we

ﬁrst construct a gold-standard positive dataset from drugs’ ATC-code

annotation databases. Then we characterize ATC-code and drug by

their similarity proﬁles and deﬁne kernel function to correlate them.

Finally, we utilize a kernel method, support vector machine (SVM),

to automatically predict drug’s ATC-codes. Our method was validated

on four drug datasets with various target proteins, including enzymes,

ion channels (ICs), G-protein couple receptors (GPCRs), and nuclear

receptors (NRs). We found that both drug’s chemical structure

and target protein are predictive and target protein information has

better accuracy. Further integrating these two data sources revealed

more experimentally validated ATC-codes for drugs. We extensively

compared our NetPredATC with SuperPred, which is a chemical

similarity only based method. Experimental results showed that our

NetPredATC outperforms SuperPred not only in predictive coverage

but also in accuracy. In addition, database search and functional

annotation analysis support that our novel predictions are worthy of

future experimental validation.

Conclusion: In conclusion, our new method, NetPredATC, can

predict drug’s ATC-codes more accurately by incorporating drug-

target network and integrating data, which will promote drug

mechanism understanding and drug repositioning and discovery.

Availability: NetPredATC is available at http://doc.aporc.org/

wiki/NetPredATC.

Contact: ycwang@nwipb.cas.cn, ywang@amss.ac.cn

∗

To whom correspondence should be addressed

1 INTRODUCTION

The Anatomical Therapeutic Chemical (ATC) classiﬁcation system

categorizes drug substances at different levels by their therapeutic

properties, chemical properties, pharmacological properties, and

practical applications. This classiﬁcation system is recommended by

the World Health Organization (WHO) and drug’s ATC-codes have

been widely applied in almost all drug utilization studies (WHO,

2006). Speciﬁcally, ATC classiﬁcation system can be used as a basic

tool for drug utilization research. It also provides the presentation

and comparison of drug consumption statistics at international

level. In addition, ATC prediction will greatly facilitate the recent

drug repositioning and drug combination studies. Though useful,

mapping ATC-codes to drugs is quite challenging.

Recently, ATC-codes for some well characterized drugs have

been deposited in databases, such as KEGG BRITE (Kanehisa

et al., 2006) and DrugBank (Wishart et al., 2008). These databases

provide high quality expert curated data. However, they are in

small scale and the coverage is far from enough to serve practical

usage. Even for some well-collected drug datasets, the ATC code

assignments for drugs are far from complete. For example, the

dataset in Yamanishi et al., 2008 contains drugs with four different

type target proteins including enzymes, ion channels (ICs), G-

protein couple receptors (GPCRs), and nuclear receptors (NRs).

These drugs all have manually curated target proteins from KEGG

BRITE (Kanehisa et al., 2006), BRENDA(Schomburg et al., 2004),

SuperTarget (Gunther et al., 2008), and DrugBank (Wishart et al.,

2008). Even in this high-quality dataset, there are 102 drugs which

do not have any ATC-codes in all 445 drugs targeting enzyme, 13

drugs which do not have any ATC-codes in all 210 drugs targeting

IC, 23 drugs which do not have any ATC-codes in all 223 drugs

targeting GPCR, and 4 drugs which do not have any ATC-codes in

all 54 drugs targeting NR. The percent of drugs without ATC codes

varies from 10% to 25%.

The bottleneck is that current data collection procedure heavily

relies on human curation and is not efﬁcient. One way out

is to learn the underlying drug ATC-codes classiﬁcation rules

from the available high quality ATC-code annotations, and

further automatically assign new ATC-codes to drugs by a

computational predictor. This strategy will accelerate the functional

characterization of drugs under the ATC classiﬁcation systems,

Associate Editor: Dr. Olga Troyanskaya

Bioinformatics Advance Access published April 5, 2013

at Periodicals Department/Lane Library on April 5, 2013http://bioinformatics.oxfordjournals.org/Downloaded from

especially those barely characterized drugs. Importantly, it will

greatly speed up the mechanism understanding of a vast majority

of drugs action, and narrow down the gap between the medical

indications and drug effects elucidation at molecular level (Dunkel

et al., 2008).

However, few studies attempt to address this important problem.

Dunkel et al. tackled this challenge by proposing a computational

method to classify the given compounds into ATC classiﬁcation

system. Their method is based on the drug similarity in chemical

structures and physicochemical properties (Dunkel et al., 2008).

They also developed a useful web-server, which allows prognoses

about the medical indication of novel compounds and to ﬁnd new

leads for known targets (Dunkel et al., 2008). Nevertheless, the

chemical structure only describes the static state of drugs. While

cells use proteins and small molecules (drugs, metabolites, or

ligands) networks to dynamically coordinate multiple biological

functions. For instance, single drug may possess different biological

functions by targeting different proteins. Therefore, if the drug

target information is integrated into the prediction, the performance

improvement can be expected. In this paper, we follow this idea to

design a new predictive method. That is, we map ATC-codes to a

given drug based not only on its chemical structure similarity with

other compounds, but also on its target proteins.

The commonly accepted assumption in drug discovery is that

drugs with similar pharmacological or therapeutic properties usually

share common functions (Yamanishi et al., 2008, 2010; Zhao and

Li, 2010; Wang et al., 2010). Existing efforts demonstrated that

chemical structure similarity is useful in classifying compounds

into ATC classiﬁcation system (Dunkel et al., 2008). Here we

note that drug’s pharmacological or therapeutic similarity may

due to the fact that they interact with common or similar target

proteins. Thus it is reasonable to assume that drugs similar in

target proteins usually share common ATC-codes. Starting with

this assumption, we propose a novel computational approach called

NetPredATC to predict potential ATC-codes for drugs. Speciﬁcally,

we ﬁrst construct the drug and ATC-code interaction network based

on the known drug ATC-code annotations. Then we characterize

ATC-code and drug by their similarity proﬁles, and deﬁne kernel

function to correlate drug with ATC-code. Finally, we infer

drug’s ATC-codes by training a machine learning model, i.e.,

support vector machines (SVMs). SVMs are motivated by statistical

learning theory and have been proven successful on many different

classiﬁcation problems in bioinformatics (Scholkopf et al., 2004).

Our contributions here are not only in incorporating drug targets

information for the ﬁrst time into the ATC-code prediction, but also

in designing a novel predictive model by data integration.

The performance of our method was validated on four classes

of drug target proteins, including enzymes, ICs, GPCRs, and

NRs. We show that both chemical structure and target protein

are predictive via cross-validation experiments and statistical

evaluation. Moreover, target protein information is more powerful.

By combining them, our method outperforms the chemical

similarity only based method and more experimentally observed

drug ATC-code annotations can be uncovered.

The remainder of this paper is structured as follows. In

Materials and Methods section, we construct the drug and

ATC-code interaction network by collecting available drug ATC-

code annotations. Then chemical structures and target proteins

information are extensively investigated. We characterize the drugs

and ATC-codes by their similarity proﬁle and train the SVM-based

predictor. In Results section, we compare the predictability of

chemical structure, target protein, and their combination, and show

that the improvement in accuracy arises from drug-target network

and data integration. Lastly, the discussions and conclusions are

presented.

2 MATERIALS AND METHODS

We propose a novel computational algorithm, NetPredATC, to infer drug’s

ATC-codes by using drug-target network information. Our algorithm works

in three phases (Fig. 1): (A) Formulating known drug’s ATC annotations

as a bipartite graph. We extracted the known drug’s ATC annotations from

KEGG BRITE (Kanehisa et al., 2006) and DrugBank (Wishart et al., 2008)

databases. (B) Extracting drug-drug and ATC-code-ATC-code similarity

metrics. Drug similarity is derived from chemical structure and target protein

information. ATC-code similarity proﬁles are calculated by a probabilistic-

based model (Lin, 1998). (C) Feeding the similarities among drugs and

similarities among ATC-codes to kernel method and applying SVM-based

classiﬁer to predict drug’s unknown ATC-codes.

2.1 Constructing drug and ATC-codes interaction

network

In ATC system, drugs are divided into fourteen main groups (1st level),

with one pharmacological/theraputic subgroup (2nd level). The 3rd and 4th

levels are chemical/pharmacological/theraputic subgroups and the 5th level

is the chemical substance. The hierarchical structure of ATC-codes makes

the prediction a hierarchical multi-label classiﬁcation problem. Existing

models for this problem are complicated and expensive in computational

cost (Rousu et al., 2004; Cai and Hofmann, 2004). This thus greatly

restricts the application scope of such methods. Here, we propose a low

cost computational method by treating ATC-code prediction problem as a

binary classiﬁcation problem. Speciﬁcally, we construct drug and ATC-

code interaction network based on available drug’s ATC annotations, which

are extracted from KEGG BRITE (Kanehisa et al., 2006) and DrugBank

(Wishart et al., 2008) databases. That is, by using the known ATC-codes

for drugs, we construct a bipartite graph (Fig. 1A), i.e., the interactions only

exist between drugs and ATC-codes. In this way drug’s ATC-code prediction

can be cast as a binary classiﬁcation problem. We aim to determine whether

a given drug and ATC-code pair interacts or not. The advantage is that we

can utilize a much popular machine learning method, SVM, to handle this

high dimensional learning problem in a relatively low cost way.

2.2 Collecting chemical structure and target protein

data

Given two drug ATC-code pairs, we construct a kernel function which

correlates with their similarity. Since kernel function represents the

similarities among the training samples in some sense (Hofmann et al.,

2008), we focus on the similarity scores among drugs and similarity

scores among ATC-codes. Therefore, we construct the similarity proﬁles to

characterize drug and ATC-code in the following subsections.

2.2.1 Chemical structure data It is generally believed that drugs with

similar chemical structures carry out common therapeutic function, thus

likely share common ATC-codes. So each drug can be characterized by its

chemical structure similarity proﬁle with other drugs. The chemical structure

similarity between two drugs d and d

′

is computed by SIMCOMP algorithm

(Hattori et al., 2003), which is a graph-based method for comparing pairwise

chemical structures. Suppose that we have n

drugs in total, a matrix

chem

∈ R

×n

is then constructed to represent chemical structure

similarity for all drug pairs. Each row (or column) of this matrix is chemical

structure similarity proﬁle for a single drug.

at Periodicals Department/Lane Library on April 5, 2013http://bioinformatics.oxfordjournals.org/Downloaded from

剩余7页未读，继续阅读

评论收藏

内容反馈

weixin_38726407

粉丝: 20
资源: 954

Network predicting drug's anatomical therapeutic chemical code

最新资源

Network predicting drug's anatomical therapeutic chemical code

2019-MIT-Predicting Drug Responses by Propagating Interactions t

Sample neural networks code predicting stock price in ten days using C++

Predicting cortical ROIs via joint modeling of anatomical and connectional profiles

Facebook V Predicting Check Ins.zip

Artificial Intelligence Marketing and Predicting Consumer Choice

用回归预测谷歌的股价 Predicting Google’s stock price using regression.zip

facebook-v-predicting-check-ins-aigc数据集，解压后训练集1.27G和测试集283M

DeepSignals- Predicting Intent of Drivers Through Visual Signals

DeepDTA_Deep Drug-Target Binding Affinity Prediction2018.pdf

Predicting mining collapse: Superjerks and the appearance

facebook-v-predicting-check-ins_2.zip

DeepSignals- Predicting Intent of Drivers Through Visual Signals.pdf

3Predicting House Prices - st.ipynb

Recurrent Event Network for Reasoning over Temporal Knowledge Graphs.pdf

Practical_Lessons_from_Predicting_Clicks_on_Ads_at_Facebook.pdf

利用公共推特上的机器学习模型预测美国大都市地区的邮政编码水平疫苗犹豫不决_Predicting Zip Code-Level V

Predicting.Consumer.Behavior

Predicting the Popularity of online content.pdf

Predicting_Boston_House_Price.ipynb

Machine-learning Techniques in Economics_New Tools for Predicting Economic Grow

Predicting Human Decision-Making: From Prediction to Action

A Novel Approach for Predicting the Probability of Inconsistent Changes to Code Clones Based LDA

Predicting direction of stock price index movement using ANN and SVM.pdf

Predicting reading achievement in disadvantaged children

最新资源