Abstract:
Predicting drug-target binding affinity is a critical phase in computer-aided drug design, which can help accelerate the drug development process and reduce experimen tal validation costs caused by the significant false-positive rates. Hence, developing in-silico computational algorithms to predict drug- target binding affinity values has become an important research area. Machine learning approaches have been pro posed for this task, including models that use readily available biomolecule sequences and heterogeneous networks enriched with drug and target-related information. We present WideDeepDTA, the first study that leverages both text-based and network based approaches and predicts drug-target binding affinities. Given homogeneous and heterogeneous networks containing multiple types of biological entities, relationships between these entities, and pre-trained language models for biomolecular language, WideDeepDTA first learns the low-dimensional feature representation of drugs and targets using the node embedding technique Metapath2Vec. Then, it predicts affinity values based on the learned features. WideDeepDTA demonstrates its ability to cre ate rich representations in the drug-target affinity prediction task compared to one of the state-of-the-art methods, DeepDTA, on the BDB dataset in terms of concordance index and mean squared error. Experiments indicate that integrating pre-trained lan guage models with heterogeneous information improves model performance, especially while predicting the affinity values between proteins and unseen ligands. Moreover, the results show that the model performance improves when heterogeneous graphs are empowered with the information extracted from text-based representations.