Archives and Documentation Center
Digital Archives

Mention extraction and normalization using ontologies in the biomedical domain

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Özgür, Arzucan.
dc.contributor.author Tiftikci, Mert.
dc.date.accessioned 2023-03-16T10:03:59Z
dc.date.available 2023-03-16T10:03:59Z
dc.date.issued 2019.
dc.identifier.other CMPE 2019 T54
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12387
dc.description.abstract This thesis proposes a machine learning- and rule-based system for the identifi cation of adverse drug reaction (ADR) entity mentions in the text of drug labels and their normalization through the MedDRA dictionary. The machine learning approach is based on a recently proposed deep learning model that works on the sentence level. The model makes use of the combination of the pre-trained word embeddings and Con volutional Neural Network (CNN) embeddings generated from the characters of a given token. These tokens are initially passed through bi-directional Long Short-Term Mem ory (Bi-LSTM) layers for feature extraction. Finally, a Conditional Random Fields (CRF) classifier is trained on those extracted features for the prediction of the target mentions. The rule-based approach, used for normalizing the identified ADR mentions to MedDRA terms, is based on an extension of the text-mining system called SciMiner. The proposed system is evaluated with the TAC-ADR 2017 challenge dataset. Since this dataset contains mentions that are disjoint and overlapping, the model also uses a recently proposed chunking scheme designed to handle those types. The model ob tained 76.97 f-score performance on the TAC dataset. Some of the challenges for the worse performance compared to performance of the models trained on the generic news paper text are the small size of the training dataset and the uneven distribution of the class instances.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019.
dc.subject.lcsh Medical informatics.
dc.subject.lcsh Knowledge representation (Information theory)
dc.subject.lcsh Ontology.
dc.title Mention extraction and normalization using ontologies in the biomedical domain
dc.format.pages xiii, 44 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account