Archives and Documentation Center
Digital Archives

Improving image captioning with language modeling regularizations

Show simple item record

dc.contributor Graduate Program in Electrical and Electronic Engineering.
dc.contributor.advisor Anarım, Emin.
dc.contributor.advisor Akgül, Ceyhun Burak.
dc.contributor.author Ulusoy, Okan.
dc.date.accessioned 2023-03-16T10:20:22Z
dc.date.available 2023-03-16T10:20:22Z
dc.date.issued 2019.
dc.identifier.other EE 2019 U68
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12969
dc.description.abstract Inspired by the recent work in language modeling, we investigate the effects of a set of regularization techniques on the performance of a recurrent neural network based image captioning model. Using these techniques, we achieve 13 Bleu-4 points improvements over using no regularizations. We show that our model does not suffer from loss-evaluation mismatch and also connect the model performance to dataset properties by running experiments on MSCOCO dataset. Further, we propose two different applications for our image captioning model, namely human in the loop system and zero shot object detection. The former application further improves CIDEr score of our best model by 30 points using only the first two tokens of a reference sentence of an image. In the latter one, we train our image captioning model as an object detector which classifies each objects in an image without finding their location. The main advantage of this detector is that it does not require object locations during the training phase.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2019.
dc.subject.lcsh Automatic speech recognition.
dc.title Improving image captioning with language modeling regularizations
dc.format.pages xvi, 99 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account