Score level multi cue fusion for sign language recognition

Gökçe, Çağrı.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
M.S. Theses
→
View Item

dc.contributor	Graduate Program in Computer Engineering.
dc.contributor.advisor	Akarun, Lale.
dc.contributor.author	Gökçe, Çağrı.
dc.date.accessioned	2023-03-16T10:04:39Z
dc.date.available	2023-03-16T10:04:39Z
dc.date.issued	2020.
dc.identifier.other	CMPE 2020 G75
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/12427
dc.description.abstract	In this thesis, we propose a Score-Level Multi Cue Fusion approach that improves the sign language recognition performance of the three dimensional convolutional neural networks. Sign Language is the communication language of the Deaf and Hearing-impaired individuals and performed using hand movements, facial gestures, and body alignment. Sign Language Recognition is the task that aims to understand sign language and gaining increasing popularity with the task becoming feasible due to the e ciency of the neural network. Previous work uses 3D CNN network variants to inspect SL properties in di erent settings. The vanilla 3D variant uses 3D kernels with high processing cost, the mixed convolution variant applies both 3D and 2D kernels respectively, and R(2+1)D variants exploit bottleneck connections to exploit the bottleneck dimension. Various studies use these networks to generate an end to end framework for tasks such as sign classi cation and translation. To achieve better performance, 3D CNN methods use the complicated neural network architectures that have a branch for every cue system. We evaluate the 3D network performances and propose a more straightforward approach which only adopts a single neural network that can process multiple cues at test time. We exploit the hand, body, and face cues by training single individual networks and fuse results by using a weighted score fusion. We test our method on the recently published Turkish Isolated SLR dataset. Despite the simple architecture, our method achieves %94 percent classi cation rate on 744 di erent sign glosses. We hope that the multi cue approach can help with the other SLR tasks such as translation, which is stated as future work.
dc.format.extent	30 cm.
dc.publisher	Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2020.
dc.subject.lcsh	Sign language.
dc.title	Score level multi cue fusion for sign language recognition
dc.format.pages	xvii, 59 leaves ;