Archives and Documentation Center
Digital Archives

Automated requirements classification using feature selection based on linguistic features

Show simple item record

dc.contributor Graduate Program in Systems and Control Engineering.
dc.contributor.advisor Aydemir, Fatma Başak.
dc.contributor.author Çevikol, Sercan.
dc.date.accessioned 2023-03-16T11:35:00Z
dc.date.available 2023-03-16T11:35:00Z
dc.date.issued 2021.
dc.identifier.other SCO 2021 C48
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/15681
dc.description.abstract Requirements classification is an important problem in organizing the systems and requirements, and it is widely used in handling large requirements data sets. A basic example of a requirements classification problem is the distinction between the functional and non-functional (quality) requirements. The state-of-the-art classifiers are most effective when they use a large set of word features such as text n-grams or part of speech n-grams. However, as the number of features increases, it becomes more difficult to interpret the approach, because many redundant features have to be explored that do not capture the meaning of the requirements. In this study, we propose the use of more general linguistic features, such as dependency types, for the construction of interpretable machine learning classifiers for requirements engineering. Through a feature engineering effort, assisted by tools that interpret graphically how classifiers work, we derive a set of linguistic features. While classifiers that use the proposed features fit the training set slightly worse than those that use high-dimensional feature sets, this approach performs generally better on validation data sets and is more interpretable. We use industry data sets, and we perform experimental runs using several automated feature selection algorithms to explore whether our feature set can be optimized further using one of the automated selection algorithms. Although in some data sets, impressive results were obtained. the automated selection algorithms did not prove a significant improvement, and even, on average, the results were worse than the results we obtained using the set based on linguistic features.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2021.
dc.subject.lcsh Software engineering.
dc.subject.lcsh Linguistics -- Software.
dc.title Automated requirements classification using feature selection based on linguistic features
dc.format.pages xi, 63 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account