Abstract:
Music industry is making big investments every year to produce hit songs. The increasing number of songs available through digital platforms can enable the development of learning models for predicting hit songs and identifying their common features. This thesis investigates classifying a song as hit or non-hit by using various machine learning methods. Besides the basic musical features provided by Spotify, more complex features based on the chords and melody extracted from the music les by utilizing music theory information are designed. Chord based features are created using the important chord progressions based on tonal harmony, while the features based on melody are designed in an intuitive way. In addition, new benchmark datasets are created by using both hit and non-hit songs from dance and rock music genres. The results show that using chord and melody based features with the basic musical features may lead to an improvement in hit song prediction performance. For rock songs, the Random Forest classi er achieves a signi cant improvement on the results by using these features. It is also observed that using a speci c feature combination with Support Vector Machine classi er increases the accuracy score of hit dance song prediction. Furthermore, all the features used in this study are analyzed in the last part of this study for each dataset.