Abstract:
Non-negative matrices appears in many domains from item recommendation, audio signal processing to computer vision in which data instances have a bounded non-negative range. For various tasks in these areas, probabilistic approaches have been widely applied where matrix factorizations are among the state-of-the-art meth ods. A particular one is a latent variable model called Poisson Factorization which models bounded data with Poisson distribution assigning them unbounded ranges. In this work, we extend Poisson Factorization to model bounded data with bounded dis tributions such as Bernoulli, Binomial, Categorical and Multinomial. The resulting model is named as Sum Conditioned Poisson Factorization as the model is constructed by conditioning multiple Poisson Factorizations on their sum. We present two algorithms for inference in Sum Conditioned Poisson Factoriza tion: Gibbs sampler and Expectation-Maximization. The algorithms and the model are tested with simulated and real data sets. First, we compare the algorithms with data generated from the model synthetically. Then, we demonstrate the interpretabil ity of the model on a binary valued data set named Swimmer. In order to measure the performance of the model on ordinal ratings data, we use MovieLens 500-K. The results indicate that the proposed model outperforms Poisson Factorization and other models in terms of predictive performance for test ratings and top-K recommendation. Finally, we conduct experiments on piano roll data extracted from Bach Chorales for investigating the use of the model in time series. The experiments reveal that the model provides parameters that can be used for prior distribution in time series analysis.