Abstract:
Extracting new unknown facts from given facts in the format of triples(entity relation-entity) is a popular statistical relational learning task and defined with the name of knowledge graph link prediction problem. Due to nature of the problem def inition, tensors are widely preferred to represent existing datasets. In the presence of latent features for entities and relations, tensor factorization models are used to ap proximate to the original dataset tensor. These latent features of entities and relations are estimated/inferred during approximation and interaction between them reveals the probabilities of triple existences. In this thesis, we propose the tensor extension of recently introduced Sum Con ditioned Poisson Factorization, in order to use it in knowledge graph problems. Sum Conditioned Poisson Factorization is an alternative to Generalized Linear Models and can be used to model bounded data with L component Poisson Factorizations which are conditioned on their summation. Unlike GLMs which factorize canonical parameters, SCPF decomposes directly the moment parameters. For knowledge graph problems, we define two Poisson tensor factorizations by conditioning their summation to a tensor of ones. We introduce maximum likelihood parameter estimation with Expectation Max imization and Bayesian inference with variational inference and Gibbs sampling. We compare the predictive performance of SCPF models with the performance of state of the art Generalized Linear Model, Logistic Tensor Factorization on standard datasets (Nation, UMLS, and Kinship).