Design and Development of Probabilistic Inference Pipelines

Shterionov, Dimitar; Janssens, Gerda

Author:

Shterionov, Dimitar

Janssens, Gerda

Abstract:

Logic is the fundament of many Artificial Intelligence (A.I.) systems as it provides an intuitive mechanism to represent knowledge. Logic can be used to address a wide range but for applications that require reasoning with uncertain knowledge more profound formalisms are needed. This necessity lead to the establishment and the vast development of the fields of Probabilistic Logic Programming (PLP), Statistical Relational Learning (SRL) and others. ProbLog is a PLP framework – a language and an inference system that started as a probabilistic extension of Prolog. Soon after, it established a dominant position within the PLP community. In this thesis we present recent advances in the design and the development of ProbLog with focus on the ProbLog2 system. First, we study the architecture of a ProbLog inference system. It is a pipeline architecture, called the ProbLog inference pipeline, that applies a sequence of transformation steps encapsulated in four separate components. The transformations conveyed by the ProbLog pipeline reduce the expensive probabilistic inference task to a weighted model counting (WMC) problem that can be solved efficiently. The modularity of this architecture allows to (i) substitute the implementation of one component with another; (ii) extend the inference pipeline with new components or processing steps; and (iii) build new inference and learning tasks with minimum efforts. We discussed 14 implementations; 5 of them are newly introduced ProbLog inference pipelines. Then and evaluate their performance and determined crucial points in an inference pipeline. Then we focus on its optimization. We presented a method that aims at improving knowledge compilation - one of the four main components - by compacting the input Boolean formulae. Our method detects seven subformulae patterns (4 that retain logic equivalence and 3 that retain the WMC) and uses them to rewrite a given Boolean formula into a more compact representation. Next, we augment the inference pipeline of ProbLog2 to handle two extensions of the ProbLog language – constraints and annotated disjunctions. Constraints are First-Order Logic sentences which need to hold; annotated disjunctions provide an intuitive way to encode random events with multiple and mutually exclusive outcomes. To incorporate constraints in ProbLog we devised a method that (i) converts constraints into ProbLog syntax and (ii) uses the default ProbLog2 pipeline to perform inference. Annotated disjunctions had already been incorporated in the ProbLog language by means of an encoding that was correct for some of the inference tasks that ProbLog supports but incorrect for more recent ones. In order to provide support for annotated disjunctions that is correct for all inference tasks of ProbLog2 we devised a constraint-based method to encode annotated disjunctions as ProbLog programs. The modularity of the inference pipeline enabled a systematic design and implementation of these approaches.