TY - JOUR
T1 - Conditional sum-product networks
T2 - Modular probabilistic circuits via gate functions
AU - Shao, Xiaoting
AU - Molina, Alejandro
AU - Vergari, Antonio
AU - Stelzner, Karl
AU - Peharz, Robert
AU - Liebig, Thomas
AU - Kersting, Kristian
N1 - Publisher Copyright:
© 2021 The Authors
PY - 2022
Y1 - 2022
N2 - While probabilistic graphical models are a central tool for reasoning under uncertainty in AI, they are in general not as expressive as deep neural models, and inference is notoriously hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions and ensure tractable inference, but still lack the expressive power of intractable models based on deep neural networks. In this paper, we introduce conditional SPNs (CSPNs)—conditional density estimators for multivariate and potentially hybrid domains—and develop a structure-learning approach that derives both the structure and parameters of CSPNs from data. To harness the expressive power of deep neural networks (DNNs), we also show how to realize CSPNs by conditioning the parameters of vanilla SPNs on the input using DNNs as gate functions. In contrast to SPNs whose high-level structure can not be explicitly manipulated, CSPNs can naturally be used as tractable building blocks of deep probabilistic models whose modular structure maintains high-level interpretability. In experiments, we demonstrate that CSPNs are competitive with other probabilistic models and yield superior performance on structured prediction, conditional density estimation, auto-regressive image modeling, and multilabel image classification. In particular, we show that employing CSPNs as encoders and decoders within variational autoencoders can help to relax the commonly used mean field assumption and in turn improve performance.
AB - While probabilistic graphical models are a central tool for reasoning under uncertainty in AI, they are in general not as expressive as deep neural models, and inference is notoriously hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions and ensure tractable inference, but still lack the expressive power of intractable models based on deep neural networks. In this paper, we introduce conditional SPNs (CSPNs)—conditional density estimators for multivariate and potentially hybrid domains—and develop a structure-learning approach that derives both the structure and parameters of CSPNs from data. To harness the expressive power of deep neural networks (DNNs), we also show how to realize CSPNs by conditioning the parameters of vanilla SPNs on the input using DNNs as gate functions. In contrast to SPNs whose high-level structure can not be explicitly manipulated, CSPNs can naturally be used as tractable building blocks of deep probabilistic models whose modular structure maintains high-level interpretability. In experiments, we demonstrate that CSPNs are competitive with other probabilistic models and yield superior performance on structured prediction, conditional density estimation, auto-regressive image modeling, and multilabel image classification. In particular, we show that employing CSPNs as encoders and decoders within variational autoencoders can help to relax the commonly used mean field assumption and in turn improve performance.
KW - Conditional probabilistic modeling
KW - Probabilistic circuits
KW - Sum-product-networks
KW - Tractable inference
KW - Variational inference
UR - http://www.scopus.com/inward/record.url?scp=85118751823&partnerID=8YFLogxK
U2 - 10.1016/j.ijar.2021.10.011
DO - 10.1016/j.ijar.2021.10.011
M3 - Article
SN - 1873-4731
VL - 140
SP - 298
EP - 313
JO - International Journal of Approximate Reasoning
JF - International Journal of Approximate Reasoning
ER -