Learning Perceptual Causality from Video
Amy Fire and Song-Chun Zhu


Parse Graphs Perceptual causality is the perception of causal relationships from observation. Humans, even as infants, form such models from observation of the world around them [Saxe and Carey 2006]. For a deeper understanding, the computer must make similar models through the analogous form of observation: video. In this paper, we provide a framework for the unsupervised learning of this perceptual causal structure from video. Our method takes action and object status detections as input and uses heuristics suggested by cognitive science research to produce the causal links perceived between them.

Complete C-AOG We greedily modify an initial distribution featuring independence between potential causes and effects by adding dependencies that maximize information gain. We compile the learned causal relationships into a Causal And-Or Graph, a probabilistic and-or representation of causality that adds a prior to causality. Validated against human perception, experiments show that our method correctly learns causal relations, attributing status changes of objects to causing actions amid irrelevant actions. Our method outperforms Hellinger’s χ2-statistic by considering hierarchical action selection, and outperforms the treatment effect by discounting coincidental relationships.


  title={Learning Perceptual Causality from Video},
  author={Fire, A. and Zhu, S.-C.},
  journal = {ACM Trans. Intell. Syst. Technol.},
  publisher = {ACM}


The learning code is available on github. It is written in MATLAB.


The minimum data for replicating the experiments consists of action and fluent detections.