PGM Representation#
This notes is completed with assistance of ChatGPT
Lecture 20: PGM Representation (COMP90051 Statistical Machine Learning)
- Probabilistic Graphical Models - Motivation: applications, unifies algorithms 
- Motivation: ideal tool for Bayesians 
- Independence lowers computational/model complexity 
- PGMs: compact representation of factorised joints 
- U-PGMs 
 
- Example PGMs and Applications 
- Additional Resource 
- (ML 13.1) Directed graphical models - introductory examples (part 1) 
- Next time: elimination for probabilistic inference 
Topics Covered:
- (Directed) Probabilistic Graphical Models (D-PGMs) - These are graphical representations of probability distributions. The directed nature means that the graph has arrows indicating a direction from one variable to another, representing causal relationships. 
 
- Motivations for Using PGMs: - Applications & Unification of Algorithms: PGMs are used in various applications and can unify different algorithms under a single framework. 
- Ideal Tool for Bayesians: Bayesians use probability distributions to represent uncertainty. PGMs provide a structured way to represent these distributions, making them an ideal tool for Bayesian analysis. 
 
- Importance of Independence: - Lowers Computational/Model Complexity: When variables are independent, it simplifies computations and the model itself. This is because you don’t need to account for relationships between every pair of variables. 
- Conditional Independence: This is a specific type of independence where two variables are independent given the value of a third variable. For example, if A and B are conditionally independent given C, then knowing the value of C breaks any dependence between A and B. 
 
- PGMs as a Compact Representation: - Factorized Joints: PGMs allow for a compact representation of joint probability distributions by breaking them down into smaller factors. This makes computations more efficient. 
 
- Undirected PGMs and Conversion from D-PGMs: - While directed PGMs represent causal relationships, undirected PGMs represent non-causal relationships between variables. There are methods to convert between the two types. 
 
- Example PGMs & Applications: - This section will likely provide specific examples of PGMs and how they are applied in real-world scenarios. 
 
Notes:
- PGMs provide a visual representation of complex probability distributions. 
- They can represent both causal (directed) and non-causal (undirected) relationships. 
- Independence and conditional independence are crucial concepts in PGMs, simplifying computations and the model. 
- PGMs are widely used in machine learning and statistics for various applications. 
Lecture Summary: Probabilistic Graphical Models (PGMs) in Statistical Machine Learning
- Introduction to PGMs: - PGMs provide a compact representation of factorized joint distributions, making them ideal for Bayesian modeling. 
 
- Joint Distributions: - Represented as \(Pr(X_1, X_2, ..., X_n)\). Directly working with these can be computationally intensive, especially as the number of variables increases. 
 
- Probabilistic Inference: - Uses Bayes Rule: \[ P(A|B) = \frac{P(B|A) \times P(A)}{P(B)} \]
- And Marginalisation: \[ P(A) = \sum_{B} P(A, B) \]
 - These tools allow us to derive probabilities and update beliefs based on new evidence. 
- Factoring Joint Distributions: - Chain Rule: \[ Pr(X_1, X_2, ..., X_n) = \prod_{i=1}^{n} Pr(X_i | X_1, ..., X_{i-1}) \]
 - This expresses joint distributions as products of conditional probabilities. Independence assumptions can simplify these products. 
- Directed PGMs (Bayesian Networks): - Represented with nodes (random variables) and acyclic edges (conditional dependencies). 
- Joint distribution factorization: \[ Pr(X_1, X_2, ..., X_n) = \prod_{i} Pr(X_i | parents(X_i)) \]
 - This shows how the joint probability is a product of the probabilities of each variable given its parents in the graph. 
- Naïve Bayes Classifier: - Assumes features \(X_1, ..., X_d\) are conditionally independent given class label \(Y\). 
- Joint probability: \[ Pr(Y, X_1, ..., X_d) = Pr(Y) \times \prod_{i=1}^{d} Pr(X_i|Y) \]
 - For prediction, it selects the \(Y\) that maximizes \(Pr(Y|X_1, ..., X_d)\). 
- Benefits of PGMs: - They allow for structured, efficient, and intuitive probabilistic modeling. The factorization and independence assumptions reduce the computational burden and the risk of overfitting. 
 
