# Probability Theory
They sometimes say that [[self.stats/Statistics]] is just inverse probability. What does that really mean?
A key concept in the field of [[Artificial Intelligence]], and life in general, is that of uncertainty. It arises both through noise on measurements, as well as through finite size datasets.
## Definition of Probability
*Probability theory provides a consistent framework for the quantification and manipulation of uncertainty and forms one of the central foundations of pattern recognition.*
## Axioms of Probability
$P(A) \geq 0$
$P(\Omega) = 1$
$P\big(\bigcup_n^\infty A_n\big) = \sum_{n=1}^{\infty}P(A_n) (A_i \bigcap A_j = \emptyset$
$ \begin{aligned}
P(X = x_i) &= \sum_{j=1}^{L}P(X=x_i, Y=y_j) \\
\\
P(X) &= \sum_Y P(X,Y) \rightarrow \text{sum rule}\\
\\
P(X,Y) &= P(Y|X)P(X) \rightarrow \text{product rule}\\
\end{aligned} $
## Probability Densities
$ \begin{aligned}
p(x \in (a,b)) &= \int_a^b p(x)dx \\
\\
p(x) &\geq 0 \\
\\
\int_{-\infty}^{\infty}p(x)dx &= 1
\end{aligned} $
Under a nonlinear change of variable a probability density transforms differently form a simple function due to the Jacobian Factor.
![[probability_mindmap_mml.png]]