Multivariate continuous probability distribution
In statistics , the matrix F distribution (or matrix variate F distribution ) is a matrix variate generalization of the F distribution which is defined on real-valued positive-definite matrices . In Bayesian statistics it can be used as the semi conjugate prior for the covariance matrix or precision matrix of multivariate normal distributions, and related distributions.[ 1] [ 2] [ 3] [ 4]
The probability density function of the matrix
F
{\displaystyle F}
distribution is:
f
X
(
X
;
Ψ
,
ν
,
δ
)
=
Γ
p
(
ν
+
δ
+
p
−
1
2
)
Γ
p
(
ν
2
)
Γ
p
(
δ
+
p
−
1
2
)
|
Ψ
|
ν
2
|
X
|
ν
−
p
−
1
2
|
I
p
+
X
Ψ
−
1
|
−
ν
+
δ
+
p
−
1
2
{\displaystyle f_{\mathbf {X} }({\mathbf {X} };{\mathbf {\Psi } },\nu ,\delta )={\frac {\Gamma _{p}\left({\frac {\nu +\delta +p-1}{2}}\right)}{\Gamma _{p}\left({\frac {\nu }{2}}\right)\Gamma _{p}\left({\frac {\delta +p-1}{2}}\right)|\mathbf {\Psi } |^{\frac {\nu }{2}}}}~|{\mathbf {X} }|^{\frac {\nu -p-1}{2}}|{\textbf {I}}_{p}+{\mathbf {X} }\mathbf {\Psi } ^{-1}|^{-{\frac {\nu +\delta +p-1}{2}}}}
where
X
{\displaystyle \mathbf {X} }
and
Ψ
{\displaystyle {\mathbf {\Psi } }}
are
p
×
p
{\displaystyle p\times p}
positive definite matrices,
|
⋅
|
{\displaystyle |\cdot |}
is the determinant, Γp (⋅) is the multivariate gamma function , and
I
p
{\displaystyle {\textbf {I}}_{p}}
is the p × p identity matrix .
Construction of the distribution [ edit ]
The standard matrix F distribution, with an identity scale matrix
I
p
{\displaystyle \mathbf {I} _{p}}
, was originally derived by.[ 1] When considering independent distributions,
Φ
1
∼
W
(
I
p
,
ν
)
{\displaystyle {\mathbf {\Phi } _{1}}\sim {\mathcal {W}}({\mathbf {I} _{p}},\nu )}
and
Φ
2
∼
W
(
I
p
,
δ
+
k
−
1
)
{\displaystyle {\mathbf {\Phi } _{2}}\sim {\mathcal {W}}({\mathbf {I} _{p}},\delta +k-1)}
, and define
X
=
Φ
2
−
1
/
2
Φ
1
Φ
2
−
1
/
2
{\displaystyle \mathbf {X} ={\mathbf {\Phi } _{2}}^{-1/2}{\mathbf {\Phi } _{1}}{\mathbf {\Phi } _{2}}^{-1/2}}
, then
X
∼
F
(
I
p
,
ν
,
δ
)
{\displaystyle \mathbf {X} \sim {\mathcal {F}}({\mathbf {I} _{p}},\nu ,\delta )}
.
If
X
|
Φ
∼
W
−
1
(
Φ
,
δ
+
p
−
1
)
{\displaystyle {\mathbf {X} }|\mathbf {\Phi } \sim {\mathcal {W}}^{-1}({\mathbf {\Phi } },\delta +p-1)}
and
Φ
∼
W
(
Ψ
,
ν
)
{\displaystyle {\mathbf {\Phi } }\sim {\mathcal {W}}({\mathbf {\Psi } },\nu )}
, then, after integrating out
Φ
{\displaystyle \mathbf {\Phi } }
,
X
{\displaystyle \mathbf {X} }
has a matrix F-distribution, i.e.,
f
X
|
Φ
,
ν
,
δ
(
X
)
=
∫
f
X
|
Φ
,
δ
+
p
−
1
(
X
)
f
Φ
|
Ψ
,
ν
(
Φ
)
d
Φ
.
{\displaystyle f_{\mathbf {X} |\mathbf {\Phi } ,\nu ,\delta }(\mathbf {X} )=\int f_{\mathbf {X} |\mathbf {\Phi } ,\delta +p-1}(\mathbf {X} )f_{\mathbf {\Phi } |\mathbf {\Psi } ,\nu }(\mathbf {\Phi } )d\mathbf {\Phi } .}
This construction is useful to construct a semi-conjugate prior for a covariance matrix.[ 3]
If
X
|
Φ
∼
W
(
Φ
,
ν
)
{\displaystyle {\mathbf {X} }|\mathbf {\Phi } \sim {\mathcal {W}}({\mathbf {\Phi } },\nu )}
and
Φ
∼
W
−
1
(
Ψ
,
δ
+
p
−
1
)
{\displaystyle {\mathbf {\Phi } }\sim {\mathcal {W}}^{-1}({\mathbf {\Psi } },\delta +p-1)}
, then, after integrating out
Φ
{\displaystyle \mathbf {\Phi } }
,
X
{\displaystyle \mathbf {X} }
has a matrix F-distribution, i.e.,
f
X
|
Ψ
,
ν
,
δ
(
X
)
=
∫
f
X
|
Φ
,
ν
(
X
)
f
Φ
|
Ψ
,
δ
+
p
−
1
(
Φ
)
d
Φ
.
{\displaystyle f_{\mathbf {X} |\mathbf {\Psi } ,\nu ,\delta }(\mathbf {X} )=\int f_{\mathbf {X} |\mathbf {\Phi } ,\nu }(\mathbf {X} )f_{\mathbf {\Phi } |\mathbf {\Psi } ,\delta +p-1}(\mathbf {\Phi } )d\mathbf {\Phi } .}
This construction is useful to construct a semi-conjugate prior for a precision matrix.[ 4]
Marginal distributions from a matrix F distributed matrix [ edit ]
Suppose
A
∼
F
(
Ψ
,
ν
,
δ
)
{\displaystyle {\mathbf {A} }\sim F({\mathbf {\Psi } },\nu ,\delta )}
has a matrix F distribution. Partition the matrices
A
{\displaystyle {\mathbf {A} }}
and
Ψ
{\displaystyle {\mathbf {\Psi } }}
conformably with each other
A
=
[
A
11
A
12
A
21
A
22
]
,
Ψ
=
[
Ψ
11
Ψ
12
Ψ
21
Ψ
22
]
{\displaystyle {\mathbf {A} }={\begin{bmatrix}\mathbf {A} _{11}&\mathbf {A} _{12}\\\mathbf {A} _{21}&\mathbf {A} _{22}\end{bmatrix}},\;{\mathbf {\Psi } }={\begin{bmatrix}\mathbf {\Psi } _{11}&\mathbf {\Psi } _{12}\\\mathbf {\Psi } _{21}&\mathbf {\Psi } _{22}\end{bmatrix}}}
where
A
i
j
{\displaystyle {\mathbf {A} _{ij}}}
and
Ψ
i
j
{\displaystyle {\mathbf {\Psi } _{ij}}}
are
p
i
×
p
j
{\displaystyle p_{i}\times p_{j}}
matrices, then we have
A
11
∼
F
(
Ψ
11
,
ν
,
δ
)
{\displaystyle {\mathbf {A} _{11}}\sim F({\mathbf {\Psi } _{11}},\nu ,\delta )}
.
Let
X
∼
F
(
Ψ
,
ν
,
δ
)
{\displaystyle X\sim F({\mathbf {\Psi } },\nu ,\delta )}
.
The mean is given by:
E
(
X
)
=
ν
δ
−
2
Ψ
.
{\displaystyle E(\mathbf {X} )={\frac {\nu }{\delta -2}}\mathbf {\Psi } .}
The (co)variance of elements of
X
{\displaystyle \mathbf {X} }
are given by:[ 3]
cov
(
X
i
j
,
X
m
l
)
=
Ψ
i
j
Ψ
m
l
2
ν
2
+
2
ν
(
δ
−
2
)
(
δ
−
1
)
(
δ
−
2
)
2
(
δ
−
4
)
+
(
Ψ
i
l
Ψ
j
m
+
Ψ
i
m
Ψ
j
l
)
(
2
ν
+
ν
2
(
δ
−
2
)
+
ν
(
δ
−
2
)
(
δ
−
1
)
(
δ
−
2
)
2
(
δ
−
4
)
+
ν
(
δ
−
2
)
2
)
.
{\displaystyle \operatorname {cov} (X_{ij},X_{ml})=\Psi _{ij}\Psi _{ml}{\tfrac {2\nu ^{2}+2\nu (\delta -2)}{(\delta -1)(\delta -2)^{2}(\delta -4)}}+(\Psi _{il}\Psi _{jm}+\Psi _{im}\Psi _{jl})\left({\tfrac {2\nu +\nu ^{2}(\delta -2)+\nu (\delta -2)}{(\delta -1)(\delta -2)^{2}(\delta -4)}}+{\tfrac {\nu }{(\delta -2)^{2}}}\right).}
The matrix F-distribution has also been termed the multivariate beta II distribution.[ 5] See also,[ 6] for a univariate version.
A univariate version of the matrix F distribution is the F-distribution . With
p
=
1
{\displaystyle p=1}
(i.e. univariate) and
Ψ
=
1
{\displaystyle \mathbf {\Psi } =1}
, and
x
=
X
{\displaystyle x=\mathbf {X} }
, the probability density function of the matrix F distribution becomes the univariate (unscaled) F distribution :
f
x
∣
ν
,
δ
(
x
)
=
B
(
ν
2
,
δ
2
)
−
1
(
ν
δ
)
ν
/
2
x
ν
/
2
−
1
(
1
+
ν
δ
x
)
−
(
ν
+
δ
)
/
2
,
{\displaystyle f_{x\mid \nu ,\delta }(x)=\operatorname {B} \left({\tfrac {\nu }{2}},{\tfrac {\delta }{2}}\right)^{-1}\left({\tfrac {\nu }{\delta }}\right)^{\nu /2}x^{\nu /2-1}\left(1+{\tfrac {\nu }{\delta }}\,x\right)^{-(\nu +\delta )/2},}
In the univariate case, with
p
=
1
{\displaystyle p=1}
and
x
=
X
{\displaystyle x=\mathbf {X} }
, and when setting
ν
=
1
{\displaystyle \nu =1}
, then
x
{\displaystyle {\sqrt {x}}}
follows a half t distribution with scale parameter
ψ
{\displaystyle {\sqrt {\psi }}}
and degrees of freedom
δ
{\displaystyle \delta }
. The half t distribution is a common prior for standard deviations[ 7]
^ a b Olkin, Ingram; Rubin, Herman (1964-03-01). "Multivariate Beta Distributions and Independence Properties of the Wishart Distribution" . The Annals of Mathematical Statistics . 35 (1): 261–269. doi :10.1214/aoms/1177703748 . ISSN 0003-4851 .
^ Dawid, A. P. (1981). "Some matrix-variate distribution theory: Notational considerations and a Bayesian application" . Biometrika . 68 (1): 265–274. doi :10.1093/biomet/68.1.265 . ISSN 0006-3444 .
^ a b c Mulder, Joris; Pericchi, Luis Raúl (2018-12-01). "The Matrix-F Prior for Estimating and Testing Covariance Matrices" . Bayesian Analysis . 13 (4). doi :10.1214/17-BA1092 . ISSN 1936-0975 . S2CID 126398943 .
^ a b Williams, Donald R.; Mulder, Joris (2020-12-01). "Bayesian hypothesis testing for Gaussian graphical models: Conditional independence and order constraints" . Journal of Mathematical Psychology . 99 : 102441. doi :10.1016/j.jmp.2020.102441 . S2CID 225019695 .
^ Tan, W. Y. (1969-03-01). "Note on the Multivariate and the Generalized Multivariate Beta Distributions" . Journal of the American Statistical Association . 64 (325): 230–241. doi :10.1080/01621459.1969.10500966 . ISSN 0162-1459 .
^ Pérez, María-Eglée; Pericchi, Luis Raúl; Ramírez, Isabel Cristina (2017-09-01). "The Scaled Beta2 Distribution as a Robust Prior for Scales" . Bayesian Analysis . 12 (3). doi :10.1214/16-BA1015 . ISSN 1936-0975 .
^ Gelman, Andrew (2006-09-01). "Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper)" . Bayesian Analysis . 1 (3). doi :10.1214/06-BA117A . ISSN 1936-0975 .
Discrete univariate
with finite support with infinite support
Continuous univariate
supported on a bounded interval supported on a semi-infinite interval supported on the whole real line with support whose type varies
Mixed univariate
Multivariate (joint) Directional Degenerate and singular Families