Scoped Entropy

As Draft PDF

Proposed answer to the following questions:

Related questions:


WORKING DRAFT

Both variance and entropy are measures of uncertainty and thus information. Scoped entropy is a generalization of both statistical variance and Shannon entropy [1]. It jointly generalizes by using a construct of “scope of relevance [2]”. Scoped entropy can be appiled to both discrete and continuous variables.

A random object is a function with a domain of a probability space [3]. When the function values are real numbers, it is a random variable (also called real random object in this document). In this document, a finite random object means a random object that takes on finitely many values. Or in other words, the range of a finite random object is a set of finite size.

Both entropy and variance are functions of random objects. In the case of Shannon entropy, the random object is finite (with values often referred to as symbols). In the case of variance, the random object is a (real) random variable. The distances between values of a finite random variable affect variance, but not Shannon entropy.

Given a scope of relevance \(\theta\) and a finite or real random object \(X\), scoped entropy is denoted as the function \[ \operatorname{V}_{ \theta}(X) \]

There is a unirelevant decomposition function \(\mathfrak{h}\) which given any finite random object \(X\) generates a scope \(\mathfrak{h}(X)\) such that \[\begin{eqnarray*} \operatorname{V}_{ \mathfrak{h}(X)}(X) & = & \operatorname{H}(X) \\ \end{eqnarray*}\] where \(\operatorname{H}(X)\) is Shannon entropy.

Similarly, there is a distribution decomposition function \(\mathfrak{d}\) which given any random variable \(X\) generates a scope \(\mathfrak{d}(X)\) such that \[\begin{eqnarray*} \operatorname{V}_{ \mathfrak{d}(X)}(X) & = & \operatorname{Var}(X) \\ \end{eqnarray*}\]

Scoped Entropy

Given \(\theta\), a scope of relevance [2], let:

\[\begin{eqnarray*} \mathcal{A}_\theta(0) & := & \{ \Omega \} \\ \mathcal{A}_\theta(i+1) & := & \left\{ A \cap B : A \in \mathcal{A}_\theta(i), B \in \pi, \pi \in \operatorname{dom}_{ i}{ \theta} \right\} \\ h(A|B) & := & \operatorname{P}(A|B) \log_2\frac{1}{\operatorname{P}(A|B)} \\ \operatorname{U}_{ \pi}(Q) & := & \operatorname{P}({\cup{ \pi}}|Q) \sum_{S \in \pi} h(S|{\cup{ \pi}} \cap Q) \\ \operatorname{U}_{ \theta}(Q) & := & \sum^\infty_{i = 0} \sum_{A \in \mathcal{A}_\theta(i)} \operatorname{P}(A|Q) \sum_{\pi \in \operatorname{dom}_{ i}{ \theta}} \theta(\pi) \operatorname{U}_{ \pi}(A \cap Q) \\ \operatorname{U}_{ \theta}(\pi) & := & \sup \left\{ \sum_{Q \in \rho} \operatorname{P}(Q) \operatorname{U}_{ \theta}(Q) : \text{ finite partition } \rho \underset{\text{(or equal to)}}{\text{ coarser than }} \pi \right\} \\ \operatorname{V}_{ \theta}(Q) & := & \operatorname{U}_{ \theta}(\Omega) - \operatorname{U}_{ \theta}(Q) \\ \operatorname{V}_{ \theta}(\pi) & := & \operatorname{U}_{ \theta}(\{\Omega\}) - \operatorname{U}_{ \theta}(\pi) \\ \ker{X} & := & \left\{ \{ \omega \in \Omega : X(\omega)=v \} : v \text{ in the range of } X \right\} \\ \operatorname{U}_{ \theta}(X) & := & \operatorname{U}_{ \theta}(\ker{X}) \\ \operatorname{V}_{ \theta}(X) & := & \operatorname{V}_{ \theta}(\ker{X}) \\ \end{eqnarray*}\]

Shannon Entropy Equality

Given any finite random object \(X\), a unirelevant decomposition is the trivial mapping of \(X\) to a scope \(\theta\) assigning \(1\) to a single partition consisting of the events for each value taken by \(X\): \[\begin{eqnarray*} \operatorname{dom}{ \theta} & = & \{\ker{X}\} \\ \theta(\ker{X}) & = & 1 \\ \end{eqnarray*}\]

Theorem: Unirelevant Scoped Entropy equals Shannon Entropy

Given any finite random objects \(X\) and \(Y\) and scope \(\theta\) equal to the unirelevant decomposition of \(X\), \[\begin{eqnarray*} \operatorname{U}_{ \theta}(Y) & = & \operatorname{H}(X|Y) \\ \end{eqnarray*}\] It follows as a corollary that \[\begin{eqnarray*} \operatorname{V}_{ \theta}(X) & = & \operatorname{H}(X) \\ \end{eqnarray*}\]

Proof

Consider any single partition scope \(\theta = \{\pi \mapsto 1 \}\). Both \(\mathcal{A}_\theta(i)\) and \(\operatorname{dom}_{ i}{ \theta}\) are only non-empty at \(i=0\) and equal \(\{\Omega\}\) and \(\{\pi\}\) respectively. Thus by definition \[\begin{eqnarray*} \operatorname{U}_{ \theta}(Y) & = & \sum_{Q \in \ker{Y}} \operatorname{P}(Q) \operatorname{P}(\Omega|Q) \theta(\pi) \operatorname{P}({\cup{ \pi}}|\Omega,Q) \sum_{B \in \pi} h(B|{\cup{ \pi}},\Omega,Q) \\ & = & \sum_{Q \in \ker{Y}} \operatorname{P}(Q) \sum_{B \in \pi} h(B|Q) \\ & = & \sum_{Q \in \ker{Y}} \sum_{B \in \pi} \operatorname{P}(Q \cap B) \log_2\frac{1}{\operatorname{P}(B|Q)} \\ & = & \operatorname{H}(X|Q) \\ \end{eqnarray*}\] Proof of the corollary is: \[ \operatorname{V}_{ \theta}(X) = \operatorname{U}_{ \theta}(\Omega) - \operatorname{U}_{ \theta}(X) = \operatorname{H}(X|\{\Omega\}) - \operatorname{H}(X|X) = \operatorname{H}(X) \]

References

1. Shannon CE, Weaver W (1998) The mathematical theory of communication. Univ. of Illinois Press, Urbana

2. Ellerman EC Entropy scope of relevance. http://castedo.com/osa/129/

3. Ash RB, Doléans-Dade C, Ash RB (2000) Probability and measure theory, 2nd ed. Harcourt/Academic Press, San Diego