13.1. Categorical Cross Entropy (CCE)

13.1.1. Example Architecture

This figure shows a generic classification network, and how the CCE is likeley to be used.

13.1.2. Graph

The Graph here shows categorical cross entropy plotted on x and y axis. Where green is the target value, orange is the predicted value, red is the output of CCE, and blue is the local gradient of CCE.

See https://www.desmos.com/calculator/q2dwniwjsp for an interactive version.

13.1.3. API

Categorical Cross Entropy (CCE) as node abstraction.

class fhez.nn.loss.cce.CCE

Categorical cross entropy for multi-class classification.

This is also known as softmax loss, since it is mostly used with softmax activation function.

Not to be confused with binary cross-entropy/ log loss, which is instead for multi-label classification, and is instead used with the sigmoid activation function.

CCE Graph: https://www.desmos.com/calculator/q2dwniwjsp

backward(gradient: numpy.ndarray): Calculate gradient of loss with respect to \(\hat{y}\).

\[\frac{d\textit{CCE}(\hat{p(y)})}{d\hat{p(y_i)}} = \frac{-1}{\hat{p(y_i)}}p(y_i)\]

property cost: Get 0 cost of plaintext loss calculation.

forward(signal=None, y: Optional[numpy.ndarray] = None, y_hat: Optional[numpy.ndarray] = None, check=False)

Calculate cross entropy and save its state for backprop.

Can either be given a network signal with both y_hat and y stacked, or you can explicitly define y and y_hat.

loss(y: numpy.ndarray, y_hat: numpy.ndarray)

Calculate the categorical cross entryopy statelessley.

\[CCE(\hat{p(y)}) = -\sum_{i=0}^{C-1} y_i * \log_e(\hat{y_i})\]

where:

\[ \begin{align}\begin{aligned}\sum_{i=0}^{C-1} \hat{p(y_i)} = 1\\\sum_{i=0}^{C-1} p(y_i) = 1\end{aligned}\end{align} \]

fhez.nn.loss.cce.CategoricalCrossEntropy: alias of fhez.nn.loss.cce.CCE