13.1. Categorical Cross Entropy (CCE)
13.1.1. Example Architecture
This figure shows a generic classification network, and how the CCE is likeley to be used.
13.1.2. Graph
The Graph here shows categorical cross entropy plotted on x and y axis. Where green is the target value, orange is the predicted value, red is the output of CCE, and blue is the local gradient of CCE.
See https://www.desmos.com/calculator/q2dwniwjsp for an interactive version.
13.1.3. API
Categorical Cross Entropy (CCE) as node abstraction.
- class fhez.nn.loss.cce.CCE
Categorical cross entropy for multi-class classification.
This is also known as softmax loss, since it is mostly used with softmax activation function.
Not to be confused with binary cross-entropy/ log loss, which is instead for multi-label classification, and is instead used with the sigmoid activation function.
CCE Graph: https://www.desmos.com/calculator/q2dwniwjsp
- backward(gradient: numpy.ndarray)
Calculate gradient of loss with respect to \(\hat{y}\).
\[\frac{d\textit{CCE}(\hat{p(y)})}{d\hat{p(y_i)}} = \frac{-1}{\hat{p(y_i)}}p(y_i)\]
- property cost
Get 0 cost of plaintext loss calculation.
- forward(signal=None, y: Optional[numpy.ndarray] = None, y_hat: Optional[numpy.ndarray] = None, check=False)
Calculate cross entropy and save its state for backprop.
Can either be given a network signal with both y_hat and y stacked, or you can explicitly define y and y_hat.
- loss(y: numpy.ndarray, y_hat: numpy.ndarray)
Calculate the categorical cross entryopy statelessley.
\[CCE(\hat{p(y)}) = -\sum_{i=0}^{C-1} y_i * \log_e(\hat{y_i})\]where:
\[ \begin{align}\begin{aligned}\sum_{i=0}^{C-1} \hat{p(y_i)} = 1\\\sum_{i=0}^{C-1} p(y_i) = 1\end{aligned}\end{align} \]
- fhez.nn.loss.cce.CategoricalCrossEntropy
alias of
fhez.nn.loss.cce.CCE