Description
Description
K-Means finds some centroids
c = torch.argmin(torch.cdist(x, centroids)**2)
Neither the assignment
The assignment
Fixes
From Clustering to Cluster Explanation via Neural Networks (2022) present a method based on neuralization. They show that the k-means cluster assignment can be exactly rewritten as a set of piecewise linear functions
and the original k-means cluster assignment can be recovered as
The cluster discriminant
Additional Information
- There is no pairwise distance layer in PyTorch. Since zennit canonizers apply to
torch.nn.Module
modules, such module would need to be added - Each discriminant
$f_c$ has it's own weight matrix$W\in\mathbb{R}^{(K-1)\times D}$ . In principle, a neuralized k-means layer could be written as a layer
torch.nn.Sequential(
torch.nn.Linear(in_features=D, out_features=K*(K-1)),
torch.nn.Unflatten(1, torch.Size([K, K-1]))
)
or alternatively via torch.einsum
, which I think is easier. Either way, a special layer is needed for that.
- The paper suggests a specific LRP rule for the
$\min_{k\neq c}\lbrace s_k\rbrace$ to make the explanations continuous. In practice, we can replace it by$\beta^{-1}{\rm LogMeanExp}_k\lbrace \beta\cdot s_k\rbrace$ which approximates the min for$\beta < 0$ , but has continuous gradients. Gradient propagation through this layer recovers exactly the LRP rule from the paper, but this also provides explanation continuity for other gradient-based explanation methods. I would suggest to implement such aLogMeanExpPool
layer. Alternative could be to implement aMinPool
layer with a custom LRP rule. - A canonizer could be used to replace the distance/k-means layer by it's neuralized counterpart. Alternatively, a composite might also work, but it seems a bit more complicated (where are the weights computed? how to set LRP rules for the linear/einsum layer?)