-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Description:
When running the code on non-graph datasets (e.g., hhar, usps from the repository), I encountered NaN values for r_loss. Upon inspecting the implementation, I noticed a possible discrepancy between the paper and the code.
Issue Details:
According to the paper, the regularization loss r_loss, the KL divergence terms appear to reuse p_output (from AZ) twice, instead of comparing p_output (from AZ) and q_output (from Z).
Current Code:
def r_loss(AZ, Z):
loss = 0
for i in range(2):
for j in range(3):
p_output = F.softmax(AZ[i][j], dim=1)
q_output = F.softmax(Z[i][j], dim=1)
log_mean_output = ((p_output + q_output) / 2).log()
loss += (F.kl_div(log_mean_output, p_output, reduction='batchmean') +
F.kl_div(log_mean_output, p_output, reduction='batchmean')) / 2 # p_output used twice
return loss
Proposed Fix:
The line:
loss += (F.kl_div(log_mean_output, p_output, ...) + F.kl_div(log_mean_output, p_output, ...)) / 2
should likely be:
loss += (F.kl_div(log_mean_output, p_output, ...) + F.kl_div(log_mean_output, q_output, ...)) / 2
to compute
Steps to Reproduce:
- Run training on non-graph datasets (e.g.,
hhar,usps). - Observe NaN values for
r_lossduring training.
Expected Behavior:
r_loss should compute the AZ and Z without numerical instability.
Actual Behavior:
NaN values are produced, likely due to incorrect KL divergence terms.
Additional Notes:
After correcting the KL divergence terms to use p_output and q_output separately, the code runs without NaN issues.
Environment:
- Repository: https://github.com/yueliu1999/Awesome-Deep-Graph-Clustering
- Datasets:
hhar,usps
Thank you for your time and assistance!