Skip to content

Weight sharing #62

@omg777

Description

@omg777

Hi,
Can you explain why multiply 0.5 from gradient ?

    # Average the gradients
    for p in D_shared.parameters():
        p.grad.data = 0.5 * p.grad.data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions