Skip to content

Why using a batch size of 1 when computing the gradients? #42

Open
@Leon-0602

Description

@Leon-0602

Thank you for proposing LESS, which is a great work.

However, I am wondering why do the authors use a batch size of 1 when computing the gradients in get_info.py? What if we set batch_size > 1?

Thanks in advance for your reply.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions