Skip to content

Geometric Parametrization for Inf-CLIP #4

Open
@zer0int

Description

@zer0int

Dear researchers, thank you very much for your paper & code!

I am keen to hear your thoughts on implementing Geometric Parametrization (GmP) with Inf-CLIP.
I have previously implemented GmP for 'classic' CLIP fine-tuning. In a nutshell:

GmP CLIP MLP:

(mlp): Sequential(
  |-(c_fc): GeometricLinear()
  | (gelu): QuickGELU()
|-}-(c_proj): GeometricLinear()
| | 
| |-- visual.transformer.resblocks.0.mlp.c_fc.r
| |-- visual.transformer.resblocks.0.mlp.c_fc.theta
| |-- visual.transformer.resblocks.0.mlp.c_fc.bias
|
|---- visual.transformer.resblocks.0.mlp.c_proj.r
|---- visual.transformer.resblocks.0.mlp.c_proj.theta
|---- visual.transformer.resblocks.0.mlp.c_proj.bias

(Same for [text] transformer.resblocks)

I was able to archive a marked improvement over pre-trained OpenAI CLIP ViT-L/14 with this technique (dataset: COCO-SPRIGHT-40k). The model was fine-tuned on 1x RTX 4090 with a batch size of 40 (!).

GmP-results

Evals:
github.com/LAION-AI/CLIP_benchmark
objectnet.dev/mvt/

Code to reproduce results / fine-tune:
github.com/zer0int/CLIP-fine-tune

Models (dataset is linked) + further results (retrieval, multimodal gap):
huggingface.co/zer0int/CLIP-GmP-ViT-L-14

CLIP GmP was inspired by this paper:
ReLU Characteristic Activation Analysis


I have forked your Inf-CLIP and provided an initial implementation of GmP:
https://github.com/zer0int/Inf-CLIP

I am unable to test it due to being 'GPU-poor', as above; however, I'd be curious to see if GmP provides additional benefits for Inf-CLIP. Or, on the other hand, if there are problems with GmP + Inf-CLIP.

Also, I am in the process of further modifying your code to implement a "fake" distributed backend to construct a sequential compute of 'tiles' using 1 GPU. Any tips (by anybody who happens to read this) with regards to handling data exchange (which would inevitably involve the CPU) are welcome. Again, thank you for your work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions