Adding image decoder to CoCa #467

iejMac · 2023-03-15T02:16:23Z

No description provided.

iejMac · 2023-03-15T16:19:44Z

Ok current state is bare minimum version to get things kind of working. That means:

We get the image tokenized using VQGAN (I think this is correct, still need to write some decoding code to check if we're tokenizing correctly
We create a image decoder transformer which is just like the text decoder transformer and predict the next image token autoregressively
We calculate the loss
Code is as decent as I could make it in one sitting. Still needs improvement

iejMac · 2023-03-18T00:55:07Z

@gpucce @rom1504 wandb = https://wandb.ai/iejmac/open-clip/reports/MAGIC--VmlldzozODE4NzM3

iejMac · 2023-03-18T21:44:31Z

TODO:

Code:

CoCa generation code should be modality-agnostic - it should be able to generate images and text based on the shape (or parameters) of the input
create some start_of_image token !!!
BIG Cleanup. Can we go without making a dependency on taming-transformers and omegaconf?
Config cleanup + update old coca configs

Model:

Train something at B/32 scale
dropout text conditioning 10% of the time as suggested by Katherine (either put nothing in cross attention or some learned sequence)
axial positional embeddings suggested by lucidrains

iejMac · 2023-04-02T17:24:22Z

https://arxiv.org/abs/2303.13455

iejMac added 5 commits March 15, 2023 02:15

Adding image decoder to CoCa

055eaf7

tokenizer in CoCa

7836c44

progress in train.py - diff preproc

bb677a5

update loss

44eab43

ok it trains

b1fbef2

iejMac added 4 commits March 16, 2023 06:21

add decode method

0bb8f37

image-generation-lsos-weight

99bfb1c

grad checkpointing + freeze

44631ba

save progress

bd90187

iejMac marked this pull request as draft March 17, 2023 03:24

tradeoff VQGAN memory usage for samples/s + get n_embed from config

b9ffe5e

add method to tok

1b92112

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding image decoder to CoCa #467

Adding image decoder to CoCa #467

Uh oh!

iejMac commented Mar 15, 2023

Uh oh!

iejMac commented Mar 15, 2023 •

edited

Loading

Uh oh!

iejMac commented Mar 18, 2023

Uh oh!

iejMac commented Mar 18, 2023 •

edited

Loading

Uh oh!

iejMac commented Apr 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Adding image decoder to CoCa #467

Are you sure you want to change the base?

Adding image decoder to CoCa #467

Uh oh!

Conversation

iejMac commented Mar 15, 2023

Uh oh!

iejMac commented Mar 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iejMac commented Mar 18, 2023

Uh oh!

iejMac commented Mar 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO:

Code:

Model:

Uh oh!

iejMac commented Apr 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

iejMac commented Mar 15, 2023 •

edited

Loading

iejMac commented Mar 18, 2023 •

edited

Loading