CPT Tuner #2168

tsachiblau · 2024-10-22T10:00:53Z

Hey,

This pull request introduces Context-Aware Prompt Tuning (CPT), a new and effective technique that builds on In-Context Learning (ICL) and Prompt Tuning (PT) with enhancements through adversarial optimization. CPT allows for better generalization and stability on various classification tasks.

The approach is based on a research paper, which will soon be available. The core idea of CPT is demonstrated and implemented in the following repository:
https://github.com/tsachiblau/CPT.

We are submitting this pull request to integrate the CPT method into the PEFT library, allowing users to experiment with this novel method. Thank you for reviewing this contribution!

The paper is attached
Context_aware_Prompt_Tuning__Advancing_In_Context_Learning_with_Adversarial_Methods_PEFT.pdf

Thanks,
Tsachi

BenjaminBossan · 2024-10-22T14:39:45Z

Hi, thanks for creating this PR to add this new method to PEFT.

I did not have time yet to do a review, but I wanted to alert you that with the provided information, you could be de-anonymized as paper author. Not sure if that's a big deal for the submission process, but just wanted to let you know.

tsachiblau · 2024-10-22T16:00:55Z

I plan to upload it to arXiv anyway, so that works for me. Thanks for letting me know.

BenjaminBossan

Thanks for this PR to add the Context-aware Prompt Tuning method to PEFT.

I plan to upload it to arXiv anyway, so that works for me. Thanks for letting me know.

It's now there, right: https://arxiv.org/abs/2410.17222? Let's add a link to the paper into the docstring of the config class for reference.

Something small I noticed:

VERA (Kopiczko et al., 2023) builds on LoRA by incorporating adaptive learning rates

I think this is not an accurate characterization of VeRA. Did you maybe mean to reference LoRA+ instead?

I reviewed your method and added a couple of comments, please check them out. On top of these, I have a more general question for my understanding: According to the paper, part of the reason why this method works will relates to the changes in the loss calculation and how the parameters are updated. Is this fully covered by the CPT implementation here or would users need to consider something in addition when defining their training loop?

Regarding the testing, thanks for including a few functional tests. Let's also add CPT to the general testing framework, similar to how we do it for prompt tuning:

peft/tests/testing_common.py

Line 129 in fb6108a

"prompt_tuning": (PromptTuningConfig, CONFIG_TESTING_KWARGS[4]),

However, IIUC, it only works for decoder models (causal LM), right? That means we need to create a separate PeftTestConfigManager = ClassInstantier(CLASSES_MAPPING) instance that uses CLASSES_MAPPING with CPT added on top. This instance should then be used in test_decoder_models.py.

Finally, before merging this, we also need to add some docs and at least one example. However, we can work on those in a later iteration and iron out the implementation first.

src/peft/peft_model.py

BenjaminBossan · 2024-10-24T12:37:11Z

src/peft/tuners/cpt/__init__.py

@@ -0,0 +1,20 @@
+# Copyright 2023-present the HuggingFace Inc. team.


Suggested change

# Copyright 2023-present the HuggingFace Inc. team.

# Copyright 2024-present the HuggingFace Inc. team.

src/peft/tuners/cpt/config.py

BenjaminBossan · 2024-10-24T12:39:28Z

src/peft/tuners/cpt/config.py

+from peft.utils import PeftType
+
+
+class PromptTuningInit(str, enum.Enum):


This is identical to the PromptTuningInit class from prompt_tuning/config.py, right? I wonder if we should give it a different name to avoid confusion. If you think the options will always be the same, we can also import that class here instead.

BenjaminBossan · 2024-10-24T12:40:14Z

src/peft/tuners/cpt/config.py

+    """
+
+    # Token-related configurations
+    CPT_token_ids: Optional[torch.Tensor] = field(


For variable names, let's avoid using capitalization. So this variable should be cpt_token_ids, same with all the variables below.

Also, since this and the next arguments should not be None, I think it makes more sense to remove the default=None and to remove the Optional type annotation. Then we don't need to check that in check_config.

Finally, list[int] would also be valid here, right?

BenjaminBossan · 2024-10-24T13:15:29Z

tests/CPT_test.py

+
+    trainer = Trainer(model=model, args=training_args, train_dataset=train_dataset, data_collator=collator)
+
+    try:


No need to catch the exception, just let it fail.

BenjaminBossan · 2024-10-24T13:16:05Z

tests/CPT_test.py

+    except Exception as e:
+        pytest.fail(f"Training failed with error: {e}")
+
+    assert torch.all(model.prompt_encoder.default.embedding.weight.data.clone().detach().cpu() == emb.cpu())


For my understanding, this is to test that these embeddings are frozen? Let's add a comment.

BenjaminBossan · 2024-10-24T13:16:49Z

tests/CPT_test.py

+    assert torch.all(norm_delta <= epsilon)
+
+
+def test_model_training_text(sst_data, global_tokenizer, collator, config_text):


Similar comments to the test above.

BenjaminBossan · 2024-10-24T13:18:18Z

tests/CPT_test.py

+    assert torch.all((norm_delta == 0) == (~non_label_idx))
+
+
+def test_model_batch_training_text(sst_data, global_tokenizer, collator, config_text):


Same argument as for the tests above. Also, what exactly is different in this test, just the batch size? Why is it important to test batch size 1 and 2?

BenjaminBossan · 2024-10-24T13:24:38Z

tests/CPT_test.py

Let's rename this to test_cpt.py

tsachiblau · 2024-10-25T15:39:43Z

Thank you for the constructive feedback! 🙂

Something small I noticed:

Sorry, I can’t find where I mentioned it. Could you please point me to it?

Is this fully covered by the CPT implementation here or would users need to consider something in addition when defining their training loop?

Yes, it is fully covered by this implementation, including both loss and projection. Full details are available in the demo (https://github.com/tsachiblau/CPT/tree/main/notebooks) and will also be added to this repository to make it easier to use.

However, IIUC, it only works for decoder models (causal LM), right? That means we need to create a separate PeftTestConfigManager = ClassInstantier(CLASSES_MAPPING) instance that uses CLASSES_MAPPING with CPT added on top. This instance should then be used in test_decoder_models.py

Yes, we only support causal LMs. Adding test_CPT to testing_common.py causes errors because the configuration is not initialized correctly, and I’m unsure how to address this. Could you clarify how to correctly include my tests?

we can work on those in a later iteration and iron out the implementation first.

Let’s handle it later.

Also, since this and the next arguments should not be None, I think it makes more sense to remove the default=None and to remove the Optional type annotation. Then we don't need to check that in check_config.

If the user choose RANDOM option then this values should be None.

As for the remaining comments, I’ve addressed them all and pushed the changes to the branch.

… created _cpt_forward for readability, updated copyright to 2024, renamed class to CPTPromptInit, changed config variables to lowercase and list[int], removed exception catch from tests, added assertion docs, removed batch_size=1 test, and renamed test file to test_cpt.py.

BenjaminBossan

Thanks for all the updates to the PR.

Something small I noticed:

Sorry, I can’t find where I mentioned it. Could you please point me to it?

This is in section 2 of the paper, 2nd paragraph starting with "Efficient Fine-Tuning".

Yes, it is fully covered by this implementation, including both loss and projection.

Okay, nice.

Yes, we only support causal LMs. Adding test_CPT to testing_common.py causes errors because the configuration is not initialized correctly, and I’m unsure how to address this. Could you clarify how to correctly include my tests?

So I think the following approach should work. See this line:

peft/tests/testing_common.py

Line 202 in b3176ef

PeftTestConfigManager = ClassInstantier(CLASSES_MAPPING)

Let's create a new instance below called PeftTestConfigManagerForDecoderModels. You instantiate it the same, but with the class mapping extended to add CPT:

PeftTestConfigManagerForDecoderModels = ClassInstantier({**CLASSES_MAPPING, **DECODER_MODELS_EXTRA})

Of course, we need to define DECODER_MODELS_EXTRA, which should be:

DECODER_MODELS_EXTRA = {
    "cpt": (CPTConfig, CONFIG_TESTING_KWARGS[12])
}

You can add this after the definition of CLASSES_MAPPING:

peft/tests/testing_common.py

Line 124 in b3176ef

CLASSES_MAPPING = {

Next, inside of test_decoder_models.py, we have to use this new class:

- from .testing_common import PeftCommonTester, PeftTestConfigManager
+ from .testing_common import PeftCommonTester, PeftTestConfigManagerForDecoderModels as PeftTestConfigManager

Let me know if you have further questions.

As for the remaining comments, I’ve addressed them all and pushed the changes to the branch.

After all the changes have been made, make sure to also call make style to make the linter happy.

There are still a couple of unaddressed comments, please check again.

BenjaminBossan · 2024-10-28T11:13:14Z

src/peft/tuners/cpt/config.py

+    )
+
+    # Prompt tuning initialization method
+    cpt_prompt_tuning_init: Optional[str] = field(


I think this can be type annotated as CPTPromptInit.

BenjaminBossan · 2024-10-28T11:13:57Z

src/peft/tuners/cpt/config.py

+        """
+        self.peft_type = PeftType.CPT  # Specifies that the PEFT type is CPT.
+        self.target_modules = None  # Placeholder for target modules in CPT.
+        self.task_type = "CAUSAL_LM"  # Ensures task type is causal language modeling.


Also, since this and the next arguments should not be None, I think it makes more sense to remove the default=None and to remove the Optional type annotation. Then we don't need to check that in check_config.

If the user choose RANDOM option then this values should be None.

Okay, let's add a check here to ensure that the argument is correctly set.

I think this comment is still relevant.

What I mean is that a check should be performed here which checks what you mentioned in the quote above.

tsachiblau · 2024-10-30T20:47:53Z

This is in section 2 of the paper, 2nd paragraph starting with "Efficient Fine-Tuning".

Thanks, I will check it out.

I've made the other requested changes. :)

…lization in config. Renamed cpt_prompt_tuning_init to cpt_prompt_init. Changed the class from PeftConfig to PromptLearningConfig. model: Removed check_config function. peft_model: Fixed bugs. tests: Added PeftTestConfigManagerForDecoderModels in test_decoder_models.py and testing_common.py.

BenjaminBossan

Thanks for the updates, this PR is approaching the finish line. Please ensure to always run make style though, or else the linter will complain and tests won't run (I checked them locally and they passed).

Now, let's get back to what I mentioned earlier, namely adding some docs and an example. To give a recent example of a PR with docs and examples, check this one:

https://github.com/huggingface/peft/pull/2172/files

The example doesn't have to be this elaborate, but it should be something that users can easily adopt to their own uses cases. Maybe you can add something that resembles one of the experiments from the paper. That way, we can use the example to ensure that the experiment can be replicated with the PEFT implementation.

When writing the docs, put yourself in the shoes of a user who may not have read the paper and might be curious why they should consider this method.

BenjaminBossan · 2024-10-31T15:04:55Z

tests/testing_common.py

+                    list_names.append(name)
+                else:
+                    assert param.grad is None
+            ''


BenjaminBossan · 2024-10-31T15:05:29Z

tests/testing_common.py

@@ -1189,10 +1202,24 @@ def _test_training_prompt_learning_tasks(self, model_id, config_cls, config_kwar
        loss = output.sum()
        loss.backward()

+        if issubclass(config_cls, CPTConfig):
+            parameters = []
+            list_names = []


Why do we need list_names?

It is redundant

BenjaminBossan · 2024-10-31T15:06:05Z

tests/testing_common.py

+            parameters = []
+            list_names = []
+            for name, param in model.prompt_encoder.named_parameters():
+                if name not in ['default.embedding.weight']:


Suggested change

if name not in ['default.embedding.weight']:

if name != "default.embedding.weight":

BenjaminBossan · 2024-10-31T15:26:28Z

tests/test_cpt.py

+MODEL_NAME = "bigscience/bloom-1b7"
+MAX_INPUT_LENGTH = 1024


For unit testing, we should use much smaller models. Check what the other tests are using. One possibility would be "hf-internal-testing/tiny-random-OPTForCausalLM".

BenjaminBossan · 2024-10-31T15:27:24Z

tests/testing_common.py

            assert param.grad is not None

+


BenjaminBossan · 2024-10-31T15:34:09Z

tests/test_cpt.py

+
+def test_model_initialization_text(global_tokenizer, config_text):
+    """Test model loading and PEFT model initialization."""
+    base_model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, cache_dir=".", trust_remote_code=True)


Suggested change

base_model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, cache_dir=".", trust_remote_code=True)

base_model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

Same below

src/peft/peft_model.py

tsachiblau · 2024-11-03T11:47:17Z

I added the documentation, except for an example, as I'm unsure where to place it. I noticed that some models have examples in the /examples/ directory, but I can't find a way to access these examples from https://huggingface.co/docs/peft/.

…dded into _toctree.yml.

BenjaminBossan

Thanks for the recent updates. Please ensure to run make style so that the linter is happy.

except for an example, as I'm unsure where to place it. I noticed that some models have examples in the /examples/ directory, but I can't find a way to access these examples from https://huggingface.co/docs/peft/.

Yes, examples should go into the examples/ directory. For this method, examples/causal_language_modeling/ could be a good option.

I'm not sure why you want to access the examples from the docs. The docs can link to the example, but I don't understand what else you would like to achieve there.

BenjaminBossan · 2024-11-05T10:02:45Z

docs/source/package_reference/cpt.md

+
+[[autodoc]] tuners.cpt.config.CPTConfig
+
+## CPTModel


There is no CPTModel, only CPTEmbedding.

Can I add a link to an example in the cpt.md file? I didn't see any other methods linked to examples. If you have an example, it would be a great help.

Yes, I see no reason not to add a link there. Just be aware that the link would be invalid until the PR is merged.

BenjaminBossan · 2024-11-05T12:47:48Z

@tsachiblau Heads up, another PR was merged to PEFT which added a new method, resulting in a bunch of merge conflicts, but they should be easy to resolve. LMK if you have questions.

tsachiblau · 2024-11-05T19:16:33Z

Merge is done

tsachiblau · 2024-11-07T07:33:16Z

Anything else that needs to be done?

HuggingFaceDocBuilderDev · 2024-11-07T18:44:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan · 2024-11-07T18:55:20Z

@tsachiblau Could you please run make style?

tsachiblau · 2024-11-07T19:19:05Z

Done

BenjaminBossan · 2024-11-07T20:05:38Z

@tsachiblau The linter is still complaining. Do you have the right ruff version installed? It should be v0.6.9.

BenjaminBossan · 2024-11-07T20:39:46Z

@tsachiblau ruff has now passed successfully but doc-builder is complaining, can you try running it?

doc-builder style src/peft tests docs/source --max_len 119

tsachiblau · 2024-11-08T05:20:13Z

Done

BenjaminBossan · 2024-11-09T16:11:08Z

The doc builder is still complaining, can you successfully run make style locally?

tsachiblau · 2024-11-09T16:47:02Z

Yes, I get this message:

ruff check --fix src tests examples docs scripts docker
All checks passed!
ruff format src tests examples docs scripts docker
209 files left unchanged
doc-builder style src/peft tests docs/source --max_len 119

It seems to fail on the error handling that we implemented, such as

raise ValueError("CPT works only with causal LM models.")

BenjaminBossan · 2024-11-09T18:07:36Z

Ah yes, somehow I looked at an old log, now the linting indeed passes but the tests are failing. Currently I can't investigate why they're failing, but if the is_decoder attribute is not reliable, we unfortunately need to remove the check.

tsachiblau · 2024-11-10T18:34:49Z

Done again :)

BenjaminBossan · 2024-11-12T22:28:19Z

@tsachiblau A couple of tests are failing. Mostly that concerns a test with gradient chechkpointing that checks the existence of a gradient on the prompt encoder. Could you check if this is a false alarm and the test needs adapting or if something else is going on? There is also another failing test during initialization. You can check the logs of the CI for more details. Thanks.

tsachiblau · 2024-11-13T04:22:41Z

Lets try again

BenjaminBossan

Thanks for the updates. The tests are now passing (some flaky tests are failing, but we can ignore those for how). I did another check of the PR and found some smaller areas for improvement. Moreover, I saw that some of my previous comments are still unaddressed. If you disagree with a suggestion I made, just let me know, not everything needs to be changed, but if there is no reply I can't tell if you read it or not.

BenjaminBossan · 2024-11-13T14:35:46Z

src/peft/peft_model.py

@@ -1779,7 +1779,7 @@ def _cpt_forward(
            else:
                N_tokens = input_ids.shape[1]
            input_type_mask = torch.zeros((batch_size, N_tokens)).to(device)
-            input_type_mask[:, -1] = 4
+            input_type_mask[:, :] = 4


input_type_mask.fill_(4) would also work. Could you add a short comment on what "4" means here?

4 is the id for the tokens used for the loss calculation. I changed the code to
input_type_mask = torch.ones((batch_size, N_tokens)).to(device) * 4

tests/testing_common.py

BenjaminBossan · 2024-11-13T14:54:42Z

src/peft/tuners/cpt/config.py

+        """
+        self.peft_type = PeftType.CPT  # Specifies that the PEFT type is CPT.
+        self.target_modules = None  # Placeholder for target modules in CPT.
+        self.task_type = "CAUSAL_LM"  # Ensures task type is causal language modeling.


I think this comment is still relevant.

BenjaminBossan · 2024-11-13T14:56:26Z

src/peft/tuners/cpt/config.py

+    )
+
+    # Prompt tuning initialization method
+    cpt_prompt_init: Optional[str] = field(


Using Literal["TEXT", "RANDOM"] as type annotation would be a bit more precise.

I think this comment is still relevant.

It already exists in the code.

Using Literal["TEXT", "RANDOM"] as type annotation would be a bit more precise.

I changed it.

BenjaminBossan · 2024-11-13T14:56:46Z

src/peft/tuners/cpt/config.py

+    )
+
+    # Loss-related configurations
+    opt_weighted_loss_type: Optional[str] = field(


Still relevant

BenjaminBossan · 2024-11-13T14:57:17Z

src/peft/tuners/cpt/config.py

+    # Virtual token configurations
+    num_virtual_tokens: int = field(default=0, metadata={"help": "Number of virtual tokens used in the prompt."})
+
+    # CPT-specific static attributes


WDYT about this suggestion?

BenjaminBossan · 2024-11-13T14:58:12Z

src/peft/tuners/cpt/model.py

+
+        return epsilon
+
+    def projection(self):


tsachiblau · 2024-11-13T18:44:55Z

Still relevant

I think this comment is still relevant.

WDYT about this suggestion?

WDYT?

Can you please explain these points? I do not get what you suggest.

BenjaminBossan

Thanks for the latest updates.

Can you please explain these points? I do not get what you suggest.

I tried to clarify the open comments. I'm not sure if you can see the full context. If not, go to https://github.com/huggingface/peft/pull/2168/files and scroll down, you should see the full context of my comments.

BenjaminBossan · 2024-11-15T16:24:12Z

src/peft/tuners/cpt/config.py

+    )
+
+    # Loss-related configurations
+    opt_weighted_loss_type: Optional[str] = field(


My suggestion is to change the type annotation to Literal["none", "decay"].

BenjaminBossan · 2024-11-15T16:26:17Z

src/peft/tuners/cpt/config.py

+    )
+
+    # Virtual token configurations
+    num_virtual_tokens: int = field(default=0, metadata={"help": "Number of virtual tokens used in the prompt."})


I think having 0 as the default here makes little sense. WDYT about using a good default here, say, 10?

BenjaminBossan · 2024-11-15T16:26:58Z

src/peft/tuners/cpt/config.py

+    # Virtual token configurations
+    num_virtual_tokens: int = field(default=0, metadata={"help": "Number of virtual tokens used in the prompt."})
+
+    # CPT-specific static attributes


Does it ever make sense to let users pass these arguments? If not, I would remove them here and place them inside the __post_init__ method.

BenjaminBossan · 2024-11-15T16:27:59Z

src/peft/tuners/cpt/config.py

+        """
+        self.peft_type = PeftType.CPT  # Specifies that the PEFT type is CPT.
+        self.target_modules = None  # Placeholder for target modules in CPT.
+        self.task_type = "CAUSAL_LM"  # Ensures task type is causal language modeling.


What I mean is that a check should be performed here which checks what you mentioned in the quote above.

BenjaminBossan · 2024-11-15T16:28:26Z

src/peft/tuners/cpt/model.py

+        """
+        if self.config.CPT_prompt_tuning_init == PromptTuningInit.TEXT:
+            tensor_ICL_mask = torch.Tensor(self.config.CPT_tokens_type_mask).long()
+            mask_input_template = torch.remainder(tensor_ICL_mask, 4) == 1


Bumping this comment.

BenjaminBossan · 2024-11-15T16:29:33Z

src/peft/tuners/cpt/model.py

+
+        return epsilon
+
+    def projection(self):


My suggestion is to rename this method to get_projection. Then, in the last line, instead of self.delta_embedding.weight.data = new_embeddings_weights, just return new_embeddings_weights. It is then on the caller side that the delta_embeddings are updated.

BenjaminBossan · 2024-11-15T16:29:43Z

src/peft/tuners/cpt/model.py

+            base_model_output (ModelOutput): Output from the base model containing logits.
+            labels (torch.Tensor): Ground-truth labels for the input tokens.
+            CPT_type_mask (torch.Tensor): Token type mask used for filtering valid loss terms.
+            config (Namespace): Configuration object containing loss-related hyperparameters.


BenjaminBossan · 2024-11-15T16:29:53Z

src/peft/tuners/cpt/model.py

+            ModelOutput: The base model output with computed loss.
+        """
+
+        if config.opt_weighted_loss_type in ["decay"]:


BenjaminBossan · 2024-11-15T16:30:15Z

src/peft/tuners/cpt/model.py

+            # Compute the weighted mean loss
+            loss = (loss[shift_labels_bool] * shift_labels_weights[shift_labels_bool]).mean()
+            base_model_output.loss = loss
+        elif config.opt_weighted_loss_type not in ["none"]:


Why not if config.opt_weighted_loss_type == "none":?

tsachiblau and others added 2 commits October 22, 2024 10:57

added CPT model to peft

92b9e1a

Merge branch 'huggingface:main' into main

e54d380

BenjaminBossan requested changes Oct 24, 2024

View reviewed changes

tsachiblau added 2 commits October 24, 2024 19:21

Merge branch 'huggingface:main' into main

023f071

Merge branch 'huggingface:main' into main

54cddaf

tsachiblau and others added 2 commits October 25, 2024 18:35

Merge branch 'huggingface:main' into main

ba4b115

BenjaminBossan requested changes Oct 28, 2024

View reviewed changes

Merge branch 'huggingface:main' into main

f8c8317

tsachiblau added 2 commits October 30, 2024 22:41

Merge branch 'main' of https://github.com/tsachiblau/peft_CPT

b01b214

BenjaminBossan requested changes Oct 31, 2024

View reviewed changes

tsachiblau and others added 2 commits November 3, 2024 07:28

Merge branch 'huggingface:main' into main

6ed1723

tests: Updated test_cpt and testing_common as per the PR requirements.

77bb0b9

tsachiblau and others added 3 commits November 3, 2024 13:40

Created cpt.md in package_regerence. Updated the prompting.md file. a…

dbcdedf

…dded into _toctree.yml.

Merge branch 'huggingface:main' into main

f7138d4

verifying that the model is causal LM

0a5fb20

BenjaminBossan reviewed Nov 5, 2024

View reviewed changes

Changed CPTModel to CPTEmbedding

7206db5

merge with main branch

24b0af9

make style

81ffa09

tsachiblau added 2 commits November 7, 2024 21:56

make style

130ec76

make style

70067d8

make doc

9397314

tsachiblau closed this Nov 9, 2024

BenjaminBossan reopened this Nov 9, 2024

Merge branch 'huggingface:main' into main

249713c

Removed redundant checks

0a43473

tsachiblau added 2 commits November 13, 2024 06:21

Fixed errors

144f042

merge with peft

97449da

BenjaminBossan requested changes Nov 13, 2024

View reviewed changes

Minor code updates.

dacb400

BenjaminBossan requested changes Nov 15, 2024

View reviewed changes

		@@ -0,0 +1,20 @@
		# Copyright 2023-present the HuggingFace Inc. team.

	# Copyright 2023-present the HuggingFace Inc. team.
	# Copyright 2024-present the HuggingFace Inc. team.

		from peft.utils import PeftType


		class PromptTuningInit(str, enum.Enum):


		trainer = Trainer(model=model, args=training_args, train_dataset=train_dataset, data_collator=collator)

		try:

		assert torch.all(norm_delta <= epsilon)


		def test_model_training_text(sst_data, global_tokenizer, collator, config_text):

		assert torch.all((norm_delta == 0) == (~non_label_idx))


		def test_model_batch_training_text(sst_data, global_tokenizer, collator, config_text):

	if name not in ['default.embedding.weight']:
	if name != "default.embedding.weight":

	base_model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, cache_dir=".", trust_remote_code=True)
	base_model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

		MODEL_NAME = "bigscience/bloom-1b7"
		MAX_INPUT_LENGTH = 1024

CPT Tuner #2168

Are you sure you want to change the base?

CPT Tuner #2168

Conversation

tsachiblau commented Oct 22, 2024

BenjaminBossan commented Oct 22, 2024

tsachiblau commented Oct 22, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau commented Oct 25, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau commented Oct 30, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau commented Nov 3, 2024

BenjaminBossan left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenjaminBossan commented Nov 5, 2024

tsachiblau commented Nov 5, 2024

tsachiblau commented Nov 7, 2024

HuggingFaceDocBuilderDev commented Nov 7, 2024

BenjaminBossan commented Nov 7, 2024

tsachiblau commented Nov 7, 2024

BenjaminBossan commented Nov 7, 2024

BenjaminBossan commented Nov 7, 2024

tsachiblau commented Nov 8, 2024

BenjaminBossan commented Nov 9, 2024

tsachiblau commented Nov 9, 2024 • edited Loading

BenjaminBossan commented Nov 9, 2024

tsachiblau commented Nov 10, 2024

BenjaminBossan commented Nov 12, 2024

tsachiblau commented Nov 13, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau commented Nov 13, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsachiblau commented Oct 25, 2024 •

edited

Loading

tsachiblau commented Oct 30, 2024 •

edited

Loading

BenjaminBossan left a comment •

edited

Loading

tsachiblau Nov 5, 2024 •

edited

Loading

tsachiblau commented Nov 9, 2024 •

edited

Loading

tsachiblau Nov 13, 2024 •

edited

Loading

tsachiblau commented Nov 13, 2024 •

edited

Loading