Skip to content

Clarification on Training Data and Prompt Optimization in AdaCLIP #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
arifur-rahman-ar opened this issue Mar 12, 2025 · 0 comments

Comments

@arifur-rahman-ar
Copy link

Thank you for sharing your work on AdaCLIP. I have gone through the paper and the repository, and I have a few points that need clarification:

  1. Training Data Details: The paper initially describes using auxiliary data for training but later states that "the description of the utilized training set is not accurate." Could you clarify whether VisA & ClinicDB are exclusively used for evaluation, or do they play any role in model fine-tuning?

  2. Prompt Optimization: The concept of hybrid prompts (static and dynamic) is intriguing. However, are the dynamic prompts generated per image based on predefined embeddings, or do they adapt iteratively during testing? Additionally, do the prompts maintain consistency across similar anomalies in different domains?

  3. Performance Stability: Given that FP16 training can be unstable, have you observed significant variance in performance across multiple runs? Would incorporating additional stability mechanisms (e.g., gradient accumulation or mixed-precision adjustments) improve robustness?

I appreciate your time and look forward to your insights. Thanks again for this excellent work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant