You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for sharing your work on AdaCLIP. I have gone through the paper and the repository, and I have a few points that need clarification:
Training Data Details: The paper initially describes using auxiliary data for training but later states that "the description of the utilized training set is not accurate." Could you clarify whether VisA & ClinicDB are exclusively used for evaluation, or do they play any role in model fine-tuning?
Prompt Optimization: The concept of hybrid prompts (static and dynamic) is intriguing. However, are the dynamic prompts generated per image based on predefined embeddings, or do they adapt iteratively during testing? Additionally, do the prompts maintain consistency across similar anomalies in different domains?
Performance Stability: Given that FP16 training can be unstable, have you observed significant variance in performance across multiple runs? Would incorporating additional stability mechanisms (e.g., gradient accumulation or mixed-precision adjustments) improve robustness?
I appreciate your time and look forward to your insights. Thanks again for this excellent work!
The text was updated successfully, but these errors were encountered:
Thank you for sharing your work on AdaCLIP. I have gone through the paper and the repository, and I have a few points that need clarification:
Training Data Details: The paper initially describes using auxiliary data for training but later states that "the description of the utilized training set is not accurate." Could you clarify whether VisA & ClinicDB are exclusively used for evaluation, or do they play any role in model fine-tuning?
Prompt Optimization: The concept of hybrid prompts (static and dynamic) is intriguing. However, are the dynamic prompts generated per image based on predefined embeddings, or do they adapt iteratively during testing? Additionally, do the prompts maintain consistency across similar anomalies in different domains?
Performance Stability: Given that FP16 training can be unstable, have you observed significant variance in performance across multiple runs? Would incorporating additional stability mechanisms (e.g., gradient accumulation or mixed-precision adjustments) improve robustness?
I appreciate your time and look forward to your insights. Thanks again for this excellent work!
The text was updated successfully, but these errors were encountered: