Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what user_query can i use? #23

Open
xiaoyazhu opened this issue Aug 7, 2024 · 5 comments
Open

what user_query can i use? #23

xiaoyazhu opened this issue Aug 7, 2024 · 5 comments

Comments

@xiaoyazhu
Copy link

xiaoyazhu commented Aug 7, 2024

If i want to locate a specific target ,such as "a person wearing a yellow hat", what user_query can i use in inference?

python -m groma.eval.run_groma \
    --model-name {path_to_groma_7b_finetune} \
    --image-file {path_to_img} \
    --query {user_query} \
    --quant_type 'none' # support ['none', 'fp16', '8bit', '4bit'] for inference
@machuofan
Copy link
Collaborator

You can simply use Locate <p> a person wearing a yellow hat </p> in the image. or something else like that. Just remember to enclose the referring expression with <p> and </p>.

@xiaoyazhu
Copy link
Author

But when i tested several images, i found that whether there is an object in the test image, the model will output a localization result. Is there any way to adjust the settings , or is it caused by model hallucination ?

@machuofan
Copy link
Collaborator

Yes, such hallucination is probably caused by training data - for grounding training, we only got positive QA pairs, i.e., the object mentioned in the question is guaranteed to occur in the image. To remedy such hallucination, you can curate some negative QA pairs and finetune the model.

@Eman-Abdelrahman
Copy link

If I need to fine-tune the model for locating target elements in an image, is there certain hyper parameters to focus on, please?

@LLH-Harward
Copy link

If I want to provide the object's label and bounding box to Groma to help generate more accurate image descriptions, how should I structure the prompt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants