Skip to content

Add gr.Dialogue component #11092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open

Add gr.Dialogue component #11092

wants to merge 43 commits into from

Conversation

freddyaboulton
Copy link
Collaborator

Description

dialogue_component

🎯 PRs Should Target Issues

Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.

Not adhering to this guideline will result in the PR being closed.

Testing and Formatting Your Code

  1. PRs will only be merged if tests pass on CI. We recommend at least running the backend tests locally, please set up your Gradio environment locally and run the backed tests: bash scripts/run_backend_tests.sh

  2. Please run these bash scripts to automatically format your code: bash scripts/format_backend.sh, and (if you made any changes to non-Python files) bash scripts/format_frontend.sh

@gradio-pr-bot
Copy link
Collaborator

gradio-pr-bot commented Apr 28, 2025

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
Storybook ready! Storybook preview
🦄 Changes detected! Details

Install Gradio from this PR

pip install https://gradio-pypi-previews.s3.amazonaws.com/ed79ab4bb17edf0649826bdd52924f9d7c4e997d/gradio-5.29.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@ed79ab4bb17edf0649826bdd52924f9d7c4e997d#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-npm-previews.s3.amazonaws.com/ed79ab4bb17edf0649826bdd52924f9d7c4e997d/gradio-client-1.14.2.tgz

Use Lite from this PR

<script type="module" src="https://gradio-lite-previews.s3.amazonaws.com/ed79ab4bb17edf0649826bdd52924f9d7c4e997d/dist/lite.js""></script>

@gradio-pr-bot
Copy link
Collaborator

gradio-pr-bot commented Apr 28, 2025

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
@gradio/dialogue minor
@gradio/dropdown minor
gradio minor
  • Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Add gr.Dialogue component

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@abidlabs abidlabs requested review from dawoodkhan82, pngwn, hannahblair and abidlabs and removed request for pngwn April 29, 2025 07:48
freddyaboulton and others added 15 commits April 29, 2025 12:16
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* fix: ensure all translation files work as expected

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* Fix scaling issue when setting height in Image component

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
…ensions (#11093)

* Update client.py to always send file data, even for files without extensions

Fixes #10775

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* changes

* changes

* changes
…1097)

* Fix #10320: Chatbot - Ensure all messages in a group are editable
Previously, only the first message in a group from the same sender
was editable. This happened because message.svelte rendered a
single textarea for the group, and ChatBot.svelte only updated
the first message.

Now, edit_message is a list (edit_messages), storing edits for all
messages in the group. Each message gets its own textarea, and on
submit, handle_action is executed for each message in the group,
ensuring all are updated correctly.

Also, added an end-to-end test to verify this behavior, modifying
the demo to allow adding messages only for the user.

* add changeset

* format + notebooks

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Dawood <[email protected]>
* changes

* testing

* add changeset

* changes

* changes

* changes

* add changeset

* changes

* changes

* revert demo

* changes

* fixes

* changes

* format

* changes

* changes

* format

* notebook

* lcean

* format

* format

* add changeset

* add changeset

* changes

* changes

* changes

* changes

* changes

* format

* changes

* changes

* changes

* changes

* add changeset

* Changes

* changes

* changes

* changes

* changes

* changes

* changes

* add changeset

* hygiene

* next

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* add demo

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* changes

* revert demos

* changes

* changes

* format

* fix utils

* changes

* format

* changes

* more tests

* changes

* changes

* changes

* changes

* pydantic

* changes

* change

* mcp

* calculator

* remove base64

* changes

* changes

* changes

* changes

* add changeset

* changes

* add changeset

* guide

* changes

* changes

* guides

* change

* changes

* changes

* guide

* changes

* changes

* changes

* changes

* tests

* reqs

* fix

* change

* changes

* changes

* changes

* changes

* changes

* changes

* fixes

* fixes

* changes

* format

* guide

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ab (#11098)

* Fix #10281: Dragging image replaces existing instead of opening new tab
I updated the drag and drop event handlers in the imageUploader
component (on_drag_over and on_drop) to clear the
existing image before processing the newly dropped file.

Additionally, I modified an end-to-end test to verify that the number
of browser tabs remains the same after dropping a new file.

* add changeset

* Apply code review suggestions: rename to snake_case and reuse is_valid_file

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Freddy Boulton <[email protected]>
* update docs

* change

* add schema url
abidlabs and others added 20 commits May 13, 2025 15:32
… API" page (#11103)

* add route

* add changeset

* add changeset

* add test

* changes

* changes

* changes

* changes

* changes

* changes

* add more images/videos

* Update guides/04_additional-features/07_sharing-your-app.md

Co-authored-by: Ali Abdalla <[email protected]>

* Update guides/04_additional-features/14_view-api-page.md

Co-authored-by: Ali Abdalla <[email protected]>

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Ali Abdalla <[email protected]>
* Add code

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* Fix datetime

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* update STDIO instructions to specify sse-only transport

* formatter

* fix

* add changeset

---------

Co-authored-by: Abubakar Abid <[email protected]>
Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Freddy Boulton <[email protected]>
* add guide

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* Fix markdown change event

* Fix html change event too

* add changeset

* Fix

* docstring

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Abubakar Abid <[email protected]>
* test

* add changeset

* typefix

* openapi note

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* changes

* add changeset

---------

Co-authored-by: Ali Abid <[email protected]>
Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Abubakar Abid <[email protected]>
* Add code

* Fix

* add changeset

* Add unit test

---------

Co-authored-by: gradio-pr-bot <[email protected]>
…te (#11173)

* changes

* add changeset

* add changeset

* js

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* Fix

* add changeset

* Update gradio/utils.py

Co-authored-by: Abubakar Abid <[email protected]>

---------

Co-authored-by: gradio-pr-bot <[email protected]>
Co-authored-by: Abubakar Abid <[email protected]>
* Add code

* add changeset

* add changeset

* backend fix

* comment

* add changeset

* unit tests

* empty

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* chore: support Path type for the favicon

* add changeset

* chore: convert favicon from Path to str internally

---------

Co-authored-by: gradio-pr-bot <[email protected]>
…nt is directory (#11174)

* lint

* add changeset

---------

Co-authored-by: gradio-pr-bot <[email protected]>
* changes

* changes

* label

* serial
@freddyaboulton
Copy link
Collaborator Author

Added a "plain text" mode for the dialogue component. Should be good for review!

dialogue_demo_new

@freddyaboulton freddyaboulton marked this pull request as ready for review May 13, 2025 20:40
@@ -502,7 +502,11 @@ def test_slider_random_value_config(self):
def test_io_components_attach_load_events_when_value_is_fn(self, io_components):
interface = gr.Interface(
lambda *args: None,
inputs=[comp(value=lambda: None, every=1) for comp in io_components],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering why do we have to exclude gr.Dialogue from this list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we need to exlcude, should we just remove it from the io_components fixture? Seems more maintainable than adjusting the count by one manually

Comment on lines +75 to +76
if (distance_from_bottom > distance_from_top || from_top) {
console.log("distance_from_top", distance_from_top);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Suggested change
if (distance_from_bottom > distance_from_top || from_top) {
console.log("distance_from_top", distance_from_top);

@@ -22,7 +24,11 @@
function calculate_window_distance(): void {
const { top: ref_top, bottom: ref_bottom } =
refElement.getBoundingClientRect();
distance_from_top = ref_top;
if (from_top) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the change here?

root: list[DialogueLine] | str


class Dialogue(Textbox):
Copy link
Member

@abidlabs abidlabs May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of inherting from gr.Textbox? If there's not a strong benefit, consider keeping separate as it couples these components in way that we might need to change later.

value: Value of the dialogue. It is a list of dictionaries, each containing a 'speaker' key and a 'text' key. If a function is provided, the function will be called each time the app loads to set the initial value of this component.
speakers: The different speakers allowed in the dialogue.
formatter: A function that formats the dialogue line dictionary, e.g. {"speaker": "Speaker 1", "text": "Hello, how are you?"} into a string, e.g. "Speaker 1: Hello, how are you?".
emotions: The different emotions and intonation allowed in the dialogue. Emotions are displayed in an autocomplete menu below the input textbox when the user starts typing `:`. Use the exact emotion name expected by the AI model or inference function.
Copy link
Member

@abidlabs abidlabs May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed in slack somewhere, consider a more general name for this parameter like completions or tags

Suggested change
emotions: The different emotions and intonation allowed in the dialogue. Emotions are displayed in an autocomplete menu below the input textbox when the user starts typing `:`. Use the exact emotion name expected by the AI model or inference function.
emotions: The different emotions and intonation allowed in the dialogue. Emotions are displayed in an autocomplete menu below the input textbox when the user starts typing `:`. Use the exact emotion name expected by the AI model or inference function.

Comment on lines +44 to +46
info: str
| None = "Type colon (:) in the dialogue line to see the available emotion and intonation tags",
placeholder: str | None = "Enter dialogue here...",
Copy link
Member

@abidlabs abidlabs May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to have default values for info and placeholder? For consistency with other components, it may be better to leave as None

<h1 style='text-align: center; display: flex; align-items: center; justify-content: center;'>
<img src="https://huggingface.co/datasets/freddyaboulton/bucket/resolve/main/dancing_huggy.gif" alt="Dancing Huggy" style="height: 100px; margin-right: 10px"> Dia Dialogue Generation Model
</h1>
<h2 style='text-align: center; display: flex; align-items: center; justify-content: center;'>Model by <a href="https://huggingface.co/nari-labs/Dia-1.6B"> Nari Labs</a>. Powered by HF and <a href="https://fal.ai/">Fal AI</a> API.</h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I'm not seeing spaces before and after the links:

image
Suggested change
<h2 style='text-align: center; display: flex; align-items: center; justify-content: center;'>Model by <a href="https://huggingface.co/nari-labs/Dia-1.6B"> Nari Labs</a>. Powered by HF and <a href="https://fal.ai/">Fal AI</a> API.</h2>
<h2 style='text-align: center; display: flex; align-items: center; justify-content: center;'>Model by &nbsp;<a href="https://huggingface.co/nari-labs/Dia-1.6B"> Nari Labs</a>. Powered by HF and &nbsp; <a href="https://fal.ai/">Fal AI</a>&nbsp; API.</h2>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great demo! Btw for some reason I'm getting some strange outputs. The output audio is always exactly 30 seconds long and has long periods of silence and other artifacts. Not sure if it's an issue with their API or something with our preprocessing:

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the Fal API. The zero-gpu demo in their org is a lot better.

@abidlabs
Copy link
Member

Very cool component @freddyaboulton! Just a couple of suggestions:

  • I think it would be great if we could support the auto-complete tags in the textbox mode:
image

This would allow us to close #10865 as well

  • It would be great if gr.Dialogue also worked as an output component for speaker diarization demos. Right now, it shows up, but doesn't work:
Screen.Recording.2025-05-14.at.1.47.36.PM.mov
import gradio as gr


emotions = [
    "(laughs)",
    "(clears throat)",
    "(sighs)",
    "(gasps)",
    "(coughs)",
    "(singing)",
    "(sings)",
    "(mumbles)",
    "(beep)",
    "(groans)",
    "(sniffs)",
    "(claps)",
    "(screams)",
    "(inhales)",
    "(exhales)",
    "(applause)",
    "(burps)",
    "(humming)",
    "(sneezes)",
    "(chuckle)",
    "(whistles)",
]
speakers = ["Speaker 1", "Speaker 2"]


def formatter(speaker, text):
    speaker = speaker.split(" ")[1]
    return f"[S{speaker}] {text}"


dialogue = gr.Dialogue(
    speakers=speakers, emotions=emotions, formatter=formatter
)
audio = gr.Dialogue(speakers=speakers, emotions=emotions, formatter=formatter)
examples=[
    [
        [
            {
                "speaker": "Speaker 1",
                "text": "Why did the chicken cross the road?",
            },
            {"speaker": "Speaker 2", "text": "I don't know!"},
            {
                "speaker": "Speaker 1",
                "text": "to get to the other side! (laughs)",
            },
        ]
    ],
    [
        [
            {
                "speaker": "Speaker 1",
                "text": "I am a little tired today (sighs).",
            },
            {"speaker": "Speaker 2", "text": "Hang in there!"},
        ]
    ],
]

demo = gr.Interface(lambda x: x, [dialogue], [audio], examples=examples)

if __name__ == "__main__":
    demo.launch()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.