Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(weave): Implement integration with 🤗 inference client #2795

Draft
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

soumik12345
Copy link
Contributor

@soumik12345 soumik12345 commented Oct 28, 2024

Description

Implement autopatch integration with 🤗 inference client.

Multi-modal text completion

Sync generation

Expand to see code snippets and traces

Without streaming

import os
import weave
from huggingface_hub import InferenceClient


weave.init("test-huggingface")
client = InferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
client.chat_completion(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": image_url}},
                {"type": "text", "text": "Describe this image in one sentence."},
            ],
        }
    ],
    max_tokens=500,
)

Sample Trace

With streaming

import os
import weave
from huggingface_hub import InferenceClient


weave.init("test-huggingface")
client = InferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
response = client.chat_completion(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": image_url}},
                {"type": "text", "text": "Describe this image in one sentence."},
            ],
        }
    ],
    max_tokens=500,
    stream=True,
)

for r in response:
    print(r.choices[0].delta.content, end="")

Sample Trace

Note: Usage metadata is coming as None. This is because value.usage is always coming None when stream=True. This might be due to a bug in huggingface_hub.InferenceClient.

Async generation

Expand to see code snippets and traces

Without streaming

import asyncio
import os
import weave
from huggingface_hub import AsyncInferenceClient

weave.init("test-huggingface")
client = AsyncInferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"

response = asyncio.run(
    client.chat_completion(
        model="meta-llama/Llama-3.2-11B-Vision-Instruct",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "image_url", "image_url": {"url": image_url}},
                    {"type": "text", "text": "Describe this image in one sentence."},
                ],
            }
        ],
        max_tokens=500,
        stream=True,
    )
)

Sample Trace

With streaming

import asyncio
import os
import weave
from huggingface_hub import AsyncInferenceClient
import rich

weave.init("test-huggingface")
client = AsyncInferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"

async def generate():
    response = await client.chat_completion(
        model="meta-llama/Llama-3.2-11B-Vision-Instruct",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "image_url", "image_url": {"url": image_url}},
                    {"type": "text", "text": "Describe this image in one sentence."},
                ],
            }
        ],
        max_tokens=500,
        stream=True,
    )

    async for r in response:
        print(r.choices[0].delta.content, end="")


asyncio.run(generate())

Sample Trace

Text-to-image generation

Expand to see code snippets and traces
import os
import weave
from huggingface_hub import InferenceClient

weave.init("test-huggingface")
InferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY")).text_to_image(
    prompt="A whimsical and creative image depicting a hybrid creature that is a mix of a waffle and a hippopotamus, basking in a river of melted butter amidst a breakfast-themed landscape. It features the distinctive, bulky body shape of a hippo. However, instead of the usual grey skin, the creature's body resembles a golden-brown, crispy waffle fresh off the griddle. The skin is textured with the familiar grid pattern of a waffle, each square filled with a glistening sheen of syrup. The environment combines the natural habitat of a hippo with elements of a breakfast table setting, a river of warm, melted butter, with oversized utensils or plates peeking out from the lush, pancake-like foliage in the background, a towering pepper mill standing in for a tree.  As the sun rises in this fantastical world, it casts a warm, buttery glow over the scene. The creature, content in its butter river, lets out a yawn. Nearby, a flock of birds take flight",
    model="stabilityai/stable-diffusion-3.5-large",
)

Sample Trace

@soumik12345 soumik12345 self-assigned this Oct 28, 2024
@soumik12345 soumik12345 requested a review from a team as a code owner October 28, 2024 18:07
@soumik12345 soumik12345 marked this pull request as draft October 28, 2024 18:07
@circle-job-mirror
Copy link

circle-job-mirror bot commented Oct 28, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant