Skip to content

feat(weave): Implement integration with 🤗 inference client #2795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 45 commits into from

Conversation

soumik12345
Copy link
Contributor

@soumik12345 soumik12345 commented Oct 28, 2024

Description

Implement autopatch integration with 🤗 inference client.

Multi-modal text completion

Sync generation

Expand to see code snippets and traces

Without streaming

import os
import weave
from huggingface_hub import InferenceClient


weave.init("test-huggingface")
client = InferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
client.chat_completion(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": image_url}},
                {"type": "text", "text": "Describe this image in one sentence."},
            ],
        }
    ],
    max_tokens=500,
)

Sample Trace

With streaming

import os
import weave
from huggingface_hub import InferenceClient


weave.init("test-huggingface")
client = InferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
response = client.chat_completion(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": image_url}},
                {"type": "text", "text": "Describe this image in one sentence."},
            ],
        }
    ],
    max_tokens=500,
    stream=True,
)

for r in response:
    print(r.choices[0].delta.content, end="")

Sample Trace

Note: Usage metadata is coming as None. This is because value.usage is always coming None when stream=True. This might be due to a bug in huggingface_hub.InferenceClient.

Async generation

Expand to see code snippets and traces

Without streaming

import asyncio
import os
import weave
from huggingface_hub import AsyncInferenceClient

weave.init("test-huggingface")
client = AsyncInferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"

response = asyncio.run(
    client.chat_completion(
        model="meta-llama/Llama-3.2-11B-Vision-Instruct",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "image_url", "image_url": {"url": image_url}},
                    {"type": "text", "text": "Describe this image in one sentence."},
                ],
            }
        ],
        max_tokens=500,
        stream=True,
    )
)

Sample Trace

With streaming

import asyncio
import os
import weave
from huggingface_hub import AsyncInferenceClient
import rich

weave.init("test-huggingface")
client = AsyncInferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY"))

image_url = "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"

async def generate():
    response = await client.chat_completion(
        model="meta-llama/Llama-3.2-11B-Vision-Instruct",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "image_url", "image_url": {"url": image_url}},
                    {"type": "text", "text": "Describe this image in one sentence."},
                ],
            }
        ],
        max_tokens=500,
        stream=True,
    )

    async for r in response:
        print(r.choices[0].delta.content, end="")


asyncio.run(generate())

Sample Trace

Text-to-image generation

Expand to see code snippets and traces
import os
import weave
from huggingface_hub import InferenceClient

weave.init("test-huggingface")
InferenceClient(api_key=os.getenv("HUGGINGFACE_API_KEY")).text_to_image(
    prompt="A whimsical and creative image depicting a hybrid creature that is a mix of a waffle and a hippopotamus, basking in a river of melted butter amidst a breakfast-themed landscape. It features the distinctive, bulky body shape of a hippo. However, instead of the usual grey skin, the creature's body resembles a golden-brown, crispy waffle fresh off the griddle. The skin is textured with the familiar grid pattern of a waffle, each square filled with a glistening sheen of syrup. The environment combines the natural habitat of a hippo with elements of a breakfast table setting, a river of warm, melted butter, with oversized utensils or plates peeking out from the lush, pancake-like foliage in the background, a towering pepper mill standing in for a tree.  As the sun rises in this fantastical world, it casts a warm, buttery glow over the scene. The creature, content in its butter river, lets out a yawn. Nearby, a flock of birds take flight",
    model="stabilityai/stable-diffusion-3.5-large",
)

Sample Trace

@soumik12345 soumik12345 self-assigned this Oct 28, 2024
@soumik12345 soumik12345 requested a review from a team as a code owner October 28, 2024 18:07
@soumik12345 soumik12345 marked this pull request as draft October 28, 2024 18:07
@circle-job-mirror
Copy link

circle-job-mirror bot commented Oct 28, 2024

Copy link

socket-security bot commented Jan 17, 2025

Updated dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
pypi/[email protected] 🔁 pypi/[email protected] None +169 1.48 GB

View full report↗︎

@soumik12345 soumik12345 marked this pull request as ready for review January 22, 2025 12:03
@ayulockin
Copy link
Member

Hey @soumik12345 can you ensure a green CI?

Hey @wandb/weave-team, this integration will help us gain more traction with the open models and projects that use HF inference. Can we get some review time on this PR?

@soumik12345
Copy link
Contributor Author

Hey @soumik12345 can you ensure a green CI?

Hey @wandb/weave-team, this integration will help us gain more traction with the open models and projects that use HF inference. Can we get some review time on this PR?

Made the ci green, however llamaindex and langchain tests keep failing (don't think that's because of the PR).

@soumik12345
Copy link
Contributor Author

Continuing at #3612

@soumik12345 soumik12345 closed this Feb 6, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Feb 6, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants