Skip to content

Commit e2f80e3

Browse files
authored
Merge pull request #331 from kbenkhaled/functiongemma
FunctionGemma
2 parents 58e9b98 + cb36ea4 commit e2f80e3

File tree

4 files changed

+119
-14
lines changed

4 files changed

+119
-14
lines changed

src/components/RunCommand.astro

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ const engineUi = engines.map((e) => ({ id: e.id, label: e.label }));
7272
const engineImagesOrin: Record<string, string> = {
7373
ollama: 'ollama/ollama:latest',
7474
vllm: 'ghcr.io/nvidia-ai-iot/vllm:latest-jetson-orin',
75-
llamacpp: 'ghcr.io/ggerganov/llama.cpp:server',
75+
llamacpp: 'ghcr.io/nvidia-ai-iot/llama_cpp:latest-jetson-orin',
7676
tensorrtllm: 'nvcr.io/nvidia/tensorrt-llm:latest',
7777
diffusers: 'pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime',
7878
comfyui: 'ghcr.io/comfyanonymous/comfyui:latest',
@@ -83,7 +83,7 @@ const engineImagesOrin: Record<string, string> = {
8383
const engineImagesThor: Record<string, string> = {
8484
ollama: 'ollama/ollama:latest',
8585
vllm: 'ghcr.io/nvidia-ai-iot/vllm:latest-jetson-thor',
86-
llamacpp: 'ghcr.io/ggerganov/llama.cpp:server',
86+
llamacpp: 'ghcr.io/nvidia-ai-iot/llama_cpp:latest-jetson-thor',
8787
tensorrtllm: 'nvcr.io/nvidia/tensorrt-llm:latest',
8888
diffusers: 'pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime',
8989
comfyui: 'ghcr.io/comfyanonymous/comfyui:latest',
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
title: "FunctionGemma"
3+
model_id: "functiongemma"
4+
short_description: "Google's specialized function calling model built on Gemma 3 270M, optimized for tool use"
5+
family: "Google Gemma3"
6+
icon: "💎"
7+
is_new: true
8+
order: 0.5
9+
type: "Text"
10+
memory_requirements: "1GB RAM"
11+
precision: "FP8"
12+
model_size: "0.5GB"
13+
hf_checkpoint: "ggml-org/functiongemma-270m-it-GGUF"
14+
minimum_jetson: "Orin Nano"
15+
supported_inference_engines:
16+
- engine: "llama.cpp"
17+
type: "Container"
18+
run_command_orin: "sudo docker run -it --rm --runtime=nvidia --network host ghcr.io/nvidia-ai-iot/llama_cpp:latest-jetson-orin llama-server --jinja -fa on -hf ggml-org/functiongemma-270m-it-GGUF --alias functiongemma"
19+
run_command_thor: "sudo docker run -it --rm --runtime=nvidia --network host ghcr.io/nvidia-ai-iot/llama_cpp:latest-jetson-thor llama-server --jinja -fa on -hf ggml-org/functiongemma-270m-it-GGUF --alias functiongemma"
20+
---
21+
22+
FunctionGemma is a lightweight, open model from Google, built as a foundation for creating your own specialized function calling models. Built on the Gemma 3 270M model and with the same research and technology used to create the Gemini models, FunctionGemma has been trained specifically for function calling. The model has the same architecture as Gemma 3, but uses a different chat format optimized for tool use.
23+
24+
**Note:** FunctionGemma is not intended for use as a direct dialogue model. It is designed to be highly performant after further fine-tuning, as is typical of models this size. The model is well suited for text-only function calling scenarios.
25+
26+
This model is extremely good for applications like home assistant where based on voice actions, we pass it through text-to-speech (TTS) and then use the model for calling the appropriate tool. For example, commands like "close the lights," "open the garage," "set the thermostat to 72 degrees," or "turn on the coffee maker" can be processed efficiently. The model is capable of calling tools in parallel as well, making it efficient for handling multiple commands or complex multi-step actions.
27+
28+
## Supported Platforms
29+
30+
- ✅ Jetson Orin (Orin Nano, Orin NX, AGX Orin)
31+
- ✅ Jetson Thor
32+
33+
You can use FunctionGemma with your favorite orchestration framework or any library/software that supports OpenAI-compatible API backends.
34+
35+
## Getting Started
36+
37+
### Quick Hello World Example
38+
39+
Here's a simple CLI example to get you started with function calling:
40+
41+
```bash
42+
curl http://localhost:8080/v1/chat/completions -d '{
43+
"model": "functiongemma",
44+
"messages": [
45+
{"role": "system", "content": "You are a chatbot that uses tools/functions. Dont overthink things."},
46+
{"role": "user", "content": "What is the weather in Istanbul?"}
47+
],
48+
"tools": [{
49+
"type":"function",
50+
"function":{
51+
"name":"get_current_weather",
52+
"description":"Get the current weather in a given location",
53+
"parameters":{
54+
"type":"object",
55+
"properties":{
56+
"location":{
57+
"type":"string",
58+
"description":"The city and country/state, e.g. `San Francisco, CA`, or `Paris, France`"
59+
}
60+
},
61+
"required":["location"]
62+
}
63+
}
64+
}]
65+
}'
66+
```
67+
68+
### Parallel Tool Calling
69+
70+
To enable parallel tool calling, simply add `"parallel_tool_calls": true` to your request payload:
71+
72+
```bash
73+
curl http://localhost:8080/v1/chat/completions -d '{
74+
"model": "functiongemma",
75+
"parallel_tool_calls": true,
76+
"messages": [
77+
{"role": "user", "content": "Turn on the living room lights and set the temperature to 70"}
78+
],
79+
"tools": [...]
80+
}'
81+
```
82+
83+
## Key Features
84+
85+
- 🎯 **Specialized for Function Calling**: Purpose-built for tool use and API calling
86+
-**Lightweight**: Only 270M parameters, runs efficiently on edge devices
87+
- 🔄 **Parallel Execution**: Call multiple tools simultaneously
88+
89+
## Inputs and outputs
90+
91+
**Input:**
92+
- Text string with system and user messages
93+
- Tool/function definitions in OpenAI format
94+
- Support for parallel tool calling with flag
95+
96+
**Output:**
97+
- Structured function calls with appropriate parameters
98+
- Compatible with OpenAI chat completions format
99+
- JSON-formatted tool invocations
100+

src/layouts/Layout.astro

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -361,20 +361,25 @@ const { title, description = "Experience the latest generative AI models optimiz
361361
function buildShell(meta, st) {
362362
var engineLower = (st.engine || '').toLowerCase();
363363

364-
// Check for full custom command from supportedEngines (for vLLM)
365-
if (engineLower === 'vllm' && meta.supportedEngines && meta.supportedEngines.length > 0) {
366-
var vllmEngine = meta.supportedEngines.find(function(e) {
367-
return e.engine && e.engine.toLowerCase() === 'vllm';
364+
// Check for full custom command from supportedEngines (for vLLM and llama.cpp)
365+
if (meta.supportedEngines && meta.supportedEngines.length > 0) {
366+
var customEngine = meta.supportedEngines.find(function(e) {
367+
if (!e.engine) return false;
368+
var engineName = e.engine.toLowerCase();
369+
// Match by exact name or by removing dots/special chars (llama.cpp -> llamacpp)
370+
var normalizedName = engineName.replace(/[.\-_]/g, '');
371+
var normalizedEngineLower = engineLower.replace(/[.\-_]/g, '');
372+
return engineName === engineLower || normalizedName === normalizedEngineLower;
368373
});
369-
if (vllmEngine) {
374+
if (customEngine) {
370375
var customCmd = null;
371376
var device = st.device || 'Jetson Orin';
372-
if (device === 'Jetson Thor' && vllmEngine.run_command_thor) {
373-
customCmd = vllmEngine.run_command_thor;
374-
} else if (device === 'Jetson Orin' && vllmEngine.run_command_orin) {
375-
customCmd = vllmEngine.run_command_orin;
376-
} else if (vllmEngine.run_command) {
377-
customCmd = vllmEngine.run_command;
377+
if (device === 'Jetson Thor' && customEngine.run_command_thor) {
378+
customCmd = customEngine.run_command_thor;
379+
} else if (device === 'Jetson Orin' && customEngine.run_command_orin) {
380+
customCmd = customEngine.run_command_orin;
381+
} else if (customEngine.run_command) {
382+
customCmd = customEngine.run_command;
378383
}
379384
if (customCmd) {
380385
return customCmd;

src/pages/models/index.astro

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ const families = Array.from(new Set(models.map(m => m.family)));
6666
{model.description}
6767
</p>
6868

69-
<div class="mt-auto pt-4 border-t border-gray-100 relative z-20">
69+
<div class="mt-auto pt-4 border-t border-gray-100">
7070
<div class="flex items-center gap-2">
7171
<RunCommand modelId={model.id} modelName={model.name} category={model.category} forceModal={model.category === 'Image'} supportedEngines={model.supported_inference_engines} hfCheckpoint={model.hf_checkpoint} />
7272
<a href={`/models/${model.id}`} class="px-3 py-1.5 rounded-md bg-nvidia-gray-100 text-nvidia-gray-900 hover:bg-nvidia-green hover:text-nvidia-black text-xs font-semibold transition-colors">

0 commit comments

Comments
 (0)