Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions scripts/copy_leaderboard/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Virtual environment
.venv/

# Downloaded artifacts
artifacts/

# W&B run data
wandb/

# State files
.last_sync.txt

# Python cache
__pycache__/
*.pyc
223 changes: 223 additions & 0 deletions scripts/copy_leaderboard/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Copy Leaderboard Tool

A tool to copy the W&B Nejumi Leaderboard to your own W&B environment (such as dedicated cloud and on-premise).

## Features

1. **Run Migration**: Migrate runs with specific tags (e.g., "leaderboard") from source to destination
2. **Artifact Migration**: Copy dataset artifacts to your environment
3. **Report Creation**: Create a new leaderboard report using W&B Reports API
4. **Incremental Updates**: Sync only new runs since the last update

## Installation

### Using uv (recommended)

```bash
cd scripts/copy_leaderboard
uv sync
```

### Using pip

```bash
cd scripts/copy_leaderboard
pip install -e .
```

## Configuration

Copy and edit the configuration file:

```bash
cp config.yaml my_config.yaml
```

Edit `my_config.yaml` with your settings:

```yaml
source:
base_url: "https://api.wandb.ai"
entity: "llm-leaderboard"
project: "nejumi-leaderboard4"
run_tag: "leaderboard"

destination:
base_url: "https://api.wandb.ai" # or your W&B server
entity: "your-entity"
project: "your-project"

artifacts:
- "wandb-japan/llm-leaderboard3/jaster:v6"
# ... add more artifacts

options:
max_workers: 4
skip_existing: true
```

### Private Artifacts Access

Some artifacts (such as toxicity datasets) require special access permissions:

```yaml
artifacts:
- "wandb-japan/toxicity-dataset-private/toxicity_dataset_full:v3"
- "wandb-japan/toxicity-dataset-private/toxicity_judge_prompts:v1"
```

To access these private artifacts:

- **Enterprise license holders**: Please contact your designated W&B support engineer who will grant temporary access.
- **Other users**: Please contact [email protected].

*Note: We may not be able to accommodate all access requests. Thank you for your understanding.*

## Usage

### Set API Keys

```bash
# Source W&B API key (for reading from Nejumi leaderboard)
export WANDB_SRC_API_KEY="your-source-api-key"

# Destination W&B API key (for writing to your environment)
export WANDB_DST_API_KEY="your-destination-api-key"
```

### Full Migration

Run complete migration (artifacts, runs, and report creation). Use this for the initial one-time migration:

```bash
uv run python copy_leaderboard.py -c my_config.yaml full-migration
```

### Continuous Migration

For ongoing synchronization after the initial migration, use the sync command. This migrates only new runs since the last update.

#### Using start_date

Set `start_date` in your config to specify the starting point for incremental sync:

```yaml
options:
start_date: "2024-01-01" # ISO format: YYYY-MM-DD
skip_existing: true
```

#### Sync Command

```bash
uv run python copy_leaderboard.py -c my_config.yaml sync
```

The sync command will:
- Only migrate runs created after `start_date` (if specified)
- Skip runs that already exist in the destination (when `skip_existing: true`)
- Automatically track the last sync time for subsequent runs

### Individual Commands

#### Migrate Runs Only

```bash
uv run python copy_leaderboard.py -c my_config.yaml migrate-runs
```

With options:
```bash
uv run python copy_leaderboard.py -c my_config.yaml migrate-runs --tag leaderboard --limit 10
```

#### Migrate Artifacts Only

```bash
uv run python copy_leaderboard.py -c my_config.yaml migrate-artifacts
```

#### Create Report

```bash
uv run python copy_leaderboard.py -c my_config.yaml create-report --title "My Leaderboard"
```

**Note**: This command creates a basic report with a `WeavePanelSummaryTable` that displays `runs.summary['leaderboard_table']`. The generated report is intentionally simple and serves as a starting point.

If you want to customize the report (e.g., add additional panels, adjust column widths, add filters, or include custom visualizations), please edit the report manually through the W&B UI after creation.

## Architecture

```
copy_leaderboard/
├── pyproject.toml # Package and dependencies
├── config.yaml # Default configuration
├── copy_leaderboard.py # Main script
└── README.md # This file
```

## API Reference

### LeaderboardMigrator

Main class for handling migrations.

```python
from copy_leaderboard import LeaderboardMigrator

migrator = LeaderboardMigrator(
src_base_url="https://api.wandb.ai",
src_api_key="your-src-key",
dst_base_url="https://api.wandb.ai",
dst_api_key="your-dst-key",
)

# Migrate runs
result = migrator.migrate_runs(
entity="llm-leaderboard",
project="nejumi-leaderboard4",
dst_entity="your-entity",
dst_project="your-project",
tag="leaderboard",
)
```

### Config

Configuration loader.

```python
from copy_leaderboard import Config

config = Config("config.yaml")
print(config.source) # Source settings
print(config.destination) # Destination settings
print(config.artifacts) # Artifact paths to migrate
```

## Troubleshooting

### Common Issues

1. **Authentication Error**: Ensure API keys are set correctly
```bash
export WANDB_SRC_API_KEY="..."
export WANDB_DST_API_KEY="..."
```

2. **Permission Denied**: Verify you have access to both source and destination projects

3. **Artifact Not Found**: Check artifact paths in config match existing artifacts

### Debug Mode

Enable verbose logging:

```bash
WANDB_DEBUG=true uv run python copy_leaderboard.py -c config.yaml migrate-runs
```

## License

Same as the parent llm-leaderboard project.
17 changes: 17 additions & 0 deletions scripts/copy_leaderboard/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Copy Leaderboard - Tool to copy W&B Nejumi Leaderboard to your own environment."""

from .copy_leaderboard import (
Config,
LeaderboardMigrator,
WandbRun,
WandbParquetRun,
main,
)

__all__ = [
"Config",
"LeaderboardMigrator",
"WandbRun",
"WandbParquetRun",
"main",
]
52 changes: 52 additions & 0 deletions scripts/copy_leaderboard/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Copy Leaderboard Configuration
# This file configures the migration of W&B runs, artifacts, and reports

# Source configuration (Nejumi Leaderboard)
source:
base_url: "https://api.wandb.ai"
entity: "llm-leaderboard"
project: "nejumi-leaderboard4"
# Private project for sensitive data
private_project: "nejumi-leaderboard4-private"
# Report URL for reference
report_url: "https://wandb.ai/llm-leaderboard/nejumi-leaderboard4/reports/Nejumi-LLM-4--VmlldzoxMzc1OTk1MA"
# Tag to filter runs for migration
run_tag: "leaderboard"

# ============================================================
# ★★★ REQUIRED: Configure your destination W&B environment ★★★
# ============================================================
destination:
# Change this to your W&B server URL (e.g., "https://api.wandb.ai" or your self-hosted URL)
base_url: "https://api.wandb.ai" # <-- CHANGE THIS
# Your W&B entity (team or username)
entity: "your-entity" # <-- CHANGE THIS
# Your target project name
project: "your-project" # <-- CHANGE THIS

# Artifacts to migrate (from base_config.yaml)
artifacts:
- "wandb-japan/llm-leaderboard3/jaster:v6"
- "wandb-japan/llm-leaderboard3/lctg:v0"
- "wandb-japan/llm-leaderboard3-private/jbbq:v2"
- "wandb-japan/toxicity-dataset-private/toxicity_dataset_full:v3"
- "wandb-japan/toxicity-dataset-private/toxicity_judge_prompts:v1"
- "wandb-japan/llm-leaderboard/mtbench_ja_question:v4"
- "wandb-japan/llm-leaderboard/mtbench_ja_question_small_for_test:v5"
- "wandb-japan/llm-leaderboard/mtbench_ja_referenceanswer:v2"
- "wandb-japan/llm-leaderboard/mtbench_ja_referenceanswer_small_for_test:v2"
- "wandb-japan/llm-leaderboard/mtbench_ja_prompt:v1"

# ============================================================
# ★★★ OPTIONAL: Adjust migration options as needed ★★★
# ============================================================
options:
# Number of parallel workers for migration (adjust based on your needs)
max_workers: 4 # <-- ADJUST IF NEEDED
# Skip runs that already exist in destination
skip_existing: true
# For incremental updates: only migrate runs after this date (ISO format)
# Example: "2024-01-01T00:00:00" or null to migrate all runs
start_date: null # <-- SET FOR INCREMENTAL UPDATES
# Limit number of runs to migrate (null for no limit, set a number for testing)
limit: null # <-- SET A NUMBER FOR TESTING
50 changes: 50 additions & 0 deletions scripts/copy_leaderboard/config_example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Test Configuration for Copy Leaderboard
# Uses the test W&B environment

# Source configuration (Nejumi Leaderboard)
source:
base_url: "https://api.wandb.ai"
entity: "llm-leaderboard"
project: "nejumi-leaderboard4"
private_project: "nejumi-leaderboard4-private"
report_url: "https://wandb.ai/llm-leaderboard/nejumi-leaderboard4/reports/Nejumi-LLM-4--VmlldzoxMzc1OTk1MA"
run_tag: "leaderboard"

# ============================================================
# ★★★ REQUIRED: Configure your destination W&B environment ★★★
# ============================================================
destination:
# Change this to your W&B server URL (e.g., "https://api.wandb.ai" or your self-hosted URL)
base_url: "https://fe-crew.wandb.io" # <-- CHANGE THIS
# Your W&B entity (team or username)
entity: "aise" # <-- CHANGE THIS
# Your target project name
project: "japanese-leaderboard-move" # <-- CHANGE THIS

# Artifacts to migrate (subset for testing)
artifacts:
- "wandb-japan/llm-leaderboard3/jaster:v6"
- "wandb-japan/llm-leaderboard3/lctg:v0"
- "wandb-japan/llm-leaderboard3-private/jbbq:v2"
- "wandb-japan/toxicity-dataset-private/toxicity_dataset_full:v3"
- "wandb-japan/toxicity-dataset-private/toxicity_judge_prompts:v1"
- "wandb-japan/llm-leaderboard/mtbench_ja_question:v4"
- "wandb-japan/llm-leaderboard/mtbench_ja_question_small_for_test:v5"
- "wandb-japan/llm-leaderboard/mtbench_ja_referenceanswer:v2"
- "wandb-japan/llm-leaderboard/mtbench_ja_referenceanswer_small_for_test:v2"
- "wandb-japan/llm-leaderboard/mtbench_ja_prompt:v1"


# ============================================================
# ★★★ OPTIONAL: Adjust migration options as needed ★★★
# ============================================================
options:
# Number of parallel workers for migration (adjust based on your needs)
max_workers: 2 # <-- ADJUST IF NEEDED
# Skip runs that already exist in destination
skip_existing: true
# For incremental updates: only migrate runs after this date (ISO format)
# Example: "2024-01-01T00:00:00" or null to migrate all runs
start_date: null # <-- SET FOR INCREMENTAL UPDATES
# Limit number of runs to migrate (null for no limit, set a number for testing)
limit: 5 # <-- SET A NUMBER FOR TESTING
Loading