Skip to content

Commit 8d38d65

Browse files
caoshiyiDachengLi1cck0517Xiuyu-LiShangyint
committed
SStar Code
Co-authored-by: Dacheng Li <[email protected]> Co-authored-by: Shiyi Cao <[email protected]> Co-authored-by: Chengkun Cao <[email protected]> Co-authored-by: Xiuyu Li <[email protected]> Co-authored-by: Shangyin Tan <[email protected]>
1 parent 4c2085a commit 8d38d65

File tree

161 files changed

+19741
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

161 files changed

+19741
-0
lines changed
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# S*: Test Time Scaling for Code Generation ####
2+
This folder provides the code for the paper "S*: Test Time Scaling for Code Generation".
3+
4+
![Overview of S* approach](assets/figure1.png)
5+
6+
## Installation (Main packages)
7+
```dspy=2.6.2, torch, vllm```
8+
9+
## Usage
10+
The scripts to reproduce the results in the paper are in the `scripts` folder.
11+
- baselines are in `baselines`, `baselines_selfdebug`, `majority_baselines`.
12+
- experiments on dev set are in: `sec[4,5,6]`.
13+
- experiments on final test set are in: `final_[]`. First run commands under `final_oracle` to produce all generations without different selection methods, then run commands under `final_[]_cached` to produce generations with different selection methods.
14+
15+
Results are availeble in google cloud storage ([Link](https://drive.google.com/drive/u/1/folders/1kmCoJ7Mkvj-umpkfsA5960hYpNrgH4X4)).
16+
17+
Simple run commands to produce generations with oracle selection and 3 rounds of generation for gpt-4o-mini.
18+
19+
Set OPENAI_API_KEY in your environment variable with `export OPENAI_API_KEY=xxx`.
20+
21+
```
22+
python evaluate_multiprocess.py \
23+
--difficulty=easy \
24+
--temperature=0.7 \
25+
--num_threads=32 \
26+
--n=16 \
27+
--selection oracle_all_rounds \
28+
--lcb_version release_v2 \
29+
--num_round 3 \
30+
--result_json_path="results/final_4omini_n_16_debug_public3_select_oracle_easy.json"
31+
```
32+
33+
To run experiments with local serve models, use ```vllm serve model_name``` to serve the model first.
34+
35+
36+
37+
#### Citation
38+
```
39+
@article{li2025sstar,
40+
title={S*: Test Time Scaling for Code Generation},
41+
author={Li, Dacheng and Cao, Shiyi and Cao, Chengkun and Li, Xiuyu and Tan, Shangyin and Keutzer, Kurt and Xing, Jiarong and Gonzalez, Joseph E. and Stoica, Ion},
42+
year={2025}
43+
}
44+
```
45+
46+
89.6 KB
Loading

skythought/test-time-scaling/codecontest_evaluate_multiprocess.py

Lines changed: 473 additions & 0 deletions
Large diffs are not rendered by default.

skythought/test-time-scaling/evaluate_multiprocess.py

Lines changed: 454 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)