FRANC

This project contains the source code of the paper titled "FRANC: A Framework for Improving the Quality ofAutomatically Generated Code", accepted at the 24th IEEE International Conference on Source Code Analysis and Manipulation (SCAM 2024).

Abstract

In recent years, the use of automated source code generation utilizing transformer-based generative models has grown in popularity. These models can generate code according to the developers’ requirements. However, recent research showed that these automatically generated source codes can contain vulnerabilities and other quality issues. Despite researchers’ and practitioners’ attempts to enhance code generation models, retraining and fine-tuning large language models is not only time-consuming but also resource-intensive and costly. Thus, in this paper, we describe FRANC, a lightweight framework for recommending more secure and high-quality source code derived from transformer-based code generation models. FRANC includes a static filter to make the generated code compilable with heuristics and a quality-aware ranker to sort the code snippets based on a quality score. Moreover, the framework uses prompt engineering to fix persistent quality issues. We evaluated FRANC with five Python and Java code generation models and six prompt datasets, including a newly created one in this work (FRANC). The static filter improves 9% to 46% Java suggestions and 10% to 43% Python suggestions regarding compilability. The average improvement over the NDCG@10 score for the ranking system is 0.0763, and the repairing techniques repair the highest 80% of prompts. FRANC takes, on average, 1.98 seconds for Java; for Python, it takes 0.08 seconds.

File Structure

Benchmarks: Contains the benchmark datasets used in the paper.
DatasetCollection: Contains the code for collecting the dataset from StackOverflow and creating SOEval.
SuggestionGenerator: Contains the code for the suggestion generator using OpenAI's ChatGPT and HuggingFaces's Open-source code generation models.
Static_Filter: Contains the code for the static filter in which the generated code is cleaned and made compilable.
Quality_Analyzer: Contains the code for the quality analyzer in which the generated code is ranked based on the quality.
Quality_Analyzer_Before_Static_Filter: Contains the code for the quality analyzer before running the static filter.
Repair_*: Contains the code for the repair techniques for the smelly codes. Three Benchmarks folders contain three repair scenarios.
Utils: Contains the code for the result analysis, repair prompt creation, and samples for NDCG score calculation.
FRANC_Appendix.pdf: Contains the appendix of the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
Benchmarks		Benchmarks
DatasetCollection		DatasetCollection
Quality_Analyzer		Quality_Analyzer
Quality_Analyzer_Before_Static_Filter		Quality_Analyzer_Before_Static_Filter
Repair_Benchmarks		Repair_Benchmarks
Repair_Benchmarks_1		Repair_Benchmarks_1
Repair_Benchmarks_2		Repair_Benchmarks_2
Repair_Benchmarks_Backup		Repair_Benchmarks_Backup
Repair_Quality_Analyzer		Repair_Quality_Analyzer
Repair_Static_Filter		Repair_Static_Filter
Repair_SuggestionGenerator		Repair_SuggestionGenerator
Result/Table		Result/Table
Static_Filter		Static_Filter
SuggestionGenerator		SuggestionGenerator
Utils		Utils
.DS_Store		.DS_Store
.gitignore		.gitignore
FRANC_Appendix.pdf		FRANC_Appendix.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FRANC

Abstract

File Structure

About

Contributors 2

Languages

s2e-lab/FRANC

Folders and files

Latest commit

History

Repository files navigation

FRANC

Abstract

File Structure

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages