[LLM] Refactor LLMWithFeedbackCycle and RequestManager#492
[LLM] Refactor LLMWithFeedbackCycle and RequestManager#492DanielRendox wants to merge 22 commits intodevelopmentfrom
Conversation
… tests and remove unused fixtures
|
Additional changings:
|
...lin/org/jetbrains/research/testspark/tools/llm/generation/gemini/GeminiRequestManagerTest.kt
Outdated
Show resolved
Hide resolved
vsantele
left a comment
There was a problem hiding this comment.
I had two errors when testing this PR, which I've commented on in the code.
I didn't review all the changes, just the part where I had issues.
core/src/main/kotlin/org/jetbrains/research/testspark/core/generation/llm/ChatSessionManager.kt
Show resolved
Hide resolved
...src/main/kotlin/org/jetbrains/research/testspark/core/generation/llm/LLMWithFeedbackCycle.kt
Outdated
Show resolved
Hide resolved
...src/main/kotlin/org/jetbrains/research/testspark/core/generation/llm/LLMWithFeedbackCycle.kt
Outdated
Show resolved
Hide resolved
…lectChunks function
…lectChunks function
|
Hi, I don't know if it's in the scope of this PR, but the To check that, you can add a breakpoint here Generate tests on a piece of code that requires at least 2 cycles and check the content of |
|
@vsantele If you have time, would be nice if you fix it :) Please push directly to this branch |
Added a call to `testsAssembler.clear()` to ensure it is empty at the start of each iteration in the feedback cycle. This prevents residual data from previous iterations from affecting later ones.
|
@DanielRendox I made a fix to clear the content before each iteration. This was much easier and cleaner than trying to find the best spot at the end of the cycle (several early terminations) It also fixed a bug that caused the number of tests generated to increase with each iteration without being reset to zero. Now, the counter increases up to the maximum of the first iteration. Then, it's potentially increased if the second iteration produces more tests. But, it will show a "false" number if the second iteration produces fewer tests. I already have an easy solution for resetting the counter between each cycle (override the clear method in |
| continue | ||
| } | ||
| val testSuite = testSuiteResult.data | ||
| generatedTestSuites.add(testSuite) |
There was a problem hiding this comment.
generatedTestSuites is not cleared between two cycles. At the end of the process, all generated tests from every iteration are displayed.
In my opinion, we only need to keep compilable ones or only from the latest cycle.
There was a problem hiding this comment.
The old logic used a set with all compilable tests.
stephanlukasczyk
left a comment
There was a problem hiding this comment.
I've at least a couple of comments, some are nitpicks or questions, feel free to ignore them. I hope I got the main changes, at least it worked when I tried it out, which gives me some confidence.
.../kotlin/org/jetbrains/research/testspark/core/generation/llm/network/model/LlmCommonModel.kt
Outdated
Show resolved
Hide resolved
.../kotlin/org/jetbrains/research/testspark/core/generation/llm/network/model/LlmCommonModel.kt
Outdated
Show resolved
Hide resolved
core/src/main/kotlin/org/jetbrains/research/testspark/core/test/SupportedLanguage.kt
Outdated
Show resolved
Hide resolved
.../org/jetbrains/research/testspark/core/test/parsers/kotlin/KotlinJUnitTestSuiteParserTest.kt
Outdated
Show resolved
Hide resolved
| * The current attempt does not count as a failure since it was rejected due to the prompt size | ||
| * exceeding the threshold | ||
| */ | ||
| if (testSuiteResult.error is LlmError.PromptTooLong) iteration-- |
There was a problem hiding this comment.
The new logic keeps the same kind of bug as #483
If the prompt is too long, the chatHistory is not cleared. So, the next request will send 2 user messages with the original and "reduced" prompts. The error will be the same.
Description of changes made
Result. Since we agreed to useTestSparkErroras the parent class for all errors in the plugin, having a generic parameter in theResultclass for the error field is redundant. Passing it in all places of the project whereResultis used just adds more unnecessary boilerplate.HttpRequests(from IntelliJ platform sdk) to ktor client, which helped to introduce coroutines to the Request to LLM module, return streamed api data in the form of kotlin flow, enable detailed logging of each http request and generally make the api logic a lot more concise and readable.HuggingFaceRequestManagerhas Llama-specific prompt instuction #289 and Add language-agnostic parsing of the backtick block with code snippet forHuggingFaceRequestManage#290 by redirecting to the specific hugging face model implementation (for now, to Llama only)ChatSessionManagermaking the RequestManager easily testable and mockable.runBlockingWithIndicatorLifecyclenow monitors indicator cancellation and stops the coroutine immediately in case of detected cancellation.LLMWithFeedbackCyclemaking the code a lot more readable.kotlinfolder to.gitignoreJUnitTestSuiteParserStrategyto remove all triple quotes in the LLM response. This handles the edge case when LLM response contains more than two triple quotes, which is for some reason, the case for Llama.What is missing?
Even though after some manual testing everything seems to work, I think this is very important to cover
LLMWithFeedbackCyclewith unit tests now because I may have broken something.