Skip to content

Commit f6337f0

Browse files
Merge pull request #1 from KasarLabs/core/migration
core: Cairo agent migration from starknet-agent repo
2 parents 0f8a8e8 + 5d9ae70 commit f6337f0

File tree

124 files changed

+14236
-2946
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

124 files changed

+14236
-2946
lines changed

.cursor/rules/coding_standards.mdc

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
description: Coding Standards
3+
globs: *.ts,*.tsx,*.js,*.jsx
4+
---
5+
# Coding Standards for Starknet Agent
6+
7+
## Naming Conventions
8+
- Variables and functions: Use `camelCase` (e.g., `fetchData`, `generateEmbeddings`).
9+
- Classes and components: Use `PascalCase` (e.g., `RagAgent`, `ChatInterface`).
10+
- Constants: Use `UPPER_CASE` with underscores (e.g., `DEFAULT_CHAT_MODEL`).
11+
- Type interfaces: Use `PascalCase` with `I` prefix (e.g., `IAgentConfig`).
12+
- Ingester classes: Use `PascalCase` with `Ingester` suffix (e.g., `CairoBookIngester`).
13+
- Pipeline components: Use descriptive names ending with their role (e.g., `QueryProcessor`, `DocumentRetriever`).
14+
15+
## Indentation and Formatting
16+
- Use 2 spaces for indentation (no tabs).
17+
- Keep lines under 100 characters where possible.
18+
- Place opening braces on the same line as the statement (e.g., `if (condition) {`).
19+
- Use Prettier for consistent formatting across the codebase.
20+
- Run `pnpm format:write` before committing changes.
21+
22+
## Imports and Structure
23+
- Group external imports first, followed by internal modules.
24+
- Use barrel exports (index.ts files) to simplify imports.
25+
- Prefer destructured imports when importing multiple items from a single module.
26+
- Order imports alphabetically within their groups.
27+
- Use relative paths for imports within the same package, absolute paths for cross-package imports.
28+
29+
## Comments
30+
- Add JSDoc comments for functions and classes, especially in the agent pipeline and ingester components.
31+
- Use `//` for single-line comments and `/* ... */` for multi-line comments.
32+
- Document ingester classes with clear descriptions of the source and processing approach.
33+
- Include explanations for complex algorithms or non-obvious design decisions.
34+
- For the RAG pipeline components, document the input/output expectations clearly.
35+
36+
## TypeScript Usage
37+
- Use explicit typing for function parameters and return values.
38+
- Prefer interfaces over types for object definitions.
39+
- Use generics where appropriate, especially in the pipeline components and ingester classes.
40+
- Example: `function processQuery<T extends BaseQuery>(query: T): Promise<QueryResult>`
41+
- Use abstract classes for base implementations (e.g., `BaseIngester`).
42+
- Leverage type guards for safe type narrowing.
43+
- Use discriminated unions for state management, especially in the UI components.
44+
45+
## Error Handling
46+
- Wrap async operations in `try/catch` blocks.
47+
- Log errors with context using the logger utility (e.g., `logger.error('Failed to retrieve documents:', error)`).
48+
- Use custom error classes for specific error types in the agent pipeline and ingestion process.
49+
- Implement proper cleanup in error handlers, especially for file operations in ingesters.
50+
- Ensure errors are propagated appropriately and handled at the right level of abstraction.
51+
- Use async/await with proper error handling rather than promise chains where possible.
52+
53+
## Testing
54+
- Write unit tests for utility functions, pipeline components, and ingester classes.
55+
- Use Jest for testing framework.
56+
- Mock external dependencies (LLMs, vector stores, etc.) using jest-mock-extended.
57+
- Aim for high test coverage in core agent functionality and ingestion processes.
58+
- Test each ingester implementation separately.
59+
- Use descriptive test names that explain the behavior being tested.
60+
- Follow the AAA pattern (Arrange, Act, Assert) for test structure.
61+
62+
## Code Organization
63+
- Keep files focused on a single responsibility.
64+
- Group related functionality in directories.
65+
- Separate business logic from UI components.
66+
- Organize ingesters by source type in dedicated directories.
67+
- Follow the template method pattern for ingester implementations.
68+
- Use the factory pattern for creating appropriate instances based on configuration.
69+
- Implement dependency injection for easier testing and component replacement.

.cursor/rules/common_patterns.mdc

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
---
2+
description: Common Patterns
3+
globs: *.ts,*.tsx,*.js,*.jsx
4+
---
5+
# Common Patterns in Starknet Agent
6+
7+
## RAG Pipeline Architecture
8+
- Core pattern for information retrieval and response generation.
9+
- Steps in the RAG pipeline:
10+
1. **Query Processor**: `packages/agents/src/pipeline/queryProcessor.ts`
11+
- Analyzes user queries and chat history
12+
- Reformulates queries to optimize document retrieval
13+
2. **Document Retriever**: `packages/agents/src/pipeline/documentRetriever.ts`
14+
- Converts queries to vector embeddings
15+
- Searches vector database using cosine similarity
16+
- Returns relevant document chunks with metadata
17+
3. **Answer Generator**: `packages/agents/src/pipeline/answerGenerator.ts`
18+
- Uses LLMs to generate comprehensive responses
19+
- Includes source citations in the response
20+
- Handles different conversation contexts
21+
4. **RAG Pipeline**: `packages/agents/src/pipeline/ragPipeline.ts`
22+
- Orchestrates the entire process flow
23+
- Manages error handling and logging
24+
25+
## Factory Pattern
26+
- Used for creating RAG agents with different configurations.
27+
- Example: `packages/agents/src/ragAgentFactory.ts`
28+
- Creates different agent instances based on focus mode.
29+
- Configures appropriate vector stores and prompt templates.
30+
- Also used in the ingester package: `packages/ingester/src/IngesterFactory.ts`
31+
- Creates appropriate ingester instances based on documentation source.
32+
- Enables easy addition of new document sources.
33+
34+
## Template Method Pattern
35+
- Used in the ingester package for standardizing the ingestion process.
36+
- Example: `packages/ingester/src/BaseIngester.ts`
37+
- Defines the skeleton of the ingestion algorithm in a method.
38+
- Defers some steps to subclasses (download, extract, process).
39+
- Ensures consistent process flow while allowing customization.
40+
- Common workflow: Download → Extract → Process → Generate Embeddings → Store
41+
42+
## WebSocket Streaming Architecture
43+
- Used for real-time streaming of agent responses.
44+
- Example: `packages/backend/src/websocket/`
45+
- Components:
46+
- `connectionManager.ts`: Manages WebSocket connections and sessions
47+
- `messageHandler.ts`: Processes incoming messages and routes to appropriate handlers
48+
- Flow: Connection → Authentication → Message Handling → Response Streaming
49+
- Enables real-time, chunk-by-chunk delivery of LLM responses
50+
51+
## Repository Pattern
52+
- Used for database interactions.
53+
- Example: `packages/agents/src/db/vectorStore.ts`
54+
- Abstracts MongoDB vector search operations
55+
- Provides methods for similarity search and filtering
56+
- Handles connection pooling and error handling
57+
- Used in ingester for vector store operations: `packages/ingester/src/utils/vectorStoreUtils.ts`
58+
59+
## Configuration Management
60+
- Centralized configuration using TOML files.
61+
- Example: `packages/agents/src/config.ts` and `packages/agents/sample.config.toml`
62+
- Loads configuration from files and environment variables.
63+
- Provides typed access to configuration values.
64+
- Supports multiple LLM providers (OpenAI, Anthropic, etc.)
65+
- Configures multiple vector databases for different focus modes
66+
67+
## Dependency Injection
68+
- Used for providing services to components.
69+
- Example: `packages/agents/src/ragAgentFactory.ts`
70+
- Injects vector stores, LLM providers, and config settings into pipeline components
71+
- Makes testing easier by allowing mock implementations
72+
- Enables flexible configuration of different agent types
73+
74+
## Focus Mode Implementation
75+
- Pattern for targeting specific document sources.
76+
- Example: `packages/agents/src/config/agentConfigs.ts`
77+
- Defines different focus modes (Starknet Ecosystem, Cairo Book, etc.)
78+
- Configures different vector stores for each mode
79+
- Customizes prompts and retrieval parameters per mode
80+
- Enables specialized knowledge domains
81+
82+
## React Hooks for State Management
83+
- Custom hooks for managing UI state and WebSocket communication.
84+
- Example: `packages/ui/lib/hooks/`
85+
- Encapsulates WebSocket connection logic.
86+
- Manages chat history and UI state.
87+
- Handles real-time streaming of responses.
88+
89+
## Error Handling and Logging
90+
- Centralized error handling with detailed logging.
91+
- Example: `packages/agents/src/utils/logger.ts`
92+
- Configurable log levels based on environment
93+
- Context-rich error messages with timestamps and stack traces
94+
- Proper error propagation through the pipeline
95+
- Used throughout the codebase for consistent error reporting.

.cursor/rules/documentation.mdc

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
description: Documentation
3+
globs:
4+
---
5+
# Documentation for Starknet Agent
6+
7+
## External Resources
8+
- Starknet Documentation: [https://docs.starknet.io](https://docs.starknet.io)
9+
- Referenced in the agent's knowledge base.
10+
- Cairo Book: [https://book.cairo-lang.org](https://book.cairo-lang.org)
11+
- Core resource for Cairo language information.
12+
- MongoDB Atlas Vector Search: [https://www.mongodb.com/docs/atlas/vector-search/](https://www.mongodb.com/docs/atlas/vector-search/)
13+
- Used for vector database implementation.
14+
- Anthropic Claude API: [https://docs.anthropic.com/claude/reference/getting-started-with-the-api](https://docs.anthropic.com/claude/reference/getting-started-with-the-api)
15+
- Used for LLM integration.
16+
17+
## Internal Documentation
18+
- Architecture Overview: `docs/architecture/README.md`
19+
- Explains the RAG pipeline architecture.
20+
- API Integration Guide: `API_INTEGRATION.md`
21+
- Details how to integrate with the agent's API.
22+
- Contributing Guidelines: `CONTRIBUTING.md`
23+
- Instructions for contributing to the project.
24+
25+
## Code Documentation
26+
- JSDoc comments are used throughout the codebase, especially in:
27+
- `packages/agents/src/pipeline/`: Documents the RAG pipeline components.
28+
- `packages/agents/src/core/`: Documents core agent functionality.
29+
- `packages/backend/src/websocket/`: Documents WebSocket communication.
30+
31+
## Configuration Documentation
32+
- Sample configuration: `packages/agents/sample.config.toml`
33+
- Documents available configuration options.
34+
- Environment variables: `.env.example` files
35+
- Documents required environment variables.
36+
37+
## Database Schema
38+
- MongoDB collections structure is documented in:
39+
- `packages/agents/src/db/`: Database interaction code.
40+
- Vector embeddings format and schema.
41+
42+
## Deployment Documentation
43+
- Docker deployment: `docker-compose.yaml` and related Dockerfiles
44+
- Instructions for containerized deployment.
45+
- Production hosting: `docker-compose.prod-hosted.yml`
46+
- Configuration for production environments.

.cursor/rules/imports.mdc

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
description: Cairo Imports
3+
globs: *.ts,*.tsx,*.js,*.jsx
4+
---
5+
# Imports in Cairo Coder
6+
7+
## External Libraries
8+
9+
### Backend and Agent Libraries
10+
- `express`: Web server framework.
11+
- Used in: `packages/backend/src/app.ts`
12+
- Import: `import express from 'express';`
13+
- `cors`: CORS middleware for Express.
14+
- Used in: `packages/backend/src/app.ts`
15+
- Import: `import cors from 'cors';`
16+
- `mongodb`: MongoDB client for database operations.
17+
- Used in: `packages/agents/src/db/`
18+
- Import: `import { MongoClient } from 'mongodb';`
19+
- `anthropic`: Anthropic Claude API client.
20+
- Used in: `packages/agents/src/lib/`
21+
- Import: `import Anthropic from '@anthropic-ai/sdk';`
22+
- `openai`: OpenAI API client.
23+
- Used in: `packages/agents/src/lib/`
24+
- Import: `import OpenAI from 'openai';`
25+
- `@google/generative-ai`: Google AI API client.
26+
- Used in: `packages/agents/src/lib/`
27+
- Import: `import { GoogleGenerativeAI } from '@google/generative-ai';`
28+
29+
### Frontend Libraries
30+
- `react`: UI library.
31+
- Used in: `packages/ui/components/`
32+
- Import: `import React from 'react';`
33+
- `next`: React framework.
34+
- Used in: `packages/ui/app/`
35+
- Import: `import { useRouter } from 'next/router';`
36+
- `tailwindcss`: CSS framework.
37+
- Used in: `packages/ui/components/`
38+
- Applied via class names.
39+
40+
## Internal Modules
41+
42+
### Agent Modules
43+
- `pipeline`: RAG pipeline components.
44+
- Used in: `packages/agents/src/core/ragAgentFactory.ts`
45+
- Import: `import { QueryProcessor, DocumentRetriever, CodeGenerator } from './pipeline';`
46+
- `config`: Configuration management.
47+
- Used in: `packages/agents/src/`
48+
- Import: `import { config } from './config';`
49+
- `db`: Database interaction.
50+
- Used in: `packages/agents/src/core/`
51+
- Import: `import { VectorStore } from './db/vectorStore';`
52+
- `models`: LLM and embedding models interfaces.
53+
- Used in: `packages/agents/src/core/`
54+
- Import: `import { LLMProviderFactory } from './models/llmProviderFactory';`
55+
- Import: `import { EmbeddingProviderFactory } from './models/embeddingProviderFactory';`
56+
57+
### Backend Modules
58+
- `routes`: API routes.
59+
- Used in: `packages/backend/src/app.ts`
60+
- Import: `import { generateRoutes } from './routes/generate';`
61+
- Import: `import { modelsRoutes } from './routes/models';`
62+
- `handlers`: Request handlers.
63+
- Used in: `packages/backend/src/routes/`
64+
- Import: `import { generateHandler } from '../handlers/generateHandler';`
65+
66+
### Ingester Modules
67+
- `baseIngester`: Abstract base class for all ingesters.
68+
- Used in: `packages/ingester/src/ingesters/`
69+
- Import: `import { BaseIngester } from '../BaseIngester';`
70+
- `ingesterFactory`: Factory for creating ingesters.
71+
- Used in: `packages/ingester/src/scripts/`
72+
- Import: `import { IngesterFactory } from '../IngesterFactory';`
73+
- `utils`: Utility functions.
74+
- Used in: `packages/ingester/src/`
75+
- Import: `import { downloadFile, extractArchive } from './utils/fileUtils';`
76+
- Import: `import { processContent, splitMarkdown } from './utils/contentUtils';`
77+
78+
## Common Import Patterns
79+
80+
### For Backend API Routes
81+
```typescript
82+
import express from 'express';
83+
import { generateHandler } from '../handlers/generateHandler';
84+
import { config } from '../config';
85+
```
86+
87+
### For Agent Core
88+
```typescript
89+
import { VectorStore } from './db/vectorStore';
90+
import { LLMProviderFactory } from './models/llmProviderFactory';
91+
import { EmbeddingProviderFactory } from './models/embeddingProviderFactory';
92+
```
93+
94+
### For Ingesters
95+
```typescript
96+
import { BaseIngester } from '../BaseIngester';
97+
import { BookPageDto, ParsedSection, BookChunk } from '../types';
98+
import { Document } from 'langchain/document';
99+
import { VectorStore } from '../../agents/src/db/vectorStore';
100+
```

0 commit comments

Comments
 (0)