-
Notifications
You must be signed in to change notification settings - Fork 531
Description
📋 Summary
Extend the CodeGraph pipeline beyond Python to support multiple programming languages including JavaScript/TypeScript, Java, C#, Go, Rust, and others for comprehensive code repository analysis.
🔍 Background
Currently, CodeGraph pipeline only processes Python files (.py) and uses Python-specific parsing logic. Modern repositories often contain multiple languages that should be analyzed together for complete understanding.
🎯 Acceptance Criteria
- Language Support
JavaScript/TypeScript: .js, .ts, .jsx, .tsx files
Java: .java files with package imports
C#: .cs files with using statements
Go: .go files with import statements
Rust: .rs files with use statements
C/C++: .c, .cpp, .h, .hpp files
- File Discovery Enhancement
Update get_source_code_files() to accept language_config parameter
Add language-specific file extension mapping
Support multi-language repository scanning
- Language-Specific Parsing
Extend tree-sitter integration for each language
Create language-specific dependency extractors
Handle different import/module systems:
Python: import, from...import
JavaScript: import, require, export
Java: import, package
Go: import, package
- Configuration
Add supported_languages parameter to run_code_graph_pipeline()
Language detection from file extensions
Per-language exclusion patterns
- CodeGraph Entities Update
Extend CodeFile entity with language type
Language-specific dependency relationships
Cross-language dependency detection