Skip to content

Commit 4326601

Browse files
feat: implement per-module logging for worker pool
Add structured logging for individual module processing with detailed tracking across all pipeline phases (clone, enrich, analyze). Changes: - Create module-logger.js utility for per-module log files - Integrate logging into processModule() and worker processes - Organize logs by run timestamp: logs/{runId}/modules/{moduleId}.log - Auto-flush on errors and buffer overflow - Add worker-specific ESLint overrides to central config - Update worker pool README with logging documentation Benefits: - Isolated debugging per module (no log mixing in parallel execution) - Complete audit trail with timestamps and structured data - Worker PID tracking for troubleshooting - Historical run comparison via timestamped directories
1 parent 7547424 commit 4326601

File tree

8 files changed

+531
-10
lines changed

8 files changed

+531
-10
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
/.pipeline-runs/
2+
/logs/
23
/modules/*
34
/modules_/*
45
/modules_temp/*

docs/pipeline-refactor-roadmap.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,12 +72,20 @@ Merge stages 3+4+5 into parallel worker processes. See [worker-pool-design.md](p
7272
| P7.1 | ✅ Design complete |
7373
| P7.2 | ✅ Single-worker prototype |
7474
| P7.3 | ✅ Worker pool orchestration |
75-
| P7.4 | Per-module logging |
75+
| P7.4 | Per-module logging |
7676
| P7.5 | Cleanup old stage scripts |
7777
| P7.6 | Incremental mode integration |
7878

7979
**Note:** P7.6 (Incremental mode) deferred until after P7.5 (Cleanup). The existing cache logic in `scripts/check-modules/index.ts` continues to work; integrating it into the new worker architecture makes more sense once the old pipeline is removed.
8080

81+
**P7.4 Implementation (Feb 2026):**
82+
83+
- Created module-specific logger utility (`module-logger.js`)
84+
- Per-module log files organized by run timestamp: `logs/{runId}/modules/{moduleId}.log`
85+
- Detailed logging across all processing phases (clone, enrich, analyze)
86+
- Automatic buffer flushing on errors and completion
87+
- Integrated with existing worker pool architecture
88+
8189
**P7.3 Implementation (Jan 2026):**
8290

8391
- Worker process wrapper with IPC communication

eslint.config.js

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,14 @@ export default defineConfig([
102102
"@typescript-eslint/no-explicit-any": "warn"
103103
}
104104
},
105+
{
106+
files: ["pipeline/workers/**/*.js"],
107+
rules: {
108+
"max-lines": ["warn", 900],
109+
"max-lines-per-function": ["warn", 200],
110+
"max-depth": ["warn", 6]
111+
}
112+
},
105113
{ files: ["**/*.json"], ignores: ["package.json", "package-lock.json"], plugins: { json }, extends: ["json/recommended"], language: "json/json" },
106114
{ files: ["package.json"], plugins: { packageJson }, extends: ["packageJson/recommended"], rules: { "package-json/sort-collections": "off" } },
107115
{ files: ["**/*.md"], plugins: { markdown }, language: "markdown/gfm", extends: ["markdown/recommended"] }

pipeline/workers/README.md

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,9 +83,9 @@ Successfully tested with multiple module sets:
8383

8484
### Next Steps (P7.4+)
8585

86-
- [ ] P7.4: Integrate incremental mode with module cache
87-
- [ ] P7.5: Add per-module logging to files
88-
- [ ] P7.6: Remove old stage scripts after migration complete
86+
- [x] P7.4: Per-module logging to files ✅
87+
- [ ] P7.5: Remove old stage scripts after migration complete
88+
- [ ] P7.6: Integrate incremental mode with module cache
8989
- [ ] P7.7: Performance benchmarking and optimization
9090

9191
### Configuration Options
@@ -148,6 +148,41 @@ See [../docs/pipeline/worker-pool-design.md](../../docs/pipeline/worker-pool-des
148148

149149
## Design Decisions
150150

151+
### Per-Module Logging (P7.4) ✅
152+
153+
Each module gets its own log file with detailed processing information:
154+
155+
**Log Structure:**
156+
157+
```text
158+
logs/
159+
{runId}/ # e.g., 2026-02-04T10-30-45
160+
modules/
161+
MMM-Module-----Author.worker-12345.log
162+
MMM-OtherModule-----Dev.worker-12346.log
163+
```
164+
165+
**Features:**
166+
167+
- Organized by run timestamp for historical tracking
168+
- Includes worker PID in filename for debugging
169+
- Structured log entries with phase, level, message, and optional data
170+
- Auto-flush on errors and when buffer reaches 100 entries
171+
- Closed automatically when module processing completes
172+
173+
**Log Format:**
174+
175+
```text
176+
[2026-02-04T10:30:45.123Z] [INFO] [clone] Starting clone stage {"url":"...","branch":"master"}
177+
[2026-02-04T10:30:47.456Z] [INFO] [clone] Repository cloned successfully
178+
[2026-02-04T10:30:47.500Z] [INFO] [enrich] Starting enrichment stage
179+
[2026-02-04T10:30:48.100Z] [INFO] [end] Module processing completed successfully {"processingTimeMs":2977}
180+
```
181+
182+
**Usage:**
183+
184+
The logger is automatically created for each module and passed via config. No manual setup required in module processing code.
185+
151186
### Single Module Processing Function
152187

153188
Instead of calling separate stage scripts, `processModule()` executes all stages inline:

pipeline/workers/module-logger.js

Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
/**
2+
* Per-Module Logging Utility (P7.4)
3+
*
4+
* Provides structured logging for individual module processing.
5+
* Logs are written to files organized by run timestamp and module ID.
6+
*/
7+
8+
import { ensureDirectory } from "../../scripts/shared/fs-utils.js";
9+
import fs from "node:fs/promises";
10+
import path from "node:path";
11+
12+
/**
13+
* @typedef {Object} ModuleLoggerOptions
14+
* @property {string} projectRoot - Project root directory
15+
* @property {string} runId - Unique run identifier (timestamp)
16+
* @property {string} moduleId - Module identifier (name-----maintainer)
17+
* @property {number} [workerId] - Worker process ID
18+
*/
19+
20+
/**
21+
* @typedef {Object} LogEntry
22+
* @property {string} timestamp - ISO timestamp
23+
* @property {string} level - Log level (info, warn, error, debug)
24+
* @property {string} phase - Processing phase (clone, enrich, analyze)
25+
* @property {string} message - Log message
26+
* @property {Object} [data] - Additional structured data
27+
*/
28+
29+
/**
30+
* Create a module-specific logger that writes to file
31+
*
32+
* @param {ModuleLoggerOptions} options
33+
* @returns {Promise<ModuleLogger>}
34+
*/
35+
export async function createModuleLogger(options) {
36+
const { projectRoot, runId, moduleId, workerId } = options;
37+
38+
// Create logs directory structure: logs/{runId}/modules/
39+
const logsDir = path.join(projectRoot, "logs", runId, "modules");
40+
await ensureDirectory(logsDir);
41+
42+
// Sanitize module ID for filename (replace special chars)
43+
const safeModuleId = moduleId.replace(/[^a-zA-Z0-9_-]/gu, "_");
44+
const logFileName = workerId
45+
? `${safeModuleId}.worker-${workerId}.log`
46+
: `${safeModuleId}.log`;
47+
48+
const logFilePath = path.join(logsDir, logFileName);
49+
50+
// Log buffer (written periodically and on close)
51+
const logBuffer = [];
52+
let closed = false;
53+
54+
/**
55+
* Write buffered logs to file
56+
*/
57+
async function flush() {
58+
if (logBuffer.length === 0 || closed) {
59+
return;
60+
}
61+
62+
const content = `${logBuffer.join("\n")}\n`;
63+
await fs.appendFile(logFilePath, content, "utf8");
64+
logBuffer.length = 0;
65+
}
66+
67+
/**
68+
* Format log entry
69+
* @param {LogEntry} entry
70+
* @returns {string}
71+
*/
72+
function formatLogEntry(entry) {
73+
const { timestamp, level, phase, message, data } = entry;
74+
const parts = [`[${timestamp}]`, `[${level.toUpperCase()}]`, `[${phase}]`, message];
75+
76+
if (data && Object.keys(data).length > 0) {
77+
parts.push(JSON.stringify(data));
78+
}
79+
80+
return parts.join(" ");
81+
}
82+
83+
/**
84+
* Add log entry
85+
* @param {string} level
86+
* @param {string} phase
87+
* @param {string} message
88+
* @param {Object} [data]
89+
*/
90+
async function log(level, phase, message, data) {
91+
if (closed) {
92+
return;
93+
}
94+
95+
const entry = {
96+
timestamp: new Date().toISOString(),
97+
level,
98+
phase,
99+
message,
100+
...(data && { data })
101+
};
102+
103+
logBuffer.push(formatLogEntry(entry));
104+
105+
// Auto-flush on error or if buffer is large
106+
if (level === "error" || logBuffer.length >= 100) {
107+
await flush();
108+
}
109+
}
110+
111+
/**
112+
* Close logger and flush remaining logs
113+
*/
114+
async function close() {
115+
if (closed) {
116+
return;
117+
}
118+
119+
await flush();
120+
closed = true;
121+
}
122+
123+
return {
124+
/**
125+
* Log info message
126+
* @param {string} phase
127+
* @param {string} message
128+
* @param {Object} [data]
129+
*/
130+
info: (phase, message, data) => log("info", phase, message, data),
131+
132+
/**
133+
* Log warning message
134+
* @param {string} phase
135+
* @param {string} message
136+
* @param {Object} [data]
137+
*/
138+
warn: (phase, message, data) => log("warn", phase, message, data),
139+
140+
/**
141+
* Log error message
142+
* @param {string} phase
143+
* @param {string} message
144+
* @param {Object} [data]
145+
*/
146+
error: (phase, message, data) => log("error", phase, message, data),
147+
148+
/**
149+
* Log debug message
150+
* @param {string} phase
151+
* @param {string} message
152+
* @param {Object} [data]
153+
*/
154+
debug: (phase, message, data) => log("debug", phase, message, data),
155+
156+
/**
157+
* Flush logs to file
158+
*/
159+
flush,
160+
161+
/**
162+
* Close logger and flush
163+
*/
164+
close,
165+
166+
/**
167+
* Get log file path
168+
*/
169+
getLogFilePath: () => logFilePath
170+
};
171+
}
172+
173+
/**
174+
* @typedef {Object} ModuleLogger
175+
* @property {(phase: string, message: string, data?: Object) => Promise<void>} info
176+
* @property {(phase: string, message: string, data?: Object) => Promise<void>} warn
177+
* @property {(phase: string, message: string, data?: Object) => Promise<void>} error
178+
* @property {(phase: string, message: string, data?: Object) => Promise<void>} debug
179+
* @property {() => Promise<void>} flush
180+
* @property {() => Promise<void>} close
181+
* @property {() => string} getLogFilePath
182+
*/

0 commit comments

Comments
 (0)