Skip to content

Commit 47552dd

Browse files
authored
feat: add partial callback for real-time ghost text (#230)
## Why is this change needed? I have a use case where I need to show streaming speech to text as it lands. Currently there's no partials for me to latch onto in the `StreamingEouAsrManager`. This adds that support; tested and using successfully on this fork. https://github.com/user-attachments/assets/5c1f1d49-0f10-4a93-8eaa-4d35338a4b0f ## Summary Add `setPartialCallback()` to `StreamingEouAsrManager` that fires after each chunk with the current accumulated transcript. This enables real-time "ghost text" display during speech - useful for live transcription UIs that want to show text as it's being spoken before the utterance is finalized. ## Changes - Add `PartialCallback` type alias (matching existing `EouCallback` pattern) - Add `partialCallback` private property - Add `setPartialCallback(_:)` public method - Invoke callback after each chunk's token accumulation in `processChunkAndDecode()` ## Usage ```swift await manager.setPartialCallback { transcript in // Update UI with partial transcript (ghost text) print("Partial: \(transcript)") } ``` The callback receives the full accumulated transcript after each 320ms chunk is processed, allowing UIs to display evolving text before EOU is detected. ---------
1 parent 2aafbb6 commit 47552dd

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

Sources/FluidAudio/ASR/Streaming/StreamingEouAsrManager.swift

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,10 @@ public enum StreamingChunkSize: Sendable {
149149
/// - Parameter transcript: The accumulated transcript up to the EOU point
150150
public typealias EouCallback = @Sendable (String) -> Void
151151

152+
/// Callback invoked when new tokens are decoded (for ghost text).
153+
/// - Parameter transcript: The current accumulated partial transcript
154+
public typealias PartialCallback = @Sendable (String) -> Void
155+
152156
/// High-level manager for the Parakeet EOU streaming pipeline.
153157
/// Uses native Swift mel spectrogram for exact NeMo parity.
154158
public actor StreamingEouAsrManager {
@@ -185,6 +189,8 @@ public actor StreamingEouAsrManager {
185189
public private(set) var eouDetected: Bool = false
186190
/// Optional callback invoked when EOU is detected
187191
private var eouCallback: EouCallback?
192+
/// Optional callback invoked after each chunk with partial transcript
193+
private var partialCallback: PartialCallback?
188194

189195
// EOU Debouncing - requires sustained silence before triggering
190196
/// Minimum duration of silence (in ms) before EOU is confirmed
@@ -229,6 +235,12 @@ public actor StreamingEouAsrManager {
229235
self.eouCallback = callback
230236
}
231237

238+
/// Set a callback to be invoked when new tokens are decoded.
239+
/// Useful for displaying "ghost text" during speech.
240+
public func setPartialCallback(_ callback: @escaping PartialCallback) {
241+
self.partialCallback = callback
242+
}
243+
232244
public func loadModels(modelDir: URL) async throws {
233245
logger.info("Loading CoreML models from \(modelDir.path)...")
234246

@@ -421,6 +433,12 @@ public actor StreamingEouAsrManager {
421433
validOutLen: chunkSize.validOutputLen)
422434
accumulatedTokenIds.append(contentsOf: decodeResult.tokenIds)
423435

436+
// Invoke partial callback for ghost text (only when new tokens decoded)
437+
if let callback = partialCallback, let tokenizer = tokenizer, !decodeResult.tokenIds.isEmpty {
438+
let partial = tokenizer.decode(ids: accumulatedTokenIds)
439+
callback(partial)
440+
}
441+
424442
// Track total samples for timing
425443
totalSamplesProcessed += shiftSamples
426444

0 commit comments

Comments
 (0)