Skip to content

Releases: pipecat-ai/pipecat

v0.0.22

23 May 21:04
b1a6229
Compare
Choose a tag to compare

Added

v0.0.21

23 May 04:45
d4b2741
Compare
Choose a tag to compare

Added

  • Added vision support to Anthropic service.

  • Added WakeCheckFilter which allows you to pass information downstream only if you say a certain phrase/word.

Changed

  • Filter has been renamed to FrameFilter and it's now under processors/filters.

Fixed

  • Fixed Anthropic service to use new frame types.

  • Fixed an issue in LLMUserResponseAggregator and UserResponseAggregator that would cause frames after a brief pause to not be pushed to the LLM.

  • Clear the audio output buffer if we are interrupted.

  • Re-add exponential smoothing after volume calculation. This makes sure the volume value being used doesn't fluctuate so much.

v0.0.20

22 May 21:29
6ac012a
Compare
Choose a tag to compare

Added

  • In order to improve interruptions we now compute a loudness level using pyloudnorm. The audio coming WebRTC transports (e.g. Daily) have an Automatic Gain Control (AGC) algorithm applied to the signal, however we don't do that on our local PyAudio signals. This means that currently incoming audio from PyAudio is kind of broken. We will fix it in future releases.

Fixed

  • Fixed an issue where StartInterruptionFrame would cause LLMUserResponseAggregator to push the accumulated text causing the LLM respond in the wrong task. The StartInterruptionFrame should not trigger any new LLM response because that would be spoken in a different task.

  • Fixed an issue where tasks and threads could be paused because the executor didn't have more tasks available. This was causing issues when cancelling and recreating tasks during interruptions.

v0.0.19

21 May 04:41
147bd1a
Compare
Choose a tag to compare

Changed

  • LLMUserResponseAggregator and LLMAssistantResponseAggregator internal messages are now exposed through the messages property.

Fixed

  • Fixed an issue where LLMAssistantResponseAggregator was not accumulating the full response but short sentences instead. If there's an interruption we only accumulate what the bot has spoken until now in a long response as well.

v0.0.18

20 May 17:33
Compare
Choose a tag to compare

Fixed

  • Fixed an issue in DailyOuputTransport where transport messages were not being sent.

v0.0.17

20 May 02:29
Compare
Choose a tag to compare

Added

  • Added google.generativeai model support, including vision. This new google service defaults to using gemini-1.5-flash-latest. Example in examples/foundational/12a-describe-video-gemini-flash.py.

  • Added vision support to openai service. Example in examples/foundational/12a-describe-video-gemini-flash.py.

  • Added initial interruptions support. The assistant contexts (or aggregators) should now be placed after the output transport. This way, only the completed spoken context is added to the assistant context.

  • Added VADParams so you can control voice confidence level and others.

  • VADAnalyzer now uses an exponential smoothed volume to improve speech detection. This is useful when voice confidence is high (because there's someone talking near you) but volume is low.

Fixed

  • Fixed an issue where TTSService was not pushing TextFrames downstream.

  • Fixed issues with Ctrl-C program termination.

  • Fixed an issue that was causing StopTaskFrame to actually not exit the PipelineTask.

v0.0.16

17 May 01:16
Compare
Choose a tag to compare

Fixed

  • DailyTransport: don't publish camera and audio tracks if not enabled.

  • Fixed an issue in BaseInputTransport that was causing frames pushed downstream not pushed in the right order.

v0.0.15

16 May 00:08
Compare
Choose a tag to compare

Fixed

  • Quick hot fix for receiving DailyTransportMessage.

v0.0.14

15 May 23:00
Compare
Choose a tag to compare

Added

  • Added DailyTransport event on_participant_left.

  • Added support for receiving DailyTransportMessage.

Fixed

  • Images are now resized to the size of the output camera. This was causing images not being displayed.

  • Fixed an issue in DailyTransport that would not allow the input processor to shutdown if no participant ever joined the room.

  • Fixed base transports start and stop. In some situation processors would halt or not shutdown properly.

v0.0.13

15 May 02:09
Compare
Choose a tag to compare

Changed

  • MoondreamService argument model_id is now model.

  • VADAnalyzer arguments have been renamed for more clarity.

Fixed

  • Fixed an issue with DailyInputTransport and DailyOutputTransport that could cause some threads to not start properly.

  • Fixed STTService. Add max_silence_secs and max_buffer_secs to handle better what's being passed to the STT service. Also add exponential smoothing to the RMS.

  • Fixed WhisperSTTService. Add no_speech_prob to avoid garbage output text.