diff --git a/.mock/definition/empathic-voice/__package__.yml b/.mock/definition/empathic-voice/__package__.yml index 435263c7..f62dfaa1 100644 --- a/.mock/definition/empathic-voice/__package__.yml +++ b/.mock/definition/empathic-voice/__package__.yml @@ -13,7 +13,9 @@ errors: - value: {} types: AssistantEnd: - docs: When provided, the output is an assistant end message. + docs: >- + **Indicates the conclusion of the assistant's response**, signaling that + the assistant has finished speaking for the current conversational turn. properties: custom_session_id: type: optional @@ -33,7 +35,15 @@ types: source: openapi: evi-asyncapi.json AssistantInput: - docs: When provided, the input is spoken by EVI. + docs: >- + **Assistant text to synthesize into spoken audio and insert into the + conversation.** EVI uses this text to generate spoken audio using our + proprietary expressive text-to-speech model. + + + Our model adds appropriate emotional inflections and tones to the text + based on the user's expressions and the context of the conversation. The + synthesized audio is streamed back to the user as an Assistant Message. properties: custom_session_id: type: optional @@ -62,7 +72,10 @@ types: source: openapi: evi-openapi.json AssistantMessage: - docs: When provided, the output is an assistant message. + docs: >- + **Transcript of the assistant's message.** Contains the message role, + content, and optionally tool call information including the tool name, + parameters, response requirement status, tool call ID, and tool type. properties: custom_session_id: type: optional @@ -102,7 +115,10 @@ types: source: openapi: evi-asyncapi.json AssistantProsody: - docs: When provided, the output is an Assistant Prosody message. + docs: >- + **Expression measurement predictions of the assistant's audio output.** + Contains inference model results including prosody scores for 48 emotions + within the detected expression of the assistant's audio sample. properties: custom_session_id: type: optional @@ -145,7 +161,17 @@ types: source: openapi: evi-openapi.json AudioInput: - docs: When provided, the input is audio. + docs: >- + **Base64 encoded audio input to insert into the conversation.** The + content is treated as the user's speech to EVI and must be streamed + continuously. Pre-recorded audio files are not supported. + + + For optimal transcription quality, the audio data should be transmitted in + small chunks. Hume recommends streaming audio with a buffer window of `20` + milliseconds (ms), or `100` milliseconds (ms) for web applications. See + our [Audio Guide](/docs/speech-to-speech-evi/guides/audio) for more + details on preparing and processing audio. properties: custom_session_id: type: optional @@ -175,17 +201,20 @@ types: The type of message sent through the socket; must be `audio_input` for our server to correctly identify and process it as an Audio Input message. - - - This message is used for sending audio input data to EVI for - processing and expression measurement. Audio data should be sent as a - continuous stream, encoded in Base64. source: openapi: evi-openapi.json AudioOutput: docs: >- - The type of message sent through the socket; for an Audio Output message, - this must be `audio_output`. + **Base64 encoded audio output.** This encoded audio is transmitted to the + client, where it can be decoded and played back as part of the user + interaction. The returned audio format is WAV and the sample rate is + 48kHz. + + + Contains the audio data, an ID to track and reference the audio output, + and an index indicating the chunk position relative to the whole audio + segment. See our [Audio Guide](/docs/speech-to-speech-evi/guides/audio) + for more details on preparing and processing audio. properties: custom_session_id: type: optional @@ -257,7 +286,16 @@ types: source: openapi: evi-asyncapi.json ChatMetadata: - docs: When provided, the output is a chat metadata message. + docs: >- + **The first message received after establishing a connection with EVI**, + containing important identifiers for the current Chat session. + + + Includes the Chat ID (which allows the Chat session to be tracked and + referenced) and the Chat Group ID (used to resume a Chat when passed in + the `resumed_chat_group_id` query parameter of a subsequent connection + request, allowing EVI to continue the conversation from where it left off + within the Chat Group). properties: chat_group_id: type: string @@ -392,7 +430,14 @@ types: Encoding: type: literal<"linear16"> WebSocketError: - docs: When provided, the output is an error message. + docs: >- + **Indicates a disruption in the WebSocket connection**, such as an + unexpected disconnection, protocol error, or data transmission issue. + + + Contains an error code identifying the type of error encountered, a + detailed description of the error, and a short, human-readable identifier + and description (slug) for the error. properties: code: type: string @@ -452,8 +497,15 @@ types: openapi: evi-openapi.json PauseAssistantMessage: docs: >- - Pause responses from EVI. Chat history is still saved and sent after - resuming. + **Pause responses from EVI.** Chat history is still saved and sent after + resuming. Once this message is sent, EVI will not respond until a Resume + Assistant message is sent. + + + When paused, EVI won't respond, but transcriptions of your audio inputs + will still be recorded. See our [Pause Response + Guide](/docs/speech-to-speech-evi/features/pause-responses) for further + details. properties: custom_session_id: type: optional @@ -495,8 +547,15 @@ types: openapi: evi-openapi.json ResumeAssistantMessage: docs: >- - Resume responses from EVI. Chat history sent while paused will now be - sent. + **Resume responses from EVI.** Chat history sent while paused will now be + sent. + + + Upon resuming, if any audio input was sent during the pause, EVI will + retain context from all messages sent but only respond to the last user + message. See our [Pause Response + Guide](/docs/speech-to-speech-evi/features/pause-responses) for further + details. properties: custom_session_id: type: optional @@ -539,7 +598,16 @@ types: openapi: evi-openapi.json inline: true SessionSettings: - docs: Settings for this chat session. + docs: >- + **Settings for this chat session.** Session settings are temporary and + apply only to the current Chat session. + + + These settings can be adjusted dynamically based on the requirements of + each session to ensure optimal performance and user experience. See our + [Session Settings + Guide](/docs/speech-to-speech-evi/configuration/session-settings) for a + complete list of configurable settings. properties: audio: type: optional @@ -764,7 +832,16 @@ types: source: openapi: evi-openapi.json ToolErrorMessage: - docs: When provided, the output is a function call error. + docs: >- + **Error message from the tool call**, not exposed to the LLM or user. Upon + receiving a Tool Call message and failing to invoke the function, this + message is sent to notify EVI of the tool's failure. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage` if the tool fails. See our + [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for further + details. properties: code: type: optional @@ -818,7 +895,16 @@ types: source: openapi: evi-openapi.json ToolResponseMessage: - docs: When provided, the output is a function call response. + docs: >- + **Return value of the tool call.** Contains the output generated by the + tool to pass back to EVI. Upon receiving a Tool Call message and + successfully invoking the function, this message is sent to convey the + result of the function call back to EVI. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage`. See our [Tool Use + Guide](/docs/speech-to-speech-evi/features/tool-use) for further details. properties: content: type: string @@ -877,9 +963,9 @@ types: openapi: evi-openapi.json UserInput: docs: >- - User text to insert into the conversation. Text sent through a User Input - message is treated as the user's speech to EVI. EVI processes this input - and provides a corresponding response. + **User text to insert into the conversation.** Text sent through a User + Input message is treated as the user's speech to EVI. EVI processes this + input and provides a corresponding response. Expression measurement results are not available for User Input messages, @@ -896,11 +982,6 @@ types: User text to insert into the conversation. Text sent through a User Input message is treated as the user's speech to EVI. EVI processes this input and provides a corresponding response. - - - Expression measurement results are not available for User Input - messages, as the prosody model relies on audio input and cannot - process text alone. type: type: literal<"user_input"> docs: >- @@ -910,7 +991,17 @@ types: source: openapi: evi-openapi.json UserInterruption: - docs: When provided, the output is an interruption. + docs: >- + **Indicates the user has interrupted the assistant's response.** EVI + detects the interruption in real-time and sends this message to signal the + interruption event. + + + This message allows the system to stop the current audio playback, clear + the audio queue, and prepare to handle new user input. Contains a Unix + timestamp of when the user interruption was detected. For more details, + see our [Interruptibility + Guide](/docs/speech-to-speech-evi/features/interruptibility) properties: custom_session_id: type: optional @@ -935,7 +1026,17 @@ types: source: openapi: evi-asyncapi.json UserMessage: - docs: When provided, the output is a user message. + docs: >- + **Transcript of the user's message.** Contains the message role and + content, along with a `from_text` field indicating if this message was + inserted into the conversation as text from a `UserInput` message. + + + Includes an `interim` field indicating whether the transcript is + provisional (words may be repeated or refined in subsequent `UserMessage` + responses as additional audio is processed) or final and complete. Interim + transcripts are only sent when the `verbose_transcription` query parameter + is set to true in the initial handshake. properties: custom_session_id: type: optional @@ -997,54 +1098,217 @@ types: discriminated: false union: - type: AssistantEnd - docs: When provided, the output is an assistant end message. + docs: >- + **Indicates the conclusion of the assistant's response**, signaling + that the assistant has finished speaking for the current + conversational turn. - type: AssistantMessage - docs: When provided, the output is an assistant message. + docs: >- + **Transcript of the assistant's message.** Contains the message role, + content, and optionally tool call information including the tool name, + parameters, response requirement status, tool call ID, and tool type. - type: AssistantProsody - docs: When provided, the output is an Assistant Prosody message. + docs: >- + **Expression measurement predictions of the assistant's audio + output.** Contains inference model results including prosody scores + for 48 emotions within the detected expression of the assistant's + audio sample. - type: AudioOutput docs: >- - The type of message sent through the socket; for an Audio Output - message, this must be `audio_output`. + **Base64 encoded audio output.** This encoded audio is transmitted to + the client, where it can be decoded and played back as part of the + user interaction. The returned audio format is WAV and the sample rate + is 48kHz. + + + Contains the audio data, an ID to track and reference the audio + output, and an index indicating the chunk position relative to the + whole audio segment. See our [Audio + Guide](/docs/speech-to-speech-evi/guides/audio) for more details on + preparing and processing audio. - type: ChatMetadata - docs: When provided, the output is a chat metadata message. + docs: >- + **The first message received after establishing a connection with + EVI**, containing important identifiers for the current Chat session. + + + Includes the Chat ID (which allows the Chat session to be tracked and + referenced) and the Chat Group ID (used to resume a Chat when passed + in the `resumed_chat_group_id` query parameter of a subsequent + connection request, allowing EVI to continue the conversation from + where it left off within the Chat Group). - type: WebSocketError - docs: When provided, the output is an error message. + docs: >- + **Indicates a disruption in the WebSocket connection**, such as an + unexpected disconnection, protocol error, or data transmission issue. + + + Contains an error code identifying the type of error encountered, a + detailed description of the error, and a short, human-readable + identifier and description (slug) for the error. - type: UserInterruption - docs: When provided, the output is an interruption. + docs: >- + **Indicates the user has interrupted the assistant's response.** EVI + detects the interruption in real-time and sends this message to signal + the interruption event. + + + This message allows the system to stop the current audio playback, + clear the audio queue, and prepare to handle new user input. Contains + a Unix timestamp of when the user interruption was detected. For more + details, see our [Interruptibility + Guide](/docs/speech-to-speech-evi/features/interruptibility) - type: UserMessage - docs: When provided, the output is a user message. + docs: >- + **Transcript of the user's message.** Contains the message role and + content, along with a `from_text` field indicating if this message was + inserted into the conversation as text from a `UserInput` message. + + + Includes an `interim` field indicating whether the transcript is + provisional (words may be repeated or refined in subsequent + `UserMessage` responses as additional audio is processed) or final and + complete. Interim transcripts are only sent when the + `verbose_transcription` query parameter is set to true in the initial + handshake. - type: ToolCallMessage - docs: When provided, the output is a tool call. + docs: >- + **Indicates that the supplemental LLM has detected a need to invoke + the specified tool.** This message is only received for user-defined + function tools. + + + Contains the tool name, parameters (as a stringified JSON schema), + whether a response is required from the developer (either in the form + of a `ToolResponseMessage` or a `ToolErrorMessage`), the unique tool + call ID for tracking the request and response, and the tool type. See + our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for + further details. - type: ToolResponseMessage - docs: When provided, the output is a function call response. + docs: >- + **Return value of the tool call.** Contains the output generated by + the tool to pass back to EVI. Upon receiving a Tool Call message and + successfully invoking the function, this message is sent to convey the + result of the function call back to EVI. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage`. See our [Tool Use + Guide](/docs/speech-to-speech-evi/features/tool-use) for further + details. - type: ToolErrorMessage - docs: When provided, the output is a function call error. + docs: >- + **Error message from the tool call**, not exposed to the LLM or user. + Upon receiving a Tool Call message and failing to invoke the function, + this message is sent to notify EVI of the tool's failure. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage` if the tool fails. See + our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for + further details. source: openapi: evi-asyncapi.json JsonMessage: discriminated: false union: - type: AssistantEnd - docs: When provided, the output is an assistant end message. + docs: >- + **Indicates the conclusion of the assistant's response**, signaling + that the assistant has finished speaking for the current + conversational turn. - type: AssistantMessage - docs: When provided, the output is an assistant message. + docs: >- + **Transcript of the assistant's message.** Contains the message role, + content, and optionally tool call information including the tool name, + parameters, response requirement status, tool call ID, and tool type. - type: AssistantProsody - docs: When provided, the output is an Assistant Prosody message. + docs: >- + **Expression measurement predictions of the assistant's audio + output.** Contains inference model results including prosody scores + for 48 emotions within the detected expression of the assistant's + audio sample. - type: ChatMetadata - docs: When provided, the output is a chat metadata message. + docs: >- + **The first message received after establishing a connection with + EVI**, containing important identifiers for the current Chat session. + + + Includes the Chat ID (which allows the Chat session to be tracked and + referenced) and the Chat Group ID (used to resume a Chat when passed + in the `resumed_chat_group_id` query parameter of a subsequent + connection request, allowing EVI to continue the conversation from + where it left off within the Chat Group). - type: WebSocketError - docs: When provided, the output is an error message. + docs: >- + **Indicates a disruption in the WebSocket connection**, such as an + unexpected disconnection, protocol error, or data transmission issue. + + + Contains an error code identifying the type of error encountered, a + detailed description of the error, and a short, human-readable + identifier and description (slug) for the error. - type: UserInterruption - docs: When provided, the output is an interruption. + docs: >- + **Indicates the user has interrupted the assistant's response.** EVI + detects the interruption in real-time and sends this message to signal + the interruption event. + + + This message allows the system to stop the current audio playback, + clear the audio queue, and prepare to handle new user input. Contains + a Unix timestamp of when the user interruption was detected. For more + details, see our [Interruptibility + Guide](/docs/speech-to-speech-evi/features/interruptibility) - type: UserMessage - docs: When provided, the output is a user message. + docs: >- + **Transcript of the user's message.** Contains the message role and + content, along with a `from_text` field indicating if this message was + inserted into the conversation as text from a `UserInput` message. + + + Includes an `interim` field indicating whether the transcript is + provisional (words may be repeated or refined in subsequent + `UserMessage` responses as additional audio is processed) or final and + complete. Interim transcripts are only sent when the + `verbose_transcription` query parameter is set to true in the initial + handshake. - type: ToolCallMessage - docs: When provided, the output is a tool call. + docs: >- + **Indicates that the supplemental LLM has detected a need to invoke + the specified tool.** This message is only received for user-defined + function tools. + + + Contains the tool name, parameters (as a stringified JSON schema), + whether a response is required from the developer (either in the form + of a `ToolResponseMessage` or a `ToolErrorMessage`), the unique tool + call ID for tracking the request and response, and the tool type. See + our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for + further details. - type: ToolResponseMessage - docs: When provided, the output is a function call response. + docs: >- + **Return value of the tool call.** Contains the output generated by + the tool to pass back to EVI. Upon receiving a Tool Call message and + successfully invoking the function, this message is sent to convey the + result of the function call back to EVI. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage`. See our [Tool Use + Guide](/docs/speech-to-speech-evi/features/tool-use) for further + details. - type: ToolErrorMessage - docs: When provided, the output is a function call error. + docs: >- + **Error message from the tool call**, not exposed to the LLM or user. + Upon receiving a Tool Call message and failing to invoke the function, + this message is sent to notify EVI of the tool's failure. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage` if the tool fails. See + our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for + further details. source: openapi: evi-asyncapi.json ConnectSessionSettingsAudio: @@ -1229,31 +1493,82 @@ types: discriminated: false union: - type: SessionSettings - docs: Settings for this chat session. + docs: >- + **Settings for this chat session.** Session settings are temporary and + apply only to the current Chat session. + + + These settings can be adjusted dynamically based on the requirements + of each session to ensure optimal performance and user experience. See + our [Session Settings + Guide](/docs/speech-to-speech-evi/configuration/session-settings) for + a complete list of configurable settings. - type: UserInput docs: >- - User text to insert into the conversation. Text sent through a User - Input message is treated as the user's speech to EVI. EVI processes - this input and provides a corresponding response. + **User text to insert into the conversation.** Text sent through a + User Input message is treated as the user's speech to EVI. EVI + processes this input and provides a corresponding response. Expression measurement results are not available for User Input messages, as the prosody model relies on audio input and cannot process text alone. - type: AssistantInput - docs: When provided, the input is spoken by EVI. + docs: >- + **Assistant text to synthesize into spoken audio and insert into the + conversation.** EVI uses this text to generate spoken audio using our + proprietary expressive text-to-speech model. + + + Our model adds appropriate emotional inflections and tones to the text + based on the user's expressions and the context of the conversation. + The synthesized audio is streamed back to the user as an Assistant + Message. - type: ToolResponseMessage - docs: When provided, the output is a function call response. + docs: >- + **Return value of the tool call.** Contains the output generated by + the tool to pass back to EVI. Upon receiving a Tool Call message and + successfully invoking the function, this message is sent to convey the + result of the function call back to EVI. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage`. See our [Tool Use + Guide](/docs/speech-to-speech-evi/features/tool-use) for further + details. - type: ToolErrorMessage - docs: When provided, the output is a function call error. + docs: >- + **Error message from the tool call**, not exposed to the LLM or user. + Upon receiving a Tool Call message and failing to invoke the function, + this message is sent to notify EVI of the tool's failure. + + + For built-in tools implemented on the server, you will receive this + message type rather than a `ToolCallMessage` if the tool fails. See + our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for + further details. - type: PauseAssistantMessage docs: >- - Pause responses from EVI. Chat history is still saved and sent after - resuming. + **Pause responses from EVI.** Chat history is still saved and sent + after resuming. Once this message is sent, EVI will not respond until + a Resume Assistant message is sent. + + + When paused, EVI won't respond, but transcriptions of your audio + inputs will still be recorded. See our [Pause Response + Guide](/docs/speech-to-speech-evi/features/pause-responses) for + further details. - type: ResumeAssistantMessage docs: >- - Resume responses from EVI. Chat history sent while paused will now be - sent. + **Resume responses from EVI.** Chat history sent while paused will now + be sent. + + + Upon resuming, if any audio input was sent during the pause, EVI will + retain context from all messages sent but only respond to the last + user message. See our [Pause Response + Guide](/docs/speech-to-speech-evi/features/pause-responses) for + further details. source: openapi: evi-openapi.json ErrorResponse: diff --git a/.mock/definition/empathic-voice/chatWebhooks.yml b/.mock/definition/empathic-voice/chatWebhooks.yml index 7f3da91b..6e66c302 100644 --- a/.mock/definition/empathic-voice/chatWebhooks.yml +++ b/.mock/definition/empathic-voice/chatWebhooks.yml @@ -49,7 +49,6 @@ webhooks: custom_session_id: null timestamp: 1 tool_call_message: - custom_session_id: null name: name parameters: parameters response_required: true diff --git a/.mock/definition/tts/__package__.yml b/.mock/definition/tts/__package__.yml index 53427e35..96e9aa75 100644 --- a/.mock/definition/tts/__package__.yml +++ b/.mock/definition/tts/__package__.yml @@ -213,7 +213,12 @@ service: docs: Specifies the output audio file format. include_timestamp_types: type: optional> - docs: The set of timestamp types to include in the response. + docs: >- + The set of timestamp types to include in the response. When used + in multipart/form-data, specify each value using bracket + notation: + `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. + Only supported for Octave 2 requests. content-type: multipart/form-data response: docs: Successful Response @@ -251,7 +256,12 @@ service: docs: Specifies the output audio file format. include_timestamp_types: type: optional> - docs: The set of timestamp types to include in the response. + docs: >- + The set of timestamp types to include in the response. When used + in multipart/form-data, specify each value using bracket + notation: + `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. + Only supported for Octave 2 requests. content-type: multipart/form-data response-stream: docs: Successful Response diff --git a/.mock/fern.config.json b/.mock/fern.config.json index 904feebc..8ee40941 100644 --- a/.mock/fern.config.json +++ b/.mock/fern.config.json @@ -1,4 +1,4 @@ { "organization" : "hume", - "version" : "0.108.0" + "version" : "0.114.0" } \ No newline at end of file diff --git a/poetry.lock b/poetry.lock index c29f926d..ef9f0876 100644 --- a/poetry.lock +++ b/poetry.lock @@ -1,4 +1,4 @@ -# This file is automatically @generated by Poetry 1.8.3 and should not be changed by hand. +# This file is automatically @generated by Poetry 1.8.5 and should not be changed by hand. [[package]] name = "aiofiles" @@ -3220,97 +3220,80 @@ test = ["pytest", "websockets"] [[package]] name = "websockets" -version = "13.1" +version = "15.0.1" description = "An implementation of the WebSocket Protocol (RFC 6455 & 7692)" optional = false -python-versions = ">=3.8" +python-versions = ">=3.9" files = [ - {file = "websockets-13.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:f48c749857f8fb598fb890a75f540e3221d0976ed0bf879cf3c7eef34151acee"}, - {file = "websockets-13.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:c7e72ce6bda6fb9409cc1e8164dd41d7c91466fb599eb047cfda72fe758a34a7"}, - {file = "websockets-13.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:f779498eeec470295a2b1a5d97aa1bc9814ecd25e1eb637bd9d1c73a327387f6"}, - {file = "websockets-13.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4676df3fe46956fbb0437d8800cd5f2b6d41143b6e7e842e60554398432cf29b"}, - {file = "websockets-13.1-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a7affedeb43a70351bb811dadf49493c9cfd1ed94c9c70095fd177e9cc1541fa"}, - {file = "websockets-13.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1971e62d2caa443e57588e1d82d15f663b29ff9dfe7446d9964a4b6f12c1e700"}, - {file = "websockets-13.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:5f2e75431f8dc4a47f31565a6e1355fb4f2ecaa99d6b89737527ea917066e26c"}, - {file = "websockets-13.1-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:58cf7e75dbf7e566088b07e36ea2e3e2bd5676e22216e4cad108d4df4a7402a0"}, - {file = "websockets-13.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:c90d6dec6be2c7d03378a574de87af9b1efea77d0c52a8301dd831ece938452f"}, - {file = "websockets-13.1-cp310-cp310-win32.whl", hash = "sha256:730f42125ccb14602f455155084f978bd9e8e57e89b569b4d7f0f0c17a448ffe"}, - {file = "websockets-13.1-cp310-cp310-win_amd64.whl", hash = "sha256:5993260f483d05a9737073be197371940c01b257cc45ae3f1d5d7adb371b266a"}, - {file = "websockets-13.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:61fc0dfcda609cda0fc9fe7977694c0c59cf9d749fbb17f4e9483929e3c48a19"}, - {file = "websockets-13.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:ceec59f59d092c5007e815def4ebb80c2de330e9588e101cf8bd94c143ec78a5"}, - {file = "websockets-13.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:c1dca61c6db1166c48b95198c0b7d9c990b30c756fc2923cc66f68d17dc558fd"}, - {file = "websockets-13.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:308e20f22c2c77f3f39caca508e765f8725020b84aa963474e18c59accbf4c02"}, - {file = "websockets-13.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:62d516c325e6540e8a57b94abefc3459d7dab8ce52ac75c96cad5549e187e3a7"}, - {file = "websockets-13.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:87c6e35319b46b99e168eb98472d6c7d8634ee37750d7693656dc766395df096"}, - {file = "websockets-13.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:5f9fee94ebafbc3117c30be1844ed01a3b177bb6e39088bc6b2fa1dc15572084"}, - {file = "websockets-13.1-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:7c1e90228c2f5cdde263253fa5db63e6653f1c00e7ec64108065a0b9713fa1b3"}, - {file = "websockets-13.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:6548f29b0e401eea2b967b2fdc1c7c7b5ebb3eeb470ed23a54cd45ef078a0db9"}, - {file = "websockets-13.1-cp311-cp311-win32.whl", hash = "sha256:c11d4d16e133f6df8916cc5b7e3e96ee4c44c936717d684a94f48f82edb7c92f"}, - {file = "websockets-13.1-cp311-cp311-win_amd64.whl", hash = "sha256:d04f13a1d75cb2b8382bdc16ae6fa58c97337253826dfe136195b7f89f661557"}, - {file = "websockets-13.1-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:9d75baf00138f80b48f1eac72ad1535aac0b6461265a0bcad391fc5aba875cfc"}, - {file = "websockets-13.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:9b6f347deb3dcfbfde1c20baa21c2ac0751afaa73e64e5b693bb2b848efeaa49"}, - {file = "websockets-13.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:de58647e3f9c42f13f90ac7e5f58900c80a39019848c5547bc691693098ae1bd"}, - {file = "websockets-13.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a1b54689e38d1279a51d11e3467dd2f3a50f5f2e879012ce8f2d6943f00e83f0"}, - {file = "websockets-13.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:cf1781ef73c073e6b0f90af841aaf98501f975d306bbf6221683dd594ccc52b6"}, - {file = "websockets-13.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8d23b88b9388ed85c6faf0e74d8dec4f4d3baf3ecf20a65a47b836d56260d4b9"}, - {file = "websockets-13.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3c78383585f47ccb0fcf186dcb8a43f5438bd7d8f47d69e0b56f71bf431a0a68"}, - {file = "websockets-13.1-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:d6d300f8ec35c24025ceb9b9019ae9040c1ab2f01cddc2bcc0b518af31c75c14"}, - {file = "websockets-13.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:a9dcaf8b0cc72a392760bb8755922c03e17a5a54e08cca58e8b74f6902b433cf"}, - {file = "websockets-13.1-cp312-cp312-win32.whl", hash = "sha256:2f85cf4f2a1ba8f602298a853cec8526c2ca42a9a4b947ec236eaedb8f2dc80c"}, - {file = "websockets-13.1-cp312-cp312-win_amd64.whl", hash = "sha256:38377f8b0cdeee97c552d20cf1865695fcd56aba155ad1b4ca8779a5b6ef4ac3"}, - {file = "websockets-13.1-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:a9ab1e71d3d2e54a0aa646ab6d4eebfaa5f416fe78dfe4da2839525dc5d765c6"}, - {file = "websockets-13.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:b9d7439d7fab4dce00570bb906875734df13d9faa4b48e261c440a5fec6d9708"}, - {file = "websockets-13.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:327b74e915cf13c5931334c61e1a41040e365d380f812513a255aa804b183418"}, - {file = "websockets-13.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:325b1ccdbf5e5725fdcb1b0e9ad4d2545056479d0eee392c291c1bf76206435a"}, - {file = "websockets-13.1-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:346bee67a65f189e0e33f520f253d5147ab76ae42493804319b5716e46dddf0f"}, - {file = "websockets-13.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:91a0fa841646320ec0d3accdff5b757b06e2e5c86ba32af2e0815c96c7a603c5"}, - {file = "websockets-13.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:18503d2c5f3943e93819238bf20df71982d193f73dcecd26c94514f417f6b135"}, - {file = "websockets-13.1-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:a9cd1af7e18e5221d2878378fbc287a14cd527fdd5939ed56a18df8a31136bb2"}, - {file = "websockets-13.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:70c5be9f416aa72aab7a2a76c90ae0a4fe2755c1816c153c1a2bcc3333ce4ce6"}, - {file = "websockets-13.1-cp313-cp313-win32.whl", hash = "sha256:624459daabeb310d3815b276c1adef475b3e6804abaf2d9d2c061c319f7f187d"}, - {file = "websockets-13.1-cp313-cp313-win_amd64.whl", hash = "sha256:c518e84bb59c2baae725accd355c8dc517b4a3ed8db88b4bc93c78dae2974bf2"}, - {file = "websockets-13.1-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:c7934fd0e920e70468e676fe7f1b7261c1efa0d6c037c6722278ca0228ad9d0d"}, - {file = "websockets-13.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:149e622dc48c10ccc3d2760e5f36753db9cacf3ad7bc7bbbfd7d9c819e286f23"}, - {file = "websockets-13.1-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:a569eb1b05d72f9bce2ebd28a1ce2054311b66677fcd46cf36204ad23acead8c"}, - {file = "websockets-13.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:95df24ca1e1bd93bbca51d94dd049a984609687cb2fb08a7f2c56ac84e9816ea"}, - {file = "websockets-13.1-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d8dbb1bf0c0a4ae8b40bdc9be7f644e2f3fb4e8a9aca7145bfa510d4a374eeb7"}, - {file = "websockets-13.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:035233b7531fb92a76beefcbf479504db8c72eb3bff41da55aecce3a0f729e54"}, - {file = "websockets-13.1-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:e4450fc83a3df53dec45922b576e91e94f5578d06436871dce3a6be38e40f5db"}, - {file = "websockets-13.1-cp38-cp38-musllinux_1_2_i686.whl", hash = "sha256:463e1c6ec853202dd3657f156123d6b4dad0c546ea2e2e38be2b3f7c5b8e7295"}, - {file = "websockets-13.1-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:6d6855bbe70119872c05107e38fbc7f96b1d8cb047d95c2c50869a46c65a8e96"}, - {file = "websockets-13.1-cp38-cp38-win32.whl", hash = "sha256:204e5107f43095012b00f1451374693267adbb832d29966a01ecc4ce1db26faf"}, - {file = "websockets-13.1-cp38-cp38-win_amd64.whl", hash = "sha256:485307243237328c022bc908b90e4457d0daa8b5cf4b3723fd3c4a8012fce4c6"}, - {file = "websockets-13.1-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:9b37c184f8b976f0c0a231a5f3d6efe10807d41ccbe4488df8c74174805eea7d"}, - {file = "websockets-13.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:163e7277e1a0bd9fb3c8842a71661ad19c6aa7bb3d6678dc7f89b17fbcc4aeb7"}, - {file = "websockets-13.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:4b889dbd1342820cc210ba44307cf75ae5f2f96226c0038094455a96e64fb07a"}, - {file = "websockets-13.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:586a356928692c1fed0eca68b4d1c2cbbd1ca2acf2ac7e7ebd3b9052582deefa"}, - {file = "websockets-13.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7bd6abf1e070a6b72bfeb71049d6ad286852e285f146682bf30d0296f5fbadfa"}, - {file = "websockets-13.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6d2aad13a200e5934f5a6767492fb07151e1de1d6079c003ab31e1823733ae79"}, - {file = "websockets-13.1-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:df01aea34b6e9e33572c35cd16bae5a47785e7d5c8cb2b54b2acdb9678315a17"}, - {file = "websockets-13.1-cp39-cp39-musllinux_1_2_i686.whl", hash = "sha256:e54affdeb21026329fb0744ad187cf812f7d3c2aa702a5edb562b325191fcab6"}, - {file = "websockets-13.1-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:9ef8aa8bdbac47f4968a5d66462a2a0935d044bf35c0e5a8af152d58516dbeb5"}, - {file = "websockets-13.1-cp39-cp39-win32.whl", hash = "sha256:deeb929efe52bed518f6eb2ddc00cc496366a14c726005726ad62c2dd9017a3c"}, - {file = "websockets-13.1-cp39-cp39-win_amd64.whl", hash = "sha256:7c65ffa900e7cc958cd088b9a9157a8141c991f8c53d11087e6fb7277a03f81d"}, - {file = "websockets-13.1-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:5dd6da9bec02735931fccec99d97c29f47cc61f644264eb995ad6c0c27667238"}, - {file = "websockets-13.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:2510c09d8e8df777177ee3d40cd35450dc169a81e747455cc4197e63f7e7bfe5"}, - {file = "websockets-13.1-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f1c3cf67185543730888b20682fb186fc8d0fa6f07ccc3ef4390831ab4b388d9"}, - {file = "websockets-13.1-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:bcc03c8b72267e97b49149e4863d57c2d77f13fae12066622dc78fe322490fe6"}, - {file = "websockets-13.1-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:004280a140f220c812e65f36944a9ca92d766b6cc4560be652a0a3883a79ed8a"}, - {file = "websockets-13.1-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:e2620453c075abeb0daa949a292e19f56de518988e079c36478bacf9546ced23"}, - {file = "websockets-13.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:9156c45750b37337f7b0b00e6248991a047be4aa44554c9886fe6bdd605aab3b"}, - {file = "websockets-13.1-pp38-pypy38_pp73-macosx_11_0_arm64.whl", hash = "sha256:80c421e07973a89fbdd93e6f2003c17d20b69010458d3a8e37fb47874bd67d51"}, - {file = "websockets-13.1-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:82d0ba76371769d6a4e56f7e83bb8e81846d17a6190971e38b5de108bde9b0d7"}, - {file = "websockets-13.1-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:e9875a0143f07d74dc5e1ded1c4581f0d9f7ab86c78994e2ed9e95050073c94d"}, - {file = "websockets-13.1-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a11e38ad8922c7961447f35c7b17bffa15de4d17c70abd07bfbe12d6faa3e027"}, - {file = "websockets-13.1-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:4059f790b6ae8768471cddb65d3c4fe4792b0ab48e154c9f0a04cefaabcd5978"}, - {file = "websockets-13.1-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:25c35bf84bf7c7369d247f0b8cfa157f989862c49104c5cf85cb5436a641d93e"}, - {file = "websockets-13.1-pp39-pypy39_pp73-macosx_11_0_arm64.whl", hash = "sha256:83f91d8a9bb404b8c2c41a707ac7f7f75b9442a0a876df295de27251a856ad09"}, - {file = "websockets-13.1-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7a43cfdcddd07f4ca2b1afb459824dd3c6d53a51410636a2c7fc97b9a8cf4842"}, - {file = "websockets-13.1-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:48a2ef1381632a2f0cb4efeff34efa97901c9fbc118e01951ad7cfc10601a9bb"}, - {file = "websockets-13.1-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:459bf774c754c35dbb487360b12c5727adab887f1622b8aed5755880a21c4a20"}, - {file = "websockets-13.1-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:95858ca14a9f6fa8413d29e0a585b31b278388aa775b8a81fa24830123874678"}, - {file = "websockets-13.1-py3-none-any.whl", hash = "sha256:a9a396a6ad26130cdae92ae10c36af09d9bfe6cafe69670fd3b6da9b07b4044f"}, - {file = "websockets-13.1.tar.gz", hash = "sha256:a3b3366087c1bc0a2795111edcadddb8b3b59509d5db5d7ea3fdd69f954a8878"}, + {file = "websockets-15.0.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:d63efaa0cd96cf0c5fe4d581521d9fa87744540d4bc999ae6e08595a1014b45b"}, + {file = "websockets-15.0.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:ac60e3b188ec7574cb761b08d50fcedf9d77f1530352db4eef1707fe9dee7205"}, + {file = "websockets-15.0.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:5756779642579d902eed757b21b0164cd6fe338506a8083eb58af5c372e39d9a"}, + {file = "websockets-15.0.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0fdfe3e2a29e4db3659dbd5bbf04560cea53dd9610273917799f1cde46aa725e"}, + {file = "websockets-15.0.1-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4c2529b320eb9e35af0fa3016c187dffb84a3ecc572bcee7c3ce302bfeba52bf"}, + {file = "websockets-15.0.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ac1e5c9054fe23226fb11e05a6e630837f074174c4c2f0fe442996112a6de4fb"}, + {file = "websockets-15.0.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:5df592cd503496351d6dc14f7cdad49f268d8e618f80dce0cd5a36b93c3fc08d"}, + {file = "websockets-15.0.1-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:0a34631031a8f05657e8e90903e656959234f3a04552259458aac0b0f9ae6fd9"}, + {file = "websockets-15.0.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:3d00075aa65772e7ce9e990cab3ff1de702aa09be3940d1dc88d5abf1ab8a09c"}, + {file = "websockets-15.0.1-cp310-cp310-win32.whl", hash = "sha256:1234d4ef35db82f5446dca8e35a7da7964d02c127b095e172e54397fb6a6c256"}, + {file = "websockets-15.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:39c1fec2c11dc8d89bba6b2bf1556af381611a173ac2b511cf7231622058af41"}, + {file = "websockets-15.0.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:823c248b690b2fd9303ba00c4f66cd5e2d8c3ba4aa968b2779be9532a4dad431"}, + {file = "websockets-15.0.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:678999709e68425ae2593acf2e3ebcbcf2e69885a5ee78f9eb80e6e371f1bf57"}, + {file = "websockets-15.0.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:d50fd1ee42388dcfb2b3676132c78116490976f1300da28eb629272d5d93e905"}, + {file = "websockets-15.0.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d99e5546bf73dbad5bf3547174cd6cb8ba7273062a23808ffea025ecb1cf8562"}, + {file = "websockets-15.0.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:66dd88c918e3287efc22409d426c8f729688d89a0c587c88971a0faa2c2f3792"}, + {file = "websockets-15.0.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8dd8327c795b3e3f219760fa603dcae1dcc148172290a8ab15158cf85a953413"}, + {file = "websockets-15.0.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:8fdc51055e6ff4adeb88d58a11042ec9a5eae317a0a53d12c062c8a8865909e8"}, + {file = "websockets-15.0.1-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:693f0192126df6c2327cce3baa7c06f2a117575e32ab2308f7f8216c29d9e2e3"}, + {file = "websockets-15.0.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:54479983bd5fb469c38f2f5c7e3a24f9a4e70594cd68cd1fa6b9340dadaff7cf"}, + {file = "websockets-15.0.1-cp311-cp311-win32.whl", hash = "sha256:16b6c1b3e57799b9d38427dda63edcbe4926352c47cf88588c0be4ace18dac85"}, + {file = "websockets-15.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:27ccee0071a0e75d22cb35849b1db43f2ecd3e161041ac1ee9d2352ddf72f065"}, + {file = "websockets-15.0.1-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:3e90baa811a5d73f3ca0bcbf32064d663ed81318ab225ee4f427ad4e26e5aff3"}, + {file = "websockets-15.0.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:592f1a9fe869c778694f0aa806ba0374e97648ab57936f092fd9d87f8bc03665"}, + {file = "websockets-15.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0701bc3cfcb9164d04a14b149fd74be7347a530ad3bbf15ab2c678a2cd3dd9a2"}, + {file = "websockets-15.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e8b56bdcdb4505c8078cb6c7157d9811a85790f2f2b3632c7d1462ab5783d215"}, + {file = "websockets-15.0.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0af68c55afbd5f07986df82831c7bff04846928ea8d1fd7f30052638788bc9b5"}, + {file = "websockets-15.0.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:64dee438fed052b52e4f98f76c5790513235efaa1ef7f3f2192c392cd7c91b65"}, + {file = "websockets-15.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:d5f6b181bb38171a8ad1d6aa58a67a6aa9d4b38d0f8c5f496b9e42561dfc62fe"}, + {file = "websockets-15.0.1-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:5d54b09eba2bada6011aea5375542a157637b91029687eb4fdb2dab11059c1b4"}, + {file = "websockets-15.0.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:3be571a8b5afed347da347bfcf27ba12b069d9d7f42cb8c7028b5e98bbb12597"}, + {file = "websockets-15.0.1-cp312-cp312-win32.whl", hash = "sha256:c338ffa0520bdb12fbc527265235639fb76e7bc7faafbb93f6ba80d9c06578a9"}, + {file = "websockets-15.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:fcd5cf9e305d7b8338754470cf69cf81f420459dbae8a3b40cee57417f4614a7"}, + {file = "websockets-15.0.1-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:ee443ef070bb3b6ed74514f5efaa37a252af57c90eb33b956d35c8e9c10a1931"}, + {file = "websockets-15.0.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:5a939de6b7b4e18ca683218320fc67ea886038265fd1ed30173f5ce3f8e85675"}, + {file = "websockets-15.0.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:746ee8dba912cd6fc889a8147168991d50ed70447bf18bcda7039f7d2e3d9151"}, + {file = "websockets-15.0.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:595b6c3969023ecf9041b2936ac3827e4623bfa3ccf007575f04c5a6aa318c22"}, + {file = "websockets-15.0.1-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3c714d2fc58b5ca3e285461a4cc0c9a66bd0e24c5da9911e30158286c9b5be7f"}, + {file = "websockets-15.0.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0f3c1e2ab208db911594ae5b4f79addeb3501604a165019dd221c0bdcabe4db8"}, + {file = "websockets-15.0.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:229cf1d3ca6c1804400b0a9790dc66528e08a6a1feec0d5040e8b9eb14422375"}, + {file = "websockets-15.0.1-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:756c56e867a90fb00177d530dca4b097dd753cde348448a1012ed6c5131f8b7d"}, + {file = "websockets-15.0.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:558d023b3df0bffe50a04e710bc87742de35060580a293c2a984299ed83bc4e4"}, + {file = "websockets-15.0.1-cp313-cp313-win32.whl", hash = "sha256:ba9e56e8ceeeedb2e080147ba85ffcd5cd0711b89576b83784d8605a7df455fa"}, + {file = "websockets-15.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:e09473f095a819042ecb2ab9465aee615bd9c2028e4ef7d933600a8401c79561"}, + {file = "websockets-15.0.1-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:5f4c04ead5aed67c8a1a20491d54cdfba5884507a48dd798ecaf13c74c4489f5"}, + {file = "websockets-15.0.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:abdc0c6c8c648b4805c5eacd131910d2a7f6455dfd3becab248ef108e89ab16a"}, + {file = "websockets-15.0.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:a625e06551975f4b7ea7102bc43895b90742746797e2e14b70ed61c43a90f09b"}, + {file = "websockets-15.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d591f8de75824cbb7acad4e05d2d710484f15f29d4a915092675ad3456f11770"}, + {file = "websockets-15.0.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:47819cea040f31d670cc8d324bb6435c6f133b8c7a19ec3d61634e62f8d8f9eb"}, + {file = "websockets-15.0.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ac017dd64572e5c3bd01939121e4d16cf30e5d7e110a119399cf3133b63ad054"}, + {file = "websockets-15.0.1-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:4a9fac8e469d04ce6c25bb2610dc535235bd4aa14996b4e6dbebf5e007eba5ee"}, + {file = "websockets-15.0.1-cp39-cp39-musllinux_1_2_i686.whl", hash = "sha256:363c6f671b761efcb30608d24925a382497c12c506b51661883c3e22337265ed"}, + {file = "websockets-15.0.1-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:2034693ad3097d5355bfdacfffcbd3ef5694f9718ab7f29c29689a9eae841880"}, + {file = "websockets-15.0.1-cp39-cp39-win32.whl", hash = "sha256:3b1ac0d3e594bf121308112697cf4b32be538fb1444468fb0a6ae4feebc83411"}, + {file = "websockets-15.0.1-cp39-cp39-win_amd64.whl", hash = "sha256:b7643a03db5c95c799b89b31c036d5f27eeb4d259c798e878d6937d71832b1e4"}, + {file = "websockets-15.0.1-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:0c9e74d766f2818bb95f84c25be4dea09841ac0f734d1966f415e4edfc4ef1c3"}, + {file = "websockets-15.0.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:1009ee0c7739c08a0cd59de430d6de452a55e42d6b522de7aa15e6f67db0b8e1"}, + {file = "websockets-15.0.1-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:76d1f20b1c7a2fa82367e04982e708723ba0e7b8d43aa643d3dcd404d74f1475"}, + {file = "websockets-15.0.1-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f29d80eb9a9263b8d109135351caf568cc3f80b9928bccde535c235de55c22d9"}, + {file = "websockets-15.0.1-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b359ed09954d7c18bbc1680f380c7301f92c60bf924171629c5db97febb12f04"}, + {file = "websockets-15.0.1-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:cad21560da69f4ce7658ca2cb83138fb4cf695a2ba3e475e0559e05991aa8122"}, + {file = "websockets-15.0.1-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:7f493881579c90fc262d9cdbaa05a6b54b3811c2f300766748db79f098db9940"}, + {file = "websockets-15.0.1-pp39-pypy39_pp73-macosx_11_0_arm64.whl", hash = "sha256:47b099e1f4fbc95b701b6e85768e1fcdaf1630f3cbe4765fa216596f12310e2e"}, + {file = "websockets-15.0.1-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:67f2b6de947f8c757db2db9c71527933ad0019737ec374a8a6be9a956786aaf9"}, + {file = "websockets-15.0.1-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d08eb4c2b7d6c41da6ca0600c077e93f5adcfd979cd777d747e9ee624556da4b"}, + {file = "websockets-15.0.1-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4b826973a4a2ae47ba357e4e82fa44a463b8f168e1ca775ac64521442b19e87f"}, + {file = "websockets-15.0.1-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:21c1fa28a6a7e3cbdc171c694398b6df4744613ce9b36b1a498e816787e28123"}, + {file = "websockets-15.0.1-py3-none-any.whl", hash = "sha256:f7a866fbc1e97b5c617ee4116daaa09b722101d4a3c170c787450ba409f9736f"}, + {file = "websockets-15.0.1.tar.gz", hash = "sha256:82544de02076bafba038ce055ee6412d68da13ab47f0c60cab827346de828dee"}, ] [[package]] @@ -3440,4 +3423,4 @@ microphone = ["sounddevice"] [metadata] lock-version = "2.0" python-versions = ">=3.9,<4" -content-hash = "ac3ed54f9220a726793d76d482bfd8200f55fa4620eebc890038d5687305c72b" +content-hash = "0d8c6b62783e6e3a1447345a49dc0a1175830c1bdf378d69c16628085eed419f" diff --git a/pyproject.toml b/pyproject.toml index 62bb52e1..08b91fd1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -3,7 +3,7 @@ name = "hume" [tool.poetry] name = "hume" -version = "0.13.5" +version = "0.13.6" description = "A Python SDK for Hume AI" readme = "README.md" authors = [] @@ -68,7 +68,7 @@ pydantic = ">= 1.9.2" pydantic-core = ">=2.18.2" sounddevice = { version = "^0.4.6", optional = true} typing_extensions = ">= 4.0.0" -websockets = "^13.1" +websockets = ">=12.0" [tool.poetry.group.dev.dependencies] mypy = "==1.13.0" diff --git a/reference.md b/reference.md index 67631787..16d2f1dc 100644 --- a/reference.md +++ b/reference.md @@ -1,6 +1,6 @@ # Reference -## Tts -
client.tts.synthesize_json(...) +## EmpathicVoice ControlPlane +
client.empathic_voice.control_plane.send(...)
@@ -12,9 +12,7 @@
-Synthesizes one or more input texts into speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. - -The response includes the base64-encoded audio and metadata in JSON format. +Send a message to a specific chat.
@@ -30,28 +28,14 @@ The response includes the base64-encoded audio and metadata in JSON format. ```python from hume import HumeClient -from hume.tts import FormatMp3, PostedContextWithUtterances, PostedUtterance +from hume.empathic_voice import SessionSettings client = HumeClient( api_key="YOUR_API_KEY", ) -client.tts.synthesize_json( - context=PostedContextWithUtterances( - utterances=[ - PostedUtterance( - text="How can people see beauty so differently?", - description="A curious student with a clear and respectful tone, seeking clarification on Hume's ideas with a straightforward question.", - ) - ], - ), - format=FormatMp3(), - num_generations=1, - utterances=[ - PostedUtterance( - text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", - description="Middle-aged masculine voice with a clear, rhythmic Scots lilt, rounded vowels, and a warm, steady tone with an articulate, academic quality.", - ) - ], +client.empathic_voice.control_plane.send( + chat_id="chat_id", + request=SessionSettings(), ) ``` @@ -68,11 +52,7 @@ client.tts.synthesize_json(
-**utterances:** `typing.Sequence[PostedUtterance]` - -A list of **Utterances** to be converted to speech output. - -An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`. +**chat_id:** `str`
@@ -80,7 +60,7 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. +**request:** `ControlPlanePublishEvent`
@@ -88,51 +68,82 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**format:** `typing.Optional[Format]` — Specifies the output audio file format. +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration.
+ +
-
-
-**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests. -
+
+## EmpathicVoice ChatGroups +
client.empathic_voice.chat_groups.list_chat_groups(...)
-**num_generations:** `typing.Optional[int]` +#### 📝 Description -Number of audio generations to produce from the input utterances. +
+
-Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - +
+
+ +Fetches a paginated list of **Chat Groups**.
+
+
+ +#### 🔌 Usage
-**split_utterances:** `typing.Optional[bool]` - -Controls how audio output is segmented in the response. +
+
-- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. +```python +from hume import HumeClient -- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. +client = HumeClient( + api_key="YOUR_API_KEY", +) +response = client.empathic_voice.chat_groups.list_chat_groups( + page_number=0, + page_size=1, + ascending_order=True, + config_id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", +) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page -This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output. - +``` +
+
+#### ⚙️ Parameters +
-**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). +
+
+ +**page_number:** `typing.Optional[int]` + +Specifies the page number to retrieve, enabling pagination. + +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -140,13 +151,19 @@ This setting affects how the `snippets` array is structured in the response, whi
-**version:** `typing.Optional[OctaveVersion]` +**page_size:** `typing.Optional[int]` -Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. +Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. -Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. + +
+
-For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. +
+
+ +**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true.
@@ -154,12 +171,11 @@ For a comparison of Octave versions, see the [Octave versions](/docs/text-to-spe
-**instant_mode:** `typing.Optional[bool]` +**config_id:** `typing.Optional[str]` -Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). -- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. -- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). -- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted). +The unique identifier for an EVI configuration. + +Filter Chat Groups to only include Chats that used this `config_id` in their most recent Chat.
@@ -179,7 +195,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-
client.tts.synthesize_file(...) +
client.empathic_voice.chat_groups.get_chat_group(...)
@@ -191,9 +207,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-Synthesizes one or more input texts into speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. - -The response contains the generated audio file in the requested format. +Fetches a **ChatGroup** by ID, including a paginated list of **Chats** associated with the **ChatGroup**.
@@ -209,23 +223,15 @@ The response contains the generated audio file in the requested format. ```python from hume import HumeClient -from hume.tts import FormatMp3, PostedContextWithGenerationId, PostedUtterance client = HumeClient( api_key="YOUR_API_KEY", ) -client.tts.synthesize_file( - context=PostedContextWithGenerationId( - generation_id="09ad914d-8e7f-40f8-a279-e34f07f7dab2", - ), - format=FormatMp3(), - num_generations=1, - utterances=[ - PostedUtterance( - text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", - description="Middle-aged masculine voice with a clear, rhythmic Scots lilt, rounded vowels, and a warm, steady tone with an articulate, academic quality.", - ) - ], +client.empathic_voice.chat_groups.get_chat_group( + id="697056f0-6c7e-487d-9bd8-9c19df79f05f", + page_number=0, + page_size=1, + ascending_order=True, ) ``` @@ -242,11 +248,19 @@ client.tts.synthesize_file(
-**utterances:** `typing.Sequence[PostedUtterance]` +**id:** `str` — Identifier for a Chat Group. Formatted as a UUID. + +
+
-A list of **Utterances** to be converted to speech output. +
+
-An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`. +**page_size:** `typing.Optional[int]` + +Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. + +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10.
@@ -254,7 +268,11 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. +**page_number:** `typing.Optional[int]` + +Specifies the page number to retrieve, enabling pagination. + +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -262,7 +280,7 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**format:** `typing.Optional[Format]` — Specifies the output audio file format. +**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true.
@@ -270,35 +288,72 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests. +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration.
+ +
+ + + +
+ +
client.empathic_voice.chat_groups.get_audio(...)
-**num_generations:** `typing.Optional[int]` +#### 📝 Description -Number of audio generations to produce from the input utterances. +
+
-Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - +
+
+ +Fetches a paginated list of audio for each **Chat** within the specified **Chat Group**. For more details, see our guide on audio reconstruction [here](/docs/speech-to-speech-evi/faq#can-i-access-the-audio-of-previous-conversations-with-evi). +
+
+#### 🔌 Usage +
-**split_utterances:** `typing.Optional[bool]` +
+
-Controls how audio output is segmented in the response. +```python +from hume import HumeClient -- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. +client = HumeClient( + api_key="YOUR_API_KEY", +) +client.empathic_voice.chat_groups.get_audio( + id="369846cf-6ad5-404d-905e-a8acb5cdfc78", + page_number=0, + page_size=10, + ascending_order=True, +) -- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. +``` +
+
+
+
-This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output. +#### ⚙️ Parameters + +
+
+ +
+
+ +**id:** `str` — Identifier for a Chat Group. Formatted as a UUID.
@@ -306,7 +361,11 @@ This setting affects how the `snippets` array is structured in the response, whi
-**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). +**page_number:** `typing.Optional[int]` + +Specifies the page number to retrieve, enabling pagination. + +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -314,13 +373,11 @@ This setting affects how the `snippets` array is structured in the response, whi
-**version:** `typing.Optional[OctaveVersion]` - -Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. +**page_size:** `typing.Optional[int]` -Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. +Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. -For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10.
@@ -328,12 +385,7 @@ For a comparison of Octave versions, see the [Octave versions](/docs/text-to-spe
-**instant_mode:** `typing.Optional[bool]` - -Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). -- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. -- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). -- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted). +**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true.
@@ -341,7 +393,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response. +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration.
@@ -353,7 +405,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-
client.tts.synthesize_file_streaming(...) +
client.empathic_voice.chat_groups.list_chat_group_events(...)
@@ -365,7 +417,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-Streams synthesized speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. +Fetches a paginated list of **Chat** events associated with a **Chat Group**.
@@ -381,22 +433,21 @@ Streams synthesized speech using the specified voice. If no voice is provided, a ```python from hume import HumeClient -from hume.tts import PostedUtterance, PostedUtteranceVoiceWithName client = HumeClient( api_key="YOUR_API_KEY", ) -client.tts.synthesize_file_streaming( - utterances=[ - PostedUtterance( - text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", - voice=PostedUtteranceVoiceWithName( - name="Male English Actor", - provider="HUME_AI", - ), - ) - ], +response = client.empathic_voice.chat_groups.list_chat_group_events( + id="697056f0-6c7e-487d-9bd8-9c19df79f05f", + page_number=0, + page_size=3, + ascending_order=True, ) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ``` @@ -412,47 +463,7 @@ client.tts.synthesize_file_streaming(
-**utterances:** `typing.Sequence[PostedUtterance]` - -A list of **Utterances** to be converted to speech output. - -An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`. - -
-
- -
-
- -**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. - -
-
- -
-
- -**format:** `typing.Optional[Format]` — Specifies the output audio file format. - -
-
- -
-
- -**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests. - -
-
- -
-
- -**num_generations:** `typing.Optional[int]` - -Number of audio generations to produce from the input utterances. - -Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. +**id:** `str` — Identifier for a Chat Group. Formatted as a UUID.
@@ -460,23 +471,11 @@ Using `num_generations` enables faster processing than issuing multiple sequenti
-**split_utterances:** `typing.Optional[bool]` - -Controls how audio output is segmented in the response. - -- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. - -- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. - -This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output. - -
-
+**page_size:** `typing.Optional[int]` -
-
+Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. -**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10.
@@ -484,13 +483,11 @@ This setting affects how the `snippets` array is structured in the response, whi
-**version:** `typing.Optional[OctaveVersion]` - -Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. +**page_number:** `typing.Optional[int]` -Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. +Specifies the page number to retrieve, enabling pagination. -For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -498,12 +495,7 @@ For a comparison of Octave versions, see the [Octave versions](/docs/text-to-spe
-**instant_mode:** `typing.Optional[bool]` - -Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). -- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. -- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). -- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted). +**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true.
@@ -511,7 +503,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response. +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration.
@@ -523,7 +515,8 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-
client.tts.synthesize_json_streaming(...) +## EmpathicVoice Chats +
client.empathic_voice.chats.list_chats(...)
@@ -535,9 +528,7 @@ Enables ultra-low latency streaming, significantly reducing the time until the f
-Streams synthesized speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. - -The response is a stream of JSON objects including audio encoded in base64. +Fetches a paginated list of **Chats**.
@@ -553,24 +544,20 @@ The response is a stream of JSON objects including audio encoded in base64. ```python from hume import HumeClient -from hume.tts import PostedUtterance, PostedUtteranceVoiceWithName client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.tts.synthesize_json_streaming( - utterances=[ - PostedUtterance( - text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", - voice=PostedUtteranceVoiceWithName( - name="Male English Actor", - provider="HUME_AI", - ), - ) - ], +response = client.empathic_voice.chats.list_chats( + page_number=0, + page_size=1, + ascending_order=True, ) -for chunk in response.data: - yield chunk +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ``` @@ -586,11 +573,11 @@ for chunk in response.data:
-**utterances:** `typing.Sequence[PostedUtterance]` +**page_number:** `typing.Optional[int]` -A list of **Utterances** to be converted to speech output. +Specifies the page number to retrieve, enabling pagination. -An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`. +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -598,15 +585,11 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. - -
-
+**page_size:** `typing.Optional[int]` -
-
+Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. -**format:** `typing.Optional[Format]` — Specifies the output audio file format. +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10.
@@ -614,7 +597,7 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests. +**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true.
@@ -622,11 +605,7 @@ An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overvi
-**num_generations:** `typing.Optional[int]` - -Number of audio generations to produce from the input utterances. - -Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. +**config_id:** `typing.Optional[str]` — Filter to only include chats that used this config.
@@ -634,72 +613,35 @@ Using `num_generations` enables faster processing than issuing multiple sequenti
-**split_utterances:** `typing.Optional[bool]` - -Controls how audio output is segmented in the response. - -- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. - -- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. - -This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output. +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration.
- -
-
- -**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). -
-
-
- -**version:** `typing.Optional[OctaveVersion]` - -Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. - -Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. -For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. -
+
+
client.empathic_voice.chats.list_chat_events(...)
-**instant_mode:** `typing.Optional[bool]` +#### 📝 Description -Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). -- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. -- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). -- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted). - -
-
+
+
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. - -
-
+Fetches a paginated list of **Chat** events.
- - -
- -
client.tts.convert_voice_json(...) -
-
#### 🔌 Usage @@ -715,9 +657,17 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.tts.convert_voice_json() -for chunk in response.data: - yield chunk +response = client.empathic_voice.chats.list_chat_events( + id="470a49f6-1dec-4afe-8b61-035d3b2d63b0", + page_number=0, + page_size=3, + ascending_order=True, +) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ```
@@ -733,7 +683,7 @@ for chunk in response.data:
-**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). +**id:** `str` — Identifier for a Chat. Formatted as a UUID.
@@ -741,17 +691,11 @@ for chunk in response.data:
-**audio:** `from __future__ import annotations - -typing.Optional[core.File]` — See core.File for more documentation - -
-
+**page_size:** `typing.Optional[int]` -
-
+Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. -**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10.
@@ -759,15 +703,11 @@ typing.Optional[core.File]` — See core.File for more documentation
-**voice:** `typing.Optional[PostedUtteranceVoice]` - -
-
+**page_number:** `typing.Optional[int]` -
-
+Specifies the page number to retrieve, enabling pagination. -**format:** `typing.Optional[Format]` — Specifies the output audio file format. +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -775,7 +715,7 @@ typing.Optional[core.File]` — See core.File for more documentation
-**include_timestamp_types:** `typing.Optional[typing.List[TimestampType]]` — The set of timestamp types to include in the response. +**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true.
@@ -795,8 +735,7 @@ typing.Optional[core.File]` — See core.File for more documentation
-## Tts Voices -
client.tts.voices.list(...) +
client.empathic_voice.chats.get_audio(...)
@@ -808,7 +747,7 @@ typing.Optional[core.File]` — See core.File for more documentation
-Lists voices you have saved in your account, or voices from the [Voice Library](https://platform.hume.ai/tts/voice-library). +Fetches the audio of a previous **Chat**. For more details, see our guide on audio reconstruction [here](/docs/speech-to-speech-evi/faq#can-i-access-the-audio-of-previous-conversations-with-evi).
@@ -828,14 +767,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.tts.voices.list( - provider="CUSTOM_VOICE", +client.empathic_voice.chats.get_audio( + id="470a49f6-1dec-4afe-8b61-035d3b2d63b0", ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -851,44 +785,7 @@ for page in response.iter_pages():
-**provider:** `VoiceProvider` - -Specify the voice provider to filter voices returned by the endpoint: - -- **`HUME_AI`**: Lists preset, shared voices from Hume's [Voice Library](https://platform.hume.ai/tts/voice-library). -- **`CUSTOM_VOICE`**: Lists custom voices created and saved to your account. - -
-
- -
-
- -**page_number:** `typing.Optional[int]` - -Specifies the page number to retrieve, enabling pagination. - -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. - -
-
- -
-
- -**page_size:** `typing.Optional[int]` - -Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. - -For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. - -
-
- -
-
- -**ascending_order:** `typing.Optional[bool]` +**id:** `str` — Identifier for a chat. Formatted as a UUID.
@@ -908,7 +805,8 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.tts.voices.create(...) +## EmpathicVoice Configs +
client.empathic_voice.configs.list_configs(...)
@@ -920,9 +818,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Saves a new custom voice to your account using the specified TTS generation ID. +Fetches a paginated list of **Configs**. -Once saved, this voice can be reused in subsequent TTS requests, ensuring consistent speech style and prosody. For more details on voice creation, see the [Voices Guide](/docs/text-to-speech-tts/voices). +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -942,10 +840,15 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.tts.voices.create( - generation_id="795c949a-1510-4a80-9646-7d0863b023ab", - name="David Hume", +response = client.empathic_voice.configs.list_configs( + page_number=0, + page_size=1, ) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ``` @@ -961,7 +864,11 @@ client.tts.voices.create(
-**generation_id:** `str` — A unique ID associated with this TTS generation that can be used as context for generating consistent speech style and prosody across multiple requests. +**page_number:** `typing.Optional[int]` + +Specifies the page number to retrieve, enabling pagination. + +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -969,7 +876,27 @@ client.tts.voices.create(
-**name:** `str` — Name of the voice in the `Voice Library`. +**page_size:** `typing.Optional[int]` + +Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. + +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. + +
+
+ +
+
+ +**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each tool. To include all versions of each tool in the list, set `restrict_to_most_recent` to false. + +
+
+ +
+
+ +**name:** `typing.Optional[str]` — Filter to only include configs with this name.
@@ -989,7 +916,7 @@ client.tts.voices.create(
-
client.tts.voices.delete(...) +
client.empathic_voice.configs.create_config(...)
@@ -1001,7 +928,9 @@ client.tts.voices.create(
-Deletes a previously generated custom voice. +Creates a **Config** which can be applied to EVI. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1017,12 +946,47 @@ Deletes a previously generated custom voice. ```python from hume import HumeClient +from hume.empathic_voice import ( + PostedConfigPromptSpec, + PostedEventMessageSpec, + PostedEventMessageSpecs, + PostedLanguageModel, + VoiceName, +) client = HumeClient( api_key="YOUR_API_KEY", ) -client.tts.voices.delete( - name="David Hume", +client.empathic_voice.configs.create_config( + name="Weather Assistant Config", + prompt=PostedConfigPromptSpec( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", + version=0, + ), + evi_version="3", + voice=VoiceName( + provider="HUME_AI", + name="Ava Song", + ), + language_model=PostedLanguageModel( + model_provider="ANTHROPIC", + model_resource="claude-3-7-sonnet-latest", + temperature=1.0, + ), + event_messages=PostedEventMessageSpecs( + on_new_chat=PostedEventMessageSpec( + enabled=False, + text="", + ), + on_inactivity_timeout=PostedEventMessageSpec( + enabled=False, + text="", + ), + on_max_duration_timeout=PostedEventMessageSpec( + enabled=False, + text="", + ), + ), ) ``` @@ -1039,7 +1003,7 @@ client.tts.voices.delete(
-**name:** `str` — Name of the voice to delete +**evi_version:** `str` — EVI version to use. Only versions `3` and `4-mini` are supported.
@@ -1047,72 +1011,95 @@ client.tts.voices.delete(
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. +**name:** `str` — Name applied to all versions of a particular Config.
- -
+
+
+**builtin_tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedBuiltinTool]]]` — List of built-in tools associated with this Config. +
-
-## EmpathicVoice ControlPlane -
client.empathic_voice.control_plane.send(...)
-#### 📝 Description +**ellm_model:** `typing.Optional[PostedEllmModel]` -
-
+The eLLM setup associated with this Config. + +Hume's eLLM (empathic Large Language Model) is a multimodal language model that takes into account both expression measures and language. The eLLM generates short, empathic language responses and guides text-to-speech (TTS) prosody. + +
+
-Send a message to a specific chat. +**event_messages:** `typing.Optional[PostedEventMessageSpecs]` +
+ +
+
+ +**language_model:** `typing.Optional[PostedLanguageModel]` + +The supplemental language model associated with this Config. + +This model is used to generate longer, more detailed responses from EVI. Choosing an appropriate supplemental language model for your use case is crucial for generating fast, high-quality responses from EVI. +
-#### 🔌 Usage -
+**nudges:** `typing.Optional[PostedNudgeSpec]` — Configures nudges, brief audio prompts that can guide conversations when users pause or need encouragement to continue speaking. Nudges help create more natural, flowing interactions by providing gentle conversational cues. + +
+
+
-```python -from hume import HumeClient -from hume.empathic_voice import SessionSettings +**prompt:** `typing.Optional[PostedConfigPromptSpec]` + +
+
-client = HumeClient( - api_key="YOUR_API_KEY", -) -client.empathic_voice.control_plane.send( - chat_id="chat_id", - request=SessionSettings(), -) +
+
-``` +**timeouts:** `typing.Optional[PostedTimeoutSpecs]` +
+ +
+
+ +**tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedUserDefinedToolSpec]]]` — List of user-defined tools associated with this Config. +
-#### ⚙️ Parameters -
+**version_description:** `typing.Optional[str]` — An optional description of the Config version. + +
+
+
-**chat_id:** `str` +**voice:** `typing.Optional[VoiceRef]` — A voice specification associated with this Config.
@@ -1120,7 +1107,7 @@ client.empathic_voice.control_plane.send(
-**request:** `ControlPlanePublishEvent` +**webhooks:** `typing.Optional[typing.Sequence[typing.Optional[PostedWebhookSpec]]]` — Webhook config specifications for each subscriber.
@@ -1140,8 +1127,7 @@ client.empathic_voice.control_plane.send(
-## EmpathicVoice ChatGroups -
client.empathic_voice.chat_groups.list_chat_groups(...) +
client.empathic_voice.configs.list_config_versions(...)
@@ -1153,7 +1139,9 @@ client.empathic_voice.control_plane.send(
-Fetches a paginated list of **Chat Groups**. +Fetches a list of a **Config's** versions. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1173,11 +1161,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.chat_groups.list_chat_groups( - page_number=0, - page_size=1, - ascending_order=True, - config_id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", +response = client.empathic_voice.configs.list_config_versions( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", ) for item in response: yield item @@ -1199,6 +1184,14 @@ for page in response.iter_pages():
+**id:** `str` — Identifier for a Config. Formatted as a UUID. + +
+
+ +
+
+ **page_number:** `typing.Optional[int]` Specifies the page number to retrieve, enabling pagination. @@ -1223,19 +1216,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true. - -
-
- -
-
- -**config_id:** `typing.Optional[str]` - -The unique identifier for an EVI configuration. - -Filter Chat Groups to only include Chats that used this `config_id` in their most recent Chat. +**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each config. To include all versions of each config in the list, set `restrict_to_most_recent` to false.
@@ -1255,7 +1236,7 @@ Filter Chat Groups to only include Chats that used this `config_id` in their mos
-
client.empathic_voice.chat_groups.get_chat_group(...) +
client.empathic_voice.configs.create_config_version(...)
@@ -1267,7 +1248,9 @@ Filter Chat Groups to only include Chats that used this `config_id` in their mos
-Fetches a **ChatGroup** by ID, including a paginated list of **Chats** associated with the **ChatGroup**. +Updates a **Config** by creating a new version of the **Config**. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1283,15 +1266,52 @@ Fetches a **ChatGroup** by ID, including a paginated list of **Chats** associate ```python from hume import HumeClient - -client = HumeClient( - api_key="YOUR_API_KEY", -) -client.empathic_voice.chat_groups.get_chat_group( - id="697056f0-6c7e-487d-9bd8-9c19df79f05f", - page_number=0, - page_size=1, - ascending_order=True, +from hume.empathic_voice import ( + PostedConfigPromptSpec, + PostedEllmModel, + PostedEventMessageSpec, + PostedEventMessageSpecs, + PostedLanguageModel, + VoiceName, +) + +client = HumeClient( + api_key="YOUR_API_KEY", +) +client.empathic_voice.configs.create_config_version( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", + version_description="This is an updated version of the Weather Assistant Config.", + evi_version="3", + prompt=PostedConfigPromptSpec( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", + version=0, + ), + voice=VoiceName( + provider="HUME_AI", + name="Ava Song", + ), + language_model=PostedLanguageModel( + model_provider="ANTHROPIC", + model_resource="claude-3-7-sonnet-latest", + temperature=1.0, + ), + ellm_model=PostedEllmModel( + allow_short_responses=True, + ), + event_messages=PostedEventMessageSpecs( + on_new_chat=PostedEventMessageSpec( + enabled=False, + text="", + ), + on_inactivity_timeout=PostedEventMessageSpec( + enabled=False, + text="", + ), + on_max_duration_timeout=PostedEventMessageSpec( + enabled=False, + text="", + ), + ), ) ``` @@ -1308,7 +1328,7 @@ client.empathic_voice.chat_groups.get_chat_group(
-**id:** `str` — Identifier for a Chat Group. Formatted as a UUID. +**id:** `str` — Identifier for a Config. Formatted as a UUID.
@@ -1316,11 +1336,15 @@ client.empathic_voice.chat_groups.get_chat_group(
-**page_size:** `typing.Optional[int]` +**evi_version:** `str` — The version of the EVI used with this config. + +
+
-Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. +
+
-For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. +**builtin_tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedBuiltinTool]]]` — List of built-in tools associated with this Config version.
@@ -1328,11 +1352,11 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**page_number:** `typing.Optional[int]` +**ellm_model:** `typing.Optional[PostedEllmModel]` -Specifies the page number to retrieve, enabling pagination. +The eLLM setup associated with this Config version. -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. +Hume's eLLM (empathic Large Language Model) is a multimodal language model that takes into account both expression measures and language. The eLLM generates short, empathic language responses and guides text-to-speech (TTS) prosody.
@@ -1340,7 +1364,7 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true. +**event_messages:** `typing.Optional[PostedEventMessageSpecs]`
@@ -1348,72 +1372,43 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. - -
-
- -
+**language_model:** `typing.Optional[PostedLanguageModel]` +The supplemental language model associated with this Config version. +This model is used to generate longer, more detailed responses from EVI. Choosing an appropriate supplemental language model for your use case is crucial for generating fast, high-quality responses from EVI. + -
-
client.empathic_voice.chat_groups.get_audio(...)
-#### 📝 Description - -
-
+**nudges:** `typing.Optional[PostedNudgeSpec]` + +
+
-Fetches a paginated list of audio for each **Chat** within the specified **Chat Group**. For more details, see our guide on audio reconstruction [here](/docs/speech-to-speech-evi/faq#can-i-access-the-audio-of-previous-conversations-with-evi). -
-
+**prompt:** `typing.Optional[PostedConfigPromptSpec]` +
-#### 🔌 Usage -
-
-
- -```python -from hume import HumeClient - -client = HumeClient( - api_key="YOUR_API_KEY", -) -client.empathic_voice.chat_groups.get_audio( - id="369846cf-6ad5-404d-905e-a8acb5cdfc78", - page_number=0, - page_size=10, - ascending_order=True, -) - -``` -
-
+**timeouts:** `typing.Optional[PostedTimeoutSpecs]` +
-#### ⚙️ Parameters - -
-
-
-**id:** `str` — Identifier for a Chat Group. Formatted as a UUID. +**tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedUserDefinedToolSpec]]]` — List of user-defined tools associated with this Config version.
@@ -1421,11 +1416,7 @@ client.empathic_voice.chat_groups.get_audio(
-**page_number:** `typing.Optional[int]` - -Specifies the page number to retrieve, enabling pagination. - -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. +**version_description:** `typing.Optional[str]` — An optional description of the Config version.
@@ -1433,11 +1424,7 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-**page_size:** `typing.Optional[int]` - -Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. - -For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. +**voice:** `typing.Optional[VoiceRef]` — A voice specification associated with this Config version.
@@ -1445,7 +1432,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true. +**webhooks:** `typing.Optional[typing.Sequence[typing.Optional[PostedWebhookSpec]]]` — Webhook config specifications for each subscriber.
@@ -1465,7 +1452,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.chat_groups.list_chat_group_events(...) +
client.empathic_voice.configs.delete_config(...)
@@ -1477,7 +1464,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Fetches a paginated list of **Chat** events associated with a **Chat Group**. +Deletes a **Config** and its versions. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1497,17 +1486,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.chat_groups.list_chat_group_events( - id="697056f0-6c7e-487d-9bd8-9c19df79f05f", - page_number=0, - page_size=3, - ascending_order=True, +client.empathic_voice.configs.delete_config( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -1523,39 +1504,7 @@ for page in response.iter_pages():
-**id:** `str` — Identifier for a Chat Group. Formatted as a UUID. - -
-
- -
-
- -**page_size:** `typing.Optional[int]` - -Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. - -For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. - -
-
- -
-
- -**page_number:** `typing.Optional[int]` - -Specifies the page number to retrieve, enabling pagination. - -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. - -
-
- -
-
- -**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true. +**id:** `str` — Identifier for a Config. Formatted as a UUID.
@@ -1575,8 +1524,7 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-## EmpathicVoice Chats -
client.empathic_voice.chats.list_chats(...) +
client.empathic_voice.configs.update_config_name(...)
@@ -1588,7 +1536,9 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-Fetches a paginated list of **Chats**. +Updates the name of a **Config**. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1608,16 +1558,10 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.chats.list_chats( - page_number=0, - page_size=1, - ascending_order=True, +client.empathic_voice.configs.update_config_name( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", + name="Updated Weather Assistant Config Name", ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -1633,31 +1577,7 @@ for page in response.iter_pages():
-**page_number:** `typing.Optional[int]` - -Specifies the page number to retrieve, enabling pagination. - -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. - -
-
- -
-
- -**page_size:** `typing.Optional[int]` - -Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. - -For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. - -
-
- -
-
- -**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true. +**id:** `str` — Identifier for a Config. Formatted as a UUID.
@@ -1665,7 +1585,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**config_id:** `typing.Optional[str]` — Filter to only include chats that used this config. +**name:** `str` — Name applied to all versions of a particular Config.
@@ -1685,7 +1605,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.chats.list_chat_events(...) +
client.empathic_voice.configs.get_config_version(...)
@@ -1697,7 +1617,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Fetches a paginated list of **Chat** events. +Fetches a specified version of a **Config**. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1717,17 +1639,10 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.chats.list_chat_events( - id="470a49f6-1dec-4afe-8b61-035d3b2d63b0", - page_number=0, - page_size=3, - ascending_order=True, +client.empathic_voice.configs.get_config_version( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", + version=1, ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -1743,19 +1658,7 @@ for page in response.iter_pages():
-**id:** `str` — Identifier for a Chat. Formatted as a UUID. - -
-
- -
-
- -**page_size:** `typing.Optional[int]` - -Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. - -For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. +**id:** `str` — Identifier for a Config. Formatted as a UUID.
@@ -1763,19 +1666,13 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**page_number:** `typing.Optional[int]` - -Specifies the page number to retrieve, enabling pagination. +**version:** `int` -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. - -
-
+Version number for a Config. -
-
+Configs, Prompts, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. -**ascending_order:** `typing.Optional[bool]` — Specifies the sorting order of the results based on their creation date. Set to true for ascending order (chronological, with the oldest records first) and false for descending order (reverse-chronological, with the newest records first). Defaults to true. +Version numbers are integer values representing different iterations of the Config. Each update to the Config increments its version number.
@@ -1795,7 +1692,7 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-
client.empathic_voice.chats.get_audio(...) +
client.empathic_voice.configs.delete_config_version(...)
@@ -1807,7 +1704,9 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-Fetches the audio of a previous **Chat**. For more details, see our guide on audio reconstruction [here](/docs/speech-to-speech-evi/faq#can-i-access-the-audio-of-previous-conversations-with-evi). +Deletes a specified version of a **Config**. + +For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1827,8 +1726,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.chats.get_audio( - id="470a49f6-1dec-4afe-8b61-035d3b2d63b0", +client.empathic_voice.configs.delete_config_version( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", + version=1, ) ``` @@ -1845,7 +1745,21 @@ client.empathic_voice.chats.get_audio(
-**id:** `str` — Identifier for a chat. Formatted as a UUID. +**id:** `str` — Identifier for a Config. Formatted as a UUID. + +
+
+ +
+
+ +**version:** `int` + +Version number for a Config. + +Configs, Prompts, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. + +Version numbers are integer values representing different iterations of the Config. Each update to the Config increments its version number.
@@ -1865,8 +1779,7 @@ client.empathic_voice.chats.get_audio(
-## EmpathicVoice Configs -
client.empathic_voice.configs.list_configs(...) +
client.empathic_voice.configs.update_config_description(...)
@@ -1878,7 +1791,7 @@ client.empathic_voice.chats.get_audio(
-Fetches a paginated list of **Configs**. +Updates the description of a **Config**. For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration).
@@ -1900,15 +1813,11 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.configs.list_configs( - page_number=0, - page_size=1, +client.empathic_voice.configs.update_config_description( + id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", + version=1, + version_description="This is an updated version_description.", ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ```
@@ -1924,11 +1833,7 @@ for page in response.iter_pages():
-**page_number:** `typing.Optional[int]` - -Specifies the page number to retrieve, enabling pagination. - -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. +**id:** `str` — Identifier for a Config. Formatted as a UUID.
@@ -1936,19 +1841,13 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-**page_size:** `typing.Optional[int]` - -Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. +**version:** `int` -For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. - -
-
+Version number for a Config. -
-
+Configs, Prompts, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. -**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each tool. To include all versions of each tool in the list, set `restrict_to_most_recent` to false. +Version numbers are integer values representing different iterations of the Config. Each update to the Config increments its version number.
@@ -1956,7 +1855,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**name:** `typing.Optional[str]` — Filter to only include configs with this name. +**version_description:** `typing.Optional[str]` — An optional description of the Config version.
@@ -1976,7 +1875,8 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.configs.create_config(...) +## EmpathicVoice Prompts +
client.empathic_voice.prompts.list_prompts(...)
@@ -1988,9 +1888,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Creates a **Config** which can be applied to EVI. +Fetches a paginated list of **Prompts**. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2006,48 +1906,19 @@ For more details on configuration options and how to configure EVI, see our [con ```python from hume import HumeClient -from hume.empathic_voice import ( - PostedConfigPromptSpec, - PostedEventMessageSpec, - PostedEventMessageSpecs, - PostedLanguageModel, - VoiceName, -) client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.create_config( - name="Weather Assistant Config", - prompt=PostedConfigPromptSpec( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", - version=0, - ), - evi_version="3", - voice=VoiceName( - provider="HUME_AI", - name="Ava Song", - ), - language_model=PostedLanguageModel( - model_provider="ANTHROPIC", - model_resource="claude-3-7-sonnet-latest", - temperature=1.0, - ), - event_messages=PostedEventMessageSpecs( - on_new_chat=PostedEventMessageSpec( - enabled=False, - text="", - ), - on_inactivity_timeout=PostedEventMessageSpec( - enabled=False, - text="", - ), - on_max_duration_timeout=PostedEventMessageSpec( - enabled=False, - text="", - ), - ), +response = client.empathic_voice.prompts.list_prompts( + page_number=0, + page_size=2, ) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ``` @@ -2063,7 +1934,11 @@ client.empathic_voice.configs.create_config(
-**evi_version:** `str` — EVI version to use. Only versions `3` and `4-mini` are supported. +**page_number:** `typing.Optional[int]` + +Specifies the page number to retrieve, enabling pagination. + +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page.
@@ -2071,7 +1946,11 @@ client.empathic_voice.configs.create_config(
-**name:** `str` — Name applied to all versions of a particular Config. +**page_size:** `typing.Optional[int]` + +Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. + +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10.
@@ -2079,7 +1958,7 @@ client.empathic_voice.configs.create_config(
-**builtin_tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedBuiltinTool]]]` — List of built-in tools associated with this Config. +**restrict_to_most_recent:** `typing.Optional[bool]` — Only include the most recent version of each prompt in the list.
@@ -2087,11 +1966,7 @@ client.empathic_voice.configs.create_config(
-**ellm_model:** `typing.Optional[PostedEllmModel]` - -The eLLM setup associated with this Config. - -Hume's eLLM (empathic Large Language Model) is a multimodal language model that takes into account both expression measures and language. The eLLM generates short, empathic language responses and guides text-to-speech (TTS) prosody. +**name:** `typing.Optional[str]` — Filter to only include prompts with name.
@@ -2099,59 +1974,72 @@ Hume's eLLM (empathic Large Language Model) is a multimodal language model that
-**event_messages:** `typing.Optional[PostedEventMessageSpecs]` +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration.
+ +
-
-
- -**language_model:** `typing.Optional[PostedLanguageModel]` - -The supplemental language model associated with this Config. -This model is used to generate longer, more detailed responses from EVI. Choosing an appropriate supplemental language model for your use case is crucial for generating fast, high-quality responses from EVI. -
+
+
client.empathic_voice.prompts.create_prompt(...)
-**nudges:** `typing.Optional[PostedNudgeSpec]` — Configures nudges, brief audio prompts that can guide conversations when users pause or need encouragement to continue speaking. Nudges help create more natural, flowing interactions by providing gentle conversational cues. - -
-
+#### 📝 Description
-**prompt:** `typing.Optional[PostedConfigPromptSpec]` - -
-
-
-**timeouts:** `typing.Optional[PostedTimeoutSpecs]` - +Creates a **Prompt** that can be added to an [EVI configuration](/reference/speech-to-speech-evi/configs/create-config). + +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +
+
+#### 🔌 Usage +
-**tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedUserDefinedToolSpec]]]` — List of user-defined tools associated with this Config. - +
+
+ +```python +from hume import HumeClient + +client = HumeClient( + api_key="YOUR_API_KEY", +) +client.empathic_voice.prompts.create_prompt( + name="Weather Assistant Prompt", + text="You are an AI weather assistant providing users with accurate and up-to-date weather information. Respond to user queries concisely and clearly. Use simple language and avoid technical jargon. Provide temperature, precipitation, wind conditions, and any weather alerts. Include helpful tips if severe weather is expected.", +) + +```
+
+
+ +#### ⚙️ Parameters
-**version_description:** `typing.Optional[str]` — An optional description of the Config version. +
+
+ +**name:** `str` — Name applied to all versions of a particular Prompt.
@@ -2159,7 +2047,13 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-**voice:** `typing.Optional[VoiceRef]` — A voice specification associated with this Config. +**text:** `str` + +Instructions used to shape EVI's behavior, responses, and style. + +You can use the Prompt to define a specific goal or role for EVI, specifying how it should act or what it should focus on during the conversation. For example, EVI can be instructed to act as a customer support representative, a fitness coach, or a travel advisor, each with its own set of behaviors and response styles. + +For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-speech-evi/guides/prompting).
@@ -2167,7 +2061,7 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-**webhooks:** `typing.Optional[typing.Sequence[typing.Optional[PostedWebhookSpec]]]` — Webhook config specifications for each subscriber. +**version_description:** `typing.Optional[str]` — An optional description of the Prompt version.
@@ -2187,7 +2081,7 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-
client.empathic_voice.configs.list_config_versions(...) +
client.empathic_voice.prompts.list_prompt_versions(...)
@@ -2199,9 +2093,9 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-Fetches a list of a **Config's** versions. +Fetches a list of a **Prompt's** versions. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2221,14 +2115,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.configs.list_config_versions( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", +client.empathic_voice.prompts.list_prompt_versions( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -2244,7 +2133,7 @@ for page in response.iter_pages():
-**id:** `str` — Identifier for a Config. Formatted as a UUID. +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2276,7 +2165,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each config. To include all versions of each config in the list, set `restrict_to_most_recent` to false. +**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each prompt. To include all versions of each prompt in the list, set `restrict_to_most_recent` to false.
@@ -2296,7 +2185,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.configs.create_config_version(...) +
client.empathic_voice.prompts.create_prompt_version(...)
@@ -2308,9 +2197,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Updates a **Config** by creating a new version of the **Config**. +Updates a **Prompt** by creating a new version of the **Prompt**. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2326,52 +2215,14 @@ For more details on configuration options and how to configure EVI, see our [con ```python from hume import HumeClient -from hume.empathic_voice import ( - PostedConfigPromptSpec, - PostedEllmModel, - PostedEventMessageSpec, - PostedEventMessageSpecs, - PostedLanguageModel, - VoiceName, -) client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.create_config_version( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", - version_description="This is an updated version of the Weather Assistant Config.", - evi_version="3", - prompt=PostedConfigPromptSpec( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", - version=0, - ), - voice=VoiceName( - provider="HUME_AI", - name="Ava Song", - ), - language_model=PostedLanguageModel( - model_provider="ANTHROPIC", - model_resource="claude-3-7-sonnet-latest", - temperature=1.0, - ), - ellm_model=PostedEllmModel( - allow_short_responses=True, - ), - event_messages=PostedEventMessageSpecs( - on_new_chat=PostedEventMessageSpec( - enabled=False, - text="", - ), - on_inactivity_timeout=PostedEventMessageSpec( - enabled=False, - text="", - ), - on_max_duration_timeout=PostedEventMessageSpec( - enabled=False, - text="", - ), - ), +client.empathic_voice.prompts.create_prompt_version( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", + text="You are an updated version of an AI weather assistant providing users with accurate and up-to-date weather information. Respond to user queries concisely and clearly. Use simple language and avoid technical jargon. Provide temperature, precipitation, wind conditions, and any weather alerts. Include helpful tips if severe weather is expected.", + version_description="This is an updated version of the Weather Assistant Prompt.", ) ``` @@ -2388,79 +2239,7 @@ client.empathic_voice.configs.create_config_version(
-**id:** `str` — Identifier for a Config. Formatted as a UUID. - -
-
- -
-
- -**evi_version:** `str` — The version of the EVI used with this config. - -
-
- -
-
- -**builtin_tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedBuiltinTool]]]` — List of built-in tools associated with this Config version. - -
-
- -
-
- -**ellm_model:** `typing.Optional[PostedEllmModel]` - -The eLLM setup associated with this Config version. - -Hume's eLLM (empathic Large Language Model) is a multimodal language model that takes into account both expression measures and language. The eLLM generates short, empathic language responses and guides text-to-speech (TTS) prosody. - -
-
- -
-
- -**event_messages:** `typing.Optional[PostedEventMessageSpecs]` - -
-
- -
-
- -**language_model:** `typing.Optional[PostedLanguageModel]` - -The supplemental language model associated with this Config version. - -This model is used to generate longer, more detailed responses from EVI. Choosing an appropriate supplemental language model for your use case is crucial for generating fast, high-quality responses from EVI. - -
-
- -
-
- -**nudges:** `typing.Optional[PostedNudgeSpec]` - -
-
- -
-
- -**prompt:** `typing.Optional[PostedConfigPromptSpec]` - -
-
- -
-
- -**timeouts:** `typing.Optional[PostedTimeoutSpecs]` +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2468,23 +2247,13 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-**tools:** `typing.Optional[typing.Sequence[typing.Optional[PostedUserDefinedToolSpec]]]` — List of user-defined tools associated with this Config version. - -
-
- -
-
+**text:** `str` -**version_description:** `typing.Optional[str]` — An optional description of the Config version. - -
-
+Instructions used to shape EVI's behavior, responses, and style for this version of the Prompt. -
-
+You can use the Prompt to define a specific goal or role for EVI, specifying how it should act or what it should focus on during the conversation. For example, EVI can be instructed to act as a customer support representative, a fitness coach, or a travel advisor, each with its own set of behaviors and response styles. -**voice:** `typing.Optional[VoiceRef]` — A voice specification associated with this Config version. +For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-speech-evi/guides/prompting).
@@ -2492,7 +2261,7 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-**webhooks:** `typing.Optional[typing.Sequence[typing.Optional[PostedWebhookSpec]]]` — Webhook config specifications for each subscriber. +**version_description:** `typing.Optional[str]` — An optional description of the Prompt version.
@@ -2512,7 +2281,7 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-
client.empathic_voice.configs.delete_config(...) +
client.empathic_voice.prompts.delete_prompt(...)
@@ -2524,9 +2293,9 @@ This model is used to generate longer, more detailed responses from EVI. Choosin
-Deletes a **Config** and its versions. +Deletes a **Prompt** and its versions. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2546,8 +2315,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.delete_config( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", +client.empathic_voice.prompts.delete_prompt( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", ) ``` @@ -2564,7 +2333,7 @@ client.empathic_voice.configs.delete_config(
-**id:** `str` — Identifier for a Config. Formatted as a UUID. +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2584,7 +2353,7 @@ client.empathic_voice.configs.delete_config(
-
client.empathic_voice.configs.update_config_name(...) +
client.empathic_voice.prompts.update_prompt_name(...)
@@ -2596,9 +2365,9 @@ client.empathic_voice.configs.delete_config(
-Updates the name of a **Config**. +Updates the name of a **Prompt**. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2618,9 +2387,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.update_config_name( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", - name="Updated Weather Assistant Config Name", +client.empathic_voice.prompts.update_prompt_name( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", + name="Updated Weather Assistant Prompt Name", ) ``` @@ -2637,7 +2406,7 @@ client.empathic_voice.configs.update_config_name(
-**id:** `str` — Identifier for a Config. Formatted as a UUID. +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2645,7 +2414,7 @@ client.empathic_voice.configs.update_config_name(
-**name:** `str` — Name applied to all versions of a particular Config. +**name:** `str` — Name applied to all versions of a particular Prompt.
@@ -2665,7 +2434,7 @@ client.empathic_voice.configs.update_config_name(
-
client.empathic_voice.configs.get_config_version(...) +
client.empathic_voice.prompts.get_prompt_version(...)
@@ -2677,9 +2446,9 @@ client.empathic_voice.configs.update_config_name(
-Fetches a specified version of a **Config**. +Fetches a specified version of a **Prompt**. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2699,9 +2468,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.get_config_version( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", - version=1, +client.empathic_voice.prompts.get_prompt_version( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", + version=0, ) ``` @@ -2718,7 +2487,7 @@ client.empathic_voice.configs.get_config_version(
-**id:** `str` — Identifier for a Config. Formatted as a UUID. +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2728,11 +2497,11 @@ client.empathic_voice.configs.get_config_version( **version:** `int` -Version number for a Config. +Version number for a Prompt. -Configs, Prompts, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. +Prompts, Configs, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine prompts and revert to previous versions if needed. -Version numbers are integer values representing different iterations of the Config. Each update to the Config increments its version number. +Version numbers are integer values representing different iterations of the Prompt. Each update to the Prompt increments its version number.
@@ -2752,7 +2521,7 @@ Version numbers are integer values representing different iterations of the Conf
-
client.empathic_voice.configs.delete_config_version(...) +
client.empathic_voice.prompts.delete_prompt_version(...)
@@ -2764,9 +2533,9 @@ Version numbers are integer values representing different iterations of the Conf
-Deletes a specified version of a **Config**. +Deletes a specified version of a **Prompt**. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2786,8 +2555,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.delete_config_version( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", +client.empathic_voice.prompts.delete_prompt_version( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", version=1, ) @@ -2805,7 +2574,7 @@ client.empathic_voice.configs.delete_config_version(
-**id:** `str` — Identifier for a Config. Formatted as a UUID. +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2815,11 +2584,11 @@ client.empathic_voice.configs.delete_config_version( **version:** `int` -Version number for a Config. +Version number for a Prompt. -Configs, Prompts, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. +Prompts, Configs, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine prompts and revert to previous versions if needed. -Version numbers are integer values representing different iterations of the Config. Each update to the Config increments its version number. +Version numbers are integer values representing different iterations of the Prompt. Each update to the Prompt increments its version number.
@@ -2839,7 +2608,7 @@ Version numbers are integer values representing different iterations of the Conf
-
client.empathic_voice.configs.update_config_description(...) +
client.empathic_voice.prompts.update_prompt_description(...)
@@ -2851,9 +2620,9 @@ Version numbers are integer values representing different iterations of the Conf
-Updates the description of a **Config**. +Updates the description of a **Prompt**. -For more details on configuration options and how to configure EVI, see our [configuration guide](/docs/speech-to-speech-evi/configuration). +See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt.
@@ -2873,8 +2642,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.configs.update_config_description( - id="1b60e1a0-cc59-424a-8d2c-189d354db3f3", +client.empathic_voice.prompts.update_prompt_description( + id="af699d45-2985-42cc-91b9-af9e5da3bac5", version=1, version_description="This is an updated version_description.", ) @@ -2893,7 +2662,7 @@ client.empathic_voice.configs.update_config_description(
-**id:** `str` — Identifier for a Config. Formatted as a UUID. +**id:** `str` — Identifier for a Prompt. Formatted as a UUID.
@@ -2903,11 +2672,11 @@ client.empathic_voice.configs.update_config_description( **version:** `int` -Version number for a Config. +Version number for a Prompt. -Configs, Prompts, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. +Prompts, Configs, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine prompts and revert to previous versions if needed. -Version numbers are integer values representing different iterations of the Config. Each update to the Config increments its version number. +Version numbers are integer values representing different iterations of the Prompt. Each update to the Prompt increments its version number.
@@ -2915,7 +2684,7 @@ Version numbers are integer values representing different iterations of the Conf
-**version_description:** `typing.Optional[str]` — An optional description of the Config version. +**version_description:** `typing.Optional[str]` — An optional description of the Prompt version.
@@ -2935,8 +2704,8 @@ Version numbers are integer values representing different iterations of the Conf
-## EmpathicVoice Prompts -
client.empathic_voice.prompts.list_prompts(...) +## EmpathicVoice Tools +
client.empathic_voice.tools.list_tools(...)
@@ -2948,9 +2717,9 @@ Version numbers are integer values representing different iterations of the Conf
-Fetches a paginated list of **Prompts**. +Fetches a paginated list of **Tools**. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -2970,7 +2739,7 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.prompts.list_prompts( +response = client.empathic_voice.tools.list_tools( page_number=0, page_size=2, ) @@ -3018,7 +2787,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**restrict_to_most_recent:** `typing.Optional[bool]` — Only include the most recent version of each prompt in the list. +**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each tool. To include all versions of each tool in the list, set `restrict_to_most_recent` to false.
@@ -3026,7 +2795,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**name:** `typing.Optional[str]` — Filter to only include prompts with name. +**name:** `typing.Optional[str]` — Filter to only include tools with name.
@@ -3046,7 +2815,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.prompts.create_prompt(...) +
client.empathic_voice.tools.create_tool(...)
@@ -3058,9 +2827,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Creates a **Prompt** that can be added to an [EVI configuration](/reference/speech-to-speech-evi/configs/create-config). +Creates a **Tool** that can be added to an [EVI configuration](/reference/speech-to-speech-evi/configs/create-config). -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3080,12 +2849,15 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.create_prompt( - name="Weather Assistant Prompt", - text="You are an AI weather assistant providing users with accurate and up-to-date weather information. Respond to user queries concisely and clearly. Use simple language and avoid technical jargon. Provide temperature, precipitation, wind conditions, and any weather alerts. Include helpful tips if severe weather is expected.", -) - -``` +client.empathic_voice.tools.create_tool( + name="get_current_weather", + parameters='{ "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "format": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location." } }, "required": ["location", "format"] }', + version_description="Fetches current weather and uses celsius or fahrenheit based on location of user.", + description="This tool is for getting the current weather.", + fallback_content="Unable to fetch current weather.", +) + +```
@@ -3099,7 +2871,7 @@ client.empathic_voice.prompts.create_prompt(
-**name:** `str` — Name applied to all versions of a particular Prompt. +**name:** `str` — Name applied to all versions of a particular Tool.
@@ -3107,13 +2879,19 @@ client.empathic_voice.prompts.create_prompt(
-**text:** `str` +**parameters:** `str` -Instructions used to shape EVI's behavior, responses, and style. +Stringified JSON defining the parameters used by this version of the Tool. -You can use the Prompt to define a specific goal or role for EVI, specifying how it should act or what it should focus on during the conversation. For example, EVI can be instructed to act as a customer support representative, a fitness coach, or a travel advisor, each with its own set of behaviors and response styles. +These parameters define the inputs needed for the Tool's execution, including the expected data type and description for each input field. Structured as a stringified JSON schema, this format ensures the Tool receives data in the expected format. + +
+
-For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-speech-evi/guides/prompting). +
+
+ +**description:** `typing.Optional[str]` — An optional description of what the Tool does, used by the supplemental LLM to choose when and how to call the function.
@@ -3121,7 +2899,15 @@ For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-spee
-**version_description:** `typing.Optional[str]` — An optional description of the Prompt version. +**fallback_content:** `typing.Optional[str]` — Optional text passed to the supplemental LLM in place of the tool call result. The LLM then uses this text to generate a response back to the user, ensuring continuity in the conversation if the Tool errors. + +
+
+ +
+
+ +**version_description:** `typing.Optional[str]` — An optional description of the Tool version.
@@ -3141,7 +2927,7 @@ For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-spee
-
client.empathic_voice.prompts.list_prompt_versions(...) +
client.empathic_voice.tools.list_tool_versions(...)
@@ -3153,9 +2939,9 @@ For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-spee
-Fetches a list of a **Prompt's** versions. +Fetches a list of a **Tool's** versions. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3175,9 +2961,14 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.list_prompt_versions( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", +response = client.empathic_voice.tools.list_tool_versions( + id="00183a3f-79ba-413d-9f3b-609864268bea", ) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ``` @@ -3193,7 +2984,7 @@ client.empathic_voice.prompts.list_prompt_versions(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3225,7 +3016,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each prompt. To include all versions of each prompt in the list, set `restrict_to_most_recent` to false. +**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each tool. To include all versions of each tool in the list, set `restrict_to_most_recent` to false.
@@ -3245,7 +3036,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.prompts.create_prompt_version(...) +
client.empathic_voice.tools.create_tool_version(...)
@@ -3257,9 +3048,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Updates a **Prompt** by creating a new version of the **Prompt**. +Updates a **Tool** by creating a new version of the **Tool**. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3279,10 +3070,12 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.create_prompt_version( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", - text="You are an updated version of an AI weather assistant providing users with accurate and up-to-date weather information. Respond to user queries concisely and clearly. Use simple language and avoid technical jargon. Provide temperature, precipitation, wind conditions, and any weather alerts. Include helpful tips if severe weather is expected.", - version_description="This is an updated version of the Weather Assistant Prompt.", +client.empathic_voice.tools.create_tool_version( + id="00183a3f-79ba-413d-9f3b-609864268bea", + parameters='{ "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "format": { "type": "string", "enum": ["celsius", "fahrenheit", "kelvin"], "description": "The temperature unit to use. Infer this from the users location." } }, "required": ["location", "format"] }', + version_description="Fetches current weather and uses celsius, fahrenheit, or kelvin based on location of user.", + fallback_content="Unable to fetch current weather.", + description="This tool is for getting the current weather.", ) ``` @@ -3299,7 +3092,7 @@ client.empathic_voice.prompts.create_prompt_version(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3307,13 +3100,19 @@ client.empathic_voice.prompts.create_prompt_version(
-**text:** `str` +**parameters:** `str` -Instructions used to shape EVI's behavior, responses, and style for this version of the Prompt. +Stringified JSON defining the parameters used by this version of the Tool. -You can use the Prompt to define a specific goal or role for EVI, specifying how it should act or what it should focus on during the conversation. For example, EVI can be instructed to act as a customer support representative, a fitness coach, or a travel advisor, each with its own set of behaviors and response styles. +These parameters define the inputs needed for the Tool's execution, including the expected data type and description for each input field. Structured as a stringified JSON schema, this format ensures the Tool receives data in the expected format. + +
+
-For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-speech-evi/guides/prompting). +
+
+ +**description:** `typing.Optional[str]` — An optional description of what the Tool does, used by the supplemental LLM to choose when and how to call the function.
@@ -3321,7 +3120,15 @@ For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-spee
-**version_description:** `typing.Optional[str]` — An optional description of the Prompt version. +**fallback_content:** `typing.Optional[str]` — Optional text passed to the supplemental LLM in place of the tool call result. The LLM then uses this text to generate a response back to the user, ensuring continuity in the conversation if the Tool errors. + +
+
+ +
+
+ +**version_description:** `typing.Optional[str]` — An optional description of the Tool version.
@@ -3341,7 +3148,7 @@ For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-spee
-
client.empathic_voice.prompts.delete_prompt(...) +
client.empathic_voice.tools.delete_tool(...)
@@ -3353,9 +3160,9 @@ For help writing a system prompt, see our [Prompting Guide](/docs/speech-to-spee
-Deletes a **Prompt** and its versions. +Deletes a **Tool** and its versions. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3375,8 +3182,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.delete_prompt( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", +client.empathic_voice.tools.delete_tool( + id="00183a3f-79ba-413d-9f3b-609864268bea", ) ``` @@ -3393,7 +3200,7 @@ client.empathic_voice.prompts.delete_prompt(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3413,7 +3220,7 @@ client.empathic_voice.prompts.delete_prompt(
-
client.empathic_voice.prompts.update_prompt_name(...) +
client.empathic_voice.tools.update_tool_name(...)
@@ -3425,9 +3232,9 @@ client.empathic_voice.prompts.delete_prompt(
-Updates the name of a **Prompt**. +Updates the name of a **Tool**. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3447,9 +3254,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.update_prompt_name( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", - name="Updated Weather Assistant Prompt Name", +client.empathic_voice.tools.update_tool_name( + id="00183a3f-79ba-413d-9f3b-609864268bea", + name="get_current_temperature", ) ``` @@ -3466,7 +3273,7 @@ client.empathic_voice.prompts.update_prompt_name(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3474,7 +3281,7 @@ client.empathic_voice.prompts.update_prompt_name(
-**name:** `str` — Name applied to all versions of a particular Prompt. +**name:** `str` — Name applied to all versions of a particular Tool.
@@ -3494,7 +3301,7 @@ client.empathic_voice.prompts.update_prompt_name(
-
client.empathic_voice.prompts.get_prompt_version(...) +
client.empathic_voice.tools.get_tool_version(...)
@@ -3506,9 +3313,9 @@ client.empathic_voice.prompts.update_prompt_name(
-Fetches a specified version of a **Prompt**. +Fetches a specified version of a **Tool**. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3528,9 +3335,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.get_prompt_version( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", - version=0, +client.empathic_voice.tools.get_tool_version( + id="00183a3f-79ba-413d-9f3b-609864268bea", + version=1, ) ``` @@ -3547,7 +3354,7 @@ client.empathic_voice.prompts.get_prompt_version(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3557,11 +3364,11 @@ client.empathic_voice.prompts.get_prompt_version( **version:** `int` -Version number for a Prompt. +Version number for a Tool. -Prompts, Configs, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine prompts and revert to previous versions if needed. +Tools, Configs, Custom Voices, and Prompts are versioned. This versioning system supports iterative development, allowing you to progressively refine tools and revert to previous versions if needed. -Version numbers are integer values representing different iterations of the Prompt. Each update to the Prompt increments its version number. +Version numbers are integer values representing different iterations of the Tool. Each update to the Tool increments its version number.
@@ -3581,7 +3388,7 @@ Version numbers are integer values representing different iterations of the Prom
-
client.empathic_voice.prompts.delete_prompt_version(...) +
client.empathic_voice.tools.delete_tool_version(...)
@@ -3593,9 +3400,9 @@ Version numbers are integer values representing different iterations of the Prom
-Deletes a specified version of a **Prompt**. +Deletes a specified version of a **Tool**. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3615,8 +3422,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.delete_prompt_version( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", +client.empathic_voice.tools.delete_tool_version( + id="00183a3f-79ba-413d-9f3b-609864268bea", version=1, ) @@ -3634,7 +3441,7 @@ client.empathic_voice.prompts.delete_prompt_version(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3644,11 +3451,11 @@ client.empathic_voice.prompts.delete_prompt_version( **version:** `int` -Version number for a Prompt. +Version number for a Tool. -Prompts, Configs, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine prompts and revert to previous versions if needed. +Tools, Configs, Custom Voices, and Prompts are versioned. This versioning system supports iterative development, allowing you to progressively refine tools and revert to previous versions if needed. -Version numbers are integer values representing different iterations of the Prompt. Each update to the Prompt increments its version number. +Version numbers are integer values representing different iterations of the Tool. Each update to the Tool increments its version number.
@@ -3668,7 +3475,7 @@ Version numbers are integer values representing different iterations of the Prom
-
client.empathic_voice.prompts.update_prompt_description(...) +
client.empathic_voice.tools.update_tool_description(...)
@@ -3680,9 +3487,9 @@ Version numbers are integer values representing different iterations of the Prom
-Updates the description of a **Prompt**. +Updates the description of a specified **Tool** version. -See our [prompting guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on crafting your system prompt. +Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI.
@@ -3702,10 +3509,10 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.prompts.update_prompt_description( - id="af699d45-2985-42cc-91b9-af9e5da3bac5", +client.empathic_voice.tools.update_tool_description( + id="00183a3f-79ba-413d-9f3b-609864268bea", version=1, - version_description="This is an updated version_description.", + version_description="Fetches current temperature, precipitation, wind speed, AQI, and other weather conditions. Uses Celsius, Fahrenheit, or kelvin depending on user's region.", ) ``` @@ -3722,7 +3529,7 @@ client.empathic_voice.prompts.update_prompt_description(
-**id:** `str` — Identifier for a Prompt. Formatted as a UUID. +**id:** `str` — Identifier for a Tool. Formatted as a UUID.
@@ -3732,11 +3539,11 @@ client.empathic_voice.prompts.update_prompt_description( **version:** `int` -Version number for a Prompt. +Version number for a Tool. -Prompts, Configs, Custom Voices, and Tools are versioned. This versioning system supports iterative development, allowing you to progressively refine prompts and revert to previous versions if needed. +Tools, Configs, Custom Voices, and Prompts are versioned. This versioning system supports iterative development, allowing you to progressively refine tools and revert to previous versions if needed. -Version numbers are integer values representing different iterations of the Prompt. Each update to the Prompt increments its version number. +Version numbers are integer values representing different iterations of the Tool. Each update to the Tool increments its version number.
@@ -3744,7 +3551,7 @@ Version numbers are integer values representing different iterations of the Prom
-**version_description:** `typing.Optional[str]` — An optional description of the Prompt version. +**version_description:** `typing.Optional[str]` — An optional description of the Tool version.
@@ -3764,8 +3571,8 @@ Version numbers are integer values representing different iterations of the Prom
-## EmpathicVoice Tools -
client.empathic_voice.tools.list_tools(...) +## Tts +
client.tts.synthesize_json(...)
@@ -3777,9 +3584,9 @@ Version numbers are integer values representing different iterations of the Prom
-Fetches a paginated list of **Tools**. +Synthesizes one or more input texts into speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +The response includes the base64-encoded audio and metadata in JSON format.
@@ -3795,19 +3602,29 @@ Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-ca ```python from hume import HumeClient +from hume.tts import FormatMp3, PostedContextWithUtterances, PostedUtterance client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.tools.list_tools( - page_number=0, - page_size=2, +client.tts.synthesize_json( + context=PostedContextWithUtterances( + utterances=[ + PostedUtterance( + text="How can people see beauty so differently?", + description="A curious student with a clear and respectful tone, seeking clarification on Hume's ideas with a straightforward question.", + ) + ], + ), + format=FormatMp3(), + num_generations=1, + utterances=[ + PostedUtterance( + text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", + description="Middle-aged masculine voice with a clear, rhythmic Scots lilt, rounded vowels, and a warm, steady tone with an articulate, academic quality.", + ) + ], ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -3823,11 +3640,11 @@ for page in response.iter_pages():
-**page_number:** `typing.Optional[int]` +**utterances:** `typing.Sequence[PostedUtterance]` -Specifies the page number to retrieve, enabling pagination. +A list of **Utterances** to be converted to speech output. -This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. +An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`.
@@ -3835,11 +3652,15 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-**page_size:** `typing.Optional[int]` +**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. + +
+
-Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. +
+
-For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. +**format:** `typing.Optional[Format]` — Specifies the output audio file format.
@@ -3847,7 +3668,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each tool. To include all versions of each tool in the list, set `restrict_to_most_recent` to false. +**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests.
@@ -3855,7 +3676,62 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**name:** `typing.Optional[str]` — Filter to only include tools with name. +**num_generations:** `typing.Optional[int]` + +Number of audio generations to produce from the input utterances. + +Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. + +
+
+ +
+
+ +**split_utterances:** `typing.Optional[bool]` + +Controls how audio output is segmented in the response. + +- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. + +- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. + +This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output. + +
+
+ +
+
+ +**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). + +
+
+ +
+
+ +**version:** `typing.Optional[OctaveVersion]` + +Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. + +Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. + +For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. + +
+
+ +
+
+ +**instant_mode:** `typing.Optional[bool]` + +Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). +- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. +- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). +- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted).
@@ -3875,7 +3751,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.tools.create_tool(...) +
client.tts.synthesize_file(...)
@@ -3887,9 +3763,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Creates a **Tool** that can be added to an [EVI configuration](/reference/speech-to-speech-evi/configs/create-config). +Synthesizes one or more input texts into speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +The response contains the generated audio file in the requested format.
@@ -3905,16 +3781,23 @@ Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-ca ```python from hume import HumeClient +from hume.tts import FormatMp3, PostedContextWithGenerationId, PostedUtterance client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.tools.create_tool( - name="get_current_weather", - parameters='{ "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "format": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location." } }, "required": ["location", "format"] }', - version_description="Fetches current weather and uses celsius or fahrenheit based on location of user.", - description="This tool is for getting the current weather.", - fallback_content="Unable to fetch current weather.", +client.tts.synthesize_file( + context=PostedContextWithGenerationId( + generation_id="09ad914d-8e7f-40f8-a279-e34f07f7dab2", + ), + format=FormatMp3(), + num_generations=1, + utterances=[ + PostedUtterance( + text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", + description="Middle-aged masculine voice with a clear, rhythmic Scots lilt, rounded vowels, and a warm, steady tone with an articulate, academic quality.", + ) + ], ) ``` @@ -3931,7 +3814,11 @@ client.empathic_voice.tools.create_tool(
-**name:** `str` — Name applied to all versions of a particular Tool. +**utterances:** `typing.Sequence[PostedUtterance]` + +A list of **Utterances** to be converted to speech output. + +An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`.
@@ -3939,11 +3826,15 @@ client.empathic_voice.tools.create_tool(
-**parameters:** `str` +**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. + +
+
-Stringified JSON defining the parameters used by this version of the Tool. +
+
-These parameters define the inputs needed for the Tool's execution, including the expected data type and description for each input field. Structured as a stringified JSON schema, this format ensures the Tool receives data in the expected format. +**format:** `typing.Optional[Format]` — Specifies the output audio file format.
@@ -3951,7 +3842,7 @@ These parameters define the inputs needed for the Tool's execution, including th
-**description:** `typing.Optional[str]` — An optional description of what the Tool does, used by the supplemental LLM to choose when and how to call the function. +**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests.
@@ -3959,7 +3850,11 @@ These parameters define the inputs needed for the Tool's execution, including th
-**fallback_content:** `typing.Optional[str]` — Optional text passed to the supplemental LLM in place of the tool call result. The LLM then uses this text to generate a response back to the user, ensuring continuity in the conversation if the Tool errors. +**num_generations:** `typing.Optional[int]` + +Number of audio generations to produce from the input utterances. + +Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency.
@@ -3967,7 +3862,15 @@ These parameters define the inputs needed for the Tool's execution, including th
-**version_description:** `typing.Optional[str]` — An optional description of the Tool version. +**split_utterances:** `typing.Optional[bool]` + +Controls how audio output is segmented in the response. + +- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. + +- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. + +This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output.
@@ -3975,7 +3878,42 @@ These parameters define the inputs needed for the Tool's execution, including th
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. +**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). + +
+
+ +
+
+ +**version:** `typing.Optional[OctaveVersion]` + +Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. + +Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. + +For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. + +
+
+ +
+
+ +**instant_mode:** `typing.Optional[bool]` + +Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). +- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. +- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). +- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted). + +
+
+ +
+
+ +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response.
@@ -3987,7 +3925,7 @@ These parameters define the inputs needed for the Tool's execution, including th
-
client.empathic_voice.tools.list_tool_versions(...) +
client.tts.synthesize_file_streaming(...)
@@ -3999,9 +3937,7 @@ These parameters define the inputs needed for the Tool's execution, including th
-Fetches a list of a **Tool's** versions. - -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +Streams synthesized speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody.
@@ -4017,18 +3953,22 @@ Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-ca ```python from hume import HumeClient +from hume.tts import PostedUtterance, PostedUtteranceVoiceWithName client = HumeClient( api_key="YOUR_API_KEY", ) -response = client.empathic_voice.tools.list_tool_versions( - id="00183a3f-79ba-413d-9f3b-609864268bea", +client.tts.synthesize_file_streaming( + utterances=[ + PostedUtterance( + text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", + voice=PostedUtteranceVoiceWithName( + name="Male English Actor", + provider="HUME_AI", + ), + ) + ], ) -for item in response: - yield item -# alternatively, you can paginate page-by-page -for page in response.iter_pages(): - yield page ``` @@ -4044,7 +3984,11 @@ for page in response.iter_pages():
-**id:** `str` — Identifier for a Tool. Formatted as a UUID. +**utterances:** `typing.Sequence[PostedUtterance]` + +A list of **Utterances** to be converted to speech output. + +An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`.
@@ -4052,11 +3996,15 @@ for page in response.iter_pages():
-**page_number:** `typing.Optional[int]` +**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. + +
+
-Specifies the page number to retrieve, enabling pagination. +
+
-This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. +**format:** `typing.Optional[Format]` — Specifies the output audio file format.
@@ -4064,11 +4012,19 @@ This parameter uses zero-based indexing. For example, setting `page_number` to 0
-**page_size:** `typing.Optional[int]` +**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests. + +
+
-Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. +
+
-For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. +**num_generations:** `typing.Optional[int]` + +Number of audio generations to produce from the input utterances. + +Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency.
@@ -4076,7 +4032,15 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**restrict_to_most_recent:** `typing.Optional[bool]` — By default, `restrict_to_most_recent` is set to true, returning only the latest version of each tool. To include all versions of each tool in the list, set `restrict_to_most_recent` to false. +**split_utterances:** `typing.Optional[bool]` + +Controls how audio output is segmented in the response. + +- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. + +- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. + +This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output.
@@ -4084,7 +4048,42 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. +**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). + +
+
+ +
+
+ +**version:** `typing.Optional[OctaveVersion]` + +Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. + +Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. + +For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. + +
+
+ +
+
+ +**instant_mode:** `typing.Optional[bool]` + +Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). +- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. +- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). +- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted). + +
+
+ +
+
+ +**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response.
@@ -4096,7 +4095,7 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-
client.empathic_voice.tools.create_tool_version(...) +
client.tts.synthesize_json_streaming(...)
@@ -4108,9 +4107,9 @@ For example, if `page_size` is set to 10, each page will include up to 10 items.
-Updates a **Tool** by creating a new version of the **Tool**. +Streams synthesized speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech's style and prosody. -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +The response is a stream of JSON objects including audio encoded in base64.
@@ -4126,17 +4125,24 @@ Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-ca ```python from hume import HumeClient +from hume.tts import PostedUtterance, PostedUtteranceVoiceWithName client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.tools.create_tool_version( - id="00183a3f-79ba-413d-9f3b-609864268bea", - parameters='{ "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "format": { "type": "string", "enum": ["celsius", "fahrenheit", "kelvin"], "description": "The temperature unit to use. Infer this from the users location." } }, "required": ["location", "format"] }', - version_description="Fetches current weather and uses celsius, fahrenheit, or kelvin based on location of user.", - fallback_content="Unable to fetch current weather.", - description="This tool is for getting the current weather.", -) +response = client.tts.synthesize_json_streaming( + utterances=[ + PostedUtterance( + text="Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.", + voice=PostedUtteranceVoiceWithName( + name="Male English Actor", + provider="HUME_AI", + ), + ) + ], +) +for chunk in response.data: + yield chunk ``` @@ -4152,19 +4158,11 @@ client.empathic_voice.tools.create_tool_version(
-**id:** `str` — Identifier for a Tool. Formatted as a UUID. - -
-
- -
-
- -**parameters:** `str` +**utterances:** `typing.Sequence[PostedUtterance]` -Stringified JSON defining the parameters used by this version of the Tool. +A list of **Utterances** to be converted to speech output. -These parameters define the inputs needed for the Tool's execution, including the expected data type and description for each input field. Structured as a stringified JSON schema, this format ensures the Tool receives data in the expected format. +An **Utterance** is a unit of input for [Octave](/docs/text-to-speech-tts/overview), and includes input `text`, an optional `description` to serve as the prompt for how the speech should be delivered, an optional `voice` specification, and additional controls to guide delivery for `speed` and `trailing_silence`.
@@ -4172,7 +4170,7 @@ These parameters define the inputs needed for the Tool's execution, including th
-**description:** `typing.Optional[str]` — An optional description of what the Tool does, used by the supplemental LLM to choose when and how to call the function. +**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output.
@@ -4180,7 +4178,7 @@ These parameters define the inputs needed for the Tool's execution, including th
-**fallback_content:** `typing.Optional[str]` — Optional text passed to the supplemental LLM in place of the tool call result. The LLM then uses this text to generate a response back to the user, ensuring continuity in the conversation if the Tool errors. +**format:** `typing.Optional[Format]` — Specifies the output audio file format.
@@ -4188,7 +4186,7 @@ These parameters define the inputs needed for the Tool's execution, including th
-**version_description:** `typing.Optional[str]` — An optional description of the Tool version. +**include_timestamp_types:** `typing.Optional[typing.Sequence[TimestampType]]` — The set of timestamp types to include in the response. Only supported for Octave 2 requests.
@@ -4196,71 +4194,62 @@ These parameters define the inputs needed for the Tool's execution, including th
-**request_options:** `typing.Optional[RequestOptions]` — Request-specific configuration. - -
-
- -
+**num_generations:** `typing.Optional[int]` +Number of audio generations to produce from the input utterances. +Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. + -
-
client.empathic_voice.tools.delete_tool(...)
-#### 📝 Description +**split_utterances:** `typing.Optional[bool]` -
-
+Controls how audio output is segmented in the response. -
-
+- When **enabled** (`true`), input utterances are automatically split into natural-sounding speech segments. -Deletes a **Tool** and its versions. +- When **disabled** (`false`), the response maintains a strict one-to-one mapping between input utterances and output snippets. -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. -
-
+This setting affects how the `snippets` array is structured in the response, which may be important for applications that need to track the relationship between input text and generated audio segments. When setting to `false`, avoid including utterances with long `text`, as this can result in distorted output. +
-#### 🔌 Usage -
+**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). + +
+
+
-```python -from hume import HumeClient +**version:** `typing.Optional[OctaveVersion]` -client = HumeClient( - api_key="YOUR_API_KEY", -) -client.empathic_voice.tools.delete_tool( - id="00183a3f-79ba-413d-9f3b-609864268bea", -) +Selects the Octave model version used to synthesize speech for this request. If you omit this field, Hume automatically routes the request to the most appropriate model. Setting a specific version ensures stable and repeatable behavior across requests. -``` -
-
+Use `2` to opt into the latest Octave capabilities. When you specify version `2`, you must also provide a `voice`. Requests that set `version: 2` without a voice will be rejected. + +For a comparison of Octave versions, see the [Octave versions](/docs/text-to-speech-tts/overview#octave-versions) section in the TTS overview. +
-#### ⚙️ Parameters -
-
-
+**instant_mode:** `typing.Optional[bool]` -**id:** `str` — Identifier for a Tool. Formatted as a UUID. +Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on [instant mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). +- A [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) must be specified when instant mode is enabled. Dynamic voice generation is not supported with this mode. +- Instant mode is only supported for streaming endpoints (e.g., [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). +- Ensure only a single generation is requested ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) must be `1` or omitted).
@@ -4280,11 +4269,11 @@ client.empathic_voice.tools.delete_tool(
-
client.empathic_voice.tools.update_tool_name(...) +
client.tts.convert_voice_json(...)
-#### 📝 Description +#### 🔌 Usage
@@ -4292,15 +4281,23 @@ client.empathic_voice.tools.delete_tool(
-Updates the name of a **Tool**. +```python +from hume import HumeClient -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +client = HumeClient( + api_key="YOUR_API_KEY", +) +response = client.tts.convert_voice_json() +for chunk in response.data: + yield chunk + +```
-#### 🔌 Usage +#### ⚙️ Parameters
@@ -4308,32 +4305,41 @@ Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-ca
-```python -from hume import HumeClient +**strip_headers:** `typing.Optional[bool]` — If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). + +
+
-client = HumeClient( - api_key="YOUR_API_KEY", -) -client.empathic_voice.tools.update_tool_name( - id="00183a3f-79ba-413d-9f3b-609864268bea", - name="get_current_temperature", -) +
+
-``` +**audio:** `from __future__ import annotations + +typing.Optional[core.File]` — See core.File for more documentation +
+ +
+
+ +**context:** `typing.Optional[PostedContext]` — Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. +
-#### ⚙️ Parameters -
+**voice:** `typing.Optional[PostedUtteranceVoice]` + +
+
+
-**id:** `str` — Identifier for a Tool. Formatted as a UUID. +**format:** `typing.Optional[Format]` — Specifies the output audio file format.
@@ -4341,7 +4347,7 @@ client.empathic_voice.tools.update_tool_name(
-**name:** `str` — Name applied to all versions of a particular Tool. +**include_timestamp_types:** `typing.Optional[typing.List[TimestampType]]` — The set of timestamp types to include in the response. When used in multipart/form-data, specify each value using bracket notation: `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. Only supported for Octave 2 requests.
@@ -4361,7 +4367,8 @@ client.empathic_voice.tools.update_tool_name(
-
client.empathic_voice.tools.get_tool_version(...) +## Tts Voices +
client.tts.voices.list(...)
@@ -4373,9 +4380,7 @@ client.empathic_voice.tools.update_tool_name(
-Fetches a specified version of a **Tool**. - -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +Lists voices you have saved in your account, or voices from the [Voice Library](https://platform.hume.ai/tts/voice-library).
@@ -4395,10 +4400,14 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.tools.get_tool_version( - id="00183a3f-79ba-413d-9f3b-609864268bea", - version=1, +response = client.tts.voices.list( + provider="CUSTOM_VOICE", ) +for item in response: + yield item +# alternatively, you can paginate page-by-page +for page in response.iter_pages(): + yield page ``` @@ -4414,7 +4423,12 @@ client.empathic_voice.tools.get_tool_version(
-**id:** `str` — Identifier for a Tool. Formatted as a UUID. +**provider:** `VoiceProvider` + +Specify the voice provider to filter voices returned by the endpoint: + +- **`HUME_AI`**: Lists preset, shared voices from Hume's [Voice Library](https://platform.hume.ai/tts/voice-library). +- **`CUSTOM_VOICE`**: Lists custom voices created and saved to your account.
@@ -4422,13 +4436,31 @@ client.empathic_voice.tools.get_tool_version(
-**version:** `int` +**page_number:** `typing.Optional[int]` -Version number for a Tool. +Specifies the page number to retrieve, enabling pagination. -Tools, Configs, Custom Voices, and Prompts are versioned. This versioning system supports iterative development, allowing you to progressively refine tools and revert to previous versions if needed. +This parameter uses zero-based indexing. For example, setting `page_number` to 0 retrieves the first page of results (items 0-9 if `page_size` is 10), setting `page_number` to 1 retrieves the second page (items 10-19), and so on. Defaults to 0, which retrieves the first page. + +
+
-Version numbers are integer values representing different iterations of the Tool. Each update to the Tool increments its version number. +
+
+ +**page_size:** `typing.Optional[int]` + +Specifies the maximum number of results to include per page, enabling pagination. The value must be between 1 and 100, inclusive. + +For example, if `page_size` is set to 10, each page will include up to 10 items. Defaults to 10. + +
+
+ +
+
+ +**ascending_order:** `typing.Optional[bool]`
@@ -4448,7 +4480,7 @@ Version numbers are integer values representing different iterations of the Tool
-
client.empathic_voice.tools.delete_tool_version(...) +
client.tts.voices.create(...)
@@ -4460,9 +4492,9 @@ Version numbers are integer values representing different iterations of the Tool
-Deletes a specified version of a **Tool**. +Saves a new custom voice to your account using the specified TTS generation ID. -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +Once saved, this voice can be reused in subsequent TTS requests, ensuring consistent speech style and prosody. For more details on voice creation, see the [Voices Guide](/docs/text-to-speech-tts/voices).
@@ -4482,9 +4514,9 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.tools.delete_tool_version( - id="00183a3f-79ba-413d-9f3b-609864268bea", - version=1, +client.tts.voices.create( + generation_id="795c949a-1510-4a80-9646-7d0863b023ab", + name="David Hume", ) ``` @@ -4501,7 +4533,7 @@ client.empathic_voice.tools.delete_tool_version(
-**id:** `str` — Identifier for a Tool. Formatted as a UUID. +**generation_id:** `str` — A unique ID associated with this TTS generation that can be used as context for generating consistent speech style and prosody across multiple requests.
@@ -4509,13 +4541,7 @@ client.empathic_voice.tools.delete_tool_version(
-**version:** `int` - -Version number for a Tool. - -Tools, Configs, Custom Voices, and Prompts are versioned. This versioning system supports iterative development, allowing you to progressively refine tools and revert to previous versions if needed. - -Version numbers are integer values representing different iterations of the Tool. Each update to the Tool increments its version number. +**name:** `str` — Name of the voice in the `Voice Library`.
@@ -4535,7 +4561,7 @@ Version numbers are integer values representing different iterations of the Tool
-
client.empathic_voice.tools.update_tool_description(...) +
client.tts.voices.delete(...)
@@ -4547,9 +4573,7 @@ Version numbers are integer values representing different iterations of the Tool
-Updates the description of a specified **Tool** version. - -Refer to our [tool use](/docs/speech-to-speech-evi/features/tool-use#function-calling) guide for comprehensive instructions on defining and integrating tools into EVI. +Deletes a previously generated custom voice.
@@ -4569,10 +4593,8 @@ from hume import HumeClient client = HumeClient( api_key="YOUR_API_KEY", ) -client.empathic_voice.tools.update_tool_description( - id="00183a3f-79ba-413d-9f3b-609864268bea", - version=1, - version_description="Fetches current temperature, precipitation, wind speed, AQI, and other weather conditions. Uses Celsius, Fahrenheit, or kelvin depending on user's region.", +client.tts.voices.delete( + name="David Hume", ) ``` @@ -4589,29 +4611,7 @@ client.empathic_voice.tools.update_tool_description(
-**id:** `str` — Identifier for a Tool. Formatted as a UUID. - -
-
- -
-
- -**version:** `int` - -Version number for a Tool. - -Tools, Configs, Custom Voices, and Prompts are versioned. This versioning system supports iterative development, allowing you to progressively refine tools and revert to previous versions if needed. - -Version numbers are integer values representing different iterations of the Tool. Each update to the Tool increments its version number. - -
-
- -
-
- -**version_description:** `typing.Optional[str]` — An optional description of the Tool version. +**name:** `str` — Name of the voice to delete
diff --git a/src/hume/base_client.py b/src/hume/base_client.py index 0feb2d67..31bfb7be 100644 --- a/src/hume/base_client.py +++ b/src/hume/base_client.py @@ -75,18 +75,10 @@ def __init__( else httpx.Client(timeout=_defaulted_timeout), timeout=_defaulted_timeout, ) - self._tts: typing.Optional[TtsClient] = None self._empathic_voice: typing.Optional[EmpathicVoiceClient] = None + self._tts: typing.Optional[TtsClient] = None self._expression_measurement: typing.Optional[ExpressionMeasurementClient] = None - @property - def tts(self): - if self._tts is None: - from .tts.client import TtsClient # noqa: E402 - - self._tts = TtsClient(client_wrapper=self._client_wrapper) - return self._tts - @property def empathic_voice(self): if self._empathic_voice is None: @@ -95,6 +87,14 @@ def empathic_voice(self): self._empathic_voice = EmpathicVoiceClient(client_wrapper=self._client_wrapper) return self._empathic_voice + @property + def tts(self): + if self._tts is None: + from .tts.client import TtsClient # noqa: E402 + + self._tts = TtsClient(client_wrapper=self._client_wrapper) + return self._tts + @property def expression_measurement(self): if self._expression_measurement is None: @@ -165,18 +165,10 @@ def __init__( else httpx.AsyncClient(timeout=_defaulted_timeout), timeout=_defaulted_timeout, ) - self._tts: typing.Optional[AsyncTtsClient] = None self._empathic_voice: typing.Optional[AsyncEmpathicVoiceClient] = None + self._tts: typing.Optional[AsyncTtsClient] = None self._expression_measurement: typing.Optional[AsyncExpressionMeasurementClient] = None - @property - def tts(self): - if self._tts is None: - from .tts.client import AsyncTtsClient # noqa: E402 - - self._tts = AsyncTtsClient(client_wrapper=self._client_wrapper) - return self._tts - @property def empathic_voice(self): if self._empathic_voice is None: @@ -185,6 +177,14 @@ def empathic_voice(self): self._empathic_voice = AsyncEmpathicVoiceClient(client_wrapper=self._client_wrapper) return self._empathic_voice + @property + def tts(self): + if self._tts is None: + from .tts.client import AsyncTtsClient # noqa: E402 + + self._tts = AsyncTtsClient(client_wrapper=self._client_wrapper) + return self._tts + @property def expression_measurement(self): if self._expression_measurement is None: diff --git a/src/hume/core/client_wrapper.py b/src/hume/core/client_wrapper.py index f55c269f..d7a684b1 100644 --- a/src/hume/core/client_wrapper.py +++ b/src/hume/core/client_wrapper.py @@ -23,10 +23,10 @@ def __init__( def get_headers(self) -> typing.Dict[str, str]: headers: typing.Dict[str, str] = { - "User-Agent": "hume/0.13.5", + "User-Agent": "hume/0.13.6", "X-Fern-Language": "Python", "X-Fern-SDK-Name": "hume", - "X-Fern-SDK-Version": "0.13.5", + "X-Fern-SDK-Version": "0.13.6", **(self.get_custom_headers() or {}), } if self.api_key is not None: diff --git a/src/hume/empathic_voice/chat/client.py.diff b/src/hume/empathic_voice/chat/client.py.diff deleted file mode 100644 index 29d9577b..00000000 --- a/src/hume/empathic_voice/chat/client.py.diff +++ /dev/null @@ -1,422 +0,0 @@ -diff --git a/src/hume/empathic_voice/chat/client.py b/src/hume/empathic_voice/chat/client.py -index 2a3732f5..910917c9 100644 ---- a/src/hume/empathic_voice/chat/client.py -+++ b/src/hume/empathic_voice/chat/client.py -@@ -1,11 +1,7 @@ - # This file was auto-generated by Fern from our API Definition. - --from contextlib import asynccontextmanager, contextmanager -- --import json - import typing -- --from typing_extensions import deprecated -+from contextlib import asynccontextmanager, contextmanager - - import httpx - import websockets.exceptions -@@ -14,34 +10,16 @@ from ...core.api_error import ApiError - from ...core.client_wrapper import AsyncClientWrapper, SyncClientWrapper - from ...core.request_options import RequestOptions - from ...core.serialization import convert_and_respect_annotation_metadata --from ...core.query_encoder import single_query_encoder - from ..types.connect_session_settings import ConnectSessionSettings - from .raw_client import AsyncRawChatClient, RawChatClient --from .socket_client import AsyncChatSocketClient, ChatSocketClient, ChatConnectOptions -- --from ...core.events import EventEmitterMixin, EventType --from ...core.pydantic_utilities import parse_obj_as --from ..types.assistant_input import AssistantInput --from ..types.audio_input import AudioInput --from ..types.pause_assistant_message import PauseAssistantMessage --from ..types.resume_assistant_message import ResumeAssistantMessage --from ..types.session_settings import SessionSettings --from ..types.tool_error_message import ToolErrorMessage --from ..types.tool_response_message import ToolResponseMessage --from ..types.user_input import UserInput --from .types.publish_event import PublishEvent --from ..types.subscribe_event import SubscribeEvent -- --from ...core.api_error import ApiError --import asyncio -- --from ...core.websocket import OnErrorHandlerType, OnMessageHandlerType, OnOpenCloseHandlerType -+from .socket_client import AsyncChatSocketClient, ChatSocketClient - - try: - from websockets.legacy.client import connect as websockets_client_connect # type: ignore - except ImportError: - from websockets import connect as websockets_client_connect # type: ignore - -+ - class ChatClient: - def __init__(self, *, client_wrapper: SyncClientWrapper): - self._raw_client = RawChatClient(client_wrapper=client_wrapper) -@@ -62,6 +40,7 @@ class ChatClient: - self, - *, - access_token: typing.Optional[str] = None, -+ allow_connection: typing.Optional[bool] = None, - config_id: typing.Optional[str] = None, - config_version: typing.Optional[int] = None, - event_limit: typing.Optional[int] = None, -@@ -69,7 +48,6 @@ class ChatClient: - verbose_transcription: typing.Optional[bool] = None, - api_key: typing.Optional[str] = None, - session_settings: ConnectSessionSettings, -- allow_connection: typing.Optional[bool] = None, - request_options: typing.Optional[RequestOptions] = None, - ) -> typing.Iterator[ChatSocketClient]: - """ -@@ -84,6 +62,9 @@ class ChatClient: - - For more details, refer to the [Authentication Strategies Guide](/docs/introduction/api-key#authentication-strategies). - -+ allow_connection : typing.Optional[bool] -+ Allows external connections to this chat via the /connect endpoint. -+ - config_id : typing.Optional[str] - The unique identifier for an EVI configuration. - -@@ -137,6 +118,8 @@ class ChatClient: - query_params = httpx.QueryParams() - if access_token is not None: - query_params = query_params.add("access_token", access_token) -+ if allow_connection is not None: -+ query_params = query_params.add("allow_connection", allow_connection) - if config_id is not None: - query_params = query_params.add("config_id", config_id) - if config_version is not None: -@@ -149,12 +132,18 @@ class ChatClient: - query_params = query_params.add("verbose_transcription", verbose_transcription) - if api_key is not None: - query_params = query_params.add("api_key", api_key) -- if allow_connection is not None: -- query_params = query_params.add("allow_connection", str(allow_connection).lower()) -- if session_settings is not None: -- flattened_params = single_query_encoder("session_settings", session_settings) -- for param_key, param_value in flattened_params: -- query_params = query_params.add(param_key, str(param_value)) -+ if ( -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ) -+ is not None -+ ): -+ query_params = query_params.add( -+ "session_settings", -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ), -+ ) - ws_url = ws_url + f"?{query_params}" - headers = self._raw_client._client_wrapper.get_headers() - if request_options and "additional_headers" in request_options: -@@ -197,14 +186,14 @@ class AsyncChatClient: - self, - *, - access_token: typing.Optional[str] = None, -+ allow_connection: typing.Optional[bool] = None, - config_id: typing.Optional[str] = None, - config_version: typing.Optional[int] = None, - event_limit: typing.Optional[int] = None, - resumed_chat_group_id: typing.Optional[str] = None, - verbose_transcription: typing.Optional[bool] = None, - api_key: typing.Optional[str] = None, -- session_settings: typing.Optional[ConnectSessionSettings] = None, -- allow_connection: typing.Optional[bool] = None, -+ session_settings: ConnectSessionSettings, - request_options: typing.Optional[RequestOptions] = None, - ) -> typing.AsyncIterator[AsyncChatSocketClient]: - """ -@@ -219,6 +208,9 @@ class AsyncChatClient: - - For more details, refer to the [Authentication Strategies Guide](/docs/introduction/api-key#authentication-strategies). - -+ allow_connection : typing.Optional[bool] -+ Allows external connections to this chat via the /connect endpoint. -+ - config_id : typing.Optional[str] - The unique identifier for an EVI configuration. - -@@ -261,11 +253,6 @@ class AsyncChatClient: - - session_settings : ConnectSessionSettings - -- allow_connection : typing.Optional[bool] -- Flag that allows the resulting Chat to accept secondary connections via -- the control plane `/connect` endpoint. Defaults to `False` on the server. -- Set to `True` to enable observer connections for the session. -- - request_options : typing.Optional[RequestOptions] - Request-specific configuration. - -@@ -277,6 +264,8 @@ class AsyncChatClient: - query_params = httpx.QueryParams() - if access_token is not None: - query_params = query_params.add("access_token", access_token) -+ if allow_connection is not None: -+ query_params = query_params.add("allow_connection", allow_connection) - if config_id is not None: - query_params = query_params.add("config_id", config_id) - if config_version is not None: -@@ -289,12 +278,18 @@ class AsyncChatClient: - query_params = query_params.add("verbose_transcription", verbose_transcription) - if api_key is not None: - query_params = query_params.add("api_key", api_key) -- if allow_connection is not None: -- query_params = query_params.add("allow_connection", str(allow_connection).lower()) -- if session_settings is not None: -- flattened_params = single_query_encoder("session_settings", session_settings) -- for param_key, param_value in flattened_params: -- query_params = query_params.add(param_key, str(param_value)) -+ if ( -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ) -+ is not None -+ ): -+ query_params = query_params.add( -+ "session_settings", -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ), -+ ) - ws_url = ws_url + f"?{query_params}" - headers = self._raw_client._client_wrapper.get_headers() - if request_options and "additional_headers" in request_options: -@@ -315,234 +310,3 @@ class AsyncChatClient: - headers=dict(headers), - body="Unexpected error when initializing websocket connection.", - ) -- -- @deprecated("") -- async def _wrap_on_open_close( -- self, on_open: typing.Optional[OnOpenCloseHandlerType] -- ): -- if on_open is not None: -- if asyncio.iscoroutinefunction(on_open): -- await on_open() -- else: -- on_open() -- -- @deprecated("") -- async def _wrap_on_error( -- self, exc: Exception, on_error: typing.Optional[OnErrorHandlerType] -- ) -> None: -- if on_error is not None: -- if asyncio.iscoroutinefunction(on_error): -- await on_error(exc) -- else: -- on_error(exc) -- -- @deprecated("") -- async def _wrap_on_message( -- self, -- message: SubscribeEvent, -- on_message: typing.Optional[OnMessageHandlerType[SubscribeEvent]], -- ) -> None: -- if on_message is not None: -- if asyncio.iscoroutinefunction(on_message): -- await on_message(message) -- else: -- on_message(message) -- -- async def _process_connection( -- self, -- connection: AsyncChatSocketClient, -- on_message: typing.Optional[OnMessageHandlerType], -- on_error: typing.Optional[OnErrorHandlerType], -- ) -> None: -- async for message in connection: -- try: -- await self._wrap_on_message(message, on_message) -- except Exception as exc: -- await self._wrap_on_error(exc, on_error) -- -- def _construct_ws_uri(self, options: typing.Optional[ChatConnectOptions]): -- query_params = httpx.QueryParams() -- -- api_key = self._raw_client._client_wrapper.api_key -- if options is not None: -- maybe_api_key = options.get("api_key") -- if maybe_api_key is not None: -- api_key = maybe_api_key -- maybe_config_id = options.get("config_id") -- if maybe_config_id is not None: -- query_params = query_params.add("config_id", maybe_config_id) -- maybe_config_version = options.get("config_version") -- if maybe_config_version is not None: -- query_params = query_params.add( -- "config_version", maybe_config_version -- ) -- maybe_resumed_chat_group_id = options.get("resumed_chat_group_id") -- if maybe_resumed_chat_group_id is not None: -- query_params = query_params.add( -- "resumed_chat_group_id", maybe_resumed_chat_group_id -- ) -- maybe_verbose_transcription = options.get("verbose_transcription") -- if maybe_verbose_transcription is not None: -- query_params = query_params.add( -- "verbose_transcription", -- "true" if maybe_verbose_transcription else "false", -- ) -- elif api_key is not None: -- query_params = query_params.add("apiKey", api_key) -- -- maybe_voice_id = options.get("voice_id") -- if maybe_voice_id is not None: -- query_params = query_params.add("voice_id", maybe_voice_id) -- -- maybe_session_settings = options.get("session_settings") -- if maybe_session_settings is not None: -- # Handle audio settings -- audio = maybe_session_settings.get("audio") -- if audio is not None: -- channels = audio.get("channels") -- if channels is not None: -- query_params = query_params.add( -- "session_settings[audio][channels]", str(channels) -- ) -- encoding = audio.get("encoding") -- if encoding is not None: -- query_params = query_params.add( -- "session_settings[audio][encoding]", str(encoding) -- ) -- sample_rate = audio.get("sample_rate") -- if sample_rate is not None: -- query_params = query_params.add( -- "session_settings[audio][sample_rate]", str(sample_rate) -- ) -- -- # Handle context settings -- context = maybe_session_settings.get("context") -- if context is not None: -- text = context.get("text") -- if text is not None: -- query_params = query_params.add( -- "session_settings[context][text]", str(text) -- ) -- context_type = context.get("type") -- if context_type is not None: -- query_params = query_params.add( -- "session_settings[context][type]", str(context_type) -- ) -- -- # Handle top-level session settings -- custom_session_id = maybe_session_settings.get("custom_session_id") -- if custom_session_id is not None: -- query_params = query_params.add( -- "session_settings[custom_session_id]", str(custom_session_id) -- ) -- -- event_limit = maybe_session_settings.get("event_limit") -- if event_limit is not None: -- query_params = query_params.add( -- "session_settings[event_limit]", str(event_limit) -- ) -- -- language_model_api_key = maybe_session_settings.get("language_model_api_key") -- if language_model_api_key is not None: -- query_params = query_params.add( -- "session_settings[language_model_api_key]", str(language_model_api_key) -- ) -- -- system_prompt = maybe_session_settings.get("system_prompt") -- if system_prompt is not None: -- query_params = query_params.add( -- "session_settings[system_prompt]", str(system_prompt) -- ) -- -- variables = maybe_session_settings.get("variables") -- if variables is not None: -- query_params = query_params.add( -- "session_settings[variables]", json.dumps(variables) -- ) -- -- voice_id_setting = maybe_session_settings.get("voice_id") -- if voice_id_setting is not None: -- query_params = query_params.add( -- "session_settings[voice_id]", str(voice_id_setting) -- ) -- elif api_key is not None: -- query_params = query_params.add("apiKey", api_key) -- -- base = self._raw_client._client_wrapper.get_environment().evi + "/chat" -- return f"{base}?{query_params}" -- -- @deprecated("Use .on() instead.") -- @asynccontextmanager -- async def connect_with_callbacks( -- self, -- options: typing.Optional[ChatConnectOptions] = None, -- on_open: typing.Optional[OnOpenCloseHandlerType] = None, -- on_message: typing.Optional[OnMessageHandlerType[SubscribeEvent]] = None, -- on_close: typing.Optional[OnOpenCloseHandlerType] = None, -- on_error: typing.Optional[OnErrorHandlerType] = None, -- ) -> typing.AsyncIterator["AsyncChatSocketClient"]: -- """ -- Parameters -- ---------- -- on_open : Optional[OnOpenCloseHandlerType] -- A callable to be invoked on the opening of the websocket connection. -- -- on_message : Optional[OnMessageHandlerType[SubscribeEvent]] -- A callable to be invoked on receiving a message from the websocket connection. This callback should expect a `SubscribeEvent` object. -- -- on_close : Optional[OnOpenCloseHandlerType] -- A callable to be invoked on the closing of the websocket connection. -- -- on_error : Optional[OnErrorHandlerType] -- A callable to be invoked on receiving an error from the websocket connection. -- -- Yields -- ------- -- AsyncIterator["AsyncChatSocketClient"] -- """ -- -- ws_uri = self._construct_ws_uri(options) -- -- background_task: typing.Optional[asyncio.Task[None]] = None -- -- try: -- async with websockets.connect( -- ws_uri, -- extra_headers=self._raw_client._client_wrapper.get_headers(), -- ) as protocol: -- await self._wrap_on_open_close(on_open) -- connection = AsyncChatSocketClient(websocket=protocol) -- background_task = asyncio.create_task( -- self._process_connection(connection, on_message, on_error) -- ) -- -- yield connection -- -- # Special case authentication errors -- except websockets.exceptions.InvalidStatusCode as exc: -- status_code: int = exc.status_code -- if status_code == 401: -- raise ApiError( -- status_code=status_code, -- body="Websocket initialized with invalid credentials.", -- ) from exc -- raise ApiError( -- status_code=status_code, -- body="Unexpected error when initializing websocket connection.", -- ) from exc -- -- # Except all other errors to apply the on_error handler -- except Exception as exc: -- await self._wrap_on_error(exc, on_error) -- raise -- -- # Finally, apply the on_close handler -- finally: -- if background_task is not None: -- background_task.cancel() -- try: -- await background_task -- except asyncio.CancelledError: -- pass -- await self._wrap_on_open_close(on_close) -- diff --git a/src/hume/empathic_voice/chat/raw_client.py.diff b/src/hume/empathic_voice/chat/raw_client.py.diff deleted file mode 100644 index cb61a2b3..00000000 --- a/src/hume/empathic_voice/chat/raw_client.py.diff +++ /dev/null @@ -1,112 +0,0 @@ -diff --git a/src/hume/empathic_voice/chat/raw_client.py b/src/hume/empathic_voice/chat/raw_client.py -index fefee870..ed718e98 100644 ---- a/src/hume/empathic_voice/chat/raw_client.py -+++ b/src/hume/empathic_voice/chat/raw_client.py -@@ -10,7 +10,6 @@ from ...core.api_error import ApiError - from ...core.client_wrapper import AsyncClientWrapper, SyncClientWrapper - from ...core.request_options import RequestOptions - from ...core.serialization import convert_and_respect_annotation_metadata --from ...core.query_encoder import single_query_encoder - from ..types.connect_session_settings import ConnectSessionSettings - from .socket_client import AsyncChatSocketClient, ChatSocketClient - -@@ -29,6 +28,7 @@ class RawChatClient: - self, - *, - access_token: typing.Optional[str] = None, -+ allow_connection: typing.Optional[bool] = None, - config_id: typing.Optional[str] = None, - config_version: typing.Optional[int] = None, - event_limit: typing.Optional[int] = None, -@@ -50,6 +50,9 @@ class RawChatClient: - - For more details, refer to the [Authentication Strategies Guide](/docs/introduction/api-key#authentication-strategies). - -+ allow_connection : typing.Optional[bool] -+ Allows external connections to this chat via the /connect endpoint. -+ - config_id : typing.Optional[str] - The unique identifier for an EVI configuration. - -@@ -103,6 +106,8 @@ class RawChatClient: - query_params = httpx.QueryParams() - if access_token is not None: - query_params = query_params.add("access_token", access_token) -+ if allow_connection is not None: -+ query_params = query_params.add("allow_connection", allow_connection) - if config_id is not None: - query_params = query_params.add("config_id", config_id) - if config_version is not None: -@@ -115,10 +120,18 @@ class RawChatClient: - query_params = query_params.add("verbose_transcription", verbose_transcription) - if api_key is not None: - query_params = query_params.add("api_key", api_key) -- if session_settings is not None: -- flattened_params = single_query_encoder("session_settings", session_settings) -- for param_key, param_value in flattened_params: -- query_params = query_params.add(param_key, str(param_value)) -+ if ( -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ) -+ is not None -+ ): -+ query_params = query_params.add( -+ "session_settings", -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ), -+ ) - ws_url = ws_url + f"?{query_params}" - headers = self._client_wrapper.get_headers() - if request_options and "additional_headers" in request_options: -@@ -150,6 +163,7 @@ class AsyncRawChatClient: - self, - *, - access_token: typing.Optional[str] = None, -+ allow_connection: typing.Optional[bool] = None, - config_id: typing.Optional[str] = None, - config_version: typing.Optional[int] = None, - event_limit: typing.Optional[int] = None, -@@ -171,6 +185,9 @@ class AsyncRawChatClient: - - For more details, refer to the [Authentication Strategies Guide](/docs/introduction/api-key#authentication-strategies). - -+ allow_connection : typing.Optional[bool] -+ Allows external connections to this chat via the /connect endpoint. -+ - config_id : typing.Optional[str] - The unique identifier for an EVI configuration. - -@@ -224,6 +241,8 @@ class AsyncRawChatClient: - query_params = httpx.QueryParams() - if access_token is not None: - query_params = query_params.add("access_token", access_token) -+ if allow_connection is not None: -+ query_params = query_params.add("allow_connection", allow_connection) - if config_id is not None: - query_params = query_params.add("config_id", config_id) - if config_version is not None: -@@ -236,10 +255,18 @@ class AsyncRawChatClient: - query_params = query_params.add("verbose_transcription", verbose_transcription) - if api_key is not None: - query_params = query_params.add("api_key", api_key) -- if session_settings is not None: -- flattened_params = single_query_encoder("session_settings", session_settings) -- for param_key, param_value in flattened_params: -- query_params = query_params.add(param_key, str(param_value)) -+ if ( -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ) -+ is not None -+ ): -+ query_params = query_params.add( -+ "session_settings", -+ convert_and_respect_annotation_metadata( -+ object_=session_settings, annotation=ConnectSessionSettings, direction="write" -+ ), -+ ) - ws_url = ws_url + f"?{query_params}" - headers = self._client_wrapper.get_headers() - if request_options and "additional_headers" in request_options: diff --git a/src/hume/empathic_voice/chat/socket_client.py.diff b/src/hume/empathic_voice/chat/socket_client.py.diff deleted file mode 100644 index 77c80ea0..00000000 --- a/src/hume/empathic_voice/chat/socket_client.py.diff +++ /dev/null @@ -1,161 +0,0 @@ -diff --git a/src/hume/empathic_voice/chat/socket_client.py b/src/hume/empathic_voice/chat/socket_client.py -index de3b4a5e..18ee74ab 100644 ---- a/src/hume/empathic_voice/chat/socket_client.py -+++ b/src/hume/empathic_voice/chat/socket_client.py -@@ -6,21 +6,10 @@ from json.decoder import JSONDecodeError - - import websockets - import websockets.sync.connection as websockets_sync_connection --from typing_extensions import deprecated --from contextlib import asynccontextmanager -- - from ...core.events import EventEmitterMixin, EventType - from ...core.pydantic_utilities import parse_obj_as --from ..types.assistant_input import AssistantInput --from ..types.audio_input import AudioInput --from ..types.pause_assistant_message import PauseAssistantMessage --from ..types.resume_assistant_message import ResumeAssistantMessage --from ..types.session_settings import SessionSettings --from ..types.tool_error_message import ToolErrorMessage --from ..types.tool_response_message import ToolResponseMessage --from ..types.user_input import UserInput --from .types.publish_event import PublishEvent - from ..types.subscribe_event import SubscribeEvent -+from .types.publish_event import PublishEvent - - try: - from websockets.legacy.client import WebSocketClientProtocol # type: ignore -@@ -29,58 +18,6 @@ except ImportError: - - ChatSocketClientResponse = typing.Union[SubscribeEvent] - --class ChatConnectSessionSettingsAudio(typing.TypedDict, total=False): -- channels: typing.Optional[int] -- encoding: typing.Optional[str] -- sample_rate: typing.Optional[int] -- -- --class ChatConnectSessionSettingsContext(typing.TypedDict, total=False): -- text: typing.Optional[str] -- -- --SessionSettingsVariablesValue = typing.Union[str, float, bool] -- --class ChatConnectSessionSettings(typing.TypedDict, total=False): -- audio: typing.Optional[ChatConnectSessionSettingsAudio] -- context: typing.Optional[ChatConnectSessionSettingsContext] -- custom_session_id: typing.Optional[str] -- event_limit: typing.Optional[int] -- language_model_api_key: typing.Optional[str] -- system_prompt: typing.Optional[str] -- variables: typing.Optional[typing.Dict[str, SessionSettingsVariablesValue]] -- voice_id: typing.Optional[str] -- --@deprecated("Use .connect() with kwargs instead.") --class ChatConnectOptions(typing.TypedDict, total=False): -- config_id: typing.Optional[str] -- """ -- The ID of the configuration. -- """ -- -- config_version: typing.Optional[str] -- """ -- The version of the configuration. -- """ -- -- api_key: typing.Optional[str] -- -- secret_key: typing.Optional[str] -- -- resumed_chat_group_id: typing.Optional[str] -- -- verbose_transcription: typing.Optional[bool] -- -- """ -- ID of the Voice to use for this chat. If specified, will override the voice set in the Config -- """ -- voice_id: typing.Optional[str] -- -- session_settings: typing.Optional[typing.Dict] -- """ -- Session settings to apply at connection time. Supports all SessionSettings fields except -- builtin_tools, type, metadata, and tools. Additionally supports event_limit. -- """ - - class AsyncChatSocketClient(EventEmitterMixin): - def __init__(self, *, websocket: WebSocketClientProtocol): -@@ -141,38 +78,6 @@ class AsyncChatSocketClient(EventEmitterMixin): - """ - await self._send(data.dict()) - -- @deprecated("Use send_publish instead.") -- async def send_audio_input(self, message: AudioInput) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_session_settings(self, message: SessionSettings) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_user_input(self, message: UserInput) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_assistant_input(self, message: AssistantInput) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_tool_response(self, message: ToolResponseMessage) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_tool_error(self, message: ToolErrorMessage) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_pause_assistant(self, message: PauseAssistantMessage) -> None: -- await self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- async def send_resume_assistant(self, message: ResumeAssistantMessage) -> None: -- await self.send_publish(message) -- - - class ChatSocketClient(EventEmitterMixin): - def __init__(self, *, websocket: websockets_sync_connection.Connection): -@@ -232,35 +137,3 @@ class ChatSocketClient(EventEmitterMixin): - Send a Pydantic model to the websocket connection. - """ - self._send(data.dict()) -- -- @deprecated("Use send_publish instead.") -- def send_audio_input(self, message: AudioInput) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_session_settings(self, message: SessionSettings) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_user_input(self, message: UserInput) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_assistant_input(self, message: AssistantInput) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_tool_response(self, message: ToolResponseMessage) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_tool_error(self, message: ToolErrorMessage) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_pause_assistant(self, message: PauseAssistantMessage) -> None: -- self.send_publish(message) -- -- @deprecated("Use send_publish instead.") -- def send_resume_assistant(self, message: ResumeAssistantMessage) -> None: -- self.send_publish(message) diff --git a/src/hume/empathic_voice/client.py.diff b/src/hume/empathic_voice/client.py.diff deleted file mode 100644 index 84b74be1..00000000 --- a/src/hume/empathic_voice/client.py.diff +++ /dev/null @@ -1,201 +0,0 @@ -diff --git a/src/hume/empathic_voice/client.py b/src/hume/empathic_voice/client.py -index e9119462..241410a1 100644 ---- a/src/hume/empathic_voice/client.py -+++ b/src/hume/empathic_voice/client.py -@@ -4,8 +4,6 @@ from __future__ import annotations - - import typing - --from hume.empathic_voice.chat.client import AsyncChatClient, ChatClient -- - from ..core.client_wrapper import AsyncClientWrapper, SyncClientWrapper - from .raw_client import AsyncRawEmpathicVoiceClient, RawEmpathicVoiceClient - -@@ -13,6 +11,7 @@ if typing.TYPE_CHECKING: - from .chat_groups.client import AsyncChatGroupsClient, ChatGroupsClient - from .chats.client import AsyncChatsClient, ChatsClient - from .configs.client import AsyncConfigsClient, ConfigsClient -+ from .control_plane.client import AsyncControlPlaneClient, ControlPlaneClient - from .prompts.client import AsyncPromptsClient, PromptsClient - from .tools.client import AsyncToolsClient, ToolsClient - -@@ -21,12 +20,12 @@ class EmpathicVoiceClient: - def __init__(self, *, client_wrapper: SyncClientWrapper): - self._raw_client = RawEmpathicVoiceClient(client_wrapper=client_wrapper) - self._client_wrapper = client_wrapper -- self._tools: typing.Optional[ToolsClient] = None -- self._prompts: typing.Optional[PromptsClient] = None -- self._configs: typing.Optional[ConfigsClient] = None -- self._chats: typing.Optional[ChatsClient] = None -+ self._control_plane: typing.Optional[ControlPlaneClient] = None - self._chat_groups: typing.Optional[ChatGroupsClient] = None -- self._chat: typing.Optional[ChatClient] = None -+ self._chats: typing.Optional[ChatsClient] = None -+ self._configs: typing.Optional[ConfigsClient] = None -+ self._prompts: typing.Optional[PromptsClient] = None -+ self._tools: typing.Optional[ToolsClient] = None - - @property - def with_raw_response(self) -> RawEmpathicVoiceClient: -@@ -40,20 +39,28 @@ class EmpathicVoiceClient: - return self._raw_client - - @property -- def tools(self): -- if self._tools is None: -- from .tools.client import ToolsClient # noqa: E402 -+ def control_plane(self): -+ if self._control_plane is None: -+ from .control_plane.client import ControlPlaneClient # noqa: E402 - -- self._tools = ToolsClient(client_wrapper=self._client_wrapper) -- return self._tools -+ self._control_plane = ControlPlaneClient(client_wrapper=self._client_wrapper) -+ return self._control_plane - - @property -- def prompts(self): -- if self._prompts is None: -- from .prompts.client import PromptsClient # noqa: E402 -+ def chat_groups(self): -+ if self._chat_groups is None: -+ from .chat_groups.client import ChatGroupsClient # noqa: E402 - -- self._prompts = PromptsClient(client_wrapper=self._client_wrapper) -- return self._prompts -+ self._chat_groups = ChatGroupsClient(client_wrapper=self._client_wrapper) -+ return self._chat_groups -+ -+ @property -+ def chats(self): -+ if self._chats is None: -+ from .chats.client import ChatsClient # noqa: E402 -+ -+ self._chats = ChatsClient(client_wrapper=self._client_wrapper) -+ return self._chats - - @property - def configs(self): -@@ -64,32 +71,32 @@ class EmpathicVoiceClient: - return self._configs - - @property -- def chats(self): -- if self._chats is None: -- from .chats.client import ChatsClient # noqa: E402 -+ def prompts(self): -+ if self._prompts is None: -+ from .prompts.client import PromptsClient # noqa: E402 - -- self._chats = ChatsClient(client_wrapper=self._client_wrapper) -- return self._chats -+ self._prompts = PromptsClient(client_wrapper=self._client_wrapper) -+ return self._prompts - - @property -- def chat_groups(self): -- if self._chat_groups is None: -- from .chat_groups.client import ChatGroupsClient # noqa: E402 -+ def tools(self): -+ if self._tools is None: -+ from .tools.client import ToolsClient # noqa: E402 - -- self._chat_groups = ChatGroupsClient(client_wrapper=self._client_wrapper) -- return self._chat_groups -+ self._tools = ToolsClient(client_wrapper=self._client_wrapper) -+ return self._tools - - - class AsyncEmpathicVoiceClient: - def __init__(self, *, client_wrapper: AsyncClientWrapper): - self._raw_client = AsyncRawEmpathicVoiceClient(client_wrapper=client_wrapper) - self._client_wrapper = client_wrapper -- self._tools: typing.Optional[AsyncToolsClient] = None -- self._prompts: typing.Optional[AsyncPromptsClient] = None -- self._configs: typing.Optional[AsyncConfigsClient] = None -- self._chats: typing.Optional[AsyncChatsClient] = None -+ self._control_plane: typing.Optional[AsyncControlPlaneClient] = None - self._chat_groups: typing.Optional[AsyncChatGroupsClient] = None -- self._chat: typing.Optional[AsyncChatClient] = None -+ self._chats: typing.Optional[AsyncChatsClient] = None -+ self._configs: typing.Optional[AsyncConfigsClient] = None -+ self._prompts: typing.Optional[AsyncPromptsClient] = None -+ self._tools: typing.Optional[AsyncToolsClient] = None - - @property - def with_raw_response(self) -> AsyncRawEmpathicVoiceClient: -@@ -103,28 +110,20 @@ class AsyncEmpathicVoiceClient: - return self._raw_client - - @property -- def tools(self): -- if self._tools is None: -- from .tools.client import AsyncToolsClient # noqa: E402 -+ def control_plane(self): -+ if self._control_plane is None: -+ from .control_plane.client import AsyncControlPlaneClient # noqa: E402 - -- self._tools = AsyncToolsClient(client_wrapper=self._client_wrapper) -- return self._tools -+ self._control_plane = AsyncControlPlaneClient(client_wrapper=self._client_wrapper) -+ return self._control_plane - - @property -- def prompts(self): -- if self._prompts is None: -- from .prompts.client import AsyncPromptsClient # noqa: E402 -- -- self._prompts = AsyncPromptsClient(client_wrapper=self._client_wrapper) -- return self._prompts -- -- @property -- def configs(self): -- if self._configs is None: -- from .configs.client import AsyncConfigsClient # noqa: E402 -+ def chat_groups(self): -+ if self._chat_groups is None: -+ from .chat_groups.client import AsyncChatGroupsClient # noqa: E402 - -- self._configs = AsyncConfigsClient(client_wrapper=self._client_wrapper) -- return self._configs -+ self._chat_groups = AsyncChatGroupsClient(client_wrapper=self._client_wrapper) -+ return self._chat_groups - - @property - def chats(self): -@@ -135,17 +134,25 @@ class AsyncEmpathicVoiceClient: - return self._chats - - @property -- def chat_groups(self): -- if self._chat_groups is None: -- from .chat_groups.client import AsyncChatGroupsClient # noqa: E402 -+ def configs(self): -+ if self._configs is None: -+ from .configs.client import AsyncConfigsClient # noqa: E402 - -- self._chat_groups = AsyncChatGroupsClient(client_wrapper=self._client_wrapper) -- return self._chat_groups -+ self._configs = AsyncConfigsClient(client_wrapper=self._client_wrapper) -+ return self._configs - - @property -- def chat(self): -- if self._chat is None: -- from .chat.client import AsyncChatClient # noqa: E402 -+ def prompts(self): -+ if self._prompts is None: -+ from .prompts.client import AsyncPromptsClient # noqa: E402 - -- self._chat = AsyncChatClient(client_wrapper=self._client_wrapper) -- return self._chat -+ self._prompts = AsyncPromptsClient(client_wrapper=self._client_wrapper) -+ return self._prompts -+ -+ @property -+ def tools(self): -+ if self._tools is None: -+ from .tools.client import AsyncToolsClient # noqa: E402 -+ -+ self._tools = AsyncToolsClient(client_wrapper=self._client_wrapper) -+ return self._tools diff --git a/src/hume/empathic_voice/types/assistant_end.py b/src/hume/empathic_voice/types/assistant_end.py index 53c0cb27..b8caf0cc 100644 --- a/src/hume/empathic_voice/types/assistant_end.py +++ b/src/hume/empathic_voice/types/assistant_end.py @@ -8,7 +8,7 @@ class AssistantEnd(UniversalBaseModel): """ - When provided, the output is an assistant end message. + **Indicates the conclusion of the assistant's response**, signaling that the assistant has finished speaking for the current conversational turn. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/assistant_input.py b/src/hume/empathic_voice/types/assistant_input.py index 11e4d279..449e8d83 100644 --- a/src/hume/empathic_voice/types/assistant_input.py +++ b/src/hume/empathic_voice/types/assistant_input.py @@ -8,7 +8,9 @@ class AssistantInput(UniversalBaseModel): """ - When provided, the input is spoken by EVI. + **Assistant text to synthesize into spoken audio and insert into the conversation.** EVI uses this text to generate spoken audio using our proprietary expressive text-to-speech model. + + Our model adds appropriate emotional inflections and tones to the text based on the user's expressions and the context of the conversation. The synthesized audio is streamed back to the user as an Assistant Message. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/assistant_message.py b/src/hume/empathic_voice/types/assistant_message.py index 794a07fb..52032677 100644 --- a/src/hume/empathic_voice/types/assistant_message.py +++ b/src/hume/empathic_voice/types/assistant_message.py @@ -10,7 +10,7 @@ class AssistantMessage(UniversalBaseModel): """ - When provided, the output is an assistant message. + **Transcript of the assistant's message.** Contains the message role, content, and optionally tool call information including the tool name, parameters, response requirement status, tool call ID, and tool type. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/assistant_prosody.py b/src/hume/empathic_voice/types/assistant_prosody.py index 12cd01a3..bc363182 100644 --- a/src/hume/empathic_voice/types/assistant_prosody.py +++ b/src/hume/empathic_voice/types/assistant_prosody.py @@ -9,7 +9,7 @@ class AssistantProsody(UniversalBaseModel): """ - When provided, the output is an Assistant Prosody message. + **Expression measurement predictions of the assistant's audio output.** Contains inference model results including prosody scores for 48 emotions within the detected expression of the assistant's audio sample. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/audio_input.py b/src/hume/empathic_voice/types/audio_input.py index b61beb5a..4325fc44 100644 --- a/src/hume/empathic_voice/types/audio_input.py +++ b/src/hume/empathic_voice/types/audio_input.py @@ -8,7 +8,9 @@ class AudioInput(UniversalBaseModel): """ - When provided, the input is audio. + **Base64 encoded audio input to insert into the conversation.** The content is treated as the user's speech to EVI and must be streamed continuously. Pre-recorded audio files are not supported. + + For optimal transcription quality, the audio data should be transmitted in small chunks. Hume recommends streaming audio with a buffer window of `20` milliseconds (ms), or `100` milliseconds (ms) for web applications. See our [Audio Guide](/docs/speech-to-speech-evi/guides/audio) for more details on preparing and processing audio. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) @@ -30,8 +32,6 @@ class AudioInput(UniversalBaseModel): type: typing.Literal["audio_input"] = pydantic.Field(default="audio_input") """ The type of message sent through the socket; must be `audio_input` for our server to correctly identify and process it as an Audio Input message. - - This message is used for sending audio input data to EVI for processing and expression measurement. Audio data should be sent as a continuous stream, encoded in Base64. """ if IS_PYDANTIC_V2: diff --git a/src/hume/empathic_voice/types/audio_output.py b/src/hume/empathic_voice/types/audio_output.py index 8dc4cf66..ec37c5c9 100644 --- a/src/hume/empathic_voice/types/audio_output.py +++ b/src/hume/empathic_voice/types/audio_output.py @@ -8,7 +8,9 @@ class AudioOutput(UniversalBaseModel): """ - The type of message sent through the socket; for an Audio Output message, this must be `audio_output`. + **Base64 encoded audio output.** This encoded audio is transmitted to the client, where it can be decoded and played back as part of the user interaction. The returned audio format is WAV and the sample rate is 48kHz. + + Contains the audio data, an ID to track and reference the audio output, and an index indicating the chunk position relative to the whole audio segment. See our [Audio Guide](/docs/speech-to-speech-evi/guides/audio) for more details on preparing and processing audio. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/chat_metadata.py b/src/hume/empathic_voice/types/chat_metadata.py index 4574b522..56be620e 100644 --- a/src/hume/empathic_voice/types/chat_metadata.py +++ b/src/hume/empathic_voice/types/chat_metadata.py @@ -8,7 +8,9 @@ class ChatMetadata(UniversalBaseModel): """ - When provided, the output is a chat metadata message. + **The first message received after establishing a connection with EVI**, containing important identifiers for the current Chat session. + + Includes the Chat ID (which allows the Chat session to be tracked and referenced) and the Chat Group ID (used to resume a Chat when passed in the `resumed_chat_group_id` query parameter of a subsequent connection request, allowing EVI to continue the conversation from where it left off within the Chat Group). """ chat_group_id: str = pydantic.Field() diff --git a/src/hume/empathic_voice/types/pause_assistant_message.py b/src/hume/empathic_voice/types/pause_assistant_message.py index 2cb93f85..52dcd6cf 100644 --- a/src/hume/empathic_voice/types/pause_assistant_message.py +++ b/src/hume/empathic_voice/types/pause_assistant_message.py @@ -8,7 +8,9 @@ class PauseAssistantMessage(UniversalBaseModel): """ - Pause responses from EVI. Chat history is still saved and sent after resuming. + **Pause responses from EVI.** Chat history is still saved and sent after resuming. Once this message is sent, EVI will not respond until a Resume Assistant message is sent. + + When paused, EVI won't respond, but transcriptions of your audio inputs will still be recorded. See our [Pause Response Guide](/docs/speech-to-speech-evi/features/pause-responses) for further details. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/resume_assistant_message.py b/src/hume/empathic_voice/types/resume_assistant_message.py index 1660b222..3fd67f96 100644 --- a/src/hume/empathic_voice/types/resume_assistant_message.py +++ b/src/hume/empathic_voice/types/resume_assistant_message.py @@ -8,7 +8,9 @@ class ResumeAssistantMessage(UniversalBaseModel): """ - Resume responses from EVI. Chat history sent while paused will now be sent. + **Resume responses from EVI.** Chat history sent while paused will now be sent. + + Upon resuming, if any audio input was sent during the pause, EVI will retain context from all messages sent but only respond to the last user message. See our [Pause Response Guide](/docs/speech-to-speech-evi/features/pause-responses) for further details. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/session_settings.py b/src/hume/empathic_voice/types/session_settings.py index 998f7aa6..b0cc93fb 100644 --- a/src/hume/empathic_voice/types/session_settings.py +++ b/src/hume/empathic_voice/types/session_settings.py @@ -13,7 +13,9 @@ class SessionSettings(UniversalBaseModel): """ - Settings for this chat session. + **Settings for this chat session.** Session settings are temporary and apply only to the current Chat session. + + These settings can be adjusted dynamically based on the requirements of each session to ensure optimal performance and user experience. See our [Session Settings Guide](/docs/speech-to-speech-evi/configuration/session-settings) for a complete list of configurable settings. """ audio: typing.Optional[AudioConfiguration] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/tool_error_message.py b/src/hume/empathic_voice/types/tool_error_message.py index 9e143879..fc35e49a 100644 --- a/src/hume/empathic_voice/types/tool_error_message.py +++ b/src/hume/empathic_voice/types/tool_error_message.py @@ -10,7 +10,9 @@ class ToolErrorMessage(UniversalBaseModel): """ - When provided, the output is a function call error. + **Error message from the tool call**, not exposed to the LLM or user. Upon receiving a Tool Call message and failing to invoke the function, this message is sent to notify EVI of the tool's failure. + + For built-in tools implemented on the server, you will receive this message type rather than a `ToolCallMessage` if the tool fails. See our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for further details. """ code: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/tool_response_message.py b/src/hume/empathic_voice/types/tool_response_message.py index cbc30a5c..e299d4ea 100644 --- a/src/hume/empathic_voice/types/tool_response_message.py +++ b/src/hume/empathic_voice/types/tool_response_message.py @@ -9,7 +9,9 @@ class ToolResponseMessage(UniversalBaseModel): """ - When provided, the output is a function call response. + **Return value of the tool call.** Contains the output generated by the tool to pass back to EVI. Upon receiving a Tool Call message and successfully invoking the function, this message is sent to convey the result of the function call back to EVI. + + For built-in tools implemented on the server, you will receive this message type rather than a `ToolCallMessage`. See our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for further details. """ content: str = pydantic.Field() diff --git a/src/hume/empathic_voice/types/user_input.py b/src/hume/empathic_voice/types/user_input.py index 448cc47a..5db7a433 100644 --- a/src/hume/empathic_voice/types/user_input.py +++ b/src/hume/empathic_voice/types/user_input.py @@ -8,7 +8,7 @@ class UserInput(UniversalBaseModel): """ - User text to insert into the conversation. Text sent through a User Input message is treated as the user's speech to EVI. EVI processes this input and provides a corresponding response. + **User text to insert into the conversation.** Text sent through a User Input message is treated as the user's speech to EVI. EVI processes this input and provides a corresponding response. Expression measurement results are not available for User Input messages, as the prosody model relies on audio input and cannot process text alone. """ @@ -21,8 +21,6 @@ class UserInput(UniversalBaseModel): text: str = pydantic.Field() """ User text to insert into the conversation. Text sent through a User Input message is treated as the user's speech to EVI. EVI processes this input and provides a corresponding response. - - Expression measurement results are not available for User Input messages, as the prosody model relies on audio input and cannot process text alone. """ type: typing.Literal["user_input"] = pydantic.Field(default="user_input") diff --git a/src/hume/empathic_voice/types/user_interruption.py b/src/hume/empathic_voice/types/user_interruption.py index 58914b38..c3be4945 100644 --- a/src/hume/empathic_voice/types/user_interruption.py +++ b/src/hume/empathic_voice/types/user_interruption.py @@ -8,7 +8,9 @@ class UserInterruption(UniversalBaseModel): """ - When provided, the output is an interruption. + **Indicates the user has interrupted the assistant's response.** EVI detects the interruption in real-time and sends this message to signal the interruption event. + + This message allows the system to stop the current audio playback, clear the audio queue, and prepare to handle new user input. Contains a Unix timestamp of when the user interruption was detected. For more details, see our [Interruptibility Guide](/docs/speech-to-speech-evi/features/interruptibility) """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/user_message.py b/src/hume/empathic_voice/types/user_message.py index b043de23..ebc07db8 100644 --- a/src/hume/empathic_voice/types/user_message.py +++ b/src/hume/empathic_voice/types/user_message.py @@ -11,7 +11,9 @@ class UserMessage(UniversalBaseModel): """ - When provided, the output is a user message. + **Transcript of the user's message.** Contains the message role and content, along with a `from_text` field indicating if this message was inserted into the conversation as text from a `UserInput` message. + + Includes an `interim` field indicating whether the transcript is provisional (words may be repeated or refined in subsequent `UserMessage` responses as additional audio is processed) or final and complete. Interim transcripts are only sent when the `verbose_transcription` query parameter is set to true in the initial handshake. """ custom_session_id: typing.Optional[str] = pydantic.Field(default=None) diff --git a/src/hume/empathic_voice/types/web_socket_error.py b/src/hume/empathic_voice/types/web_socket_error.py index 1f5b8b5a..ebf81857 100644 --- a/src/hume/empathic_voice/types/web_socket_error.py +++ b/src/hume/empathic_voice/types/web_socket_error.py @@ -8,7 +8,9 @@ class WebSocketError(UniversalBaseModel): """ - When provided, the output is an error message. + **Indicates a disruption in the WebSocket connection**, such as an unexpected disconnection, protocol error, or data transmission issue. + + Contains an error code identifying the type of error encountered, a detailed description of the error, and a short, human-readable identifier and description (slug) for the error. """ code: str = pydantic.Field() diff --git a/src/hume/expression_measurement/batch/types/inference_job.py.diff b/src/hume/expression_measurement/batch/types/inference_job.py.diff deleted file mode 100644 index f2e72b13..00000000 --- a/src/hume/expression_measurement/batch/types/inference_job.py.diff +++ /dev/null @@ -1,24 +0,0 @@ -diff --git a/src/hume/expression_measurement/batch/types/inference_job.py b/src/hume/expression_measurement/batch/types/inference_job.py -index 08add412..83a68f84 100644 ---- a/src/hume/expression_measurement/batch/types/inference_job.py -+++ b/src/hume/expression_measurement/batch/types/inference_job.py -@@ -1,7 +1,6 @@ - # This file was auto-generated by Fern from our API Definition. - - import typing --from typing_extensions import deprecated - - import pydantic - from ....core.pydantic_utilities import IS_PYDANTIC_V2 -@@ -16,11 +15,6 @@ class InferenceJob(JobInference): - Jobs created with the Expression Measurement API will have this field set to `INFERENCE`. - """ - -- @property -- @deprecated("Use .state.status instead") -- def status(self) -> str: -- return self.state.status -- - if IS_PYDANTIC_V2: - model_config: typing.ClassVar[pydantic.ConfigDict] = pydantic.ConfigDict(extra="allow", frozen=True) # type: ignore # Pydantic v2 - else: diff --git a/src/hume/expression_measurement/client.py.diff b/src/hume/expression_measurement/client.py.diff deleted file mode 100644 index 612621be..00000000 --- a/src/hume/expression_measurement/client.py.diff +++ /dev/null @@ -1,70 +0,0 @@ -diff --git a/src/hume/expression_measurement/client.py b/src/hume/expression_measurement/client.py -index f75d9210..a7651e40 100644 ---- a/src/hume/expression_measurement/client.py -+++ b/src/hume/expression_measurement/client.py -@@ -8,16 +8,14 @@ from ..core.client_wrapper import AsyncClientWrapper, SyncClientWrapper - from .raw_client import AsyncRawExpressionMeasurementClient, RawExpressionMeasurementClient - - if typing.TYPE_CHECKING: -- from .batch.client_with_utils import AsyncBatchClientWithUtils, BatchClientWithUtils -- from .stream.stream.client import StreamClient, AsyncStreamClient -+ from .batch.client import AsyncBatchClient, BatchClient - - - class ExpressionMeasurementClient: - def __init__(self, *, client_wrapper: SyncClientWrapper): - self._raw_client = RawExpressionMeasurementClient(client_wrapper=client_wrapper) - self._client_wrapper = client_wrapper -- self._batch: typing.Optional[BatchClientWithUtils] = None -- self._stream: typing.Optional[StreamClient] = None -+ self._batch: typing.Optional[BatchClient] = None - - @property - def with_raw_response(self) -> RawExpressionMeasurementClient: -@@ -33,25 +31,17 @@ class ExpressionMeasurementClient: - @property - def batch(self): - if self._batch is None: -- from .batch.client_with_utils import BatchClientWithUtils # noqa: E402 -+ from .batch.client import BatchClient # noqa: E402 - -- self._batch = BatchClientWithUtils(client_wrapper=self._client_wrapper) -+ self._batch = BatchClient(client_wrapper=self._client_wrapper) - return self._batch - -- @property -- def stream(self): -- if self._stream is None: -- from .stream.stream.client import StreamClient # noqa: E402 -- self._stream = StreamClient(client_wrapper=self._client_wrapper) -- return self._stream -- - - class AsyncExpressionMeasurementClient: - def __init__(self, *, client_wrapper: AsyncClientWrapper): - self._raw_client = AsyncRawExpressionMeasurementClient(client_wrapper=client_wrapper) - self._client_wrapper = client_wrapper -- self._batch: typing.Optional[AsyncBatchClientWithUtils] = None -- self._stream: typing.Optional[AsyncStreamClient] = None -+ self._batch: typing.Optional[AsyncBatchClient] = None - - @property - def with_raw_response(self) -> AsyncRawExpressionMeasurementClient: -@@ -67,15 +57,7 @@ class AsyncExpressionMeasurementClient: - @property - def batch(self): - if self._batch is None: -- from .batch.client_with_utils import AsyncBatchClientWithUtils # noqa: E402 -+ from .batch.client import AsyncBatchClient # noqa: E402 - -- self._batch = AsyncBatchClientWithUtils(client_wrapper=self._client_wrapper) -+ self._batch = AsyncBatchClient(client_wrapper=self._client_wrapper) - return self._batch -- -- @property -- def stream(self): -- if self._stream is None: -- from .stream.stream.client import AsyncStreamClient # noqa: E402 -- -- self._stream = AsyncStreamClient(client_wrapper=self._client_wrapper) -- return self._stream diff --git a/src/hume/expression_measurement/stream/stream/socket_client.py.diff b/src/hume/expression_measurement/stream/stream/socket_client.py.diff deleted file mode 100644 index ac65dfb4..00000000 --- a/src/hume/expression_measurement/stream/stream/socket_client.py.diff +++ /dev/null @@ -1,170 +0,0 @@ -diff --git a/src/hume/expression_measurement/stream/stream/socket_client.py b/src/hume/expression_measurement/stream/stream/socket_client.py -index fcd83929..85935e4e 100644 ---- a/src/hume/expression_measurement/stream/stream/socket_client.py -+++ b/src/hume/expression_measurement/stream/stream/socket_client.py -@@ -1,18 +1,13 @@ - # This file was auto-generated by Fern from our API Definition. - --import base64 - import json - import typing - from json.decoder import JSONDecodeError --from pathlib import Path - - import websockets - import websockets.sync.connection as websockets_sync_connection -- --from ....core.api_error import ApiError - from ....core.events import EventEmitterMixin, EventType - from ....core.pydantic_utilities import parse_obj_as --from .types.config import Config - from .types.stream_models_endpoint_payload import StreamModelsEndpointPayload - from .types.subscribe_event import SubscribeEvent - -@@ -83,74 +78,6 @@ class AsyncStreamSocketClient(EventEmitterMixin): - """ - await self._send(data.dict()) - -- async def send_facemesh( -- self, -- landmarks: typing.List[typing.List[typing.List[float]]], -- config: typing.Optional[Config] = None, -- payload_id: typing.Optional[str] = None, -- ) -> StreamSocketClientResponse: -- landmarks_str = json.dumps(landmarks) -- payload = { -- "data": landmarks_str, -- "models": config.dict() if config else None, -- "raw_text": False, -- "payload_id": payload_id, -- } -- payload = {k: v for k, v in payload.items() if v is not None} -- await self._websocket.send(json.dumps(payload)) -- return await self.recv() -- -- async def send_text( -- self, -- text: str, -- config: typing.Optional[Config] = None, -- payload_id: typing.Optional[str] = None, -- ) -> StreamSocketClientResponse: -- payload = { -- "data": text, -- "models": config.dict() if config else None, -- "raw_text": True, -- "payload_id": payload_id, -- } -- payload = {k: v for k, v in payload.items() if v is not None} -- await self._websocket.send(json.dumps(payload)) -- return await self.recv() -- -- async def send_file( -- self, -- file_: typing.Union[str, Path], -- config: typing.Optional[Config] = None, -- payload_id: typing.Optional[str] = None, -- ) -> StreamSocketClientResponse: -- try: -- with open(file_, "rb") as f: -- bytes_data = base64.b64encode(f.read()).decode() -- except: -- if isinstance(file_, Path): -- raise ApiError(body=f"Failed to open file: {file_}") -- # If you cannot open the file, assume you were passed a b64 string, not a file path -- bytes_data = str(file_) -- -- payload = { -- "data": bytes_data, -- "models": config.dict() if config else None, -- "raw_text": False, -- "payload_id": payload_id, -- } -- payload = {k: v for k, v in payload.items() if v is not None} -- await self._websocket.send(json.dumps(payload)) -- return await self.recv() -- -- async def get_job_details(self) -> StreamSocketClientResponse: -- payload = {"job_details": True} -- await self._websocket.send(json.dumps(payload)) -- return await self.recv() -- -- async def reset(self) -> StreamSocketClientResponse: -- payload = {"reset_stream": True} -- await self._websocket.send(json.dumps(payload)) -- return await self.recv() -- - - class StreamSocketClient(EventEmitterMixin): - def __init__(self, *, websocket: websockets_sync_connection.Connection): -@@ -210,71 +137,3 @@ class StreamSocketClient(EventEmitterMixin): - Send a Pydantic model to the websocket connection. - """ - self._send(data.dict()) -- -- def send_facemesh( -- self, -- landmarks: typing.List[typing.List[typing.List[float]]], -- config: typing.Optional[Config] = None, -- payload_id: typing.Optional[str] = None, -- ) -> StreamSocketClientResponse: -- landmarks_str = json.dumps(landmarks) -- payload = { -- "data": landmarks_str, -- "models": config.dict() if config else None, -- "raw_text": False, -- "payload_id": payload_id, -- } -- payload = {k: v for k, v in payload.items() if v is not None} -- self._websocket.send(json.dumps(payload)) -- return self.recv() -- -- def send_text( -- self, -- text: str, -- config: typing.Optional[Config] = None, -- payload_id: typing.Optional[str] = None, -- ) -> StreamSocketClientResponse: -- payload = { -- "data": text, -- "models": config.dict() if config else None, -- "raw_text": True, -- "payload_id": payload_id, -- } -- payload = {k: v for k, v in payload.items() if v is not None} -- self._websocket.send(json.dumps(payload)) -- return self.recv() -- -- def send_file( -- self, -- file_: typing.Union[str, Path], -- config: typing.Optional[Config] = None, -- payload_id: typing.Optional[str] = None, -- ) -> StreamSocketClientResponse: -- try: -- with open(file_, "rb") as f: -- bytes_data = base64.b64encode(f.read()).decode() -- except: -- if isinstance(file_, Path): -- raise ApiError(body=f"Failed to open file: {file_}") -- # If you cannot open the file, assume you were passed a b64 string, not a file path -- bytes_data = str(file_) -- -- payload = { -- "data": bytes_data, -- "models": config.dict() if config else None, -- "raw_text": False, -- "payload_id": payload_id, -- } -- payload = {k: v for k, v in payload.items() if v is not None} -- self._websocket.send(json.dumps(payload)) -- return self.recv() -- -- def get_job_details(self) -> StreamSocketClientResponse: -- payload = {"job_details": True} -- self._websocket.send(json.dumps(payload)) -- return self.recv() -- -- def reset(self) -> StreamSocketClientResponse: -- payload = {"reset_stream": True} -- self._websocket.send(json.dumps(payload)) -- return self.recv() diff --git a/src/hume/tts/client.py.diff b/src/hume/tts/client.py.diff deleted file mode 100644 index ea61a075..00000000 --- a/src/hume/tts/client.py.diff +++ /dev/null @@ -1,438 +0,0 @@ -diff --git a/src/hume/tts/client.py b/src/hume/tts/client.py -index 5f6c2c70..7c5e8597 100644 ---- a/src/hume/tts/client.py -+++ b/src/hume/tts/client.py -@@ -4,8 +4,7 @@ from __future__ import annotations - - import typing - --from hume.tts.stream_input.client import StreamInputClient -- -+from .. import core - from ..core.client_wrapper import AsyncClientWrapper, SyncClientWrapper - from ..core.request_options import RequestOptions - from .raw_client import AsyncRawTtsClient, RawTtsClient -@@ -13,13 +12,13 @@ from .types.format import Format - from .types.octave_version import OctaveVersion - from .types.posted_context import PostedContext - from .types.posted_utterance import PostedUtterance -+from .types.posted_utterance_voice import PostedUtteranceVoice - from .types.return_tts import ReturnTts - from .types.timestamp_type import TimestampType - from .types.tts_output import TtsOutput - - if typing.TYPE_CHECKING: - from .voices.client import AsyncVoicesClient, VoicesClient -- from .stream_input.client import AsyncStreamInputClient, StreamInputClient - # this is used as the default value for optional parameters - OMIT = typing.cast(typing.Any, ...) - -@@ -29,7 +28,6 @@ class TtsClient: - self._raw_client = RawTtsClient(client_wrapper=client_wrapper) - self._client_wrapper = client_wrapper - self._voices: typing.Optional[VoicesClient] = None -- self._stream_input: typing.Optional[StreamInputClient] = None - - @property - def with_raw_response(self) -> RawTtsClient: -@@ -75,10 +73,12 @@ class TtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -187,10 +187,12 @@ class TtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -292,10 +294,12 @@ class TtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -397,10 +401,12 @@ class TtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -471,6 +477,117 @@ class TtsClient: - ) as r: - yield from r.data - -+ def convert_voice_file( -+ self, -+ *, -+ audio: core.File, -+ strip_headers: typing.Optional[bool] = OMIT, -+ context: typing.Optional[PostedContext] = OMIT, -+ voice: typing.Optional[PostedUtteranceVoice] = OMIT, -+ format: typing.Optional[Format] = OMIT, -+ include_timestamp_types: typing.Optional[typing.List[TimestampType]] = OMIT, -+ request_options: typing.Optional[RequestOptions] = None, -+ ) -> typing.Iterator[bytes]: -+ """ -+ Parameters -+ ---------- -+ audio : core.File -+ See core.File for more documentation -+ -+ strip_headers : typing.Optional[bool] -+ If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). -+ -+ context : typing.Optional[PostedContext] -+ Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. -+ -+ voice : typing.Optional[PostedUtteranceVoice] -+ -+ format : typing.Optional[Format] -+ Specifies the output audio file format. -+ -+ include_timestamp_types : typing.Optional[typing.List[TimestampType]] -+ The set of timestamp types to include in the response. -+ -+ request_options : typing.Optional[RequestOptions] -+ Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response. -+ -+ Returns -+ ------- -+ typing.Iterator[bytes] -+ Successful Response -+ """ -+ with self._raw_client.convert_voice_file( -+ audio=audio, -+ strip_headers=strip_headers, -+ context=context, -+ voice=voice, -+ format=format, -+ include_timestamp_types=include_timestamp_types, -+ request_options=request_options, -+ ) as r: -+ yield from r.data -+ -+ def convert_voice_json( -+ self, -+ *, -+ strip_headers: typing.Optional[bool] = OMIT, -+ audio: typing.Optional[core.File] = OMIT, -+ context: typing.Optional[PostedContext] = OMIT, -+ voice: typing.Optional[PostedUtteranceVoice] = OMIT, -+ format: typing.Optional[Format] = OMIT, -+ include_timestamp_types: typing.Optional[typing.List[TimestampType]] = OMIT, -+ request_options: typing.Optional[RequestOptions] = None, -+ ) -> typing.Iterator[TtsOutput]: -+ """ -+ Parameters -+ ---------- -+ strip_headers : typing.Optional[bool] -+ If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). -+ -+ audio : typing.Optional[core.File] -+ See core.File for more documentation -+ -+ context : typing.Optional[PostedContext] -+ Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. -+ -+ voice : typing.Optional[PostedUtteranceVoice] -+ -+ format : typing.Optional[Format] -+ Specifies the output audio file format. -+ -+ include_timestamp_types : typing.Optional[typing.List[TimestampType]] -+ The set of timestamp types to include in the response. -+ -+ request_options : typing.Optional[RequestOptions] -+ Request-specific configuration. -+ -+ Yields -+ ------ -+ typing.Iterator[TtsOutput] -+ Successful Response -+ -+ Examples -+ -------- -+ from hume import HumeClient -+ -+ client = HumeClient( -+ api_key="YOUR_API_KEY", -+ ) -+ response = client.tts.convert_voice_json() -+ for chunk in response: -+ yield chunk -+ """ -+ with self._raw_client.convert_voice_json( -+ strip_headers=strip_headers, -+ audio=audio, -+ context=context, -+ voice=voice, -+ format=format, -+ include_timestamp_types=include_timestamp_types, -+ request_options=request_options, -+ ) as r: -+ yield from r.data -+ - @property - def voices(self): - if self._voices is None: -@@ -479,20 +596,12 @@ class TtsClient: - self._voices = VoicesClient(client_wrapper=self._client_wrapper) - return self._voices - -- @property -- def stream_input(self): -- if self._stream_input is None: -- from .stream_input.client import StreamInputClient # noqa: E402 -- self._stream_input = StreamInputClient(client_wrapper=self._client_wrapper) -- return self._stream_input -- - - class AsyncTtsClient: - def __init__(self, *, client_wrapper: AsyncClientWrapper): - self._raw_client = AsyncRawTtsClient(client_wrapper=client_wrapper) - self._client_wrapper = client_wrapper - self._voices: typing.Optional[AsyncVoicesClient] = None -- self._stream_input: typing.Optional[AsyncStreamInputClient] = None - - @property - def with_raw_response(self) -> AsyncRawTtsClient: -@@ -538,10 +647,12 @@ class AsyncTtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -658,10 +769,12 @@ class AsyncTtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -772,10 +885,12 @@ class AsyncTtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -886,10 +1001,12 @@ class AsyncTtsClient: - Specifies the output audio file format. - - include_timestamp_types : typing.Optional[typing.Sequence[TimestampType]] -- The set of timestamp types to include in the response. -+ The set of timestamp types to include in the response. Only supported for Octave 2 requests. - - num_generations : typing.Optional[int] -- Number of generations of the audio to produce. -+ Number of audio generations to produce from the input utterances. -+ -+ Using `num_generations` enables faster processing than issuing multiple sequential requests. Additionally, specifying `num_generations` allows prosody continuation across all generations without repeating context, ensuring each generation sounds slightly different while maintaining contextual consistency. - - split_utterances : typing.Optional[bool] - Controls how audio output is segmented in the response. -@@ -969,6 +1086,127 @@ class AsyncTtsClient: - async for _chunk in r.data: - yield _chunk - -+ async def convert_voice_file( -+ self, -+ *, -+ audio: core.File, -+ strip_headers: typing.Optional[bool] = OMIT, -+ context: typing.Optional[PostedContext] = OMIT, -+ voice: typing.Optional[PostedUtteranceVoice] = OMIT, -+ format: typing.Optional[Format] = OMIT, -+ include_timestamp_types: typing.Optional[typing.List[TimestampType]] = OMIT, -+ request_options: typing.Optional[RequestOptions] = None, -+ ) -> typing.AsyncIterator[bytes]: -+ """ -+ Parameters -+ ---------- -+ audio : core.File -+ See core.File for more documentation -+ -+ strip_headers : typing.Optional[bool] -+ If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). -+ -+ context : typing.Optional[PostedContext] -+ Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. -+ -+ voice : typing.Optional[PostedUtteranceVoice] -+ -+ format : typing.Optional[Format] -+ Specifies the output audio file format. -+ -+ include_timestamp_types : typing.Optional[typing.List[TimestampType]] -+ The set of timestamp types to include in the response. -+ -+ request_options : typing.Optional[RequestOptions] -+ Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response. -+ -+ Returns -+ ------- -+ typing.AsyncIterator[bytes] -+ Successful Response -+ """ -+ async with self._raw_client.convert_voice_file( -+ audio=audio, -+ strip_headers=strip_headers, -+ context=context, -+ voice=voice, -+ format=format, -+ include_timestamp_types=include_timestamp_types, -+ request_options=request_options, -+ ) as r: -+ async for _chunk in r.data: -+ yield _chunk -+ -+ async def convert_voice_json( -+ self, -+ *, -+ strip_headers: typing.Optional[bool] = OMIT, -+ audio: typing.Optional[core.File] = OMIT, -+ context: typing.Optional[PostedContext] = OMIT, -+ voice: typing.Optional[PostedUtteranceVoice] = OMIT, -+ format: typing.Optional[Format] = OMIT, -+ include_timestamp_types: typing.Optional[typing.List[TimestampType]] = OMIT, -+ request_options: typing.Optional[RequestOptions] = None, -+ ) -> typing.AsyncIterator[TtsOutput]: -+ """ -+ Parameters -+ ---------- -+ strip_headers : typing.Optional[bool] -+ If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk's audio will be its own audio file, each with its own headers (if applicable). -+ -+ audio : typing.Optional[core.File] -+ See core.File for more documentation -+ -+ context : typing.Optional[PostedContext] -+ Utterances to use as context for generating consistent speech style and prosody across multiple requests. These will not be converted to speech output. -+ -+ voice : typing.Optional[PostedUtteranceVoice] -+ -+ format : typing.Optional[Format] -+ Specifies the output audio file format. -+ -+ include_timestamp_types : typing.Optional[typing.List[TimestampType]] -+ The set of timestamp types to include in the response. -+ -+ request_options : typing.Optional[RequestOptions] -+ Request-specific configuration. -+ -+ Yields -+ ------ -+ typing.AsyncIterator[TtsOutput] -+ Successful Response -+ -+ Examples -+ -------- -+ import asyncio -+ -+ from hume import AsyncHumeClient -+ -+ client = AsyncHumeClient( -+ api_key="YOUR_API_KEY", -+ ) -+ -+ -+ async def main() -> None: -+ response = await client.tts.convert_voice_json() -+ async for chunk in response: -+ yield chunk -+ -+ -+ asyncio.run(main()) -+ """ -+ async with self._raw_client.convert_voice_json( -+ strip_headers=strip_headers, -+ audio=audio, -+ context=context, -+ voice=voice, -+ format=format, -+ include_timestamp_types=include_timestamp_types, -+ request_options=request_options, -+ ) as r: -+ async for _chunk in r.data: -+ yield _chunk -+ - @property - def voices(self): - if self._voices is None: -@@ -976,13 +1214,3 @@ class AsyncTtsClient: - - self._voices = AsyncVoicesClient(client_wrapper=self._client_wrapper) - return self._voices -- -- @property -- def stream_input(self): -- if self._stream_input is None: -- from .stream_input.client import AsyncStreamInputClient -- -- self._stream_input = AsyncStreamInputClient( -- client_wrapper=self._client_wrapper, -- ) -- return self._stream_input diff --git a/src/hume/tts/raw_client.py b/src/hume/tts/raw_client.py index 089211c6..5ad7fd4c 100644 --- a/src/hume/tts/raw_client.py +++ b/src/hume/tts/raw_client.py @@ -581,7 +581,7 @@ def convert_voice_file( Specifies the output audio file format. include_timestamp_types : typing.Optional[typing.List[TimestampType]] - The set of timestamp types to include in the response. + The set of timestamp types to include in the response. When used in multipart/form-data, specify each value using bracket notation: `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. Only supported for Octave 2 requests. request_options : typing.Optional[RequestOptions] Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response. @@ -668,7 +668,7 @@ def convert_voice_json( Specifies the output audio file format. include_timestamp_types : typing.Optional[typing.List[TimestampType]] - The set of timestamp types to include in the response. + The set of timestamp types to include in the response. When used in multipart/form-data, specify each value using bracket notation: `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. Only supported for Octave 2 requests. request_options : typing.Optional[RequestOptions] Request-specific configuration. @@ -1296,7 +1296,7 @@ async def convert_voice_file( Specifies the output audio file format. include_timestamp_types : typing.Optional[typing.List[TimestampType]] - The set of timestamp types to include in the response. + The set of timestamp types to include in the response. When used in multipart/form-data, specify each value using bracket notation: `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. Only supported for Octave 2 requests. request_options : typing.Optional[RequestOptions] Request-specific configuration. You can pass in configuration such as `chunk_size`, and more to customize the request and response. @@ -1384,7 +1384,7 @@ async def convert_voice_json( Specifies the output audio file format. include_timestamp_types : typing.Optional[typing.List[TimestampType]] - The set of timestamp types to include in the response. + The set of timestamp types to include in the response. When used in multipart/form-data, specify each value using bracket notation: `include_timestamp_types[0]=word&include_timestamp_types[1]=phoneme`. Only supported for Octave 2 requests. request_options : typing.Optional[RequestOptions] Request-specific configuration.