Open-LLM-VTuber
diff --git a/‎blog/2025-02-04-v1-0-1-release.md‎
Lines changed: 142 additions & 0 deletions b/‎blog/2025-02-04-v1-0-1-release.md‎
Lines changed: 142 additions & 0 deletions
diff --git a/‎blog/2025-02-20-v1-1-0-release.md‎
Lines changed: 80 additions & 0 deletions b/‎blog/2025-02-20-v1-1-0-release.md‎
Lines changed: 80 additions & 0 deletions
@@ -0,0 +1,142 @@
+---
+title: 1.0.1 版本发布
+description: Version 1.0.1 Release
+slug: v1.0.1-release
+authors: [tim, ethan]
+tags: [release]
+image: https://i.imgur.com/mErPwqL.png
+hide_table_of_contents: false
+---
+
+
+# Open-LLM-VTuber v1.0.1 Release 💥
+
+This release marks a significant milestone for Open-LLM-VTuber, featuring a complete rewrite of the backend and frontend with over 240+ new commits, along with numerous enhancements and new features. If you were using a version before this, version `v1.0.0` is basically a new app.
+
+⚠️ Direct upgrades from older versions are impossible due to architectural changes. Please refer to our **[new documentation site](https://open-llm-vtuber.github.io/docs/intro)** for installation.
+
+(v1.0.0 had a bug after the release, so let's just ignore that and have the v1.0.1)
+
+| ![i4_pet_desktop](https://github.com/user-attachments/assets/06eff9dc-e141-4401-90ac-823b08662aae) | ![i1](https://github.com/user-attachments/assets/e0175aa3-62c8-4cde-9c6f-5d010727c04f) |
+|:---:|:---:|
+| ![i3](https://github.com/user-attachments/assets/082d8f29-9b48-4dbb-87f6-0f12d89a92f2) | ![i2](https://github.com/user-attachments/assets/f6b50eda-8187-4d37-b39b-a34e33683328) |
+![i4](https://github.com/user-attachments/assets/fa4a5884-0ec7-4377-8a3b-204aafaf8ede) | ![i3_browser_world_fun](https://github.com/user-attachments/assets/8e0819d2-75dd-4ebf-97ab-399bf2d01795) |
+
+<!-- truncate -->
+
+## ✨ Highlights
+*   **Vision Capability:** Video chat with the AI.
+*   **Desktop Pet Mode:** A new Desktop Pet Mode lets you have your VTuber companion directly on your desktop.
+*   **Brand New Frontend:**  A completely redesigned frontend built with React, ChakuraUI, and Vite offers a modern user experience. Available as web and desktop apps, located in the [Open-LLM-VTuber-Web](https://github.com/Open-LLM-VTuber/Open-LLM-VTuber-Web) repository.
+*   **Chat History Management:**  Implemented a system to store and retrieve conversation history, enabling persistent interactions with your AI.
+*   **New LLM support:**  Many new (stateless) LLM providers are now supported (and refactored), including Ollama, OpenAI, Gemini, Claude, Mistral, DeepSeek, Zhipu, and llama.cpp.
+*   **DeepSeek R1 Reasoning model support**: The reasoning chain will be displayed but not spoken. See your waifu's inner thoughts!
+*   **Major Backend Rewrite:** The core of Open-LLM-VTuber has been rebuilt from the ground up, focusing on asynchronous operations, improved memory management, and a more modular architecture.
+*   **Refactored Configuration:** The `conf.yaml` file was restructured, and `config_alts` has been renamed to `characters`.
+* **TTS Preprocessor**: Text inside `asterisks`, `brackets`, `parentheses`, and `angle brackets` will no longer be spoken by the TTS.
+*   **Dependency management:** Switched to `uv` for dependency management, removed unused dependencies such as `rich`, `playsound3`, and `sounddevice`.
+*   **Documentation Site:** A comprehensive documentation site is now live at [https://open-llm-vtuber.github.io/](https://open-llm-vtuber.github.io/).
+
+## 📋 Detailed Changes
+
+### 🧮 Backend
+
+*   **Architecture:**
+    *   The project structure has been reorganized to use the `src/` directory.
+    *   The backend is now fully asynchronous, improving responsiveness.
+    *   CLI mode (`main.py`) has been removed.
+    *   The "exit word" has been removed.
+    *   Models are initialized and managed using `ServiceContext`, offering better memory management, particularly when switching characters.
+    *   Refactored LLMs into `agent` and `stateless_llm`, supporting a wider range of LLMs with a new agent interface: `basic_memory_agent` and `hume_ai_agent`.
+*   **LLM (Language Model) Enhancements:**
+    *   New (and old but refactored) providers: Ollama, OpenAI (and any OpenAI Compatible API), Gemini, Claude, Mistral, DeepSeek, Zhipu, llama.cpp.
+    *   `temperature` parameter added.
+    *   No more tokens will be generated after interruption, improving the responsiveness of voice interruption.
+    *   Ollama models are preloaded at startup, kept in memory for the server's duration, and unloaded at exit.
+    *   Added a `hf_mirror` flag to specify whether to use the Hugging Face mirror source.
+*   **TTS (Text-to-Speech) Enhancements:**
+    *   TTS now generates multiple audio segments concurrently and sends them sequentially, reducing latency.
+    *   New interruption logic for smoother transitions.
+    *   Added filters (`asterisks`, `brackets`, `parentheses`) to prevent unwanted text from being spoken.
+    *   Implemented `faster_first_response` feature to prioritize the synthesis and playback of the first sentence fragment, minimizing latency.
+*   **ASR (Automatic Speech Recognition) Enhancements:**
+    *   Made Sherpa-onnx ASR with the **SenseVoiceSmall int8** model the default for both English and Chinese presets, with automatic model download.
+    *   Added a `provider` option for sherpa-onnx-asr.
+*   **Other Improvements:**
+    *   Chat log persistence is used to maintain conversation history.
+    *   All `print` statements are replaced with `loguru` for structured logging.
+    *   Added a Chinese configuration preset: `conf.CN.yaml`.
+    *   Basic AI proactive speaking (experimental).
+    *   Added some checks in the CI/CD process
+    *   Added input/output type system to agents
+    *   Added **Tencent Translate** in https://github.com/Open-LLM-VTuber/Open-LLM-VTuber/pull/107
+
+### 🖥️ Frontend
+
+*   **New frontend built with Electron, React, ChakuraUI, and Vite.**
+*   **Multi-Mode in Single Codebase:**
+    *   Web Mode: Browser interface
+    *   Window Mode: Desktop window
+    *   Pet Mode: Transparent desktop companion
+    *   Seamless context sharing between Window and Pet modes, allowing for the preservation of settings, history, connections, and model states.
+*   **Enhanced UI Features**
+    *    Responsive layout with collapsible sidebar and footer
+    *    Customizable Live2D model interactions: Mouse tracking for eye movement, Click-triggered animations, Drag & resize capabilities.
+    *    Persistent local storage for user preference settings, including background, VAD configuration, Live2D size and interactions, and agent behavior.
+    *    Supports viewing, loading, and deleting conversation history with streaming subtitles.
+    *    (Electron pet-mode) A transparent, always-on-top desktop companion with click-through, non-interactive areas featuring draggable and hideable Live2D and UI, right-click menu controls.
+    *    Camera and screen capturing panel
+    *    Switch characters easily
+
+### 📖 Documentation
+
+*   Rewritten README file.
+*   New comprehensive documentation with a dedicated website.
+
+### 🧹 Cleanup
+
+*   Removed unused and legacy code, including `TaskQueue.py`, `scripts/install_piper_tts.py`, `model_manager_old.py`, `service_context_old.py`, `main.py`, `asr_with_vad`, `vad`, `start_cli`, `fake_llm`, `MemGPT`, the `pywhispercpp` submodule, and CoreML script.
+*   Removed unused dependencies: `rich`, `playsound3`, `sounddevice`, among others.
+*   Removed configuration options that are no longer relevant: `VOICE_INPUT_ON`, `MIC_IN_BROWSER`, `LIVE2D`, `EXTRA_SYSTEM_PROMPT_RAG`, `AI_NAME`, `USER_NAME`, `SAVE_CHAT_HISTORY`, `CHAT_HISTORY_DIR`, `RAG_ON`, `LLMASSIST_RAG_ON`, `SAY_SENTENCE_SEPARATELY`, `MEMORY_SNAPSHOT`, `PRELOAD_MODELS`, `tts_on`.
+
+
+## ⚠️⚠️⚠️ Critical Upgrade Notice
+
+
+1. No Direct Upgrades - Previous installations are incompatible
+
+2. Fresh Install Required - Follow new documentation
+
+3. Config Changes - Back up existing configurations before migration
+
+### Why the Hassle? 💡
+
+1. UV dependency manager replaces legacy systems
+2. Complete configuration schema overhaul
+
+
+
+Please check out the [new documentation](https://open-llm-vtuber.github.io/docs/quick-start/) to install Open-LLM-VTuber again. Fortunately, thanks to `uv,` there should be fewer headaches during installation.
+
+
+## 🎉 Contributors
+- @t41372, which is me
+- @ylxmf2005, the creator of the new frontend, implemented LLM vision capability, chat history management, TTS concurrency, hume AI agent, better sentence division, a better live2d configuration, countless bug fixes, and more. He also wrote the majority of the documentation and provided countless insights. The version `v1.0.0` was a close collaboration with him and wouldn't have existed without his tremendous contribution.
+- @Stewitch, who added the hf_mirror option and is currently working on a launcher for this project to streamline the installation and configuration process. It's still a work in progress but will be completed very soon. https://github.com/Stewitch/LiZhen
+- @Fluchw, who added Tecent translator and helped us fix the translator bug.
+
+And all the other contributors who worked on this project in previous versions.
+
+
+**Full Changelog**: https://github.com/Open-LLM-VTuber/Open-LLM-VTuber/compare/v0.5.2...v1.0.0
+
+
+## Faster download links for Chinese users 给内地用户准备的(相对)快速的下载链接
+Open-LLM-VTuber-v1.0.3.zip (包含 sherpa onnx asr 的 sense-voice 模型，就不用再从github上拉取了)
+- https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.0.3/Open-LLM-VTuber-v1.0.3.zip
+
+open-llm-vtuber-electron-1.0.0-frontend.exe (桌面版前端，Windows)
+- https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.0.3/open-llm-vtuber-electron-1.0.0-setup.exe
+
+open-llm-vtuber-electron-1.0.0-frontend.dmg (桌面版前端，macOS)
+- https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.0.3/open-llm-vtuber-electron-1.0.0.dmg
@@ -0,0 +1,80 @@
+---
+title: 1.1.0 Release
+description: Version 1.1.0 Release
+slug: v1.1.0-release
+authors: [tim, ethan]
+tags: [release]
+image: https://i.imgur.com/mErPwqL.png
+hide_table_of_contents: false
+---
+
+
+## What's Changed
+
+### Major Features
+* Implemented group chat functionality (@ylxmf2005)
+* Added Silero-VAD voice activity detection (@AnyaCoder)
+* Added CosyVoice2 text-to-speech support (@Warma10032)
+* Added frontend ASR/TTS tools accessible at `http://localhost:web-tool`
+  - Users can now directly use the project's speech recognition and text-to-speech engines
+* Introduced one-click CUDA-ready setup using pixi (@mokurin000)
+* Improved configuration management and update mechanism:
+  - `conf.yaml` is no longer tracked in git
+  - New config template system for generating and updating `conf.yaml` during upgrades
+
+<!-- truncate -->
+
+### Bug Fixes & Improvements
+* Fixed sentence divider issues
+* Fixed system prompt override bug for certain LLMs
+* Removed deprecated `prompts/persona` directory (unused since v1.0.0)
+* Major codebase refactoring of conversation and handler components (@ylxmf2005)
+
+### New Contributors
+* @mokurin000
+* @AnyaCoder
+* @Warma10032
+
+**Full Changelog**: https://github.com/Open-LLM-VTuber/Open-LLM-VTuber/compare/v1.0.0...v1.1.0
+
+
+## Which files should I get? 我应该下载哪些文件？
+
+### For Existing Open-LLM-VTuber Users (v1.0.0 or newer) 现有 Open-LLM-VTuber 用户（v1.0.0 或更新版本）
+1. Run `uv run upgrade.py` to update to the latest version 运行 `uv run upgrade.py` 来更新到最新版本
+2. Download the new electron app from the releases section 从发布区(下面)下载新的 electron 应用程序
+
+### For New Users or Versions Below v1.0.0 新用户或 v1.0.0 以下版本用户
+Please refer to the [new deployment documentation](https://docs.llmvtuber.com/docs/quick-start) for installation instructions.
+请参考[新部署文档](https://docs.llmvtuber.com/docs/quick-start)获取安装说明。
+
+### Download Files 下载文件
+If you are here because you read the documentation, download the zip file and the electron app below.
+Download both of these files:
+1. The electron app
+2. The language-specific ZIP file:
+   - English: `Open-LLM-VTuber-v1.1.0-en.zip`
+   - Chinese: `Open-LLM-VTuber-v1.1.0-zh.zip`
+
+Note: The ZIP files are identical except for the language of the configuration file. Both packages include the SenseVoiceSmall model file to ensure accessibility for Chinese users.
+
+如果您是按照文档指引来到这里的，请下载以下的 zip 文件和 electron 应用程序。
+请下载这两个文件：
+1. electron 应用程序
+2. 对应语言的 ZIP 文件：
+   - 英文版：`Open-LLM-VTuber-v1.1.0-en.zip`
+   - 中文版：`Open-LLM-VTuber-v1.1.0-zh.zip`
+
+注意：这些 ZIP 文件除了配置文件的语言不同外完全相同。两个包都包含 SenseVoiceSmall 模型文件以确保内地用户可以愉快使用。
+
+
+## Faster download links for Chinese users 给内地用户准备的(相对)快速的下载链接
+Open-LLM-VTuber-v1.1.0-zh.zip (包含 sherpa onnx asr 的 sense-voice 模型，就不用再从github上拉取了)
+- [Open-LLM-VTuber-v1.1.0-en.zip](https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.1.0/Open-LLM-VTuber-v1.1.0-en.zip)
+- [Open-LLM-VTuber-v1.1.0-zh.zip](https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.1.0/Open-LLM-VTuber-v1.1.0-zh.zip)
+
+open-llm-vtuber-electron-1.1.0-frontend.exe (桌面版前端，Windows)
+- https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.1.0/open-llm-vtuber-electron-1.1.0-setup.exe
+
+open-llm-vtuber-electron-1.1.0-frontend.dmg (桌面版前端，macOS)
+- https://pub-17317087be374bc68161ac63de2022a5.r2.dev/v1.1.0/open-llm-vtuber-electron-1.1.0.dmg