style_bert_vits2_jp_extra_cool_original开源日语童声TTS模型

首页

Style Bert Vits2 Jp Extra Cool Original

由 RikkaBotan 开发

基于style_bert_vits2_jp_extra框架训练的日语童声文本转语音模型，支持免费商用

语音合成

Transformers

支持多种语言#稚嫩童声 #日语TTS #商用免费

下载量 15

发布时间 : 4/25/2024

模型简介

专为日语优化的童声音色语音生成模型，特别适合说明性文本朗读，合成效果自然度高

模型特点

稚嫩童声

可生成自然舒缓的童声音色

商用授权

允许免费商用及非商用场景使用

多版本适配

提供甜美版、英语版、ASMR版等不同风格变体

高自然度

基于最新框架优化，语音合成自然度显著提升

模型能力

日语文本转语音

童声音色生成

语音风格控制

长文本朗读

使用案例

内容创作

视频解说

为儿童向视频内容添加童声解说

增强内容亲和力

有声读物

生成儿童故事朗读音频

提升儿童听众体验

虚拟主播

VRM角色配音

为虚拟主播角色提供实时语音

配套商用VRM模型可用

🚀 风格BERT-VITS2日语额外模型（冷酷版）

本项目是一个文本转语音（TTS）模型，基于style_bert_vits2_jp_extra模型使用特定语音数据训练而来，能够生成幼态沉稳的语音，可免费用于商业和非商业用途。

🚀 快速开始

你可以通过以下两种方式使用该模型：

方法一：使用style-bert-vits2应用程序生成语音

将 config.json、safetensors 和 style_vectors.npy 这三个文件放置在 Style-Bert-VITS2/model_assets/rikka_botan/ 文件夹中。可以使用以下程序自动保存文件：

from google.colab import drive
drive.mount("/content/drive")
%cd /content/drive/MyDrive/
!mkdir Style-Bert-VITS2/
%cd Style-Bert-VITS2/
!mkdir model_assets/
%cd model_assets/
!mkdir rikka_botan/
from huggingface_hub import snapshot_download

model_name = "RikkaBotan/style_bert_vits2_jp_extra_cool_original"
download_path = snapshot_download(
    repo_id=model_name,
    local_dir = f"rikka_botan/",
    local_dir_use_symlinks=False
    )

执行以下程序：

!git clone https://github.com/litagin02/Style-Bert-VITS2.git
%cd Style-Bert-VITS2/
!pip install -r requirements.txt
!python initialize.py --skip_jvnv

from google.colab import drive
drive.mount("/content/drive")

dataset_root = "/content/drive/MyDrive/Style-Bert-VITS2/Data"
assets_root = "/content/drive/MyDrive/Style-Bert-VITS2/model_assets"
import yaml
with open("configs/paths.yml", "w", encoding="utf-8") as f:
    yaml.dump({"dataset_root": dataset_root, "assets_root": assets_root}, f)

!python app.py --share

访问公共URL。

方法二：使用以下代码

# 首先，我们将安装所需的库
!git clone https://github.com/litagin02/Style-Bert-VITS2.git
%cd Style-Bert-VITS2/
!pip install -r requirements.txt 
!pip install style-bert-vits2 --no-build-isolation  # 避免错误

# 加载日语BERT模型
from style_bert_vits2.nlp import bert_models
from style_bert_vits2.constants import Languages

bert_models.load_model(Languages.JP, "ku-nlp/deberta-v2-large-japanese-char-wwm")
bert_models.load_tokenizer(Languages.JP, "ku-nlp/deberta-v2-large-japanese-char-wwm")

# 将模型文件保存到model_assets目录
from pathlib import Path
from huggingface_hub import hf_hub_download

model_file = "rikka_botan_cool.safetensors"
config_file = "config.json"
style_file = "style_vectors.npy"

for file in [model_file, config_file, style_file]:
    print(file)
    hf_hub_download(
        "RikkaBotan/style_bert_vits2_jp_extra_cool_original",
        file,
        local_dir="model_assets"
    )

# 使用保存的模型，我们将测试文本转语音演示
from style_bert_vits2.tts_model import TTSModel

assets_root = Path("model_assets")

model = TTSModel(
    model_path=assets_root / model_file,
    config_path=assets_root / config_file,
    style_vec_path=assets_root / style_file,
    device="cuda"  # 如果无法使用cuda，请输入cpu
)

# 请输入日语文本
from IPython.display import Audio, display

sr, audio = model.infer(text="ここに文章を入力してください")
display(Audio(audio, rate=sr))

✨ 主要特性

能够生成幼态沉稳的语音，适用于解说类文本的朗读。
基于 style_bert_vits2_jp_extra 模型训练，在日语语音生成方面具有高精度和自然度。
支持多种使用场景，包括商业和非商业用途。
提供了不同风格的版本，如甜美版、英语版、ASMR版和中文版，满足多样化需求。

📦 安装指南

在使用模型前，你需要按照上述快速开始部分的步骤进行操作，包括克隆仓库、安装依赖库、下载模型文件等。

💻 使用示例

基础用法

在上述快速开始部分已经给出了使用该模型进行文本转语音的基础代码示例，你可以按照代码中的步骤进行操作。

📚 详细文档

模型说明

本模型是一个TTS（文本转语音）模型，是使用独特的语音数据对 style_bert_vits2_jp_extra 进行训练得到的。style_bert_vits2_jp_extra 是专门针对日语的语音生成模型，与以往的模型相比，能够生成高精度且自然的语音。由于训练数据仅为创建该模型的研究者本人的语音，因此该模型的许可与 style_bert_vits2_jp_extra 相同，可免费用于商业和非商业用途。