Zurich-14B-GCv2-10k开源AI模型 - 基于伽马语料微调，性能超越同规模模型！

首页

Zurich 14B GCv2 10k

由 rubenroy 开发

基于伽马语料库微调的Qwen 2.5模型，旨在超越同规模的其他模型

大型语言模型

Transformers

英语开源协议:Apache-2.0 #多轮对话优化 #高参数量14B #伽马语料库微调

下载量 47

发布时间 : 1/29/2025

模型简介

苏黎世14B伽马语料库v2-10k是对阿里巴巴Qwen 2.5 14B Instruct模型的微调版本，展示了伽马语料库v2-10k的潜力。

模型特点

高效微调

使用Unsloth框架在1块A100显卡上仅训练约10分钟完成60个周期

先进架构

采用RoPE、SwiGLU、RMSNorm及注意力QKV偏置的变换器架构

多轮对话支持

基于伽马语料库训练，擅长处理结构化多轮对话

模型能力

文本生成

多轮对话

问答系统

使用案例

对话系统

AI助手

可作为智能助手处理用户查询

能够生成连贯、有帮助的回复

问答系统

事实查询

回答关于事实信息的问题

能提供准确的事实性回答

🚀 苏黎世14B GammaCorpus v2-10k

基于GammaCorpus数据集微调的Qwen 2.5模型

苏黎世14B GammaCorpus v2-10k是对阿里巴巴的Qwen 2.5 14B Instruct模型进行微调后的版本。该模型旨在超越其他同等规模的模型，同时展示GammaCorpus v2-10k数据集的优势。

🚀 快速开始

要求

我们强烈建议您使用最新版本的transformers包。您可以通过以下pip命令进行安装：

pip install transformers

快速上手

以下是一个使用apply_chat_template的代码片段，展示了如何加载分词器和模型，以及如何生成内容：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "rubenroy/Zurich-14B-GCv2-10k"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "How tall is the Eiffel tower?"
messages = [
    {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

✨ 主要特性

基于阿里巴巴的Qwen 2.5 14B Instruct模型进行微调，在同等规模模型中表现更优。
展示了GammaCorpus v2-10k数据集的优势。

📦 安装指南

使用pip安装最新版本的transformers包：

pip install transformers

📚 详细文档

模型详情

属性	详情
基础模型	Qwen/Qwen2.5-14B-Instruct
类型	因果语言模型
架构	具有RoPE、SwiGLU、RMSNorm和注意力QKV偏置的Transformers
参数数量	147亿
参数数量（非嵌入层）	131亿
层数	48
注意力头数量（GQA）	Q为40，KV为8