Qwen2.5-0.5B-Instruct-Gensyn-Swarm免费开源模型

首页

Qwen2.5 0.5B Instruct Gensyn Swarm Peaceful Exotic Butterfly

由 juliannode 开发

基于Gensyn/Qwen2.5-0.5B-Instruct的微调版本，采用TRL框架和GRPO算法训练，适用于指令跟随任务。

大型语言模型

Transformers

#GRPO强化学习 #多轮指令微调 #小参数高效推理

下载量 16

发布时间 : 4/2/2025

模型简介

这是一个经过微调的语言模型，专注于指令理解和生成任务，采用了强化学习群体训练方法。

模型特点

GRPO算法训练

采用DeepSeekMath论文中提出的GRPO方法进行训练，优化模型性能

TRL框架

使用基于Transformer的强化学习框架进行训练

指令微调

针对指令理解和生成任务进行了专门优化

模型能力

文本生成

指令理解

对话生成

使用案例

对话系统

假设性问题回答

回答用户提出的假设性问题，如时光机选择问题

能生成合理且有逻辑的回答

教育应用

思维启发

帮助学生拓展思维，回答开放式问题

提供多样化的观点和思考角度

🚀 Qwen2.5-0.5B-Instruct-Gensyn-Swarm-peaceful_exotic_butterfly

本模型是基于Transformer架构的微调模型，在问答和文本生成任务上表现出色，解决了特定领域的语言处理需求，提升了模型的实用性和效率。

🚀 快速开始

from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="juliannode/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-peaceful_exotic_butterfly", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

✨ 主要特性

基于Gensyn/Qwen2.5-0.5B-Instruct进行微调。
使用TRL进行训练。

📦 安装指南

文档未提供安装步骤，跳过该章节。

💻 使用示例

基础用法

from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="juliannode/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-peaceful_exotic_butterfly", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

高级用法

文档未提供高级用法代码示例，跳过该部分。

📚 详细文档

训练过程

本模型使用GRPO方法进行训练，该方法在论文 DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models 中被提出。

框架版本

TRL: 0.15.2
Transformers: 4.51.3
Pytorch: 2.5.1
Datasets: 3.5.0
Tokenizers: 0.21.1

🔧 技术细节

文档未提供足够技术实现细节，跳过该章节。

📄 许可证

本模型遵循 license 许可证。

📚 引用信息

引用GRPO方法请使用以下格式：

@article{zhihong2024deepseekmath,
    title        = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
    author       = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
    year         = 2024,
    eprint       = {arXiv:2402.03300},
}

引用TRL库请使用以下格式：

@misc{vonwerra2022trl,
	title        = {{TRL: Transformer Reinforcement Learning}},
	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
	year         = 2020,
	journal      = {GitHub repository},
	publisher    = {GitHub},
	howpublished = {\url{https://github.com/huggingface/trl}}
}

模型信息表格

属性	详情
基础模型	Gensyn/Qwen2.5-0.5B-Instruct
库名称	transformers
模型名称	Qwen2.5-0.5B-Instruct-Gensyn-Swarm-peaceful_exotic_butterfly
标签	generated_from_trainer、rl-swarm、grpo、gensyn、I am peaceful exotic butterfly、trl
许可证	license