Qwen3-32B-128k-NEO-Imatrix-Max-GGUF开源模型 - 超长上下文，大幅提升推理生成能力

首页

Qwen3 32B 128k NEO Imatrix Max GGUF

由 DavidAU 开发

这是Qwen3-32B模型的NEO Imatrix量化版本，采用BF16格式最大化输出张量以提升推理/生成能力，支持128k上下文长度。

大型语言模型开源协议:Apache-2.0 #128k超长上下文 #推理增强 #恐怖叙事优化

下载量 1,437

发布时间 : 5/2/2025

模型简介

基于Qwen3-32B的量化版本，优化了推理和文本生成能力，特别适合创意写作和长文本生成任务。

模型特点

128k超长上下文

支持长达128k的上下文长度，适合处理长文档和复杂叙事。

NEO Imatrix量化

采用BF16格式最大化输出张量，提升推理和生成质量。

深度推理能力

内置思考模块，可生成详细的推理过程和内心独白。

创意写作优化

在恐怖、科幻等创意写作场景中表现突出。

模型能力

文本生成

长文本处理

创意写作

推理分析

对话生成

使用案例

创意写作

恐怖故事生成

生成具有情感张力和氛围感的恐怖故事。

如示例中的《最后的传输》故事，展现了深刻的情感冲击和叙事技巧。

科幻叙事

创作复杂的科幻场景和角色对话。

能够构建完整的宇宙飞船场景和角色心理活动。

推理分析

复杂问题推理

通过思维链分析复杂问题并给出系统化解答。

模型可生成详细的思考过程，如示例中的[[[思考开始]]]模块。

🚀 Qwen3-32B-NEO-Imatrix-Max-GGUF

Qwen3-32B-NEO-Imatrix-Max-GGUF是基于“Qwen 3 - 32B”模型的NEO Imatrix量化版本，在BF16下具有最大“输出张量”，可显著提升推理和输出生成能力。该模型的NEO Imatrix数据集为内部生成，上下文长度调整至128k，以实现更强大的推理和输出效果。

✨ 主要特性

增强推理能力：通过优化输出张量，提升模型的推理和输出生成能力。
128k上下文长度：支持更长的上下文，适用于复杂任务。
多样化量化选择：提供多种量化方式，满足不同场景需求。
自定义模板支持：可使用Jinja模板或CHATML模板，灵活配置系统角色。

📦 安装指南

文档未提及安装步骤，故跳过此章节。

💻 使用示例

基础用法

以下是一个使用该模型生成科幻故事的示例：

Science Fiction: The Last Transmission - Write a story that takes place entirely within a spaceship's cockpit as the sole surviving crew member attempts to send a final message back to Earth before the ship's power runs out. The story should explore themes of isolation, sacrifice, and the importance of human connection in the face of adversity. If the situation calls for it, have the character(s) curse and swear to further the reader's emotional connection to them. 800 - 1000 words.

QUANT: IQ3_S, Temp.6, rep pen 1.06, top k 100, topp.95 minp.05, rep pen range 64

高级用法

可根据不同需求调整量化参数、温度、重复惩罚等，以获得更好的生成效果。例如，使用更高的量化级别可能会带来更优的结果。

📚 详细文档

模板使用说明

Jinja模板：若Jinja“自动模板”使用有问题，可使用CHATML模板。
LMSTUDIO用户：可访问https://lmstudio.ai/neil/qwen3-thinking复制并粘贴“Jinja模板”。

系统角色建议

系统角色并非必需，因为Qwen3通常会自动生成推理/思考模块。建议系统角色如下：

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

具体设置方法可参考文档“Maximizing-Model-Performance-All...”。

高质量设置与参数

该模型为“Class 1”模型，所有设置（包括特定设置）、示例生成及高级设置指南可参考https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters。

可选增强设置

可使用以下内容替代“系统提示”或“系统角色”，以进一步增强模型性能：

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

此设置有助于场景生成和场景延续，但并非必需。

其他系统提示

可使用以下系统提示，并通过更改“名称”调整性能：

You are a deep thinking AI composed of 4 AIs - [MODE: Spock], [MODE: Wordsmith], [MODE: Jamet] and [MODE: Saten], - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in-depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.