Karen_TheEditor_V2_STRICT_Mistral_7B开源大模型 - 免费修正美式英语语法拼写错误

首页

Karen TheEditor V2 STRICT Mistral 7B

由 FPHam 开发

专注于美式英语语法拼写校正的大语言模型，严格保持原文风格的同时修正语法错误

大型语言模型

Transformers

#美式英语校对 #语法纠错 #写作风格保留

下载量 118

发布时间 : 11/15/2023

模型简介

基于Mistral-7B优化的专业语法校对模型，特别擅长识别非母语者的常见英语错误，提供严格和创意两种编辑模式

模型特点

双模式设计

提供严格模式（保持原风格）和创意模式（优化表达）两种文本处理方式

逆向训练方法

通过先在文本中植入错误再进行纠错的独特训练方式提升语法识别能力

错误类型覆盖全面

可识别16类常见英语语法错误，包括时态、主谓一致、冠词使用等非母语典型错误

风格保持

严格模式下优先保持作者原始写作风格，仅修正语法错误不改变表达方式

模型能力

语法错误检测

拼写校正

文本风格保持

语境感知修正

美式英语专业化处理

使用案例

写作辅助

小说创作校对

帮助作家快速修正创作草稿中的语法错误而不改变叙事风格

案例显示语法评分从48提升至93

学术论文润色

为非英语母语研究者提供专业语法修正服务

教育辅助

英语写作教学

作为教学工具展示常见语法错误及修正方案

🚀 凯伦：文本编辑利器（v.2）严格版

凯伦宛如文本世界的专业编辑，致力于修正文本中的语法和拼写错误，同时完美保留原文风格。在处理美国英语文本时，她能精准识别并纠正各类常见错误，为你的写作保驾护航。

🚀 快速开始

版本说明

凯伦V2有两个版本：

严格版：此版本尽量不改动原文，主要修复语法和拼写错误，默认你清楚自己的写作意图。
创意版：点击此处查看。该版本可能会根据需要提出轻微的上下文改进或改写建议，就像小酌一杯红酒后的凯伦，更具创意。

使用方式

每次提交一个段落或一段文本进行处理。使用时需注意在实际文本前加上提示语：Edit the following text for spelling and grammar mistakes: 这样凯伦就不会对文本进行额外解读。

✨ 主要特性

核心功能

凯伦的主要目标是纠正美国英语中的语法和拼写错误，同时保持文本风格不变。她擅长识别和纠正常见的英语作为第二语言（ESL）错误，具体包括：

动词时态错误：如该用过去时态却用了现在时态，反之亦然；混淆进行时态和一般时态。
主谓一致错误：主语和谓语在数上不一致，例如用单数动词搭配复数主语，或反之。
冠词使用错误：错误使用或遗漏冠词，如用“a”代替“an”，反之亦然；过度使用或遗漏定冠词“the”。
介词使用错误：介词使用不当，如用“in”代替“on”或“at”，或在需要介词的地方遗漏介词。
词序错误：句子中的词序错误，特别是在疑问句和否定句中；副词或形容词位置不当。
名词复数形式错误：名词复数形式错误，如必要时未加“-s”或“-es”。
代词错误：混淆主格和宾格代词；物主代词使用错误。
双重否定错误：使用双重否定，这在标准英语中是语法错误。
情态动词使用错误：情态动词如can、could、will、would、should等使用不当。
易混淆词汇错误：混淆发音相似但含义和拼写不同的单词（如“their”、“there”和“they're”）。
单复数一致错误：句子中单数和复数名词及动词搭配错误。

效果展示

以下是使用凯伦编辑前后的文本对比：原文：

Edit the following text for spelling and grammar mistakes: I see future writing more about you writing the story any way you can and LLM editing it and making it into a proper fiction. That seems to be more satisfying way where you could feel you actually had done something creATIVE. So human is the creative, LLM is the burger flipping intern. The opposite is what many people are trying rn. Make LLM to invent story (from premise, to creating outline to writing actual chapters) and the human to somehow assemble it from those pieces. So LLM is the creative man, and human is the floor polisher. I can't believe a normal person would find it satisfying long term. So in my view, the job of editor would be replaced by Ai, not the job of author. （Grammarly评分：48）

严格版输出：

I see the future of writing being more about you writing the story any way you can and having an AI like LLM edit it and make it into a proper fiction. That seems to be a more satisfying way where you could feel you actually did something creative. So humans are the creative ones, while LLMs are the burger-flipping interns.

On the other hand, many people are currently trying to do the exact opposite. They want to make LLMs create stories (from premises, to outlining, to writing entire chapters). And then they want humans to somehow piece them together from these parts. In this scenario, LLMs become the creative individuals, and humans become the floor polishers.

I can't imagine a regular person finding that satisfying over the long run.

In my opinion, the role of an editor will eventually be taken over by AI, not the role of an author. （Grammarly评分：93）

创意版输出：

I see the future of writing being more focused on you writing the story any way you can and having an AI like GPT - 3 edit it and turn it into a proper piece of fiction. This seems to be a more satisfying approach where you can truly feel that you have done something creative. In this scenario, humans remain the creative force, while the AI acts as the burger flipping intern.

On the other hand, there are those who argue that we should let the AI generate stories from scratch - from coming up with a premise, to outlining the plot, to writing individual chapters. They believe that the role of the human should then become one of assembling these pieces together. In this case, the AI becomes the creative mind, and the human takes on the role of a floor polisher.

I cannot imagine how someone could find this satisfying over the long term. Therefore, in my opinion, the job of the editor will eventually be taken over by AI, not the job of the author. （Grammarly评分：83）

📦 安装指南

文档未提及安装相关内容，故跳过此章节。

💻 使用示例

文档未提供代码示例，故跳过此章节。

📚 详细文档

模型信息

凯伦V2与之前的版本使用了完全不同的数据集和基础模型。

训练方式

该模型通过反向训练，使用另一个Llama模型（Darth Karen）和Python脚本故意在虚构/非虚构美国文本中插入错误进行训练。

未来目标

使用更大的模型。
增加模型遗漏的语法案例。
使用更好、更大的数据集。

总结

经过大约10个不同版本的迭代和改进，当前模型表现良好，但偶尔会出现语法错误（且这些错误往往存在争议）。模型的局限性似乎与7B参数有关，参数规模不足以对输入的各种细微差别进行精细理解。这与我的其他发现一致——Mistral模型在生成自己的文本时表现相当不错，但理解能力并不完美，同样与7B参数有关。

创建这个模型的目标是在不改变文本风格的前提下修正语法和拼写错误。通常，大语言模型（LLM）在编辑文本时，即使文本本身没问题也会尝试重写。对于这样一个小模型来说，要在修正文本（且不改变风格）和逐字复制文本之间找到正确的平衡是相当具有挑战性的。严格版模型假设你已经是一位优秀的作家，不需要过多指导，并且你写下的每个字都有其用意。

评估结果

详细的评估结果可在此处查看。

指标	值
平均值	59.13
AI2推理挑战（25-shot）	59.56
HellaSwag（10-shot）	81.79
MMLU（5-shot）	59.56
TruthfulQA（0-shot）	49.36
Winogrande（5-shot）	74.35
GSM8k（5-shot）	30.17