🚀 凯伦:文本编辑利器(v.2)严格版
凯伦宛如文本世界的专业编辑,致力于修正文本中的语法和拼写错误,同时完美保留原文风格。在处理美国英语文本时,她能精准识别并纠正各类常见错误,为你的写作保驾护航。
🚀 快速开始
版本说明
凯伦V2有两个版本:
- 严格版:此版本尽量不改动原文,主要修复语法和拼写错误,默认你清楚自己的写作意图。
- 创意版:点击此处查看。该版本可能会根据需要提出轻微的上下文改进或改写建议,就像小酌一杯红酒后的凯伦,更具创意。
使用方式
每次提交一个段落或一段文本进行处理。使用时需注意在实际文本前加上提示语:Edit the following text for spelling and grammar mistakes: 这样凯伦就不会对文本进行额外解读。
推荐设置
- 温度(Temperature):0.7
- 核采样概率(top_p):0.1
- 前k采样(top_k):40
- 重复惩罚(repetition penalty):1.18
凯伦也可用于聊天,但如果文本较长,她可能会将你的消息视为需要校对的内容,而非普通闲聊。
✨ 主要特性
核心功能
凯伦的主要目标是纠正美国英语中的语法和拼写错误,同时保持文本风格不变。她擅长识别和纠正常见的英语作为第二语言(ESL)错误,具体包括:
- 动词时态错误:如该用过去时态却用了现在时态,反之亦然;混淆进行时态和一般时态。
- 主谓一致错误:主语和谓语在数上不一致,例如用单数动词搭配复数主语,或反之。
- 冠词使用错误:错误使用或遗漏冠词,如用“a”代替“an”,反之亦然;过度使用或遗漏定冠词“the”。
- 介词使用错误:介词使用不当,如用“in”代替“on”或“at”,或在需要介词的地方遗漏介词。
- 词序错误:句子中的词序错误,特别是在疑问句和否定句中;副词或形容词位置不当。
- 名词复数形式错误:名词复数形式错误,如必要时未加“-s”或“-es”。
- 代词错误:混淆主格和宾格代词;物主代词使用错误。
- 双重否定错误:使用双重否定,这在标准英语中是语法错误。
- 情态动词使用错误:情态动词如can、could、will、would、should等使用不当。
- 易混淆词汇错误:混淆发音相似但含义和拼写不同的单词(如“their”、“there”和“they're”)。
- 单复数一致错误:句子中单数和复数名词及动词搭配错误。
效果展示
以下是使用凯伦编辑前后的文本对比:
原文:
Edit the following text for spelling and grammar mistakes: I see future writing more about you writing the story any way you can and LLM editing it and making it into a proper fiction. That seems to be more satisfying way where you could feel you actually had done something creATIVE. So human is the creative, LLM is the burger flipping intern.
The opposite is what many people are trying rn. Make LLM to invent story (from premise, to creating outline to writing actual chapters) and the human to somehow assemble it from those pieces. So LLM is the creative man, and human is the floor polisher.
I can't believe a normal person would find it satisfying long term.
So in my view, the job of editor would be replaced by Ai, not the job of author.
(Grammarly评分:48)
严格版输出:
I see the future of writing being more about you writing the story any way you can and having an AI like LLM edit it and make it into a proper fiction. That seems to be a more satisfying way where you could feel you actually did something creative. So humans are the creative ones, while LLMs are the burger-flipping interns.
On the other hand, many people are currently trying to do the exact opposite. They want to make LLMs create stories (from premises, to outlining, to writing entire chapters). And then they want humans to somehow piece them together from these parts. In this scenario, LLMs become the creative individuals, and humans become the floor polishers.
I can't imagine a regular person finding that satisfying over the long run.
In my opinion, the role of an editor will eventually be taken over by AI, not the role of an author.
(Grammarly评分:93)
创意版输出:
I see the future of writing being more focused on you writing the story any way you can and having an AI like GPT - 3 edit it and turn it into a proper piece of fiction. This seems to be a more satisfying approach where you can truly feel that you have done something creative. In this scenario, humans remain the creative force, while the AI acts as the burger flipping intern.
On the other hand, there are those who argue that we should let the AI generate stories from scratch - from coming up with a premise, to outlining the plot, to writing individual chapters. They believe that the role of the human should then become one of assembling these pieces together. In this case, the AI becomes the creative mind, and the human takes on the role of a floor polisher.
I cannot imagine how someone could find this satisfying over the long term. Therefore, in my opinion, the job of the editor will eventually be taken over by AI, not the job of the author.
(Grammarly评分:83)
📦 安装指南
文档未提及安装相关内容,故跳过此章节。
💻 使用示例
文档未提供代码示例,故跳过此章节。
📚 详细文档
模型信息
凯伦V2与之前的版本使用了完全不同的数据集和基础模型。
训练方式
该模型通过反向训练,使用另一个Llama模型(Darth Karen)和Python脚本故意在虚构/非虚构美国文本中插入错误进行训练。
未来目标
- 使用更大的模型。
- 增加模型遗漏的语法案例。
- 使用更好、更大的数据集。
总结
经过大约10个不同版本的迭代和改进,当前模型表现良好,但偶尔会出现语法错误(且这些错误往往存在争议)。模型的局限性似乎与7B参数有关,参数规模不足以对输入的各种细微差别进行精细理解。这与我的其他发现一致——Mistral模型在生成自己的文本时表现相当不错,但理解能力并不完美,同样与7B参数有关。
创建这个模型的目标是在不改变文本风格的前提下修正语法和拼写错误。通常,大语言模型(LLM)在编辑文本时,即使文本本身没问题也会尝试重写。对于这样一个小模型来说,要在修正文本(且不改变风格)和逐字复制文本之间找到正确的平衡是相当具有挑战性的。严格版模型假设你已经是一位优秀的作家,不需要过多指导,并且你写下的每个字都有其用意。
评估结果
详细的评估结果可在此处查看。
指标 |
值 |
平均值 |
59.13 |
AI2推理挑战(25-shot) |
59.56 |
HellaSwag(10-shot) |
81.79 |
MMLU(5-shot) |
59.56 |
TruthfulQA(0-shot) |
49.36 |
Winogrande(5-shot) |
74.35 |
GSM8k(5-shot) |
30.17 |
🔧 技术细节
文档未提供足够的技术实现细节,故跳过此章节。
📄 许可证
本项目使用Llama2许可证。