🚀 internlm-chatbode-7b
InternLm-ChatBode 是一个针对葡萄牙语进行微调的语言模型,它基于 InternLM2 模型开发。该模型通过使用 UltraAlpaca 数据集进行微调得到进一步优化。
✨ 主要特性
💻 使用示例
基础用法
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("recogna-nlp/internlm-chatbode-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("recogna-nlp/internlm-chatbode-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
model = model.eval()
response, history = model.chat(tokenizer, "Olá", history=[])
print(response)
response, history = model.chat(tokenizer, "O que é o Teorema de Pitágoras? Me dê um exemplo", history=history)
print(response)
高级用法
可以使用 stream_chat
方法以流式方式生成回复:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "recogna-nlp/internlm-chatbode-7b"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = model.eval()
length = 0
for response, history in model.stream_chat(tokenizer, "Olá", history=[]):
print(response[length:], flush=True, end="")
length = len(response)
📚 详细文档
Open Portuguese LLM Leaderboard 评估结果
详细结果可在 此处 和 Open Portuguese LLM Leaderboard 上查看。
指标 |
数值 |
平均值 |
69.54 |
ENEM Challenge (No Images) |
63.05 |
BLUEX (No Images) |
51.46 |
OAB Exams |
42.32 |
Assin2 RTE |
91.33 |
Assin2 STS |
80.69 |
FaQuAD NLI |
79.80 |
HateBR Binary |
87.99 |
PT Hate Speech Binary |
68.09 |
tweetSentBR |
61.11 |
📄 引用
如果您想在研究中使用 Chatbode,请按以下方式引用:
@misc {chatbode_2024,
author = { Gabriel Lino Garcia, Pedro Henrique Paiola and and João Paulo Papa},
title = { Chatbode },
year = {2024},
url = { https://huggingface.co/recogna-nlp/internlm-chatbode-7b/ },
doi = { 10.57967/hf/3317 },
publisher = { Hugging Face }
}