开源多语言模型KunoRZN-Llama-3-3B - 适用于教育、医疗和日常任务

首页

Kunorzn Llama 3 3B

由 VinkuraAI 开发

KunoRZN-Llama-3-3B是VinkuraAI的旗舰多语言模型，支持12种以上印度语言及英语，专为教育、医疗和日常任务设计。

大型语言模型

Transformers

支持多种语言#印度多语言助手 #教育医疗双场景 #长链思维推理

下载量 9,214

发布时间 : 4/15/2025

模型简介

基于Meta Llama 3构建的多语言AI助手，擅长教育应用、医疗信息传递和交通管理，特别适应低资源计算环境。

模型特点

多语言支持

支持12种以上印度语言及英语，特别适合印度多语言环境。

双模式推理

可切换直觉响应模式和深度思考模式，适应不同任务需求。

低资源优化

专为印度普遍的低资源计算环境设计，运行效率高。

领域专业化

在教育、医疗和交通管理等特定领域表现优异。

模型能力

多语言文本生成

教育内容解释

医疗信息提供

交通管理建议

长链思维推理

函数调用

使用案例

教育

多语言课程解释

用多种印度语言向学生解释复杂概念

提高非英语母语学生的理解能力

医疗

地区语言医疗信息

用当地语言提供基本医疗信息和症状解释

提高医疗信息在非英语人群中的可及性

交通管理

本地化交通建议

根据印度城市特点提供交通路线和避堵建议

适应印度特有的交通状况

🚀 KunoRZN-Llama-3-3B

KunoRZN-Llama-3-3B 是 VinkuraAI 推出的一款强大的语言模型，它支持 12 种以上印度语言以及英语。该模型结合了“直觉式”传统模式响应和长链思维推理响应，用户可通过系统提示进行切换，能广泛应用于教育、医疗、交通管理等多个领域，为印度地区提供了便捷且强大的 AI 能力。

🚀 快速开始

模型信息

属性	详情
模型类型	KunoRZN-Llama-3-3B
基础模型	meta-llama/Meta-Llama-3.2-3B
支持语言	英语、印地语、泰米尔语、泰卢固语、卡纳达语、马拉雅拉姆语、马拉地语、孟加拉语、古吉拉特语、旁遮普语、奥里亚语、阿萨姆语、乌尔都语
许可证	llama3.2
标签	Llama - 3、instruct、finetune、chatml、multilingual、indian - languages、reasoning、education、healthcare、low - resource、vllm

示例配置

{
    "example_title": "KunoRZN",
    "messages": [
        {
            "role": "system",
            "content": "You are KunoRZN, a multilingual AI assistant fluent in both English and Indian languages, designed to help with education, healthcare information, and daily tasks."
        },
        {
            "role": "user",
            "content": "मुझे भारत के शिक्षा प्रणाली के बारे में बताएं।"
        }
    ]
}

推理基准

评估指标

在 IndicGLUE 数据集上进行文本生成任务，平均准确率达到 79.8%。

✨ 主要特性

多语言支持：支持 12 种以上印度语言和英语，打破语言障碍，为不同地区用户提供服务。
混合推理模式：统一“直觉式”传统模式响应和长链思维推理响应，用户可通过系统提示灵活切换。
广泛应用场景：适用于教育、医疗信息传递、交通管理系统等多个领域，满足不同场景需求。
低资源适配：能够在印度常见的低资源计算环境中运行。

📦 安装指南

使用 vLLM 运行该模型，首先确保已安装 vLLM：

pip install vllm

然后在终端运行以下命令：

vllm serve VinkuraAI/KunoRZN-Llama-3-3B

💻 使用示例

基础用法

深度思考模式

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import flash_attn
import time

tokenizer = AutoTokenizer.from_pretrained("VinkuraAI/KunoRZN-Llama-3-3B")

model = AutoModelForCausalLM.from_pretrained(
    "VinkuraAI/KunoRZN-Llama-3-3B",
    torch_dtype=torch.float16,
    device_map="auto",
    attn_implementation="flash_attention_2",
)

messages = [
    {
        "role": "system",
        "content": "You are a deep thinking AI assistant who can communicate in multiple Indian languages. You may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <thinking> </thinking> tags, and then provide your solution or response to the problem."
    },
    {
        "role": "user",
        "content": "भारतीय संविधान के मूल अधिकारों के बारे में बताइए और उनका महत्व समझाइए।"
    }
]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to("cuda")
generated_ids = model.generate(input_ids, max_new_tokens=3000, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
print(f"Generated Tokens: {generated_ids.shape[-1:]}")
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True, clean_up_tokenization_space=True)
print(f"Response: {response}")

⚠️ 重要提示

对于复杂推理任务，KunoRZN 在思考过程中可能会使用多达 10,000 个标记。对于难题，您可能需要增加 max_new_tokens 的值。

标准“直觉式”响应模式

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import flash_attn
import time

tokenizer = AutoTokenizer.from_pretrained("VinkuraAI/KunoRZN-Llama-3-3B")

model = AutoModelForCausalLM.from_pretrained(
    "VinkuraAI/KunoRZN-Llama-3-3B",
    torch_dtype=torch.float16,
    device_map="auto",
    attn_implementation="flash_attention_2",
)

messages = [
    {
        "role": "system",
        "content": "You are KunoRZN, a multilingual AI assistant fluent in both English and Indian languages."
    },
    {
        "role": "user",
        "content": "தமிழ்நாட்டில் உள்ள பிரபலமான சுற்றுலா தலங்கள் என்ன?"
    }
]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to("cuda")
generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
print(f"Generated Tokens: {generated_ids.shape[-1:]}")
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True, clean_up_tokenization_space=True)
print(f"Response: {response}")

高级用法

多语言用例

教育领域

messages = [
    {
        "role": "system",
        "content": "You are KunoRZN, an educational AI assistant that can explain concepts in simple terms for students."
    },
    {
        "role": "user",
        "content": "ಸೌರವ್ಯೂಹದ ಗ್ರಹಗಳ ಬಗ್ಗೆ ವಿವರಿಸಿ."
    }
]

医疗领域

messages = [
    {
        "role": "system",
        "content": "You are KunoRZN, a healthcare information assistant. Provide general health information while always recommending consultation with healthcare professionals."
    },
    {
        "role": "user",
        "content": "ডায়াবেটিস রোগের লক্ষণগুলি কী কী?"
    }
]

交通管理领域

messages = [
    {
        "role": "system",
        "content": "You are KunoRZN, a traffic management assistant. Help users navigate local traffic conditions and understand traffic rules."
    },
    {
        "role": "user",
        "content": "मुंबई में ट्रैफिक जाम से बचने के लिए क्या सुझाव हैं?"
    }
]

函数调用

使用特定系统提示和结构进行函数调用，示例如下：

<|start_header_id|>system<|end_header_id|>
You are a function calling AI model fluent in multiple Indian languages. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools: <tools> {"type": "function", "function": {"name": "get_weather", "description": "get_weather(city: str, state: str, country: str='India') -> dict - Get weather information for a given city.\\n\\n    Args:\\n        city (str): The city name.\\n        state (str): The state name.\\n        country (str): The country name, defaults to India.\\n\\n    Returns:\\n        dict: A dictionary containing weather information.\\n            Keys:\\n                - \'city\': The city name.\\n                - \'state\': The state name.\\n                - \'temperature\': The current temperature in Celsius.\\n                - \'humidity\': The current humidity percentage.\\n                - \'description\': Weather description.\\n                - \'forecast\': Forecast for next 3 days.", "parameters": {"type": "object", "properties": {"city": {"type": "string"}, "state": {"type": "string"}, "country": {"type": "string"}}, "required": ["city", "state"]}}}  </tools> Use the following pydantic model json schema for each tool call you will make: {"properties": {"arguments": {"title": "Arguments", "type": "object"}, "name": {"title": "Name", "type": "string"}}, "required": ["arguments", "name"], "title": "FunctionCall", "type": "object"} For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{"arguments": <args-dict>, "name": <function-name>}
</tool_call><|eot_id|><|start_header_id|>user<|end_header_id|>

📚 详细文档

量化版本

GGUF 量化版本：https://huggingface.co/VinkuraAI/KunoRZN-Llama-3-3B-GGUF

引用方式

@misc{
      title={KunoRZN-Llama-3-3B}, 
      author={VinkuraAI},
      year={2025}
}

📄 许可证

🔗 联系与支持

如需更多信息和支持，请访问 vinkura.in 或通过 support@vinkura.in 联系我们。

精选推荐AI模型

Llama 3 Typhoon V1.5x 8b Instruct

专为泰语设计的80亿参数指令模型，性能媲美GPT-3.5-turbo，优化了应用场景、检索增强生成、受限生成和推理任务

Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型，专为边缘设备推理设计，体积仅为Cosmo-3B模型的2%左右。

Roberta Base Chinese Extractive Qa

基于RoBERTa架构的中文抽取式问答模型，适用于从给定文本中提取答案的任务。

问答系统中文

uer

2,694

智启未来，您的人工智能解决方案智库

简体中文