Trendyol-LLM-7B-chat-v4.1.0开源模型 - 免费支持土耳其语电商领域文本生成

首页

Trendyol LLM 7B Chat V4.1.0

由 Trendyol 开发

Trendyol LLM v4.1.0 是一款基于 Trendyol LLM base v4.0（在130亿token上继续预训练的Qwen2.5 7B版本）的生成模型，专注于电商领域和土耳其语理解。

大型语言模型

Safetensors

其他开源协议:Apache-2.0 #电商领域优化 #土耳其语增强 #函数调用支持

下载量 854

发布时间 : 3/7/2025

模型简介

Trendyol LLM v4.1.0 是一款生成模型，增强电商领域知识（如商品描述生成、属性提取、内容摘要等）和土耳其语理解能力，支持函数调用。

模型特点

电商领域知识增强

模型在商品描述生成、属性提取、内容摘要等电商相关任务上表现优异。

土耳其语理解能力提升

针对土耳其语进行了优化，能够更好地理解和生成土耳其语内容。

支持函数调用

部分功能支持函数调用，增强了模型的交互能力和实用性。

模型能力

商品描述生成

属性提取

内容摘要

时尚对话

商品标签提取

类目检测

用户画像解析

检索增强生成（RAG）

使用案例

电商

商品描述生成

根据商品属性自动生成吸引人的商品描述。

提升商品页面的转化率。

用户画像解析

基于用户行为数据生成用户画像。

帮助商家更好地理解用户需求。

多语言支持

土耳其语内容生成

生成高质量的土耳其语文本内容。

满足土耳其语用户的需求。

🚀 时尚前沿大语言模型Trendyol LLM v4.1.0

Trendyol LLM v4.1.0是一款基于Trendyol LLM base v4.0的生成式模型。Trendyol LLM base v4.0是在130亿个标记上对Qwen2.5 7B进行持续预训练的版本。本仓库为聊天模型仓库。

✨ 主要特性

电商知识增强
- 描述生成
- 属性提取
- 摘要生成
- 时尚对话
- 产品标签提取
- 类别检测
- 基于行为的人物角色解读
- 检索增强生成（RAG）
- 等
土耳其语知识提升
函数调用支持（部分完成，后续迭代将全部完成）

📦 安装指南

文档未提及安装步骤，可参考transformers库的安装方式来安装依赖：

pip install transformers torch

💻 使用示例

基础用法

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch


model_id = "Trendyol/Trendyol-LLM-7B-chat-v4.1.0"

pipe = pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={
        "torch_dtype": torch.bfloat16,
        "use_cache":True, 
        "use_flash_attention_2": True
    },
    device_map='auto',
)


sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9, repetition_penalty=1.1)
DEFAULT_SYSTEM_PROMPT = "Sen yardımsever bir asistansın ve sana verilen talimatlar doğrultusunda en iyi cevabı üretmeye çalışacaksın."

messages = [
    {"role": "system", "content": DEFAULT_SYSTEM_PROMPT},
    {"role": "user", "content": "Türkiye'de kaç il var?"}
]

outputs = pipe(
    messages,
    max_new_tokens=1024,
    return_full_text=False,
    **sampling_params
)

print(outputs[0]["generated_text"])

高级用法

tools = [
    {
        "name": "get_city_count",
        "description": "Get current city count of given country.",
        "parameters": {
            "type": "object",
            "properties": {
                "country_name": {
                    "type": "string",
                    "description": 'The name of the country to get the count for.',
                },
            },
            "required": ["country_name"],
        },
    },
    {
        "name": "get_temperature_date",
        "description": "Get temperature at a location and date.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": 'The location to get the temperature for, in the format "City, State, Country".',
                },
                "date": {
                    "type": "string",
                    "description": 'The date to get the temperature for, in the format "Year-Month-Day".',
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": 'The unit to return the temperature in. Defaults to "celsius".',
                },
            },
            "required": ["location", "date"],
        },
    },
]

messages = [
    {"role": "system", "content": "Sen, aşağıdaki fonksiyonlara erişimi olan yardımcı bir asistansın. Gerektiğinde bunları kullanabilirsin -"},
    {"role": "user", "content": "Türkiye'de kaç il var?"}
]

text = pipe.tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, tokenize=False)
inputs = pipe.tokenizer(text, return_tensors="pt").to(pipe.model.device)
outputs = pipe.model.generate(**inputs, max_new_tokens=512)
output_text = pipe.tokenizer.batch_decode(outputs)[0][len(text):]
print(output_text)
# '<function>{"name": "get_city_count", "arguments": \'{"country_name": "Turkey"}\'}</function><|im_end|>'

📚 详细文档

局限性、风险、偏差和伦理考量

局限性和已知偏差

主要功能和应用：Trendyol LLM是一种自回归语言模型，主要用于预测文本字符串中的下一个标记。虽然它常用于各种应用，但需要注意的是，该模型尚未经过广泛的实际应用测试。其在不同场景下的有效性和可靠性在很大程度上仍未得到验证。
语言理解和生成：该模型主要在标准英语和土耳其语上进行训练。在理解和生成俚语、非正式语言或其他语言时，其性能可能会受到限制，从而导致潜在的错误或误解。
虚假信息生成：用户应注意，Trendyol LLM可能会产生不准确或误导性的信息。其输出应被视为起点或建议，而非确定的答案。

风险和伦理考量

有害使用的可能性：Trendyol LLM存在被用于生成冒犯性或有害语言的风险。我们强烈反对将其用于任何此类目的，并强调在部署前需要进行特定应用的安全性和公平性评估。
意外内容和偏差：该模型在大量文本数据语料库上进行训练，这些数据并未明确检查是否存在冒犯性内容或现有偏差。因此，它可能会无意中产生反映这些偏差或不准确信息的内容。
毒性：尽管我们努力选择合适的训练数据，但该模型仍有可能生成有害内容，尤其是在明确提示的情况下。我们鼓励开源社区制定策略以尽量减少此类风险。

安全和道德使用建议

人工监督：我们建议在公共应用中加入人工审核层或使用过滤器来管理和提高输出质量。这种方法有助于降低意外生成令人反感内容的风险。
特定应用测试：打算使用Trendyol LLM的开发者应针对其特定应用进行全面的安全测试和优化。这一点至关重要，因为该模型的响应可能不可预测，偶尔可能存在偏差、不准确或冒犯性。
负责任的开发和部署：Trendyol LLM的开发者和用户有责任确保其应用符合伦理和安全标准。我们敦促用户注意该模型的局限性，并采取适当的保障措施以防止滥用或产生有害后果。