finetuned-t5-xsum开源文本摘要模型 - 免费助力快速精准提炼内容要点

首页

Finetuned T5 Xsum

由 Lakshan2003 开发

基于T5-small模型，在XSum数据集上使用LoRA技术微调的文本摘要模型

文本生成

Safetensors

英语开源协议:Apache-2.0 #LoRA微调 #英文摘要 #轻量级T5

下载量 22

发布时间 : 2/9/2025

模型简介

该模型采用LoRA（低秩自适应）技术对T5-small进行微调，专门用于生成高质量的文本摘要，特别适用于新闻文章等内容的抽象摘要任务。

模型特点

LoRA微调技术

采用低秩自适应技术进行高效微调，显著减少训练参数同时保持模型性能

专业摘要能力

在XSum新闻摘要数据集上专门优化，擅长生成简洁准确的抽象摘要

轻量级部署

基于T5-small架构，适合资源有限的环境部署

模型能力

文本摘要生成

新闻内容提炼

长文本压缩

使用案例

新闻媒体

新闻自动摘要

为新闻机构自动生成文章要点摘要

生成简洁准确的新闻摘要，节省编辑时间

内容分析

报告自动摘要

对长篇幅研究报告生成执行摘要

快速提取关键信息，提高阅读效率

🚀 LoRA微调的XSum T5文本摘要器

本模型是一个专为文本摘要任务优化的工具，基于t5-small模型在XSum数据集上进行LoRA（低秩自适应）微调，能高效生成高质量的文本摘要。

🚀 快速开始

本模型是 t5-small 在xsum数据集上的微调版本。

✨ 主要特性

这是一个针对文本摘要进行优化的T5-small的LoRA（低秩自适应）微调版本。该模型在XSum数据集上进行了抽象摘要的训练。

💻 使用示例

基础用法

from peft import PeftModel
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
import torch

base_model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
my_model = PeftModel.from_pretrained(base_model, "Lakshan2003/finetuned-t5-xsum")

def test_peft_summarizer(text, model, max_length=128, min_length=30):
    """
    Test the PEFT-loaded summarization model
    
    Args:
        text (str): Input text to summarize
        model: The loaded PEFT model
        max_length (int): Maximum length of the summary
        min_length (int): Minimum length of the summary
    """
    # Load tokenizer for t5-small (base model)
    tokenizer = AutoTokenizer.from_pretrained("Lakshan2003/finetuned-t5-xsum")
    
    # Move model to GPU if available
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = model.to(device)
    
    # Prepare the input text
    prefix = "summarize: "
    input_text = prefix + text
    
    # Tokenize
    inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    # Generate summary
    with torch.no_grad():
        output_ids = model.generate(
            input_ids=inputs["input_ids"],
            attention_mask=inputs["attention_mask"],
            max_length=max_length,
            min_length=min_length,
            num_beams=4,
            length_penalty=2.0,
            early_stopping=True,
            no_repeat_ngram_size=3
        )
    
    # Decode the summary
    summary = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    
    return summary

# Test text
test_text = """
The United Nations has warned that climate change poses an unprecedented threat to human civilization. In a landmark report, scientists detailed how rising temperatures are affecting everything from weather patterns to food production. The report emphasizes that without immediate and substantial action to reduce greenhouse gas emissions, the world faces severe consequences including rising sea levels, more frequent extreme weather events, and widespread ecosystem collapse. Many countries have pledged to reduce their carbon emissions, but experts say current commitments fall short of what's needed to prevent the worst impacts of climate change. The report also highlights the disproportionate effect of climate change on developing nations, which often lack the resources to adapt to changing conditions.
"""

# Generate summary
summary = test_peft_summarizer(test_text, my_model)

print("Original Text:")
print(test_text)
print("\nGenerated Summary:")
print(summary)