Desklib开源AI文本检测模型v1.01 - 精准区分英文文本是人写还是AI生成！

首页

Ai Text Detector V1.01

由 desklib 开发

由Desklib开发的AI生成文本检测模型，用于区分人类撰写和AI生成的英文文本，在RAID基准测试中表现领先。

文本分类

Transformers

英语开源协议:MIT #AI文本检测 #学术诚信保障 #抗对抗攻击

下载量 20.01k

发布时间 : 2/16/2025

模型简介

该模型基于微调的microsoft/deberta-v3-large架构，专注于检测AI生成的文本内容，适用于内容审核、学术诚信等领域。

模型特点

高精度检测

在RAID AI检测基准测试中表现领先，能够准确区分人类和AI生成的文本。

鲁棒性强

能够有效应对不同领域的各类对抗攻击，保持稳定的检测性能。

基于DeBERTa架构

采用改进的BERT架构，通过解耦注意力和增强的掩码解码器实现更优性能。

模型能力

AI生成文本检测

内容真实性验证

文本分类

使用案例

教育

学术诚信检查

检测学生作业或论文中是否存在AI生成的内容，维护学术诚信。

帮助教育机构识别潜在的学术不端行为

内容审核

AI生成内容标记

在社交媒体或新闻平台上标记AI生成的内容，提高内容透明度。

增强用户对内容真实性的信任

新闻业

新闻真实性验证

验证新闻稿件是否为人类撰写，防止AI生成的虚假信息传播。

维护新闻行业的可信度和专业性

🚀 desklib/ai-text-detector-v1.01

这是Desklib开发的一款AI生成文本检测模型，旨在将英文文本分类为人写文本或AI生成文本。它目前在RAID AI检测基准测试中名列前茅。该模型基于Transformer架构，是microsoft/deberta-v3-large的微调版本，具有高精度、鲁棒性强的特点，能很好地应对不同领域的各种对抗攻击。此模型在内容审核、学术诚信、新闻报道等对文本真实性要求较高的场景中非常实用。

Desklib提供基于AI的个性化学习和学习辅助工具，本模型是Desklib为学生、教育工作者和大学提供的众多工具之一。

在线试用模型！：Desklib AI Detector

Github仓库：https://github.com/desklib/ai-text-detector

🚀 快速开始

本模型可用于将英文文本分类为人写文本或AI生成文本，适用于内容审核、学术诚信等场景。你可以通过以下链接在线试用模型：Desklib AI Detector ，也可以参考下面的使用示例在本地运行。

✨ 主要特性

高精度：在RAID基准测试中表现优异，目前处于领先地位。
鲁棒性强：能很好地应对不同领域的各种对抗攻击。
应用广泛：适用于内容审核、学术诚信、新闻报道等对文本真实性要求较高的场景。

📦 安装指南

文档未提及具体安装步骤，可参考以下通用步骤：

确保你已经安装了Python和transformers库。
克隆模型的GitHub仓库：git clone https://github.com/desklib/ai-text-detector.git
进入仓库目录并安装依赖：pip install -r requirements.txt

💻 使用示例

基础用法

import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoConfig, AutoModel, PreTrainedModel

class DesklibAIDetectionModel(PreTrainedModel):
    config_class = AutoConfig

    def __init__(self, config):
        super().__init__(config)
        # Initialize the base transformer model.
        self.model = AutoModel.from_config(config)
        # Define a classifier head.
        self.classifier = nn.Linear(config.hidden_size, 1)
        # Initialize weights (handled by PreTrainedModel)
        self.init_weights()

    def forward(self, input_ids, attention_mask=None, labels=None):
        # Forward pass through the transformer
        outputs = self.model(input_ids, attention_mask=attention_mask)
        last_hidden_state = outputs[0]
        # Mean pooling
        input_mask_expanded = attention_mask.unsqueeze(-1).expand(last_hidden_state.size()).float()
        sum_embeddings = torch.sum(last_hidden_state * input_mask_expanded, dim=1)
        sum_mask = torch.clamp(input_mask_expanded.sum(dim=1), min=1e-9)
        pooled_output = sum_embeddings / sum_mask

        # Classifier
        logits = self.classifier(pooled_output)
        loss = None
        if labels is not None:
            loss_fct = nn.BCEWithLogitsLoss()
            loss = loss_fct(logits.view(-1), labels.float())

        output = {"logits": logits}
        if loss is not None:
            output["loss"] = loss
        return output

def predict_single_text(text, model, tokenizer, device, max_len=768, threshold=0.5):
    encoded = tokenizer(
        text,
        padding='max_length',
        truncation=True,
        max_length=max_len,
        return_tensors='pt'
    )
    input_ids = encoded['input_ids'].to(device)
    attention_mask = encoded['attention_mask'].to(device)

    model.eval()
    with torch.no_grad():
        outputs = model(input_ids=input_ids, attention_mask=attention_mask)
        logits = outputs["logits"]
        probability = torch.sigmoid(logits).item()

    label = 1 if probability >= threshold else 0
    return probability, label

def main():
    # --- Model and Tokenizer Directory ---
    model_directory = "desklib/ai-text-detector-v1.01"

    # --- Load tokenizer and model ---
    tokenizer = AutoTokenizer.from_pretrained(model_directory)
    model = DesklibAIDetectionModel.from_pretrained(model_directory)

    # --- Set up device ---
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)

    # --- Example Input text ---
    text_ai = "AI detection refers to the process of identifying whether a given piece of content, such as text, images, or audio, has been generated by artificial intelligence. This is achieved using various machine learning techniques, including perplexity analysis, entropy measurements, linguistic pattern recognition, and neural network classifiers trained on human and AI-generated data. Advanced AI detection tools assess writing style, coherence, and statistical properties to determine the likelihood of AI involvement. These tools are widely used in academia, journalism, and content moderation to ensure originality, prevent misinformation, and maintain ethical standards. As AI-generated content becomes increasingly sophisticated, AI detection methods continue to evolve, integrating deep learning models and ensemble techniques for improved accuracy."
    text_human = "It is estimated that a major part of the content in the internet will be generated by AI / LLMs by 2025. This leads to a lot of misinformation and credibility related issues. That is why if is important to have accurate tools to identify if a content is AI generated or human written"

    # --- Run prediction ---
    probability, predicted_label = predict_single_text(text_ai, model, tokenizer, device)
    print(f"Probability of being AI generated: {probability:.4f}")
    print(f"Predicted label: {'AI Generated' if predicted_label == 1 else 'Not AI Generated'}")

    probability, predicted_label = predict_single_text(text_human, model, tokenizer, device)
    print(f"Probability of being AI generated: {probability:.4f}")
    print(f"Predicted label: {'AI Generated' if predicted_label == 1 else 'Not AI Generated'}")

if __name__ == "__main__":
    main()

📚 详细文档

性能表现

该模型在提交时的RAID基准测试中取得了顶尖的成绩：访问RAID排行榜

模型架构

该模型基于微调后的microsoft/deberta-v3-large Transformer架构构建，核心组件包括：

Transformer基础模型：预训练的microsoft/deberta-v3-large模型作为基础。该模型采用了DeBERTa（具有解耦注意力的解码增强BERT），它是BERT和RoBERTa的改进版本，结合了解耦注意力和增强的掩码解码器，以实现更好的性能。
均值池化层：该层聚合Transformer的隐藏状态，创建输入文本的固定大小表示。此方法通过注意力掩码对词元嵌入进行加权平均，以捕捉整体语义。
分类器头：一个线性层作为分类器，接收池化后的表示并输出单个对数几率。该对数几率表示模型对输入文本是AI生成的置信度。对对数几率应用Sigmoid激活函数以生成概率。

🔧 技术细节

本模型是基于Transformer架构的文本分类模型，通过微调microsoft/deberta-v3-large模型来实现对英文文本的分类。在模型架构上，采用了均值池化层和线性分类器头，以提高模型的性能和准确性。同时，模型在训练过程中使用了RAID基准测试数据集，以确保模型在不同领域和对抗攻击下的鲁棒性。