ProteusV0.5开源AI图像生成模型 - 写实表现佳，多元风格处理强！

首页

Proteusv0.5

由 dataautogpt3 开发

ProteusV0.5是基于OpenDalleV1.1核心架构的进阶AI图像生成模型，在写实表现、提示词理解及多元风格处理能力上有显著提升。

图像生成开源协议:Apache-2.0 #定制CLIP优化 #多风格精准生成 #超写实细节

下载量 1,360

发布时间 : 7/23/2024

模型简介

该模型专注于高质量图像生成，支持多种艺术风格和复杂场景构图，特别优化了面部细节与肌肤质感表现。

模型特点

高级定制CLIP集成

采用专门训练的自定义CLIP模型，显著提升提示理解能力，贡献模型90%的性能提升

风格化能力精进

增强多元艺术风格生成能力，提升复杂场景构图连贯性

训练数据集扩容

总量突破40万张图像，显著拓展知识库与生成维度

创意与精确度平衡

修正过往'过度风格化'倾向，优化提示词与生成结果的匹配度

模型能力

文本到图像生成

多种艺术风格转换

高细节面部生成

复杂场景构图

超现实风格创作

动漫风格创作

写实风格创作

使用案例

艺术创作

角色设计

生成具有特定特征的虚构角色形象

示例展示了不同风格的角色肖像生成效果

场景构建

根据文字描述创建复杂场景图像

示例展示了太空场景、酒吧场景等复杂构图

商业应用

概念艺术

快速生成产品概念或广告创意视觉

风格化摄影

模拟特定摄影风格或胶片效果

示例展示了柯达电影胶片风格的图像生成

🚀 ProteusV0.5

ProteusV0.5是AI图像生成模型的最新完整版本，它是在OpenDalleV1.1基础上进行的深度优化。此版本在照片写实度、提示理解能力以及跨领域风格表现能力方面都有显著提升。

🚀 快速开始

你可以使用以下代码示例，借助🧨 diffusers库来使用ProteusV0.5：

import torch
from diffusers import (
    StableDiffusionXLPipeline, 
    KDPM2AncestralDiscreteScheduler,
    AutoencoderKL
)

# Load VAE component
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", 
    torch_dtype=torch.float16
)

# Configure the pipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
    "dataautogpt3/ProteusV0.5", 
    vae=vae,
    torch_dtype=torch.float16
)
pipe.scheduler = KDPM2AncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.to('cuda')

# Define prompts and generate image
prompt = "a cat wearing sunglasses on the beach"
negative_prompt = ""

image = pipe(
    prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=7,
    num_inference_steps=50,
    clip_skip=2
).images[0]


image.save("generated_image.png")

✨ 主要特性

关于Proteus

Proteus利用并增强了OpenDalleV1.1的核心功能，以提供更出色的结果。主要改进领域包括对提示的更高响应度和更强的创造力。该模型使用精心挑选的免版权库存图像和高质量AI生成图像对数据集进行了微调。

V0.5的关键改进

高级自定义CLIP集成

集成了经过精心训练的自定义CLIP模型。
经过长时间稳定开发。
针对Proteus和Prometheus的特定品质进行了进一步微调。
预计为模型性能提升贡献约90%。
为达到最佳性能，需要将clip skip设置为2。

风格能力的进一步优化

生成多样化艺术风格的能力得到增强。
复杂场景和构图的连贯性得到改善。

扩展训练数据集

现在总数超过400,000张图像。
显著拓宽了知识库和生成能力。

平衡创造力与准确性

解决了之前“过于风格化/富有创造力”的问题。
提高了用户提示与生成输出之间的一致性。

Proteus的背景

Proteus是OpenDalleV1.1的深度优化版本，它利用其核心功能来提供更出色的结果。主要改进领域包括对提示的更高响应度和更强的创造力。为实现这一目标，它使用了约220,000张来自免版权库存图像（包括一些动漫）的GPTV字幕图像进行微调，然后进行了归一化处理。此外，还通过精心挑选的10,000对高质量AI生成图像对采用了DPO（直接偏好优化）。为追求最佳性能，在将众多LORA（低秩自适应）模型通过动态应用方法选择性地集成到主模型之前，会对它们进行独立训练。这些技术涉及针对模型内的特定部分，同时在学习阶段避免干扰其他区域。因此，Proteus在描绘复杂面部特征和逼真皮肤纹理方面有显著改进，同时在各种美学领域，特别是超现实主义、动漫和卡通风格可视化方面保持了出色的能力。

训练细节

属性	详情
总训练数据集	现在超过400,000张图像
初始训练数据	约220,000张来自免版权库存图像（包括一些动漫）的GPTV字幕图像
额外训练数据	精心挑选的照片写实图像
微调方法	使用精心挑选的10,000对高质量AI生成图像对进行直接偏好优化（DPO）
LORA模型处理	独立训练并选择性集成

改进之处

复杂面部特征和逼真皮肤纹理的描绘能力增强。
超现实主义、动漫和卡通风格可视化的能力提高。
由于自定义训练的CLIP，提示理解能力更优。
扩展的数据集带来更多样化和准确的输出。
创造力与准确性之间的平衡得到优化。

属性	详情
Clip Skip	2
CFG Scale	7
Steps	25 - 50
Sampler	DPM++ 2M SDE
Scheduler	Karras
Resolution	1024x1024

📄 许可证

本项目采用Apache 2.0许可证。

🔍 示例展示

输入提示	输出图像
black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed	点击查看
(impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers	点击查看
high quality pixel art, a pixel art silhouette of an anime space-themed girl in a space-punk steampunk style, lying in her bed by the window of a spaceship, smoking, with a rustic feel. The image should embody epic portraiture and double exposure, featuring an isolated landscape visible through the window. The colors should primarily be dynamic and action-packed, with a strong use of negative space. The entire artwork should be in pixel art style, emphasizing the characters shape and set against a white background. Silhouette	点击查看
The image features an older man, a long white beard and mustache, He has a stern expression, giving the impression of a wise and experienced individual. The mans beard and mustache are prominent, adding to his distinguished appearance. The close-up shot of the mans face emphasizes his facial features and the intensity of his gaze.	点击查看
Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd	点击查看
cinematic film still of Kodak Motion Picture Film: (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy	点击查看
1980s anime portrait of a character	点击查看
(("Proteus"):text_logo:1)	点击查看