paligemma-longprompt-v1-safetensors开源视觉模型 - 融合图文生成图像提示词

首页

Paligemma Longprompt V1 Safetensors

由 mnemic 开发

实验性视觉模型，融合关键词标签与长文本描述生成图像提示词

图像生成文本

Transformers

开源协议:Gpl-3.0 #混合标签描述 #长文本生成 #图像理解

下载量 38

发布时间 : 6/15/2024

模型简介

该模型是基于超长复杂结构生成图像描述的视觉语言模型，能同时输出逗号分隔关键词和自然语言长文本描述，适用于图像内容分析与生成提示词创作。

模型特点

混合输出格式

同时生成图库式标签(逗号分隔关键词)和自然语言长文本描述

复杂结构处理

专门优化对超长复杂描述结构的生成能力

双用途输出

生成的标签和描述均可直接用于图像生成提示词

模型能力

图像内容分析

关键词提取

自然语言描述生成

图像提示词创作

使用案例

创意辅助

AI绘画提示词生成

为AI绘画工具生成包含关键词和详细描述的提示词

示例输出包含20+关键词和100+单词的连贯描述

内容标注

图像库自动标注

为图像库自动生成可搜索的标签和描述文本

同时提供可检索关键词和可读性描述

🚀 长描述图像字幕生成模型

这是一个实验性的视觉模型，基于复杂的架构，能为输入图像生成字幕或提示词。它结合了标签式关键词（逗号分隔的关键词标签）和较长的描述性文本，可生成高质量的提示词。

🚀 快速开始

安装

安装所需依赖和支持 CUDA 的 PyTorch。

简单使用脚本

pip install git+https://github.com/huggingface/transformers

from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
from PIL import Image
import requests
import torch

model_id = "mnemic/paligemma-longprompt-v1-safetensors"

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)

model = PaliGemmaForConditionalGeneration.from_pretrained(model_id).to('cuda').eval()
processor = AutoProcessor.from_pretrained(model_id)

## prefix
prompt = "caption en"
model_inputs = processor(text=prompt, images=image, return_tensors="pt").to('cuda')
input_len = model_inputs["input_ids"].shape[-1]

with torch.inference_mode():
    generation = model.generate(**model_inputs, max_new_tokens=256, do_sample=False)
    generation = generation[0][input_len:]
    decoded = processor.decode(generation, skip_special_tokens=True)
    print(decoded)

批量处理脚本

from transformers import AutoProcessor, PaliGemmaForConditionalGeneration, BitsAndBytesConfig
from PIL import Image
import torch
import os
import glob
from colorama import init, Fore, Style
from datetime import datetime
import time
import re
from huggingface_hub import snapshot_download

# Initialize colorama
init(autoreset=True)

# Settings
quantization_bits = 8  # Set to None for full precision, 4 for 4-bit quantization, or 8 for 8-bit quantization
generation_token_length = 256
min_tokens = 20  # Minimum number of tokens required in the generated output
max_word_character_length = 30  # Maximum length of a word before it's considered too long
prune_end = True  # Remove any trailing chopped off end text until it reaches a . or ,
output_format = ".txt"  # Output format for the generated captions

# Clean up of poorly generated prompts
repetition_penalty = 1.15  # Control the repetition penalty (higher values discourage repetition)
retry_words = ["no_parallel"]  # If these words are encountered, the entire generation retries
max_retries = 10
remove_words = ["#", "/", "、", "@", "__", "|", "  ", ";", "~", "\"", "*", "^", ",,", "ON DISPLAY:"]  # Words or characters to be removed from the output results
strip_contents_inside = ["(", "[", "{"]  # Specify which characters to strip out along with their contents
remove_underscore_tags = True  # Option to remove words containing underscores

# Specify the model path
model_name = "mnemic/paligemma-longprompt-v1-safetensors"
models_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'models')
model_path = os.path.join(models_dir, model_name.split('/')[-1])

# Ensure the local directory is correctly specified relative to the script's location
script_dir = os.path.dirname(os.path.abspath(__file__))
local_model_path = model_path  # Use the specified model directory

# Directory paths
input_dir = os.path.join(script_dir, 'input')
output_in_input_dir = True  # Set this to False if you want to use a separate output directory
output_dir = input_dir if output_in_input_dir else os.path.join(script_dir, 'output')

# Create output directory if it doesn't exist
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

# Function to download the model from HuggingFace using snapshot_download
def download_model(model_name, model_path):
    if not os.path.exists(model_path):
        print(Fore.YELLOW + f"Downloading model {model_name} to {model_path}...")
        snapshot_download(repo_id=model_name, local_dir=model_path, local_dir_use_symlinks=False, local_files_only=False)
        print(Fore.GREEN + "Model downloaded successfully.")
    else:
        print(Fore.GREEN + f"Model directory already exists: {model_path}")

# Download the model if not already present
download_model(model_name, model_path)

# Check that the required files are in the local_model_path
required_files = ["config.json", "tokenizer_config.json"]
missing_files = [f for f in required_files if not os.path.exists(os.path.join(local_model_path, f))]
safetensor_files = [f for f in os.listdir(local_model_path) if f.endswith(".safetensors")]
if missing_files:
    raise FileNotFoundError(f"Missing required files in {local_model_path}: {', '.join(missing_files)}")
if not safetensor_files:
    raise FileNotFoundError(f"No safetensors files found in {local_model_path}")

# Load model and processor from local directory
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(Fore.YELLOW + "Loading model and processor...")
try:
    if quantization_bits == 4:
        bnb_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16,
        )
        model = PaliGemmaForConditionalGeneration.from_pretrained(
            local_model_path,
            quantization_config=bnb_config,
            device_map={"": 0},
        ).eval()
    elif quantization_bits == 8:
        bnb_config = BitsAndBytesConfig(
            load_in_8bit=True,
        )
        model = PaliGemmaForConditionalGeneration.from_pretrained(
            local_model_path,
            quantization_config=bnb_config,
            device_map={"": 0},
        ).eval()
    elif quantization_bits is None:
        model = PaliGemmaForConditionalGeneration.from_pretrained(
            local_model_path
        ).eval()
        model.to(device)  # Ensure the model is on the correct device
    else:
        raise ValueError("Unsupported quantization_bits value. Use None for full precision, 4 for 4-bit quantization, or 8 for 8-bit quantization.")

    processor = AutoProcessor.from_pretrained(local_model_path, local_files_only=True)
    print(Fore.GREEN + "Model and processor loaded successfully.")
except OSError as e:
    print(Fore.RED + f"Error loading model or processor: {e}")
    raise

# Process each image in the input directory recursively
image_extensions = ['jpg', 'jpeg', 'png', 'webp']
image_paths = []
for ext in image_extensions:
    image_paths.extend(glob.glob(os.path.join(input_dir, '**', f'*.{ext}'), recursive=True))

print(Fore.YELLOW + f"Found {len(image_paths)} image(s) to process.\n")

def prune_text(text):
    if not prune_end:
        return text
    # Find the last period or comma
    last_period_index = text.rfind('.')
    last_comma_index = text.rfind(',')
    prune_index = max(last_period_index, last_comma_index)
    if prune_index != -1:
        # Return text up to the last period or comma
        return text[:prune_index].strip()
    return text

def contains_retry_word(text, retry_words):
    return any(word in text for word in retry_words)

def remove_unwanted_words(text, remove_words):
    for word in remove_words:
        text = text.replace(word, ' ')
    return text

def strip_contents(text, chars):
    for char in chars:
        if char == "(":
            text = re.sub(r'\([^)]*\)', ' ', text)
        elif char == "[":
            text = re.sub(r'\[[^\]]*\]', ' ', text)
        elif char == "{":
            text = re.sub(r'\{[^}]*\}', ' ', text)
    text = re.sub(r'\s{2,}', ' ', text)  # Remove extra spaces
    text = re.sub(r'\s([,.!?;])', r'\1', text)  # Remove space before punctuation
    text = re.sub(r'([,.!?;])\s', r'\1 ', text)  # Add space after punctuation if missing
    return text.strip()

def remove_long_words(text, max_word_length):
    words = text.split()
    for i, word in enumerate(words):
        if len(word) > max_word_length:
            # Strip back to the previous comma or period
            last_period_index = text.rfind('.', 0, text.find(word))
            last_comma_index = text.rfind(',', 0, text.find(word))
            prune_index = max(last_period_index, last_comma_index)
            if prune_index != -1:
                return text[:prune_index].strip()
            else:
                return text[:text.find(word)].strip()
    return text

def clean_text(text):
    text = remove_unwanted_words(text, remove_words)
    text = strip_contents(text, strip_contents_inside)
    text = remove_long_words(text, max_word_character_length)
    # Remove unwanted characters
    text = re.sub(r'[^\x00-\x7F]+', '', text)
    # Normalize spaces
    text = re.sub(r'\s+', ' ', text).strip()
    if remove_underscore_tags:
        text = ' '.join([word for word in text.split() if '_' not in word])
    return text

for image_path in image_paths:
    output_file_path = os.path.splitext(image_path)[0] + output_format if output_in_input_dir else os.path.join(output_dir, os.path.splitext(os.path.relpath(image_path, input_dir))[0] + output_format)
    
    if os.path.exists(output_file_path):
        # print(Fore.CYAN + f"Skipping {image_path}, output already exists.")
        continue

    try:
        start_time = datetime.now()
        print(Fore.CYAN + f"[{start_time.strftime('%Y-%m-%d %H:%M:%S')}] Starting processing for {image_path}")
        
        image = Image.open(image_path).convert('RGB')
        prompt = "caption en"
        model_inputs = processor(text=prompt, images=image, return_tensors="pt").to(device)  # Ensure inputs are on the correct device
        input_len = model_inputs["input_ids"].shape[-1]

        # Generate the caption with additional parameters to reduce repetitiveness
        retries = 0
        success = False
        while retries < max_retries:
            with torch.inference_mode():
                generation_start_time = time.time()
                generation = model.generate(
                    **model_inputs,
                    max_new_tokens=generation_token_length,
                    do_sample=True,  # Enable sampling
                    temperature=0.7,  # Control randomness of predictions
                    top_k=50,  # Consider top 50 candidates
                    top_p=0.9,  # Consider tokens that comprise the top 90% probability mass
                    no_repeat_ngram_size=2,  # Avoid repeating 2-grams
                    repetition_penalty=repetition_penalty  # Apply a penalty to repeated tokens
                )
                generation_end_time = time.time()
                generation = generation[0][input_len:]
                decoded = processor.decode(generation, skip_special_tokens=True)
                pruned_text = prune_text(decoded)
                
                if not contains_retry_word(pruned_text, retry_words) and len(pruned_text.split()) >= min_tokens:
                    success = True
                    break
                retries += 1
                print(Fore.YELLOW + f"Retrying generation for {image_path} due to retry word or insufficient tokens, attempt {retries}")
            
            if retries == max_retries:
                print(Fore.RED + f"Max retries reached for {image_path}. Saving the result with retry word or insufficient tokens.")

        # Clean the text
        cleaned_text = clean_text(pruned_text)

        # Save the output to a text file, replicating the directory structure
        os.makedirs(os.path.dirname(output_file_path), exist_ok=True)
        with open(output_file_path, 'w', encoding='utf-8') as f:  # Specify UTF-8 encoding
            f.write(cleaned_text)
        
        end_time = datetime.now()
        duration = generation_end_time - generation_start_time
        
        print(Fore.GREEN + f"[{end_time.strftime('%Y-%m-%d %H:%M:%S')}] Processed {image_path}, saved to {output_file_path}")
        print(Fore.LIGHTBLACK_EX + f"Output: {cleaned_text}")
        print(Fore.LIGHTBLACK_EX + f"Time taken for generation: {duration:.2f} seconds\n")
        
        # Clear memory
        del model_inputs
        torch.cuda.empty_cache()
    except Exception as e:
        print(Fore.RED + f"Error processing {image_path}: {e}\n")

你也可以将此脚本用于其他 Paligemma 模型。推荐使用：https://huggingface.co/gokaygokay/paligemma-rich-captions

✨ 主要特性

本模型旨在进行更长、更复杂描述的实验。目标是将关键词标签和描述相结合，以便在提示时同时使用两者，并生成高质量的提示词。不过，当前版本尚未完全达成这一目标，还需进一步训练和优化。

📦 安装指南

安装所需依赖和支持 CUDA 的 PyTorch。

💻 使用示例

基础用法

from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
from PIL import Image
import requests
import torch

model_id = "mnemic/paligemma-longprompt-v1-safetensors"

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)

model = PaliGemmaForConditionalGeneration.from_pretrained(model_id).to('cuda').eval()
processor = AutoProcessor.from_pretrained(model_id)

## prefix
prompt = "caption en"
model_inputs = processor(text=prompt, images=image, return_tensors="pt").to('cuda')
input_len = model_inputs["input_ids"].shape[-1]

with torch.inference_mode():
    generation = model.generate(**model_inputs, max_new_tokens=256, do_sample=False)
    generation = generation[0][input_len:]
    decoded = processor.decode(generation, skip_special_tokens=True)
    print(decoded)

高级用法

# 此脚本可用于批量处理图像，并支持不同的量化选项（4 位、8 位或全精度）。
# 它还包含了对生成结果的清理和重试机制，以提高生成质量。
from transformers import AutoProcessor, PaliGemmaForConditionalGeneration, BitsAndBytesConfig
from PIL import Image
import torch
import os
import glob
from colorama import init, Fore, Style
from datetime import datetime
import time
import re
from huggingface_hub import snapshot_download

# Initialize colorama
init(autoreset=True)

# Settings
quantization_bits = 8  # Set to None for full precision, 4 for 4-bit quantization, or 8 for 8-bit quantization
generation_token_length = 256
min_tokens = 20  # Minimum number of tokens required in the generated output
max_word_character_length = 30  # Maximum length of a word before it's considered too long
prune_end = True  # Remove any trailing chopped off end text until it reaches a . or ,
output_format = ".txt"  # Output format for the generated captions

# Clean up of poorly generated prompts
repetition_penalty = 1.15  # Control the repetition penalty (higher values discourage repetition)
retry_words = ["no_parallel"]  # If these words are encountered, the entire generation retries
max_retries = 10
remove_words = ["#", "/", "、", "@", "__", "|", "  ", ";", "~", "\"", "*", "^", ",,", "ON DISPLAY:"]  # Words or characters to be removed from the output results
strip_contents_inside = ["(", "[", "{"]  # Specify which characters to strip out along with their contents
remove_underscore_tags = True  # Option to remove words containing underscores

# Specify the model path
model_name = "mnemic/paligemma-longprompt-v1-safetensors"
models_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'models')
model_path = os.path.join(models_dir, model_name.split('/')[-1])

# Ensure the local directory is correctly specified relative to the script's location
script_dir = os.path.dirname(os.path.abspath(__file__))
local_model_path = model_path  # Use the specified model directory

# Directory paths
input_dir = os.path.join(script_dir, 'input')
output_in_input_dir = True  # Set this to False if you want to use a separate output directory
output_dir = input_dir if output_in_input_dir else os.path.join(script_dir, 'output')

# Create output directory if it doesn't exist
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

# Function to download the model from HuggingFace using snapshot_download
def download_model(model_name, model_path):
    if not os.path.exists(model_path):
        print(Fore.YELLOW + f"Downloading model {model_name} to {model_path}...")
        snapshot_download(repo_id=model_name, local_dir=model_path, local_dir_use_symlinks=False, local_files_only=False)
        print(Fore.GREEN + "Model downloaded successfully.")
    else:
        print(Fore.GREEN + f"Model directory already exists: {model_path}")

# Download the model if not already present
download_model(model_name, model_path)

# Check that the required files are in the local_model_path
required_files = ["config.json", "tokenizer_config.json"]
missing_files = [f for f in required_files if not os.path.exists(os.path.join(local_model_path, f))]
safetensor_files = [f for f in os.listdir(local_model_path) if f.endswith(".safetensors")]
if missing_files:
    raise FileNotFoundError(f"Missing required files in {local_model_path}: {', '.join(missing_files)}")
if not safetensor_files:
    raise FileNotFoundError(f"No safetensors files found in {local_model_path}")

# Load model and processor from local directory
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(Fore.YELLOW + "Loading model and processor...")
try:
    if quantization_bits == 4:
        bnb_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16,
        )
        model = PaliGemmaForConditionalGeneration.from_pretrained(
            local_model_path,
            quantization_config=bnb_config,
            device_map={"": 0},
        ).eval()
    elif quantization_bits == 8:
        bnb_config = BitsAndBytesConfig(
            load_in_8bit=True,
        )
        model = PaliGemmaForConditionalGeneration.from_pretrained(
            local_model_path,
            quantization_config=bnb_config,
            device_map={"": 0},
        ).eval()
    elif quantization_bits is None:
        model = PaliGemmaForConditionalGeneration.from_pretrained(
            local_model_path
        ).eval()
        model.to(device)  # Ensure the model is on the correct device
    else:
        raise ValueError("Unsupported quantization_bits value. Use None for full precision, 4 for 4-bit quantization, or 8 for 8-bit quantization.")

    processor = AutoProcessor.from_pretrained(local_model_path, local_files_only=True)
    print(Fore.GREEN + "Model and processor loaded successfully.")
except OSError as e:
    print(Fore.RED + f"Error loading model or processor: {e}")
    raise

# Process each image in the input directory recursively
image_extensions = ['jpg', 'jpeg', 'png', 'webp']
image_paths = []
for ext in image_extensions:
    image_paths.extend(glob.glob(os.path.join(input_dir, '**', f'*.{ext}'), recursive=True))

print(Fore.YELLOW + f"Found {len(image_paths)} image(s) to process.\n")

def prune_text(text):
    if not prune_end:
        return text
    # Find the last period or comma
    last_period_index = text.rfind('.')
    last_comma_index = text.rfind(',')
    prune_index = max(last_period_index, last_comma_index)
    if prune_index != -1:
        # Return text up to the last period or comma
        return text[:prune_index].strip()
    return text

def contains_retry_word(text, retry_words):
    return any(word in text for word in retry_words)

def remove_unwanted_words(text, remove_words):
    for word in remove_words:
        text = text.replace(word, ' ')
    return text

def strip_contents(text, chars):
    for char in chars:
        if char == "(":
            text = re.sub(r'\([^)]*\)', ' ', text)
        elif char == "[":
            text = re.sub(r'\[[^\]]*\]', ' ', text)
        elif char == "{":
            text = re.sub(r'\{[^}]*\}', ' ', text)
    text = re.sub(r'\s{2,}', ' ', text)  # Remove extra spaces
    text = re.sub(r'\s([,.!?;])', r'\1', text)  # Remove space before punctuation
    text = re.sub(r'([,.!?;])\s', r'\1 ', text)  # Add space after punctuation if missing
    return text.strip()

def remove_long_words(text, max_word_length):
    words = text.split()
    for i, word in enumerate(words):
        if len(word) > max_word_length:
            # Strip back to the previous comma or period
            last_period_index = text.rfind('.', 0, text.find(word))
            last_comma_index = text.rfind(',', 0, text.find(word))
            prune_index = max(last_period_index, last_comma_index)
            if prune_index != -1:
                return text[:prune_index].strip()
            else:
                return text[:text.find(word)].strip()
    return text

def clean_text(text):
    text = remove_unwanted_words(text, remove_words)
    text = strip_contents(text, strip_contents_inside)
    text = remove_long_words(text, max_word_character_length)
    # Remove unwanted characters
    text = re.sub(r'[^\x00-\x7F]+', '', text)
    # Normalize spaces
    text = re.sub(r'\s+', ' ', text).strip()
    if remove_underscore_tags:
        text = ' '.join([word for word in text.split() if '_' not in word])
    return text

for image_path in image_paths:
    output_file_path = os.path.splitext(image_path)[0] + output_format if output_in_input_dir else os.path.join(output_dir, os.path.splitext(os.path.relpath(image_path, input_dir))[0] + output_format)
    
    if os.path.exists(output_file_path):
        # print(Fore.CYAN + f"Skipping {image_path}, output already exists.")
        continue

    try:
        start_time = datetime.now()
        print(Fore.CYAN + f"[{start_time.strftime('%Y-%m-%d %H:%M:%S')}] Starting processing for {image_path}")
        
        image = Image.open(image_path).convert('RGB')
        prompt = "caption en"
        model_inputs = processor(text=prompt, images=image, return_tensors="pt").to(device)  # Ensure inputs are on the correct device
        input_len = model_inputs["input_ids"].shape[-1]

        # Generate the caption with additional parameters to reduce repetitiveness
        retries = 0
        success = False
        while retries < max_retries:
            with torch.inference_mode():
                generation_start_time = time.time()
                generation = model.generate(
                    **model_inputs,
                    max_new_tokens=generation_token_length,
                    do_sample=True,  # Enable sampling
                    temperature=0.7,  # Control randomness of predictions
                    top_k=50,  # Consider top 50 candidates
                    top_p=0.9,  # Consider tokens that comprise the top 90% probability mass
                    no_repeat_ngram_size=2,  # Avoid repeating 2-grams
                    repetition_penalty=repetition_penalty  # Apply a penalty to repeated tokens
                )
                generation_end_time = time.time()
                generation = generation[0][input_len:]
                decoded = processor.decode(generation, skip_special_tokens=True)
                pruned_text = prune_text(decoded)
                
                if not contains_retry_word(pruned_text, retry_words) and len(pruned_text.split()) >= min_tokens:
                    success = True
                    break
                retries += 1
                print(Fore.YELLOW + f"Retrying generation for {image_path} due to retry word or insufficient tokens, attempt {retries}")
            
            if retries == max_retries:
                print(Fore.RED + f"Max retries reached for {image_path}. Saving the result with retry word or insufficient tokens.")

        # Clean the text
        cleaned_text = clean_text(pruned_text)

        # Save the output to a text file, replicating the directory structure
        os.makedirs(os.path.dirname(output_file_path), exist_ok=True)
        with open(output_file_path, 'w', encoding='utf-8') as f:  # Specify UTF-8 encoding
            f.write(cleaned_text)
        
        end_time = datetime.now()
        duration = generation_end_time - generation_start_time
        
        print(Fore.GREEN + f"[{end_time.strftime('%Y-%m-%d %H:%M:%S')}] Processed {image_path}, saved to {output_file_path}")
        print(Fore.LIGHTBLACK_EX + f"Output: {cleaned_text}")
        print(Fore.LIGHTBLACK_EX + f"Time taken for generation: {duration:.2f} seconds\n")
        
        # Clear memory
        del model_inputs
        torch.cuda.empty_cache()
    except Exception as e:
        print(Fore.RED + f"Error processing {image_path}: {e}\n")

📚 详细文档

模型介绍

这是一个实验性的视觉模型，基于复杂的架构，能为输入图像生成字幕或提示词。它结合了标签式关键词（逗号分隔的关键词标签）和较长的描述性文本。

示例展示

image/jpeg

瀑布, 无人, 户外, 风景, 树木, 湖泊, 岩石, 河流, 水, 自然, 植物, 天空, 草地, 白天, 岛屿, 蓝天, 独自, 山脉, 森林, 一幅宁静自然的瀑布景观图，瀑布下有一个小池塘，隐藏在树林中，采用数字艺术技术创作而成。树木翠绿的枝叶、盛开的粉色花朵和波光粼粼的湖水营造出一种难以言喻的和谐与宁静之感。瀑布高耸，周围郁郁葱葱的环境更凸显了它的雄伟之美。它矗立在宁静的池塘上方，就像大自然赐予的巨大礼物。整个场景宁静祥和，散发着一种宁静的氛围。画面中是一个美丽的热带景观，有一个令人印象深刻的瀑布，周围环绕着岩石和树木。水面上漂浮着几片树叶，还有一些花朵散落其中，为环境增添了色彩和质感。花朵以其鲜艳的颜色和娇嫩的花瓣为任何场景增添了美丽。它们经过精心布置，吸引人们的注意力，凸显了这个精心设计的杰作的自然之美。