CodeLlama-7b-Instruct-Solidity开源模型 - 免费生成高质量Solidity智能合约代码

首页

Codellama 7b Instruct Solidity

由 AlfredPros 开发

基于70亿参数Code LLaMA-Instruct模型进行微调的版本，专门用于生成Solidity智能合约代码

大型语言模型

Transformers

支持多种语言#智能合约生成 #4位QLoRA微调 #Solidity代码优化

下载量 459

发布时间 : 9/7/2023

模型简介

这是一个采用4位QLoRA微调技术的Solidity智能合约代码生成模型，基于Code LLaMA-Instruct模型微调而成，擅长处理区块链智能合约开发任务。

模型特点

Solidity专精

专门针对Solidity智能合约开发进行优化，能生成高质量的智能合约代码

4位QLoRA微调

采用4位量化低秩适配(QLoRA)技术进行高效微调，降低资源需求

指令跟随

能够理解自然语言指令并生成相应的Solidity代码实现

模型能力

Solidity代码生成

智能合约开发

区块链应用开发

代码补全

使用案例

区块链开发

智能合约开发

根据自然语言描述生成完整的Solidity智能合约代码

可快速生成ERC-20代币、NFT合约等常见智能合约

合约安全审计

生成合约代码时可考虑常见安全模式

减少常见安全漏洞如重入攻击、整数溢出等

开发者工具

代码补全

为正在编写的Solidity代码提供智能补全建议

提高开发效率，减少语法错误

🚀 代码LLaMA 7B指令Solidity模型

这是一个经过微调的70亿参数的代码LLaMA指令模型，借助PEFT库提供的4位QLoRA微调技术，用于生成Solidity智能合约。该模型能够有效解决在区块链开发中生成智能合约代码的问题，为开发者提供高效、准确的代码生成能力。

✨ 主要特性

代码生成：可生成Solidity智能合约代码。
微调技术：采用4位QLoRA微调技术，优化模型性能。

📦 安装指南

文档未提供安装步骤，此部分跳过。

💻 使用示例

基础用法

from transformers import BitsAndBytesConfig, AutoTokenizer, AutoModelForCausalLM
import torch
import accelerate

use_4bit = True
bnb_4bit_compute_dtype = "float16"
bnb_4bit_quant_type = "nf4"
use_double_nested_quant = True
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

# BitsAndBytesConfig 4-bit config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_use_double_quant=use_double_nested_quant,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    load_in_8bit_fp32_cpu_offload=True
)

# Load model in 4-bit
tokenizer = AutoTokenizer.from_pretrained("AlfredPros/CodeLlama-7b-Instruct-Solidity")
model = AutoModelForCausalLM.from_pretrained("AlfredPros/CodeLlama-7b-Instruct-Solidity", quantization_config=bnb_config, device_map="balanced_low_0")

# Make input
input='Make a smart contract to create a whitelist of approved wallets. The purpose of this contract is to allow the DAO (Decentralized Autonomous Organization) to approve or revoke certain wallets, and also set a checker address for additional validation if needed. The current owner address can be changed by the current owner.'

# Make prompt template
prompt = f"""### Instruction:
Use the Task below and the Input given to write the Response, which is a programming code that can solve the following Task:

### Task:
{input}

### Solution:
"""

# Tokenize the input
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
# Run the model to infere an output
outputs = model.generate(input_ids=input_ids, max_new_tokens=1024, do_sample=True, top_p=0.9, temperature=0.001, pad_token_id=1)

# Detokenize and display the generated output
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):])

📚 详细文档

训练数据集

用于微调模型的数据集是AlfredPros的智能合约指令数据集（https://huggingface.co/datasets/AlfredPros/smart-contracts-instructions）。该数据集包含6003个由GPT生成的人类指令与Solidity源代码数据对，并且已经过处理，可用于训练大语言模型。

训练参数

Bitsandbytes量化配置

以4位加载：true
4位量化类型：NF4
4位计算数据类型：float16
4位使用双重量化：true

监督式微调训练器参数

训练轮数：1
FP16：true
FP16选项级别：O1
BF16：false
每个设备的训练批次大小：1
梯度累积步数：1
梯度检查点：true
最大梯度范数：0.3
学习率：2e - 4
权重衰减：0.001
优化器：分页AdamW 32位
学习率调度器类型：余弦
预热比例：0.03

训练详情

使用的GPU：1x NVIDIA GeForce GTX 1080Ti
训练时间：21小时4分57秒

训练损失

Step	Training Loss
 100	0.330900
 200	0.293000
 300	0.276500
 400	0.290900
 500	0.306100
 600	0.302600
 700	0.337200
 800	0.295000
 900	0.297800
1000	0.299500
1100	0.268900
1200	0.257800
1300	0.264100
1400	0.294400
1500	0.293900
1600	0.287600
1700	0.281200
1800	0.273400
1900	0.266600
2000	0.227500
2100	0.261600
2200	0.275700
2300	0.290100
2400	0.290900
2500	0.316200
2600	0.296500
2700	0.291400
2800	0.253300
2900	0.321500
3000	0.269500
3100	0.295600
3200	0.265800
3300	0.262800
3400	0.274900
3500	0.259800
3600	0.226300
3700	0.325700
3800	0.249000
3900	0.237200
4000	0.251400
4100	0.247000
4200	0.278700
4300	0.264000
4400	0.245000
4500	0.235900
4600	0.240400
4700	0.235200
4800	0.220300
4900	0.202700
5000	0.240500
5100	0.258500
5200	0.236300
5300	0.267500
5400	0.236700
5500	0.265900
5600	0.244900
5700	0.297900
5800	0.281200
5900	0.313800
6000	0.249800
6003	0.271939