pipeline_tag: 文本生成
base_model: ibm-granite/granite-20b-code-base-8k
inference: true
license: apache-2.0
datasets:
- bigcode/commitpackft
- TIGER-Lab/MathInstruct
- meta-math/MetaMathQA
- glaiveai/glaive-code-assistant-v3
- glaive-function-calling-v2
- bugdaryan/sql-create-context-instruction
- garage-bAInd/Open-Platypus
- nvidia/HelpSteer
metrics:
- code_eval
library_name: transformers
tags:
- 代码
- granite
model-index:
- name: granite-20b-code-instruct-8k
results:
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalSynthesis(Python)
metrics:
- name: pass@1
type: pass@1
value: 60.4
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalSynthesis(JavaScript)
metrics:
- name: pass@1
type: pass@1
value: 53.7
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalSynthesis(Java)
metrics:
- name: pass@1
type: pass@1
value: 58.5
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalSynthesis(Go)
metrics:
- name: pass@1
type: pass@1
value: 42.1
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalSynthesis(C++)
metrics:
- name: pass@1
type: pass@1
value: 45.7
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalSynthesis(Rust)
metrics:
- name: pass@1
type: pass@1
value: 42.7
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalExplain(Python)
metrics:
- name: pass@1
type: pass@1
value: 44.5
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalExplain(JavaScript)
metrics:
- name: pass@1
type: pass@1
value: 42.7
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalExplain(Java)
metrics:
- name: pass@1
type: pass@1
value: 49.4
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalExplain(Go)
metrics:
- name: pass@1
type: pass@1
value: 32.3
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalExplain(C++)
metrics:
- name: pass@1
type: pass@1
value: 42.1
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalExplain(Rust)
metrics:
- name: pass@1
type: pass@1
value: 18.3
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix(Python)
metrics:
- name: pass@1
type: pass@1
value: 43.9
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix(JavaScript)
metrics:
- name: pass@1
type: pass@1
value: 43.9
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix(Java)
metrics:
- name: pass@1
type: pass@1
value: 45.7
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix(Go)
metrics:
- name: pass@1
type: pass@1
value: 41.5
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix(C++)
metrics:
- name: pass@1
type: pass@1
value: 41.5
veriefied: false
- task:
type: 文本生成
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix(Rust)
metrics:
- name: pass@1
type: pass@1
value: 29.9
veriefied: false

Granite-20B-Code-Instruct-8K
模型概述
Granite-20B-Code-Instruct-8K 是一个200亿参数的模型,基于 Granite-20B-Code-Base-8K 微调而来,结合了宽松许可的指令数据,以增强其遵循指令的能力,包括逻辑推理和问题解决技能。
使用方法
预期用途
该模型设计用于响应与编码相关的指令,可用于构建编码助手。
生成示例
以下是使用 Granite-20B-Code-Instruct-8K 模型的简单示例。
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_path = "ibm-granite/granite-20b-code-instruct-8k"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
chat = [
{ "role": "user", "content": "编写一个代码来查找数字列表中的最大值。" },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(chat, return_tensors="pt")
for i in input_tokens:
input_tokens[i] = input_tokens[i].to(device)
output = model.generate(**input_tokens, max_new_tokens=100)
output = tokenizer.batch_decode(output)
for i in output:
print(i)
训练数据
Granite Code Instruct 模型基于以下类型的数据进行训练。
基础设施
我们使用IBM的两个超级计算集群(Vela和Blue Vela)训练Granite Code模型,这两个集群分别配备了NVIDIA A100和H100 GPU。这些集群为我们在数千个GPU上训练模型提供了可扩展且高效的基础设施。
伦理考量和限制
Granite Code Instruct 模型主要通过特定编程语言的指令-响应对进行微调。因此,其性能在非目标编程语言上可能受限。在这种情况下,提供少量示例有助于引导模型的输出。此外,开发者在关键应用部署这些模型之前应进行安全测试和目标特定调优。该模型还继承了其基础模型的伦理考量和限制。更多信息请参考 Granite-20B-Code-Base-8K 模型卡片。