language:
- en
license: llama3
tags:
- moe
model-index:
- name: L3-SnowStorm-v1.15-4x8B-B
results:
- task:
type: text-generation
name: 文本生成
dataset:
name: AI2推理挑战赛(25样本)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 60.67
name: 标准化准确率
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xxx777xxxASD/L3-SnowStorm-v1.15-4x8B-B
name: 开放大模型排行榜
- task:
type: text-generation
name: 文本生成
dataset:
name: HellaSwag(10样本)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 81.6
name: 标准化准确率
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xxx777xxxASD/L3-SnowStorm-v1.15-4x8B-B
name: 开放大模型排行榜
- task:
type: text-generation
name: 文本生成
dataset:
name: MMLU(5样本)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 68.12
name: 准确率
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xxx777xxxASD/L3-SnowStorm-v1.15-4x8B-B
name: 开放大模型排行榜
- task:
type: text-generation
name: 文本生成
dataset:
name: TruthfulQA(0样本)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 51.69
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xxx777xxxASD/L3-SnowStorm-v1.15-4x8B-B
name: 开放大模型排行榜
- task:
type: text-generation
name: 文本生成
dataset:
name: Winogrande(5样本)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 76.56
name: 准确率
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xxx777xxxASD/L3-SnowStorm-v1.15-4x8B-B
name: 开放大模型排行榜
- task:
type: text-generation
name: 文本生成
dataset:
name: GSM8k(5样本)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 69.45
name: 准确率
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xxx777xxxASD/L3-SnowStorm-v1.15-4x8B-B
name: 开放大模型排行榜
[!NOTE]
GGUF量化版本
实验性角色扮演导向的混合专家模型,目标是打造在角色扮演/情感角色扮演任务中表现不逊于或优于Mixtral 8x7B及其微调版本的模型。
现有版本:
Llama 3 SnowStorm v1.15B 4x8B架构
基础模型: Sao10K_L3-8B-Stheno-v3.1
门控模式: 随机选择
数据类型: bfloat16
每token专家数: 2
专家模型:
- 源模型: Nitral-AI_Poppy_Porpoise-1.0-L3-8B
- 源模型: NeverSleep_Llama-3-Lumimaid-8B-v0.1-OAS
- 源模型: openlynn_Llama-3-Soliloquy-8B-v2
- 源模型: Sao10K_L3-8B-Stheno-v3.1
采用的基础模型
版本差异(相较于SnowStorm v1.0)
视觉模块
llama3_mmproj视觉适配器

提示格式: Llama 3标准
详细结果请查阅此处
评估指标 |
得分 |
平均得分 |
68.01 |
AI2推理挑战赛(25样本) |
60.67 |
HellaSwag(10样本) |
81.60 |
MMLU(5样本) |
68.12 |
TruthfulQA(0样本) |
51.69 |
Winogrande(5样本) |
76.56 |
GSM8k(5样本) |
69.45 |