许可证: cc-by-4.0
基础模型: hiieu/halong_embedding
语言: ["vi"]
库名称: sentence-transformers
任务标签: sentence-similarity
推理: false
GGUF格式的hiieu/halong_embedding
模型
原始地址: https://huggingface.co/hiieu/halong_embedding
量化步骤:
REL=b3827
wget https://github.com/ggerganov/llama.cpp/releases/download/$REL/llama-$REL-bin-ubuntu-x64.zip --content-disposition --continue &> /dev/null
wget https://github.com/ggerganov/llama.cpp/archive/refs/tags/$REL.zip --content-disposition --continue &> /dev/null
unzip -q llama-$REL-bin-ubuntu-x64.zip
unzip -q llama.cpp-$REL.zip
mv llama.cpp-$REL/* .
rm -r llama.cpp-$REL/ llama-$REL-bin-ubuntu-x64.zip llama.cpp-$REL.zip
pip install -q -r requirements.txt
rm -rf models/tmp/
git clone --depth=1 --single-branch https://huggingface.co/hiieu/halong_embedding models/tmp
huggingface-cli download intfloat/multilingual-e5-base sentencepiece.bpe.model --local-dir models/tmp
python convert_hf_to_gguf.py models/tmp/ --outfile model-f32.gguf --outtype f32
build/bin/llama-quantize model-f32.gguf model-f16.gguf f16 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-bf16.gguf bf16 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-q8_0.gguf q8_0 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-q6_k.gguf q6_k 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-q5_k_m.gguf q5_k_m 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-q5_k_s.gguf q5_k_s 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-q4_k_m.gguf q4_k_m 2> /dev/null
build/bin/llama-quantize model-f32.gguf model-q4_k_s.gguf q4_k_s 2> /dev/null
rm -rf models/yolo/
mkdir -p models/yolo
mv model-*.gguf models/yolo/
touch models/yolo/README.md
huggingface-cli upload halong-embedding-gguf models/yolo .
使用方法:
build/bin/llama-embedding -m model-q5_k_m.gguf -p "她整天说说笑笑" --embd-output-format array 2> /dev/null
build/bin/llama-server --embedding -c 512 -m model-q5_k_m.gguf