Instructions to use Azazelle/L3-Persephone-8B-v1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Azazelle/L3-Persephone-8B-v1.0 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Azazelle/L3-Persephone-8B-v1.0")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Azazelle/L3-Persephone-8B-v1.0")
model = AutoModelForCausalLM.from_pretrained("Azazelle/L3-Persephone-8B-v1.0")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Azazelle/L3-Persephone-8B-v1.0 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Azazelle/L3-Persephone-8B-v1.0"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Azazelle/L3-Persephone-8B-v1.0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Azazelle/L3-Persephone-8B-v1.0

SGLang

How to use Azazelle/L3-Persephone-8B-v1.0 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Azazelle/L3-Persephone-8B-v1.0" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Azazelle/L3-Persephone-8B-v1.0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Azazelle/L3-Persephone-8B-v1.0" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Azazelle/L3-Persephone-8B-v1.0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Azazelle/L3-Persephone-8B-v1.0 with Docker Model Runner:
```
docker model run hf.co/Azazelle/L3-Persephone-8B-v1.0
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

L3-Persephone-8B-v1.0

About:

This is a merge of pre-trained language models created using mergekit.

Recommended Samplers:

Temperature - 1.0
TFS - 0.85
Smoothing Factor - 0.3
Smoothing Curve - 1.1
Repetition Penalty - 1.1

Merge Method

This model was merged a series of model stock and lora merges, followed by ExPO. It uses a mix of smart and roleplay centered models to improve performance.

Configuration

The following YAML configuration was used to produce this model:

# Smart model mixing
models:
  - model: migtissera/Llama-3-8B-Synthia-v3.5
  - model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
  - model: openchat/openchat-3.6-8b-20240522
  - model: NousResearch/Hermes-2-Pro-Llama-3-8B
  - model: WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0
  - model: chujiezheng/LLaMA3-iterative-DPO-final-ExPO
  - model: chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO
  - model: NousResearch/Hermes-2-Theta-Llama-3-8B
  - model: mlabonne/Daredevil-8B-abliterated
  - model: mlabonne/NeuralDaredevil-8B-abliterated
  - model: iRyanBell/ARC1
  - model: iRyanBell/ARC1-II
  - model: aaditya/Llama3-OpenBioLLM-8B
  - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
  - model: Locutusque/Llama-3-Hercules-5.0-8B
  - model: OwenArli/Awanllm-Llama-3-8B-Cumulus-v1.0
  - model: TIGER-Lab/MAmmoTH2-8B-Plus
  - model: refuelai/Llama-3-Refueled
  - model: failspy/Meta-Llama-3-8B-Instruct-abliterated-v3
  - model: HPAI-BSC/Llama3-Aloe-8B-Alpha
  - model: abacusai/Llama-3-Smaug-8B
  - model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
  - model: turboderp/llama3-turbcat-instruct-8b
  - model: nbeerbower/llama-3-gutenberg-8B
  - model: chargoddard/prometheus-2-llama-3-8b
  - model: Magpie-Align/Llama-3-8B-OpenHermes-2.5-1M
  - model: Magpie-Align/Llama-3-8B-Magpie-Pro-MT-SFT-v0.1
merge_method: model_stock
base_model: NousResearch/Meta-Llama-3-8B-Instruct
dtype: float32
vocab_type: bpe
name: stop_it_nerd
# RP Lora Mixing
models:
  - model: stop_it_nerd+Azazelle/Llama-3-8B-Abomination-LORA
  - model: stop_it_nerd+Azazelle/Llama-3-LimaRP-Instruct-LoRA-8B
  - model: stop_it_nerd+ToastyPigeon/Llama-3-8B-Instruct-SpringDragon-V2-QLoRA
  - model: stop_it_nerd+Azazelle/Llama-3-LongStory-LORA
  - model: stop_it_nerd+Azazelle/Llama3_RP_ORPO_LoRA
  - model: stop_it_nerd+Azazelle/RP_Format_QuoteAsterisk_Llama3
  - model: stop_it_nerd+Azazelle/Theory_of_Mind_Llama3
  - model: stop_it_nerd+Azazelle/Aura_Llama3
  - model: stop_it_nerd+Azazelle/Luna_Llama3
  - model: stop_it_nerd+Azazelle/BlueMoon_Llama3
  - model: stop_it_nerd+Azazelle/Smarts_Llama3
  - model: stop_it_nerd+Azazelle/Nimue-8B
  - model: stop_it_nerd+Azazelle/Llama-3-Instruct-LiPPA-LoRA-8B
  - model: stop_it_nerd+Azazelle/go-bruins-v3-lora
  - model: stop_it_nerd+Azazelle/L3-Daybreak-8b-lora
merge_method: model_stock
base_model: stop_it_nerd
dtype: float32
vocab_type: bpe
name: nerdy_rp
# RP Model Mixing
models:
  - model: ChaoticNeutrals/Hathor_RP-v.01-L3-8B
  - model: TheDrummer/Llama-3SOME-8B-v2
  - model: cgato/TheSalt-L3-8b-v0.3.2
  - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
  - model: Sao10K/L3-8B-Stheno-v3.2
  - model: ChaoticNeutrals/T-900-8B
  - model: ResplendentAI/Nymph_8B
  - model: vicgalle/Roleplay-Llama-3-8B
  - model: maldv/badger-mu-llama-3-8b
  - model: maldv/badger-iota-llama-3-8b
  - model: ContextualAI/Llama-3-8B-Instruct-EPO-checkpoint5376
  - model: hf-100/Llama-3-Spellbound-Instruct-8B-0.3
  - model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
  - model: lodrick-the-lafted/Limon-8B
  - model: ChaoticNeutrals/Poppy_Porpoise-1.0-L3-8B
  - model: turboderp/llama3-turbcat-instruct-8b
merge_method: model_stock
base_model: NousResearch/Meta-Llama-3-8B-Instruct
dtype: float32
vocab_type: bpe
name: true_rp
# Component Mixing
models:
  - model: true_rp
  - model: nerdy_rp
merge_method: model_stock
base_model: NousResearch/Meta-Llama-3-8B-Instruct
dtype: float32
vocab_type: bpe
name: virgin_rp
# Normal ExPO
models:
  - model: virgin_rp
    parameters:
      weight: 1.28
merge_method: task_arithmetic
base_model: NousResearch/Meta-Llama-3-8B-Instruct
parameters:
  normalize: false
dtype: float32
vocab_type: bpe
name: virgin_dumb
# Instruct ExPO
models:
  - model: virgin_rp
    parameters:
      weight: 1.12
merge_method: task_arithmetic
base_model: NousResearch/Meta-Llama-3-8B
parameters:
  normalize: false
dtype: float32
vocab_type: bpe
name: virgin_smart
# ExPO Mixing
models:
  - model: virgin_smart
  - model: virgin_dumb
merge_method: model_stock
base_model: virgin_rp
dtype: float32
vocab_type: bpe

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

F32

Model tree for Azazelle/L3-Persephone-8B-v1.0

Quantizations

2 models

Collection including Azazelle/L3-Persephone-8B-v1.0

Main Models

Collection

My main models that I use or have used • 4 items • Updated Jul 19, 2024 • 1