KARAKURI LM 7B APM v0.1
Model Details
Model Description
- Developed by: KARAKURI Inc.
- Model type: Causal decoder-only transformer language model
- Languages: Primarily English
- License: Gemma Terms of Use
- Finetuned from model: google/gemma-7b
- Contact: For questions and comments about the model, please email
karakuri-rd@karakuri.ai
Use in π€ Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "karakuri-ai/karakuri-lm-7b-apm-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hello! How can I help you today?"},
]
tokenizer.apply_chat_template(
messages,
label="helpsteer",
tokenize=False,
add_generation_prompt=True,
)
input_ids = tokenizer.apply_chat_template(
messages,
label="helpsteer",
add_generation_prompt=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=32)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])
messages += [
{"role": "label", "content": "helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1"},
{"role": "user", "content": "Thank you!"},
{"role": "assistant", "content": "You're welcome! I'm happy to help however I can."},
]
tokenizer.apply_chat_template(
messages,
label="helpsteer",
tokenize=False,
add_generation_prompt=True,
)
messages = [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hello! How can I help you today?"},
]
tokenizer.apply_chat_template(
messages,
label="oasst",
tokenize=False,
add_generation_prompt=True,
)
input_ids = tokenizer.apply_chat_template(
messages,
label="oasst",
add_generation_prompt=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=32)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])
Training Details
Training Data
Training Infrastructure
- Hardware: The model was trained on single node of an Amazon EC2 trn1.32xlarge instance.
- Software: We use code based on neuronx-nemo-megatron.
Citation
@misc {karakuri_lm_7b_apm_v01,
author = { {KARAKURI} {I}nc. },
title = { {KARAKURI} {LM} 7{B} {APM} v0.1 },
year = { 2024 },
url = { https://huggingface.co/karakuri-ai/karakuri-lm-7b-apm-v0.1 },
publisher = { Hugging Face },
journal = { Hugging Face repository }
}