Model Card for Mistral7B-v0.1-coco-caption-de

This model is a fine-tuned model of the Mistral7B-v0.1 completion model and meant to produce german COCO like captions.

The coco-karpathy-opus-de dataset was used to tune the model for german image caption generation.

Model Details

Prompt format

The completion model is trained with the prompt prefix Bildbeschreibung:

Examples:

>>> Bildbeschreibung: 
2 Hunde sitzen auf einer Bank neben einer Pflanze

>>> Bildbeschreibung: Wasser
fall und Felsen vor dem Gebäude mit Blick auf den Fluss.

>>> Bildbeschreibung: Ein grünes Auto mit roten 
 Reflektoren parkte auf dem Parkplatz.

Model Description

Developed by: Jotschi
License: Apache License
Finetuned from model: Mistral7B-v0.1

Uses

The model is meant to be used in conjunction with a BLIP2 Q-Former to enable image captioning, visual question answering (VQA) and chat-like conversations.

Training Details

The preliminary training script uses PEFT and DeepSpeed to execute the traininng.

Training Data

coco-karpathy-opus-de dataset

Training Procedure

The model was trained using PEFT 4Bit Q-LoRA with the following parameters:

rank: 256
alpha: 16
steps: 8500
bf16: True
lr_scheduler_type: cosine
warmup_ratio: 0.03
gradient accumulation steps: 2
batch size: 4
Input sequence length: 512
Learning Rate: 2.0e-5

Postprocessing

The merged model was saved using PeftModel API.

Framework versions

PEFT 0.8.2