SetFit with sentence-transformers/all-mpnet-base-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
Label |
Examples |
0 |
|
1 |
- 'Police officer wounded suspect dead after exchanging shots: RICHMOND Va. (AP) \x89ÛÓ A Richmond police officer wa... http://t.co/Y0qQS2L7bS'
- "There's a weird siren going off here...I hope Hunterston isn't in the process of blowing itself to smithereens..."
- 'Iranian warship points weapon at American helicopter... http://t.co/cgFZk8Ha1R'
|
Evaluation
Metrics
Label |
Accuracy |
all |
0.8058 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("pEpOo/catastrophy6")
preds = model("SHOUOUT TO @kasad1lla CAUSE HER VOCALS ARE BLAZING HOT LIKE THE WEATHER SHES IN")
Training Details
Training Set Metrics
Training set |
Min |
Median |
Max |
Word count |
1 |
14.7175 |
54 |
Label |
Training Sample Count |
0 |
1335 |
1 |
948 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch |
Step |
Training Loss |
Validation Loss |
0.0094 |
1 |
0.0044 |
- |
0.4717 |
50 |
0.005 |
- |
0.9434 |
100 |
0.0007 |
- |
0.0002 |
1 |
0.4675 |
- |
0.0088 |
50 |
0.3358 |
- |
0.0175 |
100 |
0.2516 |
- |
0.0263 |
150 |
0.2158 |
- |
0.0350 |
200 |
0.1924 |
- |
0.0438 |
250 |
0.1907 |
- |
0.0526 |
300 |
0.2166 |
- |
0.0613 |
350 |
0.2243 |
- |
0.0701 |
400 |
0.0644 |
- |
0.0788 |
450 |
0.1924 |
- |
0.0876 |
500 |
0.166 |
- |
0.0964 |
550 |
0.2117 |
- |
0.1051 |
600 |
0.0793 |
- |
0.1139 |
650 |
0.0808 |
- |
0.1226 |
700 |
0.1183 |
- |
0.1314 |
750 |
0.0808 |
- |
0.1402 |
800 |
0.0194 |
- |
0.1489 |
850 |
0.0699 |
- |
0.1577 |
900 |
0.0042 |
- |
0.1664 |
950 |
0.0048 |
- |
0.1752 |
1000 |
0.1886 |
- |
0.1840 |
1050 |
0.0008 |
- |
0.1927 |
1100 |
0.0033 |
- |
0.2015 |
1150 |
0.0361 |
- |
0.2102 |
1200 |
0.12 |
- |
0.2190 |
1250 |
0.0035 |
- |
0.2278 |
1300 |
0.0002 |
- |
0.2365 |
1350 |
0.0479 |
- |
0.2453 |
1400 |
0.0568 |
- |
0.2540 |
1450 |
0.0004 |
- |
0.2628 |
1500 |
0.0002 |
- |
0.2715 |
1550 |
0.0013 |
- |
0.2803 |
1600 |
0.0005 |
- |
0.2891 |
1650 |
0.0014 |
- |
0.2978 |
1700 |
0.0004 |
- |
0.3066 |
1750 |
0.0008 |
- |
0.3153 |
1800 |
0.0616 |
- |
0.3241 |
1850 |
0.0003 |
- |
0.3329 |
1900 |
0.001 |
- |
0.3416 |
1950 |
0.0581 |
- |
0.3504 |
2000 |
0.0657 |
- |
0.3591 |
2050 |
0.0584 |
- |
0.3679 |
2100 |
0.0339 |
- |
0.3767 |
2150 |
0.0081 |
- |
0.3854 |
2200 |
0.0001 |
- |
0.3942 |
2250 |
0.0009 |
- |
0.4029 |
2300 |
0.0018 |
- |
0.4117 |
2350 |
0.0001 |
- |
0.4205 |
2400 |
0.0012 |
- |
0.4292 |
2450 |
0.0001 |
- |
0.4380 |
2500 |
0.0003 |
- |
0.4467 |
2550 |
0.0035 |
- |
0.4555 |
2600 |
0.0172 |
- |
0.4643 |
2650 |
0.0383 |
- |
0.4730 |
2700 |
0.0222 |
- |
0.4818 |
2750 |
0.0013 |
- |
0.4905 |
2800 |
0.0007 |
- |
0.4993 |
2850 |
0.0003 |
- |
0.5081 |
2900 |
0.1247 |
- |
0.5168 |
2950 |
0.023 |
- |
0.5256 |
3000 |
0.0002 |
- |
0.5343 |
3050 |
0.0002 |
- |
0.5431 |
3100 |
0.0666 |
- |
0.5519 |
3150 |
0.0002 |
- |
0.5606 |
3200 |
0.0003 |
- |
0.5694 |
3250 |
0.0012 |
- |
0.5781 |
3300 |
0.0085 |
- |
0.5869 |
3350 |
0.0003 |
- |
0.5957 |
3400 |
0.0002 |
- |
0.6044 |
3450 |
0.0004 |
- |
0.6132 |
3500 |
0.013 |
- |
0.6219 |
3550 |
0.0089 |
- |
0.6307 |
3600 |
0.0001 |
- |
0.6395 |
3650 |
0.0002 |
- |
0.6482 |
3700 |
0.0039 |
- |
0.6570 |
3750 |
0.0031 |
- |
0.6657 |
3800 |
0.0009 |
- |
0.6745 |
3850 |
0.0002 |
- |
0.6833 |
3900 |
0.0002 |
- |
0.6920 |
3950 |
0.0001 |
- |
0.7008 |
4000 |
0.0 |
- |
0.7095 |
4050 |
0.0212 |
- |
0.7183 |
4100 |
0.0001 |
- |
0.7270 |
4150 |
0.0586 |
- |
0.7358 |
4200 |
0.0001 |
- |
0.7446 |
4250 |
0.0003 |
- |
0.7533 |
4300 |
0.0126 |
- |
0.7621 |
4350 |
0.0001 |
- |
0.7708 |
4400 |
0.0001 |
- |
0.7796 |
4450 |
0.0001 |
- |
0.7884 |
4500 |
0.0 |
- |
0.7971 |
4550 |
0.0002 |
- |
0.8059 |
4600 |
0.0002 |
- |
0.8146 |
4650 |
0.0001 |
- |
0.8234 |
4700 |
0.0035 |
- |
0.8322 |
4750 |
0.0002 |
- |
0.8409 |
4800 |
0.0002 |
- |
0.8497 |
4850 |
0.0001 |
- |
0.8584 |
4900 |
0.0001 |
- |
0.8672 |
4950 |
0.0001 |
- |
0.8760 |
5000 |
0.0003 |
- |
0.8847 |
5050 |
0.0 |
- |
0.8935 |
5100 |
0.0041 |
- |
0.9022 |
5150 |
0.0001 |
- |
0.9110 |
5200 |
0.0001 |
- |
0.9198 |
5250 |
0.0001 |
- |
0.9285 |
5300 |
0.0001 |
- |
0.9373 |
5350 |
0.0001 |
- |
0.9460 |
5400 |
0.0001 |
- |
0.9548 |
5450 |
0.0001 |
- |
0.9636 |
5500 |
0.0001 |
- |
0.9723 |
5550 |
0.0001 |
- |
0.9811 |
5600 |
0.0002 |
- |
0.9898 |
5650 |
0.0271 |
- |
0.9986 |
5700 |
0.0 |
- |
Framework Versions
- Python: 3.10.12
- SetFit: 1.0.1
- Sentence Transformers: 2.2.2
- Transformers: 4.35.2
- PyTorch: 2.1.0+cu121
- Datasets: 2.15.0
- Tokenizers: 0.15.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}