Edit model card

SetFit

This is a SetFit model that can be used for Text Classification. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a OneVsRestClassifier instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 3 classes

Model Sources

Model Labels

Label Examples
2
  • 'This research group is only interested in violent extremism – according to their website.'
  • 'No cop, anywhere, “signed up” to be murdered.'
  • "(Both those states are also part of today's federal lawsuit filed in the Western District of Washington.)"
1
  • 'In the meantime, the New Mexico district attorney who failed to file for a preliminary hearing within 10 days and didn’t show up for court is vowing to pursue prosecution of these jihadis.'
  • 'According to the Constitution, you, and you alone, are the sole head of the executive branch, and as such you are where the buck stop in making sure the laws are faithfully executed.'
  • 'And the death of the three-year-old?'
0
  • 'One of the Indonesian illegal aliens benefiting from her little amnesty took the hint and used the opportunity that Saris created to flee from arrest and deportation, absconding to a sanctuary church to hide from arrest.'
  • 'So, why did Mueller focus on Manafort?'
  • 'We had a lot of reporters in that room, many many reporters in that room and they were unable to ask questions because this guy gets up and starts, you know, doing what he’s supposed to be doing for him and for CNN and you know just shouting out questions and making statements, too."'

Evaluation

Metrics

Label Accuracy
all 0.9987

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("anismahmahi/doubt_repetition_with_noPropaganda_multiclass_SetFit")
# Run inference
preds = model("The Twitter suspension caught me by surprise.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 20.4272 109
Label Training Sample Count
0 131
1 129
2 2479

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 5
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0006 1 0.3869 -
0.0292 50 0.3352 -
0.0584 100 0.2235 -
0.0876 150 0.1518 -
0.1168 200 0.1967 -
0.1460 250 0.1615 -
0.1752 300 0.1123 -
0.2044 350 0.1493 -
0.2336 400 0.0039 -
0.2629 450 0.0269 -
0.2921 500 0.0024 -
0.3213 550 0.0072 -
0.3505 600 0.0649 -
0.3797 650 0.0005 -
0.4089 700 0.0008 -
0.4381 750 0.0041 -
0.4673 800 0.0009 -
0.4965 850 0.0004 -
0.5257 900 0.0013 -
0.5549 950 0.0013 -
0.5841 1000 0.0066 -
0.6133 1050 0.0355 -
0.6425 1100 0.0004 -
0.6717 1150 0.0013 -
0.7009 1200 0.0003 -
0.7301 1250 0.0002 -
0.7593 1300 0.0008 -
0.7886 1350 0.0002 -
0.8178 1400 0.0002 -
0.8470 1450 0.0004 -
0.8762 1500 0.1193 -
0.9054 1550 0.0002 -
0.9346 1600 0.0002 -
0.9638 1650 0.0002 -
0.9930 1700 0.0002 -
1.0 1712 - 0.0073
1.0222 1750 0.0002 -
1.0514 1800 0.0006 -
1.0806 1850 0.0005 -
1.1098 1900 0.0001 -
1.1390 1950 0.0012 -
1.1682 2000 0.0003 -
1.1974 2050 0.0344 -
1.2266 2100 0.0038 -
1.2558 2150 0.0001 -
1.2850 2200 0.0003 -
1.3143 2250 0.0114 -
1.3435 2300 0.0001 -
1.3727 2350 0.0001 -
1.4019 2400 0.0001 -
1.4311 2450 0.0001 -
1.4603 2500 0.0005 -
1.4895 2550 0.0086 -
1.5187 2600 0.0001 -
1.5479 2650 0.0002 -
1.5771 2700 0.0001 -
1.6063 2750 0.0002 -
1.6355 2800 0.0001 -
1.6647 2850 0.0001 -
1.6939 2900 0.0001 -
1.7231 2950 0.0001 -
1.7523 3000 0.0001 -
1.7815 3050 0.0001 -
1.8107 3100 0.0 -
1.8400 3150 0.0001 -
1.8692 3200 0.0001 -
1.8984 3250 0.0001 -
1.9276 3300 0.0 -
1.9568 3350 0.0001 -
1.9860 3400 0.0002 -
2.0 3424 - 0.0053
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.16.1
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·

Evaluation results