SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 384 tokens
Number of Classes: 3 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
2	'validity airtel xstream fiber id 20001896982 expire 04-sep-23 . please recharge rs 589 enjoy uninterrupted service . recharge , click www.airtel.in/5/c_summary ? n=021710937343_dsl . please ignore already pay .' 'initiate process add a/c . xxxx59 upi app - axis bank' 'google-pay registration initiate icici bank . do , report bank . card details/otp/cvv secret . disclose anyone .'
0	'rs 260.00 debit a/c xxxxxx7783 credit krjngm @ oksbi upi ref:325154274303. ? call 18005700 -bob' 'send rs.400.00 kotak bank ac x4524 7800600122 @ ybl 15-10-23.upi ref 328855774953. , kotak.com/fraud' 'send rs.400.00 kotak bank ac x4524 7800600122 @ ybl 15-10-23.upi ref 328855774953. , kotak.com/fraud'
1	'dear bob upi user , account credit inr 50.00 date 2023-08-27 11:41:09 upi ref 360562629741 - bob' 'receive rs.10000.00 kotak bank ac x4524 mahimagyamlani08 @ okaxis 21-08-23.bal:197,838.14.upi ref:323334598750' 'update ! inr5.66 credit federal bank account xxxx9374 jupiter app . happy bank !'

Evaluation

Metrics

Label	Accuracy
all	0.9722

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("vipinbansal179/SetFit_sms_Analyzer5c95292")
# Run inference
preds = model("< # > use otp : 8233 login turtlemintpro zck+rfoaqnm")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	5	20.5357	35

Label	Training Sample Count
0	31
1	28
2	81

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0014	1	0.2939	-
0.0708	50	0.1698	-
0.1416	100	0.0557	-
0.2125	150	0.0614	-
0.2833	200	0.0099	-
0.3541	250	0.0005	-
0.4249	300	0.0002	-
0.4958	350	0.0001	-
0.5666	400	0.0001	-
0.6374	450	0.0001	-
0.7082	500	0.0001	-
0.7790	550	0.0001	-
0.8499	600	0.0002	-
0.9207	650	0.0001	-
0.9915	700	0.0001	-
1.0	706	-	0.0312
1.0623	750	0.0001	-
1.1331	800	0.0001	-
1.2040	850	0.0001	-
1.2748	900	0.0	-
1.3456	950	0.0001	-
1.4164	1000	0.0	-
1.4873	1050	0.0	-
1.5581	1100	0.0	-
1.6289	1150	0.0	-
1.6997	1200	0.0	-
1.7705	1250	0.0	-
1.8414	1300	0.0001	-
1.9122	1350	0.0	-
1.9830	1400	0.0001	-
2.0	1412	-	0.0366
2.0538	1450	0.0	-
2.1246	1500	0.0001	-
2.1955	1550	0.0	-
2.2663	1600	0.0	-
2.3371	1650	0.0	-
2.4079	1700	0.0	-
2.4788	1750	0.0	-
2.5496	1800	0.0	-
2.6204	1850	0.0	-
2.6912	1900	0.0	-
2.7620	1950	0.0	-
2.8329	2000	0.0	-
2.9037	2050	0.0	-
2.9745	2100	0.0	-
3.0	2118	-	0.0414
3.0453	2150	0.0	-
3.1161	2200	0.0	-
3.1870	2250	0.0	-
3.2578	2300	0.0	-
3.3286	2350	0.0	-
3.3994	2400	0.0	-
3.4703	2450	0.0	-
3.5411	2500	0.0	-
3.6119	2550	0.0	-
3.6827	2600	0.0	-
3.7535	2650	0.0	-
3.8244	2700	0.0	-
3.8952	2750	0.0	-
3.9660	2800	0.0	-
4.0	2824	-	0.0366

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.1
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.16.0
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

vipinbansal179
/

SetFit_sms_Analyzer5c95292

SetFit with sentence-transformers/all-mpnet-base-v2

Model Details

Model Description

Model Sources

Model Labels

Evaluation

Metrics

Uses

Direct Use for Inference

Training Details

Training Set Metrics

Training Hyperparameters

Training Results

Framework Versions

Citation

BibTeX

Finetuned from

Evaluation results

SetFit with sentence-transformers/all-mpnet-base-v2

Model Details

Model Description

Model Sources

Model Labels

Evaluation

Metrics

Uses

Direct Use for Inference

Training Details

Training Set Metrics

Training Hyperparameters

Training Results

Framework Versions

Citation

BibTeX

Finetuned from sentence-transformers/all-mpnet-base-v2

Evaluation results

Finetuned from