SetFit with BAAI/bge-small-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
Label |
Examples |
UNDETERMINED |
- 'Professor Emeritus of Cognitive Sciences at the University of California Irvine Research Visual perception evolutionary psychology consciousness AI Irvine CA'
- 'Emeritus Professor of War Studies Kings College London just published Command The Politics of Military Operations from Korea to Ukraine UK Penguin US OUP '
- 'XML apologist Erlang enthusiast Currently JVMs Performance stuff at Netflix Previously JVMs performative stuff at Twitter Hehim San Francisco California'
|
NFT_ARTIST |
- 'Artist Web3 Marketing Advisor Educator Making history everyday Trapped in the blockchain'
- 'OwnYourAssets TokenGatedFile Access For CrossPlatformInteroperableGaming C5isComing CYBΞRVΞRSΞ'
- 'Pronounced Akossya artist Zurich'
|
ONCHAIN_ANALYST |
- 'I write about onchain stuff fixer AleoHQ prev rabbithole_gg and plenty of DAOs youve heard of '
- 'cofounder 3pochLabs onchain'
- 'onchain data farcer building mosaicdrops media CryptoSapiens_ OntologyNetwork OrangeProtocol banklessDAO s0 _buildspace s4 Mosaicverse'
|
BUSINESS_DEVELOPER |
- 'Prev opensea TheBlock__ amazon '
- 'Building HxroNetwork variable'
- 'Building something old CoFounder alongsidefi '
|
NFT_COLLECTOR |
- 'Building glitchmarfa Collecting brightopps prev brtmoments '
- 'My soul is a cat My two children rpcnftclub ChainFeedsxyz Bangkok'
- 'prev OpenSea NYC'
|
DEVELOPER |
- 'Architect DoraHacks DoraFactory The everlasting hacker movement Menlo Park'
- 'Engineer at Inria scikitlearn developer supported by Python and Machine Learning Between Vannes Paris France'
- 'Working paritytech on substrate Views are my own I working mostly with rustlang nowadays '
|
TRADER |
- 'Applied game theorist blog occasionally at formerly not a very serious person Scott Alexander '
- 'Crypto Trading Bitcoin class of 2013 insilicotrading COO Banana Cabana'
- 'token maxi '
|
COMMUNITY_MANAGER |
- 'chutzpah controlled chaos connoisseur arbitrum chinshilling chinchillin thoughts are my own Rio de Janeiro Brazil'
- 'commonsstack CoFounder tecmns Founding Steward KERNEL0x KB5 trustedseed tamaralens '
- 'Community Admin at The Arbitrum Foundation Helping to scale Ethereum at Arbitrum Feed KOL Binance WEB3'
|
SECURITY_AUDITOR |
- 'founder adjacentfi cofounder former auditor osec_io MEV on solana '
- 'Security Researcher Googles Threat Analysis Group 0days all day Love all things bytes assembly and glitter sheher '
- '採用マーケ得意仮想通貨エンジニア4社1社ホワイトハッカーとして月110万達成現在歯科衛生士の妻と事業開始 実績年商1億超えのマーケ担当 開始5ヶ月で6名見学開始2年で累計DH11名見学6名採用 ハイライト要チェック ブログに今までの有益投稿をまとめました 岩手長野福岡ドバイ沖縄'
|
VENTURE_CAPITALIST |
- 'Liquid Crypto Brevan Howard Prev dragonfly_xyz consensys Arena'
- 'maverick LA'
- 'Founder of SavvyBooks Degen dcv_capital Summoner ElasticDAO metafam Judge code4rena Contributor CantoPublic Nomadic'
|
INVESTOR |
- 'Crypto Investor at Tephra Digital Ex Head of Research Grayscale DCGco FMR Head of Digital Asset Strategy Fundstrat New York NY'
- 'Capital Allocators New York NY'
- 'Director of Research Autonomous Technology Robotics ARKinvest Automation robotics energy storage alternative energy and space Disclosure New York NY'
|
ANGEL_INVESTOR |
- 'larp LawliettesLab angel uvocapital '
- 'Initiator inverternetwork I Angel Investor I ex Gitcoin '
- 'VP Head of BD AleoHQ Mainnet Launch Soon Strategic Advisor VoxiesNFT Angel Investor rcsdao ExOP ExCoinbase Professionally CuriousOpinions My Own Manhattan NY'
|
EXECUTIVE |
- 'Chief Strategy Marketing Officer of Liquidity Group Im also the cofounder of Hudson Rock RockHudsonRock a cybercrime intelligence company TelAviv'
- 'CEO Polymarket Ethereum since 14 I love music and collect art new york'
- 'CEO StartaleHQ Founder AstarNetwork All things for Web3 for billions Japanese Sota_Web3 Earth'
|
MARKETER |
- 'Director General en Kayum comparador de seguros insurance PPC tech crypto f1 Mexico City Mexico'
- 'Insights about Web3 data economy and AI by oceanprotocol Currently in Marcom oceanprotocol ocean Ocean '
- 'f加速 ethereum China internet culture history podcast growth marketing realmasknetwork prev newsbreakapp smartnews Zuzalu human Palo Alto USA'
|
DATA_SCIENTIST |
- 'data uniswap prev theTIEIO go bears New York NY'
- 'engineering data science a16zcrypto '
- 'LangChainAI previously robusthq kensho MLOps Generative AI sports analytics '
|
EDUCATOR |
- ' London'
- 'MSc Immunology student Past cofounder prof director USF Center Applied Data Ethics math PhD math_rachelmastodonsocial sheher Brisbane Australia'
- 'Here to build shared intelligence listen learn share via community tokenengineering KERNEL0x OptimismGov publicgoods education valuesmatter CyberDyn0x tauranga teikaamaui'
|
INFLUENCER |
- 'the destroyer Titan'
- 'Healthy life style healthier bags Cape Town South Africa'
- 'Beauty Brains Bitcoin Beauty in an anonymous world'
|
ADVISOR |
- 'A decentralized onchain governance consultant Health Wealth RunItUp The only Alpha discord youll ever need to joingametheoryweb3 squanchland Profit Land'
- 'Design director Startup Advisor Midjourney Sharing learnings and prompts In my free time working on offscreenai Vancouver Canada'
- 'I help fix and grow crypto portfolios through premium research and strategies 1000 members Founder cshift_io Podcast benandbergs Join 10k Crypto Investors '
|
BLOGGER |
- 'NOW Editor Forbes Writer Stripe HarvardBiz Back on Twitter after ignoring it for a decade I will try my best London'
- 'larp coindesk '
- ' '
|
RESEARCHER |
- 'Roblox Chief Scientist UWaterloo McGill Prof morgan3dbsky Known for NVIDIA Unity Graphics Codex Markdeep G3D Skylanders E Ink Titan Quest Williams Ontario Canada'
- 'Simple human Simple life I am trying to do good around me Empathy creativity inspiration ArigatōMerci For ever apprenti researcher Nulle part ailleurs Nowhere'
- 'Research community And we have our own NFT collection Telegram'
|
METAVERSE_ENTHUSIAST |
- 'fluent speaker of http and color virtual world evangelist game developer painter writer cj5 driver San Diego'
- 'Blockchain Gaming Evangelist CritTheory Gaming CoFounder Earth'
- 'We are a peeple obsessed recruiting service collective Treating everyone like a DMs checked infrequently Metaverse'
|
NODE_OPERATOR |
- 'into protocools and shitposting at nodeguardians '
- ' CoFounder of onivalidator Filmmaker People Maxi Los Angeles CA'
- 'I attest to block 247 Hobby involves the occasional block proposal Have commercial agreements with the MEV trade association Members of Sync Committees Los Angeles'
|
LAWYER |
- 'Law professor at Cal BerkeleyLaw Berkeley California'
- 'IP litigator first sale doctrine respecter schedule a disrespecter wife mom to the tiny boss likes design patents needlework yarn new hampshire'
- 'Lawyer FINTConsulting TechPolicy E4EProject upcoming GRC CybersecurityAnalyst ex InstituteGC Tweet law tech policy GRC Cybersecurity Decentralized'
|
DATA_ANALYST |
- 'Llama pilot at and '
- 'blockchain data opensea kqian on Dune my views are my own dyor nfa data only wagmi open sea'
- 'Blockchain analyst Cat and dog dad Taylor Swift fan Army veteran Pittsburgh PA'
|
MINER |
- 'Blockchain bitcoin mining since 2011 analyst 35 years in IT UnixNetwork engineer fpgachip design exCIO Bitfury BitfuryGroup LNSegWit taproot California USA'
- 'Founder and CEO of Austin TX'
- '在币圈捡矿泉水瓶子的人 0xb38544ccf295d78b7ae7b2bae5dbebdb1f09910dcrossbell Member of 33daoweb3 Metaverse'
|
SHITCOINER |
- 'Degen ETH and SOL lover '
- 'VMPX mrjacklevin Draculaborg'
- 'gripto alt notapornfolder_ '
|
FINANCIAL_ANALYST |
- 'Enrolled Agent Crypto Enthusiast Tax EXPERT StackingSats Chopping Tax Since 2016 NoSatoshiLeftBehind hodlmore payless crypto taxes Longmont CO'
- 'Politico financial services editor zwarmbrodtpoliticocom zacharywarmbrodtprotonmailcom Washington DC'
- 'Im just lookin for clues at the scene of the crime Sedona Arizona'
|
BUSINESS_ANALYST |
- 'Biz Analyst by day web3crypto learner by nightweekend Optimistic about Crypto FanVajpayeeji NaMo M Andreessen E Musk C Dixon Balaji S web3SF Bay Area'
|
Evaluation
Metrics
Label |
Accuracy |
all |
0.5565 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("kasparas12/crypto_individual_infer_model_setfit")
preds = model("producer business and elsewhere on leave views my own la gran manzana")
Training Details
Training Set Metrics
Training set |
Min |
Median |
Max |
Word count |
1 |
13.3415 |
65 |
Label |
Training Sample Count |
DEVELOPER |
2111 |
DATA_SCIENTIST |
93 |
DATA_ANALYST |
25 |
NODE_OPERATOR |
71 |
MINER |
47 |
SECURITY_AUDITOR |
352 |
INVESTOR |
484 |
ANGEL_INVESTOR |
160 |
VENTURE_CAPITALIST |
941 |
TRADER |
270 |
SHITCOINER |
88 |
BUSINESS_DEVELOPER |
917 |
BUSINESS_ANALYST |
1 |
COMMUNITY_MANAGER |
401 |
MARKETER |
190 |
FINANCIAL_ANALYST |
72 |
ADVISOR |
150 |
RESEARCHER |
691 |
ONCHAIN_ANALYST |
45 |
EXECUTIVE |
741 |
INFLUENCER |
834 |
LAWYER |
137 |
BLOGGER |
198 |
NFT_COLLECTOR |
335 |
NFT_ARTIST |
598 |
EDUCATOR |
281 |
METAVERSE_ENTHUSIAST |
132 |
UNDETERMINED |
2216 |
Training Hyperparameters
- batch_size: (64, 64)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch |
Step |
Training Loss |
Validation Loss |
0.0001 |
1 |
0.2625 |
- |
0.0064 |
50 |
0.2677 |
- |
0.0127 |
100 |
0.2515 |
- |
0.0191 |
150 |
0.2413 |
- |
0.0254 |
200 |
0.2374 |
- |
0.0318 |
250 |
0.2383 |
- |
0.0381 |
300 |
0.222 |
- |
0.0445 |
350 |
0.1972 |
- |
0.0509 |
400 |
0.2268 |
- |
0.0572 |
450 |
0.2333 |
- |
0.0636 |
500 |
0.199 |
- |
0.0699 |
550 |
0.2035 |
- |
0.0763 |
600 |
0.1676 |
- |
0.0827 |
650 |
0.1566 |
- |
0.0890 |
700 |
0.1909 |
- |
0.0954 |
750 |
0.189 |
- |
0.1017 |
800 |
0.1872 |
- |
0.1081 |
850 |
0.1576 |
- |
0.1144 |
900 |
0.1382 |
- |
0.1208 |
950 |
0.1603 |
- |
0.1272 |
1000 |
0.155 |
- |
0.1335 |
1050 |
0.1764 |
- |
0.1399 |
1100 |
0.1506 |
- |
0.1462 |
1150 |
0.1439 |
- |
0.1526 |
1200 |
0.1581 |
- |
0.1590 |
1250 |
0.1494 |
- |
0.1653 |
1300 |
0.1622 |
- |
0.1717 |
1350 |
0.1503 |
- |
0.1780 |
1400 |
0.1094 |
- |
0.1844 |
1450 |
0.1576 |
- |
0.1907 |
1500 |
0.1194 |
- |
0.1971 |
1550 |
0.1515 |
- |
0.2035 |
1600 |
0.1662 |
- |
0.2098 |
1650 |
0.1642 |
- |
0.2162 |
1700 |
0.0943 |
- |
0.2225 |
1750 |
0.1472 |
- |
0.2289 |
1800 |
0.1622 |
- |
0.2352 |
1850 |
0.0809 |
- |
0.2416 |
1900 |
0.1623 |
- |
0.2480 |
1950 |
0.1444 |
- |
0.2543 |
2000 |
0.1304 |
- |
0.2607 |
2050 |
0.1175 |
- |
0.2670 |
2100 |
0.078 |
- |
0.2734 |
2150 |
0.1189 |
- |
0.2798 |
2200 |
0.141 |
- |
0.2861 |
2250 |
0.1233 |
- |
0.2925 |
2300 |
0.1446 |
- |
0.2988 |
2350 |
0.1076 |
- |
0.3052 |
2400 |
0.1016 |
- |
0.3115 |
2450 |
0.0818 |
- |
0.3179 |
2500 |
0.1384 |
- |
0.3243 |
2550 |
0.1065 |
- |
0.3306 |
2600 |
0.1029 |
- |
0.3370 |
2650 |
0.1227 |
- |
0.3433 |
2700 |
0.0982 |
- |
0.3497 |
2750 |
0.0959 |
- |
0.3561 |
2800 |
0.0851 |
- |
0.3624 |
2850 |
0.1028 |
- |
0.3688 |
2900 |
0.1136 |
- |
0.3751 |
2950 |
0.1111 |
- |
0.3815 |
3000 |
0.115 |
- |
0.3878 |
3050 |
0.1183 |
- |
0.3942 |
3100 |
0.0689 |
- |
0.4006 |
3150 |
0.1004 |
- |
0.4069 |
3200 |
0.1079 |
- |
0.4133 |
3250 |
0.112 |
- |
0.4196 |
3300 |
0.0758 |
- |
0.4260 |
3350 |
0.09 |
- |
0.4323 |
3400 |
0.1267 |
- |
0.4387 |
3450 |
0.1024 |
- |
0.4451 |
3500 |
0.1352 |
- |
0.4514 |
3550 |
0.0681 |
- |
0.4578 |
3600 |
0.0483 |
- |
0.4641 |
3650 |
0.0937 |
- |
0.4705 |
3700 |
0.0744 |
- |
0.4769 |
3750 |
0.0926 |
- |
0.4832 |
3800 |
0.0764 |
- |
0.4896 |
3850 |
0.0814 |
- |
0.4959 |
3900 |
0.108 |
- |
0.5023 |
3950 |
0.0936 |
- |
0.5086 |
4000 |
0.0687 |
- |
0.5150 |
4050 |
0.0607 |
- |
0.5214 |
4100 |
0.0829 |
- |
0.5277 |
4150 |
0.0772 |
- |
0.5341 |
4200 |
0.0309 |
- |
0.5404 |
4250 |
0.0797 |
- |
0.5468 |
4300 |
0.063 |
- |
0.5532 |
4350 |
0.071 |
- |
0.5595 |
4400 |
0.0667 |
- |
0.5659 |
4450 |
0.121 |
- |
0.5722 |
4500 |
0.0565 |
- |
0.5786 |
4550 |
0.0915 |
- |
0.5849 |
4600 |
0.0613 |
- |
0.5913 |
4650 |
0.0479 |
- |
0.5977 |
4700 |
0.0622 |
- |
0.6040 |
4750 |
0.0687 |
- |
0.6104 |
4800 |
0.0635 |
- |
0.6167 |
4850 |
0.1233 |
- |
0.6231 |
4900 |
0.0351 |
- |
0.6295 |
4950 |
0.0717 |
- |
0.6358 |
5000 |
0.0906 |
- |
0.6422 |
5050 |
0.0712 |
- |
0.6485 |
5100 |
0.1133 |
- |
0.6549 |
5150 |
0.0757 |
- |
0.6612 |
5200 |
0.0809 |
- |
0.6676 |
5250 |
0.112 |
- |
0.6740 |
5300 |
0.0893 |
- |
0.6803 |
5350 |
0.0591 |
- |
0.6867 |
5400 |
0.0872 |
- |
0.6930 |
5450 |
0.0937 |
- |
0.6994 |
5500 |
0.038 |
- |
0.7057 |
5550 |
0.0793 |
- |
0.7121 |
5600 |
0.0569 |
- |
0.7185 |
5650 |
0.0861 |
- |
0.7248 |
5700 |
0.1022 |
- |
0.7312 |
5750 |
0.0759 |
- |
0.7375 |
5800 |
0.0451 |
- |
0.7439 |
5850 |
0.08 |
- |
0.7503 |
5900 |
0.058 |
- |
0.7566 |
5950 |
0.0423 |
- |
0.7630 |
6000 |
0.043 |
- |
0.7693 |
6050 |
0.109 |
- |
0.7757 |
6100 |
0.072 |
- |
0.7820 |
6150 |
0.0342 |
- |
0.7884 |
6200 |
0.0833 |
- |
0.7948 |
6250 |
0.0643 |
- |
0.8011 |
6300 |
0.1069 |
- |
0.8075 |
6350 |
0.0713 |
- |
0.8138 |
6400 |
0.0807 |
- |
0.8202 |
6450 |
0.0518 |
- |
0.8266 |
6500 |
0.0796 |
- |
0.8329 |
6550 |
0.0954 |
- |
0.8393 |
6600 |
0.0709 |
- |
0.8456 |
6650 |
0.0541 |
- |
0.8520 |
6700 |
0.0503 |
- |
0.8583 |
6750 |
0.0737 |
- |
0.8647 |
6800 |
0.0931 |
- |
0.8711 |
6850 |
0.0636 |
- |
0.8774 |
6900 |
0.0579 |
- |
0.8838 |
6950 |
0.1168 |
- |
0.8901 |
7000 |
0.0751 |
- |
0.8965 |
7050 |
0.0945 |
- |
0.9028 |
7100 |
0.0396 |
- |
0.9092 |
7150 |
0.0623 |
- |
0.9156 |
7200 |
0.0641 |
- |
0.9219 |
7250 |
0.0697 |
- |
0.9283 |
7300 |
0.0675 |
- |
0.9346 |
7350 |
0.0544 |
- |
0.9410 |
7400 |
0.0803 |
- |
0.9474 |
7450 |
0.0549 |
- |
0.9537 |
7500 |
0.0612 |
- |
0.9601 |
7550 |
0.0721 |
- |
0.9664 |
7600 |
0.0692 |
- |
0.9728 |
7650 |
0.07 |
- |
0.9791 |
7700 |
0.0476 |
- |
0.9855 |
7750 |
0.0673 |
- |
0.9919 |
7800 |
0.0606 |
- |
0.9982 |
7850 |
0.1001 |
- |
Framework Versions
- Python: 3.9.16
- SetFit: 1.0.3
- Sentence Transformers: 2.2.2
- Transformers: 4.21.3
- PyTorch: 1.12.1+cu116
- Datasets: 2.4.0
- Tokenizers: 0.12.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}