SetFit documentation

SetFit v1.0.0 Migration Guide

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

SetFit v1.0.0 Migration Guide

To update your code to work with v1.0.0, the following changes must be made:

General Migration Guide

  1. keep_body_frozen from SetFitModel.unfreeze has been deprecated, simply either pass "head", "body" or no arguments to unfreeze both.
  2. SupConLoss has been moved from setfit.modeling to setfit.losses. If you are importing it using from setfit.modeling import SupConLoss, then import it like from setfit import SupConLoss now instead.
  3. use_auth_token has been renamed to token in SetFitModel.from_pretrained(). use_auth_token will keep working until the next major version, but with a warning.

Training Migration Guide

  1. Replace all uses of SetFitTrainer with Trainer, and all uses of DistillationSetFitTrainer with DistillationTrainer.

  2. Remove num_iterations, num_epochs, learning_rate, batch_size, seed, use_amp, warmup_proportion, distance_metric, margin, samples_per_label and loss_class from a Trainer initialization, and move them to a TrainerArguments initialization instead. This instance should then be passed to the trainer via the args argument.

    • num_iterations has been deprecated, the number of training steps should now be controlled via num_epochs, max_steps or EarlyStoppingCallback.
    • learning_rate has been split up into body_learning_rate and head_learning_rate.
    • loss_class has been renamed to loss.
  3. Stop providing training arguments like num_epochs directly to Trainer.train: pass a TrainingArguments instance via the args argument instead.

  4. Refactor multiple trainer.train(), trainer.freeze() and trainer.unfreeze() calls that were previously necessary to train the differentiable head into just one trainer.train() call by setting batch_size and num_epochs on the TrainingArguments dataclass with tuples. The first value in the tuple is for training the embeddings, and the second is for training the classifier.

Hard deprecations

  • SetFitBaseModel, SKLearnWrapper and SetFitPipeline have been removed. These can no longer be used starting from v1.0.0.

v1.0.0 Changelog

This list contains new functionality that can be used starting from v1.0.0.

  • SetFitModel.from_pretrained() now accepts new arguments:

    • device: Specifies the device on which to load the SetFit model.
    • labels: Specify labels corresponding to the training labels - useful if the training labels are integers ranging from 0 to num_classes - 1. These are automatically applied on calling SetFitModel.predict().
    • model_card_data: Provide a SetFitModelCardData instance storing data such as model language, license, dataset name, etc. to be used in the automatically generated model cards.
  • Certain SetFit configuration options, such as the new labels argument from SetFitModel.from_pretrained(), now get saved in config_setfit.json files when a model is saved. This allows labels to be automatically fetched when a model is loaded.

  • SetFitModel.predict() now accepts new arguments:

    • batch_size (defaults to 32): The batch size to use in encoding the sentences to embeddings. Higher often means faster processing but higher memory usage.
    • use_labels (defaults to True): Whether to use the SetFitModel.labels to convert integer labels to string labels. Not used if the training labels are already strings.
  • SetFitModel.encode() has been introduce to convert input sentences to embeddings using the SentenceTransformer body.

  • SetFitModel.device has been introduced to determine the device of the model.

  • AbsaTrainer and AbsaModel have been introduced for applying SetFit for Aspect Based Sentiment Analysis.

  • Trainer now supports a callbacks argument for a list of transformers TrainerCallback instances.

  • Trainer.evaluate() now works with string labels.

  • An updated contrastive pair sampler increases the variety of training pairs.

  • TrainingArguments supports various new arguments:

    • output_dir: The output directory where the model predictions and checkpoints will be written.

    • max_steps: If set to a positive number, the total number of training steps to perform. Overrides num_epochs. The training may stop before reaching the set number of steps when all data is exhausted.

    • sampling_strategy: The sampling strategy of how to draw pairs in training. Possible values are:

      • "oversampling": Draws even number of positive/negative sentence pairs until every sentence pair has been drawn.
      • "undersampling": Draws the minimum number of positive/negative sentence pairs until every sentence pair in the minority class has been drawn.
      • "unique": Draws every sentence pair combination (likely resulting in unbalanced number of positive/negative sentence pairs).

      The default is set to "oversampling", ensuring all sentence pairs are drawn at least once. Alternatively, setting num_iterations will override this argument and determine the number of generated sentence pairs.

    • report_to: The list of integrations to report the results and logs to. Supported platforms are "azure_ml", "comet_ml", "mlflow", "neptune", "tensorboard","clearml" and "wandb". Use "all" to report to all integrations installed, "none" for no integrations.

    • run_name: A descriptor for the run. Typically used for wandb and mlflow logging.

    • logging_strategy: The logging strategy to adopt during training. Possible values are:

      • "no": No logging is done during training.
      • "epoch": Logging is done at the end of each epoch.
      • "steps": Logging is done every logging_steps.
    • logging_first_step: Whether to log and evaluate the first global_step or not.

    • logging_steps: Number of update steps between two logs if logging_strategy="steps".

    • evaluation_strategy: The evaluation strategy to adopt during training. Possible values are:

      • "no": No evaluation is done during training.
      • "steps": Evaluation is done (and logged) every eval_steps.
      • "epoch": Evaluation is done at the end of each epoch.
    • eval_steps: Number of update steps between two evaluations if evaluation_strategy="steps". Will default to the same as logging_steps if not set.

    • eval_delay: Number of epochs or steps to wait for before the first evaluation can be performed, depending on the evaluation_strategy.

    • eval_max_steps: If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop before reaching the set number of steps when all data is exhausted.

    • save_strategy: The checkpoint save strategy to adopt during training. Possible values are:

      • "no": No save is done during training.
      • "epoch": Save is done at the end of each epoch.
      • "steps": Save is done every save_steps.
    • save_steps: Number of updates steps before two checkpoint saves if save_strategy="steps".

    • save_total_limit: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in output_dir. Note, the best model is always preserved if the evaluation_strategy is not "no".

    • load_best_model_at_end: Whether or not to load the best model found during training at the end of training.

      When set to True, the parameters save_strategy needs to be the same as evaluation_strategy, and in the case it is “steps”, save_steps must be a round multiple of eval_steps.

  • Pushing SetFit or SetFitABSA models to the Hub with SetFitModel.push_to_hub() or AbsaModel.push_to_hub() now results in a detailed model card. As an example, see this SetFitModel or this SetFitABSA polarity model.