google/vit-base-patch16-224-in21k · How to replicate CIFAR100 fine-tuning performance?

Has anyone else managed to get decent accuracy with this pretrained model after fine-tuning for CIFAR100? I've replicated the conditions claimed in the paper (aside from resolution changed to 324) where we have batch size of 512, cosine learning scheduler, with 10,000 warmup steps, and varied learning rate between 0.0001 to 0.1.

I perform an 80/20 split on the train data provided in huggingface for training/validation and then use the 'test' portion to test for top-1 accuracy. Validation accuracy jumps to 90%, but I consistently get 40% accuracy on the test dataset no matter what parameter I tune.

Has anyone managed to fine-tune this pretrained model on CIFAR100, and found consistently high accuracy (80-90%) in the 'test' dataset for CIFAR100?