Papers
arxiv:2310.07838

Towards the Fundamental Limits of Knowledge Transfer over Finite Domains

Published on Oct 11, 2023
Authors:
,

Abstract

We characterize the statistical efficiency of knowledge transfer through n samples from a teacher to a probabilistic student classifier with input space mathcal S over labels mathcal A. We show that privileged information at three progressive levels accelerates the transfer. At the first level, only samples with hard labels are known, via which the maximum likelihood estimator attains the minimax rate {|{mathcal S||{mathcal A}|}/{n}}. The second level has the teacher probabilities of sampled labels available in addition, which turns out to boost the convergence rate lower bound to {{|{mathcal S}||{mathcal A}|}/{n}}. However, under this second data acquisition protocol, minimizing a naive adaptation of the cross-entropy loss results in an asymptotically biased student. We overcome this limitation and achieve the fundamental limit by using a novel empirical variant of the squared error logit loss. The third level further equips the student with the soft labels (complete logits) on {mathcal A} given every sampled input, thereby provably enables the student to enjoy a rate {|{mathcal S}|}/{n} free of |{mathcal A}|. We find any Kullback-Leibler divergence minimizer to be optimal in the last case. Numerical simulations distinguish the four learners and corroborate our theory.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2310.07838 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2310.07838 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2310.07838 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.