Towards Better Understanding of Domain Shift on Linear-Probed Visual Foundation Models
• White Paper
Publisher
Software Engineering Institute
Abstract
Visual foundation models have recently emerged to offer similar promise as their language counterparts: The ability to produce representations of visual data that can be successfully used in a variety of tasks and contexts. One common way this is shown in research literature is through “domain generalization” experiments of linear models trained from representations produced by foundation models (i.e. linear probes). These experiments largely limit themselves to a small number of benchmark data sets and report accuracy as the single figure of merit, but give little insight beyond these numbers as to how different foundation models represent shifts.
In this work we perform an empirical evaluation that expands the scope of previously reported results in order to give better understanding into how domain shifts are modeled. Namely, we investigate not just how models generalize across domains, but how models may enable domain transfer. Our evaluation spans a number of recent visual foundation models and benchmarks. We find that not only do linear probes fail to generalize on some shift benchmarks, but