Experimental Designs for Heteroskedastic Variance

Author

Justin Weltz

Published

September 11, 2023

Abstract

Most linear experimental design problems assume homogeneous variance although heteroskedastic noise is present in many realistic settings. Let a learner have access to a finite set of measurement vectors \(\mathcal{X}\subset \mathbb{R}^d\) that can be probed to receive noisy linear responses of the form \(y=x^{\top}\theta^{\ast}+\eta\). Here \(\theta^{\ast}\in \mathbb{R}^d\) is an unknown parameter vector, and \(\eta\) is independent mean-zero \(\sigma_x^2\)-sub-Gaussian noise defined by a flexible heteroskedastic variance model, \(\sigma_x^2 = x^{\top}\Sigma^{\ast}x\). Assuming that \(\Sigma^{\ast}\in \mathbb{R}^{d\times d}\) is an unknown matrix, we propose, analyze and empirically evaluate a novel design for uniformly bounding estimation error of the variance parameters, \(\sigma_x^2\). We demonstrate the benefits of this method with two adaptive experimental design problems under heteroskedastic noise, fixed confidence transductive best-arm identification and level-set identification and prove the first instance-dependent lower bounds in these settings. Lastly, we construct near-optimal algorithms and demonstrate the large improvements in sample complexity gained from accounting for heteroskedastic variance in these designs empirically.

Advisor(s)

Alexander Volfovsky and Eric Laber