Rather than run for fixed number of samples, run until reach specified tolerance on evidence estimate.