-
Notifications
You must be signed in to change notification settings - Fork 5
/
stratumRisk.tex
136 lines (123 loc) · 7.89 KB
/
stratumRisk.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
\subsubsection{Combining stratum-level risk limits}\label{sec:stratumRisk}
We audit to test the two hypotheses $\{\omega_{w\ell,s} \ge \lambda_s V_{w\ell}\}_{s=1}^2$,
independently for the two strata.
If we reject \emph{both} hypotheses, we conclude that the contest outcome is correct;
otherwise, we manually re-tabulate the contest in one or both strata, depending on the
audit rules.
Those rules matter:
the two audits might need to be conducted to smaller risk limits individually than the desired
risk limit for the contest as a whole.
Recall that the samples are drawn independently from the two strata.
Pick $\alpha_1, \alpha_2 \in (0,\alpha)$.
(Below we discuss the choice further.)
We audit each stratum $s$ to test the hypothesis $\omega_{w\ell,s} \ge \lambda_s V_{w\ell}$
(the overstatement exceeds the tolerable overstatement)
at risk limit $\alpha_s$,
as if it were its own election.
The audits can be conducted at the same time or sequentially; there is no coordination
between the audits unless one of them leads to a full hand count but the other does not:
see below.
How do these two stratum-level ``risk limits'' $\alpha_1$ and
$\alpha_2$ determine the
overall risk that the audit will not correct the outcome if the outcome is wrong?
The overall risk depends on the rule for what we do if the audit in one stratum leads
to a full manual tally of that stratum.
Here are the possibilities. Bear in mind that for the outcome to be wrong,
at least one stratum must have a net overstatement
greater its tolerable overstatement:
That is, if $\omega_{w\ell,1} + \omega_{w\ell,2} \ge V_{w\ell}$, then $\omega_{w\ell,1}\ge \lambda_1V_{w\ell}$
or $\omega_{w\ell,2}\ge \lambda_2V_{w\ell}$, or both.
If the tolerable overstatement is exceeded in only one stratum, $h$, then the chance that the
stratum will be fully hand counted is at least $1-\alpha_h \ge 1- \alpha$.
If both $\omega_{w\ell,1} \ge \lambda_1V_{w\ell}$
and $\omega_{w\ell,2} \ge \lambda_2V_{w\ell}$, then the chance both
are completely tabulated by hand is at least
$(1-\alpha_1)(1-\alpha_2)$, since the audit samples in the two strata are independent.
What should we do if the audit leads to a full tally in one stratum, $h$,
that reveals that indeed its tolerable overstatement has been exceeded,
but the other audit has not led to a full tabulation, because it
has not started, because it is still underway, or because it terminated without
a full hand tally?
We consider two options.
The simpler is to automatically require a full hand count of the other stratum.
If the audit uses this rule, then we can take $\alpha_1 = \alpha_2 = \alpha$,
and the procedure will have risk limit~$\alpha$. However, this rule creates the
possibility of requiring a full hand count in circumstances where it may seem
substantively superfluous. For instance, one can imagine an audit of a statewide
contest in which the tolerable overstatement in no-CVR counties is exceeded,
yet the outcome still could be verified without a full hand count in the CVR counties.
The second approach is to adjust the tolerable overstatement in the other
stratum in light of the known manual tally $A_{w\ell,h}$
in the stratum $h$ that has been fully hand tallied:
we will test against the threshold
$V_{w\ell} - A_{w\ell,h} \equiv \lambda_t' V_{w\ell}$, rather than
the original value $\lambda_t V_{w\ell}$. (Because the overstatement
in stratum $h$ exceeded the tolerable overstatement, the updated tolerable
overstatement in stratum $t$ will be smaller than the original value.)
Then to reject the new null hypothesis in stratum $t$ is to conclude that the
overall outcome is correct.
If and when the hypothesis in stratum $t$ changes, the audit
in that stratum might be able to stop on the basis of the data already observed;
it might need to continue; or---if it had stopped based on the original threshold
$\lambda_t V_{w\ell}$---it might need to examine more ballots, possibly
continuing to a full hand tally.
We will now show in detail that this rule allows the contest to be audited at
risk limit~$\alpha$ by selecting values of~$\alpha_1$ and~$\alpha_2$ that sum to
a bit more than~$\alpha$: specifically, such that $(1-\alpha_1)(1-\alpha_2) < 1-\alpha$.
For instance, suppose we want the overall risk limit to be 5\%.
If we use a risk limit of 4\% in the no-CVR stratum and a risk limit of 1.04\% in the CVR stratum,
the overall risk limit is not larger than $1 - (1-\alpha_1)(1-\alpha_2) \equiv 1 - 0.96\times 0.9896 < 0.05$.
The statistical wrinkle is that adjusting for the manual tally in the hand-counted
stratum $h$
changes the hypothesis being tested in the other stratum $t$
in a way that is itself random:
whether the original null $\omega_{w\ell,s} \ge \lambda_t V_{w\ell}$ is tested
or the new null $\omega_{w\ell,s} \ge \lambda_t' V_{w\ell}$ is tested depends on what the
sample reveals in stratum $h$.
If the hypothesis does change, there is only one value possible for $\lambda_t'$---which
depends on the reported margin $V_{w\ell}$ and the count $A_{w\ell,h}$ in
stratum $h$---but $\lambda_t'$ is unknown until $A_{w\ell,h}$ is known.
We assume that before any data are collected, the audit specifies two families of tests:
for each stratum $s$, a family of level-$\alpha_s$ tests of the null hypothesis that
the overstatement in the stratum is greater than or equal to $c$, for all feasible values of $c$.
That is,
\beq
\Pr \{ \mbox{reject hypothesis that } \omega_{w\ell,s} \ge
c_s || \omega_{w\ell,s} \ge c_s \} \le \alpha_s,
\eeq
for $s = 1, 2$, and all feasible $c_s$.
Moreover, we insist that the test depend on data only from ballots selected from its stratum.
Because the samples in the two strata are independent, for all feasible pairs $c_1, c_2$,
\begin{align} \label{eq:stratum_families}
\Pr\{&\mbox{reject neither hypothesis } \omega_{w\ell,s} \ge c_s, \;\; s=1, 2 ||
\omega_{w\ell,s} \ge c_s \mbox{ for both } s=1, 2 \} \nonumber \\
&= \prod_{s=1}^2 1 - \Pr \{ \mbox{reject hypothesis that } \omega_{w\ell,s} \ge c_s || \omega_{w\ell,s} \ge c_s \} \nonumber \\
& \ge (1-\alpha_1)(1-\alpha_2).
\end{align}
What is the chance that the audit leads to a full hand tabulation if the outcome is incorrect?
One way the audit can lead to a full hand tally is if it leads to a full count in one stratum,
the null hypothesis in the other stratum is changed, and the audit in the second
stratum then proceeds to a full manual tally.
(There are other ways the audit can lead to a full hand tally, for instance, if neither
null hypothesis is rejected, but this is one way.)
If the outcome is wrong, there is at least one stratum in which the overstatement
$\omega_{w\ell,s}$
exceeds the threshold $\lambda_s V_{w\ell}$.
Let $h$ be one such stratum.
Then the chance the audit in stratum $h$ leads to a full manual tally in that stratum
is at least $(1-\alpha_h)$.
If the audit leads to a full manual tally in stratum~$h$ and the overall outcome is wrong,
then the (new) null hypothesis in the other stratum, $t$, must be true.
If we started to audit that new hypothesis \emph{ab initio}, the chance that we would reject it
would be at most $\alpha_t$, so the chance the audit would lead to a full hand count
of stratum $t$ is at least $1-\alpha_t$.
The question is whether ``changing hypotheses'' could make that chance smaller.
The inequality \ref{eq:stratum_families} shows that it cannot: for any feasible pair of
overstatements, $c = (c_1, c_2)$, if $\omega_{w\ell,1} \ge c_1$ and $\omega_{w\ell,2} \ge c_2$,
the chance that neither the hypothesis $\omega_{w\ell,1} \ge c_1$ nor the hypothesis
$\omega_{w\ell,2} \ge c_2$ will be rejected is at least $(1-\alpha_1)(1-\alpha_2)$.
And therefore, for this procedure, the chance that there will be a full hand count in both strata is at least
$(1-\alpha_1)(1-\alpha_2)$ if the outcome is incorrect,
even if the probability were zero that both of the original audits would proceed to a full hand count.
The overall risk limit is thus not larger than $1 - (1-\alpha_1)(1-\alpha_2)$
.