-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
8993f59
commit f124232
Showing
52 changed files
with
3,879 additions
and
3,496 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,29 +1,39 @@ | ||
--- | ||
title: 'Clustered data' | ||
|
||
--- | ||
|
||
# Non-independence {#clustering} | ||
|
||
Psychological data often contains natural *groupings*. In intervention research, multiple patients may be treated by individual therapists, or children taught within classes, which are further nested within schools; in experimental research participants may respond on multiple occasions to a variety of stimuli. | ||
|
||
Although disparate in nature, these groupings share a common characteristic: they induce *dependency* between the observations we make. That is, our data points are *not independently sampled* from one another. | ||
Psychological data often contains natural _groupings_. In intervention research, | ||
multiple patients may be treated by individual therapists, or children taught | ||
within classes, which are further nested within schools; in experimental | ||
research participants may respond on multiple occasions to a variety of stimuli. | ||
|
||
What this means is that observations *within* a particular grouping will tend, all other things being equal, be more alike than those from a different group. | ||
Although disparate in nature, these groupings share a common characteristic: | ||
they induce _dependency_ between the observations we make. That is, our data | ||
points are _not independently sampled_ from one another. | ||
|
||
What this means is that observations _within_ a particular grouping will tend, | ||
all other things being equal, be more alike than those from a different group. | ||
|
||
#### Why does this matter? {-} | ||
|
||
Think of the last quantitative experiment you read about. If you were the author of that study, and were offered 10 additional datapoints for 'free', which would you choose: | ||
Think of the last quantitative experiment you read about. If you were the author | ||
of that study, and were offered 10 additional datapoints for 'free', which would | ||
you choose: | ||
|
||
1. 10 extra datapoints from existing participants. | ||
2. 10 data points from 10 new participants. | ||
|
||
In general you will gain more *new information* from data from a new | ||
participant. Intuitively we know this is correct because an extra observation from | ||
someone we have already studies is *less likely to surprise us* or be | ||
In general you will gain more _new information_ from data from a new | ||
participant. Intuitively we know this is correct because an extra observation | ||
from someone we have already studies is _less likely to surprise us_ or be | ||
different from the data we already have than an observation from a new | ||
participant. | ||
|
||
Most traditional statistical models assume that data *are* sampled independently however. And the precision of the inferences we can draw from from statistical models is based on the *amount of information we have available*. This means that if we violate this assumption of independent sampling we will trick our model into thinking we have more information than we really do, and our inferences may be wrong. | ||
|
||
Most traditional statistical models assume that data _are_ sampled independently | ||
however. And the precision of the inferences we can draw from from statistical | ||
models is based on the _amount of information we have available_. This means | ||
that if we violate this assumption of independent sampling we will trick our | ||
model into thinking we have more information than we really do, and our | ||
inferences may be wrong. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.