-
Notifications
You must be signed in to change notification settings - Fork 0
/
week8.html
156 lines (151 loc) · 63 KB
/
week8.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Week 8</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="css/bootstrap.min.css" rel="stylesheet">
<link href="css/custom.css" rel="stylesheet">
</head>
<body class="markdown github">
<header class="navbar-inverse navbar-fixed-top">
<div class="container">
<nav role="navigation">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="index.html" class="navbar-brand">J298 Data Journalism</a>
</div> <!-- /.navbar-header -->
<!-- Collect the nav links, forms, and other content for toggling -->
<div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
<ul class="nav navbar-nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Class notes<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="week1.html">What is data?</a></li>
<li><a href="week2.html">Types of stories</a></li>
<li><a href="week3.html">Working with spreadsheets</a></li>
<li><a href="week4.html">Acquiring, cleaning, and formatting data</a></li>
<li><a href="week5.html">R, RStudio, and the tidyverse</a></li>
<li><a href="week6.html">Data journalism in the tidyverse</a></li>
<li><a href="week7.html">Don't let the data lie to you</a></li>
<li><a href="week8.html">Databases and SQL</a></li>
<li><a href="week9.html">Finding stories using maps</a></li>
<li><a href="week10.html">Maps meet databases</a></li>
<li><a href="week11.html">More fun with R</a></li>
<li><a href="week12.html">R practice</a></li>
<li><a href="week13.html">PostGIS practice</a></li>
<li><a href="week14.html">More fun with R</a></li>
</ul>
</li>
<li><a href="software.html">Software</a></li>
<li><a href="datasets.html">Data</a></li> <li><a href="questions.html">If you get stuck</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Email instructors<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="mailto:[email protected]">Peter Aldhous</a></li>
<li><a href="mailto:[email protected]">Amanda Hickman</a></li>
</ul>
</li>
</ul>
</div><!-- /.navbar-collapse -->
</nav>
</div> <!-- /.navbar-header -->
</header>
<div class="container all">
<h2 id="week-8-tuesday-march-6-2018">Week 8 | Tuesday March 6, 2018</h2>
<p><em>Instructor: Amanda Hickman</em></p>
<p>The SQL portion of this lesson plan draws heavily from the Medicare opioid prescription data. If you're interested in reporting on the opioid crisis in the US, I highly recommend Charles Ornstein's guide to <a href="https://docs.google.com/document/d/1NDjd-7I2GlENPfRgMi5cT63Q6KRzWsvTHohbXEStnHg/edit">Covering Opioids with Data</a>. Unfortunately, this is a story with a lot of angles an endless opportunities to report.</p>
<h1 id="learning-to-love-the-terminal">Learning to Love the Terminal</h1>
<p>We're going to use a few command line tools. </p>
<h2 id="looking-better">Looking Better</h2>
<p>Before we do more, go ahead and look under <code>Terminal > Preferences</code> and set the default to "Homebrew" -- you can fuss with that on your own later if you'd like a different color scheme.</p>
<p>I also want you to add "open in terminal" to your finder. Right click in the finder and look for <code>services > New Terminal</code> check box.</p>
<h3 id="bash-term-console-huh-">Bash, Term, Console, Huh?</h3>
<p>Some vocabulary: <code>Terminal</code> is an OSX app that provides you direct access the command line. <code>Command Line Interface</code> or <code>cli</code> is the interface you see in the terminal. The <code>shell</code> is the command line interpreter -- there are a few different shells, but <code>bash</code> is the most common. The <code>console</code> is a hardware interface -- the physical machine, but people sometimes use the term to apply generically to terminal applications. The <code>prompt</code> is the actual line in your shell that is waiting for input. Most of the time by default that is something like <code>$</code> -- mine is <code>amanda@mona:~$</code></p>
<p>Technically this is accurate: Terminal runs a shell, such as bash, that gives you access to a command line interface where you can use the command line to run commands by typing them at the command prompt.</p>
<p>Functionally, the terms are more or less interchangeable.</p>
<p><a href="https://askubuntu.com/questions/506510/what-is-the-difference-between-terminal-console-shell-and-command-line">More on that.</a>.</p>
<h2 id="for-starters">For Starters</h2>
<p>The <a href="http://csvkit.readthedocs.org/en/latest/install.html">csvkit installation instructions</a> are probably way over your head. If they aren't, go ahead and install csvkit and walk through <a href="https://csvkit.readthedocs.io/en/1.0.2/tutorial/1_getting_started.html#about-this-tutorial">their tutorial</a>.</p>
<p>We're going to try a few commands, but I also want to flag that it is very possible to make a terrible mess using the terminal. For instance this simple command: <code>rm -r /</code> will ... wipe the contents of you r entire computer. So You shouldn't run anything at the terminal without knowing what it is?</p>
<ol>
<li>Open Terminal. Look at what you see there.</li>
<li>Try running <code>pwd</code> -- note that when I say "run <code>pwd</code>" what I mean is "type the command <code>pwd</code> at the prompt and hit <kbd>enter</kbd>". That command, "print working directory" will tell you what directory you are currently in.</li>
<li>Try running <code>cd</code>. And then run <code>pwd</code> again.</li>
<li>Type <code>cd Dow</code> and use tab completion to fill in the rest: it should populate with <code>Downloads</code>. Run it by hitting <kbd>enter</kbd>.</li>
<li>Try running <code>ls</code> and <code>ls -al</code> -- <code>-a</code> and <code>-l</code> are "options" that extend the <code>ls</code> command.</li>
<li>Run <code>man cat</code> -- what does the <code>cat</code> command do? If you run <code>cat</code> by itself it is going to hang, waiting for you to tell it what to concatenate. Use <kbd>ctrl</kbd><kbd>c</kbd> to escape.</li>
<li>When you ran <code>ls</code> did you see any <code>csv</code> files? Try printing one to the screen with <code>cat {filename}.csv</code> -- <code>{filename}.csv</code> is an argument that extends the <code>cat</code> command. The curly braces are a convention you'll see a fair bit, they indicate that you need to fill in the {generic term} with a real term. You probably don't have anything called <code>filename.csv</code> but you have a few things that have a filename.</li>
</ol>
<p>We're going to stop there because all we really need right now is access to the command line in general, and a foolproof way to figure out what directory we're in. As I said when we started out, there are tools you can use at the command line to trash your machine, so if someone tells you to "just run this command" you should always make sure you understand what it is doing. <code>man {command}</code> is your friend here.</p>
<p>Noah Veltman has an <a href="https://github.com/veltman/clmystery/blob/master/cheatsheet.md">excellent resource</a> for learning your way around the command line, or this <a href="https://github.com/chrislkeller/nicar15-command-line-basics">2015 NICAR Workshop</a> is a great place to start.</p>
<h2 id="installing-csvkit">Installing CSVkit</h2>
<p>One of the reasons we're doing this is to make <code>csvkit</code> a bit more accessible.
I wrote a <a href="https://github.com/amandabee/CUNY-SOJ-data-storytelling/wiki/Tutorial:-Installing-CSVKit">tutorial on installing CSVkit</a> that gives you a bunch of different options. I've heard that's a great place to start. From there you can <a href="https://github.com/amandabee/CUNY-data-storytelling/wiki/Tutorial:-Using-CSVkit">use</a> csvkit.</p>
<h1 id="using-posgres">Using Posgres</h1>
<p>There are a lot of reasons I want you to get comfortable with Postgres. Last week we talked a little bit about generating CREATE statements and I want you to be able to do that, but you're also going to find a lot of situations where facility with Postgres will make a big difference. It opens the door to some geographic queries that are quite powerful.</p>
<p>Postico does have <a href="https://eggerapps.at/postico/docs/v1.3.2/keyboard-shortcuts.html">keyboard shortcuts</a>.</p>
<p>I keep a running list of <a href="https://github.com/amandabee/CUNY-data-storytelling/wiki/Tip-Sheet:-SQL">SQL commands</a> that I wind up showing to students often.</p>
<p>I have a few conventions that I use. The <a href="https://www.postgresql.org/docs/9.5/static/sql-syntax.html">Postgres manual</a> is dense but incredibly helpful if you find yourself wondering <em>why</em> something is the way it is.</p>
<h3 id="-create-database-https-www-postgresql-org-docs-9-5-static-sql-createdatabase-html-"><a href="https://www.postgresql.org/docs/9.5/static/sql-createdatabase.html">CREATE DATABASE</a></h3>
<p>You probably want to create a new database for each project, just to stay organized. So go ahead and run <code>CREATE DATABASE week8</code> (you may need something closer to <code>CREATE DATABASE week8 WITH OWNER = amanda;</code>).</p>
<p>Everything we do tonight we're going to do in that Week8 database.</p>
<h3 id="-create-https-www-postgresql-org-docs-9-5-static-sql-createtable-html-"><a href="https://www.postgresql.org/docs/9.5/static/sql-createtable.html">CREATE</a></h3>
<p>We're going to start by pulling some of the data Peter had you working with into Postgres -- this should give you a nice sense of how the tools compare.</p>
<p>One thing to keep in mind: R is super efficient about referring back to your base data. If you start generating tables from old tables, you are going to duplicate data.</p>
<p>I used <code>csvkit</code> to generate my create statements: <code>csvsql *.csv</code> -- it took about 40 seconds to run through all of it.</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--meta syntax--create syntax--sql"><span class="syntax--keyword syntax--other syntax--create syntax--sql"><span>CREATE</span></span><span> </span><span class="syntax--keyword syntax--other syntax--sql"><span>TABLE</span></span><span> </span><span class="syntax--entity syntax--name syntax--function syntax--sql"><span>ca_discipline</span></span></span><span> (</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> alert_date </span><span class="syntax--storage syntax--type syntax--sql"><span>DATE</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> last_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>19</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> first_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>14</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> middle_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>19</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> name_suffix </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>33</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> city </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>22</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> state </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>11</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> license </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>9</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> action_type </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>62</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> action_date </span><span class="syntax--storage syntax--type syntax--sql"><span>DATE</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span>);</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--meta syntax--create syntax--sql"><span class="syntax--keyword syntax--other syntax--create syntax--sql"><span>CREATE</span></span><span> </span><span class="syntax--keyword syntax--other syntax--sql"><span>TABLE</span></span><span> </span><span class="syntax--entity syntax--name syntax--function syntax--sql"><span>ca_medicare_opioids</span></span></span><span> (</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> npi </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> nppes_provider_last_org_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>44</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> nppes_provider_first_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>20</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> nppes_provider_city </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>22</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> nppes_provider_state </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>2</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> specialty_description </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>62</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> description_flag </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>1</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> drug_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>30</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> generic_name </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>30</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> bene_count </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_claim_count </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_30_day_fill_count </span><span class="syntax--storage syntax--type syntax--sql"><span>FLOAT</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_day_supply </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_drug_cost </span><span class="syntax--storage syntax--type syntax--sql"><span>FLOAT</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> bene_count_ge65 </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> bene_count_ge65_suppress_flag </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>4</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_claim_count_ge65 </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> ge65_suppress_flag </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>4</span></span><span>)</span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_30_day_fill_count_ge65 </span><span class="syntax--storage syntax--type syntax--sql"><span>FLOAT</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_day_supply_ge65 </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> total_drug_cost_ge65 </span><span class="syntax--storage syntax--type syntax--sql"><span>FLOAT</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> year </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span>);</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--meta syntax--create syntax--sql"><span class="syntax--keyword syntax--other syntax--create syntax--sql"><span>CREATE</span></span><span> </span><span class="syntax--keyword syntax--other syntax--sql"><span>TABLE</span></span><span> </span><span class="syntax--entity syntax--name syntax--function syntax--sql"><span>npi_license</span></span></span><span> (</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> npi </span><span class="syntax--storage syntax--type syntax--sql"><span>INTEGER</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> plicnum </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>20</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span><span>,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> license </span><span class="syntax--storage syntax--type syntax--sql"><span>VARCHAR</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>22</span></span><span>)</span><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>NOT NULL</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span>);</span></span></div></pre>
<p>We can just run with this since we aren't going to add new data. Our next project is to load in the data. Last week we used this command to load <code>courses.csv</code> into the <code>courses</code> table:</p>
<p><code>COPY courses from '/path/to/courses.csv' DELIMITER ',' CSV HEADER;</code></p>
<p>How do you think we should populate the <code>ca_discipline</code> table? First, try running <code>head ca_discipline.csv</code> in your terminal. Does this file have a header row? What is the delimeter?</p>
<p>You probably got an error:</p>
<pre class="editor-colors lang-"><div class="line"><span class="syntax--text syntax--plain syntax--null-grammar"><span>ERROR: value too long for type character varying(11)</span></span></div><div class="line"><span class="syntax--text syntax--plain syntax--null-grammar"><span>CONTEXT: COPY ca_discipline, line 1314, column state: " South Korea"</span></span></div></pre><p>I used <code>csvkit</code> to see what was going on: <code>csvstat -c "state" ca_discipline.csv</code>.</p>
<p>I can use an <a href="https://www.postgresql.org/docs/9.5/static/sql-altertable.html">ALTER</a> statement to make more room.</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--meta syntax--alter syntax--sql"><span class="syntax--keyword syntax--other syntax--create syntax--sql"><span>ALTER</span></span><span> </span><span class="syntax--keyword syntax--other syntax--table syntax--sql"><span>TABLE</span></span><span> </span><span class="syntax--entity syntax--name syntax--function syntax--sql"><span>ca_discipline</span></span></span><span> ALTER COLUMN state TYPE </span><span class="syntax--storage syntax--type syntax--sql"><span>varchar</span></span><span>(</span><span class="syntax--constant syntax--numeric syntax--sql"><span>15</span></span><span>)</span></span></div></pre>
<p>And then let's try the <code>COPY</code> statement again. And once all the data is in there, we need to clean it up.</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT DISTINCT</span></span><span> state </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--comment syntax--block syntax--sql"><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>/*</span></span><span> That's a little disordered.</span><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>*/</span></span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT DISTINCT</span></span><span> state </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> state;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--comment syntax--block syntax--sql"><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>/*</span></span><span> Better. Just out of curiousity, how are these distributed? </span><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>*/</span></span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT DISTINCT</span></span><span> state, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> state;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--comment syntax--block syntax--sql"><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>/*</span></span><span> Ok. So now we need to trim those spaces. </span><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>*/</span></span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>UPDATE</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SET</span></span><span> state</span><span class="syntax--keyword syntax--operator syntax--comparison syntax--sql"><span>=</span></span><span class="syntax--support syntax--function syntax--string syntax--sql"><span>trim</span></span><span>(state);</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT DISTINCT</span></span><span> state, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> state </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> state;</span></span></div></pre>
<h3 id="-select-https-www-postgresql-org-docs-9-5-static-sql-select-html-"><a href="https://www.postgresql.org/docs/9.5/static/sql-select.html">SELECT</a></h3>
<p>We just tried some SELECT statements to examine the "State" column. We added in a <a href="">COUNT</a> function and two clauses, <a href="https://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUP">GROUP BY</a> and <a href="https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-ORDERBY">ORDER BY</a></p>
<p>The next thing we did in R was create a smaller view with just a few cities, so let's resolve our query to look at what cities exist in the data:</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT DISTINCT</span></span><span> city, state, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> city, state </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> city, state;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--comment syntax--block syntax--sql"><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>/*</span></span><span> and add a WHERE clause to see just a few Alameda County cities </span><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>*/</span></span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT DISTINCT</span></span><span> city, state, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>WHERE</span></span><span> city </span><span class="syntax--keyword syntax--other syntax--data-integrity syntax--sql"><span>IN</span></span><span> (</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Kensington</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Piedmont</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Berkeley</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Oakland</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Alameda</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>)</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> city, state </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> city, state;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> action_type, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> action_type;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--comment syntax--block syntax--sql"><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>/*</span></span><span> I like to know I'm hitting everything first </span><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>*/</span></span></span></span></div></pre>
<p>So: if we want to SELECT only those entries where the license was revoked, what are we going to do? I want everyone to sketch out what you think it is going to be, in <a href="https://public.etherpad-mozilla.org/p/J298">Etherpad</a>. And then try running it.</p>
<p><a href="https://public.etherpad-mozilla.org/p/J298">https://public.etherpad-mozilla.org/p/J298</a></p>
<p>Then see if you can make these two queries happen:</p>
<ul>
<li><p>Filter the ca_discipline data to show licenses Revoked for doctors based in Los Angeles, with the most recent first.</p>
</li>
<li><p>Filter the data to show licenses Suspended or Revoked for doctors in Los Angeles or San Diego. Sort the results by Doctor's names.</p>
</li>
</ul>
<p>Note: I find it a bit easier to compose my scripts separately in a <a href="https://beebom.com/best-text-editors-for-mac/">text editor</a> -- <a href="https://www.sublimetext.com/docs/3/linux_repositories.html">Sublime</a> and <a href="https://flight-manual.atom.io/getting-started/sections/installing-atom/#platform-mac">Atom</a> are both good bets. TextEdit is actually a lightweight word processor. (One good clue is that you can tweak the fonts in TextEdit) but you can use it as a text editor if you switch to Plain Text mode. (It's under format for any one document, or under preferences for the software as a whole.)</p>
<h4 id="-views-https-www-postgresql-org-docs-9-5-static-sql-createview-html-"><a href="https://www.postgresql.org/docs/9.5/static/sql-createview.html">VIEWS</a></h4>
<p>One of the ways that SQL is a little less efficient than R is that it doesn't do a great job of caching views. So sometimes it makes sense to create a new table with a subset of the data, while other times <a href="https://www.postgresql.org/docs/9.5/static/sql-createview.html">CREATE VIEW</a></p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span>CREATE TEMP VIEW ca_discipline_local_revoked </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> </span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>WHERE</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--constant syntax--other syntax--database-name syntax--sql"><span>ca_discipline</span></span><span>.</span><span class="syntax--constant syntax--other syntax--table-name syntax--sql"><span>city</span></span><span> </span><span class="syntax--keyword syntax--other syntax--data-integrity syntax--sql"><span>IN</span></span><span> (</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Alameda</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Albany</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Berkeley</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Dublin</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Emeryville</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Fremont</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Hayward</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Kensington</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Livermore</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Newark</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Oakland</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Piedmont</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Pleasanton</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>San Leandro</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>,</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Union City</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>)</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>AND</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--constant syntax--other syntax--database-name syntax--sql"><span>ca_discipline</span></span><span>.</span><span class="syntax--constant syntax--other syntax--table-name syntax--sql"><span>action_type</span></span><span> </span><span class="syntax--keyword syntax--operator syntax--comparison syntax--sql"><span>=</span></span><span> </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Revoked</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--comment syntax--block syntax--sql"><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>/*</span></span><span> Note that this assumes my database is called "public" </span><span class="syntax--punctuation syntax--definition syntax--comment syntax--sql"><span>*/</span></span></span></span></div></pre>
<p>Working with <a href="https://www.postgresql.org/docs/9.5/static/functions-datetime.html">dates</a>:</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> date_part(</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>year</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, alert_date) </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> year, date_part(</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>month</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, alert_date) </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> month, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>)</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline_local_revoked</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> date_part(</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>year</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, alert_date), date_part(</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>month</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, alert_date)</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> year, month;</span></span></div></pre>
<p>So let's think about how we might go about looking at how many licenses are revoked in CA each year. Sketch out what you think it is going to be, in a text editor. Drop your query in <a href="https://public.etherpad-mozilla.org/p/J298">Etherpad</a>. And then try running it.</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> date_part(</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>year</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, alert_date) </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> year, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>)</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>WHERE</span></span><span> action_type </span><span class="syntax--keyword syntax--operator syntax--comparison syntax--sql"><span>=</span></span><span> </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>Revoked</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> date_part(</span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>year</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span>, alert_date)</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> year;</span></span></div></pre>
<p>And if you want to start playing with fentanyl prescriptions:</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> generic_name, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_medicare_opioids</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>WHERE</span></span><span> generic_name </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>LIKE</span></span><span> </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>%FENTANYL%</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> generic_name;</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span></span></div><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> year, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>count</span></span><span>(</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>) </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_medicare_opioids</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>WHERE</span></span><span> generic_name </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>LIKE</span></span><span> </span><span class="syntax--string syntax--quoted syntax--single syntax--sql"><span class="syntax--punctuation syntax--definition syntax--string syntax--begin syntax--sql"><span>'</span></span><span>%FENTANYL%</span><span class="syntax--punctuation syntax--definition syntax--string syntax--end syntax--sql"><span>'</span></span></span><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> year </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> year;</span></span></div></pre>
<h4 id="don-t-just-copy-and-paste-">Don't just copy and paste.</h4>
<p>Create a summary, showing the number of opioid prescriptions written by each doctor, the total cost of the opioids prescribed, and the cost per claim.</p>
<p>Sketch out how you think you would do it. Drop your query in etherpad, and try running it.</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> npi, nppes_provider_last_org_name, nppes_provider_first_name, nppes_provider_city,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> specialty_description, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>SUM</span></span><span>(total_claim_count) </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> prescriptions,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>SUM</span></span><span>(total_drug_cost) </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> costs, </span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>SUM</span></span><span>(total_drug_cost)</span><span class="syntax--keyword syntax--operator syntax--math syntax--sql"><span>/</span></span><span class="syntax--support syntax--function syntax--aggregate syntax--sql"><span>SUM</span></span><span>(total_claim_count) </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> cost_per_prescription</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_medicare_opioids</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>GROUP BY</span></span><span> npi, nppes_provider_last_org_name, nppes_provider_first_name, nppes_provider_city,</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> specialty_description</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>ORDER BY</span></span><span> prescriptions </span><span class="syntax--keyword syntax--other syntax--order syntax--sql"><span>DESC</span></span></span></div></pre>
<p>What if you just wanted to look for Fentanyl prescriptions? Try it and share your query. </p>
<h3 id="-join-https-www-postgresql-org-docs-9-5-static-queries-table-expressions-html-queries-join-"><a href="https://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-JOIN">JOIN</a></h3>
<p>We played with this a bit last week, and I know that you did some joins Week 6 as well. So I wanted to translate those queries into SQL so we can look at what it looks like to do this in SQL.</p>
<pre class="editor-colors lang-sql"><div class="line"><span class="syntax--source syntax--sql"><span>CREATE TEMP VIEW ca_discipline_npi</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--alias syntax--sql"><span>AS</span></span><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>SELECT</span></span><span> ca_discipline.</span><span class="syntax--keyword syntax--operator syntax--star syntax--sql"><span>*</span></span><span>, </span><span class="syntax--constant syntax--other syntax--database-name syntax--sql"><span>npi_license</span></span><span>.</span><span class="syntax--constant syntax--other syntax--table-name syntax--sql"><span>npi</span></span><span>, </span><span class="syntax--constant syntax--other syntax--database-name syntax--sql"><span>npi_license</span></span><span>.</span><span class="syntax--constant syntax--other syntax--table-name syntax--sql"><span>plicnum</span></span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>FROM</span></span><span> ca_discipline </span><span class="syntax--keyword syntax--other syntax--DML syntax--sql"><span>LEFT JOIN</span></span><span> npi_license</span></span></div><div class="line"><span class="syntax--source syntax--sql"><span> </span><span class="syntax--keyword syntax--other syntax--DDL syntax--create syntax--II syntax--sql"><span>ON</span></span><span> </span><span class="syntax--constant syntax--other syntax--database-name syntax--sql"><span>npi_license</span></span><span>.</span><span class="syntax--constant syntax--other syntax--table-name syntax--sql"><span>license</span></span><span> </span><span class="syntax--keyword syntax--operator syntax--comparison syntax--sql"><span>=</span></span><span> </span><span class="syntax--constant syntax--other syntax--database-name syntax--sql"><span>ca_discipline</span></span><span>.</span><span class="syntax--constant syntax--other syntax--table-name syntax--sql"><span>license</span></span></span></div></pre>
<p><a href="https://blog.codinghorror.com/a-visual-explanation-of-sql-joins/">https://blog.codinghorror.com/a-visual-explanation-of-sql-joins/</a></p>
<h3 id="-update-https-www-postgresql-org-docs-9-5-static-sql-update-html-"><a href="https://www.postgresql.org/docs/9.5/static/sql-update.html">UPDATE</a></h3>
<p>We actually introduced update above. You can use Postico to manually tweak individual values, but if you want to change values en masse, UPDATE is your friend.</p>
</div> <!-- /.container all -->
<script src="https://code.jquery.com/jquery.min.js"></script>
<script src="js/bootstrap.min.js"></script>
</body>
</html>