Skip to content

Commit 76891b0

Browse files
committed
updated Docker guidance for class next week.
1 parent becd1b4 commit 76891b0

File tree

13 files changed

+634072
-56487
lines changed

13 files changed

+634072
-56487
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,4 +137,6 @@ docker-spark/data/irs990
137137
docker-spark/data/vermont
138138

139139
# ignore draft folder in scripts
140-
docker-spark//scripts/draft
140+
docker-spark//scripts/draft
141+
docker-spark/.DS_Store
142+
docker-spark/data/.DS_Store

README.md

Lines changed: 32 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -37,34 +37,22 @@ Another huge advantage – learning to use Docker will make you a better enginee
3737
## Getting started
3838

3939
1. [Install Docker Desktop](https://www.docker.com/get-started) (Windows users will need to [install WSL-2](windows_wsl2.md).)
40-
2. [Create a Dockerhub account](https://hub.docker.com/signup)
40+
2. [Create a Dockerhub account](https://hub.docker.com/signup) and verify your email.
41+
3. Go to your terminal and run the command `docker login`.
4142
3. [Fork this repo](https://github.com/byuibigdata/docker_guide_streamlit) with the clone your forked version to your desktop.
42-
4. Within the cloned repository, Open a terminal and switch the working directory to one of the two `docker-` directories. For example, using `cd docker-streamlit` will get you into the correct folder for streamlit.
43+
4. Within the cloned repository, Open a terminal and switch the working directory to one of the two `docker-` directories.
44+
- For example, using `cd docker-streamlit` will get you into the correct folder for streamlit.
4345
- The [jupyter/all-spark-notebook](https://hub.docker.com/r/jupyter/all-spark-notebook) could be used by using `cd docker-spark`.
4446
5. Within the respective `docker-` folder in your terminal you can now run `docker compose up` to take advantage of the `docker-compose.yaml` file within the directory.
4547

46-
4748
_Note that the command line versions require that the full local volume path is specified. We will be able to use relative file paths with the yaml._
4849

49-
## Streamlit App
50-
51-
After opening a terminal in the directory `~/docker-streamlit` and running `docker compose up` you should see action in the containers section of Docker and your terminal.
52-
53-
Now you can open your streamlit app at [http://localhost:8501](http://localhost:8501)
54-
55-
### Developing your App
56-
57-
Microsft's Visual Studio code provides guidance on [developing inside a Container using Visual Studio Code Remote [Development](https://code.visualstudio.com/docs/devcontainers/containers). Let's use their [get started with development Containers in Visual Studio Code](https://code.visualstudio.com/docs/devcontainers/tutorial) tutorial.
58-
59-
Now we can have a VS Code window running on the OS environment within the container.
60-
6150
## Spark-Notebook
6251

6352
After opening a terminal in the directory `~/docker-spark` and running `docker compose up` you should see a lot of action in the containers section of Docker and your terminal.
6453

6554
Now open [http://localhost:8888/lab?token=easy](http://localhost:8888/lab?token=easy). Our token is set to `easy` which is not recommended in development.
6655

67-
6856
### Starting Spark
6957

7058
You can use the `example.ipynb` script in the `scripts` folder of your container. It contains the code shown below.
@@ -104,6 +92,34 @@ spark = SparkSession.builder \
10492
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
10593
```
10694

95+
### Connecting to an external database
96+
97+
We will use the [PostgreSQL Docker Container](https://hub.docker.com/_/postgres) to create our postgres server and database. After pulling the container `docker pull postgres` we can get started.
98+
99+
#### Adminer
100+
101+
Docker Hub has an [adminer image](https://hub.docker.com/_/adminer) that we can use.
102+
103+
- System: _PostgreSQL_
104+
- Server: _name of postgres docker_ (db if using the `docker run` command above)
105+
- Username: _postgres_
106+
- Password: _postgres1234_
107+
- Database: _lego_
108+
109+
The [Postgres sample databases](https://github.com/neondatabase-labs/postgres-sample-dbs/tree/main?tab=readme-ov-file) has some Postgres databases that you could use. We will use the [Lego example](https://github.com/neondatabase-labs/postgres-sample-dbs/tree/main?tab=readme-ov-file#lego-database). The `lego.sql` file is already in the `scratch` folder.
110+
111+
112+
## Streamlit App
113+
114+
After opening a terminal in the directory `~/docker-streamlit` and running `docker compose up` you should see action in the containers section of Docker and your terminal.
115+
116+
Now you can open your streamlit app at [http://localhost:8501](http://localhost:8501)
117+
118+
### Developing your App
119+
120+
Microsft's Visual Studio code provides guidance on [developing inside a Container using Visual Studio Code Remote [Development](https://code.visualstudio.com/docs/devcontainers/containers). Let's use their [get started with development Containers in Visual Studio Code](https://code.visualstudio.com/docs/devcontainers/tutorial) tutorial.
121+
122+
Now we can have a VS Code window running on the OS environment within the container.
107123

108124
## References
109125

docker-spark/scratch/commands.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
```
2+
psql -d "postgres://postgres:postgres1234@db/lego" -f lego.sql
3+
```

0 commit comments

Comments
 (0)