- Get to the magic: Have your students see a glimpse of the "magic" early on. Images, plots and interactive pieces of code and apps are the most fun. You can use "cooking show" magic and provide your students with something 99% processed but then have them be able to customize or fill in the last bit of code.
- Emphasize the "yet": Encourage a growth mindset by assuring the student that with practice and perseverance they will learn this even if they haven’t learned it "yet".
- Validate trickiness: Try to stay away from using phrases like “it’s easy”. Although you might be attempting to impart confidence, it may make students feel inadequate if they don’t also find it easy. Validating the challenges in data science can be reassuring and help renew a student’s confidence.
- Assure togetherness: Sometimes frustration regarding a problem can feel lonely and impossible. Assuring the student that you are there to help can be a powerful learning tool.
- Celebrate the wins!: No matter how small the win, celebrate what your student has accomplished. This can mean congratulating them, but also may mean encouraging them to share their success with their peers.
- Do not compare: Be careful to not compare your students’ skills with your own or with other students. Emphasize the idea that everyone has a different starting points and aptitudes. What matters is not where you are today, but that everyone continues to learn.
- Have madlibs code: By madlibs code, we mean code that is mostly written but has the student fill in the blanks. This allows the student to see the control they have over code without requiring them to write a whole project from scratch before they are ready to do so.
- Talk about mistakes you’ve made: It can be helpful if as the educator you also tell of times you’ve made mistakes in your work. It can help students start to reverse the idea that mistakes should be hidden but instead that mistakes are normal! This is something that we’ve embraced from the Data Mishaps Night.
- Model making mistakes: Although you don’t need to purposely make mistakes in live coding, if you do see yourself starting to make a mistake, maybe don’t stop it right away. Try to let students catch your mistake instead of pointing it out to them. When they do point it out to you, be sure to be happy about the fact that as a team, you have caught the mistake.
- Say "I don’t know": If a student asks a question that you are unsure of, say "I don’t know" proudly. Use this as an opportunity to demonstrate that not knowing something is expected. The class can also take the opportunity to look something up together. If the question is less relevant to the whole class, let the student know you will look into it and get back to them or you and the student can look into it together later.
- Encourage iteration: Reaffirm the idea that drafts are okay. Blank pages are harder to work from than pages full of mistakes. Rather than striving for perfection on the first try, emphasize the idea that we can use version control and return to this code later to better polish it.
- Normalize questions: Instead of “are there any questions?”, ask “what questions do people have?”. This small wording change can help lower the intimidation factor by implying questions are to be expected.
- Study silly data: Data science doesn’t always have to be about life changing questions. Sometimes it can be a lot of fun to analyze datasets about movies, the best halloween candies, and bigfoot sightings.
- Use GIFs and cartoons: Cartoons and GIFs can be mood boosters and can even make salient points that will stick with your students after class is over.
- Use fun data examples: People like movies and pop culture. So long as the data is appropriate, it can be fun to use data examples that has material that people enjoy.
- Use silly icebreakers: The more the classroom feels comfortable with each other, the more they will be ready to learn and participate. Let the class know its okay to be silly by asking a silly question.
- Take snack breaks: Snack breaks or other kinds of breaks don’t need to be silly per se, but being silly during breaks are important to help people stay refreshed and ready to learn.
- Encourage or require office hours: Many students believe office hours are not relevant to them. Encourage your students to drop by and introduce themselves any time, not only when they are stuck.
- Survey your students: In-class anonymous surveys can answer questions like "why are you taking this class?" and "what career fields interest you?", which can serve as a starting point for conversation.
- Provide Mentorship: Young data scientists need support, particularly those from disadvantaged backgrounds. When possible, try to connect learners to supportive mentors who have time and understanding to devote to the learner. Ideally a mentor can be someone of a similar background to help encourage the learner through shared experience and understanding, but any form of mentorship is still beneficial.
- Don’t require people to buy expensive things: Income insecurity can be a massive barrier to entry into the field of data science but it doesn’t have to be. When possible, pursue cloud-based computing resources for your students to use. This will allow them to run more computationally costly analyses on nearly any machine, including relatively inexpensive computers like Chromebooks.
- Frequently take the temperature of the room: Silence and pauses may feel awkward but they are critical to good teaching. Allow students to have time to think. Another useful tool is using sticky notes to keep track of whether students who are actively coding need help or are doing okay. People who are doing well can put up a green post it, while people who need help with something can put up a different color post-it. Try to create an atmosphere that helps to decrease the intimidation factor of asking questions. Additionally, tools like Slido can help you collect interactive responses from your students from their smartphones or computers.
- Explain things in multiple different ways: Perhaps you understand things well using a particular analogy. But that analogy may not resonate with all your students. Try to think outside of the box and explain things in multiple different ways.
- Do not assume everyone knows the basics: Err on the side of explaining the most fundamental piece of knowl- edge. Less experienced students will be less likely to get lost when the curriculum advances, but more experienced students may overestimate how well they know something. Everyone benefits from starting off on the same page with the basics.
- Use inclusive language: Refer to guidelines for creating inclusive communities [53]. Educators can unknowingly use phrases that reinforce stereotypes or perpetuate gaps in the STEM fields [16]. This also means that as an educator you should always be ready to be corrected and change course should a student share with you how they could be better accommodated.
- Look for ways to improve the accessibility of your classroom: This includes simple things like making sure your curriculum can be read by a screen reader and testing your curriculum for color vision compatibility with tools like ColorOracle [54]. See our Supplemental information for a longer list of accessibility items to consider.
- Be aware of implicit biases and stereotype threat: Though behaviors with implicit bias are by definition unin- tentional, they can be very harmful all the same
- Use pair programming: Have designated time to have students practice paired programming. Or have optional or required time that students can pair program with you or other tutors.
- Cover code review: Explicitly cover techniques for how to conduct formal code review and why it is important.
- Highlight coding communities: Introduce your students to online or in-person coding communities such as R-Ladies, StackOverflow, etc.
- Be aware of the stage of your audience: What concepts do your students understand well? What concepts overwhelm them? If you are at an early stage of the process where students are attempting to grapple with a lot of information at once, do not bring up alternatives.
- Acknowledge the existence of alternatives If learners are likely to encounter common alternatives for particular methods in the real world, be upfront about this. This does not mean that learners necessarily need to dive into these alternatives, but simply noting the names of such methods can enable learners to recognize them in the future.
- Many solutions to the same endpoint: Reinforce the idea that there may be a multitude of ways in code to reach the same endpoint. The priorities should be that the code works and is relatively readable. This can tie in well with practicing code review, which lets them see and evaluate approaches taken by others.
- Stop yourself: Interrogate why you might give a complex answer. Is it because you want people to know you are knowledgeable? Are you very enthusiastic about the material? Remember to focus on the learner. Too much nuance or focusing on exceptions to rules will likely be a disservice to their learning experience.
- Chunk it out: In code outside of the classroom, you may try to reduce the number of lines and put similar steps together. However, in the classroom, it can be beneficial for you to break down each step separately. This may look like making one chunk of code into multiple separate steps that you walk through with the stu- dents. You should also explain and encourage students how to chunk out code for their own troubleshooting technique.
- Keep it practical: You likely have a lot more information about a topic than you need to share. Ask yourself what practical information would your students need to know in a "real world" data science project? For many topics, they won’t need to know deep history or the ins and outs of each parameter of a function. We need to be selective about when history of something aids to understanding and when it does not.
- Make it skimmable: You may notice we use lists and bold type in this paper to highlight main points. Respect that your students are busy and your class is not the only thing they have going on. What’s the most efficient way for you to communicate this (either in print or verbally)?
- Link it out: If you have additional information for the particularly curious student, feel free to share it, but don’t make it central. Add links or a collapsible menu where students can find more information, but don’t use time in a lecture to cover it.
- Pause and think: Sometimes in an effort to complete a project quickly we can move too quickly and miss a critical clue in the data. Pauses are effective tools for thinking effectively about a project and what you are seeing. Encourage students that they do not have to answer questions right away. They can walk away, think about it for minutes or days and come back to it.
- Show real examples: Real data have weirdness. Your curriculum should include an example of real data weirdness and how someone found that weirdness. What functions were used? What aspects of the data were the first red flags that the person who did the data analysis followed? Tell the story about how we found out this weird thing about this real data.
- Give them an investigative tool belt: Give your students a set of strategies they can use to investigate weird- ness. What functions or tests can they use to interrogate a piece of data or weirdness in a package? Where can they go to find more information? Demonstrate Googling, StackOverflow, and package documentation as investigative tools.
- Model investigative data science: In a live coding or pair programming session, encourage your students to look for abnormalities. Ask them questions about what they think about the results or what we might want to look out for. Model checking your data after each step.
- Leave in the side journeys: Often, complete analyses involve several side explorations and dead ends. Although we often want to show a polished data science story, sometimes it can be beneficial to briefly demonstrate your development process and the side journeys it took to get there.
- Practice how to ask for help: Have students practice writing "help" posts and discuss the standard outline of what a call for help should have [63]. StackOverflow and other online communities can be very helpful, but often this starts with a well crafted post.
– Be available: Always reiterate your availability (and truly be available too). When students do come to you with questions, try to be enthusiastic and supportive of them.
– Automate questions: Set up systems that regularly ask your students what questions or problems they have. You can set up reminders for yourself or them.
– Structured one-on-one sessions Set up structured one-on-one mentoring meetings. By structured, we mean use a document that your student fills out that asks them to answer: what are they working on? what is going well? what is not going so well? and so on.
– Model good communication yourself: When live coding, add documentation and try to stick to a code style. Example code should be even more extraordinarily well documented. Emphasize stories where you have messed up code or been stuck and asked someone for help.
– Low(er) stakes presentations: Have students practice presenting to their peers. Public speaking is notori- ously scary, particularly if you are presenting on a new topic. The most effective way to make it less scary is to practice. Encourage your students to share their results regularly. When they do present, reaffirm that everyone is rooting for them and no one will interrogate them. Presentations by early professionals are the time to be supportive and enthusiastic, not critique the results or code.
– Rubber ducking: Rubber ducking refers to the debugging code by explaining it aloud, even if no one is listening to your explanation. Encourage students to walk through their code on their own and translate it into "normal speak". This not only helps them troubleshoot, but also builds explanatory skills and deeper understanding of their code.
– Version control your curriculum: Not only should our data analyses be well tracked, documented and version controlled, but our curriculum should be too. Where possible, curriculum should be open source and on GitHub [64]. Also consider using permissive licenses such as a Creative Commons licenses such as CC-BY which requires attribution but is otherwise open to repurpose and reuse. – Minimize maintenance pains: Create your curriculum in a way that minimizes the pain of maintenance. We use Open-source Tools for Training Resources (OTTR) to create our curriculum [64]. We also utilize the exercise package to automatically generate our exercise notebooks without solutions [66]. See the Supplementary info for more resources for how to automate your curriculum maintenance. – Take notes: In each iteration of your class, take notes and debrief with your education team about strengths of your course and opportunities for improvement. You can easily track ideas and notes as issues in your GitHub repository. – Survey your students: Use short and focused surveys to take the temperature of the class. Note that some interpretation of surveys are needed. For example, if half your students feel the course speed is "too fast" and the other half feel that the course speed is "too slow", it may mean the course speed is just right.