Syllabus

EDLD 651: Introduction to Data Science with R
(CRN: 11803; 3 credit hours)

Introduction to Course and Instructor


Term: Fall 2025

Time: Wed, 1:00-3:50pm

Classroom: 119 Lokey

Instructor: Joe Nese, PhD

  • email: jnese@uoregon.edu (preferred contact method)
  • office hours: By appointment

Course Overview

This is the first course in a sequence of courses that will eventually lead to a data science in educational research specialization. All courses will be taught through R, a free and open-source statistical computing environment. This course will introduce students toR and RStudio, version control with git and GitHub, dynamic and reproducible workflows with Quarto, and basic data wrangling and visualization with the tidyverse suite of packages.


Acknowledgment. This course, and much of the materials prepared and content presented, was originally developed by Daniel Anderson. In collaboration with Daniel Anderson, Alison Hill, Chester Ismay, and Andrew Bray, helped design the content for this course and the specialization as a whole.

Student Learning Outcomes

By the end of this course, students should be able to:

  • Understand the R package ecosystem (how to find, install, load, and learn about package functionality)
  • Read “flat” (i.e., rectangular) datasets into R
  • Perform basic data wrangling and transformations in R, using the tidyverse
    • Leverage appropriate functions for introductory data science tasks (pipeline)
    • “clean up” the dataset using scripts and reproducible workflows
  • Use version control with R via GitHub
  • Use Quarto to create reproducible, dynamic reports
  • Understand and create different types of data visualizations

Course Modality

This is an in-person course: that means that, unlike asynchronous online/ASYNC WEB courses, we will meet during scheduled class meeting times in (class location). I will accommodate absences as described in the Absences policy below. If you need additional flexibility, UO encourages you to consider ASYNC WEB courses. If you need accommodation related to a medical or other disability, you can request those by working with the Accessible Education Center.

Course Reading Materials and Resources

All required course readings are freely available online or will be provided by the instructor. Note that the assigned readings should be read before each class.

Books (required)

  • Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science. O’Reilly. (referred to as R4DS(2e) in the weekly schedule)
    • Freely available online in the link above. A full-color paper copy is also available from Bookshop.org for (currently) $74.39.

Books (optional and possibly helpful)

Resources

Supplemental Learning

The best way to learn to use R is to be exposed to and use R, so in this course we will use some supplemental content developed outside this course.

Codecademy

We will be using some content from the Learn R course through Codecademy. You will need to sign up for a free account. Codecademy can be used as a place for you to practice/grow the skills you learn in class, and a resource for your own exploration of topics not covered by this class if you so choose. You will be working through the following.

Helpful People/Groups

UO Libraries Data Services

The Data Services Help Desk is available this term Monday through Friday 11 am - 4 pm here in the Knight Library. The Help Desk offers drop-in services and consultations by appointment for statistical methods, R, Python, Git, Excel, Dedoose, Qualtrics, research data management, and entry-level GIS and Talapas questions.

They also provide interactive workshops on topics ranging from courses on R and Python to Git, Qualtrics, and High-Performance Computing.

If you are looking to connect with peers, their Coffee + Data && Code sessions offer short talks, while Coding Circles are casual co-working spaces. They also host a book club.

learnR4free

learnR4free includes all sorts of resources (including books, videos, interactive websites, papers) to learn R for free.

R-Ladies

Regardless of how you identify, but particularly if you identify as a woman or gender non-binary, connect with the global R-Ladies organization. They are an excellent organization of incredibly supportive individuals who routinely share great information.

Posit Community

The Posit community is similar to stackoverflow but is generally more friendly, willing to engage in more philosophical and deeper discussions than “how do you do X”, and more opinionated about workflows/software design. This last part is important because these discussions will generally be biased toward the {tidyverse} philosophy, and so it’s important that you understand that going in. However, this course is also biased toward the {tidyverse} philosophy, as I think it’s a good one. Stackoverflow is a very useful resource, but the RStudio community is a better place for beginners to post questions.

Assignments (400 points total)

As outlined in the Weekly Schedule, most class meetings will include a homework assignment. Supplemental Learning platforms (Codecademy) will also be used as additional homework assignments, and the course will conclude with a final project. More detail about each is provided below.

Homework (200 points; 50%)

Homework Assignments (130 points)

There are 10 homework assignments during the course, which must be submitted to the instructor prior to the start of the following class. At 13 points each, these homeworks will be scored on a “best honest effort” basis, which generally implies zero or full credit (i.e., the assignment was or was not fully completed). However, many of the homework assignments require students complete specific portions before moving on to the next sections. If you find yourself stuck and unable to proceed, please contact the instructor for help rather than submitting incomplete work. Contacting the instructor is part of the “best honest effort”, and can result in full credit for an assignment even if the work is not fully complete. If the assignment is not complete, and the student has not contacted the instructor for help, it is likely to result is a score of zero.

If your homework is not submitted prior to the start of the following class without approval from the instructor, it will receive at most 7 points when/if it is submitted. If your homework is submitted more than one week late, it will receive zero points.

Supplemental Learning Platforms (70 points)

Codecademy-Learn R

In addition to providing supplemental support, seven Codecademy Lessons will be assigned and scored as part of homework at 10 points each. For the Codecademy assignments you will submit a screenshot of the completed lesson, as seen in the examples in the Weekly Schedule.

Codecademy Screenshots to submit (10 points each)

Final Project (200 points; 50%)

The final project in this class is a group project, requiring students use a “real world” dataset to write, essentially, a miniature manuscript, including an introduction (paragraph or two), methods, results, and discussion (again, maybe 2-3 paragraphs). Ideally, students would work with a dataset that includes variables they are interested in using beyond just this class; however, if students do not have access to a dataset, the instructor will provide one. Students who do not have access to data should plan to meet with the instructor as soon as possible so a dataset can be provided. Additionally, the dataset must be able to be shared publicly, as the full project will be required to be housed on GitHub and be fully reproducible. If making your data publicly available is a problem for you, please contact the instructor as soon as possible. We can work together to either find a dataset that will work for you, or simulate a dataset that is similar to the data you’d like to work with in reality (and then all your code should work for the real dataset, but you can share the simulated data with your classmates). Students are required to work in groups of 2-4 people. The final assignment is assigned during the first class, and groups must be finalized by the end of Week 2 (at which point students who have not self-selected into groups will be randomly assigned).

Outline (15 points)

A basic outline of the final project is due at the end of Week 5. The outline should include a description of the data to be used, a discussion of what preparatory work will need to be done, and how the requirements of the final project will be met. The outline is intended to be low-stakes and is primarily designed to be a means for you to obtain feedback on the feasibility of the project and areas to consider.

Draft Data Preparation Script (25 points)

At the end of Week 9, you must have a draft of the data preparation complete, including moving the data from its raw to tidy form and a variety of exploratory data visualizations. Final project must use the following functions: pivot_longer(), select(), filter(), mutate(), pivot_wider(), group_by(), and summarize().

Peer Review of Data Preparation Script (25 points)

Following the submission of the data preparation scripts, you will be assigned to review your peers’ code. The purpose of this exercise is to learn from each other. Programming is an immensely open-ended enterprise and there are lots of winding paths that all ultimately end up at the same destination. During your peer review, you must note (a) at least three areas of strength, (b) at least one thing you learned from reviewing their script, and (c) at least one and no more than three areas for improvement. Making your code publicly available can feel daunting. The purpose of this portion of the final project is to help us all learn from each other, not to denigrate. Under no circumstances will negative comments be tolerated. Any comments that could be perceived as negative, and outside the scope of the code, will result in an immediate score of zero. Be constructive in your feedback. Be kind. We are all learning.

Final Project Presentation (25 points)

Each group will present on their final project during Week 10, which is expected to still be in progress. These presentations are expected to be informal, and emphasize what learning occurred during the project. Specifically, the presentations are to commiserate with each other about the failures and challenges experienced along the way, while also celebrating the successes. Learning R is a difficult task, and we should all take solace knowing that others are struggling along with us. The final presentation should be equal parts “journey” and substantive findings/conclusions/results. Students are expected to present for approximately 10 minutes each (20-40 minutes per group), although the time may change depending on the enrollment of the class.

Final Project – Presentation Scoring Rubric

Presentation Rubric
Criteria Points possible
Challenges faced along the way 5
Victories and things to celebrate 5
Challenges you are still facing 5
Substantive findings/interpretations 5
Next R hurdle to tackle 5
Total 25

Final Paper (110)

The purpose of the final project is to allow students an opportunity to demonstrate all the skills they have learned throughout the course. The final project must (a) be a reproducible and dynamic R Markdown document with references to the extant literature; (b) be housed on GitHub, with contributions from all authors obvious; (c) demonstrate moving data from its raw “messy” format to a tidy data format through the R Markdown file, but not in the final document; (d) include at least two exploratory data visualizations, and (e) include at least summary statistics of the data in tables, although fitted models of any sort are an added bonus (not literally, there are not extra points for fitting a model). The points for the final project are broken down as follows.

Final Paper Rubric
Criteria Points Possible
Writing
Abstract 5
Introduction 5
Methods 5
Results 5
Discussion 5
References 5
Code
Document is fully reproducible 25
Demonstrate use of inline code 5
At least two data visualizations 10 (5 pts each)
Demonstrate tidying messy data using:
pivot_longer() 5
mutate() 5
select()`` and ``filter() 5
pivot_wider() 5
At least one table of descriptive statistics 10
group_by() 5
summarize() 5
Total 110

I will investigate the commits made by different authors when evaluating the final project. If it is obvious that one person did not utilize GitHub, and instead added all of their contributions through a single or only a few commits, I will dock points from that individual. There should be numerous commits by each author, and they should be roughly even in terms of contribution activity (which GitHub has metrics to track, both in terms of the number of commits as well as the number of lines modified).

Weekly Schedule (Topics, Assignments, and Readings)

Week 1: Introduction

Reading Slides Lecture1 Assigned2 Due
Oct-01 R4DS(2e) 5 A Quick Tour of the RStudio IDE Installs

R4DS(2e) 3

Codecademy: Introduction to R Syntax

R4DS(2e) 7


Final Project

Final Project



Optional: MD 1.1



1 Passcode: TjPe*3C0
2 I do not rearrange my RStudio panes as he does.

Week 2: Workflow

Reading Slides Lecture1 Assigned Due
Oct-08 Project-oriented Workflow Codecademy: Introduction to Data Frames in R A Quick Tour of the RStudio IDE

R4DS(2e) 29

Codecademy: Introduction to Visualization with R Codecademy: Introduction to R Syntax

here::here() Jenny Bryan

Homework 1

{rio} vignette



1 Passcode: 9.0Xhxz!

Week 3: {ggplot2}

Reading Slides Lecture1 Assigned Due
Oct-15 R4DS(2e) 2 Codecademy: Modifying Data Frames Codecademy: Introduction to Data Frames in R

Optional: MD 2.0 to 2.9
Codecademy: Aggregates in R Codecademy: Introduction to Visualization with R

Optional: Healy Ch 3

Homework 2 Homework 1




Homework 3


Final Project: Finalize Groups

Final Project: Finalize Groups

1 Passcode: 0m9#%C5B

Week 4: {dplyr}

Reading Slides Lecture1 Assigned Due
Oct-22 R4DS(2e) 4
Download GitKraken Codecademy: Modifying Data Frames

Optional: MD 3.1 to 3.6, 3.8

Watch What is a Git repository? Codecademy: Aggregates in R




Watch What is a remote repository? Homework 2




Homework 4 Homework 3
1 Passcode: NA

Week 5: GitHub

Reading Slides Lecture1 Assigned Due
Oct-29 Bryan 2017

Markdown Tutorial Download GitKraken




Homework 5 Watch What is a Git repository?




Homework 6 Watch What is a remote repository?





Homework 4






Final Project: Outline

Final Project: Outline

1 Passcode: NA

Week 6: Quarto

Reading Slides Lecture1 Assigned Due
Nov-05 R4DS(2e) 29

Codecademy: Joining Tables in R Markdown Tutorial




Homework 7 Homework 5





Homework 6
1 Passcode: NA

Week 7: Mutating Joins

Reading Slides Lecture1 Assigned Due
Nov-12 R4DS(2e) 13

Codecademy: Data Cleaning in R Codecademy: Joining Tables in R




Homework 8 Homework 6





Homework 7
1 Passcode: NA

Week 8: Tidy Data

Reading Slides Lecture1 Assigned Due
Nov-19 R4DS(2e) 12

Homework 9 Codecademy: Data Cleaning in R

Wickham 2014


Homework 8

R-Ladies Sydney CleanItUp 5


Final Project: Draft Data Script

Final Project: Draft Data Script



Optional: MD 4.2 - 4.4



1 Passcode: NA

Week 9: Factors & Pull Request

Reading Slides Lecture1 Assigned Due
Nov-26 R4DS(2e) 17

Homework 10 Homework 9





Final Project: Peer Review of Script

Final Project: Peer Review of Script








Final Project: Draft Data Script

Final Project: Draft Data Script

1 Passcode: NA

Week 10: Presentations

Reading Slides Lecture1 Assigned Due
Dec-03



Homework 10






Final Project: Presentation

Final Project: Presentation

1 Passcode: NA

Week 11: No class: Final papers due

Course Policies

Grading Components

Grading Components
Lower % Lower point range Grade Upper point range Upper %
97 388 A+
93 372 A 384 96
90 360 A- 368 92
87 348 B+ 356 89
83 332 B 344 86
80 320 B- 328 82
77 308 C+ 316 79
73 292 C 304 76
70 280 C- 288 72
F 276 69

Student Engagement Inventory

Graduate: 1 credit hour = 40 hours of student engagement (3 credit hours = 120 hours of student engagement).

Student Engagement Inventory
Educational activity Hours student engaged Explanatory comments (if any):
Course attendance 28.33 10 meetings at 170 minutes per meeting
Assigned readings 15.67 Weekly readings are assigned, and expected to take approximatley 1.5 hours each week
Projects 36.00 Final project, as described above
Homework 40.00 10 Labs, at approximately 3 hours per lab spent out of class (20 hours), plus 4 Rstudio Primers, 3 R-Bootcamp Chapters, and 4 Codecademy Lessons at approximately 1 hour each (10 hours)
Total hours: 120.00

Communicating with Me: How and Why

How will I communicate with you? Our class will communicate through our Canvas site. Announcements and emails are archived there, automatically forwarded to your UO email, and can even reach you by text. Check and adjust your settings under Account > Notifications.

When I need to get in touch with individual students, I do so through email.

When giving feedback on assignments, I do so in Canvas, and turnaround time for feedback is generally one week.

How can you communicate with me? If your question (or comment) is about a technical challenge with Canvas or another technology, please contact the UO Service Portal. If it is about course content or activities, about something personal, time sensitive, or something else that doesn’t feel like it fits above, please reach out to me by email. I try to respond to questions within one business day.

Why should you communicate with me? I enjoy talking with students about our course material! Are you confused or excited about something? Wondering how what we’re learning relates to current events, career choices, or other classes you can take UO? Please be in touch! Please also be in touch to tell me how you are doing in the course. If you are having trouble with some aspect of it, I would like to strategize with you. I believe every student can succeed in this course, and I care about your success.

Classroom Community Expectations

Participate and Contribute: All students are expected to participate by sharing ideas and contributing to the learning environment. This entails preparing, following instructions, and engaging respectfully and thoughtfully with others. While all students should participate, participation is not just talking, and a range of participation activities support learning. Participation might look like speaking aloud in the full class and in small groups as well as submitting questions prior to class or engaging with Discussion posts. We will establish more specific participation guidelines and criteria for contributions in our first weeks of the term.

Expect and Respect Diversity: All classes at the University of Oregon welcome and respect diverse experiences, perspectives, and approaches. What is not welcome are behaviors or contributions that undermine, demean, or marginalize others based on race, ethnicity, gender, sex, age, sexual orientation, religion, ability, or socioeconomic status. We will value differences and communicate disagreements with respect. We may establish more specific guidelines and protocols to ensure inclusion and equity for all members of our learning community.

Help Everyone Learn: Part of how we learn together is by learning from one another. To do this effectively, we need to be patient with each other, identify ways we can assist others, and be open-minded to receiving help and feedback from others. Don’t hesitate to contact me to ask for assistance or offer suggestions that might help us learn better.

Course Attendance and Engagement

This is a face-to-face course. Attendance is important because we will develop our knowledge through in-class activities that require your active engagement. We’ll have discussions, small-group activities, and do other work during class that will be richer for your presence, and that you won’t be able to benefit from if you are not there. Excessive absences make it impossible to learn well and succeed in the course. While there is not an automatic grade deduction for missing classes, it is unlikely that students who miss 6 or more classes will be able pass this course. That said, if you are feeling ill, please stay home to heal and avoid infecting your classmates. Please take absences only when necessary, so when they are necessary, your prior attendance will have positioned you for success. If you must miss a class, please fill out the absence report form.

My course attendance and engagement policies were built with absences and deadline flexibility that students commonly need in mind. There are, however, times when a student may experience an extraordinary circumstance—an unanticipated and significant crisis—that impacts their attendance. Exceptions to the attendance policy and/or deadlines may be granted in the event of extraordinary circumstances. Please contact me as soon as you are able to request it—ideally before the class or deadline has passed, or, if your circumstance makes this difficult, then as soon as possible afterwards. This exception will not be offered on an open-ended basis, so if you need to ask for it, please give some consideration to how much time you will realistically need to complete the work. To activate this policy, send me an email with “Extraordinary circumstance request” in the subject line, and if you are requesting deadline flexibility, let me know by what updated deadline you will be able to submit your assignment. There is no need to explain or offer information about the nature of the extraordinary circumstance in your email—we will trust you only to activate this policy in crisis situation. Please note, too, that detailed feedback on your written work may be delayed or impossible to provide if you’re submitting to meet an extended deadline. At the end of term, granting extensions is hard for the teaching team because of UO’s tight turnaround on grading. Please be in touch in an emergency and we can discuss your options.

Generative Artificial Intelligence Use

Students can use GenAI tools in this class to help with course work and assignments. However, if you use a GenAI tool, you need to document your use, including the tool you use and when, where, and how in your work process you used it (for example: “I used ChatGPT to generate this part of my code, which I then revised before submitting”). In certain cases, as part of your documentation, I may ask you to submit any GenAI results you obtained, so you need to keep GenAI-created drafts and logs of your interactions with GenAI tools; failure to provide such documentation may result in a grade reduction in certain instances.

Along with documentation of your GenAI use, you are also required to cite GenAI if you use any GenAI-created content in your work submissions, for example text or images or graphics generated by GenAI tools. That is, you need to treat GenAI just like other sources such as books, articles, videos, etc.

Grievance Policy

A student or group of students of the College of Education may appeal decisions or actions pertaining to admissions, programs, evaluation of performance and program retention and completion. Students who decide to file a grievance should follow University student grievance procedures and/or consult with the College Associate Dean for Academic Affairs: Edward M. Olivos at emolivos@uoregon.edu or 541-346-2983.

University Policies

The University of Oregon policy statements now exist on the student-facing University Course Policies page and are also linked to from every Canvas course site.