Presentation Rubric | |
---|---|
Criteria | Points possible |
Challenges faced along the way | 5 |
Victories and things to celebrate | 5 |
Challenges you are still facing | 5 |
Substantive findings/interpretations | 5 |
Next R hurdle to tackle | 5 |
Total | 25 |
Syllabus
EDLD 651: Introduction to Data Science with R
(CRN: 11803; 3 credit hours)
Introduction to Course and Instructor
Term: Fall 2025
Time: Wed, 1:00-3:50pm
Classroom: 119 Lokey
Instructor: Joe Nese, PhD
- email: jnese@uoregon.edu (preferred contact method)
- office hours: By appointment
Course Overview
This is the first course in a sequence of courses that will eventually lead to a data science in educational research specialization. All courses will be taught through R
, a free and open-source statistical computing environment. This course will introduce students toR
and RStudio, version control with git and GitHub, dynamic and reproducible workflows with Quarto, and basic data wrangling and visualization with the tidyverse
suite of packages.
Acknowledgment. This course, and much of the materials prepared and content presented, was originally developed by Daniel Anderson. In collaboration with Daniel Anderson, Alison Hill, Chester Ismay, and Andrew Bray, helped design the content for this course and the specialization as a whole.
Student Learning Outcomes
By the end of this course, students should be able to:
- Understand the
R
package ecosystem (how to find, install, load, and learn about package functionality) - Read “flat” (i.e., rectangular) datasets into
R
- Perform basic data wrangling and transformations in
R
, using thetidyverse
- Leverage appropriate functions for introductory data science tasks (pipeline)
- “clean up” the dataset using scripts and reproducible workflows
- Use version control with
R
via GitHub - Use
Quarto
to create reproducible, dynamic reports - Understand and create different types of data visualizations
Course Modality
This is an in-person course: that means that, unlike asynchronous online/ASYNC WEB courses, we will meet during scheduled class meeting times in (class location). I will accommodate absences as described in the Absences policy below. If you need additional flexibility, UO encourages you to consider ASYNC WEB courses. If you need accommodation related to a medical or other disability, you can request those by working with the Accessible Education Center.
Course Reading Materials and Resources
All required course readings are freely available online or will be provided by the instructor. Note that the assigned readings should be read before each class.
Books (required)
- Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science. O’Reilly. (referred to as R4DS(2e) in the weekly schedule)
- Freely available online in the link above. A full-color paper copy is also available from Bookshop.org for (currently) $74.39.
Books (optional and possibly helpful)
- Ismay, C. and Kim, A. Y. (2021). Statistical Inference via Data Science: A ModernDive into R and the Tidyverse. (referred to as MD in the weekly schedule)
Resources
Supplemental Learning
The best way to learn to use R
is to be exposed to and use R
, so in this course we will use some supplemental content developed outside this course.
Codecademy
We will be using some content from the Learn R course through Codecademy. You will need to sign up for a free account. Codecademy can be used as a place for you to practice/grow the skills you learn in class, and a resource for your own exploration of topics not covered by this class if you so choose. You will be working through the following.
Helpful People/Groups
UO Libraries Data Services
The Data Services Help Desk is available this term Monday through Friday 11 am - 4 pm here in the Knight Library. The Help Desk offers drop-in services and consultations by appointment for statistical methods, R, Python, Git, Excel, Dedoose, Qualtrics, research data management, and entry-level GIS and Talapas questions.
They also provide interactive workshops on topics ranging from courses on R
and Python
to Git
, Qualtrics
, and High-Performance Computing
.
If you are looking to connect with peers, their Coffee + Data && Code sessions offer short talks, while Coding Circles are casual co-working spaces. They also host a book club.
learnR4free
learnR4free includes all sorts of resources (including books, videos, interactive websites, papers) to learn R
for free.
R-Ladies
Regardless of how you identify, but particularly if you identify as a woman or gender non-binary, connect with the global R-Ladies organization. They are an excellent organization of incredibly supportive individuals who routinely share great information.
Posit Community
The Posit community is similar to stackoverflow but is generally more friendly, willing to engage in more philosophical and deeper discussions than “how do you do X”, and more opinionated about workflows/software design. This last part is important because these discussions will generally be biased toward the {tidyverse}
philosophy, and so it’s important that you understand that going in. However, this course is also biased toward the {tidyverse}
philosophy, as I think it’s a good one. Stackoverflow is a very useful resource, but the RStudio community is a better place for beginners to post questions.
Assignments (400 points total)
As outlined in the Weekly Schedule, most class meetings will include a homework assignment. Supplemental Learning platforms (Codecademy) will also be used as additional homework assignments, and the course will conclude with a final project. More detail about each is provided below.
Homework (200 points; 50%)
Homework Assignments (130 points)
There are 10 homework assignments during the course, which must be submitted to the instructor prior to the start of the following class. At 13 points each, these homeworks will be scored on a “best honest effort” basis, which generally implies zero or full credit (i.e., the assignment was or was not fully completed). However, many of the homework assignments require students complete specific portions before moving on to the next sections. If you find yourself stuck and unable to proceed, please contact the instructor for help rather than submitting incomplete work. Contacting the instructor is part of the “best honest effort”, and can result in full credit for an assignment even if the work is not fully complete. If the assignment is not complete, and the student has not contacted the instructor for help, it is likely to result is a score of zero.
If your homework is not submitted prior to the start of the following class without approval from the instructor, it will receive at most 7 points when/if it is submitted. If your homework is submitted more than one week late, it will receive zero points.
Supplemental Learning Platforms (70 points)
Codecademy-Learn R
In addition to providing supplemental support, seven Codecademy Lessons will be assigned and scored as part of homework at 10 points each. For the Codecademy assignments you will submit a screenshot of the completed lesson, as seen in the examples in the Weekly Schedule.
Codecademy Screenshots to submit (10 points each)
- Codecademy: Introduction to R Syntax: screenshot of completion checkmark (example)
- Codecademy: Introduction to Data Frames in R: screenshot of completion checkmark (example)
- Codecademy: Introduction to Visualization with R: screenshot of completion checkmark (example)
- Codecademy: Modifying Data Frames: screenshot of completion checkmark (example)
- Codecademy: Aggregates in R: screenshot of completion checkmark (example)
- Codecademy: Joining Tables in R: screenshot of completion checkmark (example)
- Codecademy: Data Cleaning in R: screenshot of completion checkmark (example)
Final Project (200 points; 50%)
The final project in this class is a group project, requiring students use a “real world” dataset to write, essentially, a miniature manuscript, including an introduction (paragraph or two), methods, results, and discussion (again, maybe 2-3 paragraphs). Ideally, students would work with a dataset that includes variables they are interested in using beyond just this class; however, if students do not have access to a dataset, the instructor will provide one. Students who do not have access to data should plan to meet with the instructor as soon as possible so a dataset can be provided. Additionally, the dataset must be able to be shared publicly, as the full project will be required to be housed on GitHub and be fully reproducible. If making your data publicly available is a problem for you, please contact the instructor as soon as possible. We can work together to either find a dataset that will work for you, or simulate a dataset that is similar to the data you’d like to work with in reality (and then all your code should work for the real dataset, but you can share the simulated data with your classmates). Students are required to work in groups of 2-4 people. The final assignment is assigned during the first class, and groups must be finalized by the end of Week 2 (at which point students who have not self-selected into groups will be randomly assigned).
Outline (15 points)
A basic outline of the final project is due at the end of Week 5. The outline should include a description of the data to be used, a discussion of what preparatory work will need to be done, and how the requirements of the final project will be met. The outline is intended to be low-stakes and is primarily designed to be a means for you to obtain feedback on the feasibility of the project and areas to consider.
Draft Data Preparation Script (25 points)
At the end of Week 9, you must have a draft of the data preparation complete, including moving the data from its raw to tidy form and a variety of exploratory data visualizations. Final project must use the following functions: pivot_longer()
, select()
, filter()
, mutate()
, pivot_wider()
, group_by()
, and summarize()
.
Peer Review of Data Preparation Script (25 points)
Following the submission of the data preparation scripts, you will be assigned to review your peers’ code. The purpose of this exercise is to learn from each other. Programming is an immensely open-ended enterprise and there are lots of winding paths that all ultimately end up at the same destination. During your peer review, you must note (a) at least three areas of strength, (b) at least one thing you learned from reviewing their script, and (c) at least one and no more than three areas for improvement. Making your code publicly available can feel daunting. The purpose of this portion of the final project is to help us all learn from each other, not to denigrate. Under no circumstances will negative comments be tolerated. Any comments that could be perceived as negative, and outside the scope of the code, will result in an immediate score of zero. Be constructive in your feedback. Be kind. We are all learning.
Final Project Presentation (25 points)
Each group will present on their final project during Week 10, which is expected to still be in progress. These presentations are expected to be informal, and emphasize what learning occurred during the project. Specifically, the presentations are to commiserate with each other about the failures and challenges experienced along the way, while also celebrating the successes. Learning R is a difficult task, and we should all take solace knowing that others are struggling along with us. The final presentation should be equal parts “journey” and substantive findings/conclusions/results. Students are expected to present for approximately 10 minutes each (20-40 minutes per group), although the time may change depending on the enrollment of the class.
Final Project – Presentation Scoring Rubric
Final Paper (110)
The purpose of the final project is to allow students an opportunity to demonstrate all the skills they have learned throughout the course. The final project must (a) be a reproducible and dynamic R Markdown document with references to the extant literature; (b) be housed on GitHub, with contributions from all authors obvious; (c) demonstrate moving data from its raw “messy” format to a tidy data format through the R Markdown file, but not in the final document; (d) include at least two exploratory data visualizations, and (e) include at least summary statistics of the data in tables, although fitted models of any sort are an added bonus (not literally, there are not extra points for fitting a model). The points for the final project are broken down as follows.
Final Paper Rubric | |
---|---|
Criteria | Points Possible |
Writing | |
Abstract | 5 |
Introduction | 5 |
Methods | 5 |
Results | 5 |
Discussion | 5 |
References | 5 |
Code | |
Document is fully reproducible | 25 |
Demonstrate use of inline code | 5 |
At least two data visualizations | 10 (5 pts each) |
Demonstrate tidying messy data using: | |
pivot_longer() |
5 |
mutate() |
5 |
select()`` and ``filter() |
5 |
pivot_wider() |
5 |
At least one table of descriptive statistics | 10 |
group_by() |
5 |
summarize() |
5 |
Total | 110 |
I will investigate the commits made by different authors when evaluating the final project. If it is obvious that one person did not utilize GitHub, and instead added all of their contributions through a single or only a few commits, I will dock points from that individual. There should be numerous commits by each author, and they should be roughly even in terms of contribution activity (which GitHub has metrics to track, both in terms of the number of commits as well as the number of lines modified).
Weekly Schedule (Topics, Assignments, and Readings)
Week 1: Introduction
Reading | Slides | Lecture1 | Assigned2 | Due | |
---|---|---|---|---|---|
Oct-01 | R4DS(2e) 5 | A Quick Tour of the RStudio IDE | Installs | ||
R4DS(2e) 3 | Codecademy: Introduction to R Syntax | ||||
R4DS(2e) 7 |
Final Project |
||||
Optional: MD 1.1 | |||||
1 Passcode: TjPe*3C0 | |||||
2 I do not rearrange my RStudio panes as he does. |
Week 2: Workflow
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Oct-08 | Project-oriented Workflow | Codecademy: Introduction to Data Frames in R | A Quick Tour of the RStudio IDE | ||
R4DS(2e) 29 | Codecademy: Introduction to Visualization with R | Codecademy: Introduction to R Syntax | |||
here::here() Jenny Bryan | Homework 1 | ||||
{rio} vignette |
|||||
1 Passcode: 9.0Xhxz! |
Week 3: {ggplot2}
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Oct-15 | R4DS(2e) 2 | Codecademy: Modifying Data Frames | Codecademy: Introduction to Data Frames in R | ||
Optional: MD 2.0 to 2.9 | Codecademy: Aggregates in R | Codecademy: Introduction to Visualization with R | |||
Optional: Healy Ch 3 | Homework 2 | Homework 1 | |||
Homework 3 |
Final Project: Finalize Groups |
||||
1 Passcode: 0m9#%C5B |
Week 4: {dplyr}
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Oct-22 | R4DS(2e) 4 | Download GitKraken | Codecademy: Modifying Data Frames | ||
Optional: MD 3.1 to 3.6, 3.8 | Watch What is a Git repository? | Codecademy: Aggregates in R | |||
Watch What is a remote repository? | Homework 2 | ||||
Homework 4 | Homework 3 | ||||
1 Passcode: NA |
Week 5: GitHub
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Oct-29 | Bryan 2017 | Markdown Tutorial | Download GitKraken | ||
Homework 5 | Watch What is a Git repository? | ||||
Homework 6 | Watch What is a remote repository? | ||||
Homework 4 | |||||
Final Project: Outline |
|||||
1 Passcode: NA |
Week 6: Quarto
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Nov-05 | R4DS(2e) 29 | Codecademy: Joining Tables in R | Markdown Tutorial | ||
Homework 7 | Homework 5 | ||||
Homework 6 | |||||
1 Passcode: NA |
Week 7: Mutating Joins
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Nov-12 | R4DS(2e) 13 | Codecademy: Data Cleaning in R | Codecademy: Joining Tables in R | ||
Homework 8 | Homework 6 | ||||
Homework 7 | |||||
1 Passcode: NA |
Week 8: Tidy Data
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Nov-19 | R4DS(2e) 12 | Homework 9 | Codecademy: Data Cleaning in R | ||
Wickham 2014 | Homework 8 | ||||
R-Ladies Sydney CleanItUp 5 |
Final Project: Draft Data Script |
||||
Optional: MD 4.2 - 4.4 | |||||
1 Passcode: NA |
Week 9: Factors & Pull Request
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Nov-26 | R4DS(2e) 17 | Homework 10 | Homework 9 | ||
Final Project: Peer Review of Script |
|||||
Final Project: Draft Data Script |
|||||
1 Passcode: NA |
Week 10: Presentations
Reading | Slides | Lecture1 | Assigned | Due | |
---|---|---|---|---|---|
Dec-03 | Homework 10 | ||||
Final Project: Presentation |
|||||
1 Passcode: NA |
Week 11: No class: Final papers due
Course Policies
Grading Components
Grading Components | ||||
---|---|---|---|---|
Lower % | Lower point range | Grade | Upper point range | Upper % |
97 | 388 | A+ | ||
93 | 372 | A | 384 | 96 |
90 | 360 | A- | 368 | 92 |
87 | 348 | B+ | 356 | 89 |
83 | 332 | B | 344 | 86 |
80 | 320 | B- | 328 | 82 |
77 | 308 | C+ | 316 | 79 |
73 | 292 | C | 304 | 76 |
70 | 280 | C- | 288 | 72 |
F | 276 | 69 |
Student Engagement Inventory
Graduate: 1 credit hour = 40 hours of student engagement (3 credit hours = 120 hours of student engagement).
Student Engagement Inventory | ||
---|---|---|
Educational activity | Hours student engaged | Explanatory comments (if any): |
Course attendance | 28.33 | 10 meetings at 170 minutes per meeting |
Assigned readings | 15.67 | Weekly readings are assigned, and expected to take approximatley 1.5 hours each week |
Projects | 36.00 | Final project, as described above |
Homework | 40.00 | 10 Labs, at approximately 3 hours per lab spent out of class (20 hours), plus 4 Rstudio Primers, 3 R-Bootcamp Chapters, and 4 Codecademy Lessons at approximately 1 hour each (10 hours) |
Total hours: | 120.00 |
Communicating with Me: How and Why
How will I communicate with you? Our class will communicate through our Canvas site. Announcements and emails are archived there, automatically forwarded to your UO email, and can even reach you by text. Check and adjust your settings under Account > Notifications.
When I need to get in touch with individual students, I do so through email.
When giving feedback on assignments, I do so in Canvas, and turnaround time for feedback is generally one week.
How can you communicate with me? If your question (or comment) is about a technical challenge with Canvas or another technology, please contact the UO Service Portal. If it is about course content or activities, about something personal, time sensitive, or something else that doesn’t feel like it fits above, please reach out to me by email. I try to respond to questions within one business day.
Why should you communicate with me? I enjoy talking with students about our course material! Are you confused or excited about something? Wondering how what we’re learning relates to current events, career choices, or other classes you can take UO? Please be in touch! Please also be in touch to tell me how you are doing in the course. If you are having trouble with some aspect of it, I would like to strategize with you. I believe every student can succeed in this course, and I care about your success.
Classroom Community Expectations
Participate and Contribute: All students are expected to participate by sharing ideas and contributing to the learning environment. This entails preparing, following instructions, and engaging respectfully and thoughtfully with others. While all students should participate, participation is not just talking, and a range of participation activities support learning. Participation might look like speaking aloud in the full class and in small groups as well as submitting questions prior to class or engaging with Discussion posts. We will establish more specific participation guidelines and criteria for contributions in our first weeks of the term.
Expect and Respect Diversity: All classes at the University of Oregon welcome and respect diverse experiences, perspectives, and approaches. What is not welcome are behaviors or contributions that undermine, demean, or marginalize others based on race, ethnicity, gender, sex, age, sexual orientation, religion, ability, or socioeconomic status. We will value differences and communicate disagreements with respect. We may establish more specific guidelines and protocols to ensure inclusion and equity for all members of our learning community.
Help Everyone Learn: Part of how we learn together is by learning from one another. To do this effectively, we need to be patient with each other, identify ways we can assist others, and be open-minded to receiving help and feedback from others. Don’t hesitate to contact me to ask for assistance or offer suggestions that might help us learn better.
Course Attendance and Engagement
This is a face-to-face course. Attendance is important because we will develop our knowledge through in-class activities that require your active engagement. We’ll have discussions, small-group activities, and do other work during class that will be richer for your presence, and that you won’t be able to benefit from if you are not there. Excessive absences make it impossible to learn well and succeed in the course. While there is not an automatic grade deduction for missing classes, it is unlikely that students who miss 6 or more classes will be able pass this course. That said, if you are feeling ill, please stay home to heal and avoid infecting your classmates. Please take absences only when necessary, so when they are necessary, your prior attendance will have positioned you for success. If you must miss a class, please fill out the absence report form.
My course attendance and engagement policies were built with absences and deadline flexibility that students commonly need in mind. There are, however, times when a student may experience an extraordinary circumstance—an unanticipated and significant crisis—that impacts their attendance. Exceptions to the attendance policy and/or deadlines may be granted in the event of extraordinary circumstances. Please contact me as soon as you are able to request it—ideally before the class or deadline has passed, or, if your circumstance makes this difficult, then as soon as possible afterwards. This exception will not be offered on an open-ended basis, so if you need to ask for it, please give some consideration to how much time you will realistically need to complete the work. To activate this policy, send me an email with “Extraordinary circumstance request” in the subject line, and if you are requesting deadline flexibility, let me know by what updated deadline you will be able to submit your assignment. There is no need to explain or offer information about the nature of the extraordinary circumstance in your email—we will trust you only to activate this policy in crisis situation. Please note, too, that detailed feedback on your written work may be delayed or impossible to provide if you’re submitting to meet an extended deadline. At the end of term, granting extensions is hard for the teaching team because of UO’s tight turnaround on grading. Please be in touch in an emergency and we can discuss your options.
Generative Artificial Intelligence Use
Students can use GenAI tools in this class to help with course work and assignments. However, if you use a GenAI tool, you need to document your use, including the tool you use and when, where, and how in your work process you used it (for example: “I used ChatGPT to generate this part of my code, which I then revised before submitting”). In certain cases, as part of your documentation, I may ask you to submit any GenAI results you obtained, so you need to keep GenAI-created drafts and logs of your interactions with GenAI tools; failure to provide such documentation may result in a grade reduction in certain instances.
Along with documentation of your GenAI use, you are also required to cite GenAI if you use any GenAI-created content in your work submissions, for example text or images or graphics generated by GenAI tools. That is, you need to treat GenAI just like other sources such as books, articles, videos, etc.
Grievance Policy
A student or group of students of the College of Education may appeal decisions or actions pertaining to admissions, programs, evaluation of performance and program retention and completion. Students who decide to file a grievance should follow University student grievance procedures and/or consult with the College Associate Dean for Academic Affairs: Edward M. Olivos at emolivos@uoregon.edu or 541-346-2983.
University Policies
The University of Oregon policy statements now exist on the student-facing University Course Policies page and are also linked to from every Canvas course site.