• Home
  • 1 Syllabus
  • 2 Activities: Introductory
    • 2.1 Getting Started in RStudio
      • 2.1.1 Review + Assignment
      • 2.1.2 Tidy Data
      • 2.1.3 Data Basics in RStudio
    • 2.2 R Markdown and Reproducible Research
  • 3 Activities: Visualization
    • 3.1 Introduction to Data Visualization
      • 3.1.1 Motivation
      • 3.1.2 Features of Good (& Bad) Visualizations
      • 3.1.3 Grammar of Graphics
    • 3.2 Univariate Visualizations
      • 3.2.1 Practice
      • 3.2.2 Exercises
    • 3.3 Bivariate Visualizations
      • Practice
      • Exercises
    • 3.4 Beyond Bivariate Relationships
      • 3.4.1 Practice
      • 3.4.2 Exercises
    • 3.5 Visualization Wrap-Up
      • 3.5.1 Practice
      • 3.5.2 Exercises
  • 4 Activities: Wrangling
    • 4.1 Data Wrangling: Introduction
      • 4.1.1 Tidy Data
      • 4.1.2 Data Verbs
      • 4.1.3 Piping
      • 4.1.4 Manipulating Dates
      • 4.1.5 Practice
      • 4.1.6 Drills
    • 4.2 Data Wrangling: Spread, Gather, Wide and Narrow
      • 4.2.1 Spread
      • 4.2.2 Gather
      • 4.2.3 Summary Graphic
      • 4.2.4 The Daily Show Guests
      • 4.2.5 Recreating a Graphic
      • 4.2.6 Gathering Practice
      • 4.2.7 Practice Solutions:
    • 4.3 Data Wranging: Joining Data Frames
  • 5 Activities: EDA
    • 5.1 EDA Case Study: Flight Delays
  • 6 Activities: Classification & Prediction
    • 6.1 Classification Trees
      • 6.1.1 Discussion
      • 6.1.2 Practice
    • 6.2 Random Forests
      • 6.2.1 Discussion
      • 6.2.2 Practice
    • 6.3 Regression Trees
      • 6.3.1 Discussion
      • 6.3.2 Practice
  • 7 Activities: Acquiring Data
    • 7.1 Finding, Importing, and Cleaning Data
      • 7.1.1 Finding Existing Data Sets
      • 7.1.2 Loading Datasets
      • 7.1.3 Cleaning Datasets
      • 7.1.4 Additional exercises
    • 7.2 Introduction to SQL
      • 7.2.1 Stack Exchange Data Explorer
      • 7.2.2 Basic Select Queries
      • 7.2.3 SQL Summarization
      • 7.2.4 SQL Joins
      • 7.2.5 Additional Exercises
    • 7.3 Public Web APIs
      • 7.3.1 Wrapper Packages
      • 7.3.2 Accessing Web APIs Using JSON
    • 7.4 Data Acquisition: Scraping
      • 7.4.1 Finding CSS Selectors
      • 7.4.2 Retrieving Data Using RVest & CSS Selector
      • 7.4.3 Additional Exercises: Analyze Alexa Top Ranks
  • 8 Additional Topics
    • 8.1 Introduction to Text Processing in R
      • 8.1.1 Getting Started with twitteR
      • 8.1.2 Getting Strings, Technique 1: String Literals
      • 8.1.3 Getting Strings, Technique 2: Reading .txt Files
      • 8.1.4 Analyzing Single Documents
      • 8.1.5 Getting Strings Technique 3: Web APIs
      • 8.1.6 Comparing multiple documents
  • 9 Homework
    • 9.1 General Homework Directions
    • 9.2 Homework 1
    • 9.3 Homework 2
    • 9.4 Homework 3
    • 9.5 Homework 4
    • 9.6 Homework 5
    • 9.7 Homework 6
    • 9.8 Homework 7
    • 9.9 Homework 8
    • 9.10 Homework 9
    • 9.11 Homework 100
  • 10 RStudio Cheat Sheets
  • 11 Final Project
    • 11.1 Brainstorming Ideas
    • 11.2 Data Collection & Visualization
    • 11.3 Progress Reports
    • 11.4 Final Presentations
    • 11.5 Final Reports

COMP/MATH 112: Introduction to Data Science

10 RStudio Cheat Sheets

  • RMarkdown cheat sheet

  • Visualization (ggplot) cheat sheet

  • Data wrangling (dplyr) cheat sheet