Introductory R for Social Sciences

Welcome!

This site serves as a repository for the slides and codes developed for the ‘Introductory R for Social Sciences’ workshop tailored for undergraduate students at SMU. Comprising five sessions, this resource is designed for individuals with fundamental statistics knowledge who are venturing into R programming or coding for the first time.

Date, Time, and Venue:

  • Thursday sessions:
    • Date: Thursdays, 18 Jan, 25 Jan, 1 Feb, 8 Feb, 15 Feb 2024.
    • Time: 3.30 PM to 5.30 PM
    • Venue: LKCSB Classroom 3.5 YPHSL Seminar Room 2.04
  • Friday sessions (for IDIS100):
    • Date: Fridays, 19 Jan, 26 Jan, 2 Feb, 16 Feb, 23 Feb 2024.
    • Time: 3.30 PM to 5.30 PM
    • Venue: SOSS/CIS Classroom 3.4 SOSS/CIS Seminar Room 3.1

Workshop Contents

Slides, scripts, and other materials will be progressively made available below.

Session 1: Introduction to R and RStudio

Learn the fundamentals of R and Rstudio, including how to set up your working directory and R projects, basic data format in R such as vectors and dataframe, and how to import data files to R.

Slides available here (29 Jan - edited with code explanations)

Session 2: Data wrangling with tidyverse

Learn the basics of data wrangling with tidyverse, including how to remove duplicates, reverse coding, filter data, reshape data to prepare it for analysis and more.

Slides available here (29 Jan - edited with code explanations & exercise answers)

Session 3: Data visualization and Quarto

Learn how to create data visualizations like scatterplots, boxplots, and barplots with ggplot. Additionally, the session will introduce Quarto, a document format that allows seamless integration of R code, text, plots, and citations into a single document, which can then be effortlessly converted to PDF, Word, or HTML.

Slides available here(6 Feb - edited with code explanations & exercise answers)

Few things you need to do before the session:

  • Download the supplementary materials here. The zip file contains two files: faculty-eval-with-scores.csv and apa-single-styled.csl. Put the CSV file inside the data-output folder in your project, and put the CSL file in the same location as your R scripts.

  • Create an account in quartopub.com.

  • (Sort of optional) Have Zotero installed in your laptop. If you already have it installed, you don’t have to do anything else :)

Session 4: Stats in R (part 1)

Learn how to conduct basic statistical analysis in social sciences such as chi-square, t-tests, correlations, and ANOVA.

Slides available here(26 Feb - edited with code explanations & exercise answers)

Session 5: Stats in R (part 2)

Learn how to conduct simple linear and logistic regression in R, as well as best practices to improve code readability and reproducibility.

Slides available here(26 Feb - edited with code explanations & exercise answers)

Data for this Workshop

The workshop will use a dataset from the CSV file faculty_policy_eval.csv. It is a synthetic dataset derived from Salaries dataset in carData package.

Download: available here. Login with your SMU account for access, and then go to File > Save As > Download a Copy . Change the file type to Comma-separated Values (.csv).

The scenario for this dataset: The college is implementing a new policy that aims to improve faculty’s teaching, research, and service performance score, or TEARS. The new policy, called TED (TEARS Enhancement Directive), is implemented for 3 years, starting from 2020 and concluding in 2023.

Data dictionary:

pid

Unique ID to identify each observation

rank

An ordered factor with levels: AsstProf, AssocProf, Prof

discipline

a factor with levels: A (“theoretical” departments) or B (“applied” departments).

yrs.since.phd

years since PhD

yrs.service

years of service in the college

sex

a factor with levels: Female, Male

salary

Average annual salary for 3 years period (2020 – 2023) in USD

teaching.2020

Faculty’s teaching score for 2020, before implementation of TED

research.2020

Faculty’s research score for 2020, before implementation of TED

service.2020

Faculty’s service score for 2020, before implementation of TED

teaching.2023

Faculty’s teaching score for 2023, after implementation of TED

research.2023

Faculty’s research score for 2023, after implementation of TED

service.2023

Faculty’s service score for 2023, after implementation of TED

Q1 to Q5

Faculty’s response to 7 likert scale feedback survey. The question was:

Please indicate your agreement with the following statements, with 1 = strongly disagree, 4 = neutral, and 7 = strongly agree.

  1. I feel adequately trained and informed about TED.

  2. TED implementation was effective and efficient.

  3. The communication about TED implementation was confusing.

  4. I found it challenging to understand the reasons behind TED.

  5. TED aligns well with our goal and values.