Introductory R for Social Sciences
Welcome!
This site serves as a repository for the slides and codes developed for the ‘Introductory R for Social Sciences’ workshop tailored for undergraduate students at SMU. Comprising five sessions, this resource is designed for individuals with fundamental statistics knowledge who are venturing into R programming or coding for the first time.
Date, Time, and Venue:
- Thursday sessions:
- Date: Thursdays, 18 Jan, 25 Jan, 1 Feb, 8 Feb, 15 Feb 2024.
- Time: 3.30 PM to 5.30 PM
- Venue:
LKCSB Classroom 3.5YPHSL Seminar Room 2.04
- Friday sessions (for IDIS100):
- Date: Fridays, 19 Jan, 26 Jan, 2 Feb, 16 Feb, 23 Feb 2024.
- Time: 3.30 PM to 5.30 PM
- Venue:
SOSS/CIS Classroom 3.4SOSS/CIS Seminar Room 3.1
Workshop Contents
Slides, scripts, and other materials will be progressively made available below.
Session 1: Introduction to R and RStudio
Learn the fundamentals of R and Rstudio, including how to set up your working directory and R projects, basic data format in R such as vectors and dataframe, and how to import data files to R.
Slides available here (29 Jan - edited with code explanations)
Session 2: Data wrangling with tidyverse
Learn the basics of data wrangling with tidyverse, including how to remove duplicates, reverse coding, filter data, reshape data to prepare it for analysis and more.
Slides available here (29 Jan - edited with code explanations & exercise answers)
Session 3: Data visualization and Quarto
Learn how to create data visualizations like scatterplots, boxplots, and barplots with ggplot. Additionally, the session will introduce Quarto, a document format that allows seamless integration of R code, text, plots, and citations into a single document, which can then be effortlessly converted to PDF, Word, or HTML.
Slides available here(6 Feb - edited with code explanations & exercise answers)
Few things you need to do before the session:
Download the supplementary materials here. The zip file contains two files:
faculty-eval-with-scores.csv
andapa-single-styled.csl
. Put the CSV file inside thedata-output
folder in your project, and put the CSL file in the same location as your R scripts.Create an account in quartopub.com.
(Sort of optional) Have Zotero installed in your laptop. If you already have it installed, you don’t have to do anything else :)
Session 4: Stats in R (part 1)
Learn how to conduct basic statistical analysis in social sciences such as chi-square, t-tests, correlations, and ANOVA.
Slides available here(26 Feb - edited with code explanations & exercise answers)
Session 5: Stats in R (part 2)
Learn how to conduct simple linear and logistic regression in R, as well as best practices to improve code readability and reproducibility.
Slides available here(26 Feb - edited with code explanations & exercise answers)
Data for this Workshop
The workshop will use a dataset from the CSV file faculty_policy_eval.csv
. It is a synthetic dataset derived from Salaries
dataset in carData
package.
Download: available here. Login with your SMU account for access, and then go to File
> Save As
> Download a Copy
. Change the file type to Comma-separated Values (.csv).
The scenario for this dataset: The college is implementing a new policy that aims to improve faculty’s teaching, research, and service performance score, or TEARS. The new policy, called TED (TEARS Enhancement Directive), is implemented for 3 years, starting from 2020 and concluding in 2023.
Data dictionary:
pid
-
Unique ID to identify each observation
rank
-
An ordered factor with levels: AsstProf, AssocProf, Prof
discipline
-
a factor with levels: A (“theoretical” departments) or B (“applied” departments).
yrs.since.phd
-
years since PhD
yrs.service
-
years of service in the college
sex
-
a factor with levels: Female, Male
salary
-
Average annual salary for 3 years period (2020 – 2023) in USD
teaching.2020
-
Faculty’s teaching score for 2020, before implementation of TED
research.2020
-
Faculty’s research score for 2020, before implementation of TED
service.2020
-
Faculty’s service score for 2020, before implementation of TED
teaching.2023
-
Faculty’s teaching score for 2023, after implementation of TED
research.2023
-
Faculty’s research score for 2023, after implementation of TED
service.2023
-
Faculty’s service score for 2023, after implementation of TED
Q1
toQ5
-
Faculty’s response to 7 likert scale feedback survey. The question was:
Please indicate your agreement with the following statements, with 1 = strongly disagree, 4 = neutral, and 7 = strongly agree.
I feel adequately trained and informed about TED.
TED implementation was effective and efficient.
The communication about TED implementation was confusing.
I found it challenging to understand the reasons behind TED.
TED aligns well with our goal and values.