R programming for beginners (GV900)

Lesson 1: R & RStudio interface

Friday, December 22, 2023

Video of Lesson 1 ~ Part 1

1 About me

Hello everyone, I’m Reddy.

I’m a PhD candidate specializing in Political Science at the University of Essex. My research delves into understanding authoritarian regimes, democratization, international conflict, coup d’état dynamics, and political economy.

As a Teaching Assistant for GV900–Introduction to Quantitative Methods & Data Analysis I, I’m excited to share my knowledge and help beginners navigate the world of R programming.

Let’s get started!

2 Install R & RStudio

3 Get familiar with RStudio interface

  1. console

    • Prompt: >
  2. bottom right pane

    • Files
    • Plots
    • Help
    • Viewer
  3. top right pane

    • Environment
    • History

4 Basic operations: deal with numeric variables

  1. addition: + \[ 123 + 321 \]
Code
123 + 321
[1] 444
  1. Subtraction: - \[ 123 - 321 \]
Code
123 - 321
[1] -198
  1. Multiplication: * \[ 123 \times 321 \]
Code
123 * 321
[1] 39483
  1. Division: / \[ 123 \div 321 \]
Code
123 / 321
[1] 0.3831776
  1. Exponent / power: ^ \[ 123^2 \]
Code
123^2
[1] 15129
  1. square root: sqrt(123) \[ \sqrt {123} \]
Code
sqrt(123)
[1] 11.09054
  1. Remainder / modulus: %%
Code
10 %% 3
[1] 1

5 Rerun previous commands

  1. Up Arrow: ⇪

  2. History history

6 Strings

  1. Video of Lesson 1 ~ Part 2

Code
"Hello world!"
[1] "Hello world!"
Code
'Hello world!'
[1] "Hello world!"
Code
"I said 'Hello'."
[1] "I said 'Hello'."
Code
'I said "Hello".'
[1] "I said \"Hello\"."

7 Vectors

Code
1:10 # from 1 to 10, one by one
 [1]  1  2  3  4  5  6  7  8  9 10
Code
seq(from = 1, to = 100, by = 10) # from 1 to 100, by 10
 [1]  1 11 21 31 41 51 61 71 81 91
Code
seq(from = 1000, to = 100, by = -100) # from 1000 down to 100, by -100
 [1] 1000  900  800  700  600  500  400  300  200  100
Code
seq(from = 1, to = 100, length.out = 27) 
 [1]   1.000000   4.807692   8.615385  12.423077  16.230769  20.038462
 [7]  23.846154  27.653846  31.461538  35.269231  39.076923  42.884615
[13]  46.692308  50.500000  54.307692  58.115385  61.923077  65.730769
[19]  69.538462  73.346154  77.153846  80.961538  84.769231  88.576923
[25]  92.384615  96.192308 100.000000
Code
# we want 27 elements
# seq(from = 1, to = 100, by = 4, length.out = 27) # error
Code
rep(c(1,3,5), times = 2) # replicate 1,3,5 twice
[1] 1 3 5 1 3 5
Code
rep(c(1,3,5), each = 2) # replicate 1,3,5 each twice
[1] 1 1 3 3 5 5
Code
sample(x = 1:100, size = 10, replace = T) # randomly select 10 from 1 t0 100, replication is possible.
 [1] 81 88 26 71 89 99 55 21 67 63
Code
sample(x = c("Head", "Tail"), size = 10, replace = T) # Toss coins, fair coin.
 [1] "Tail" "Tail" "Tail" "Tail" "Head" "Tail" "Head" "Tail" "Tail" "Head"
Code
sample(x = c("Head", "Tail"), size = 10, 
       replace = T, 
       prob = c(0.3, 0.7)) # unfair coin.
 [1] "Tail" "Head" "Tail" "Tail" "Head" "Tail" "Head" "Tail" "Tail" "Tail"

8 Assignment: create objects

Code
x <- sample(x = c("Head", "Tail"), 
            size = 10, replace = T, 
            prob = c(0.3, 0.7)) 
# assign a value to x. Read as "x gets the value... "

table(x)
x
Head Tail 
   3    7 
Code
x # display x: call x
 [1] "Tail" "Tail" "Head" "Tail" "Tail" "Head" "Head" "Tail" "Tail" "Tail"
Code
y <- 27
(z <- 27)
[1] 27

9 Locate element(s) in a vector

Code
x
 [1] "Tail" "Tail" "Head" "Tail" "Tail" "Head" "Head" "Tail" "Tail" "Tail"
Code
x[3]
[1] "Head"
Code
x[c(1, 9)]
[1] "Tail" "Tail"
Code
x[4:6]
[1] "Tail" "Tail" "Head"

10 Vector operations

Code
y <- c(1,4, 9)
y * 2
[1]  2  8 18
Code
sqrt(y)
[1] 1 2 3

11 Rules to name objects

Code
a1 <- c(1,2) # has to start with letter
# test variable <- c(1,2) # space is not allowed
Test_var <- c(1,2) # case sensitive
# test_var # Capitalization matters

12 Basic functions

Code
test <- rbinom(n = 100,size = 10, prob = 0.5)
# This scenario is akin to tossing 10 fair coins, each with a probability of 0.5, a total of 100 times. As a result, we obtain a vector consisting of 100 numbers. Each number represents the count of 'Heads,' where 'Heads' is considered a successful outcome in this context.

test
  [1] 4 5 7 7 5 6 5 7 6 4 3 3 6 5 5 7 6 3 4 6 5 2 3 4 4 3 6 8 4 5 5 5 7 5 6 6 6
 [38] 6 6 6 5 5 3 7 5 6 7 4 4 6 5 8 3 3 3 6 7 6 3 8 7 4 7 7 3 2 7 8 4 5 7 5 4 3
 [75] 5 6 6 6 6 5 3 4 7 5 4 2 7 5 5 3 8 7 4 4 8 6 3 6 6 5
Code
table(test) # find out the proportions
test
 2  3  4  5  6  7  8 
 3 15 15 22 23 16  6 
Code
mean(test) # find out the average successful times.
[1] 5.19
Code
max(test) # find out the maximum 
[1] 8
Code
min(test) # find out the minimun
[1] 2
Code
var(test) # variance
[1] 2.418081
Code
sd(test) # standard deviation
[1] 1.555018

13 How to get Help

Code
?mean

14 Video tutorial: R programming for ABSOLUTE beginners

15 Homework

  1. Task 1
  1. small vector Write code (in the R script, not in the console) to create a vector (using the combine function) called “small” that has numbers from 1 to 4, i.e., it has 4 values: 1, 2, 3 and 4.

  2. big vector Next, create another vector called “big” that has values from 5 to 8.

  3. third vector Now create a third vector called “sum” where you add small and big. Display the output of “sum.” What is the length of the vector called sum? Write the length in comments for yourself.

  1. Task 2
  1. first vector of numbers Create a vector of numbers from 0 to 50 in increments of 5; store it as a variable called “first.”

  2. second vector of numbers Create a variable called “second” that is a vector of numbers from 5 to 60, which is of the same length as the first vector.

  3. add vectors Add the two vectors, naming the new vector “third.

  1. Task 3
  1. vector of names Create a vector of six first names of politicians (real or hypothetical) and save them as a variable. Remember to name the variable.

  2. sub vector Save the third, fourth, fifth names as a separate variable.

  1. Task 4
  • create a sample variable Imagine we want to think about factors that affect whether someone turned out to vote in the last election or not. Call this variable ‘turnout’ and assume it is a nominal/categorical variable that can take on the value of ‘yes’ or ‘no’ Create a sample variable of length 100 called turnout that randomly takes on the given values such that approximately 80% of the total sample did turn out to vote.


Thank you!