class: center, middle, inverse, title-slide # Introduction to R, part 2 ## Research Methods and Skills ### 20/10/2020 --- # Interacting with R .left-pull[ * The R Console * REPL: Read Evaluate Print Loop * Type stuff in, it tries to do it ] .right-pull[ ![:scale 60%](images/01/default_project.png) ] --- # Basic use of R ## Use of R like a calculator The R console allows you to use it like a calculator, as below: ```r 5 + 5 ``` ``` ## [1] 10 ``` ```r 10 - 6 * 13 ``` ``` ## [1] -68 ``` --- # Basic use of R ## Creating objects to store information You assign values to objects using *<-* ```r test_object <- 5 ``` *<-* can be read as "is now", making the code above roughly mean ```r The object "test_object" is now 5 # Do not run! ``` Objects "stand-in" for their values: ```r test_object ``` ``` ## [1] 5 ``` --- # Basic use of R ## Creation of vectors Vectors are simply a 1-dimensional collection of values of the same type. E.g. We can create a numeric vector using the **c()** function. ```r c(5, 10, 3, -1, -5) ``` ``` ## [1] 5 10 3 -1 -5 ``` This is a one-dimensional vector of length *five*, since it has 5 values. --- # Basic use of R ## Using functions on objects **Functions** do things to objects. Brackets after a word in these slides indicate that something is a function, e.g. **c()**, **mean()** ```r mean(c(5, 8, 2, 4, 5)) ``` ``` ## [1] 4.8 ``` ```r test_object <- c(5, 8, 2, 4, 5) mean(test_object) ``` ``` ## [1] 4.8 ``` --- class: inverse, center, middle # R Scripts --- # R Scripts *Scripts* are a way of writing out a sequence of commands that you want R to execute. A typical script looks something like this: ```r # Load in required packages using library() library(tidyverse) # Define any custom functions here (we haven't covered this!) # Now load any data you want to work on. (again, we'll cover this later!) test_data <- read_csv("data/a-random-RT-file.csv") %>% # I'll explain what %>% means later rename(RT = `reaction times`) # The rest of the script then runs whatever analyses or plotting you want to do ggplot(test_data, aes(x = RT, fill = viewpoint)) + geom_density() ``` --- # Why is this useful? .large[ Somebody asks you how you performed a particular analysis. In particular, they want detailed instructions of how you created a plot, filtered out outliers or missing data, and performed a linear regression. Q1: *How would you do that if you used SPSS?* Q2: *How would you do that if you used R?* ] --- class: inverse, center, middle # Creating R Scripts --- background-image: url(images/02/cloud_blank.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-create-script.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-script-window.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-examp-script.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-script-run.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-script-source.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-sourced-script.png) background-size: contain class: inverse --- # R Markdown .large[ **Literate programming** is a mixture of plain text and code. Whereas in scripts you need to use the **#** symbol to indicate comments, as here ```r # This is a comment ``` ...with R Markdown you can mix plain text and code using **chunks** to delineate sections of code. This allows you to create elaborate documents following the structure *you* want! ] --- background-image: url(images/02/cloud-rmarkdown.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmarkdown-install.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmarkdown-new.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmd-example.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmd-chunk-lab.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmd-click-run.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmd-chunk-play.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmd-click-knit.png) background-size: contain class: inverse --- background-image: url(images/02/cloud-rmd-html-file.png) background-size: contain class: inverse --- # Some very important advice R Markdown documents are like *recipes*. Every step needs to be written down. When you press the knit button, R forgets everything and follows the instructions line-by-line. So be thorough, and write down everything in the order you want it to happen! (One exception: NEVER use install.packages() in a script) --- class: inverse, center, middle #Basic data types --- # Basic data types There are five basic data types in R: <table> <thead> <tr> <th style="text-align:left;"> Type </th> <th style="text-align:left;"> Description </th> <th style="text-align:left;"> Examples </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> integer </td> <td style="text-align:left;"> Whole numbers </td> <td style="text-align:left;"> 1, 2, 3 </td> </tr> <tr> <td style="text-align:left;"> numeric </td> <td style="text-align:left;"> Any real number, fractions </td> <td style="text-align:left;"> 3.4, 2, -2.3 </td> </tr> <tr> <td style="text-align:left;"> character </td> <td style="text-align:left;"> Text </td> <td style="text-align:left;"> "Hi there", "8.5", "ABC123" </td> </tr> <tr> <td style="text-align:left;"> logical </td> <td style="text-align:left;"> Assertion of truth/falsity </td> <td style="text-align:left;"> TRUE, FALSE </td> </tr> <tr> <td style="text-align:left;"> complex </td> <td style="text-align:left;"> Real and imaginary numbers </td> <td style="text-align:left;"> 0.34+5.3i </td> </tr> </tbody> </table> There are some additional types to be aware of, particularly *factors*, but we'll come back to them in a later session. --- # Checking data types We can use the **class()** function to check what type a given object is. ```r class(10) ``` ``` ## [1] "numeric" ``` ```r class(10L) # using L after the number turns it into an *integer* ``` ``` ## [1] "integer" ``` ```r class(TRUE) ``` ``` ## [1] "logical" ``` ```r class("Wednesday") ``` ``` ## [1] "character" ``` --- class: inverse, middle, center # Basic containers --- background-image: url(images/02/masonjars.jpg) background-size: contain --- # Vectors A vector is a collection of values which all have the same basic **type**. A numeric vector is thus a collection of numeric values: ```r some_numbers <- c(5, 3, 6, 8) some_numbers ``` ``` ## [1] 5 3 6 8 ``` ... and a character vector is a collection of character values ```r char_example <- c("Monday", "Tuesday", "Wednesday", "Thursday") char_example ``` ``` ## [1] "Monday" "Tuesday" "Wednesday" "Thursday" ``` --- # More about vectors The colon (**:**) operator can be used to produce a sequence of numbers: ```r one_to_ten <- 1:10 one_to_ten ``` ``` ## [1] 1 2 3 4 5 6 7 8 9 10 ``` Vectors can also be given names: ```r one_to_four <- 1:4 names(one_to_four) <- char_example one_to_four ``` ``` ## Monday Tuesday Wednesday Thursday ## 1 2 3 4 ``` --- # Extracting values Sometimes you only want a specific subset of a vector. For example, suppose that you only want the third value. For this, we need the **[]** (square brackets) operator. We put an *index* inbetween the **[]** operator. ```r char_example[3] ``` ``` ## [1] "Wednesday" ``` Note that you can also supply *multiple* values: ```r char_example[2:3] ``` ``` ## [1] "Tuesday" "Wednesday" ``` ```r char_example[c(2, 4)] ``` ``` ## [1] "Tuesday" "Thursday" ``` --- # Extracting values If your vector is *named*, you can also use the names as *indices*. ```r one_to_four ``` ``` ## Monday Tuesday Wednesday Thursday ## 1 2 3 4 ``` ```r one_to_four["Wednesday"] ``` ``` ## Wednesday ## 3 ``` ```r one_to_four[c("Monday", "Wednesday")] ``` ``` ## Monday Wednesday ## 1 3 ``` --- background-image: url(images/02/wine_rack.jpg) background-size: 60% # Matrices --- # Matrices Matrices are 2-dimensional collections of values. All values must be of the same type. ```r matrix(1:9, nrow = 3, ncol = 3) ``` ``` ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ``` This is quite a common format. For example, each row could represent an individual participant. Each column could represent a different numerical measure. --- # Accessing matrices Since matrices are two-dimensional, you need to give two indices to make sure you get the value you want. Again, you can use the **[]** operator. ```r *[row, col] ``` Here I extract the number from the 2nd row down, 3rd column across. ```r test_matrix <- matrix(1:9, nrow = 3, ncol = 3) test_matrix ``` ``` ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ``` ```r test_matrix[2, 3] ``` ``` ## [1] 8 ``` --- background-image: url(images/02/masonjars.jpg) background-size: 60% # Lists --- # Lists Lists are a collection of objects of varying length and type. ```r album_list <- list(The_Beatles = c( "Sgt. Pepper", "The White Album", "Revolver", "Abbey Road"), Nirvana = c( "Bleach", "Nevermind", "In Utero") ) ``` Each element is labelled, just like a mason jar on a shelf. Each element has different contents, just like our mason jars. --- # Lists ```r names(album_list) ``` ``` ## [1] "The_Beatles" "Nirvana" ``` ```r length(album_list) ``` ``` ## [1] 2 ``` ```r album_list["The_Beatles"] ``` ``` ## $The_Beatles ## [1] "Sgt. Pepper" "The White Album" "Revolver" "Abbey Road" ``` --- # Tabular data *Tabular* data is also collection of different types of data, arranged in a rectangular, tabular format. Most of the data you encounter in psychology is in this kind of format. In tabular data, each column contains only values of one *type*, and each row thus contains different types of information about one thing.
--- background-image: url(images/05/import-foc.png) background-size: contain class: inverse --- # Creating tabular data In R, this type of structure is called a *data frame*. .pull-left[ ```r days_of_the_week <- data.frame(day_name = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"), day_number = 1:7 ) ``` ] .pull-right[ ```r days_of_the_week ``` ``` ## day_name day_number ## 1 Sunday 1 ## 2 Monday 2 ## 3 Tuesday 3 ## 4 Wednesday 4 ## 5 Thursday 5 ## 6 Friday 6 ## 7 Saturday 7 ``` ] --- # Extracting information from data frames You can use the **[]** operator to extract single elements, rows, or columns: ```r days_of_the_week[1, 2] ``` ``` ## [1] 1 ``` ```r days_of_the_week[5, ] ``` ``` ## day_name day_number ## 5 Thursday 5 ``` ```r days_of_the_week[, 1] ``` ``` ## [1] "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday" ``` --- # Extracting information from data frames A special operator you can use for data frame columns is the dollar sign, **$** Combine the data frame's name with the column name as below: ```r days_of_the_week$day_name ``` ``` ## [1] "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday" ``` Question: what **class()** is this? --- class: inverse, middle, center # Wrapping up --- ## This week's concepts .large[ - R Markdown - Chapter 27 of R4DS - see also https://rmarkdown.rstudio.com - **vectors** and **lists** in Chapter 20 of R4DS ] ## Prep for next week .large[ - Next week we'll talk again about data frames and consider how to *structure* data. - Look at Section 2 (Wrangle) of R4DS for information on **tibbles** (which are essentially data frames...). ]