R Quickstart for Data Science and Analysis cover image

R Quickstart for Data Science and Analysis

Jonathan Barrios • September 14, 2020

data-science data-analysis

If you've been curious about R or Data Science in general, this 10-minute R Quickstart is for you. You’re going to install and start using R from the beginning, and by the end of this course, you will know how to:

What is R?

R is an actual programming language used for statistical computing and general data science. I enjoy being part of the vibrant and popular R community, which is also popular with top companies(Goole, Airbnb, HP, Accenture, Pfizer), universities, and related fields. From business intelligence to hypothesis testing, R is a very flexible and capable programming language. R is also open-source and free, which is no surprise why R is so prevalent in 2020. Most importantly, R's visualization is best-in-class, and you will see it used in the wild often, more so than Python, for example. R also makes package creation relatively easy, evidenced by the vast number of community-created packages available.

Installing R and RStudio

To install R, head over to the R Project website and follow your operating system's steps. You don’t need to install RStudio, an Integrated Development Environment, to use R since there’s a console built-in. That said, RStudio is a unique integrated development environment for R, which includes syntax highlighting, code completion, and smart indentation. I should also mention that you can use Visual Studio Code and a few extensions available through the Visual Studio Code Extensions tab.

I’m on a Mac, and I prefer using Homebrew but feel free to follow the R Project website instructions instead. To install R and RStudio using Homebrew, enter the following commands:

brew cask install r
brew cask install rstudio

Now that you have R installed let’s get started.

The Console

The R console is a window in which users can type R commands and view the results. During installation, Homebrew installed the R executable in the Applications directory. Once you open or click on the R icon to execute the application, you will see the R Console appear. Let’s begin.

Arithmetic with R

To begin, you can perform mathematical calculations. For example, you can type 1 + 2, and the console returns 3.

# addition
1 + 2

# subtraction
5 - 5

# multiplication
3 * 5

 # division
(5 + 5) / 2

# exponentiation
2 ^ 5

# modulo
28 %% 6

Assigning Variables

You can assign variables using the <- operator. Try saving height and width variables.

height <- 4
width <- 7

To view what variables you have created so far, type:

ls()
[1] "height" "width"

Finally, create a new variable area and multiply the height and width—type area to return our example invisible rectangle area, which is 28.

area = height * width
area

Try ls() again, and you will see three objects as expected.

Scripting with R

As you can imagine, it would be tricky to change the height and width values often, so let’s use a script instead. An R script is a text file containing R code. Create an R file using the same lines of code you previously entered, like this:

(r_script.R)
height <- 4
width <- 7
area = height * width
area

You could start using an IDE like RStudio or even Visual Studio Code with the help of a few extensions.

Common R Data Types

Next up, R’s six basic data types, also known as ('atomic') vector types:

Next, we’ll explore the logical, integer, and character data types using a few examples.

Logical Data Type

To reveal what data type a variable is, use the class() function. For example, TRUE is a logical type, and if we use class(TRUE), the console will return, “logical”.

TRUE
class(TRUE)

Logical types are boolean, so they are either TRUE or FALSE. To use numerical types such as 2, as an integer, you can declare the type using L for integer:

Integer Data Type

2
2L
class(2)
class(2L)

You can also use is.numeric() and is.integer() like this:

is.numeric(2) is.numeric(2L)

is.integer(2) is.integer(2L)

Character Data Type

A string is a character type like:

“I love R.”
class(“I love R”)

We won't cover the remaining three data types, but I will run through a short description of each before moving on to the next section. The double type is for high precision numerics, while the complex type handles complex numbers. Finally, the raw type stores raw bytes.

Built-in Databases

If you want to get started exploring R using a built-in dataset, you can type:

library(datasets)

To access the head of the iris dataset:

head(iris)

Look at a summary:

summary(iris)

To view the dataset using plot() add the database name:

plot(iris)

To end this session and detach from the iris database, type:

detach(“package:datasets”, unload = TRUE)

To close the iris plot, you can do so manually or use the following command:

dev.off()

I hope you enjoyed the R Quickstart course and feel excited about using R to uncover your next data project's insights. Keep going and, as always, happy analyzing!