Goal
Get up and running with RStudio!
Directions
Before attempting the homework you must:
-
Review the course syllabus.
-
Read the “General homework directions” which can be found in the repository on the course website. For Homework 1, ignore the R Markdown instructions. Rather, record your work in a Word (or similar) document. Include a title (“Homework 1”) and your name at the top of the document
-
Watch the assigned Videos 1 and 2. These can be found at the bottom of the repository.
-
Submit your final document under the “Homework 1” location on Moodle. Confirm that it has been received - you should be able to click on and open your file.
Warm-Up
The warm-up exercises review the course expectations outlined in the syllabus and homework expectations outlined in the “General homework directions” document. You can find the Homework 1 Warm-up here.
Exercise 1
Be sure to show your RStudio code and output for each exercise below.
-
Use RStudio to calculate the sum of 52 and 49.
-
Use the
rep
function in RStudio to repeat the number “10” six times.
Exercise 2
The “World Prison Brief” conducted by the International Centre for Prison Studies provides insight into how incarceration rates vary from country to country. Statistics from the 2010 brief (courtesy chartsbin.com) are stored at
http://www.macalester.edu/~ajohns24/data/WorldIncarceration.csv
where incarceration rates are reported as the number of present incarcerations per 100,000 persons. You will need to use this data for the remaining exercises. Since it’s stored as a csv file on the internet, you can import the data into RStudio and store it under the name Prison
by copying and pasting the following into your RStudio console:
Prison = read.csv("http://www.macalester.edu/~ajohns24/data/WorldIncarceration.csv")
Reminder: Be sure to show your RStudio code and output for each exercise below.
-
Using the appropriate RStudio function, show the first 6 cases in this data.
-
Using the appropriate RStudio function, report the number of cases and the number of variables in this data.
-
Using the appropriate RStudio function, report the names of the variables in the data.
-
Which of the variables are categorical? Which are quantitative?
Exercise 3
-
Use the
summary
function to get a summary of the Prison
data. Be sure to show your RStudio code and output.
-
Which continent has the most countries represented in this data set and how many countries does it have?
-
What is the median incarceration rate for the countries in this data? Be sure to report your answer with the appropriate units.
-
What is the largest incarceration rate? Be sure to report your answer with the appropriate units.
-
We can isolate the U.S. data using the
subset
function:
subset(Prison, Country=="United States")
What is the U.S. incarceration rate and how does this compare to other countries?
Exercise 4
Recall that we can subset a data set in order to isolate cases and variables of interest. Reminder: Be sure to show your RStudio code and output for each exercise below.
-
In RStudio, isolate (and show) the variable containing the list of country names. HINT: Use the
$
notation!
-
One of the
Prison
variables indicates the continent of each case (i.e. country) in the data set. Using the appropriate RStudio function, show the list of category labels for the continent variable.
-
Suppose we’re only interested in the incarceration rates for countries in Asia. Using the appropriate RStudio functions, create a smaller data set with only the Asian countries and name this
PrisonAsia
.
-
Using the appropriate RStudio function, report the number of cases (countries) in the Asian subset.