Goal

Get up and running with RStudio!



Directions

Before attempting the homework you must:



Warm-Up

The warm-up exercises review the course expectations outlined in the syllabus and homework expectations outlined in the “General homework directions” document. You can find the Homework 1 Warm-up here.



Exercise 1

Be sure to show your RStudio code and output for each exercise below.

  1. Use RStudio to calculate the sum of 52 and 49.
  2. Use the rep function in RStudio to repeat the number “10” six times.



Exercise 2

The “World Prison Brief” conducted by the International Centre for Prison Studies provides insight into how incarceration rates vary from country to country. Statistics from the 2010 brief (courtesy chartsbin.com) are stored at

http://www.macalester.edu/~ajohns24/data/WorldIncarceration.csv


where incarceration rates are reported as the number of present incarcerations per 100,000 persons. You will need to use this data for the remaining exercises. Since it’s stored as a csv file on the internet, you can import the data into RStudio and store it under the name Prison by copying and pasting the following into your RStudio console:

Prison = read.csv("http://www.macalester.edu/~ajohns24/data/WorldIncarceration.csv")

Reminder: Be sure to show your RStudio code and output for each exercise below.

  1. Using the appropriate RStudio function, show the first 6 cases in this data.
  2. Using the appropriate RStudio function, report the number of cases and the number of variables in this data.
  3. Using the appropriate RStudio function, report the names of the variables in the data.
  4. Which of the variables are categorical? Which are quantitative?



Exercise 3

  1. Use the summary function to get a summary of the Prison data. Be sure to show your RStudio code and output.
  2. Which continent has the most countries represented in this data set and how many countries does it have?
  3. What is the median incarceration rate for the countries in this data? Be sure to report your answer with the appropriate units.
  4. What is the largest incarceration rate? Be sure to report your answer with the appropriate units.
  5. We can isolate the U.S. data using the subset function:

    subset(Prison, Country=="United States")

    What is the U.S. incarceration rate and how does this compare to other countries?



Exercise 4

Recall that we can subset a data set in order to isolate cases and variables of interest. Reminder: Be sure to show your RStudio code and output for each exercise below.

  1. In RStudio, isolate (and show) the variable containing the list of country names. HINT: Use the $ notation!
  2. One of the Prison variables indicates the continent of each case (i.e. country) in the data set. Using the appropriate RStudio function, show the list of category labels for the continent variable.
  3. Suppose we’re only interested in the incarceration rates for countries in Asia. Using the appropriate RStudio functions, create a smaller data set with only the Asian countries and name this PrisonAsia.
  4. Using the appropriate RStudio function, report the number of cases (countries) in the Asian subset.