Making choices load or download file (solution)

Exercise

With large data files it can be useful to only download the file if it hasn’t already been downloaded. One way to do this is to check if the file name exists in your working directory. If it does then load it, if not then download it. You can use the list.files() function to get a list of files and directories in the working directory and the download.file(url, filename) function to download the file at a url to a specific filename.

  1. Write a conditional statement that checks if surveys.csv exists in the working directory, if it doesn’t then downloads it from https://ndownloader.figshare.com/files/2292172 using download.file(), and finally loads the file into a data frame and displays the first few rows using the head() function. The url needs to be in quotes since it is character data.

  2. Make a version of this conditional statement that is a function, where the name of the file is the first argument and the link for downloading the file is the second argument. The function should return the resulting data frame. Add some documentation to the top of the function describing what it does. Call this function using “species.csv” as the file name and https://ndownloader.figshare.com/files/3299483 as the link. Print the first few rows of the resulting data frame using head().

Output solution

1

Rows: 35549 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): species_id, sex
dbl (7): record_id, month, day, year, plot_id, hindfoot_length, weight

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 6 × 9
  record_id month   day  year plot_id species_id sex   hindfoot_length weight
      <dbl> <dbl> <dbl> <dbl>   <dbl> <chr>      <chr>           <dbl>  <dbl>
1         1     7    16  1977       2 NL         M                  32     NA
2         2     7    16  1977       3 NL         M                  33     NA
3         3     7    16  1977       2 DM         F                  37     NA
4         4     7    16  1977       7 DM         M                  36     NA
5         5     7    16  1977       3 DM         M                  35     NA
6         6     7    16  1977       1 PF         M                  14     NA

2

Rows: 54 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): species_id, genus, species, taxa

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 6 × 4
  species_id genus            species         taxa  
  <chr>      <chr>            <chr>           <chr> 
1 AB         Amphispiza       bilineata       Bird  
2 AH         Ammospermophilus harrisi         Rodent
3 AS         Ammodramus       savannarum      Bird  
4 BA         Baiomys          taylori         Rodent
5 CB         Campylorhynchus  brunneicapillus Bird  
6 CM         Calamospiza      melanocorys     Bird