Portal Data Challenge (solution)

Exercise

If the file surveys.csv is not already in your working directory then download a copy.

Develop a data manipulation pipeline for the Portal surveys table that produces a table of data for only the three Dipodomys species (DM, DO, DS). The species IDs should be presented as lower case, not upper case. The table should contain information on the date, the species ID, the weight and hindfoot length. The data should not include null values for either weight or hindfoot length. The table should be sorted first by the species (so that each species is grouped together) and then by weight, with the largest weights at the top.

Output solution

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Rows: 35549 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): species_id, sex
dbl (7): record_id, month, day, year, plot_id, hindfoot_length, weight

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 14,558 × 6
    year month   day species_id weight hindfoot_length
   <dbl> <dbl> <dbl> <chr>       <dbl>           <dbl>
 1  1979     4    29 dm             65              37
 2  1991     8     7 dm             65              37
 3  2002     5    16 dm             64              35
 4  1984     5    13 dm             63              35
 5  1995    12     3 dm             63              38
 6  1980    10    12 dm             62              35
 7  1995    10    28 dm             62              37
 8  1996     1    28 dm             62              38
 9  1996     1    28 dm             62              38
10  1999    11     7 dm             62              36
# ℹ 14,548 more rows