Portal data review (solution)

Exercise
Output solution

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Rows: 35549 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): species_id, sex
dbl (7): record_id, month, day, year, plot_id, hindfoot_length, weight

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 54 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): species_id, genus, species, taxa

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 24 Columns: 2
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): plot_type
dbl (1): plot_id

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
  1. Create a data frame with only data for the species_id DO, with the columns year, month, day, species_id, and weight.
# A tibble: 3,027 × 5
    year month   day species_id weight
   <dbl> <dbl> <dbl> <chr>       <dbl>
 1  1977     8    19 DO             52
 2  1977    10    17 DO             33
 3  1977    10    17 DO             50
 4  1977    10    17 DO             48
 5  1977    10    17 DO             31
 6  1977    10    18 DO             41
 7  1977    11    12 DO             44
 8  1977    11    12 DO             48
 9  1977    11    14 DO             39
10  1977    12    10 DO             40
# ℹ 3,017 more rows
  1. Create a data frame with only data for species IDs PP and PB and for years starting in 1995, with the columns year, species_id, and hindfoot_length, with no null values for hindfoot_length.
# A tibble: 5,150 × 3
    year species_id hindfoot_length
   <dbl> <chr>                <dbl>
 1  1995 PP                      23
 2  1995 PP                      22
 3  1995 PP                      22
 4  1995 PP                      21
 5  1995 PP                      21
 6  1995 PP                      20
 7  1995 PP                      22
 8  1995 PP                      24
 9  1995 PP                      22
10  1995 PP                      22
# ℹ 5,140 more rows
  1. Create a data frame with the average hindfoot_length for each species_id in each year with no null values.
`summarise()` has grouped output by 'species_id'. You can override using the
`.groups` argument.
# A tibble: 340 × 3
# Groups:   species_id [25]
   species_id  year `mean(hindfoot_length)`
   <chr>      <dbl>                   <dbl>
 1 AH          1999                    35  
 2 AH          2000                    31  
 3 BA          1989                    13  
 4 BA          1990                    13.8
 5 BA          1991                    12.9
 6 BA          1992                    12  
 7 DM          1977                    35.7
 8 DM          1978                    36.1
 9 DM          1979                    35.9
10 DM          1980                    35.8
# ℹ 330 more rows
  1. Create a data frame with the year, genus, species, weight and plot_type for all cases where the genus is "Dipodomys".
# A tibble: 16,167 × 5
    year genus     species     weight plot_type                
   <dbl> <chr>     <chr>        <dbl> <chr>                    
 1  1977 Dipodomys merriami        NA Control                  
 2  1977 Dipodomys merriami        NA Rodent Exclosure         
 3  1977 Dipodomys merriami        NA Long-term Krat Exclosure 
 4  1977 Dipodomys merriami        NA Spectab exclosure        
 5  1977 Dipodomys merriami        NA Spectab exclosure        
 6  1977 Dipodomys spectabilis     NA Rodent Exclosure         
 7  1977 Dipodomys merriami        NA Rodent Exclosure         
 8  1977 Dipodomys merriami        NA Long-term Krat Exclosure 
 9  1977 Dipodomys merriami        NA Control                  
10  1977 Dipodomys merriami        NA Short-term Krat Exclosure
# ℹ 16,157 more rows
  1. Make a scatter plot with hindfoot_length on the x-axis and weight on the y-axis. Color the points by species_id. Include good axis labels.
Warning: Removed 4811 rows containing missing values or values outside the scale range
(`geom_point()`).

  1. Make a histogram of weights with a separate subplot for each species_id. Do not include species with no weights. Set the scales argument to "free_y" so that the y-axes can vary. Include good axis labels.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

  1. (Challenge) Make a plot with histograms of the weights of three species, PP, PB, and DM, with a different facet (i.e., subplot) for each of three plot_type’s Control, Long-term Krat Exclosure, and Short-term Krat Exclosure.
Joining with `by = join_by(plot_id)`
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 438 rows containing non-finite outside the scale range
(`stat_bin()`).