Graphing adult vs newborn size (solution)

Exercise

Output solution


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Rows: 1440 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (4): order, family, Genus, species
dbl (9): mass(g), gestation(mo), newborn(g), weaning(mo), wean mass(g), AFR(...
num (1): refs

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Graph lifespan (max. life(mo)) vs. gestation period(gestation(mo)). Label the axes with clearer labels than the column names.

Warning: Removed 892 rows containing missing values or values outside the scale range
(`geom_point()`).

This looks like a pretty regular pattern, so you wonder if it varies among different groups. Graph lifespan vs. gestation periodwith the data points colored by order. Label the axes.

Warning: Removed 892 rows containing missing values or values outside the scale range
(`geom_point()`).

Coloring the points was useful, but there are a lot of points and it’s kind of hard to see what’s going on with all of the orders. Use facet_wrap to create a subplot for each order.

Warning: Removed 892 rows containing missing values or values outside the scale range
(`geom_point()`).

Since different orders have different average sizes it can be hard to see the relationship for some orders. Let the both the x and y axes vary across different facets by setting the optional scales argument.

Warning: Removed 892 rows containing missing values or values outside the scale range
(`geom_point()`).

use geom_smooth to fit a linear model to each order.

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 892 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning in qt((1 - level)/2, df): NaNs produced
Warning in qt((1 - level)/2, df): NaNs produced
Warning in qt((1 - level)/2, df): NaNs produced

Warning: Removed 892 rows containing missing values or values outside the scale range
(`geom_point()`).

Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf
Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf
Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf

Challenge (optional): Some of the orders don’t have enough data points to fit a meaningful linear model. Use group_by and summarize and your data frame to create a new data frame with counts of the number of species (i.e., rows) in each order. Join this data frame (using inner_join) to your main data frame and use the new species counts to filter the data frame to only keep orders with at least 20 species. Then remake the graph from (5) with this filtered data. Note that there won’t be 20 points for all orders because some orders are missing values for some columns.

Joining with `by = join_by(order)`
`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 867 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 867 rows containing missing values or values outside the scale range
(`geom_point()`).