M’Baïki Data Challenge (solution)

Exercise

A long-term study near M’Baïki in the Central African Republic has been monitoring tropical forest recovery from disturbance for 40 years.

Use the data on yearly tree measurements (in mbaiki_measures.csv), information on the individual (in mbaiki_trees.csv), and the names of the species in the forest (in mbaiki_species.csv)to answer the following questions (if isn’t in your working directory download it).

  1. Create a new data frame that contains the following information for each unique tree (each tree has a unique id_tree): The id_tree, the net growth (total change in diameter from the first year a tree is measured to the last year a tree is measured), the time period of sampling in years (number of years between the first and last measurement), and the growth rate (the net growth divided by the time period of sampling). Only include observations while the tree was alive in these calculations.
  2. Starting with the data frame you created in (1) create a new data frame that contains the following information on the average growth rate of trees in each species in each subplot: The ID of the subplot, the scientific name of the species, the mean growth rate of all of the trees of that species in that subplot, and the sample size used to estimate the mean (i.e., the number of trees of that species in that subplot). Make sure the resulting data frame is not grouped.

Find out more about this dataset by accessing the full dataset or reading the associated paper: Bénédet, F., Gourlet-Fleury, S., Allah-Barem, F. et al. 2024. 40 years of forest dynamics and tree demography in an intact tropical forest at M’Baïki in central Africa. Sci Data 11, 734.

Output solution

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Rows: 248147 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (3): code_observation, msg, status
dbl  (8): id_tree, hom_diameter, code_mortality, year, code_mortality_orig, ...
date (2): measure_date, measure_date_orig

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 10885 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (10): id_tree, id_subplot, code_vernacular, x_plot, y_plot, number, id_c...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 205 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): vernacular_name, scientific_name
dbl (1): code_vernacular

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 10,885 × 4
   id_tree net_growth num_years growth_rate
     <dbl>      <dbl>     <dbl>       <dbl>
 1       1      21.6         40      0.54  
 2       2       2.9         40      0.0725
 3       3      14           40      0.35  
 4       4       1.30        22      0.0591
 5       5       2.5         40      0.0625
 6       6      22.7         16      1.42  
 7       7       3.3         13      0.254 
 8       8      12.4         40      0.31  
 9       9       2.10        40      0.0525
10      10       1.9         40      0.0475
# ℹ 10,875 more rows
`summarise()` has grouped output by 'id_subplot'. You can override using the
`.groups` argument.
# A tibble: 1,518 × 4
   id_subplot scientific_name                       mean_growth_rate sample_size
        <dbl> <chr>                                            <dbl>       <int>
 1          9 Afrostyrax lepidophyllus Mildbr.                0.139            2
 2          9 Afzelia bipindensis Harms                       0.0125           1
 3          9 Albizia adianthifolia (Schumach.) W.…           1.12             1
 4          9 Albizia ferruginea (Guill. & Perr.) …           0.423            1
 5          9 Albizia glaberrima (Schumach. & Thon…           0.287            9
 6          9 Albizia zygia (DC.) J.F.Macbr.                  0.424            4
 7          9 Allanblackia floribunda Oliv.                   0.14             1
 8          9 Amphimas pterocarpoides Harms                   0.38             2
 9          9 Angylocalyx pynaertii De Wild.                  0.103           19
10          9 Aningeria altissima (A.Chev.) Aubrév…           0.183            4
# ℹ 1,508 more rows