
https://bit.ly/424cydp
To simulate the data that would have been collected if the statistical null hypothesis were true!
Simulate replications of rearrangements of the observations to draw a possible null distribution.
Randomly re-sample the observations with replacement (allowing duplication) to re-calculate the test statistic of interest.
infer packageThe four infer verbs:
specify() allows you to specify the variable, or relationship between variables, that you’re interested in.
hypothesize() allows you to declare the null hypothesis.
calculate() a distribution of statistics from the generated data to form the null distribution.
generate() data reflecting the null hypothesis (byr permutation or bootstrap).
visualize() is a set of custom functions to plot results from the infer package.
Adelie and Gentoo

Adelie and Chinstrap

adelie_gentoo_observed <- adelie_gentoo |>
specify(body_mass_g ~ species) |>
calculate(
stat = "diff in means",
order = c("Adelie", "Gentoo")
)
adelie_gentoo_observedadelie_chinstrap_observed <- adelie_chinstrap |>
specify(body_mass_g ~ species) |>
calculate(
stat = "diff in means",
order = c("Adelie", "Chinstrap")
)
adelie_chinstrap_observedadelie_gentoo_null <- adelie_gentoo |>
specify(body_mass_g ~ species) |>
hypothesize(null = "independence") |>
generate(reps = 1000, type = "permute") |>
calculate(
stat = "diff in means",
order = c("Adelie", "Gentoo")
)
adelie_gentoo_null |>
get_p_value(
obs_stat = adelie_gentoo_observed,
direction = "two-sided"
)adelie_chinstrap_null <- adelie_chinstrap |>
specify(body_mass_g ~ species) |>
hypothesize(null = "independence") |>
generate(reps = 1000, type = "permute") |>
calculate(
stat = "diff in means",
order = c("Adelie", "Chinstrap")
)
adelie_chinstrap_null |>
get_p_value(
obs_stat = adelie_chinstrap_observed,
direction = "two-sided"
)Adelie and Gentoo
visualize(adelie_gentoo_null) +
shade_p_value(
obs_stat = adelie_gentoo_observed,
direction = "two-sided"
)
Adelie and Chinstrap
visualize(adelie_chinstrap_null) +
shade_p_value(
obs_stat = adelie_chinstrap_observed,
direction = "two-sided"
)
adelie_chinstrap_bootstrap <- adelie_chinstrap |>
specify(body_mass_g ~ species) |>
generate(reps = 1000, type = "bootstrap") |>
calculate(
stat = "diff in means",
order = c("Adelie", "Chinstrap")
)
ac_ci <- adelie_chinstrap_bootstrap |>
get_confidence_interval(point_estimate = adelie_chinstrap_observed)
ac_ciadelie_gentoo_bootstrap <- adelie_gentoo |>
specify(body_mass_g ~ species) |>
generate(reps = 1000, type = "bootstrap") |>
calculate(
stat = "diff in means",
order = c("Adelie", "Gentoo")
)
ag_ci <- adelie_gentoo_bootstrap |>
get_confidence_interval(point_estimate = adelie_gentoo_observed)
ag_ciAdelie and Chinstrap
adelie_chinstrap_bootstrap |>
visualize() +
shade_confidence_interval(endpoints = ac_ci)
Adelie and Gentoo
adelie_gentoo_bootstrap |>
visualize() +
shade_confidence_interval(endpoints = ag_ci)
It tells us how the averages in body mass between Adelie and Chinstrap penguins change if we redid the experiment many times.
BIOL2205 - Inferencia e Informática - DCB - Uniandes