Bootstrap is a statistical method that allows you to make inferences about a population from a sample. The basic idea is to repeatedly sample from the original sample with replacement to create many new samples, called bootstrap samples. These samples are then used to estimate the population parameters and construct confidence intervals.
Confidence intervals are a measure of the uncertainty of an estimate. They provide a range of plausible values for a population parameter based on a sample. The most common method for constructing confidence intervals is the percentile method, where the lower and upper bounds of the interval are defined by the percentiles of the distribution of the bootstrap samples.
Hypothesis testing is a statistical method that allows you to make inferences about a population based on a sample. The basic idea is to formulate a null hypothesis (e.g. the population mean is equal to a certain value) and an alternative hypothesis (e.g. the population mean is different from the certain value), and then use the sample data to test whether the null hypothesis can be rejected in favor of the alternative hypothesis.
In Python, the scikit-learn
library provides a simple and easy-to-use implementation of bootstrap through the resample
function from the sklearn.utils
module. Here is an example of how to use it to create bootstrap samples and estimate the mean of a population: