Links
Abstract
Determining an appropriate sample size is crucial for constructing efficient machine learning models. Existing techniques often lack rigorous theoretical justification or are tailored to specific statistical hypotheses about model parameters. This paper introduces two novel methods based on likelihood values from resampled subsets to address this challenge. We demonstrate the validity of one of these methods in a linear regression model. Computational experiments on both synthetic and real-world datasets show that the proposed functions converge as the sample size increases, highlighting the practical utility of our approach.