This article discusses the use of different machine learning techniques in drug discovery and compares their performance on datasets of varying sizes and chemical diversities. The authors propose a “Goldilocks learning paradigm” to help select the most suitable modeling method based on dataset characteristics.