A new study predicts that the supply of publicly available training data for AI language models will be exhausted by the end of the decade, potentially hindering the progress of AI development. Companies are currently racing to secure high-quality data sources, but may eventually have to rely on sensitive or synthetic data.
