The launch of Large Language Models (LLMs) for Generative Artificial Intelligence (GenAI) has sparked both excitement and concern. While the potential for AI may be limitless, the cost of data center server infrastructure and operating costs is estimated to exceed $76 billion by 2028, more than twice the estimated annual operating cost of Amazon’s cloud service AWS. Neural Networks (NNs) designed to run at scale will be highly optimized and will continue to improve over time, but the cost and scale of GenAI will demand innovation in optimizing NNs and is likely to push the computational load out from data centers to client devices.
