Dramatic improvements in computation hardware have reduced the per-token cost of generative AI models. These hardware improvements, along with more efficient algorithms and software, drive down the cost of inference and make generative AI-based capabilities more accessible to new applications and use cases.
- Sanjay Mehrotra