From the course: GenAIOps Foundations
Unlock this course with a free trial
Join today to access over 24,900 courses taught by industry experts.
Infrastructure planning
From the course: GenAIOps Foundations
Infrastructure planning
- [Instructor] Infrastructure for models and data are critical when planning the deployment of GenAI applications. There are some special considerations when planning for model deployment in GenAI, irrespective of whether the model is hosted locally or used as a third-party service. What kinds of infrastructure considerations are there? Let's begin with the self-hosted option. Gen I models are huge and require significant compute resources to run them. To begin with, they need GPUs or TPUs for execution. Proper sizing is needed to ensure that the right capacity of such resources is provisioned for the model. Models also need significant memory for them to run. Again, sizing estimates are needed for provisioning the right amount of resources. Hardware and software accelerators may also be needed to support the deployment of the self-hosted models. Serving frameworks may also be used for optimized deployment and management. Networking is also a key requirement to ensure that there is…