GenAI Blog Series #2 – Host it or get SaaS-y

In the previous posts we presented the option of building it or buying it for your GenAI platform. In this post, we will get a bit more specific about what it really takes to host your own solution and why choosing a SaaS based product may be in your best interest.

GenAI solutions tend to need a lot of horse power. The layers you build (UIs, APIs, Pipelines, Database, Orchestrations, etc) will very likely be packaged up as containers and those containers will need to be deployed somewhere. In the case of FoundationaLLM, you can choose two supported Azure-based deployment paths, Azure Container Apps (ACA) and Azure Kubernetes Services (AKS). It doesn’t mean that you could not run it on AKS in Google (GCP) or Amazon Web Services (AWS), as of today, we just have the bicep that knows all about Azure.

When these services get deployed, you have to have knowledge of how many users and requests you will be receiving to your agents. This is important for several reasons.

  • You will need to scale up your container instances (Nodes/Pods) to meet the demand (which typically requires an increase of your vCPU quotas to support said demand).
  • You will need to be able to make sure you don’t overload the backend model given the call pattern of your agent and tools. But this is more of a model issue than a hosting problem, we will explore this more in a later blog post.

Azure Container Apps (ACA)

ACS is an incredibly simple and easy way to get up and running quickly. If you are not an expert in yaml and helm charts for AKS, then this presents a nice way to get something up and running and not require much management or experience to keep things running.

Azure Kubernetes Services (AKS)

ACA is a great product, but it does have some quirks too it that doesn’t quite make it a production level system for a GenAI deployment. Great for a development/qa/staging environment, but not something I’d go with for production.

So Kubernetes is really your best option. Luckily, Kubernetes runs everywhere (Azure, GCP, AWS). This means you’ll need to be comfortable reviewing how to deploy the initial resources and how to ultimately secure it (Zero trust). Greenfield environments work great, but the moment you have other things (your own DNS, hub network, peering, VPNs, paths, routes, TLS, etc) come into play, aka brownfield, you will need to consider all kinds of things to get a solution up and running. If you don’t have the resources to do this, you’ll need to bring on someone to help plan and match things up so your deployment works flawlessly.

These folk(s) will need to have some serious skills. If you plan on managing this yourself at some point, you will need to train up and/or hire folks with the knowledge to do it.

Upgrading

How easy is it to upgrade the solution? If its container based, it won’t be as simple as changing the container image in the ACA config or the AKS deployment. There will always be extra steps to move you from one version to another.

Getting SaaS-y

Hosting it sounds like a lot of work right? Not to mention, its going to cost you in cloud costs right away. Just firing up the basic system is going to put you right around $5K in compute spend per month. So if you are not comfortable with all the work it takes to host a GenAI system yourself, you are probably better off going down a hosted/SaaS based path.

There are several options out there, but note that you will need to consider the following:

  • Security/Identity – Does the solution support external identity providers? How easy is it to integrate? Does it support more than just users (groups?). How might the solution utilize models in other cloud platforms? Can it support cross IdP auth to take advantage of various models? APIKey is not the auth you are looking for….
  • Compliance – Has the platform gone through basic compliance checks like SOC Type II or other more stringent process and data control verifications? Do they store your data, if so, where and how?
  • High-availability – What is the SLA on the system? What if it goes down?
  • Customization – Can it do what you need it to do, plus give you the flexibility to mold the system to your requirements?

Contact

Need help getting your GenAI project started and/or over the finish line? Ping me, always happy to help!

Email: givenscj@hotmail.com
Twitter: @givenscj
LinkedIn: http://linkedin.com/in/givenscj

GenAI Blog Series

Leave a Reply

Your email address will not be published. Required fields are marked *