Hi kodai, great questions.
1a (CBOX): during our beta phase, there are no HW changes. Existing IceLake & AMD Epyc based (CPU only) systems will be used. For services during beta which require GPUs, partners and customers will allocate/provision one or more of the following:
a. GPU systems which can be added to a k8s cluster. VAST is deploying a series of services on k8s systems which allow bi-directional communication with a VAST Cluster, such that the VAST control plane can monitor and manage certain types of services (this is evolving as we iterate on our codebase).
b. GPU systems which are separate from k8s and are 100% customer managed. Interaction with models deployed on those GPU servers will occur via configuration on the pipelines which customers define on their VAST cluster,. For example, if an NVIDIA NIM/Model is required for inference, and the model is hosted on an existing, non-managed GPU server, a customer could set an ENVIRONMENT_VARIABLE on their VAST managed pipeline to send inference calls to a defined model endpoint (eg: https://mygpu.client.com/v1/...)
- AI engineering → It seems like your question is more related to VectorDB’s than the broader scope of ‘ai engineering’. VAST has already implemented a large scale database platform. What’s missing in current GA code is support for the types of data structures and query/search optimizations typically associated with searching for vector embeddings. We are in the process of creating these as extensions to our existing Database, and will be launching initial support for using VAST as a native vector store later this year.
The ‘short’ answer is “yes, VAST could potentially replace Milvus, Chroma, etc”…once we complete our R&D effort ![]()