What is Insight Engine?

kodai · January 20, 2025, 11:48pm

I am interested in learning more about VAST InsightEngine and its capabilities.

InsightEngine seems very promising for future AI engineering, particularly in areas like LLMs (Large Language Models) and Generative AI. I would like to start using it as soon as it becomes available.

I have two specific questions about InsightEngine:

System Architecture:
Could you provide details about the system architecture of the C-Box when used with InsightEngine?
I understand that InsightEngine utilizes NIM or Nemo. My current understanding is that GPUs are required for running NIM or Nemo. Does the new C-Box include GPUs to support this functionality?
Role in AI Engineering:
What role does InsightEngine play in AI engineering workflows? Specifically, how does VAST Data, in combination with InsightEngine, compare to traditional vector databases like Milvus or Chroma? Could VAST Data potentially replace them in certain use cases?

Does anyone know anything about this?

andypern · January 21, 2025, 7:45pm

Hi kodai, great questions.

1a (CBOX): during our beta phase, there are no HW changes. Existing IceLake & AMD Epyc based (CPU only) systems will be used. For services during beta which require GPUs, partners and customers will allocate/provision one or more of the following:

 a.  GPU systems which can be added to a k8s cluster.  VAST is deploying a series of services on k8s systems which allow bi-directional communication with a VAST Cluster, such that the VAST control plane can monitor and manage certain types of services (this is evolving as we iterate on our codebase).
b.  GPU systems which are separate from k8s and are 100% customer managed.  Interaction with models deployed on those GPU servers will occur via configuration on the pipelines which customers define on their VAST cluster,.  For example, if an NVIDIA NIM/Model is required for inference, and the model is hosted on an existing, non-managed GPU server, a customer could set an ENVIRONMENT_VARIABLE on their VAST managed pipeline to send inference calls to a defined model endpoint (eg: https://mygpu.client.com/v1/...)

AI engineering → It seems like your question is more related to VectorDB’s than the broader scope of ‘ai engineering’. VAST has already implemented a large scale database platform. What’s missing in current GA code is support for the types of data structures and query/search optimizations typically associated with searching for vector embeddings. We are in the process of creating these as extensions to our existing Database, and will be launching initial support for using VAST as a native vector store later this year.

The ‘short’ answer is “yes, VAST could potentially replace Milvus, Chroma, etc”…once we complete our R&D effort

andypern · January 21, 2025, 7:49pm

and @kodai I suppose some follow-ups

what kinds of pipelines are you deploying (test or prod) today?
are you using milvus/chroma/weaviate/etc?
do you use k8s?

we are definitely looking for feedback and use cases so we can align our strategy to what people are actually doing…

kodai · January 23, 2025, 10:34am

Thank you for your prompt reply.
I now have a clear understanding of the matter.

I am a member of the distribution company which handle VAST DATA.
As one of the distributors, we are preparing to utilize InsightEngine for demonstration purposes.
This demonstration has the potential to greatly appeal to our customers who are developing or utilizing AI technologies.

Therefore, we would appreciate receiving detailed information about InsightEngine as soon as possible.

Best regards.

kodai · January 24, 2025, 12:05am

I did not answer your question.

what kinds of pipelines are you deploying (test or prod) today?
→ LLM and RAG using NIM and Nemo
are you using milvus/chroma/weaviate/etc?
→ milvus/chroma/FAISS
do you use k8s?
→Sorry no

mark.medovich · January 24, 2025, 5:02pm

if no k8s, please describe your GPU setup (e.g. system config, system topology including network, and current sw deployment , i.e., GPU aware container instances running on … ?

kodai · January 28, 2025, 9:09am

Im so sorry for the delayed response; I just noticed your message. I will check with the engineers shortly

kodai · January 29, 2025, 5:48am

I apologize for replying late.
I asked our application engineering team.

They are using Run:AI to deploy multiple NIM containers and verify the RAG pipeline on GPUs.

The GPUs are located in our company’s data center, and for the RAG pipeline, they are using three H100s and one A40.
The H100s handle NIM’s LLM and embeddings, while the A40 is used for vector search (FAISS) and application deployment (Gradio).

Additionally, they would like to confirm whether they can use the same type of NIM as the one used in InsightEngine.
This is because, while their current NIM setup requires GPUs, InsightEngine does not.
They suspect that some aspects of NIM, such as TensorRT, might have been modified.

Best Regards,

Topic		Replies	Views
VAST Data at GTC 2025: Simplifying the Complex, Scaling the Future VAST News	0	31	April 17, 2025
Advanced AI Vision Search and Reasoning with the VAST InsightEngine with NVIDIA® AI Blueprints VAST News	3	98	March 27, 2025
VAST Unveils a Game-Changing AI Agent to Connect, Understand and Reason Across All Enterprise Data VAST News	0	19	April 29, 2025
A few questions on a simple RAG app Help	1	60	February 26, 2025
Vectors, Vision, Velocity: VAST RAG Pipelines in Action VAST News	0	13	May 16, 2025

What is Insight Engine?

Related topics