Videos

Granular Observability for Private AI Workflows: On VMware Private AI Foundation

In enterprise AI, success isn’t just about deploying large language models- it’s about controlling access, tracking usage, and understanding performance across every layer of your stack.
That's why VMware and DKube are working together to enable secure, local LLM deployments with full-stack observability, using the VMware Private AI Foundation with NVIDIA.
In this demo, we showcase:

  • Centralized API-level access control and model routing with LiteLLM
  • RAG-based querying through OpenWebUI, backed by Postgres and PGVector
  • Request-level tracing, usage metrics, and cost visibility via Langfuse
  • Backend insight into vector embeddings and data chunks through PGAdmin
  • A fully private deployment- no external internet connectivity required

Watch the demo to see how VMware and DKube bring transparency, traceability, and control to enterprise GenAI workflows- from secure model access to every individual query.

Written by
Team DKube

The time to put your AI model to work is now

There's a faster way to go from research to application. Find out how an MLOps workflow can benefit your teams.

Schedule a Demo