Videos

Granular Observability for Private AI Workflows: On VMware Private AI Foundation

Explore how VMware and DKube deliver secure, private GenAI workflows with full-stack observability using the VMware Private AI Foundation with NVIDIA. This demo features centralized LLM access control with LiteLLM, RAG-based querying through OpenWebUI, detailed request tracing via Langfuse, and backend visibility into vector data using PGAdmin — all running entirely within your environment.

In enterprise AI, success isn’t just about deploying large language models- it’s about controlling access, tracking usage, and understanding performance across every layer of your stack.
That's why VMware and DKube are working together to enable secure, local LLM deployments with full-stack observability, using the VMware Private AI Foundation with NVIDIA.
In this demo, we showcase:

  • Centralized API-level access control and model routing with LiteLLM
  • RAG-based querying through OpenWebUI, backed by Postgres and PGVector
  • Request-level tracing, usage metrics, and cost visibility via Langfuse
  • Backend insight into vector embeddings and data chunks through PGAdmin
  • A fully private deployment- no external internet connectivity required

Watch the demo to see how VMware and DKube bring transparency, traceability, and control to enterprise GenAI workflows- from secure model access to every individual query.

Written By
Team DKube