Shivay Lamba
Shivay Lamba is a software developer specializing in DevOps, Machine Learning and Full Stack Development.
He is an Open Source Enthusiast and has been part of various programs like Google Code In and Google Summer of Code as a Mentor and has also been a MLH Fellow.
He is actively involved in community work as well. He is a TensorflowJS SIG member, Mentor in OpenMined and CNCF Service Mesh Community, SODA Foundation and has given talks at various conferences like Github Satellite, Voice Global, Fossasia Tech Summit, TensorflowJS Show & Tell.
Sessions
Learn how to secure the AI/ML lifecycle using CNCF tools like KitOps, Cosign, and Kubernetes. This talk covers packaging, signing, enforcement, and compliance, without slowing down ML velocity.
AI models are increasingly critical to modern applications, yet most teams treat them as opaque binaries outside the bounds of traditional software supply chain security. This talk guides participants through an end-to-end cloud-native pipeline that secures model artifacts from training to deployment. Using open tools like KitOps, Sigstore/Cosign, and Kubernetes, we’ll package a Hugging Face model, generate and verify attestations, enforce policies, and trace provenance. This is not just theory — it’s hands-on, practical, and designed to align with cloud-native workflows. Attendees will leave with patterns and tools they can immediately apply to secure AI in production.
As the LLMs and generative models become more and more complex, one can't simply train them on CPU, or a single GPU cluster, this requires the use of multiple GPUs but managing those can be complicated.GPU partitioning in the cloud is perceived to be a complicated, resource-consuming process that is worth the exclusive involvement of narrowly focused teams or large enterprises. So this talk explores why GPU partitioning is necessary for running Python AI workloads and how it can be done efficiently using open source tooling.
The talk will cover about some common myths: that this has something to do with advanced hardware configurations or prohibitive costs, on systems likeKubernetes
In this talk, we will illustrate how modern frameworks like NVIDIA MIG with vCluster effectively enable seamless sharing of GPUs across different teams, leading to more efficient resource utilization, higher throughput, and broader accessibility for workloads like LLM finetuning and inference. The talk aims to inspire developers, engineers to understand the key techniques for efficient GPU scheduling and sharing of resources across multiple GPU Clusters with open source platform tooling like vCluster.