COSCUP 2025

Shivay Lamba

Shivay Lamba is a software developer specializing in DevOps, Machine Learning and Full Stack Development.

He is an Open Source Enthusiast and has been part of various programs like Google Code In and Google Summer of Code as a Mentor and has also been a MLH Fellow.
He is actively involved in community work as well. He is a TensorflowJS SIG member, Mentor in OpenMined and CNCF Service Mesh Community, SODA Foundation and has given talks at various conferences like Github Satellite, Voice Global, Fossasia Tech Summit, TensorflowJS Show & Tell.


Sessions

08-09
11:10
30min
Trusting your AI models: Building a secure cloud-native supply chain
Shivay Lamba

Learn how to secure the AI/ML lifecycle using CNCF tools like KitOps, Cosign, and Kubernetes. This talk covers packaging, signing, enforcement, and compliance, without slowing down ML velocity.
AI models are increasingly critical to modern applications, yet most teams treat them as opaque binaries outside the bounds of traditional software supply chain security. This talk guides participants through an end-to-end cloud-native pipeline that secures model artifacts from training to deployment. Using open tools like KitOps, Sigstore/Cosign, and Kubernetes, we’ll package a Hugging Face model, generate and verify attestations, enforce policies, and trace provenance. This is not just theory — it’s hands-on, practical, and designed to align with cloud-native workflows. Attendees will leave with patterns and tools they can immediately apply to secure AI in production.

Miscellaneous Open Source Topics
TR512
08-10
10:40
30min
Multi Cluster GPU Allocation for AI Research
Hrittik Roy, Shivay Lamba

As the LLMs and generative models become more and more complex, one can't simply train them on CPU, or a single GPU cluster, this requires the use of multiple GPUs but managing those can be complicated.GPU partitioning in the cloud is perceived to be a complicated, resource-consuming process that is worth the exclusive involvement of narrowly focused teams or large enterprises. So this talk explores why GPU partitioning is necessary for running Python AI workloads and how it can be done efficiently using open source tooling.

The talk will cover about some common myths: that this has something to do with advanced hardware configurations or prohibitive costs, on systems likeKubernetes

In this talk, we will illustrate how modern frameworks like NVIDIA MIG with vCluster effectively enable seamless sharing of GPUs across different teams, leading to more efficient resource utilization, higher throughput, and broader accessibility for workloads like LLM finetuning and inference. The talk aims to inspire developers, engineers to understand the key techniques for efficient GPU scheduling and sharing of resources across multiple GPU Clusters with open source platform tooling like vCluster.

Open Source AI and Machine Learning
AU