Skip to main content
Intro

Hello, I'm Soumik Rakshit 👋

🖥️ I am working at Weights & Biases as a Machine Learning Engineer.
📣 I am a Google Developer Expert in Machine Learning (JAX/Flax).
🚀 I build MLOps pipelines for open-source repositories like Keras, Kaolin-Wisp, YOLOv5, etc.
🌱 I'm currently learning Neural Rendering, Neural Approximation and Vision-Language Modelling.
👬 I would love to collaborate on interesting Computer Vision and Graphics projects and implementations of Deep Learning Research Papers.
📲 You can reach me at soumik.rakshit@wandb.com or on my social media profiles [twitter.com/soumikrakshit96 | linkedin.com/in/soumikrakshit]
🧔🏽‍♂️ Pronouns: He/His/Him.
🎮 Fun fact: I love playing video games (currently playing Elden Ring).
😁 More about myself at wandb.me/soumik-rakshit


📚 I create content on machine learning. You can find some of my work down below 👇

Reports
PoE-GAN: Generating Images from Multi-Modal Inputs
PoE-GAN is a recent, fascinating paper where the authors generate images from multiple inputs like text, style, segmentation, and sketch. We dig into the architecture, the underlying math, and of course, generate some images along the way.
11403 views
Last edit 3 years ago
Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild
Or, teaching four-legged robots to walk in the real world
2320 views
Last edit 3 years ago
Block-NeRF: Scalable Large Scene Neural View Synthesis
Representing large city-scale environments spanning multiple blocks using Neural Radiance Fields
6566 views
Last edit 3 years ago
EditGAN: High-Precision Semantic Image Editing
Robust and high-precision semantic image editing in real-time
4425 views
Last edit 3 years ago
Barbershop: Hair Transfer with GAN-Based Image Compositing Using Segmentation Masks
A novel GAN-based optimization method for photo-realistic hairstyle transfer
8316 views
Last edit 3 years ago
Implementing NeRF in JAX
This article uses JAX to create a minimal implementation of 3D volumetric rendering of scenes represented by Neural Radiance Fields, using W&B to track all metrics.
7606 views
Last edit 2 years ago
Extracting Triangular 3D Models, Materials, and Lighting From Images
In this article, we'll explore a novel and efficient approach for joint optimization of topology, materials, and lighting from multi-view image observations.
8436 views
Last edit 2 years ago
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
In this article, we explore how to achieve photorealistic rendering of large unbounded 3D scenes from novel camera angles while preserving fine-grained details.
8662 views
Last edit 2 years ago
Modern Evolution Strategies for Creativity
In this article, we revisit evolutionary strategy algorithms for computational creativity and look at how they improve quality and efficiency in generating art.
3410 views
Last edit 2 years ago
Digging Into StyleGAN-NADA for CLIP-Guided Domain Adaptation
In this article, we take a deep dive into how StyleGAN-NADA achieved the task of CLIP-guided domain adaptation and explore how we can use the model itself.
3249 views
Last edit 2 years ago
Writing a Training Loop in JAX and Flax
In this article, we explore an end-to-end training and evaluation pipeline in JAX, Flax, and Optax for image classification, using W&B to track experiments.
16375 views
Last edit 2 years ago
Training Semantic Segmentation Models for Autonomous Vehicles (A Step-by-Step Guide)
A short tutorial on leveraging Weights & Biases to train a semantic segmentation model for autonomous vehicles.
3894 views
Last edit 3 years ago
Object Detection for Autonomous Vehicles (A Step-by-Step Guide)
Digging into object detection and perception for autonomous vehicles using YOLOv5 and Weights & Biases
7425 views
Last edit 1 year ago
Improving Generative Images with Instructions: Prompt-to-Prompt Image Editing with Cross Attention Control
A primer on text-driven image editing for large-scale text-based image synthesis models like Stable Diffusion & Imagen
8500 views
Last edit 3 years ago
Hazy Image Restoration Using Keras
An introduction to building an Image Restoration model using Tensorflow, Keras, and Weights & Biases.
2461 views
Last edit 2 years ago
Unconditional Image Generation Using HuggingFace Diffusers
In this article, we explore how to train unconditional image generation models using HuggingFace Diffusers and we will track these experiments and compare the results usingWeights & Biases.
6805 views
Last edit 2 years ago
Paella: Fast Text-Conditional Image Generation
In this article, we explore the paper "Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces" which introduces Paella, a novel text-to-image model.
1428 views
Last edit 2 years ago
Variable Bitrate Neural Fields: Create Fast Approximations of 3D Scenes
This article explores creating accurate, fast approximations of complex 3D scenes with a low memory footprint, as outlined in 'Variable Bitrate Neural Fields'.
3048 views
Last edit 2 years ago
What Makes Depthwise Separable Convolutions Faster
Exploring how Depthwise Separable Convolutions can make CNNs faster
109 views
Last edit 3 years ago
Point Cloud Classification Using PyTorch Geometric
In this article, we explore how to classify point cloud data from 3D CAD Models, implementing the PointNet++ architecture and using PyTorch Geometric and W&B.
8377 views
Last edit 2 years ago
Digging Into the ShapeNetCore Dataset
In this article, we dive into the ShapeNetCore Dataset for the classification and segmentation of point cloud data and explore how to use it using Weights & Biases.
2459 views
Last edit 2 years ago
Point Cloud Segmentation Using Dynamic Graph CNNs
In this article, we explore a simple point cloud segmentation pipeline using Dynamic Graph CNNs, implemented using PyTorch Geometric along with Weights & Biases.
3781 views
Last edit 2 years ago
DeepFloydAI: A New Breakthrough in Text-Guided Image Generation
In this article, we explore DeepFloydAI — an AI Research Band which is working with StabilityAI to make AI open again.
1715 views
Last edit 2 years ago
Fine-Tuning Stable Diffusion Using Dreambooth in Keras
In this article, we quickly teach Stable Diffusion new visual concepts using Dreambooth in Keras, to produce fully-novel photorealistic images of a given subject.
3229 views
Last edit 2 years ago
Low-Light Image Enhancement: Lighting up Images in the Deep Learning Era
In this article, we explore some deep learning techniques for low-light image enhancement, so that you can enhance images taken under sub-optimal conditions.
4201 views
Last edit 2 years ago
XLA Compatibility of Vision Models in Keras
A set of comprehensive benchmarks around XLA compatibility of computer vision models implemented in Keras.
1215 views
Last edit 1 year ago
Supercharging Ultralytics with Weights & Biases
A guide on using Weights & Biases with Ultralytics workflows for computer vision models
3006 views
Last edit 2 years ago
A Guide to Using Stable Diffusion XL with HuggingFace Diffusers and W&B
A comprehensive guide to using Stable Diffusion XL (SDXL) for generating high-quality images using HuggingFace Diffusers and managing experiments with Weights & Biases
16319 views
Last edit 1 year ago
A Guide to Generating Music using AudioCraft
This article provides a comprehensive guide to using state-of-the-art music and audio generation models using AudioCraft from Meta, along with Weights & Biases.
3476 views
Last edit 2 years ago
Fine-Tuning a TorchVision Model using Keras
A comprehensive guide to fine-tuning a pre-trained model from TorchVision using Keras.
772 views
Last edit 2 years ago
Object Detection using YOLOv8: An End-to-End Workflow
A comprehensive guide to building an object detection workflow using Ultralytics YOLOv8 and Weights & Biases.
5111 views
Last edit 10 months ago
A Guide to Prompt Engineering for Stable Diffusion
A comprehensive guide to prompt engineering for generating images using Stable Diffusion, HuggingFace Diffusers and Weights & Biases.
6840 views
Last edit 1 year ago
PIXART-α: A Diffusion Transformer Model for Text-to-Image Generation
This article provides a short tutorial on how to run experiments with Pixart-α — the new transformer-based Diffusion model for generating photorealistic images from text.
2193 views
Last edit 1 year ago
Brain tumor segmentation using MONAI and W&B Models
Supercharging your Deep Learning workflows for Medical Imaging using MonAI and Weights & Biases
2140 views
Last edit 6 months ago
Building an AI teacher's assistant using LlamaIndex and Groq
Today, we're going to leverage a RAG pipeline to create an AI TA capable of helping out with grading, questions about a class syllabus, and more
1209 views
Last edit 1 year ago
How to optimize LLM workflows using DSPy and W&B Weave
Learn how to use DSPy teleprompters and Weave to automatically optimize prompting strategies for causal reasoning
740 views
Last edit 1 year ago
Prompt upsampling for diffusion models
This article shows the implementation of an LLM-assisted prompt upsampling strategy to improve the quality of images generated by Stable Diffusion.
1178 views
Last edit 1 year ago
Building a GenAI-assisted automatic story illustrator
Until September 27th Weights & Biases users can illustrate their own stories for free using Flux and GPT-4
1506 views
Last edit 1 year ago
Llama 3.2-Vision for multi-modal RAG in financial services
Understanding SEC filings with the help of foundation models
973 views
Last edit 7 months ago
Links
Activity
Mon
Wed
Fri
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Runs
Name
Project
State
Created
Loading...