Intro
Hello, I'm Soumik Rakshit 👋
🚀 I build MLOps pipelines for open-source repositories like Keras, Kaolin-Wisp, YOLOv5, etc.
🌱 I'm currently learning Neural Rendering, Neural Approximation and Vision-Language Modelling.
👬 I would love to collaborate on interesting Computer Vision and Graphics projects and implementations of Deep Learning Research Papers.
📲 You can reach me at soumik.rakshit@wandb.com or on my social media profiles [twitter.com/soumikrakshit96 | linkedin.com/in/soumikrakshit]
🧔🏽♂️ Pronouns: He/His/Him.
🎮 Fun fact: I love playing video games (currently playing Elden Ring).
📚 I create content on machine learning. You can find some of my work down below 👇
Reports
PoE-GAN: Generating Images from Multi-Modal Inputs
PoE-GAN is a recent, fascinating paper where the authors generate images from multiple inputs like text, style, segmentation, and sketch. We dig into the architecture, the underlying math, and of course, generate some images along the way.
Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild
Or, teaching four-legged robots to walk in the real world
Block-NeRF: Scalable Large Scene Neural View Synthesis
Representing large city-scale environments spanning multiple blocks using Neural Radiance Fields
EditGAN: High-Precision Semantic Image Editing
Robust and high-precision semantic image editing in real-time
Barbershop: Hair Transfer with GAN-Based Image Compositing Using Segmentation Masks
A novel GAN-based optimization method for photo-realistic hairstyle transfer
Implementing NeRF in JAX
This article uses JAX to create a minimal implementation of 3D volumetric rendering of scenes represented by Neural Radiance Fields, using W&B to track all metrics.
Extracting Triangular 3D Models, Materials, and Lighting From Images
In this article, we'll explore a novel and efficient approach for joint optimization of topology, materials, and lighting from multi-view image observations.
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
In this article, we explore how to achieve photorealistic rendering of large unbounded 3D scenes from novel camera angles while preserving fine-grained details.
Modern Evolution Strategies for Creativity
In this article, we revisit evolutionary strategy algorithms for computational creativity and look at how they improve quality and efficiency in generating art.
Digging Into StyleGAN-NADA for CLIP-Guided Domain Adaptation
In this article, we take a deep dive into how StyleGAN-NADA achieved the task of CLIP-guided domain adaptation and explore how we can use the model itself.
Writing a Training Loop in JAX and Flax
In this article, we explore an end-to-end training and evaluation pipeline in JAX, Flax, and Optax for image classification, using W&B to track experiments.
Training Semantic Segmentation Models for Autonomous Vehicles (A Step-by-Step Guide)
A short tutorial on leveraging Weights & Biases to train a semantic segmentation model for autonomous vehicles.
Object Detection for Autonomous Vehicles (A Step-by-Step Guide)
Digging into object detection and perception for autonomous vehicles using YOLOv5 and Weights & Biases
Improving Generative Images with Instructions: Prompt-to-Prompt Image Editing with Cross Attention Control
A primer on text-driven image editing for large-scale text-based image synthesis models like Stable Diffusion & Imagen
Hazy Image Restoration Using Keras
An introduction to building an Image Restoration model using Tensorflow, Keras, and Weights & Biases.
Unconditional Image Generation Using HuggingFace Diffusers
In this article, we explore how to train unconditional image generation models using HuggingFace Diffusers and we will track these experiments and compare the results usingWeights & Biases.
Paella: Fast Text-Conditional Image Generation
In this article, we explore the paper "Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces" which introduces Paella, a novel text-to-image model.
Variable Bitrate Neural Fields: Create Fast Approximations of 3D Scenes
This article explores creating accurate, fast approximations of complex 3D scenes with a low memory footprint, as outlined in 'Variable Bitrate Neural Fields'.
What Makes Depthwise Separable Convolutions Faster
Exploring how Depthwise Separable Convolutions can make CNNs faster
Point Cloud Classification Using PyTorch Geometric
In this article, we explore how to classify point cloud data from 3D CAD Models, implementing the PointNet++ architecture and using PyTorch Geometric and W&B.
Digging Into the ShapeNetCore Dataset
In this article, we dive into the ShapeNetCore Dataset for the classification and segmentation of point cloud data and explore how to use it using Weights & Biases.
Point Cloud Segmentation Using Dynamic Graph CNNs
In this article, we explore a simple point cloud segmentation pipeline using Dynamic Graph CNNs, implemented using PyTorch Geometric along with Weights & Biases.
DeepFloydAI: A New Breakthrough in Text-Guided Image Generation
In this article, we explore DeepFloydAI — an AI Research Band which is working with StabilityAI to make AI open again.
Fine-Tuning Stable Diffusion Using Dreambooth in Keras
In this article, we quickly teach Stable Diffusion new visual concepts using Dreambooth in Keras, to produce fully-novel photorealistic images of a given subject.
Low-Light Image Enhancement: Lighting up Images in the Deep Learning Era
In this article, we explore some deep learning techniques for low-light image enhancement, so that you can enhance images taken under sub-optimal conditions.
XLA Compatibility of Vision Models in Keras
A set of comprehensive benchmarks around XLA compatibility of computer vision models implemented in Keras.
Supercharging Ultralytics with Weights & Biases
A guide on using Weights & Biases with Ultralytics workflows for computer vision models
A Guide to Using Stable Diffusion XL with HuggingFace Diffusers and W&B
A comprehensive guide to using Stable Diffusion XL (SDXL) for generating high-quality images using HuggingFace Diffusers and managing experiments with Weights & Biases
A Guide to Generating Music using AudioCraft
This article provides a comprehensive guide to using state-of-the-art music and audio generation models using AudioCraft from Meta, along with Weights & Biases.
Fine-Tuning a TorchVision Model using Keras
A comprehensive guide to fine-tuning a pre-trained model from TorchVision using Keras.
Object Detection using YOLOv8: An End-to-End Workflow
A comprehensive guide to building an object detection workflow using Ultralytics YOLOv8 and Weights & Biases.
A Guide to Prompt Engineering for Stable Diffusion
A comprehensive guide to prompt engineering for generating images using Stable Diffusion, HuggingFace Diffusers and Weights & Biases.
PIXART-α: A Diffusion Transformer Model for Text-to-Image Generation
This article provides a short tutorial on how to run experiments with Pixart-α — the new transformer-based Diffusion model for generating photorealistic images from text.
Brain tumor segmentation using MONAI and W&B Models
Supercharging your Deep Learning workflows for Medical Imaging using MonAI and Weights & Biases
Building an AI teacher's assistant using LlamaIndex and Groq
Today, we're going to leverage a RAG pipeline to create an AI TA capable of helping out with grading, questions about a class syllabus, and more
How to optimize LLM workflows using DSPy and W&B Weave
Learn how to use DSPy teleprompters and Weave to automatically optimize prompting strategies for causal reasoning
Prompt upsampling for diffusion models
This article shows the implementation of an LLM-assisted prompt upsampling strategy to improve the quality of images generated by Stable Diffusion.
Building a GenAI-assisted automatic story illustrator
Until September 27th Weights & Biases users can illustrate their own stories for free using Flux and GPT-4
Llama 3.2-Vision for multi-modal RAG in financial services
Understanding SEC filings with the help of foundation models
Links
Resume
https://wandb.me/soumik-rakshit
Low-light image enhancement using MIRNet
https://keras.io/examples/vision/mirnet/
Zero-DCE for low-light image enhancement
https://keras.io/examples/vision/zero_dce/
Multiclass semantic segmentation using DeepLabV3+
https://keras.io/examples/vision/deeplabv3_plus/
Large-scale multi-label text classification
https://keras.io/examples/nlp/multi_label_classification/
GauGAN for conditional image generation
https://keras.io/examples/generative/gaugan/
Point cloud segmentation with PointNet
https://keras.io/examples/vision/pointnet_segmentation/
Radium
https://github.com/soumik12345/radium
Colorization using Optimization
https://github.com/soumik12345/colorization-using-optimization
Deep Deterministic Policy Gradients
https://github.com/soumik12345/DDPG
Twin Delayed DDGP
https://github.com/soumik12345/Twin-Delayed-DDPG
Arxiv2Kindle
https://github.com/soumik12345/Arxiv2Kindle
Manga Scraper
https://github.com/soumik12345/Manga-Scraper
Activity
Mon
Wed
Fri
NovDecJanFebMarAprMayJunJulAugSepOct
Runs
Name
Project
State
Created
Loading...