Skip to main content

Soumik Rakshit

Machine Learning Engineer at Weights & Biases || Google Developer Expert (JAX)
Created on October 16|Last edited on April 17

About Me

  • 💼 My Roles:
  • 🔭 Things I'm currently working on:
    • Restorers: A toolkit providing out-of-the-box TensorFlow implementations of SoTA image and video restoration models for tasks such as low-light enhancement, denoising, deblurring, super-resolution, etc.
    • wandb-addons: A repository consisting of additional utilities and community contributions for supercharging your Weights & Biases workflows.
    • Weights & Biases Callbacks for Keras: Callbacks for experiment tracking, model checkpointing, data evaluation, model surgery and many more features for Keras-based machine learning workflows.
    • Application of generative machine learning for weather forecast. Check out our Nvidia GTC showcase at wandb.me/gtc2023.
    • Clean and reproducible implementations of deep learning research papers.
  • 🌱 Things I’m currently learning:
    • Geometric Deep learning
    • Real-time rendering using Vulkan
  • 👬 Looking forward to collaborate on:
    • Clean and reproducible implementation of Deep Learning Research Papers
    • Music recordings
  • 💬 Reach out to me if you:
    • want to engage in a meaningful conversation regarding Deep Learning, Mathematics or Computer Graphics
    • want to hire me
  • 📲 How to reach out to me:
  • ⚡ Fun fact:
    • I have insomnia 😵
    • I play Minecraft and Souls-like games (currently playing Sekiro 🗡️) 🎮



My Content

Check out my content at:

Research Paper Summaries on Two Minute Papers

Improving Generative Images with Instructions: Prompt-to-Prompt Image Editing with Cross Attention Control
A primer on text-driven image editing for large-scale text-based image synthesis models like Stable Diffusion & Imagen
Building Diverse Skillsets for Video Game Characters With Adversarial Skill Embeddings
In this article, we explore using large-scale reusable adversarial skill embeddings for physically simulated characters.
Robotic Telekinesis in the Wild
In this article, we teach a robotic hand imitator by watching humans on Youtube to enable any operator in the wild with only a single uncalibrated color camera.
Digging Into StyleGAN-NADA for CLIP-Guided Domain Adaptation
In this article, we take a deep dive into how StyleGAN-NADA achieved the task of CLIP-guided domain adaptation and explore how we can use the model itself.
Modern Evolution Strategies for Creativity
In this article, we revisit evolutionary strategy algorithms for computational creativity and look at how they improve quality and efficiency in generating art.
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
In this article, we explore how to achieve photorealistic rendering of large unbounded 3D scenes from novel camera angles while preserving fine-grained details.
Extracting Triangular 3D Models, Materials, and Lighting From Images
In this article, we'll explore a novel and efficient approach for joint optimization of topology, materials, and lighting from multi-view image observations.
Barbershop: Hair Transfer with GAN-Based Image Compositing Using Segmentation Masks
A novel GAN-based optimization method for photo-realistic hairstyle transfer
Block-NeRF: Scalable Large Scene Neural View Synthesis
Representing large city-scale environments spanning multiple blocks using Neural Radiance Fields
Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild
Or, teaching four-legged robots to walk in the real world
PoE-GAN: Generating Images from Multi-Modal Inputs
PoE-GAN is a recent, fascinating paper where the authors generate images from multiple inputs like text, style, segmentation, and sketch. We dig into the architecture, the underlying math, and of course, generate some images along the way.
Paella: Fast Text-Conditional Image Generation
In this article, we explore the paper "Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces" which introduces Paella, a novel text-to-image model.
Variable Bitrate Neural Fields: Create Fast Approximations of 3D Scenes
This article explores creating accurate, fast approximations of complex 3D scenes with a low memory footprint, as outlined in 'Variable Bitrate Neural Fields'.


Reports Published on Fully-Connected

Object Detection for Autonomous Vehicles (A Step-by-Step Guide)
Digging into object detection and perception for autonomous vehicles using YOLOv5 and Weights & Biases
Training Semantic Segmentation Models for Autonomous Vehicles (A Step-by-Step Guide)
A short tutorial on leveraging Weights & Biases to train a semantic segmentation model for autonomous vehicles.
Multi-Task Learning as a Bargaining Game
In this article, we explore gradient combination in multi-task learning (MTL) as a cooperative bargaining game, and discuss Nash MTL — a novel approach — in detail.
Implementing NeRF in JAX
This article uses JAX to create a minimal implementation of 3D volumetric rendering of scenes represented by Neural Radiance Fields, using W&B to track all metrics.
Working with FuncTorch: An Introduction
Working with JAX-like composable function transforms in PyTorch
Advanced Tensorboard Features: Graph Dashboard
An introduction to exploring Computation Graphs of our Machine Learning workflows with Tensorboard
Advanced Tensorboard Features: Tensorflow Debugger
An introduction to Machine Learning workflows written in Tensorflow using the Tensorboard Debugger
Hazy Image Restoration Using Keras
An introduction to building an Image Restoration model using Tensorflow, Keras, and Weights & Biases.
Unconditional Image Generation Using HuggingFace Diffusers
In this article, we explore how to train unconditional image generation models using HuggingFace Diffusers and we will track these experiments and compare the results usingWeights & Biases.
What Makes Depthwise Separable Convolutions Faster
Exploring how Depthwise Separable Convolutions can make CNNs faster
Point Cloud Classification Using PyTorch Geometric
In this article, we explore how to classify point cloud data from 3D CAD Models, implementing the PointNet++ architecture and using PyTorch Geometric and W&B.
Point Cloud Segmentation Using Dynamic Graph CNNs
In this article, we explore a simple point cloud segmentation pipeline using Dynamic Graph CNNs, implemented using PyTorch Geometric along with Weights & Biases.
Digging Into the ShapeNetCore Dataset
In this article, we dive into the ShapeNetCore Dataset for the classification and segmentation of point cloud data and explore how to use it using Weights & Biases.
Fine-Tuning Stable Diffusion Using Dreambooth in Keras
In this article, we quickly teach Stable Diffusion new visual concepts using Dreambooth in Keras, to produce fully-novel photorealistic images of a given subject.
DeepFloydAI: A New Breakthrough in Text-Guided Image Generation
In this article, we explore DeepFloydAI — an AI Research Band which is working with StabilityAI to make AI open again.
Low-Light Image Enhancement: Lighting up Images in the Deep Learning Era
In this article, we explore some deep learning techniques for low-light image enhancement, so that you can enhance images taken under sub-optimal conditions.



Examples Contributed to keras.io/examples




Other Open-Source Projects

link
Enet-Camvid
Pytorch Implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation trained on the CamVid Dataset
link
YOLOv3 using Tensorflow 2.0
Implementation of YOLOv3 using Tensorflow 2.0
link
Relativistic AnimeGAN
Generating Anime Faces using Relativistic GAN
link
Automatic Number Plate Recognition
Automatic Number Plate Recognition in Hangul using Convolutional Recurrent Neural Network
link
Efficient Graph-Based Image Segmentation
Implementation of "Efficient Graph-Based Image Segmentation" paper written by P. Felzenszwalb and D. Huttenlocher
link
Weights and Biases Callback for Elegy
Automatically log your experiment results using elegy.callbacks.WandbCallback and save and version model checkpoints as Weights & Biases Artifacts
link
Colorization using Optimization
While using this alogorithm, an artist only needs to annotate the image with a few color scribbles or visual clues, and the indicated colors are automatically propagated in both space and time to produce a fully colorized image or sequence.
link
Radium
Radium is a small and lightweight Ray Tracing Engine written in C++ that runs on the CPU using shared-memory multiprocessing
link
Kinect-Vision
A computer vision based gesture detection system that automatically detects the number of fingers as a hand gesture and enables you to control simple button pressing games using you hand gestures
link
Arxiv2Kindle
Arxiv2Kindle is a simple script written in python that converts LaTeX source downloaded from Arxiv and recompiles it to better fit a reading device (such as a Kindle)
link
Manga Scraper
https://github.com/soumik12345/Manga-Scraper
link
BrainFuck-Interpreter
A simple BrainFuck Interpreter in Java