JSTASR: Joint Size and Transparency-Aware Snow Removal Algorithm

Reproduction of the 'JSTASR: Joint Size and Transparency-Aware Snow Removal Algorithm Based on Modified Partial Convolution and Veiling Effect Removal' paper by Chen et al. (2020), accepted to ECCV 2020.
Kajal Puri

Reproducibility Summary

This is a report for reproducibility challenge of ECCV 2020 paper "JSTASR: Joint Size and Transparency-Aware Snow Removal Algorithm Based on Modified Partial Convolution and Veiling Effect Removal" by Chen et al. (2020). The original code is available here. The paper is trying to remove snow or snowflakes from the images. The resultant model behaves very well on removing the snow from real-world as well as synthetic datasets, in addition to object detection on images with snow.

Scope of Reproducibility

The authors have open-sourced the new large-scale snow dataset that consists of veiling effect named as "Snow Removal in Realistic Scenario (SRRS)", which can be downloaded here. They propose the denoising model named "Joint Size and Transparency Aware Snow Removal" i.e. JSTASR. This method can remove the veiling snow effects from the images, detect the size of the snow in order to remove it and aware of the transparent snow in the image itself.

Methodology

To reproduce the paper we have mainly used the newly released snow dataset i.e. SRRS, open-sourced model weights and code from their open-sourced github repository. The code was reproducible without any errors after installing the dataset and libraries (according to what they mentioned in their github repo). We reproduced the results mainly on the their given testing images and achieved decent results. We tried to use Weights & Biases for the logging of various hyper-parameters during training and inference but as their code is mainly written in Keras, for which there is very less support of W&B is available, we couldn't log all parameters using w&b.

Results

We were able to reproduce the main results reported in the paper using the GPUs. The primary results that we obtained are very much aligned with the reported results. We had a slight difficulty in installing the right versions of the keras, opencv libraries (as they use older versions) but after looking through open github issues we were able to resolve them. Overall the results obtained from replicating the code and experiments supports the authors' claim that JSTASR is able to remove snow from the images.

What was easy

Because we were able to easily locate the open-sourced data, code, and extracted features, it wasn't difficult to understand the code and experiments referenced in this paper. They report the reasoning behind the model working in the "Training Detail" section of the paper as well as in Readme file in their code. The network used is simplified in the code and the whole process is divided into various modules i.e. Snow Model formulation, Joint size and transparency aware snow removal and veiling effect removal, which makes it easy to understand the code.

What was difficult

They have uploaded the new SRRS dataset on Google Drive which also gives a challenge to download sometimes as the size of one of the files is 20 GB. As for most of the vision applications, we almost always use PyTorch library, whereas their code has been written in Keras, OpenCV and Tensorflow (that too very older version of these libraries). So before actually understanding the code, we had to dive into the Keras documentation to remind ourselves how data and model loading works in this library. Also, as w&b doesn't support many Keras features yet, we had to drop main ways to track the model's progress and evaluation during training.

Communication with the Authors

As the code is open-sourced and easily reproducible, we didn't feel any need to contact the authors. But the authors also actively reply on the github open issues. We are thankful to them for their work and responsiveness.

Introduction

This report is a reproduction of the ECCV 2020 paper "JSTASR: Joint Size and Transparency-Aware Snow Removal Algorithm Based on Modified Partial Convolution and Veiling Effect Removal" by Chen et al. (2020).
The paper introduces a novel technique to remove snow or snowflakes from the images. Additionally, they also release a new dataset of images that has significant amount of snow in them. They reformulate the snow model which also considers the veiling snow effect into account. Then, they employ a size and transparency aware snow removal algorithm named as JSTASR that can remove snow of different shapes, sizes and scales in the image. They test this method on both real-world as well as synthetic image datasets in order to check its efficacy, which proves to be a significant improvement from the existing methodologies. This report replicates all the information provided in the paper and supports their claim by verifying very similar results.

Scope of Reproducibility

The paper proposes JSTASR method to remove snow, which is performed in mainly two following steps :
  1. Size-aware snow identifier.
  2. Transparency aware snow removal.
The first step will help to generate snow at different scales and using this information, the transparency aware model can deal with different transparency level as well as different scales in order to remove the snow. Additionally as a third step, they add a differentiable dark channel in one of the layers and embed this onto the model in order to remove the veiling effect of the snow, which works perfectly at the end.

Methodology

Let's look at their methodology in detail and step-wise manner, in order to understand :
  1. Snow Model Formulation : The snow has been distributed locally unlike rain, clouds, mist etc. Secondly, similar to other atmospheric phenomenon, veiling effect also occurs in snow. Veiling effect is a global illumination effect arising from multiple scattering of light making the image foggier and noisy as a result. This effect generally limits the performance of snow-removal technique. They proposed the following new model :
I(x) = K(x)T(x) + A(x)(1 − T(x))
In above equation, I is the image captured by the camera, K is the veiling effect-free but snowy image, T is the media transmission and A is the atmospheric light of the veiling effect. This is different from the existing snow removal algorithms as it also takes into account the local reconstruction performed by snow pixels. Therefore, the resulting image can recover image with sharper and well defined edges.
2. Joint Size and Transparency-Aware Snow Removal : In JSTASR, size aware snow removal model is designed for the very first time. The possible scales detected are small, medium and large. Three different networks consists of various convolution and deconvolution layers are employed in order to predict three different scales.With this architecture, size, shape and location information about the image and snow can be estimated. By using this information, an accurate snow information map (SIM) can be generated.
For transparency-awareness, image in-painting technique is adopted. It tries to fill the holes created in the image due to snow. JSTASR has a modified partial convolution to inpaint the irregular/broken area. This transparency-aware architecture is actually mainly inspired from U-Net. For the encoder part, different from the traditional partial convolution the multi-scale architecture is employed. In the decoder part, different from the original partial convolution they adopt both multi-scale deconvolution and up-sampling operations to prevent the recovered feature from blurring. They also add a size-aware loss function in order to enhance the performance of the generator network.
3. Veiling Effect Removal : In this module, they address the issue of estimating the accurate transmission and atmospheric light values. As said previously, for this they add differentiable dark channel prior (DDCP) layer with the patch map. In DDCP, various limitations have been seen previously like color degradation in white and bright scenes. In order to overcome these issues, they introduce a patch-map based differentiable dark channel prior layer to improve the performance of this method and further embed into the proposed snow removal process to achieve fully end-to-end learning. For this architecture, they use VGG-16 as the backbone in which each convolution is replaced by multi-level pooling.

Dataset

There is an already existing large scale snow dataset but that doesn't include veiling effect so the authors curated a new dataset named as Snow Removal in Realistic Scenario (SRRS) which also includes snow veiling effect. It consists of 15K artificially synthesized images whereas 1000 real-life snow images downloaded from the internet. In order to generate the dataset, they apply the popular haze benchmark dataset called RESIDE dataset in order to synthesize the image with veiling effect. Then, for each snow image, various types of snow are synthesized by Photoshop and the corresponding snow information (i.e., transparency, size, and location) is labeled. They randomly pick 2500 images for the training purpose and 1000 images for the testing purposes, named as Test A. They also create Test B, same images as Test A, but without the veiling effect of snow.

Training

It mainly consists of two sub-networks, one is JSTASR and another is veiling effect removal. Firstly, they train the 2500 hazy images based on the RESIDE dataset with veiling effect removal model and then train the size-aware snow identifier to predict the snow information. These two processes are pre-trained with the fixed veiling effect removal network. After the pre-trained process, two sub-networks are trained together in the fine-tuned state.

Hyperparameters

We have used the same hyper-parameters mentioned on their github repository to reproduce the results. Few of the primary ones are : Learning rate is set to be 0.0001 and ADAM optimizer has been used for all the experiments. The value of \gamma=0.1 and for every epoch, they cut 15% of the training data as the validation set.

Experimental Setup

For all the training experiments using SRRS dataset, we evaluated the models using the NVIDIA GeForce RTX 2080 Ti GPU. It took few minutes for the testing on our GPU with hyper-parameters mentioned above. All the experiments were conducted based on the code that has been released publicly by the authors are at their github repository.

Results

We reproduced the JSTASR for snow removal task on SRRS dataset specifically. We used the pre-trained model open-sourced by the authors to evaluate the JSTASR. The resultant images received are similar to what has been reported in the paper.

Discussion

Based on the results obtained, we have the sufficient evidence to validate the authors' claims made in the JSTASR paper. This novel method significantly outperforms its counterpart snow-removal methods by a huge margin and also introduces a novel realistic snow removal dataset along with a realistic veiling effect. The one ordeal could be the formation of such datasets. As it is difficult to obtain snow images in real-life as well as could be a problem creating such images in photoshop as they might not be realistic. Furthermore, this method can prevent the recovered images from blurring effectively because the snow location is considered during the snow removal procedure. It outperforms the state-of-the-art methods in the run time and the recovered quality of the images.

Conclusion

In JSTASR the authors have successfully created a novel architecture and dataset in order to detect as well as remove the snow by recovering the image completely. This method embeds joint size and transparency aware filters and veiling effect removal. The differentiable DCP is proposed to remove the veiling effect. In order to optimize the snow recovered results in different sizes of snow particles, size-aware loss functions and the snow-free discriminator are also designed and added onto the architecture. Experimental results showed that the proposed method can achieve better performance even in the complicated snow scenarios compared to other methods.