Skip to main content

KMNIST

Learn to identify 10 characters of classical Japanese cursive script with the Kuzushiji-MNIST (kmnist) dataset. A Benchmark created with Weights and Biases.
Introduction
Getting started

Introduction

This community benchmark is a fork of Kuzushiji-MNIST (kmnist) instrumented with Weights & Biases to track and visualize model training and facilitate collaborative deep learning. Given an image, correctly classify it as showing one of ten classical Japanese cursive characters.

This fresh reimagining of MNIST preserves technical simplicity, offers more headroom for creativity, and relies less on visual intuition (only a small number of experts can read Kuzushiji script, regardless of fluency in contemporary Japanese). It is an exciting challenge with the potential to make classical Japanese literature more accessible.

Runs in this project can be submitted to the Kuzushiji-MNIST (kmnist) public benchmark.

Getting started

First clone and initialize the starter code repository. This will also download the training data into a folder named "dataset" inside the repo.

> git clone https://github.com/wandb/kmnist
> cd kmnist/benchmarks && pip install -U -r requirements.txt
> ./init.sh [PROJECT NAME]

Now you can run scripts to train models. Your results will show up in the "Project workspace" tab to the left. For a quick experiment that trains on 10% of the data, you can run the following example script:

> python cnn_kmnist.py --quick_run

You can modify training configuration and hyperparameters by editing the example scripts or via command line. Run the following to see all the options:

> python cnn_kmnist.py -h

You can explore various settings to improve these baseline models or write your own scripts.

Evaluation

We will be using the "kmnist_val_acc" metric to evaluate all entries to the benchmark. This logs the validation accuracy computed from tensorflow.keras on the 10K images in the provided test data. If you're using a different framework, please log an equivalent metric with the same name--PRs implementing this in other frameworks are most appreciated.

How to submit your results

Once you're happy with your model's performance on the "kmnist_val_acc" metric, you can submit it for consideration to the public benchmark from the "Runs" table in the "Project workspace" tab. To submit a specific run, hover over the run's name, click on the three-dot menu icon that appears to the left of the name, and select "Submit to benchmark".

All submissions are reviewed by the benchmark administrators before acceptance.