Skip to main content

Demo: Errors in windowed total extraction

Created on December 3|Last edited on December 3

Overview

After tuning good parameter ranges in the last big sweep, I wanted to do a full size run with best known settings that rendered its errors to PDF.

I used training_len:8000 window_len:30 use_string:1 use_page:1 use_geom:1 use_amount:1 vocab_size:1000 vocab_embed_size:64 epochs: 100 steps_per_epoch:20

This differs from previous settings (before The Big Sweep) in two important ways:

  • More training, with epochs=100 and steps_per_epoch=20
  • I made the vocabulary and embedding a little bigger than before because I am hoping part of the problem is that it needs to learn to "read" better.

Full config here. It ran fine and got to 96% doc_val_acc.




Run: PDF len:8000 win:30 str:1 page:1 geom:1 amt:1 voc:1000 emb:64 steps:20
1