2016. AI boffins picked a hell of a year to train a neural net by making it watch the news

Date: November 23, 2016 Published by

  • The Watch, Listen, Attend and Spell (WLAS) network has a lower word accuracy rate than LipNet, at 46.8 per cent compared to 93.4 per cent.
  • The WLAS network analysed the speech movements from the LRS dataset, which contains over 100,000 natural sentences and 17,428 words.
  • LipNet, the lipreading network developed by researchers at the University of Oxford and DeepMind, can now lipread from TV shows better than professional lipreaders.
  • WLAS still requires a lot of training, like LipNet, and only a small part of the LRS dataset is used to test the WLAS network.
  • When audio and lipreading are added together, the word error rate goes down to 50.8 per cent.