Deep Image: Scaling up Image Recognition

Baidu's custom-built HPC Minwa recently scanned more than 1 million images and taught itself to sort them into about 1,000 categories with 95.42% accuracy, beating human performance. By comparison, Google's system scored a 95.2% and Microsoft's, a 95.06%.

A recent paper entitled "Deep Image: Scaling up Image Recognition" details use of a highly optimized parallel algorithm, new data partitioning and communication methods, larger deep neural network models, innovative data augmentation approaches, and usage of multi-scale high-resolution images.

See paper here.

Abstract

We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. The key components are a custom-built supercomputer dedicated to deep learning, a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation approaches, and usage of multi-scale high-resolution images. On one of the most challenging computer vision benchmarks, the ImageNet classification challenge, our system has achieved the best result to date, with a top-5 error rate of 4.58% and exceeding the human recognition performance, a relative 31% improvement over the ILSVRC 2014 winner.