NVIDIA Expands Its Deep Learning Inference Capabilities for Hyperscale Datacenters

by Angela Guess

According to a new press release, “NVIDIA today announced a series of new technologies and partnerships that expand its potential inference market to 30 million hyperscale servers worldwide, while dramatically lowering the cost of delivering deep learning-powered services. Speaking at the opening keynote of GTC 2018, NVIDIA founder and CEO Jensen Huang described how GPU acceleration for deep learning inference is gaining traction, with new support for capabilities such as speech recognition, natural language processing, recommender systems, and image recognition — in datacenters and automotive applications, as well as in embedded devices like robots and drones. NVIDIA announced a new version of its TensorRT inference software, and the integration of TensorRT into Google’s popular TensorFlow framework. NVIDIA also announced that Kaldi, the most popular framework for speech recognition, is now optimized for GPUs. NVIDIA’s close collaboration with partners such as Amazon, Facebook and Microsoft make it easier for developers to take advantage of GPU acceleration using ONNX and WinML.”

The release goes on, “NVIDIA unveiled TensorRT 4 software to accelerate deep learning inference across a broad range of applications. TensorRT offers highly accurate INT8 and FP16 network execution, which can cut datacenter costs by up to 70 percent. TensorRT 4 can be used to rapidly optimize, validate and deploy trained neural networks in hyperscale datacenters, embedded and automotive GPU platforms. The software delivers up to 190x(2) faster deep learning inference compared with CPUs for common applications such as computer vision, neural machine translation, automatic speech recognition, speech synthesis and recommendation systems.”

Data Topics

NVIDIA Expands Its Deep Learning Inference Capabilities for Hyperscale Datacenters

Leave a Reply Cancel reply