KDD2016 paper 801

Title: Robust Substantial-Scale Machine Learning in the Cloud

Steffen Rendle*, Google, Inc.
Dennis Fetterly, Google, Inc.
Eugene Shekita, Google, Inc.
Bor-Yiing Su, Google, Inc.

The convergence conduct of several dispersed machine learning (ML) algorithms can be sensitive to the quantity of equipment getting utilised or to modifications in the computing environment. As a result, scaling to a huge quantity of equipment can be challenging. In this paper, we describe a new scalable coordinate descent (SCD) algorithm for generalized linear styles whose convergence conduct is constantly the very same, irrespective of how much SCD is scaled out and irrespective of the computing environment. This tends to make SCD extremely strong and allows it to scale to substantial datasets on minimal-price tag commodity servers. Experimental success on a serious advertising and marketing dataset in Google are utilised to demonstrate SCD’s price tag efficiency and scalability. Working with Google’s interior cloud, we demonstrate that SCD can present close to linear scaling making use of thousands of cores for 1 trillion instruction illustrations on a petabyte of compressed data. This signifies 10,000x far more instruction illustrations than the ‘large-scale’ Netflix prize dataset. We also demonstrate that SCD can discover a product for twenty billion instruction illustrations in two hrs for about $10.

A lot more on http://www.kdd.org/kdd2016/

KDD2016 Conference will be recorded and printed on http://videolectures.web/