Big Data Analytics & Machine Learning @ Google Cloud Next

IMG_5272

(T) This week at the Moscone Center in San Francisco, Google had its annual Cloud conference, Next, focusing on the Google Cloud Platform (GCP). Google made many announcements regarding GCP’s infrastructure, security, analytics, databases, machine learning, development tools, enterprise collaboration and productivity, and GCP working with Android and Chrome devices! Very impressive!!!

Following are first the key announcements on machine learning and big data analytics, and second a selection of my favorite sessions.

New announcements from Google about machine learning for GCP

Cloud Machine Learning Engine (GA) – Cloud ML Engine, now generally available, is for organizations that want to train and deploy their own models into production in the cloud.

Cloud Video Intelligence API (Private Beta) – A first of its kind, Cloud Video Intelligence API lets developers easily search and discover video content by providing information about entities (nouns such as “dog,” “flower”, or “human” or verbs such as “run,” “swim,” or “fly”) inside video content.

Cloud Vision API (GA) – Cloud Vision API reaches GA and offers new capabilities for enterprises and partners to classify a more diverse set of images. The API can now recognize millions of entities from Google’s Knowledge Graph and offers enhanced OCR capabilities that can extract text from scans of text-heavy documents such as legal contracts or research papers or books.

Machine learning Advanced Solution Lab (ASL) – ASL provides dedicated facilities for our customers to directly collaborate with Google’s machine-learning experts to apply ML to their most pressing challenges.

Cloud Jobs API – A powerful aid to job search and discovery, Cloud Jobs API now has new features such as Commute Search, which will return relevant jobs based on desired commute time and preferred mode of transportation.

Machine Learning Startup Competition – We announced a Machine Learning Startup Competition in collaboration with venture capital firms Data Collective and Emergence Capital, and with additional support from a16z, Greylock Partners, GV, Kleiner Perkins Caufield & Byers, and Sequoia Capital.

New announcements from Google about analytics for GCP

BigQuery Data Transfer Service (Private Beta) – BigQuery Data Transfer Service makes it easy for users to quickly get value from all their Google-managed advertising datasets. With just a few clicks, marketing analysts can schedule data imports from Google Adwords, DoubleClick Campaign Manager, DoubleClick for Publishers and YouTube Content and Channel Owner reports.

Cloud Dataprep (Private Beta) – Cloud Dataprep is a new managed data service, built in collaboration with Trifacta, that makes it faster and easier for BigQuery end-users to visually explore and prepare data for analysis without the need for dedicated data engineer resources.

New Commercial Datasets – Businesses often look for datasets (public or commercial) outside their organizational boundaries. Commercial datasets offered include financial market data from Xignite, residential real-estate valuations (historical and projected) from HouseCanary, predictions for when a house will go on sale from Remine, historical weather data from AccuWeather, and news archives from Dow Jones, all immediately ready for use in BigQuery (with more to come as new partners join the program).

Python for Google Cloud Dataflow in GA – Cloud Dataflow is a fully managed data processing service supporting both batch and stream execution of pipelines. Until recently, these benefits have been available solely to Java developers. Now there’s a Python SDK for Cloud Dataflow in GA.

Stackdriver Monitoring for Cloud Dataflow (Beta) – We’ve integrated Cloud Dataflow with Stackdriver Monitoring so that you can access and analyze Cloud Dataflow job metrics and create alerts for specific Dataflow job conditions.

Google Cloud Datalab in GA – This interactive data science workflow tool makes it easy to do iterative model and data analysis in a Jupyter notebook-based environment using standard SQL, Python and shell commands.

Cloud Dataproc updates – Our fully managed service for running Apache Spark, Flink and Hadoop pipelines has new support for restarting failed jobs (including automatic restart as needed) in beta, the ability to create single-node clusters for lightweight sandbox development, in beta, GPU support, and the cloud labels feature, for more flexibility managing your Dataproc resources, is now GA.

Selection of a few sessions

Introduction to Google Cloud machine learning:

 

Lifecycle of a machine learning model:

 

Machine learning APIs (by example):

 

Spark and Hadoop on GCP:

 

TensorFlow without a PhD:

 

 

Conversational UX with API.AI:

 

Video intelligence:

 

IoT on GCP:

 

Nest on GCP:

Note: The picture above is the parking of the Menlo Park public library.

Copyright © 2005-2017 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com.