Machine Learning on Google Cloud Platform – Simplified

The previous blog “Data Science, Machine learning, Business Intelligence – Demystified” discussed the basic conceptual foundation of machine learning in the context of data science. This blog focuses on the available tools/services/software platforms to perform data analysis using machine learning on Google Cloud Platform GCP.

Technically, today there are various machine learning models, algorithms and services available to consume when using GCP. Although such variety is great, it may seems confusing if you cannot categorize and map the actual use case and business requirements, to what GCP has made available for the different use cases.

Let’s analyze the above statement by considering couple of examples.

In all use cases, we have data that referred to as: input-data.

Let’s assume we need to have labels of vehicles (car, bus, etc.) in a photo, and we may or may not have these labeled photos.

With GCP life can be easy here,  as we can utilize a number of APIs or application programming interfaces/endpoints, where we are able to ingest our input-data from a GCP storage bucket (cloud storage), into this API endpoint (with photos we typically use the GCP Vision API). The output will be the labels, generated for us by the GCP Machine Learning algorithm. The nice thing here, is we don’t need to be ML experienced as we don’t really need to select the algorithm, train a model, etc. it’s simply like a black box that does the job for us in this use case. In fact, what we really need to do here, is to understand the output data/photos along with the likelihood of correctness for that specific use case scenario.

Then based on that if a photo is labeled as 60% likely of being a car, is this something acceptable for the business use case or expectations?we may find some use cases like social media, the acceptable correctness should be above ~75%. In contrast with medical applications, the percentage of likelihood must be much higher e.g. 95% or higher. So the question here, How we could obtain a higher percentage?, simply we may need to create our own model. In other words, each use case need different ML services and capabilities. This blog will summarize it, in a simplified way as illustrated in the figure below.

At the top of the figure above, we have the ML software platforms offered by GCP, that we can consider it as SaaS ML.

ML APIs can be thought of plug and play ML services, in which you provide (ingest) your data and

GCP ML APIs allows developers to extract actionable insights from video, Photos, text etc. without requiring any machine learning knowledge or skills. Taking advantage of a massive library, labels and pre trained models. For example with the Cloud Vision API, “The API quickly classifies images into thousands of categories (such as “sailboat” or “Eiffel Tower”), detects individual objects and faces within images, and finds and reads printed words contained within images”

As highlighted earlier in some cases, the accuracy level or might be some special cases requires a more customized model or labels, here where the AutoML ( such as AutoML Vision ) helps in building and training custom ML models with minimal ML expertise to meet domain-specific business need.

According to GCP “Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs, by leveraging Google’s state-of-the-art transfer learning, and Neural Architecture Search technology”

In addition, with BigQueryML GCP is democratizes machine learning, by enabling data analysts to use machine learning through existing SQL based, business intelligence tools and skills. At the time of this post writing, it supports the following types of ML models: Linear regression, Binary logistic regression and Multiclass logistic regression for classification that can be models can be used to predict more than two classes such as whether an input is “low-value”, “medium-value”, or “high-value”.

The other ML service category offered by GCP, is the Cloud ML Engine. First of all, Cloud ML Engine is not a SaaS platform that you just upload you data to and then you can start using it like the Google ML APIs or AutoML. Instead, Cloud ML can be thought of as platform as a service PaaS. How?

So as we start considering a custom ML models, here we are not only concerned about accessing an endpoint/API, however, we need to start thinking about the entire ML model workflow, as a process. Technically we need to consider at least, the following steps:

  • Complex Data preparation: with custom ML models, almost always data require more cleaning and transformation steps than what we you may need to do with the SaaS ML services. Common examples here of data preparation: removing outliers , features or columns, transformation of data type of format etc.
  • Designing the Model (ML Model Selection): this needs framing the targeted problem or goal, to decide which model should make predictions.
    • Do we have labeled data, or known truth data >> supervised ML, most common use case cases fall under supervised machine learning problems that could be regression, predicting of value or classification, predicting the likelihood of membership in a class
    • Are we trying to find a correlation or relationships in data that we are not aware of or understand >> unsupervised/Deep learning
  • Code, train, evaluate and Tune the Model: here we need to choose and use the suitable ML language for the selected algorithm, that could be TensorFlow, or the higher level language Keras. We also can pick from others such as, scikit-learn, but not all of the algorithms are in scikit-learn.
  • Deploy and monitor the model: this stage involves:
    • Deploy your trained model.
    • Send prediction requests to your model, Online or Batch prediction
    • Monitor the predictions & model versions.

Its obvious taking this path is more complex, as it involves more steps and requires ML expertise, compared to the ML API/AutoML. This is simply because, there is a big difference in the level of complexity here (accessing an ML API endpoint Vs. building, tuning, deploying and maintaining a complete custom model).

Therefore, the business situation should warrant that you need this level of complexity.

The role of GCP ML Engine here, is like a PaaS, where GCP spins up the underlaying environment required to run the training and production models across its cloud, however, by ML engine itself it does not do ML, you as the ML specialist or Data scientist need to code that.

Practically, although, creating a graph in TensorFlow to train on the ML Engine is key to developing an application. “But what’s the point of a powerful prediction model that only a data scientist can use?” In IoT/big data use cases with predictive analytics, the goal is to obtain real-time predictions that could feed into a dashboard or other application layers to perform other functions. In order to do so, the models need to be accessible from other Cloud services or applications such GCP Cloud Functions written in Node.js. While This model could be built and written in Python.

In this case, GCP ML Engine, offers the ability to deploy the model as a RESTful API to provide prediction at scale and makes the model available to all sort of clients, whether we are dealing with a single or millions of users. This should be applicable for both, online and offline predictions.

The figure below from GCP, illustrates, where the Cloud ML Engine provides managed services and APIs as part of the ML workflow (the blue-filled boxes indicate where Cloud ML Engine provides managed services and APIs)

According to GCP, ML Engine offers key advantages when running TensorFlow:

  • Running machine learning tasks in a serverless environment
  • Facilitating hyperparameter tuning
  • Hosting models as a RESTful API accessible from heterogenous clients (not only Python)

As highlighted earlier in this blog, as part of the custom model built, data exploration and preparation including preprocessing is not as simple as uploading data to ML API. Also, during the design, build and evaluation of the customer model, there is always a need to look into some hyperparameter tuning, and features’ engineering tasks are required to enhance the model. Jupyter notebooks are a great proven tool data preparations, because they’re easier to share with subject matter experts as they include text annotation and visualizations in addition to the actual runtimes. In GCP Cloud Datalab can run gcloud commands directly from its UI and run Jupyter Notebooks in a managed environment. Cloud Datalab comes with ML Workbench, a library that simplifies interactions with TensorFlow.

The following are some of the tasks you might want to perform (according to GCP):

  • Filter out columns that won’t be an available input at prediction time. Example: agent ID
  • Make sure that your labels are correctly transformed and available. Example: delete empty values
  • Eliminate exceptions, or find more examples if they are not exceptions but recurrent events. Example: Non-existing category text
  • Split your data for training, evaluating, and testing. Example: 80%, 10%, 10%
  • Transform inputs and labels into usable features. Example: resolution time = (closing time – creation time)
  • Remove duplicated rows

When it comes to data exploration and preparation, at a higher scale and with complex Extract, Transform and Load ETL functions GCP Cloud Dataprep, by Trifacta is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis that you can consider, which allows you to visually explore, clean and prepare data that is not ready for immediate analysis. It can automatically detects schemas, datatypes, possible joins, and anomalies such as missing values, outliers, and duplicates so you get to skip the time-consuming work of profiling your data and go right to the data analysis.

Dataprep uses Apache Beam behind the scenes, but it saves a lot of boilerplate code with its simple GUI. The Apache Beam tasks can run on Cloud Dataflow, which can help you develop and execute a wide range of data processing patterns, including ETL, batch computation, and streaming computation.

When using Cloud Dataprep for exploring, cleaning, and preparing, we still need to use Cloud Datalab for the model build, splitting data for training, evaluating, and testing, and running the the model. Although, we can use it for manual hyperparameters tunings, according to GCP “The preferred approach is to tune hyperparameters using ML Engine. ML Engine tunes hyperparameters automatically based on a declarative YAML setup. The system jumps quickly to the best parameter combinations and stops before going through all the training steps, saving time, compute power, and money”.

Categories :
Marwan Al-shawi – CCDE No. 20130066, Google Cloud Certified Architect, AWS Certified Solutions Architect, Cisco Press author (author of the Top Cisco Certifications’ Design Books “CCDE Study Guide and the upcoming CCDP Arch 4th Edition”). He is Experienced Technical Architect. Marwan has been in the networking industry for more than 12 years and has been involved in architecting, designing, and implementing various large-scale networks, some of which are global service provider-grade networks. Marwan holds a Master of Science degree in internetworking from the University of Technology, Sydney. Marwan enjoys helping and assessing others, Therefore, he was selected as a Cisco Designated VIP by the Cisco Support Community (CSC) (official Cisco Systems forums) in 2012, and by the Solutions and Architectures subcommunity in 2014. In addition, Marwan was selected as a member of the Cisco Champions program in 2015 and 2016.