Governance setting for cache refreshes from Azure Analysis Services

Built on the proven analytics engine in Microsoft SQL Server Analysis Services, Azure Analysis Services delivers enterprise-grade BI semantic modeling capabilities with the scale, flexibility, and management benefits of the cloud. The success of any modern data-driven organization requires that information is available at the fingertips of every business user, not just IT professionals and data scientists, to guide their day-to-day decisions. Azure Analysis Services helps you transform complex data into actionable insights. Users in your organization can then connect to your data models using tools like Excel, Power BI, and many others to create reports and perform ad-hoc interactive analysis.

Data visualization and consumption tools over Azure Analysis Services (Azure AS) sometimes store data caches to enhance report interactivity for users. The Power BI service, for example, caches dashboard tile data and report data for initial load for Live Connect reports. However, enterprise BI deployments where semantic models are reused throughout organizations can result in a great deal of dashboards and reports sourcing data from a single Azure AS model. This can cause an excessive number of cache queries being submitted to AS and, in extreme cases, can overload the server. This is especially relevant to Azure AS (as opposed to on-premises SQL Server Analysis Services) because models are often co-located in the same region as the Power BI capacity for faster query response times, so may not even benefit much from caching.

ClientCacheRefreshPolicy governance setting

The new ClientCacheRefreshPolicy property allows IT or the AS practitioner to override this behavior at the Azure AS server level, and disable automatic cache refreshes. All Power BI Live Connect reports and dashboards will observe the setting irrespective of the dataset-level settings, or which Power BI workspace they reside on. You can set this property using SQL Server Management Studio (SSMS) in the Server Properties dialog box. Please see the Analysis Services server properties page for more information on how to make use of this property.

Quelle: Azure

Azure Notification Hubs and Google’s Firebase Cloud Messaging Migration

When Google announced its migration from Google Cloud Messaging (GCM) to Firebase Cloud Messaging (FCM), push services like Azure Notification Hubs had to adjust how we send notifications to Android devices to accommodate the change.

We updated our service backend, then published updates to our API and SDKs as needed. With our implementation, we made the decision to maintain compatibility with existing GCM notification schemas to minimize customer impact. This means that we currently send notifications to Android devices using FCM in FCM Legacy Mode. Ultimately, we want to add true support for FCM, including the new features and payload format. That is a longer-term change and the current migration is focused on maintaining compatibility with existing applications and SDKs. You can use either the GCM or FCM libraries in your app (along with our SDK) and we make sure the notification is sent correctly.

Some customers recently received an email from Google warning about apps using a GCM endpoint for notifications. This was just a warning, and nothing is broken – your app’s Android notifications are still sent to Google for processing and Google still processes them. Some customers who specified the GCM endpoint explicitly in their service configuration were still using the deprecated endpoint. We had already identified this gap and were working on fixing the issue when Google sent the email.

We replaced that deprecated endpoint and the fix is deployed.

If your app uses the GCM library, go ahead and follow Google’s instructions to upgrade to the FCM library in your app. Our SDK is compatible with either, so you won’t have to update anything in your app on our side (as long as you’re up to date with our SDK version).

Now, this isn’t how we want things to stay; so over the next year you’ll see API and SDK updates from us implementing full support for FCM (and likely deprecate GCM support). In the meantime, here’s some answers to common questions we’ve heard from customers:

Q: What do I need to do to be compatible by the cutoff date (Google’s current cutoff date is May 29th and may change)?

A: Nothing. We will maintain compatibility with existing GCM notification schema. Your GCM key will continue to work as normal as will any GCM SDKs and libraries used by your application.
If/when you decide to upgrade to the FCM SDKs and libraries to take advantage of new features, your GCM key will still work. You may switch to using an FCM key if you wish, but ensure you are adding Firebase to your existing GCM project when creating the new Firebase project. This will guarantee backward compatibility with your customers that are running older versions of the app that still use GCM SDKs and libraries.

If you are creating a new FCM project and not attaching to the existing GCM project, once you update Notification Hubs with the new FCM secret you will lose the ability to push notifications to your current app installations, since the new FCM key has no link to the old GCM project.

Q: Why am I getting this email about old GCM endpoints being used? What do I have to do?

A: Nothing. We have been migrating to the new endpoints and will be finished soon, so no change is necessary. Nothing is broken, our one missed endpoint simply caused warning messages from Google.

Q: How can I transition to the new FCM SDKs and libraries without breaking existing users?

A: Upgrade at any time. Google has not yet announced any deprecation of existing GCM SDKs and libraries. To ensure you don't break push notifications to your existing users, make sure when you create the new Firebase project you are associating with your existing GCM project. This will ensure new Firebase secrets will work for users running the older versions of your app with GCM SDKs and libraries, as well as new users of your app with FCM SDKs and libraries.

Q: When can I use new FCM features and schemas for my notifications?

A: Once we publish an update to our API and SDKs, stay tuned – we expect to have something for you in the coming months.

Learn more about Azure Notification Hubs and get started today.
Quelle: Azure

5 tips to get more out of Azure Stream Analytics Visual Studio Tools

Azure Stream Analytics is an on-demand real-time analytics service to power intelligent action. Azure Stream Analytics tools for Visual Studio make it easier for you to develop, manage, and test Stream Analytics jobs. This year we provided two major updates in January and March, unleashing new useful features. In this blog we’ll introduce some of these capabilities and features to help you improve productivity.

Test partial scripts locally

In the latest March update we enhanced local testing capability. Besides running the whole script, now you can select part of the script and run it locally against the local file or live input stream. Click Run Locally or press F5/Ctrl+F5 to trigger the execution. Note that the selected portion of the larger script file must be a logically complete query to execute successfully.

Share inputs, outputs, and functions across multiple scripts

It is very common for multiple Stream Analytics queries to use the same inputs, outputs, or functions. Since these configurations and code are managed as files in Stream Analytics projects, you can define them only once and then use them across multiple projects. Right-click on the project name or folder node (inputs, outputs, functions, etc.) and then choose Add Existing Item to specify the input file you already defined. You can organize the inputs, outputs, and functions in a standalone folder outside your Stream Analytics projects to make it easy to reference in various projects.

Duplicate a job to other regions

All Stream Analytics jobs running in the cloud are listed in Server Explorer under the Stream Analytics node. You can open Server Explorer by choosing from the View menu.

If you want to duplicate a job to another region, just right-click on the job name and export it to a local Stream Analytics project. Since the credentials cannot be downloaded to local environment, you must specify the correct credentials in the job inputs and outputs files. After that, you are ready to submit the job to another region by clicking Submit to Azure in the script editor.

Local input schema auto-completion

If you have specified a local file for an input to your script, the IntelliSense feature will suggest input column names based on the actual schema of your data file.

Testing queries against SQL database as reference data

Azure Stream Analytics supports Azure SQL Database as an input source for reference data. When you add a reference input using SQL Database, two SQL files are generated as code, behind files under your input configuration file.

In Visual Studio 2017 or 2019, if you have already installed SQL Server Data tools, you can directly write the SQL query and test by clicking Execute in the query editor. A wizard window will pop up to help you connect to the SQL database and show the query result in the window at the bottom.

Providing feedback and ideas

The Azure Stream Analytics team is committed to listening to your feedback. We welcome you to join the conversation and make your voice heard via our UserVoice. For tools feedback, you can also reach out to ASAToolsFeedback@microsoft.com.

Also, follow us @AzureStreaming to stay updated on the latest features.
Quelle: Azure

What’s in an image: fast, accurate image segmentation with Cloud TPUs

Google designed Cloud TPUs from the ground up to accelerate cutting-edge machine learning (ML) applications, from image recognition, to language modeling, to reinforcement learning. And now, we’ve made it even easier for you to use Cloud TPUs for image segmentation—the process of identifying and labeling regions of an image based on the objects or textures they contain—by releasing high-performance TPU implementations of two state-of-the-art segmentation models, Mask R-CNN and DeepLab v3+ as open source code. Below, you can find performance and cost metrics for both models that can help you choose the right model and TPU configuration for your business or product needs.A brief introduction to image segmentationImage segmentation is the process of labeling regions in an image, often down to the pixel level. There are two common types of image segmentation:Instance segmentation: This process gives each individual instance of one or multiple object classes a distinct label. In a family photo containing several people, this type of model would automatically highlight each person with a different color.Semantic segmentation: This process labels each pixel of an image according to the class of object or texture it represents. For example, pixels in an image of a city street scene might be labeled as “pavement,” “sidewalk,” “building,” “pedestrian,” or “vehicle.”Autonomous driving, geospatial image processing, and medical imaging, among other applications, typically require both of these types of segmentation. And image segmentation is even an exciting new enabler for certain photo and video editing processes, including bokeh and background removal!High performance, high accuracy, and low costWhen you choose to work with image segmentation models, you’ll want to consider a number of factors: your accuracy target, the total training time to reach this accuracy, the cost of each training run, and more. To jump-start your analysis, we have trained Mask R-CNN and DeepLab v3+ on standard image segmentation datasets and collected many of these metrics in the tables below.Instance segmentation using Mask R-CNNFigure 1: Mask R-CNN training performance and accuracy, measured on the COCO datasetSemantic segmentation using DeepLab v3+Figure 2: DeepLab v3+ training performance and accuracy, measured on the PASCAL VOC 2012 datasetAs you can see above, Cloud TPUs can help you train state-of-the-art image segmentation models with ease, and you’ll often reach usable accuracy very quickly. At the time we wrote this blog post, the first two Mask R-CNN training runs and both of the DeepLab v3+ runs in the tables above cost less than $50 using the on-demand Cloud TPU devices that are now generally available.By providing these open source image segmentation models and optimizing them for a range of Cloud TPU configurations, we aim to enable ML researchers, ML engineers, app developers, students, and many others to train their own models quickly and affordably to meet a wide range of real-world image segmentation needs.A closer look at Mask R-CNN and DeepLab v3+In order to achieve the image segmentation performance described above, you’ll need to use a combination of extremely fast hardware and well-optimized software. In the following sections, you can find more details on each model’s implementation.Mask R-CNNMask R-CNN is a two-stage instance segmentation model that can be used to localize multiple objects in an image down to the pixel level. The first stage of the model extracts features (distinctive patterns) from an input image to generate region proposals that are likely to contain objects of interest. The second stage refines and filters those region proposals, predicts the class of every high-confidence object, and generates a pixel-level mask for each object.Figure 3: An image from Wikipedia with an overlay of Mask R-CNN instance segmentation results.In the Mask R-CNN table above, we explored various trade-offs between training time and accuracy. The accuracy you wish to achieve as you train Mask R-CNN will vary by application: for some, training speed might be your top priority, whereas for others, you’ll prioritize around training to the highest possible accuracy, even if more training time and associated costs are needed to reach that accuracy threshold.The training time your model will require depends on both the number of training epochs and your chosen TPU hardware configuration. When training for 12 epochs, Mask R-CNN training on the COCO dataset typically surpasses an object detection “box accuracy” of 37 mAP (“mean Average Precision”). While this accuracy threshold may be considered usable for many applications, we also report training results using 24 and 48 epochs across various Cloud TPU configurations to help you evaluate the current accuracy-speed trade off and choose an option that works best for your application. All the numbers in the tables above were collected using TensorFlow version 1.13. While we expect your results to be similar to ours, your results may vary.Here are some high-level conclusions from our Mask R-CNN training trials:If budget is your top priority, a single Cloud TPU v2 device (v2-8) should serve you well. With a Cloud TPU v2, our Mask R-CNN implementation trains overnight to an accuracy point of more than 37 mAP for less than $50. With a preemptible Cloud TPU device, that cost can drop to less than $20.Alternatively, if you choose a Cloud TPU v3 device (v3-8), you should benefit from a speedup of up to 1.7x over a Cloud TPU v2 device—without any code changes.Cloud TPU Pods enable even faster training at larger scale. Using just 1/16th of a Cloud TPU v3 Pod, Mask R-CNN trains to the highest accuracy tier in the table in under two hours.DeepLab v3+Google’s DeepLab v3+, a fast and accurate semantic segmentation model, makes it easy to label regions in images. For example, a photo editing application might use DeepLab v3+ to automatically select all of the pixels of sky above the mountains in a landscape photograph.Last year, we announced the initial open source release of DeepLab v3+, which as of writing is still the most recent version of DeepLab. The DeepLab v3+ implementation featured above includes optimizations that target Cloud TPU.Figure 4: Semantic segmentation results using DeepLab v3+ [image from the DeepLab v3 paper]We trained DeepLab v3+ on the PASCAL VOC 2012 dataset using TensorFlow version 1.13 on both Cloud TPU v2 and Cloud TPU v3 hardware. Using a single Cloud TPU v2 device (v2-8), DeepLab v3+ training completes in about 8 hours and costs less than $40 (less than $15 using preemptible Cloud TPUs). Cloud TPU v3 offers twice the memory (128 GB) and more than twice the peak compute (420 teraflops), enabling a speedup of about 1.7x without any code changes.Getting started—in a sandbox, or in your own projectIt’s easy to start experimenting with both the models above by using a free Cloud TPU in Colab right in your browser:Mask R-CNN ColabDeepLab v3+ ColabYou can also get started with these image segmentation models in your own Google Cloud projects by following these tutorials:Mask R-CNN tutorial (source code here)DeepLab v3+ tutorial (source code here)If you’re new to Cloud TPUs, you can get familiar with the platform by following our quickstart guide, and you can also request access to Cloud TPU v2 Pods—available in alpha today. For more guidance on determining whether you should use an individual Cloud TPU or an entire Cloud TPU Pod, check out our comparison documentation here.AcknowledgementsMany thanks to the Googlers who contributed to this post, including Zak Stone, Pengchong Jin, Shawn Wang, Chiachen Chou, David Shevitz, Barrett Williams, Liang-Chieh Chen, Yukun Zhu, Yeqing Li, Wes Wahlin, Pete Voss, Sharon Maher, Tom Nguyen, Xiaodan Song, Adam Kerin, and Ruoxin Sang.
Quelle: Google Cloud Platform

Connecting employers near and far to remote job seekers with Cloud Talent Solution

In March, Cloud Talent Solution announced new functionality that enables job search by preferred commute time and mode of transit, including walking, cycling, driving, and public transportation. We also enhanced our job search capabilities in more than 100 languages. And we continue to hear from employers that finding and engaging a larger talent pool is a top priority, which means meeting job seekers where they are. In today’s era of agility, speed and innovation, companies must be able to adapt across every part of their business to keep up with competition, starting with their workforce.To help employers and recruitment platforms connect with candidates who need more flexibility, today we are announcing an improved job search experience that allows our customers to make remote work opportunities in the U.S. more discoverable on their career sites. This functionality supports users who search for jobs with terms like “work from home” or “WFH” and returns the relevant jobs that may be labeled differently as “remote” or “telecommute.”Job seekers have different lifestyle and geographic needs that require flexibility. Working from home can enable parents and caregivers to be more available to their families. It can help retain a high performing employee who regularly relocates as a military spouse. And it can help increase the loyalty of millennial and Generation Z employees who are much likelier to stay in a role for 5+ years if their company is flexible about where and when they work.Through a number of user studies, we learned that most remote jobs were not clearly labeled as such. And our customers have told us it is difficult to accurately detect and return remote opportunities when users enter “work from home” or related queries into their search box. By offering this capability to our customers, we hope to make remote work opportunities more discoverable to job seekers who need them.What remote work flexibility means for businessesHighmark Health is a national health and wellness organization whose more than 43,000 employees serve millions of Americans nationwide. Highmark Health offers diverse career opportunities to contribute to a remarkable health experience, across a wide variety of disciplines, including nursing, customer service, finance and IT. As an employer, Highmark Health invests heavily in building diverse talent pipelines, including students and new grads, veterans, and people with disabilities, because the organization believes that diverse perspectives drive a better employee and patient experience, as well as better business outcomes. By offering remote work opportunities, Highmark Health is able to attract a broader audience of job seekers who may require work location flexibility in order to do their best work.Highmark Health has integrated Cloud Talent Solution throughSymphony Talent to offer remote work and other inclusive job search functionality to all candidates who visit theircareer site.“Remote positions allow our organization to remove geography as a barrier and tap into talent markets that we would not otherwise have access to,” commented Karl Sparre, Vice President, Talent Solutions, Highmark Health. “Our collaboration with Google and Symphony Talent enables us to compete for top talent in the current record low unemployment market, as remote positions are critical to the success of our businesses that have a nationwide consumer presence.”“By infusing our clients’ career sites with the power of Google Cloud Talent Solution’s enhanced search capabilities, we are empowering our customers to transform their candidate experience into one that is more personalized and responsive,” said Roopesh Nair, President and CEO, Symphony Talent. “Google is driving new levels of talent marketing experiences, making it the ideal strategic partnership to accelerate our passion to deliver the right fit, quality talent to our clients.”RecruitMilitary connects organizations with veteran talent through over 30 products and services, all of which are fueled by their job board. Their job board, with over 1.4M members, is core to RecruitMilitary’s overall business and is powered by Google Cloud Talent Solution’s job search functionality. By offering an improved search experience for remote work opportunities on RecruitMilitary’s job board, job seekers in the military community can now discover more flexible jobs to suit their specific needs.“We’re excited about this feature as it enhances our ability to deliver meaningful jobs to important members of our military community, military spouses and veterans with limited mobility,” said Mike Francomb, Senior Vice President of Technology for RecruitMilitary and Bradley-Morris. “With this capability, these audiences will now have ease of access to opportunities that accommodate the unique aspects of their job search. Spouses can now connect with more portable career options as they move to support their service member spouse. With this new feature, veterans who prefer to or need to work from home will now have a path to opportunities that allow them to do so.”This new functionality supporting the discoverability of remote work opportunities is available to any site using Cloud Talent Solution to power its job search.And if you are an employer or running a job board or staffing agency and want to help more people find the right job opportunities on your site, visit our website to get started with Cloud Talent Solution today.
Quelle: Google Cloud Platform

Kubernetes Adoption Challenges Solved

The business wants better software, brought to market faster. Enterprise IT wants to manage cost and risk. The problem for large, complex organizations is that they need to accomplish these objectives and still maintain existing business operations. It is like changing the wheels of an F1 race car at full speed without a pit stop. […]
The post Kubernetes Adoption Challenges Solved appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

AI in depth: Creating preprocessing-model serving affinity with custom online prediction on AI Platform Serving

AI Platform Serving now lets you deploy your trained machine learning (ML) model with custom online prediction Python code, in beta. In this blog post, we show how custom online prediction code helps maintain affinity between your preprocessing logic and your model, which is crucial to avoid training-serving skew. As an example, we build a Keras text classifier, and deploy it for online serving on AI Platform, along with its text preprocessing components. The code for this example can be found in this Notebook.BackgroundThe hard work of building an ML model pays off only when you deploy the model and use it in production—when you integrate it into your pre-existing systems or incorporate your model into a novel application. If your model has multiple possible consumers, you might want to deploy the model as an independent, coherent microservice that is invoked via a REST API that can automatically scale to meet demand. Although AI Platform may be better known for its training abilities, it can also serve TensorFlow, Keras, scikit-learn, and XGBoost models with REST endpoints for online prediction.While training that model, it’s common to transform the input data into a format that improves model performance. But when performing predictions, the model expects the input data to already exist in that transformed form. For example, the model might expect a normalized numerical feature, for example TF-IDF encoding of terms in text, or a constructed feature based on a complex, custom transformation. However, the callers of your model will send “raw”, untransformed data, and the caller doesn’t (or shouldn’t) need to know which transformations are required. This means the model microservice will be responsible for applying the required transformation on the data before invoking the model for prediction.The affinity between the preprocessing routines and the model (i.e., having both of them coupled in the same service) is crucial to avoid training-serving skew, since you’ll want to ensure that these routines are applied on any data sent to the model, with no assumptions about how the callers prepare the data. Moreover, the model-preprocessing affinity helps to decouple the model from the caller. That is, if a new model version requires new transformations, these preprocessing routines can change independently of the caller, as the caller will keep on sending data in its raw format.Beside preprocessing, your deployed model’s microservice might also perform other operations, including postprocessing of the prediction produced by the model, or even more complex prediction routines that combine the predictions of multiple models.To help maintain affinity of preprocessing between training and serving, AI Platform Serving now lets you customize the prediction routine that gets called when sending prediction requests to a model deployed on AI Platform Serving. This feature allows you to upload a custom model prediction class, along with your exported model, to apply custom logic before or after invoking the model for prediction.Customizing prediction routines can be useful for the following scenarios:Applying (state-dependent) preprocessing logic to transform the incoming data points before invoking the model for prediction.Applying (state-dependent) post-processing logic to the model prediction before sending the response to the caller. For example, you might want to convert the class probabilities produced by the model to a class label.Integrating rule-based and heuristics-based prediction with model-based prediction.Applying a custom transform used in fitting a scikit-learn pipeline.Performing complex prediction routines based on multiple models, that is, aggregating predictions from an ensemble of estimators, or calling a model based on the output of the previous model in a hierarchical fashion.The above tasks can be accomplished by custom online prediction, using the standard framework supported by AI Platform Serving, as well as with any model developed by your favorite Python-based framework, including PyTorch. All you need to do is to include the dependency libraries in the setup.py of your custom model package (as discussed below). Note that without this feature, you would need to implement the preprocessing, post-processing, or any custom prediction logic in a “wrapper” service, using, for example, App Engine. This App Engine service would also be responsible for calling the AI Platform Serving models, but this approach adds complexity to the prediction system, as well as latency to the prediction time.Next we’ll demonstrate how we built a microservice that can handle both preprocessing and post-processing scenarios using the AI Platform custom online prediction, using text classification as the example. We chose to implement the text preprocessing logic and built the classifier using Keras, but thanks to AI Platform custom online prediction, you could implement the preprocessing using any other libraries (like NLTK or Scikit-learn), and build the model using any other Python-based ML framework (like TensorFlow or PyTorch). You can find the code for this example in this Notebook.A text classification exampleText classification algorithms are at the heart of a variety of software systems that process text data at scale. The objective is to classify (categorize) text into a set of predefined classes, based on the text’s content. This text can be a tweet, a web page, a blog post, user feedback, or an email: in the context of text-oriented ML models, a single text entry (like a tweet) is usually referred to as a “document.”Common use cases of text classification include:Spam-filtering: classifying an email as spam or not.Sentiment analysis: identifying the polarity of a given text, such as tweets, product or service reviews.Document categorization: identifying the topic of a given document (for example, politics, sports, finance, etc.)Ticket routing: identifying to which department to dispatch a ticketYou can design your text classification model in two different ways; choosing one versus the other will influence how you’ll need to prepare your data before training the model.N-gram models: In this option, the model treats a document as a “bag of words,” or more precisely, a “bag of terms,” where a term can be one word (uni-gram), two words (bi-gram) or n words (n-grams). The ordering of the words in the document is not relevant. The feature vector representing a document encodes whether a term occurs in the document or not (binary encoding), how many times the term occurs in the document (count encoder) or more commonly, Term Frequency Inverse Document Frequency (TF-IDF encoder). Gradient-boosted trees and Support Vector Machines are typical techniques to use in n-gram models.Sequence models: With this option, the text is treated as a sequence of words or terms, that is, the model uses the word ordering information to make the prediction. Types of sequence models include convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variations.In our example, we utilize the sequence model approach.Hacker News is one of many public datasets available in BigQuery. This dataset includes titles of articles from several data sources. For the following tutorial, we extracted the titles that belong to either GitHub, The New York Times, or TechCrunch, and saved them as CSV files in a publicly shared Cloud Storage bucket at the following location:gs://cloud-training-demos/blogs/CMLE_custom_predictionHere are some useful statistics about this dataset:Total number of records: 96,203Min, Max, and Average number of words per title: 1, 52, and 8.7Number of records in GitHub, The New York Times, and TechCrunch: 36,525, 28,787, and 30,891Training and evaluation percentages: 75% and 25%The objective of the tutorial is to build a text classification model, using Keras to identify the source of the article given its title, and deploy the model to AI Platform serving using custom online prediction, to be able to perform text pre-processing and prediction post-processing.Preprocessing textSequence tokenization with KerasIn this example, we perform the following preprocessing steps:Tokenization: Divide the documents into words. This step determines the “vocabulary” of the dataset (set of unique tokens present in the data). In this example, you’ll make use of the most frequently 20,000 words, and discard the other ones from the vocabulary. This value is set through the VOCAB_SIZE parameter.Vectorization: Define a good numerical measure to characterize these documents. A given embedding’s representation of the tokens (words) will be helpful when you’re ready to train your sequence model. However, these embeddings are created as part of the model, rather than as a preprocessing step. Thus, what you need here is to simply convert each token to a numerical indicator. That is, each article’s title is represented as a sequence of integers, and each is an indicator of a token in the vocabulary that occured in the title.Length fixing: After vectorization, you have a set of variable-length sequences. In this step, the sequences are converted into a single fixed length: 50. This can be configured using MAX_SEQUENCE_LENGTH parameter. Sequences with more than 50 tokens will be right-trimmed, while sequences with fewer than 50 tokens will be left-padded with zeros.Both the tokenization and vectorization steps are considered to be stateful transformations. In other words, you extract the vocabulary from the training data (after tokenization and keeping the top frequent words), and create a word-to-indicator lookup, for vectorization, based on the vocabulary. This lookup will be used to vectorize new titles for prediction. Thus, after creating the lookup, you need to save it to (re-)use it when serving the model.The following block shows the code for performing text preprocessing. The TextPreprocessor class in the preprocess.py module includes two methods.fit(): applied on training data to generate the lookup (tokenizer). The tokenizer is stored as an attribute in the object.transform(): applies the tokenizer on any text data to generate the fixed-length sequence of word indicators.Preparing training and evaluation dataThe following code prepares the training and evaluation data (that is, it converts each raw text title to a NumPy array with 50 numeric indicator). Note that, you use both fit() and transform() with the training data, while you only use transform() with the evaluation data, to make use of the tokenizer generated from the training data. The outputs, train_texts_vectorized and eval_texts_vectorized, will be used to train and evaluate our text classification model respectively.Next, save the processor object (which includes the tokenizer generated from the training data) to be used when serving the model for prediction. The following code dumps the object to processor_state.pkl file.Training a Keras modelThe following code snippet shows the method that creates the model architecture. We create a Sequential Keras model, with an Embedding layer, Dropout layer, followed by two Conv1d and Pooling Layers, then a Dense layer with Softmax activation at the end. The model is compiled with sparse_categorical_crossentropy loss and accuracy acc (accuracy) evaluation metric.The following code snippet creates the model by calling the create_model method with the required parameters, trains the model on the training data, and evaluates the trained model’s quality using the evaluation data. Lastly, the trained model is saved to keras_saved_model.h5 file.Implementing a custom model prediction classIn order to apply a custom prediction routine that includes preprocessing and postprocessing, you need to wrap this logic in a Custom Model Prediction class. This class, along with the trained model and the saved preprocessing object, will be used to deploy the AI Platform Serving microservice. The following code shows how the Custom Model Prediction class (CustomModelPrediction) for our text classification example is implemented in the model_prediction.py module.Note the following points in the Custom Model Prediction class implementation:from_path is a “classmethod”, responsible for loading both the model and the preprocessing object from their saved files, and instantiating a new CustomModelPrediction object with the loaded model and preprocessor object (which are both stored as attributes to the object).predict is the method invoked when you call the “predict” API of the deployed AI Platform Serving model. The method does the following:Receives the instances (list of titles) for which the prediction is neededPrepares the text data for prediction by applying the transform() method of the “stateful” self._processor object.Calls the self._model.predict() to produce the predicted class probabilities, given the prepared text.Post-processes the output by calling the _postprocess method._postprocess is the method that receives the class probabilities produced by the model, picks the label index with the highest probability, and converts this label index to a human-readable label  github,  nytimes,  or techcrunch .Deploying to AI Platform ServingFigure 1 shows an overview of how to deploy the model, along with its required artifacts for a custom prediction routine to AI Platform Serving.Uploading the artifacts to Cloud StorageThe first thing you want to do is to upload your artifacts to Cloud Storage. First, you need to upload:Your saved (trained) model file: keras_saved_model.h5 (see the Training a Keras model section).Your pickled (seralized) preprocessing objects (which contain the state needed for data transformation prior to prediction): processor_state.pkl. (see the Preprocessing Text section). Remember, this object includes the tokenizer generated from the training data.Second, upload a python package including all the classes you need for prediction (e.g., preprocessing, model classes, and post-processing). In this example, you need to create a pip-installable tar with model_prediction.py and preprocess.py. First, create the following setup.py file:Now, generate the package by running the following command:This creates a .tar.gz package under a new /dist directory, created in your working directory. The name of the package will be $name-$version.tar.gz where $name and $version are the ones specified in the setup.py.Once you have successfully created the package, you can upload it to Cloud Storage:Deploying the model to AI Platform ServingLet’s define the model name, the model version, and the AI Platform Serving runtime (which corresponds to a TensorFlow version) required to deploy the model.First, create a model in AI Platform Serving using the following gcloud command:Second, create a model version using the following gcloud command, in which you specify the location of the model and preprocessing object (–origin), the location the package(s) including the scripts needed for your prediction (–package-uris), and a pointer to your Custom Model Prediction class (–prediction-class).This should take one to two minutes.Calling the deployed model for online predictionsAfter deploying the model to AI Platform Serving, you can invoke the model for prediction using the following code:Given the titles defined in the request object, the predicted source of each title from the deployed model would be as follows: [techcrunch, techcrunch, techcrunch, nytimes, nytimes, nytimes, github, github, techcrunch]. Note that the last one was mis-classified by the model.ConclusionIn this tutorial, we built and trained a text classification model using Keras to predict the source media of a given article. The model required text preprocessing operations for preparing the training data, and preparing the incoming requests to the model deployed for online predictions. Then, we showed you how to deploy the model to AI Platform Serving with custom online prediction code, in order to perform preprocessing to the incoming prediction requests and post-processing to the prediction outputs. Enabling a custom online prediction routine in AI Platform Serving allows for affinity between the preprocessing logic, the model, and the post-processing logic required to handle prediction request end-to-end. This helps to avoid training-serving skew, and simplifies deploying ML models for online prediction.Thanks for following along. If you’re curious to try out some other machine learning tasks on GCP, take this specialization on Coursera. If you want to try out these examples for yourself in a local environment, run this Notebook. Send a tweet to @GCPcloud if there’s anything we can change or add to make text analysis even easier on Google Cloud.AcknowledgementsWe would like to thank Lak Lakshmanan, Technical Lead, Machine Learning and Big Data in Google Cloud, for reviewing and improving the blog post.
Quelle: Google Cloud Platform