Run cron jobs reliably on Compute Engine with Cloud Scheduler

Many systems have regularly scheduled jobs, but getting those job to run reliably in a distributed environment can be surprisingly hard.Imagine trying to run the standard UNIX cron job scheduling service in a fleet of virtual machines. Many individual machines come and go due to autoscaling and network partitioning. As such, a critical task might never run because the instance it was scheduled on became unavailable. Alternately, a task that was meant to run only once might be duplicated by many servers as your autoscaler brings them online.Using Cloud Scheduler for scheduling andGoogle Cloud Pub/Sub for messaging, you can build a distributed and fault-tolerant scheduler for your virtual machines.  In this design pattern, you schedule your jobs in Cloud Scheduler. Cloud Scheduler uses Cloud Pub/Sub to relay the events to a utility running on each Compute Engine instance. When that utility receives a message, it runs a script corresponding to the Cloud Pub/Sub topic. The scripts run locally on the instance just as if they were run by cron. In fact, you can reuse existing cron scripts with this design pattern.Using Cloud Pub/Sub for distributed messaging means that you can schedule an event to only run on one of many servers, or to run the task on several servers concurrently. Using this topic and subscriber model (shown in the diagram below) allows you to control which instances receive and perform a given task.For a detailed explanation of this design pattern, check out ourReliable Task Scheduling for Google Compute Engine article, which includesa sample implementation on GitHub. Feel free to make pull requests or open issues directly on the open source sample.
Quelle: Google Cloud Platform

AI in depth: monitoring home appliances from power readings with ML

As the popularity of home automation and the cost of electricity grow around the world, energy conservation has become a higher priority for many consumers. With a number of smart meter devices available for your home, you can now measure and record overall household power draw, and then with the output of a machine learning model, accurately predict  individual appliance behavior simply by analyzing meter data. For example, your electric utility provider might send you a message if it can reasonably assess that you left your refrigerator door open, or if the irrigation system suddenly came on at an odd time of day.In this post, you’ll learn how to accurately identify home appliances’ (e.g. electric kettles and washing machines, in this dataset) operating status using smart power readings, together with modern machine learning techniques such as long short-term memory (LSTM) models. Once the algorithm identifies an appliance’s operating status, we can then build out a few more applications. For example:Anomaly detection: Usually the TV is turned off when there is no one at home. An application can send a message to the user if the TV turns on at an unexpected or unusual time.Habit-improving recommendations: We can present users the usage patterns of home appliances in the neighborhood at an aggregated level so that they can compare or refer to the usage patterns and optimize the usage of their home appliances.We developed our end-to-end demo system entirely on Google Cloud Platform, including data collection through Cloud IoT Core, a machine learning model built using TensorFlow and trained on Cloud Machine Learning Engine, and real-time serving and prediction made possible by Cloud Pub/Sub, App Engine and Cloud ML Engine. As you progress through this post, you can access the full set of source files in the GitHub repository here.IntroductionThe growing popularity of IoT devices and the evolution of machine learning technologies have brought new opportunities for businesses. In this post, you’ll learn how home appliances’ (for example, an electric kettle and a washing machine) operating status (on/off) can be inferred from gross power readings collected by a smart meter, together with state-of-the-art machine learning techniques. An end-to-end demo system, developed entirely on Google Cloud Platform (as shown in Fig. 1), includes:Data collection and ingest through Cloud IoT Core and Cloud Pub/SubA machine learning model, trained using Cloud ML EngineThat same machine learning model, served using Cloud ML Engine together with App Engine as a front endData visualization and exploration using BigQuery and ColabFigure 1. System architectureThe animation below shows real-time monitoring, as real-world energy usage data is ingested through Cloud IoT Core into Colab.Figure 2. Illustration of real-time monitoringIoT extends the reach of machine learningData ingestionIn order to train any machine learning model, you need data that is both suitable and sufficient in quantity. In the field of IoT, we need to address a number of challenges in order to reliably and safely send the data collected by smart IoT devices to remote centralized servers. You’ll need to consider data security, transmission reliability, and use case-dependent timeliness, among other factors.Cloud IoT Core is a fully managed service that allows you to easily and securely connect, manage, and ingest data from millions of globally dispersed devices. The two main features of Cloud IoT Core are its device manager and its protocol bridge. The former allows you to configure and manage individual devices in a coarse-grained way by establishing and maintaining devices’ identities along with authentication after each connection. The device manager also stores each device’s logical configuration and is able to remotely control the devices—for example, changing a fleet of smart power meters’ data sampling rates. The protocol bridge provides connection endpoints with automatic load balancing for all device connections, and natively supports secure connection over industry standard protocols such as MQTT and HTTP. The protocol bridge publishes all device telemetry to Cloud Pub/Sub, which can then be consumed by downstream analytic systems. We adopted the MQTT bridge in our demo system and the following code snippet includes MQTT-specific logic.Data consumptionAfter the system publishes data to Cloud Pub/Sub, it delivers a message request to the “push endpoint,” typically the gateway service that consumes the data. In our demo system, Cloud Pub/Sub pushes data to a gateway service hosted in App Engine which then forwards the data to the machine learning model hosted in the Cloud ML Engine for inference, and at the same time stores the raw data together with received prediction results in BigQuery for later (batch) analysis.While there are numerous business-dependent use cases you can deploy based on our sample code, we illustrate raw data and prediction results visualization in our demo system. In the code repository, we have provided two notebooks:EnergyDisaggregationDemo_Client.ipynb: this notebook simulates multiple smart meters by reading in power consumption data from a real world dataset and sends the readings to the server. All Cloud IoT Core-related code resides in this notebook.EnergyDisaggregationDemo_View.ipynb: this notebook allows you to view raw power consumption data from a specified smart meter and our model’s prediction results in almost real time.If you follow the deployment instructions in the README file and in the accompanying notebooks, you should be able to reproduce the results shown in Figure 2. Meanwhile, if you’d prefer to build out your disaggregation pipeline in a different manner, you can also use Cloud Dataflow and Pub/Sub I/O to build an app with similar functionality.Data processing and machine learningDataset introduction and explorationWe trained our model to predict each appliance’s on/off status from gross power readings, using the UK Domestic Appliance-Level Electricity (UK-DALE, publicly available here1) dataset  in order for this end-to-end demo system to be reproducible. UK-DALE records both whole-house power consumption and usage from each individual appliance every 6 seconds from 5 households. We demonstrate our solution using the data from house #2, for which the dataset includes a total of 18 appliances’ power consumption. Given the granularity of the dataset (a sample rate of ⅙ Hz), it is difficult to estimate appliances with relatively tiny power usage. As a result, appliances such as laptops and computer monitors are removed from this demo. Based on a data exploration study shown below, we selected eight appliances out of the original 18 items as our target appliances: a treadmill, washing machine, dishwasher, microwave, toaster, electric kettle, rice cooker and “cooker,” a.k.a., electric stovetop.The figure below shows the power consumption histograms of selected appliances. Since all the appliances are off most of the time, most of the readings are near zero. Fig. 4 shows the comparisons between aggregate power consumption of selected appliances (`app_sum`) and the whole-house power consumption (`gross`). It is worth noting that the input to our demo system is the gross consumption (the blue curve) because this is the most readily available power usage data, and is even measurable outside the home.Figure 3. Target appliances and demand histogramsFigure 4. Data sample from House #2 (on 2013-07-04 UTC)The data for House #2 spans from late February to early October 2013. We used data from June to the end of September in our demo system due to missing data at both ends of the period. The descriptive summary of selected appliances is illustrated in Table 1. As expected, the data is extremely imbalanced in terms of both “on” vs. “off” for each appliance and power consumption scale of each appliance, which introduces the main difficulty of our prediction task.Table 1. Descriptive summary of power consumptionPreprocessing the dataSince UK-DALE did not record individual appliance on/off status, one key preprocessing step is to label the on/off status of each appliance at each timestamp. We assume an appliance to be “on” if its power consumption is larger than one standard deviation from the sample mean of its power readings, given the fact that appliances are off most of the time and hence most of the readings are near zero. The code for data preprocessing can be found in the notebook provided, and you can also download the processed data from here.With the preprocessed data in CSV format, TensorFlow’s Dataset class serves as a convenient tool for data loading and transformation—for example, the input pipeline for machine learning model training. For example, in the following code snippet lines 7 – 9 load data from the specified CSV file and lines 11 – 13 transform data into our desired time-series sequence.In order to address the data imbalance issue, you can either down-sample the majority class or up-sample the minority class. In our case, we propose a probabilistic negative down-sampling method: we’ll preserve the subsequences in which at least one appliance remains on, but we’ll filter the subsequences with all appliances off, based on a certain probability and threshold. The filtering logic integrates easily with the tf.data API, as in the following code snippet:Finally, you’ll want to follow best practices from Input Pipeline Performance Guide to ensure that your GPU or TPU (if they are used to speed up training process) resources are not wasted while waiting for the data to load from the input pipeline. To maximize usage, we employ parallel mapping to parallelize data transformation and prefetch data to overlap the preprocessing and model execution of a training step, as shown in the following code snippet:The machine learning modelWe adopt a long short-term memory (LSTM) based network as our classification model. Please see Understanding LSTM Networks for an introduction to recurrent neural networks and LSTMs. Fig. 5 depicts our model design, in which an input sequence of length n is fed into a multilayered LSTM network, and prediction is made for all m appliances. A dropout layer is added for the input of LSTM cell, and the output of the whole sequence is fed into a fully connected layer. We implemented this model as a TensorFlow estimator.Figure 5. LSTM based model architectureThere are two ways of implementing the above architecture: TensorFlow native API (tf.layers and tf.nn) and Keras API (tf.keras). Compared to TensorFlow’s native API, Keras serves as a higher level API that lets you train and serve deep learning models with three key advantages: ease of use, modularity, and extensibility. tf.keras is TensorFlow’s implementation of the Keras API specification. In the following code sample, we implemented the same LSTM-based classification model using both methods, so that you can compare the two:Model authoring using TensorFlow’s native API:Model authoring using the Keras API:Training and hyperparameter tuningCloud Machine Learning Engine supports both training and hyperparameter tuning. Figure 6 shows the average (over all appliances) precision, recall and f_score for multiple trials with different combinations of hyperparameters. We observed that hyperparameter tuning significantly improves model performance.Figure 6. Learning curves from hyperparameter tuning.We selected two experiments with optimal scores from hyperparameter tunings and report their performances in Table 2.Table 2. Hyper-parameter tuning of selected experimentsTable 3 lists the precision and recall of each individual appliance. As mentioned in the previous “Dataset introduction and exploration” section, the cooker and the treadmill (“running machine”) are difficult to predict, because their peak power consumptions were significantly lower than other appliances.Table 3. Precision and recall of predictions for individual appliancesConclusionWe have provided an end-to-end demonstration of how you can use machine learning to determine the operating status of home appliances accurately, based on only smart power readings. Several products including Cloud IoT Core,  Cloud Pub/Sub,  Cloud ML Engine, App Engine and BigQuery are orchestrated to support the whole system, in which each product solves a specific problem required to implement this demo, such as data collection/ingestion, machine learning model training, real time serving/prediction, etc. Both our code and data are available for those of you who would like to try out the system for yourself.We are optimistic that both we and our customers will develop ever more interesting applications at the intersection of more capable IoT devices and fast-evolving machine learning algorithms. Google Cloud provides both the IoT infrastructure and machine learning training and serving capabilities that make newly capable smart IoT deployments both a possibility and a reality.1. Jack Kelly and William Knottenbelt. The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes. Scientific Data 2, Article number:150007, 2015, DOI:10.1038/sdata.2015.7.
Quelle: Google Cloud Platform

Introducing scheduled snapshots for Compute Engine persistent disk

From web hosting to databases, workloads running on Compute Engine need a reliable, convenient and automatic way to create periodic snapshots for disks of VM instances. We are excited to announce that starting today, scheduled snapshots are now available in beta. This feature lets you create automated snapshots, as well as manage snapshot retention. It is designed to reduce errors and save time, so you can focus on initiatives that create value for your business.You can use this feature by first defining a snapshot schedule, which supports frequencies by hours, days and weeks. For example, you can create a schedule that says “Create a snapshot every six hours,” or “Create a snapshot every Monday, Wednesday and Friday of each week.” Scheduled snapshots also mean you no longer need to manage snapshot cleanup yourself. You can define the retention policy within the same schedule, and the system will automatically delete the snapshots based on your defined retention policy.The snapshot schedule can be applied to a single disk, or multiple disks within the same region, so you can create scheduled snapshots at scale.You can also use the latest storage location feature for snapshots when defining the snapshot schedules.Using the scheduled snapshots featureYou can create scheduled snapshots via the API, in the CLI (gcloud) and through the GCP Developer Console. Here’s how to get started.In this gcloud example, a snapshot schedule is created in the europe-west1 region to generate snapshots every six hours, then delete them after 15 days:and then attach the “hourly-schedule” to existing disk d1 in europe-west1-b zone:Or you can specify a snapshot schedule while creating a new disk.You can also create and manage your automated snapshots using the Developer Console. As you can see in the screenshot below, simply go to your “Snapshots” tab in Compute Engine to create and manage your snapshot schedules in the “Snapshot schedules” tab.Here is an example of how the same schedule in the CLI above would look in the Developer Console, once you create a schedule through the “Create snapshot schedule” button on the top.Navigate to the “Disks” tab to attach snapshot schedules to one or more disks. You can attach schedules to existing disks in the disk details view, or apply a schedule while creating a new disk. The screenshot below shows where to choose the snapshot schedule when creating a disk.To learn more about this feature and other best practices for managing your VMs, check out our talk from Next ’18. With the scheduled snapshots feature, you can focus on building creative applications without inventing your own tools for disk snapshots. Try it today.
Quelle: Google Cloud Platform

Revevol: How we built a BI dashboard with GCP to track G Suite adoption

Editor’s note:Want inspiration on how to use Google Cloud Platform’s extensive data analytics and visualization tools? Using a combination of App Engine, Cloud Storage, Cloud Dataflow, BigQuery and Data Studio, Google Cloud partner Revevol created a series of dashboards to help a client better understand how its employees were using G Suite. Read on to learn how Revevol did it.Here at Revevol, we’ve deployed G Suite at dozens of companies, migrating over 400k users and training 80k people with our change management programs. After using G Suite for a few years, one of our largest clients, a 30k-employee global industrial company, wanted to understand more about how their employees were actually using G Suite. They needed a more data-driven approach to service delivery and change management in order to optimize collaboration.But with their complex organizational structure, questions like, “How is G Suite being used in my organization?” “Where are people struggling?” and “How do we improve?” were nearly impossible to answer using the G Suite admin console. They turned to us to give them a comprehensive picture of their users’ G Suite usage based on activity logs and metadata.As a long-time GCP shop, for both our projects and products (like AODocs), we naturally turned to GCP to build the solution. Our customer wanted to be able to display usage and activity data filtered by location, country, region, business unit and time, and export data to spreadsheets for further analysis. This entailed joining data from G Suite Directory and G Suite usage, and displaying it through a filterable business intelligence (BI) dashboard that looks like this:At a high level, the architecture is as follows: we extract data from the G Suite APIs using App Engine, store it in Google Cloud Storage, transform it in Cloud Dataflow and store it in BigQuery for analysis. We then use Data Studio for visualization. Let’s go over each of these components.Data extractionThe first step in building a solution like this is to extract the data. There are two ways to do so: with REST APIs and the BigQuery Export feature.REST APIsG Suite provides a large number of REST APIs that allow querying for services metadata, such as documents that are stored in Google Drive and activity logs. In this case, we developed an extraction module on top of App Engine. App Engine is great because it is completely serverless and can scale up and down without having to tweak the configuration, provision capacity, or handle load balancing.There are a number of APIs from which to extract data and two kinds of extractions: snapshots of the current state and activity logs. Snapshots are extracted from the Directory API, the Drive API, or the Groups Settings API. The Directory extraction stores a list of users in BigQuery. For Drive, we request all the documents owned by each user by impersonating a service account,  a special type of Google account that belongs to your application rather than to an individual end user. Your application assumes the identity of the service account to call Google APIs so that the users aren’t directly involved. Thanks to the integration of identity between G Suite and GCP, this a breeze.We make requests to the API using Google Cloud Tasks. Each user gets its own Task in a Task Queue, and we launch up to 100 tasks at a time. All API responses are pushed to Cloud Storage. If the user owns so many documents that it’s impossible to page through all of them inside of the 10-minute limit, the user’s task adds itself back to the task queue. If an extraction fails, it also goes back into the queue. The state of the extraction is maintained as a decreasing counter in Memcache that is updated by each task if it’s successful. Once the counter hits 0, the job is done, triggering backup/transformation jobs.If you’ve been paying attention, you’re probably wondering, “What about Team Drives?”—how do I extract data from documents stored there? Great question. While you can as an admin get the list of all the Team Drives in a domain, you cannot then list the documents stored inside of these Team Drives, so it is a bit complex. This is how we do it: First we list all the Team Drives in the domain. Then we go through each user one by one in order to find a user belonging to each Team Drive, and be able to impersonate them with the service account, and finally, list the files in the Team Drive.Thanks to the power and flexibility of Google Cloud Tasks, we were able to implement a parallel and coordinated task queue very easily, without worrying about servers, and extract all the Drive metadata content of a 35k-employee company in under 10 minutes. In fact, the bottleneck here is the Drive API quota.Extracting activity logs is more straightforward as they come from the Admin Report API, so there is no need to impersonate all the users. As time-series data, we query it daily, triggered by a cronjob in App Engine, again relying on Task Queues.BigQuery ExportsUnlike with other G Suite products, G Suite Enterprise customers can export Gmail daily logs directly into BigQuery. This is an example of the tight integration between G Suite and GCP. If similar export capabilities existed for other G Suite services, we would completely bypass the APIs and implement the whole solution without writing a single line of code (except SQL queries of course).Data transformation and storageThe data that was exported from the REST APIs now lies in Cloud Storage in raw JSON, and we keep it there for backup and archiving purposes. For analysis, however, we need to copy it over to BigQuery. At the end of each extraction, a message is sent to the Cloud Pub/Sub topic that we’ve subscribed to using Cloud Functions. The cloud function loads the data using the BigQuery and Cloud Storage APIs:If we want to transform the data before it reaches BigQuery, we can use Cloud Dataflow in batch mode. Otherwise we can use BigQuery queries to create transformed tables from the raw tables. Your decision will likely be guided by two factors:Costs: The BigQuery and Dataflow pricing models are very different. If your transformations frequently scan large tables, BigQuery costs can quickly add up.Maintenance overhead: It’s typically easier to maintain Cloud Dataflow and its clear, readable code, compared to the typical long, complex, unversioned SQL queries that are stored in BigQuery’s web UI.We currently do transformations in BigQuery as it is faster to prototype, but we plan on moving some of the larger ones to Cloud Dataflow soon. We might also experiment with Cloud Dataprep which could enable us to describe Dataflow transformation pipelines without coding, but we haven’t tried it yet.Data visualizationData Studio, a free BI product that’s part of Google Marketing Platform (GMP), is a great example of the tight integration between Google Cloud and the greater Google ecosystem. Data Studio dashboards obey the same access patterns as Google Docs and Sheets, (user, group and domain permissions), have the same real-time collaboration features, and are available to anyone who has a Google account.From the outset, our client wanted one dashboard per G Suite service—one for Google Drive, Gmail, Hangouts Meet, etc. Data Studio provides a connector to BigQuery, which enables the creation of data sources based on BigQuery tables. Whoever creates the data source is the only person who needs access to the underlying dataset, and G Suite credentials allow authentication to happen in the background, with nothing to configure.We created one data source per dashboard, to which we added charts and KPIs through the Data Studio UI. Thus, without writing any front-end code or SQL, we are able to display KPIs and charts based on BigQuery tables, all in a slick, professional dashboard, as well as add filters.While Data Studio can be magical, there are some things to keep in mind. For instance:Having additional aggregation functions like AVG_DISTINCT can help to display the average number of Meet meeting participants filtered by location.If the simple G Suite/Drive-based access control does not work for your use case, you might have to implement a custom solution using row-level access in BigQuery or building your own Community Connector.While reusable templates exist, declarative (YAML-like) sources would be great to industrialize dashboard management and maintenance.There is no easy way to drill down into the underlying data that was used to build the chart.Security controls on BigQuery data sources make collaborating on them slightly difficult: it would be great to be able to view the query even as a non-owner.We look forward to seeing some of these features in future releases of Data Studio. That said, Data Studio is a simple, but seriously powerful BI product that you can use for a long time before needing to use a competitive paid offering.Data exportsAs you might recall, our customer asked for a way to export underlying chart data to a spreadsheet for further analysis. Here we once again took advantage of integrations between G Suite and GCP, using the brand new Google Sheets data connector for BigQuery. It’s very straightforward to use: just select your project and schema from the Google Sheets UI, insert your query (which can be parameterized), run it and voilà!Google Sheets also provides a refresh button for spreadsheet end-users to refresh the data.It’s important to realize that both in the visualization and export use cases, if GCP and G Suite services weren’t so well integrated, we would have had to create a complex API to expose data from BigQuery, handle authentication, and maintain it as our queries or schemas changed. With this solution, we didn’t have to do any of this.ConclusionThe brunt of our work fell into three areas:Extracting data from the APIs. In and of itself, this process adds no business value and could be completely bypassed if G Suite exported metadata and logs for all services directly into BigQuery, as it does for Gmail.Transforming the data to make it valuable from a business point of view. This is where our expertise in G Suite collaboration really shined through.Creating dashboards to make the transformed data easy to consume and adds a lot of business value.Notably absent from this list is integrating services together, or performing DevOps tasks such as setting up logging infrastructure, managing database capacity, replication and backup, and setting up networking and VMs. This allowed us to deliver a working business intelligence solution quickly with a very small team of empowered developers, and is consistent with our overall impression of GCP: it does an amazing job of letting us focus on adding business value, rather than tending to servers and writing integration code. If you have further questions about how we built this solution, or if you want to learn more about Revevol and our offerings, you can find me at stanislas.marion@revevol.eu.
Quelle: Google Cloud Platform

How Box Skills can optimize your workflow with the help of Cloud AI

Have you ever had to manually upload and tag a lot of files? It’s no fun. Increasingly though, machine learning algorithms can help you or your team classify and tag large volumes of content automatically. And if your company uses Box, a popular file sharing, storage and collaboration service, you can now apply Google ML services to your files with just a few lines of code, using the Box Skills Kit, a new framework within Box’s developer toolkit.With technologies like image recognition, speech-to-text transcription, and natural language understanding, Google Cloud makes it easy to enrich your Box files with useful metadata. For example, if you have lots of images in your repository, you can use the Cloud Vision API to understand more about the image, such as objects or landmarks in an image, or in documents, —or even parse their contents and identify elements that determine the document’s category. If your needs extend beyond functionality provided by Cloud Vision, you can point your Skill at a custom endpoint that serves your own custom-trained model.An example integration in actionNow, let’s look at an example. Many businesses use Box to store images of their products. With the Box Skills Kit and the product search functionality in the Cloud Vision API, you can automatically catalog these products. When a user uploads a new product image into Box, the product search feature within the Vision API helps identify similar products in the catalog, as well as the maximum price for such a product.Configuring and deploying a product search Box skillLet’s look at how you can use the Box Skills Kit to implement the use case outlined above.1.Create an endpoint for your Skill   a. Follow this QuickStart guide.   b. You can use this API endpoint to call a pre-trained machine learning model to classify new data.   c. Create a Cloud Function to point your Box Skill at the API endpoint created above.   d. Clone the following repository.   e. Next, follow the instructions to deploy the function to your project.   f. Make a note of the endpoint’s URI.2.Configure a Box Custom Skills App in Box, then configure it to point to the Cloud Function created above.   a. Follow the instructions.   b. Then these instructions.And there you have it. You now have a new custom Box Skill enabled by Cloud AI that’s ready to use. Try uploading a new image to your Box drive and notice that maximum retail price and information on similar products are both displayed under the “skills” console.Using your new SkillNow that you’re all set up, you can begin by uploading an image file of household goods, apparel, or toys into your Box drive. The upload triggers a Box Skill event workflow, which calls a Cloud Function you deployed in Google Cloud and whose endpoint you will specify in the Box Admin Console. The Cloud Function you create then uses the Box Skills kit’s FileReader API to read the base64-encoded image string, automatically sent by Box when the upload trigger occurs. The Function then calls the product search function of Cloud Vision, and creates a Topics Card with data returned from the product search function. Next, it creates a Faces card in which to populate a thumbnail that it scaled from the original image. Finally, the function persists the skills card within Box using the skillswriter API. Now, you can open the image in Box drive and click on the “skills” menu (which expands, when you click the “magic wand” icon on the right), and you’ll see product catalog information, with similar products and maximum price populated.What’s next?Over the past several years, Google Cloud and Box have built a variety of tools to make end users more productive. Today, the Box Skills integration opens the door to a whole new world of advanced AI tools and services: in addition to accessing pre-trained models via the Vision API, Video Intelligence API or Speech-to-Text API, data scientists can train and host custom models written in TensorFlow, sci-kit learn, Keras, or PyTorch on Cloud ML Engine. Lastly, Cloud AutoML lets you train a model on your dataset without having to write any code. Whatever your levels of comfort with code or data science, we’re committed to making it easy for you to run machine learning-enhanced annotations on your data.You can find all the code discussed in this post and its associated documentation in its GitHub repository. Goodbye, tedious repetition! Hello, productivity.
Quelle: Google Cloud Platform

AI in Depth: serving a Keras text classifier with preprocessing using Cloud ML Engine

Cloud ML Engine now supports deploying your trained model with custom online prediction Python code, now in beta. In this blog post, we show how custom online prediction code helps maintaining affinity between your preprocessing logic and your model, which is crucial to avoid training-serving skew. We show an example of building a Keras text classifier, and deploying it for online serving in Cloud ML Engine, along with its text preprocessing components.Cloud ML Engine pre-processing, training, and classification diagramBackgroundThe hard work of building a machine learning (ML) model pays off only when you deploy the model and use it in production—when you integrate it into your pre-existing systems or incorporate your model into a novel application. If your model has multiple possible consumers, you might want to deploy the model as an independent, coherent microservice that is invoked via a REST API that can automatically scale to meet demand. Although Cloud ML Engine may be better known for its training abilities, it can also serve TensorFlow, Keras, scikit-learn, and XGBoost models with REST endpoints for online prediction.While training that model, it’s common to transform the input data into a format that improves model performance. But when performing predictions, the model expects the input data to already exist in that transformed form. For example, the model might expect a normalized numerical feature, for example TF-IDF encoding of terms in text, or a constructed feature based on a complex, custom transformation. However, the callers of your model will send “raw”, untransformed data, and the caller doesn’t (or shouldn’t) need to know which transformations are required. This means the model microservice will be responsible for applying the required transformation on the data before invoking the model for prediction.The affinity between the preprocessing routines and the model (i.e., having both of them coupled in the same service) is crucial to avoid training-serving skew, since you’ll want to ensure that these routines are applied on any data sent to the model, with no assumptions about how the callers prepare the data. Moreover, the model-preprocessing affinity helps to decouple the model from the caller. That is, if a new model version requires new transformations, these preprocessing routines can change independently of the caller, as the caller will keep on sending data in its raw format.Beside preprocessing, your deployed model’s microservice might also perform other operations, including postprocessing of the prediction produced by the model, or even more complex prediction routines that combine the predictions of multiple models.To help maintain affinity of preprocessing between training and serving, Cloud ML Engine now enables users to customize the prediction routine that gets called when sending prediction requests to a model deployed on Cloud ML Engine. This feature allows you to upload a custom model prediction class, along with your exported model, to apply custom logic before or after invoking the model for prediction.Customizing prediction routines can be useful for the following scenarios:Applying (state-dependent) preprocessing logic to transform the incoming data points before invoking the model for prediction.Applying (state-dependent) post-processing logic to the model prediction before sending the response to the caller. For example, you might want to convert the class probabilities produced by the model to a class label.Integrating rule-based and heuristics-based prediction with model-based prediction.Applying a custom transform used in fitting a scikit-learn pipeline.Performing complex prediction routines based on multiple models, that is, aggregating predictions from an ensemble of estimators, or calling a model based on the output of the previous model in a hierarchical fashion.The above tasks can be accomplished by custom online prediction, using the standard framework supported by Cloud ML Engine, as well as with any model developed by your favorite Python-based framework, including PyTorch. All you need to do is to include the dependency libraries in the setup.py of your custom model package (as discussed below). Note that without this feature, you would need to implement the preprocessing, post-processing, or any custom prediction logic in a “wrapper” service, using, for example, App Engine. This App Engine service would also be responsible for calling the Cloud ML engine models, but this approach adds complexity to the prediction system, as well as latency prediction time.Next we’ll demonstrate how we built a microservice that can handle both preprocessing and post-processing scenarios using the Cloud ML Engine custom online prediction, using text classification as the example. We chose to implement the text preprocessing logic and built the classifier using Keras, but thanks to Cloud ML Engine custom online prediction, you could implement the preprocessing using any other libraries (like NLTK or Scikit-learn), and build the model using any other Python-based ML framework (like TensorFlow or PyTorch). You can find the code for this example in this Colab notebook.A text classification exampleText classification algorithms are at the heart of a variety of software systems that process text data at scale. The objective is to classify (categorize) text into a set of predefined classes, based on the text’s content. This text can be a tweet, a web page, a blog post, user feedback, or an email: in the context of text-oriented ML models, a single text entry (like a tweet) is usually referred to as a “document.”Common use cases of text classification include:Spam-filtering: classifying an email as spam or not.Sentiment analysis: identifying the polarity of a given text, such as tweets, product or service reviews.Document categorization: identifying the topic of a given document (for example, politics, sports, finance, etc.)Ticket routing: identifying to which department to dispatch a ticketYou can design your text classification model in two different ways; choosing one versus the other will influence how you’ll need to prepare your data before training the model.N-gram models: In this option, the model treats a document as a “bag of words,” or more precisely, a “bag of terms,” where a term can be one word (uni-gram), two words (bi-gram) or n words (n-grams). The ordering of the words in the document is not relevant. The feature vector representing a document encodes whether a term occurs in the document or not (binary encoding), how many times the term occurs in the document (count encoder) or more commonly, Term Frequency Inverse Document Frequency (TF-IDF encoder). Gradient-boosted trees and Support Vector Machines are typical techniques to use in n-gram models.Sequence models: With this option, the text is treated as a sequence of words or terms, that is, the model uses the word ordering information to make the prediction. Types of sequence models include convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variations.In our example, we utilize the sequence model approach.Hacker News is one of many public datasets available in BigQuery. This dataset includes titles of articles from several data sources. For the following tutorial, we extracted the titles that belong to either GitHub, The New York Times, or TechCrunch, and saved them as CSV files in a publicly shared Cloud Storage bucket at the following location:gs://cloud-training-demos/blogs/CMLE_custom_predictionHere are some useful statistics about this dataset:Total number of records: 96,203Min, Max, and Average number of words per title: 1, 52, and 8.7Number of records in GitHub, The New York Times, and TechCrunch: 36,525, 28,787, and 30,891Training and evaluation percentages: 75% and 25%The objective of the tutorial is to build a text classification model, using Keras to identify the source of the article given its title, and deploy the model to Cloud ML Engine using custom online prediction, to be able to perform text pre-processing and prediction post-processing.Preprocessing textSequence tokenization with KerasIn this example, we perform the following preprocessing steps:Tokenization: Divide the documents into words. This step determines the “vocabulary” of the dataset (set of unique tokens present in the data). In this example, you’ll make use of the most frequently 20,000 words, and discard the other ones from the vocabulary. This value is set through the VOCAB_SIZE parameter.Vectorization: Define a good numerical measure to characterize these documents. A given embedding’s representation of the tokens (words) will be helpful when you’re ready to train your sequence model. However, these embeddings are created as part of the model, rather than as a preprocessing step. Thus, what you need here is to simply convert each token to a numerical indicator. That is, each article’s title is represented as a sequence of integers, and each is an indicator of a token in the vocabulary that occurred in the title.Length fixing: After vectorization, you have a set of variable-length sequences. In this step, the sequences are converted into a single fixed length: 50. This can be configured using MAX_SEQUENCE_LENGTH parameter. Sequences with more than 50 tokens will be right-trimmed, while sequences with fewer than 50 tokens will be left-padded with zeros.Both the tokenization and vectorization steps are considered to be stateful transformations. In other words, you extract the vocabulary from the training data (after tokenization and keeping the top frequent words), and create a word-to-indicator lookup, for vectorization, based on the vocabulary. This lookup will be used to vectorize new titles for prediction. Thus, after creating the lookup, you need to save it to (re-)use it when serving the model.The following block shows the code for performing text preprocessing. The TextPreprocessor class in the preprocess.py module includes two methods.fit(): applied on training data to generate the lookup (tokenizer). The tokenizer is stored as an attribute in the object.transform(): applies the tokenizer on any text data to generate the fixed-length sequence of word indicators.Preparing training and evaluation dataThe following code prepares the training and evaluation data (that is, it converts each raw text title to a NumPy array with 50 numeric indicator). Note that, you use both fit() and transform() with the training data, while you only use transform() with the evaluation data, to make use of the tokenizer generated from the training data. The outputs, train_texts_vectorized and eval_texts_vectorized, will be used to train and evaluate our text classification model respectively.Next, save the processor object (which includes the tokenizer generated from the training data) to be used when serving the model for prediction. The following code dumps the object to processor_state.pkl file.Training a Keras modelThe following code snippet shows the method that creates the model architecture. We create a Sequential Keras model, with an Embedding layer, Dropout layer, followed by two Conv1d and Pooling Layers, then a Dense layer with Softmax activation at the end. The model is compiled with sparse_categorical_crossentropy loss and accuracy acc (accuracy) evaluation metric.The following code snippet creates the model by calling the create_model method with the required parameters, trains the model on the training data, and evaluates the trained model’s quality using the evaluation data. Lastly, the trained model is saved to keras_saved_model.h5 file.Implementing a custom model prediction classIn order to apply a custom prediction routine that includes preprocessing and postprocessing, you need to wrap this logic in a Custom Model Prediction class. This class, along with the trained model and the saved preprocessing object, will be used to deploy the Cloud ML Engine online prediction microservice. The following code shows how the Custom Model Prediction class (CustomModelPrediction) for our text classification example is implemented in the model_prediction.py module.Note the following points in the Custom Model Prediction class implementation:from_path is a “classmethod”, responsible for loading both the model and the preprocessing object from their saved files, and instantiating a new CustomModelPrediction object with the loaded model and preprocessor object (which are both stored as attributes to the object).predict is the method invoked when you call the “predict” API of the deployed Cloud ML Engine model. The method does the following:Receives the instances (list of titles) for which the prediction is neededPrepares the text data for prediction by applying the transform() method of the “stateful” self._processor object.Calls the self._model.predict() to produce the predicted class probabilities, given the prepared text.Postprocesses the output by calling the _postprocess method._postprocess is the method that receives the class probabilities produced by the model, picks the label index with the highest probability, and converts this label index to a human-readable label  ‘github’,  ‘nytimes’,  or ‘techcrunch’ .Deploying to Cloud ML EngineFigure 1 shows an overview of how to deploy the model, along with its required artifacts for a custom prediction routine to Cloud ML Engine.Uploading the artifacts to Cloud StorageThe first thing you want to do is to upload your artifacts to Cloud Storage. First, you need to upload:Your saved (trained) model file: ‘keras_saved_model.h5′ (see the Training a Keras model section).Your pickled (seralized) preprocessing objects (which contain the state needed for data transformation prior to prediction): processor_state.pkl (see the Preprocessing Text section). Remember, this object includes the tokenizer generated from the training data.Second, upload a python package including all the classes you need for prediction (e.g., preprocessing, model classes, and post-processing). In this example, you need to create a pip-installable tar with model_prediction.py and preprocess.py. First, create the following setup.py file:Now, generate the package by running the following command:This creates a .tar.gz package under a new /dist directory, created in your working directory. The name of the package will be $name-$version.tar.gz where $name and $version are the ones specified in the setup.py.Once you have successfully created the package, you can upload it to Cloud Storage:Deploying the model to Cloud ML EngineLet’s define the model name, the model version, and the Cloud ML Engine runtime (which corresponds to a TensorFlow version) required to deploy the model.First, create a Cloud ML Engine using the following gcloud command:Second, create a model version using the following gcloud command, in which you specify the location of the model and preprocessing object (–origin), the location the package(s) including the scripts needed for your prediction (–package-uris), and a pointer to your Custom Model Prediction class ( –model-class). This should take one to two minutes.Calling the deployed model for online predictionsAfter deploying the model to Cloud ML Engine, you can invoke the model for prediction using the following code:Given the titles defined in the request object, the predicted source of each title from the deployed model would be as follows: [‘techcrunch’, ‘techcrunch’, ‘techcrunch’, ‘nytimes’, ‘nytimes’, ‘nytimes’, ‘github’, ‘github’, ‘techcrunch’]. Note that the last one was mis-classified by the model.ConclusionIn this tutorial, we built and trained a text classification model using Keras to predict the source media of a given article. The model required text preprocessing operations for preparing the training data, and preparing the incoming requests to the model deployed for online predictions. We have now shown how you can deploy the model to Cloud ML Engine with custom online prediction code, in order to perform preprocessing to the incoming prediction requests and post-processing to the prediction outputs. Enabling a custom online prediction routine in Cloud ML Engine allows for affinity between the preprocessing logic, the model, and the post-processing logic required to handle prediction request end-to-end, which helps to avoid training-serving skew, and simplifies deploying ML models for online prediction.Thanks for following along. If you’re curious to try out some other machine learning tasks on GCP, take this specialization on Coursera. If you want to try out these examples for yourself in a local environment, run this notebook on Colab. Send a tweet to @GCPcloud if there’s anything we can change or add to make text analysis even easier on Google Cloud.
Quelle: Google Cloud Platform

From CCIE to Google Cloud Network Engineer: four things to think about

To stay relevant and wanted in the high-tech job market, it’s important to keep abreast of new technologies—and get certified in them! Google Cloud offers a number of professional certifications, including the new Professional Cloud Network Engineer. Currently in beta, certifications such as this can make you a valuable asset in a multi-cloud world.If you’re coming from a traditional on-premises IT environment, there are some things that are helpful to know up front when studying for the Cloud Network Engineer certification. Personally, I spent nearly two decades working in mainstream IT operations settings, and have made the switch to cloud. As a former Cisco Certified Internetwork Expert, i.e., CCIE, I’ve had to let go of the past and open up to seeing and learning new things in a slightly different way. Here are some things to understand before you start studying. The sooner you see the difference between networking in the cloud and on-prem, the more successful you’ll be.1. Focus on workflows, not packets.Figure 1 is a common network diagram that shows the data flow between two endpoints over a simple network. Data originates in applications on Endpoint 1 and flows up and down the TCP/IP network stack across the devices in the network, until it finally reaches the applications on Endpoint 2. Before a large chunk of data is sent out of EndPoint-1 it is sliced up into smaller sized pieces. Protocol headers are then prepended to these pieces before they are sent out onto the wire as packets. These packets, and their associated headers, are the atomic unit in the networking world.Figure 1. Packetized data flow through network.As a network engineer though, you typically focus on the network devices in between the endpoints, not the endpoints themselves. As you can see in Router-1, the majority of traffic flows through the router; it comes in one interface (the so-called “goes-inta” interface), and passes out the “goes-outta” interface. Only a relatively small amount of traffic is destined to the router itself. Data destined for the network device, meanwhile, includes control-plane communications, management traffic, or malicious attacks. This “through vs. to” traffic balance is common across all networking devices (switches, routers, firewalls, and load balancers) and results in a “goes-inta/goes-outta” view of the world as you configure, operate, and troubleshoot your network devices.Once you step into the cloud engineer role the atomic unit changes. Packets and headers are replaced with workflows and their associated datasets. Figure 2 shows this conceptual change through a typical three-tier web deployment. The idea of the network as you knew is it abstracted and distributed. The traffic pattern now inverts, with the majority of traffic either sourced or destined for a cloud service or application that resides on a cloud resource, rather than the network devices between them.Figure 2. Cloud-based three-tier web deployment.You can see this when you look at how to configure the firewall rule named http-inbound. Even though you configure the rule in relation to the VPC, you now have to identify a target using either the –target-tags or the –target-service-accounts=[IAM Service Account] gcloud arguments. In addition, depending on the ingress or egress direction of the traffic, you only configure either a source or destination filter, not both. This is because half of the information is considered to be the target itself. In other words, the focus is on the data that enters and leaves the specific cloud service.2. Realize your building blocks have changed.As you move from on-premises to the cloud don’t get hung up trying to fit all the networking details you already know into the new solutions you are learning. Remember that your new goal is to enable workflows.In the old networking world there were tangible building blocks such as switches, routers, firewalls, load balancers, cables, racks, power outlets, and BTU calculations. The intangible building blocks were features and methods defined by IETF RFCs and vendor-proprietary protocols with their ordered steps, finite-state machines, data structures, and timers. You physically assembled all these things to build inter-connectivity between the end users and the applications they used to make your business run. Implementing all this took days and weeks. In addition, as the network grew, the associated management and cost to operate it also grew disproportionately larger for your business.Cloud solutions treat this complexity as a software problem and add a layer of abstraction between end users and workloads, removing or hiding many of the complex details associated with the old building blocks. Your new building blocks are cloud-based services and features like Google Compute Engine, Cloud SQL, Cloud Functions, and Cloud Pub/Sub. You assemble these new resources based on your needs to provide IaaS, Paas, SaaS, and FaaS solutions. Your deployment schedule shrinks from days and weeks to seconds and minutes as you connect your enterprise network via Cloud VPN or Cloud Interconnect and deploy VPCs, Cloud Load Balancing, and Docker containers with Google Kubernetes Engine. You minimize management complexity through tools like Deployment Manager, Stackdriver, and Google Cloud’s pricing tools. You no longer simply build connectivity between end points but rather enable virtualized environments by treating infrastructure as code.3. Understand the power of a global fiber network.Many cloud providers’ infrastructure is made up of large data center complexes in geographical regions across the globe, with each region subdivided into zones for service redundancy. Connectivity between these regions, for the most part, happens over the public internet.Figure 3. A typical cloud provider’s global infrastructure.The benefit of this approach is that the internet provides ubiquitous connectivity. Looking at Figure 3 though you can see that there are several downsides:Management complexity. As your cloud footprint grows and you need your “island” VPCs to communicate over various peering options across regions, you inherit additional configuration, operational, and troubleshooting complexityUnpredictable performance. You have no control over jitter, delay, and packet loss in the public Internet.Suboptimal routes. The number of hops your traffic must transverse across the internet is most likely not optimized for your business—you are at the mercy of network outages and carriers’ BGP policies.Security risks.  The internet is where the good people are (your customers), but it’s also unfortunately where the bad people are. While you can encrypt your traffic in transit, you still run a risk when sending inter-region communications over the public Internet.Figure 4. Google’s Premium Tier cloud infrastructure.Google Cloud’s’ Premium Network Service Tier, now generally available, changes the game. As shown in Figure 4, the public internet sits outside of your global VPC. The core of your network is now Google’s own private fiber network.This improves your situation in several ways:You no longer have a cloud footprint made up of isolated geographic VPC islands—your infrastructure is one large homogenous cloud network. This network can be regional to start and grow to a global footprint when you are ready, with minimal headache.The issues of packet loss, delay, and jitter are mitigated significantly as compared to the public internet.The number of hops between endpoints is significantly minimized. Once your traffic enters the Google network it rides across its optimum path as opposed to through various Internet carrier networks.By utilizing global load-balancing and anycast addresses, traffic hops onto and jumps off of Google’s network at the closest point to your end users.Inter-region and private access traffic is automatically encrypted, transparently to the application, and sent across the private fiber backbone.  Because it doesn’t ride over the Internet, that traffic is never exposed to the bad guys.Of course, if these advantages aren’t as compelling as lower bandwidth costs, Google Cloud also offers a Standard Networking Tier that routes traffic over the public internet for a lower price point.  4. Embrace the flexibility of the API, Client Libraries, SDK, and Console.Sure, some networking devices have GUI-based management programs or web consoles, but if you’re like me, you’ve probably spent most of your career in the CLI of a networking device. This is because GUIs tend to make the basic stuff easy and CLIs make the hard stuff possible‚—they’re your go-to place for configuration, operation, and troubleshooting.CLIs do have their limitations though. If you want new features you have to upgrade software, and before you upgrade you have to test. That takes time and it’s expensive. If the CLI’s command structure or output changes, your existing automation and scripting breaks. In addition, in large networks with literally hundreds or thousands of devices, lack of software version consistency can be a management nightmare. Yes, you have SNMP, and where SNMP fails, XML schemas and NETCONF/YANG models step in to evolve things in the right direction. All this said, it’s a far cry from the programmatic access you are given once you step into the cloud.Figure 5. Cloud API,Client Libraries, SDKs, and Console.From a configuration, operation, and troubleshooting standpoint, the cloud has a lot of roads to the proverbial top of Mount Fuji. Figure 5 shows the different the different paths available. You are free to choose the one that best maps to your skill level and is most appropriate to complete the task at hand. While Google Cloud has a CLI-based SDK for shell scripting or interactive terminal sessions, you don’t have to use it. If you are developing an application or prefer a programmatic approach, you can use one of many client libraries that expose a wealth of functionality. If you’re an experienced programmer with specific needs you can even write directly to the REST API itself. And of course, on the other end of the spectrum, if you are learning or prefer to use a visual approach, there’s always the console.In addition to the tools above, If you need to create larger topologies on a regular cadence you may want to look at Google’s Cloud Deployment Manager. If you want a vendor agnostic tool that works across cloud carriers you can investigate the open-source program Terraform. Both solutions offer a jump from imperative to declarative infrastructure programming. This may be a good fit if you need a more consistent workflow across developers and operators as they provision resourcesPutting it all togetherIf this sounds like a lot, that’s because it is. Don’t despair though, there’s a readily available resource that will really help you grok these foundational network concepts: the documentation.You are most likely very familiar with the documentation section of several network vendor’s websites. To get up to speed on networking on Google Cloud, your best bet is to familiarize yourself with Google’s documentation as well. There is documentation for high-level concepts like network tier levels, network peering, and hybrid connectivity. Then, each cloud service also has its own individual set of documentation, subdivided into concepts, how-tos, quotas, pricing, and other areas. Reviewing how it is structured and creating bookmarks will make studying and the certification process much easier. Better yet, it will also make you a better cloud engineer.Finally, I want to challenge you to stretch beyond your comfort zone. Moving from network to the cloud is about virtualization, automation, programming, and developing new areas of expertise. Your journey into the cloud should not stop at learning how GCP implements VPCs. Set long term as well as short term goals. There are so many new areas where your skill sets are needed and you can provide value. You can do it; don’t doubt that for one minute.In my next blog post I’ll be discussing an approach to structure your Cloud learning. This will make the learning and certification process easier as well as prepare you to do the cloud Network Engineer role. Until then, the Google Cloud training team has lots of ways for you to increase your Google Cloud know-how. Join our webinar on preparing for the Professional Cloud Network Engineer certification exam on February 22, 2019 at 9:45am PST. Now go visit the Google certification page and set your first certification goal! Best of Luck!
Quelle: Google Cloud Platform

Announcing Google Cloud Security Talks during RSA Conference 2019

Going to RSA Conference in San Francisco next month? In addition to keynote sessions, we’re hosting the third edition of Google Cloud Security Talks at Bespoke in Westfield San Francisco Centre, a five-minute walk from Moscone Center.This series of 20 talks over two days will cover Google Cloud’s security products and capabilities, our 2019 vision and roadmap, and insights from our upcoming security report. The majority of the sessions will be led by Googlers, including Panos Mavrommatis, Engineering Director for Safe Browsing, and Eugene Liderman, Director for Android Security Strategy. You’ll also get to hear from security partners running workloads on GCP, including Palo Alto Networks, and from customers about how security is a differentiator for Google Cloud. You can view the full agenda below, and feel free to register for the event on our website.In addition to presentations and panels, we’ll feature several interactive demos that showcase how Google prevents phishing and ransomware attacks and how partners integrate with our services.Finally, various Google security experts will be talking at the RSA Conference itself, as well as at additional parallel events throughout the week:RSA CONFERENCE | Moscone CenterWhat Should a US Federal Privacy Law Look Like? [PRV-T09]Tuesday, Mar 05 | 3:40 PM – 4:30 PMKeith Enright, Chief Privacy Officer, GoogleAttacking Machine Learning: On the Security & Privacy of Neural Networks [MLAI-W03]Wednesday, March 6 | 9:20 AM – 10:10 AMNicholas Carlini, Research Scientist, GoogleFirst Steps in RF: Lessons Learned [SBX3-W2]Wednesday, March 6 | 1:50 PM – 2:50 PMDave Weinstein, Android Security Assurance Engineering Manager, GoogleKubernetes Runtime Security: What Happens If a Container Goes Bad? [CSV-R02]Thursday, March 7 | 8:00 AM – 8:50 AMJen Tong, Security Advocate, Google CloudAnatomy of Phishing Campaigns: A Gmail Perspective [HT-R03]Thursday, March 7 | 9:20 AM – 10:10 AMAli Zand, Software Engineer, Google & Nicolas Lidzborski, Senior Software Engineer, Google CloudEngineering Trust and Security in the Cloud Era, Based on Early Lessons [KEY-F03S]Friday, Mar 08 | 11:10 AM – 12:00 PMSuzanne Frey, Vice President, Engineering, Google Cloud; Quentin Hardy, Head of Editorial, Google Cloud & Amin Vadhat, Google Fellow and Networking Technical Lead, GoogleTHE CYBER RISK FORUM | The Fairmont, 950 Mason St, San FranciscoThe Human Factor: How CEOs and Boards Can Ensure Your Employees are an Asset not a Liability in the War on CyberMonday, March 4 | 11:00 AM – 11:50 AMSam Srinivas, Director of Product Management, Google CloudBSides SF | City View at Metreon, 135 4th St #4000, San FranciscoYou Might Still Need Patches for Your Denim, but You No Longer Need Them for ProdMonday, March 4 | 3:30 PM – 4:00 PMMaya Kaczorowski, Product Manager, Google Cloud & Dan Lorenc, Software Engineer, Google CloudDo Androids Dream of Electric Fences?: Defending Android in the EnterpriseMonday, March 4, 2019 | 4:50 PM – 5:20 PMBrandon Weeks, Security Engineer, Google CloudAt Google Cloud, we work hard to protect your underlying infrastructure from end-to-end, give you control over your own data, while complying with industry regulations, standards, and frameworks. We look forward to showing you how during RSA Conference next month!
Quelle: Google Cloud Platform

A little light reading: the latest on technology from around the Google-verse

As we collectively dive into 2019, we’ve already come across some great cloud-related reads from around the broader Google world. Here are a few stories to help you stay informed—and get inspired—about interesting technologies, projects and initiatives on everything from application development and Knative to renewable energy and AI research.Brighten up your day with renewables newsA 40,000-strong solar panel farm in Taiwan is now part of our plan to meet our renewable energy goals. This is our first renewable energy project in Asia, and we’ll purchase 100% of the output of a 10-megawatt solar array. Located about 100 kilometers away from our data center on the west side of Taiwan, solar panels will be mounted several feet in the sky around commercial fishing ponds. Take a peek at the solar farm site here.See Cloud Functions through Firebase eyesHere’s a look, with plenty of visuals, at how one mobile developer uses Firebase for app dev, along with Cloud Functions as a back end. Since Firebase is a Google product, it integrates with other Google products, so you can access Cloud Functions from within Firebase’s console, or vice versa. You may often switch between the consoles during development, as well as writing and deploying code via either the Firebase CLI or Cloud CLI (known as gcloud). Read the nitty-gritty details here.Go back to school at the libraryGrow with Google is a digital skills training and education program for students, teachers, job seekers, startups and others. It’s spreading its wings with a big push to bring in-person Grow with Google workshops to libraries in every U.S. state in 2019. The workshops and tools are all free, in true library spirit, and there’s also room for creative new ideas through a grant to the American Library Association. Check out the workshop list.See what researchers accomplished in 2018The accomplishments of Google researchers last year are inspiring, from creating assistive techniques in email to earthquake aftershock prediction. It’s worth scrolling down for some great details about how quantum computing is developing as well as how computational photography powers Night Sight in Pixel phones. Do your own investigation into this post.Try a little light listening: All about KnativeThough this podcast doesn’t exactly count as “reading,” it’s a great primer for understanding Knative, which simplifies Kubernetes for developing serverless apps. . Knative came out of the idea that developers don’t necessarily need to see every detail of Kubernetes tools to use it effectively. Knative is a higher level of abstraction that’s focused on making the Kubernetes experience easier for developers, and the podcast covers features like eventing and scale-to-zero capabilities. Plus, you’ll learn how to pronounce “Knative.” Hear from the developers on Knative details. And join the Knative community on GitHub.That’s a wrap for January! What have you been reading lately? Tell us your recommendations here.
Quelle: Google Cloud Platform

Transforming healthcare in the cloud through data, analytics, and machine learning

From electronic health records to medical imaging, healthcare is an industry with an unprecedented amount of data. At Google Cloud, we want to help more healthcare organizations turn this data into health breakthroughs, through better care and more streamlined operations. Over the past year, we’ve enhanced Google Cloud offerings with healthcare in mind, expanded our compliance coverage, and welcomed new customers and partners. Here’s a look at a few milestones along the way.Welcoming new healthcare customers to Google CloudThe challenges of healthcare are increasingly data challenges—creating it, storing it, and analyzing it to find meaningful insights. This year we welcomed many new healthcare customers to Google Cloud, and we’re continually inspired by how these customers use data to benefit both patients and providers. Here are a few examples:National Institutes of Health (NIH) is bringing the power of Google Cloud to biomedical research as a part of their STRIDES (Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability) Initiative. As NIH’s first industry partner on this initiative, Google Cloud made some of the most important NIH-funded datasets available to users with appropriate privacy controls and have helped to simplify access to these datasets.The BARDA DRIVe Solving Sepsis initiative is partnering with a research consortium consisting of Emory University School of Medicine, Massachusetts General Hospital (MGH), University of California San Diego (UCSD) School of Medicine, and Atlanta’s Grady Health System to leverage Google Cloud to develop interoperable learning software for early prediction of sepsis in hospital intensive care units. Now DRIVe can help develop and implement that platform to reduce the approximately 270,000 deaths from sepsis in the United States each year.Hurley Medical Center is increasing operational efficiencies, reducing costs and improving patient outcomes by moving to G Suite from on-premises productivity software and email. Moving to G Suite has saved the organization $150,000 in annual software costs.Hunterdon Healthcare uses G Suite to improve collaboration and efficiency, reclaiming 30% of caregivers’ time for patient interactions while reducing costs by $1.3 million over three years.Imagia is leveraging GCP in its mission to help predict patient outcomes and detect disease specific markers from imaging data. With GCP, the company has reduced test processing time from 16 hours to one hour, and has improved time to discovery for researchers.Wellframe uses GCP to power their platform that connects people and care teams, helping them build trusted relationships that drive early interventions. Automating care intelligence empowers Wellframe providers to scale care delivery and optimize care strategy, which has already resulted in an 80 percent increase in weekly patient care plan engagement. We’re excited to see how these and other organizations in the healthcare space utilize data to solve their most pressing challenges.Working with partners for better patient outcomesOur Google Cloud partners play a critical role in helping healthcare providers and organizations embrace and evolve their cloud strategies. Today, we are pleased to announce several new partnerships established to accelerate our commitment to data interoperability.Our relationship with Health Level 7 (HL7), an international standards body for clinical data, builds upon our existing work with the FHIR Foundation to include the broader set of standards managed by the organization. Representatives from Google are joining the standards community.By partnering with the SMART Advisory Council, a group designed to facilitate applications integrated directly into electronic health records, Google Cloud developers will be able to share feedback to improve the SMART specification and help maintain a robust set of tools for application designers, engineers, and users.As a partner of Rock Health, an industry leader in digital health research and new venture support, we will incorporate integration requirements from novel and fast-growing companies, share best practices for scalable and compliant product development around the world, and consult with investors, industry executives, regulators, legislators, and academics shaping the future of digital health.MITRE, a not-for-profit organization that operates federally funded research and development centers, is collaborating with Google Cloud to give developers access to SyntheticMass through Cloud Healthcare API and Apigee Edge. SyntheticMass is a population-level, FHIR-formatted dataset that contains realistic but fictional residents of the state of Massachusetts. It statistically mirrors the real population in terms of demographics, disease burden, vaccinations, medical visits, and social determinants, which makes it a risk-free environment for experimenting and exploring new healthcare solutions.  SyntheticMass is generated by Synthea, an open-source, synthetic patient generator that models the medical history of patients. The FHIR dataset will be made publicly available to developers soon.We’ve also made great strides with other technology partners within the healthcare ecosystem. Novo Nordisk selected the medical-grade BrightInsight platform, which is hosted on GCP, to build and operate digital health solutions for diabetes patients and securely manage millions of its smart medical devices and the corresponding data within a regulatory-compliant environment.Flywheel is integrating Google’s Healthcare API, as well as BigQuery and AutoML Vision, with their platform to capture multi-modality images and data, boost the productivity of data classification, and securely collaborate with peers to manage analysis and metadata.Life Image and the Athena Breast Health Network at the University of California selected Mammosphere on GCP for its breakthrough WISDOM Study to determine the optimal frequency and methods of breast cancer screening. Life Image is also using our Healthcare API to bridge the gap between care systems and applications built on Google Cloud.Our partnership with Imprivata, the healthcare IT security company, makes it possible for Chrome devices to work seamlessly with Imprivata’s single sign-on and virtual desktop access platform for healthcare. This will enable secure mobile workstations and patient devices.Elastifile launched Elastifile Cloud File Service, a fully-managed file storage service. With scalable, high-performance, pay-as-you-go file storage at their fingertips, healthcare organizations are empowered to burst data-intensive NFS workloads to Google Cloud for accelerated processing.Unlocking the power of data with our productsAt Google Cloud, we’re always looking to expand our healthcare product offerings—and help our customers do the same. Many organizations host datathon events as a way to collaboratively tackle data challenges and quickly iterate on new solutions or predictive models. To help, we’re announcing the Healthcare Datathon Launcher, which provides a secure computing environment for datathons. And if you want to learn how to do clinical analysis, University of Colorado Anschutz Medical Campus has just launched a Clinical Data Science specialization on Coursera, with 6 online courses, giving you hands-on experience with Google Cloud.Additionally, we’ve enhanced our healthcare offerings in numerous ways over the past year, including making radiology datasets publicly availableto researchers with the Google Healthcare API, and hosting over 70 public datasets from the Cancer Imaging Archive (TCIA) and NIH. With these datasets, researchers can quickly begin to test hypotheses and conduct experiments by running analytic workloads on GCP—without the need to worry about IT infrastructure and management. Helping healthcare providers meet their security and compliance needsSecurity and compliance are fundamental concerns for healthcare providers, and are among Google Cloud’s topmost priorities. To date, more than three dozen Google Cloud Platform products and services enable HIPAA compliance, including Compute Engine, Cloud Storage, BigQuery, and most recently, Apigee Edge and AutoML Natural Language. In addition, Google Cloud Platform and G Suite are HITRUST CSF certified. Google Cloud is also committed to supporting compliance with requirements such as the GDPR, PIPEDA, and more. We recently published a whitepaper on Handling Healthcare Data in the UK that provides an overview of NHS information governance requirements.This week at HIMSS, throughout speaking sessions and within our booth (#2221), we’re highlighting the inspiring ways our customers and partners are using Google Cloud to positively transform healthcare . We’ll also be sharing more on products and services we’ve designed specifically for our healthcare customers’ needs and with security and compliance top of mind. Finally, we are thrilled that Aashima Gupta, Google Cloud’s Director of Healthcare Solutions, has been recognized by HIMSS as one of 2019’s most influential women in health IT for her contributions in this space. Come see her and our many other speakers throughout the event.
Quelle: Google Cloud Platform