Coastal classifiers: using AutoML Vision to assess and track environmental change

Tracking changes in the coastline and its appearance is an effective means for many scientists to monitor both conservation efforts and the effects of climate change. That’s why the Harte Research Institute at TAMUCC (Texas A&M University – Corpus Christi) decided to use Google Cloud’s AutoML Vision classifiers to identify attributes in large data sets of coastline imagery, in this case, of the coastline along the Gulf of Mexico. This post will describe how AutoML’s UI helped TAMUCC’s researchers improve their model’s accuracy, by making it much easier to build custom image classification models on their own image data. Of course not every organization wants to analyze and classify aerial photography, but the techniques discussed in this post have much wider applications, for example industrial quality control and even endangered species detection. Perhaps your business has a use case that can benefit from AutoML Vision’s custom image classification capabilities.The research problem: classification of shoreline imageryThe researchers at the Harte Research Institute set out to identify the types of shorelines within aerial imagery of the coast, in order to accurately predict the Environmental Sensitivity Index (ESI) of shorelines displayed in the images, which indicate how sensitive a section of shoreline would be to an oil spill.   Anthony Reisinger and his colleagues at the Harte developed an Environmental Sensitivity Index map of shorelines that may be impacted by oil spills for the State (government) of Texas. During this process, the team looked at oblique aerial photos and orthophotos similar to what one might find on Google Maps, and manually traced out shorelines for the entire length (8950 miles) of the Texas coast (see below). After the team traced the shoreline, they then coded it with ESI values that indicate how sensitive the shoreline is to oil. These values were previously standardized by experts who had spent many years in the field and scrutinizing coastal images.Texas coast with cyan overlay of ESI shorelineAfter an oil spill, the State of Texas uses these ESI shoreline classifications to send out field crews to highly sensitive environments near the oil spill. The State then isolates sensitive habitats with floating booms (barriers that float on water and extend below the surface) to minimize the oil’s impact on the environment and the animals that live there.As you might imagine, the process of learning how to identify the different types of environment classifications and how sensitive these shorelines are to oil spills takes years of first-hand experience, especially when imagery is only available at different scales and resolutions. Some of the team’s research over the years has utilized machine learning, so the researchers decided to see if their expert knowledge could be transferred over to a machine and automate the identification of the different types of ESI shorelines within the images and among the different types of imagery used.  Coastal environments can rapidly change due to natural processes as well as coastal development, thus, the state needs to update its ESI shoreline assessments periodically. At the moment, the team plans to update the ESI shoreline data set for the entire Gulf Coast that lies within the State of Texas. During this process, new oblique imagery will be acquired to help identify the shorelines sensitivity to oil spills. With AutoML Vision, the team takes newly-acquired oblique imagery and predicts the ESI values in the shoreline photos, thereby classifying (or coding) the new shoreline file we create.Imagery typesThe team experimented with two different types of aerial shoreline images: oblique and orthorectified. For orthorectified aerial photos, a grid was overlaid on the imagery and ESI shorelines, and both were extracted for each grid cell and joined together. For the oblique shoreline photos, the team experimented with applying both single labels and multiple labels (also known as multi-class detection). Details of this latter approach are mentioned later in this post.Rectified imagery overlaid with ESI shorelines and grid used to extract both imagery and shorelines.But to begin, let’s take a look at the results on the oblique image set. Early experiments comparing precision and recall metrics of AutoML models using orthorectified aerial photos of different pixel resolutions and acquisition dates were compared to AutoML models of oblique photos. Interestingly, the team found that prediction accuracy using oblique imagery models was higher than that for orthorectified imagery models. The oblique imagery models’ higher performance is likely due to larger geographic coverage of the oblique imagery, and the inclusion of vertical information in these images.Cloud Vision’s limitations for our use caseA little testing confirmed that the out-of-the-box Cloud Vision API won’t help with this task: Cloud Vision can identify many image categories, but, unsurprisingly, the results proved too general for the team’s purposes.Cloud Vision’s labels are too general for coastline classification.The team then decided that the shoreline images dataset is perfect for use with AutoML Vision, which let the team build their own domain-specific classifier.Cloud AutoML Vision provides added flexibilityCloud AutoML allows developers with limited machine learning expertise to train high-quality models specific to their data and business needs, by leveraging Google’s state-of-the-art transfer learning and Neural Architecture Search technology. Google’s suite of AutoML products currently includes Natural Language and Translate as well as Vision, all currently in beta.By using AutoML Vision, the team was able to train custom image classification models with only a labeled dataset of images. AutoML does all the rest for you: it trains advanced models using your data, lets you inspect your data and analyze the results via an intuitive UI, and provides an API for scalable serving.The first AutoML experiment: single-labeled imagesThe team first experimented with a single-label version of the oblique image set, in which the label referred to a single primary shoreline type included in the image. The quality of this model was passable, but not as accurate as the team had hoped. To generate the image labels, the direction of the camera and aircraft position were used to project a point to the closest shoreline, and each image was assigned a label based on both the camera’s heading and the proximity of the shoreline to the plane’s position.Map showing single-label method used to join the aircraft’s position with the nearest ESI shoreline from the image on the left. Image was taken from the aircraft’s position in the map (Note: the projected point was assigned the value of the closest shoreline point to the plane/camera’s location; however, this photo contains multiple shoreline types).Precision and recall metrics across all labels, using the single-label dataset.The AutoML UI allows easy visual inspection of your model’s training process and evaluation results, under the Evaluate tab. You can look at the metrics for all images, or focus on the results for a given label, including its true positives, false positives, and false negatives. Thumbnail renderings of the images allow you to quickly scan each category.From inspection of the false positives, the team was able to determine that this first model often predicted coastline types that were actually present in the image, but did not match the single label. Below, you can see one example, in which the given coastline label was salt_brackish_water_marshes, but the model predicted gravel_shell_beaches with higher probability. In fact, the image does show gravel shell beaches as well as marshes.AutoML Vision correctly predicts that this image contains gravel shell beaches, even though it wasn’t labeled as such.After examining these evaluation results, the team concluded that this data would be a better fit for multi-label classification, in which a given image could be labeled as containing more than one type of shoreline. (Similarly, you might want to apply multiple classes to your training data as well, depending on your use case.)The second AutoML experiment: multi-labeled imagesAutoML supports multi-labeled datasets, and enables you to train and use such models. With this capability in mind, the team soon discovered it was possible to generate such a dataset from the original source images, and then ran a second set of experiments using the same images, but tagged with multiple labels per image, where possible. This dataset resulted in significantly more accurate models than those built using the single-label dataset.Map illustrating the multi-label method of ESI shoreline label extraction using modeled field-of-view (FOV) of the camera and the image taken from the aircraft’s position on the map (Note: this method allows for the majority of the shorelines in the FOV to be assigned to this image).Precision and recall metrics across all labels, for a model trained on the multi-label dataset.The following image is representative of how the multi-labeling helped: its labels include both gravel_shell_beaches and salt_brackish_water_marshes, and it correctly predicts both.This image’s multiple classifications were correctly predicted.Viewing evaluation results and metricsAutoML’s user interface (UI) makes it easy to view evaluation results and metrics. In addition to overall metrics, you can view how well the model performed with each label, including display of representative true positives, false positives, and false negatives. A slider lets you adjust the score threshold for classification—for all labels or for just a single label—and then observe how the precision-recall tradeoff curve changes in response.Often, classification accuracy is higher for some labels than others, especially if your dataset includes some bias. This information can be useful in determining whether you might increase model accuracy by sourcing additional images for some of your labels, then further training (or retraining) them.Viewing evaluation results for the “gravel_shell_beaches” label.Comparing models built using the same datasetAutoML Vision allows you to indicate how much (initial1) compute time to devote to creating a model. As part of the team’s experimentation, it also compared two models, the first created using one hour of compute time, and the other using 24 hours. As expected, the latter model was significantly more accurate than the former. (This was the case for the single-label dataset as well.)Viewing evaluation results for the “gravel_shell_beaches” label.Using your models for predictionAutoML Vision makes it easy for you to use your trained models for prediction. You can use the Predict tab in the UI to see visually how a model is doing on a few images. You can use the ‘export data’ feature in the top navigation bar to see which of your images were in which data set (training, validation, or test) to avoid using training images.Predicting the classes of shoreline shown in a new imageThen, you can access your model via its REST API for scalable serving, either programmatically or from the command line. The Predict tab in the AutoML UI includes examples of how to do this.What’s nextWe hope this helps demonstrate how Cloud AutoML Vision can be used to accurately classify different types of shorelines in aerial images. We plan to create an updated version of the ESI shoreline dataset in the future and use the AutoML model to predict shoreline types on newly acquired oblique photography and orthorectified imagery. Use of AutoML will allow non-experts the ability to assign ESI values to these shorelines we create. Try it yourselfThe datasets we used in these experiments are courtesy of the Harte Research Institute at the Texas A&M University – Corpus Christi. You can use these datasets yourself. See this README for more information, and see this documentation page for permission details. Of course you can use the same techniques to classify other types of geographical or geological features, or even entirely unrelated image categories. AutoML Vision lets you extend and retrain the models that back the Cloud Vision API with additional classes, on data from your organization’s use case.AcknowledgementsThanks to Philippe Tissot, Associate Director, Conrad Blucher Institute for Surveying and Science, Texas A&M University – Corpus Christi; James Gibeaut, Endowed Chair for Coastal and Marine Geospatial Sciences, Harte Research Institute, Texas A&M University – Corpus Christi; and Valliappa Lakshmanan, Tech Lead, Google Big Data and Machine Learning Professional Services, for their contributions to this work.1. For options other than the default ‘1 hour’ of compute time, model training can be resumed later if desired. If you like, you can add additional images to the dataset before resumption.
Quelle: Google Cloud Platform

Announcing the general availability of Azure Data Box Disk

Since our preview announcement, hundreds of customers have been moving recurring workloads, media captures from automobiles, incremental transfers for ongoing backups, and archives from remote/office branch offices (ROBOs) to Microsoft Azure. We’re excited to announce the general availability of Azure Data Box Disk, an SSD-based solution for offline data transfer to Azure. Data Box Disk is now available in the US, EU, Canada, and Australia, with more country/regions to be added over time. Also, be sure not to miss the announcement of the public preview for Blob Storage on Azure Data Box below!

Top three reasons customers use Data Box Disk

Easy to order and use: Each disk is an 8 TB SSD. You can easily order a pack(s) of up to five disks from the Azure portal for a total capacity of 40 TB per order. The small form-factor provides the right balance of capacity and portability to collect and transport data in a variety of use cases. Support is available for Windows and Linux.
Fast data transfer: These SSD disks copy data up to USB 3.1 speeds and support the SATA II and III interfaces. Simply mount the disks as drives and use any tool of choice such as Robocopy, or just drag-and-drop to copy files to the disks.
Security: The disks are encrypted using 128-bit AES encryption and can be locked with your custom passkeys. After the data upload to Azure is complete, the disks are wiped clean in accordance with NIST 800 88-r1 standards.

Get started now

Data Box Disk is currently available in the US, EU, Australia, and Canada, and we will continue to expand to more county/regions in the coming months. To get started, refer to the online tutorial and order your Data Box Disk today. A complete list of supported operating systems can be found in our documentation, “Azure Data Box Disk system requirements.” For a deep dive on the toolset, see our documentation “Tutorial: Unpack, connect, and unlock Azure Data Box Disk.”

Announcing Blob Storage on Azure Data Box – Public preview now available

We also are launching the public preview of Azure Data Box Blob Storage. When enabled, this feature will allow you to copy data to Blob Storage on Data Box using blob service REST APIs. We are working with leading partners in the space to ensure you can use your favorite data copy tools.

For more details on using Blob Storage with Data Box, see our official documentation for “Azure Data Box Blob Storage requirements” and a tutorial on copying data via Azure Data Box Blob Storage REST APIs.

Thank you to everyone who participated in the preview of Azure Data Box Disk, and to those continuing to participate in previews for other products in the Data Box family including Data Box Heavy, Data Box Edge, and Data Box Gateway! In the coming months, we plan to make many enhancements based on your suggestions. Please continue to provide your valuable comments by posting on Azure Feedback.
Quelle: Azure

Optimize your Google Kubernetes Engine workloads with Spotinst Elastigroup

Managing Kubernetes is about more than making sure you have enough capacity to run your deployments; it’s also about continuously optimizing all the moving pieces to make sure everything is running as cost-effectively as possible. But this “Tetris” game of mixing and matching workloads with available compute resources can be a full-time job.Spotinst, a DevOps automation provider and Google Cloud Platform (GCP) partner, provides a proactive workload scaling service that anticipates interruptions in excess cloud capacity. By leveraging GCP’s Preemptible VMs, Spotinst helps customers eliminate the need to manage infrastructure scaling, reducing costs and operational overhead. In fact, a Spotinst Elastigroup lets you run production-grade container environments on preemptible VMs while saving up to 70% on your compute expenses.We recently published a tutorial that demonstrates how to configure a Spotinst Elastigroup to manage Google Kubernetes Engine (GKE) cluster workloads, automatically maintaining the availability of the cluster while lowering costs.The following diagram shows a GKE cluster integrated with an Elastigroup. Once a new deployment is processed, the Elastigroup uses predictive algorithms to ensure it has sufficient resources.With Spotinst Elastigroup, you can focus on building applications with GKE and not the infrastructure they run on. Get started by visiting the solution page and see a Spotinst Elastigroup in action.
Quelle: Google Cloud Platform

VW: Moia startet in Hamburg

Mehr Mobilität für die Hansestadt: Moia, der Ridesharing-Dienst von VW, geht in den Testbetrieb. Ab April sollen Passagiere regulär Fahrten in den Elektrokleinbussen über ihr Smartphone buchen. (ÖPNV, Elektroauto)
Quelle: Golem

German rail operator use AI to fast-track responses to customer queries

A typical visit to the train station can come with some uncertainty. Is your ticket valid for the next train to Berlin? Or do peak time restrictions apply? Are there catering facilities on board, or will you need to buy something to eat beforehand?
To take the stress out of traveling, DB Dialog and DB Systel – parts of Deutsche Bahn AG, Germany’s largest rail operator – launched a smart travel assistant service that uses artificial intelligence (AI) to quickly answer customer queries by text message.
Keeping customer services on the right track
DB Dialog is the business unit within Deutsche Bahn responsible for customer communications. It receives 12 million inquiries per year, typically by phone, email or in writing.
Expectations around customer service are changing. Digital communications, such as social media, are becoming more and more popular, especially with younger generations who would rather send a text message than pick up the phone.
Eager to offer customers a more convenient way of getting in touch, DB Dialog introduced a mobile communication channel to complement its existing call center and email services. We teamed up with DB Systel, the group’s IT services provider, and came up with the idea for DB Reisebuddy (German for “travel buddy”): a virtual assistant that helps customer service agents answer questions sent by text message.
Deploying an AI-based chatbot
DB Dialog and DB Systel selected IBM Watson Assistant, a platform for developing AI solutions, to underpin the service. After integrating Watson Assistant with DB Dialog’s customer relationship management (CRM) system, the repository for customer messages, we took six weeks to build the DB Reisebuddy virtual assistant and train it to recognize and respond to common customer queries.
To learn more about how we developed the solution, read the IBM case study.
The DB Reisebuddy virtual assistant is now on hand to help passengers before, during and after their train journeys. Customers send their queries – When is the next train to Munich? How do I get to Hamburg-Altona from the central station? I misplaced my wallet on the 12:05 to Stuttgart, is it in lost property? –  through SMS and web chat, and get personalized answers back a few minutes later.
There are still real people behind the DB Reisebuddy service. The virtual assistant suggests answers based on its understanding of the given questions, and human customer service agents check the proposed responses, edit them if necessary, then send them out as SMS or web chat.
Putting a travel assistant in passengers’ pockets
Today, the DB Reisebuddy virtual assistant handles common queries regarding ticketing and timetables, leaving the customer service agents free to deal with more complex inquiries. This saves the agents a lot of time and effort, taking the pressure off during busy travel times and enabling them to respond to customers much faster.
During a recent marketing campaign, which coincided with the launch of the service, the DB Reisebuddy virtual assistant provided correct responses to 40 percent of queries. We were really impressed with the percentage of correct, automated answers at such an early stage in its lifecycle.
Looking to the future, DB Dialog and DB Systel are eager to further fine tune the DB Reisebuddy virtual assistant to offer customers an even more responsive and convenient service. With many common queries now handled by the DB Reisebuddy virtual assistant, customer service agents are free to focus on more complex queries.
Learn more about IBM Watson Assistant.
The post German rail operator use AI to fast-track responses to customer queries appeared first on Cloud computing news.
Quelle: Thoughts on Cloud