Dear Spark developers: Welcome to Azure Cognitive Services

This post was co-authored by Mark Hamilton, Sudarshan Raghunathan, Chris Hoder, and the MMLSpark contributors.

Integrating the power of Azure Cognitive Services into your big data workflows on Apache Spark™

Today at Spark AI Summit 2019, we're excited to introduce a new set of models in the SparkML ecosystem that make it easy to leverage the Azure Cognitive Services at terabyte scales. With only a few lines of code, developers can embed cognitive services within your existing distributed machine learning pipelines in Spark ML. Additionally, these contributions allow Spark users to chain or Pipeline services together with deep networks, gradient boosted trees, and any SparkML model and apply these hybrid models in elastic and serverless distributed systems.

From image recognition to object detection using speech recognition, translation, and text-to-speech, Azure Cognitive Services makes it easy for developers to add intelligent capabilities to their applications in any scenario. To this date, more than a million developers have already discovered and tried Cognitive Services to accelerate breakthrough experiences in their application.

Azure Cognitive Services on Apache Spark™

Cognitive Services on Spark enable working with Azure’s Intelligent Services at massive scales with the Apache Spark™ distributed computing ecosystem. The Cognitive Services on Spark are compatible with any Spark 2.4 cluster such as Azure Databricks, Azure Distributed Data Engineering Toolkit (AZTK) on Azure Batch, Spark in SQL Server, and Spark clusters on Azure Kubernetes Service. Furthermore, we provide idiomatic bindings in PySpark, Scala, Java, and R (Beta).

Cognitive Services on Spark allows users to embed general purpose and continuously improve intelligent models directly into their Apache Spark™ and SQL computations. This contribution aims to liberate developers from low-level networking details, so they can focus on creating intelligent, distributed applications. Each Cognitive Service is a SparkML transformer, so users can add services to existing SparkML pipelines. We also introduce a new type of API to the SparkML framework that allows users to parameterize models by either a single scalar, or a column of a distributed spark DataFrame. This API yields a succinct, yet powerful fluent query language that offers a full distributed parameterization without clutter. For more information, check out our session.

Use Azure Cognitive Services on Spark in these 3 simple steps:

Create an Azure Cognitive Services Account
Install MMLSpark on your Spark Cluster
Try our example notebook

Low-latency, high-throughput workloads with the cognitive service containers

The cognitive services on Spark are compatible with services from any region of the globe, however many scenarios require low or no-connectivity and ultra-low latency. To tackle these with the cognitive services on Spark, we have recently released several cognitive services as docker containers. These containers enable running cognitive services locally or directly on the worker nodes of your cluster for ultra-low latency workloads. To make it easy to create Spark Clusters with embedded cognitive services, we have created a Helm Chart for deploying a Spark clusters onto the popular container orchestration platform Kubernetes. Simply point the Cognitive Services on Spark at your container’s URL to go local!

Add any web service to Apache Spark™ with HTTP on Spark

The Cognitive Services are just one example of using networking to share software across ecosystems. The web is full of HTTP(S) web services that provide useful tools and serve as one of the standard patterns for making your code accessible in any language. Our goal is to allow Spark developers to tap into this richness from within their existing Spark pipelines.

To this end, we present HTTP on Spark, an integration between the entire HTTP communication protocol and Spark SQL. HTTP on Spark allows Spark users to leverage the parallel networking capabilities of their cluster to integrate any local, docker, or web service. At a high level, HTTP on Spark provides a simple and principled way to integrate any framework into the Spark ecosystem.

With HTTP on Spark, users can create and manipulate their requests and responses using SQL operations, maps, reduces, filters, and any tools from the Spark ecosystem. When combined with SparkML, users can chain services together and use Spark as a distributed micro-service orchestrator. HTTP on Spark provides asynchronous parallelism, batching, throttling, and exponential back-offs for failed requests so that you can focus on the core application logic.

Real world examples

The Metropolitan Museum of Art

At Microsoft, we use HTTP on Spark to power a variety of projects and customers. Our latest project uses the Computer Vision APIs on Spark and Azure Search on Spark to create a searchable database of Art for The Metropolitan Museum of Art (The MET). More Specifically, we load The MET’s Open Access catalog of images, and use the Computer Vision APIs to annotate these images with searchable descriptions in parallel. We also used CNTK on Spark, and SparkML’s Locality Sensitive Hash implementation to futurize these images and create a custom reverse image search engine. For more information on this work, check out our AI Lab or our Github.

The Snow Leopard Trust

We partnered with the Snow Leopard Trust to help track and understand the endangered Snow Leopard population using the Cognitive Services on Spark. We began by creating a fully labelled training dataset for leopard classification by pulling snow leopard images from Bing on Spark. We then used CNTK and Tensorflow on Spark to train a deep classification system. Finally, we interpreted our model using LIME on Spark to refine our leopard classifier into a leopard detector without drawing a single bounding box by hand! For more information, you can check out our blog post.


With only a few lines of code you can start integrating the power of Azure Cognitive Services into your big data workflows on Apache Spark™. The Spark bindings offer high throughput and run anywhere you run Spark. The Cognitive Services on Spark fully integrate with containers for high performance, on premises, or low connectivity scenarios. Finally, we have provided a general framework for working with any web service on Spark. You can start leveraging the Cognitive Services for your project

with our open source initiative MMLSpark on Azure Databricks.

Learn more



Quelle: Azure

5 data security techniques that help boost consumer confidence

These days, it seems like hardly any time passes between headlines about the most recent data breach. Consider the revelation in late September that a security intrusion exposed the accounts of more than 50 million Facebook users.
For that matter, not much time goes by without a new survey or study that confirms the difficulty of data security. Forbes recently reported that US businesses and government agencies suffered 668 million security intrusions and data breaches in the first half of 2018 alone. It’s no wonder consumers have little faith in organizations’ abilities to protect their data. Only 20 percent of US consumers completely trust organizations to keep their data private.
No business is immune to data breaches, but that doesn’t mean you can’t do everything in your power to prevent them. By taking proven, sensible measures to ensure data security, your enterprise will not only tighten its defenses, but also promote trust among customers.
Here are five steps your organization can take that will demonstrate to consumers that you’re committed to data security.
1. Encrypt sensitive information.
Many industry regulations require certain data be encrypted, but it wouldn’t hurt if your organization considered safeguarding other types of data too. Almost anything can be encrypted. There are the obvious resources: email, SMS messages, user names, passwords and databases. Other sensitive data, such as intellectual property and the personal data of customers and employees, can also be encrypted.
Before considering encryption, review whether a particular type of data would cause financial harm and reputational damage to your organization if someone exposed and manipulated it. Encryption isn’t foolproof, especially if the key to encryption falls in the wrong hands, but it is a first-line security step that can show customers you take these matters seriously.
2. Optimize backup and recovery.
Most enterprises have data backup and recovery plans and likely rely on some form of disaster recovery (DR) technology, whether it’s offsite servers or a cloud service. But is it effective enough to boast about? An organization can’t make any stated commitment to protecting customers’ data if it’s at risk of losing it.
Because cyber incidents usually happen without notice and can go undetected for days, weeks or even longer, it’s critical to restore data to its clean, pre-breach condition. It’s a complicated process, but cutting-edge, purpose-built resiliency technologies can automatically recover data to its correct state and enable enterprises to find their footing quickly after a breach.
3. Promote compliance and transparency.
This year, organizations around the world started abiding by the General Data Protection Regulation (GDPR), a European Union standard for the handling of customer data. The GDPR essentially puts the power in consumers’ hands, enabling them to control how their data is stored and managed. It’s a thorough and detailed mandate for any organization, no matter where it’s based, to properly handle European citizens’ data.
Companies that comply with GDPR should use this compliance to their advantage by promoting how they collect, use and store consumer data. Asking users to review privacy settings or agree to a laundry list of new standards won’t effectively relay the steps you’re taking on their behalf. Instead, organizations should separately promote the many ways they follow GDPR and other compliance standards in easily consumable marketing materials. This will show customers that the organization is serious about its commitment to protecting personal information.
4. Consider cyber insurance.
In its annual study on the expenses of cybercrime, Ponemon estimates that the global average cost of a data breach has increased 6.4 percent over last year, climbing to an average $3.86 million in 2018. Those high costs have prompted many businesses to view cyber risk insurance as a critical investment.
Businesses that want the support of insurance should look for a policy that covers common reimbursable expenses. These might include a forensics examination to review the data breach, as well as monetary losses from business interruption, crisis management costs, legal expenses and regulatory fines. Hopefully, your enterprise won’t face many of those costs, but cybercrime is unpredictable. The peace of mind that insurance can provide you and your customers is worth the cost.
5. Work with a data security expert.
It’s not easy deciding which technologies and data security management strategies will work best for your organization. There are many technologies and strategies to implement. With regulations such as GDPR increasing expectations, don’t take any chances with customer data. Work with a data security expert that knows the lay of the land and already has insight on potential changes that would affect how you safeguard information.
Customers have an increasingly endless array of options to choose from on the digital market, so you might get only one chance with each consumer. Win their loyalty by demonstrating how you can expertly handle and preserve their data.
Learn about more ways IBM can help your organization secure your cloud platforms by registering for the guide to securing cloud platforms.
The post 5 data security techniques that help boost consumer confidence appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

Galaxy-Fold-Pleite: Verhaltenes Kichern in Shenzhen

Huawei dürfte über den verpatzten Marktstart von Samsungs Galaxy Fold nicht unglücklich sein: Im Rennen um das erste gute Falt-Smartphone liegen die Trümpfe nun in China. Allerdings sollte Huawei das Mate X zum Veröffentlichungstermin wirklich fertig haben – im Moment ist es das offenbar noch nicht. Eine Analyse von Tobias Költzsch (Samsung, Smartphone)
Quelle: Golem