Microsoft + Docker – Investing in the future of your applications

This post was authored by the Microsoft and Docker teams.

Did you know when you combine Docker’s cross platform support of Linux and Windows containers and Microsoft cloud technologies, you get a comprehensive offering that can support virtually every enterprise workload?

One platform, one journey for all applications

Microsoft and Docker aim to provide a modern platform for developers and IT pros to build, ship, and run any application on-premises, in the cloud, or through service providers across both Windows and Linux operating systems. Together we are bringing container applications across platforms, integrating across our developer tools, the operating system, and cloud infrastructure to provide a seamless experience across the application environment from development to test and production.

Whether you host your workloads in private datacenters, public cloud, or hybrid, Microsoft and Docker offer great end-to-end solutions or individual components from the developer’s keyboard to cloud. Azure Container Service provides the simplest way to deploy your container orchestration environment, such as Docker Swarm, so your app teams can deploy their apps more quickly. Windows Server Containers are powered by the same Docker toolchain, so you use the same Docker tooling to build and run those containers as you do your Linux containers and with the tooling you choose including Eclipse, Visual Studio, Jenkins, and Visual Studio Team Service. Windows Server Containers help secure and modernize existing enterprise .NET and line-of-business server applications with little or no code changes. Package existing apps in containers to realize the benefit of a more agile DevOps model, then deploy on-premises, to any cloud, or in a hybrid model. Reduce infrastructure and management costs for those applications, as well.

See it in action @ DockerCon 2017

Come visit Docker + Microsoft sessions @ DockerCon, taking place in Austin, TX from April 17th –20th. Learn how to modernize traditional applications as well as learn new technologies to help you build your next great application. You’ll learn about customer success stories on how they were able to achieve ROI targets and up to 80% cost savings through infrastructure consolidation and operational efficiencies with Docker Enterprise Edition (EE) and Azure.

Check out our sessions

Docker + Microsoft – Investing in the future of your applications on Tuesday, April 18th from 11:45am-12:25pm
Beyond – the path to Windows and Linux parity in Docker on Tuesday, April 18th from 2:00pm-2:40pm

There will also be hands-on labs for you to experience Docker on Windows. We’ll provision a Docker environment for you in Azure, and provide self-paced learning guides. You can learn more by reading Elton Stoneman’s blog on Docker + Microsoft sessions at DockerCon, from modernizing traditional apps like .NET to building new Windows Server Container apps.
Quelle: Azure

Azure Container Registry now generally available

Companies of all sizes are embracing containers as a fast and portable way to lift, shift and modernize into cloud-native apps. As part of this process, customers need a way to store and manage images for all types of container deployments. In November, we announced the preview of Azure Container Registry, which enables developers to create and maintain Azure container registries to store and manage private Docker container images.

Today, we&;re announcing the general availability of Azure Container Registry supporting a network-close, private registry for Linux and Windows container images. Azure Container Registry integrates well with orchestrators hosted in Azure Container Service, including Docker Swarm, Kubernetes and DC/OS as well as other Azure Services including Service Fabric and Azure App Services. Customers can benefit from using familiar tooling capable of working with the open source Docker Registry v2. Learn more by watching this Azure Container Registry GA video.
Building on the November Preview, we’ve added the following features and capabilities:

Availability in 23 regions, with a global footprint (with more coming)
Repositories, tag, and manifest listing in the Azure Portal
Dual-key password providing key rotation
Nested level repositories
Azure CLI 2.0 support

Global Availability

Container registry is now available globally. As part of our general availability release, all features are now available in all regions.

A full list of the supported regions are:

Australia East
Australia Southeast
Brazil South
Canada Central
Canada East
Central India
Central US

East US 2
East US
Japan East
Japan West
North Central US
North Europe
South Central US

South India
Southeast Asia
UK South
UK West
West Central US
West Europe
West US 2
West US

Multi-Arch Support

With the release of Windows Containers, we’re increasingly seeing customers who want both Windows and Linux images. While the Azure Container Registry supports both Windows and Linux, docker has added the ability to pull a single named image and have it resolve the os version based on the host the image is pulled from. Using multi-arch support, a customer can push both Windows and Linux based tags and their development teams can create their dockerfiles using FROM contoso.comaspnetcorecorpstandard. The Azure Container Registry multi-arch features will pull the appropriate image based on the host it’s called from.

Nested Repositories

Development teams often work in hierarchies and deploy solutions based on collections. The bikesharing team may have a collection of images they wish to group together (bikesharingweb, bikesharingapi), while the headtrax team has their collection (headtraxweb, headtraxapi, headtraxadmin), with a set of corporate images available to all members (aspnet:corpstandard).
The Azure Container Registry supports nested repos to enable teams to group their repos and images to match their development.

Repositories, tags, manifests

Customers have requested visibility into the contents of their registry. With the GA release, you will now have an integrated experience in the Azure portal to view the repositories, images, tags and the contents of manifests associated with an image.

To view repositories and tags you’ve already created in your repository:

Log in to the Azure Portal.
Select "More Services" on the left-side panel.
Search for "Container registries".
Select the registry you want to inspect.
On the left-hand side panel, select "Repositories".

The repositories blade will display a list of all the repositories (including nested registries) that you have created, as well as the images that are stored in these repositories.

If you select a specific image, it will open up a "Tags" blade containing the tags associated with that image. Additionally, if you select a tag, you will have the ability to see the manifest for that image tag.

 

Improved passwords

We have also made improvements for registry admin accounts. While we recommend using a service principal as a best practice, we wanted to improve on the safety of this alternative providing the ability to do key rotation. As such, new container registries will have access to two admin passwords, both of which can be regenerated. Having two passwords allows you to maintain connections by allowing you to swap to another password if you need to regenerate one.

To regenerate passwords, go to the "Access Keys" section of a registry on which you have enabled an Admin user.

Summary

We hope you enjoy the new features and capabilities of container registries. If you have any comments, requests, or issues, you can reach out to us on StackOverflow or log issues at https://github.com/azure/acr/issues
Quelle: Azure

Using xEvents to monitor Azure Analysis Services

In addition to providing BI queries at the speed of thought and a user friendly BI semantic model, Azure Analysis Services supports many manageability features.  One such feature is a rich set of Extended Events (xEvents) which can be used for scenarios ranging from trouble shooting and diagnostics to in depth auditing and usage analysis.

You can use SQL Server Management Studio (SSMS) to configure a xEvents for Azure Analysis Services. Today, you can only configure Azure Analysis services to log to a stream or ring buffer and not to a file. In some cases, you may want to log events for offline analysis or to retain historically. We have provided an example of using the Tabular Object Modeling APIs to create an xEvents session and logging the data to disk and a richer sample to trace to a database with a windows service. The xEvents Logging for Azure Analysis Services sample and ASTrace samples are available on GitHub at https://github.com/Microsoft/Analysis-Services.

The easiest way to use this sample is to use SSMS to configure streaming xEvents to see which events you would like to log. First, create an xEvents session in SSMS. Then pick which events you like to record and set the data mode to streaming. Run some queries or do other operations, and then look at the xEvents in the “Watch Live Data” option on the trace session in SSMS to verify the data. If these events are the ones you want, then you can script these out to a file. 

The sample program takes the TMSL script file to define the events it will record. Then you can run the sample program to create a new session, and it will trace these events to a file.  Be sure to install the latest Azure Analysis Services client libraries to ensure you have support to integrated authentication. 

Let us know how it works for you or check out the other Azure Analysis Serivces samples on GitHub!
Quelle: Azure

Introducing Dataiku’s DSS on Microsoft Azure HDInsight to make data science easier

We are pleased to announce the expansion of HDInsight Application Platform to include Dataiku.

Azure HDInsight is the industry leading fully-managed cloud Apache Hadoop & Spark offering which allows customers to do reliable open source analytics with an industry-leading SLA. Dataiku develops Data Science Studio (DSS), a collaborative data science platform that enables companies to build and deliver their analytical solutions more efficiently.

This combined offering of DSS on HDInsight enables customers to easily use data science to build big data solutions and run them at enterprise grade and scale.

Microsoft Azure HDInsight – Reliable Open Source Analytics at Enterprise grade & scale

HDInsight is the only fully-managed cloud Hadoop offering that provides optimized open source analytical clusters for Spark, Hive, Interactive Hive, MapReduce, HBase, Storm, Kafka, and R Server backed by a 99.9% SLA. Each of these big data technologies are easily deployable as managed clusters with enterprise-level security and monitoring.

The ecosystem of applications in Big data has grown with the goal of making it easier for customers to solve their big data and analytical problems faster. Today, customers often find it challenging to discover these productivity applications and then in turn struggle to install and configure these apps.

To address this gap, the HDInsight Application Platform provides an experience unique to HDInsight where Independent Software Vendors (ISV’s) can directly offer their applications to customers – and customers can easily discover, install and use these applications built for the Big data ecosystem.
As part of this integration, Dataiku is bringing DSS to make collaborative data science much easier.

Dataiku Data Science Studio (DSS) – Prototype, deploy and run at scale

Dataiku provides Data Science Studio, the collaborative data science platform that enables professionals (data scientists, data engineers etc.) to collaborate on building analytical solutions. DSS has an easy to use team-based interface for data scientists and beginner analysts. A user can use DSS to implement a complete analytical solution – which could range from data ingestion (all data types, sizes, format etc.), data preparation, data processing, training and applying machine learning models, visualization and operationalizing the solution.

A user can use DSS to implement a complete analytical solution – which could range from data ingestion (all data types, sizes, format etc.), data preparation, data processing, training and applying machine learning models, visualization and operationalizing the solution.

DSS on HDInsight – Data science at enterprise grade & scale

A customer can install DSS on HDInsight using Hadoop or Spark clusters. They can install DSS on existing clusters which are running, or while creating new clusters. DSS 4.0 also added support for using Azure Blob Storage as a connector for reading data from.

When a user installs DSS on HDInsight, the user can make use of the benefits of Hadoop or Spark on HDInsight. Users can utilize DSS to build projects; the projects can generate MapReduce or Spark jobs, which makes DSS a great compliment to your HDInsight cluster. These jobs are executed as regular MapReduce or Spark jobs, and hence they get all the benefits of running these jobs on an enterprise grade platform. Since these jobs are running on HDInsight, customers can scale the cluster on demand, which allows a customer to run DSS at scale on HDInsight.

Getting started with DSS on HDInsight

Let us show a quick walkthrough of installing and getting started with DSS on HDInsight: The following screen shot shows a Spark cluster in the Microsoft Azure portal. A user can click the Applications tile to see the list of applications installed.

A user can select DSS, agree to the terms of agreement and install DSS. This is the simplicity associated with a one-click deployment experience. After the user has selected DSS, DSS is installed on the edge node, which is part of the cluster.

After DSS is installed, a customer can launch DSS using the “WEBPAGE” link. (This is the link to the DSS product.) A user must first authenticate with the cluster user credentials and then they can login with their DSS credentials

The following screenshot shows what a typical data science project’s landing page would look like in DSS. This shows both the summary of the project, as well as the timeline of the changes made to the project.

Resources

Following are some resource on learning more on this integration along with tutorials and videos.

Learn more about Azure HDInsight
Learn more about Dataiku DSS
Getting started with DSS on HDInsight
Use HDInsight and DSS to predict credit default
Dataiku and Microsoft HDInsight Integration
Video recording between Microsoft & Dataiku on using DSS on HDInsight
Install DSS on HDInsight from Azure marketplace
Ask HDInsight questions on stackoverflow
Dataiku Q&A

Summary

We are pleased to announce the expansion of HDInsight Application Platform to include Dataiku’s Data Science Studio. By deploying DSS on HDInsight, customers can easily build analytical solutions and run them at enterprise grade and scale.
Quelle: Azure

Azure Stack Technical Preview 3 refresh with Azure PaaS services

This post was authored by the Azure Stack Team.

Today, we are excited to announce preview releases of Azure PaaS services for Azure Stack and a refresh to Azure Stack TP3. Last month, we released Azure Stack TP3 and provided additional information about hybrid use cases, the pay-as-you-use business model for Azure Stack, and roadmap updates. If you haven’t already, read Jeffery Snover’s Azure Stack TP3 blog post for more context. Additionally, we put together a whitepaper with an even more detailed roadmap of Azure services, integrated systems details, and initial geo-availability.

 

This update continues delivering Azure Services on premises so customers can create innovative applications for the hybrid cloud. Specifically, this release includes:

Azure App Service (Web apps, API apps, and Mobile apps)
Azure Functions
Updated versions of SQL/MySQL database services

New to App Service this release:

Azure Functions preview for AAD based deployments
Deployment in disconnected environments
Deployment on ADFS authenticated Azure Stack
Installation and deployment improvements
Azure Resource Manager (ARM) API version 2016-03-01 support for App Services
Synchronization of SKUs with Azure – i.e. Free (F1), Shared (D1), and Standard (S1, S2, S3)
Service reliability improvements

Azure Stack TP3 refresh: Based on feedback and several ongoing improvements/bug fixes, we’ve also updated the Azure Stack TP3 software for a better deployment and operational experience. A list of the latest features and improvements to Azure Stack TP3 is now available.

Important Note: If you have already deployed Azure Stack TP3, you will need to redeploy using the updated software before deploying the Azure PaaS services.

Visit our Azure Stack technical documentation page to guide your deployment efforts and look at the documentation for App Service and Functions, SQL, and MySQL.

Visit the Azure Stack forum for troubleshooting help and User Voice if you’d like to provide feedback. Learn more and see the current list of known issues.

We’d love to hear from you!
Quelle: Azure

Announcing HTTP/2 support for all Azure CDN customers

In August 2016, we announced the HTTP/2 support for Azure CDN from Akamai. Today, we are pleased to announce that HTTP/2 is also available for all customers using Azure CDN from Verizon. No further action is required from customers. HTTP/2 is on by default, for all existing and new Azure CDN profiles with no additional fees.

HTTP/2 is designed to improve webpage loading speed and optimize user experience. You will start enjoying the benefits of HTTP/2 without the need to update any of your code base today!

Read also

Azure CDN HTTP/2 doc
HTTP/2 spec
HTTP/2 FAQ

Quelle: Azure

What’s brewing in Visual Studio Team Services: April 2017 Digest

This post series provides the latest updates and news for Visual Studio Team Services and is a great way for Azure users to keep up-to-date with new features being released every three weeks. Visual Studio Team Services offers the best DevOps tooling to create an efficient continuous integration and release pipeline to Azure. With the rapidly expanding list of features in Team Services, teams can start to leverage it more efficiently for all areas of their Azure workflow, for apps written in any language and deployed to any OS.

Git tags

We’ve now added tag support into the web experience. Instead of creating tags from the command line and pushing the tags to the repository, you can now simply go to a commit and add a tag. The tag creation dialog will also let you tag any other ref in the repo.

Your commits will now show the tags that you have created.

The commit list view also supports a context menu. No need to go to the commit details page to create tags and create new branches.

Soon we will add a page for tag management.

Git branch policy improvements

Branch policies provide a great way to help maintain quality in your repos by allowing you to require a passing build, require code reviewers, and more. As part of review pull requests, users often leave comments. You can now ensure that all comments in pull requests are being addressed with the new Comments policy. Once enabled, active comments will block completion of the PR. Reviewers that leave comments for the PR author but optimistically approve the pull request can be sure that comments won’t be missed.

Sometimes you need to override policies, such as in the middle of the night when addressing an issue in production. Users bypassing pull request policies must now specify a reason. In the Complete pull request dialog, users will see a new Reason field, if they choose to bypass.

After entering the reason and completing the pull request, the message will be displayed in the pull request’s Overview.

Import Team Foundation Version Control into a Git repo

If you’re using Team Foundation Version Control (TFVC) and are looking for an easy way to migrate to Git, try out the new TFVC import feature. Select Import Repository from the repository selector drop-down.

Select TFVC for the source type. Individual folders or branches can be imported to a new Git repository, or the entire TFVC repository can be imported (minus the branches). You can import up to 180 days of history.

Team Foundation Version Control support for Android Studio, IntelliJ, and Rider

We’ve now officially released support for TFVC in Android Studio and the variety of JetBrains IDE’s such as IntelliJ IDEA and Rider EAP. Users can seamlessly develop without needing to switch back and forth from the IDE to the command line to perform their Team Services actions. It also includes additional features that you otherwise wouldn’t get from the command line client, such as seeing an updated status of your repository’s related builds along with the capability to browse work items assigned to you or from your personal queries.

Currently we support:

Checkout a TFVC repository from Team Services or Team Foundation Server 2015+
Execute all basic version control actions such as add, delete, rename, move, etc.
View local changes and history for your files
Create, view, and edit your workspace
Checkin and update local files
Merge conflicts from updates
Lock and unlock files and directories
Add labels to files and directories
Configure a TFS proxy

Check out our brief demo of getting up and running inside of Android Studio. For a more comprehensive look at the plugin, checkout our presentation and tutorial inside of IntelliJ.

To start using the TFVC features, download the latest version of the plugin and follow the setup steps.

Continuous delivery in the Azure portal using any Git repo

You can now configure a continuous delivery (CD) workflow for an Azure App Service for any public or private Git repository that is accessible from the Internet. With a few clicks in the Azure portal, you can set up a build and release definition in Team Services that will periodically check your Git repository for any changes, sync those changes, run an automated build and test, followed by a deployment to Azure App Service.

Start using this feature today by navigating to your app’s menu blade in the Azure portal and clicking Continuous Delivery (Preview) under the App Deployment section.

Conditional build tasks

If you’re looking for more control over your build tasks, such as a task to clean things up or send a message when something goes wrong, we now support four built-in choices for you to control when a task is run:

If you are looking for more flexibility, such as a task to run only for certain branches, with certain triggers, under certain conditions, you can express your own custom conditions:

and(failed(), eq(variables[&;Build.Reason&039;], &039;PullRequest&039;))

Take a look at the conditions for running a task.

Customizable backlog levels

You can now add backlog levels to manage the hierarchy of their work items and name them in a way that makes sense for your work item types. You can also rename and recolor existing backlog levels, such as Stories or Features. See Customize your backlogs or boards for a process for details on how to get started.

Mobile work item discussion

Our mobile discussion experience has been optimized to provide a mobile-friendly, streamlined experience for submitting a comment. Discussion is the most common action that takes place in a mobile device. We look forward to hearing what you think about our new experience!

Extension of the month

If you are like us, you use open source software in your development projects. Reusing components enables great productivity gains. However, you can also reuse security vulnerabilities or violate licenses without realizing it.

The WhiteSource Bolt extension for build makes it easy to find out whether you are using vulnerable components. After installing it in your account, add it to your build definition and queue a new build. You’ll get a report like the following. In the table under the summary, you will see a list of components with issues and the recommended way to address those issues.

If you have Visual Studio Enterprise, you get 6 months of WhiteSource Bolt for one team project included with your subscription (redeem the code from your benefits page or see this page for VS subscribers for more detailed instructions).

Have a look at the full list of new features by checking out the release notes for March 8th and March 29th.

Happy coding!
Quelle: Azure

Enhance protection of VMs with Azure Advisor backup recommendations

We have seen a few customer cases where customers accidentally deleted VMs or data inside a VM running in Azure. While Azure provides protection against infrastructure related failures, it can’t guard against user initiated actions such as accidental deletion or a wrong patch on the guest OS triggered by customer. Azure Backup provides a capability to guard against accidental deletions and guest OS level corruption scenarios using its cloud-first approach to backup and seamlessly enables to restore a full VM or instantly recover files inside a VM. Customers can configure backup either from Recovery Services vault or directly from VM management blade. However, we have seen customers missing on configuring backup and risking their critical data. Today we are making a step towards making sure that we advise you to protect your VMs using backup with Advisor recommendations, made generally available last week.

Azure Advisors, is a personalized cloud consultant that helps to optimize use of Cloud, as customers start on their digital transformation using Azure. It analyzes your Azure usage and provides timely recommendations to help optimize and secure your deployments. It provides recommendations in four categories: High Availability, Security, Performance and Cost. With this announcement, it can provide recommendations about virtual machines which are not backed up and with few clicks it will let you enable backup on those virtual machines.

Value Proposition:

Periodic Recommendations – Advisors provide hourly recommendations for virtual machines that are not backed up so that you never miss to backup important VMs. You can also control recommendations by snoozing them.

Seamless experience to backup – You can seamlessly enable backup on virtual machines by clicking on a recommendation and by specifying vault (where backups will be stored) and backup policy (schedule of backups and retention of backup copies).

Freedom from infrastructure – With Azure Backup integration into recoomendations, you need not provision any additional infrastructure to configure backup.

Application consistent backup – Azure Backup provides application consistent backup for Windows and Linux and by configuring backup using recommendations, you will get a consistent backup without the need to shut down the virtual machine.

 

Related links and additional content

Want more details? Check out Azure Backup documentation and Azure Advisor documentation
New to Azure Backup and Azure Advisors, sign up for a free Azure trial subscription
Need help? Reach out to Azure Backup forum for support
Tell us how we can improve Azure Backup by contributing new ideas and voting up existing ones.
Follow us on Twitter @AzureBackup for the latest news and updates

Quelle: Azure

Real-time machine learning on globally-distributed data with Apache Spark and DocumentDB

At the Strata + Hadoop World 2017 Conference in San Jose, we have announced the Spark to DocumentDB Connector. It enables real-time data science, machine learning, and exploration over globally distributed data in Azure DocumentDB. Connecting Apache Spark to Azure DocumentDB accelerates our customer’s ability to solve fast-moving data science problems, where data can be quickly persisted and queried using DocumentDB. The Spark to DocumentDB connector efficiently exploits the native DocumentDB managed indexes and enables updateable columns when performing analytics, push-down predicate filtering against fast-changing globally-distributed data, ranging from IoT, data science, and analytics scenarios. The Spark to DocumentDB connector uses the Azure DocumentDB Java SDK. You can get started today and download the Spark connector from GitHub!

What is DocumentDB?

Azure DocumentDB is our globally distributed database service designed to enable developers to build planet scale applications. DocumentDB allows you to elastically scale both, throughput and storage across any number of geographical regions. The service offers guaranteed low latency at P99, 99.99% high availability, predictable throughput, and multiple well-defined consistency models, all backed by comprehensive SLAs. By virtue of its schema-agnostic and write optimized database engine, by default DocumentDB is capable of automatically indexing all the data it ingests and serve SQL, MongoDB, and JavaScript language-integrated queries in a scale-independent manner. As a cloud service, DocumentDB is carefully engineered with multi-tenancy and global distribution from the ground up.
These unique benefits make DocumentDB a great fit for both operational as well as analytical workloads for applications including web, mobile, personalization, gaming, IoT, and many other that need seamless scale and global replication.

What are the benefits of using DocumentDB for machine learning and data science?

DocumentDB is truly schema-free. By virtue of its commitment to the JSON data model directly within the database engine, it provides automatic indexing of JSON documents without requiring explicit schema or creation of secondary indexes. DocumentDB supports querying JSON documents using well-familiar SQL language. DocumentDB query is rooted in JavaScript&;s type system, expression evaluation, and function invocation. This, in turn, provides a natural programming model for relational projections, hierarchical navigation across JSON documents, self joins, spatial queries, and invocation of user defined functions (UDFs) written entirely in JavaScript, among other features. We have now expanded the SQL grammar to include aggregations, thus enabling globally-distributed aggs in addition to these capabilities.

Figure 1: With Spark Connector for DocumentDB, data is parallelized between the Spark worker nodes and DocumentDB data partitions

Distributed aggregations and advanced analytics

While Azure DocumentDB has aggregations (SUM, MIN, MAX, COUNT, SUM and working on GROUP BY, DISTINCT, etc.) as noted in Planet scale aggregates with Azure DocumentDB, connecting Apache Spark to DocumentDB allows you to easily and quickly perform an even larger variety of distributed aggregations by leveraging Apache Spark. For example, below is a screenshot of calculating a distributed MEDIAN calculation using Apache Spark&039;s PERCENTILE_APPROX function via Spark SQL.

select destination, percentile_approx(delay, 0.5) as median_delay
from df
where delay < 0
group by destination
order by percentile_approx(delay, 0.5)

Figure 2: Area visualization for the above distributed median calculation via Jupyter notebook service on Spark on Azure HDInsight.

Push-down predicate filtering

As noted in the following animated gif, the queries from Apache Spark will push down predicated to Azure DocumentDB and take advantage that DocumentDB indexes every attribute by default. Furthermore, by pushing computation close to the where the data lives, we can do processing in-situ, and reduce the amount of data that needs to be moved. At global scale, this results in tremendous performance speedups for analytical queries.

For example, if you only want to ask for the flights departing from Seattle (SEA), the Spark to DocumentDB connector will:

Send the query to Azure DocumentDB.
As all attributes within Azure DocumentDB are automatically indexed, only the flights pertaining to Seattle will be returned to the Spark worker nodes quickly.

This way as you perform your analytics, data science, or ML work, you will only transfer the data you need.

Blazing fast IoT scenarios

Azure DocumentDB is designed for high-throughput, low-latency IoT environments. The animated GIF below refers to a flights scenario.

Together, you can:

Handle high throughput of concurrent alerts (e.g., weather, flight information, global safety alerts, etc.)
Send this information downstream for device notifications, RESTful services, etc. (e.g., alert on your phone of an impending flight delay) including the use of change feed
At the same time, as you are building up ML models against your data, you can also make sense of the latest information

Updateable columns

Related to the previously noted blazing fast IoT scenarios, let&039;s dive into updateable columns:

As the new piece of information comes in (e.g. the flight delay has changed from 5 min to 30 min), you want to be able to quickly re-run your machine learning (ML) models to reflect this newest information. For example, you can predict the impact of the 30min for all the downstream flights. This event can be quickly initiated via the Azure DocumentDB Change Feed to refresh your ML models.

Next steps

In this blog post, we’ve looked at the new Spark to DocumentDB Connector. The Spark with DocumentDB enables both ad-hoc, interactive queries on big data, as well as advanced analytics, data science, machine learning, and artificial intelligence. DocumentDB can be used for capturing data that is collected incrementally from various sources across the globe. This includes social analytics, time series, game or application telemetry, retail catalogs, up-to-date trends and counters, and audit log systems. Spark can then be used for running advanced analytics and AI algorithms at scale on top of the data coming from DocumentDB.

Companies and developers can employ this scenario in online shopping recommendations, spam classifiers for real time communication applications, predictive analytics for personalization, and fraud detection models for mobile applications that need to make instant decisions to accept or reject a payment. Finally, internet of things scenarios fit in here as well, with the obvious difference that the data represents the actions of machines instead of people.

To get started running queries, create a new DocumentDB account from the Azure Portal and work with the project in our Azure-DocumentDB-Spark GitHub repo. Complete instructions are available in the Connecting Apache Spark to Azure DocumentDB article.

Stay up-to-date on the latest DocumentDB news and features by following us on Twitter @DocumentDB or reach out to us on the developer forums on Stack Overflow.
Quelle: Azure

Announcing general availability of Azure HDInsight 3.6

This week at DataWorks Summit, we are pleased to announce general availability of Azure HDInsight 3.6 backed by our enterprise grade SLA. HDInsight 3.6 brings updates to various open source components in Apache Hadoop & Spark eco-system to the cloud, allowing customers to deploy them easily and run them reliably on an enterprise grade platform. What’s new in Azure HDInsight 3.6 Azure HDInsight 3.6 is a major update to the core Apache Hadoop & Spark platform as well as with various open source components. HDInsight 3.6 has the latest Hortonworks Data Platform (HDP) 2.6 platform, a collaborative effort between Microsoft and Hortonworks to bring HDP to market cloud-first. You can read more about this effort here. HDInsight 3.6 GA also builds upon the public preview of 3.6 which included Apache Spark 2.1. We would like to thank you for trying the preview and providing us feedback, which has helped us improve the product. Apache Spark 2.1 is now generally available, backed by our existing SLA. We are introducing capabilities to support real-time streaming solutions with Spark integration to Azure Event Hubs and leveraging the structured streaming connector in Kafka for HDInsight. This will allow customers to use Spark to analyze millions of real-time events ingested into these Azure services, thus enabling IoT and other real-time scenarios. HDInsight 3.6 will only have the latest version of Apache Spark such as 2.1 and above. There is no support for older versions such as 2.0.2 or below. Learn more on how to get started with Spark on HDInsight. Apache Hive 2.1 enables ~2X faster ETL with robust SQL standard ACID merge support and many more improvements. This release also includes an updated preview of Interactive Hive using LLAP (Long Lived and Process) which enables 25x faster queries.  With the support of the new version of Hive, customers can expect sub-second performance, thus enabling enterprise data warehouse scenarios without the need for data movement. Learn more on how to get started with Interactive Hive on HDInsight. This release also includes new Hive views (Hive view 2.0) which provides an easy to use graphical user interface for developers to get started with Hadoop. Developers can use this to easily upload data to HDInsight, define tables, write queries and get insights from data faster using Hive views 2.0. Following screenshot shows new Hive views 2.0 interface. We are expanding our interactive data analysis by including Apache Zeppelin notebook apart from Jupyter. Zeppelin notebook is pre-installed when you use HDInsight 3.6, and you can easily launch it from the portal. Following screenshot shows Zeppelin notebook interface. Getting started with Azure HDInsight 3.6 It is very simple to get started with Apache HDInsight 3.6 – simply go to the Microsoft Azure portal and create an Azure HDInsight service.   Once you’ve selected HDInsight, you can pick the specific version and workload based on your desired scenario. Azure HDInsight supports a wide range of scenarios and workloads such as Hive, Spark, Interactive Hive (Preview), HBase, Kafka (Preview), Storm, and R Server as options you can select from. Learn more on creating clusters in HDInsight. Once you’ve complete the wizard, the appropriate cluster will be created. Apart from the Azure portal, you can also automate creation of the HDInsight service using the Command Line Interface (CLI). Learn more on how to create cluster using CLI. We hope that you like the enhancements included within this release. Following are some resources to learn more about this HDI 3.6 release: Learn more and get help Azure HDInsight Overview Getting started with Azure HDInsight Use Hive on HDInsight Use Spark on HDInsight Use Interactive Hive on HDInsight Use HBase on HDInsight Use Kafka on HDInsight Use Storm on HDInsight Use R Server on HDInsight Open Source component guide on HDInsight Extend your cluster to install open source components HDInsight release notes HDInsight versioning and support guidelines How to upgrade HDInsight cluster to a new version Ask HDInsight questions on stackoverflow Ask HDInsight questions on Msdn forums Summary This week at DataWorks Summit, we are pleased to announce general availability of Azure HDInsight 3.6 backed by our enterprise grade SLA. HDInsight 3.6 brings updates to various open source components in Apache Hadoop & Spark eco-system to the cloud, allowing customers to deploy them easily and run them reliably on an enterprise grade platform.
Quelle: Azure