Data Management Gateway – High Availability and Scalability Preview

We are excited to announce the preview for Data Management Gateway – High Availability and Scalability.

You can now associate multiple on-premise machines to a single logical gateway. The benefits are: 

Higher availability of Data Management Gateway (DMG) – DMG will no longer be the single point of failure in your Big Data solution or cloud data integration with Azure Data Factory, ensuring continuity with up to 4 nodes.

Improved performance and throughput during data movement between on-premises and cloud data stores. Get more information on performance comparisons.
Both Scale out and Scale up support – Not only the DMG can be installed across 4 nodes (scale out), but you can now increase/decrease the concurrent data movement jobs at each node (scale up/down) as per the need.
Note: The Scale up/down feature is now available for all existing Single Node (GA) gateways. This update is not limited to this preview. 
Richer Data Management Gateway Monitoring experience – You can monitor each node status and resource utilization all at one place on the Azure Portal. This helps simplify the DMG management.  

Note: Monitoring is now available for all existing Single Node (GA) gateways. This update is not limited to this preview. 

For more information on the Data Management Gateway ‘High Availability and Scalability’ feature check our documentation.

Getting started

Scenario 1 – Setting up a new ‘Highly Available and Scalable’ Data Management Gateway.

 

 

Scenario 2 – Upgrading existing Data Management Gateway to enable the ‘High Availability and Scalability’ feature.

 

 

Prerequisite – This preview feature is supported on Data Management Gateway version 2.12.xxxx.x and above. Please make sure you are using version 2.12.xxxx.x or above. Download the latest version of Data Management Gateway.

In case you have any queries, please feel free to reach out to us at dmghelp@microsoft.com.
Quelle: Azure

Announcing Microsoft’s Coco Framework for enterprise blockchain networks

Blockchain is a transformational technology with the potential to extend digital transformation beyond a company’s four walls and into the processes it shares with suppliers, customers and partners. A growing number of enterprises are investing in blockchain as a secure and transparent way to digitally track the ownership of assets across trust boundaries and to collaborate on shared business processes, opening up new opportunities for cross-organizational collaboration and imaginative new business models.

Microsoft is committed to bringing blockchain to the enterprise—and is working with customers, partners, and the blockchain community to continue advancing its enterprise readiness. Our mission is to help companies thrive in this new era of secure multi-party computation by delivering open, scalable platforms and services that any company—from ledger startups to retailers to health providers to global banks—can use to improved shared business processes.

As enterprises look to apply blockchain technology to meet their business needs, they’ve come to realize that many existing blockchain protocols fail to meet key enterprise requirements such as performance, confidentiality, governance, and required processing power. This is because existing systems were designed to function—and to achieve consensus—in public scenarios amongst anonymous, untrusted actors with maximum transparency. Because of this, transactions are posted “in the clear” for all to see, every node in the network executes every transaction, and computationally intensive consensus algorithms must be employed. These safeguards, while necessary to ensure the integrity of public blockchain networks, require tradeoffs in terms of key enterprise requirements such as scalability and confidentiality.

Efforts to adapt existing public blockchain protocols or to create new protocols to meet these needs have generally traded one required enterprise attribute for another—such as improved confidentiality at the cost of greater complexity or lower performance. 

Facilitating enterprise blockchain adoption

Today I am proud to introduce the Coco Framework, an open-source system that enables high-scale, confidential blockchain networks that meet all key enterprise requirements—providing a means to accelerate production enterprise adoption of blockchain technology.

Coco achieves this by designing specifically for confidential consortiums, where nodes and actors are explicitly declared and controlled. Based on these requirements, Coco presents an alternative approach to ledger construction, giving enterprises the scalability, distributed governance and enhanced confidentiality they need without sacrificing the inherent security and immutability they expect.

Leveraging the power of existing blockchain protocols, trusted execution environments (TEEs) such as Intel SGX and Windows Virtual Secure Mode (VSM), distributed systems and cryptography, Coco enables enterprise-ready blockchain networks that deliver:

Throughput and latency approaching database speeds.
Richer, more flexible, business-specific confidentiality models.
Network policy management through distributed governance.
Support for non-deterministic transactions.

By providing these capabilities, Coco offers a trusted foundation with which existing blockchain protocols can be integrated to deliver complete, enterprise-ready ledger solutions, opening up broad, high scale scenarios across industries, and furthering blockchain's ability to digital transform business.

We have already begun exploring Coco’s potential across a variety of industries, including retail, supply chain and financial services.

"Being able to run our existing supply chain Dapp code much faster within Coco framework is a great performance improvement that will reduce friction when we talk about enterprise Blockchain readiness with our retail customers. Adding data confidentiality support without sacrificing this improvement is what will enable us to lead the digital transformation we are envisioning with Smart Supply Chains."

– Tom Racette, Vice President, Global Retail Business Development, Mojix

Whether a customer is designing an end-to-end trade finance solution, using blockchain to ensure security at the edge or leveraging Enterprise Smart Contracts to drive back office efficiencies, Coco enables them to meet their enterprise requirements. Microsoft is the only cloud provider that delivers consistency across on-premises and the public cloud at hyperscale while providing access to the rich Azure ecosystem for the wide range of applications that will be built on top of blockchain as a shared data layer.

An open approach

By design, Coco is open and compatible with any blockchain protocol. Microsoft has already begun integrating Ethereum into Coco and we’re thrilled to announce that J.P. Morgan Chase, Intel and R3 have committed to integrating enterprise ledgers, Quorum, Hyperledger Sawtooth and Corda, respectively. This is just the beginning, and we look forward to exploring integration opportunities with other ledgers in the near future.

"Microsoft's Coco Framework represents a breakthrough in achieving highly scalable, confidential, permissioned Ethereum or other blockchain networks that will be an important construct in the emerging world of variously interconnected blockchain systems.​"

– Joseph Lubin, Founder of ConsenSys

I believe Coco can only benefit from the diverse and talented open source communities that are driving blockchain innovation today. While Coco started as a collaboration between Azure and Microsoft Research, it has benefitted from the input of dozens of customers and partners already. Opening up Coco is a way to scale development far beyond the reach and imagination of our initial working group, and our intent is to contribute the source code to the community in early 2018.

Coco will be compatible, by design, with any ledger protocol and can operate in the cloud and on premises, on any operating system and hypervisor that supports a compatible TEE. We are building in this flexibility in part to allow the community to integrate Coco with additional protocols, try it on other hardware and adapt it for enterprise scenarios we haven't yet thought of.

Industry enthusiasm for blockchain is growing, and while it will still take time for blockchain to achieve enterprise assurance, we remain laser focused on accelerating its development and enterprise adoption in partnership with the community.

To learn more about Coco you can read our technical whitepaper and watch my demo on the MSCloud Youtube page – be sure to star and follow the project on GitHub to keep up with the working group and receive notifications on the latest developments!
Quelle: Azure

General Availability: Azure Media Redactor

Azure Media Redactor is a powerful cloud video processing service which is capable of automatically detecting and blurring faces in your videos, for use in cases such as public safety and news media. Based on artificial intelligence technology developed in house, Redactor can be used in both automated and semi-manual ways to improve the efficiency of workflows that involve labor intensive manual video editing.

In our previous blog post we discussed the preview release of Azure Media Redactor and the various ways you can use it. This release includes a couple of changes based on your feedback during the preview process, and updates the feature to include full SLA support. You can view updated pricing for this feature here.

Updates in this release include the following:

Greatly improved processing speed
Better face detection and tracking
Stickier face ID association
Multiple blur modes

View our full documentation page for details on using all these features.

See our pricing page on updated GA pricing for Azure Media Redactor.

Improved performance

Speed of processing varies quite a bit depending on video size, framerate, and number of faces in the video. Expect a 720p 30fps video to take between 1x and 2x real time to complete processing.

Another large improvement is in face grouping, where the same that that appears in the video at multiple points will be given the same ID. Previously, the same face could easily be assigned multiple ID’s as they appears throughout a video, which made selectively blurring individual faces much easier.

Accuracy of face detection has also been slightly improved from the previous version.

Blurring changes

We now offer 5 blurring modes you can choose from via the JSON configuration preset. By default ‘Med’ is used.

Example JSON:

{'version':'1.0', 'options': {'Mode':'Combined', 'BlurType':'High'}}

Low:

Med:

High:

Debug:

Black:

Quelle: Azure

Announcing the new and improved Azure Log Analytics

The Azure Log Analytics service is rolling out an upgrade to existing customers today – offering powerful search, smart analytics, and even deeper insights. This upgrade provides an interactive query language and an advanced analytics portal, powered by a highly scalable data store resembling Azure Application Insights. This creates a consistent monitoring experience for IT operations and developers.

In the biggest upgrade since its launch, the new and improved Azure Log Analytics brings you a simple yet very powerful query language with all the capabilities requested in the language feedback. Over the last couple of months, we have been working closely with 60+ customers who had early access to the upgrade, and their feedback has been very positive regarding the enhanced experience and capabilities of the new language. Here are some of their quotes, I would like to share with you:

"Wizards of the Coast was fortunate enough to gain early access to Azure Log Analytics upgrade and it has been instrumental in our ability to diagnose issues within our code base and environments, and to view on a large scale the overall performance. The portal implementation is intuitive, and the query language is extremely easy to understand, and the IntelliSense implementation is refined and extremely helpful in its implementation.” 

–Scott Thomas, Infrastructure & Platform Architect, Wizards of the Coast

“I just got our workspace upgraded and the new query language is awesome (so far)! The queries are lightning fast, IntelliSense works great, and I can now do the aggregations I couldn't do before. This is light-years ahead of the old query engine. Bravo!”

–Microsoft IT

“With the new query language, we can carve up Log Analytics data in any way we need to visualize it. Key benefits include the ability to use unions, joins, functions and variables. We have been able to create queries which would not have been possible with the original query language. The upgrade experience was seamless and existing queries were converted automatically. Even with custom solutions which we had developed for Log Analytics the conversion was very straightforward.”

–Cameron Fuller, Principal Consultant, Catapult Systems

Why should I upgrade?

This upgrade opens endless possibilities, but here are some of the brand new key capabilities available immediately after the upgrade, which takes only a few seconds, in most cases.

Powerful query language with built in Smart Analytics

The query language provides powerful search, query time field extractions, calculated fields, joins and unions, as well as rich date time operators, string operators and native JSON support. The query language also supports let statements, lambda expressions and comments in queries, an extremely important feature to modularize the queries, especially when sharing queries with colleagues or using them for live site support and troubleshooting. The query language offers flexible machine learning constructs and time series functions to help customers get deeper insights into their data. For instance, the time series functions help analyze CPU performance from hundreds of computers and select the top N based on usage spikes. There are numerous other capabilities included in the language, which can be further explored in the Azure Log Analytics resource.

Now let’s look at some examples for these, in the context of scenarios. All of the queries shown in the examples below were not feasible in the previous query language.

This query calculates whether a service-level agreement (SLA) was met based on IIS call duration. To try it for yourself, click to run the query.

This example using joins, shows a list of missing security updates, for computers with a high severity security alert detection for the last day. To try it for yourself, click to run the query.

Here is another example using time series analysis for analyzing the CPU performance of several computers and narrowing it down to the two most relevant. To try it for yourself, click to run the query.

Advanced Analytics Portal

The Advanced Analytics portal gives you the best experience for writing interactive ad hoc queries, whether it is for troubleshooting, diagnostics, analyzing trends or creating quick visualizations. This game-changing experience provides multi-line editing features with context-aware syntax highlighting and powerful built-in visualizations. You can save and share queries and export data to Excel.

Azure Portal, Power BI Desktop and Microsoft Flow Connector Integration

Now with one click, create a quick visualization on Analytics portal and pin the visualization to a shared Azure Dashboard. This enables you to create a single pane of glass across different workspaces, Azure resources and applications.

With this upgrade, you have a much more powerful integration with Power BI Desktop, the same type of integration as in Application Insights. You can take advantage of additional Power BI visualizations, publish and share them with your colleagues on PowerBI.com and enable automatic daily reports. Finally, you can now integrate with Microsoft Flow and Azure Logic Apps, enabling you to create business flows, notifications, and much more.

How to upgrade

This is probably the simplest upgrade process you’ll experience. Within the application you’ll see a banner prompting you to upgrade, and with just one click, it will enhance your workspace – automatically converting all your artifacts, such as saved searches, views, alerts, and computer groups. Later, all non-upgraded workspaces will automatically be upgraded to the new query language and the platform. Learn more about upgrade process and FAQs in Azure documentation.

Language documentation, learning tools and community

The Log Search page also provides a side-by-side experience with the old query language enabling you to learn and ramp up on the new query language. The main reason for a rollout upgrade vs automatic upgrade is to give you time to learn and ramp up at your own pace.

The language documentation site includes extensive language reference, tutorials, examples and cheat sheets. A full-featured demo environment, enables you try out any queries. We are also launching a community site enabling you to interact with other product users, as well as the product team, with questions regarding query language.

Summary

The upgrade enables an assortment of new capabilities and customers are already taking advantage of them. Over the last week and half, during the soft launch period, hundreds of customers elected to upgrade their workspace, totaling in more than 1,000 enhanced workspaces. Upgrade your workspace today and start using the new powerful search and query language to gain deep insights into your data! Register now to join us for a webinar on August 17, 2017, where we will share more details and demos of this improved experience.
Quelle: Azure

Azure AD authentication extensions for Azure SQL DB and SQL DW tools

With the latest SQL server tools release we extended the Azure AD authentication support for SQL DB and DW tools for token-based authentication (Universal authentication) with MFA support.

The following SQL Server tools have been extended adding new functionality:

SSMS 17.2 supports the following functionalities:

Multiple-user Azure AD authentication for Universal authentication with multi-factor support (authentication option: Active Directory – Universal with MFA). A new user credential input field was added for the Universal authentication with MFA method to support multi-user authentication. See below myaccount@gmail.com as user name.          

Azure AD MFA Conditional Access (CA) is available for SQL DB and DW.
Database export/import for DacFx wizard using Universal authentication with MFA.
ADAL managed library used by Universal authentication with MFA was upgraded to 3.13.9 version.
Object Explorer support for Universal authentication with MFA.

 

SSMS 17.0 release supports “Azure AD domain name or tenant ID” in Connection Properties, an entry required for Azure AD guest users including Microsoft accounts such as hotmail.com, outlook.com, and live.com, as well as non-Microsoft accounts such as gmail.com. See below aadtest.onmicrosoft.com as AD domain name.

The latest SQLPackage.exe supports Universal authentication with MFA.
Rest API for DacFx supports Universal authentication with MFA.
New CLI interface for SQL DB/DW supports setup operations for Azure AD SQL administrator.

For more information about Azure AD authentication extensions please review the following documents:

Download SQL Server Management Studio (SSMS) July 2017 version17.2
Configure multi-factor authentication for SQL Server Management Studio and Azure AD
Universal Authentication with SQL Database and SQL Data Warehouse (SSMS support for MFA)
Conditional Access (MFA) with Azure SQL Database and Data Warehouse
Configure and manage Azure Active Directory authentication with SQL Database or SQL Data Warehouse
Use Azure Active Directory Authentication for authentication with SQL Database or SQL Data Warehouse
SQLPackage.exe support for UA with MFA  
DacFx UA with MFA support (import a BACPAC file)
DacFx UA with MFA support (export a BACPAC file)
API for UA with MFA support
Download SQLPackage.exe and the DacFx API (SQL Server Data-Tier Application Framework)
CLI for Azure SQL Server Admin Setup
ADAL.dll 3.13.9 release

For further communication on this topic please contact the MFAforSQLDB@microsoft.com alias.
Quelle: Azure

Operating Azure Stack

Ever since we announced that Azure Stack is ready to order, we’ve seen a variety of questions related to managing and operating Azure Stack. This blog kicks off a series of blogs addressing these questions. Operating Azure Stack is different. Today, your on-premises IT infrastructure provides a secure and controlled environment for your business solutions, but it also requires configuration, deployment, backup, and management tasks. Your IT administrators spend most of their time on these tasks, to simply keep your on-premises environments running. Azure Stack is an extension of Azure, it enables you to run Azure services in your on-premises environments. That way, you can enable a modern application development environment for your organization across cloud and on-premises, while taking advantage of all the Azure native toolsets and APIs. To ensure you can successfully provide Azure services in your own on-premises environments and can operate them with cloud SLAs, we’ve spent the last several months talking with many of you who’ve told us that the following infrastructure management tasks are the most important, time consuming and complex, and these should be our focus for simplification: Managing capacity: Ensuring that your infrastructure capacity is configured to correctly deal with the demands of providing cloud capacity. Checking and maintaining health: From monitoring, security, business continuity, and disaster recovery, customers want solutions that address these operational tasks and allow them to focus on service delivery. Managing tenants’ use of resources: Infrastructure is successful only when tenants are satisfied with the services, and customers want to be assured that they can successfully provide and operate these services for tenants. The “Azure Stack Operator” will be responsible for these tasks. It was with these tasks in mind that we made the necessary investments in the infrastructure management capabilities of Azure Stack and in the definition of the “Azure Stack Operator” role. This introductory post will be followed by a series of posts where we’ll go into more detail about each of these investments, including:   Monitoring and diagnostics: Monitoring, notifications, and management capabilities allow you to manage the infrastructure and service health, performance, and capacity that underlie your tenant workloads. Patching and update: With Azure Stack, you can update your infrastructure software while minimizing the impact on your business applications, services, and workloads. Business continuity: Azure Backup and Azure Site Recovery will enable tenant-driven protection for business applications and services. Security and compliance: Azure Stack has a secure-by-design approach across network, data, and management. Hardware lifecycle management: Azure Stack will have validated workflows to enable the replacement of failed components. Intuitive experiences: A portal and command-line experience highlights the common actions you need to perform. This allows you to make decisions quickly and intuitively. Future posts will also address the ways Azure Stack can be integrated into your existing datacenter including networking, identity, and ticketing and will go into more depth on the Azure Stack Operator role. Operating Azure Stack is different. Although many scenarios are familiar, I want to make sure you approach Azure Stack knowing that how you operate it will be different. Your value will not only be measured by how you manage Azure Stack infrastructure, but also by what services you provide to your developer community and how fast you can enable them. More information At Ignite this year in Orlando we will have a series of sessions that will educate you on all aspects of Azure Stack. See our list of sessions and register to attend. Lastly, the Azure Stack team is extremely customer focused and we are always looking for new customers to talk too. If you are passionate about Hybrid Cloud and want to talk with team building Azure Stack at Ignite please sign up for our customer meetup.
Quelle: Azure

Announcing deploy to Azure app service Jenkins plugin and more

We are proud to announce the availability of the Azure App Service plugin for Jenkins, which provides Jenkins native capability to continuously deploy to Azure Web Apps. Depending on your environment, you can choose to use Team Service together with Jenkins, or leverage this plugin to deliver your cloud apps or services.

Azure Web App lets developers rapidly build, deploy, and manage powerful websites and web apps using .NET, Node.js, PHP, Python, and Java. It provides built-in autoscale, load balancing, high availability and auto-patching – letting you focus on your application code. Web App on Linux is now in Public Preview, giving you an additional option to run your cloud apps natively on Docker Containers for Linux.

This release of the Azure App Service plugin for Jenkins supports deploying to Azure Web App through:

Git and FTP for Web App and Web App on Linux
Docker for Web App on Linux

The plugin is pipeline-ready so you can use it in a Jenkinsfile. You can find a walkthrough of deploying a Java app to Web App on Linux on Jenkins Hub.

Additional support, such as deploying to Azure Functions, is on the roadmap. Stay tuned for more updates in the coming months.

Azure Storage plugin update

Speaking of pipeline support, from version 0.3.6 onwards, you can leverage the Azure Storage plugin in pipeline code to upload and download build artifacts. Here are the sample syntax for upload and download respectively:

azureUpload storageCredentialId: '<credentials id>', storageType: 'blobstorage',
containerName: '<container>', filesPath: '<files in glob pattern>', virtualPath: '<remote path>'

azureDownload storageCredentialId: '<credentials id>', downloadType: 'container',
containerName: '<container>', includeFilesPattern: '<files in glob pattern>', downloadDirLoc: '<local path>'

You can refer to this article about Using Azure Storage with a Jenkins plugin on Jenkins Hub for more information.

As always, we would love to get your feedback via comments below. You can also email Azure Jenkins Support to let us know what you think.
Quelle: Azure

Automate Application Insights processes with the connector for Flow and Logic Apps

Azure Application Insights provides powerful search capabilities to query and get insights from your telemetry data.

Often, you may find yourself running the same queries repeatedly to validate whether your service is functioning properly or to find trends and anomalies. Wouldn’t it be nice to turn the repetitive queries into your own workflows so you can save time and turn those insights into actions? Well, this is now possible using the Application Insights connector for Microsoft Flow and Azure Logic Apps, as it will allow you to create automated workflows visually.

Using the Application Insights connector, now in preview, you can create workflows that use an Application Insights action to query or visualize your telemetry data. You can have them run together automatically along with any other subsequent actions you choose. There are hundreds of available actions, such as sending an email notification, creating a bug in Visual Studio Team Services, or posting a message in Slack or Microsoft Teams.

Below is a sample screenshot of the Microsoft Flow integration.

Below is a sample screenshot of the Azure Logic Apps integration.

Learn more about how to automate Azure Application Insights processes with the connector for Microsoft Flow or Azure Logic Apps.

Try today and share your feedback at aiflowfb@microsoft.com.

We’d love to hear from you!
Quelle: Azure

Bring Interactive Analytics to Azure HDInsight: Kyligence Analytics Platform enables sub-second query

In resource-intensive systems, queries will compete for runtime resources and it takes hours to return when the work load is high. SQL on Hadoop is improving continuously, but it is still common to wait minutes or even a couple hours for one single query to return, especially when the dataset is huge. Most of these systems are resource-intensive where queries compete for runtime resources and performance declines when the workload is high.

To solve this problem, Kyligence Analytics Platform (KAP) enables interactive analytics with sub-second query latency on massive dataset. KAP is a leading big data intelligence platform powered by Apache Kylin. It enables interactive analytics with sub-second query latency, even on massive data-set, and is widely adopted by enterprises such as Lenovo, China Mobile, and many more. We are happy to announce that the Kyligence team and Azure HDInsight team have worked closely with each other to bring OLAP capabilities to HDInsight, and KAP is now available on Azure HDInsight as an HDInsight application.

HDInsight Application Platform

Azure HDInsight is the only fully-managed cloud Hadoop offering that provides optimized open source analytical clusters for Spark, Hive, MapReduce, HBase, Storm, Kafka, and R Server backed by a 99.9% SLA. Each of these big data technologies and ISV applications are easily deployable as managed clusters with enterprise-level security and monitoring.

The open source ecosystem of applications has grown with the goal of making it easier for customers to build their big data and analytical solutions. Today, customers find it challenging to discover these productivity applications, and struggle to install and configure the apps. To address this gap, HDInsight Application Platform provides a unique experience to Microsoft where ISV’s can directly offer their applications to customers, and customers can easily discover, install, and use ISV applications built for the big data ecosystem.

As part of this integration, KAP can be easily deployed by one-click on HDInsight.

Interactive Analytics with Trillions of Data on HDInsight

Hadoop is designed for large scale data processing, but is not efficient enough for interactive analytics. KAP provides interactive analytics ability on HDInsight by providing the following integration with HDInsight:

Native SQL support on Hadoop and HDInsight: Many existing big data analytics technologies have their own query language or proprietary storage engine optimized for analytics scenarios. It is difficult for analysts to learn a new query language or move data out of HDFS/BLOB storage to other platforms. With KAP's native SQL support and ODBC drivers, customers can use the standard SQL interface and choose their favorite BI tools on their large amount of data.
Sub-second query response: The query performance is the bottleneck for most big data use cases. The performance will decline if the cluster resource cannot scale out when the original data grows 10x. To make the sub-second query response consistent is the key for interactive analytics and KAP on HDInsight solves this problem by providing pre-calculated Cubes.
Elastic architecture: The dataset normally ranges from gigabytes, terabytes, and more. Hadoop provides the elastic infrastructure for batch processing, and KAP as an interactive analytics technology, also leverages the elastic capability of Hadoop to enable the scale-out solutions.
Native Integration with HDInsight: Cloud is an effortless way to adopt new technology without worrying about deployment or monitoring. With KAP + HDInsight as a full-managed cloud solution, it can help users reduce operation cost as well as achieve high availability. KAP can work with all the supported Azure storage services (Azure BLOB storage and Azure Data Lake Store), and can also work with HDInsight Kafka clusters to ingest data from Kafka.

KAP – Enterprise-ready data warehouse powered by Apache Kylin

KAP, an enterprise OLAP on Hadoop powered by Apache Kylin, enables sub-second SQL query latency on petabyte scale dataset, provides high concurrency at internet scale, and empowers analysts to architect BI on Hadoop with industry-standard data warehouse and business intelligence methodology. KAP is a unified analytics platform simplified Big Data Analytics for business users, analysts, and engineers with self-service, seamless integrated with BI tools and no programming required. KAP is a native on Hadoop OLAP solution which interacts with cluster only via standard APIs and supports main Hadoop distributions from on-prem environment to in the Cloud.

On Azure, most data are stored in Azure BLOB storage or Azure Data Lake Store, and then are loaded into Hive as external tables. KAP builds the cube (index) by using MapReduce/Spark according to the data model designed by the modeler before analysis. During query runtime, all queries can access the pre-aggregated cube data and the result will be returned in sub-second. By leveraging the unique pre-calculation technology, KAP provides consistent query latency regardless of how much data grows, even with limited resources. KAP also provides native integration with various Azure storage services, such as Azure BLOB storage and Azure Data Lake Store. It can also connect with HDInsight Kafka clusters to ingest data from Kafka.

The screenshot below shows the KAP modeling GUI:

 

Compared to Hive query, KAP is 100x faster without modifying the queries into HiveQL dialect. ANSI SQL and JDBC/ODBC drivers are also supported, so users can choose their familiar BI tools to do interactive analytics, for example PowerBI or Tableau. Below is the performance comparison between Apache Kylin and Apache Hive on SSB dataset:

Installing KAP on Azure HDInsight

With the KAP on Azure HDInsight solution, user can install KAP on their exiting HDInsight cluster or standalone optimized cluster designed for KAP with a single click. Currently, KAP works as an application on HDInsight HBase cluster.

After the one-click installation, you will get the following components:

KAP: The enterprise version of Apache Kylin, which provides the core OLAP analysis on HDInsight by building pre-calculated cubes.
KyAnalyzer: The built-in OLAP agile BI tool for quick BI analysis by connecting to KAP.

KAP will be installed on the Edge Node in the HBase cluster. To learn more details on how to use KAP on HDInsight, please check the Kyligence blog post.

Summary

KAP on Azure HDInsight brings quick insight into massive dataset in sub-second latency and empowers interactive analytics on Hadoop for trillion level records. It offers web-scale OLAP solutions for various industries to build their online and offline analytics platforms. With the cloud based technologies, computing resources can extend and shrink when processing burst data, with a more efficient deployment model, thus helping customers reduce cost and improve productivity.

For more resources to get started, please check the "more resources" section below. If you have any feedbacks or questions, feel free to drop us an email at hdiask@microsoft.com. We love to hear from you!

More resources

Getting Started to use KAP on HDInsight (Kyligence Blog or MSDN blog)
Video Tutorial for KAP on HDInsight
KAP Documentation
Learn more about Azure HDInsight
Ask HDInsight questions on stackoverflow
Learn more about Apache Kylin
Learn more about Kyligence Analytics Platform

Quelle: Azure