Microsoft 365 boosts usage analytics with Azure Cosmos DB

This post is part of a 2-part series about how organizations are using Azure Cosmos DB to meet real-world needs, and the difference it’s making for them. In this first post we explore the challenges that led the Microsoft 365 usage analytics team to take action, the architecture of the new solution, and migration of the production workload. In part 2, we’ll examine additional implementation details and the outcomes resulting from the team’s efforts.

The challenge: Understanding the behavior of more than 150 million active users

Office 365 is a flagship service within the Microsoft 365 Enterprise solution, with millions of commercial customers and more than 150 million active commercial users each month. Office 365 provides extensive reporting for administrators within each company on how the service is being used including license assignment, product-level usage, user-level activity, site activity, group activity, storage consumption, and more. The Microsoft 365 usage analytics team incrementally adds new reports to cover more Office 365 services.

Previous architecture

The telemetry data needed to generate such reports was collected in a system called usage analytics, that until recently ran on the community version of MongoDB. The image below shows the data flow, with an importer web service used to write log streams collected in Azure Blob storage to MongoDB. An OData web service exposes APIs to extract the stored data for both reporting within the Microsoft 365 admin center and for access through Microsoft Graph. Every day, as part of a full daily refresh, several billion rows of data were added to the system.

Each of the primary geographies served by Office 365 has an independent usage analytics repository, all employing a similar architecture. In each geography, data was stored on two MongoDB clusters, with each cluster consisting of up to 50 virtual machines (VMs) hosted in Azure Virtual Machines and running MongoDB. The two clusters in each geography functioned in a primary/backup configuration. Data was written separately to both clusters and under normal operation, all reads were performed on the primary cluster.

Each cluster was designed for a write-heavy workload. To speed writes, sharding of data across individual cluster nodes was done using a random globally unique identifier (GUID) such as a MongoDB shard key. Every day for a few hours, new data from Azure Blob storage was written using a multithreaded importer. Each thread wrote batches of 2,000 records at a time to all cluster nodes and waited for all records to finish before starting on the next batch of 2,000.

Problems and pains

This architecture presented several problems for the Microsoft 365 usage analytics team, ranging from excessive administrative effort and costs to limited performance, reliability, availability, and scalability. Some specific pains included:

Poor performance. Reads were inefficient and reports sometimes timed out because of the use of a random GUID as a shard key required querying all nodes. In addition, during the few hours each day when new data was imported, with writes and reads hitting the primary cluster node during the same time, performance was poor. To make matters worse, if anything failed during a batch write, which often happened due to internal database errors, all 2,000 records had to be written again.
Full-time administration. Maintenance of the MongoDB clusters was manual and time-consuming, requiring human resources to dedicate time towards managing the clusters. This put an unnecessary resource constraint on the team, which would rather use its bandwidth to bring new reports to market. Plus, bugs in MongoDB 3.2 required all servers to be restarted weekly. And renewing the security certificates on each cluster node within the virtual network had to be completed annually, and required an additional two weeks of effort per cluster. During such routine administrative tasks, if an operation failed on one cluster node, the entire cluster was down until the issue was resolved.
High costs. Significant costs were incurred to run the MongoDB backup clusters, which remained idle most of the time. Those costs continued to increase as Office 365 usage grew.
Limited scalability. Less than three years after MongoDB was initially deployed, the largest repository was almost at maximum capacity. Any spare capacity was forecast to run out within six months as more products and reports were added, with no easy way to scale.

While the team was dealing with the architectural limitations of its existing solution, they were looking ahead to a lineup of new, high-scale capabilities that they wanted to enable for customers in the usage analytics space. The team started looking for a new, cost-effective, and low-maintenance solution that would let them move from self-maintained VMs running MongoDB to a fully managed database service.

Geo-distribution on Azure Cosmos DB: The key to an improved architecture

After exploring their options, the team decided to replace MongoDB with Azure Cosmos DB, a fully managed globally-distributed, multi-model database service designed for global distribution and virtually unlimited elastic scalability. The first step was to deploy the needed infrastructure.

In contrast to the primary/backup, two-cluster configuration that it had used with MongoDB, the team took advantage of turnkey global distribution of active data in Azure Cosmos DB. Using multiple Azure regions for data replication provided an easy way to write to any region, read from any region, and better balance the workload across the database instances—all while relying on Azure Cosmos DB to transparently handle active data replication and data consistency.

“True geo-replication had been deemed too hard to do with MongoDB, which is why the previous architecture separately wrote data to both the primary and backup clusters,” says Xiaodong Wang, a Software Engineer on the Microsoft 365 usage analytics team. “With Azure Cosmos DB, implementing transparent geo-distribution literally took minutes—just a few mouse clicks.”

The image below shows the internal architecture of the usage analytics system today. Each of the primary geographies served by Office 365 is served by Cosmos databases geo-replicated across two Azure regions within that geography. Under normal operating conditions, writes are sent to one region within each geography while reads are routed to both. If for some reason a region is prevented from serving reads, those reads are automatically routed to the other region serving that same geography.

Migrating a production workload to Azure Cosmos DB

Developers began writing a new data access layer on the new infrastructure to accommodate reads and writes, using the Azure Cosmos DB SQL (Core) API. After bringing the new system online, the team began to write new production data to both old and new systems, while continuing to serve production reports from the old one.

Developers began to address the reports that they would need to duplicate for the new solution, working through them one at a time. Separate Cosmos containers were created within the database for most reports, so that each collection would be separately scalable after the system came online. The largest reports were addressed first to ensure that Azure Cosmos DB could handle them, and after each new report was verified, the team began serving it from the new environment.

After all functionality and reports were being served by Azure Cosmos DB, and everything was running as it should, the team stopped writing new data to the old system and decommissioned the MongoDB environment. The development team was able to move to Azure Cosmos DB, rewrite the data access layer, and migrate all reports for all geographies without any service interruptions to end users.

In part 2 of this series, we'll cover additional implementation details and the outcomes resulting from the Microsoft 365 usage analytics team’s implementation of Azure Cosmos DB.
Quelle: Azure

Microsoft 365 boosts usage analytics with Azure Cosmos DB – Part 2

This post is part of a 2-part series about how organizations are using Azure Cosmos DB to meet real world needs, and the difference it’s making for them. In part 1, we explored the challenges that led the Microsoft 365 usage analytics team to take action, the architecture of the new solution, and migration of the production workload. In this post, we’ll examine additional implementation details and the outcomes resulting from the team’s efforts.

Finding the right partition key—a critical design decision

After moving to Azure Cosmos DB, the team revisited how data would be partitioned (referred to as “sharding” in MongoDB). With Azure Cosmos DB, each collection must have a partition key, which acts as a logical partition for the data and provides Azure Cosmos DB with a natural boundary for distributing data across partitions. The data for a single logical partition must reside inside a single physical partition. Physical partition management is managed internally by Azure Cosmos DB.

The Microsoft 365 usage analytics team worked closely with the Azure Cosmos DB team to optimize data distribution in a way that would ensure high performance. The team initially tried the same approach as they used with MongoDB, which was using a random GUID as the partition key. However, this required scanning all of the partitions for reads and over allocating resources for writes, making writes fast but reads slow. The team then tried using Tenant ID as the partition key but found that the vast difference in the amount of report data for each tenant made some partitions too hot, which would have required throttling, while others remained cold.

The solution lay in creating a synthetic partition key. In the end, the team solved both the slow read and too hot and too cold issues by grouping 100 documents per tenant ID into a bucket and then using a combination of tenant IDs and bucket IDs as the partition key. The bucket ID loops from 1 to n, where n is a variable and can be adjusted for each report.

Handling four terabytes of new data every day

In one region alone, more than 6 TB of data is stored in Azure Cosmos DB, with 4 TB of that written and refreshed daily. Both of those numbers are continuing to grow. The database consists of more than 50 different collections, and the largest is more than 300 GB in size. It consumes an average of 150,000 request units per second (RU/s) of throughput, scaling this number up and down as needed.

The different collections map closely to the different reports that the system serves, which in turn have different throughput requirements. This design enables the Microsoft 365 usage analytics team to optimize the number of RU/s that are allocated to each collection (and thus to each report), and to elastically scale that throughput up or down on a per-collection and per-report basis.

Built-in, cost-effective scalability and performance

With Azure Cosmos DB, the Microsoft 365 usage analytics team is delivering real-time customer insights with less maintenance, better performance, and improved availability—all at a lower cost. The new usage analytics system can now easily scale to handle future growth in the number of Office 365 commercial customers. All that was accomplished in less than five months, without any service interruptions. “The benefits of moving from MongoDB to Azure Cosmos DB more than justify the effort that it took,” says Guo Chen, Principal Software Development Manager on the Microsoft 365 usage analytics team.

Improved performance and service availability

The team’s use of built-in, turnkey geo-distribution provided a way to easily distribute reads and writes across two regions. Combined with the other work done by the team, such as rewriting the data access layer using the Azure Cosmos DB Core (SQL) API, this enabled the team to reduce the time for the majority of reads from 12 milliseconds to 3 milliseconds. The image below illustrates this performance improvement.

Although this difference may seem negligible in the context of viewing a report, it resulted in significant service improvements. “There are two ways to access reporting data in the usage analytics system: through the Microsoft 365 admin center, and through Microsoft Graph,” explains Xiaodong Wang, a Software Engineer on the Microsoft 365 usage analytics team. “In the past, people complained that the Graph API was too slow. That’s no longer an issue. In addition, service availability is better now because the chances of any query timing-out are reduced.”

The image below shows just how much service availability is improved. The graph illustrates successful API requests divided by the total API requests and shows that the system is now delivering a service availability level of greater than 99.99 percent.

Zero maintenance and administration

Because Azure Cosmos DB is a fully managed service, the Office 365 development team no longer needs to devote one full-time person to database maintenance and administration. Annual certificate maintenance is no longer a burden, and VMs no longer need to be restarted weekly to protect against any compromises in service availability.

“In the past, with MongoDB, we had to allocate core developer resources to administrative management of the data store,” says Shilpi Sinha, Principal Program Manager on the Microsoft 365 usage analytics team. “Now that we are running on a fully managed service, we are able to repurpose developer resources towards adding new customer value instead of managing the infrastructure.”

Elastic scalability

The Microsoft 365 usage analytics team can now scale database throughput up or down on demand, as needed to accommodate a fluctuating workload that on average, is growing at a rate of 8 percent every three months. By simply adjusting the number of RU/s allocated to each collection, which can be done in the Azure portal or programmatically, the team can easily scale up during heavy data-ingestion periods to handle new reports, and most importantly, to accommodate continued overall growth of Office 365 around the world.

“Today, all we need to do is keep an eye on request unit usage versus what we have budgeted,” says Wang. “If we’re reaching capacity, we can allocate more RU/s in just a few minutes. We don’t have to pay for spare capacity until we need it and more importantly, we no longer need to worry whether we can handle future growth in data volumes or report usage.”

Lower costs

On top of all of those benefits, the Microsoft 365 usage analytics team increased data and reporting volumes while reducing its monthly Microsoft Azure bill for the usage analytics system by more than 13 percent. “After we cut over to Azure Cosmos DB, our monthly Azure expenses decreased by almost 20 percent,” says Chen. “We undertook this project to better serve our customers. Being able to save close to a quarter-million dollars per year—and likely more in the future—is like icing on the cake.”

“Usage analytics are offered as part of the base capability to all Microsoft 365 customers, irrespective of the type of subscription they purchase," said Sinha. "Keeping the costs of operating this service as low as possible contributes to our goal of running the overall Microsoft 365 service as efficiently as possible while at the same time giving our customers new and improved insights into how their people are using our services.”

Learn more about Microsoft usage analytics and Azure Cosmos DB today.
Quelle: Azure

Microsoft Azure portal May 2019 update

This month is packed with updates on the Azure portal, including enhancements to the user experience, resource configuration, management tools and more.

Sign in to the Azure portal now and see for yourself everything that’s new. Download the Azure mobile app to stay connected to your Azure resources anytime, anywhere.

Here’s the list of May updates to the Azure portal:

User experience

Improvements to the Azure portal user experience
Tabbed browsing support for more portal links

IaaS

Improved VMSS Diagnostics and troubleshooting with Boot Diagnostics, Serial Console access, and Resource Health
Updated VM computer name and Hostname display
New full-screen create experience for Azure Container Instances
New integrations for Azure Kubernetes Service
Multiple node pools for Azure Kubernetes Service (preview)

Storage

Azure Storage Data Transfer

Management tools

View change history in Activity Log

Create your first cloud project with confidence

Azure Quickstart Center now generally available

Security Center

Changing a VM group membership on adaptive application controls
Advanced Threat Protection for Azure Storage now generally available
Virtual machine scale set support now generally available
Adaptive network hardening now in public preview
Regulatory Compliance Dashboard in now generally available

Site Recovery

Add a disk to an already replicated Azure VM
Enhancements to Process Server monitoring
Dynamic Non-Azure groups for Azure Update Management public preview

Intune

Updates to Microsoft Intune

Let’s look at each of these updates in greater detail.

User experience

Improvements to the Azure portal user experience

Several new improvements this month help enrich your experience in the Azure portal:

Improvements to Global Search
Faster and more intuitive resource browsing
Powerful resource querying capabilities

For a detailed view of all these improvements, please visit this blog, “Key improvements to the Azure portal user experience.”

Tabbed browsing support for more portal links

We have heard your feedback that despite being a single page application, the portal should behave like a normal web site in as many cases as possible. With this month's release you can open many more of the portal's links in a new tab using standard browser mechanisms such as right click or CtrlShift + Left click. The improvement is most visible in the pages that list resources. You'll find that the links in the NAME, RESOURCE GROUP, and SUBSCRIPTION columns all support this behavior. A normal click will still result in an in place navigation.

IaaS

Improved VMSS diagnostics and troubleshooting with boot diagnostics, serial console access, and resource health

Azure Virtual Machine Scale Sets (VMSS) let you create and manage a group load balanced VMs. The number of VM instances can automatically increase or decrease in response to demand or a defined schedule. Scale sets provide high availability to your applications, and allow you to centrally manage, configure, and update a large number of VMs.

You can now manage and access additional diagnostic tools for your VMSS instances via the portal:

Boot diagnostics: access console output and screenshot support for Azure Virtual Machines.
Serial console: this serial connection connects to the COM1 serial port of the virtual machine, providing access independent of the virtual machine's network or operating system state.
Resource health: resource health informs you about the current and past health of your resources, including times your resources were unavailable in the past because of Azure service problems.

Serial console

To try out these tools, take the following steps:

Navigate to an existing Virtual Machine Scale Set instance.
In the left navigation menu, you'll find the Boot Diagnostics tab in the Support + troubleshooting section. Ensure that Boot diagnostics is enabled for the scale set (you'll need to create or select a storage account to hold the diagnostic logs).
If your scale set is set to automatic or rolling upgrade mode, each instance will be updated to receive the latest scale set model. If your scale set is set to manual upgrade mode, you will have to manually update instances from the VMSS > Instances blade.

Once each instance has received the latest model, boot diagnostics and serial console will be available for you.

Updated VM computer name and hostname display

The Azure naming convention documentation reminds you that Azure virtual machines have two names:

Virtual machine resource name: this is the Azure identifier for the virtual machine resource. It is the name you use to reference the virtual machine in any Azure automation. It cannot be changed.
Computer hostname: the runtime computer name of the in-guest operating system. The computer name can be changed at will.

If you create a VM using the Azure portal, for simplicity we use the same name for both the virtual machine resource name, and the computer hostname. You could always log into the VM and change the hostname; however, the portal only showed the virtual machine resource name. With this change, the portal now exposes both the virtual machine name, and the computer hostname in the VM overview blade. We also added more detailed operation system version info. These properties are visible for running virtual machines that have a healthy running VMAgent installed.

The resource name and guest computer hostname

New full-screen create experience for Azure Container Instances

The Azure Container Instances creation experience in portal has been completely redone, moving it to the new create style with convenient tabs and a simplified flow. Specific improvements to adding environment variables and specifying container sizes (including support for GPU cores) were also included.

ACI now uses the same create pattern as other services

To try out the new create experience: 

Go to the "+ Create a resource" button in the top-left of the portal
Choose the "Containers" category, and then choose "Container Instances".

New integrations for Azure Kubernetes Service

From an Azure Kubernetes Service cluster in the portal you can now add integrations with other Azure services including Dev Spaces, deployment center from Azure DevOps, and Policies. With the enhanced debugging capabilities offered by Dev Spaces, the robust deployment pipeline offered through the deployment center, and the increased control over containers offered by policies, setting up powerful tools for managing and maintaining Kubernetes clusters in Azure is now even easier.

New integrations now available

To try out the new integrations:

Go to the overview for any Azure Kubernetes Service cluster
Look for the following new menu items on the left:

Dev Spaces
Deployment center (preview)
Policies (preview)

Multiple node pools for Azure Kubernetes Service (preview)

Multiple node pools for Azure Kubernetes Service are now shown in the Azure portal for any clusters in the preview. New node pools can be added to the cluster and existing node pools can be removed, allowing for clusters with mixed VM sizes and even mixed operating systems. Find more details on the new multiple node pool functionality.

Node pools blade

Add a node pool

To try out multiple node pools: 

If you are not already participating, please visit the multiple node pools preview to learn more about multiple node pools.
If you already have a cluster with multiple node pools, look for the new 'Node pools (preview)' option in the left menu for your cluster in the portal.

Storage

Azure Storage Data Transfer

Azure has numerous data transfer offerings catering to different capabilities in order help users transfer data to a storage account. The new Data Transfer feature presents the recommended solutions depending on the available network bandwidth in your environment, the size of the data you intend to transfer, and the frequency at which you transfer. For each solution, a description, estimated time to transfer and best use case is shown.

Data Transfer

To try out Azure Storage Data Transfer:

Select a Storage Account
Click on the "Data transfer" ToC menu item on the left-hand side
Select an item in the drop down for 3 different fields:

Estimate data size for transfer
Approximate available network bandwidth
Transfer frequency

For more in-depth information, check out the documentation.

Management tools

View change history in Activity Log

The Activity Log shows you what changes happened to a resource during an event. Now you can view this information with Change history in preview.

For more details visit the blog, “Key improvements to the Azure portal user experience” and scroll to the “View change tracking in Activity Log” section.

Create your first cloud project with confidence

Azure Quickstart Center now generally available

The Azure Quickstart Center is a new experience to help you create and deploy your first cloud projects with confidence. We launched it as a preview at Microsoft Build 2018 and are now proud to announce it is generally available.

For more details, including the updated design please visit the blog,“Key improvements to the Azure portal user experience” and scroll to the “Take your first steps with Azure Quickstart Center” section.

Security Center

Changing a VM group membership on adaptive application controls

Users can now move a VM from one group to another, and by doing that, the application control policy applied to it will change according to the settings of that group. Up to now, after a VM was configured within a specific group, it could not be reassigned. VMs can now also be moved from a configured group to a non-configured group, which will result in removing any application control policy that was previously applied to the VM. For more information, see Adaptive application controls in Azure Security Center.

Advanced Threat Protection for Azure Storage now generally available

Advanced Threat Protection (ATP) for Azure Storage provides an additional layer of security intelligence that detects unusual and potentially harmful attempts to access or exploit storage accounts. This layer of protection allows you to protect and address concerns about potential threats to your storage accounts as they occur, without needing to be an expert in security. To learn more, see Advanced Threat Protection for Azure Storage or read about the ATP for Storage price in Azure Security Center pricing page.

Virtual machine scale set support now generally available

Azure Security Center now identifies virtual machine scale sets and provides recommendations for scale sets. For more information, see virtual machine scale sets.

Adaptive network hardening now in public preview

One of the biggest attack surfaces for workloads running in the public cloud are connections to and from the public Internet. Our customers find it hard to know which Network Security Group (NSG) rules should be in place to make sure that Azure workloads are only available to required source ranges. With this feature, Security Center learns the network traffic and connectivity patterns of Azure workloads and provides NSG rule recommendations, for internet facing virtual machines. This helps our customer better configure their network access policies and limit their exposure to attacks.

For more information about network hardening, see Adaptive Network Hardening in Azure Security Center.

Regulatory Compliance Dashboard in now generally available

The Regulatory Compliance Dashboard helps Security Center you streamline your compliance process, by providing insights into your compliance posture for a set of supported standards and regulations.

The compliance dashboard surfaces security assessments and recommendations as you align to specific compliance requirements, based on continuous assessments of your Azure and hybrid workload. The dashboard also provides actionable information for how to act on recommendations and reduce risk factors in your environment, to improve your overall compliance posture.  The dashboard is now generally available for Security Center Standard tier customers. For more information, see Improve your regulatory compliance.

Azure Site Recovery feature updates

Add a disk to an already replicated Azure VM

Azure Site Recovery for IaaS VMs now support the addition of new disks to an already replicated Azure virtual machine.

Adding new disks

To try out this feature:

Select any virtual machine which is protected using ASR.
Add new disk to this virtual machine.
Navigate to the Recovery services vault where you will see warning about the replication health of this virtual machine.
Click on the this VM and navigate to Disks > click on unprotected disk >Enable Replication.
Refer documentation for more details

Enhancements to Process Server monitoring

Azure Site Recovery has enhanced the health monitoring of your workloads on VMware or physical servers by introducing various health signals on the replication component, Process Server. Notifications are raised on multiple parameters of Process Server: free space utilization, memory usage, CPU utilization, and achieved throughput.

Enhancements to Process Server monitoring

For more details refer to this blog, “Monitoring enhancements for VMware and physical workloads protected with Azure Site Recovery.”

The new enhancement on Process Server alerts for VMware and physical workloads also helps in new protections with Azure Site Recovery. These alerts also help with load balancing of Process Servers. The signals are powerful as the scale of the workloads grows. This guidance ensures that the apt number of virtual machines are connected to a Process Server, and that related issues can be avoided.

 

New alerts

To try out the new alerts:

Start the enable replication workflow for a Physical or a VMware machine.
At the time of source selection, choose the Process Server from the dropdown list.
The health of the Process Server is displayed against each Process Server. Warning health status deters the user’s choice by raising warning, while critical health completely blocks the PS selection.

Dynamic Non-Azure groups for Azure Update Management public preview

Non-Azure group targeting for Azure update management is now available in public preview. This feature supports dynamic targeting of patch deployments to non-Azure machines based on Log Analytics saved searches.

This feature enables dynamic resolution of the target machines for an update deployment based on saved searches. After the deployment is created, any new machines added to update management that meet the search criteria will be automatically picked up and patched in the next deployment run without requiring the user to modify the update deployment itself.

Dynamic non-Azure groups

To try out this feature:

Deploy Azure Update Management and add 1 or more non-Azure machines to be managed by the service.
Create a saved search that targets your non-Azure machines.
Create a new periodic Update Deployment in Azure Update Management.

For target machines, select Groups to Update and choose your saved search from the Non-Azure (preview) tab.

Complete your Update Deployment.
When new machines are added to update management that match the saved search, they will be picked up by this deployment.

To learn more about Azure Update Management and creating saved searches, see the documentation.

Intune

Updates to Microsoft Intune

The Microsoft Intune team has been hard at work on updates as well. You can find the full list of updates to Intune on the What's new in Microsoft Intune page, including changes that affect your experience using Intune.

Azure portal “how to” video series

Have you checked out our Azure portal “how to” video series yet? The videos highlight specific aspects of the portal so you can be more efficient and productive while deploying your cloud workloads from the portal. Recent videos include a demonstration of how to create a storage account and upload a blob and how to create an Azure Kubernetes Service cluster in the portal. Keep checking our playlist on YouTube for a new video each week.

Next steps

The Azure portal’s large team of engineers always wants to hear from you, so please keep providing us with your feedback in the comments section below or on Twitter @AzurePortal.

Don’t forget to sign in the Azure portal and download the Azure mobile app today to see everything that’s new. See you next month!
Quelle: Azure

A Cosmonaut’s guide to the latest Azure Cosmos DB announcements

At Microsoft Build 2019 we announced exciting new capabilities, including the introduction of real-time operational analytics using new built in support for Apache Spark and a new Jupyter notebook experience for all Azure Cosmos DB APIs. We believe these capabilities will help our customers easily build globally distributed apps at Cosmos scale.

Here are additional enhancements to the developer experience, announced at Microsoft Build:

Powering Kubernetes with etcd API

Etcd is at the heart of the Kubernetes cluster – it’s where all of the state is! We are happy to announce a preview for wire-protocol compatible etcd API to enable self-managed Kubernetes developers to focus more on their apps, rather than managing etcd clusters. With the wire-protocol compatible Azure Cosmos DB API for etcd, Kubernetes developers will automatically get highly scalable, globally distributed, and highly available Kubernetes clusters. This enables developers to scale Kubernetes coordination and state management data on a fully managed service with 99.999-percent high availability and elastic scalability backed by Azure Cosmos DB SLAs. This helps significantly lower total cost of ownership (TCO) and remove the hassle and complexity of managing etcd clusters.

To get started, setup AKS Engine with Azure Cosmos DB API for etcd. You can also learn more and sign-up for the preview.

Deepening our multi-model capabilities

The multi-model capabilities of Azure Cosmos DB’s database engine are foundational and bring important benefits to our customers, such as leveraging multiple data models in the same apps, streamlining development by focusing on the single service, reducing TCO by not having multiple database engines to manage, and getting the benefits of the comprehensive SLAs offered by Azure Cosmos DB.

Over the past two years, we have been steadily revamping our database engine’s type system and the storage encodings for both Azure Cosmos DB database log and index. The database engine’s type system is fully extensible and is now a complete superset of the native type systems of Apache Cassandra, MongoDB, Apache Gremlin, and SQL. The new encoding scheme for the database log is highly optimized for storage and parsing, and is capable of efficiently translating popular formats like Parquet, protobuf, JSON, BSON, and other encodings. The newly revamped index layout provides:

Significant performance boost to query execution cost, especially for the aggregate queries
New SQL query capabilities:

Support for OFFSET/LIMIT and DISTINCT keywords
Composite indexes for multi-column sorting
Correlated subqueries including EXISTS and ARRAY expressions

Learn more about SQL query examples and SQL language reference.

The type system and storage encodings have provided benefits to a plethora of Gremlin, MongoDB, and Cassandra (CQL) features. We are now near full compatibility with Cassandra CQL v4, and are bringing native change feed capabilities as an extension command in CQL. Customers can build efficient, event sourcing patterns on top of Cassandra tables in Azure Cosmos DB. We are also announcing several Gremlin API enhancements, including the support of Execution Profile function for performance evaluation and String comparison functions aligned with the Apache TinkerPop specification.

To learn more, visit our documentation for Gremlin API Execution Profile and Azure Cosmos DB Gremlin API supported features.

SDK updates

General availability of Azure Cosmos DB .NET V3 SDK

Fully open-sourced, .NET Standard 2.0 compatible
~30 percent performance improvements including the new streaming API
More intuitive, idiomatic programming model with developer-friendly APIs
New change feed pull and push programming models

We will make .NET SDK V3 generally available later this month and recommend existing apps upgrade to take advantage of the latest improvements.

New and improved Azure Cosmos DB Java V3 SDK

New, reactor-based async programming model
Added support for Azure Cosmos DB direct HTTPS and TCP transport protocols, increasing performance and availability
All new query improvements of V3 SDKs

Java V3 SDK is fully open-sourced, and we welcome your contributions. We will make Java V3 SDK generally available shortly.

Change feed processor for Java

One of the most popular features in Azure Cosmos DB, change feed allows customers to programmatically observe changes to their data in Cosmos containers. It is used in many application patterns, including reactive programming, analytics, event store, and serverless. We’re excited to announce change feed processor library for Java, allowing you to build distributed microservices architectures on top of change feed, and dynamically scale them using one of the most popular programming languages.

General availability of the cross-platform Table .NET Standard SDK

The 1.0.1 GA version of the cross-platform Table .NET Standard SDK has just come out. It is a single unified cross-platform SDK for both Azure Cosmos DB Table API and Azure Storage Table Service. Our customers can now operate against the Table service, either as a Cosmos Table, or Azure Storage Table using .NET Framework app on Windows, or .NET Core app on multiple platforms. We’ve improved the development experience by removing unnecessary binary dependencies while retaining the improvements when invoking Table API via the REST protocols, such as using modern HttpClient, DelegatingHandler based extensibility, and modern asynchronous patterns. It can also be used by the cross-platform Azure PowerShell to continue to power the Table API cmdlets.

More cosmic developer goodness

ARM support for databases, containers, and other resources in Azure Resource Manager

Azure Cosmos DB now provides support for Databases, Containers and Offers in Azure Resource Manager. Users can now provision databases and containers, and set throughput using Azure Resource Manager templates or PowerShell. This support is available across all APIs including SQL (Core), MongoDB, Cassandra, Gremlin, and Table. This capability also allows customers to create custom RBAC roles to create, delete, or modify the settings on databases and containers in Azure Cosmos DB. To learn more and to get started, see Azure Cosmos DB Azure Resource Manager templates.

Azure Cosmos DB custom roles and policies

Azure Cosmos DB provides support for custom roles and policies. Today, we announce the general availability of an Azure Cosmos DB Operator role. This role provides the ability to manage Azure Resource Manager resources for Azure Cosmos DB without providing data access. This role is intended for scenarios where customers need the ability to grant access to Azure Active Directory Service Principals to manage deployment operations for Azure Cosmos DB, including the account, databases, and containers. To learn more, visit our documentation on Azure Cosmos DB custom roles and policies support.

Upgrade single-region writes Cosmos accounts to multi-region writes

One of the most frequent customer asks has been the ability to upgrade existing Cosmos accounts configured with a single writable region (single-master) to multiple writable regions (multi-master). We are happy to announce that starting today, you will be able to make your existing accounts writable from all regions. You can do so using the Azure portal or Azure CLI. The upgrade is completely seamless and is performed without any downtime. To learn more about how to perform this upgrade, visit our documentation.

Automatic upgrade of fixed containers to unlimited containers

All existing fixed Azure Cosmos containers (collections, tables, graphs) in the Azure Cosmos DB service are now automatically upgraded to enjoy unlimited scale and storage. Please refer to this documentation for in depth overview of how to scale your existing fixed containers to unlimited containers.

Azure Cosmos Explorer now with Azure AD support

Enjoy a flexible Cosmos Explorer experience to work with data within the Azure portal, as part of the Azure Cosmos DB emulator and Azure Storage Explorer. We’ve also made it available “full-screen”, for when developers do not have access to the Azure portal or need a full screen experience. Today, we are adding support for Azure Active Directory to https://cosmos.azure.com, so that developers can authenticate directly with their Azure credentials, and take advantage of the full screen experience.

Azure portal and tools enhancements

To help customers correctly provision capacity for apps and optimize costs on Azure Cosmos DB, we have added built in cost recommendations to Azure portal and Azure Advisor, along with updates to the Azure pricing calculator.

We look forward to seeing what you will build with Azure Cosmos DB!

Have questions? Email us at AskCosmosDB@microsoft.com any time.
Try out Azure Cosmos DB for free. (No credit card required)
For the latest Azure Cosmos DB news and features, stay up-to-date by following us on Twitter #CosmosDB, @AzureCosmosDB.

 

Azure Cosmos DB

Azure Cosmos DB is Microsoft's globally distributed, multi-model database service for mission-critical workloads. Azure Cosmos DB provides turnkey global distribution with unlimited endpoint scalability, elastic scaling of throughput at multiple granularities (e.g., database/key-space as well as, tables/collections/graphs), storage worldwide, single-digit millisecond read and write latencies at the 99th percentile, five well-defined consistency models, and guaranteed high availability, all backed by the industry-leading comprehensive SLAs.

Quelle: Azure

Azure Firewall and network virtual appliances

Network security solutions can be delivered as appliances on premises, as network virtual appliances (NVAs) that run in the cloud or as a cloud native offering (known as firewall-as-a-service).

Customers often ask us how Azure Firewall is different from Network Virtual Appliances, whether it can coexist with these solutions, where it excels, what’s missing, and the TCO benefits expected. We answer these questions in this blog post.

Network virtual appliances (NVAs)

Third party networking offerings play a critical role in Azure, allowing you to use brands and solutions you already know, trust and have skills to manage. Most third-party networking offerings are delivered as NVAs today and provide a diverse set of capabilities such as firewalls, WAN optimizers, application delivery controllers, routers, load balancers, proxies, and more. These third party capabilities enable many hybrid solutions and are generally available through the Azure Marketplace. For best practices to consider before deploying a NVA, see Best practices to consider before deploying a network virtual appliance.

Cloud native network security

A cloud native network security service (known as firewall-as-a-service) is highly available by design. It auto scales with usage, and you pay as you use it. Support is included at some level, and it has a published and committed SLA. It fits into DevOps model for deployment and uses cloud native monitoring tools.

What is Azure Firewall?

Azure Firewall is a cloud native network security service. It offers fully stateful network and application level traffic filtering for VNet resources, with built-in high availability and cloud scalability delivered as a service. You can protect your VNets by filtering outbound, inbound, spoke-to-spoke, VPN, and ExpressRoute traffic. Connectivity policy enforcement is supported across multiple VNets and Azure subscriptions. You can use Azure Monitor to centrally log all events. You can archive the logs to a storage account, stream events to your Event Hub, or send them to Log Analytics or your security information and event management (SIEM) product of your choice.

Is Azure Firewall a good fit for your organization security architecture?

Organizations have diverse security needs. In certain cases, even the same organization may have different security requirements for different environments. As mentioned above, third party offerings play a critical role in Azure. Today, most next-generation firewalls are offered as Network Virtual Appliances (NVA) and they provide a richer next-generation firewall feature set which is a must-have for specific environments/organizations.  In the future, we intend to enable chaining scenarios to allow you to use Azure Firewall for specific traffic types, with an option to send all or some traffic to a third party offering for further inspection. This third-party offering can be either a NVA or a cloud native solution.

Many Azure customers find the Azure Firewall feature set is a good fit and it provides some key advantages as a cloud native managed service:

DevOps integration – easily deployed using Azure Portal, Templates, PowerShell, CLI, or REST.
Built in HA with cloud scale.
Zero maintenance service model – no updates or upgrades.
Azure specialization— for example, service tags, and FQDN tags.
Significant total cost of ownership saving for most customers.

But for some customers third party solutions are a better fit.

The following table provides a high-level feature comparison for Azure Firewall vs. NVAs:

Figure 1: Azure Firewall versus Network Virtual Appliances – Feature comparison

Why Azure Firewall is cost effective

Azure Firewall pricing includes a fixed hourly cost ($1.25/firewall/hour) and a variable per GB processed cost to support auto scaling. Based on our observation, most customers save 30 percent – 50 percent in comparison to an NVA deployment model. We are announcing a price reduction, effective May 1, 2019, for the firewall per GB cost to $0.016/GB (-46.6 percent) to ensure that high throughput customers maintain cost effectiveness. There is no change to the fixed hourly cost. For the most up-to-date pricing information, please go to the Azure Firewall pricing page.

The following table provides a conceptual TCO view for a NVA with full HA (active/active) deployment:

Cost

Azure Firewall

NVAs

Compute

$1.25/firewall/hour

$0.016/GB processed

(30%-50% cost saving)

 

 

 

Two plus VMs to meet peek requirements

Licensing

Per NVA vendor billing model

Standard Public Load Balancer

First five rules: $0.025/hour
Additional rules: $0.01/rule/hour
$0.005 per GB processed

Standard Internal Load Balancer

First five rules: $0.025/hour
Additional rules: $0.01/rule/hour
$0.005 per GB processed

Ongoing/Maintenance

Included

Customer responsibility

Support

Included in your Azure Support plan

Per NVA vendor billing model

Figure 2: Azure Firewall versus Network Virtual Appliances – Cost comparison

Next steps

Azure Firewall Documentation
March blog: Announcing new capabilities in Azure Firewall
Pricing
Azure Firewall management partners:

AlgoSec 
Barracuda
Tufin

Quelle: Azure

Howden: How they built a knowledge mining solution with Azure Search

Customers across industries including healthcare, legal, media, and manufacturing are looking for new solutions to solve business challenges with AI, including knowledge mining with Azure Search.

Azure Search enables developers to quickly apply AI across their content to unlock untapped information.  Custom or prebuilt cognitive skills like facial recognition, key phrase extraction, and sentiment analysis can be applied to content using the cognitive search capability to extract knowledge that’s then organized within a search index. Let’s take a closer look at how one company, Howden, applies the cognitive search capability to reduce time and risk to their business.

Howden, a global engineering company, focuses on providing quality solutions for air and gas handling. With over a century of engineering experience, Howden creates industrial products that help multiple sectors improve their everyday processes; from mine ventilation and waste water treatment to heating and cooling.

Too many details, not enough time

Every new project requires the creation of a bid proposal. A typical customer bid can span thousands of pages in differing formats such as Word and PDF.  The team has to scour through detailed customer requirements to identify key areas of design and specialized components in order to produce accurate bids.  If they miss key or critical details, they can bid too low and lose money, or bid too high and lose the customer opportunity.  The manual process is time consuming, labor intensive, and creates multiple opportunities for human error. To learn more about knowledge mining with Azure Search and see how Howden built their solution, check out the Microsoft Mechanics show linked below.

Learn more

Leverage the solution accelerator to build your own application
Learn more about Azure Search

Quelle: Azure

Premium files redefine limits for Azure Files

Premium files sets new scale and performance bar for Azure Files, providing more power to developers and IT pros.

Today, we are excited to share that Azure Premium Files preview is now available to everyone! Premium files is a new performance tier that unlocks the next level of performance for fully managed file services in the cloud. Premium tier is optimized to deliver consistent performance for IO-intensive workloads that require high-throughput and low latency. Premium shares store data on the latest solid-state drives (SSDs) making it suitable for a wide variety of workloads like file services, databases, shared cache storage, home directories, content and collaboration repositories, persistent storage for containers, media and analytics, high variable and batch workloads, and many more. Our standard tier continues to provide reliable performance to workloads that are less sensitive to performance variability and is well-suited for general purpose file storage, development/test, and application workloads.

Provisioned performance – Dynamically scalable and consistent

With premium files, you can customize the performance of file storage to fit your workload needs. Premium file shares allow you to dynamically scale premium shares up and down without any downtime. The premium shares’ IOPS and throughput instantly scale based on changes to your provisioned capacity, while still offering low and consistent latency.

Defining premium shares performance:

Baseline IOPS = 1 * provisioned GiB (Up to a max of 100,000 IOPS).

Burst IOPS = 3 * provisioned GiB (Up to a max of 100,000 IOPS).

egress rate = 60 MiB/s + 0.06 * provisioned GiB

ingress rate = 40 MiB/s + 0.04 * provisioned GiB

Example: For a 10 TiB provisioned share, 10K Baseline IOPS, and up to 30K burst IOPS, 675 MiB/s egress, and 450 MiB/s ingress rate. Please note, IOPS and egress/ingress rate can vary based on the access patterns and IO sizes and hit peak performance at 100 TiB shares.

So, how fast can it get? Let’s take a look at latency.

The above sample test results are based on internal testing performed with 8 KiB IO size reads and writes on a single virtual machine, Standard F16s_v2,  and connected over server message block (SMB) to a premium share. Our tests revealed that premium shares provides low and consistent latency for read and writes. This means between two to three milliseconds for small IOs sizes of less than 64 KiB, even with varying numbers of parallel threads (up to 10).

Premium shares offer performance with scale. They can massively scale up to 100K IOPS with a target egress rate of 6 GiB/s and ingress rate of 4 GiB/s for 100 TiB shares. To feed throughput-hungry workloads, we raised the bar for premium share throughput even higher. Now, you can get double the total throughput from when we first introduced premium files. In essence, you can get 100 times the IOPS and a total throughput of 10 GiB/s, which is an improvement of 170 times when compared to our current standard files offering.

What about workloads with variable access patterns? Frequently, applications have short peaks of intense IO, with a more predictable IO pattern most of the time. For these scenarios, premium files offers the best out-of-box experience. All premium shares start with full burst credit and a minimum total throughput of 100 MiB/s and with an ability to operate in burst mode.

Let’s look at how the burst mode works. Any un-used baseline IOs are accrued in the burst credit bucket. Shares can burst up to three times their baseline IOPS if there are enough IO credits accrued. On a best effort basis, all shares can burst up to 3 IOPS per provisioned GiB for up to 60 minutes and shares larger than 50 TiB can go over 60 minutes duration. For more details, please refer to our documentation on bursting.

Pricing – Simple and predictable cost

Premium file shares are billed based on provisioned storage, rather than used storage. You only pay for each GiB you provision, with no transaction fees or any additional cost for throughput and bursting. This makes it much simpler to determine the total cost of ownership for a premium files-based deployment. Although the cost of premium per GiB storage is higher than for standard storage, with zero transaction fees, in-built bursting capability, and flexibility to adjust provisioning size, Premium tier can be a more cost-effective solution than standard tier for some IO-intensive workloads. Refer to the pricing page for additional details.

Availability – Broad and global

At the time of this announcement, the Azure Premium Files public preview is available in East US2, East US, West US, West US2, Central US, North Europe, West Europe, SE Asia, East Asia, Japan East, Japan West, Korea Central, and Australia East regions. We are continuing to expand service to additional Azure regions. Stay up to date on region availability through the Azure products availability page.

Getting started – Quick and easy

It takes two minutes to get started with premium files. Premium tier is offered on a dedicated storage account type, FileStorage. Simply create a new FileStorage account type in any available region and create a new share with size provisioned based on your workload performance. You can use Azure portal, PowerShell, or CLI to create premium shares and any of your favorite Azure Files client tools and/or libraries to access data. Please see detailed steps for how to create a premium file share.

Currently, the Azure portal allows creating premium share up to 5 TiB. Portal update for creating greater than 5TiB is coming soon. Meanwhile, you can use Azure PowerShell or CLI to create shares greater than 5 TiB or update size to greater than 5 TiB of shares being created through the portal.

Next steps

Visit Azure Premium Files documentation to learn more and give it a try.

As always, you can share your feedback and experiences on the Azure Storage forum or just email us at PFSFeedback@microsoft.com. Post your ideas and suggestions about Azure Storage on Azure Storage feedback forum.

Happy sharing!
Quelle: Azure

Azure SQL Database Edge: Enabling intelligent data at the edge

The world of data changes at a rapid pace, with more and more data being projected to be stored and processed at the edge. Microsoft has enabled enterprises with the capability of adopting a common programming surface area in their data centers with Microsoft SQL Server and in the cloud with Azure SQL Database. We note that latency, data governance and network connectivity continue to gravitate data compute needs towards the edge. New sensors and chip innovation with analytical capabilities at a lower cost enable more edge compute scenarios to drive higher agility for business.

At Microsoft Build 2019, we announced Azure SQL Database Edge, available in preview, to help address the requirements of data and analytics at the edge using the performant, highly available and secure SQL engine. Developers will now be able to adopt a consistent programming surface area to develop on a SQL database and run the same code on-premises, in the cloud, or at the edge.

Azure SQL Database Edge offers:

Small footprint allows the database engine to run on ARM and x64 devices via the use of containers on interactive devices, edge gateways, and edge servers.
Develop once and deploy anywhere scenarios through a common programming surface area across Azure SQL Database, SQL Server, and Azure SQL Database Edge
Combines data streaming and time-series, with in-database machine learning to enable low latency analytics
Industry leading security capabilities of Azure SQL Database to protect data-at-rest and in- motion on edge devices and edge gateways, and allows management from a central management portal from Azure IoT.
Cloud connected, and fully disconnected edge scenarios with local compute and storage.
Supports existing business intelligence (BI) tools for creating powerful visualizations with Power BI and third-party BI tools.
Bi-directional data movement between the edge to on-premises or the cloud.
Compatible with popular T-SQL language, developers can implement complex analytics using R, Python, Java, and Spark, delivering instant analytics without data movement, and real-time faster insights

Provides support for processing and storing graph, JSON, and time series data in the database, coupled with the ability to apply our analytics and in-database machine learning capabilities on non-relational datatypes.

For example, manufacturers that employ the use of robotics or automated work processes can achieve optimal efficiencies by using Azure SQL Database Edge for analytics and machine learning at the edge. These real-world environments can leverage in-database machine learning for immediate scoring, initiating corrective actions, and detecting anomalies.

Key benefits:

A consistent programming surface area as Azure SQL Database and SQL Server, the SQL engine at the edge allows engineers to build once for on-premises, in the cloud, or at the edge.
The streaming capability enables instant analysis of the incoming data for intelligent insights.
In-Database AI capabilities enables scenarios like anomaly detection, predictive maintenance and other analytical scenarios without having to move data.

Train in the cloud and score at the edge

Supporting a consistent Programming Surface Area across on-premises, in the cloud, or at the edge, developers can use identical methods for securing data-in-motion and at-rest while enabling high availability and disaster recovery architectures equal to those used in Azure SQL Database and SQL Server. Giving seamless transition of the application from the various locations means a cloud data warehouse could train an algorithm and push the machine learning model to Azure SQL Database Edge and allow it to run scoring locally, giving real-time scoring using a single codebase.

Intelligent store and forward

The engine provides proficiencies to take streaming datasets and replicate them directly to the cloud, coupled with enabling an intelligent store-and-forward pattern. In duality, the edge can leverage its analytical capabilities while processing streaming data or applying machine learning using in-database machine learning. Fundamentally, the engine can process data locally and upload using native replication to a central datacenter or cloud for aggregated analysis across multiple different edge hubs.

Unlock additional insights for your data that resides at the edge. Join the Early Adopter Program to access the preview and get started building your next intelligent edge solution.
Quelle: Azure

Azure.Source – Volume 82.

What a great week we had at Build 2019! We all had tremendous fun meeting developers, talking about new technologies, and sharing our vision for the future. Plus, the weather was nearly perfect, and attendees had time to see some sights and sample Seattle’s terrific restaurant scene.
Quelle: Azure