Account failover now in public preview for Azure Storage

Today we are excited to share the preview for account failover for customers with geo-redundant storage (GRS) enabled storage accounts. Customers using GRS or RA-GRS accounts can take advantage of this functionality to control when to failover from the primary region to the secondary region for their storage accounts.

Customers have told us that they wish to control storage account failover so they can determine when storage account write access is required and the secondary replication state is understood. 

If the primary region for your geo-redundant storage account becomes unavailable for an extended period of time, you can force an account failover. When you perform a failover, all data in the storage account is failed over to the secondary region, and the secondary region becomes the new primary region. The DNS records for all storage service endpoints – blob, Azure Data Lake Storage Gen2, file, queue, and table – are updated to point to the new primary region. Once the failover is complete, clients can automatically begin writing data to the storage account using the service endpoints in the new primary region, without any code changes.

The diagram below shows how account failover works. Under normal circumstances, a client writes data to a geo-redundant storage account (GRS or RA-GRS) in the primary region, and that data is replicated asynchronously to the secondary region. If write operations to the primary region fail consistently then you can trigger the failover.

After the failover is complete, write operations can resume against the new primary service endpoints.

Post failover, the storage account is configured to be locally redundant (LRS). To resume replication to the new secondary region, configure the account to use geo-redundant storage again (either RA-GRS or GRS). Keep in mind that converting an locally-redundant (LRS) account to RA-GRS or GRS incurs a cost.

Account failover is supported in preview for new and existing Azure Resource Manager storage accounts that are configured for RA-GRS and GRS. Storage accounts may be general-purpose v1 (GPv1), general-purpose v2 (GPv2), or Blob Storage accounts. Account failover is currently supported in US-West 2 and US-West Central.

You can initiate account failover using the Azure portal, Azure PowerShell, Azure CLI, or the Azure Storage Resource Provider API. The process is simple and easy to perform. The image below shows how to trigger account failover in the Azure portal in one step.

As is the case with most previews, account failover should not be used with production workloads. There is no production SLA until the feature becomes generally available.

It's important to note that account failover often results in some data loss, because geo-replication always involves latency. The secondary endpoint is typically behind the primary endpoint. So when you initiate a failover, any data that has not yet been replicated to the secondary region will be lost.

We recommend that you always check the Last Sync Time property before initiating a failover to evaluate how far the secondary is behind the primary. To understand the implications of account failover and learn more about the feature, please read the documentation, “What to do if an Azure Storage outage occurs.”

For questions about participation in the preview or about account failover, contact xstoredr@microsoft.com. We welcome your feedback on the account failover feature and documentation!
Quelle: Azure

Processing trillions of events per day with Apache Kafka on Azure

This blog is co-authored by Noor Abani and Negin Raoof, Software Engineer, who jointly performed the benchmark, optimization and performance tuning experiments under the supervision of Nitin Kumar, Siphon team, AI Platform.

Our sincere thanks to Dhruv Goel and Uma Maheswari Anbazhagan from the HDInsight team for their collaboration.

 

Figure 1: Producer throughputs for various scenarios. 2 GBps achieved on a 10 broker Kafka cluster.

In the current era, companies generate huge volumes of data every second. Whether it be for business intelligence, user analytics, or operational intelligence; ingestion, and analysis of streaming data requires moving this data from its sources to the multiple consumers that are interested in it. Apache Kafka is a distributed, replicated messaging service platform that serves as a highly scalable, reliable, and fast data ingestion and streaming tool. At Microsoft, we use Apache Kafka as the main component of our near real-time data transfer service to handle up to 30 million events per second.

In this post, we share our experience and learnings from running one of world’s largest Kafka deployments. Besides underlying infrastructure considerations, we discuss several tunable Kafka broker and client configurations that affect message throughput, latency and durability. After running hundreds of experiments, we have standardized the Kafka configurations required to achieve maximum utilization for various production use cases. We will demonstrate how to tune a Kafka cluster for the best possible performance.

Performance has two orthogonal dimensions – throughput and latency. From our experience, customer performance requirements fall in three categories A, B and C of the diagram below. Category A customers require high throughput (~1.5 GBps) and are tolerant to higher latency (< 250 ms). One such scenario is telemetry data ingestion for near real-time processes like security and intrusion detection applications. Category B customers have very stringent latency requirements (< 10 ms) for real-time processing, such as online spelling and grammar checks. Finally, Category C customers require both high throughput and low latency (~100 ms), but can tolerate lower data reliability, like service availability monitoring applications.

The graph above shows the maximum throughput we achieved in each case. Reliability is another requirement that has a trade-off against performance. Kafka provides reliability by replicating data and providing configurable acknowledgement settings. We quantify the performance impact that comes with these guarantees.        

Our goal is to make it easier for anyone planning to run a production Kafka cluster to understand the effect of each configuration, evaluate the tradeoffs involved, tune it appropriately for their use case and get the best possible performance.

Siphon and Azure HDInsight

To build a compliant and cost-effective near real time publish-subscribe system that can ingest and process 3 trillion events per day from businesses like O365, Bing, Skype, SharePoint online, and more, we created a streaming platform called Siphon. Siphon is built for internal Microsoft customers on Azure cloud with Apache Kafka on HDInsight as its core component. Setting up and operating a Kafka cluster by purchasing the hardware, installing and tuning the bits and monitoring is very challenging. Azure HDInsight is a managed service with a cost-effective VM based pricing model to provision and deploy Apache Kafka clusters on Azure. HDInsight ensures that brokers stay healthy while performing routine maintenance and patching with a 99.9 percent SLA on Kafka uptime. It also has enterprise security features such as role-based access control and bring your own key (BYOK) encryption.

Benchmark setup

Traffic generator

To stress-test our system in general and the Kafka clusters specifically, we developed an application which constantly generates message batches of random bytes to a cluster’s front-end. This application spins 100 threads to send 1,000 messages of 1 KB random data to each topic, in 5 ms intervals. Unless explicitly mentioned otherwise, this is the standard application configuration.

Event Server setup

Event Server is used as a front-end web server which implements Kafka producer and consumer APIs. We provision multiple Event Servers in a cluster to balance the load and manage produce requests sent from thousands of client machines to Kafka brokers. We optimized Event Server to minimize the number of TCP connections to brokers by implementing partition affinity whereby each Event Server machine makes connections to a randomly selected partition’s leader, which gets reset after a fixed time interval. Each Event Server application runs in a docker container on scale-sets of Azure Standard F8s Linux VMs, and is allocated 7 CPUs and 12 GB of memory with a maximum Java heap size set to 9 GB. To handle the large amount of traffic generated by our stress tool, we run 20 instances of these Event Servers.

Event server also uses multiple sliding queues to control the number of outstanding requests from clients. New requests are queued to one of the multiple queues in an event server instance, which is then processed by multiple parallel Kafka producer threads. Each thread instantiates one producer. The number of sliding queues is controlled by thread pool size. When testing the producer performance for different thread pool sizes, we found out that adding too many threads can cause a processing overhead and increase Kafka request queue time and local processing time. Despite doubling the Kafka send latency, adding more than 5 threads did not increase the ingress throughput significantly. So, we chose 5 Kafka producer threads per event server instance.

Kafka Broker hardware

We used Kafka version 1.1 for our experiments. The Kafka brokers used in our tests are Azure Standard D4 V2 Linux VMs. We used 10 brokers with 8 cores and 28 GB RAM each. We never ran into high CPU utilization with this setup. On the other hand, the number of disks had a direct effect on throughput. We initially started by attaching 10 Azure Managed Disks to each Kafka broker. By default, Managed Disks support Locally-redundant storage (LRS), where three copies of data are kept within a single region. This introduces another level of durability, since write requests to an LRS storage account return successfully only after the data is written to all copies. Each copy resides in separate fault domains and update domains within a storage scale unit. This means that along with a 3x replication factor Kafka configuration, we are in essence ensuring 9x replication.

Consumers and Kafka Connect setup

In our benchmark, we used Kafka Connect as the connector service to consume data from Kafka. Kafka Connect is a built-in tool for producing and consuming Kafka messages in a reliable and scalable manner. For our experiments, we ran Null sink connectors which consume messages from Kafka, discard them and then commit the offsets. This allowed us to measure both producer and consumer throughput, while eliminating any potential bottlenecks introduced by sending data to specific destinations. In this setup, we ran Kafka Connect docker containers on 20 instances of Azure Standard F8s Linux VM nodes. Each container is allocated 8 CPUs and 10 GB Memory with maximum Java heap size of 7 GB.

Results

Producer configurations

The main producer configurations that we have found to have the most impact on performance and durability are the following:

Batch.size
Acks
Compression.type
Max.request.size
Linger.ms
Buffer.memory

Batch size

Each Kafka producer batches records for a single partition, optimizing network and IO requests issued to a partition leader. Therefore, increasing batch size could result in higher throughput. Under light load, this may increase Kafka send latency since the producer waits for a batch to be ready. For these experiments, we put our producers under a heavy load of requests and thus don’t observe any increased latency up to a batch size of 512 KB. Beyond that, throughput dropped, and latency started to increase. This means that our load was sufficient to fill up 512 KB producer batches quickly enough. But producers took a longer time to fill larger batches. Therefore, under heavy load it is recommended to increase the batch size to improve throughput and latency.

The Linger.ms setting also controls batching. It puts a ceiling on how long producers wait before sending a batch, even if the batch is not full. In low-load scenarios, this improves throughput by sacrificing latency. Since we tested Kafka under continuous high throughput, we didn’t benefit from this setting.

Another configuration we tuned to support larger batching was buffer.memory, which controls the amount of memory available for the producer for buffering. We increased this setting to 1 GB.

Producer required acks

Producer required acks configuration determines the number of acknowledgments required by the partition leader before a write request is considered completed. This setting affects data reliability and it takes values 0, 1, or -1 (i.e. “all”).

To achieve highest reliability, setting acks = all guarantees that the leader waits for all in-sync replicas (ISR) to acknowledge the message. In this case, if the number of in-sync replicas is less than the configured min.insync.replicas, the request will fail. For example, with min.insync.replicas set to 1, the leader will successfully acknowledge the request if there is at least one ISR available for that partition. On the other end of the spectrum, setting acks = 0 means that the request is considered complete as soon as it is sent out by producer. Setting acks = 1 guarantees that the leader has received the message.

For this test, we varied the configuration between those three value. The results confirm the intuitive tradeoff that arises between reliability guarantees and latency. While ack = -1 provides stronger guarantees against data loss, it results in higher latency and lower throughput.

Compression

A Kafka producer can be configured to compress messages before sending them to brokers. The Compression.type setting specifies the compression codec to be used. Supported compression codecs are “gzip,” “snappy,” and “lz4.” Compression is beneficial and should be considered if there is a limitation on disk capacity.

Among the two commonly used compression codecs, “gzip” and “snappy,” “gzip” has a higher compression ratio resulting in lower disk usage at the cost of higher CPU load, whereas “snappy” provides less compression with less CPU overhead. You can decide which codec to use based on broker disk or producer CPU limitations, as “gzip” can compress data 5 times more than “snappy.”

Note that using an old Kafka producer (Scala client) to send to newer Kafka versions creates an incompatibility in message types structure (magic byte) which forces brokers to decompress and recompress before writing. This adds latency to message delivery and CPU overhead (almost 10 percent in our case) due to this extra operation. It is recommended to use the Java producer client when using newer Kafka versions.

Broker configurations

Number of disks

Storage disks have limited IOPS (Input/Output Operations Per Second) and read/write bytes per second. When creating new partitions, Kafka stores each new partition on the disk with fewest existing partitions to balance them across the available disks. Despite this, when processing hundreds of partitions replicas on each disk, Kafka can easily saturate the available disk throughput.

We used Azure standard S30 HDD disks in our clusters. In our experiments, we observed 38.5 MBps throughput per disk on average with Kafka performing multiple concurrent I/O operations per disk. Note that the overall write throughput includes both Kafka ingestion and replication requests.

We tested with 10, 12, and 16 attached disks per broker to study the effect on the producer throughput. The results show a correlation of increasing throughput with an increasing number of attached disks. We were limited by the number of disks that can be attached to one VM (16 disks maximum). Hence, adding more disks would need additional VMs, which would increase cost. We decided to continue with 16 standard HDDs per broker in the next experiments. Note that this experiment was specifically to observe the effect of the number of disks and did not include other configuration tuning done to optimize throughput. Hence, the throughputs mentioned in this section are lower than the values presented elsewhere in this post.

Number of topics and partitions

Each Kafka partition is a log file on the system, and producer threads can write to multiple logs simultaneously.

Similarly, since each consumer thread reads messages from one partition, consuming from multiple partitions is handled in parallel as well. In this experiment, we quantify the effect of partition density (i.e. the number of partitions per broker, not including replicas) on performance. Increasing the partition density adds an overhead related to metadata operations and per partition request/response between the partition leader and its followers. Even in the absence of data flowing through, partition replicas still fetch data from leaders, which results in extra processing for send and receive requests over the network. Therefore, we increased the number of I/O, network and replica fetcher threads to utilize the CPU more efficiently. Note that once CPU is fully utilized, increasing the thread pool sizes may not improve the throughput. You can monitor network and I/O processor idle time using Kafka metrics.

Moreover, observing Kafka metrics for request and response queue times enabled us to tune the size of Kafka thread pools. Allocating more I/O and network threads can reduce both the request and response queue wait times. Higher request local latency indicated that the disk couldn’t handle the I/O requests fast enough. The key Kafka configurations are summarized in the list below.

Kafka can handle thousands of partitions per broker. We achieved the highest throughput at 100 partitions per topic, i.e., a total of 200 partitions per broker (we have 20 topics and 10 brokers). The throughput decline exhibited for higher partition density corresponds to the high latency, which was caused by the overhead of additional I/O requests that the disks had to handle.

Also, keep in mind that increasing partition density may cause topic unavailability. In such cases, Kafka requires each broker to store and become the leader to a higher number of partitions. In the event of an unclean shutdown of such brokers, electing new leaders can take several seconds, significantly impacting performance.

Number of replicas

Replication is a topic level configuration to provide service reliability. In Siphon, we generally use 3x replication in our production environments to protect data in situations when up to two brokers are unavailable at the same time. However, in situations where achieving higher throughput and low latency is more critical than availability, the replication factor may be set to a lower value.

Higher replication factor results in additional requests between the partition leader and followers. Consequently, a higher replication factor consumes more disk and CPU to handle additional requests, increasing write latency and decreasing throughput.

Message size

Kafka can move large volumes of data very efficiently. However, Kafka sends latency can change based on the ingress volume in terms of the number of queries per second (QPS) and message size. To study the effect of message size, we tested message sizes from 1 KB to 1.5 MB. Note that load was kept constant during this experiment. We observed a constant throughput of ~1.5 GBps and latency of ~150 ms irrespective of the message size. For messages larger than 1.5 MB, this behavior might change.

Conclusion

There are hundreds of Kafka configurations that can be tuned to configure producers, brokers and consumers. In this blog, we pinpointed the key configurations that we have found to have an impact on performance. We showed the effect of tuning these parameters on performance metrics such as throughput, latency and CPU utilization. We showed that by having appropriate configurations such as partition density, buffer size, network and IO threads we achieved around 2 GBps with 10 brokers and 16 disks per broker. We also quantified the tradeoffs that arise between reliability and throughput with configurations like replication factor and replica acknowledgements.
Quelle: Azure

Azure IoT drives next-wave innovation in infrastructure and energy

Many of the conveniences we enjoy today are dependent on the infrastructure cities and municipalities provide, such as water mains, streetlights, and roads. This infrastructure and its associated technology support our transportation systems, schools, hospitals, and more. A vital type of infrastructure is the electrical grid – every basic need we have is dependent on access to energy, but the way energy is managed is changing rapidly. Electric vehicles, solar roofs, battery storage, demand flexibility, and green energy are fundamentally changing grid management and driving the urgency to modernize the energy industry through innovation and technology.

Today at the DistribuTECH conference in New Orleans, Azure IoT partners are showcasing new solutions that bring the next level of “smart” to our grids. ABB, GE, EY, and Schneider Electric demonstrate their energy solutions and platforms in their respective booths. We invited eight partners to the Microsoft booth to demonstrate their approach to modernizing infrastructure, and how Azure IoT dramatically accelerates time to results. Each partner is showing exciting new use cases for utilities, infrastructure, and cities that take advantage of cloud, AI, and IoT.

With Azure Digital Twins our partners can create digital replicas of spaces and infrastructure. New functionality in limited preview is the Twin Object Model for Grid, which enables partners to accelerate their solution development with simple APIs that create and update grid assets like substations, transformers, and distributed energy resources (DERs). Together with partners, we are working on new energy optimization and forecasting scenarios, including utilizing connected distributed energy resources to balance the distribution grid to avoid expensive infrastructure upgrades and outages, automated carbon footprint reduction, and new ways to include smart charging infrastructure for electric vehicles.

Partners Solutions at DistribuTECH

Agder Energi, Nodes, and Enfo facilitate DER participation in the energy market

Agder Energi, a Norwegian electric utility, is using Azure Digital Twins to identify ways to operate its electrical grid more efficiently through distributed energy resources, device controls, and predictive forecasting – thus avoiding costly and time consuming energy upgrades.

NODES, a new company created by Agder Energi and Nord Pool, Europe’s leading power market, connects local and central power markets to an integrated market. It is a fully automated marketplace capable of real-time trading of available flexibility in the market with transparent prices in an open, integrated market place available to all flexibility providers and grid operators. The NODES-platform leverages the new energy optimization capability in Azure to trade local flexibility in a closed loop, real-time.

Enfo, a subsidiary of Agder Energi, has developed Flex tool, a full-service platform for energy flexibility providers, traders, aggregators, and power companies that aim to participate in the flexibility markets. The platform is running on Azure and is using Azure IoT services to forecast and optimize connecting flexible assets to the power system.

Allego adds more intelligence to e-mobility charging solutions

Allego is a leading European provider of charging solutions for electric vehicles (EVs) with significant expertise in e-mobility. The company operates over 12,000 charging points throughout Europe supporting companies and EV drivers via their cloud-based service platform.

Allego uses the new optimization capability for smart charging. New chargers maximize use of sustainably generated energy from local energy cooperatives and automatically reduce charging speeds during peak hours. Allego’s new smart charging solution is used across a network of 4,500 public charging points in 43 cities in the Netherlands. The solution will rapidly evolve to support advanced driver and operator needs and support millions of charge points across different geographies. Integration of renewable energy and energy storage into smart charging will allow charging aligned to energy production.

Eaton’s next-gen smart breaker for efficient energy management

Eaton, a global power management company, provides energy-efficient solutions to effectively manage electrical, hydraulic, and mechanical power more efficiently, safely, and sustainably. Eaton is using Azure IoT Central to enable easy application development for its industry-first Energy Management Circuit Breaker (EMCB). EMCB is a next-generation “smart breaker” that has the safety functionality of a standard circuit breaker with cloud-connectivity and on-board intelligence built in to help support grid optimization. It is a significant transformation of circuit breaker technology and offers revenue-grade branch circuit metering, communications capabilities, and remote access.

The EMCB is an entirely new kind of energy management device that offers advantages for utilities as well as consumers. It is suitable for a variety of end-use applications including Advanced Metering Infrastructure (AMI), Home Area Networks (HAN), as well as Demand Response (DR), and solar installation monitoring.

eSmart Systems’ connected drone improves asset and grid monitoring

eSmart Systems provides AI-driven software for the energy industry and service providers. Its Connected Drone software uses the Microsoft Azure platform for efficient and accurate power grid asset detection and classification from aerial images. It can analyze 180,000 images in less than an hour (that's more than a human can do in a year) to give a complete overview of inspected assets in the electric grid.

eSmart Systems is partnering with Microsoft on several projects. The “Smart Clean Energy Parking” project within the City of Fargo is one of the projects in which eSmart Systems is partnering with Microsoft. It involves intelligent and efficient use of solar energy and battery storage, electric vehicle charging, and monitoring and reduction of carbon emissions. This project builds on the smart grid digital intelligence and clean energy thought leadership that eSmart and Microsoft bootstrapped in the context of European EMPOWER and INVADE projects.

E.ON sets a new bar for Home Energy Management security using Azure Sphere

E.ON is one of Europe’s leading energy companies. By bringing together home energy resources in a personalized solution, E.ON Home puts a single, responsive energy management solution in the hands of its customers. Heating, lighting, and energy sources like solar panels, batteries, and electric car home charging points can all cooperate to help people define and manage their household energy profile.

E.ON prepares for broad availability of the E.ON Home solution in 2019. To meet the highest standards of security, E.ON and Microsoft are partnering to power and secure devices across the E.ON ecosystem with Azure Sphere. Together we aim to design future-proof technology systems and to deliver E.ON customers the technology of tomorrow, today.

Itron creates state-of-the-art solutions for utilities and cities

Itron enables utilities and cities to safely, securely, and reliably deliver critical infrastructure services to communities around the globe. By combining its rich portfolio of smart networks, software, services, meters, and sensors with Azure Mixed Reality and Azure IoT services, Itron delivers a state-of-the-art solution for utilities and cities to manage energy and water.

The Itron Idea Labs team has created a virtual representation of the relationships between building materials, infrastructure, and various sensor types in a downtown Los Angeles neighborhood by leveraging Azure Digital Twins. A user can virtually install sensors, change rooftop materials, alter traffic patterns, plant trees, and experience the impact on every person, car, school, and building in the simulated environment by using the Microsoft HoloLens. Itron Idea Labs is among the first adopters of the Azure Digital Twins service, which enables developers to build repeatable, scalable experiences from digital sources and the physical world.

L&T develops smart energy solutions for buildings

L&T Group has decades of power and utilities expertise. L&T relies on Microsoft Azure and Azure IoT to deliver Smart Grid projects for several leading electric utilities including advanced analytics at utility, sub-station and meter level asset health performance, demand forecasting, outage management, and optimal use of renewable energy sources as part of the overall smart grid.

L&T Technology Services and Microsoft recently teamed up to deliver a sustainable Smart Campus for a leading technology company in Israel where integrated assets, systems, and predictive analytics are combined to reduce energy consumption by up to 40 percent. Together with Microsoft, L&T Group, L&T Power, and L&T Technology Services are developing next generation smart grid solutions leveraging Azure IoT services in the context of smart city, smart campus, and smart building scenarios.

Telensa develops smarter streetlights using edge and AI technologies

Telensa, the world leader in smart streetlighting systems, has chosen Azure IoT for their smart cities dashboard as well as a new initiative – Urban Data Project. The project creates a trusted infrastructure for urban data to enable cities to collect, protect, and use their data for the benefit of all citizens. The data comes from Telensa’s streetlight based multi-sensor pods, which will run on Azure IoT Edge and feature real-time AI and machine learning to extract insights from the raw data. The first deployment will be in Cambridge, UK.

Connected solutions to build smarter grid

With solutions that take full advantage of the intelligent cloud and intelligent edge, we continue to demonstrate how cloud, IoT, and AI have the power to drastically transform every industry. Smart grids will drive efficiencies to power and utility companies, grid operators, and energy prosumers. Come see our partners at DistribuTECH, and how they are delivering the smart grid future today.

Partner links

ABB 
Allego Charging solutions
Eaton EMCB
E.ON Home
eSmart Connected Drone
Itron
L&T Technologies Services and L&T Power
Microsoft joins EEBUS initiative
Nodes Market
Telensa Urban Data Project

Learn more about Microsoft Azure IoT.
Quelle: Azure

Intelligent Edge support grows – Azure IoT Edge now available on virtual machines

Earlier this year, Microsoft announced the general availability of Azure IoT Edge which enables customers to bring cloud intelligence to the edge and act immediately on real-time data, whether it be a drone recognizing a crack in a gas pipe or predicting equipment failure before it happens. Azure IoT Edge is built to be secure, portable, and open. The Azure IoT Edge runtime is open sourced on GitHub so you can easily modify code, and the open container approach allows you to deploy Microsoft and 3rd party services across a range of edge devices.

We’re committed to building an open, robust ecosystem and giving customers choices in deploying their edge solution. Today we’re announcing that Azure IoT Edge runs in a virtual machine (VM) using one of these supported operating systems. While this works for multiple virtualization technologies, VMware has simplified the deployment process of Azure IoT Edge to VMs using VMware vSphere. Additionally, vSphere 6.7 and later provide passthrough support for Trusted Platform Module (TPM), allowing Azure IoT Edge to maintain its industry leading security framework by leveraging the hardware root of trust.

Azure’s intelligent edge portfolio is designed to run on a breath of hardware to match our customers’ scenarios. This includes everything from microcontroller units (MCUs) running Azure Sphere to a fully consistent experience that is both cloud and edge, powered by Azure Stack. Azure IoT Edge already supports a variety of Linux and Windows operating systems as well as a spectrum of hardware from devices smaller than a Raspberry Pi to servers. Supporting IoT Edge in VMware vSphere offers even more customer choice for those who want to run AI on infrastructure they already own.

The hardware portfolio available to customers to power scenarios at the intelligent edge is almost as diverse as the sectors it’s being used in. We see customers building hybrid cloud and edge solutions in virtually every industry, and the hardware they choose for each is fit for purpose:

Home appliance makers can use Azure Sphere certified chips in their appliances to ensure operation is never compromised and customer data stays secure.
The oil and gas industry is optimizing production and performing predictive maintenance by processing rod pump data on site with Azure IoT Edge devices, smaller than a Raspberry Pi.
Utilities companies are autonomously inspecting pipelines and powerlines for defects through video analytics running on drones with Azure IoT Edge.
Textile producers are detecting weaving defects by adding industrialized PCs running Azure IoT Edge to their production lines.
Large retailers are optimizing their stores’ energy usage by analyzing HVAC data with Azure IoT Edge in a VM, running on existing servers in each retail store.
Electronic makers are implementing quality control and audit compliance scenarios with Azure Data Box Edge.
Healthcare networks are using Azure Stack to optimize stocking vaccines while complying with industry regulations around personally identifiable medical data.

Every company’s digital transformation is unique. Some scenarios can be accomplished primarily in the cloud, while a number of use cases require high value cloud services to be free from data centers and run adjacent to, or actually on, the devices creating data. Azure provides the most secure, scalable, and flexible options, regardless your company’s hybrid cloud and edge needs.
Quelle: Azure

New connectors added to Azure Data Factory empowering richer insights

Data is essential to your business. The ability to unblock business insights more efficiently can be a key competitive advantage to the enterprise. As data grows in volume, variety, and velocity, organizations need to bring together a continuously increasing set of diverse datasets across silos in order to perform advanced analytics and uncover business opportunities. The first challenge to building such big data analytics solutions is how to connect and extract data from a broad variety of data stores. Azure Data Factory (ADF) is a fully-managed data integration service for analytic workloads in Azure, that empowers you to copy data from 80 plus data sources with a simple drag-and-drop experience. Also, with its flexible control flow, rich monitoring, and CI/CD capabilities you can operationalize and manage the ETL/ELT flows to meet your SLAs.

Today, we are excited to announce the release of a set of new ADF connectors which enable more scenarios and possibilities for your analytic workloads. For example, you can now:

Ingest data from Google Cloud Storage into Azure Data Lake Gen2, and process using Azure Databricks jointly with data coming from other sources.
Bring data from any S3-compatible data storage that you may consume from third party data vendors into Azure.
Copy data from MongoDB and others to Azure Cosmos DB MongoDB API for application consumption.
Retrieve data from any RESTful endpoint as an extensible point to reach hundreds of SaaS applications.

For more information, see the following updates on new connectors and additional features for existing connectors.

Connector updates

Azure Cosmos DB MongoDB API

You can now copy data to and from Azure Cosmos DB MongoDB API, in addition to the already supported SQL API. For writing into Azure Cosmos DB specifically, the connector sink is built on top of the Azure Cosmos DB bulk executor library to provide the best performance. Learn more about Azure Cosmos DB MongoDB API.

Amazon S3

ADF enables a custom S3 endpoint configuration in Amazon S3 connector. With this you can now copy data from any S3-compatible storage providers using the connector and are no longer limited to the official Amazon S3 service. Learn more about Amazon S3 connector.

Google Cloud Storage

As Google Cloud Storage provides S3-compatible interoperability, you can now copy data from Google Cloud Storage. This leverages the S3 connector with Google Cloud Storage’s corresponding S3 endpoint. Learn more about Google Cloud Storage connector.

MongoDB

To address the feedback on MongoDB feature coverage, performance, and scalability, ADF releases a new version of MongoDB connector. It provides comprehensive native MongoDB support including generic MongoDB connection string with connection options, native MongoDB query, extracting hierarchical data, and more. Learn more about MongoDB connector.

Azure Database for MariaDB

You can copy data from Azure Database for MariaDB. Learn more about Azure Database for MariaDB connector.

Generic REST

You can now retrieve data from various RESTful services and apps. ADF releases a more targeted REST connector in addition to the generic HTTP connector. To fulfill the two most common asks we’ve received, this REST connector supports Azure Active Directory (AAD), service principal, Managed Identity for Azure resource (MSI) authentications, as well as pagination rules. Learn more about REST connector.

Generic OData

ADF now supports AAD service principal and Managed Identity for Azure resource (MSI) authentications when copying data form OData endpoint. Learn more about OData connector.

Dynamics AX (preview)

You can now copy data from Dynamics AX using OData protocol with service principal authentication. This connector also works with Dynamics 365 Finance and Operations (F&O). Learn more about Dynamics AX connector.

You are encouraged to give these additions a try and provide us with feedback. We hope you find them helpful in your scenarios. Please post your questions on Azure Data Factory forum or share your thoughts with us on Data Factory feedback site.
Quelle: Azure

Find out when your virtual machine hardware is degraded with Scheduled Events

One of the benefits of moving to the cloud is that you, our customer, don’t need to deal with hardware maintenance and repairs; you can focus your time on your business applications. Azure continuously monitors for hardware that shows signs of degradation or potential failure. When these conditions are detected, Azure will attempt to live migrate your virtual machines (VMs). If live migration isn’t possible, Azure will automatically redeploy VMs to a healthy machine. If you have a disaster recovery setup, which is highly recommended, the impact of this redeployment will be minimal. However, a redeployment to a healthy machine may be problematic for some applications that can’t tolerate disruption. We’ve received feedback that in this situation,  when possible, customers prefer to control the time the redeployment to a healthy machine will occur.

We introduced Scheduled Events in Azure as a programmatic way to notify your VMs and act on upcoming maintenance events such as a live migration, redeployment, reboot, etc. Upon receiving the scheduled event, customers can take actions such as failover, saving state, drain sessions in the VMs, schedule a time for manual maintenance, notify customers, etc. We’re excited to announce that Scheduled Events will now be triggered when Azure predicts that hardware issues will require a redeployment to healthy hardware in the near future, and provide a time window when Azure will redeploy the VMs to healthy hardware if a live migration was not possible. Customers can initiate the redeployment of their VMs ahead of Azure automatically doing it.

Hardware failure prediction

Azure has taken insight from operating millions of servers in its data centers to identify when hardware health is degrading and predict in many cases a failure before it happens. For example, Azure can detect if there is degradation in disk IO performance on a given node, or detect memory errors, and determine if this will become fatal.

When Azure detects imminent hardware failure, VMs are proactively live migrated when possible. This should have minimal impact on your workloads and the customer experience is typically a freeze of a few seconds during the final phase. Subscribing to Scheduled Events allows your VM to be notified a few minutes before the live migration process is started. However, there are cases where live migration isn’t possible, like on specialized computer hardware such as M-Series, G-Series, etc. or on legacy hardware, in which case the VMs would be redeployed to a new instance. Some of our customers have expressed interest in being able to control the time to initiate a reallocation from the node and control the experience during the process. Based on this feedback, we enhanced Scheduled Events to notify the time the hardware is detected as unhealthy, and give the time the VM will be moved to another machine, provided the hardware does not fail sooner. In many cases there can be multiple days before the hardware fails and through mitigations, Azure tries to delay this failure time. Because the time to fail varies, we recommend customers move from degraded hardware as soon as possible.

How to listen to these Scheduled Events

Your VM must subscribe to Scheduled Events to get events related to maintenance. Watch this video to learn how to programmatically enable and react to Scheduled Events. You can also find code samples of how to listen to Scheduled Events and then approve them once you have done your mitigation.

To listen to hardware-related events, you don’t have to do anything different! Hardware-related events are delivered as a redeploy event. The NotBefore time, which is the property that gives the time window before the maintenance is performed, could range from a few hours to a few days and can change depending on the severity of the hardware fault. As Azure’s estimation for the time to failure improves, the NotBefore time window will change to become more accurate. But note that since you’re running on degraded hardware that can fail suddenly, you should initiate a redeployment or approve the scheduled event as soon as possible after initiating the corresponding automated or manual actions. Once you approve the request, your VM will be redeployed to a new physical machine. You can track the completion of the redeploy via Scheduled Events. If you don’t approve the scheduled event within the NotBefore time, you will no longer have control of the experience and Azure will redeploy your VM to a healthy machine.

Support for hardware degradation information via Scheduled Events is already available worldwide! There are no API changes so this feature that is available from api-version=2017-08-01.

If you are sensitive to platform maintenance events, I would highly encourage you to build automation by handling Scheduled Events. Try this out and let us know what you think in the comments below.
Quelle: Azure

Azure Stream Analytics now supports Azure SQL Database as reference data input

Our goal on the Azure Stream Analytics team is to empower developers and make it incredibly easy to leverage the power of Azure to analyze big data in real-time. We achieve this by continuously listening for feedback from our customers and ship features that are delightful to use and serve as a tool for tackling complex analytics scenarios. We are excited to share the public preview of Azure SQL Database as a reference data input for Stream Analytics, which is the most requested feature on UserVoice!

Typical scenarios for reference data

Reference data is a dataset that is static or slow changing in nature which you can correlate with real-time data streams to augment the data. Stream Analytics leverages versioning of reference data to augment streaming data by the reference data that was valid at the time the event was generated.

An example scenario would be storing currency exchange rates in Azure SQL Database which is regularly updated to reflect market trends, and then converting a stream of billing events in different currencies to a standard currency.

In IoT scenarios, you could have millions of IoT devices emitting a stream of events with critical values like temperature and pressure being monitored. Using Stream Analytics, you can join this real-time data stream with metadata about each IoT device stored in Azure SQL Database to define per-device threshold and metadata.

Easily integrate with Azure SQL Database input

Until today, Azure Blob Storage was the only way to store your reference data. We heard from our customers that Azure SQL Database is a natural place to store datasets that need to be used in correlation with real-time data streams.

Instead of writing your logic and building custom pipelines to transfer data periodically from Azure SQL Database to Azure Blob Storage, Stream Analytics now provides out-of-the-box support for Azure SQL Database as reference data input. We knew providing just this capability alone wouldn’t delight our customers. So, we took it one step further and are providing the ability to automatically refresh your reference dataset periodically. You can easily configure this refresh interval when adding your input to the job. The refresh interval can be as short as one minute.

You might have a complex query to pull reference data from Azure SQL Database. In order to preserve the performance of your Stream Analytics job, we also provide the option to fetch incremental changes from your Azure SQL Database by writing a delta query.

Getting started

You can try using Azure SQL Database as a source of reference data input to your Stream Analytics job today. This feature is available for public preview in all Azure regions. This feature is also available in the latest release of Stream Analytics tools for Visual Studio. We hope you take full advantage of this functionality and are excited to see what you build with Stream Analytics.

Providing feedback and ideas

The Azure Stream Analytics team is highly committed to listening to your feedback. We welcome you to join the conversation and make your voice heard via our UserVoice. You can stay up-to-date on the latest announcements by following us on Twitter @AzureStreaming. You can also reach out to us at askasa@microsoft.com.
Quelle: Azure

Help us shape new Azure migration capabilities: Sign up for early access!

Based on Azure Migrate and Azure Site Recovery usage trends, we know that many of you are well along on your Azure migration journey. We’re now working on the next wave of innovation to further enhance and simplify your migration experience. We have a great opportunity for you to influence and shape product direction through early access to new capabilities.

Delivering an integrated end-to-end migration experience that enables you to discover, assess, and migrate servers to Azure is the goal. To that end, we have several new capabilities in our roadmap, including a new user experience with partner tool integration, Hyper-V environment assessment, and server migration enhancements. You are welcome to migrate your workloads to Azure using these new features, and we will enable you by providing production support.

If you’d like to be part of this awesome opportunity, please fill and submit this form as soon as possible. We will review your submission and follow up with on-boarding steps, including detailed guidance on how to participate and provide feedback.

Your feedback is extremely valuable in helping us improve our product offerings. We look forward to sharing more about what we’ve been working on and look forward to your inputs!

Regards,

Azure Migrate Team
Quelle: Azure

Completers in Azure PowerShell

Since version 3.0, PowerShell has supported applying argument completers to cmdlet parameters. These argument completers allow you to tab through a set of values that are valid for the parameter. However, unlike ValidateSet which enforces that only the provided values are passed to the cmdlet, argument completers do not restrict the values that can be passed to the string parameter. Additionally, argument completers can be either a static or a dynamic set of strings. Using this feature, we have added argument completers to the Azure PowerShell modules which allow you to select valid parameter values without needing to make additional calls to Azure. These completers make the required calls to Azure to obtain the valid parameter values.

To best capture the functionality of the completers, I have modified the key binding for “Tab” in the examples below to display all the possible values at once. If you want to replicate this setup, simply run: “Set-PSReadLineKeyHandler -Key Tab -Function Complete.”

Location completer

The first completer that we created was the Location completer. Since each resource type has a distinct list of available Azure regions, we wanted to create an easy, quick way to select a valid region when creating a resource. Thus, for every parameter in our modules which accepts an Azure region (which in most cases is called Location), we added an argument completer that returns only the regions in which the resource type can be created. In the example below, you can see the result of pressing tab immediately after -Location for the New-AzResourceGroup cmdlet.

In addition to listing out all available regions that a Resource Group can be created in, the Location completer allows you to filter the results by a typing in the first few characters of the region you are looking for.

Resource Group Name completer

The second completer that we added to the PowerShell modules is the Resource Group Name completer. This completer was applied to all parameters which accept an existing resource group and returns all resource groups in the current subscription. Similar to the Location completer, you can filter the results by typing the first few characters of the resource group before pressing tab.

Resource Name completer

The third completer that we added to the PowerShell modules is the Resource Name completer. This completer returns the list of names of all resources that match the resource type required by the parameter. Additionally, this argument completer will filter by the resource group name if it is already provided to the cmdlet invocation. For example, in the screenshot below, when we tab after typing “Get-AzVM -Name test,” we see all four VMs in the current subscription that starts with “test.” Then, when we tab after typing “Get-AzVM -ResourceGroupName maddie1 -Name test,” we only see the two VMs that are contained in the “maddie1” resource group.

Not only does the Resource Name completer filter by the resource group name, but, for all subresources, it also filters by the parent resources, if they are provided to the cmdlet invocation. The results will be filtered by each of the parent resources provided. In the example below, you can see the results of tab completion over “maddiessqldatabase” for various combinations of parameters being provided.

At the moment, this completer has only been applied to the Compute, Network, KeyVault, and SQL modules. If you enjoy this feature and would like to see it applied to more modules, please let us know by sending us feedback using the Send-Feedback cmdlet.

Resource Id completer

The final completer that we added to the Az modules is a Resource Id completer. This completer returns all resource Ids in the current subscription, filtered for the resource type that the parameter requires. The Resource Id completer allows you to filter the results by a typing in a few characters, using '*<characters>*' wildcard pattern. This completer was applied to all parameters in our cmdlets that accept an Azure resource Id.

Try it out

To try out Azure PowerShell for yourself, install our Az module via the PowerShell Gallery. For more information about our new Az module, please check out our Az announcement blog. We look forward to getting your feedback, suggestions or issues via the built-in “Send-Feedback” cmdlet. Alternatively, you can always open an issue in our GitHub repository.
Quelle: Azure

Modernizing payment management for online merchants

E-commerce merchants all over the world are innovating every day to offer customers the best user experience. To keep customers coming back, the buying experience should leave only good impressions, from beginning to end. To achieve this, merchants want to examine every step—especially the payment checkout. So, payment processors need to complement and support the innovations of the merchants. And the final experience needs to be as intuitive and seamless as possible, so it does not break the checkout flow; it should support the brand experience and leave customers with a pleasing memory. Helping a merchant craft a seamless payment experience is the domain of Newgen.

Solution key features

Guru is Newgen's fully integrated portal that enables merchants to have a complete view of their payments, generate reports, capture/void transactions, and perform refunds. It is a fully managed SaaS solution which comes as a value addition with Newgen's Payment Gateway—a cutting edge payment technology for merchants. The solution competes in the market with these key features.

Intelligent transaction routing: Newgen’s engine automatically routes transactions taking into account the country, credit provider, volume and ratio (selecting the best destination based on the transaction amount), currency, and transaction fees. Using machine learning that is continually improving, the engine bases its decisions on platform health, performance, and fees—it will select the optimal route to maximize your gains. The service provides capabilities that would otherwise consume a merchant’s resources to reproduce.
Split payments: When friends and family want to split a charge, Newgen will make it easy for them to do so. The end user provides the email addresses for the participants, and sets the split ratio (for example, “evenly”). The engine generates the email inviting others to participate. The initiator can check the status of the payment progress.
Page builders: Your brand and UI should be distinct, and the payment process needs to be integrated into it. Newgen lets you build a custom UI with a drag-and-drop approach. You can craft the checkout experience to ensure your brand is present and reassuring to your customers.
Flash checkout: Requiring customers to fill in form data should be a one-time event. Newgen gives the customer the option of saving the data for reuse and instant checkout. The data is securely stored, and Newgen strictly adheres to PCI DSS level 1 for maximum security.

Azure services

Guru is a fully cloud-based solution hosted completely on Microsoft Azure. It benefits from Azure’s highly scalable and secure technologies, with the flexibility to develop non-trivial technology stacks. Guru specifically uses these Azure technologies:

Azure Virtual Machines
Azure SQL Database
Azure Files
Azure Site Recovery
Azure Functions

Recommended next steps

Explore Newgen’s various solutions to see what works for you. Or, go to the Azure Marketplace, and click Contact me.
Quelle: Azure