Instant File Recovery from Azure Linux VM backup using Azure Backup – Preview

We earlier announced Instant file recovery from Azure Windows VM backups which enables you to restore files instantly from the Azure Recovery Services Vault with no additional cost or infrastructure. Today, we are excited to announce the same feature for Azure Linux VM backups in preview. If you are new to Azure Backup, you can start backing directly from the Azure IaaS VM blade and start using this feature. Value proposition: Instant recovery of files – Now instantly recover files from the cloud backups of Azure VMs. Whether it’s accidental file deletion or simply validating the backup, instant restore drastically reduces the time to recover your data. Mount application files without restoring them – Our iSCSI-based approach allows you to open/mount application files directly from cloud recovery points to application instances, without having to restore them. For e.g. In case of backup of a Azure Linux VM running mongoDB, you can mount BSON data dumps from the cloud recovery point and quickly validate the backup or retrieve individual items such as tables without having to download the entire data dump. Learn how to instantly recover files from Azure Linux VM backups:     Basic requirements The downloaded recovery script can be run on a machine which meets the following requirements. OS of the machine where the script is run (recovery machine) should support/recognize the underlying file-system of the files present in the backed-up Linux VM. Ensure that the OS of the recovery machine is compatible with the backed up VM and the versions are as mentioned in the following table Linux OS Versions Ubuntu 12.04 and above CentOS 6.5 and above RHEL 6.7 and above Debian 7 and above Oracle Linux 6.4 and above The script requires python and bash components to execute and provide a secure connection to the recovery point. Component Version Python 2.6.6 and above Bash 4 and above Only users with root level access can view the paths mounted by the script.   Advanced configurations Recovering files from LVM/Software RAID Arrays: In case you are using LVM/RAID Arrays in the backed-up Linux VM,  you cannot run the script on the same virtual machine due to disk conflicts. Run the script on any other Recovery machine (meeting the basic requirements as mentioned above) and the script will attach the relevant disks as shown in the output below. The following additional commands need to be run by the user to make LVM/RAID Array partitions visible and online. For LVM Partitions: $ pvs <volume name as shown above in the script output>  –  This will list all the volume groups under this physical volume $ lvdisplay <volume-group-name from the previous command’s result>  –  This will list all logical volumes, names and their paths in a volume group $ mount <LV path> </mountpath> –  Now mount the logical volumes to a path of your choice.   For RAID Arrays: $ mdadm –detail –scan – This will display details about all RAID Disks in this machine. The relevant RAID disk from the backed-up VM will be displayed with its name (</dev/mdm/<RAID array name in the backed up VM>) If the RAID Disk has physical volumes, mount the disk directly to view all the volumes within it. $ mount [RAID Disk Path] [/mounthpath] If the RAID disk was used to configure LVM over and above it, then re-use the process defined for LVM above and supply the volume name as an input.   Related links and additional content Want more details about this feature? Check out Azure Backup Linux Restore documentation Need help? Reach out to Azure Backup forum for support Tell us how we can improve Azure Backup by contributing new ideas and voting up existing ones. Follow us on Twitter @AzureBackup for the latest news and updates New to Azure Backup, sign up for a free Azure trial subscription 
Quelle: Azure

doAzureParallel: Take advantage of Azure’s flexible compute directly from your R session

Users of the R language often require more compute capacity than their local machines can handle. However, scaling up their work to take advantage of cloud capacity can be complex, troublesome, and can often distract R users from focusing on their algorithms.

We are excited to announce doAzureParallel – a lightweight R package built on top of Azure Batch, that allows you to easily use Azure’s flexible compute resources right from your R session.

At its core, the doAzureParallel package is a parallel backend, for the widely popular foreach package, that lets you execute multiple processes across a cluster of Azure virtual machines. In just a few lines of code, the package helps you create and manage a cluster in Azure, and register it as a parallel backend to be used with the foreach package.

With doAzureParallel, there’s no need to manually create, configure, and manage a cluster of individual virtual machines. Instead, this package makes running your jobs at scale no more complex than running your algorithms on your local machine. With Azure Batch’s autoscaling capabilities, you can also increase or decrease the size of your cluster to fit your workloads, helping you to save time and/or money.

doAzureParallel also uses the Azure Data Science Virtual Machine (DSVM), allowing Azure Batch to easily and quickly configure the appropriate environment in as little time as possible.

There is no additional cost for these capabilities – you only pay for the Azure VMs you use.

doAzureParallel is ideal for running embarrassingly parallel work such as parametric sweeps or Monte Carlo simulations, making it a great fit for many financial modelling algorithms (back-testing, portfolio scenario modelling, etc).

Installation / Pre-requisites

To use doAzureParallel, you need to have a Batch account and a Storage account set up in Azure. More information on setting up your Azure accounts.

You can install the package directly from Github. More information on install instructions and dependencies.

Getting Started

Once you install the package, getting started is as simple as few lines of code:

Load the package:

library(doAzureParallel)

Set up your parallel backend (which is your pool of virtual machines) with Azure:

# 1. Generate a pool configuration json file.
generateClusterConfig(“pool_config.json”)

# 2. Edit your pool configuration file.
# Enter your Batch account & Storage account information and configure your pool settings

# 3. Create your pool. This will create a new pool if your pool hasn’t already been provisioned.
pool <- makeCluster(“pool_config.json”)

# 4. Register the pool as your parallel backend
registerDoAzureParallel(pool)

# 5. Check that your parallel backend has been registered
getDoParWorkers()

Run your parallel foreach loop with the %dopar% keyword. The foreach function will return the results of your parallel code.

number_of_iterations <- 10
results <- foreach(i = 1:number_of_iterations) %dopar% {
    # This code is executed, in parallel, across your Azure pool.
    myAlgorithm(…)
}

When developing at scale, it is always recommended that you test and debug your code locally first. Switch between %dopar% and %do% to toggle between running in parallel on Azure and running in sequence on your local machine.

# run your code sequentially on your local machine
results <- foreach(i = 1:number_of_iterations) %do% { … }

# use the doAzureParallel backend to run your code in parallel across your Azure pool
results <- foreach(i = 1:number_of_iterations) %dopar% {…}

After you finish running your R code at scale, you may want to shut down your pool of VMs to make sure that you aren’t being charged anymore:

# shut down your pool
stopCluster(pool)

Monte Carlo Pricing Simulation Demo

The following demo will show you a simplified version of predicting a stock price after 5 years by simulating 5 million different outcomes of a single stock.

Let&;s imagine Contoso&039;s stock price gains on average 1.001 times its opening price each day, but has a volatility of 0.01. Given a starting price of $100, we can use a Monte Carlo pricing simulation to figure out what price Contoso&039;s stock will be after 5 years.

First, define the assumptions:

mean_change = 1.001
volatility = 0.01
opening_price = 100

Create a function to simulate the movement of the stock price for one possible outcome over 5 years  by taking the cumulative product from a normal distribution using the variables defined above.

simulateMovement <- function() {
    days <- 1825 # ~ 5 years
    movement <- rnorm(days, mean=mean_change, sd=volatility)
    path <- cumprod(c(opening_price, movement))
    return(path)
}

On our local machine, simulate 30 possible outcomes and graph the results:

simulations <- replicate(30, simulateMovement())
matplot(simulations, type=&039;l&039;) # plots all 30 simulations on a graph

To understand where Contoso&039;s stock price will be in 5 years, we need to understand the distribution of the closing price for each simulation (as represented by the lines). But instead of looking at the distribution of just 30 possible outcomes, lets simulate 5 million outcomes to get a massive sample for the distribution.

Create a function to simulate the movement of the stock price for one possible outcome, but only return the closing price.

getClosingPrice <- function() {
    days <- 1825 # ~ 5 years
    movement <- rnorm(days, mean=mean_change, sd=volatility)
    path <- cumprod(c(opening_price, movement))
    closingPrice <- path[days]
    return(closingPrice)
}

Using the foreach package and doAzureParallel, we can simulate 5 million outcomes in Azure. To parallelize this, lets run 50 iterations of 100,000 outcomes:

closingPrices <- foreach(i = 1:50, .combine=&039;c&039;) %dopar% {
    replicate(100000, getClosingPrice())
}

After running the foreach package against the doAzureParallel backend, you can look at your Azure Batch account in the Azure Portal to see your pool of VMs running the simulation.

As the nodes in the heat map changes color, we can see it busy working on the pricing simulation.

When the simulation finishes, the package will automatically merge the results of each simulation and pull it down from the nodes so that you are ready to use the results in your R session.

Finally, we&039;ll plot the results to get a sense of the distribution of closing prices over the 5 million possible outcomes.

# plot the 5 million closing prices in a histogram
hist(closingPrices)

Based on the distribution above, Contoso&039;s stock price will most likely move from the opening price of $100 to a closing price of roughly $500, after a 5 year period.

 

We look forward to you using these capabilities and hearing your feedback. Please contact us at razurebatch@microsoft.com for feedback or feel free to contribute to our Github repository.

Additional information:

Download and get started with doAzureParallel
For questions related to using the doAzureParallel package, please see our docs, or feel free to reach out to razurebatch@microsoft.com
Please submit issues via Github

Additional Resources:

See Azure Batch, the underlying Azure service used by the doAzureParallel package
More general purpose HPC on Azure

Quelle: Azure

How Microsoft builds its fast and reliable global network

Every day, customers around the world connect to Microsoft Azure, Bing, Dynamics 365, Office 365, OneDrive, Xbox, and many other services through trillions of requests. These requests are for diverse types of data, such as enterprise cloud applications and email, VOIP, streaming video, IoT, search, and cloud storage.

Customers expect instant responsiveness and reliability from our services. The Microsoft global wide-area network (WAN) plays an important part in delivering a great cloud service experience. Connecting hundreds of datacenters in 38 regions around the world, our global network offers near-perfect availability, high capacity, and the flexibility to respond to unpredictable demand spikes.

As we build, expand, and run this world-class network, we rely on three guiding principles:

Be as close as possible to our customers for optimal latency.
Stay in control of capacity and resiliency to guarantee that the network can survive multiple failures.
Proactively manage network traffic at scale via software-defined networking (SDN).

We are as close to customers as possible

You want a fast, reliable response when you use Microsoft services. Data travels over our network at nearly the speed of light; network speed, or latency, is a function of distance from the customer to the datacenter. If your service is far away, say you’re in London and the service is in Tokyo, the network path determines latency. We use innovative software to optimize network routing and to build and deploy network paths that are as direct as possible between customers and their data and services. This reduces latency to the limits imposed by the speed of light.

Customer traffic enters our global network through strategically placed Microsoft edge nodes, our points of presence. These edge nodes are directly interconnected to more than 2,500 unique Internet partners through thousands of connections in more than 120 locations. Our rich interconnection strategy optimizes the paths that data travels on our global network. Customers get a better network experience with less latency, jitter, and packet loss with more throughput. Direct interconnections give customers better quality of service compared to transit links, because there are fewer hops, fewer parties, and better networking paths.

Figure 1. Microsoft Global WAN

Azure traffic between our datacenters stays on our network and does not flow over the Internet. This includes all traffic between Microsoft services anywhere in the world. For example, within Azure, traffic between virtual machines, storage, and SQL communication traverses only the Microsoft network, regardless of the source and destination region. Intra-region VNet-to-VNet traffic, as well as cross-region VNet-to-VNet traffic, stays on the Microsoft network.

Customers can use Azure ExpressRoute to create private network connections to Azure, Dynamics 365, Office 365, and Skype for Business. ExpressRoute connections bypass the Internet and offer more reliability, faster speeds, and less latency than typical Internet connections. With ExpressRoute, customers connect to Azure at an ExpressRoute location at specific Microsoft edge sites, such as an Internet exchange provider facility, or directly connect to Azure from an existing corporate WAN, such as a Multiprotocol Label Switching (MPLS) VPN provided by a network service provider.

For example, customers can connect to a local ExpressRoute site in Dallas and access virtual machines in Amsterdam, Busan, Dublin, Hong Kong, Osaka, Seoul, Singapore, Sydney, Tokyo, (or any of our datacenters) and the traffic will stay on our global backbone network. We have 37 ExpressRoute sites, and growing, with one near each Azure region, as well as other strategic locations. Every time we announce a new Azure region, like we recently did in Korea, you can expect that ExpressRoute will also be there, along with our global ecosystem of ExpressRoute partners.

Figure 2. A sampling of the Microsoft ExpressRoute partner ecosystem which includes the world’s largest network and co-location providers

Stay in control of capacity and provide resiliency

To give customers a service that works well, our network must be able to handle failures and rapidly respond to demand spikes. To support the tremendous growth of our cloud services and maintain consistent service level agreements, we invest in private fiber (sometimes called dark fiber), for our metro, terrestrial, and submarine paths. Microsoft owns and runs one of the largest backbone networks in the world,connecting our datacenters and customers. Over the last three years, we’ve grown our long-haul WAN capacity by 700 percent. Within a given region, we can support up to 1.6 Pbps of inter-datacenter bandwidth. We continue to increase capacity to meet the strong demand for Microsoft cloud services.

Microsoft owns and runs one of the largest WAN backbones in the world.

Our submarine investments improve resiliency, performance, and reliability across the Pacific and Atlantic Oceans. Our latest investment is the MAREA cable, a 6,600 km submarine cable between Virginia Beach, Virginia, USA, and Bilbao, Spain, which we jointly developed with Facebook. MAREA will be the highest-capacity subsea cable to cross the Atlantic, featuring eight fiber pairs and an initial estimated design capacity of 160 Tbps. This open cable system is an innovation in submarine cable design and delivery, which allows for greater bandwidth capacity thresholds and reduces cost. More importantly, it has given us the ability to introduce SDN principles into cable management, resulting in a better quality of service.

Over the last three years, we’ve grown our long-haul WAN capacity by 700 percent.

Global network infrastructure can be surprisingly vulnerable. For example, fiber optic cables can be cut by ship anchors dragging along the seabed. For an example, see a ship accidentally cut Jersey’s internet cables with its anchor. To provide the reliability our cloud needs, we have many physical networking paths with automatic routing around failures for optimal reliability.

Figure 3. The inter-datacenter backbone connects datacenters globally with fiber optic cables

Controlling operations and managing traffic with software

Delivering traffic to millions, and growing, physical servers isn’t possible with pre-cloud technologies. In partnership with Microsoft Research, we developed a range of SDN technologies to optimally manage routing and centralize control to meet network-wide goals. We use standard switches and routers, and then we manage them with our own software, which is built to handle the enormous volume of traffic on the Microsoft network.

We use an SDN-based architecture called SWAN to manage our WAN, which enables centralized management and control of network infrastructure and improves reliability and efficiency. SWAN controls when and how much traffic each service sends and automatically reconfigures the network’s data plane to match traffic demand. With SWAN, we control every network flow from the very farthest reaches of our network, across our global WAN, all the way down to the network interface card (NIC) on a server in one of our datacenters.

Conclusion

Whether you choose to reach the Microsoft cloud through the Internet or through a private network, we are committed to building the fastest and most reliable global network of any public cloud. We continue innovating and investing in a globally distributed networking platform to enable high performance, low latency, and the world’s most reliable cloud. We will continue to provide you with the best possible network experience, wherever in the world you happen to be.

Read more

To read more posts from this series please visit:

Networking innovations that drive the cloud disruption
SONiC: The networking switch software that powers the Microsoft Global Cloud

Quelle: Azure

DocumentDB: API for MongoDB now generally available

Today, we are excited to announce that DocumentDB: API for MongoDB is generally available. The API for MongoDB allows developers to experience the power of the DocumentDB database engine with the comfort of a managed service and the familiarity of the MongoDB SDKs and tools. With the announcement of its general availability, we are introducing a suite of new features for improvements in availability, scalability, and usability of the service.

What is API for MongoDB?

DocumentDB: API for MongoDB is a flavor of DocumentDB that enables MongoDB developers to use familiar SDKs, tool chains, and libraries to develop against DocumentDB. MongoDB developers can now enjoy the advantages of DocumentDB, which include auto-indexing, no server management, limitless scale, enterprise-grade availability backed by service level agreements (SLAs), and enterprise-grade customer support.

What’s new?

From preview to general availability, we have reached a few important milestones. We are proud to introduce a number of major feature releases:

Sharded Collections
Global Databases
Read-only Keys
Additional portal metrics

Sharded Collections – By specifying a shard key, API for MongoDB will automatically distribute your data amongst multiple partitions to scale out both storage and throughput. Sharded collections are an excellent option for applications to ingest large volumes of data or for applications that require high throughput, low latency access to date. Sharded collections can be scaled in a matter of seconds in the Azure portal. They can scale to a nearly limitless amount of both storage and throughput.

Global Databases – API for MongoDB now allows you to replicate your data across multiple regions to deliver high availability. You can replicate your data across any of Azure’s 30+ datacenters with just a few clicks from the Azure portal. Global databases are a great option for delivering low latency requests across the world or in preparation for disaster recovery (DR) scenarios. Global databases have support for both manual and policy driven failovers for full user control.

Read-only Keys – API for MongoDB now supports read-only keys, which will only allow read operations on the API for MongoDB database.

Portal Metrics – To improve visibility into the database, we are proud to announce that we have added additional metrics to the Azure portal. For all API for MongoDB databases, we provide metrics on the numbers of requests, request charges, and errored requests. Supplementing the portal metrics, we have also added a custom command, GetLastRequestStatistics, which allows you to programmatically determine a command’s request charge.

What’s next?

General availability is just the beginning for all the features and improvements we have in stored for DocumentDB: API for MongoDB. In the near future, we will be releasing support for Unique indexes and  a couple major performance improvements. Stay tuned!

In addition to API for MongoDB’s general availability, we are announcing a preview Spark connector. Visit our Github repo for more information.

We hope you take advantage of these new features and capabilities. Please continue to provide feedback on what you want to see next. Try out DocumentDB: API for MongoDB today by signing up for a free trial and create a API for MongoDB account.

Stay up-to-date on the latest Azure DocumentDB news and features by following us on Twitter @DocumentDB.
Quelle: Azure

Announcing new capabilities of HDInsight and DocumentDB at Strata

This week in San Jose, Microsoft will be at Strata Hadoop + World where will be announcing new capabilities of Azure HDInsight, our fully managed OSS analytics platform for running all open-source analytics workloads at scale, with enterprise grade security and SLA and Azure DocumentDB, our planet-scale fully-managed NoSQL database service. Our vision is to deeply integrate both services and make it seamless for developers to process massive amounts of data with low-latency and global scale.

DocumentDB announcements

DocumentDB is Microsoft’s globally distributed database service designed to enable developers to build planet-scale applications. DocumentDB allows you to elastically scale both throughput and storage across any number of geographical regions. The service offers guaranteed single-digit millisecond low latency at the 99th percentile, 99.99% high availability, predictable throughput, and multiple well-defined consistency models—all backed by comprehensive SLAs for latency, availability, throughput, and consistency. By virtue of its schema-agnostic and write-optimized database engine, DocumentDB, by default, is capable of automatically indexing all the data it ingests and serves across SQL, MongoDB, and JavaScript language-integrated queries in a scale-independent manner. As one of the foundational services of Azure, DocumentDB has been used virtually ubiquitously as a backend for first-party Microsoft services for many years. Since its general availability in 2015, DocumentDB is one of the fastest growing services on Azure.

Real-time data science with Apache Spark and DocumentDB

At Strata, we are pleased to announce Spark connector for DocumentDB. It enables real-time data science and exploration over globally distributed data in DocumentDB. Connecting Apache Spark to Azure DocumentDB accelerates our customer’s ability to solve fast-moving data sciences problems where data can be quickly persisted and retrieved using DocumentDB. The Spark to DocumentDB connector efficiently exploits the native DocumentDB managed indexes and enables updateable columns when performing analytics, push-down predicate filtering, and advanced analytics to data sciences against fast-changing globally-distributed data, ranging from IoT, data science, and analytics scenarios. The Spark to DocumentDB connector uses the Azure DocumentDB Java SDK. Get started today and download the Spark connector from GitHub!

General availability of high-fidelity, SLA backed MongoDB APIs for DocumentDB

DocumentDB is architected to natively support multiple data models, wire protocols, and APIs. Today we are announcing the general availability of our DocumentDB’s API for MongoDB. With this, existing applications built on top of MongoDB can seamlessly target DocumentDB and continue to use their MongoDB client drivers and toolchain. This allows customers to easily move to DocumentDB while continuing to use the MongoDB APIs, but get comprehensive enterprise grade SLAs, turn-key global distribution, security, compliance, and a fully managed service.

HDInsight announcements

Cloud-first with Hortonworks Data Platform 2.6

Microsoft’s cloud-first strategy has already shown success with customers and analysts, having recently been placed as a leader in the Forrester Big Data Hadoop Cloud Solutions Wave and a Leader in the Gartner Magic Quadrant for Data Management Solutions for Analytics. Operating a fully managed cloud service like HDInsight, which is backed by enterprise grade SLA, enable customers to deploy the latest bits of Hadoop & Spark, on demand. To that end, we are excited that the latest Hortonworks Data Platform 2.6 will be continuously available to HDInsight even before its on-premises release. Hortonworks’ commitment to being cloud-first is especially significant given the growing importance of cloud with Hadoop and Spark workloads.

"At Hortonworks we have seen more and more Hadoop related work loads and applications move to the cloud. Starting in HDP 2.6, we are adopting a “Cloud First” strategy in which our platform will be available on our cloud platforms – Azure HDInsight at the same time or even before it is available on traditional on-premises settings. With this in mind, we are very excited that Microsoft and Hortonworks will empower Azure HDInsight customers to be the first to benefit from our HDP 2.6 innovation in the near future."
– Arun Murthy, co-founder, Hortonworks

Most secured Hadoop in a managed cloud offering

Last year at Strata + Hadoop World Conference in New York, we announced the highest levels of security for authentication, authorization, auditing, and encryption natively available in HDInsight for Hadoop workloads. Now, we are expanding our security capabilities across other workloads including Interactive Hive (powered by LLAP) and Apache Spark. This allows customers to use Apache Ranger over these popular workloads to provide a central policy and management portal to author and maintain fine-grained access control. In addition, customers can now analyze detailed audit records in the familiar Apache Ranger user interface.

New fully managed, SLA-backed Apache Spark 2.1 offering

With the latest release of Apache Spark for Azure HDInsight, we are providing the only fully managed, 99.9% SLA-backed Spark 2.1 cluster in the market. Additionally, we are introducing capabilities to support real-time streaming solutions with Spark integration to Azure Event Hubs and leveraging the structured streaming connector in Kafka for HDInsight. This will allow customers to use Spark to analyze millions of real-time events ingested into these Azure services, thus enabling IoT and other real-time scenarios. We made this possible through DirectStreaming support, which improves the performance and reliability of Spark streaming jobs as it processes data from Event Hubs. The source code and binary distribution of this work is now available publicly on GitHub.

New data science experiences with Zeppelin and ISV partnerships

Our goal is to make big data accessible for everybody. We have designed productivity experiences for different audiences including the data engineer working on ETL jobs with Visual Studio, Eclipse, and IntelliJ support, the data scientists performing experimentation with Microsoft R Server and Jupyter notebook support, and the business analysts creating dashboards with Power BI, Tableau, SAP Lumira, and Qlik support. As part of HDInsight’s support for the latest Hortonworks Data Platform 2.6, Zeppelin notebooks, a popular workspace for data scientists, will support both Spark 2.1 and interactive Hive (LLAP). Additionally, we have added popular independent software vendors (ISVs) Dataiku and H20.ai to our existing set of ISV applications that are available on the HDInsight platform. Through the unique design of HDInsight edge nodes, customers can spin up these data science solutions directly on HDInsight clusters, which are integrated and tuned out-of-the-box making it easier for customers to build intelligent applications.

Enabling Data Warehouse scenarios through Interactive Hive

Microsoft has been involved from the beginning in making Apache Hive run faster with our contributions to Project Stinger and Tez that sped up Hive query performance up to 100x. We announced support for Hive using LLAP (Long Lived and Process) to speed up query performance up to an additional 25x. With support for the newest version of Apache Hive 2.1.1, customers can expect sub-second query performance, thus enabling data warehouse scenarios over all enterprise data, without the need for data movement. Interactive Hive clusters also support popular BI tools, which is useful for business analysts who want to run their favorite tools directly on top of Hadoop. 

Announcing SQL Server CTP 1.4

Microsoft is excited to announce a new preview for the next version of SQL Server Community Technology Preview (CTP) 1.4 is available on both Windows and Linux. This preview offers an enhancement to SQL Server v.Next on Linux. Another enhancement to SQL Server v.Next on Windows and Linux is resumable online index builds b-tree rebuild support which extends flexibility in index maintenance scheduling and recovery. You can try the preview in your choice of development and test environments now and for additional detail on CTP 1.4, please visit What’s New in SQL Server v.Next, Release Notes and Linux documentation.

Earlier today, we also announced a new online event that will take place next month – Microsoft Data Amp. During the event, Scott Guthrie and Joseph Sirosh will share some exciting new announcements around investments we are making that put data front and center of application innovation and artificial intelligence. I encourage you to check out Mitra Azizirad’s blog post to learn more about Microsoft Data Amp and save the date for what’s going to be an amazing event.

This week the big data world is focused on Strata + Hadoop World in San Jose, a great event for the industry and community. We are committed to making the innovations in big data and NoSQL natively available, easily accessible, and highly productive as part of our Azure services.
Quelle: Azure

Announcing general availability of Update 4.0 for StorSimple 8000 series

We are pleased to announce that StorSimple 8000 series Update 4.0 is now generally available. This release has the following new features and enhancements:

Heatmap-based restore – No more slowness when accessing data from appliance post device restore (DR). The new feature implemented in Update 4 tracks frequently accessed data to create a heatmap when the device is in use prior to DR. Post DR, it uses the heatmap to automatically restore and rehydrate the data from the cloud.
Performance enhancements for locally pinned volumes – This update has improved the performance of locally pinned volumes in scenarios that have high data ingestion.
Bug fixes – In the areas of MPIO support for StorSimple Snapshot Manager, alerts, controller replacement, updates, and more.

This update is now generally available for customers to apply from the StorSimple Manager Service in Azure. You can also manually apply this update using the hotfix method.

Next steps:

Visit StorSimple 8000 Series Update 4 release notes for a full list of features and enhancements.

For step-by-step instructions on how to apply Update 4, please visit Install Update 4 on your StorSimple device.
Quelle: Azure

Announcing Storage Optimized Virtual Machines, L Series

We are excited to introduce a new series of virtual machine sizes. The L Series for Storage optimizes workloads that require low latency, such as NoSQL databases (e.g. Cassandra, MongoDB, Cloudera and Redis). This new series of VMs offers from up to 32 CPU cores, using the Intel® Xeon® processor E5 v3 family, similar to the CPU performance of the G-Series that is currently available.

L Series offers 4 new VM sizes from 4 cores, 32 GiB of memory, and 678 GB of fast local SSD, scaling up to 32 cores with 256 GiB of memory, and over 5.6 TB of local SSD. Please refer to the Azure VM pricing page for pricing details.

At general availability, L Series VMs are available in the following regions:

East US 2
West US
Southeast Asia
Canada Central
Canada East
Australia East

Please check the Azure services by region site for future updates to the geographic availability of L Series.

L Series: Standard and premium storage optimized

VM sizes
CPU Cores
Memory
Temporary Disk (SSD)
Max Network Bandwidth

Standard_L4s
4
32 GiB
678 GB
Moderate

Standard_L8s
8
64 GiB
1388 GB
High

Standard_L16s
16
128 GiB
2807 GB
Very high

Standard_L32s
32
256 GiB
5630 GB
Very high

Note: Storage values for disk sizes use a legacy "GB" label. They are actually calculated in gibibytes, and all values should be read as "X GiB"
Quelle: Azure

Using templates to customize restored VMs from Azure Backup

Last week, we covered how you can configure backup on Azure VMs using Azure Quickstart templates. In this blog post, we will cover how you can customize the VM that will be created as part of restore operation from Azure backup to match your restore requirements.

Azure Backup provides three ways to restore from VM backup – Create a new VM from VM backup, Restore disks from VM backup and use them to create a VM or instant file recovery from VM backup. While a creating a VM from VM backup creates a restored VM, it will not let you customize the configuration from what is present during backup. If you want a test restore or spin a new VM with a different configuration, you can use restore disks and attach those disks to a different VM configuration using PowerShell. Today, we are happy to announce a feature which provides a customizable template to be deployed along with restore disks option which lets you customize the configuration for restore VM.

Customizing restored VM:

You can use restore disks option to customize parameters which are not possible with create a new VM option as part of restore process. Create VM option will generate unique identifiers and use them for some resource names to guarantee a restored success. If you want to customize or add new parameters as part of restore process, you can restore disks and use the template generated as part of restoring disks to customize the restored VM as per your requirements. This will also enable you to create VM with your choice of configuration from restored disks more seamlessly or help you to restore a VM to different network settings to test restore periodically at your environment.

Once you trigger restore job using Restore disks option, Azure Backup will copy data from its vault to storage account selected. Once this job is completed, you can go to corresponding restore job to find the template generated. This will be stored under parameter Template Blob Uri. Using the path mentioned, go to specified storage account and container, to download the template.  Once downloaded, you can use it in Azure template deployment to trigger a new VM creation. By default, template will have few parameters like Vnet Name, Public IP name, OS Disk name, Data Disk name prefix, NIC name prefix and Availability set option(only available if your original VM is part of availability set). If you want to specify a different configurations parameters, edit the template and submit the template for deployment.

Template will be provided for all non-encrypted standard and premium non-managed disk VMs and we will add support for encrypted and Managed Disks VMs in coming release. Please let us know your feedback at azurevmrestore@service.microsoft.com.

Related links and additional content

Want more details? Check out Azure Backup documentation and Azure Template walkthrough
Browse through Azure Quickstart templates for sample templates
Learn more about Azure Backup
Need help? Reach out to Azure Backup forum for support
Tell us how we can improve Azure Backup by contributing new ideas and voting up existing ones.
Follow us on Twitter @AzureBackup for the latest news and updates

Quelle: Azure

Planet scale aggregates with Azure DocumentDB

We’re excited to announce that we have expanded the SQL grammar in DocumentDB to support aggregate functions with the last service update. Support for aggregates is the most requested feature on the user voice site, so we are thrilled to roll this out everyone that&;s voted for it.

Azure DocumentDB is a fully managed NoSQL database service built for fast and predictable performance, high availability, elastic scaling, global distribution, and ease of development. DocumentDB provides rich and familiar SQL query capabilities with consistent low latencies on JSON data. These unique benefits make DocumentDB a great fit for web, mobile, gaming, IoT, and many other applications that need seamless scale and global replication.

DocumentDB is truly schema-free. By virtue of its commitment to the JSON data model directly within the database engine, it provides automatic indexing of JSON documents without requiring explicit schema or creation of secondary indexes. DocumentDB supports querying JSON documents using SQL. DocumentDB query is rooted in JavaScript&039;s type system, expression evaluation, and function invocation. This, in turn, provides a natural programming model for relational projections, hierarchical navigation across JSON documents, self joins, spatial queries, and invocation of user defined functions (UDFs) written entirely in JavaScript, among other features. We have now expanded the SQL grammar to include aggregations in addition to these capabilities.

Aggregates for planet scale applications

Whether you’re building a mobile game that needs to calculate statistics based on completed games, designing an IoT platform that triggers actions based on the number of occurrences of a certain event, or building a simple website or paginated API, you need to perform aggregate queries against your operational database. With DocumentDB you can now perform aggregate queries against data of any scale with low latency and predictable performance.

Aggregate support has been rolled out to all DocumentDB production datacenters. You can start running aggregate queries against your existing DocumentDB accounts or provision new DocumentDB accounts via the SDKs, REST API, or the Azure Portal. You must however download the latest version of the SDKs in order to perform cross-partition aggregate queries or use LINQ aggregate operators in .NET.

Aggregates with SQL

DocumentDB supports the SQL aggregate functions COUNT, MIN, MAX, SUM, and AVG. These operators work just like in relational databases, and return the computed value over the documents that match the query. For example, the following query retrieves the number of readings from the device xbox-1001 from DocumentDB:

SELECT VALUE COUNT(1)
FROM telemetry T
WHERE T.deviceId = "xbox-1001"

(If you’re wondering about the VALUE keyword – all queries return JSON fragments back. By using VALUE, you can get the scalar value of count e.g., 100, instead of the JSON document {"$1": 100})

We extended aggregate support in a seamless way to work with the existing query grammar and capabilities. For example, the following query returns the average temperature reading among devices within a specific polygon boundary representing a site location (combines aggregation with geospatial proximity searches):

SELECT VALUE AVG(T.temperature?? 0)
FROM telemetry T
WHERE ST_WITHIN(T.location, {"type": "polygon": … })

As an elastically scalable NoSQL database, DocumentDB supports storing and querying data of any storage or throughput. Regardless of the size or number of partitions in your collection, you can submit a simple SQL query and DocumentDB handles the routing of the query among data partitions, runs it in parallel against the local indexes within each matched partition, and merges intermediate results to return the final aggregate values. You can perform low latency aggregate queries using DocumentDB.

In the .NET SDK, this can be performed via the CreateDocumentQuery<T> method as shown below:

client.CreateDocumentQuery<int>(
"/dbs/devicedb/colls/telemetry",
"SELECT VALUE COUNT(1) FROM telemetry T WHERE T.deviceId = &039;xbox-1001&039;",
new FeedOptions { MaxDegreeOfParallelism = -1 });

For a complete example, you can take a look at our query samples in Github. 

Aggregates with LINQ

With the .NET SDK 1.13.0, you can query for aggregates using LINQ in addition to SQL. The latest SDK supports the operators Count, Sum, Min, Max, Average and their asynchronous equivalents CountAsync, SumAsync, MinAsync, MaxAsync, AverageAsync. For example, the same query shown previously can be written as the following LINQ query:

client.CreateDocumentQuery<DeviceReading>("/dbs/devicedb/colls/telemetry",
new FeedOptions { MaxDegreeOfParallelism = -1 })
.Where(r => r.DeviceId == "xbox-1001")
.CountAsync();

Learn more about DocumentDB’s LINQ support, including how asynchronous pagination is performed during aggregate queries.

Aggregates using the Azure Portal

You can also start running aggregate queries using the Azure Portal right away.

Next Steps

In this blog post, we looked at support for aggregate functions and query in Azure DocumentDB. To get started running queries, create a new DocumentDB account from the Azure Portal.

Stay up-to-date on the latest DocumentDB news and features by following us on Twitter @DocumentDB or reach out to us on the developer forums on Stack Overflow.
Quelle: Azure

Notice for developers using Azure AD B2C tenants configured for Google sign-ins

On April 20th 2017, Google will start blocking OAuth requests from embedded browsers, called "web-views". If you are using Google as an identity provider in Azure Active Directory B2C, you might need to make changes to your applications to avoid downtime. For more information about Google&;s plans, see Google&039;s blog post.

Applications not impacted

We do not expect any impact for applications that:

Only use local accounts or do not have Google as an social identity provider
Web applications / Web APIs
Desktop (Windows) applications

Applications impacted

Applications impacted are those that have configured Google as an social identity provider in Azure AD B2C and support Android or iOS using:

Xamarin and MSAL Preview

Given it&039;s preview status, MSAL should not be in use in production, but in case you did, contact Azure Support and we&039;ll help you out.

Any library that uses embedded web-views such as AndroidAuthClient/OIDCAndroidLib (Android), NXOAuth2Client (iOS) and ADAL Experimental (iOS & Android) or codes against the protocol using embedded web-views directly, WebView (Android) and UIWebView (iOS). Android and iOS B2C samples posted before today used some of these libraries.

Our updated Android and iOS samples have instructions and working code with AppAuth, an open source library that uses the system web-views.

Azure AD B2C support for System Web-Views

Traditionally, applications using embedded web-views send an OAuth request to an identity provider with a redirect URN such as urn:ietf:wg:oauth:2.0:oob. Once the user signed in with the identity provider and the identity provider attempted to redirect the user back to the URN, the application, having full control of the web-view, would intercept the response and grab the authorization code.

Conversely, applications using system web-views do not have control over the web-view and thus can&039;t intercept the OAuth response, they need a way for the system web-view when to return control back to the application. To support system web-views, Azure AD B2C has added support for custom redirect URIs for native clients (e.g. com.onmicrosoft.fabrikamb2c.exampleapp://oauthredirect) which developers can set up in their application configurations to ensure that the system web-view sends the response back to the application. Also, to ensure that only the application that generated the OAuth request can redeem the authentication code, Azure AD B2C added support for Proof Key for Code Exchange (PKCE).

If you run into any issues please contact Azure Support or if you have coding questions, don&039;t hesitate to post on StackOverflow using the azure-ad-b2c tag.
Quelle: Azure