Deep Learning, Simulation and HPC Applications with Docker and Azure Batch

The Azure Big Compute team is happy to announce version 1.0.0 of the Batch Shipyard toolkit, which enables easy deployment of batch-style Dockerized workloads to Azure Batch compute pools. Azure Batch enables you to run parallel jobs in the cloud without having to manage the infrastructure. It’s ideal for parametric sweeps, Deep Learning training with NVIDIA GPUs, and simulations using MPI and InfiniBand.

Whether you need to run your containerized jobs on a single machine or hundreds or even thousands of machines, Batch Shipyard blends features of Azure Batch — handling complexities of large scale VM deployment and management, high throughput, highly available job scheduling, and auto-scaling to pay only for what you use — with the power of Docker containers for application packaging.  Batch Shipyard allows you to harness the deployment consistency and isolation for your batch-style and HPC containerized workloads, and run them at any scale without the need to develop directly to the Azure Batch SDK.

The initial release of Batch Shipyard has the following major features:

Automated Docker Host Engine installation tuned for Azure Batch compute nodes
Automated deployment of required Docker images to compute nodes
Accelerated Docker image deployment at scale to compute pools consisting of a large number of VMs via private peer-to-peer distribution of Docker images among the compute nodes
Automated Docker Private Registry instance creation on compute nodes with Docker images backed to Azure Storage if specified
Automatic shared data volume support for:

Azure File Docker Volume Driver installation and share setup for SMB/CIFS backed to Azure Storage if specified
GlusterFS distributed network file system installation and setup if specified

Seamless integration with Azure Batch job, task and file concepts along with full pass-through of the Azure Batch API to containers executed on compute nodes
Support for Azure Batch task dependencies allowing complex processing pipelines and graphs with Docker containers
Transparent support for GPU accelerated Docker applications on Azure N-Series VM instances (Preview)
Support for multi-instance tasks to accommodate Dockerized MPI and multi-node cluster applications on compute pools with automatic job cleanup
Transparent assist for running Docker containers utilizing Infiniband/RDMA for MPI on HPC low-latency Azure VM instances (i.e., STANDARD_A8 and STANDARD_A9)
Automatic setup of SSH tunneling to Docker Hosts on compute nodes if specified

We’ve also made available an initial set of recipes that enable scenarios such as Deep Learning, Computational Fluid Dynamics (CFD), Molecular Dynamics (MD) and Video Processing with Batch Shipyard. In fact, we are aiming to make Deep Learning on Azure Batch an easy, low friction experience. Once you have the toolkit installed and have Azure Batch and Azure Storage credentials, you can get CNTK, Caffe or TensorFlow running in an Azure Batch compute pool in under 15 minutes. Below is a screenshot of CNTK running on a GPU-enabled STANDARD_NC6 VM via Batch Shipyard with nvidia-smi:

We hope to continue to expand the repertoire of recipes available for Batch Shipyard in the future.

The Batch Shipyard toolkit can be found on GitHub. We welcome any feedback and contributions!
Quelle: Azure

500 Million Yahoo Accounts Have Been Hacked

Denis Balibouse / Reuters

Yahoo has confirmed in a press release that a hacker, possibly working with a foreign government, stole 500 million users&; account information in 2014.

The company said that it is working with law enforcement to catch the hacker. The data breach may have included names, email addresses, telephone numbers, dates of birth, hashed passwords, and security questions and answers. Financial data, according to Yahoo, were not part of the information taken.

Recode reports that a hacker nicknamed “Peace” may be responsible. In early August, a hacker by the same name had listed data from 200 million Yahoo accounts for sale on the Dark Web. At the time, Yahoo said it was aware of the listing, but it did not issue a password reset.

Yahoo is asking users to change their passwords and to be wary of any unsolicited communication. The company has updated its security FAQ page to include response measures, sent a security email to affected users, and issued a slew of other recommendations to users, including changing security questions, reviewing accounts for suspicious activity, and not clicking any links or downloading any materials from unverified emails. The company&039;s investigation into the hack is ongoing.

The hack may affect the $4.8 billion sale of Yahoo&039;s core business to Verizon. Verizon said in a prepared statement, “Within the last two days, we were notified of Yahoo&039;s security incident. We understand that Yahoo is conducting an active investigation of this matter, but we otherwise have limited information and understanding of the impact. We will evaluate as the investigation continues through the lens of overall Verizon interests, including consumers, customers, shareholders and related communities. Until then, we are not in position to further comment.”

Many online responses criticized the pace of Yahoo&039;s response and joked about its relevance in 2016:

The hack may also spread to other websites and accounts. Yahoo account holders should change their passwords for other websites as well, cybersecurity experts advise. Shuman Ghosemajumder, CTO of the Shape Security, said, “The real issue now is that these passwords will be used to breach thousands of other websites unrelated to Yahoo, as cybercriminals use advanced automated tools to discover where users have used those same passwords on other sites.”

Quelle: <a href="500 Million Yahoo Accounts Have Been Hacked“>BuzzFeed

Using BigQuery and Firebase Analytics to understand your mobile app

Posted by Sara Robinson, Developer Advocate

At Google I/O this May, Firebase announced a new suite of products to help developers build mobile apps. Firebase Analytics, a part of the new Firebase platform, is a tool that automatically captures data on how people are using your iOS and Android app, and lets you define your own custom app events. When the data’s captured, it’s available through a dashboard in the Firebase console. One of my favorite cloud integrations with the new Firebase platform is the ability to export raw data from Firebase Analytics to Google BigQuery for custom analysis. This custom analysis is particularly useful for aggregating data from the iOS and Android versions of your app, and accessing custom parameters passed in your Firebase Analytics events. Let’s take a look at what you can do with this powerful combination.

How does the BigQuery export work?

After linking your Firebase project to BigQuery, Firebase automatically exports a new table to an associated BigQuery dataset every day. If you have both iOS and Android versions of your app, Firebase exports the data for each platform into a separate dataset. Each table contains the user activity and demographic data automatically captured by Firebase Analytics, along with any custom events you’re capturing in your app. Thus, after exporting one week’s worth of data for a cross-platform app, your BigQuery project would contain two datasets, each with seven tables:

Diving into the data

The schema for every Firebase Analytics export table is the same, and we’ve created two datasets (one for iOS and one for Android) with sample user data for you to run the example queries below. The datasets are for a sample cross-platform iOS and Android gaming app. Each dataset contains seven tables — one week’s worth of analytics data.

The following query will return some basic user demographic and device data for one day of usage on the iOS version of our app:

SELECT
user_dim.app_info.app_instance_id,
user_dim.device_info.device_category,
user_dim.device_info.user_default_language,
user_dim.device_info.platform_version,
user_dim.device_info.device_model,
user_dim.geo_info.country,
user_dim.geo_info.city,
user_dim.app_info.app_version,
user_dim.app_info.app_store,
user_dim.app_info.app_platform
FROM
[firebase-analytics-sample-data:ios_dataset.app_events_20160601]

Since the schema for every BigQuery table exported from Firebase Analytics is the same, you can run any of the queries in this post on your own Firebase Analytics data by replacing the dataset and table names with the ones for your project.

The schema has user data and event data. All user data is automatically captured by Firebase Analytics, and the event data is populated by any custom events you add to your app. Let’s take a look at the specific records for both user and event data.

User data

The user records contain a unique app instance ID for each user (user_dim.app_info.app_instance_id in the schema), along with data on their location, device and app version. In the Firebase console, there are separate dashboards for the app’s Android and iOS analytics. With BigQuery, we can run a query to find out where our users are accessing our app around the world across both platforms. The query below makes use of BigQuery’s union feature, which lets you use a comma as a UNION ALL operator. Since a row is created in our table for each bundle of events a user triggers, we use EXACT_COUNT_DISTINCT to make sure each user is only counted once:
SELECT
user_dim.geo_info.country as country,
EXACT_COUNT_DISTINCT( user_dim.app_info.app_instance_id ) as users
FROM
[firebase-analytics-sample-data:android_dataset.app_events_20160601],
[firebase-analytics-sample-data:ios_dataset.app_events_20160601]
GROUP BY
country
ORDER BY
users DESC

User data also includes a user_properties record, which includes attributes you define to describe different segments of your user base, like language preference or geographic location. Firebase Analytics captures some user properties by default, and you can create up to 25 of your own.

A user’s language preference is one of the default user properties. To see which languages our users speak across platforms, we can run the following query:

SELECT
user_dim.user_properties.value.value.string_value as language_code,
EXACT_COUNT_DISTINCT(user_dim.app_info.app_instance_id) as users,
FROM
[firebase-analytics-sample-data:android_dataset.app_events_20160601],
[firebase-analytics-sample-data:ios_dataset.app_events_20160601]
WHERE
user_dim.user_properties.key = “language”
GROUP BY
language_code
ORDER BY
users DESC

Event data

Firebase Analytics makes it easy to log custom events such as tracking item purchases or button clicks in your app. When you log an event, you pass an event name and up to 25 parameters to Firebase Analytics and it automatically tracks the number of times the event has occurred. The following query shows the number of times each event in our app has occurred on Android for a particular day:

SELECT
event_dim.name,
COUNT(event_dim.name) as event_count
FROM
[firebase-analytics-sample-data:android_dataset.app_events_20160601]
GROUP BY
event_dim.name
ORDER BY
event_count DESC

If you have another type of value associated with an event (like item prices), you can pass it through as an optional value parameter and filter by this value in BigQuery. In our sample tables, there is a spend_virtual_currency event. We can write the following query to see how much virtual currency players spend at one time:

SELECT
event_dim.params.value.int_value as virtual_currency_amt,
COUNT(*) as num_times_spent
FROM
[firebase-analytics-sample-data:android_dataset.app_events_20160601]
WHERE
event_dim.name = “spend_virtual_currency”
AND
event_dim.params.key = “value”
GROUP BY
1
ORDER BY
num_times_spent DESC

Building complex queries

What if we want to run a query across both platforms of our app over a specific date range? Since Firebase Analytics data is split into tables for each day, we can do this using BigQuery’s TABLE_DATE_RANGE function. This query returns a count of the cities users are coming from over a one week period:

SELECT
user_dim.geo_info.city,
COUNT(user_dim.geo_info.city) as city_count
FROM
TABLE_DATE_RANGE([firebase-analytics-sample-data:android_dataset.app_events_], DATE_ADD(‘2016-06-07′, -7, ‘DAY’), CURRENT_TIMESTAMP()),
TABLE_DATE_RANGE([firebase-analytics-sample-data:ios_dataset.app_events_], DATE_ADD(‘2016-06-07′, -7, ‘DAY’), CURRENT_TIMESTAMP())
GROUP BY
user_dim.geo_info.city
ORDER BY
city_count DESC

We can also write a query to compare mobile vs. tablet usage across platforms over a one week period:

SELECT
user_dim.app_info.app_platform as appPlatform,
user_dim.device_info.device_category as deviceType,
COUNT(user_dim.device_info.device_category) AS device_type_count FROM
TABLE_DATE_RANGE([firebase-analytics-sample-data:android_dataset.app_events_], DATE_ADD(‘2016-06-07′, -7, ‘DAY’), CURRENT_TIMESTAMP()),
TABLE_DATE_RANGE([firebase-analytics-sample-data:ios_dataset.app_events_], DATE_ADD(‘2016-06-07′, -7, ‘DAY’), CURRENT_TIMESTAMP())
GROUP BY
1,2
ORDER BY
device_type_count DESC

Getting a bit more complex, we can write a query to generate a report of unique user events across platforms over the past two weeks. Here we use PARTITION BY and EXACT_COUNT_DISTINCT to de-dupe our event report by users, making use of user properties and the user_dim.user_id field:

SELECT
STRFTIME_UTC_USEC(eventTime,”%Y%m%d”) as date,
appPlatform,
eventName,
COUNT(*) totalEvents,
EXACT_COUNT_DISTINCT(IF(userId IS NOT NULL, userId, fullVisitorid)) as users
FROM (
SELECT
fullVisitorid,
openTimestamp,
FORMAT_UTC_USEC(openTimestamp) firstOpenedTime,
userIdSet,
MAX(userIdSet) OVER(PARTITION BY fullVisitorid) userId,
appPlatform,
eventTimestamp,
FORMAT_UTC_USEC(eventTimestamp) as eventTime,
eventName
FROM FLATTEN(
(
SELECT
user_dim.app_info.app_instance_id as fullVisitorid,
user_dim.first_open_timestamp_micros as openTimestamp,
user_dim.user_properties.value.value.string_value,
IF(user_dim.user_properties.key = ‘user_id’,user_dim.user_properties.value.value.string_value, null) as userIdSet,
user_dim.app_info.app_platform as appPlatform,
event_dim.timestamp_micros as eventTimestamp,
event_dim.name AS eventName,
event_dim.params.key,
event_dim.params.value.string_value
FROM
TABLE_DATE_RANGE([firebase-analytics-sample-data:android_dataset.app_events_], DATE_ADD(‘2016-06-07′, -7, ‘DAY’), CURRENT_TIMESTAMP()),
TABLE_DATE_RANGE([firebase-analytics-sample-data:ios_dataset.app_events_], DATE_ADD(‘2016-06-07′, -7, ‘DAY’), CURRENT_TIMESTAMP())
), user_dim.user_properties)
)
GROUP BY
date, appPlatform, eventName

If you have data in Google Analytics for the same app, it’s also possible to export your Google Analytics data to BigQuery and do a JOIN with your Firebase Analytics BigQuery tables.

Visualizing analytics data

Now that we’ve gathered new insights from our mobile app data using the raw BigQuery export, let’s visualize it using Google Data Studio. Data Studio can read directly from BigQuery tables, and we can even pass it a custom query like the ones above. Data Studio can generate many different types of charts depending on the structure of your data, including time series, bar charts, pie charts and geo maps.

For our first visualization, let’s create a bar chart to compare the device types from which users are accessing our app on each platform. We can paste the mobile vs. tablet query above directly into Data Studio to generate the following chart:

From this chart, it’s easy to see that iOS users are much more likely to access our game from a tablet. Getting a bit more complex, we can use the above event report query to create a bar chart comparing the number of events across platforms:

Check out this post for detailed instructions on connecting your BigQuery project to Data Studio.

What’s next?
If you’re new to Firebase, get started here. If you’re already building a mobile app on Firebase, check out this detailed guide on linking your Firebase project to BigQuery. For questions, take a look at the BigQuery reference docs and use the firebase-analytics and google-bigquery tags on Stack Overflow. And let me know if there are any particular topics you’d like me to cover in an upcoming post.

Quelle: Google Cloud Platform

Microsoft Ignite: Azure Stack technical sessions

Last week my colleague, Wale Martins, posted a great summary of all the Microsoft Ignite sessions focused on Microsoft Azure Stack. For all of you attending Ignite, this is a good guide to learn more about some of the sessions we plan to deliver and when they will occur.

If you are like me, you crave as many details as you can get about each session, especially the technical sessions. This blog post provides more details about what you can expect from some technical sessions on Thursday and Friday at Ignite.

BRK3115: Becoming a Microsoft Azure Stack infrastructure rockstar

Are you ready to learn how to become a cloud administrator, Microsoft Azure Stack, infrastructure managing rockstar? If so, come start your journey to rockstar status in this session.

My colleague, Thomas Roettinger, program manager on the Azure Stack team, and I will be hosting a session about how we view infrastructure management in Azure Stack and what capabilities are included in the Azure Stack Technical Preview 2. We will deep dive into several areas including:

Integrating Azure Stack with your datacenter: What points of integration are available, why you should integrate, and how
Hardware management: How will cloud administrators manage the hardware supporting Azure Stack?
Monitoring: How are concepts like health and alerting enabled?

This session is just the start of our journey together. Feel free to follow us @chasat and @troettinger for more updates on these topics and come visit us while we are in Atlanta!

BRK3327: Dive deep into Microsoft Azure Stack IaaS

Azure delivers Infrastructure as a Service (IaaS) at hyper-scale, with a massive global infrastructure behind it. So how will Azure Stack deliver an IaaS offering that looks, tastes and feels just like Azure?

Scott Napolitan and Mallikarjun Chadalapaka, program managers on the Microsoft Azure Stack team, show you how Azure Stack delivers an IaaS experience that is consistent with Azure yet uses infrastructure at a fraction of the scale so it fits into your datacenter.

In this session, they will talk about how we took robust, scalable technologies directly from Azure and combined them with new features in Windows Server 2016 built for cloud. You will walk out with a better understanding of how the infrastructure works, and what IaaS scenarios are enabled by it. The session will dive deep into:

Compute, storage and networking resource providers and how they interact with the underlying infrastructure
The infrastructure and technologies that enable Azure Stack to surface simple resource primitives that can be consumed by the same APIs used with Azure
What to expect in terms of IaaS scenarios and features enabled in TP2
How cloud administrators will surface resources to their tenants

They also plan on doing some demos to help drive the learnings home and show how seamless the experience can be.

BRK3112: Learn about the community of templates for Azure Stack

Azure Stack provides consistency with Azure which allows you to reuse artifacts across clouds. But, how do you create those artifacts? And why are they so important?

Marc van Eijk and Ricardo Mendes will help you understand ARM templates across Azure and Azure Stack. They will cover the basics on how to get started with Azure Resource Manager templates including:

What tools you can use
How to create and deploy ARM templates
How to troubleshoot deployments

In this session, they will look at the existing community templates, how you can reuse them for your own purposes and how you can contribute to community templates in addition to:

An introduction to GitHub
Repositories, forks, clones, branches, commits and pull requests
End-to-end example on how to make an update to the Azure-Quickstart Templates

They will complete the session with a more advanced, production-ready, deployment scenario. Join us and learn how to get started with ARM templates for Azure Stack and Azure.

BRK3141: Discuss Microsoft DevOps on Azure Stack

Do you want to learn how to give your organization’s developers the flexibility of the cloud with the security of your own datacenter? If so, attend this session and learn about how Azure Stack integrates into a modern DevOps workflow.

My colleagues Anjay Ajodha, Matthew McGlynn, and Shri Natarajan will showcase how Azure Stack allows you to adapt the skills you use to deploy and maintain complex applications in the cloud to your on-premises infrastructure. They&;ll go over some common examples of continuous integration and deployment, using both Microsoft-based and Linux-based stacks, and demonstrate the value of having your infrastructure defined and versioned through code. They’ll cover a breadth of concepts including:

What is DevOps and how does it bring value to your organization?
How can your developers define an application’s infrastructure through Resource Manager templates, and deploying to Azure Stack?
How can rapid changes be made to an application’s infrastructure?
How can you bring DevOps to your own organization?

They also have some exciting demos planned!

BRK3148: Learn about hybrid applications with Azure and Azure Stack

Do you build and operate applications that use public cloud resources and resources in your datacenter? Are you doing cutting edge Hybrid App development? If yes, this session is for you!

Please join my colleague, Ricardo Mendes, program manager on the Azure Stack team, to learn how can you use Azure Stack to build solutions comprised of resources in the public cloud and on-premises in a consistent way leveraging your knowledge of Azure.

This session will cover a broad set of concepts, including:

The different types of hybrid cloud apps
Why hybrid solutions?
Challenges on building those type of apps
Tooling and resources to get you started faster
Tips and tricks

Next Stop Atlanta

All our presenters are excited to share their knowledge of Azure Stack and answer your questions both after their sessions as well as in our booth on the Expo Hall floor. We hope you are looking forward to learning and networking next week. See you in Atlanta!
Quelle: Azure

Microsoft Azure Storage samples – cross platform quick starts and more

Getting started with new technology can sometimes be complex and time consuming. Often it requires searching for the right getting started and operational guidance that include samples and posting questions on forums.

We at Azure Storage continue to strive to improve our end-user experiences to make it easy for you to discover and try out a sample in just 5 minutes. As part of this, we want to make our samples more easily discoverable, fully functional and community-friendly.

1. Discoverable: We now have a landing page with all the Azure Storage samples listed with per language GitHub repos. You can download the zip project file or fork the sample repo that you are interested in. Most of our Storage content pages either are already updated with or will be updated with relevant sample page links for you to easily pick up the sample, compile and experiment with. Beyond specifying either using the emulator or connecting to an Azure Storage account with your credentials, the code should just work.

2. Relevant: In the samples page (image on right), we initially focused on functional code samples for the most common Azure Storage usage scenarios for Blobs, Tables, Queues, and Files written in .NET, Java, Node.js, Python, C++, Ruby and PHP. These are already available for you to use right away!  Similarly, we have invested in functional samples in scripting /tooling options (Powershell, AzureCLI).

Following this, we plan to invest in creating a few scenario samples like data movement solutions, image upload from a mobile device, designing for high availability, client side encryption working across OS platforms and languages that light up the rich service and client library capabilities on Storage and at the same time showcase patterns and best practices. Also, as we build new features, we will do our best to keep these up to date so you can see your favorite new features in action.

3. Open Source: Finally, the code is open source and is readily usable from GitHub to make it possible for community contributions to the samples repository. Simply propose your change and we will review the design and code then merge it in. You can help build new samples or keep these samples up to date as the client libraries and the service evolves.

As always, we are interested in your feedback so please let us know what you think by providing comments on this post. As you start leveraging individual samples, please provide actionable feedback in the GitHub repo and/or the comments in the azure storage documentation web page.

Go ahead, navigate to the Storage samples page, get started with the samples and explore how easy it is to build cloud applications on Storage!
Quelle: Azure

Umbraco uses Azure SQL Database Elastic Pools for thousands of CMS tenants in the cloud

Umbraco is a popular open-source content-management system (CMS) that can run anything from small campaign or brochure sites to complex applications for Fortune 500 companies and global media websites.

Azure SQL Database powers Umbraco-as-a-Service (UaaS), a software-as-a-service (SaaS) solution that eliminates the need for on-premises deployments, provides built-in scaling, and removes management overhead by enabling developers to focus on product innovation rather than solution management. Umbraco is able to provide all those benefits by relying on the flexible platform-as-a-service (PaaS) model offered by Microsoft Azure, SQL Database Elastic Database Pools.

To learn more about Umbraco&;s journey and how you can take advantage of Elastic Database Pools, take a look at this newly published case study.
Quelle: Azure