IoT × AI : 家電製品の作動状況を機械学習でリアルタイム予測

世界中でホーム オートメーションの人気が高まり、それにかかる電力のコストが上昇するにつれて、省エネルギーへの取り組みが多くの消費者にとって大きな関心事となっています。家庭用スマート メーターの登場により、世帯全体の消費電力を測定、記録することも今では可能になりました。さらに、スマート メーターのデータを機械学習モデルで分析すれば、個々の電化製品の挙動を正確に予測できます。それにより、たとえば冷蔵庫のドアが開いたままになっていると推測されるときや、非常識な時間帯にスプリンクラーが突然作動したときに、電力会社が契約者にメッセージを送るといったことも実現できるでしょう。この投稿では、家庭用電化製品(今回のデータセットでは、たとえば電気ケトルや洗濯機など)の作動状況をスマート メーターの測定データから正確に判定する方法とともに、LSTM(long short-term memory)モデルなどの新しい機械学習テクニックについてご紹介します。電化製品の作動状況がアルゴリズムで判定できるようになれば、それに対応したアプリケーションも作れるようになります。たとえば、次のようなものが考えられます。異常状態の検出 : 通常、家に誰もいなければテレビの電源は切れています。予期しない時間帯、あるいは異常な時間帯にテレビがオンになっていると、アプリケーションがユーザーにメッセージを送ります。習慣改善の提案 : 近隣家庭での電化製品の利用パターンを集計した形で示し、それを当該ユーザーの利用パターンと比較または参照できるようにするアプリケーションがあれば、電化製品の使い方を最適化できます。私たちは、Google Cloud Platform(GCP)を使用して、エンドツーエンドのデモ システムを開発しました。データ収集には Cloud IoT Core、機械学習モデルの構築には TensorFlow、モデルのトレーニングには Cloud Machine Learning Engine(Cloud ML Engine)、リアルタイムでのサービス提供と予測には Cloud Pub/Sub、App Engine、Cloud ML Engine を使用しています。本稿を読みながらご覧いただけるように、完全なソース ファイルはこちらの GitHub リポジトリから参照できます。デモ システムの概要IoT デバイスの人気の高まりと機械学習テクノロジーの発展により、新しいビジネス チャンスが生まれています。本稿では、スマート メーターが収集した総電力の測定値を最新の機械学習テクニックで処理し、家庭用電化製品(たとえば電気ケトルや洗濯機)の作動状況(電源オンかオフか)を推測する方法をご紹介します。GCP だけで開発したエンドツーエンドのデモ システムには次のものが含まれています。Cloud IoT Core と Cloud Pub/Sub によるデータの収集と取り込みCloud ML Engine でトレーニングされた機械学習モデルフロントエンドの App Engine と Cloud ML Engine を使用して提供される、同じ機械学習モデルBigQuery と Colab によるデータの可視化と調査図 1. デモ システムのアーキテクチャ下のアニメーションは、Cloud IoT Core を介して Colab に取り込まれた実際の消費電力データのリアルタイム モニタリングを示しています。図 2. リアルタイム モニタリングの様子IoT によって広がる機械学習の可能性データの取り込み機械学習モデルのトレーニングには十分な量の適切なデータが必要です。IoT の場合は、スマート IoT デバイスが収集したデータを遠く離れた中央のサーバーに安全かつ確実に送信するために、さまざまな課題を克服しなければなりません。特に、データのセキュリティや伝送の信頼性、ユース ケースに応じたタイムリー性などを考慮する必要があります。Cloud IoT Core は、世界中に分散した数百万のデバイスに簡単かつセキュアに接続して、それらのデバイスを管理するとともにデータを取り込むフルマネージド サービスです。デバイス マネージャとプロトコル ブリッジの 2 つが主要機能となります。デバイス マネージャは、デバイスを識別して認証し、その識別情報を保持することで、個々のデバイスを大まかな方法で設定、管理できるようにします。また、個々のデバイスの論理構成を保存し、デバイスの遠隔操作を行うことができます。たとえば、大量のスマート メーターのデータ サンプリング率を一斉に変更するようなことも可能です。プロトコル ブリッジは、接続しているすべてのデバイスを対象に自動でロード バランシングを行うエンドポイントを提供するとともに、MQTT や HTTP などの業界標準プロトコルを介したセキュアな接続をネイティブでサポートします。また、デバイスの遠隔測定データを Cloud Pub/Sub にパブリッシュすれば、後でその測定データを下流の分析システムに渡すことができます。私たちのデモ システムでは MQTT ブリッジを採用し、以下に示す MQTT 固有のロジックをコードに組み込んでいます。データの流れデータが Cloud Pub/Sub にパブリッシュされると、Cloud Pub/Sub は「プッシュ エンドポイント」(一般に、データを受け付けるゲートウェイ サービス)にメッセージを送ります。私たちのデモ システムの場合、Cloud Pub/Sub は、App Engine がホスティングするゲートウェイ サービスにデータをプッシュし、そこから Cloud ML Engine がホスティングする機械学習モデルにデータを転送して推論を実行させます。また、それとともに、未加工データと受け取った予測結果を BigQuery に格納し、後で(バッチ)分析できるようにします。私たちのサンプル コードはビジネス固有のさまざまなユース ケースに応用できますが、デモ システムでは未加工データと予測結果の可視化を行っています。コード リポジトリには、次の 2 つのノートブックが含まれています。EnergyDisaggregationDemo_Client.ipynb : このノートブックは、実際のデータセットから消費電力データを読み込むことで複数のスマート メーターをシミュレートし、読み込んだデータをサーバーに送信します。Cloud IoT Core 関連のコードは、すべてこのノートブックに含まれています。EnergyDisaggregationDemo_View.ipynb : このノートブックを使用すれば、指定したスマート メーターからの未加工の消費電力データと、モデルによる予測結果をほぼリアルタイムで表示できます。README ファイルと付属のノートブックで説明されているデプロイ方法に従えば、図 2 の表示を再現できるはずです。一方、ほかの方法でデータ分割パイプラインを作りたい場合は、Cloud Dataflow や Pub/Sub I/O を使用すれば、同様の機能を備えたアプリケーションを構築できます。データ処理と機械学習データセットの概要と調査結果エンドツーエンドのデモ システムを再現可能なものにするため、私たちは UK-DALE(UK Domestic Appliance-Level Electricity、こちら 1 からダウンロード可能)データセットを使用して、総電力の測定値を基に個々の電化製品のオン / オフを予測するモデルをトレーニングしました。UK-DALE は、5 世帯の世帯全体の電力消費と個々の電化製品の消費電力を 6 秒ごとに記録しています。デモ システムでは世帯 2 のデータを使っており、このデータセットには全部で 18 個の電化製品の消費電力が含まれています。データセットの粒度(サンプリング レート 0.166 Hz)を考慮すると、比較的消費電力の少ない電化製品の評価は困難なため、ラップトップやコンピュータ ディスプレイなどの消費電力についてはこのデモに含まれていません。後述のデータ調査結果に基づいて、18 個の電化製品のうち、ランニング マシン、洗濯機、食洗機、電子レンジ、トースター、電気ケトル、炊飯器、電気コンロの 8 個だけを調査対象とすることにしました。下の図 3 は、選択した 8 個の電化製品の消費電力ヒストグラムです。どの電化製品もほとんどの時間は電源オフになっているので、大半の測定値はゼロに近くなります。また図 4 は、選択した電化製品の消費電力の合計(app_sum)と世帯全体の消費電力(gross)との比較を示しています。デモ システムに対する入力は全体の消費電力量(青い曲線)だということに注意してください。これが最も手に入りやすく、家の外でも測定できる消費電力データなのです。図 3. 調査対象の電化製品とその電力需要のヒストグラム図 4. 世帯 2 のデータ サンプル(2013 年 7 月 4 日 : UTC)図 4 に示した世帯 2 のデータは 2013 年の 2 月下旬から 10 月上旬までのものですが、先頭と末尾の近辺には欠損値があるため、デモ システムでは 6 月から 9 月末までのデータを使用しています。表 1 は、選択した電化製品の実際の要約統計をまとめたものです。予想どおり、データは個々の電化製品のオン / オフ状態と個々の電化製品の消費電力規模によって極端にバランスの悪いものになっており、それが予測タスクを難しくする主要因になっています。表 1. 消費電力の要約統計データの前処理UK-DALE は個々の電化製品のオン / オフ状態を記録していないため、前処理において特に重要だったのは、各タイムスタンプにおける個々の電化製品のオン / オフ状態のラベル付けでした。電化製品の電源がほとんどの時間でオフになっており、ほとんどの測定値がゼロに近いことから、消費電力が測定値の標本平均よりも 1 標準偏差高いときは電源がオンであると見なすことにしました。データの前処理のコードはノートブックに含まれており、こちらから処理済みのデータをダウンロードすることもできます。前処理後のデータを CSV 形式にしているので、機械学習モデル トレーニングの入力パイプラインなどでは、TensorFlow の Dataset クラスが、データのロードと変換のための便利なツールとして機能します。たとえば、次のコードの 7 行目から 9 行目では指定された CSV ファイルからデータをロードし、11 行目から 13 行目ではデータを時系列シーケンスに変換しています。データの不均衡という問題については、大きいクラスをダウンサンプリングするか、小さいクラスをアップサンプリングすれば対処できます。私たちのデモでは、確率的ネガティブ ダウンサンプリングを提案しています。一定の確率としきい値に基づき、少なくとも 1 つの電化製品がオンになっているサブシーケンスは残すものの、すべての電化製品がオフになっているサブシーケンスはフィルタリングします。次のコードが示すように、フィルタリング ロジックは tf.data API と簡単に統合できます。最後に、『Data Input Pipeline Performance』ガイドのベスト プラクティスに従い、入力パイプラインからデータがロードされるのを漫然と待って GPU/TPU リソースを無駄にするようなことがないようにしましょう(トレーニング プロセスの高速化のために GPU/TPU が使用されている場合)。GPU/TPU を最大限に活用するには、次のコードに示すように、並列マッピングを使ってデータ変換とプリフェッチを並列化し、前処理とモデル トレーニングのステップが同時に実行されるようにします。機械学習モデル私たちは分類モデルとして LSTM ベースのネットワークを採用しています。RNN(再帰型ニューラル ネットワーク)と LSTM の基礎については『Understanding LSTM Networks』をご覧ください。図 5 は、私たちのモデル設計を図示したものです。長さ n の入力シーケンスが多階層の LSTM ネットワークに送られ、m 個のすべてのデバイスについて予測が行われます。LSTM セルへの入力のためにドロップアウト層を追加し、シーケンス全体の出力を完全接続層に送るようにしました。私たちはこのモデルを TensorFlow の Estimator として実装しています。図 5. LSTM ベース モデルのアーキテクチャ上記アーキテクチャの実装方法としては、TensorFlow ネイティブ API(tf.layers と tf.nn)と Keras API(tf.keras)の 2 つがあります。Keras は、TensorFlow のネイティブ API と比べて高レベルの API であり、使いやすさ、モジュール性、拡張性の 3 つの長所を兼ね備えたディープ ラーニング モデルのトレーニングと提供を可能にします。一方、tf.keras は TensorFlow による Keras API 仕様の実装です。次のコード例では、LSTM ベースの分類モデルを両方の方法で実装しています。比較してみてください。TensorFlow のネイティブ API を使用したモデルの実装 :Keras API を使用したモデルの実装 :トレーニングとハイパーパラメータ調整Cloud ML Engine は、トレーニングとハイパーパラメータ調整の両方をサポートしています。図 6 は、さまざまな組み合わせのハイパーパラメータを使って複数回試行したときの、電化製品全体の平均精度、再現率、F 値を示しています。ハイパーパラメータの調整により、モデルのパフォーマンスが大幅に向上しています。図 6. ハイパーパラメータ調整と学習曲線表 2 は、ハイパーパラメータ調整で最高のスコアを叩き出した 2 つの実験を選び、そのパフォーマンスをまとめたものです。表 2. スコアの高い 2 つの実験のハイパーパラメータ表 3 は、個々の電化製品における予測の精度と再現率を示しています。「データセットの概要と調査結果」の項で述べたように、電気コンロとランニング マシンについては、ピーク時の消費電力がほかのデバイスよりもかなり低かったため、予測が難しいことがわかります。表 3. 個々の電化製品における予測の精度と再現率まとめ以上、スマート メーターの測定データのみを基に電化製品の作動状況を正確に判断する方法を、機械学習を取り入れたエンドツーエンドのデモ システムを用いて解説しました。システム全体をサポートするため、Cloud IoT Core、Cloud Pub/Sub、Cloud ML Engine、App Engine、BigQuery などを組み合わせており、これらの GCP プロダクトはデータ収集 / 取り込み、機械学習モデルのトレーニング、リアルタイム予測など、デモの実装に必要な特定の問題を解決します。このシステムに興味のある方は、コードとデータの両方を入手して、ぜひお試しください。より能力の高い IoT デバイスと、急速に発展を遂げている機械学習が交わる領域では、もっと面白いアプリケーションがこれからもどんどん開発されていくと、私たちは楽観的に考えています。Google Cloud は、IoT のインフラストラクチャと機械学習トレーニングの両方を提供することで、新しくて能力の高いスマート IoT の可能性を追求し、現実のものにしていきます。* 1. Jack Kelly and William Knottenbelt. The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes. Scientific Data 2, Article number:150007, 2015, DOI:10.1038/sdata.2015.7.- By Yujin Tang, ML Strategic Cloud Engineer, Kunpei Sakai, Cloud DevOps and Infrastructure Engineer, Shixin Luo, Machine Learning Engineer and Yiliang Zhao, Machine Learning Engineer
Quelle: Google Cloud Platform

Transform publicly available BigQuery data and Stackdriver logs into graph databases with Neo4j

Neo4j Enterprise, now available on Google Cloud PlatformThe Google Cloud Partner Engineering team is excited to announce the availability of Neo4j Enterprise VM solution and Test Drive on Google Cloud Launcher.Neo4j is very helpful whether your use case is better understanding NCAA Mascots or analyzing your GCP security posture with Stackdriver logs.  All of these use cases call for a high-performance graph database. Graph databases emphasize the importance of the relationship between data, and stores the connections as first-class citizens. Accessing nodes and relationships in a graph database makes for an efficient, constant-time operation and allows you to quickly traverse millions of connections quickly and efficiently.In today’s blog post, we will give a light introduction to working with Neo4j’s query language, Cypher, as well as demonstrate how to get started with Neo4j on Google Cloud. You will learn how to quickly turn your Google BigQuery data or your Google Cloud logs into a graph data model, which you can use to reveal insights by connecting data points.Let’s take Neo4j for a test drive.Neo4j with NCAA BigQuery public datasetsThe Neo4j Test Drive will orient you on the basics of Neo4j, and show you how to access BigQuery data using Cypher. There are also tutorials and getting started guides for learning more about the Neo4j graph database.Once you have either created or signed into your Orbitera account, you can deploy the Neo4j Enterprise test drive.Exporting the NCAA BigQuery dataWhile we wait for our Neo4j graph to deploy, we can log into BigQuery and start to prepare a dataset for consumption by Neo4j.Click here for the BigQuery Public Dataset page. (More background information can be found here.)From this screen, click on the blue arrow of the mascots table, then click “export table”.This will let you quickly and efficiently export the data associated with NCAA mascots into Google Cloud Storage as a CSV file.Populate the “Google Cloud Storage URI” field with a Cloud Storage bucket you created, or to which you have write access. Once you have exported the mascots data as CSV, switch back to the Google Cloud Console.Find the Cloud Storage browser under Storage>BrowserFind the file you exported from BigQuery; ours is called mascots.csv. Since this is already a public dataset and does not contain sensitive data, the easiest way to give Neo4j access to this file is simply to share it publicly.​Click the checkbox under “Share publicly”Connecting BigQuery data to the Neo4j test driveNow that our mascots data is accessible publicly, let’s return to our Neo4j test drive and on the trial status page, find the URL (url), username, and password.Once you are in the test drive browser check to make sure you can import the CSV mascots data: put the following code into the box at top, then press the play button on the right hand side.This query should return ten mascot results as text, as shown below.Connecting BigQuery data to the Neo4j test driveAs an example of how to turn our mascots data into a very simple graph, simply run the below code in the Cypher block. This loads the data from your public Cloud Storage bucket and sets up some simple relationships in a graph data structure.For each unique conceptual identity, we create a node. Each node will be given a label of either mascot, market, or the mascot’s taxonomic rank. A very basic relationship is also maintained between all of these elements with a relationship of either “location in” to associate the mascot and a market or an “is” relationship to indicate if the mascot is a certain biological classification. By using MERGE in Cypher, only one node is created for each unique value of things such as kingdoms, or phylums. In this way, we ensure that all the mascots of the same kingdom are linked to the same node.  For a deeper discussion on Neo4j data modeling, see the developer guide.When the query loading step is finished, you should see a return value of 274, the total number of records in the input file, which lets you know the query was successful.One of the best ways to improve Neo4j graph performance is to make sure that each node has as index, which can be done with the following code. Each index statement must run in separate code blocks.Exploring the NCAA mascot graph with CypherTo see what our NCAA mascot graph looks like, run the below Cypher query. This query builds a graph based on when a mascot node contains the name “Buckeye”.This query should return a graph similar to the following image.We can quickly see a mascot (node) containing Buckeye Nut {label} is located in [relationship] of the market(node) of Ohio State. {label}You can also see that each of the taxonomic ranks for a Buckeye also have an “IS” relationship. We could extend the complexity of this graph by creating relationships that maintain the hierarchy of the taxonomic rank but since we are just introducing the concept of converting BigQuery data to a graph, we will continue with this simple graph structure.Tigers and eagles and bulldogs, oh my!While the true power of Cypher is that it allows us to explore relationships within the data, it is also useful in providing the same type of aggregations on the data that SQL gives us. Use the below query to find the three most popular types of mascots in the NCAA. The query result should be a visualization that lets you quickly identify tigers, eagles, and bulldogs as the most common mascots in the NCAA. The visualization also lets us identify the various markets that are home to these mascots.Neo4j’s Browser displays the result in this graphical way because the return type of the query contained nodes and edges. If we wanted a more traditional tabular style result, we can modify the query to request only certain attributes, such as:We can now modify this query to find the three most popular mascots that are human as opposed to an animal.What do a Buckeye Nut and an Orange have in common?Because Neo4j is a graph database, we can use the taxonomy structure in the data to find the shortest paths between nodes, to give a sense of how biologically similar two different kinds of things are, or at least what attributes they share. All we need is to match two different nodes in the graph, and then ask for all of the shortest paths connecting them, like so:Here, the graph patterns are showing us that a buckeye nut and an orange share several classifications; they’re all plants, all Eukaryotes, and all in the Sapindales order, which are flowering plants.Completing our test driveAt this point, we’ve seen how easy it is to get started using Neo4j in a contained environment, how we can quickly convert a BigQuery public dataset into a graph structure, and how we can interrogate the graph using aggregation capabilities.Now the real fun of using a graph data starts! In Cypher, run::play startThis command will launch a card of Neo4j tutorials. Following those guides will let you understand the real power of having your data structured as a graph.In our next section, we will move from test driving Neo4j with our public datasets into a private implementation of Neo4j that we can use to better understand our GCP security posture.Using Neo4j to understand Google Cloud monitoring dataCloud infrastructure greatly increases the security posture of most enterprises, but it can also increase the sophistication of the configuration management databases (CMDB). We need tools  to understand the varied and ephemeral relationships of IT assets in the cloud. A graph database such as Neo4j can enable you to better understand your full cloud architecture by providing the ability to easily connect data relationships all the way from the Kubernetes microservices that collect the data to the rows in a BigQuery analysis where the data ends up in. For more on how Neo4j can help with similar use cases to this one, see the Manage and Monitor Your Complex Networks with Real-Time Insight white paper.In this section, we will use Stackdriver Logging to collect BigQuery logs and then export them into a Neo4j graph. For easy understanding, this graph will be limited to small subset of BigQuery logs but the real value of the relationships in Stackdriver data is once you expand your graph with logs across VMs, Kubernetes, various Google Cloud services and even AWS.Neo4j causal clusteringUnlike the NCAA public data, our stackdriver logs will most likely contain a lot of sensitive data we would not want to put on a test drive or expose publicly. The easiest way to obtain a fault tolerant Neo4j database in our private Google Cloud project is by using GCP Launcher’s Neo4j Enterprise deployment.Simply click this link and then click the “Launch on Compute Engine” button as shown below.Once you obtain a license from Neo4j and populate the configuration on the next page, a Neo4j cluster is deployed into your project that provides:A fault-tolerant platform for transaction processing that remains available even if there is a VM failureScale through read replicasCausal consistency, meaning a client application is guaranteed to read at least its own writes.You can read more about Neo4j’s Causal Clustering architecture here.Exporting Stackdriver metrics to the Neo4j virtual machineFrom within the Google Cloud Platform console, you can go to Stackdriver>Logging-Exports as show below to create an export of your Stackdriver logs. In a production environment, you might set up an export to send logs to a variety of services. The example shown is similar to the export technique used for the NCAA mascot data above. Logs are collected in Stackdriver, exported to BigQuery, then BigQuery is used to export a CSV into Cloud Storage. In this particular graph, we limited our output to the results of the following BigQuery standard SQL query:You can also export the logs directly from Stackdriver into Google Cloud Storage, creating JSON files. To import those JSON files, usually you’d install the APOC server extension for neo4j, but to keep things simple we’ll just use CSV for this example.Note An important distinction between my process for importing public data from Cloud Storage, as compared with importing log data from Cloud Storage, is that I do not make the log data publicly available. I copy the log files to a local file on the Neo4j VM running in my account. I do so via SSH connection to the instance, then I run the below command in a directory to which the Neo4j processing has access.The Neo4j Launcher deployment already provides the necessary read scopes for Google Cloud Storage to make this possible. However, you may still need to provide the service account of the Neo4j Compute Engine instancepermissions to your bucket.Stackdriver as a graphLet’s start off by converting this sample Stackdriver data into a graph:Even with this small subset of Stackdriver data, we can begin to see the value of having a Neo4j graph and Stackdriver working in tandem. Let’s take an example where we have used the Stackdriver Alerting feature’sbuilt in condition to detect an increase in logging byte count.When you create the below condition in Stackdriver, you can group the condition by both severity and the specific GCP project where the increase in logs is occurring.Once this configuration is setup, we can have Stackdriver alert me when a threshold of my choosing is crossed:This threshold setting will help notify us that a particular GCP project is experiencing an increase in error logs. However with just this alert, we may need additional information to help diagnose the problem.This is where having your Stackdriver logs in Neo4j can help. Although this alert tells us to look in a particular project for a increase in error logs, having the graph available makes it possible to quickly identify the root cause by looking at the relationships contained in those GCP project logs.Running the above query will give us the ERROR logs in a particular project but will also show the resource relationships associated with those logs, as well as any other type of node that has a relationship with our logs. The below image is the output result of the query:This single query makes it apparent that the error logs (nodes in blue) in the project are all attributed not only to a single resource of “BigQuery” but also to a specific node which contains the same method type and HTTP user agent header. This single node tells us that BigQuery query jobs coming from the Chrome browser are responsible for the increase in errors in the project and gives us a good place to start investigating the issue.To learn more about using Neo4j to model your network and IT infrastructure run the command::play https://guides.neo4j.com/gcloud-testdrive/network-management.htmlConclusionWe hope that this post helped you understand the  benefits of the Neo4j graph data model on Google Cloud Platform. In addition, we hope you were able to see how easy it is to load your own BigQuery and Stackdriver data into a Neo4j graph without any programming or sophisticated ETL work.To get started for free, check out the Test Drive on Google Cloud Launcher.
Quelle: Google Cloud Platform

Google Cloud for Life Sciences: new products and new partners

Google Cloud helps enterprises manage, process, structure, and analyze all kinds of biomedical data through both products and partnerships. Imagine being able to make sense of the immense volume of genomic, transcriptomic, metabolomic, phenotypic, and other data generated in research and clinical labs by structuring all of it in the cloud to deliver patient insights across millions of samples.This week at Bio-IT World, we’re showcasing our progress toward this goal through new Google Cloud Platform products such as Variant Transforms, which helps organizations structure genomic variant data in BigQuery, as well as new partnerships that help teams achieve operational and scientific excellence through cloud computing. These partnerships include:BC Platforms, a world leader in genomic data management and analytics, is bringing GeneVision to GCP. GeneVision provides an end-to-end SaaS solution delivering on the promise of precision medicine, from raw genome data to actionable patient report.FireCloud from the Broad Institute is an open platform for secure and scalable analysis in the cloud. FireCloud uses Cromwell, a popular open source workflow engine also created by the Broad, to leverage the Google Genomics API Pipelines component to run secondary analysis pipelines at scale.Dell EMC is offering Dell EMC Isilon, a leading scale-out NAS platform, on Google Cloud Platform (GCP). Currently in early access, Isilon Cloud for GCP allows organizations to deploy dedicated Isilon infrastructure with secure and sub-millisecond latency network access to Compute Engine clusters. Dell EMC will provide 24×7 proactive monitoring and support of the environment, while customers will be able to maintain full access to all Isilon OneFS management interfaces.DNAstack is an advanced platform for genomics data storage, bioinformatics, and sharing in the cloud. DNAstack recently launched the Canadian Genomics Cloud, which provides a massively scalable platform compliant with Canadian federal and provincial regulations for data privacy and security, in Google Cloud’s brand new Montreal region.Elastifile delivers enterprise-grade, scalable NFS file services in the public cloud, on-premises, or across both environments. Teams like Silicon Therapeutics use Elastifile on Google Cloud to power advanced drug discovery with tools like SLURM, to handle heterogeneous datasets at speed.Komprise helps biomedical IT organizations manage data growth and cut costs through intelligent data management software. Komprise provides visibility across your current storage to identify cold or inactive data, and then transparently archive, replicate, and move that data to Google Cloud Storage. Moved data looks exactly the same as before, so users and applications face no disruption.OnRamp.Bio is the team behind the ROSALIND platform. ROSALIND is a biologist-friendly bioinformatics engine for the analysis and interpretation of genomic data sets, now running on Google Cloud Platform. ROSALIND provides push-button simplicity with interactive visualization for a deeper discovery of data, without all the complexity of hashing together open source tools via command line.PetaGene addresses IT challenges for genomic data. PetaSuite Cloud Edition enables organizations to easily cloud-enable their existing pipelines while delivering genomic compression to accelerate cloud transfers and reduce storage costs by up to 10x. For GCP customers, PetaSuite Cloud Edition enables customers to transparently run their pipelines directly to and from Google Cloud Storage as though they were local files, as well as from other public and private cloud storage.Sentieon implements the industry standard mathematical methods used in BWA/GATK/MuTect/Mutect2, with efficient computing algorithms and robust software implementation. The Sentieon tools are scalable, deployable, upgradable, software-only solutions that run affordably in Google Cloud. We’ve provided the documentation to enable you to try out Sentieon’s tools right from your GCP account.Seven Bridges offers hundreds of genomics tools, workflows, and datasets on the Google Cloud Platform in a secure managed environment. Their team can work with you to deploy complex workflows and develop the capabilities your organization needs to learn from your data faster.WuXi NextCODE is a genomics company enabling researchers and clinicians to use genomic data to improve global health by uncovering disease associated genomic markers in patients, families, cohorts and populations. WuXi NextCODE is bringing their genomics aware suite of capabilities to Google Cloud and is now available through the Google Cloud Launcher marketplace.If you aren’t familiar with our partners, we encourage you to visit us at booth #410 and meet them. They’ll be on hand to demonstrate their solutions. We’ve also set up a special website for the conference, where you can track our partners’ talks and demos, sign up for a one-on-one meeting with our executive team, and register for our reception on Tuesday night. We hope to see you there!
Quelle: Google Cloud Platform

Cloud ML Engine adds Cloud TPU support for training

Starting today, Cloud Machine Learning Engine (ML Engine) offers the option to accelerate training with Cloud TPUs as a beta feature. Getting started is easy, since Cloud TPU quota is now available to all GCP customers.Cloud ML Engine enables you to train and deploy machine learning models on datasets of many types and sizes, using the flexibility and production-readiness of TensorFlow. As a managed service, ML Engine handles the infrastructure, compute resources, and job scheduling on your behalf, allowing you to focus on data and modeling.In March 2017, we launched Cloud ML Engine to provide a managed TensorFlow service, with the ability to scale machine learning workloads using distributed training and GPU acceleration. Over the last year, we have continued to release new features and improvements including beta support for NVIDIA V100 GPUs, online prediction as a deployment capability, and improvements to the hyperparameter tuning feature.Today, we are adding support for Cloud TPUs, enabling you to train a variety of high-performance, open-source reference models with differentiated performance per dollar. Or, you can choose to accelerate your own models written with high-level TensorFlow APIs.Recently launched in beta, Cloud TPUs are a family of Google-designed hardware accelerators built from the ground up for machine learning. Cloud TPUs recently won the ImageNet Training Cost category of Stanford’s DAWNBench competition, and their performance and cost advantages were recently analyzed in detail.Getting started with Cloud TPU on ML EngineML Engine automatically handles provisioning and management of Cloud TPU nodes, so you can use TPUs just as easily as CPUs and GPUs. Additionally, you can use ML Engine’s hyperparameter tuning feature in your Cloud TPU jobs to optimize your hyperparameters—combining scale, performance, and algorithms to improve your models. Finally, the resulting models can be deployed with ML Engine to issue prediction requests, or submit batch prediction jobs.Read this guide to learn more about how you can use Cloud TPUs with ML Engine for training jobs.
Quelle: Google Cloud Platform

Securing cloud-connected devices with Cloud IoT and Microchip

Maintaining the security of your products, devices, and live code is a perpetual necessity. Every year, researchers (and hackers!) unearth some new flaw. Occasionally, they prove to be especially worrisome, like the “Meltdown” and “Spectre” vulnerabilities discovered by Google’s Project Zero team at the end of 2017.Many companies believe they are too small or too inconsequential to ever be a target, but in the case of a distributed denial of service attacks (DDoS) for example, hacks will exploit random hosts (as many as possible) to hit a specific target. Regardless of who owns the site, the attacker will try to use all available local resources to do some damage. This can be compute and bandwidth resources, or exposing assets or personal information about users. The “Mirai” attack on IoT devices didn’t target anyone in particular, but aimed to take over connected devices in order to deploy them in rogue and massively distributed denial of service (DDoS) attacks.Security cannot be an afterthought. The best course of action from any company building connected devices is to apply a combination of strong identity, encryption, and access control. In the world of IoT, this process is not as simple as it sounds.Here we present the story of Acme, a hypothetical company planning to launch a new generation of connected devices.Acme has several work streams for its project: mechanical design, PCB design, supply chain, firmware development, network connectivity, cloud back-end, mobile and web applications, data processing and analytics, and support. Let’s look at what each of these workstreams demand in terms of security, starting with the application layer.Application layer securityAt this layer, where the backend and user applications are delivered, the security models are well understood—access controls via permissions, roles, strong passwords, encryption in transit and at rest, logging, and monitoring all provide a very good set of security measures. The main problem today is deciding how a company should best get its data into the cloud securely.Data encryptionEncryption starts with Transport Layer Security (TLS), which ensures that traffic between two parties is indecipherable to any potential eavesdropper. TLS is used very commonly for accessing websites—your bank’s site included—to ensure encryption of all transmitted data, keeping it safe from any prying eyes. Understandably, Acme wants to implement TLS for its devices as well as its services.There is a trick—when you connect to your bank, the TLS session is only authenticating the bank, not you or your machine. Once you have the TLS in place, you typically enter a username and password. That password can be changed and it is stored in your head (please, don’t keep a sticky note reminder below your keyboard). The fact that you have to use your head to put in the password is proof for the verifier that you are physically present at the other end of the connection. It’s says: “Here I am, and here is my password,” but your device is not a person. A device sending a password proves that it has the password, but not that it is actually the expected device trying to authenticate. It’s similar to someone stealing your sticky note with your password on it. To address this issue, Acme will install certificates on its devices.A certificate uses asymmetric cryptography, which implies a separation of roles. The party issuing the certificate (the Certificate Authority) guarantees the link between the physical device and the public key. Having the public key alone is insufficient. Furthermore, the verifier never gets anything of value (like the password) to be able to authenticate the entity (device). This is in fact a much higher level of security, but unfortunately it brings a level of complexity into the picture. The good news is that machines are good at both automating repetitive tasks and handling complexity.Device identityHow does Acme use certificates for its devices? It needs its own Certificate Authority (CA). Acme can buy a root certificate from a CA provider and create its Authority. The CA has a root certificate and private key that has to be closely guarded—in the digital era, this is the key to the kingdom. That key can be used to generate an intermediate CA with the purpose of signing others keys, for example, the connected devices. If the root key is compromised, the entire security chain is compromised. If an intermediary CA is compromised, it’s not good, but remediation steps can be taken, like revoking all certificates generated by that CA, and a new intermediary CA can be generated. Acme is aware of how difficult it is to protect the root key and decides to buy that service from a company specialized in that regard.Manufacturing securityNow that Acme and its engineering team have a CA, they can generate certificates for their devices. These are called “key pairs”—a set of private keys and corresponding public keys alongside a unique certificate for each device. These certificates need to be put on each device. This is where friction enters the process. Acme, after validating a final hardware design for its device, has found an ODM (Original Design Manufacturer) in China capable of producing them at a reasonable price.Acme asks the ODM that during the manufacturing, each device is flashed with its unique key pair and certificate. The ODM replies that this will be a custom flashing per device and will add dozens of cents to each product. Indeed, custom flashing is expensive. This increase wasn’t really planned for Acme, but security is too important and they decide to move forward with the extra cost.To get the certificates to the ODM, Acme has two choices: (1) Send a big file with all the keys and certificates to the ODM, or (2) Have an API that can be called during manufacturing, so the ODM can retrieve the certificates at the time of flashing. The ODM pushes back on the second option because their manufacturing plant is not connected to the internet for security purposes. Even if it were, each API call would drastically slow down the manufacturing process, and those calls would have to be extremely reliable so that there is no failure. The calls would have to be highly secure, even requiring a certificate based authentication between the manufacturing plant and the API endpoint. Furthermore, regulations in China do not allow fully encrypted tunnels in and out of the country. The only option seems to be to send a file.The risk of doing this is obvious. A file can be easily copied, which unfortunately happens frequently. Acme needs to trust the manufacturer to not set aside a few of those certificates, and to not release copies of the devices themselves that would be indistinguishable from the real ones. (Except for price, of course!) Every new batch will require a new file, and new opportunities for a copy to leak.AuthenticationLet’s assume for now that the ODM is trustworthy, which is in fact often the case. The device will have to use the certificate to authenticate itself with the cloud endpoint and establish the encrypted channel prior to operation. Just to say hello securely, the device first needs to open a secure pipe (over TLS), and then needs to use that pipe for the cloud and device to mutually authenticate each endpoint’s respective identity. This process requires both the device and the cloud endpoint to have the public key of the other party. Public keys of all devices connecting to the cloud endpoint will have to be uploaded to the cloud at one point or another before the authentication happens.To perform the mutual authentication, the device will have to store its private key, public key, a TLS stack with mutual authentication, a certificate with the public key of the endpoint to connect to in the Cloud to establish the first call to the cloud securely. All of a sudden, the memory requirements on the device becomes a problem. That minimum stack is in the order of a few hundred kilobytes. Acme didn’t plan on that much. The devices have simple command and control systems and a few sensors in it. The non-volatile storage capacity of the device is well under 100 kB and is insufficient. Acme will need to move to a more powerful architecture and add costs to the original design.Secure storage and secure bootWith more memory (and added costs), Acme is now looking for the best way to store the private key securely. Indeed, what use is a private key if someone can access the device firmware by physically hacking into it or remotely take control and retrieving the private key? Doing so will allow the attacker to copy the private key and start connecting to the cloud endpoint and access data it’s not supposed to.In case the device is compromised, the firmware of the device can be modified, which is exactly what Mirai does, and it can be used for other purposes than what it was intended for. Validating the firmware through a signature verification is critical to ensure what runs on the device is valid before it even boots the firmware. There is no way to prevent a modification of this signature if the validation is not in a separate memory location from the firmware itself.Rotating keysSimilarly to how a user changes their password from time to time to reduce the window of opportunity for an attacker to use a compromised password, devices need to be able to rotate their keys. That rotation is not as simple as getting a new key. Imagine the cloud system tells the device to change its keys. The cloud can generate a new pair, the device can download it securely using the old key. The cloud then invalidates the old public key for the device and replaces it with the new one. You have to hope that the device will be able to update its key pair at this stage, because if not, the user will end up with a brick. It is critical that several keys can be used simultaneously for a single device to enable a rotation of keys and enable reverting to a working state in case the process fails.Summary of the situationThe cost of securing the device has skyrocketed for Acme, as well as the complexity to implement and maintain a high level of security. Let’s summarize:Acme needs certificates and therefore a Certificate Authority that needs to be protected with the highest level of care.The cost of burning those keys in the device is a balance between dollar amounts (and finding an appropriate ODM), and the risk of credentials being compromised (copied) during manufacture.Acme will need to use TLS to secure the communication which now requires a bloated TLS stack on the device and a larger memory footprint than they anticipated. These resource demands increase  after you integrate Online Certificate Status Protocol (OCSP for the broker), which requires additional (memory-consuming) keys and (CPU-consuming) requests.Keys are extremely difficult, if not impossible to store securely in the firmware.Secure boot to stop the device from running a compromised firmware is impossible without a separate secure storage.Refreshing keys requires the ability from the cloud solution to store several identities in order to have a failsafe.At Google, we have given a hard look at this situation, and we believe we have come up with a solution that can serve companies like Acme very well. The main demonstration of this solution is through our partnership with Microchip.Step 1: Use a secure elementA secure element is a piece of hardware that can securely store key material. It usually comes with anti-tampering capabilities which will block all attempts to physically hack the device and retrieve the keys.All IoT devices should have a secure element. It is the only way to secure the storage of the private key. All secure elements will do that well, but some secure elements will do more. For example, the Microchip ATECC608A cryptographic coprocessor chip will not only store the private keys, it will also validate the firmware and offer a more secure boot process for the device.Microchip ATECC608AThe ATECC608A offers even more features. For example, the private key is generated by the secure element itself, not an external party (CA). The chip uses a random number generator to create the key, making it virtually impossible to derive. The private key never leaves the chip, ever. Using the private key, the chip will be able to generate a public key that can be signed by the chosen CA of the company.Microchip performs this signature in a dedicated secure facility in the US, where an isolated plant will store the customer’s intermediate CA keys in a highly secure server plugged into the manufacturing line. The key pairs and certificates are all generated in this line in a regulatory environment which allows auditing and a high level of encryption.Once the secure elements have each generated their key pairs, the corresponding public keys are sent to the customer’s Google Cloud account and stored securely in the Cloud IoT Core device manager. Because Cloud IoT Core can store up to 3 public keys per device, key rotation can be performed with failsafe without issues.All the customer has to do is provide an intermediary CA for a given batch of devices to Microchip, and they will return a roll of secure elements. These rolls can be sent to any manufacturer to be soldered onto the final PCB at high speed, with no customization, no risk of copy, and very low cost.Step 2: Using a JWT for authenticationUsing TLS is perfect for securing the communication between the device and the cloud, but the authentication stack is not ideal for IoT. The stack required for mutual authentication is large in size and has a downside: it needs to be aware of where the keys are stored. The TLS stack needs to know what secure element is used and how to communicate with it. An OpenSSL stack will assume the keys are stored in a file system and need to be modified to access the secure element. This requires development and testing that has to be done again at each update of the stack. With TLS 1.3 coming up, it is likely that this work will have to happen several times, which is a cost for the company. The company can use a TLS stack that is already compatible with the secure element, like WolfSSL, but there is a licensing cost involved that adds to the cost of the device.Google Cloud IoT is using a very common JWT (JSON Web Token) to authenticate the device instead of relying on the mutual authentication of a TLS stack.The device will establish a secure connection to the global cloud endpoint for Cloud IoT Core (mqtt.googleapis.com) using TLS, but instead of triggering the mutual authentication it will generate a very simple JWT, sign it with its private key and pass it as a password. The Microchip ATECC608 offers a simple interface to sign the device JWT securely without ever exposing the private key. The JWT is received by Google Cloud IoT, the public key for the device is retrieved and used to verify the JWT signature. If valid, the mutual authentication is effectively established. The JWT validation can be set by the customer but never exceeds 24 hours, making it very ephemeral.Secure flow with Microchip and Cloud IoT’s Device ManagerThere are several benefits to this approach:There is no dependency on the TLS stack used to perform the device authentication. Updating the TLS stack to 1.3 will be a breeze.The devices do not need to store their public key and certificate, which releases a significant portion of memory on the device.The device does not need to host a full TLS stack, which again releases memory for the application.The memory requirements are well under 50KB, which opens the door to using a much smaller MCU (microcontroller unit).With these two steps, the full complexity of handling certificates is removed and customers can focus on their product and customer experience.ConclusionSecurity is complex, and as we alluded to in the introduction, it cannot be an afterthought. Fortunately, with the use of the JWT authentication scheme, and the partnership with Microchip around the ATECC608, security is turned into a simple BOM item. Google and Microchip even agreed on a discounted price of around 50 cents. This means customers pay less than a dollar to not only bring increased security to the provisioning of identity, authentication, and encryption, but also to free up a large amount of space on the device, enabling smaller and cheaper MCUs to work in the final design.The chip can even be retrofitted into existing designs as a companion chip since the secure element communicates easily over I2C. We hope you’ll consider integrating the ATECC608 in every IoT design you are looking into.To learn more, take a look at the following links:Google Cloud IoT Core product pageMicrochip-Google partnership pageGoogle Cloud IoT Security webinarWe’ll also be presenting our work around IoT and security at Google Cloud’s NEXT 2018 event on July 24-26 in San Francisco. Here are a couple sessions you might be interested in:An overview of Cloud IoT CoreGoogle’s vision for Industrial IoTRegister here
Quelle: Google Cloud Platform

Expanding the Azure Stack partner ecosystem

We continue to expand our ecosystem by partnering with independent software vendors (ISV) around the globe to deliver prepackaged software solutions to Azure Stack customers. As we are getting closer to our two-year anniversary, we are humbled by the trust and confidence bestowed by our partners in the Azure Stack platform. We would like to highlight some of the partnerships that we built during this journey.

Security

Thales now offers their CipherTrust Cloud Key Manager solution through the Azure Stack Marketplace that works with Azure and Azure Stack “Bring Your Own Key” (BYOK) APIs to enable such key control. CipherTrust Cloud Key Manager creates Azure-compatible keys from the Vormetric Data Security Manager that can offer up to FIPS 140-2 Level 3 protection. Customers can upload, manage, and revoke keys, as needed, to and from Azure Key Vaults running in Azure Stack or Azure, all from a single pane of glass.

Migration

Every organization has a unique journey to the cloud based on its history, business specifics, culture, and maybe most importantly their starting point. The journey to the cloud provides many options, features, functionalities, as well as opportunities to improve existing governance, operations, implement new ones, and even redesign the applications to take advantage of the cloud architectures.

When starting this migration, Azure Stack has a number of ISV partner solutions which would help you start with what you already have and progress to modernizing your applications as well as your operations. These are described in the “Azure Stack at its core is an Infrastructure-as-a-Service (IaaS) platform” blog series.

Data protection and disaster recovery

Veeam Backup and Replication 9.5 is now available through Azure Stack Marketplace making to possible to protect both Windows and Linux-based workloads running in the cloud from one centrally managed console. Refer to this document to learn about all data protection and disaster recovery partner solutions that support Azure Stack platform.

Networking

The VM-Series next-generation firewall from Palo Alto Networks allows customers to securely migrate their applications and data to Azure Stack, protecting them from known and unknown threats with application whitelisting and threat prevention policies. You can learn more about the VM-series next-generation firewall on Azure Stack.

Developer platform and tools

We continue to invest in open source technologies and Bitnami helps us make this possible with their extensive application catalog. Bitnami applications can be found on the Azure Stack Marketplace and can easily be launched directly on your Azure Stack platform. Learn more about Bitnami offerings.

With self-service simplicity, performance and scale, Iguazio Data Science Platform empowers developers to deploy AI apps faster on the edge. Iguazio Data Science Platform will be soon available through Azure Stack Marketplace.

IoT solutions

PTC's ThingWorx IIoT platform is designed for rapidly developing industrial IoT solutions, with the ability to scale securely from the cloud to the edge. ThingWorx runs on top of Microsoft Azure or Azure Stack, and leverages Azure PAAS to bring best in class IIoT solution to the manufacturing environment. Deploying ThingWorx on Azure Stack enables you to bring your cloud-based industry 4.0 solution to the factory floor. Experience on the show floor a demonstration of how ThingWorx Connect factory solution pulls data from real factory assets and makes insightful data available in prebuilt applications that can be customized and extended using ThingWorx Composer and Mashup builder.

Intelligent Edge devices

With the private preview of Iot Hub in Azure Stack, we are very excited to see our customers and partners creating solutions that perform data collection and AI inferencing in the field. Intel and its partners have created hardware kits that support IoT Edge and seamlessly integrate with Azure Stack. A few examples of such kits are the IEI Tank and up2, that enables the creation of computer vision solutions and deep learning inference using CPU, GPU, or an optional VPU. Those kits allow you to kick-start your targeted application development with a superior out-of-the-box experience, that includes pre-loaded software like the Intel Distribution of OpenVINO™.

View all partner solutions available on Azure Stack Marketplace. 
Quelle: Azure

Easing compliance for UK public and health sectors with new Azure Blueprints

Earlier this month we released our latest Azure Blueprint for a key compliance standard with the availability of the UK OFFICIAL blueprint for the Government-Cloud (G-Cloud) standard, and National Health Service (NHS) Information Governance of the United Kingdom. The new blueprints map a set of Azure policies to appropriate UK OFFICIAL and UK NHS controls for any Azure deployed architecture. This allows UK government agencies and partners, and UK health organizations to more easily create Azure environments that might store and process UK OFFICIAL government data and health data.

Azure Blueprints is a service that enables customers to define a repeatable set of Azure resources that implement and adhere to standards, patterns, and requirements. Azure Blueprints help customers to set up governed Azure environments that can scale to support production implementations for large-scale migrations.

The National Health Service is the national health system for England, which holds the population's health data. NHS Digital published its guidance on the use of public cloud services for storing confidential patient data, which provides a single standard that governs the collection, storage, and processing of patient data. Adherence with NHS helps protect the integrity and confidentiality of patient data against unauthorized access, loss, damage, and destruction.

G-Cloud is a UK government initiative to enable the adoption of cloud services by the UK public sector. The G-Cloud standard requires the implementation of 14 Cloud Security Principles. Every year, Microsoft submits evidence to attest that its in-scope cloud services comply with these principles, giving potential G-Cloud customers an overview of its risk environment. 

The UK OFFICIAL blueprint includes mappings to 8 of the 14 Cloud Security Principals:

1.  Data in transit protection. Assigns Azure Policy definitions to audit insecure connections to storage accounts and Redis cache.

2.  Data at rest protection (asset protection and resilience.) Assigns Azure Policy definitions that enforce specific cryptograph controls and audit the use of weak cryptographic settings. Also includes policies to restrict deployment of resources to UK location.

5.  Operational security. Assigns Azure Policy definitions that monitor missing endpoint protection, missing system updates, various vulnerabilities, unrestricted storage account, and whitelist activity.

9.  Secure user management and 10. Identity and authentication. Assigns several Azure Policy definitions to audit external accounts, accounts that do not have multi-factor authentication (MFA) enabled, virtual machines (VMs) without passwords, and other issues.

11. External interface protection. Assigns Azure Policy definitions that monitor unrestricted storage accounts. Also assigns a policy that enables adaptive application controls on VMs.

12.  Secure Service Administration. Assigns Azure Policy definitions related to privileged access rights for external accounts, Azure Active Directory authentication, MFA enablement, etc.

13.  Audit Information for Users. Assigns Azure Policy definitions that audit or enable various log settings on Azure resources.

Microsoft has prepared a guide to explain how Azure can help customers comply with the 14 Cloud Security Principals including 3, 4, 6, 7, 8, and 14. It can be found in our document 14 Cloud Security Controls for UK Cloud Using Microsoft Azure.

Compliance with regulations and standards such as ISO 27001, SASE-16, PCI DSS, and UK OFFICIAL is increasingly necessary for all types of organizations, making control mappings to compliance standards a natural application for Azure Blueprints. Azure customers, particularly those in regulated industries, have expressed strong interest in compliance blueprints to make it easier to meet their compliance obligations.

We are committed to helping our customers leverage Azure in a manner that helps improve security and compliance. We have now released Azure Blueprints for ISO 27001, PCI DSS, UK OFFICIAL, and UK NHS.  Over the next few months we will release new built-in blueprints for HITRUST, NIST SP 800-53, FedRAMP, and Center for Internet Security (CIS) Benchmark. If you would like to participate in any early previews please sign up with this form, or if you have a suggestion for a compliance blueprint please share it via the Azure Governance Feedback Forum.

Learn more about the UK OFFICIAL and UK NHS blueprints in our documentation Control mapping of the UK OFFICIAL and UK NHS blueprint samples.
Quelle: Azure

Announcing Docker Enterprise 3.0 General Availability

Today, we’re excited to announce the general availability of Docker Enterprise 3.0 – the only desktop-to-cloud enterprise container platform enabling organizations to build and share any application and securely run them anywhere – from hybrid cloud to the edge.

Docker Enterprise 3.0 Demo

Leading up to GA, more than 2,000 people participated in the Docker Enterprise 3.0 public beta program to try it for themselves. We gathered feedback from some of these beta participants to find out what excites them most about the latest iteration of Docker Enterprise. Here are 3 things that customers are excited about and the features that support them:

Simplifying Kubernetes

Kubernetes is a powerful orchestration technology but due to its inherent complexity, many enterprises (including Docker customers) have struggled to realize the full value of Kubernetes on their own. Much of Kubernetes’ perceived complexity stems from a lack of intuitive security and manageability configurations that most enterprises expect and require for production-grade software. We’re addressing this challenge with Docker Kubernetes Service (DKS) – a Certified Kubernetes distribution that is included with Docker Enterprise 3.0. It’s the only offering that integrates Kubernetes from the developer desktop to production servers, with ‘sensible secure defaults’ out-of-the-box.

“Increasing application development velocity and digital agility are a strategic imperative for companies in all sectors today. Developer experience is the killer app,” said RedMonk co-founder, James Governor. “Docker Kubernetes Service and Docker Application aim to package and simplify developer and operator experience, making modern container based workflows more accessible to developers and operators alike.”

You can learn more about Docker Kubernetes Service here.

Automating Deployment of Containers and Kubernetes

One of the most common requests we’ve heard from customers has been to make it easier to deploy and manage their container environments. That’s why we introduced new lifecycle automation tools for day 1 and day 2 operations, helping customers accelerate and expand the deployment of containers and Kubernetes on their choice of infrastructure. Using a simple set of CLI commands, operations teams can easily deploy, scale, backup and restore and upgrade their Docker Enterprise clusters across hybrid and multi-cloud deployment on AWS, Azure, or VMware.

You can learn more about lifecycle automation tools here.

Building Modern Applications 

With the ever-increasing emphasis on making things easier and faster for developers, it’s no surprise that Docker Desktop Enterprise and Docker Application created a lot of excitement amongst beta participants. Docker Desktop Enterprise is a new developer tool that decreases the “time-to-Docker” – accelerating developer onboarding and improving developer productivity. Docker Application, based on the CNAB standard, is a new application format that enables developers to bundle the many distributed resources that comprise a modern application into a single object that can be easily shared, installed and run anywhere. Docker Desktop Enterprise also allows users to quickly and easily create Docker Applications leveraging pre-defined Application Templates that support any language or framework.

“The Docker Enterprise platform and its approach to simplifying how containerized applications are built, shared and run allows us to fail fearlessly. We can test new services easily and quickly and if they work, we can immediately enhance the mortgage experience for our customers,” said Don Bauer, Lead DevOps Engineer, Citizens Bank. “Docker’s investment in new capabilities like Docker Application and simplified cluster management will further improve developer productivity and lifecycle automation for us so that we can continue to bring new, differentiated services to market faster.” 

You can learn more about Docker Applications here.

How to Get Started

Try Docker Enterprise 3.0 for YourselfLearn More about What’s New in Docker Enterprise 3.0Sign up for the upcoming webinar series: Drive High-Velocity Innovation with Docker Enterprise 3.0

Big News! Announcing #Docker Enterprise 3.0 General AvailabilityClick To Tweet
The post Announcing Docker Enterprise 3.0 General Availability appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

Azure solutions for financial services regulatory boundaries

Microsoft Azure is rapidly becoming the public cloud of choice for large financial services enterprises. Some of the biggest reasons Global Financial Services Institutions (GFIs) are choosing Azure to augment or replace on-premises application environments are:

The high level of security that the Azure cloud provides.
The exceptional control enterprises can have over compliance and security within their subscriptions.
The many features that Azure has for data governance and protection.
The long list of Global Regulatory Standards that the Azure cloud is compliant with. Please see the Microsoft Trust Center for more information.

Requirements for globally regulated Azure solutions

Azure is built to allow enterprises to control the flow of data between regions, and to control who has access to and can manage that data. Before we begin talking about solutions we need to define the requirements.

Examples of global regulation

Many governments and coalitions have developed laws and regulations for how data is stored, where it can be stored, and how it must be managed. Some examples of the more stringent and well know of these scenarios are:

European Union (EU)

General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for the collection and processing of personal information from individuals who live in the EU.

Germany

Federal Data Protection Act is a law that deals with the conditions for processing employee data, and restrictions on the rights enjoyed by data subjects.

Data Localization and Management Law is a law that states that data collected about German citizens must be properly protected and encrypted, stored only on physical devices within Germany’s political boundaries, as well as managed only by German citizens.

China

Cyber Security Law (CSL) is a set of laws concerned with data localization, infrastructure, and management.

Canada

The Canadian Personal Information Protection and Electronic Documents Act (PIPEDA), protects consumer data across Canada, against misuse and disclosure.

Architecture and design requirements

Beyond the above-mentioned regulatory requirements there exist technical requirements specific to these scenarios. Cloud application and infrastructure architects are presented with the opportunity to develop solutions that provide business function while not violating international laws and regulations. The following are some of the requirements that need to be considered.

Globalization

A globalized business model provides access to multiple financial markets on a continuous basis each day. These markets differ in operations, language, culture, and of course regulation. Despite these differences, the services placed in the cloud need to be architected to be consistent across these markets to ensure manageability and customer experience.

Services and data management

Germany and China are prime examples of countries that only allow their citizens to manage data and the infrastructure on which that data resides.

Data localization

Many countries require at least some of the data sovereign to their country to remain physically within their borders. Regulated data cannot be transferred out of the country and data that does not meet regulatory requirements cannot be transferred into the country.

Reliability

Due to many of the above requirements, it becomes slightly more complicated to design for high availability, data-replication, and disaster recovery. For example, data must be replicated only to a location consistent with the country or regions standards and laws. Likewise, if a DR scenario is triggered it must be ensured that the applications, running in the DR site, are not crossing legal or standards boundaries to access information.

Authentication

Proper authentication to support role and identity based access controls must be in place to ensure that only intended and legally authorized individuals can access resources.

The Azure solution

Security components

Azure Active Directory (AAD)

Azure Active Directory (AAD) is the cloud-based version of Active Directory, so it takes advantage of the flexibility, scalability, and performance of the cloud while retaining the AD functionality that customers have grown used to. One of those functions is the ability to create sub-domains that can be managed and contain only those identities relevant to that country or region. AAD also provides functionality to differentiate between business-to-business relationships (B2B) and business-to-customer relationships (B2C). This differentiation can help clarify between customer access to their own data and management access.  

Azure Sentinel

Azure Sentinel is a scalable, cloud-native, security information event management (SIEM), and security orchestration automated response (SOAR) solution. Azure Sentinel delivers intelligent security analytics and threat intelligence across the enterprise, providing a single solution for alert detection, threat visibility, proactive hunting, and threat response.

Azure Key Vault 

Azure Key Vault helps safeguard cryptographic keys and secrets that cloud applications and services use. Key Vault streamlines the key management process and enables you to maintain control of keys that access and encrypt your data. Developers can create keys for development and testing in minutes, and then migrate them to production keys. Security administrators can grant (and revoke) permission to keys, as needed.

Role based access control

Access management for cloud resources is a critical function for any organization that is using the cloud. Role based access control (RBAC) helps you manage who has access to Azure resources, what they can do with those resources, and what areas they have access to. RBAC is an authorization system built on  Azure Resource Manager that provides fine-grained access management of Azure resources.

Azure Security Center

Azure Security Center is a unified infrastructure security management system that strengthens the security posture of your datacenters. It also provides advanced threat protection across your hybrid workloads in the cloud, whether they're in Azure or not, as well as on premises.

Governance components

Azure Blueprints

Azure Blueprints helps you deploy and update cloud environments in a repeatable manner using composable artifacts such as Azure Resource Manager templates to provision resources, role-based access controls, and policies. Blueprints can be used to deploy certain policies or controls for a given location or geographic region. Sample blueprints can be found in our GitHub repository.

Azure Policy

Azure Policy is a service in Azure that you use to create, assign, and manage policies. These policies enforce different rules and effects over your resources, so those resources stay compliant with your corporate standards and service level agreements. For example, a policy can be set to allow only certain roles to access a group of resources. Another example is setting a policy that only certain sized resources are allowed in a given resource group. If a new resource is added to the group, the policy automatically applies to that entity. Sample Azure Policy configurations can be found in our GitHub repository.

Azure Virtual Datacenter Program (VDC)

The Azure Virtual Datacenter Program (VDC) is a collection of methods are archetypes designed to help enterprises standardize deployments and controls across application and workload environments. VDC utilizes multiple other Azure products including Azure Policy and Azure Blueprints. VDC samples can be found in our GitHub repository.

Infrastructure components

Azure Site Recovery (ASR)

Azure Site Recovery (ASR) provides data replication and disaster recovery services between Azure Regions, or between on-premise environments and Azure. ASR can be easily configured to replicate and failover between Azure regions within or outside country/geographic-region.

High availability

Virtual Machine (Infrastructure-as-a-Service IaaS) high availability can be achieved in multiple ways within the Azure cloud. Azure provides two native methods of failover:

An Azure Availability Set (AS) is a group of virtual machines that are deployed across fault domains and update domains within the same Azure Datacenter. Availability sets make sure that your application is not affected by single points of failure, like the network switch or the power unit of a rack of servers. Azure Availability Sets provide a service level agreement (SLA) of 99.95%.
An Availability Zone (AZ) is like an availability set in that the virtual machines are deployed across fault and update domains. The difference is that AZs provides a higher level of availability (SLA of 99.99%) by spreading the VMs across multiple Azure datacenters within the same region.

For Platform-as-a-Service (PaaS) high availability is built into the services, and need not be configured by the as the IaaS services above.

Data at rest encryption

Data at rest encryption is a common security requirement. In Azure, organizations can encrypt data at rest without the risk or cost of a custom key management solution. Organizations have the option of letting Azure completely manage encryption at rest. Additionally, organizations have various options to closely manage encryption or encryption keys.

Conclusion

The above capabilities are available across Azure’s industry leading regional coverage and extensive global network. Microsoft’s commitment to global regulatory compliance, data protection, data privacy, and security make Azure uniquely positioned to support GFSIs as they migrate complex mission critical workloads to the Cloud.

For more information on Azure compliance, please visit the Microsoft Trust Center compliance overview page.
Quelle: Azure