Amazon SageMaker Pipelines sind nun in Amazon SageMaker Experiments integriert

Amazon SageMaker Pipelines, die erste ziel-entwickelte kontinuierliche Integration und kontinuierlicher Lieferservice (CI/CD) für maschinelles Lernen (ML) ist jetzt in SageMaker Experiments integriert. Somit können Kunden ihre ML-Experimente organisieren, nachverfolgen, vergleichen und evaluieren. Kunden können nun Metriken vergleichen, wie Modelltraining-Genauigkeit über mehrere Ausführungen ihrer SageMaker Pipelines hinweg, und sie können solche Metriken über mehrere Versuche eines ML-Modelltraining-Experiments vergleichen. SageMaker Pipelines erstellt automatisch ein Experiment mit dem Pipelinenamen und einen Experiment-Versuch für jede Ausführung der Pipeline. Die Erstellung eines Experiments für eine Pipeline und ein Versuch für jede Pipelineausführung wird standardmäßig aktiviert. Sie können wählen, die automatische Erstellung nicht zu aktivieren.
Quelle: aws.amazon.com

Security automation for digital transformation

As organizations are adopting agile and DevOps to improve their processes and products at breakneck speed, security considerations may be left in the dust and digital risks left unmanaged. Therefore, organizations must have security automation as part of their digital transformation. This article intends to provide you with security basics and an automation approach to assess platforms, products, and services to comply with security policies, regulatory, and compliance requirements.
Quelle: CloudForms

New Google Cloud innovations to unify your data cloud

Every company in every industry is on a journey to become more data-driven whether that’s providing great digital experiences to customers, or driving operational excellence through AI, or detecting hidden patterns in data to improve decision making. To help with this transformation, we are excited to announce new products and services designed to fully unify your databases, analytics and AI in an open data cloud, so that you can get the most value from your data.Here are some of our latest innovations to help your organization succeed in today’s data-driven world:  Centrally manage, monitor and govern your data across data lakes, data warehouses and data marts, and make this data securely accessible to a variety of analytics and data science tools from a single view with Dataplex. Learn more here. Move and synchronize data between heterogeneous databases, storage and applications reliably to support real-time analytics, database replication and event-driven architectures with Datastream, our serverless change data capture (CDC) and replication service, available in preview. Learn more here.Access and share valuable datasets and analytics assets (think BigQuery ML models, Looker Blocks, data quality recipes, etc.) across any organizational boundary with Analytics Hub, a fully-managed service built on BigQuery that allows you to efficiently and securely create data sharing ecosystems with governance in mind. Sign up for the Analytics Hub product preview to learn more and check out this blog post.Speed up your rate of experimentation with AI projects and accelerate time to business value with Vertex AI, our comprehensive AI platform that gives data scientists and machine learning (ML) engineers a way to simplify the process of building, training, and deploying ML models at scale. Learn more here.Multi-cloud investments with Anthos, BigQuery Omni, Looker, and our flexible data platform are helping organizations accelerate decision making regardless of their cloud strategy.Customers leading data-driven transformationEver-changing consumer expectations and increased data complexity has made business decision-making much harder. As a result, an insight gap to realize value from data continues to grow with increased data silos across the business and increased risk of security. Digital transformation leaders are digging out of this complexity and offering increased value to their customers by leveraging an open data cloud. Carrefourhad less than 5% of its apps running in the cloud in 2018. By the end of 2020, more than 25% of its applications (approximately 800!) are cloud-based. Its 700TB data lake moved from on-premise to Google Cloud in only a few months and without any service interruption and now scales to 2TB+ per day. Using BigQuery, its data scientists can access larger amounts of data and spend most of their time on model development. And Carrefour is using Looker to provide data-based insights to its suppliers to optimize the collaboration.One of the largest transportation logistics companies in North America, JB Hunt, will use Google’s data cloud to better predict outcomes, empower users, and make informed decisions. Real-time data is a cornerstone in the $1 trillion logistics industry, as customers have increased expectations for faster services and more transparency on their shipments. And Etsyhas helped its community of sellers turn their ideas into successful businesses. The company has adapted its marketplace on which creators are connected with millions of buyers. Etsy achieved scale with better search and smarter recommendations that have helped grow buyer retention and business revenue, all while improving the sustainability of its business. More data cloud innovationsIn addition to the new products above, we are excited to announce updates to BigQuery, Dataflow, Looker, and Spanner technologies. BigQuery Omni for Azure, now in preview, builds on our commitment to multi-cloud  by giving you a way to analyze data across public clouds from a single pane of glass. We believe in flexibility when it comes to analytics and this announcement, along with last year’s introduction of BigQuery Omni for AWS, helps you access and securely analyze data across Google Cloud, AWS, and Azure. Join our session Unlock Innovation and Flexibility with a MultiCloud Strategy to learn how customers like Electronic Arts are developing applications and analyzing data residing across multiple clouds with BigQuery Omni, Looker, and Apigee to innovate faster.Looker hosted on Microsoft Azure, now generally available, adds Azure to our range of hosting options. With Looker, data teams can connect to data located on the cloud (or clouds) of their choice with support for more than 60 distinct database dialects, host Looker where it makes sense for their data strategy (Google Cloud, AWS, Azure and self-hosted), and deliver data and insights to where they add the most value. Dataflow Prime brings resource utilization, radical simplicity and integrated ML to streaming ETL and continuous analytics use cases. Dataflow Prime, with innovations in vertical autoscaling, right fitting and proactive diagnostics, removes the operational toil associated with infrastructure sizing and provisioning, tuning, and debugging performance and data freshness problems. Dataflow Prime provides ML integration, an open framework and APIs and unified batch and streaming data processing for real-time applications. For more information, check out this blog post.And we’re making Cloud Spanner, our fully managed relational database that supports strong consistency and infinite scale, accessible to more customers by lowering the entry price by 90%. We’re also offering more granular instance sizing (coming soon) while providing the same scale and reliability, opening up Spanner to more workloads. In addition, BigQuery federation to Spanner is coming soon, which lets users query transactional data residing in Spanner, from BigQuery, for richer, real-time insights. And Key Visualizer, available now in public preview, provides interactive monitoring, which allows developers to quickly identify trends and usage patterns in Spanner for improved decision making. Finally, we’re announcing that Bigtable joins Firestore and Spanner with industry leading 99.999% availability SLA. For more information, check out this blog post.Lastly, BigQuery ML Anomaly Detection provides a way to more easily detect problematic data patterns for a variety of use cases, including bank fraud detection and manufacturing defect analysis.Data analytics partner ecosystem, powered by BigQuery Google Cloud has a thriving partner ecosystem for data analytics and we’re looking at new ways of celebrating those partners who are building data-driven applications and delivering new analytics services to their customers, all powered by BigQuery. Partners such as Quantum Metric, Shape Security, and Trax are leveraging processing, collection, storage, and analytics on BigQuery to solve their customer challenges for customer analytics, security, and data exchanges. Reach out to us through our Partner Advantage Program to learn more about how you can use BigQuery to power your applications. And watch keynote and strategy presentations on-demand at the Data Cloud Summit, to learn and share new ways we can all use data for good.
Quelle: Google Cloud Platform

Introducing Analytics Hub: secure and scalable sharing for data and analytics

Customers tell us that sharing and exchanging data with other organizations is a critical element of their analytics strategy, but it’s hamstrung by unreliable data and processes, and only getting harder with security threats and privacy regulations on the rise. Furthermore, traditional data sharing techniques use batch data pipelines that are expensive to run, create late arriving data, and can break with any changes to the source data. They also create multiple copies of data, which brings unnecessary costs and can bypass data governance processes. These techniques do not offer features for data monetization, such as managing subscriptions and entitlements. Altogether, these challenges mean that organizations are unable to realize the full potential of transforming their business with shared data.To address these limitations, we are introducing Analytics Hub, a new fully managed service, available in Q3, in preview, that helps you unlock the value of data sharing, leading to new insights and increased business value. With Analytics Hub you get:A rich data ecosystem by publishing and subscribing to analytics-ready datasets. Control and monitoring over how your data is being used, because data is shared in one place.A self-service way to access valuable and trusted data assets, including data provided by Google. For example, a unique dataset from Google Search Trends will be available, that you can query and combine with your own data.An easy way to monetize your data assets without the overhead of building and managing the infrastructure. Built on a decade of cross-organizational sharingWhile Analytics Hub is a new service, it builds on BigQuery, Google’s petabyte-scale, serverless cloud data warehouse. BigQuery’s unique architecture provides separation between compute and storage, enabling data publishers to share data with as many subscribers as you want without having to make multiple copies of your data. With BigQuery, there are no servers to deploy or manage, which means that data consumers get immediate value from shared data. Data can be provided and consumed in real-time using the streaming capabilities of BigQuery and you can leverage the built in machine learning, geospatial, and natural language capabilities of BigQuery or take advantage of the native business intelligence support with tools like Looker, Google Sheets, and Data Studio.BigQuery has had cross-organizational, in-place data sharing capabilities since it was introduced in 2010. We took a look at usage metrics in BigQuery and found that over a 7 day period in April, we had over 3,000 different organizations sharing over 200 petabytes of data. These numbers don’t include data sharing between departments within the same organization.As you can see, data sharing in BigQuery is already popular. But we want to make it easier and even more scalable.Raising the bar on data sharing To make data sharing easier and more scalable in BigQuery, Analytics Hub introduces the  concepts of shared datasets and exchanges. As a data publisher, you create shared datasets that contain the views of data that you want to deliver to your subscribers. Next, you create exchanges, which are used to organize and secure shared datasets. By default, exchanges are completely private, which means that only the users and groups that you give access to can view or subscribe to the data. You can also create internal exchanges or leverage public exchanges provided by Google. Finally, you publish shared datasets into an exchange to make them available to subscribers. Data subscribers search through the datasets that are available across all exchanges for which they have access and subscribe to relevant datasets. This creates a linked dataset in their project that they can query and join with their own data. Subscribers pay for the queries that they run against the data while the publisher pays for the storage of the data. Data providers can add new data, new tables, or new columns to the shared dataset and these will be immediately available to subscribers. In addition, the publisher can track subscribers, disable subscriptions, and see aggregated usage information for the shared data. Analytics Hub makes it easy for you to publish, discover, and subscribe to valuable datasets that you can combine with your own data to derive unique insights. Here are some types of data that will be available through Analytics Hub:Public datasets: Easy access to the existing repository of over 200 public datasets, including data about weather and climate, cryptocurrency, healthcare and life sciences, and transportation. Google datasets: Unique, freely-available datasets from Google. One example of this is the COVID-19 community mobility dataset. Another example is the forthcoming Google Trends dataset, which will provide the top 25 search terms and top 25 rising search terms over a 5 year window in 210 distinct locations in the US. Trends data can be used by everyone in the organization to gain insights into what customers care about.Commercial (paid for) datasets: We are working with leading commercial data providers to bring their data products to Analytics Hub. If you are interested in delivering your data via Analytics Hub, we’re also introducing Data Gravity, an initiative that provides storage benefits and new distribution paths for data published through Analytics Hub. Internal datasets: We know that data sharing can be challenging in larger organizations. Analytics Hub can be used for internal data, for example, to share standardized customer demographics with your sales engineering and data science teams.Customers and partners using Analytics Hub“Google Search Trends data has always been an important tool for our WPP agency data teams. At WPP we believe that data variety is a superpower which is why we are excited to use the new Trends dataset availability within BigQuery, plus the launch of Analytics Hub. The best creativity in the world is informed by data insights, and influenced by what people search for, so the operational efficiencies we’ll gain via the Analytics Hub and the insights we can drive with Trends data are just phenomenal.”—Di Mayze Global Head of Data and AI, WPP“Equifax Ignite is our shared data analytics environment within our Equifax data fabric. We are excited to partner with Google to leverage Analytics Hub and BigQuery to deliver data to over 400 statisticians and data modelers as well as securely sharing data with our partner financial institutions.” —Kumar Menon, SVP Data Fabric and Decision Science, Equifax”The flow of data and insights between our teams at Deloitte and our clients is paramount for building truly transformational data cultures. With its purpose-built architecture for secure data exchanges and sharing analytics resources, Google Cloud’s Analytics Hub can help provide significant operational efficiencies for how Deloitte teams support our clients’ data-driven initiatives within their industry ecosystems. It will also help minimize the worries about scale, privacy and security, or the administrative burden associated with each.” —Navin Warerkar, Managing Director, Deloitte Consulting LLP, and US Google Cloud Data & Analytics GTM Lead”Crux Informatics is proud to partner with Google to support the launch of Analytics Hub, removing friction for those who need access to analytics-ready data. With thousands of datasets from over 140 sources, Crux Informatics will accelerate access to data on Analytics Hub and together provide a more efficient and cost effective solution to deliver datasets in Google Cloud’s ecosystem.” —Will Freiberg, CEO, Crux InformaticsNext steps for Analytics HubThis is just the beginning for Analytics Hub. As we get to preview and general availability, we will be adding additional capabilities, including workflows for publishing and subscribing, publishing analytics assets (Looker Blocks, Data Studio reports, Connected Google Sheets) along with the shared data, the ability for data publishers to specify query restrictions on the usage of their data, and making it easy for data publishers to create sandbox environments for subscribers to work with their data, even if they are not yet on Google Cloud. We will provide features in Analytics Hub for monetization of data, including managing subscriptions, data entitlements, and billing.Please sign up for the preview, which is scheduled to be available in the third quarter of 2021. In the meantime, you can learn more about BigQuery and how to leverage its built-in data sharing capabilities. Please go to g.co/cloud/analytics-hub to register your interest in Analytics Hub.Related ArticleTransforming your business with the data cloudAccelerate your business transformation with the data cloud.Read Article
Quelle: Google Cloud Platform