Monitor and secure your containers with new Container Threat Detection

As more containerized workloads find their way into your organization, you want to be able to detect and respond to threats to containers running in this environment. Today, we’re excited to announce the general availability of Container Threat Detection to help you monitor and secure your container deployments in Google Cloud.Container Threat Detection is a built-in service in Security Command Center Premium tier. Container Threat Detection detects the most common container runtime attacks and alerts you to any suspicious activity. This release includes multiple new detection capabilities and provides an API.Here are the key findings that are identified by Container Threat Detection:Suspicious Binary Executions: Container Threat Detection can see when a binary that was not part of the original container image is executed, and triggers a finding, indicating that an attacker may have control of the workload and that they are executing suspicious software such as malware or cryptocurrency mining software.Suspicious Library Loaded: Container Threat Detection can also detect when a library that was not part of the original container image is loaded—a possible sign that the attacker has control of the workload and that they are executing arbitrary code.Reverse Shell: Container Threat Detection monitors for processes that get started with stream redirection to a remote connected socket. An attacker can use a reverse shell to communicate from a compromised workload to an attacker controlled machine and perform malicious activities, for example as part of a botnet.Get started todayYou can get started with Container Threat Detection by simply enabling the built-in service in the Security Command Center with a Premium subscription. To enable a Premium subscription, contact your Google Cloud Platform sales team.We’ve also made it easy for you to test Container Threat Detection in a non-production environment. To trigger Container Threat Detection findings in a test environment, follow the steps outlined in this Testing Container Threat Detection guide.Security Command Center is a native security and risk management platform for Google Cloud. In addition to Container Threat Detection, it provides built-in services that enables you to gain visibility into your cloud assets, discover misconfigurations and vulnerabilities in your resources, and help maintain compliance based on industry standards and benchmarks.Click to enlargeYou can learn more about the Security Command Center and how it can help with your security operations using our product documentation.
Quelle: Google Cloud Platform

Best practices to use Apache Ranger on Dataproc

Dataproc is an easy-to-use, fully managed cloud service for running managed open source, such as Apache Spark, Apache Presto, and Apache Hadoop clusters, in a simpler, more cost-efficient way. Dataproc allows you to have long-running clusters similar to always-on on-premises OSS clusters. But even better, it allows multiple smaller, customized, job-focused clusters that can be turned off when a job is done to help manage costs. However, using these ephemeral clusters opens a few questions: How do you manage secure and fine-grained access to Hadoop services in this new architecture? How can you audit user actions and make sure the logs persist beyond any cluster lifecycle?In this blog, we propose an end-to-end architecture and best practices to answer these questions using Apache Ranger, an authorization OSS for Hadoop, on Google Cloud.In this architecture, several Dataproc clusters share a single Ranger back-end database while each cluster has its own Ranger admin and plugin components. The database, hosted on Cloud SQL, centralizes the policies so that policies are synchronized among all the clusters. With this architecture, you don’t have to deploy one Ranger database per cluster and consequently, deal with policy synchronization and incur higher costs. Moreover, you don’t need a central Ranger admin instance, which requires maintenance to be always up. Instead, the only centralized component is your Ranger database, backed by Cloud SQL, Google Cloud’s fully managed relational database service.How is the cloud different?With Dataproc you create clusters in a few minutes, manage them easily, and save money by turning clusters off when you don’t need them. You can create as many clusters as you need, tailor them for a job or a group of jobs, and have them around only while those jobs are running. That sounds great, but how is authentication and authorization managed in such an environment? Dataproc shares the Cloud Identity and Access Management (Cloud IAM) functionalities with the rest of Google Cloud; however, IAM permissions are high-level and not specifically aimed to control very fine-grained access to the services in a Hadoop environment. That is where Ranger excels. If you are used to Ranger on your on-prem environments, you will feel at home on Dataproc. Dataproc supports Ranger as an optional component, so you continue to have Ranger installed on each cluster using Dataproc’s component exchange.In this diagram, you can see four Dataproc clusters on Google Cloud. Each cluster hosts an instance of Ranger to control access to cluster services such as Hive, Presto, HBase, and othersClick to enlargeUsers of these services have their identities defined in an identity provider service that is external to the clusters. As an example, the diagram shows an LDAP server such as Apache DS running on Google Compute Engine. However, you can also use your own identity provider like Active Directory on-prem or on a different cloud provider. See Authenticating corporate users in a hybrid environment. The access policies defined in Ranger are also external to the clusters. The diagram shows them stored in a centralized Cloud SQL instance, along with the Ranger internal users. Finally, auditing is externalized to Cloud Storage, with each cluster storing its logs in its own bucket and folder. Having the policies, internal users, and logs separated from the Hadoop clusters allows you to create and turn off clusters as needed.What is behind the scenes in a cluster?Let’s go under the hood of a cluster and drill down to the components that make this architecture possible:Click to enlargeUsers of the system, shown on top of the diagram, want to access one or more of the cluster services to process some data and get results back. They authenticate using an on-cluster Kerberos Distribution Center, or alternatively using an Apache Knox Gateway as described in this article. Both Kerberos and Apache Knox can verify the user identities defined in an external LDAP server. The Ranger User Sync Server periodically retrieves the identities from the LDAP server so that it can apply access policies to the users. Dataproc supports Kerberos integration on the cluster out of the box. If you use Kerberos in your cluster with this architecture, you need to use an LDAP server as an external cross-realm trust to map users and groups into Kerberos principals.Once a user is authenticated, their request is routed to the appropriate service. However, it is intercepted by the corresponding Ranger plugin for the service. The plugin periodically retrieves the policies from the Ranger Policy Server. These policies determine if the user identity is allowed to perform the requested action on the specific service. If it is, then the plugin allows the service to process the request and the user gets back the results. Note that the policies are external to the cluster and stored in a Cloud SQL database so that they persist independently of the cluster lifecycle.Every user interaction with a Hadoop service, both allowed or denied, is written to cluster logs by the Ranger Audit Server. Each cluster has its own logs folder in Cloud Storage. Ranger can index and search these logs leveraging Apache Solr. Examining the logs of a previously deleted cluster is as easy as creating a new cluster and pointing the dataproc:solr.gcs.path property to the old cluster logs folder.Last but not least, the Admin UI of Ranger is installed to allow an easy way to visualize and manage the different policies, roles, identities, and logs across clusters. Access to the Admin UI is given to a separate group of users, internal to Ranger, and stored in the Ranger database.All the Ranger components run on the Hadoop master node. Workers that ultimately run jobs orchestrated through YARN are not pictured in the diagram, and do not need any particular configuration.How does the architecture work with ephemeral clusters? Dataproc allows you to run multiple long-running and/or ephemeral clusters simultaneously. Should you install Ranger in every cluster? The answer is yes and no. If every cluster had its own Ranger admin and database, it would be cumbersome to re-populate the users and policies every time you have a new cluster. On the other hand, a central Ranger service brings up scalability issues, since it has to deal with the user sync, policy sync, and the audit logs for all the clusters.The proposed architecture keeps a central Cloud SQL database always up while all the clusters can be ephemeral. The database stores policies, users, and roles. Every cluster has its own Ranger components synchronized with this database. The advantage of this architecture is that you avoid policy synchronization and the only centralized component is Cloud SQL, which is managed by Google Cloud. See the first figure above that shows the architecture.How do you authenticate users?For Ranger, there are two user types: External users: These are users that access data processing services such as Hive. In most cases, they do not need explicit access to the Ranger UI. Ranger runs a user synchronization daemon service in every cluster to fetch these users and groups from LDAP, then persists them in the Ranger database. This daemon can run safely in each Dataproc cluster as long as they all fetch users from the same LDAP server with the same parameters. To avoid race conditions, where a particular user is synchronized twice by different clusters, the Ranger database has a uniqueness constraint on user/group IDs.  Internal users: These are the users of the Ranger UI. Authentication is different from external users. You define authentication to the UI via an LDAP/AD setup or by manually creating the users. This method must be set up in every cluster explicitly because every UI checks its own configuration to learn where to query for authentication. When you create a user via UI directly, Ranger persists that user into the shared database. Hence, it is available in the Ranger UIs on all clusters without any additional configuration.A Ranger admin user is a special internal user who has the authority to perform any action on the Ranger UI, such as creating policies, adding internal users, and assigning the admin role to others. The Dataproc Ranger component allows you to set the Ranger admin user password during startup and stores the credentials in the central Ranger database. Therefore, the admin user and password are the same across all the clusters.How do you synchronize authorization policies across clusters?Ranger stores authorization policies in a relational database. The architecture uses a shared Cloud SQL Ranger database so that policies are available to all clusters. Admin users can alter these policies by logging into any Ranger UI that shares the same database.How do you audit user actions?Apache Solr handles the Ranger audit logs and stores them in a Cloud Storage bucket for durability even after cluster deletion. When you need to read the logs of a deleted cluster, you create a cluster and point Solr to the same Cloud Storage folder. You will then be able to browse the logs in the Ranger UI of that cluster. The cluster that you create for log retrieval can be small, such as a single node cluster, and ephemeral. To avoid having different Cloud Storage buckets per cluster, use the same bucket for all as long as each cluster logs to a different folder. Clusters cannot write their audit logs to the same folder since each cluster has its own Solr component managing these logs.In addition to Ranger audit logs, Google Cloud provides Cloud Audit Logs. These logs are not as granular as the Ranger logs, but are an excellent tool that allows you to answer the questions of “who did what, where, and when?” on your Google Cloud resources. For example, if you use the Dataproc Jobs API, you could find out which Cloud IAM user submitted a job through Cloud Audit Logging. Or you can track the Dataproc Service Account reads and writes on a Cloud Storage Bucket.Use the right access control for your use caseBefore we finish, we’d ask you to consider whether you need Ranger. Ranger adds minutes to cluster creation and you have to manage its policies. As an alternative, you can create many ephemeral Dataproc clusters and assign them individual service accounts with different access rights. Depending on your company size, creating a service account and cluster per person may not be cost-effective, but creating shared clusters per team would offer enough degree of separation for many use cases. You can also use Dataproc Personal Cluster Authentication if a cluster is only intended for interactive jobs run by an individual (human) user.Use these alternatives instead of Ranger when you don’t need fine-grained authorization and audit at the service, table, or column level. You can limit a service account or user account to access only a specific cluster and data set.Get started with Ranger on DataprocIn this blog post, we propose a Ranger architecture to serve multiple long-running and/or ephemeral Dataproc clusters. The core idea is sharing the Ranger database, authentication provider, and audit log storage and running all other components such as Ranger Admin, Ranger UI, Ranger User Sync, and Solr in individual clusters. The database serves the policies, users, and their roles for all the clusters. You don’t need to run a central Ranger service because Ranger components are stateless. Solr stores the audit logs on Cloud Storage to keep them for further analysis even after the deletion of a cluster.Try Ranger on Dataproc with the Dataproc Ranger Component for easy installation. Combine it with Cloud SQL as the shared Ranger database. Go one step further and connect your Visualization Software to Hadoop on Google Cloud.
Quelle: Google Cloud Platform

Better service orchestration with Workflows

Going from a single monolithic application to a set of small, independent microservices has clear benefits. Microservices enable reusability, make it easier to change and scale apps on demand. At the same time, they introduce new challenges. No longer is there a single monolith with all the business logic neatly contained and services communicating with simple method calls. In the microservices world, communication has to go over the wire with REST or some kind of eventing mechanism and you need to find a way to get independent microservices to work toward a common goal.Orchestration vs ChoreographyShould there be a central orchestrator controlling all interactions between services or should each service work independently and only interact through events? This is the central question in Orchestration vs Choreography debate. In Orchestration, a central service defines and controls the flow of communication between services. With centralization, it becomes easier to change and monitor the flow and apply consistent timeout and error policies. In Choreography, each service registers for and emits events as they need. There’s usually a central event broker to pass messages around, but it does not define or direct the flow of communication. This allows services that are truly independent at the expense of less traceable and manageable flow and policies. Google Cloud provides services supporting both Orchestration and Choreography approaches. Pub/Sub and Eventarc are both suited for choreography of event-driven services, whereas Workflows is suited for centrally orchestrated services. Workflows: Orchestrator and moreWorkflows is a service to orchestrate not only Google Cloud services, such as Cloud Functions and Cloud Run, but also external services. As you might expect from an orchestrator, Workflows allows you to define the flow of your business logic in a YAML based workflow definition language and provides a Workflows Execution API and Workflows UI to trigger those flows.It is more than a mere orchestrator with these built-in and configurable features:Flexible retry and error handling between steps for reliable execution of steps.JSON parsing and variable passing between steps to avoid glue-code. Expression formulas for decisions allow conditional step executions. Subworkflows for modular and reusable Workflows.Support for external services allows orchestration of services beyond Google Cloud.Authentication support for Google Cloud and external services for secure step executions. Connectors to Google Cloud services such as Pub/Sub, Firestore, Tasks, Secret Manager for easier integration (in private preview soon). Not to mention, Workflows is a fully-managed serverless product. No servers to configure or scale and you only pay for what you use. Use casesWorkflows lends itself well to a wide range of use cases. For example, in an e-commerce application, you might have a chain of services that need to be executed in a certain order. If any of the steps fail, you want to retry or fail the whole chain. Workflow with its built-in error/retry handling is perfect for this use case:In another application, you might need to execute different chains depending on a condition with Workflow’s conditional step execution:In long-running batch data processing kind of applications, you usually need to execute many small steps that depend on each other and you want the whole process to complete as a whole. Workflows is well suited because it:Supports long-running workflows.Supports a variety of Google Cloud compute options such as GCE or GKE for long-running and Cloud Run or Cloud Functions for short-lived data processing.Is resilient to system failures. Even if there’s a disruption to the execution of the workflow, it will resume at the last check-pointed state.In orchestration vs choreography debate, there is no right answer. If you’re implementing a well-defined process with a bounded context, something you can picture with a flow diagram, orchestration is often the right solution. If you’re creating a distributed architecture across different domains, choreography can help those systems to work together. You can also have a hybrid approach where orchestrated workflows talk to each other via events. I’m definitely excited about using Workflows in my apps and it’ll be interesting to see how people use Workflows with services on Google Cloud and beyond. For more information, check out Workflows documentation and feel free to reach out to me on Twitter @meteatamel for questions/feedback!Related ArticleCloud Composer is now in beta: build and run practical workflows with minimal effortLearn more about the beta of Cloud Composer, a managed Apache Airflow service to facilitate your multi-cloud strategy.Read Article
Quelle: Google Cloud Platform

Docker and AWS Resources for Developers

AWS re:Invent kicks off this week and if you are anything like us, we are super geeked out to watch and attend all the talks that are lined up for the next three weeks.

To get ready for re:Invent, we’ve gathered some of our best resources and expert guidance to get the most out of the Docker platform when building apps for AWS. Check out these blogs, webinars and DockTalks from the past few weeks to augment your re:Invent experience over the next three weeks:

Expert Guidance from the Docker Team

BlogsDocker Compose for Amazon ECS Now Available: Excellent blog post written by Docker Product Manager Ben De St Paer-Gotch (@Nebuk89) about how to get started with Docker Compose and ECS.Deploying WordPress to the Cloud: Another excellent blog post by Ben De St Paer-Gotch on how to configure and deploy a full-blown WordPress instance to AWS ECS using the Docker CLI.

AWS Howdy PartnerAWS Howdy Partner Twitch Show: Back in July, I (@pmckee) was a guest on the AWS Howdy Partner show hosted on Twitch. Follow along as we walked through deploying a multi-container application to AWS ECS using the Docker CLI.

WebinarFrom Docker Straight to AWS: Deploying from Docker straight to AWS with your existing workflow has never been easier. In this webinar, Chad Metcalf of Docker and Carmen Puccio of AWS do a hands-on walk through of how you can get started today.

DockTalksDockTalk Q&A: From Docker Straight to AWS: Chad Metcalf and I welcomed Jonah Jones from AWS to the show to help answer all the questions you might have around using the Docker integrations with AWS ECS.

YouTubeDocker Hub Video Series: Learn how to take advantage of basic and advanced features of Docker Hub. In this series, you’ll learn how to configure and set up your Organization and Teams to help maximize collaboration and development using containers.

The post Docker and AWS Resources for Developers appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

Amazon Aurora PostgreSQL Patches 1.7.6 / 2.5.6 / 3.2.6 jetzt verfügbar

Patches 1.7.6 / 2.5.6 / 3.2.6 sind jetzt für Kunden, die Amazon Aurora PostgreSQL nutzen, verfügbar. Detaillierte Versionshinweise finden Sie in unserer Versionsdokumentation. Sie können die neue Patch-Version in der AWS-Managementkonsole über die AWS-CLI oder über die RDS-API anwenden. Detaillierte Anweisungen können Sie unserer technischen Dokumentation entnehmen.  
Quelle: aws.amazon.com

Amazon SageMaker Studio unterstützt jetzt Multi-GPU-Instances

Wir freuen uns, heute ankündigen zu können, dass Amazon SageMaker Studio jetzt Multi-GPU-Instances in den Größen ml.g4dn.12xlarge, ml.p3.8xlarge, und ml.p3.16xlarge unterstützt. Multi-GPU-Instances beschleunigen das Training von Machine Learning-Modellen erheblich, was es Nutzern ermöglicht, fortgeschrittenere Machine Learning-Modelle zu trainieren, die für einzelne GPUs zu groß sind. Ebenso bieten sie die Flexibilität, größere Datenstapel, wie etwa 4k-Bilder für Bildklassifikation und Objekterkennung, zu verarbeiten.
Quelle: aws.amazon.com

Einführung von Amazon CloudFront in Thailand

Amazon CloudFront gibt die ersten beiden Edge-Standorte in Thailand bekannt. Diese neuen Edge-Standorte in Bangkok werden Zuschauern eine bis zu 30%ige Reduzierung der p90-Latenzmessungen bieten. Die Preise für diese neuen Edge-Standorte liegen innerhalb der geografischen Region Asien-Pazifik von CloudFront. Weitere Informationen zur globalen Infrastruktur von CloudFront finden Sie unter Amazon CloudFront-Infrastruktur.
Quelle: aws.amazon.com