Cloud Computing Köln - Seite 1836 von 7011 - Neues zu Cloud Computing, Internet of Things und Technologien

In the fragmented world of U.S. healthcare, patients often have to wait in line or on hold, navigate multiple patient portals, and fill out numerous request forms—all in pursuit of their own medical history. Healthcare technology startup PicnicHealth is on a mission to put control back with the patient, where it belongs. PicnicHealth’s growth, from closing successful venture rounds to winning machine learning (ML) competitions, speaks to not only improvements and opportunities in healthcare, but also how startups are leveraging Google Workspace and Google Cloud services to accelerate momentum. The company does the heavy lifting of collecting records and leverages human-in-the-loop ML to transcribe and validate them with an abstraction team of medical professionals. The records are then structured into a complete medical history that patients can access and share with providers to get better care. But PicnicHealth helps to improve patient health on more than one front. It allows patients to contribute their data to de-identified medical research, building high-quality, anonymized datasets that researchers and life sciences companies can use to better understand disease progression and treatment in the real world. Google Workspace has been part of PicnicHealth from day one, helping the founders collaborate and shape the company’s vision using cloud-synced documents and spreadsheets to collaborate and model predictions. “I’ve had a Gmail account since 2006, and in college 100% of people worked out of Google Docs. Workspace continues to be the best choice for online collaboration, and that’s why it’s still the default standard for startups,” said Troy Astorino, CoFounder & CTO of PicnicHealth. “When we started PicnicHealth, my co-founder Noga Leviner was in San Francisco and I was in Southern California, and of course we used Workspace.” Today, Google Workspace continues to play a central role in the company’s collaboration. “We create design documents in Google Docs, primarily for engineering and product changes, and get really healthy, vibrant discussions through comments,” noted Astorino. “This practice has grown beyond engineering and is used for everything from how the company operates to communication norms.” Google Workspace offers everything the team needs to collaborate, no matter where employees are. PicnicHealth’s team has spread from San Francisco to being distributed across the country and around the world. Instant collaboration is crucial. “We work in a complex domain where people need a lot of information to make good decisions,” said Astorino. “Google Workspace allows us to operate in a mode of default transparency, where people can easily get the information they need even if it wasn’t intentionally or directly shared with them. Whether it’s working in Docs or scheduling in Calendar, we can operate much more effectively than we could otherwise.”By any measure, PicnicHealth’s trajectory is one of record success. The startup is a 2014 alumni of Y Combinator, a program that helped launch household names like Airbnb, DoorDash, and Dropbox. Three years later, the team went on to win the $1 million grand prize at Google’s Machine Learning Competition. And the momentum has continued— PicnicHealth has recently announced a $60 million Series C round, bringing the amount raised to date to over $100 million. With the Series C, PicnicHealth is investing in expanding its reach to more patients across over 30 diseases. As a healthcare startup, PicnicHealth faced a very particular set of challenges, especially when working with and accessing data. Data fragmentation and interoperability are only some of the challenges of realizing the value of big data in the cloud. The healthcare industry is notoriously difficult to navigate due to sensitive data protection laws and regulations like the Health Insurance Portability and Accountability Act (HIPAA). PicnicHealth started in the cloud on Amazon Web Services (AWS). However, after migrating over to Kubernetes and facing an expanding list of requirements for HIPAA compliance, the company started to explore alternatives. “We needed to be HIPAA compliant, which was going to be painful on AWS, and we wanted to get away from managing and operating our own Kubernetes clusters,” recalled Astorino. “We had heard good things about GKE (Google Kubernetes Engine). And particularly valuable for us, — many technical requirements you need for HIPAA compliance are configured by default on Google Cloud.” PicnicHealth would have had to implement a lot of changes and get specialized instance types to get their existing configuration to work. So, they began experimenting with Google Cloud and discovered a much smoother experience. “It was a lot easier to manage in terms of product setup and developer experience,” said Astorino. “There is a sane product hierarchy of resources you can access and use through Google Cloud and the relationships between them, from coordinated IAM (identity and access management) to using Google Groups for granting permissions. Overall, it’s cleaner.” Astorino added that the move has also opened the doors to taking advantage of other services in the Google Cloud ecosystem like Cloud SQL, BigQuery, and Cloud Composer. PicnicHealth also uses Security Command Center because it easily integrates with everything but also helps meet various compliance frameworks’ requirements, providing visibility, near-real-time asset discovery, and security information and event management. But most importantly, the integrated ecosystem has simplified the work needed for PicnicHealth to create a secure environment for employees to use when working with sensitive medical records while still providing all the tools they need. For example, abstractors not only use Google Workspace but also have Chromebooks because they are easy to manage and secure. Altogether, Google Cloud helps form a technology stack that has enabled the startup to build a massive labeled dataset containing over 100 million labeled medical data concepts. In turn, it accelerates PicnicHealth’s ability to generate highly-performant AI models and feed other ML pipelines, which has been vital for processing and reviewing data at scale.To learn more about how Google Workspace and Google Cloud help startups like PicnicHealth accelerate their journey, visit our startups solutions pages for Google Workspace and Google Cloud.Related ArticleGoogle Workspace, GKE help startup CAST AI grow faster and optimize cloud costsHow startup CAST AI accelerated its growth with Google Workspace and Google Kubernetes Engine.Read Article
Quelle: Google Cloud Platform

27. Juli 2022

da Agency

New Google Cloud Marketplace Private Offers features to help our partners grow

As we shared at the beginning of the year, we are making significant investments in Google Cloud Marketplace to accelerate growth for our customers and partners. This includes new technical capabilities that provide the purchasing flexibility and choice our enterprise customers need when buying software from Google Cloud partners through Marketplace.Private Offers are now more flexible than everToday, we are excited to announce that Private Offers in Google Cloud Marketplace is now generally available. With these new and expanded deal-making capabilities, Google Cloud partners can help our shared customers buy the way they want. All Marketplace partners now have more options to further customize pricing, payment schedules, and terms for privately negotiated Google Cloud Marketplace deals, including: Support across product types: SaaS, Virtual Machine and Kubernetes products can now be purchased via Private Offers.Expanded subscription and discounting models: New committed use discounts (CUD) and enhanced flat fee and flat fee with usage experiences can better support your business model.Flexible payment schedule and contract duration options: Align your Private Offers to how our mutual customers want to buy with pre or post-pay timing options and choice of various contract periods.Prepay installments functionality: Allow customers to make multiple prepay payments of equal or increasing amounts over the course of the contract to align with when they want to pay.Deal-specific terms: Upload pre-existing or customized license agreements to each offer, enabling customers to leverage deal-specific terms of service and accelerate the purchase process by reducing redlining.Private Offer amendment and extension: Support for renewals, expanding existing deals, updating customer plans, and launching new product features.Offering these capabilities is an important step forward in helping our partners grow their business on Google Cloud. As Kathy Barboza, NetApp’s Worldwide Head of Google Cloud Sales Specialist says, Private Offers open new and expanded opportunities, helping us better serve our customers together: “NetApp and Google Cloud have partnered to meet our customers’ unique needs through Private Offers on Google Marketplace and are collaborating to establish long term relationships, growth, and revenue. The partnership provides our joint customers with the ability to anticipate budgets along with the flexibility to address their business-critical requirements as they navigate digital transformation.”And these new capabilities are on top of the existing customer benefits that accelerate deals transacted through Google Cloud Marketplace: Buyers can leverage their existing agreement with Google Cloud for Marketplace purchases, simplifying procurement for quicker deployment and time to value.Customers can decrement their committed spend through Marketplace transactions, which maximizes their cost savings and helps them spend smartly across first and third-party solutions. All Marketplace purchases show up on one bill from Google, allowing customers to easily analyze and manage spend.Check out the Marketplace Partner Fundamentals within Partner Advantage for more on the benefits of Google Cloud Marketplace for your business and customers.This is a major step forward in helping customers solve business challenges more quickly and driving additional growth for our partner ecosystem. Google Cloud Marketplace was already the fastest way to show up to Google Cloud customers in-product worldwide. Now transacting and growing large, customized enterprise deals is easier and more flexible than ever.Simple Private Offer configurationReady to grow your business on Google Cloud with Private Offers? Let’s walk through the guided creation flow.As a prerequisite, you’ll need to publish a transactable listing on Google Cloud Marketplace.Once your product is published, customers can request a negotiated deal by reaching out to you directly within Marketplace, through their Google seller, or via an existing engagement. While confirming pricing, terms, and payment schedules with the customer offline, you can start creating a new Private Offer in Google Cloud Marketplace > Producer Portal and selecting the relevant transactable SaaS, VM, or Kubernetes product and plan that will support one of three subscription models per deal:For SaaS, VM and Kubernetes products with Usage Only pricing models, you can provide your customer with a committed use discount (CUD) subscription. The customer commits to spending a certain amount to use the product and receives a discount based on this commitment. They can apply this commitment flexibly towards different resources of the product.For SaaS products specifically, there are two additional subscription models available:Flat fee: your customer pays a set subscription fee for a specified quantity of software features.Flat fee with usage: your customer pays a fee to use the software, including access to features in specific quantities. Customers pay an additional fee for resource usage incurred beyond what’s included in the flat fee.The three types of Private Offer subscription models are committed use discount (CUD), flat fee, and flat fee with usageAfter selecting the product and plan, you’ll enter the recipient details for the customer or the Google Cloud reseller if this offer is being resold. Enter the recipient’s Billing Account ID—which they can learn more about identifying here. A Private Offer will apply to all projects assigned to their billing account. You’ll then provide a sales contact at your organization that the customer can reach out to if they have any questions on this Private Offer. You can also add notes that your organization will see in the Private Offer dashboard. We’ve seen partners use this for order numbers, procurement IDs, or other CRM IDs to track deals within tools they use internally.Next, you’ll select a payment schedule and the discounted pricing that you’re providing the customer. A postpay schedule will bill the customer monthly, while a prepay schedule—which many larger organizations prefer to help manage cloud spend—allows you to configure an installment schedule. Each installment can be up to a year in length, and each must be equal or greater in value than the previous one. You’ll also indicate a contract duration and offer an acceptance deadline that can be up to 3 months from the creation date. For postpay schedules, you can also select whether the customer can automatically renew this order at the end of the contract duration.Now, select the software license terms you want the customer to agree to for your solution. You can use Google’s standard end-user license agreement (EULA), or you can upload and name a custom deal-specific one. We see most Private Offers using the standard EULA, but you may want to provide custom terms in certain scenarios. For instance, reusing previously agreed-to terms with an existing customer could skip redundant legal reviews, saving you and your customer time.Now you’re ready to review the details for accuracy and preview the customer view of the deal. When everything looks great, generate a link to the Private Offer that you can send for your customer to review and accept.By the way, don’t worry about future-proofing your offer now. We’ve built in plenty of flexibility to support growth in customer usage and renewals. Partners will also be able to amend existing offers as your customers’ needs grow, including:Modifying installment contracts to upsell and upgradeAdding future installments and edit unpaid installmentsExtending contract durationsOffering new features that are launched on existing plansOnce configured, review it for accuracy and click Generate URL to send it to the customer or reseller.Learn more about how to leverage this new feature set in the Private Offer documentation.We’re excited to offer these new Private Offer features that provide you and our mutual customers with greater deal-making flexibility than ever in Google Cloud Marketplace. Stay tuned as we continue to invest in our partner ecosystem to unlock further opportunities that accelerate our customers’ digital transformation. See you in the cloud.Related ArticleHow SingleStoreDB uses Google Cloud Marketplace to drive great customer experiencesGoogle Cloud Marketplace enables partners like SingleStoreDB to enhance customer experiences.Read Article
Quelle: Google Cloud Platform

27. Juli 2022

da Agency

Unify data lakes and warehouses with BigLake, now generally available

Data continues to grow in volume and is increasingly distributed across lakes, warehouses, clouds, and file formats. As more users demand more use cases, the traditional approach to build data movement infrastructure is proving difficult to scale. Unlocking the full potential of data requires breaking down these silos, and is increasingly a top priority for enterprises. Earlier this year, we previewed BigLake, a storage engine that extends innovations in BigQuery storage to open file formats running on public cloud object stores. This allows customers to build secure multi-cloud data lakes over open file formats. BigLake provides consistent, fine-grained security controls for Google Cloud and open-source query engines to interact with data. Today, we are excited to announce General Availability for BigLake, and a set of new capabilities to help you build a differentiated data platform. “We are using GCP to build and extend one of the street’s largest risk systems. During several tests we have seen the great potential and scale of BigLake. It is one of the products that could support our cloud journey and drive application’s future efficiency” – Scott Condit, Director, Risk CTO Deutsche Bank.Build a distributed data lake that spans across warehouses, object stores & clouds with BigLakeCustomers can create BigLake tables on Google Cloud Storage (GCS), Amazon S3 and ADLS Gen 2 over supported open file formats, such as Parquet, ORC and Avro. BigLake tables are a new type of external table that can be managed similar to data warehouse tables. Administrators do not need to grant end users access to files in object stores, but instead manage access at a table, row or a column level. These tables can be created from a query engine of your choice, such as BigQuery or open-source engines using the BigLake connector. Once these tables are created, BigLake and BigQuery tables can be centrally discovered in the data catalog and managed at scale using Dataplex. BigLake extends the BigQuery storage API to object stores to help you build a multi-compute architecture. BigLake connectors are built on the BigQuery storage API and enable Google Cloud DataFlow and open-source query engines (such as Spark, Trino, Presto, Hive) to query BigLake tables by enforcing security. This eliminates the need to move the data to a query engine specific use case and security only needs to be configured at one place and is enforced everywhere. “We are using GCP to design datalake solutions for our customers and transform their digital strategy to create a data-driven enterprise. Biglake has been critical for our customers to quickly realize the value of analytical solutions by reducing the need to build ETL pipelines and cutting-down time-to-market. The performance & governance features of BigLake enabled a variety of data lake use cases for our customers.” – Sureet Bhurat, Founding Board member – Synapse LLCBigLake unlocks new use cases using Google Cloud and OSS Query enginesDuring the preview, we saw a large number of customers use BigLake in various ways. Some of the top use cases include: Building secure and governed data lakes for open-source workloads – Workloads migrating from Hadoop, Spark first customers, or those using Presto/Trino, can now use BigLake to build secure, governed and performant data lakes on GCS. BigLake tables on GCS provide fine-grained security, table management (vs giving access to files), better query performance and integrated governance with Dataplex. These characteristics are accessible across multiple OSS query engines when using the BigLake connectors.”To support our data driven organization, Wizard needs a data lake solution that leverages open file formats and can grow to meet our needs. BigLake allows us to build and query on open file formats, scales to meet our needs, and accelerates our insight discovery. We look forward to expanding our use cases with future BigLake features” – Rich Archer, Senior Data Engineer – WizardEliminate or reduce data duplication across data warehouses and lakes – Customers who use GCS, and BigQuery managed storage had to previously create two copies of data to support users using BigQuery and OSS engines. BigLake makes the GCS tables more consistent with BigQuery tables, reducing the need to duplicate data. Instead, customers can now keep a single copy of data split across BigQuery storage and GCS, and data can be accessed by BigQuery or OSS engines in either places in a consistent, secure manner.Fine-grained security for multi-cloud use cases – BigQuery Omni customers can now use BigLake tables on Amazon S3, and ADLS Gen 2 to configure fine grained security access control, and take advantage of localized data processing, and cross cloud transfer capabilities to do multi-cloud analytics. Tables created on other clouds are centrally discoverable on Data catalog for ease of management & governance Interoperability between analytics and data science workloads – Data science workloads, using either Spark or Vertex AI notebooks can now directly access data in BigQuery or GCS through the API connector, enforcing security & eliminating the need to import data for training models. For BigQuery customers, these models can be imported back into BigQuery ML to produce inferences. Build a differentiated data platform with new BigLake capabilitiesWe are also excited to announce new capabilities as part of this General Availability launch. These include:Analytics Hub support: Customers can now share BigLake tables on GCS with partners, vendors or suppliers as linked data sets. Consumers can access this data in place through the preferred query engine of their choice (BigQuery, Spark, Presto, Trino, Tensorflow).BigLake tables is now the default table type BigQuery Omni, and has been upgraded from the previous default of external tables.BigQuery ML support: BigQuery customers can now train their models on GCS BigLake tables using BigQuery ML, without needing to import data, and accessing the data in accordance to the access policies on the table.Performance acceleration (preview): Queries for GCS BigLake tables can now be accelerated using the underlying BigQuery infrastructure. If you would like to use this feature please get in touch with your account team or fill out this form.Cloud Data Loss Prevention (DLP) profiling support (coming soon): Cloud DLP can soon scan BigLake tables to identify and protect sensitive data at scale. If you would like to use this feature please get in touch with your account team or fill out this form.Data masking and audit logging (Coming soon): BigLake tables now support dynamic data masking, enabling you to mask sensitive data elements to meet compliance needs. End user query requests to GCS for BigLake tables are now audit logged and are available to query via logs.Next stepsRefer to BigLake documentation to learn more, or get started with this quick start tutorial. If you are already using external tables today, consider upgrading them to BigLake tables to take advantage of above mentioned new features. For more information, reach out to the Google cloud account team to see how BigLake can add value to your data platform.Special mention to Anoop Johnson, Thibaud Hottelier, Yuri Volobuev and rest of the BigLake engineering team to make this launch possible.Related ArticleBigLake: unifying data lakes and data warehouses across cloudsBigLake unifies data warehouses and data lakes into a consistent format for faster data analytics across Google Cloud and open source for…Read Article
Quelle: Google Cloud Platform

27. Juli 2022

da Agency

Azure empowers easy-to-use, high-performance, and hyperscale model training using DeepSpeed

This blog was written in collaboration with the DeepSpeed team, the Azure ML team, and the Azure HPC team at Microsoft.

Large-scale transformer-based deep learning models trained on large amounts of data have shown great results in recent years in several cognitive tasks and are behind new products and features that augment human capabilities. These models have grown several orders of magnitude in size during the last five years. Starting from a few million parameters of the original transformer model all the way to the latest 530 billion-parameter Megatron-Turing (MT-NLG 530B) model as shown in Figure 1. There is a growing need for customers to train and fine-tune large models at an unprecedented scale.

Figure 1: Landscape of large models and hardware capabilities.

Azure Machine Learning (AzureML) brings large fleets of the latest GPUs powered by the InfiniBand interconnect to tackle large-scale AI training. We already train some of the largest models including Megatron/Turing and GPT-3 on Azure. Previously, to train these models, users needed to set up and maintain a complex distributed training infrastructure that usually required several manual and error-prone steps. This led to a subpar experience both in terms of usability and performance.

Today, we are proud to announce a breakthrough in our software stack, using DeepSpeed and 1024 A100s to scale the training of a 2T parameter model with a streamlined user experience at 1K+ GPU scale. We are bringing these software innovations to you through AzureML (including a fully optimized PyTorch environment) that offers great performance and an easy-to-use interface for large-scale training.

Customers can now use DeepSpeed on Azure with simple-to-use training pipelines that utilize either the recommended AzureML recipes or via bash scripts for VMSS-based environments. As shown in Figure 2, Microsoft is taking a full stack optimization approach where all the necessary pieces including the hardware, the OS, the VM image, the Docker image (containing optimized PyTorch, DeepSpeed, ONNX Runtime, and other Python packages), and the user-facing Azure ML APIs have been optimized, integrated, and well-tested for excellent performance and scalability without unnecessary complexity.

Figure 2: Microsoft full-stack optimizations for scalable distributed training on Azure.

This optimized stack enabled us to efficiently scale training of large models using DeepSpeed on Azure. We are happy to share our performance results supporting 2x larger model sizes (2 trillion vs. 1 trillion parameters), scaling to 2x more GPUs (1024 vs. 512), and up to 1.8x higher compute throughput/GPU (150 TFLOPs vs. 81 TFLOPs) compared to those published on other cloud providers.

We offer near-linear scalability both in terms of an increase in model size as well as increase in number of GPUs. As shown in Figure 3a, together with the DeepSpeed ZeRO-3, its novel CPU offloading capabilities, and a high-performance Azure stack powered by InfiniBand interconnects and A100 GPUs, we were able to maintain an efficient throughput/GPU (>157 TFLOPs) in a near-linear fashion as the model size increased from 175 billion parameters to 2 trillion parameters. On the other hand, for a given model size, for example, 175B, we achieve near-linear scaling as we increase the number of GPUs from 128 all the way to 1024 as shown in Figure 3b. The key takeaway from the results presented in this blog is that Azure and DeepSpeed together are breaking the GPU memory wall and enabling our customers to easily and efficiently train trillion-parameter models at scale.

(a) (b)

Figure 3: (a) Near-perfect throughput/GPU as we increase the model size from 175 billion to 2 trillion parameters (BS/GPU=8), (b) Near-perfect performance scaling with the increase in number of GPU devices for the 175B model (BS/GPU=16). The sequence length is 1024 for both cases.

Learn more

To learn more about the optimizations, technologies, and detailed performance trends presented above, please refer to our extended technical blog.

Learn more about DeepSpeed, which is part of Microsoft’s AI at Scale initiative.
Learn more about Azure HPC + AI.
To get started with DeepSpeed on Azure, please follow our getting started tutorial.
The results presented in this blog were produced on Azure by following the recipes and scripts published as part of the Megatron-DeepSpeed repository. The recommended and most easy-to-use method to run the training experiments is to utilize the AzureML recipe.
If you are running experiments on a custom environment built using Azure VMs or VMSS, please refer to the bash scripts we provide in Megatron-DeepSpeed.

Quelle: Azure

27. Juli 2022

da Agency

How to Build and Deploy a Task Management Application Using Go

Golang is designed to let developers rapidly develop scalable and secure web applications. Go ships with an easy to use, secure, and performant web server alongside its own web templating library. Enterprise users also leverage the language for rapid, cross-platform deployment. With its goroutines, native compilation, and the URI-based package namespacing, Go code compiles to a single, small binary with zero dependencies — making it very fast.
Developers also favor Go’s performance, which stems from its concurrency model and CPU scalability. Whenever developers need to process an internal request, they use separate goroutines, which consume just one-tenth of the resources that Python threads do. Via static linking, Go actually combines all dependency libraries and modules into a single binary file based on OS and architecture.
Why is containerizing your Go application important?
Go binaries are small and self-contained executables. However, your application code inevitably grows over time as it’s adapted for additional programs and web applications. These apps may ship with templates, assets and database configuration files. There’s a higher risk of getting out-of-sync, encountering dependency hell, and pushing faulty deployments.
Containers let you synchronize these files with your binary. They also help you create a single deployable unit for your complete application. This includes the code (or binary), the runtime, and its system tools or libraries. Finally, they let you code and test locally while ensuring consistency between development and production.
We’ll walk through our Go application setup, and discuss the Docker SDK’s role during containerization.
Table of Contents

Building the Application
Key Components
Getting Started
Define a Task
Create a Task Runner
Container Manager
Sequence Diagram
Conclusion

Building the Application
In this tutorial, you’ll learn how to build a basic task system (Gopher) using Go.
First, we’ll create a system in Go that uses Docker to run its tasks. Next, we’ll build a Docker image for our application. This example will demonstrate how the Docker SDK helps you build cool projects. Let’s get started.
Key Components

Go Docker SDK

Microsoft Visual Studio Code

Docker Desktop

Getting Started
Before getting started, you’ll need to install Go on your system. Once you’ve finished up, follow these steps to build a basic task management system with the Docker SDK.
Here’s the directory structure that we’ll have at the end:
➜ tree gopher
gopher
├── go.mod
├── go.sum
├── internal
│ ├── container-manager
│ │ └── container_manager.go
│ ├── task-runner
│ │ └── runner.go
│ └── types
│ └── task.go
├── main.go
└── task.yaml

4 directories, 7 files

You can click here to access the complete source code developed for this example. This guide leverages important snippets, but the full code isn’t documented throughout.
version: v0.0.1
tasks:
– name: hello-gopher
runner: busybox
command: ["echo", "Hello, Gopher!"]
cleanup: false
– name: gopher-loops
runner: busybox
command:
[
"sh",
"-c",
"for i in `seq 0 5`; do echo ‘gopher is working'; sleep 1; done",
]
cleanup: false

Define a Task
First and foremost, we need to define our task structure. This task is going to be a YAML definition with the following structure:
The following table describes the task definition:

Now that we have a task definition, let’s create some equivalent Go structs.
Structs in Go are typed collections of fields. They’re useful for grouping data together to form records. For example, this Task Task struct type has Name, Runner, Command, and Cleanup fields.
// internal/types/task.go

package types

// TaskDefinition represents a task definition document.
type TaskDefinition struct {
Version string `yaml:"version,omitempty"`
Tasks []Task `yaml:"tasks,omitempty"`
}

// Task provides a task definition for gopher.
type Task struct {
Name string `yaml:"name,omitempty"`
Runner string `yaml:"runner,omitempty"`
Command []string `yaml:"command,omitempty"`
Cleanup bool `yaml:"cleanup,omitempty"`
}

Create a Task Runner
The next thing we need is a component that can run our tasks for us. We’ll use interfaces for this, which are named collections of method signatures. For this example task runner, we’ll simply call it Runner and define it below:

// internal/task-runner/runner.go

type Runner interface {
Run(ctx context.Context, doneCh chan<- bool)
}

Note that we’re using a done channel (doneCh). This is required for us to run our task asynchronously — and it also notifies us once this task is complete.
You can find your task runner’s complete definition here. In this example, however, we’ll stick to highlighting specific pieces of code:

// internal/task-runner/runner.go

func NewRunner(def types.TaskDefinition) (Runner, error) {
client, err := initDockerClient()
if err != nil {
return nil, err
}

return &runner{
def: def,
containerManager: cm.NewContainerManager(client),
}, nil
}

func initDockerClient() (cm.DockerClient, error) {
cli, err := client.NewClientWithOpts(client.FromEnv)
if err != nil {
return nil, err
}

return cli, nil
}

The NewRunner returns an instance of the struct, which provides the implementation of the Runner interface. The instance will also hold a connection to the Docker Engine. The initDockerClient function initializes this connection by creating a Docker API client instance from environment variables.
By default, this function creates an HTTP connection over a Unix socket unix://var/run/docker.sock (the default Docker host). If you’d like to change the host, you can set the DOCKER_HOST environment variable. The FromEnv will read the environment variable and make changes accordingly.
The Run function defined below is relatively basic. It loops over a list of tasks and executes them. It also uses a channel named taskDoneCh to see when a task completes. It’s important to check if we’ve received a done signal from all the tasks before we return from this function.

// internal/task-runner/runner.go

func (r *runner) Run(ctx context.Context, doneCh chan<- bool) {
taskDoneCh := make(chan bool)
for _, task := range r.def.Tasks {
go r.run(ctx, task, taskDoneCh)
}

taskCompleted := 0
for {
if <-taskDoneCh {
taskCompleted++
}

if taskCompleted == len(r.def.Tasks) {
doneCh <- true
return
}
}
}

func (r *runner) run(ctx context.Context, task types.Task, taskDoneCh chan<- bool) {
defer func() {
taskDoneCh <- true
}()

fmt.Println("preparing task – ", task.Name)
if err := r.containerManager.PullImage(ctx, task.Runner); err != nil {
fmt.Println(err)
return
}

id, err := r.containerManager.CreateContainer(ctx, task)
if err != nil {
fmt.Println(err)
return
}

fmt.Println("starting task – ", task.Name)
err = r.containerManager.StartContainer(ctx, id)
if err != nil {
fmt.Println(err)
return
}

statusSuccess, err := r.containerManager.WaitForContainer(ctx, id)
if err != nil {
fmt.Println(err)
return
}

if statusSuccess {
fmt.Println("completed task – ", task.Name)

// cleanup by removing the task container
if task.Cleanup {
fmt.Println("cleanup task – ", task.Name)
err = r.containerManager.RemoveContainer(ctx, id)
if err != nil {
fmt.Println(err)
}
}
} else {
fmt.Println("failed task – ", task.Name)
}
}

The internal run function does the heavy lifting for the runner. It accepts a task and transforms it into a Docker container. A ContainerManager executes a task in the form of a Docker container.
Container Manager
The container manager is responsible for:

Pulling a Docker image for a task

Creating the task container

Starting the task container

Waiting for the container to complete

Removing the container, if required

Therefore, with respect to Go, we can define our container manager as shown below:
// internal/container-manager/container_manager.go

type ContainerManager interface {
PullImage(ctx context.Context, image string) error
CreateContainer(ctx context.Context, task types.Task) (string, error)
StartContainer(ctx context.Context, id string) error
WaitForContainer(ctx context.Context, id string) (bool, error)
RemoveContainer(ctx context.Context, id string) error
}

type DockerClient interface {
client.ImageAPIClient
client.ContainerAPIClient
}

type ImagePullStatus struct {
Status string `json:"status"`
Error string `json:"error"`
Progress string `json:"progress"`
ProgressDetail struct {
Current int `json:"current"`
Total int `json:"total"`
} `json:"progressDetail"`
}

type containermanager struct {
cli DockerClient
}

The containerManager interface has a field called cli with a DockerClient type. The interface in-turn embeds two interfaces from the Docker API, namely ImageAPIClient and ContainerAPIClient. Why do we need these interfaces?
For the ContainerManager interface to work properly, it must act as a client for the Docker Engine and API. For the client to work effectively with images and containers, it must be a type which provides required APIs. We need to embed the Docker API’s core interfaces and create a new one.
The initDockerClient function (seen above in runner.go) returns an instance that seamlessly implements those required interfaces. Check out the documentation here to better understand what’s returned upon creating a Docker client.
Meanwhile, you can view the container manager’s complete definition here.
Note: We haven’t individually covered all functions of container manager here, otherwise the blog would be too extensive.
Entrypoint
Since we’ve covered each individual component, let’s assemble everything in our main.go, which is our entrypoint. The package main tells the Go compiler that the package should compile as an executable program instead of a shared library. The main() function in the main package is the entry point of the program.

// main.go

package main

func main() {
args := os.Args[1:]

if len(args) < 2 || args[0] != argRun {
fmt.Println(helpMessage)
return
}

// read the task definition file
def, err := readTaskDefinition(args[1])
if err != nil {
fmt.Printf(errReadTaskDef, err)
}

// create a task runner for the task definition
ctx := context.Background()
runner, err := taskrunner.NewRunner(def)
if err != nil {
fmt.Printf(errNewRunner, err)
}

doneCh := make(chan bool)
go runner.Run(ctx, doneCh)

<-doneCh
}

Here’s what our Go program does:

Validates arguments

Reads the task definition

Initializes a task runner, which in turn initializes our container manager

Creates a done channel to receive the final signal from the runner

Runs our tasks

Building the Task System
1) Clone the repository
The source code is hosted over GitHub. Use the following command to clone the repository to your local machine.
git clone https://github.com/dockersamples/gopher-task-system.git

2) Build your task system
The go build command compiles the packages, along with their dependencies.
go build -o gopher

3) Run your tasks
You can directly execute gopher file to run the tasks as shown in the following way:
$ ./gopher run task.yaml

preparing task – gopher-loops
preparing task – hello-gopher
starting task – gopher-loops
starting task – hello-gopher
completed task – hello-gopher
completed task – gopher-loops

4) View all task containers
You can view the full list of containers within the Docker Desktop. The Dashboard clearly displays this information:

5) View all task containers via CLI
Alternatively, running docker ps -a also lets you view all task containers:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
396e25d3cea8 busybox "sh -c ‘for i in `se…" 6 minutes ago Exited (0) 6 minutes ago gopher-loops
aba428b48a0c busybox "echo ‘Hello, Gopher…" 6 minutes ago Exited (0) 6 minutes ago

Note that in task.yaml the cleanup flag is set to false for both tasks. We’ve purposefully done this to retrieve a container list after task completion. Setting this to true automatically removes your task containers.
Sequence Diagram

Conclusion
Docker is a collection of software development tools for building, sharing, and running individual containers. With the Docker SDK’s help, you can build and scale Docker-based apps and solutions quickly and easily. You’ll also better understand how Docker works under the hood. We look forward to sharing more such examples and showcasing other projects you can tackle with Docker SDK, soon!
Want to start leveraging the Docker SDK, yourself? Check out our documentation for install instructions, a quick-start guide, and library information.
References

Docker SDK
Go SDK Reference
Getting Started with Go

Quelle: https://blog.docker.com/feed/

Former employees claim the company placed pieces of pro-China content in its now-defunct US news app, TopBuzz, and censored negative stories about the Chinese government. ByteDance says it did no such thing.