Automate data governance, extend your data fabric with Dataplex-BigLake integration

Unlocking the full potential of data requires breaking down the silo between open-source data formats and data warehouses. At the same time, it is critical to enable data governance team to apply policies regardless of where the data happens, whether – on file  or columnar storage. Today,  data governance teams have to become subject matter experts on each storage system the corporate data happens to reside on. Since February 2022,  Dataplex has offered a unified  place to apply policies, which are propagated across both lake storage and data warehouses in GCP. Rather than specifying policies in multiple places, bearing the cognitive load of translating policies from “what you want the storage system to do” to “how your data should behave” Dataplex offers a single point for unambiguous policy management.  Now, we are making it easier for you to use BigLake.  Earlier this year, we launched BigLake into general availability, BigLake unifies data fabric between Data Lakes and Data Warehouses by extending BigQuery storage to open file formats. Today, we announce BigLake Integration with Dataplex (available in preview). This integration eliminates the configuration steps for the admin taking advantage of BigLake and managing policies across GCS and BigQuery from a unified console. Previously,  you could point Dataplex at a Google Cloud Storage (GCS) bucket, and Dataplex will discover and extract all metadata from the data lake and register this metadata in BigQuery (and Dataproc Metastore, Data Catalog) for analysis and search. With the BigLake integration capability, we are building on this capability by allowing an “upgrade” of a bucket asset, and instead of just creating external tables in BigQuery for analysis – Dataplex will create policy-capable BigLake tables! The immediate implication is that admins can now assign column, row, and table policies to the BigLake tables auto-created by Dataplex, as with BigLake – the infrastructure (GCS) layer is separate from the analysis layer (BigQuery). Dataplex will handle the creation of a BigQuery connection and a BigQuery publishing dataset and ensure the BigQuery service account has the correct permissions on the bucket.But wait – there’s more.With this release of Dataplex, we are also introducing advanced logging called governance logs.  Governance logs allow tracking the exact state of policy propagation to tables and columns – adding an additional level of detail going beyond the high-level “status” for the bucket and into fine-grained status and logs for tables, columns. What’s next? We have updated our documentation for managing buckets and have additional detail regarding policy propagation and the upgrade process.Stay tuned for an exciting  roadmap ahead, with more automation around policy management.For more information, please visit:Google Cloud Dataplex
Quelle: Google Cloud Platform

How HSBC is upskilling at scale with Google Cloud

Editor’s note: Founded in 1865, HSBC is one of the world’s largest banking and financial services organizations. In today’s post, Adrian Phelan, Global Head of Google Cloud, HSBC, explains how the organization is working with Google Cloud to drive cloud adoption at scale. Close to 90% of corporations say they’re affected by digital skills gaps, or expect to be within the next five years. Technologies and business models are evolving rapidly, and companies are deploying a multi-pronged approach to ensure they have the right skills in the right places.Here at HSBC, one of the bank’s strategic priorities is digitizing at scale. As people operate in a more digital world, we want to supply them with services quickly and in ways they want to use them. We initially worked with Google Cloud to implement more than 1,700 data analytics, customer experience, cybersecurity and emission reduction projects. A big part of rolling these out has been getting our teams skilled in the right way.Empowering employees with a culture of learningThis approach has evolved over time, but central to it has been proactively instilling a culture of learning. We started out in 2018 with a few small-scale training projects, and it quickly became clear that the teams who had participated in them delivered better and faster than those who hadn’t. They were also more independent and less dependent on central expertise.This inspired us to scale up our learning programs across the organization, which was a challenge because of the sheer size of our technical staff: tens of thousands of employees.  After some really positive feedback for our early training programs with Google Cloud, we set up our Google Accelerated Certification Program (GACP). It’s a 10-week blended learning model including self-learning, case studies, and hands-on practice followed by an examination preparation boot camp.This combination of theory and practice in a safe environment helped build employees’ confidence. Two thousand people have gone through this training so far, and it’s really helped accelerate their journey towards achieving Google Cloud certification. The learning programs also offer other digital credentials, such as completion badges and skill badges, which provide encouragement and help participants measure their progress.  Company-wide knowledge buildingWhen we started our learning journey, we focused on IT roles for obvious reasons, but we are increasingly moving towards training people in business functions.One of our aims is to educate our less technical employees about the broad capabilities that exist within the cloud. IT teams are often the ones to say, “Hey, we could do this in a better, more efficient, different way by using the cloud”, and to make that happen we need to work in close collaboration with our business colleagues, so it’s equally important that they understand the technology.To enable this kind of innovation, you have to educate the whole organization in the ‘art of the possible’. One of the ways we did that was by organizing a month-long Cloud Festival that reached 10,000 employees, which included three Google Cloud sessions. This really helped us build a foundational level of knowledge with business and technology colleagues across the organization.As we continue along our training path, interest in the cloud within the organization continues to increase. Our channel for communicating any changes related to cloud technology, processes or ways of working now has an audience of close to 8,000 employees. Looking to the future with targeted training The Google Cloud team has provided a lot of support in helping us get our training off the ground. It has always been a true process of co-creation, of listening, testing things, and seeing what works best. We meet weekly in order to keep our learning journey moving forward, listen to the demands of the business, understand what the pipeline of work is, and what the up and coming Google Cloud product launches are, so that we can stay one step ahead.One example of this is the bespoke training we introduced for business leaders. So far, 250 senior business leaders have completed it with great feedback. They have told us that the program improved their understanding of how the cloud can help to more quickly meet customer expectations, increase speed to market, reduce overheads and grow revenue through new product streams and continuous innovation. It also covered potential business activities suitable for migration to the cloud.When it comes to learning and training, you can either let it happen organically, or you can drive it. Our choice was to drive it and invest in it, and I’d highly recommend anybody trying to adopt cloud at scale does the same: they will see the return on that investment many times over. Learn more about Google Cloud training and certificationand the impact it can have on your team.
Quelle: Google Cloud Platform

Configure, Manage, and Simplify Your Observability Data Pipelines with the Calyptia Core Docker Extension

This post was co-written with Eduardo Silva, Founder and CEO of Calyptia.

Applications produce a lot of observability data. And it can be a constant struggle to source, ingest, filter, and output that data to different systems. Managing these observability data pipelines is essential for being able to leverage your data and quickly gain actionable insights.

In cloud and containerized environments, Fluent Bit is a popular choice for marshaling data across cloud-native environments. A super fast, lightweight, and highly scalable logging and metrics processor and forwarder, it recently reached three billion downloads.

Calyptia Core, from the creators of Fluent Bit, further simplifies the data collection process with a powerful processing engine. Calyptia Core lets you create custom observability data pipelines and take control of your data.

And with the new Calyptia Core Docker Extension, you can build and manage observability pipelines within Docker Desktop. Let’s take a look at how it works!

What is Calyptia Core?

Calyptia Core plugs into your existing observability and security infrastructure to help you process large amounts of logs, metrics, security, and event data. With Calyptia Core, you can:

Connect common sources to the major destinations (e.g. Splunk, Datadog, Elasticsearch, etc.)

Process 100k events per second per replicas with efficient routing.

Automatically collect data from Kubernetes and its various flavors (GKE, EKS, AKS, OpenShift, Tanzu, etc).

Build reliability into your data pipeline at scale to debug data issues.

Why Calyptia Core?

Observability as a concept is common in the day-to-day life of engineers. But the different data standards, data schemas, storage backends, and dev stacks contribute to tool fatigue, resulting in lower developer productivity and increased total cost of ownership.  

Calyptia Core aims to simplify the process of building an observability pipeline. You can also augment the streaming observability data to add custom markers and discard or mask unneeded fields.  

Why run Calyptia Core as a Docker Extension?

Docker Extensions help you build and integrate software applications into your daily workflows. With Calyptia Core as a Docker Extension, you now have an easier, faster way to deploy Calyptia Core.

Once the extension is installed and started, you’ll have a running Calyptia core. This allows you to easily define and manage your observability pipelines and concentrate on what matters most — discovering actionable insights from the data.

Getting started with Calyptia Core

Calyptia Core is in Docker Extension Marketplace. In the tutorial below, we’ll install Calyptia Core in Docker Desktop, build a data pipeline with mock data, and visualize it with Vivo.

Initial setup

Make sure you’ve installed the latest version of Docker Desktop (or at least v4.8+). You’ll also need to enable Kubernetes under the Preferences tab. This will start a Kubernetes single-node cluster when starting Docker Desktop.

Installing the Calyptia Core Docker Extension

Step 1

Open Docker Desktop and click “Add Extensions” under Extensions to go to the Docker Extension Marketplace.

Step 2

Install the Calyptia Core Docker Extension.

By clicking on the details, you can see what containers or binaries are pulled during installation.

Step 3

Once the extension is installed, you’re ready to deploy Calyptia Core! Select “Deploy Core” and you’ll be asked to login and authenticate the token for the Docker Extension.

In your browser, you’ll see a message from https://core.calyptia.com/ asking to confirm the device.

Step 4

After confirming, Calyptia Core will be deployed. You can now select “Manage Core” to build, configure, and manage your data pipelines.

You’ll be taken to core.calyptia.com, where you can build your custom observability data pipelines from a host of source and destination connectors.

Step 5

In this tutorial, let’s create a new pipeline and set docker-extension as the name.

Add “Mock Data” as a source and “Vivo” as the destination.

NOTE: Vivo is a real time data viewer embedded in the Calyptia Core Docker Extension. You can make changes to the data pipelines like adding new fields or connectors and view the streaming observability data from Vivo in the Docker Extension.

Step 6

Hit “Save & Deploy” to create the pipeline in the Docker Desktop environment.

With the Vivo Live Data Viewer, you can view the data without leaving Docker Desktop.

Conclusion

The Calyptia Core Docker Extension makes it simple to manage and deploy observability pipelines without leaving the Docker Desktop developer environment. And that’s just the beginning. You can also use automated logging in Calyptia Core for automated data collection from your Kubernetes pods and use metadata  to perform processing rules before it’s delivered to the chosen destination.

Give the Calyptia Core Docker Extension a try, and let us know what you think at hello@calyptia.com.
Quelle: https://blog.docker.com/feed/