Monitoring Cloud SQL with SQL Server database auditing

Cloud SQL for SQL Server is a fully-managed database service that allows you to run SQL Server in the cloud and let Google take care of the toil. In the past year, we’ve launched features that help you get the most out of SQL Server, like support for Active Directory authentication, SQL Server 2019, and Cross Region Replicas. We’re happy to add another SQL Server security feature: database auditing. Database auditing allows you to monitor changes to your SQL Server databases, like database creations, data inserts, or table deletions. Cloud SQL writes audit logs generated by SQL Server to the local disk and to Google Cloud Storage. You can specify how long logs should be stored on the instance – for up to seven days – and use a SQL Server function to inspect logs. Cloud SQL will also automatically write all audit files to a Google Cloud Storage bucket that you manage, so you can decide how long to retain these records if you need them for longer than seven days, or consolidate them with audit files from other SQL Server instances. To enable database auditing, go to the Google Cloud console, select your Cloud SQL for SQL Server instance, and select Edit from the Overview page. You can also enable SQL Server Audit when you create a new Cloud SQL instance:Once you’ve enabled auditing for your Cloud SQL for SQL Server instance, you can create SQL Server audits and audit specifications, which determine what information will be tracked on your databases. You can capture granular information about operations performed on your databases, including, for example, every time a login succeeds or fails. If you want to capture different information for each of your databases, you can create different audit specifications for each database on your instance, or you can create server-level audit specifications to track changes across all databases. SQL Server auditing is now available for all Cloud SQL for SQL Server instances. Learn more about how to get started with this feature today!Related ArticleCloud SQL for SQL Server: Database administration best practicesCloud SQL for SQL Server is a fully-managed relational database service that makes it easy to set up, maintain, manage, and administer SQ…Read Article
Quelle: Google Cloud Platform

Impact.com: Forging a new era of business growth through partnerships

Business partnerships come in all shapes and forms: from affiliate and influencer marketing, to SaaS providers and strategic B2B alliances. For all parties to be successful and drive business growth through a partnership, they must have a way to manage, track, and measure the incremental value their partners provide throughout their relationship. That’s why impact.com has developed technology that makes it easy for businesses to create, manage, and scale an ecosystem of partnerships with the brands and communities that customers trust so that businesses can focus on building great relationships.The partnership management platform is currently helping thousands of brands, including Walmart and Shopify, to manage and optimize their partnerships’ ROI through its purpose-built platform. And impact.com has brought this vision to life with its software built on Google Cloud Platform. “We believe in the power of technology and partnerships to create transformational growth for our customers, our company, and ourselves,” says Lisa Riolo, VP Strategic Partnerships and Co-founder at impact.com. “This mindset has been integral to the success of impact.com, which grew from a five person startup to a company valued at $1.5 billion as of September 2021.”Fuelling business growth with the right partnerships and technologyimpact.com’s original vision was to significantly improve the technology available to performance marketers while empowering traditional media channels with the data and measurement systems available to digital marketers. But it’s always been clear that for the company to remain relevant it needs to constantly evolve to meet the needs of the next generation of marketers.“We designed our toolset to be future-proof, flexible, and to adapt to the changing global landscape. Our customers rely on impact.com to manage their strategic partnerships on a global level,” says Riolo. “This combined ability to be reliable and continually innovate is the sweet spot we look for when selecting the components of our technology setup, and that’s what we found in Google Cloud.”As a company that focuses on helping businesses grow through their partnerships, scalability has always been one key criteria behind impact.com’s technology. As it acquires multiple product-led technology companies throughout its growth, the importance of being scalable becomes more evident. New companies joining impact.com suddenly gain access to a multitude of businesses they could be working with, while their customer base tends to multiply due to the exposure they gain through these new partnerships. “From a strategic perspective, when you need something new you can build it, buy it, or partner with someone who has it. We do all three,” says Riolo. “Each time we welcome a new company, we bring them on board Google Cloud so they can lean on the same reliability and scalability as we do. Having the ability to accommodate our growth is a must, and scaling on a Google Cloud environment is seamless and efficient. Additionally, we are taking advantage of the global footprint of the platform to run our applications closer to customers with low latency.”Helping more businesses to grow in the cloudAs buyers increasingly turn to cloud marketplaces to fulfill their procurement needs, impact.com is launching its partnership management software on Google Cloud Marketplace. This means that companies of all sizes can find and quickly deploy impact.com’s software on Google Cloud without having to manually configure the software itself, its virtual machine instances, or its network settings. And by taking this step onto the cloud, they gain access to infrastructure that can keep up with their success.“Our commitment to our partners centers on how we best support and enable their growth. I believe, as customers grow, their need to be in the cloud is critical to their ability to scale up, no matter what type of business they are,” explains Riolo. “So being on Google Cloud Marketplace is important for impact.com to get more exposure, and also important for Google Cloud,  because the growth-focused businesses we attract need more cloud capabilities. That’s how we grow together, and we’re very excited about what this means for the future of our relationships.”Related Article5 ways retailers can evolve beyond traditional segmentation methodsFive ways retailers can evolve beyond traditional customer segmentation methods to drive more personalized experiences in real time.Read Article
Quelle: Google Cloud Platform

Improving developer agility and efficiency with Google Workspace

The software development process requires complex, cross-functional collaboration while continuously improving products and services. Our customers who build software say that they value Google Workspace for its ability to drive innovation and collaboration throughout the entire software development life cycle. Developers can hold standups and scrums in Google Chat, Meet, and Spaces, create and collaborate on requirements documentation in Google Docs and Sheets, build team presentations in Google Slides, and manage their focus time and availability with Google Calendar. Development teams also use many other tools to get work done, like tracking issues and tasks in Atlassian’s Jira, managing workloads with Asana, and incident management in PagerDuty. One of the benefits of Google Workspace is that it’s an open platform tailored to improve the performance of your tools by seamlessly integrating them together. We’re constantly expanding our ecosystem and improving Google Workspace, giving you the power to push your software development even further.Make software development more agileGoogle Workspace gives you real-time visibility into project progress and decisions to help you ship quality code fast and stay connected with your stakeholders, all without switching tools and tabs. By leveraging applications from our partners, you can pull valuable information out of silos, making collaborating on requirements, code reviews, bug triage, deployment updates, and monitoring operations easy for the whole team. This allows your teams to stay focused on their priorities while keeping everyone aligned, ensuring collaborators are always in the loop.Plan and execute togetherWhen combined with integrations, Google Workspace makes the software development planning process more collaborative and efficient. For example, many organizations use Asana—a leading work management platform—to coordinate and manage everything from daily tasks to cross-functional strategic initiatives. To make the experience more seamless, Asana built integrations so users can always have access to their tasks and projects with Google Drive, Gmail, and Chat. With these integrations for Google Workspace, you can turn your conversations into action and create new tasks in Asana—all without leaving Google Workspace. “We’ve seen exceptional, heavy adoption of tasks being created from within the Gmail add-on. Our customers and community have also shown very strong interest in future development work, which is something we’ll continue to prioritize.” Strand Sylvester, Product Manager, Asana To date, users have installed the Asana for Gmail add-on over 2.5 million times, as well as over 3.8 million installs of the Asana for Google Workspace add-on for Google Drive.Turn your conversations into action with the Asana for Google Chat app.Start coding quicklyGoogle Workspace makes it easy for product managers, UX designers, and engineers to agree on what they’re building and why. By bringing all stakeholders, decisions, and requirements into one place—whether it’s a Gmail or Google Chat conversation, or a document in Google Docs, Sheets, or Slides—Google Workspace removes friction, helping your teams finalize product specifications and get started right away.Integrations like GitHub for Google Chat make the entire development process fit easily into a developer’s workflow. With this integration, teams can quickly push new commits, make pull requests, do code reviews, and provide real-time feedback that improves the quality of their code—all from Google Chat.Get updates on GitHub without leaving the conversation.Speed up testingIntegrations like Jira for Google Chat accelerate the entire QA process in the development workflow. The app acts as a team member in the conversation, sending new issues and contextual updates as they are reported to improve the quality of your code and keep everyone informed on your Jira projects.Quickly create a new Jira issue without ever leaving Google Chat.Ship code fasterDevelopers use Jenkins—a popular open-source continuous integration and continuous delivery tool—to build and test products continuously. Along with other cloud-native tools, Jenkins supports strong DevOps practices by letting you continuously integrate changes into the software build. With Jenkins for Google Chat, development and operations teams can connect into their Jenkins pipeline and stay up to date by receiving software build notifications directly in Google Chat.Jenkins for Google Chat helps DevOps teams stay up to date with build notifications.Proactively monitor your servicesImproving the customer experience requires capturing and monitoring data sources to improve application and infrastructure observability. Google Workspace supports DevOps teams and organizations by helping stakeholders collaborate and troubleshoot more effectively. When you integrate Datadog with Google Chat, monitoring data becomes part of your team’s discussion, and you can efficiently collaborate to resolve issues as soon as they arise. The integration makes it easy to start a discussion with all the relevant teams by sharing a snapshot of a graph in any of your Chat spaces. When an alert notification is triggered, it allows you to notify each Chat space independently, precisely targeting your communication to the right teams.Collaborate, share, and track performance with Datadog for Google Chat.Improve service reliabilityOrchestrating business-wide responses to interruptions is a cross-functional effort. When revenue and brand reputation depends on customer satisfaction, it’s important to proactively manage service-impacting events. Google Workspace supports response teams by ensuring that urgent alerts reach the right people by providing teams with a central space to discover incidents, find the root cause, and resolve them quickly.  PagerDuty for Google Chat empowers developers, DevOps, IT operations, and business leaders to prevent and resolve business-impacting incidents for an exceptional customer experience—all from Google Chat. See and share details with link previews, and perform actions by creating or updating incidents. By keeping all conversations in a central space, new responders can get up to speed and solve issues faster without interrupting others.PagerDuty for Google Chat keeps the business up to date on service-impacting incidents.Accelerate developer productivityIntegrating your DevOps tools with Google Workspace allows your development teams to centralize their work, stay focused on what’s important—like managing their work—build code quickly, ship quality products, and communicate better during service impacting incidents. For more apps and solutions that help centralize your work so you and your teams can connect, create, and get things done, check out Google Workspace Marketplace, where you’ll find more than 5,300 public applications that integrate directly into Google Workspace.Related ArticleCan email still delight us? An interview with Gmail’s Product LeadInterview with Gmail’s Product Lead on how the team innovates and continues to deliver great user experiences.Read Article
Quelle: Google Cloud Platform

How SLSA and SBOM can help healthcare resiliency

Taking prescription medication at the direction of anyone other than a trained physician is very risky—and the same could be said for selecting technology used to run a hospital, to manage a drug manufacturing facility and, increasingly, to treat a patient for a medical condition.To pick the right medication, physicians need to carefully consider its ingredients, the therapeutic value they collectively provide, and the patient’s condition. Healthcare cybersecurity leaders similarly need to know what goes into the technology their organization’s use to manage patient medical records, manufacture compound drugs, and treat patients in order to keep them safe from cybersecurity threats.Just like prescription medication, careful vetting and selection of the technology is required to ensure patient safety and establish visibility and awareness into the technology modern healthcare depends on to create a resilient healthcare system. In this and our next blog, we focus on two topics critical to building resilience – software bill of materials (SBOM) and Google’s Supply chain Levels for Software Artifacts (SLSA) framework – and how to use them to make technology safe. Securing the software supply chain, or where the software we depend comes from, is a critical security priority for defenders and something Google is committed to helping organizations do.Diving deeper into the technology we rely onCybersecurity priorities for securing healthcare systems usually focus only on protecting sensitive healthcare information, like Protected Health Information (PHI). Maintaining the privacy of patient records is an important objective and securing data and systems plays a big role in this regard. Healthcare system leadership and other decision makers often depend on cybersecurity experts to select technologies and service providers that can meet regulatory rules for protecting data as a first (and sometimes only) priority. Trust is often placed on the reputations and compliance programs of the vendors who manufacture the technology they buy without much further inspection. Decision makers need to approach every key healthcare and life science technology or service provider choice as a high-risk, high-consequence decision, but few healthcare organizations have the skills, resources, and time to “go deep” in vetting the security built into the technology they buy before it enters a care setting. Vetting needs to include penetrating analysis of all aspects of software and hardware, their architecture and engineering quality, the provenance of all parts that they’re made of, and assessing each component for risk. Doing this can sometimes require deep technical skills and advanced knowledge of medical equipment threats that may not be easy to acquire. Instead of making additional investments to help secure their networks and systems, many organizations choose simpler paths.The failure to properly assess technological susceptibility to risk has exposed healthcare organizations and their patients to a variety of safety and security issues that may have been preventable. PTC (formerly Parametric Technology Corporation, which makes medical device software) disclosed seven vulnerabilities in March that impacted equipment used for robotic radiosurgery. In October 2019, the VxWorks Urgent 11 series of vulnerabilities was announced, affecting more than 1 billion connected devices, many used throughout healthcare and life sciences. More examples of medical devices and software found to have vulnerable components can be found on the FDAs cybersecurity website and in its recall database. How a physician understands, selects, and prescribes medication parallels how we address these concerns when selecting technology. Recent FDA guidance suggests manufacturers must soon provide increased levels of visibility into the technologies they market and sell in the healthcare industry. Here’s where the SBOM, a key visibility mechanism, comes in.What SBOMs do well, and how Google is helping make them betterThe National Telecommunications and Information Administration defines the SBOM as a “nested inventory for software, a list of ingredients that make up software components.”The concept of a SBOM appears to have found its start in enabling software makers back in the 1990s, although it originally stems from ideas popularized by visionary engineer and professor W. Edwards Deming. SBOM as a concept has advanced since then, with multiple standards for generating and sharing them now in use.Thanks to the continued focus on improving and using SBOMs, we expect it will be much easier for defenders to use SBOMs to track software and its components, where they come from, what security vulnerabilities they contain, and equip protectors with their ability to stop those vulnerabilities from being exploited, at scale, and before they impact patient care. “Software bills of materials help to bridge the knowledge gap created by running unknown, unpatched software and components as too many healthcare organizations currently do,” says Dan Walsh, chief information security officer at VillageMD, a tech-driven primary-care provider. “For security leaders, SBOM should be an extension of their asset inventory and management capability, regardless of whether that software was bought or built. At VillageMD, we are asking our vendors that store, transmit, receive or process PHI for an SBOM as part of our third-party vendor assessment program.”Today’s SBOMs are most often basic text files generated by a software developer when the creation of software is complete and a product is assembled (or application is created from source code.) The text file contains information about the product’s software components and subcomponents, where those components and subcomponents came from, and who owns them. But unlike a recipe used to make a pharmaceutical, for example, an SBOM also tracks the software versions of components and subcomponents. SBOMs often capture:Supplier NameComponent NameVersion of the ComponentOther Unique IdentifiersDependency RelationshipAuthor of SBOM DataTimestamp Here’s the format of a SBOM generated using the SPDX v2.2.1 standard:Technology producers, decision makers, and operators in any industry can use this information to deeply understand the risks the products pose to patients and the health system. An SBOM, for example, can show a reader if the software used on a medical device is merely out of date, or vulnerable to a cyber attack that could affect its safe use. Google sponsors a number of initiatives focused on securing the software supply chain, including how to use SBOMs, through our work with U.S. government agencies, the Open Source Security Foundation, and Linux Foundation, including a project focused on building and distributing SBOMs. Learn about the SPDX project and Cyclone DX, read the ISO/IEC 5962:2021 standard (for SPDX), ISO ISO/IEC 19770-2:2015 (for SWID; another artifact that provides a SBOM), and other training resources from the Linux Foundation.As an additional measure, healthcare organizations which use SBOM need to make sure they can trust that the SBOMs they rely on haven’t been changed since the manufacturer produced it. To defend against this, software makers can cryptographically sign their SBOMs making it easier to identify if a SBOM has been maliciously altered since it was first published. While U.S. Executive Order 14028 created a federal mandate for the SBOM, and although many organizations have begun to incorporate that mandate into their software production workflows, many issues and roadblocks remain unresolved. At Google, we think the use of SBOM will help organization’s gain important visibility into the technologies that are entering our healthcare facilities and enable defenders to more capably protect both patient safety and patient data privacy.Digging into the SLSAWe believe resilient organizations have resilient software supply chains. Sadly no single mechanism, like SBOM, can achieve this outcome. It’s why we created the SLSA framework, and services like Assured Open Source Software. SLSA was developed following Google’s own practices for securing its software supply chain. SLSA is guidance for securing software supply chains using a set of incremental, enforceable security guidelines that can automatically create auditable metadata. This metadata will then result in a “SLSA certification” to a particular package or build platform. It’s a verifiable way to assure consumers that the software they use hasn’t been tampered with, something which doesn’t exist broadly today. We’ve recently explained more about how the SLSA works in blog posts on SLSA basics and more in-depth SLSA details.Similarly, Assured Open Source Software gives organizations the ability to use the same regularly tested and secured software packages Google uses to build its software. Used in combination with a SBOM, technology makers can build reliable, safe, and verifiable products. Most technology buyers, such as those who run your local healthcare system, can use those same mechanisms to gain visibility into a technologies’ safety and fitness for use. Where do we go from here? Visibility into the components that make up the technology we use to care for patients is critically necessary. We can’t build a resilient healthcare system if our only priority is privacy of data. We must add resilience and safety to the list of our top priorities. Gaining deep visibility into the technology that decorates health system networks is a critical shift we must make. SBOM and SLSA help us make this shift. But remember, it’s not one or the other. As Dan Walsh from VillageMD says, the SBOM has a way to go:. “It won’t solve all of your problems,” he cautions, but adds that when used correctly, “SBOM will help you improve visibility into the software that runs on the critical systems that keep societies safe and we’re excited to see it get traction.”But when complemented with SLSA and topics we’ll cover next, such as a Vulnerability eXploitability Exchange (VEX), we are on a path to greater resilience.Related ArticleHow healthcare can strengthen its own cybersecurity resilienceBuilding resilience in healthcare cybersecurity may feel daunting, but lessons from exposure therapy and using core concepts can lead to …Read Article
Quelle: Google Cloud Platform

Four back-to-school and off-to-college consumer trends retailers should know

Is it September yet? Hardly! School is barely out for the summer. But according to Google and Quantum Metric research, the back-to-school and off-to-college shopping season – which in the U.S. is second only to the holidays in terms of purchasing volume1 – has already begun. For retailers, that means planning for this peak season has kicked off as well.We’d like to share four key trends that emerged from Google research and Quantum Metric’s Back-to-School Retail Benchmarks study of U.S. retail data, explore the reasons behind them, and outline the key takeaways.1. Out-of-stock and inflation concerns are changing the way consumers shop. Back-to-school shoppers are starting earlier every year, with 41% beginning even before school is out – even more so when buying for college1. Why? The behavior is driven in large part by consumers’ concerns that they won’t be able to get what they need if they wait too long. 29% of shoppers start looking a full month before they need something1.Back-to-school purchasing volume is quite high, with the majority spending up to $500 and 21% spending more than $1,0001. In fact, looking at year-over-year data, we see that average cart values have not only doubled since November 2021, but increased since the holidays1. And keep in mind that back-to-school spending is a key indicator leading into the holiday season.That said, as people are reacting to inflation, they are comparing prices, hunting for bargains, and generally taking more time to plan. This is borne out by the fact that 76% of online shoppers are adding items to their carts and waiting to see if they go on sale before making the purchase1. And, to help stay on budget and reduce shipping costs, 74% plan to make multiple purchases in one checkout1. That carries over to in-store shopping, when consumers are buying more in one visit to reduce trips and save on gas.  2. The omnichannel theme continues. Consumers continue to use multiple channels in their shopping experience. As the pandemic has abated, some 82% expect that their back-to-school buying will be in-store, and 60% plan to purchase online. But in any case, 45% of consumers report that they will use both channels; more than 50% research online first before ever setting foot in a store2. Some use as many as five channels, including video and social media, and these 54% of consumers spend 1.5 times more compared to those who use only two channels4.And mobile is a big part of the journey. Shoppers are using their phones to make purchases, especially for deadline-driven, last-minute needs, and often check prices on other retailers’ websites while shopping in-store. Anecdotally, mobile is a big part of how we ourselves shop with our children, who like to swipe on the phone through different options for colors and styles. We use our desktops when shopping on our own, especially for items that require research and represent a larger investment – and our study shows that’s quite common.3. Consumers are making frequent use of wish lists. One trend we have observed is a higher abandonment rate, especially for apparel and general home and school supplies, compared to bigger-ticket items that require more research. But that can be attributed in part to the increasing use of wish lists. Online shoppers are picking a few things that look appealing or items on sale, saving them in wish lists, and then choosing just a few to purchase. Our research shows that 39% of consumers build one or two wish lists per month, while 28% said they build one or two each week, often using their lists to help with budgeting1.4. Frustration rates have dropped significantly. Abandonment rates aside, shopper annoyance rates are down by 41%, year over year1. This is despite out-of-stock concerns and higher prices. But one key finding showed that both cart abandonment and “rage clicks” are more frequent on desktops, possibly because people investing time on search also have more time to complain to customer service.And frustration does still exist. Some $300 billion is lost each year in the U.S. from bad search experiences5. Data collected internationally shows that 80% of consumers view a brand differently after experiencing search difficulties, and 97% favor websites where they can quickly find what they are looking for5.Lessons to LearnWhat are the key takeaways for retailers? In general, consider the sources of customer pain points and find ways to erase friction. Improve search and personalization. And focus on improving the customer experience and building loyalty. Specifically:80% of shoppers want personalization6. Think about how you can drive personalized promotions or experiences that will drive higher engagement with your brand. 46% of consumers want more time to research1. Drive toward providing more robust research and product information points, like comparison charts, images, and specific product details.43% of consumers want a discount1, but given current economic trends, retailers may not be offering discounts. In order to appease budget-conscious shoppers, retailers can consider other retention strategies such as driving loyalty using points, rewards, or faster-shipping perks.Be sure to keep returns as simple as possible so consumers feel confident when making a purchase, and reduce possible friction points if a consumer decides to make a return. 43% of shoppers return at least a quarter of the products they buy and do not want to pay for shipping or jump through hoops1.How We Can HelpGoogle-sponsored research shows that price, deals, and promotions are important to 68% of back-to-school shoppers.7 In addition, shoppers want certainty that they will get what they want. Google Cloud can make it easier for retailers to enable customers to find the right products with discovery solutions. These solutions provide Google-quality search and recommendations on a retailer’s own digital properties, helping to increase conversions and reduce search abandonment. In addition, Quantum Metric solutions, available on the Google Cloud Marketplace, are built with BigQuery, which helps retailers consolidate and unlock the power of their raw data to identify areas of friction and deliver improved digital shopping experiences.We invite you to watch the Total Retail webinar “4 ways retailers can get ready for back-to-school, off-to college” on demand and to view the full Back-to-School Retail Benchmarks reportfrom Quantum Metric.Sources:1. Back-to-School Retail Benchmarks reportfrom Quantum Metric2. Google/Ipsos,Moments 2021, Jun 2021, Online survey, US, n=335 Back to School shoppers3. Google/Ipsos, Moments 2021, Jun 2021, Online survey, US, n=2,006 American general population 18+4. Google/Ipsos, Holiday Shopping Study, Oct 2021 – Jan 2022, Online survey, US, n=7,253, Americans 18+ who conducted holiday shopping activities in past two days5. Google Cloud Blog, Nov 2021, “Research: Search abandonment has a lasting impact on brand loyalty”6. McKinsey & Company, “Personalizing the customer experience: Driving differentiation in retail”7. Think with Google, July 2021, “What to expect from shoppers this back-to-school season”Related ArticleQuantum Metric explores retail big data use cases on BigQueryExplore three ways enterprises are leveraging Quantum Metric data in BigQuery to enhance the customer experience.Read Article
Quelle: Google Cloud Platform

Google Cloud Data Heroes Series: Meet Francisco, the Ecuadorian American founder of Direcly, a Google Cloud Partner

Google Cloud Data Heroes is a series where we share stories of the everyday heroes who use our data analytics tools to do incredible things. Like any good superhero tale, we explore our Google Cloud Data Heroes’ origin stories, how they moved from data chaos to a data-driven environment, what projects and challenges they are overcoming now, and how they give back to the community.In this month’s edition, we’re pleased to introduce Francisco! He is based out of Austin, Texas, but you’ll often find him in Miami, Mexico City, or Bogotá, Colombia. Francisco is the founder of Direcly, a Google Marketing Platform and Google Cloud Consulting/Sales Partner with presence in the US and Latin America.Francisco was born in Quito, Ecuador, and at age 13, came to the US to live with his father in Miami, Florida. He studied Marketing at Saint Thomas University, and his skills in math landed him a job as Teaching Assistant for Statistics & Calculus. After graduation, his professional career began at some nation’s  leading ad agencies before he eventually transitioned into the ad tech space. In 2016, he ventured into the entrepreneurial world and founded Direcly, a Google Marketing Platform, Google Cloud, and Looker Sales/Consulting partner obsessed with using innovative technological solutions to solve business challenges. Against many odds and with no external funding since its inception, Direcly became a part of a selected group of Google Cloud and Google Marketing Platform partners. Francisco’s story was even featured in a Forbes Ecuador article! Outside of the office, Francisco is an avid comic book reader/collector, a golfer, and fantasy adventure book reader. His favorite comic book is The Amazing Spider-Man #252, and his favorite book is The Hobbit. He says he isn’t the best golfer, but can ride the cart like a pro.When were you introduced to the cloud, tech, or data field? What made you pursue this in your career? I began my career in marketing/advertising, and I was quickly drawn to the tech/data space, seeing the critical role it played. I’ve always been fascinated by technology and how fast it evolves. My skills in math and tech ended up being a good combination. I began learning some open source solutions like Hadoop, Spark, and MySQL for fun and started to apply them in roles I had throughout my career.  After my time in the ad agency world, I transitioned into the ad tech industry, where I was introduced to how cloud solutions were powering ad tech solutions like demand side, data management, and supply side platforms. I’m the type of person that can get easily bored doing the same thing day in and day out, so I pursued a career in data/tech because it’s always evolving. As a result, it forces you to evolve with it. I love the feeling of starting something from scratch and slowly mastering a skill.What courses, studies, degrees, or certifications were instrumental to your progression and success in the field? In your opinion, what data skills or competencies should data practitioners be focusing on acquiring to be successful in 2022 and why? My foundation in math, calculus, and statistics was instrumental for me.  Learning at my own pace and getting to know the open source solutions was a plus. What I love about Google is that it provides you with an abundance of resources and information to get started, become proficient, and master skills. Coursera is a great place to get familiar with Google Cloud and prepare for certifications. Quests in Qwiklabs are probably one of my favorite ways of learning because you actually have to put in the work and experience first hand what it’s like to use Google Cloud solutions. Lastly, I would also say that just going to the Google Cloud internal documentation and spending some time reading and getting familiar with all the use cases can make a huge difference. For those who want to acquire the right skills I would suggest starting with the fundamentals. Before jumping into Google Cloud, make sure you have a good understanding of Python, SQL, data, and some popular open sources. From there, start mastering Google Cloud by firstly learning the fundamentals and then putting things into practice with Labs. Obtain a professional certification — it can be quite challenging but it is rewarding once you’ve earned it. If possible, add more dimension to your data expertise by studying real life applications with an industry that you are passionate about. I am fortunate to be a Google Cloud Certified Professional Data Engineer and hold certifications in Looker, Google Analytics, Tag Manager, Display and Video 360, Campaign Manager 360, Search Ads 360, and Google Ads. I am also currently working to obtain my Google Cloud Machine Learning Engineer Certification. Combining data applications with analytics and marketing has proven instrumental throughout my career. The ultimate skill is not knowledge or competency in a specific topic, but the ability to have a varied range of abilities and views in order to solve complicated challenges.You’re no doubt a thought leader in the field. What drew you to Google Cloud? How have you given back to your community with your Google Cloud learnings?Google Cloud solutions are highly distributed, allowing companies to use the same resources an organization like Google uses internally, but for their own business needs. With Google being a clear leader in the analytics/marketing space, the possibilities and applications are endless. As a Google Marketing Platform Partner and having worked with the various ad tech stacks Google has to offer, merging Google Cloud and GMP for disruptive outcomes and solutions is really exciting.  I consider myself to be a very fortunate person, who came from a developing country, and was given amazing opportunities from both an educational and career standpoint. I have always wanted to give back in the form of teaching and creating opportunities, especially for Latinos / US Hispanics. Since 2018, I’ve partnered with Florida International University Honors College and Google to create industry relevant courses. I’ve had the privilege to co-create the curriculum and teach on quite a variety of topics. We introduced a class called Marketing for the 21st Century, which had a heavy emphasis on the Google Marketing Platform. Given its success, in 2020, we introduced Analytics for the 21st Century, where we incorporated key components of Google Cloud into the curriculum. Students were even fortunate enough to learn from Googlers like Rob Milks (Data Analytics Specialist) and Carlos Augusto (Customer Engineer).What are 1-2 of your favorite projects you’ve done with Google Cloud’s data products? My favorite project to date is the work we have done with Royal Caribbean International (RCI) and Roar Media. Back in 2018, we were able to transition RCI efforts from a fragmented ad tech stack into a consolidated one within the Google Marketing Platform. Moreover, we were able to centralize attribution across all the paid marketing channels. With the vast amount of data we were capturing (17+ markets), it was only logical to leverage Google Cloud solutions in the next step of our journey. We centralized all data sources in the warehouse and deployed business intelligence across business units. The biggest challenge from the start was designing an architecture that would meet both business and technical requirements. We had to consider the best way to ingest data from several different sources, unify them, have the ability to transform data as needed, visualize it for decision makers, and set the foundations to apply machine learning. Having a deep expertise in marketing/analytics platforms combined with an understanding of data engineering helped me tremendously in leading the process, designing/implementing the ideal architecture, and being able to present end users with information that makes a difference in their daily jobs.      We utilized BigQuery as a centralized data warehouse to integrate all marketing sources (paid, organic, and research) though custom built pipelines. From there we created data driven dashboards within Looker, de-centralizing data and giving end users the ability to explore and answer key questions and make real time data driven business decisions. An evolution of this initiative has been able to go beyond marketing data and apply machine learning. We have created dashboards that look into covid trends, competitive pricing, SEO optimizations, and data feeds for dynamic ads. From the ML aspect, we have created predictive models on the revenue side, mixed marketing modeling, and applied machine learning to translate English language ads to over 17 languages leveraging historical data.What are your favoriteGoogle Cloud data productswithin the data analytics, databases, and/or AI/ML categories? What use case(s) do you most focus on in your work? What stands out aboutGoogle Cloud’s offerings?I am a big fan of BigQuery (BQ) and Looker. Traditional data warehouses are no match for the cloud – they’re not built to accommodate the exponential growth of today’s data and the sophisticated analytics required. BQ offers a fast, highly scalable, cost-effective and fully controlled cloud data warehouse for integrated machine learning analytics and the implementation of AI. Looker on the other hand, is truly next generation BI. We all love Structured Query Language (SQL), but I think many of us have been in position of writing dense queries and forgetting how some aspects of the code work, experiencing the limited collaboration options, knowing that people write queries in different ways, and how difficult it can be to track changes in a query if you changed your mind on a measure. I love how Look ML solves all those challenges, and how it helps one reuse, control and separate SQL into building blocks. Not to mention, how easy it is to give end users with limited technical knowledge the ability to look at data on their terms.      What’s next for you?I am really excited about everything we are doing at Direcly. We have come a long way, and I’m optimistic that we can go even further. Next for me is just to keep on working with a group of incredibly bright people who are obsessed with using innovative technological solutions to solve business challenges faced by other incredibly bright people.From this story I would like to tell those that are pursuing a dream, that are looking to provide a better life for themselves and their loved ones, to do it, take risks, never stop learning, and put in the work. Things may or may not go your way, but keep persevering — you’ll be surprised with how it becomes more about the journey than the destination. And whether things don’t go as planned, or you have a lot of success, you will remember everything you’ve been through and how far you’ve come from where you started.  Want to join the Data Engineer Community?Register for the Data Engineer Spotlight, where attendees have the chance to learn from four technical how-to sessions and hear from Google Cloud Experts on the latest product innovations that can help you manage your growing data.Related ArticleGoogle Cloud Data Heroes Series: Meet Antonio, a Data Engineer from Lima, PeruGoogle Cloud continues their Data Hero series with a profile on Antonio C., a data engineer, teacher, writer, and enthusiast on GCP.Read Article
Quelle: Google Cloud Platform

Load balancing Google Cloud VMware Engine with Traffic Director

The following solution brief discusses a GCVE + Traffic Director implementation aimed at providing customers an easy way to scale out web services, while enabling application migrations to Google Cloud. The solution is built on top of a flexibleandopen architecture that exemplifies the unique capabilities of Google Cloud Platform. Let’s elaborate:Easy: The full configuration takes minutes to implement and can be scripted or defined with Infrastructure-as-Code (IaC) for rapid consumption and minimal errors.Flexible and open: The solution relies on Envoy, an open source platform that enjoys tremendous popularity with the network and application communities.The availability of Google Cloud VMware Engine (GCVE) has given GCP customers the ability to deploy Cloud applications on a certified VMware stack that is managed, supported and maintained by Google. Many of these customers also demand seamless integration between their applications running on GCVE, and the various infrastructure services that are provided natively by our platform such as Google Kubernetes Engine (GKE), or serverless frameworks like Cloud Functions,  App Engine or Cloud Run. Networking services are at the top of that list.In this blog, we discuss how Traffic Director, a fully managed control plane for Service Mesh, can be combined with our portfolio of load balancers and withhybrid network endpoint groups (hybrid NEG) to provide a high-performance front-end for web services hosted in VMware Engine.Traffic Director also serves as the glue that links the native GCP load balancers and the GCVE backends, with the objective of enabling these technical benefits:Certificate Authority integration, for full lifecycle management of SSL certificates.DDoS protection with Cloud Armor, helps protect your applications and websites against denial of service and web attacks.Cloud CDN, for cached content delivery.Intelligent anycast with a Single IP and Global Reach, for improved failover, resiliency and availability. Bring Your Own IP (BYOIP),  to provision and use your own public IP addresses for Google Cloud resources.Diverse backend types integration in addition to GCVE, such as GCE, GKE, Cloud Storage and serverless. Scenario #1 – External load balancerThe following diagram provides a summary of the GCP components involved in this architecture:This scenario shows an external HTTP(S) load balancer used to forward traffic to the Traffic Director dataplane component, implemented as a fleet of Envoy proxies. Users can create routable NSX segments and centralize the definition of all traffic policies in Traffic Director. The GCVE VM IP and port pairs are specified directly in the hybrid NEG, meaning all network operations are fully managed by a Google Cloud control plane.Alternatively, GCVE VMs can be deployed to a non-routable NSX segment behind an NSX L4 load balancer configured at the Tier-1 level, and the Load Balancer VIP can be exported to the customer VPC via the import and export of routes in the VPC Peering connection. It is important to note that in GCVE, it is highly recommended that NSX-T load balancers be associated with Tier-1 gateways, and not the Tier-0 gateway.The steps to configure load balancers in NSX-T, including server pools, health checks, virtual servers and distribution algorithms are documented by VMware and not covered in this document.Fronting the web applications with an NSX load balancer would allow for the following:Only VIP routes are announced, allowing the use of private IP addresses in the web tier, as well as overlapping IP addresses in case of multi-tenant deployments.Internal clients (applications inside of GCP or GCVE) can point to the VIP of the NSX Load Balancer, while external clients can point to the public VIP in front of a native, GCP external load balancer.A L7 NSX load balancer can also be used (not discussed in this example), for advanced application-layer services, such as cookie session persistence, URL mapping, and more.To recap, the implementation discussed in this scenario shows an external HTTP(S) load balancer, but please note that an external TCP/UDP network load balancer or TCP Proxy could also be used for supporting protocols other than HTTP(S). There are certain restrictions when using Traffic Director in L4 mode, such as a single backend service per target proxy, which need to be accounted for when implementing your architecture.Scenario #2 – Internal load balancerIn this scenario, the only change is the load balancing platform used to route requests to Traffic Director-managed Envoy proxies. This use case may be appropriate in certain situations, for instance, whenever the users want to take advantage of advanced traffic management capabilities not supported without Traffic Director, as documented here.The Envoy-managed proxies controlled by Traffic Director can send traffic directly to GCVE workloads:Alternately, and similar to what was discussed in Scenario #1, an NSX LB VIP can be used instead of the explicit GCVE VM IPs, which introduces an extra load balancing layer:To recap, this scenario shows a possible configuration with L7 Internal Load Balancer, but an L4 Internal Load Balancer can also be used for supporting protocols other than HTTP(S). Please note there are certain considerations when leveraging L4 vs. L7 load balancers in combination with Traffic Director, which are all  documented here.ConclusionWith the combination of multiple GCP products, customers can take advantage of the various distributed network services offered by Google, such as global load balancing, while hosting their applications on a Google Cloud VMware Engine environment that provides continuity for their operations, without sacrificing availability, reliability or performance.Go ahead and review the GCVE networking whitepaper today. For additional information about VMware Engine, please visit the VMware Engine landing page, and explore our interactive tutorials. And be on the lookout for future articles, where we will discuss how VMware Engine integrates with other core GCP infrastructure and data services.Related ArticleNew in Google Cloud VMware Engine: Single nodes, certifications and moreThe latest version of Google Cloud VMware Engine now supports single node clouds, compliance certs and Toronto availabilityRead Article
Quelle: Google Cloud Platform

Top 5 use cases for Google Cloud Spot VMs explained + best practices

Cloud was built on the premise of flexible infrastructure that grows and shrinks with your application demands. Applications that can take advantage of this elastic infrastructure and scale horizontally with the demands of your application offer significant advantages over competitors by allowing infrastructure costs to scale up and down along with the demand. Google Cloud’s Spot VMs enable our customers to make the most of our idle capacity where and when it is available. Spot VMs are offered at a significant discount from list price to drive maximum savings provided customers have flexible, stateless workloads that can handle preemption. Spot VMs can be reclaimed by Google (with a 30 second notice). When you deploy the right workloads on Spot VMs, you are able to maintain elasticity while also taking advantage of the best discounts Google has to offer.This blog discusses a few common use cases and design patterns we have seen customers utilize Spot VMs for and discusses the best practices for these use cases. While this is not an exhaustive list, this blog serves as a template to help customers make the most of the Spot VM savings while still reaching their application and workload objectives. Media renderingRendering workloads (such as rendering 2D or 3D elements) can be both compute and time intensive, requiring skilled IT resources to manage render farms. Job management becomes even more difficult when the render farm is at 100% utilization. Spot VMs are ideal resources for fault-tolerant rendering workloads; when combined with a queuing system customers can integrate the preemption notice to track preempted jobs. This allows you to build a render farm which benefits from reduced TCO. If your renderer supports taking snapshots of in-progress renders at specified intervals, writing these snapshots to a persistent data store (Cloud Storage) will limit any loss in work in the event the Spot VM is preempted. As subsequent Spot VMs are created, they can pick up where the old ones left off by using the snapshots on Cloud Storage. You can also leverage the new “suspend and resume a VM” feature which allows you to keep the VM instances during the preemption event but not incur any charges for it while the VM is not in use.Additionally, we have helped customers combine local render farms in their existing datacenters with cloud-based render farms, allowing a hybrid approach for large or numerous render workloads without increasing their investment in their physical datacenters. Not only does this reduce their capital expenses, but it adds flexible scalability to the existing farm and provides a better experience for their business partners. Financial modelingCapital market firms have significant investments in their infrastructure to create state-of-the-art, world-class compute grids. Since compute grids began, in-house researchers leverage these large grids in physical datacenters to test their trading hypotheses and perform backtesting. But as the business grows, what happens when all the researchers each have a brilliant idea and want to test that out at the same time? Researchers then have to compete with one another for the same limited resources, which leads to queueing their jobs and increased lead times for testing their ideas. And in financial markets, time is always scarce. Enter cloud computing and Spot VMs. Capital market firms can use Google Cloud as an extension of their on-premises grid by spinning up temporary compute resources. Or they can go all in on cloud and build their grid in Google Cloud entirely. In either scenario, Spot VMs are ideal candidates for bursting research workloads given the transient nature of the workload and heavily discounted prices of VMs. This enables researchers to test more hypotheses at a lower cost per test, in turn producing better models for firms. Google Cloud Spot VM discounts not only apply to the VMs themselves, but also to any GPU accelerator attached to them, providing even more processing power to a firm looking to process larger more complex models. Once these jobs have completed, Spot VMs can be quickly spun down, maintaining strict control on costs. CI/CD pipelinesContinuous integration (CI) and Continuous delivery (CD) tools are very common for the modern application developer. These tools allow developers to create a testing pipeline that enables developers and quality engineers to ensure the newly created code works with their environment and that the deployment process does not break anything during deployment. CI/CD tools and test environments are great workloads to run on Spot VMs since CI/CD pipelines are not mission-critical for most companies — a delay in deployment or testing by 15 minutes, or even a few hours, is not material to their business. This means that companies can lower the cost of operating their CI/CD pipeline significantly through the use of Spot VMs. A simple example of this would be to install the Jenkins Master Server in a Managed Instance Group (MIG) with the VM type set to Spot. If the VM gets preempted, the CI/CD pipelines will stall until the MIG can find resources again to spin up a new VM. The first reaction may be concern that Jenkins persists data locally, which is problematic for Spot VMs. However, customers can move the Jenkins directory (/var/lib/Jenkins) to Google Cloud Filestore and preserve this data. Then when the new Spot VM spins up, it will reconnect to the directory. In the case of a large-scale Jenkins deployment, build VMs can utilize Spot VMs as part of a MIG to scale as necessary while ensuring that the builds can be maintained with on-demand VMs. This blended approach removes any risk to the builds, while still allowing customers to save up to 91% in costs of the additional VMs versus traditional on-demand VMs.Web services and appsLarge online retailers have found ways to drive massive increases in order volume. Typically companies like this target a specific time each month, such as the last day of the month, through a unique promotion process. This means that they are in many cases creating a Black Friday/Cyber Monday-style event, each and every month! In order to support this, companies traditionally used a “Build it like a stadium for Super Bowl Sunday” model. The issue with that, and a reason most professional sports teams have practice facilities, is that it’s very expensive to keep all the lights, climate control, and ancillary equipment running for the sole purpose of practice. 29-30 days of a month most infrastructure sits idle, wasting HVAC, electricity, etc. However, using the elasticity of cloud, we could manage this capacity and turn it up only when necessary. But to drive even more optimization and savings, we turn to Spot VMs. Spot VMs really shine during these kinds of scale-out events. Imagine the above scenario: what if behind a load balancer we could have:One MIG to help scale the web frontends. This MIG will be sized with on-demand VMs to handle day-to-day traffic.A second MIG for Spot VMs that scales up starting at 11:45pm the night prior to the end of month. The first and second MIG can now handle ~80-90% of the workload. A third MIG of on-demand VMs that spins up as a workload bursts to handle any remaining traffic, should the Spot MIG not be able to find enough capacity, thus ensuring we’re meeting our SLAs as well as keeping costs as tight as possible. KubernetesNow you may say “Well that’s all well and good, but we’re a fully modernized container shop, using Google Kubernetes Engine (GKE).” You are in luck — Spot VMs are integrated with GKE, enabling you to quickly and easily save on your GKE workloads by using Spot VMs with standard GKE clusters or Spot Pods with your Autopilotclusters. GKE supports gracefully shutting downSpot VMs, notifying your workloads that they will be shut down and giving them time to cleanly exit. GKE then automatically reschedules your deployments. With Spot Pods, you can use Kubernetes nodeSelectors and/or Node affinity to control the placement of spot workloads, striking the right balance between cost and availability across spot and on-demand compute.General best practicesTo take advantage of Spot VMs, your use case doesn’t have to be an exact match to any of those described above. If the workload is stateless, scalable, can be stopped and checkpointed in less than 30 seconds, or is location- and hardware-flexible, then they may be a good fit for Spot VMs.There are many several actions you can take to help ensure your Spot workloads run as smoothly as possible. Below we outline a few best practices you should consider:1. Deploy Spot behindRegional Managed Instance Groups (RMIGs):RMIGs are a great fit for Spot workloads given the RMIG’s ability to recreate instances which are preempted.Using your workload’s profile, determine the RMIG’s target distribution shape. For example, with a batch research workload, you might select an ANY target distribution shape. This will allow for Spot instances to be distributed in any manner across the various zones, thereby taking advantage of any underutilized resources. You can use a mix of on-demand RMIGs and Spot RMIGs to maintain stateful applications while increasing availability in a cost effective manner.2. Ensure you have a shutdown script:In the event of Spot VM preemptions, use a shutdown script to enable checkpointing to Cloud Storage for your workloads as well as perform any graceful shutdown processes.When drafting your shutdown script, test it out on an instance by either manually stopping or deleting the instance with the shutdown script attached and validate the intended behavior.3. Write check-point files to Cloud Storage.4. Consider using multiple MIGs behind your load balancer.Whether your workload is graphics rendering, financial modeling, scaled-out ecommerce, or any other stateless use case, Spot VMs are the best and easiest way to reduce your cost of operating it by more than 60%. By following the examples and best practices above, you can ensure that Spot VMs will create the right outcome. Get started today with a free trial of Google Cloud. AcknowledgementSpecial thanks to Dan Sheppard, Product Manager for Cloud Compute, for contributing to this post.
Quelle: Google Cloud Platform

Building a Mobility Dashboard with Cloud Run and Firestore

Visualization is the key to understanding massive amounts of data. Today we have BigQuery and Looker to analyze petabytes scale data and to extract insights in a sophisticated way. But how about monitoring data that actively changes every second? In this post, we will walk through how to build a real-time dashboard with Cloud Run and Firestore.Mobility DashboardThere are many business use cases that require real-time updates. For example, inventory monitoring in retail stores, security cameras, and MaaS (Mobility as a Service) applications such as share ride. In the MaaS business area, locations of vehicles are very useful in making business decisions. In this post, we are going to build a mobility dashboard, monitoring vehicles on a map in real-time.The ArchitectureThe dashboard should be accessible from the web browser without any setups on the client side. Cloud Run is a good fit because it can generate URLs, and of course, scalable that can handle millions of users. Now we need to implement an app that can plot geospatial data, and a database that can broadcast its update. Here are my choices and architecture.Cloud Run — Hosting a web app (dashboard)(streamlit — a library to visualize data and to make web app)(pydeck — a library to plot geospatial data)Firestore — a full managed database that keeps your data in syncThe diagram below illustrates a brief architecture of the system. In the production environment, you may also need to implement a data ingestion and transform pipeline.Before going to the final form, let’s take some steps to understand each component.Step 1: Build a data visualization web app with Cloud Run + streamlitstreamlit is an OSS web app framework that can create beautiful data visualization apps without knowledge of the front-end (e.g. HTML, JS). If you are familiar with pandas DataFrame for your data analytics, it won’t take time to implement. For example, you can easily visualize your DataFrame in a few lines of code.code_block[StructValue([(u’code’, u”import streamlit as strnchart_data = pd.DataFrame(rn np.random.randn(20, 3),rn columns=[‘a’, ‘b’, ‘c’])rnst.line_chart(chart_data)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54a26e3d90>)])]The chart on the webapp (Source)Making this app runnable on Cloud Run is easy. Just add streamlit in requirements.txt, and make Dockerfile from a typical python webapp image. If you are not familiar with Docker, buildpacks can do the job. Instead of making Dockerfile, make Procfile with just 1 line as below.code_block[StructValue([(u’code’, u’web: streamlit run app.py –server.port $PORT –server.enableCORS=false’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54a3261350>)])]To summarize, the minimum required files are only as below.code_block[StructValue([(u’code’, u’.rn|– app.pyrn|– Procfilern|– requirements.txt’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54b4265b10>)])]Deployment is also easy. You can deploy this app to Cloud Run with just a command.code_block[StructValue([(u’code’, u’$ gcloud run deploy mydashboard –source .’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54b4265310>)])]This command will build and make your image with buildpacks and Cloud Build, thus you don’t need to set up a build environment in your local system. Once deployment is completed, you can access your web app with the generated URL like https://xxx-[…].run.app. Copy and paste the URL into your web browser, and you will see your first dashboard webapp.Step 2: Add a callback function that receive changes in Firestore databaseIn the STEP 1, you can visualize your data with fixed conditions or interactively with UI functions on streamlit. Now we want it to update by itself.Firestore is a scalable NoSQL database, and it keeps your data in sync across client apps through real-time listeners. Firestore is available on Android and iOS, and also provides SDKs in major programming languages. Since we use streamlit in Python, let us use a Python client.In this post we don’t cover detailed usage of Firestore though, it is easy to implement a callback function that is called when a specific “Collection” has been changed. [reference]code_block[StructValue([(u’code’, u”from google.cloud import firestore_v1rnrndb = firestore_v1.Client()rncollection_ref = db.collection(u’users’)rnrndef on_snapshot(collection_snapshot, changes, read_time):rn for doc in collection_snapshot.documents:rn print(u'{} => {}’.format(doc.id, doc.to_dict()))rnrn# Watch this collectionrncollection_watch = collection_ref.on_snapshot(on_snapshot)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54b42658d0>)])]In this code, on_snapshot callback function is called when users Collection has been changed. You can also watch changes of Document.Since Firestore is a fully managed database, you would not need to provision the service ahead. You only need to choose “mode” and location. To use real-time sync functionality, select “Native mode”. Also select nearest or desired location.Using Firestore with streamlitNow let’s implement Firestore with streamlit. We add on_snapshot callback and update a chart with the latest data sent from Firestore. Here is one quick note when you use the callback function with streamlit. on_snapshot function is executed in a sub thread, instead UI manipulation in streamlit must be executed in a main thread. Therefore, we use Queue to sync the data between threads. The code will be something like below.code_block[StructValue([(u’code’, u’from queue import Queuernrnq = Queue()rndef on_snapshot(collection_snapshot, changes, read_time):rn for doc in collection_snapshot.documents:rn q.put(doc.to_dict()) # Put data into the Queuernrn# below will run in main threadrnsnap = st.empty() # placeholderrnrnwhile True:rn # q.get() is a blocking function. thus recommend to add timeoutrn doc = q.get() # Read from the Queuern snap.write(doc) # Change the UI’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54a36850d0>)])]Deploy this app and write something in the collection you refer to. You will see the updated data on your webapp.Step 3: Plot a geospatial data with streamlitWe learned how to host web apps on Cloud Run, then how to update data with Firestore. Now we want to know how to plot geospatial data with streamlit. streamlit has multiple ways to plot geospatial data which includes latitude and longitude, we here used pydeck_plot(). This function is a wrapper of deck.gl, a geospatial visualization library.For example, provide data in latitude and longitude as to plot, add layers to visualize them.code_block[StructValue([(u’code’, u’import streamlit as strnimport pydeck as pdkrnimport pandas as pdrnimport numpy as nprnrndf = pd.DataFrame(rn np.random.randn(1000, 2) / [50, 50] + [37.76, -122.4],rn columns=[‘lat’, ‘lon’])rnst.pydeck_chart(pdk.Deck(rn map_provider=”carto”,rn map_style=’road’,rn initial_view_state=pdk.ViewState(rn latitude=37.76,rn longitude=-122.4,rn zoom=11,rn pitch=50,rn ),rn layers=[rn pdk.Layer(rn ‘HexagonLayer’,rn data=df,rn get_position='[lon, lat]’,rn radius=200,rn elevation_scale=4,rn elevation_range=[0, 1000],rn pickable=True,rn extruded=True,rn ),rn pdk.Layer(rn ‘ScatterplotLayer’,rn data=df,rn get_position='[lon, lat]’,rn get_color='[200, 30, 0, 160]’,rn get_radius=200,rn ),rn ],rn ))’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54a3685750>)])]Plotting with pydeck_plot (Source)pydeck supports multiple map platforms. We here chose CARTO. If you would like to know more about great examples using CARTO and deck.gl, please refer to this blog.Step 4: Plot mobility dataWe are very close to the goal. Now we want to plot locations of vehicles. pydeck supports some ways to plot data, and TripsLayer would be a good fit to plot mobility data.Demo using Google Maps JavaScript API (Source)TripsLayer can visualize location data in time sequential. That means, when selecting a specific timestamp, it plots lines from location data in the time including last n periods. It also draws like an animation when you change the time in sequential order.In the final form, we also add IconLayer to identify the latest location. This layer is also useful when you want to plot a static location, and it just works like a “pin” on Google Maps.Now we need to think about how to use this plot with Firestore. Let’s make Document per vehicle, and only save the latest latitude, longitude, and timestamp of every vehicle. Why not save the history of locations? In that case, we should rather use BigQuery. We just want to see the latest locations that update in realtime.Firestore is useful and scalable, yet NoSQL. Note that there are some good fits and bad fits in NoSQL.Location data in Firestore ConsoleStep 5: RunFinally, we are here. Now let’s ride in a car and record data… if possible.For demo purposes, now we ingest dummy data into Firestore. It is easy to write data by using a client library.code_block[StructValue([(u’code’, u”db = firestore.Client()rncol_ref = db.collection(‘connected’)rncol_ref.document(str(vehicle_ind)).set({rn ‘lonlat': [-74, 40.72],rn ‘timestamp': 0rn})”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e54a350c950>)])]With writing dummy data, open the web page hosted on Cloud Run. you will see the map is updated upon new data coming.Firestore syncs data on streamlitNote that we used dummy data and manipulated the timestamps. Consequently, the location data updates much faster than actual time. This can be fixed once you use proper data and update cycle.Try it with your dataIn this post, we learned how to build a dashboard updated in real-time with Cloud Run and Firestore. Let us know when you find other use-cases with those nice Google Cloud products.Find out more automotive solutions here.Haven’t used Google Cloud yet? Try it from here.Check out the source code on GitHub.Related ArticleDiscover our new edge concepts at Hannover Messe that bring smart factories to lifeIntel and Google Cloud demonstrate edge-to-cloud technology at Hannover Messe.Read Article
Quelle: Google Cloud Platform

Announcing new BigQuery capabilities to help secure sensitive data

In order to better serve their customers and users, digital applications and platforms continue to store and use sensitive data such as Personally Identifiable Information (PII), genetic and biometric information, and credit card information. Many organizations that provide data for analytics use cases face evolving regulatory and privacy mandates, ongoing risks from data breaches and data leakage, and a growing need to control data access. Data access control and masking of sensitive information is even more complex for large enterprises that are building massive data ecosystems. Copies of datasets often are created to manage access to different groups. Sometimes, copies of data are obfuscated while other copies aren’t. This creates an inconsistent approach to protecting data, which can be expensive to manage. To fully address these concerns, sensitive data needs to be protected with the right defense mechanism at the base table itself so that data can be kept secure throughout its entire lifecycle. Today, we’re excited to introduce two new capabilities in BigQuery that add a second layer of defense on top of access controls to help secure and manage sensitive data. 1. General availability of BigQuery column-level encryption functionsBigQuery column-level encryption SQL functions enable you to encrypt and decrypt data at the column level in BigQuery. These functions unlock use cases where data is natively encrypted in BigQuery and must be decrypted when accessed. It also supports use cases where data is externally encrypted, stored in BigQuery, and must then be decrypted when accessed. SQL functions support industry standard encryption algorithms AES-GCM (non-deterministic) and AES-SIV (deterministic).  Functions supporting AES-SIV allow for grouping, aggregation, and joins on encrypted data. In addition to these SQL functions, we also integrated BigQuery with Cloud Key Management Service (Cloud KMS). This gives you additional control, and allows you to manage your encryption keys in KMS and enables on-access secure key retrieval as well as detailed logging. An additional layer of envelope encryption enables generations of wrapped key sets to decrypt data. Only users with permission to access the Cloud KMS key and the wrapped keyset can unwrap the keyset and decrypt the ciphertext. “Enabling dynamic field level encryption is paramount for our data fabric platform to manage highly secure, regulated assets with rigorous security policies complying with several regulations including FedRAMP, PCI, GDPR, CCPA and more. BigQuery column-level encryption capability provides us with a secure path for decrypting externally encrypted data in BigQuery unblocking analytical use cases across more than 800+ analysts,” said Kumar Menon, CTO of Equifax.Users can also leverage available SQL functions to support both non-deterministic encryption and deterministic encryption to enable joins and grouping of encrypted data columns.The following query sample uses non-deterministic SQL functions to decrypt ciphertext.code_block[StructValue([(u’code’, u’SELECTrn AEAD.DECRYPT_STRING(KEYS.KEYSET_CHAIN(rn @kms_resource_name,rn @wrapped_keyset),rn ciphertext,rn additional_data)rnFROMrn ciphertext_tablernWHERErn …’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edc9b976150>)])]The following query sample uses deterministic SQL functions to decrypt ciphertext.code_block[StructValue([(u’code’, u’SELECTrn DETERMINISTIC_DECRYPT_STRING(KEYS.KEYSET_CHAIN(rn @kms_resource_name,rn @wrapped_keyset),rn ciphertext,rn additional_data)rn FROMrn ciphertext_tablernWHERErn …’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3edc9b9764d0>)])]2. Preview of dynamic data masking in BigQueryExtending BigQuery’s column-level security, dynamic data masking allows you to obfuscate sensitive data and control user access while mitigating the risk of data leakage. This capability selectively masks column level data at query time based on the defined masking rules, user roles and privileges. Masking eliminates the need to duplicate data and allows you to define different masking rules on a single copy of data to desensitize data, simplify user access to sensitive data, and protect against compliance, privacy regulations, or confidentiality issues. Dynamic data masking allows for different transformations of underlying sensitive data to obfuscate data at query time. Masking rules can be defined on the policy tag in the taxonomy to grant varying levels of access based on the role and function of the user and the type of sensitive data. Masking adds to the existing access controls to allow customers a wide gamut of options around controlling access. An administrator can grant a user full access, no access or partial access with a particular masked value based on data sharing use case.For the preview of data masking, three different masking policies are being supported: ALWAYS_NULL. Nullifies the content regardless of column data types.SHA256. Applies SHA256 to STRING or BYTES data types. Note that the same restrictions apply to the SHA256 function.Default_VALUE. Returns the default value based on the data type.A user must first have all of the permissions necessary to run a query job against a BigQuery table to query it. In addition, for users to view the masked data of a column tagged with a policy tag they need to have a MaskedReader role.When to use dynamic data masking vs encryption functions?Common scenarios for using data masking or column level encryption are: protect against unauthorized data leakage access control management compliance against data privacy laws for PII, PHI, PCI datacreate safe test datasetsSpecifically, masking can be used for real-time transactions whereas encryption provides additional security for data at rest or in motion where real-time usability is not required.  Any masking policies or encryption applied on the base tables are carried over to authorized views and materialized views, and masking or encryption is compatible with other security features such as row-level security. These newly added BQ security features along with automatic DLP can help to scan your data across your entire organization, give you visibility into where sensitive data is stored, and enable you to manage access and usability of data for different use cases across your user base. We’re always working to enhance BigQuery’s (and Google Cloud’s) data governance capabilities, to enable end to end management of your sensitive data. With the new releases, we are adding deeper protections for your data in BigQuery. Related ArticleBuild a secure data warehouse with the new security blueprintIntroducing our new security blueprint that helps enterprises build a secure data warehouse.Read Article
Quelle: Google Cloud Platform