Building your own private knowledge graph on Google Cloud

A Knowledge Graph ingests data from multiple sources, extracts entities (e.g., people, organizations, places, or things), and establishes relationships among the entities (e.g., owner of, related to) with the help of common attributes such as surnames, addresses, and IDs.Entities form the nodes in the graph and the relationships are the edges or connections. This graph building is a valuable step for data analysts and software developers for establishing entity linking and data validation.The term “Knowledge Graph” was first introduced by Google in 2012 as part of a new Search feature to provide users with answer summaries based on previously collected data from other top results and sources.Advantages of a Knowledge GraphBuilding a Knowledge Graph for your data has multiple benefits:Clustering text together that is identified as one single entity like “Da Vinci,” “Leonardo Da Vinci,” “L Da Vinci,” “Leonardo di ser Piero da Vinci,” etc. Attaching attributes and relationships to this particular entity, such as “painter of the Mona Lisa.”Grouping entities based on similarities, e.g., grouping Da Vinci with Michelangelo because both are famous artists from the late 15th century.It also provides a single source of truth that helps users discover hidden patterns and connections between entities. These linkages would have been more challenging and computationally intensive to identify using traditional relational databases.Knowledge Graphs are widely deployed for various use cases, including but not limited to: Supply chain: mapping out suppliers, product parts, shipping, etc.Lending: connecting real estate agents, borrowers, insurers, etc.Know your customer: anti-money laundering, identity verification, etc.Deploying on Google CloudGoogle Cloud has introduced two new services (both in Preview as of today): The Entity Reconciliation API lets customers build their own private Knowledge Graph with data stored in BigQuery. Google Knowledge Graph Search API lets customers search for more information about their entities from the Google Knowledge Graph.To illustrate the new solutions, let’s explore how to build a private knowledge graph using the Entity Reconciliation API and use the generated ID to query the Google Knowledge Graph Search API. We’ll use the sample data from zoominfo.com for retail companies available on Google Cloud Marketplace (link 1, link 2). To start, enable the Enterprise Knowledge Graph API and then navigate to the Enterprise Knowledge Graph from the Google Cloud console.The Entity Reconciliation API can reconcile tabular records of organization, local business, and person entities in just a few clicks.Three simple steps are involved: Identify the data sources in BigQuery that need to be reconciled and create a schema mapping file for each source. Configure and kick off a Reconciliation job through our console or API.Review the results after job completion.Step 1For each job and data source, create a schema mapping file to inform how Enterprise Knowledge Graph ingests the data and maps to a common ontology using schema.org. This mapping file will be stored in a bucket in Google Cloud Storage.For the purposes of this demo, I am choosing the organization entity type and passing in the database schema that I have for my BigQuery table. Note to always use the latest from our documentation.code_block[StructValue([(u’code’, u’prefixes:rn ekg: http://cloud.google.com/ekg/0.0.1#rn schema: https://schema.org/rnrnmappings:rn organization:rn sources:rn – [yourprojectid:yourdataset.yourtable~bigquery]rn s: ekg:company_$(id_column_from_table)rn po:rn – [a, schema:Organization]rn – [schema:name, $(name_column_from_table)]rn – [schema:streetAddress, $(address_column_from_table)]rn – [schema:postalCode, $(ZIP_column_from_table)]rn – [schema:addressCountry, $(country_column_from_table)]rn – [schema:addressLocality, $(city_column_from_table)]rn – [schema:addressRegion, $(state_column_from_table)]rn – [ekg:recon.source_name, (chosen_source_name)]rn – [ekg:recon.source_key, $(id_column_from_table)]’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eae0e207050>)])]Step 2The console page shows the list of existing entity reconciliation jobs available in the project.Create a new job by clicking on the “Run A Job” button in the action bar, then select an entity type for entity reconciliation.Add one or more BigQuery data sources and specify a BigQuery dataset destination where EKG will create new tables with unique names under the destination data set. To keep the generated cluster IDs constant across different runs, advanced settings like “previous BigQuery result table” are available. Click “DONE” to create the job.Step 3After the job completes, navigate to the output BigQuery table, then use a simple join query similar to the one below to review the output:code_block[StructValue([(u’code’, u’SELECT *rnFROM `<dataset>.clusters_14002307131693260818` as RS join `<dataset>.retail_companies` as SRCrnon RS.source_key = SRC.COMPANY_IDrnorder by cluster_id;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eadf639cf10>)])]This query joins the output table with the input table(s) of our Entity Reconciliation API and orders by cluster ID. Upon investigation, we can see that two entities are grouped into one cluster.The  confidence score indicates how likely it is that these entities belong to this group. Last but not least, the cloud_kg_mid column returns the linked Google Cloud Knowledge Graph machine ID, which can be used for our Google Knowledge Graph Search API.Running the above cURL command will return response that contains a list of entities, presented in JSON-LD format and compatible with schema.org schemas with limited external extensions.For more information, kindly visit our documentation.Special thanks to Lewis Liu, Product Manager and Holt Skinner, Developer Advocate for the valuable feedback on this content.
Quelle: Google Cloud Platform

7 reasons to join us at Azure Open Source Day

This post was coauthored by Katie Fritsch and ChatGPT.

Are you interested in learning more about Azure and open-source technologies?  Do you want to learn about the latest AI capabilities on Azure and how Microsoft is leveraging open-source technologies to drive innovation? If so, you won't want to miss Azure Open Source Day on Tuesday, March 7, 2023, at 9:00 AM–10:30 AM Pacific Time.

Azure Open Source Day is a great opportunity to learn more about Microsoft's role in the open-source community, its contributions, and vision. Microsoft has a long history of supporting and contributing to open-source projects, and it continues to be a leader in the community today. Learn how Microsoft is empowering developers to build innovative solutions using the best of cloud capabilities and open-source technologies.

Here are seven reasons why you should attend:

See app-building demos—Discover how to build intelligent applications that are fast, flexible, and scalable using containers, Azure Kubernetes Service (AKS), and Azure managed databases and Azure AI services. Azure provides a wide range of tools and services that can be used to build intelligent applications.
Learn from partners—See how to use the power of Azure to build intelligent apps fast and flexibly using the best of open-source technology. Hear about Microsoft and Nvidia’s partnership to allow developers to spin up a platform in a matter of minutes.
Discover new innovative technologies—Find out how to use Dapr—an open-source project developed by Microsoft and a growing community of contributors—to easily build, deploy, and scale microservices. Dapr helps you to focus on business logic while abstracting away the underlying infrastructure and platform.
Hear perspectives on open-source trends—Hear from Microsoft luminaries Brendan Burns and Sarah Novotny, and our partners (GitHub, HashiCorp, Redis, and Nvidia) about how open source can be used to drive technological progress and drive collaboration between companies.
Get proven support—Get a first look at how Microsoft is committed to supporting its customers with their technology needs whatever they may be, including Web3 scenarios and use cases. Microsoft's Azure cloud platform, developer tools, and identity and security services can help customers build and run Web3 applications.
Learn how to protect your data—Protect your business assets by building on a highly secure cloud platform designed to meet your open-source security and compliance needs.
Ask the experts—Post your questions during the live chat Q&A. Azure Open Source Day features a live chat where attendees can ask the experts their questions and get detailed answers.

Learn more

Don't miss out on the opportunity to learn about the latest AI capabilities on Azure and how Microsoft is leveraging open-source technologies to drive innovation. Register for Azure Open Source Day for Azure Open Source Day today and join us Tuesday, March 7, 2023, 9:00 AM–10:30 AM Pacific Time.
Quelle: Azure

DDoS Mitigation with Microsoft Azure Front Door

This blog post was authored by Dave Burkhardt, Principal Product Manager, and co-authored by Harikrishnan M B, Program Manager, and Yun Zheng, Sr Program Manager.

Within the last few years, the complexity and size of distributed denial-of-service (DDoS) attacks have increased dramatically across the industry.

As we reported previously, TCP, UDP, and DNS-based attacks are still the most frequent, but layer 7/HTTP(S) based attacks have been breaking traffic records across the industry in 2022. As a recent example, we successfully mitigated an attack with over 60 billion malicious requests that were directed at a customer domain hosted on Azure Front Door (AFD).

Layer 7 attacks can affect any organization—from media and entertainment companies to financial institutions. Initially, attacks were unencrypted HTTP-based traffic (such as Slowloris, and HTTP Flood), but the industry is now seeing an increase in weaponized botnet HTTPS-based attacks (like Mēris, Mirai). 

Mitigation techniques utilizing Azure Front Door

Fortunately, there are battle-tested frameworks, services, and tools for organizations to utilize so they can mitigate against a potential DDoS attack. Here are some initial steps to consider:

Content Delivery Networks (CDNs) such as AFD are architected to redistribute HTTP(S) DDoS traffic away from your origin systems in the event of an attack. As such, utilizing AFD’s 185+ edge POPs around the globe that leverage our massive private WAN will not only allow you to deliver your web applications and services faster to your users, but you will also be taking advantage of the AFD’s distributed systems to mitigate against layer 7 DDoS attacks. Additionally, layer 3, 4, and 7 DDoS protection is included with AFD, and WAF services are included at no extra charge with AFD Premium.
Front Door's caching capabilities can be used to protect backends from large traffic volumes generated by an attack. Cached resources will be returned from the Front Door edge nodes so they don't get forwarded to your origins. Even short cache expiry times (seconds or minutes) on dynamic responses can greatly reduce the load on your origin systems. You can also learn more about how AFD caching can protect you from DDoS attacks.
Leverage Azure Web Application Firewall (Azure WAF) integration with Azure Front Door to mitigate malicious activities, and prevent DDoS and bot attacks. Here are the key Azure WAF areas to explore before (ideally) or during a DDoS attack:

Enable rating limiting to block the number of malicious requests that can be made over a certain time period.
Utilize Microsoft Managed Default Rule Set for an easy way to deploy protection against a common set of security threats. Since such rulesets are managed by Microsoft and backed by Microsoft Threat Intel team, the rules are updated as needed to protect against new attack signatures.
Enable the Bot Protection Ruleset to block known bad bots responsible for launching DDoS attacks. This ruleset includes malicious IPs sourced from the Microsoft Threat Intelligence Feed and updated frequently to reflect the latest intel from the immense Microsoft Security and Research organization.
Create Custom WAF rules to automatically block conditions that are specific to your organization.
Utilize our machine learning-based anomaly detection to automatically block malicious traffic spikes using Azure WAF integrated with Azure Front Door.
Enable Geo-filtering to block traffic from a defined geographic region, or block IP addresses and ranges that you identify as malicious.

Determine all of your attack vectors. In this article, we mainly talked about layer 7 DDoS aspects and how Azure WAF and AFD caching capabilities can help prevent those attacks. The good news is AFD will protect your origins from layer 3 and 4 attacks if you have these origins configured to only receive traffic from AFD. This layer 3 and 4 protection is included with AFD and is a managed service provided by Microsoft—meaning, this service is turned on by default and is continuously optimized and updated by the Azure engineering team. That said, if you have internet-facing Azure resources that don’t utilize AFD, we strongly recommend you consider leveraging Microsoft’s Azure DDOS Protection product. Doing so will allow customers to receive additional benefits including cost protection, an SLA guarantee, and access to experts from the DDoS Rapid Response Team for immediate help during an attack.
Fortify your origins hosted in Azure by only allowing them to connect to AFD via Private Link. When Private Link is utilized, traffic between Azure Front Door and your application servers is delivered through a private network connection. As such, exposing your origins to the public internet is no longer necessary. In the event you do not utilize Private Link, origins that are connected over the public IPs could be exposed to DDOS attacks and our recommendation is to enable Azure DDOS Protection (Network or IP SKUs). 
Monitor traffic patterns: Regularly monitoring traffic patterns can help identify unusual spikes in traffic, which could indicate a DDoS attack. As such, set up the following alerting to advise your organization of anomalies:

Configure Resource Health alerts within AFD.
Utilize Azure Monitor Alerts to contact you for anomalies noted within Azure WAF/AFD logs.
Consider leveraging Microsoft's Sentinel, a 2022 recognized Gartner SIEM leader to help you collect, detect, investigate, and respond to anomalies.

Create playbooks to document how you will respond to a DDoS attack and other cybersecurity incidents.
Run fire drills to determine potential gaps and fine-tune.

Learn more about AFD

Information about the Azure Front Door Content Delivery Network.
Read LinkedIn’s case study on how they utilized AFD to scale and reduce latency at a global level.
Check out the Quickstart guide on how to set up high availability with Azure Front Door.
See the Azure Front Door pricing page.

Quelle: Azure