How reCAPTCHA Enterprise protected customers during the holidays

Every business had to adapt to a new reality in 2020, and make online business their primary channel. But as online business increased, so did web-based attacks. In research commissioned by Forrester consulting, 84% of companies have seen an increase in bot attacks. 71% of organizations have seen an increase in the amount of successful attacks. 65% of businesses have experienced more frequent attacks and greater revenue loss due to bot attacks. With online fraud expected to only increase, the security of web pages has never been more important.Online fraud and abuse impacts various industries differently, ranging from inventory problems to account access difficulties. Attack methods also vary; some businesses have to deal with frequent credential stuffing or payment fraud attacks, and some are more subject to account takeovers to spam logins. Credential stuffing is one of the most common attacks our customers face, due to a spike in the availability of usernames and passwords from a wide range of successful breaches, and the ease of scripting these kinds of attacks. Account takeovers are another common attack type, as billions of account records have been leaked over the last several years from breaches, and these credentials have been posted and sold on the dark web. While the attacks are varied, they all share the same end result: damage to your business, customers, and bottom line.Successful online businesses require successful online security The more digital an organization becomes, the more its success is tied to its ability to understand and manage online attacks. And though the 2020 holiday season unleashed more online attacks than ever before, customers using reCAPTCHA Enterprise were prepared. Any organization that conducts business online can be susceptible to online fraud. But this susceptibility can be mitigated by reCAPTCHA Enterprise, which is particularly helpful for businesses in the retail, gaming, media, entertainment, software and internet industries. reCAPTCHA Enterprise customers create, sell, offer or manage everything from smart home devices, to office supplies, to software, online marketplaces, social media, and streaming services. And all of them face a myriad of automated attacks that, unless properly defended, could weaken their businesses.For example, retailers need protection from bots putting inventory in their shopping carts, thereby decreasing the amount of inventory available to legitimate customers. They are sometimes faced with malicious attempts to identify missing start/expiry dates and security codes for stolen payment card data, by bots that test different values and personal information at checkout. Gaming, media, and entertainment customers are challenged by bad actors trying to log in into a legitimate customer’s account with stolen credentials. Event companies deal with automated scalping, with bots buying up tickets and then reselling them later at a profit. And many vendors are challenged by repeated attempts to use a coupon number, voucher code or discount token on web pages during payment. Halting 2020 holiday hacksThe most common attacks our customers experienced this holiday season were credential stuffing, followed by scraping, card fraud, and account takeovers.In a credential stuffing attack, bots list stolen credentials against an application’s authentication mechanisms to identify whether users have reused the same login credentials. The stolen usernames (often email addresses) and password pairs could have been sourced directly from another application by the attacker, purchased in a criminal marketplace, or obtained from publicly available breach data dumps. reCAPTCHA Enterprise detects and stops credential stuffing attacks by recognizing bot behavior and introducing friction into the bot’s attempt at an attack—alerting that an attack is taking place, and implementing a response like two-factor authentication to defeat the attempt while letting valid users through the website. In a scraping attack, large volumes of data are extracted from web pages and applications. Scraping can be used to collect personal data from social media accounts, which malicious actors use to create applications for loans, credit cards, or other forms of identification. Scraping can also be used to collect legitimate information about products or services, and then create fake products and services and trick buyers into purchasing them. reCAPTCHA Enterprise uses an adaptive risk analysis engine to keep malicious software from engaging in abusive activities on your site. Another type of fraud that has been prominent in the last year is card cracking. Fraudsters often use automated tools to verify stolen credit cards before they’re sold or used. reCAPTCHA uses machine learning models that analyze site-specific behavior to recognize patterns of legitimate and fraudulent transactions and detect this type of abuse. reCAPTCHA Enterprise returns a score based on interactions with your websites, with 1.0 being a likely good interaction and 0.0 being a likely abusive action. This can reduce the transaction costs of such abuse, and prevent larger scale attacks resulting from the use of stolen payment mechanisms.Sometimes, a bad actor will use a stolen or leaked credential to log in and access a legitimate user’s account, in an attack called an account takeover. Account takeovers are typically followed by the attacker transferring money, buying a gift card or making purchases with the user’s account. The reCAPTCHA Enterprise API risk score gives you the granularity and flexibility to protect your webpages in the way that makes the most sense to your business; you can decide which action to take based on that score. There’s no one-size-fits-all approach to managing risk, so you should have the levels of protection for different web pages. A suspected fraudulent request on a login page could force a two-factor authorization challenge, while you could just block the request on a less valuable webpage.reCAPTCHA Enterprise is built to help mitigate fraudulent online activity for your enterprise, with technology that has helped defend millions of websites for over a decade. The number and types of attacks your business will experience will only increase over time, so it’s important to remember that the success of your business is dependent on how well you can protect against these attacks. To protect your business from online fraud and abuse, get started with reCAPTCHA Enterprise today.
Quelle: Google Cloud Platform

Compiling Containers – Dockerfiles, LLVM and BuildKit

Today we’re featuring a blog from Adam Gordon Bell at Earthly who writes about how BuildKit, a technology developed by Docker and the community, works and how to write a simple frontend. Earthly uses BuildKit in their product.

Introduction

How are containers made? Usually, from a series of statements like `RUN`, `FROM`, and `COPY`, which are put into a Dockerfile and built.  But how are those commands turned into a container image and then a running container?  We can build up an intuition for how this works by understanding the phases involved and creating a container image ourselves. We will create an image programmatically and then develop a trivial syntactic frontend and use it to build an image.

On `docker build`

We can create container images in several ways. We can use Buildpacks, we can use build tools like Bazel or sbt, but by far, the most common way images are built is using `docker build` with a Dockerfile.  The familiar base images Alpine, Ubuntu, and Debian are all created this way.     

Here is an example Dockerfile:

FROM alpine
COPY README.md README.md
RUN echo “standard docker build” > /built.txt”

We will be using variations on this Dockerfile throughout this tutorial. 

We can build it like this:

docker build . -t test

But what is happening when you call `docker build`? To understand that, we will need a little background.

Background

 A docker image is made up of layers. Those layers form an immutable filesystem.  A container image also has some descriptive data, such as the start-up command, the ports to expose, and volumes to mount. When you `docker run` an image, it starts up inside a container runtime.

 I like to think about images and containers by analogy. If an image is like an executable, then a container is like a process. You can run multiple containers from one image, and a running image isn’t an image at all but a container.

Continuing our analogy, BuildKit is a compiler, just like LLVM.  But whereas a compiler takes source code and libraries and produces an executable, BuildKit takes a Dockerfile and a file path and creates a container image.

Docker build uses BuildKit, to turn a Dockerfile into a docker image, OCI image, or another image format.  In this walk-through, we will primarily use BuildKit directly.

This primer on using BuildKit supplies some helpful background on using BuildKit, `buildkitd`, and `buildctl` via the command-line. However, the only prerequisite for today is running `brew install buildkit` or the appropriate OS equivalent steps.

How Do Compilers Work?

A traditional compiler takes code in a high-level language and lowers it to a lower-level language. In most conventional ahead-of-time compilers, the final target is machine code. Machine code is a low-level programming language that your CPU understands.

Fun Fact: Machine Code VS. Assembly

Machine code is written in binary. This makes it hard for a human to understand.  Assembly code is a plain-text representation of machine code that is designed to be somewhat human-readable. There is generally a 1-1 mapping between instructions the machine understands (in machine code) and the OpCodes in Assembly

Compiling the classic C “Hello, World” into x86 assembly code using the Clang frontend for LLVM looks like this:

Creating an image from a dockerfile works a similar way:

BuildKit is passed the Dockerfile and the build context, which is the present working directory in the above diagram. In simplified terms, each line in the dockerfile is turned into a layer in the resulting image.  One significant way image building differs from compiling is this build context.  A compiler’s input is limited to source code, whereas `docker build` takes a reference to the host filesystem as an input and uses it to perform actions such as `COPY`.

There Is a Catch

The earlier diagram of compiling “Hello, World” in a single step missed a vital detail. Computer hardware is not a singular thing. If every compiler were a hand-coded mapping from a high-level language to x86 machine code, then moving to the Apple M1 processor would be quite challenging because it has a different instruction set.  

Compiler authors have overcome this challenge by splitting compilation into phases.  The traditional phases are the frontend, the backend, and the middle. The middle phase is sometimes called the optimizer, and it deals primarily with an internal representation (IR).

This staged approach means you don’t need a new compiler for each new machine architecture. Instead, you just need a new backend. Here is an example of what that looks like in LLVM:

Intermediate Representations

This multiple backend approach allows LLVM to target ARM, X86, and many other machine architectures using LLVM Intermediate Representation (IR) as a standard protocol.  LLVM IR is a human-readable programming language that backends need to be able to take as input. To create a new backend, you need to write a translator from LLVM IR to your target machine code. That translation is the primary job of each backend.

Once you have this IR, you have a protocol that various phases of the compiler can use as an interface, and you can build not just many backends but many frontends as well. LLVM has frontends for numerous languages, including C++, Julia, Objective-C, Rust, and Swift.  

If you can write a translation from your language to LLVM IR, LLVM can translate that IR into machine code for all the backends it supports. This translation function is the primary job of a compiler frontend.

In practice, there is much more to it than that. Frontends need to tokenize and parse input files, and they need to return pleasant errors. Backends often have target-specific optimizations to perform and heuristics to apply. But for this tutorial, the critical point is that having a standard representation ends up being a bridge that connects many front ends with many backends. This shared interface removes the need to create a compiler for every combination of language and machine architecture. It is a simple but very empowering trick!

BuildKit

Images, unlike executables, have their own isolated filesystem. Nevertheless, the task of building an image looks very similar to compiling an executable. They can have varying syntax (dockerfile1.0, dockerfile1.2), and the result must target several machine architectures (arm64 vs. x86_64).

“LLB is to Dockerfile what LLVM IR is to C” – BuildKit Readme

This similarity was not lost on the BuildKit creators.  BuildKit has its own intermediate representation, LLB.  And where LLVM IR has things like function calls and garbage-collection strategies, LLB has mounting filesystems and executing statements.

LLB is defined as a protocol buffer, and this means that BuildKit frontends can make GRPC requests against buildkitd to build a container directly.

Programmatically Making An Image

Alright, enough background.  Let’s programmatically generate the LLB for an image and then build an image.  

Using Go

In this example, we will be using Go which lets us leverage existing BuildKit libraries, but it’s possible to accomplish this in any language with Protocol Buffer support.

Import LLB definitions:

import (
“github.com/moby/buildkit/client/llb”
)

Create LLB for an Alpine image:

func createLLBState() llb.State {
return llb.Image(“docker.io/library/alpine”).
File(llb.Copy(llb.Local(“context”), “README.md”, “README.md”)).
Run(llb.Args([]string{“/bin/sh”, “-c”, “echo “programmatically built” > /built.txt”})).     
Root()
}

We are accomplishing the equivalent of a `FROM` by using `llb.Image`. Then, we copy a file from the local file system into the image using `File` and `Copy`.  Finally, we `RUN` a command to echo some text to a file.  LLB has many more operations, but you can recreate many standard images with these three building blocks.

The final thing we need to do is turn this into protocol-buffer and emit it to standard out:

func main() {

dt, err := createLLBState().Marshal(context.TODO(), llb.LinuxAmd64)
if err != nil {
panic(err)
}
llb.WriteTo(dt, os.Stdout)
}

Let’s look at the what this generates using the `dump-llb` option of buildctl:

go run ./writellb/writellb.go |
buildctl debug dump-llb |
jq .

We get this JSON formatted LLB:

{
“Op”: {
“Op”: {
“source”: {
“identifier”: “local://context”,
“attrs”: {
“local.unique”: “s43w96rwjsm9tf1zlxvn6nezg”
}
}
},
“constraints”: {}
},
“Digest”: “sha256:c3ca71edeaa161bafed7f3dbdeeab9a5ab34587f569fd71c0a89b4d1e40d77f6″,
“OpMetadata”: {
“caps”: {
“source.local”: true,
“source.local.unique”: true
}
}
}
{
“Op”: {
“Op”: {
“source”: {
“identifier”: “docker-image://docker.io/library/alpine:latest”
}
},
“platform”: {
“Architecture”: “amd64″,
“OS”: “linux”
},
“constraints”: {}
},
“Digest”: “sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7″,
“OpMetadata”: {
“caps”: {
“source.image”: true
}
}
}
{
“Op”: {
“inputs”: [
{
“digest”: “sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7″,
“index”: 0
},
{
“digest”: “sha256:c3ca71edeaa161bafed7f3dbdeeab9a5ab34587f569fd71c0a89b4d1e40d77f6″,
“index”: 0
}
],
“Op”: {
“file”: {
“actions”: [
{
“input”: 0,
“secondaryInput”: 1,
“output”: 0,
“Action”: {
“copy”: {
“src”: “/README.md”,
“dest”: “/README.md”,
“mode”: -1,
“timestamp”: -1
}
}
}
]
}
},
“platform”: {
“Architecture”: “amd64″,
“OS”: “linux”
},
“constraints”: {}
},
“Digest”: “sha256:ba425dda86f06cf10ee66d85beda9d500adcce2336b047e072c1f0d403334cf6″,
“OpMetadata”: {
“caps”: {
“file.base”: true
}
}
}
{
“Op”: {
“inputs”: [
{
“digest”: “sha256:ba425dda86f06cf10ee66d85beda9d500adcce2336b047e072c1f0d403334cf6″,
“index”: 0
}
],
“Op”: {
“exec”: {
“meta”: {
“args”: [
“/bin/sh”,
“-c”,
“echo “programmatically built” > /built.txt”
],
“cwd”: “/”
},
“mounts”: [
{
“input”: 0,
“dest”: “/”,
“output”: 0
}
]
}
},
“platform”: {
“Architecture”: “amd64″,
“OS”: “linux”
},
“constraints”: {}
},
“Digest”: “sha256:d2d18486652288fdb3516460bd6d1c2a90103d93d507a9b63ddd4a846a0fca2b”,
“OpMetadata”: {
“caps”: {
“exec.meta.base”: true,
“exec.mount.bind”: true
}
}
}
{
“Op”: {
“inputs”: [
{
“digest”: “sha256:d2d18486652288fdb3516460bd6d1c2a90103d93d507a9b63ddd4a846a0fca2b”,
“index”: 0
}
],
“Op”: null
},
“Digest”: “sha256:fda9d405d3c557e2bd79413628a435da0000e75b9305e52789dd71001a91c704″,
“OpMetadata”: {
“caps”: {
“constraints”: true,
“platform”: true
}
}
}

Looking through the output, we can see how our code maps to LLB.

Here is our `Copy` as part of a FileOp:

“Action”: {
“copy”: {
“src”: “/README.md”,
“dest”: “/README.md”,
“mode”: -1,
“timestamp”: -1
}

Here is mapping our build context for use in our `COPY` command:

“Op”: {
“source”: {
“identifier”: “local://context”,
“attrs”: {
“local.unique”: “s43w96rwjsm9tf1zlxvn6nezg”
}
}

Similarly, the output contains LLB that corresponds to our  `RUN` and `FROM` commands. 

Building Our LLB

To build our image, we must first start `buildkitd`:

docker run –rm –privileged -d –name buildkit moby/buildkit
export BUILDKIT_HOST=docker-container://buildkit

To build our image, we must first start `buildkitd`:

go run ./writellb/writellb.go |
buildctl build
–local context=.
–output type=image,name=docker.io/agbell/test,push=true

The output flag lets us specify what backend we want BuildKit to use.  We will ask it to build an OCI image and push it to docker.io. 

Real-World Usage

In the real-world tool, we might want to programmatically make sure `buildkitd` is running and send the RPC request directly to it, as well as provide friendly error messages. For tutorial purposes, we will skip all that.

We can run it like this:

> docker run -it –pull always agbell/test:latest /bin/sh

And we can then see the results of our programmatic `COPY` and `RUN` commands:

/ # cat built.txt
programmatically built
/ # ls README.md
README.md

There we go! The full code example can be a great starting place for your own programmatic docker image building.

A True Frontend for BuildKit

A true compiler front end does more than just emit hardcoded IR.  A proper frontend takes in files, tokenizes them, parses them, generates a syntax tree, and then lowers that syntax tree into the internal representation. Mockerfiles are an example of such a frontend:

#syntax=r2d4/mocker
apiVersion: v1alpha1
images:- name: demo
  from: ubuntu:16.04
  package:
    install:
    – curl
    – git
    – gcc

And because Docker build supports the `#syntax` command we can even build a Mockerfiles directly with `docker build`. 

docker build -f mockerfile.yaml

To support the #syntax command, all that is needed is to put the frontend in a docker image that accepts a gRPC request in the correct format, publish that image somewhere.  At that point, anyone can use your frontend `docker build` by just using `#syntax=yourimagename`.

Building Our Own Example Frontend for `docker build`

Building a tokenizer and a parser as a gRPC service is beyond the scope of this article. But we can get our feet wet by extracting and modifying an existing frontend. The standard dockerfile frontend is easy to disentangle from the moby project. I’ve pulled the relevant parts out into a stand-alone repo. Let’s make some trivial modifications to it and test it out.

So far, we’ve only used the docker commands `FROM`, `RUN` and `COPY`.  At a surface level, with its capitalized commands, Dockerfile syntax looks a lot like the programming language INTERCAL. Let change these commands to their INTERCAL equivalent and develop our own Ickfile format.

DockerfileIckfileFROMCOME FROMRUNPLEASECOPYSTASH

The modules in the dockerfile frontend split the parsing of the input file into several discrete steps, with execution flowing this way:

For this tutorial, we are only going to make trivial changes to the frontend.  We will leave all the stages intact and focus on customizing the existing commands to our tastes.  To do this, all we need to do is change `command.go`:

package command

// Define constants for the command strings
const (
Copy = “stash”
Run = “please”
From = “come_from”

)

And we can then see results of our `STASH` and `PLEASE` commands:

/ # cat built.txt
custom frontend built
/ # ls README.md
README.md

I’ve pushed this image to dockerhub.  Anyone can start building images using our `ickfile` format by adding `#syntax=agbell/ick` to an existing Dockerfile. No manual installation is required!

Enabling BuildKit

BuildKit is enabled by default on Docker Desktop. It is not enabled by default in the current version of Docker for Linux (`version 20.10.5`). To instruct `docker build` to use BuildKit set the following environment variable `DOCKER_BUILDKIT=1` or change the Engine config.

Conclusion

We have learned that a three-phased structure borrowed from compilers powers building images, that an intermediate representation called LLB is the key to that structure.  Empowered by the knowledge, we have produced two frontends for building images.  

This deep dive on frontends still leaves much to explore.  If you want to learn more, I suggest looking into BuildKit workers.  Workers do the actual building and are the secret behind `docker buildx`, and multi-archtecture builds. `docker build` also has support for remote workers and cache mounts, both of which can lead to faster builds.

Earthly uses BuildKit internally for its repeatable build syntax. Without it, our containerized Makefile-like syntax would not be possible. If you want a saner CI process, then you should check it out.

There is also much more to explore about how modern compilers work. Modern compilers often have many stages and more than one intermediate representation, and they are often able to do very sophisticated optimizations.
The post Compiling Containers – Dockerfiles, LLVM and BuildKit appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

Spring forward with BigQuery user-friendly SQL

image 2Image 1image 3Spring is here. Clocks move forward. The Sakura (cherry blossom) festival in Japan marks the celebration of the new season. In India, the holi festival of colors ushers in the new harvest season. It’s a time for renewal and new ways of doing things. This month, we are pleased to debut our newest set of SQL features in BigQuery to help our analysts and data engineers spring forward. It’s time to set aside the old ways of doing things and instead look at these new ways of storing and analyzing all your data using BigQuery SQL.Bigger dataHigher precision and more flexible functions to manage your ever-expanding data in BigQueryBIGNUMERIC data type (GA)We live in an era where intelligent devices and systems ranging from driverless vehicles to global stock and currency trading systems to high speed 5G networks are driving nearly all aspects of modern life. These systems rely on large amounts of precision data to perform real time analysis. To support these analytics, BigQuery is pleased to announce the general availability of BIGNUMERIC data type which supports 76 digits of precision and 38 digits of scale. Similar to NUMERIC, this new data type is available  in all aspects of BigQuery from clustering to BI Engine and is also supported in the JDBC/ODBC drivers and client libraries.Here is an example that demonstrates the additional precision and scale using BIGNUMERIC applied to the various powers of e, Euler’s number and the base of natural logarithms.  DocumentationAs an aside, did you know that the world record, as of December 5, 2020, for the maximum number of digits to represent e stands at 10π trillion digits?JSON extraction functions (GA)As customers analyze different types of data, both structured and semi-structured, within BigQuery, JavaScript Object Notation (JSON) has emerged as the de facto standard for semi-structured data. JSON provides the flexibility of storing schemaless data in tables without requiring the specification of data types with associated precision for columns. As new elements are added, the JSON document can be extended to add new key-value pairs without requiring schema changes.BigQuery has long supported JSON data and JSON functions to query and transform JSON data before they became a part of the ANSI SQL standard in 2016. JSON extraction functions typically take two parameters: JSON field, which contains the JSON document and JSONPath, which points to the specific element or array of elements that need to be extracted. If JSONPath references an element or elements containing reserved characters, such as dot(.), dollar($) or star(*) characters, they need to be escaped so that they can be treated as strings instead of being interpreted as JSONPath expressions. To support escaping, BigQuery supports two types of JSON extraction functions: Standard and Legacy. The Standard (ANSI compliant and recommended) way of escaping these reserved characters is by enclosing the reserved characters in double quotes (” “). The Legacy (pre-ANSI) way is to enclose them in square brackets and single quotes ([‘ ‘]).Here’s a quick summary of existing and the new (highlighted in bold) JSON extraction functions: DocumentationTABLESAMPLE clause (preview)With the convergence into and growth of all types of data within BigQuery, customers want to maintain control over query costs especially when analysts and data scientists are performing ad hoc analysis of data in large tables. We are pleased to introduce the TABLESAMPLE clause in queries which allows users to sample a subset of the data, specified as a percentage of a table, instead of querying the entire data from large tables. This SQL clause can sample data from native BigQuery tables or external tables, stored in storage buckets in Google Cloud Storage, by randomly selecting a percentage of data blocks from the table and reading all of the rows in the selected blocks, lowering query costs when trying ad hoc queries. DocumentationAgile schemaMore commands and capabilities in SQL to allow you to evolve your data as your analytics needs change.Dataset (SCHEMA) operations (GA)In BigQuery, a dataset is the top level container entity that contains the data and program objects, such as tables, views, procedures. Creating, maintaining and dropping these datasets have been supported thus far in BigQuery using API, cli and UI. Today, we’re pleased to offer full SQL support (CREATE, ALTER and DROP) for dataset operations using SCHEMA, the ANSI standard keyword for the collection of logical objects in a database or a data warehouse. These operations greatly simplify data administrators’ ability to provision and manage schema across their BigQuery projects. Documentation for CREATE, ALTER and DROP SCHEMA syntaxObject creation DDL from INFORMATION_SCHEMA (preview)Data administrators provision empty copies of production datasets to allow loading of fictitious data so that developers can test out new capabilities before they are added to production datasets; new hires can train themselves on production-like datasets with test data. To help data administrators generate the data definition language (DDL) for objects, the TABLES view in INFORMATION_SCHEMA in BigQuery now has a new column called DDL which contains the exact object creation DDL for every table, view and materialized view within the dataset. In combination with dynamic SQL, data administrators can quickly generate and execute the creation DDL commands for a specific object or all objects of  particular type, e.g. MATERIALIZED VIEW or all data objects within a specified dataset with a single SQL statement without having to manually reconstruct all options and elements associated with the schema object(s). DocumentationDROP COLUMN support (preview)In October 2020, BigQuery introduced ADD COLUMN support in SQL to allow users to add columns using SQL to existing tables. As data engineers and analysts expand their tables to support new data, some columns may become obsolete and need to be removed from the tables. BigQuery now supports the DROP COLUMN clause as a part of the ALTER TABLE command to allow users to remove one or more of these columns. During the Preview period, note that there are certain restrictions on DROP COLUMN operations that will remain in effect. See Documentation for more details.Longer column names (GA)BigQuery now allows you to have longer column names upto 300 characters within tables, views and materialized views instead of the previous limit of 128 characters. DocumentationStorage insightsStorage usage analysis for partitioned and unpartitioned tablesINFORMATION_SCHEMA.PARTITIONS view for tables (preview)Customers store their analytical data in tables within BigQuery and use the flexible partitioning schemes on large tables in BigQuery to organize their data for improved query efficiency. To provide data engineers with better insight on storage and the record count for tables, partitioned and unpartitioned, we are pleased to introduce PARTITIONS view as a part of BigQuery INFORMATION_SCHEMA. This view provides up-to-date information on tables or partitions of a table, such as the size of the table (logical and billable bytes), number of rows, the last time the table (or partition) was updated and whether the specific table (or partition) or is active or has aged out into cheaper long term storage. Partition entries for tables are identified by their PARTITION_ID while unpartitioned tables have a single NULL entry for PARTITION_ID.Querying INFORMATION_SCHEMA views is more cost-efficient compared to querying base tables. Thus, the PARTITIONS view can be used in conjunction with queries to filter the query to specific partitions, e.g. finding data in the most recently updated partition or the maximum value of a partition key, as shown in the example below. DocumentationWe hope these new capabilities put a spring in the step of our BigQuery users as we continue to work hard to bring you more user-friendly SQL. To learn more about BigQuery, visit our website, and get started immediately with the free BigQuery Sandbox.Related ArticleBigQuery explained: Querying your dataLearn how to query datasets in BigQuery using SQL, save and share queries, and create views and materialized views.Read Article
Quelle: Google Cloud Platform