How to build demand forecasting models with BigQuery ML

Retail businesses have a “goldilocks” problem when it comes to inventory: don’t stock too much, but don’t stock too little. With potentially millions of products, for a data science and engineering team to create multi-millions of forecasts is one thing, but to procure and manage the infrastructure to handle continuous model training and forecasting, this can quickly become overwhelming, especially for large businesses.With BigQuery ML, you can train and deploy machine learning models using SQL. With the fully managed, scalable infrastructure of BigQuery, this means reducing complexity while accelerating time to production, so you can spend more time using the forecasts to improve your business.So how can you build demand forecasting models at scale with BigQuery ML, for thousands to millions of products like for this liquor product below?In this blogpost, I’ll show you how to build a time series model to forecast the demand of multiple products using BigQuery ML. Using Iowa Liquor Sales data, I’ll use 18 months of historical transactional data to forecast the next 30 days.You’ll learn how to:pre-process data into the correct format needed to create a demand forecasting model using BigQuery MLtrain an ARIMA-based time-series model in BigQuery MLevaluate the modelpredict the future demand of each product over the next n daystake action on the forecasted predictions:create a dashboard to visualize the forecasted demand using Data Studiosetup scheduled queries to automatically re-train the model on a regular basisThe data: Iowa Liquor SalesThe Iowa Liquor Sales data, which is hosted publicly on BigQuery, is a dataset that “contains the spirits purchase information of Iowa Class “E” liquor licensees by product and date of purchase from January 1, 2012 to current” (from the official documentation by the State of Iowa).The raw dataset looks like this:As on any given date, there may be multiple orders of the same product, we need to:Calculate the total # of products sold grouped by the date and the productCleaned training dataIn the cleaned training data, we now have one row per date per item_name, the total amount sold on that day. This can be stored as a table or view. In this example, this is stored as bqmlforecast.training_data using CREATE TABLE.Train the time series model using BigQuery MLTraining the time-series model is straight-forward. How does time-series modeling work in BigQuery ML?When you train a time series model with BigQuery ML, multiple models/components are used in the model creation pipeline. ARIMA, is one of the core algorithms. Other components are also used, as listed roughly in the order the steps they are run:Pre-processing: Automatic cleaning adjustments to the input time series, including missing values, duplicated timestamps, spike anomalies, and accounting for abrupt level changes in the time series history.Holiday effects: Time series modeling in BigQuery ML can also account for holiday effects. By default, holiday effects modeling is disabled. But since this data is from the United States, and the data includes a minimum one year of daily data, you can also specify an optional HOLIDAY_REGION. With holiday effects enabled, spike and dip anomalies that appear during holidays will no longer be treated as anomalies. A full list of the holiday regions can be found in the HOLIDAY_REGION documentation.Seasonal and trend decomposition using the Seasonal and Trend decomposition using Loess (STL) algorithm. Seasonality extrapolation using the double exponential smoothing (ETS) algorithm.Trend modeling using the ARIMA model and the auto.ARIMA algorithm for automatic hyper-parameter tuning. In auto.ARIMA, dozens of candidate models are trained and evaluated in parallel, which include p,d,q and drift. The best model comes with the lowest Akaike information criterion (AIC).Forecasting multiple products in parallel with BigQuery MLYou can train a time series model to forecast a single product, or forecast multiple products at the same time (which is really convenient if you have thousands or millions of products to forecast). To forecast multiple products at the same time, different pipelines are run in parallel. In this example, since you are training the model on multiple products in a single model creation statement, you will need to specify the parameter TIME_SERIES_ID_COL as item_name. Note that if you were only forecasting a single item, then you would not need to specify TIME_SERIES_ID_COL. For more information, see the BigQuery ML time series model creation documentation.Evaluate the time series modelYou can use the ML.EVALUATE function (documentation) to see the evaluation metrics of all the created models (one per item):As you can see, in this example, there were five models trained, one for each of the products in item_name. The first four columns (non_seasonal_{p,d,q} and has_drift) define the ARIMA model. The next three metrics (log_likelihood, AIC, and variance) are relevant to the ARIMA model fitting process. The fitting process determines the best ARIMA model by using the auto.ARIMA algorithm, one for each time series. Of these metrics, AIC is typically the go-to metric to evaluate how well a time series model fits the data while penalizing overly complex models. As a rule-of-thumb, the lower the AIC score, the better. Finally, the seasonal_periods detected for each of the five items happened to be the same: WEEKLY.Make predictions using the modelMake predictions using ML.FORECAST (syntax documentation), which forecasts the next n values, as set in horizon. You can also change the confidence_level, the percentage that the forecasted values fall within the prediction interval.The code below shows a forecast horizon of “30”, which means to make predictions on the next 30 days, since the training data was daily.Since the horizon was set to 30, the result contains rows equal to 30 forecasted value * (number of items).Each forecasted value also shows the upper and lower bound of the prediction_interval, given the confidence_level.As you may notice, the SQL script uses DECLARE and EXECUTE IMMEDIATE to help parameterize the inputs for horizon and confidence_level. As these HORIZON and CONFIDENCE_LEVEL variables make it easier to adjust the values later, this can improve code readability and maintainability. To learn about how this syntax works, you can read the documentation on scripting in Standard SQL.Plot the forecasted predictions You can use your favourite data visualization tool, or use some template code here on Github for matplotlib and Data Studio, as shown below:How do you automatically re-train the model on a regular basis?If you’re like many retail businesses that need to create fresh time-series forecasts based on the most recent data, you can use scheduled queries to automatically re-run your SQL queries, which includes your CREATE MODEL, ML.EVALUATE or ML.FORECAST queries.1. Create a new scheduled query in the BigQuery UIYou may need to first “Enable Scheduled Queries” before you can create your first one.2. Input your requirements (e.g., repeats Weekly) and select “Schedule”3. Monitor your scheduled queries on the BigQuery Scheduled Queries pageExtra tips on using time series with BigQuery MLInspect the ARIMA model coefficientsIf you want to know the exact coefficients for each of your ARIMA models, you can inspect them using ML.ARIMA_COEFFICIENTS (documentation).For each of the models, ar_coefficients shows the model coefficients of the autoregressive (AR) part of the ARIMA model. Similarly, ma_coefficients shows the model coefficients of moving-average (MA) part. They are both arrays, whose lengths are equal to non_seasonal_p and non_seasonal_q, respectively. The intercept_or_drift is the constant term in the ARIMA model.SummaryCongratulations! You now know how to train your time series models using BigQuery ML, evaluate your model, and use the results in production. Code on GithubYou can find the full code in this Jupyter notebook on Github:https://github.com/GoogleCloudPlatform/analytics-componentized-patterns/tree/master/retail/time-series/bqml-demand-forecastingJoin me on February 4 for a live walkthrough of how to train, evaluate and forecast inventory demand on retail sales data with BigQuery ML. I’ll also demonstrate how to schedule model retraining on a regular basis so your forecast models can stay up-to-date. You’ll have a chance to have their questions answered by Google Cloud experts via chat.Want more?I’m Polong Lin, a Developer Advocate for Google Cloud. Follow me on @polonglin or connect with me on Linkedin at linkedin.com/in/polonglin.Please leave me your comments with any suggestions or feedback.Thanks to reviewers: Abhishek Kashyap, Karl WeinmeisterRelated ArticleRetailers find flexible demand forecasting models in BigQuery MLTry BigQuery’s design pattern for demand forecasting to create predictive analytics models for retail use cases.Read Article
Quelle: Google Cloud Platform

Eventarc brings eventing to Cloud Run and is now GA

Back in October, we announced the public preview of Eventarc, as new eventing functionality that lets developers route events to Cloud Run services. In a previous post, we outlined more benefits of Eventarc: a unified eventing experience in Google Cloud, centralized event routing, consistency with eventing format, libraries and an ambitious long term vision. Today, we’re happy to announce that Eventarc is now generally available. Developers can focus on writing code to handle events, while Eventarc takes care of the details of event ingestion, delivery, security, observability, and error handling.To recap, Eventarc lets you:Receive events from 60+ Google Cloud sources (via Cloud Audit logs).Receive events from custom sources by publishing to Pub/Sub. Adhere to the CloudEvents standard for all your events, regardless of source, to ensure a consistent developer experience.Enjoy on-demand scalability and no minimum fees.In the rest of the post, we outline some of the improvements to Eventarc since public preview.gcloud updatesAt GA, there are a few updates to Eventarc gcloud commands. First, you don’t need to specify beta in Eventarc commands anymore. Instead of gcloud beta eventarc, you can simply use gcloud eventarc. Second, –matching-criteria flag in public preview got renamed to –event-filters. Third, –destination-run-region is now optional when creating a regional trigger. If not specified by the user, it will be populated with the trigger location (specified via –location flag or eventarc/location property). For example, this is how you can create a trigger to listen for messages from a Pub/Sub topic in the same region as the trigger:This trigger creates a Pub/Sub topic under the covers. If you want to use an existing Pub/Sub topic, Eventarc now allows that with an optional –transport-topic gcloud flag. There’s also a new command to list available regions for triggers. More on these below. Bring your own Pub/Sub topicIn public preview, when you created a Pub/Sub trigger, Eventarc created a Pub/Sub topic under the covers for you to use as transport topic between your application and a Cloud Run service. This was useful if you need to easily and quickly create a Pub/Sub backed trigger. But  it was also limiting; there was no way to create triggers from an existing Pub/Sub topic or set up a fanout from a single Pub/Sub topic.With today’s GA, Eventarc now allows you to specify an existing Pub/Sub topic in the same project with the –transport-topic gcloud flag as follows:Regional expansionIn addition to the regions supported at public preview (asia-east1, europe-west1, us-central1, us-east1 and global), Eventarc is now available from four additional more Google Cloud regions: asia-southeast1, europe-north1, europe-west4, us-west1. This lets you create regional triggers in eight regions or create a global trigger and receive events from those regions. There’s also a new command to see the list of available trigger locations:You can specify trigger locations with –location flag with each command:Alternatively, you can also set the eventarc/location config to set it globally for all commands:Next stepsWe’re excited to bring Eventarc to general availability. Getting started with Eventarc couldn’t be easier, as it does not require any setup to quickly set up triggers to ingest events from various Google Cloud sources and direct them to Cloud Run services. Check out our documentation, try the Quickstart guide or our codelab.Related ArticleTrigger Cloud Run with events from more than 60 Google Cloud sourcesNow, you can invoke applications running on Cloud Run with events generated by over 60 Google Cloud services.Read Article
Quelle: Google Cloud Platform

AWS App Mesh ist jetzt in der AWS-Region Afrika (Kapstadt) verfügbar

AWS App Mesh ist ein Service-Mesh, das Netzwerke auf Anwendungsebene bereitstellt, sodass Ihre Services problemlos über verschiedene Arten von Computerinfrastrukturen miteinander kommunizieren können. AWS App Mesh standardisiert die Kommunikation Ihrer Services und bietet Ihnen eine durchgängige Sichtbarkeit und Optionen zum Abstimmen für die Hochverfügbarkeit Ihrer Anwendungen. 
Quelle: aws.amazon.com