Amazon QuickSight führt Datenbalken für Tabellen ein

Amazon QuickSight unterstützt jetzt Datenbalken für Tabellen. Datenbalken sind im Wesentlichen Balkendiagramme, die für eine bestimmte Spalte angezeigt werden, wobei die Balkenlänge die Größe des Zellenwerts im Verhältnis zum Wertebereich in derselben Spalte angibt. Datenbalken erleichtern das Lesen und Vergleichen von Werten und das schnelle Erkennen von Ausreißern. Sie können Datenbalken nur für numerische Felder festlegen und die Farbe sowohl für positive als auch für negative Werte separat festlegen. Weitere Details finden Sie hier.
Quelle: aws.amazon.com

Submit your entry now for our *new* Talent Transformation Google Cloud Customer Award

Has your business made talent investments that directly link to your digital transformation strategy? Google Cloud wants to honor your organization for its dedication to developing your team’s Google Cloud skills through our *new* Talent Transformation Google Cloud Customer Award. Submit your application before March 31, 2023 to be recognized as a global leader in cloud talent transformation. Celebrating Google Cloud customer’s outstanding contributionsGoogle Cloud Customer Awards recognize organizations who are leading business transformation with Google Cloud products and solutions. We want to hear how you are growing one of the most important elements of your organization — your people! Tell us your story for a chance to win and enjoy benefits like:A Google Cloud Customer Award designation for your websiteCollaboration with Google Cloud leaders, engineers and product managers at a variety of roundtables, discussions and eventsGoogle Cloud press release and announcement support to help strengthen your brand as a visionary leader in technologyPromotion through the Google Cloud results blog and social media to share your success story with our extensive customer and partner networkInclusion in the annual Google Cloud Customer ebookA place amongst the global leaders who are recognized at Google Cloud events and celebrationsHow to get started and tips for a submission that will stand outTell your compelling and unique story about cloud talent transformation! This can include mentorship, skills training, Google Cloud certification preparation support or anything you’ve built to invest in your people’s Google Cloud skills. To help your accomplishments shine, use the distinct voice and personality of your organization. You’ll want to begin by gathering:Business and deployment metricsSolution overview diagrams, workflows, architectural diagrams or imagesExisting public case studies, webinars or other contentThese awards recognize customers who demonstrate unique transformation and innovation, business/operational excellence, industry-wide problem solving, and implementing long-term, lasting benefits. You can add depth to your submission by asking stakeholders to share their perspectives — for example, your CEO or customer testimonies are great ways to do this.Metrics and impact are also important. Share how your company is now faster, smarter, more collaborative and flexible due to the Google Cloud skills development opportunities that you provided.Judging the entries and results announcementsA diverse panel of senior technical judges from around the world carefully assess hundreds of entries that are ranked using a scoring framework. We ensure high quality assessment through a three-round process, using specified benchmarks at least twice per entry. The Google Cloud Customer Awards team and the judges are the only people who see submissions, and winners are under embargo until official announcements take place. All participants will be notified of results via email at least two months prior to announcements, with results notification scheduled for May 31, 2023.Results will be formally announced and celebrated at a special event later this year, where winners take their place amongst other outstanding leaders in innovative thinking and business transformation.For inspiration and to learn more about the transformative organizations that have won Customer Awards with their visionary achievements, take a look at last year’s industry winners. We encourage entry by any customers – new to established, small to large, across all types of products and solutions. Submitting your entryIn order to enter, you must be a Google Cloud customer, with success you can demonstrate within the last 12 months. Google Cloud partners and Googlers can also submit on behalf of customers.Award categories include Industry Customer Awards across a range of verticals, and our Technology for Good Awards, which include the Talent Transformation Award. You may apply for one Industry Customer Award, plus any or all of the Technology for Good Awards.Start by using this templateto gather all of the relevant information as a team. Designate one person to complete the application and submit via the Customer Awards online questionnaire. The submission window is now open through March 31, 2023.We are so excited to hear about the wonderful things you are doing to empower your teams to build upon their Google Cloud knowledge and skills — making you a leader in your industry. Happy submitting — get started here!
Quelle: Google Cloud Platform

How to use advance feature engineering to preprocess data in BigQuery ML

Preprocessing and transforming raw data into features is a critical but time consuming step  in the ML process.  This is especially true when a data scientist or data engineer has to move data across different platforms to do MLOps. In this blogpost, we describe how we streamline this process by adding two feature engineering capabilities in BigQuery ML.Our previous blog outlines the data to AI journey with BigQuery ML, highlighting two powerful features that simplify MLOps – data preprocessing functions for feature engineering and the ability to export BigQuery ML TRANSFORM statement as part of the model artifact. In this blog post, we share how to use these features for creating a seamless experience from BigQuery ML to Vertex AI.Data Preprocessing FunctionsPreprocessing and transforming raw data into features is a critical but time consuming step when operationalizing ML. We recently announced the public preview of advanced feature engineering functions in BigQuery ML. These functions help you impute, normalize or encode data. When this is done inside the database, BigQuery, the entire process becomes easier, faster, and more secure to preprocess data.Here is a list of the new functions we are introducing in this release. The full list of preprocessing functions can be found here.ML.MAX_ABS_SCALERScale a numerical column to the range [-1, 1] without centering by dividing by the maximum absolute value.ML.ROBUST_SCALERScale a numerical column by centering with the median (optional) and dividing by the quantile range of choice ([25, 75] by default).ML.NORMALIZERTurn an input numerical array into a unit norm array for any p-norm: 0, 1, >1, +inf. The default is 2 resulting in a normalized array where the sum of squares is 1.ML.IMPUTERReplace missing values in a numerical or categorical input with the mean, median or mode (most frequent).ML.ONE_HOT_ENCODEROne-hot encode a categorical input. Also, it optionally does dummy encoding by dropping the most frequent value. It is also possible to limit the size of the encoding by specifying k for the k most frequent categories and/or a lower threshold for the frequency of categories. ML.LABEL_ENCODEREncode a categorical input to integer values [0, n categories] where 0 represents NULL and excluded categories. You can exclude categories by specifying k for k most frequent categories and/or a lower threshold for the frequency of categories.Model Export with TRANSFORM StatementYou can now export BigQuery ML models that include a feature TRANSFORM statement. The ability to include TRANSFORM statements makes models more portable when exporting them for online prediction. This capability also works when BigQuery ML models are registered with Vertex AI Model Registry and deployed to Vertex AI Prediction endpoints. More details about exporting models can be found in BigQuery ML Exporting models.These new features are available through the Google Cloud Console, BigQuery API, and client libraries. Step-by-step guide to use the two featuresIn this tutorial, we will use the bread recipe competition dataset to predict judges rating using linear regression and boosted tree models.Objective: To demonstrate how to preprocess data using the new functions, register the model with Vertex AI Model Registry, and deploy the model for online prediction with Vertex AI Prediction endpoints. Dataset: Each row represents a bread recipe with columns for each ingredient (flour, salt, water, yeast) and procedure (mixing time, mixing speed, cooking temperature, resting time). There are also columns that include judges ratings of the final product from each recipe.Overview of the tutorial: Steps 1 and 2 show how to use the TRANSFORM statement. Steps 3 and 4 demonstrate how to manually export and register the models. Steps 5 through 7 show how to deploy a model to Vertex AI Prediction endpoint.For the best learning experience, follow this blog post alongside the tutorial notebook.Step 1: Transform BigQuery columns into ML features with SQLBefore training an ML model, exploring the data within columns is essential to identifying the data type, distribution, scale, missing patterns, and extreme values.  BigQuery ML enables this exploratory analysis with SQL. With the new preprocessing functions it is now even easier to transform BigQuery columns into ML features with SQL while iterating to find the optimal transformation. For example, when using the ML.MAX_ABS_SCALER function for an input column, each value is divided by the maximum absolute value (10 in the example): code_block[StructValue([(u’code’, u’SELECTrn input_column,rn ML.MAX_ABS_SCALER (input_column) OVER() AS scale_columnrnFROMrn UNNEST([0, -1, 2, -3, 4, -5, 6, -7, 8, -9, 10]) as input_columnrnORDER BY input_column’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea90d564c90>)])]Once the input columns for an ML model are identified and the feature transformations are chosen, it is enticing to apply the transformation and save the output as a view. But this has an impact on our predictions later on because these same transformations will need to be applied before requesting predictions. Step 2 shows how to prevent this separation of processing and model training.Step 2: Iterate through multiple models with inline TRANSFORM functionsBuilding on the preprocessing explorations in Step 1, the chosen transformations are applied inline with model training using the TRANSFORM statement. This interlocks the model iteration with the preprocessing explorations while making any candidate ready for serving with BigQuery or beyond. This means you can immediately try multiple model types without any delayed impact of feature transformations on predictions. In this step, two models, linear regression and boosted tree, are trained side-by-side with identical TRANSFORM statements:Training with linear regression – Model acode_block[StructValue([(u’code’, u”CREATE OR REPLACE MODEL `statmike-mlops-349915.feature_engineering.03_feature_engineering_2a`rnTRANSFORM (rn JUDGE_A,rnrn ML.MIN_MAX_SCALER(flourAmt) OVER() as scale_flourAmt, rn ML.ROBUST_SCALER(saltAmt) OVER() as scale_saltAmt,rn ML.MAX_ABS_SCALER(yeastAmt) OVER() as scale_yeastAmt,rn ML.STANDARD_SCALER(water1Amt) OVER() as scale_water1Amt,rn ML.STANDARD_SCALER(water2Amt) OVER() as scale_water2Amt,rnrn ML.STANDARD_SCALER(waterTemp) OVER() as scale_waterTemp,rn ML.ROBUST_SCALER(bakeTemp) OVER() as scale_bakeTemp,rn ML.MIN_MAX_SCALER(ambTemp) OVER() as scale_ambTemp,rn ML.MAX_ABS_SCALER(ambHumidity) OVER() as scale_ambHumidity,rnrn ML.ROBUST_SCALER(mix1Time) OVER() as scale_mix1Time,rn ML.ROBUST_SCALER(mix2Time) OVER() as scale_mix2Time,rn ML.ROBUST_SCALER(mix1Speed) OVER() as scale_mix1Speed,rn ML.ROBUST_SCALER(mix2Speed) OVER() as scale_mix2Speed,rn ML.STANDARD_SCALER(proveTime) OVER() as scale_proveTime,rn ML.MAX_ABS_SCALER(restTime) OVER() as scale_restTime,rn ML.MAX_ABS_SCALER(bakeTime) OVER() as scale_bakeTimern)rnOPTIONS (rn model_type = ‘LINEAR_REG’,rn input_label_cols = [‘JUDGE_A’],rn enable_global_explain = TRUE,rn data_split_method = ‘AUTO_SPLIT’,rn MODEL_REGISTRY = ‘VERTEX_AI’,rn VERTEX_AI_MODEL_ID = ‘bqml_03_feature_engineering_2a’,rn VERTEX_AI_MODEL_VERSION_ALIASES = [‘run-20230112234821′]rn ) ASrnSELECT * EXCEPT(Recipe, JUDGE_B)rnFROM `statmike-mlops-349915.feature_engineering.bread`”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea90e7bcc90>)])]Training with boosted tree – Model bcode_block[StructValue([(u’code’, u”CREATE OR REPLACE MODEL `statmike-mlops-349915.feature_engineering.03_feature_engineering_2b`rnTRANSFORM (rn JUDGE_A,rnrn ML.MIN_MAX_SCALER(flourAmt) OVER() as scale_flourAmt, rn ML.ROBUST_SCALER(saltAmt) OVER() as scale_saltAmt,rn ML.MAX_ABS_SCALER(yeastAmt) OVER() as scale_yeastAmt,rn ML.STANDARD_SCALER(water1Amt) OVER() as scale_water1Amt,rn ML.STANDARD_SCALER(water2Amt) OVER() as scale_water2Amt,rnrn ML.STANDARD_SCALER(waterTemp) OVER() as scale_waterTemp,rn ML.ROBUST_SCALER(bakeTemp) OVER() as scale_bakeTemp,rn ML.MIN_MAX_SCALER(ambTemp) OVER() as scale_ambTemp,rn ML.MAX_ABS_SCALER(ambHumidity) OVER() as scale_ambHumidity,rnrn ML.ROBUST_SCALER(mix1Time) OVER() as scale_mix1Time,rn ML.ROBUST_SCALER(mix2Time) OVER() as scale_mix2Time,rn ML.ROBUST_SCALER(mix1Speed) OVER() as scale_mix1Speed,rn ML.ROBUST_SCALER(mix2Speed) OVER() as scale_mix2Speed,rn ML.STANDARD_SCALER(proveTime) OVER() as scale_proveTime,rn ML.MAX_ABS_SCALER(restTime) OVER() as scale_restTime,rn ML.MAX_ABS_SCALER(bakeTime) OVER() as scale_bakeTimern)rnOPTIONS (rn model_type = ‘BOOSTED_TREE_REGRESSOR’,rn booster_type = ‘GBTREE’,rn num_parallel_tree = 1,rn max_iterations = 30,rn early_stop = TRUE,rn min_rel_progress = 0.01,rn tree_method = ‘HIST’,rn subsample = 0.85, rn input_label_cols = [‘JUDGE_A’],rn enable_global_explain = TRUE,rn data_split_method = ‘AUTO_SPLIT’,rn l1_reg = 10,rn l2_reg = 10,rn MODEL_REGISTRY = ‘VERTEX_AI’,rn VERTEX_AI_MODEL_ID = ‘bqml_03_feature_engineering_2b’,rn VERTEX_AI_MODEL_VERSION_ALIASES = [‘run-20230112234926′]rn ) ASrnSELECT * EXCEPT(Recipe, JUDGE_B)rnFROM `statmike-mlops-349915.feature_engineering.bread`”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea8f93a6a90>)])]Identical input columns that have the same preprocessing means you can easily compare the accuracy of the models. Using the BigQuery ML function ML.EVALUATE makes this comparison as simple as a single SQL query that stacks these outcomes with the UNION ALL set operator:code_block[StructValue([(u’code’, u”SELECT ‘Manual Feature Engineering – 2A’ as Approach, mean_squared_error, r2_scorernFROM ML.EVALUATE(MODEL `statmike-mlops-349915.feature_engineering.03_feature_engineering_2a`)rnUNION ALLrnSELECT ‘Manual Feature Engineering – 2B’ as Approach, mean_squared_error, r2_scorernFROM ML.EVALUATE(MODEL `statmike-mlops-349915.feature_engineering.03_feature_engineering_2b`)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea90c7bd310>)])]The results of the evaluation comparison show that using the boosted tree model results in a much better model than linear regression with drastically lower mean squared error and higher r2. Both models are ready to serve predictions, but the clear choice is the boosted tree regressor. Once we decide which model to use, you can predict directly within BigQuery ML using the ML.PREDICT function. In the rest of the tutorial, we show how to export the model outside of BigQuery ML and predict using Google Cloud Vertex AI.Using BigQuery Models for Inference Outside of BigQueryOnce your model is trained, if you want to do online inference for low latency responses in your application for online prediction, you have to deploy the model outside of BigQuery. The following steps demonstrate how to deploy the models to Vertex AI Prediction endpoints.This can be accomplished in one of two ways:Manually export the model from BigQuery ML and set up a Vertex AI Prediction Endpoint. To do this, you need to do steps 3 and 4 first.Register the model and deploy from Vertex AI Model Registry automatically. The capability is not available yet but will be available in a forthcoming release. Once it’s available steps 3 and 4 can be skipped.Step 3. Manually export models from BigQueryBigQuery ML supports an EXPORT MODEL statement to deploy models outside of BigQuery. A manual export includes two models – a preprocessing model that reflects the TRANSFORM statement and a prediction model. Both models are exported with a single export statement in BigQuery ML.code_block[StructValue([(u’code’, u”EXPORT MODEL `statmike-mlops-349915.feature_engineering.03_feature_engineering_2b`rn OPTIONS (URI = ‘gs://statmike-mlops-349915-us-central1-bqml-exports/03/2b/model’)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea8faf965d0>)])]The preprocessing model that captures the TRANSFORM statement is exported as a TensorFlow SavedModel file. In this example it is exported to a GCS bucket located at ‘gs://statmike-mlops-349915-us-central1-bqml-exports/03/2b/model/transform’.  The prediction models are saved in portable formats that match the frameworks in which they were trained by BigQuery ML. The linear regression model is exported as a TensorFlow SavedModel and the boosted tree regressor is exported as Booster file (XGBoost). In this example, the boost tree model is exported to a GCS bucket located at ‘gs://statmike-mlops-349915-us-central1-bqml-exports/03/2b/model’ These export files are in a standard open format of the native model types making them completely portable to be deployed anywhere – they can be deployed to Vertex AI (Steps 4-7 below), on your own infrastructure, or even in edge applications.Steps 4 through 7 show how to register and deploy a model to Vertex AI Prediction endpoint. These steps need to be repeated separately for the preprocessing models and the prediction models.Step 4. Register models to Vertex AI Model RegistryTo deploy the models in Vertex AI Prediction, they first need to be registered with the Vertex AI Model Registry To do this two inputs are needed – the links to the model files and a URI to a pre-built container. Go to Step 4 in the tutorial to see how exactly it’s done.The registration can be done with the Vertex AI console or programmatically with one of the clients. In the example below, the Python client for Vertex AI is used to register the models like this:code_block[StructValue([(u’code’, u’vertex_model = aiplatform.Model.upload(rn display_name = ‘gcs_03_feature_engineering_2b’,rn serving_container_image_uri = ‘us-docker.pkg.dev/vertex-ai/prediction/xgboost-cpu.1-1:latest’,rn artifact_uri = “gs://statmike-mlops-349915-us-central1-bqml-exports/03/2b/model”rn)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea8faf96650>)])]Step 5. Create Vertex AI Prediction endpointsVertex AI includes a service forhosting models for online predictions. To host a model on a Vertex AI Prediction endpoint you first create an endpoint. This can also be done directly from the Vertex AI Model Registry console or programmatically with one of the clients. In the example below, the Python client for Vertex AI is used to create the endpoint like this:code_block[StructValue([(u’code’, u’vertex_endpoint = aiplatform.Endpoint.create (rn display_name = u201803_feature_engineering_manual_2bu2019rn)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea8faf96b50>)])]Step 6. Deploy models to endpointsDeploying a model from the Vertex AI Model Registry (Step 4) to a Vertex AI Prediction endpoint (Step 5) is done in a single deployment action where the model definition is supplied to the endpoint along with the type of machine to utilize. Vertex AI Prediction endpoints can automatically scale up or down to handle prediction traffic needs by providing the number of replicas to utilize (default is 1 for min and max). In the example below, the Python client for Vertex AI is being used with the deploy method for the endpoint (Step 5) using the models (Step 4):code_block[StructValue([(u’code’, u”vertex_endpoint.deploy(rn model = vertex_model,rn deployed_model_display_name = vertex_model.display_name,rn traffic_percentage = 100,rn machine_type = ‘n1-standard-2′,rn min_replica_count = 1,rn max_replica_count = 1rn)”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea90c7bd450>)])]Step 7. Request predictions from endpoints  Once the model is deployed to a Vertex AI Prediction endpoint (Step 6) it can serve predictions. Rows of data, called instances, are passed to the endpoint and results are returned that include the processed information: preprocessing result or prediction. Getting prediction results from Vertex AI Prediction endpoints can be done with any of the Vertex AI API interfaces (REST, gRPC, gcloud, Python, Java, Node.js). Here, the request is demonstrated directly with the predict method of the endpoint (Step 6) using the Python client for Vertex AI as follows:code_block[StructValue([(u’code’, u”results = vertex_endpoint.predict(instances = [rn{‘flourAmt': 511.21695405324624,rn ‘saltAmt': 9,rn ‘yeastAmt': 11,rn ‘mix1Time': 6,rn ‘mix1Speed': 4,rn ‘mix2Time': 5,rn ‘mix2Speed': 4,rn ‘water1Amt': 338.3989183746999,rn ‘water2Amt': 105.43955159464981,rn ‘waterTemp': 48,rn ‘proveTime': 92.27755071811586,rn ‘restTime': 43,rn ‘bakeTime': 29,rn ‘bakeTemp': 462.14028505497805,rn ‘ambTemp': 38.20572852497746,rn ‘ambHumidity': 63.77836403396154}])”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea90c7bd690>)])]The result of an endpoint with a preprocessing model will be identical to applying the TRANSFORM statement from BigQuery ML. The results can then be pipelined to an endpoint with the prediction model to serve predictions that match the results of the ML.PREDICT function in BigQuery ML. The results of both methods, Vertex AI Prediction endpoints and BigQuery ML with ML.PREDICT are shown side-by-side in the tutorial to show that the results of the model are replicated. Now the model can be used for online serving with extremely low latency. This even includes using private endpoints for even lower latency and secure connections with VPC Network Peering.Conclusion With the new preprocessing functions, you can simplify data exploration and feature preprocessing. Further, by embedding preprocessing within model training using the TRANSFORM statement, the serving process is simplified by using prepped models without needing additional steps. In other words, predictions are done right inside BigQuery or alternatively the models can be exported to any location outside of BigQuery such as Vertex AI Prediction for online serving. The tutorial demonstrated how BigQuery ML works with Vertex AI Model Registry and Prediction to create a seamless end-to-end ML experience. In the future you can expect to see more capabilities that bring BigQuery, BigQuery ML and Vertex AI together.Click here to access the tutorial or check out the documentation to learn more about BigQuery MLThanks to Ian Zhao, Abhinav Khushraj, Yan Sun, Amir Hormati, Mingge Deng and Firat Tekiner from the BigQuery ML team
Quelle: Google Cloud Platform