Using Google Cloud Speech-to-Text to transcribe your Twilio calls in real-time

Developers have asked us how they can use Google Cloud’s Speech-to-Text to transcribe speech (especially phone audio) coming from Twilio, a leading cloud communications PaaS. We’re pleased to announce that it’s now easier than ever to integrate live call data with Google Cloud’s Speech-to-Text using Twilio’s Media Streams.The new TwiML <stream> command streams call audio to a websocket server. This makes it simple to move your call audio from your business phone system into an AI platform that can transcribe that data in real time and use it for use cases like helping contact center agents and admins, as well as store it for later analysis. When you combine this new functionality with Google Cloud’s Speech-to-Text abilities and other infrastructure and analytics tools like BigQuery, you can create an extremely scalable, reliable and accurate way of getting more value from your audio.ArchitectureThe overall architecture for creating this flow looks something like what you see below. Twilio creates and manages the inbound phone number. Their new Stream command takes the audio from an incoming phone call and sends it to a configured websocket which runs on a simple App Engine flexible environment. From there, sending the audio along as it comes to Cloud Speech-to-Text is not very challenging. Once a transcript is created, it’s stored in BigQuery where real-time analysis can be performed.Configuring your phone numberOnce you’ve bought a number in Twilio, you’ll need to configure your phone number to respond with TwiML, which stands for Twilio Markup Language. It’s a tag-based language much like HTML, which will pass off control via a webhook that expects TwiML that you provide.Next, navigate to your list phone numbers and choose your new number. On the number settings screen, scroll down to the Voice section. There is a field labelled “A Call Comes In”. Here, choose TwiML Bin from the drop down and press the plus button next to the field to create a new TwiML Bin.Creating a TwiML BinTwiML Bins are a serverless solution that can seamlessly host TwiML instructions. Using a TwiML Bin prevents you from needing to set up a webhook handler in your own web-hosted environment.Give your TwiML Bin a Friendly Name that you can remember later. In the Body field, enter the following code, replacing the url attribute of the <Stream> tag and the phone number contained in the body of the <Dial> tag.The <Stream> tag starts the audio stream asynchronously and then control moves onto the <Dial> verb. <Dial> will call that number. The audio stream will end when the call is completed.Save your TwiML Bin and make sure that you see your Friendly Name in the “A Call Comes In“ drop down next to TwiML Bin. Make sure to Save your phone number.Setup in Google CloudThis setup can either be done in an existing Google Cloud project or a new project. To set up a new project, follow the instructions here. Once you have the project selected that you want to work in, you’ll need to set up a few key things before getting started:Enable APIs for Google Speech-to-Text. You can do that by following the instructions here and searching for “Cloud Speech-to-Text API”.Create a service account for your App Engine flexible environment to utilize when accessing other Google Cloud services. You’ll need to download the private key as a JSON file as well.Add firewall rules to allow your App Engine flexible environment to accept incoming connections for the websocket. A command like the following should work from a gcloud enabled terminal:gcloud compute firewall-rules create default-allow-websockets-8080 –allow tcp:8080 –target-tags websocket –description “Allow websocket traffic on port 8080″App Engine flexible environment setupFor the App Engine application, we will be taking the sample code from Twilio’s repository to create a simple node.js websocket server. You can find the github page here with instructions on environment setup. Once the code is in your project folder, you’ll need to do a few more things to deploy your application:Place the service account JSON key you downloaded earlier, rename it to “google_creds.json”, and put it in the same directory as the node.js code.Create an app.yaml file that looks like the following:runtime: nodejsenv: flexmanual_scaling:  instances: 1network:  instance_tag: websocketOnce these two items are in order, you will be able to deploy your application with the command:gcloud app deployOnce deployed, you can tail the console logs with the command:gcloud app logs tail -s defaultVerifying your stream is workingCall your Twilio number, and you should immediately be connected with the number specified in your TwiML. You should see a websocket connection request made to the url specified in the <Stream>. Your websocket should immediately start receiving messages. If you are tailing the logs in the console, the application will log the intermediate messages as well as any final utterances detected by Google Cloud’s Speech-to-Text API.Writing transcriptions to BigQueryIn order to analyze the transcripts later, we can create a BigQuery table and modify the sample code from Twilio to write to that table. Instructions for creating a new BigQuery table can be found here. Given the way Google Speech-to-Text creates transcription results, a potential schema for the table might look like the following.Once a table like this exists, you can modify the Twilio sample code to also stream data to the BigQuery table using sample code found here.ConclusionTwilio’s new Stream function allows users to quickly make use of the real time audio that is moving through their phone systems. Paired with Google Cloud, that data can be transcribed in real time and passed on to numerous other applications. This ability to get high quality transcription in real time can benefit businesses—from helping contact center agents document and understand phone calls, to analyzing data from the transcripts of those calls. To learn more about Cloud Speech-to-Text, visit our website.
Quelle: Google Cloud Platform

Spot slow MySQL queries fast with Stackdriver Monitoring

When you’re serving customers online, speed is essential for a good experience. As the amount of data in a database grows, queries that used to be fast can slow down. For example, if a query has to scan every row because a table is missing an index, response times that were acceptable with a thousand rows can turn into multiple seconds of waiting once you have a million rows. If this query is executed every time a user loads your web page, their browsing experience will slow to a crawl, causing user frustration. Slow queries can also impact automated jobs, causing them to time out before completion. If there are too many of these slow queries executing at once, the database can even run out of connections, causing all new queries, slow or fast, to fail. The popular open-source databases MySQL and Google Cloud Platform’s fully managed version, Cloud SQL for MySQL, include a feature to log slow queries, letting you find the cause, then optimize for better performance. However, developers and database administrators typically only access this slow query log reactively, after users have seen the effects and escalated the performance degradation.With Stackdriver Logging and Monitoring, you can stay ahead of the curve for database performance with automatic alerts when query latency goes over the threshold, and a monitoring dashboard that lets you quickly pinpoint the specific queries causing the slowdown.Architecture for monitoring MySQL slow query logs with StackdriverTo get started, import MySQL’s slow query log into Stackdriver Logging. Once the logs are in Stackdriver, it’s straightforward to set up logs-based metrics that can both count the number of slow queries over time, which is useful for setting up appropriate alerts, and also provide breakdowns by slow SQL statement, allowing speedy troubleshooting. What’s more, this approach works equally well for managed databases in Cloud SQL for MySQL and for self-managed MySQL databases hosted on Compute Engine. For a step-by-step tutorial to set up slow query monitoring, check out Monitoring slow queries in MySQL with Stackdriver. For more ideas about what else you can accomplish with Stackdriver Logging, check out Design patterns for exporting Stackdriver Logging.
Quelle: Google Cloud Platform

What’s happening in BigQuery: Adding speed and flexibility with 10x streaming quota, Cloud SQL federation and more

We’ve been busy this summer releasing new features for BigQuery, Google Cloud’s petabyte-scale data warehouse. BigQuery lets you ingest and analyze data quickly and with high availability, so you can find new insights, trends, and predictions to efficiently run your business. Our Google Cloud engineering team is continually making improvements to BigQuery to accelerate your time to value. Recently added BigQuery features include a newly built back end with 10x the streaming quota, the ability to query live from Cloud SQL datasets, and the ability to run your existing TensorFlow models in BigQuery. These new features are designed to help you stream, analyze, and model more data faster, with more flexibility.Read on to learn more about these new capabilities and get quick demos and tutorial links so you can try these features yourself.10x BigQuery streaming quota, now in betaWe know your data needs to move faster than your business, so we’re always working on adding efficiency and speed. The BigQuery team has completely redesigned the streaming back end to increase the default Streaming API quota by a factor of 10, from 100,000 to 1,000,000 rows per second per project. The default quota for maximum bytes per second has also increased, from 100MB per table to 1GB per project and there are now no table-level limitations. This means you get greater capacity and better performance for your streaming workloads like IoT and more. There’s no change to the current streaming API. You can choose whether you’d like to use this new streaming back end by filling out this form. If you use the new back end, you won’t have to change your BigQuery API code, since the new back end uses the same BigQuery Streaming API. Note that this quota increase is only applicable if you don’t need the best effort deduplication that’s offered by the current streaming back end. This is done by not populating the insertId field for each row inserted when calling the streaming API.Check out this demo from Google Cloud Next ‘19 to see data stream 20 GB per second from simulated IoT sensors into BigQuery.Check out the documentation for more on Streaming data into BigQuery.Query Cloud SQL from BigQueryData can only create value for your business when you put it to work, and businesses need secure and easy-to-use methods to explore and manage data that is stored in multiple locations. Within Google Cloud, we use our database tools and services to power what we do, including offering new Qwiklabs and courses each month. Internally, we manage the roadmap of new releases with a Cloud SQL back end. We then have an hourly Cloud Composer job that pipes our Cloud SQL transactional data from Cloud SQL into BigQuery for reporting. Such periodic export carries considerable overhead and the drawback that reports reflect data that is an hour old. This is a common challenge for enterprise business intelligence teams who want quicker insights from their transactional systems. To avoid the overhead of periodic exports and increase the timeliness of your reports, we have expanded support for federated queries to include Cloud SQL. You can now query your Cloud SQL tables and views directly from BigQuery through a federated Cloud SQL connection (no more moving or copying data). Our curriculum dashboards now run on live data with one simple EXTERNAL_QUERY() instead of a complex hourly pipeline. This new connection feature supports both MySQL (second generation) and PostgreSQL instances in Cloud SQL. After the initial one-time setup, you can write a query with the new SQL function EXTERNAL_QUERY(). Here’s an example where we join existing customer data from BigQuery against the latest orders from our transactional system in Cloud SQL in one query:Note the cross database JOIN on rq.customer_id = c.customer_id. BigQuery actively connects to Cloud SQL to get the latest order data. Getting live data from Cloud SQL federated in BigQuery means you will always have the latest data for reporting. This can save teams time, bring the latest data faster, and open up analytics possibilities. We hear from customers that they are seeing the benefits of immediate querying, too.”Our data is spread across Cloud SQL and BigQuery. We had to maintain and monitor extract jobs to copy Cloud SQL data into BigQuery for analysis, and data was only as fresh as the last run,” says Zahi Karam, director of data science at Bluecore. “With Cloud SQL Federation, we can use BigQuery to run analysis across live data in both systems, ensuring that we’re always getting the freshest view of our data. Additionally, we can securely enable less technical analysts to query Cloud SQL via BigQuery without having to set up additional connections.”Take a look at the demo for more:Check out the documentation to learn more about Cloud SQL federated queries from BigQuery.BigQuery ML: Import TensorFlow models Machine learning can do lots of cool things for your business, but it needs to be easy and fast for users. For example, say your data science teams have created a couple of models and they need your help to make quick batch predictions on new data arriving in BigQuery. With new BigQuery ML Tensorflow prediction support, you can import and make batch predictions using your existing TensorFlow models on your BigQuery tables, using familiar BQML syntax. Here’s an example.First, we’ll import the model from our project bucket:Then we can quickly batch predictions with the familiar BigQuery ML syntax:Want to run batch predictions at regular intervals as new data comes in? Simply set up a scheduled query to pull the latest data and also make the prediction. And as we highlighted in a previous post, scheduled queries can run as frequently as every 15 minutes.Check out the BigQuery ML TensorFlow User Guide for more.Automatic re-clustering now available Efficiency is essential when you’re crunching through huge datasets. One key best practice for cost and performance optimization in BigQuery is table partitioning and clustering. As new data is added to your partitioned tables, it may get written into an active partition and need to be periodically re-clustered for better performance. Traditionally, other data warehouse processes like “VACUUM” and “automatic clustering” require setup and financing by the user. BigQuery now automatically re-clusters your data for you at no additional cost and with no action needed on your part.Check out our recent blog post Skip the maintenance, speed up queries with BigQuery’s clustering for a detailed walkthrough. And get more detail in the documentation: automatic re-clustering.UDF performance now fasterIf you perform a query using JavaScript UDFs, it’ll now take around a second less to execute, on average, due to speedier logic for initializing the JavaScript V8 Engine that BigQuery uses to compute UDFs. Don’t forget you can persist and share your custom UDFs with your team, as we highlighted in our last post. In case you missed itFor more on all things BigQuery, check out these recent posts, videos and how-tos:Skip the heavy lifting: Moving Redshift to BigQuery easilyIntroducing the BigQuery Terraform moduleClustering 4,000 Stack Overflow tags with BigQuery k-meansEfficient spatial matching in BigQueryLab series: BigQuery for data analysts GlideFinder: How we built a platform on Google Cloud that can monitor wildfiresMigrating Teradata and other data warehouses to BigQueryHow to use BigQuery ML for anomaly detectionBigQuery shared utilities GitHub library (scripts, UDFs)To keep up on what’s new with BigQuery, subscribe to our release notes and stay tuned to the blog for news and announcements And let us know how else we can help.
Quelle: Google Cloud Platform

Introducing Red Hat OpenShift 4.2 in Developer Preview: Releasing Nightly Builds 

You might have read about the architectural changes and enhancements in Red Hat OpenShift 4 that resulted in operational and installation benefits. Or maybe you read about how OpenShift 4 assists with developer innovation and hybrid cloud deployments. I want to draw attention to another part of OpenShift 4 that we haven’t exposed to you yet…until today.
When Red Hat acquired CoreOS, and had the opportunity to blend Container Linux with RHEL and Tectonic with OpenShift, the innovation did not remain only in the products we brought to market. 
An exciting part about working on new cloud-native technology is the ability to redefine how you work. Redefine how you hammer that nail with your hammer. These Red Hat engineers were building a house, and sometimes the tools they needed simply did not exist. 
OpenShift 4 represents new features, methods, and use cases that had not been attempted before with its upstream components (Kubernetes). A lot of the development process was around building internal tooling that would enable OpenShift to be successful as a distribution of Kubernetes and as an application platform. There were three specific results of that tooling work that resulted in some exciting improvements that you might already be using, but didn’t realize it just yet.

Over The Air
One of the technology concepts we leveraged from CoreOS is called “over the air” updates (OTA). If you are like me, you are probably thinking “Why is that considered important. We have been doing yum, docker, apt updates over the air for years!” This is different in some pretty significant ways. 

We created an entirely new software distribution system to intelligently suggest and apply Kubernetes cluster upgrades. This software component runs as a SaaS service from cloud.redhat.com and it allows us to help OpenShift 4 clusters to automatically upgrade.

By combining Red Hat’s deep experience in managing software lifecycle with Kubernetes-native configuration, automatic management with Kubernetes Operators, and a deep feedback loop between cluster health and software rollout, we can help reduce the complexity and risk in updating your production clusters while you can increase the speed at which you can roll out critical security fixes. Upgrade at the “click of a button” without stress.

This SaaS service has knowledge of what it is doing. Imagine if you had more than 1,200 lines of business (LOBs) and each one deployed up to 250 OpenShift clusters. They installed on multiple and different infrastructures. But they all used your software to do so.

You could start becoming aware when some organically updated before others and you could observe success rates and issues. If you found a common pattern you could stop the upgrade from reaching clusters that had not been upgraded yet. This allows Red Hat to more deeply partner with our customers to help shoulder the responsibility of running OpenShift.

OpenShift 4 introduced an optional component called telemeter. When enabled, we can have immediate awareness and feedback when a piece of OpenShift isn’t functioning properly so we can create a fix for your issue before you even know the problem is there.

We take care of updating the operating system and the platform together. You no longer need to cycle them on two different maintenance windows and we no longer have to suffer from a configuration skew or poor target system while updating the layers on top of the stack.

All four of those benefits combine to form over the air updates (OTA). OTA is more than pushing software around. It’s about us reaching out to you and declaring that we want to do more for you. Enabling over the air updates in OpenShift may help you keep your OPEX as low as possible. 
Continuous Improvements
OpenShift 4 has been streaming continuous fixes for a few weeks. Hopefully you have noticed by now that OpenShift 4 has been able to release 8 z-stream patches in 11 weeks. That is almost one per week, and that is our goal. In OpenShift 3 we would typically release a z-stream patch once every 3-5 weeks. Why have we done this? We did this for two reasons:

With the OTA upgrade framework in place we can have a scientifically higher confidence probability of success.
You do not have to take all the z-streams when they are released as they are cumulative, but by allowing them out each week when they are ready, you also don’t have to wait. If you are being affected by an issue, you can now have the lessons learned from other customers at your fingertips. 

When you are moving around that much software across public cloud and datacenter OpenShift deployments, you need strong automated testing solutions. Building that to the scale (1,000s of clusters per week) and diversity (AWS, Azure, GCP, IBM Public, vSphere, OpenStack, RHV) needed for OpenShift 4 was a large engineering project. 
In this new world of OTA coupled with continuous improvements, we refined how we were leveraging CI/CD to the point of extending its checkpoints all the way down to our customers.
Nightly Builds:
Starting today, we have decided to expose our customers and partners to an opportunity to gain access to our nightly builds. 
In the past, we would have had to run a high touch beta to give our customers this level of access. By leveraging the new tooling built for OTA and the above continuous improvements, we are now in a position where we can expose nightly builds for a future release of OpenShift (in this case OpenShift 4.2).
These are the caveats of nightly builds: they are not for production usage, they are not supported, they will have little documentation, and not all the features will be baked in them. But we intend for them to get better and better the closer they get to a GA milestone. The documentation should slowly grow as well. 
What they do offer is the ability to get the earliest possible look at features as they are being developed. That can help during deployment planning and ISV level integrations. It is for those reasons we feel our customers and partners will truly enjoy this new opportunity.
It’s About Working as a Community:
We believe that a transformation in software development is beginning with our continuous contributions in open source, our new Kubernetes over the air upgrades, and the automated integration, testing and rollout that makes these nightly builds available. 
We have extended the classic continuous feedback loop used in agile software development to reduce the time it takes to bring fixes and innovation from customer to community to Red Hat back to the customer. The health of production clusters is assessed and translated directly to critical interventions and fixes that rapidly make their way out to the fleet. 
A software loop that in the past might have only been available to hosted services is now available for users and customers of OpenShift. We see this as a true evolution in open source software and I’m excited about where it is going to lead us…together. 
To find out more go to try.openshift.com. Log in and select your infrastructure. Then look for the new developer preview link!
The post Introducing Red Hat OpenShift 4.2 in Developer Preview: Releasing Nightly Builds  appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift