Amazon Aurora unterstützt PostgreSQL 9.6.22, 10.17, 11.12 und 12.7 in der Region AWS GovCloud (USA)

Nach der Ankündigung von Updates an der PostgreSQL-Datenbank von der Open-Source-Community haben wir die Kompatibilität von Amazon Aurora PostgreSQL aktualisiert, um die PostgreSQL-Versionen 9.6.22, 10.17, 11.12 und 12.7 in derAWS-Region GovCloud (USA)zu unterstützen. Diese Releases enthalten Fehlerbehebungen und Verbesserungen durch die PostgreSQL-Community. Zur Erinnerung: Amazon Aurora PostgreSQL 9.6 wird am 31. Januar 2022 den Auslauf erreichen. Die Nebenversion 9.6.22 ist auf Upgrades für Cluster beschränkt, auf denen bereits Aurora PostgreSQL 9.6 ausgeführt wird.
Quelle: aws.amazon.com

Lincoln Laboratory honored for transfer of security-enhancing technologies

The Federal Laboratory Consortium for Technology Transfer (FLC) awarded their 2021 Excellence in Technology Transfer Award for the Northeast region to two Lincoln Laboratory technologies developed to improve security.

The first technology, Forensic Video Exploitation and Analysis (FOVEA), is a suite of analytic tools that makes it significantly easier for investigators to review surveillance video footage. The second technology, Keylime, is a software architecture designed to increase the security and privacy of data and services in the cloud. Both technologies have transitioned to commercial use via license or open-source access.

“These Federal Laboratory Consortium awards are an acknowledgement that the advanced capabilities developed at MIT Lincoln Laboratory are valued, not only for their contribution to enhancing national security, but also for their value to related private-sector needs,” says Bernadette Johnson, the chief technology ventures officer at Lincoln Laboratory. “Technology transfer is considered an integral element of the Department of Defense’s mission and is explicitly called out in the laboratory’s Prime Contract and Sponsoring Agreement. The transfer of these two technologies is emblematic of the unique ‘R&D-to-rapid-prototyping’ transition pipeline we have been developing at Lincoln.”

Speeding up video review 

The FOVEA program first began under sponsorship from the Department of Homeland Security (DHS) to address the challenge of efficiently reviewing video surveillance footage. The process of searching for a specific event, investigating abandoned objects, or piecing together activity from multiple cameras can take investigators hours or even days. It is especially challenging in large-scale closed-circuit TV systems, like those that surveil subway stations.

The FOVEA suite overcomes these challenges with three advanced tools. The first tool, video summarization, condenses all motion activity into a visual summary, transforming, for example, an hour of raw video into a three-minute product that only highlights motion. The second tool, called jump back, automatically seeks a portion of the video when an idle object, such as a backpack, first appeared. The third tool, multi-camera navigation and path reconstruction, allows an operator to track a person or vehicle of interest across multiple camera views.

Notably, FOVEA’s analytic tools can be integrated directly into existing video surveillance systems and can be processed on any desktop or laptop computer. In contrast, most commercial offerings first require customers to export their video data for analysis and to purchase proprietary server equipment or cloud services.

“The project team worked very hard on not just the development of the FOVEA prototype, but also packaging the software in a way that accommodates hand-off to third-party deployment sites and transition partners,” says Marianne DeAngelus, who led the development of FOVEA with a team in the Homeland Sensors and Analytics Group.

Under government sponsorship, the developers first deployed FOVEA to two mass transit facilities. Through participation in an MIT-led Innovation-Corps program, the team then adapted the technology into a commercial application. Doradus Lab, Inc. has since licensed FOVEA for security surveillance in casinos.

“Though FOVEA was originally developed for a specific use case of mass transit security, our tech transfer to industry will make it available for a broader set of security applications that would benefit from accelerated forensic analysis of surveillance video. We and our DHS sponsor are happy that this may lead to a wider impact of the technology,” adds Jason Thornton, who leads the technical group.

Putting trust in the cloud

Keylime is making it possible for government and industry users with sensitive data to increase the security of their cloud and internet-of-things (IoT) devices. This free, open-source software architecture enables cloud customers to securely upload cryptographic keys, passwords, and certificates into the cloud without divulging these secrets to their cloud provider, and to secure their cloud resources without relying on their provider to do it for them.

Keylime started as an internal project funded through Lincoln Laboratory’s Technology Office in 2015. Eventually, the Keylime team began discussions with RedHat, one of the world’s largest open-source software companies, to expand the technology’s reach. With RedHat’s help, Keylime was transitioned in 2019 into the Cloud Native Computing Foundation as a sandbox technology with more than 30 open-source developers contributing to it from around the world. Most recently, IBM announced its plans to adopt Keylime into its cloud feet, enabling IBM to attest to the security of its thousands of cloud servers.

“Keylime’s transfer and adoption into the open-source community and cloud environments helps to empower edge/IoT and cloud customers to validate provider claims of trustworthiness, rather than needing to rely solely on trust of the underlying environment for compliance and correctness,” says Charles Munson, who developed Keylime with former laboratory staff member Nabil Schear and adapted it as an open-source platform with Luke Hinds at RedHat. 

Keylime achieves its cloud security by leveraging a piece of hardware called a TPM, an industry-standard hardware security chip. A TPM generates a hash, a short string of numbers representing a much larger amount of data, that changes significantly if data are even slightly tampered with. Keylime can detect and react to this tampering in under a second.

Before Keylime, TPMs were incompatible with cloud technology, slowing down systems and forcing engineers to change software to accommodate the module. Keylime gets around these problems by serving as a piece of intermediary software that allows users to leverage the security benefits of the TPM without having to make their software compatible with it.

Transferring to industry

The transition of Lincoln Laboratory’s technology to industry and government is central to its role as a federally funded research and development center (FFRDC).

The mission of the FLC is to facilitate and educate FFRDCs and industry on the process of technology transfer. More than 300 federal laboratories, facilities, research centers, and their parent agencies make up the FLC community.

The transfer of these FLC-awarded technologies was supported by Bernadette Johnson and Lou Bellaire in the Technology Ventures Office; David Pronchick, Drinalda Kume, Zachary Sweet, and Jayme Selinger of the Contracting Services Department; and Daniel Dardani in MIT’s Technology Licensing Office, along with the technology development teams. Both FOVEA and Keylime were also awarded R&D 100 Awards in 2020, acknowledging them among the year’s 100 most innovative technologies available for sale or license.

The FLC will recognize the award recipients at a regional meeting in October.
Quelle: Massachusetts Institute of Technology

How Lowe’s SRE reduced its mean time to recovery (MTTR) by over 80 percent

Editor’s Note:In a previous blog, we discussed how home improvement retailer Lowe’s was able to increase the number of releases it supports by adopting Google’s Site Reliability Engineering (SRE) framework on Google Cloud. Lowe’s went from one release every two weeks to 20+ releases daily, helping meet its customer needs faster and more effectively. Today, the Lowe’s SRE team shares how they used SRE principles to decrease their mean-time-to-recovery (MTTR) by over 80 percent.The stakes of managing Lowes.com have never been higher, and that means spotting, troubleshooting and recovering from incidents as quickly as possible, so that customers can continue to do business on our site. To do that, it’s crucial to have solid incident engineering practices in place. Resolving an incident means mitigating the impact and/or restoring the service to its previous condition. The average time it takes to do this is called mean time to recovery (MTTR). Tracking this metric helps us stay on top of the overall reliability of our systems at Lowe’s, while simultaneously improving the speed with which we recover. Our goal is to keep the MTTR metric as low as possible, so that failures don’t negatively impact our business. Here are the four areas we addressed to drive holistic improvement in our MTTR.Lowe’s incident reporting processTo reduce MTTR, we created a seamless incident reporting process following SRE principles. Our incident reporting process is a workflow that starts at the time an incident occurs, and ends with an SRE captain who closes the action items after a postmortem report. With this approach, we are able to limit the number of critical incidents. The reporting process involves three core components: monitoring, alerting, and blameless postmortems.Monitoring and alertingHaving proper monitoring and alerting in place is crucial when it comes to incident management. Monitoring and alerting tools let you detect issues as soon as they occur, and notify the right person in the shortest possible time to take action. From a measurement standpoint, we track this as our mean time to acknowledge (MTTA). This is the average time it takes from when an alert is triggered, to when work on the issue begins.At the time of an incident, our monitoring and alerting tools notify the on-call SRE first responder via PagerDuty in the form of a phone call, text message and email. Our SRE software engineering team has done a lot of automation to enable various Service Level Indicator (SLI) alerts and Service Level Agreement (SLA) notifications. The on-call SRE then initiates a triage call with our service/domain stakeholders to resolve the incident. As a result, we reduced our MTTA from 30 minutes in 2019, to one minute – a 97 percent decrease. Blameless postmortems: learning from incidentsA postmortem is a written record of an incident, its impact, the actions taken to resolve it, the root cause and the follow-up actions to prevent the incident from recurring (see example here). A blameless postmortem builds on that and is a core part of an SRE culture, and our culture at Lowe’s. We ensure that individuals are not singled out, and the outcome for all postmortems are directed toward learnings and process improvement.For us, the postmortem process is the biggest part of our incident workflow. When an SRE creates a new postmortem report, the first step is to conduct a postmortem session with domain stakeholders to review the report. The postmortem then goes into the review stage and gets reviewed by more stakeholders in our weekly postmortem meeting. In the final stage of this process, the SRE captain will close the report once everyone in the weekly meeting agrees that the report is complete.To conduct a successful postmortem, it is critical to keep the focus on identifying gaps and issues with the system and operations processes, rather than an individual, and generate concrete actions to address the problems we’ve identified. To ensure this, we follow a couple of best practices:We start by gathering the facts from the person who identified the problem, and each SLI owner has to identify a gap or the next SLI upstream owner who created the impact for them.Every SLI owner is provided full opportunity to present their case, and identifying the issue is done as a community exercise. Once action items and process changes are identified, an owner is nominated to complete the actions, or they will volunteer. For easy reference, we publish and store postmortems in our incident knowledge base. This process helps SREs continuously improve as future incidents arise. Continuous Improvement Encouraging a culture of honest, transparent and direct feedback that you need for blameless postmortems is often an iterative process that needs sponsorship from executives, empowering incident captains to lead the entirety of the discussion and outcomes. Running successful postmortems, and completing action items from them, needs to be recognized and accounted for in SRE performance objective assessment. As shared in Google’s SRE book, the best practice is to ensure that writing effective postmortems is a rewarded and celebrated practice, with leadership’s acknowledgement and participation. This is possibly the hardest part to accomplish in an effective postmortem during a cultural transformation unless you have full buy-in from leadership.However, it’s all well worth it. This process is a key part of how we were able to improve our MTTR over time—from two hours in 2019 to just 17 minutes! Our SRE incident reporting process has also transformed how our company solves issues. By streamlining this workflow from alerting, to solving an issue, to blameless postmortems, we have reduced our MTTR by 82 percent and our MTTA by 97 percent. Most importantly, our team is learning from every incident and becoming better engineers as a result. Visit the SRE Google Cloud website to learn more about implementing SRE best practices in the cloud.AcknowledgementSpecial thanks to Rahul Mohan Kola Kandy, Vivek Balivada, and the Digital SRE team at Lowe’s for contributing to this blog post.Related ArticleHow Lowe’s meets customer demand with Google SRE practicesLowe’s has adopted Google SRE practices to help developer and operations teams keep up with ecommerce demand.Read Article
Quelle: Google Cloud Platform