By Woody Sherman, CSO and Vipin Sachdeva, Principal Investigator, Silicon Therapeutics

[Editor’s note: Today we hear from Boston, MA-based Silicon Therapeutics, which is applying computational methods in the context of complex biochemical problems relevant in human biology.]

As an integrated computational drug discovery firm, we recently deployed our INSITE Screening platform on Google Cloud Platform (GCP) to analyze over 10 million commercially available molecular compounds as potential starting materials for next-generation medicines. In one week, we performed over 500 million docking computations to evaluate how a protein responds to a given molecule. Each computation involved a docking program that predicted the preferred orientation of a small molecule to a protein and the associated energetics so we could assess whether or not it will bind and alter the function of the target protein.

With a combination of Google Compute Engine standard and Preemptible VMs, we used up to 16,000 cores, for a total of 3 million core-hours and a cost of about $30,000. While this might sound like a lot of time and money, it’s a lot less expensive and a lot faster than experimentally screening all compounds. Using a physics-based approach such as our INSITE platform is much more computationally expensive than some other computational screening approaches, but it allows us to find novel binders without the use of any prior information about active compounds (this particular target has no drug-like compounds known to bind). In a final stage of the calculations we performed all-atom molecular dynamics (MD) simulations on the top 1,000 molecules to determine which ones to purchase and experimentally assay for activity.

The bottom line: We successfully completed the screen using our INSITE platform on GCP and found several molecules that have recently been experimentally verified to have on-target and cell-based activity.

We chose to run this high-performance computing (HPC) job on GCP over other public cloud providers for a number of reasons:

Availability of high-performance compute infrastructure. Compute Engine has a good inventory of high-performance processors that can be configured with large amounts of cores and memory. It also offers GPUs — a great fit for some of our computations, such as molecular dynamics and free energy calculations. SSD made a big difference in performance, as our total I/O for this screen exceeded 40 TB of raw data. Fast connectivity between the front-end and the compute nodes was also a big factor, as the front-end disk was NFS-mounted on the compute nodes.
Support for industry standard tools. As a startup, we value the ability to run our workloads wherever we see fit. Our priorities can change rapidly based on project challenges (chemistry and biology), competition, opportunities and the availability of compute resources. Our INSITE platform is built on a combination of open-source and proprietary in-house software, so portability and repeatability across in-house and public clouds is essential.
An attractive pricing model. Preemptible VMs are great combination of cost-effective and predictable, offering up to 80% off standard instances — no bidding and no surprises. That means we don’t have to worry about jobs being killed due to a bidding war, which can create significant delays in completing our screens and requires unnecessary human overhead to manage the jobs.

We initialized multiple clusters for the screening; specifically, our cluster’s front-end consisted of three full-priced n1-highmem-32 VM instances with 208GB of RAM that ran the queuing system, and that connected to a 2TB SSD NFS filestore that housed the compound library. Each of these front-end nodes then spawned up to 128 compute nodes configured as n1-highcpu-32 Preemptible VMs, each with 28.8GB of memory. Those compute nodes performed the actual molecular compound screens, and wrote their results back to the filestore. Preemptible VMs run for a maximum of 24 hours; when that time elapsed, the front-end nodes drained any jobs remaining on the compute nodes and re-spawned a new set of nodes until all 10 million compounds had been successfully run.

To manage compute jobs, we enlisted the help of two popular open-source tools: Slurm, a workload manager used by 60% of the world’s TOP500 clusters, and ElastiCluster, which provides a command-line tool to create, manage and setup compute clusters hosted on a variety of cloud infrastructures. Using these open-source packages is economical, provides the lion’s share of the functionality of paid software solutions and ensures we can run our workloads in-house or elsewhere.

More compute = better results
But ultimately, the biggest benefit of using GCP was being able to more thoroughly screen compounds than we could have done with in-house resources. The target protein in this particular study was highly flexible, and having access to massive amounts of compute power allowed us to more accurately model the underlying physics of the system by accounting for protein flexibility. This yielded more active compounds than we would have found without the GCP resources.

The reality is that all proteins are flexible, and undergo some form of induced fit upon ligand binding, so treating protein flexibility is always important in virtual screening if you want the best results. Most molecular docking programs only account for ligand flexibility, so if the receptor structure is not quite right then active compounds might not fit and therefore be missed, no matter how good the docking program is. Our INSITE screening platform incorporates protein flexibility in a novel way that can greatly improve the hit rate in virtual screening, even as it requires a lot of computational resources when screening millions of commercially available compounds.

Example of the dynamic nature of protein target (Interleukin018, IL18)

From the initial 10 million compounds, we prioritized 250 promising compounds for experimental validation in our lab. As a small company, we don’t have the capabilities to experimentally screen millions of compounds, and there’s no need to do so with an accurate virtual screening approach like we have in our INSITE platform. We’re excited to report that at least five of these compounds have shown activity in human cells, suggesting them as promising starting points for new medicines. To our knowledge, there are no drug-like small molecule activators of this important and challenging immune-oncology target.

To learn more about the science at Silicon Therapeutics, please visit our website. And if you’re an engineer with expertise in high performance computing, GPUs and/or molecular simulations, be sure to visit our job listings.
Quelle: Google Cloud Platform

18. Juli 2017

da Agency

Amazon Could Get Into Meal Kits, And It's Crushing Blue Apron

Scott Eisen / Getty Images

Investors think the meal-kit delivery service Blue Apron has a very dangerous enemy: Amazon.

Shares of the newly public meal-kit company dived again on Monday morning following news that Amazon had applied to trademark the phrase “We do the prep. You be the chef,” suggesting that the e-commerce giant is planning to launch its own prepared meal-kit service.

Blue Apron's stock fell about 11% to $6.55 as of noon on Monday, down 35% from its initial public offering price of $10. The IPO had originally been projected to be priced between $15 and $17, but fell as concerns mounted about Blue Apron's high marketing spending, and due to pressure from Amazon's plan to buy Whole Foods.

The trademark application is for “prepared food kits composed of meat, poultry, fish, seafood, fruit and/or and vegetables and also including sauces or seasonings, ready for cooking and assembly as a meal” as well as frozen meals. While the filing doesn't mention delivery specifically, it is listed under, among other things, “retail store services and online retail store services in the field of fresh and prepared foods and dry goods.”

Yahoo Finance/BuzzFeed

The British newspaper The Times first reported the trademark application on Sunday.

Amazon and Blue Apron did not immediately respond to requests for comment.

Amazon is planning to acquire Whole Foods for almost $14 billion, a move that put the fear of Bezos into much of the retail and grocery sector. The merger was announced while Blue Apron was preparing to go public, and may have contributed to the meal-kit company's bankers lowering their estimate for its share value until the day they began to trade.

While Amazon has said little about how it would operate Whole Foods, many analysts have speculated that some kind of meal-kit delivery service — using Amazon's existing logistics expertise and built-in network of Amazon Prime customers — would be a natural integration between the two.

Not all patents and trademarks registered by technology companies turn into actual services, but Blue Apron has been a skittish stock since it went public late last month — it fell over 10% last week after a brokerage firm put out a research report pegging its value at only $2 a day. Amazon shares were up by less than 1% by mid-Monday.

A Single Share Of Blue Apron Now Costs Less Than A Single Blue Apron Meal

Blue Apron Shares Rose 0.00% On First Day Of Trading

Blue Apron Goes Public On Thursday, But It’s Not Looking Pretty

Quelle: <a href="Amazon Could Get Into Meal Kits, And It's Crushing Blue Apron“>BuzzFeed

18. Juli 2017

da Agency

Azure Data Lake Tools for Visual Studio Code (VSCode) July updates

We are pleased to announce the July updates of Azure Data Lake Tools for VSCode. This is a quality milestone and we added local debug capability for C# code behind for window users, refined Azure Data Lake (ADLA & ADLS) integration experiences, and focused on refactoring the components and fixing bugs. Azure Data Lake Tools for VSCode is an extension for developing U-SQL projects against Microsoft Azure Data Lake! This extension provides you a cross-platform, light-weight, and keyboard-focused authoring experience for U-SQL while maintaining a rich set of development functions. Summary of key updates Local Run for Windows Users This update allows you to perform local run to test your local data. Execute your script locally before publishing your production ready code to ADLA. Use command ADL: Start Local Run Service to start local run service. The cmd console shows up. For first time users, enter 3 and set up your data root. Use command ADL: Submit Job to submit your job to your local account. After job submission, you can view the submission details by clicking jobUrl in the output window, or view the job submission status from the CMD console. Local Debug for Window Users Local Debug enables you to debug your C# code behind, step through the code, and validate your script locally before submitting to ADLA. Use command ADL: Start Local Run Service to start local run service and set a breakpoint in your code behind, then click command ADL: Local Debug to start local debug service. You can debug through the debug console and view parameter, variable, and call stack information. Register assemblies through configuration Register assemblies through configuration provides you more flexibility to register your dependency and upload your resources. Use command ADL: Register Assembly through Configuration to register your assembly, register the assembly dependencies, and upload resources through a simple configuration. Upload file through configuration Upload file through configuration boosts your productivity and offers you the capability to upload multiple files at the same time. Use command ADL: Upload File through Configuration to upload multiple files through a simple configuration. How do I get started? First, install Visual Studio Code and download the prerequisite files including JRE 1.8.x, Mono 4.2.x (Linux and Mac), and .Net Core (Linux and Mac). Then get the latest ADL Tools by going to the VSCode Extension repository or VSCode Marketplace and searching “Azure Data Lake Tools for VSCode”. For more information about Azure Data Lake Tool for VSCode, please see: Get more information on using Data Lake Tools for VSCode. Watch the ADL Tools for VSCode User instructions video. Learn more about how to get started on Data Lake Analytics. Learn how to Develop U-SQL assemblies for Azure Data Lake Analytics jobs. If you encounter any issues, please submit it to: https://github.com/Microsoft/AzureDatalakeToolsForVSCode/issues Want to make this extension even more awesome? Share your feedback
Quelle: Azure