Docker Docs Hackathon: April 17-21, 2017

During DockerCon 2017, ’s team will be running the first-ever Docker Docs hackathon, and you’re invited to participate and win prizes – whether you attend DockerCon or are just watching the proceedings online.
Essentially, it’s a bug-bash! We have a number of bugs filed against our docs up on GitHub for you to grab.
You can participate in one of two ways:

With the docs team’s help in the fourth floor hack room at DockerCon on Tuesday, April 18th and Wednesday, April 19th, from 1-6pm.
Online! Right here! During the whole week of DockerCon (April 17th &; 21st).

Or, both – if you want to have the best shot. After all, we won’t be in the hack room 24/7 that whole week.
All participants who show up in the 4th floor hack room at DockerCon will get this way-cool magnet just for stopping by.

Quick links

Official hackathon page on Docs site
Event page on DockerCon website
View hackathon bugs on GitHub
Report your hackathon work
Browse prizes
docs on Slack, if you have questions

How it works
We have a number of bugs that have built up in our docs queue on GitHub, and we have labeled a whole slew of them with the tag hackathon, which you can see here.
Submit fixes for these bugs, or close them if you do a bit of research it turns out they aren’t actually valid. Every action you take gets you more points, and the points are redeemable for dollars in our hackathon store. These points also qualify you for valuable prizes like an Amazon gift card and a personally engraved trophy!
Prizes

All participants: Points are redeemable for t-shirts, hoodies, sweatshirts, mugs, beer steins, pint glasses, flasks, hoodies, stickers, buttons, magnets, wall clocks, post-cards, and even doggie t-shirts.
3rd place: A small trophy with a personal engraving, plus store credit
2nd place: A small trophy with a personal engraving, plus store credit, plus a $150 Amazon Gift Card
1st place: A large trophy with a personal engraving, plus store credit, plus a $300 Amazon Gift Card

Bonuses
A select few will get bonuses for being extra special contributors:

Largest single change introduced in a fix (files changed/lines of delta): 1000 points
Most bugs closed (resolved as no-op or handled): 1000 points
Most participation (attended all days): 1000 points

Choosing a prize
You can see the point values for the bugs in the GitHub queue. Those are worth cash in our rewards store at http://www.cafepress.com/dockerdocshackathon.
Our points-to-cash conversion rate will be figured out at the end of the hackathon, and will essentially be a function of the number of points that hackathon participants logged, and the number of dollars we have to spend on prizes.

View available rewards

When?
The docs hackathon is going on from April 17th thru April 21st, 2017. This is the time when it’s possible to claim and resolve bugs.
Where?
In-person
Attending DockerCon? Come to the fourth floor hack room on Tuesday and Wednesday from 1pm to 6pm. We’ll be there to answer questions and help you.
Note: While the hackathon is officially ongoing all week online, working in the hack room with us for these two days is by far the best way to participate; the docs team will be on-hand to get you started, get you unstuck, and guide you.
Online
Drop into the community Slack channel for the docs and ask any questions you have. Otherwise, just go to GitHub and look at our hackathon label and come here to claim your points when you’re done.
Claiming a bug
Whether attending in-person or online, to claim a bug as one that you are working on (so nobody else grabs it out from under you) you must type a comment saying you claim it. Respect it when you see other comments claiming a bug.

View available bugs

Claiming your points
Simply fill out this form when you’re done participating. We’ll take it from there.
Conversion rate
The points-to-cash ratio will be posted on the official page for the hackathon no later than Friday the 21st. We need to figure out how many points’ worth of fixes come in first.
Sorry but we can not send you cash for these points under any circumstances, even if you don’t spend them.
Questions?
Ask us anything at docs@docker.com or in the docs channel on Slack.
Thank you for participating in the 2017 Docs Hackathon!

Join us for the Docker Docs Hackathon: April 17-21, 2017Click To Tweet

The post Docker Docs Hackathon: April 17-21, 2017 appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

Enterprise Ready Software from Docker Store

Store is the place to discover and procure trusted, enterprise-ready containerized software &; free, open source and commercial.
Docker Store is the evolution of the Docker Hub, which is the world’s largest container registry, catering to millions of users. As of March 1, 2017, we crossed 11 billion pulls from the public registry!  Docker Store leverages the public registry’s massive user base and ensures our customers &8211; developers, operators and enterprise Docker users get what they ask for. The Official Images program was developed to create a set of curated and trusted content that developers could use as a foundation for building containerized software. From the lessons learned and best practices, Docker recently launched a certification program that  enables ISVs, around the world to take advantage of Store in offering great software, packaged to operate optimally on the Docker platform.

The Docker Store is designed to bring Docker users and ecosystem partners together with

Certified with ISV apps that have been validated against Docker Enterprise Edition, and comes with cooperative support from Docker and the ISV
Enhanced search and discovery capabilities of containers, including filtering support for platforms, categories and OS.
Self service publisher workflow and interface to facilitate a scalable marketplace.
Support for a range of licensing models for published content

Publishers with certified content on Docker Store include:  AVI Networks, Cisco, Bleemeo, BlobCity DB, Blockbridge, CodeCov, CoScale, Datadog, Dynatrace, Gitlab, Hedvig, HPE, Hypergrid, Kaazing, Koekiebox, Microsoft, NetApp, Nexenta, Nimble, Nutanix, Polyverse Portworx, Sysdig, and Weaveworks
The simplest way to get started is to go check out Docker Store!

Using Docker Store
For developers and IT teams building Docker apps, the Docker Store is the best place to get the components they need available as containers. Containerization technology has emerged as a strong solution for developers, devops and IT &8211; and enterprises especially need assurances that software packages are trusted and “just works” when deployed. The Docker Certification program takes containers and through an end-to-end testing process and provides collaborative support for any potential issues. Read more about the certification program here!

Enhanced Discovery: Easily search for a wide range of solutions from Docker, ISV containers or plugins. Use filters and categories to search for specific characteristics
Software Trials: Where available, free trials of commercial software (including Docker) are available from the Docker Store.
Community Content: Developers can continue to browse and download from Docker Hub public repos from the Docker Store. The Docker Community is very vibrant and active, and community images will be accessible from the Docker Store.
Notifications: Alerts and updates are available to manage subscriptions of Docker Store listings including patches, fixes or new versions.

Publish Content to Docker Store
From large ISV with hundreds of products to a small start up building new tools, Docker Store provides a marketplace to package and distribute software and plugins in containers ready for use on the Docker platform. Making their tools more accessible to the community of millions of Docker users and accelerating their time to value with these partner solutions.
In addition, Publishers gain the following benefits from the Docker Store:

Access to a globally scalable container distribution service.
Path to certification for software and plugin content to differentiate the solution from ecosystem and to signal additional value to end users.
Visibility and analytics including managing subscribers and sales reports.
Flexible fulfillment and billing support with “Paid via Docker” and BYOL (Bring your own License) models. You focus on creating great software and we take care of the rest.
Reputation management via Ratings and Reviews.

Getting started as a publisher on Docker Store is as simple as 1-2-3!

Tips for becoming a publisher:

Create Great Containerized Content (you have probably already done this!)
Follow best practices

https://success.docker.com/store

https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/

Use an official image as your base image.
github.com/docker/docker-bench-securityWe will keep adding more best practices and tools to make your content robust.

Go to https://store.docker.com and click on “Publish”.

More Resources

Learn more about certification. 
Sign up for a Docker Store Workshop at DockerCon
Learn More about Docker Enterprise Edition 

Docker Store is the place to get your certified containers, plugins and Editions!Click To Tweet

The post Enterprise Ready Software from Docker Store appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

Let’s Meet At OpenStack Summit In Boston!

The post Let&;s Meet At OpenStack Summit In Boston! appeared first on Mirantis | Pure Play Open Cloud.

 
The citizens of Cloud City are suffering — Mirantis is here to help!
 
We&8217;re planning to have a super time at summit, and hope that you can join us in the fight against vendor lock-in. Come to booth C1 to power up on the latest technology and our revolutionary Mirantis Cloud Platform.

If you&8217;d like to talk with our team at the summit, simply contact us and we&8217;ll schedule a meeting.

REQUEST A MEETING

 
Free Mirantis Training @ Summit
Take advantage of our special training offers to power up your skills while you&8217;re at the Summit! Mirantis Training will be offering an Accelerated Bootcamp session before the big event. Our courses will be conveniently held within walking distance of the Hynes Convention Center.

Additionally, we&8217;re offering a discounted Professional-level Certification exam and a free Kubernetes training, both held during the Summit.

 
Mirantis Presentations
Here&8217;s where you can find us during the summit&;.
 
MONDAY MAY 8

Monday, 12:05pm-12:15pm
Level: Intermediate
Turbo Charged VNFs at 40 gbit/s. Approaches to deliver fast, low latency networking using OpenStack.
(Gregory Elkinbard, Mirantis; Nuage)

Monday, 3:40pm-4:20pm
Level: Intermediate
Project Update &; Documentation
(Olga Gusarenko, Mirantis)

Monday, 4:40pm-5:20pm
Level: Intermediate
Cinder Stands Alone
(Ivan Kolodyazhny, Mirantis)

Monday, 5:30pm-6:10pm
Level: Intermediate
m1.Boaty.McBoatface: The joys of flavor planning by popular vote
(Craig Anderson, Mirantis)

 

TUESDAY MAY 9

Tuesday, 2:00pm-2:40pm
Level: Intermediate
Proactive support and Customer care
(Anton Tarasov, Mirantis)

Tuesday, 2:30pm-2:40pm
Level: Advanced
OpenStack, Kubernetes and SaltStack for complete deployment automation
(Aleš Komárek and Thomas Lichtenstein, Mirantis)

Tuesday, 2:50pm-3:30pm
Level: Intermediate
OpenStack Journey: from containers to functions
(Ihor Dvoretskyi, Mirantis; Iron.io, BlueBox)

Tuesday, 4:40pm-5:20pm
Level: Advanced
Point and Click ->CI/CD: Real world look at better OpenStack deployment, sustainability, upgrades!
(Bruce Mathews and Ryan Day, Mirantis; AT&T)

Tuesday, 5:05pm-5:45pm
Level: Intermediate
Workload Onboarding and Lifecycle Management with Heat
(Florin Stingaciu and Lance Haig, Mirantis)

 

WEDNESDAY MAY 10

Wednesday, 9:50am-10:30am
Level: Intermediate
Project Update &8211; Neutron
(Kevin Benton, Mirantis)

Wednesday, 11:00am-11:40am
Level: Intermediate
Project Update &8211; Nova
(Jay Pipes, Mirantis)

Wednesday, 1:50pm-2:30pm
Level: Intermediate
Kuryr-Kubernetes: The seamless path to adding Pods to your datacenter networking
(Ilya Chukhnakov, Mirantis)

Wednesday, 1:50pm-2:30pm
Level: Intermediate
OpenStack: pushing to 5000 nodes and beyond
(Dina Belova and Georgy Okrokvertskhov, Mirantis)

Wednesday, 4:30pm-5:10pm
Level: Intermediate
Project Update &8211; Rally
(Andrey Kurilin, Mirantis)

 

THURSDAY MAY 11

Thursday, 9:50am-10:30am
Level: Intermediate
OSprofiler: evaluating OpenStack
(Dina Belova, Mirantis; VMware)

Thursday, 11:00am-11:40am
Level: Intermediate
Scheduler Wars: A New Hope
(Jay Pipes, Mirantis)

Thursday, 11:30am-11:40am
Level: Beginner
Saving one cloud at a time with tenant care
(Bryan Langston, Mirantis; Comcast)

Thursday, 3:10pm-3:50pm
Level: Advanced
Behind the Scenes with Placement and Resource Tracking in Nova
(Jay Pipes, Mirantis)

Thursday, 5:00pm-5:40pm
Level: Intermediate
Terraforming OpenStack Landscape
(Mykyta Gubenko, Mirantis)

 

Notable Presentations By The Community
 
TUESDAY MAY 9

Tuesday, 11:15am-11:55am
Level: Intermediate
AT&;T Container Strategy and OpenStack&8217;s role in it
(AT&038;T)

Tuesday, 11:45am-11:55am
Level: Intermediate
AT&038;T Cloud Evolution : Virtual to Container based (CI/CD)^2
(AT&038;T)

WEDNESDAY MAY 10

Wednesday, 1:50pm-2:30pm
Level: Intermediate
Event Correlation &038; Life Cycle Management – How will they coexist in the NFV world?
(Cox Communications)

Wednesday, 5:20pm-6:00pm
Level: Intermediate
Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF&8217;s on OpenStack
(Wind River)

THURSDAY MAY 11

Thursday, 9:00am-9:40am
Level: Intermediate
ChatOpsing Your Production Openstack Cloud
(Adobe)

Thursday, 11:00am-11:10am
Level: Intermediate
OpenDaylight Network Virtualization solution (NetVirt) with FD.io VPP data plane
(Ericsson)

Thursday, 1:30pm-2:10pm
Level: Beginner
Participating in translation makes you an internationalized OpenStacker &038; developer
(Deutsche Telekom AG)

Thursday, 5:00pm-5:40pm
Level: Beginner
Future of Cloud Networking and Policy Automation
(Cox Communications)

The post Let&8217;s Meet At OpenStack Summit In Boston! appeared first on Mirantis | Pure Play Open Cloud.
Quelle: Mirantis

Ten Ways a Cloud Management Platform Makes your Virtualization Life Easier

I spent the last decade working with virtualization platforms and the certifications and accreditation’s that go along with them.  During this time, I thought I understood what it meant to run an efficient data center. After six months of working with Red Hat CloudForms, a Cloud Management Platform (CMP), I now wonder what was I thinking.  I encountered every one of the problems below, each are preventable with the right solution. Remember, we live in the 21st century&;shouldn’t the software that we use act like it?

We filled up a data store and all of the machines on it stopped working. 
It does not matter if it is a development environment or the mission critical database cluster, when storage fills up everything stops!  More often than not it is due to an excessive number of snapshots. The good news is CloudForms can quickly be set up with a policy to recognize and prevent this from happening.For example we can check the storage utilization and if it is over 90% full take action, or better yet, when it is within two weeks of being full based on usage trends. That way if manual action is required, there is enough forewarning to do so.  Another good practice is to setup a policy to disable more than a few snapshots. We all love to take snapshots, but there is a real cost to them, and there is no need to let them get out of hand.
I just got thousands of emails telling me that my host is down.The only thing worse than no email alert is receiving thousands of them. In CloudForms it is not only easy to set up alerts, but also to define how often they should be acted upon. For example, check every hour, but only notify once per day.
Your virtual machines (VMs) cannot be migrated because the VM tools updater CD-ROM image was not un-mounted correctly. 
This is a serious issue for a number of reasons.  First it breaks Disaster Recovery (DR) operations and can cause virtual machines to be out of balance. It also disables the ability to put a node into maintenance mode, potentially causing additional outages and delays.Most solutions involve writing a shell script that runs as root and attempts to periodically unmount the virtual CD-ROM drives. These scripts usually work, but are both scary from a security standpoint and indiscriminately dangerous, imagine physically ejecting the CD-ROM while the database administrator is in the middle of a database upgrade!  With CloudForms we can setup a simple policy that can unmount drives once a day, but only after sanity checking that it is the correct CD-ROM image and that the system is in a state where it can be safely unmounted.
I have to manually ensure that all of my systems pass an incredibly detailed and painful compliance check (STIGS, PCI, FIPS, etc.) by next week! 
I have lost weeks of my life to this and if you have not had the pleasure, count yourself lucky.  When the “friendly” auditors show up with a stack of three-ring binders and a mandate to check everything, you might as well clear your calendar for the next few weeks. In addition, since these checks are usually a requirement to continuing operations, expect many of these meetings to involve layers of upper management you did not know existed, and this is definitely not the best time to become acquainted.The good news is CloudForms allows for you to run automatic checks on VMs and hosts. If you are not already familiar with its OpenSCAP scanning capability, you owe yourself a look. Not only that, but if someone attempts to bring a VM online that is not compliant, CloudForms can shut it right back down. That is the type of peace of mind that allows for sleep-filled nights.
Someone logged into a production server as root using the virtual console and broke it.  Now you have to physically hunt down and interrogate all the potential culprits &; as well as fix the problem. 
Before you pull out your foam bat and roam the halls to apply some “sense” to the person who did this, it is good to know exactly who it was and what they did. With CloudForms you can see a timeline of each machine, who logged into what console, as well as perform a drift analysis to potentially see what changed.  With this knowledge you can now not only fix the problem, but also “educate” the responsible party.
The developers insist that all VM’s must have 8 vCPU’s and 64GB of RAM. 
The best way to fight flagrant waste or resources is with data.  CloudForms provides the concept of “Right-Sizing” where it will watch VMs operate and determine what resource allocation is the ideal size. With this information in hand CloudForms can either automatically adjust the allocations, or spit out a report to be used to show what the excessive resources are costing.
Someone keeps creating 32bit VM’s with more than 4GB of RAM! 
As we know there is no “good” way that a 32bit VM can possibly use that much memory and it is essentially just waste.  A simple CloudForms policy to check for “OS Type = 32bit” and “RAM > 4GB”, can be a very interesting report to run. Or better yet, put a policy in place to automatically adjust the memory to 4GB and notify the system owner.
I have to buy hardware for next year, but my capacity-planning formula involves a spreadsheet and a dart board. 
Long term planning in IT is hard, especially with dynamic workloads in a multi-cloud environment.  Once CloudForms is running, it automatically collects performance data and executes trend line analysis to assist with operational management. For example, in 23 days you will be out of storage on your production SAN. If that does not get the system administrator&;s attention nothing will. It can also perform simulations to see what your environment would look like if you added resources. So you can see your trend lines and capacity if you added another 100 VMs of a particular type and size.
For some reason two hosts were swapping VMs back and forth, and I only found out when people complained about performance. 
As an administrator there is no worse way to find out that something is wrong than being told by a user. Large scale issues such as this can be hard to see from the logs since they consist of typical output. With CloudForms, a timeline overview of the entire environment highlights issues like this and the root cause can be tracked down.
I spend most of my day pushing buttons, spinning up VMs, manually grouping them into virtual folders and tracking them with spreadsheets. 
Before starting a new administrator role it is always good to ask for the “Point of Truth” system that keeps track of what systems are running, where they are, and who is responsible for them.  More often than not the answer is, “A guy, who keeps track of the list, on his laptop”.This may be how it was always done, but now with tools such as CloudForms, you can automatically tag machines based on location, projects, users, or any other combination of characteristics, and as a bonus, can provide usage and costing information back to the user. Gary could only dream of providing that much helpful information.

Conclusion
There is never enough time in the day, and the pace of new technologies is accelerating. The only way to keep up is to automate processes. The tools that got you where you are today are not necessarily the same ones that will get you through the next generation of technologies. It will be critical to have tools that work across multiple infrastructure components and provide the visibility and automation required. This is why you need a cloud management platform and where the real power of CloudForms comes into play.
Quelle: CloudForms

We installed an OpenStack cluster with close to 1000 nodes on Kubernetes. Here’s what we found out.

The post We installed an OpenStack cluster with close to 1000 nodes on Kubernetes. Here&;s what we found out. appeared first on Mirantis | Pure Play Open Cloud.
Late last year, we did a number of tests that looked at deploying close to 1000 OpenStack nodes on a pre-installed Kubernetes cluster as a way of finding out what problems you might run into, and fixing them, if at all possible. In all we found several, and though in general, we were able to fix them, we thought it would still be good to go over the types of things you need to look for.
Overall we deployed an OpenStack cluster that contained more than 900 nodes using Fuel-CCP on a Kubernetes that had been deployed using Kargo. The Kargo tool is part of the Kubernetes Incubator project and uses the Large Kubernetes Cluster reference architecture as a baseline.
As we worked, we documented issues we found, and contributed fixes to both the deployment tool and reference design document where appropriate.  Here&8217;s what we found.
The setup
We started with just over 175 bare metal machines, allocating 3 of them to be used for Kubernetes control plane services placement (API servers, ETCD, Kubernetes scheduler, etc.), others had 5 virtual machines on each node, where every VM was used as a Kubernetes minion node.
Each bare metal node had the following specifications:

HP ProLiant DL380 Gen9
CPU &; 2x Intel(R) Xeon(R) CPU E5-2680 v3 @ .50GHz
RAM &8211; 264G
Storage &8211; 3.0T on RAID on HP Smart Array P840 Controller, HDD &8211; 12 x HP EH0600JDYTL
Network &8211; 2x Intel Corporation Ethernet 10G 2P X710

The running OpenStack cluster (as far as Kubernetes is concerned) consists of:

OpenStack control plane services running on close to 150 pods over 6 nodes
Close to 4500 pods spread across all of the remaining nodes, at 5 pods per minion node

One major Prometheus problem
During the experiments we used Prometheus monitoring tool to verify resource consumption and the load put on the core system, Kubernetes, and OpenStack services. One note of caution when using Prometheus:  Deleting old data from Prometheus storage will indeed improve the Prometheus API speed &; but it will also delete any previous cluster information, making it unavailable for post-run investigation. So make sure to document any observed issue and its debugging thoroughly!
Thankfully, we had in fact done that documentation, but one thing we&8217;ve decided to do going forward to prevent this problem by configuring Prometheus to back up data to one of the persistent time series databases it supports, such as InfluxDB, Cassandra, or OpenTSDB. By default, Prometheus is optimized to be used as a real time monitoring / alerting system, and there is an official recommendation from the Prometheus developers team to keep monitoring data retention for only about 15 days to keep the tool working in a quick and responsive manner. By setting up the backup, we can store old data for an extended amount of time for post-processing needs.
Problems we experienced in our testing
Huge load on kube-apiserver
Symptoms
Initially, we had a setup with all nodes (including the Kubernetes control plane nodes) running on a virtualized environment, but the load was such that the API servers couldn&8217;t function at all so they were moved to bare metal.  Still, both API servers running in the Kubernetes cluster were utilising up to 2000% of the available CPU (up to 45% of total node compute performance capacity), even after we migrated them to hardware nodes.
Root cause
All services that are not on Kubernetes masters (kubelet, kube-proxy on all minions) access kube-apiserver via a local NGINX proxy. Most of those requests are watch requests that lie mostly idle after they are initiated (most timeouts on them are defined to be about 5-10 minutes). NGINX was configured to cut idle connections in 3 seconds, which causes all clients to reconnect and (even worse) restart aborted SSL sessions. On the server side, this it makes kube-apiserver consume up to 2000% of the CPU resources, making other requests very slow.
Solution
Set the proxy_timeout parameter to 10 minutes in the nginx.conf configuration file, which should be more than long enough to prevent cutting SSL connections before te requests time out by themselves. After this fix was applied, one api-server consumed only 100% of CPU (about 2% of total node compute performance capacity), while the second one consumed about 200% (about 4% of total node compute performance capacity) of CPU (with average response time 200-400 ms).
Upstream issue status: fixed
Make the Kargo deployment tool set proxy_timeout to 10 minutes: issue fixed with pull request by Fuel CCP team.
KubeDNS cannot handle large cluster load with default settings
Symptoms
When deploying an OpenStack cluster on this scale, kubedns becomes unresponsive because of the huge load. This end up with a slew of errors appearing in the logs of the dnsmasq container in the kubedns pod:
Maximum number of concurrent DNS queries reached.
Also, dnsmasq containers sometimes get restarted due to hitting the high memory limit.
Root cause
First of all, kubedns often seems to fail often in this architecture, even even without load. During the experiment we observed continuous kubedns container restarts even on an empty (but large enough) Kubernetes cluster. Restarts are caused by liveness check failing, although nothing notable is observed in any logs.
Second, dnsmasq should have taken the load off kubedns, but it needs some tuning to behave as expected (or, frankly, at all) for large loads.
Solution
Fixing this problem requires several levels of steps:

Set higher limits for dnsmasq containers: they take on most of the load.
Add more replicas to kubedns replication controller (we decided to stop on 6 replicas, as it solved the observed issue &8211; for bigger clusters it might be needed to increase this number even more).
Increase number of parallel connections dnsmasq should handle (we used &8211;dns-forward-max=1000 which is recommended parameter setup in dnsmasq manuals)
Increase size of cache in dnsmasq: it has hard limit of 10000 cache entries which seems to be reasonable amount.
Fix kubedns to handle this behaviour in proper way.

Upstream issue status: partially fixed
and 2 are fixed by making them configurable in Kargo by Kubernetes team: issue, pull request.
Others &8211; work has not yet started.
Kubernetes scheduler needs to be deployed on a separate node
Symptoms
During the huge OpenStack cluster deployment against Kubernetes, scheduler, controller-manager and kube-apiserver start fighting for CPU cycles as all of them are under a large load. Scheduler is the most resource-hungry, so we need a way to deploy it separately.
Solution
We moved the Kubernetes scheduler moved to a separate node manually; all other schedulers were manually killed to prevent them from moving to other nodes.
Upstream issue status: reported
Issue in Kargo.
Kubernetes scheduler is ineffective with pod antiaffinity
Symptoms
It takes a significant amount of time for the scheduler to process pods with pod antiaffinity rules specified on them. It is spending about 2-3 seconds on each pod, which makes the time needed to deploy an OpenStack cluster of 900 nodes unexpectedly long (about 3h for just scheduling). OpenStack deployment requires the use of antiaffinity rules to prevent several OpenStack compute nodes from being launched on a single Kubernetes minion node.
Root cause
According to profiling results, most of the time is spent on creating new Selectors to match existing pods against, which triggers the validation step. Basically we have O(N^2) unnecessary validation steps (where N = the number of pods), even if we have just 5 deployment entities scheduled to most of the nodes.
Solution
In this case, we needed a specific optimization that speeds up scheduling time up to about 300 ms/pod. It’s still slow in terms of common sense (about 30m spent just on pods scheduling for a 900 node OpenStack cluster), but it is at least close to reasonable. This solution lowers the number of very expensive operations to O(N), which is better, but still depends on the number of pods instead of deployments, so there is space for future improvement.
Upstream issue status: fixed
The optimization was merged into master (pull request) and backported to the 1.5 branch, and is part of the 1.5.2 release (pull request).
kube-apiserver has low default rate limit
Symptoms
Different services start receiving “429 Rate Limit Exceeded” HTTP errors, even though kube-apiservers can take more load. This problem was discovered through a scheduler bug (see below).
Solution
Raise the rate limit for the kube-apiserver process via the &8211;max-requests-inflight option. It defaults to 400, but in our case it became workable at 2000. This number should be configurable in the Kargo deployment tool, as bigger deployments might require an even bigger increase.
Upstream issue status: reported
Issue in Kargo.
Kubernetes scheduler can schedule incorrectly
Symptoms
When creating a huge amount of pods (~4500 in our case) and faced with HTTP 429 errors from kube-apiserver (see above), the scheduler can schedule several pods of the same deployment on one node, in violation of the pod antiaffinity rule on them.
Root cause
See pull request below.
Upstream issue status: pull request
Fix from Mirantis team: pull request (merged, part of Kubernetes 1.6 release).
Docker sometimes becomes unresponsive
Symptoms
The Docker process sometimes hangs on several nodes, which results in timeouts in the kubelet logs. When this happens, pods cannot be spawned or terminated successfully on the affected minion node. Although many similar issues have been fixed in Docker since 1.11, we are still observing these symptoms.
Workaround
The Docker daemon logs do not contain any notable information, so we had to restart the docker service on the affected node. (During the experiments we used Docker 1.12.3, but we have observed similar symptoms in 1.13 release candidates as well.)
OpenStack services don’t handle PXC pseudo-deadlocks
Symptoms
When run in parallel, create operations of lots of resources were failing with DBError saying that Percona Xtradb Cluster identified a deadlock and the transaction should be restarted.
Root cause
oslo.db is responsible for wrapping errors received from the DB into proper classes so that services can restart transactions if similar errors occur, but it didn’t expect the error in the format that is being sent by Percona. After we fixed this, however, we still experienced similar errors, because not all transactions that could be restarted were properly decorated in Nova code.
Upstream issue status: fixed
The bug has been fixed by Roman Podolyaka’s CR and backported to Newton. It fixes Percona deadlock error detection, but there’s at least one place in Nova that still needs to be fixed.
Live migration failed with live_migration_uri configuration
Symptoms
With the live_migration_uri configuration, live migrations fails because one compute host can’t connect to a libvirt on another host.
Root cause
We can’t specify which IP address to use in the live_migration_uri template, so it was trying to use the address from the first interface that happened to be in the PXE network, while libvirt listens on the private network. We couldn’t use the live_migration_inbound_addr, which would solve this problem, because of a problem in upstream Nova.
Upstream issue status: fixed
A bug in Nova has been fixed and backported to Newton. We switched to using live_migration_inbound_addr after that.
The post We installed an OpenStack cluster with close to 1000 nodes on Kubernetes. Here&8217;s what we found out. appeared first on Mirantis | Pure Play Open Cloud.
Quelle: Mirantis

Docker Gives Back at DockerCon

is actively working to improve opportunities for women and underrepresented minorities throughout the global ecosystem and promote diversity and inclusion in the larger tech community.
For instance, at DockerCon 2016, attendees contributed to a scholarship program through the Bump Up Challenge unlocking funds towards full-tuition scholarships for three applicants to attend Hack Reactor. We selected two recipients in 2016 and are excited to announce our third recipient, Tabitha Hsia, who is already in her first week of the program.
In her own words:

“My name is Tabitha Hsia. I grew up in the East Bay. I come from an art-focused family with my sister being a professional cellist, my mother being a professional pianist, and my great grandfather being a famous Taiwanese painter. I chose Hack Reactor because of their impressive student outcomes and their weekly schedule. Already in my first week, I have learned a ton of information from lectures and their wealth of resources. I have enjoyed pair programming the most so far. While the lectures expose me to new topics, applying the topics to actual problems has deepened my understanding the most. After graduation, my long-term goal is to become a virtual reality developer. Seeing the integration of the solutions and tools into society excites me.”

DockerCon Gives Back  
Following the success of previous DockerCon initiatives promoting diversity in the tech industry, we’re proud to continue our efforts at the upcoming DockerCon 2017 in Austin.
With this year’s program called DockerCon Gives Back, we’re recognizing four organizations that are doing outstanding work locally in Austin and globally. Attendees at the show will have the chance to connect and support these great organizations by dropping their token in their box &; each token represents a dollar that Docker will donate at the end of the conference.

            

Meet the DockerCon 2017 Diversity Scholarship winners
The DockerCon team is excited to announce the recipients of this year’s DockerCon Diversity Scholarship Program! The DockerCon Diversity Scholarship aims to provide support and guidance to members of the Docker Community who are traditionally underrepresented in tech through mentorship and a scholarship to attend DockerCon. Meet the recipients of this year’s scholarship here.

Congrats to our Austin scholarship winners! Learn more about how Docker Gives Back at To Tweet

The post Docker Gives Back at DockerCon appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

Red Hat Summit 2017 – Planning your OpenStack labs

This year in Boston, MA you can attend the Red Hat Summit 2017, the event to get your updates on open source technologies and meet with all the experts you follow throughout the year.
It&;s taking place from May 2-4 and is full of interesting sessions, keynotes, and labs.
This year I was part of the process of selecting the labs you are going to experience at Red Hat Summit and wanted to share here some to help you plan your OpenStack labs experience. These labs are for you to spend time with the experts who will teach you hands-on how to get the most out of your Red Hat OpenStack product.
Each lab is a 2-hour session, so planning is essential to getting the most out of your days at Red Hat Summit.
As you might be struggling to find and plan your sessions together with some lab time, here is an overview of the labs you can find in the session catalog for exact room and times. Each entry includes the lab number, title, abstract, instructors and is linked to the session catalog entry:

L103175 &; Deploy Ceph Rados Gateway as a replacement for OpenStack Swift
Come learn about these new features in Red Hat OpenStack Platform 10: There is now full support for Ceph Rados Gateway, and &;composable roles&; let administrators deploy services in a much more flexible way. Ceph capabilities are no longer limited to block only. With a REST object API, you are now able to store and consume your data through a RESTful interface, just like Amazon S3 and OpenStack Swift. Ceph Rados Gateway has a 99.9% API compliance with Amazon S3, and it can communicate with the Swift API. In this lab, you&8217;ll tackle the REST object API use case, and to get the most of your Ceph cluster, you&8217;ll learn how to use Red Hat OpenStack Platform director to deploy Red Hat OpenStack Platform with dedicated Rados Gateways nodes.
Instructors: Sebastien Han, Gregory Charot, Cyril Lopez
 
L104387 &8211; Hands on for the first time with Red Hat OpenStack Platform
In this lab, an instructor will lead you in configuring and running core OpenStack services in a Red Hat OpenStack Platform environment. We&8217;ll also cover authentication, compute, networking, and storage. If you&8217;re new to Red Hat OpenStack Platform, this session is for you.
Instructors: Rhys Oxenham, Jacob Liberman, Guil Barros
 
L102852 &8211; Hands on with Red Hat OpenStack Platform director
Red Hat OpenStack Platform director is a tool set for installing and managing Infrastructure-as-a-Service (IaaS) clouds. In this two-hour instructor-led lab, you will deploy and configure a Red Hat OpenStack Platform cloud using OpenStack Platform director. This will be a self-paced, hands-on lab, and it&8217;ll include both the command line and graphical user interfaces. You&8217;ll also learn, in an interactive session, about the architecture and approach of Red Hat OpenStack Platform director.
Instructors: Rhys Oxenham, Jacob Liberman
 
L104665 &8211; The Ceph power show—hands on with Ceph
Join our Ceph architects and experts for this guided, hands-on lab with Red Hat Ceph Storage. You&8217;ll get an expert introduction to Ceph concepts and features, followed by a series of live interactive modules to gain some experience. This lab is perfect for users of all skills, from beginners to experienced users who want to explore advanced features of OpenStack storage. You&8217;ll get some credits to the Red Hat Ceph Storage Test Drive portal that can be used later to learn and evaluate Red Hat Ceph Storage and Red Hat Gluster Storage. You&8217;ll leave this session having a better understanding of Ceph architecture and concepts, with experience on Red Hat Ceph Storage, and the confidence to install, set up, and provision Ceph in your own environment.
Instructors: Karan Singh, Kyle Bader, Daniel Messer
As you can see, there is plenty of OpenStack in these hands-on labs to get you through the week and hope to welcome you to one or more of the labs!
Quelle: RedHat Stack

Intelligent NFV performance with OpenContrail

The post Intelligent NFV performance with OpenContrail appeared first on Mirantis | Pure Play Open Cloud.
The private cloud market has changed in the past year, and our customers are no longer interested in just getting an amazing tool for installing OpenStack; instead, they are looking more at use cases. Because we see a lot of interest in NFV cloud use cases, Mirantis includes OpenContrail as the default SDN for its new Mirantis Cloud Platform. In fact, NFV has become a mantra for most service providers, and because Mirantis is a key player in this market, we work on a lot of testing and performance validation.
The most common value for performance comparison between solutions is bandwidth, which shows how much capacity a network connection has for supporting data transfer, as measured in bits per second. In this domain, the OpenContrail vRouter can reach near line speed (about 90%, in fact). However, performance also depends on other factors, such as latency, or packets-per-second (pps), which are as important as bandwidth. Packets per second rate is a key factor for VNF (firewalls, routers, etc.) instances running on top of NFV clouds. In this article, we&;ll compare PPS rate for different OpenContrail setups so you can decide what will work best for your specific use case.
The simplest way to test PPS rate is to run a VM to VM test. We will provide a short overview of OpenContrail low-level techniques for NFV infrastructure, and perform a comparative analysis of different approaches using simple PPS benchmarking. To make testing fair, we will use only a 10GbE physical interface, and will limit resource consumption for data plane acceleration technologies, making the environment identical for all approaches.
OpenContrail vRouter modes
For different use cases, Mirantis supports several ways of running the OpenContrail vRouter as part of Mirantis Cloud Platform 1.0 (MCP). Let&8217;s look at each of them before we go ahead and take measurements.
Kernel vRouter
OpenContrail has a module called vRouter that performs data forwarding in the kernel. The vRouter module is an alternative to Linux bridge or Open vSwitch (OVS) in the kernel, and one of its functionalities is encapsulating packets sent to the overlay network and decapsulating packets received from the overlay network. A simplified schematic of VM to VM connectivity for 2 compute nodes can be found in Figure 1:

Figure 1: A simplified schematic of VM to VM connectivity for 2 compute nodes
The problem with a kernel module is that packets-per-second is limited by various factors, such as memory copies, the number of VM exits, and the overhead of processing interrupts. Therefore vRouter can be integrated with the Intel DPDK to optimize PPS performance.
DPDK vRouter
Intel DPDK is an open source set of libraries and drivers that perform fast packet processing by enabling drivers to obtain direct control of the NIC address space and map packets directly into an application. The polling model of NIC drivers helps to avoid the overhead of interrupts from the NIC. To integrate with DPDK, the vRouter can now run in a user process instead of a kernel module. This process links with the DPDK libraries and communicates with the vrouter host agent, which runs as a separate process. The schematic for a simplified overview of vRouter-DPDK based nodes is shown in Figure 2:

Figure 2: The schematic for a simplified overview of vRouter-DPDK based nodes
vRouter-DPDK uses user-space packet processing and CPU affinity to dedicate poll mode drivers being served by a particular CPU. This approach enables packets to be processed in user-space during the complete life time &; from physical NIC to vhost-user port.
Netronome Agilio Solution
Software and hardware components distributed by Netronome provide an OpenContrail-based platform to perform high-speed packet processing. It’s a scalable, easy to operate solution that includes all server-side networking features, such as overlay networking based on MPLS over UDP/GRE and VXLAN. The Agilio SmartNIC solution supports DPDK, SR-IOV and Express Virtio (XVIO) for data plane acceleration while running the OpenContrail control plane. Wide integration with OpenStack enables you to run VMs with Virtio devices or SR-IOV Passthrough vNICs, as in Figure 3:

Figure 3:  OpenContrail network schematic based on Netronome Agilio SmartNICs and software
A key feature of the Netronome Agilio solution is deep integration with OpenContrail and offloading of lookups and actions for vRouter tables.
Compute nodes based on Agilio SmartNICs and software can work in an OpenStack cluster based on OpenContrail without changes to orchestration. That means it’s scale-independent and can be plugged into existing OpenContrail environments with zero downtime.
Mirantis Cloud Platform can be used as an easy and fast delivery tool to set up Netronome Agilio-based compute nodes and provide orchestration and analysis of the cluster environment. Using Agilio and MCP, it is easily to setup a high-performance cluster with a ready-to-use NFV infrastructure.
Testing scenario
To make the test fair and clear, we will use an OpenStack cluster with two compute nodes. Each node will have a 10GbE NIC for the tenant network.
As we mentioned before, the simplest way to test the PPS rate is to run a VM to VM test. Each VM will have 2 Virtio interfaces to receive and transmit packets, 4 vCPU cores, 4096 MB of RAM and will run Pktgen-DPDK inside to generate and receive a high rate of traffic. For each VM a single Virtio interface will be used for generation, and another interface will be used for receiving incoming traffic from the other VM.
To make an analytic comparison of all technologies, we will not use more than 2 cores for the data plane acceleration engines. The results of the RX PPS rate for all VMs will be considered as a result for the VM to VM test.
First of all, we will try to measure kernel vRouter VM to VM performance. Nodes will be connected with Intel 82599 NICs. The following results were achieved for a UDP traffic performance test:
As you can see, the kernel vRouter is not suitable for providing a high packet per second rate, mostly because the interrupt-based model can’t handle a high rate of packets per second. With 64 byte packets we can only achieve 3% of line rate.
For the DPDK-based vRouter, we achieved the following results:

Based on these results, the DPDK based solution is better at handling high-rated traffic based on small UDP packets.
Lastly, we tested the Netronome Agilio SmartNIC-based compute nodes:

With only 2 forwarder cores, we are able to achieve line-rate speed on Netronome Agilio CX 10GbE SmartNICs on all size of packets.
You can also see a demonstration of the Netronome Agilio Solution here.
Since we have achieved line-rate speed on the 10GbE interface using Netronome Agilio SmartNICs we wanted to have the maximum possible PPS rate based on 2 CPUs. To determine the maximum performance result for this deployment, we will upgrade existing nodes with Netronome Agilio CX 40GbE SmartNIC and repeat the maximum PPS scenario one more time. We will use direct wire connection between 40GbE ports and will set up 64-bytes UDP traffic. Even with hard resources limitations, we achieved:

Rate
Packet size, Bytes

Netronome Agilio Agilio CX 40GbE SmartNIC
19.9 Mpps
64

What we learned
Taking all of the results together, we can see a pattern:

Based on 64 byte UDP traffic, we can also see where each solution stands compared to 10GbE line rate:

Rate
% of line rate

Netronome Agilio
14.9 Mpps
100

vRouter DPDK
4.0 Mpps
26

Kernel vRouter
0.56 Mpps
3

OpenContrail remains the best production-ready SDN solution for OpenStack clusters, but to provide NFV-related infrastructure, OpenContrail can be used in different ways:

The Kernel vRouter, based on interrupt model packet processing, works, but does not satisfy the high PPS rate requirement.
The DPDK-based vRouter significantly improves the PPS rate, but due to high resource consumption and because of defined limitations, it can’t achieve the required performance. We also can assume that using a modern DPDK library will improve performance and optimise resource consumption.
The Netronome Agilio SmartNIC solution significantly improves OpenContrail SDN performance, focusing on saving host resources and providing a stable high-performance infrastructure.

With Mirantis Cloud Platform tooling, it is possible to provision, orchestrate and destroy high performance clusters with various networking features, making networking intelligent and agile.
The post Intelligent NFV performance with OpenContrail appeared first on Mirantis | Pure Play Open Cloud.
Quelle: Mirantis

User Group Newsletter March 2017

User Group Newsletter March 2017
 
BOSTON SUMMIT UPDATE
Exciting news! The schedule for the Boston Summit in May has been released. You can check out all the details on the Summit schedule page.
Travelling to the Summit and need a visa? Follow the steps in this handy guide, 
If you haven’t registered, there is still time! Secure your spot today! 
 
HAVE YOUR SAY IN THE SUPERUSER AWARDS!

The OpenStack Summit kicks off in less than six weeks and seven deserving organizations have been nominated to be recognized during the opening keynotes. For this cycle, the community (that means you!) will review the candidates before the Superuser editorial advisors select the finalists and ultimate winner. See the full list of candidates and have your say here. 
 
COMMUNITY LEADERSHIP CHARTS COURSE FOR OPENSTACK
About 40 people from the OpenStack Technical Committee, User Committee, Board of Directors and Foundation Staff convened in Boston to talk about the future of OpenStack. They discussed the challenges we face as a community, but also why our mission to deliver open infrastructure is more important than ever. Read the comprehensive meeting report here.
 
NEW PROJECT MASCOTS
Fantastic new project mascots were released just before the Project Teams Gathering. Read the the story behind your favourite OpenStack project mascot via this superuser post. 
 
WELCOME TO OUR NEW USER GROUPS
We have some new user groups which have joined the OpenStack community.
Spain- Canary Islands
Mexico City &; Mexico
We wish them all the best with their OpenStack journey and can’t wait to see what they will achieve! Looking for your local group? Are you thinking of starting a user group? Head to the groups portal for more information.
 
LOOK OUT FOR YOUR FELLOW STACKERS AT COMMUNITY EVENTS
OpenStack is participating in a series of upcoming Community events this April.
April 3: Open Networking Summit Santa Clara, CA

OpenStack is sponsoring the Monday evening Open Source Community Reception at Levi Stadium
ldiko Vancsa will be speaking in two sessions:
Monday, 9:00-10:30am on &;The Interoperability Challenge in Telecom and NFV Environments&;, with EANTC Director Carsten Rossenhovel and Chris Price, room 207
Thursday, 1:40-3:30pm, OpenStack our Mini-Summit, topic &8220;OpenStack:Networking Roadmap, Collaboration and Contribution&8221; with Armando Migliaccio and Paul Carver from AT&T; Grand Ballroom A&B

 
April 17-19: DockerCon, Austin, TX

Openstack will be in booth

 
April 19-20: Global Open Source Summit, Beijing, China

Mike Perez will be delivering an OpenStack keynote

 
OPENSTACK DAYS: DATES FOR YOUR CALENDAR
We have lots of upcoming OpenStack Days coming up:
Upcoming OpenStack Days
June 1: Australia
June 5: Israel
June 7: Budapest
June 26: Germany Enterprise (DOST)
Read further information about OpenStack Days from this website. You’ll find a FAQ, see highlights from previous events and an extensive toolkit for hosting an OpenStack Day in your region. 
 
CONTRIBUTING TO UG NEWSLETTER
If you’d like to contribute a news item for next edition, please submit to this etherpad.
Items submitted may be edited down for length, style and suitability.
This newsletter is published on a monthly basis.
 
 
 
Quelle: openstack.org

Webinar recap: Docker 101 for federal government

is driving a movement for IT teams across all industries to modernize their applications with container technology. Government agencies, like private sector companies face similar pressures to accelerate software development while reduce overall IT costs and adopting new technologies and practices like cloud, DevOps and more.
This webinar titled “Docker 101 for the Federal Government” features Andrew Weiss, Docker Federal Sales Engineer and breaks down the core concepts of Docker and how it applies to government IT environments and unique regulatory compliance requirements. The presentation highlights how Docker Enterprise Edition can help agencies build a secure cloud-first government.

Watch the on-demand webinar to learn how Docker is transforming the way government agencies deliver secure, reliable, and scalable services to organizations and citizens.

Here are the questions from the live session:
Q: Is Docker Datacenter available both hosted and as a cloud offering?
A: Docker Datacenter is now a part of Docker Enterprise Edition (EE) &; providing integrated container management and security from development to production. Docker EE provides a unified software supply chain for all apps—commercial off the shelf, homegrown monoliths to modern microservices written for Windows or Linux environments on any server, VM or cloud. Docker EE can be deployed on-premises (bare metal or VMS) or on any cloud provider.
Q: Can you install regular Windows Server apps into Docker containers in Windows 2016?
A: YES. Docker running containers on Windows is the result of a two-year collaboration between Microsoft that involved the Windows kernel growing containerization primitives, Docker and Microsoft collaborating on porting the Docker Engine and CLI to Windows to take advantage of those new primitives and Docker adding multi-arch image support to Docker Hub.
Q: From an implementation perspective, do you recommend one container per virtual machine or multiple containers?
A: We see a mix. Depending on the use case you will get a range in density of containers per virtual or bare metal machine. In some science and research communities, we have seen a use case of a 1:1 container to machine  where developers are looking purely for portability of their existing workloads. However, typically containers are ephemeral, running on average for a few minutes so that number is always changing depending on how that service is scaled out or back.
Q: How do you phrase the argument that a Linux kernel is the same everywhere?
A: The kernel: This is the one piece of the whole that is actually called “Linux”. The kernel is the core of the system and manages the CPU, memory, and peripheral devices. The kernel is the “lowest” level of the OS.
Q: Is the AWS Quick Start of Docker EE available for Gov Cloud?
A: Docker Enterprise Edition (EE) Basic, Standard and Advanced are all available in the AWS Marketplace for easy deployment of a highly available Docker EE environment in about 20 minutes. Built in accordance with best practices from AWS and Docker, these templates include the latest Docker software in a variety of regions and directly integrated with AWS services.
Q: Will license pricing remain the same from DDC to Docker EE?
A: Docker Datacenter (DDC) is now part of Docker Enterprise Edition (EE) Standard tier. The subscription price has not changed. Customers who have previously purchased DDC are entitled to the latest version of Docker EE Standard. For more information, visit www.docker.com/pricing.
Continue your Docker journey with these helpful links:

Register for the next Federal Webinar on April 4th
Try Docker Enterprise Edition for free
Learn more about Docker in Government
Save your seat for the Docker Federal Summit on May 2nd

Webinar recap: Docker 101 for federal governmentClick To Tweet

The post Webinar recap: Docker 101 for federal government appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/