OpenStack

at Canonical

Brad Marshall
brad.marshall@canonical.com

Who am I?

  • Unix sysadmin with 18 years experience
  • Currently sysadmin at Canonical
  • Worked in environments from small software startups to university
  • Volunteered for Debian, SAGE-AU, linux.conf.au, HUMBUG, CQLUG
  • Presentations/articles at SAGE-AU, AUUG, CQLUG, HUMBUG, Linux.com

Where Is It Used

  • Canonistack - dev / testing stack
  • Prodstack - production stack
  • Stagingstack - staging environments on prodstack
  • Prodstack 4.0 - Next version of production stack
  • Scalingstack - WIP stack for scaling type environments (builders etc)
  • Openstack Integration Lab - testing integration of hardware and software

What's Canonistack?

  • Openstack installation anyone in the company can use
  • 2 regions, running previous two releases
  • Right now we have grizzly and havana
  • Allows oversubscribing to get the most usage
  • Allows dev to test across multiple releases and prototype things
  • Downtime can happen, but improving over time
  • Want developers to prefer this over envs, like AWS
  • Good learning environment

Canonistack Sizing

  • 10 Compute nodes across 2 regions each with:
    • 96G ram
    • 24 CPUs
    • Between 500G and 2Tb storage
  • 3 Swift nodes each with:
    • 6G ram
    • 8 CPUs
    • 3 x 128G storage

Prodstack

  • Production deployment of Openstack
  • Stagingstack is the staging version of prodstack
  • Currently running internally developed services
  • Running 12.04 with Folsom currently
  • Using Ubuntu Cloud Archive which backports versions of Openstack to LTS

What runs in Prodstack

  • Certification website
  • Product search - Amazon search on desktop
  • Video search
  • Music streaming
  • Parts of ubuntu.com website
  • Juju charms website
  • Juju GUI website
  • Summit website
  • Cassandra backends for errors.ubuntu.com
  • Various private git and gerrit services
  • Ubuntu Developer API docs
  • Many others

Prodstack Sizing

  • 20 Compute nodes each with:
    • 96G memory
    • 1Tb storage
    • 24 CPUs
  • 5 Swift storage nodes each with:
    • 16G memory
    • 3 x 1Tb storage
    • 16 CPUs
  • 6 Ceph nodes each with:
    • 16G memory
    • 11 spindles with ~20Tb of space
    • 16 CPUs

Prodstack 4.0

  • Next version of Prodstack in progress
  • Will be Havana on Precise using Ubuntu Cloud Archive
  • HA on Neutron and firewalls to start
  • Once services migrated from Prodstack will integrate hardware

Prodstack 4.0 initial sizing

  • 7 x Compute nodes each with:
    • 96G ram
    • 24 CPUs
    • 1Tb SAS storage
  • 3 x Swift nodes each with:
    • 16G ram
    • 16 CPUs
    • 11 x 3Tb SATA storage
  • 3 x Ceph nodes each with:
    • 16G ram
    • 16 CPUs
    • 11 x 2TB SAS storage

Software Stack

  • Openstack
  • Ubuntu
  • MaaS
  • Juju
  • Landscape

How to deploy

  • MaaS and Juju to deploy Openstack to bare metal
  • Juju to deploy to Openstack
  • Landscape to manage servers

Service Orchestration vs Configuration Management

  • Configuration management is more like universal remote that allows you to directly control TV, DVD, set top box etc - eg Puppet, Chef
  • Service orchestration is more like the remotes that allow you to say what you want to do, ie "Watch a DVD" and it handles it for you - eg Juju, Heat

Juju

  • Service orchestration for the cloud
  • Can deploy to Openstack, MaaS, AWS, HP Cloud, Azure, Joyent (soon), local containers
  • Providers being added all the time
  • Charms are pre-setup, configurable services
  • Juju expresses relationships between charms
  • Bundle is a set of services expressed as charms
  • You can control where things are deployed via juju deploy --to instance
  • Juju manual provisioning means you can deploy to hosts that are pre-installed via ssh - experimental at this stage

Juju example


$ juju bootstrap
$ juju deploy juju-gui --to 0
$ juju deploy -n 3 --config ceph.yaml ceph --constraints "mem=2G"
$ juju deploy -n 3 --config ceph.yaml ceph-osd --constraints "mem=2G"
$ juju add-relation ceph ceph-osd

Juju GUI


$ juju bootstrap
$ juju deploy juju-gui
  • GUI that allows control over Juju
  • Can add new services, scale out, configure
  • Drag and drop services onto canvas, add relationships

Maas - Metal As A Service

  • Controls deployment of physical hosts
  • Lets you treating bare metal the same as a cloud guest
  • Actively developed, designed for hyperscale
  • Juju provider can use MaaS natively
  • Has a RESTful API
  • Devs currently working on improving the user experience
  • Virtual MaaS allows simulation of MaaS using virtual machines
  • http://bazaar.launchpad.net/~virtual-maasers/charms/precise/virtual-maas/trunk/view/head:/README.md

Mojo - CI for Services

  • Continous integration for Juju based services
  • Uses jenkins to deploy and test services
  • Used to ensure services are still deployable after changes
  • Uses nagios checks defined for services as well as anything else
  • Can test upgrades of charms or anything else you can script
  • Looking at using it for deployment as well
  • Still a work in progress, but already helping find issues

Workflow for apps

  • Devs work up a Juju charm and how to deploy on Canonistack
  • Webops (group who deal with devs) take charm and work it into a usable deployment story on stagingstack, going back and forth with devs as required
  • Once everyone is happy the stack is deployed into prodstack, and into production by ops
  • Updates to apps are done via devs, rolled into stagingstack and then onto prodstack by ops
  • This allows good involvement between ops and devs, while still keeping seperation of responsibilities

Problems

  • Needs more hardware to move to cloud initially to seed the cloud - can't just magically replace everything overnight and pull the hardware in
  • Devs might think they'll be able to deploy anything out immediately - still need testing in production etc
  • Devs getting up to speed with how to write apps to use cloud
  • Ops have concerns around visibility of usage etc - tooling isn't quite there in some cases
  • Bandwidth limitations - looking at bonded nics for next version of prodstack

Problems cont

  • Picking the right hardware for cloud compute nodes - need more iops since you can be bringing up multiple guests with competing resource requirements, and cause local disk IO storms
  • Making sure instances deployed in the cloud aren't running on same physical hosts
  • Want HA + 1 (be able to remove a host to upgrade and still keep HA)
  • Need cross environment deployments in Juju
  • Colocating services in Juju is possible now - tricky though, since charms usually assume they're the only thing

Advantages

  • Vastly improved our density of services on servers, decreasing capex (buying less servers) and opex (powering less servers)
  • Dynamic scale up and down of services - ie, we can add more website units at release time
  • True reproducability of deployments
  • Staging environment that closely matches prod without too much extra hardware required
  • Both devs and ops develop an awareness of each others needs
  • IS is involved in development side of things, help shape thing to work in production
  • We are upstream for the OS - opportunity to get things in
  • Dogfooding our product and providing feedback to devs

Questions?

Contact me at brad.marshall@canonical.com