My Best of MMS 2012 Series: Private Cloud 2012 Lessons Learned from Our Early Adopters

An open discussion on people who have built private clouds at customers.

Elasticity

In real live things don’t shrink that often.  Smile Free or real cheap back charge rates are not doing anything to help.

My take on this is that you should look at elasticity as a flexibility feature. Even if a cloud is no that elastic in both ways. You can shrink a cloud v1 to zero as you migrate the VMs to private cloud v2. Than dump the resources back into another pool, step by step or in one go. I’ll use whatever works to make my live easier.

Standardizing & Customer Centric Operations = Planning is Key!

These two can be hard to combine. It takes serious planning and as such an upfront investment.

Than you need to build it to optimize operations (cost & excellent service). This sounds nice but how good are we at this and what is the shelf life of a solution versus the investment?

There is a lot of preparation to do. There is a lot of things to consider. Databases, Storage, the network, security boundaries, disaster recovery planning. They advise not to do it cross domain. Hmmm … we need to address this. Seriously..

Testing => build decent scripts with variables & config files. This will help to deploy in test, acceptance & production without to many changes/work.

Make sure you define all service accounts, groups and permissions you need.

It’s all about planning and what’s being told are best practices that exist already, private cloud or not.

Self Service

  • Service Catalog is a prerequisite.
  • Self Service is key to the private cloud.
  • If people can they will do things differently. You’ll have to learn to deal with this.
  • Billing for services should be clear. Not to much detail. VMs & Storage are two good ones. Keep it simple and don’t go into memory & vCPU. Just set boundaries.

Demos

We dive into the System Center products and look at it from both the IT Pro and the consumer side of things.

Requests => Approval & Deployment

Approval process should be dynamic based on what is requested & who’s making is. You’ll also need SLAs & chargeback on these. Be careful not over complicating it or you encourage rogue IT.

RANT: IT should make things as easy as possible. And in this discussion I’m not won over for charge back. It often turns into an excel exercise. Internal IT becomes more and more like an external service provider or integrator in this model. The inherent strength of being part of the business and being in the best position to help that business move ahead is lost. Is this a complot of the integrators? It fits their model but basically a lot of that is broken very badly. The last thing internal IT should do is become like them. That will do nothing for “Business-IT alignment”. We need to leverage the possibilities of the private cloud for our business or we have no unique selling point. Not that the service providers do a better job, but at least they are not on the pay roll so the bean counters like that. And as long as they can use public cloud to get their needs served hey couldn’t care less about who does the private cloud thingy for them. So a functional IT is first and foremost what we need. That is customer centered. Alignment of business & IT is worthless without that. The latter happens ay to much.

Management

Well yes this is important. We need reports, reviews, Service Improvement Plans, look for opportunities for automation.

Personal Best of MMS 2012 Series “Why We Fail–An Architect’s Journey to the Private Cloud”

Introduction

The speaker (Alex Jauch) addresses cloud terminology confusion and points out that yet everyone wants it. So the pressure is on to deliver cloud.

But as an architect you can’t build with such vague notions of what it is. That just doesn’t work. 78% of enterprise IT Shops will deploy a private cloud by 2014 (Gartner) 62% of all IT Projects fail. For the record, building a private cloud is not an easy project.

For one, what are you building? What is it, way to may definitions. NIST seems to be one of the better definitions around. Specific, direct and actionable. We can work with that. I suggest you visit the NIST site for more information on:

  • Deployment Models:Private Cloud, Hybrid, Public.
  • Service models SAAS, PAAS, IAAS
  • The Essential Characteristics
    The Common Characteristics

Why We Fail

What happens:

  • Install Hyper-V
  • Deploy System Center
  • Build a solution

The essential element of cloud is that  “The cloud is a customer centric business model, not technology”.It’s approached to much as a technology problem and that’s why we fail.

The architect should not allow this to happen so he is to blame. The architectural practice is to marry business needs and wants to technology as a solution. This really hits home but there are more people involved and than there is the entire business / IT alignment fiasco as you can read in my blog The shortage of skilled employees, are we making it worse? , but the bucks ends with the architect..

How do we add value to the business? Commodities do not add value, they are necessities. So we need to decide what business we are in. Meeting standards is not a goal. Enabling business is the goal. So they think you’re doing a great job empowering them. After all they are paying for it.

The Take Away

Traditional IT needs to evolve (fast) to customer centric IT.  End user departments define the goals. Our operational proficiency used to be our pride but what does it mean to the customer? Problems that do not affect the business don’t matter. So talk to customers to find out about what they want and need. They don’t care about your skill set or certifications. You’ll need t extract the need from their wants.

The ability to take pain points away from customers. Small & medium sized projects do very well at this. But in a lot of companies they don’t promote you for those “smaller” projects. So the business also has to evolve.

I’d like to add that Old style IT is also promoted by  a lot of misguided security officers and business lawyers. Strict rules as a guidance and instrument are their instruments and no those are also not always in the business best interest.

This relates to IT Portfolio Management: Strategic, High Potential, Key Operational & Support. We need to realize that whatever we work on might be strategic or high potential will move to key operational and support. They all need different approaches and types of management. So choose your methodologies wisely. Don’t just pick one and force that square peg in the round hole. This is my advice to both business and IT. I’ve seen business decisions change support level products turned into high cost  high maintenance because due to bad decisions. So we might not have to be our brothers keeper towards the business but than again do we really need those bridging functions and those guys or gals need to be at the top of their game as I stated in The shortage of skilled employees, are we making it worse?

So keep things a simple and as effective as possible. Do it fast, ride and repeat. You’ll learn a lot and improve along the way. So here comes the build or buy decision and the link to the NetApp plug by the speaker. This is very dependent on the situation of the organization at hand. So the fast track has it’s place here. Is speed of delivery of key importance or absolute flexibility and adaptability? So it will depend. Yes the consultants answer. But being a real consultant is a very respectable job. I can’t hell it that the word has become meaningless due to missuses and inflationary titles for temps for hire.  The System Center stack and how NetApp improves and leverages all this is briefly discussed. He ties the fast track into the discussion of portfolio management and working in a customer centric way.

Conclusion

Why are we doing what we do? Think about it. There is a nice book on this subject  “Why we fail? by Alex Jauch.

Join the Private Cloud Roadshow in Brussels (25 April) and Ghent (26 April)

You will have heard of the Private Cloud concept by now. But a lot of us are still struggling with what it means to us and how we can leverage this concept. Microsoft is aware of this and wants to help us understand Private Cloud better. On top of that I can tell you from personal experience that Redmond also really wants to understand our needs in that regard better.

Recently they announced a free event (1/2 day) to achieve exactly that. Yes, they are proud to present “The Private Cloud Roadshow”. And I’m proud to say that two fellow MVPs and MEET members are delivering the content Smile I included the agenda for you as well as the links to register.


Join experts Kurt Roggen (MVP) and Mike Resseler (MVP) for the Private Cloud Roadshow.

clip_image001 clip_image002

In this half day you will learn more about private cloud infrastructure setup and how you can monitor this. Learn how to create your private clouds and how to deploy standardized applications or services into these clouds. And as a final session you will learn how you can provide automation in your private cloud.

There are 2 options to attend (same content, different location):

25 April 2012 in Brussels
clip_image003

26 April 2012 in Ghent
clip_image003[1]

This TechNet event is free of charge and Microsoft will be giving away one Nokia Lumia 800 Windows Phone in each location.

Detailed agenda:
13:00 : 13:30 Welcome & Coffee
13:30 : 14:30 Building your Private Cloud Infrastructure
Learn how to build your Private Cloud infrastructure, by dealing with Fabric Management (Compute/ Hypervisors, Storage, Network), which will serve as the basis for the Private Cloud that you will be creating. We will discuss how to deploy, configure and manage each of these different elements in your datacenter.
14:30 : 14:50 Break
14:50 : 16:30 Creating, Monitoring & Operating your Private Cloud
Learn how to create your private clouds and how to deploy standardized applications or services into these clouds. Learn how to monitor your clouds and how to can handle change requests. All this key area’s will be addressed to give you an idea of what is happening in a private cloud after it is up, running, and into production.
16:30 : 16:50 Break
16:50 : 18:00 Automating & Delivering Services in your Private Cloud
Learn how you can provide automation in your private cloud. Discover also how your cloud services can be offered and consumed using different self-service portals and what their differences are.
18:00 : 19:00 Networking & Drinks

Thinking About Windows 8 Server & Hyper-V 3.0 Network Performance

Introduction

The main purpose of this post is, as mentioned in the title, to think. This is not a design or a technical reference. When it comes to designing a virtualization solution for your private cloud their are a lot of components to consider. Storage, networking, CPU, memory all come in to play and there is no one size fits all. It all depends on your needs, budget in combination with how good your  insight into your future plans & requirements are. This is not and easy task. Virtualizing 40 webservers is very different from virtualizing SQL Server, Exchange or SharePoint.  Server virtualization is different from VDI and VDI itself comes in many different flavors.

  • So what workloads are you hosting? Is it a homogeneous or a heterogeneous environment?
  • What kind of applications are you supporting? Client-Server, SOA/Web Services, Cloud apps?
  • What storage performance & features do you need to support all that?
  • What kind of network traffic does this require?
  • What does your business demand? Do you know, do they even know? Can you even know in a private cloud environment?
  • Do you have one customers or many (multi tenancy) and how are they alike or different in both IT needs and business requirements.

The needs of true public cloud builders are different from those running their own private clouds in their own data centers or in a mix of those with infrastructure at a hosting provider. On top of that an SMB environment is different from large enterprises and companies of the same size will differ in their requirements enormously due to the nature of their business.

I’ve written about virtualization and CPU considerations before (NUMA, Power Save settings for both OS & 10Gbps network performance) before. I’ve also discussed a number of posts about 10Gbps networking and different approaches on how to introduce it with out breaking the bank. In 2012 I intend to blog some more on networking and storage options with Windows 8 and Hyper-V 3.0. But I still need to get my hands on the betas and release candidates of Windows 8 to do so. You’ll notice I don’t talk about Infiniband. Well I just don’t circulate in the ecosystems where absolute top notch performance is so important that they can justify and get that kind of budget to throw at those needs.

To set the scene for these blog posts I’ll introduce some considerations around networking options with Hyper-V. There are many features and options both in hardware, technologies, protocols, file systems. Even when everything is intended to make live simpler people might get lost in all the options and choices available.

Windows Server 8 NIC features – The Alphabet Soup

  • Data Center Bridging (DCB)
  • Receive Segment Coalescing (RSC)
  • Receive Side Scaling (RSS)
  • Remote Direct Memory Access (RDMA)
  • Single Root I/O Virtualization (SR-IOV)
  • Virtual Machine Queue (VMQ)
  • IPsec offload (IPsecTO)

A lot of this stuff has to do with converged networks. These offer a lot of flexibility and the potential for cost savings along the way. But convergence & cost savings are not a goal. They are means to an end. Perhaps you can have better, cheaper and more effective solutions leveraging you existing network infrastructure by adding some 10Gbps switches & NICs where they provide the best bang for the buck. Chances are you don’t need to throw it all out and do a fork lift replacement. Use what you need from the options and features. Be smart about it. Remember my post on A Fool With A Tool Is Still A Fool, don’t be that guy!

Now let’s focus on couple of the features here that have to do with network I/O performance and not as much convergence or QOS. As an example of this I like to use Live migration of virtual machines with 10Gbps. Right now with one 10Gbps NIC I can use 75% of the bandwidth of a dedicated NIC for live migration. When running 20 or more virtual machines per host with 4Gbps to 8Gbps of memory and with Windows 8 giving me multiple concurrent Live Migrations I can really use that bandwidth. Why would I want to cut it up to 2 or 3Gbps in that case. Remember the goal. All the features and concepts are just tools, means to and end. Think about what you need.

But wait, in Windows 8 we have some new tricks up our sleeve. Let’s team two 10Gbps NICs put all traffic over that team and than divide the bandwidth up and use QOS to assure Live Migration gets 10Gbps when needed but without taking it away from others network I/O when it’s not needed. That’s nice! Sounds rather cool doesn’t it and I certainly see a use for it. It might not be right if you can’t afford to loose that bandwidth when Live Migration kicks in but if you can … more power & cost savings to you. But there are other reasons not to put everything on one NIC or team.

RSS, VMQ, SR-IOV

One thing all these have in common is that they are used to reduce the CPU load / bottleneck on the host and allow to optimize the network I/O and bandwidth usage of your expensive 10Gbps NICs. Both avoiding having a CPU bottleneck and optimizing the use of the available bandwidth mean you get more out of your servers. That translates in avoiding buying more of them to get the same workload done.

RSS is targeted at the host network traffic. VMQ and SR-IOV are targeted at the virtual machine network traffic but in the end the both result in the same benefits as stated above. RSS & VMQ integrate well with other advanced windows features. VMQ for examples can be used with the extensible Hyper-V switch while RSS can be combined with QOS & DCB in storage & cluster host networking scenarios. So these give you a lot of options and flexibility. SR-IOV or RDMA is more focused on raw performance and doesn’t integrate so well with the more advanced features for flexibility & scalability. I’ll talk some more on this in future blog posts.

Now with all these features that have there own requirements and compatibilities you might want to reconsider putting all traffic over one pair of teamed NICs. You can’t optimize them all in such a scenario and that might hurt you. Perhaps you’ll be fine, perhaps you won’t.

image

So what to use where and when depends on how many NICs you’ll use in your servers and for what purpose. For example even in a private cloud for lightweight virtual machines running web services you might want to separate the host management & cluster traffic from the virtual machine network traffic. You see RSS & VMQ are mutually exclusive. That way you can use RSS for the host/cluster traffic and DVMQ for the virtual machine network. Now if you need redundancy you might see that you’ll already use 2*2NIC with Windows 8 NLB in combination with two switches to avoid a single point of failure. Do your really need that bandwidth for the guest servers? Perhaps not but, you might find that it helps improve density because of better better host & NIC performance helping you avoiding the cost of buying extra servers. If you virtualize SQL servers you’d be even more interested in all this. The picture below is just an illustration, just to get you to think, it’s not a design.

image

I’m sure a lot of matrices will be produced showing what features are compatible under what conditions, perhaps even with some decision charts to help you decide what to use where and when.