Workshop Datacenter Modernization -Microsoft Technical Summit 2014 Germany (Berlin)

While speaking (What’s new in Failover Clustering in Windows Server 2012 R2) and attending the Microsoft Technical Summit 2014 I’m taking the opportunity to see how Microsoft Germany and partners are doing a workshop which is based on the IT Camps they have been delivering over the past year. There is a lot of content to be delivered and both trainers Carsten Rachfahl (Rachfahl IT-Solutions GmbH) and Bernhard Frank (Partner Technology Strategist (Hosting), Microsoft) are doing that magnificently.

One thing I note is that they sure do put in a lot of effort. The one I’m attending requires some server infrastructure, a couple of switches, cabling for over 50 laptops etc. These have been neatly packed into road cases and the 50+ laptops had been placed, cabled and deployed using PXE boot /WDS the night before. Yes even in the era of cloud you need hardware especially if you’re doing an IT Camp on “Datacenter Modernization” (think private & hybrid infrastructure design and deployment).

image

Not bypassing this aspect of private cloud building adds value to the workshop and is made possible with the help of Wortmann AG. Yes the attendees get to deploy storage spaces, Scale Out File Server, networking etc. They don’t abstract any of the underlying technologies away, I like that a lot, it adds value and realism.

I’m happy to see that they leverage the real world experience of experts (fellow Hyper-V MVP Carsten Rachfahl) who helps hosting companies and enterprises deploy these technologies. Storage, Scale Out File Server, Hyper-V clusters, System Center and self service (Azure Pack) are the technologies used to achieve the goals of the workshop.

image

The smart use of PowerShell (workflows, PDT) allows to automate the process and frees up time to discuss and explain the technologies and design decisions. They take great care to explain the steps and tools used so the attendees can use these later in their own environments. Talking about their own experiences and mistakes helps the attendees avoid common mishaps and move along faster.

image

The fact that they have added workshops like this to the summit adds value. I think it’s a great idea that they are held on the last day as this means that attendees can put the information they gathered from 2 days of sessions into practice. This helps understanding the technologies better.

There is very little criticism to be given on the content and the way they deliver it. I have to say that it’s all very well done. Perhaps they make private cloud look a bit too easy Winking smile. Bernard, Carsten, well done guys, I’m impressed. If you’re based in Germany and you or your team members need to get up to speed on how these technologies can be leveraged to modernize your data center I can highly recommend these guys and their workshops/IT Camps.

The Microsoft Management Summit 2013

MMS 2013 is in Las Vegas, Nevada, USA

Time flies fast and it’s time to look ahead to 2013. My continuing investment in myself is part of that.  Despite a lot of rumors about big changes to MMS (its future, location, timing etc.) things will go forward as they’ve been in the past years. That includes the location. As you probably already heard it’s back in Las Vegas, state of Nevada, USA. So after the, for many people, somewhat disconcerting announcement at MMS 2012 indicating the above mentioned changes, MMS 2013 will once again be held in Las Vegas again. As before it will be focused on the entire System Center Suite. That was confirmed by a mail form the MSS conference team recently and a TechNet blog post

image

Recently is was announced that the MMS 2013 content survey is now open. So they’re planning for the Microsoft Management Summit 2013 content and they’d like to hear from us. Why? Well, the better they align the content of the conference to our needs, the better it will be as an experience. This means our return on investment will be bigger which is always a good thing. So if you’re going or thinking of going this is the place, MMS 2013 Content Survey, to voice your opinions on what it should look like content wise. You have two more weeks to fill it out and than it’s scheduled to close down.

Why Attend?

It’s great to have an event focused on managing, deploying and protecting the infrastructure we’ve spent so much time, effort and money building. This conference is dedicated to exactly that. Smaller in scale but very focused. All together in the same hotel/conference center for 5 long days living in System Center and nothing else. As the world’s top operators in this space are there, the networking opportunities are also excellent. I can still remember the amount of talking and discussing I did with my colleagues in 2012, that was stimulating.

It’s also the place to provide feedback to Microsoft about System Center. Things you like, don’t like, things that are missing etc. I most certainly have some feedback for them.

Will I attend?

I’ll most certainly try to attend, that’s for sure. So it’s time to fill out the request form and start cutting through the red tape. Let’s hope the economy doesn’t tank completely and that we can go. The chips might be down right now but let’s not cost cut ourselves out of skills, education, opportunities and a future. Remember, keep moving forward and don’t quit yet, you can always give up later Winking smile.

Fixing Hiccups in The SCVMM2008R2 GUI & Database

As you might very well know by experience sometimes the System Center Virtual Machine Manager GUI and database get out of sync with reality about what’s going on for real on the cluster. I’ve blogged about this before in SCVMM 2008 R2 Phantom VM guests after Blue Screen and in System Center Virtual Machine Manager 2008 R2 Error 12711 & The cluster group could not be found (0×1395)

The Issue

Recently I had to trouble shoot the “Missing” status of some virtual machines on a Hyper-V cluster in SCVMM2008R2. Rebooting the hosts, guests, restarting agents, … none of the usual tricks for this behavior seemed to do the trick. The SCVMM2008R2 installation was also fully up to date with service packs & patches so there the issue dot originate.

Repair was greyed out and was no use. We could have removed the host from SCVMM en add it again. That resets the database entries for that host en can help fix the issues but still is not guaranteed to work and you don’t learn what the root cause or solution is. But none of our usual tricks worked.We could have deleted the VMs from the database as in  but we didn’t have duplicates. Sure, this doesn’t delete any files or VM so it should show up again afterwards but why risk it not showing up again and having to go through fixing that.

The Cause

The VMs were in a “Missing” state after an attempted live migration during a manual patching cycle where the host was restarted the before the “start maintenance mode” had completed. A couple of those VMs where also Live Migrated at the same time with the Failover Cluster GUI. A bit of confusion al around so to speak nut luckily all VMs are fully operational an servicing applications & users so no crisis there.

The Fix

DISCLAIMER

I’m not telling you to use this method to fix this issue but you can at your own risk. As always please make sure you have good and verified backups of anything that’s of value to you Smile

We hade to investigate. The good news was that all VMs are up an running, there is no downtime at the moment and the cluster seems perfectly happy Smile.

But there we see the first clue. The Virtual machines on the cluster are not running on the node SCVMM thinks they are running, hence the “Missing” status.

First of all let’s find out what host the VM is really running on in the cluster and see what SCVMM thinks on what host the VM  is running. We run this little query against the VMM database. That gives us all hosts known to SCVMM.

SELECT [HostID],[ComputerName] FROM [VMM].[dbo].[tbl_ADHC_Host]

HostID                                                                        ComputerName

559D0C84-59C3-4A0A-8446-3A6C43ABF618          node1.test.lab

540C2477-00C3-4388-9F1B-31DBADAD1D8C        node2.test.lab

40B109A2-9E6B-47BC-8FB5-748688BFC0DF         node3.test.lab

C2DA03CE-011D-45E3-A389-200A3E3ED62E        node4.test.lab

6FA4ABBA-6599-4C7A-B632-80449DB3C54C         node5.test.lab

C0CF479F-F742-4851-B340-ED33C25E2013          node6.test.lab

D2639875-603F-4F49-B498-F7183444120A             node7.test.lab

CE119AAC-CF7E-4207-BE0B-03AAE0371165         node8.test.lab

AB07E1C2-B123-4AF5-922B-82F77C5885A2           node9.test.lab

(9 row(s) affected)

Voila en now the fun starts. SCVMM GUI tells us “MissingVM” is missing on node4.

We check this in the database to confirm:

SELECT Name, ObjectState, HostId
FROM VMM.dbo.tbl_WLC_VObject
WHERE Name = 'MissingVM'
GO

Which is indeed node4

Name                                                                                                                                                                                                                                                             ObjectState HostId

———  —  ————————————

node4  220  C2DA03CE-011D-45E3-A389-200A3E3ED62E

(1 row(s) affected)


In SCVMM we see that the moving of the VM failed. Between node 4 and node 6.

image

Now let’s take a look at what the cluster thinks … yes there it is running happily on node 6 and not on node 4. There’s the mismatch causing the issue.

So we need to fix this. We can Live Migrate the VM with the Failover Cluster GUI to the node SCVMM thinks the VM still resides on and see if that fixes it. If it does, great! You have to give SCVMM some time to detect all things and update its records.

But what to do if it doesn’t work out?  We can get the HostId from the node where the VM is really running in the cluster, which we can see in the Failover Cluster GUI, from the query we ran above and than update the record:

UPDATE VMM.dbo.tbl_WLC_VObject
SET HostId  = 'C0CF479F-F742-4851-B340-ED33C25E2013'
WHERE Name = 'MissingVM'
GO

We then reset the ObjectState to 0 to get rid of the Missing status. It would do this automatically but it takes a while.

UPDATE VMM.dbo.tbl_WLC_VObject
SET ObjectState = '0'
WHERE Name = 'MissingVM'
GO

After some patience & Refreshing all is well again and test with live migrations proves that all works again.

As I said before people get creative in how to achieve things due to inconsistencies, differences in functionality between Hyper-V Manager, Failover Cluster Manager and SCVMM 2008R2 can lead to some confusing situations. I’m happy to see that in Windows 8 the action you should perform using the Failover Cluster GUI or PowerShell are blocked in Hyper-V Manager. But SCVMM really needs a “reset” button that makes it check & validate that what it thinks is reality.

My Best of MMS 2012 Series: Private Cloud 2012 Lessons Learned from Our Early Adopters

An open discussion on people who have built private clouds at customers.

Elasticity

In real live things don’t shrink that often.  Smile Free or real cheap back charge rates are not doing anything to help.

My take on this is that you should look at elasticity as a flexibility feature. Even if a cloud is no that elastic in both ways. You can shrink a cloud v1 to zero as you migrate the VMs to private cloud v2. Than dump the resources back into another pool, step by step or in one go. I’ll use whatever works to make my live easier.

Standardizing & Customer Centric Operations = Planning is Key!

These two can be hard to combine. It takes serious planning and as such an upfront investment.

Than you need to build it to optimize operations (cost & excellent service). This sounds nice but how good are we at this and what is the shelf life of a solution versus the investment?

There is a lot of preparation to do. There is a lot of things to consider. Databases, Storage, the network, security boundaries, disaster recovery planning. They advise not to do it cross domain. Hmmm … we need to address this. Seriously..

Testing => build decent scripts with variables & config files. This will help to deploy in test, acceptance & production without to many changes/work.

Make sure you define all service accounts, groups and permissions you need.

It’s all about planning and what’s being told are best practices that exist already, private cloud or not.

Self Service

  • Service Catalog is a prerequisite.
  • Self Service is key to the private cloud.
  • If people can they will do things differently. You’ll have to learn to deal with this.
  • Billing for services should be clear. Not to much detail. VMs & Storage are two good ones. Keep it simple and don’t go into memory & vCPU. Just set boundaries.

Demos

We dive into the System Center products and look at it from both the IT Pro and the consumer side of things.

Requests => Approval & Deployment

Approval process should be dynamic based on what is requested & who’s making is. You’ll also need SLAs & chargeback on these. Be careful not over complicating it or you encourage rogue IT.

RANT: IT should make things as easy as possible. And in this discussion I’m not won over for charge back. It often turns into an excel exercise. Internal IT becomes more and more like an external service provider or integrator in this model. The inherent strength of being part of the business and being in the best position to help that business move ahead is lost. Is this a complot of the integrators? It fits their model but basically a lot of that is broken very badly. The last thing internal IT should do is become like them. That will do nothing for “Business-IT alignment”. We need to leverage the possibilities of the private cloud for our business or we have no unique selling point. Not that the service providers do a better job, but at least they are not on the pay roll so the bean counters like that. And as long as they can use public cloud to get their needs served hey couldn’t care less about who does the private cloud thingy for them. So a functional IT is first and foremost what we need. That is customer centered. Alignment of business & IT is worthless without that. The latter happens ay to much.

Management

Well yes this is important. We need reports, reviews, Service Improvement Plans, look for opportunities for automation.