WDeployConfigWriter Account Issues – Trouble Shooting Web Deploy 2.0 With Lessons Learned

Here’s a small recap of an incident we dealt with recently and that served as a coaching exercise for troubleshooting. It seems we have Web Deploy 2.0 in use for in house deployments of web apps. It seems to be a valued asset as well. At least valuable enough to land a help request on the desk of one of the young, eager, smart, and upward mobile IT Professionals when it stops working and they need some assistance.

Hello ICT,

To deploy our we websites remotely we use web deployment service (see http://technet.microsoft.com/en-us/library/dd569087(WS.10).aspx for more info).

This service runs under the network service account by default. Deploying fails now. In the security log on the server I find  “The specified account’s password has expired”.

Does anyone know the password of this account?

Best regards,

Hardworking Web Guy In Trouble

Basically, we have enough information to know something went wrong and that they need it to work again. But that’s about it. Password for the network service account expired? They also included an error log and reading it learns us something. The lesson to be learned here: investigate yourself, read the log, interpret them. Don’t let patients give you a diagnosis. Their input is critical, but you need to draw your own conclusions.

An account failed to log on.

Subject:
                Security ID:                           LOCAL SERVICE
                Account Name:                    LOCAL SERVICE
                Account Domain:                NT AUTHORITY
                Logon ID:                              0x3e5

Logon Type:                                         8

Account For Which Logon Failed:
                Security ID:                           NULL SID
                Account Name:                    WDeployConfigWriter
                Account Domain:                lab.test

Failure Information:
                Failure Reason:                     The specified account’s password has expired.
                Status:                                0xc000006e
                Sub Status:                            0xc0000071

Process Information:
Caller Process ID: 0x1f44
Caller Process Name: C:WindowsSystem32inetsrvWMSvc.exe

What did we just read and learn? No, it’s not the Network Service Account whose password has expired. This doesn’t happen/doesn’t work that way … so that was our first indication that this isn’t quite right in the support ticket. As you can see the real problem account mentioned in the error log:  WDeployConfigWriter. That account is indeed a local account.

Cool, now we check what service runs under that account by looking in the services panel …. none! The easy way to check is to sort on the “Log On As” column. You won’t find WDeployConfigWriter. Right … , what else do we learn from the Services panel. Well we do have service called Web Deployment Agent Service running under the local Network Service account. We can stop and start it just fine so there is nothing wrong with the Network Service account, which is as expected and this service is not our culprit.  What we also learn that this is Web Deploy 2.0.

As the Web Deployment Agent Service has nothing to do with the problem at hand. So where is that WDeployConfigWriter being used and what is it status? Let’s take a look.

Hey, how could this account have expired? This is impossible. Unless they changed it while trying to fix the error. We check this with a quick phone call and yes, they did exactly that.  The good thing is that this web guy is professional and tells us what they did. Some people think this might get them into trouble and won’t do that. It doesn’t change anything, things are what they are, but it does make communication less easy when you discover people act that way… So the lessons here are to double-check & verify what happened if at all possible. Originally the settings were:

They changed them after they ran into issues hop that checking those options might fix it. Well no, expired is expired and you can’t fix it like that. You need indeed to correct the settings if you don’t want the password to expire and even prevent the user from changing it but you also need to set a new password when it has already expired. After doing so we contact the hardworking web guy in trouble to let them test and predict a new error: whatever runs under that Account will now fail to run due to an incorrect password. And guess what? “Unknown user name or bad password” in the security log.

Log Name:      Security
Source:        Microsoft-Windows-Security-Auditing
Date:          24/06/2011 10:30:39
Event ID:      4625
Task Category: Logon
Level:         Information
Keywords:      Audit Failure
User:          N/A
Computer:     server1.lab.test
Description:
An account failed to log on.

Subject:
    Security ID:        LOCAL SERVICE
    Account Name:        LOCAL SERVICE
    Account Domain:        NT AUTHORITY
    Logon ID:        0x3e5

Logon Type:            8

Account For Which Logon Failed:
    Security ID:        NULL SID
    Account Name:        WDeployConfigWriter
    Account Domain:        lab.test

Failure Information:
    Failure Reason:        Unknown user name or bad password.
    Status:            0xc000006d
    Sub Status:        0xc000006a

Process Information:
    Caller Process ID:    0x1f44
    Caller Process Name:    C:WindowsSystem32inetsrvWMSvc.exe

 

The user wants to repair install or uninstall and reinstall the application to “get a quick fix” but we do not give in and keep troubleshooting. It’s better to learn what the cause really is and how to fix it instead of relying on wishful reinstalling.

So where is the thing that runs under that account? We start a quick search in the registry and on the file system for the account name just in case it’s configured in the registry or a configuration file and let it run while we keep investigating.  We also send a tweet into the universe, as perhaps someone out there knows this and can help out. We search the internet for Web Deploy 2.0 and WDeployConfigWriter. This results in very few hits, hmmm, interesting  … One of them is http://blogs.iis.net/msdeploy/archive/2011/04/05/announcing-web-deploy-2-0-refresh.aspx

Where we learn a few things, the most important is the one line from that blog post I formatted in bold and red from the blog snippet right below. I also enlarged the picture from the blog post to make it readable where you can find in IIS  what we learned here:

Notice that Web Deploy setup created two new local user accounts:

– WDeployConfigWriter, which has Write permissions to the IIS server’s applicationHost.config. This is used by delegation rules for createApp, appPoolNetFx and appPoolPipelineMode.

I’ve included the entire block of text from where this was taken below.

1. Easier setup for non-administrator deployments on IIS7

One of the common requests from our users was to make it easier to setup Web Deploy so non-administrators can publish to their sites. Typically, you will need to do this if you are running a shared hosting environment or if you are administering a build machine and you do not want users to have admin access.

If you launch the Web Deploy installer and choose “Custom”, you will notice a new option, “Configure for Non-administrator Deployments”:

If you choose this option, Web Deploy will automatically create Management Service Delegation rules for the following providers, as well as user the accounts needed for providers like createApp and recycleApp that need elevated privileges.

These are the rules you will have in the Management Service Delegation UI in IIS Manager after you install this component:

Notice that Web Deploy setup created two new local user accounts:

– WDeployConfigWriter, which has Write permissions to the IIS server’s applicationHost.config. This is used by delegation rules for createApp, appPoolNetFx and appPoolPipelineMode.

– WDeployAdmin, which is an administrator. This is used by delegation rules for recycleApp.

If you prefer to create these rules by hand, uncheck the component in the installer. We also provide a PowerShell script for creating delegation rules (more on this later in the post) if you prefer that route.

Well-armed with this information we go have a look at the Management Service Delegation:

Where we indeed find createApp, appPoolNetFx and appPoolPipelineMode:

So now we take a look a bit what we can configure here and  sure enough, by double-clicking on them the Edit Rule form:

So we click on Edit security credentials and are welcomed by this form:

So we enter the account name and the new password we set before (remember to do this for both providers):

Guess what, end user happy, things are working again. Jay! From service down report to the helpdesk to fully operational again in less than an hour with a technology new to the service desk.

How did this happen and did they end up with this funky configuration (expiring password of an account that no one knows where it is used for and where configured)? Aha, operational control => know the configuration of what you use and know why it is configured that way and where it’s configured. Is it a mistake/assumption in the installer that the accounts WDeployConfigWriter and WDeployAdmin have their passwords set to expired and can be changed by the user or did somebody mess with them after the install? Well, I did the test by setting it up on a test server and found that they are indeed installed with their passwords set to expire and that the password can be changed by the user. It assumes that the person doing the install knows and realizes the implications. I’m not saying either setting is wrong but you should know why, when, and where. There is no documentation on this as far as we could find right now and perhaps the installer should mention the benefits/risks of both types of configuration and ask what to choose. This, together with better documentation, could help prevent this issue. As always, no guarantees are given   

Overall lesson: don’t assume things, trust but verify …

KB Article 2522766 & KB Article 2135160 Published Today

At this moment in time I don’t have any more Hyper-V clusters to support that are below Windows Server 2008 R2 SP1. That’s good as I only have one list of patches to keep up to date for my own use. As for you guys still taking care of Windows 2008 R2 RTM Hyper-V cluster you might want to take a look at KN article 2135160 FIX: "0x0000009E" Stop error when you host Hyper-V virtual machines in a Windows Server 2008 R2-based failover cluster that was released today. The issue however is (yet again) an underlying C-State issue that already has been fixed in relation to another issue published as KB article 983460 Startup takes a long time on a Windows 7 or Windows Server 2008 R2-based computer that has an Intel Nehalem-EX CPU installed.

And for both Windows Server 2008 R2 RTM and SP1 you might take a look at an MPIO issue that was also published today (you are running Hyper-V on a cluster and your are using MPIO for redundant storage access I bet) KB article 2522766 The MPIO driver fails over all paths incorrectly when a transient single failure occurs in Windows Server 2008 or in Windows Server 2008 R2

It’s time I add a page to this blog for all the fixes related to Hyper-V and Failover Clustering with Windows Server 2008 R2 SP1 for my own reference Smile

Exchange 2010 SP1 DAG & Unified Messaging Now Supports Host Based High Availability & Live Migration!

Well due to rather nice virtualization support for Lync and the fact that Denali (SQL Server vNext) does support DAG like functionality with Live Migration and host based clustering, it was about time for Exchange 2010 to catch up. And when we read the white paper  Best Practices for Virtualizing Exchange Server 2010 with Windows Server® 2008 R2 Hyper V™ that moment has finally arrived. I have to thank Michel de Rooij at  for bringing this to our attention http://eightwone.com/2011/05/14/exchange-2010-sp1-live-migration-supported/. So now we have the best features in virtualization at our disposal and that simply rocks. We read:

“Exchange server virtual machines, including Exchange Mailbox virtual machines that are part of a Database Availability Group (DAG), can be combined with host-based failover clustering and migration technology as long as the virtual machines are configured such that they will not save and restore state on disk when moved or taken offline. All failover activity must result in a cold start when the virtual machine is activated on the target node. All planned migration must either result in shut down and a cold start or an online migration that utilizes a technology such as Hyper-V live migration.”

“Microsoft Exchange Server 2010 SP1 supports virtualization of the Unified Messaging role when it is installed on the 64-bit edition of Windows Server 2008 R2. Unified Messaging must be the only Exchange role in the virtual machine. Other Exchange roles (Client Access, Edge Transport, Hub Transport, Mailbox) are not supported on the same virtual machine as Unified Messaging. The virtualized machine configuration running Unified Messaging must have at least 4 CPU cores, and at least 16 GB of memory.”

And it is NOT ONLY for Hyper-V, look at the Exchange Team blog here “The updated support guidance applies to any hardware virtualization vendor participating in the Windows Server Virtualization Validation Program (SVVP).’” Nice!

Anyone who’s at TechEd USA 2011 in Atlanta should attend EXL306 for more details. Huge requirements yes, but the same goes for physical servers. That’s how they get the performance gains needed, it’s done by lowering IO by using large amounts of RAM.

Think about the above statement, we now have support for host clustering with live migration, possibly together with technology like for example Melio (SanBolic) on the software side or Live Volume (Compellent) on the storage side to protect against SAN Failure (local or remote) and combined with DAG high availability for the databases in Exchange 2010 (which can be multi site) this becomes a very resilient package. So to come back to my other post on a brighter future for public folders, if they can sort out this red headed stepchild of the Exchange portfolio they have covered all their bases and have a great platform with the option of making it better, easier and cheaper to implement, operate & use. No one will argue with that.

I know some people will say all this is overkill, to complex, to much or to expensive. I call it having options. When the S* hits the fan and you’re “in the fight of your life” wading your way through one or multiple IT disasters to keep that mail flow up an running it is good to have multiple options. Options mean you can get the job done using creativity and tools. If you have only one tool and one option Murphy will catch up with you. Actually this is one of my most heard shout outs to the team “give me options” when problems arise. But at what cost do these options come? That is up for the business and you to decide. We’re getting very robust options in Exchange that can be leveraged with other technologies for high availability that have become more and more main stream. This means none of all this needs to be bought and implemented just for Exchange. They are already in place. Unless your IT “strategy” the last 10 years was run Windows 2000 & Exchange 2000 until the servers fall apart and we don’t have any more spares available on e-bay before we consider moving along.

Déjà vu Bug: The network connection of a running Hyper-V virtual machine may be lost under heavy outgoing network traffic on a computer that is running Windows Server 2008 R2 SP1

Anyone who’s been doing virtualization with Hyper-V on Windows 2008 R2 has a good change of having seen the issue described in http://support.microsoft.com/kb/974909/en-us

You install the Hyper-V role on a computer that is running Windows Server 2008 R2.

  • You run a virtual machine on the computer.
  • You use a network adapter on the virtual machine to access a network.
  • You establish many concurrent network connections, or there is heavy outgoing network traffic.

In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter is disabled.
Note You have to restart the virtual machine to recover from this issue.

We’ve seen this one on VM’s that have indeed a lot of outgoing traffic.  In our environment the situation looks like this:

  • You can access the VM with Hyper-V Manager or SCVMM but not via RDP as all Network connectivity is lost.  The status the  guest NIS is always “Enabled” but there is no traffic/connectivity
  • You can try to disable the NIC but this tales a  very long time and when you try to enable it again this never succeeds. Disconnecting the NIC form the virtual network and connecting it again doesn’t help either.
  • You need to shut down the host but this takes an extremely long time, so long you really can’t afford to wait if it ever succeeds. It seems to hang at shutting down with a “non whirling whirly”.  So finally you’ll power off the VM and start it up again. Apart from entries related to having not connectivity the event logs are “clean” and there is no indication as to what happened.

Well this exact same issue is back with Windows 2008 R2 SP1. That’s the bad news. The good news is there is a hotfix for it already so you can fix it. You can read up on this issue in Knowledge Base article 2263829  and request the hotfix here. Instructions to get the hotfix are in there as well as a reference to the previous fixes for Windows 2008 R2 RTM.

Consider the following scenario:

  • You install the Hyper-V role on a computer that is running Windows Server 2008 R2 Service Pack 1 (SP1).
  • You run a virtual machine on the computer.
  • You use a network adapter on the virtual machine to access a network.
  • You establish many concurrent network connections. Or, there is heavy outgoing network traffic.

In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter may be disabled.
Notes

  • You must restart the virtual machine to recover from this issue.
  • This issue can also occur on versions of Windows Server 2008 R2 that do not have SP1 installed. To resolve the issue, apply the hotfix that is described in one of the following Microsoft Knowledge Base articles:

    974909 (http://support.microsoft.com/kb/974909/ ) The network connection of a running Hyper-V virtual machine is lost under heavy outgoing network traffic on a Windows Server 2008 R2-based computer
    2264080 (http://support.microsoft.com/kb/2264080/ ) An update rollup package for the Hyper-V role in Windows Server 2008 R2: August 24, 2010

Oh yeah, people often seem confused  as to where to install the hotfix. Does it go on the Hyper-V hosts or and/or on the guest?  It’s a hyper visor bug in Hyper-V so it goes on the hosts. Have a nice weekend.