WorkingHardInIT Blog Maintenance Window & Tools Used

As you might have noticed my blog was down last night for about 1 hour and 45 minutes between 22:20 and 00:10. A bit longer than I wanted but I needed more time do deal with the upgrade of MySQL as part of the routine maintenance I do on my WordPress blog server.

In the environments under my care I take care to take the time to do routine maintenance to avoid falling behind to much in firmware, drivers, patches, etc. This takes some effort but as it helps prevent bigger issues in the long run it’s worth while to do so. I take the same approach with my blog as much as possible. Most of this maintenance goes by without you ever noticing. The windows updates reboots being the exception. WordPress upgrades, plugin upgrades, PHP upgrades, etc. … all go swiftly usually which means I’m pretty well covered there, frequently.

Upgrading MySQL however is always a bit of a time consuming effort and depending on what version you’re upgrading from and to witch one it can actually mean multiple sequential upgrades (5.1 to 5.5.44 to 5.6.25).image

I practiced this upgrade on a copy of the VM in azure to make sure I could handle whatever came up and still I had to deal with some challenges I did not encounter in the test environment. That show that I’m not a full time hard core MySQL guru I guess.

Anyway after getting to MySQL 5.6.25 from 5.5.44 and fixing some issues with TIMESTAMP with implicit DEFAULT value is deprecated (easy fix) and dealing with the error in MySQL Workbench:

An unhandled exception occurred (Error executing ‘SELECT t.PROCESSLIST_ID,
IF (NAME = ‘thread/sql/event_scheduler’,’event_scheduler’,t.PROCESSLIST_USER) PROCESSLIST_USER,t.PROCESSLIST_HOST,t.PROCESSLIST_DB,t.PROCESSLIST_COMMAND,
t.PROCESSLIST_TIME,t.PROCESSLIST_STATE,t.THREAD_ID,t.TYPE,t.NAME,t.PARENT_THREAD_ID,
t.INSTRUMENTED,t.PROCESSLIST_INFO,a.ATTR_VALUE FROM performance_schema.threads t 
LEFT OUTER JOIN performance_schema.session_connect_attrs a ON t.processlist_id = a.processlist_id AND (
a.attr_name IS NULL OR a.attr_name = ‘program_name’) WHERE t.TYPE <> ‘BACKGROUND”
Native table ‘performance_schema’.’threads’ has the wrong structure.
SQL Error: 1682). Please refer to the log files for details.

which I fixed by running run mysqld –performance_schema I’m rocking everything up to date once more.

image

Always have good backups, make exports of your database schema, data and structures in MySQL and have multiple ways out when things go south. In Azure I’m relying on Backup Vault where I protect my virtual machine with schedules backup jobs. I also backup my WordPress with the data via a plug in and export the database via MySQL Workbench.

image

Those dumps are copied out of the VM to where ever I want (Azure, One Drive, home PC, a VM running in AWS …) to make sure I have multiple options to recover.

image

VEEAM FastSCP for Microsoft Azure comes in very handy for this by the way. You might want to check it out if you’re in need of an automated and secure way to get data out of a VM running in Microsoft Azure!

Dell Storage Replay Manager 7.6.0.47 for Compellent 6.5

Recently as a DELL Compellent customer version 7.6.0.47 became available to us. I download it and found some welcome new capabilities in the release notes.

  • Support for vSphere 6
  • 2024 bit public key support for SSL/TLS
  • The ability to retry failed jobs (Microsoft Extensions Only)
  • The ability to modify a backup set (Microsoft Extensions Only)

The ability to retry failed jobs is handy. There might be a conflicting backup running via a 3rd party tool leveraging the hardware VSS provider. So the ability to retry can mitigate this. As we do multiple replays per day and have them scheduled recurrently we already mitigated the negative effects of this, but this only gibes us more options to deal with such situations. It’s good.

image

The ability to modify a backup set is one I love. It was just so annoying not to be able to do this before. A change in the environment meant having to create a new backup set. That also meant keeping around the old job for as long as you wanted to retain the replays associated with that job. Not the most optimal way of handling change I’d say, so this made me happy when I saw it.

image

Now I’d like DELL to invest a bit more in make restore of volume based replays of virtual machines easier. I actually like the volume based ones with Hyper-V as it’s one snapshot per CSV for all VMs and it doesn’t require all the VMs to reside on the host where we originally defined the backup set. Optimally you do run all the VMs on the node that own the CSV but otherwise it has less restrictions. I my humble opinion anything that restricts VM mobility is bad and goes against the grain of virtualization and dynamic optimization. I wonder if this has more to do with older CVS/Hyper-V versions, current limitations in Windows Server Hyper-V or CVS or a combination. This makes for a nice discussion, so if anyone from MSFT & the DELL Storage team responsible for Repay Manager wants to have one, just let me know Smile 

Last but not least I’d love DELL to communicate in Q4 of 2015 on how they will integrate their data protection offering in Compellent/Replay manager with Windows Server 2016 Backup changes and enhancements. That’s quite a change that’s happing for Hyper-V and it would be good for all to know what’s being done to leverage that. Another thing that is high on my priority for success is to enable leveraging replays with Live Volumes. For me that’s the biggest drawback to Live Volumes: having to chose between high/continuous availability and application consistent replays for data protection and other use cases).

I have some more things on my wish list but these are out of scope in regards to the subject of this blog post.

Free VEEAM Endpoint Backup Goes RTM – First Upgrade Experiences!

VEEAM Endpoint backup has gone RTM and that’s great news. I’ve been using it since the beta version with great results. I moved to the release candidate when that became available and now I’m running RTM. The version number of the RTM bits is 1.0.0.1954.

image

You can download it here and put it into action straight away!

Quick Tips & Findings

There is no supported upgrade path form the beta release. As a matter of fact the RTM version cannot read the backup files. When trying to upgrade from beta to RTM you’ll be greeted with this message:

image

Now that’s OK. You should have been on the RC already and there things are better Smile. Mind you, there’s no way to do an in place upgrade either but it can read the backups made by the RC version!

image

With a clean install (green field or after uninstalling the beta or RC version) the installation will kick off.

image

Now in the case of or RC backups we tested 2 things:

  • Can we restore the existing backups? Yes we can!

image

  • How are the backs made by the RTM version handled in regards to the already present ones. We just reconfigured the backups to the same repository and kicked of a backup. A new backup job folder was created and the backup was made there. So our DBA’s great self service SQL Server backup offloading repository made with the RC candidate is still available for restores while RTM backups to it’s own new folder.

image

Well there you go, VEEAM Endpoint Backup just got launched in production. We still have to wait for the production ready update for integration with VEEAM Backup & Replication v8 but that will arrive soon enough. The future looks bright.

Optimizing Backups: PowerShell Script To Move All Virtual Machines On A Cluster Shared Volume To The Node Owing That CSV

When you are optimizing the number of snapshots to be taken for backups or are dealing with storage vendors software that leveraged their hardware VSS provider you some times encounter some requirements that are at odds with virtual machine mobility and dynamic optimization.

For example when backing up multiple virtual machines leveraging a single CSV snapshot you’ll find that:

  • Some SAN vendor software requires that the virtual machines in that job are owned by the same host or the backup will fail.
  •  
  • Backup software can also require that all virtual machines are running on the same node when you want them to be be protected using a single CSV snap shot. The better ones don’t let the backup job fail, they just create multiple snapshots when needed but that’s less efficient and potentially makes you run into issues with your hardware VSS provider.
image

VEEAM B&R v8 in action … 8 SQL Server VMs with multiple disks on the same CSV being backed up by a single hardware VSS writer snapshot (DELL Compellent 6.5.20 / Replay Manager 7.5) and an off host proxy Organizing & orchestrating backups requires some effort, but can lead to great results.

Normally when designing your cluster you balance things out a well as you can. That helps out to reduce the needs for constant dynamic optimizations. You also make sure that if at all possible you keep all files related to a single VM together on the same CSV.

Naturally you’ll have drift. If not you have a very stable environment of are not leveraging the capabilities of your Hyper-V cluster. Mobility, dynamic optimization, high to continuous availability are what we want and we don’t block that to serve the backups. We try to help out to backups as much a possible however. A good design does this.

If you’re not running a backup every 15 minutes in a very dynamic environment you can deal with this by live migrating resources to where they need to be in order to optimize backups.

Here’s a little PowerShell snippet that will live migrate all virtual machines on the same CSV to the owner node of that CSV. You can run this as a script prior to the backups starting or you can run it as a weekly scheduled task to prevent the drift from the ideal situation for your backups becoming to huge requiring more VSS snapshots or even failing backups. The exact approach depends on the storage vendors and/or backup software you use in combination with the needs and capabilities of your environment.

cls
 
$Cluster = Get-Cluster
$AllCSV = Get-ClusterSharedVolume -Cluster $Cluster
 
ForEach ($CSV in $AllCSV)
{
    write-output "$($CSV.Name) is owned by $($CSV.OWnernode.Name)"
     
    #We grab the friendly name of the CSV
    $CSVVolumeInfo = $CSV | Select -Expand SharedVolumeInfo
    $CSVPath = ($CSVVolumeInfo).FriendlyVolumeName
 
    #We deal with the \ being and escape character for string parsing.
    $FixedCSVPath = $CSVPath -replace '\\', '\\'
 
    #We grab all VMs that who's owner node is different from the CSV we're working with
    #From those we grab the ones that are located on the CSV we're working with
      $VMsToMove = Get-ClusterGroup | ? {($_.GroupType –eq 'VirtualMachine') -and ( $_.OwnerNode -ne $CSV.OWnernode.Name)} | Get-VM | Where-object {($_.path -match $FixedCSVPath)}
    
    ForEach ($VM in $VMsToMove)
     {
        write-output "`tThe VM $($VM.Name) located on $CSVPath is not running on host $($CSV.OwnerNode.Name) who owns that CSV"
        write-output "`tbut on $($VM.Computername). It will be live migrated."
        #Live migrate that VM off to the Node that owns the CSV it resides on
        Move-ClusterVirtualMachineRole -Name $VM.Name -MigrationType Live -Node $CSV.OWnernode.Name
    }
}
Winking smile

Now there is a lot more to discuss, i.e. what and how to optimize for virtual machines that are clustered. For optimal redundancy you’ll have those running on different nodes and CSVs. But even beyond that, you might have the clustered VMs running on different cluster, which is the failure domain.  But I get the remark my blogs are wordy and verbose so … that’s for another time