Nuclear data waste

Introduction

Nowadays everyone seems to be heading to the hills to try and cash in on the new gold rush. Data! You have all heard that data is the new gold. Some call it the new oil, as that is their favorite fantasy, that’s all good. But there are drawbacks to this.

The cost of data and data waste

Like any resource you mine it does come with associated costs. A cost that has to be covered by the value you derive from it. That value has to exceed the money spend to gather, process and consume it. That can be an expensive business.

On top of that you have to deal with “the waste” it creates as a byproduct. Waste can be toxic. Mining data tends to produce nuclear data waste, the bad kind where “safe levels” are hard to determine. In the rush to grab the gold many forget a couple of important lessons from history. We should know by now we need to act proactively to avoid waste. That is the cheapest option in the long run and mitigates many of the risks. We should also know that not all data is gold, some of it just glitters, but it isn’t valuable. Fool’s data, like fool’s gold, is essentially worthless no matter how much money you have spent. Even worse, it’s still produces nuclear data waste asset than can get you into (legal) trouble.

Data storages and backups

How much data to you need to get to the gold and at what cost. Storage capabilities as well as storage capacity grows fast and cost seems under control for now. But will this last forever? And even if so, what’s the ratio of data gold versus raw data stored? Can we improve this ratio? Because even when things are cheap, why even do it if it is not needed?

Protecting the data and the waste

And then there is the cost with protecting that data as well as the governance around it.  The sad reality with data is that once you have it the probability that it will get you into trouble is real. Data, sooner or latter will get lost, misplaced, sold, hacked, leaked, … it’s almost guaranteed. Ask any real InfoSec professional (not the standard issue, policy quoting security officers, those are just windows dressing) and they will open your eyes to the reality of the risks. It’s very sobering.

The gold rush

As with any hype or gold rush we can avoid costly mistakes buy looking at history. Think about who benefited and who lost out. Think about why this happened and how. Can you see any parallels?

  • Many people are drawn to the data gold fields. Very few strike a gold vein.
  • There is a lot of money to be made selling the tools, supplies and gear to mine the data, process it, store and protect it.
  • Gathering raw data and processing it can be highly toxic
  • Storing and protecting the gold is expensive and hard.

Let’s dive a bit deeper into these issues, what these mean and how they materialize. They all have one thing in common for sure and that is that the fear of missing out is one of the driving factors.

Gold diggers

The reality is that many people that now become “data scientists” are not all highly skilled mathematicians and experts at statistics analysis. It’s a new hype, just like OLAP tools and data mining where before. We now have BI, big data and data science. That’s where the gold can be found so that’s where gold diggers flock to. Some have the skills, abilities and the luck to derive wealth from that. Most will just have the job digging.

Pick Axe

There is a lot more data than there is science in the hype created around data scientists. Data scientists should be great at math and statistics. Those are not very fields of human endeavor that do not scale well. They are not even popular. Attaching “scientist” to something doesn’t make it a science. Be sure of the quality of your gold and make sure it is not fool’s gold. But the gold rush is on. There’s money to be made. In an era where science is viewed by many as “an opinion” the urge to derive some credibility from adding “science” to any endeavor is a paradox. It is on the rise. Clearly, this shows the value of real science even when some only seem to like it when it suits their agenda.

But the field is exploding as companies want people working on all the raw data they collect. As one statistician stated: “I used to be a boring, underpaid geek with glasses, now that I’m a data scientist I’m cool, in demand and paid very well even if the work is less scientific.” That’s the nature of the beast. Her employer got a real statistician, but as the mines require a lot more bodies, many will make due with less.

Merchants

The “sure” money is in the supply chain. Storage, networking, compute … no matter where it is (cloud, fog or on-premises computing). There is money to be made with tools to process the data, protect it (backups) and secure it against unwanted prying eyes and theft. If you’re selling any of those business is a booming.

Everyone seems obsessed with collecting data. Luckily storage costs are down per GB and we can store ever more. We also need to protect more. But whoever deletes data? Who dares push the button? A lot of data is collected “just in case”. We might find gold in there later and if we don’t have it we cannot for sure. The fear of missing out in action. That is great if you’re selling stuff. Data lakes, data ponds, storage blobs or tables, Mongo DB or SQL PaaS, storage arrays, data processing technology and data protection. These can be products or services, it doesn’t matter, there is money to be made. And while you’re selling you’re not asking the buyers if the really need it. You don’t question them, you praise their insights and help them protect their investment. Everyone is doing it, so must you. The copy/paste strategy in action.

Nuclear data waste

While the vast growth in data is spectacular. A lot of it is crap. But there is very little effort put into  being selective. It’s too cheap right now to collect and store it. No one want to say “we don’t need it” and be the one to blame if you don’t have it.

But in the age of data leaks, hackers, privacy concerns and ever more legislation around data protection it’s worth making sure you don’t store data just because you can. Storing data holds inherent risks. Risk of losing it, corrupting it, deriving faulty information from it, leaking it, have it stolen or abused. It

In the age of GDPR and many other rightful privacy and data protection concerns collecting data should be treated like nuclear power. The value it brings is undeniable. But you don’t need vast amounts of nuclear fuel to deliver that value. You do need very good processes, fail safes, regulation, capable people and technology.

We should start looking at data as nuclear fuel and as such, after use and processing, part of it is left as toxic nuclear data waste. It’s a long-term toxic by product of the process of deriving information form data. Minimize the collection, storage and of data to achieve your goals at minimum cost and risk. Luckily, we have a very good solution for toxic data waste. You can delete it and wipe it securely.

We have to stop thinking that more is better when much of it is junk. The overhead of caring for that junk is ridiculous. We might have to do so for nuclear waste out of need. But for data there are alternatives. Destroy it if you don’t need it. It’s the safest way to handle the legal and reputation risks related to it. That will take a conscious effort.

Efforts and costs

Critical thinking about collecting data is lacking. That is understandable. There is a lot of the money to be made in data mining is in providing the tools to collect, process, store and protect the data. Even with many people that warn us of the security issues and legal responsibilities around it is often about selling services and products. For many all this might turn out to be a lot like the other gold rushes. There were a lot more suppliers of the tools that got rich than actual finders of profitable gold mines. This means there is also a lot of pressure and incentive to feed the “data is the new gold” beast.

Where a SQL database or a data warehouse at least meant you had to put effort into collecting the data, the rise of unstructured data technologies means way too often we don’t care and we’ll figure it out later. Imagine doing that to nuclear fuel! For now, the technical advances in storage and data technologies has allowed us to act without too much deliberation on the sanity of our choices. That might change, it might be wise to avoid the cold shower when it does and benefit from minimizing toxic data risks today.

Conclusion

Now true data gold is very valuable, but make sure you can recognize it. Just going through the motions and buying the tools, copying “in the know” statements from the internet isn’t going to cut it. That is called pretending. Sure, it’s fun. It is also a very dangerous and costly mistake when things get real. At best you look like an idiot with money. Many sales people will separate you from your money very efficiently.

The smarter organizations already have a data strategy that includes waste avoidance, reduction and management. Many don’t unfortunately. Collecting data for those is the main goal, driven by the tyranny of action over strategies. You have to be seen acting and being in charge. The buzz words have to be present and you have to come across as a “can do sir, yes sir” person. Well that is what will kill you. The late Norman Schwarzkopf knew this all too well.

Take care of your weaknesses, figure them out before they hurt you and before they destroy your ability to exploit your strengths. That people is a strategy exercise. I can do that for you and it will cost you a lot of money. But remember, strategies are not products you can buy, they are not commodities and as such buying them is a paradox. A strategy is what will give you the edge over your competitors.If you have others determine your strategy, your competitors will pay them more to find out . So, roll up your sleeves and put in the effort yourself. In the end, it’s all about common sense and this is true for data-mining, AI and BI as well.

Webinar: 3 Emerging Technologies that will change the way you use Hyper-V

I have the pleasure of doing a webinar on October 24th 2017 with two fellow MVPs. They are Andy Syrewicze  from Altaro who organize the webinar and Thomas Mauer who’s well know expert in the tech community and beyond on Cloud and Data Center technologies.

The subject of the webinar is 3 Emerging Technologies that will change the way you use Hyper-V. It’s a panel style discussion amongst the 3 of us technology and trends that effect everyone in this business.

  • Public cloud computing platforms such as Azure and AWS
  • Azure Stack and the complete abstraction of Hyper-V
  • Containers and microservices: why they are game changers

image

There will be time for questions and discussions as we expect the subject to be of great interest to all. For my part I’ll try to look at the bigger picture of the technologies both from a product a service perspective as well as from a strategic point of view and a part of a doctrine to achieve an organizations goals.

Happy New Year from a renewed Microsoft MVP in 2016

Happy New Year from a renewed Microsoft MVP in 2016

It’s January 1st 2016, late in the afternoon here local time and I have just received great news to start the new year with. It came by way of an e-mail notifying me I have been renewed as a Microsoft Most Valued Professional (MPV).

The Microsoft MVP Award provides us the unique opportunity to celebrate and honor your significant contributions and say “Thank you for your technical leadership.”

image

So it’s time for a happy New year from a renewed Microsoft MVP in 2016. My expertise is now Cloud and Data Center Management. It’s quite an honor to be renewed. Somewhere people think I make a big enough difference to be recognized, that caresses my ego just a little bit. More importantly however it means I get the opportunity to keep working with a lot of passionate and talented people. The ability to participate in a global community and ecosystem focused on our areas of expertise is something I have enjoyed for many years now. Attending the MVP Summit is the cherry on the cake and they sure do make you feel welcome at every place you stop on and around the campus.

clip_image001[5]

My fellow MVPs are always very helpful, they are both an inspiration as well as a source of tremendous experience and knowledge. Being a MVP has opened opportunities to both learn and teach, both professionally and personally. That’s what enabled me to grow in depth and breadth within my areas of expertise which ultimately translates into our new expertise assignment, cloud and datacenter management.

Thank you!

It’s a good time to wish you all a happy New Year. Let me take a moment to express my gratitude to all loyal or accidental readers of WorkingHardInIT. A blog without readers would be a sad thing but luckily you’re all reading this blog more and more, year after year.

image

I’m grateful you for your continued support and spending the time reading my blog. To the people, businesses and organizations that given me so many opportunities and support and with whom I had the pleasure to work with in 2015, I say thank you and let’s continue to do so. I wish you all a marvelous 2016 with lots of joy, good health and tons fun at and outside of work!

The road ahead

2016 will be an interesting year. There’s a lot going on in our industry, some of it is hype, a lot of it is real. That reality is sometimes sobering but often inspiring. Keep cool, don’t panic or go ballistic. Smart discipline with a good portion of common sense, insights and a solid, yet flexible plan wins the day. You’ll also need some luck and turn up at the right place at the right time every now and then, ready to make the most of an opportunity. You get the idea.

There are and have been, as always, personal and professional challenges. That’s a given. Only newbies and idiots make picture perfect plans. They then get “dazzled” by the first punch on their snout which sends their plans falling apart like shattered glass. Sometimes the challenges are bigger and harder. This can mean you need to work even harder, smarter and perhaps even longer. It can also mean to cut your losses and disengage. No matter how good you are, how long, hard and smart you work, you cannot right all wrongs in this world. Leave that to the self-promoting LinkedIn blogs on “personal success and growth” aimed at ridiculously entitled people or the painfully naïve.

Importance

2016 will also know its challenges. They will be met with all the attention and dedication required where and when needed. They will be passed by or ignore where the effort just isn’t worthwhile. There’re good places to go, nice things to do and great people to meet. If I can seize as many opportunities in 2016 (TechEd, ITPROCeed, E2EVC, VEEAMON, Microsoft MVP Summit, ExpertsLive)  like I have been able to do in 2015 I’ll be a happy man, both professionally and personally.

How to get a dream job in 2016?

I’ve been asked that a couple of times. I’m not the one  for handing out personal advice, that would only shock your parents and potentially shake your worldview as well. Professionally I’d say, your profession, your career is not the same as your job. It might be, but more often than not it isn’t. That’s OK. You can build a career in your (chosen) profession even despite your job or jobs. Most MVPs work very hard and we put a lot of personal time into our technical skills and community. It isn’t a lifestyle of the rich and famous as some would think when you read a blog about a conference or summit.

VEEAMON 2015 Party

Those are a fun part of work, that’s for sure, but they don’t define our work days. It’s lots of work, learning, sharing and many battles are uphill!. We all have jobs that require us to do things we’d rather not have to do. Do what you need to do to stay afloat but try to do as much of what you like and enjoy it as possible. Do it smart and don’t waste your time or let others waste yours. The latter is something you should not do to other people either. When it comes to jobs it’s not all that simple as the sloganesque “Do what you love, versus work for money/the man/a pension/security” for most people. Sure most don’t like to admit that they have to take crap, but we all do. Anything else is as much BS as every employer that seems to pretend everybody has to be and is an engaged, inspired team player who’s going all out for the company, beyond and above what the job demands. That’s a bit too much like Office Space’s “Is this good for the company?” for comfort 😉

Production Checkpoints in Windows Server 2016

We’ve  had snapshots, or better checkpoints as we call them now for consistency amongst products, for a longest time in Hyper-V. I have always liked and used them to my benefit. That’s what they are intended for. But you have to use them correctly, in a supported and smart manner. Some (or perhaps not an insignificant number of) people did not read the manual and/or do not test their assumptions before trying something in production. Some times that leads to a lesson, sometimes it leads to tears.

We now have the choice between two type of checkpoints: Production Checkpoints and Standard Checkpoints.

A standard virtual machine checkpoint all the memory state of running applications get stored and when you apply the checkpoint it’s back magically. Doing this to a production SQL or Exchange Server for example causes (huge) problems. With some applications these problems are minor or transient but it’s not a healthy consistent state to be in, and recovery has to happen. Which could  happen automatically or require disaster recovery depending on the situation at hand.

Production checkpoints are made in application consistent manner. For this the leverage Volume Shadow Copy Services (or File System Freeze on Linux) which puts the virtual machines into a safe state to create a checkpoint that can be restored like a VSS based, application consistent backup or SAN snapshot. This does mean that applying a production checkpoint requires the restored virtual machine to boot from an off line state, just like with a restored backup.

The choice for what type of checkpoint can be made on a per virtual machine basis which make it’s flexible as you can pick the best option for a particular virtual machine for a specific purpose. As you might have guessed, that still requires some insight, reading the manual and testing your assumptions. But you now can have the behavior you want and way to many assumed to have.

image

We also have the option of allowing or disallowing for a standard checkpoint to me made if for any reason (which results in the VSS snapshot in Windows or file systems freeze in Linux  in the guest might not work or are not available) a production checkpoint cannot be made. Here’s the table of what type of checkpoint can be used when from MSDN. I conclude that the chosen default is the best fitting one for most scenarios.

image

You also have the option of choosing the standard checkpoints for a virtual machine. That gives you the exact behavior as with all previous versions of Hyper-V.

I love the GUI for ad hoc work but when I need to do this on dozens or hundreds of virtual machines or potentially tens of thousands when running a larger private cloud this is not the way to go. PowerShell is you long time trusted friend here!

image

As you can see just the “-CheckpointType” parameter you control the check pointing behavior. And as it is very easy to grab all virtual machines on a host or cluster setting this for all or a selection of your virtual machines is easy and fast. Let’s set it to “ProductionOnly” and grab the setting for that VM via PowerShell

image

When you create a checkpoint of a Windows Server 2016 Hyper-V host you’ll even get a nice notification (you can turn it off) by default that the Production checkpoint used backup technology in the guest operating system.

image

It’s also important to realize that this capability is the basis of the new checkpoint based way of making backups in Windows Server 2016 as well. But that’s a subject for another blog post. Thank you for reading!