DevOps: Is AIOps just yet another almost meaningless acronym?

DevOps has quickly become a core part of how many organizations deliver IT, and in particular, how they deliver applications. But just as quickly as it has become popular, a whole series of XXXOps names have appeared. One of the latest is AIOps. So is it just yet another almost meaningless acronym?

Well as Betteridges Law of Headlines suggests, the answer is no.

When I first saw the term, I was presuming this would be about how to deploy AI based systems, and I wondered why on earth that would need a special name. But that's not what it is.

So what is AIOps?

AIOps is the use of artificial intelligence (AI) and machine learning (ML) techniques to allow us to analyze IT problems that occur so that we can respond to them fast enough to be useful.

The core problem is that we're now generating an enormous volume of metric and log data about every step of all our processes, and about the health of all our systems and applications yet so much of that data is never processed, or at least not processed fast enough to be useful.

Only machines can do this.

The term AIOps seems to have been coined by Will Cappelli (a former analyst with Gartner). In the end, humans won't be scouring logs and responding to what they find. Instead, they'll be teaching the machines what to look for, and how to correlate information from a variety of sources to find what is really going on.

Cappelli is now at Moogsoft and sums up AIOps quite distinctly:

AIOps is the application of artificial intelligence for IT operations. It is the future of ITOps, combining algorithmic and human intelligence to provide full visibility into the state and performance of the IT systems that businesses rely on.

People are already doing this but it's likely in the future that this will become a well-known job role. It will be important to guide the machine's learning to teach it to recognize the appropriate patterns.

If you are working in related IT roles, it might be time to start to add some data science, AI, and/or ML into your learning plans.

AI: Machine Learning and AI – What's in a name?

I regularly hear the terms AI and Machine Learning used almost interchangeably, along with a variety of other related terms. I thought it would be useful to add a post that defines some of the common terms and how they differ:

Artificial intelligence (AI) is a fairly generic term. It relates to all intelligent agents that are able to be aware of their environments (in some way), and to take actions where the aim is to achieve a specified goal. Sometimes these goals are terminal ie: they reach a final desired state. Other times, these goals are continuous ie: keep speed at a desired value. It is considered "artificial" intelligence as to an observer, it mimics cognitive functions that humans would imagine other humans performing.

Machine Learning (ML) describes a form of "learning" where a system  improves its model of a specific behavior (ie: "learns"). It can then use the model to predict future outcomes. Machine Learning is considered a field of Artificial Intelligence. There are many types of Machine Learning.

The most common form of Machine Learning today is Data Mining where the model is trained by analyzing existing outcomes, and then used to predict future outcomes. (This is usually called Predictive Analytics).

The learning can be supervised (ie: here are pictures of dogs, is this other picture a dog?), unsupervised (ie: what are the common types of an object?), or combinations of the two (often called semi-supervised).

Deep Learning is a form of Machine Learning where the models comprise many layers. "Deep" refers to the number of layers, not to any specific ability or insight. These models often do an amazing job, and in some cases are already performing better than humans at specific tasks such as speech to text translation.

Reinforcement Learning is another form of Machine Learning that typically involves working out optimal ways for software agents to operate within defined software environments. Game theory, simulation experiments, etc. often form part of Reinforcement Learning. One common way to represent the environments is as what is known as a Markov Decision Process (a mathematical framework that defines the rules for decision making and the goals and rewards involved).





AI: Detecting and Avoiding Customer Churn is Critical

I've flown a lot over the years. What continues to strike me though, is how poorly airlines use machine learning and AI, even when they are in strong competitive environments. A key indicator is detecting and avoiding customer churn. Let me give you an example:

We flew with QANTAS and with their partners in One World for many years. We were both platinum and I'd been platinum for many years. At a recent peak a few years ago, we were flying once or twice a week. That's not a crazy amount, but it's enough. And it's certainly enough to be able to see a purchasing pattern.

We finally got pretty fed up some years back, and one February, we said "enough", and stopped flying with them.

But what fascinated me was how the airline reacted to that.

At the end of the year, I got an email pointing out they'd noticed that I didn't fly as much that year, but because of previous custom, they'd keep my status in place.

After another whole year of not flying, I got another email saying they'd noticed a drop in my custom, and that they'd have to drop me down to Gold. Same thing again the next year.

But at no stage did they ever seek to work out what went wrong.

When could they have actually detected the change? Probably a month or two after we stopped. At that point, there's always a chance you can recover the situation. Every business teacher will tell you how much harder it is to gain a customer than to avoid losing them in the first place, and how very much harder it is to regain a lost customer.

Look at your own businesses, and ask yourself if you have systems in place to detect changes in your customers' behaviors, particularly if they've stopped dealing with you. Don't just detect total volumes over each year. Look for changes in behavior.



Opinion: What's with the lack of coding standards in Data Science?

I've been spending a lot of time over the last few years working through data science and AI topics. One thing that's struck me consistently is the total lack of reasonable coding standards in almost all the sample code that I see.

I was doing an AI lab in eDX recently, and one of the questions got me to open some sample Python code for a virtual environment, and asked me to work out how the virtual world that it created operated.

After working on it for quite a while, I realized that the #1 reason I was finding it hard, was not because the concepts were crazy difficult, it was because the person writing the sample thought it was reasonable to have variables, arrays, etc. with names like r, x, np, d, and so on.

What's with that?

Suddenly it felt like I was reading code written by a self-taught programmer in 1970, at their first attempt at using Basic. There is absolutely no need for anyone to be doing this.

Please don't.

I was left wondering who on earth would write this and interestingly enough, found that the person who translated the environments was in fact self-taught. I admire his efforts in teaching himself but this is not acceptable code to be sharing with anyone else.

There is no reason for data science or AI code in Python, R, or whatever language to be written like this. (Yet I see it all the time)

ML: Properties of the LaunchPad service – changing the concurrent session limit

In a recent post, I discussed issues I found when testing the Machine Learning Services setup in SQL Server 2017.

After that, my old friend Niels Berglund also posted about issues he found after installing CU7 (cumulative update 7) and how he solved them. Niels' article is here:

What each of those articles discussed though was detail on how temporary files are used by Machine Learning Services in SQL Server to hold R or Python data for sessions. By default, SQL Server configures itself to hold data for up to 20 concurrent sessions.

You can change that number.

In SQL Server Configuration Manager, find the LaunchPad service:

In the Properties page for the service, select the Advanced tab. In the main image above, you can see the value that you can change to increase the number of concurrent sessions.

Values up to 100 are permitted.

SQL Server also creates passwords for these "external" users. If you have a policy that requires regular password changes, you'll see that there's also an option in this window to let you change all of them.

(It's worth noting that current NIST guidelines say that you shouldn't force regular password updates anyway – you'll find more info here:




AI and ML: Why have machine learning in SQL Server at all?

In a post the other day, I described how to test if machine learning with R and/or Python was set up correctly within SQL Server 2017.

One of the comments on that post, said that the info was useful but they were still to be convinced why you'd want to have machine learning in the database in the first place.

Fair question.

I see several reasons for this. SQL Server Machine Learning Services is the result of embedding a predictive analytics and data science engine within SQL Server. Consider what happens in most data science groups today, where this type of approach isn't used.

I routinely see data scientists working with large amounts of data in generic data stores. This might mean that they have data in stores like Hadoop/HDInsight or Azure Data Lake Store but in many cases, I just see operating system files, often even just CSV files. Both the R and Python languages make it really easy to create data frames from these types of files. But where did this data come from? In some cases, it will have come from the generic data store, but in most cases that I see, it has come from within a database somewhere.

And that raises a number of questions:

  • What effort is required to extract that data (particularly for large volumes)?
  • How up to date is the data?
  • What is the security context for that data?

Often the answers to these questions aren't great. What I see is data science people extracting data from existing databases into CSV files, and then loading them up and processing them in tools like RStudio. And mostly, I see that data being processed single-threaded in those tools.

The outcome of this work though, is either analytics or (more commonly), trained predictive models.

Having Machine Learning in SQL Server helps here in several ways. First, you can utilize the same security model that you're using for any other access to that same data. Second, as the data volumes grow, you aren't needing to move (and then refresh) the data. You can process it right where it is. Third, you can take advantage of the multi-threaded architecture of SQL Server.

With Operational Analytics in SQL Server 2016 and later (basically non-clustered columnstore indexes with delayed aggregation, built over transactional data), you might even be able to have the outcomes really up to date.

While being able to train and retrain predictive models is really important, and is hard work, it's when you use those models to create predictions that the real value becomes apparent. Trained models are quite lightweight execution-wise. You can add predictions right into your queries along with your other returned data, and very efficiently. This is where having Machine Learning within the database engine truly shines.

And you don't necessarily even need to create the predictive models. The SQL Server team have provided a series of world-class pretrained models that you can load directly into and bind to, an instance of SQL Server.



Machine Learning: Testing your installation of R and Python in SQL Server 2017

One of the wonderful additions to SQL Server in 2016 was the R language. In SQL Server 2017, Python was also added and the combination of both with SQL Server rebranded to Machine Learning Services.

Why would you want these installed? The most common answer is to enable you to run predictive analytics.

But I've found that at many sites, getting R and/or Python installed turned out to be more complicated than it seemed.

Once you have the features added (the in-database options and not the standalone options) for R and Python, you need to enable the execution of external scripts. That's easy enough:

You need to restart the SQL Server service after doing this.

Now you can try to execute this script to see if the features are working:

If all is good, you'll see that it worked. Both values would be returned.

If you haven't changed the default SQL Server configuration though, I don't think that's what you'll see. More likely, you'll see this:

Msg 39012, Level 16, State 1, Line 4
Unable to communicate with the runtime for 'R' script. Please check the requirements of 'R' runtime.

STDERR message(s) from external script:
Fatal error: cannot create 'R_TempDir'

So, why does that happen? It's SQL Server's attempt at telling you that R doesn't like paths with spaces in them, and that's what the default configuration has. (I have no idea why).

If you open Notepad (or your favorite editor) as an administrator, navigate to this file:

The file rlauncher.config holds the configuration for the R feature.

That path for the working directory isn't going to work. Now you can change the values in it to 8.3 filenames like they have for RHOME in the first line, or, my preference, point it to a different temp folder.

I then make sure that C:\Temp actually exists, and that there's an ExtensibilityData folder under that. I then copy in all the folders from the original folder:

These are used for different processes running from the SQL Launch Pad.

Then restart both the SQL Server service and the SQL Launch Pad service and try your script again.

If you still have no luck, chances are high that the security isn't correct for the folders. Ensure that the SQL Launch Pad service account has full control on the ExtensibilityData folder and all sub-folders. By default, the service account will be MSSQLLaunchPad but you can check it in SQL Server Configuration Manager.

I'd restart both services again just for good luck, and then hopefully you'll see a response from both the queries.

Then you're ready to start investigating Machine Learning in SQL Server.




AI: New Microsoft Professional Program in Artificial Intelligence

In the last year or so, there has been a quiet revolution going on with how Microsoft delivers training and certification.

Previously, the main option was Microsoft Official Curriculum (MOC) courses delivered by Certified Learning Partners. For some years, I've been saying that I don't see that as the longer-term model for Microsoft. I believe that's for three reasons:

  • The learning experiences team in Microsoft have needed to be a profit center.
  • The product groups want as much information out there as possible and as free as possible.
  • The creation and delivery processes for MOC courses don't lend themselves well to constantly-evolving information.

That has to lead to real challenges within the company.

The partnership that Microsoft has done with edX ( is an interesting alternative. If you haven't been involved with edX, they are an amazing organization that allows you to access some of the best learning in the world, mostly for free. If you'd like to see some of the best lecturers from MIT, Harvard, Berkeley, Hong Kong Polytechnic, University of British Columbia, etc. you can now do that. You can use it to learn almost anything, right from your home and at your leisure.

So where does Microsoft fit into this?

Microsoft have been putting many courses up onto edX and you can learn all the material for free. This fits directly with the needs of the product groups, to get information about their products and services, and how to use them, out there for everyone.

So what about certification?

Microsoft still needs to be able to certify people. When you take a course at edX, you have the option to choose a Verified course. This currently (typically) costs around $99 USD per course. And if you pass the right combination of courses, you can achieve one of Microsoft's Professional Program certificates ( Here are the current tracks:

So you can choose to learn any of it for free, or pay to be verified and certified. That's a great combination. I currently see this as a much better learning model than the previous official curriculum model which was far too hard to keep up to date.

I've previously completed the Data Science track, and the Big Data track, and hope to complete the DevOps track this week.

But what has me really interested is the new Artificial Intelligence Track ( The AI track requires 10 courses, and there is a small overlap with the Data Science track. In my case, as soon as I'd enrolled, I found that I had 3 courses already credited:

They were:

  • Introduction to Python for Data Science
  • Data Science Essentials
  • Principles of Machine Learning

The Python topic was optional in the Data Science track so those that did the R courses would not have this one. (Luckily I decided to do both the R and Python courses as I had an interest in both).

I'm looking forward to this track. Here are the overall areas covered:

I'd encourage you to check it all out and to consider enrolling if it's of interest to you.

Opinion: You have to live and breathe the technology to be good at it

Digital Transformation and Cloud Transformation are phrases that I hear bandied around at nearly every large organization that I currently doing consulting work for.

Yet, in so many cases, I can't see the organization achieving the changes required. This is for two core reasons:

  • The first is that the culture within the organizations is a major hurdle. There just isn't enough flexibility to think outside the box about alternative ways to work.
  • Worse (and probably more concerning), I see these companies taking advice on how to make these transformations from companies who don't themselves "get it".

An organization that is cloud-antagonistic internally, and stuck in an endless IT management quagmire, isn't likely to make a good cloud transformation, and they're certainly not going to be a successful partner to be able to help you to make a successful cloud migration or to implement a cloud transformation within your company.

An organization that doesn't use business intelligence (BI) or analytics internally isn't going to be able to help you make that transition either.

If the organization is claiming to be proficient in an area of technology, ask them about the use that they are making themselves of those same technologies. As a simple example, ask them about their internal analytics that they can see on their own phones.

To be any good at any of these areas of technology, companies need to live and breathe them daily. If they don't, find someone to help you who does.

R Tools for Visual Studio

In recent months, I've been brushing up my R skills. I've had a few areas of interest in this:

* R in Azure Machine Learning

* R in relation to Power BI and general analytics

* R embedded (somewhat) in SQL Server 2016

As a client tool, I've been using RStudio. It's been good and very simple but it's a completely separate environment. So I was excited when I saw there was to be a preview of new R tooling for Visual Studio.

I've been using a pre-release version of R Tools for Visual Studio for a short while but I've already come to quite like it. It's great to have this embedded directly within Visual Studio. I can do everything that I used to do in RStudio but really like the level of Intellisense, etc. that I pick up when I'm working in R Tools for Visual Studio.

So today I was pleased to see the announcement that these tools have gone public. You'll find more info here in today's post from Shahrokh Mortazavi in the Azure Machine Learning blog: