Certification: First Experience of Searching Learn Content During Exams

I took AI-102 yesterday. It was the first exam that I've taken where Learn content was available. There's an icon at the bottom centre of the screen, beside the one that lets you take a break. (I haven't used that break option yet but that was also a great recent additon).

Did it help?

There were a couple of places where I found it helped. It's not going to teach you the content, even though the training parts of Learn are available. That would mostly take far too long. What it worked well for, was to confirm the purpose of a specific option, etc.

Questions are meant to be designed around the application of knowledge, not around pure knowledge. (Yes, I know some of these still exist)

Navigation

The way it works is that a browser appears on the right-hand side of your screen, and your question stays on the left. For most multiple choice questions that's easy enough, but a little more challenging (screen real estate wise), for case studies.

What was hard to work with, was moving the exam monitoring/counter popup thing around, to keep it out of the way of the browser tabs. It was never a problem with exam pages, but that popup thing was designed to be as small as possible, to stay out of the way. I found it hard to move, as you needed to click like a normal window, but there is almost no blank space on it that you can click without doing something, instead of just selecting it.

The team could improve that experience by making sure that popup thing had a blank space that would work well as a handle for dragging.

I was recently asked if there's a limit to the number of tabs you can have open. Someone mentioned that he'd had a large number open. I found it refused to open more after about 5 (didn't note the exact number). But any more than that would be awkward.

The Search Experience

What I do find hard with it, is the searching for content. The search bar in Learn is not a good way to search Learn. Let's face it, almost no-one searches Learn content by going to the home page, and using the search bar. I'd guess that 99% of traffic to Learn comes from search engines, and they do a far better job of indexing the content.

We're not going to get a Google search in there (i.e., what most people would actually use), but it would be interesting to have the option to use Bing search and enforce a site restriction for Learn. I didn't try to find out, but I'm guessing they have a limit to any external links that you can connect to from within Learn.

Limit the Scope

Another important note: in the search, there are options to restrict the scope of the search. Make sure you use them !

Scope options in Learn search

I was doing an AI exam, and at first, I'd go looking for a class name, and getting results from Dynamics, etc. that I didn't want. Remember that it's all of Learn, with all products, all documentation, training materials, tutorials, etc. Using the scope options, for this exam, I could say just .NET and Azure, and the results were far better.

Timing and Rabbit Holes

After getting into the exam, I knew I had plenty of time as I was pretty confident with the material. I'd spent weeks prepping, and doing the labs, etc. and a lot of it was familiar before I started. (Note for any certification people reading: the labs for AI-102 really need updating).

Because of that, I searched as I went, but I'd strongly recommend that if you aren't really confident on the content, you don't do that.

I did find that when I tried to research topics, that I really burned time. In fact, I burned way, way more time than I was expecting. Most of that was because the searching option isn't good. But you could easily go down long rabbit holes with this, and simply lose track of time.

A rabbit
Unsplash image by Gary Bendig

I'd recommend that as you go through, you choose your best answer even if you have doubts, but flag the question for later review. You'd be better to make sure you've answered everything, then go back to research what you need to check, based on your available time.

Summary

Thanks to the certification team for a great new option! Two areas I'd love to see some work on:

  • Making the popup thing easier to move around
  • Considering whether a restricted Bing search might be a better option

 

Book Review: Technology Operating Models for Cloud and Edge

I was recently sent a copy of a new book by Ahilan Ponnusamy and Andreas Spanner, called Technology Operating Models for Cloud and Edge: Create your purpose-built distributed operating model for public, hybrid, multicloud, and edge.

It was interesting to read this book.

I spend more of my time on the Azure side of the fence than the AWS / RedHat side of things, so I have some pretty different opinions to the authors on many of the topics. I've have worked with multiple clouds, so what they were describing was familiar anyway.

I particularly liked their discussion near the beginning of the book that cover what often goes wrong in cloud moves. They talked about issues with:

  • Unclear direction-setting attempts like 'cloud first' left teams unsure of their future or what to do.
  • Moving to the Cloud is mistaken for innovation.
  • Goals to move large amounts of applications to the public cloud in an established enterprise without significant change management are not realistic.
  • High unanticipated bill shock.
  • Lift and shift shortcuts do not leverage Cloud on-demand scale-out/in features and, as a result, do not lead to the desired business demand-aligned pricing.

I regularly write articles on making real cloud transformations not just migrations. Too many people talk transformation, but end up doing migrations, and then enter the well of disappointment. And for their users, things are even worse.

So I liked to see their discussion warning against focusing on technology instead of transitioning people, processes, and culture to enable cloud ROI. And I had a chuckle when I read their recommendation to not start with logos that we want on our CVs. I can't tell you how often I've seen teams makign that mistake.

Robust, Antifragile, Application Lifecycles, and Data

I'm not 100% convinced by the Robust is out. Antifragile is in. discussion. I get why they're arguing that, but I see it differently. Similarly, I view application lifecycles now as very different to the:

  • innovate
  • operate
  • retire

options discussed.

Given my work focus, it's not entirely surprising that I differ a great deal with them in regards to how data should be handled today.

Edge Services and Hybrid Cloud

The authors must be seeing far more edge-based services than I commonly see in the market. They had an interesting discussion on that. Quite a large part of the book is dedicated to tackling edge services.

As for hybrid cloud arrangements, I'm quite pleased to see the lead Microsoft has taken in their Arc-enabled offerings.

Vendor Lock-In

Finally, I see their arguments around avoiding vendor lock-in. This is a topical one for me. I see people worried by this, and ending up with lousy outcomes.

It reminds me of application code that's written to avoid being locked into specific databases. Invariably, you end up with code that's not great with any database. And eventually, code gets added that's specific to one database. So then you have the worst of all possible worlds: poor code that's database agnostic, yet lock-in anyway because eventually, work just had to be done. And the kicker: I've never seen anyone who does this ever change databases anyway.

The same applies to multi-cloud thinking. What the authors don't really discuss, is that it's easy to end up with outcomes that work equally poorly on multiple platforms.

And it's a similar discussion in relation to committing to only use open source software. I understand the sentiment, and I've used many of the services that they mention in the book. I just haven't loved those services the way I do many of the options that aren't open source.

In today's world, your company can get run over by competitors who are quicker to market by using the best aspects offered by a single platform.

Summary

Regardless, I'm glad they've written on these topics. Even if you don't agree, it gives you a list of topics to consider.

7 out of 10

Contributing to Microsoft Learn Community Content

Microsoft Learn has a community content pilot running right now. They are looking to see how content from community members might provide different value to the content provided by their team members.

So, I decided that I should submit an article to see how it goes, and to find out how it works.

At present, it's all hosted in a GitHub repository, and no surprise, is all written in Markdown.

To contribute, you fork the respository, choose a template, and create your content from the template, following the rules contained in the template. You also need to add your content to the Table of Contents. Finally, you raise a pull request and a build agent checks that your content follows the rules. If the team are then happy, you get author access and can merge your content in. Eventually, your new article should appear at Microsoft Learn.

Here's the first article that I contributed:

https://learn.microsoft.com/en-us/community/content/remembering-the-users-in-cloud-transformation

As another example, here's another one from fellow MVP Luke Murray:

https://learn.microsoft.com/en-us/community/content/securing-backups-with-azure

I'd love to hear your thoughts on whether this style of content is useful, and if so, how it should be structured.

If you are interested in contributing, here's a link to get you started:

https://github.com/MicrosoftDocs/community-content/blob/main/CONTRIBUTING.md

You can also find more information in the Microsoft Learn Contributor Guide. It covers style, voice, etc.

https://learn.microsoft.com/en-us/contribute/content/

 

 

SQL: Understanding Change Data Capture for Azure SQL Database – Part 4 – Azure SQL Database Service Level Objectives

This is part 4 of a series on working with change data capture (CDC) in Azure SQL Database. This part discusses how to enable it and how to use it.

When I was reading the documentation for CDC in Azure SQL Database, I kept coming across a mention that it required at least an S3 service level objective (SLO), if you were using a DTU-based database.

I really hoped that wasn't the case.

I was left saying "please say it ain't so!".

Testing

I thought it was time to find out what actually happens when you try it.

I started by creating a database with the lowest SLO (Basic). Now I know these don't hold enough data to really even be very meaningful, but I wanted to know what would happen.

I didn't get far.

I created the table from the scripts in part 3 of this series, and then tried to do the first enable of CDC.

To say that's disappointing is an understatement. But i tmakes it clear that Basic, S0, S1, and S2 aren't going to let you enable it.

What about Scaling?

Then I wondered what would happen if I enabled it on an S3, and then tried to scale down.

So I scaled the DB to S3 ok, and enabled CDC as in part 3. All worked as expected.

I then scaled the DB back to S2 using the Azure Portal. And alas, that failed too:

I suppose I'm not surprised, but I had hoped it might at least have automatically disabled CDC if it really had to.

It's a Mistake

I think these licensing restrictions for CDC in Azure SQL DB are a mistake.

Unhappy woman
Unsplash image from Getty

Back in SQL Server 2016 SP1, I was one of the main instigators to get the product team to make sure that the programming surface of the database worked across all the tiers (Enterprise, Standard, Developer, Express) wherever possible.

This restriction on CDC goes directly against that ethos. It harks back again to large single database thinking, not cloudy thinking.

I understand that it's probably needed for performance in production scenarios, but what I'm always trying to tell the team is that production scenarios aren't the only scenarios.

Developers also need to write code. We should let them write and test functionality. It can be clear to them that it won't work the same as on a production instance, but that's no different to many other aspects of the system. A Basic SLO database won't run a complex query over a lot of data either, but we don't expect it to.

But it's important to let them write their code and make sure it basically functions, no matter what level of SLO they're working with. If you require them all to have an expensive database, just to test code or concepts, you greatly limit how many developers will use the features. 

Clouds
Unsplash image by Dallas Reedy

Having those types of restrictions also restricts the cloudiness of the solution, in that you couldn't scale up/down across a range of SLOs. We often take S7 or S8 databases and temporarily "park" them at S2 while not in use. This type of restriction kills those options as well, and makes the overall solution less cloudy.

And One More Part to This Series

That's service level objectives for the use of CDC in Azure SQL DB. We'll move to accessing CDC data from another Azure SQL DB next.

  1. Why use Change Data Capture for Azure SQL Database?
  2. How Change Data Capture works in Azure SQL Database
  3. Enabling and using Change Data Capture in Azure SQL Database
  4. Change Data Capture and Azure SQL Database Service Level Objectives
  5. Accessing Change Data Capture Data from Another Azure SQL Database

 

Azure Data Factory (ADF) – Issues with parsing strings in CSVs

It's scary how much of the world's data lives in CSVs. Yet, as a standard, it's problematic. So many implementations just don't work as expected. Today, I had a client asking about why Azure Data Factory wouldn't read a CSV that, while odd, was in a valid format.

The simplest equivalent of the file that wouldn't load, would be one like this:

There are meant to be four columns. The source system wrapped strings in quotes only when the string contained a comma, as that was the delimiter for the file.

But the fourth line in that file would not load. ADF would split it across multiple columns. It was not interpreting the double-double-quotes.

I was puzzled at first, as the file seemed ok. If I upload that file an any of the common online CSV file validators or linters, they all said it was ok.

The client had configured the dataset as follows:

The options seemed ok to them.

Turns out the issue was with the escape character. ADF defaulted to using a backslash. For the majority of CSV files that I see, it would need to be a double-quote:

And sure enough, with that change in place, it loaded files similar to the one above without issue.

I don't know why that's not the default for CSV files in ADF but I do hope it helps someone else.

Fabric Down Under show 3 with guest Mark Pryce-Maher discussing SQL Warehousing in Microsoft Fabric

I had the great pleasure to get to spend time today, talking with Mark Pryce-Maker. We were discussing the SQL warehousing experience in Microsoft Fabric.

Mark is a Senior Program Manager at Microsoft, where he's working closely with the Microsoft Fabric team. He has a strong background in many areas of data, most recently prior to Fabric, with helping customers migrate to Azure Synapse Analytics. He has worked extensively with the dedicated SQL pools in that service.

SQL is one of the core experiences for data engineers in Microsoft Fabric. In an upcoming show, I'll discuss the other core experience with Spark notebooks.

In the show, Josh provides a solid introduction to what is present, and we dived into many areas of how it works, and how you might use it.

You'll find the show here: https://fabricdownunder.com

I hope you find it useful.

Book Review: Pro Power BI Architecture

I was pleased to see Reza Rad's latest book Pro Power BI Architecture: Development, Deployment, Sharing, and Security for Microsoft Power BI Solutions: Rad, Reza: 9781484295373: Amazon.com: Books now out the door. Reza is an old friend, fellow Data Platform MVP, and fellow member of the Microsoft Regional Director program.

I was pleased to have been a technical reviewer for this book, and I hope that, along with the other reviewers, we have improved what was already a good book.

Not Just an Update

This is version 2 of the book that Reza produced in 2019 but it is not a book with minor updates. Most of the book is rewritten.

Reza is a book-writing machine. In this book, he has covered so very many aspects of architecture for Power BI. He has provided an emphasis on reliability and ease of maintenance. In particular, I was pleased to see a discussion on environments as that's often omitted in Power BI related books. And a discussion on how to save money by using the right licensing. Once again, that's a topic that's often not discussed.

Target Audience and Style

The book is really for anyone who needs to build Power BI reports (often analysts and developers), but with a view to the bigger picture of how to structure a project so it continues to be usable as it (or the team) grows in size.

I like Reza's conversational style and it shines out in his writing. When I read it, I often feel like I'm sitting in a room listening to him talk. That's a tough skill and Reza does it effortlessly.

Summary

Great book and well-written

8 out of 10

 

Book Review: Chernobyl 01:23:40 by Andrew Leatherbarrow

I've been fascinated by what happened at Chernobyl for a long time. Many of my readers wouldn't know that when I first started university, I was studying nuclear physics and mathematics. Wasn't long afterwards that I headed into computing but, at the time, there were very few degree courses on what's now called computing.

I decided to move on from that study for many reasons, but a primary one was that I could see Australia heading into a pretty solid ban on using nuclear power. The only reactor running in the country was a high flux reactor at Lucas Heights and was used primarily for producing medical isotopes. (It was replaced by a 20MW open-pool lightwater reactor in 2007).

The ban on reactors in Australia has pretty much continued to this day, but there is now some renewed interest. Given the abundance of sunshine, wind, waves, etc. here, you wouldn't think we'd need them now, but over the years, we've continued to mostly use coal. And while we've come a long way with renewables now, we're still far, far behind where we need to be to provide our base load needs from alternatives. For a start, we'd pretty much have to rewire the country.

It does fascinate me that if you ask the average person here about nuclear power though, they quote details of a handful of accidents, and primarily from reactors built in the 1970s. There is an assumption that what would be built in 2023 is the same as what was built in 1971. Even allowing for the difference in severity, if we applied that logic to other areas, we would have, for example, stopped using aircraft long ago (drastically more people have died from aircraft accidents and incidents).  And there seems to be a wilful blindness to the deaths directly and indirectly linked to the use of coal. Perhaps they just aren't as spectacular.

Chernobyl

I've seen a lot of material about Chernobyl over the years. I think my standout favourite was a  1996 Horizon documentary from the BBC called Inside Chernobyl's Sarcophagus. I saw it back in 1996 and it was the first time that I really started to understand what had happened there, and more importantly, how poorly the government reacted and how poorly those involved were treated. It had such a wake-up call about how technologies that we thought would work, just didn't.

I'm not sure why Andrew Leatherbarrow's book Chernobyl 01:23:40 recently got me interested. It might have been the renewed public interest in the accident from the recent HBO/Sky miniseries. (I already knew enough about the accident to note the amount of dramatic license taken by the miniseries). I listened to Andrew's book on Audible.

The book was really not what I was expecting. And I say that in a good way.

Rather than just a dry rehashing of facts, it was Andrew's personal story. He had researched the accident for many years, and finally managed to make a trip to visit the site. For him, it was like a pilgrimage.

The book describes not only what he saw when he got there, but also the details of the trip, from the planning stages, to how he and his companions managed to get around when nearby, on site, and more.

I also particularly liked the detail he had on the legendary Chernobyl divers.

Summary?

Have to say that I thoroughly enjoyed it.

8 out of 10

 

 

Service Principal vs Service Principle -> not the same thing!

I can't tell you how many times lately, that I've seen people writing about service principles when they really mean service principals. These are entirely different concepts!

I did a certification exam the other day, and it was asking about service principles. I was left wondering how many people had reviewed that before it was released.

But what's worse, is when people bake the wrong one into code libraries that others need to use. I was reading some content from Microsoft Learn: Quickstart: Manage data with Azure Cosmos DB Spark 3 OLTP Connector for API for NoSQL and I saw this code example:

Note the authType. I just thought it was a typo and was going to make a pull request to fix it. But then I went looking at the library that it was using. That's Azure Cosmos DB OLTP Spark 3 connector, and what amazed me, is that the typo was right throughout the library.

I've passed about this back to the Cosmos DB team but it left me wondering what you should do when you see this sort of error in a library.

Do you just use the typo in your code and just forever propagate the mistake?

What should library authors do, to avoid breaking existing code? Perhaps allow both the correctly spelled option as well as the wrong one?

Service Principal

A service principal is a security identity used by applications, services, or automated processes to access specific resources or perform tasks within an Azure Active Directory (Azure AD) tenant. It's most commonly a type of service entity that can be assigned permissions to access resources in a secure manner. These are very common in Azure but they appear in many places across the IT industry.

Service Principle

A service principle really isn't something we talk about often but if we did, it would be perhaps a rule that we try to follow when providing service. Richard Koch wrote a famous book called The 80/20 Principle. That's often called The Pareto Principle. Either way, it's got nothing to do with an identity that performs actions in IT systems.

 

 

Fabric Down Under show 2 with guest Josh Caplan discussing OneLake

I had the great pleasure to get to spend time today, talking with Josh Caplan about OneLake.

Josh Caplan serves as a Principal Group Product Manager at Microsoft, where he's now leading product management for Microsoft OneLake. He has a strong background in managing products like Power BI, SQL Server Analysis Servers, and Azure Analysis Services. Before his current role, Josh contributed to enhancing developer tools for Power BI and worked with Bing to harness its vast data resources.

OneLake is a foundational aspect of all things Fabric.

In the show, Josh provides a solid introduction to OneLake, and then we delved into many aspects of how it works, and what it means for organizations.

You'll find it here: https://fabricdownunder.com

I hope you find it useful.