Business-Intelligence

Book Review: Data Cleaning with Power BI

I was excited to see Gus Frazer’s new book on Data Cleaning with Power BI. Our friends at PackT sent a copy for me to take a look at.

Gus has a background with both Power BI and Tableau, and it’s always interesting to see a mix of perspectives in any book. In this book, he shows a variety of data cleaning/cleansing tasks, then how to do many of them with M in Power Query. And you can tell Gus has a teaching background. because chapters have review questions. The book could well support a course.

2024-06-10

Book Review: Extending Power BI with Python and R

I’ve seen a few books lately from the PackT people. The latest was the second edition of Extending Power BI with Python and R by Luca Zavarella: Perform advanced analysis using the power of analytical languages.

Author

Luca Zavarella

The author is Luca Zavarella. I’ve been working with Power BI since before it was released, and ever since I’ve seen discussions around using R (initially) and Python (later), Luca has been one of those people that everyone listens to.

2024-05-06

SQL Down Under show 85 with guest Bob Duffy discussing building data warehouses is now published

And another new SQL Down Under show is published!

Once again, I had the great pleasure yesterday to record a podcast with one of my UK based friends Bob Duffy.

Bob is a Principal Data Architect at Prodata’s SQL Centre of Excellence in Dublin, Ireland. He helps enterprise customers build large scale enterprise data warehouses in Azure, typically integrating ERP and financial systems.

Bob is a long-term Data Platform MVP, a Microsoft Certified Architect, Master, and Analysis Services Maestro. He holds a number of Azure role-based certifications. Bob also specialises in performance tuning, architecture, load testing, DevOps, Software engineering, MQ architecture, semantic modelling and meta data driven design on the MS platform.

2024-03-07

SQL: Understanding Change Data Capture for Azure SQL Database - Part 4 - Azure SQL Database Service Level Objectives

This is part 4 of a series on working with change data capture (CDC) in Azure SQL Database. This part discusses how to enable it and how to use it.

When I was reading the documentation for CDC in Azure SQL Database, I kept coming across a mention that it required at least an S3 service level objective (SLO), if you were using a DTU-based database.

I really hoped that wasn’t the case.

2023-09-23

Azure Data Factory (ADF) - Issues with parsing strings in CSVs

It’s scary how much of the world’s data lives in CSVs. Yet, as a standard, it’s problematic. So many implementations just don’t work as expected. Today, I had a client asking about why Azure Data Factory wouldn’t read a CSV that, while odd, was in a valid format.

The simplest equivalent of the file that wouldn’t load, would be one like this:

First Column,Second Column,Third Column,Fourth Column
12,Terry Johnson,Paul Johnson,031-23423
13,Mary Johnson,"Paul,Johnson",031-23423
14,Mia Johnson,"Paul ""the beast"", Johnson",031-23423
16,Cherry Johnson,Paul Johnson,031-23423

There are meant to be four columns. The source system wrapped strings in quotes only when the string contained a comma, as that was the delimiter for the file.

2023-09-12

Fabric Down Under show 3 with guest Mark Pryce-Maher discussing SQL Warehousing in Microsoft Fabric

I had the great pleasure to get to spend time today, talking with Mark Pryce-Maker. We were discussing the SQL warehousing experience in Microsoft Fabric.

Mark is a Senior Program Manager at Microsoft, where he’s working closely with the Microsoft Fabric team. He has a strong background in many areas of data, most recently prior to Fabric, with helping customers migrate to Azure Synapse Analytics. He has worked extensively with the dedicated SQL pools in that service.

2023-09-07

ADF: Replacing carriage returns in Data Factory expression language

This one had me stumped lately.

I had an variable in ADF where the value contained carriage returns. I needed to remove them.

I guessed that would be easy:

@replace(variables(‘fileContent’), ‘\r’, ‘’)

But no matter what variation of this I tried, it wouldn’t remove them.

Thanks to the genius of fellow MVP Simon Sabin, the answer was:

@replace(variables(‘fileContent’),base64ToString(‘DQ==’),’’)

Note: if you need a line feed (LF) it’s base64ToString(‘Cg==’) and a carriage return / line feed pair (CRLF), is base64ToString(‘DQo=’).

2023-04-24

FIX: Data Factory ODBC linked service fails to Apply and returns Internal Server Error

I was working with a client who has having trouble debugging an ADF pipeline, related to an ODBC linked service not working as expected.

The user had configured the connection string property of an ODBC connection this way:

  •  Set a parameter to the linked service as ServiceDSN
  •  Configured the connection string as @concat(‘DSN=’,linkedService().ServiceDSN)

The Test Connection for that worked fine, but when you click Apply, it fails with an Internal Server Error. Specifically, the error was:

2023-04-24