Book Review: Azure Data Factory Cookbook – 2nd Edition

The people at PackT recently sent me a book to review, and I was happy to do so as it was on a topic that's dear to my heart: Azure Data Factory. The book was Azure Data Factory Cookbook and it's the second edition of the book. The authors are Dmitry Foshin, Tonya Chernyshova, Dmitry Anoshin, and Xenia Ireton.

PackT

In the past, I wasn't keen on PackT books. When they first appeared, they tended to be low cost books from unknown authors, many of whom struggled with writing in English, and pretty poor editing of the content.

I'm really pleased to see how this has changed in recent times. The authors of most of their books are now people who are knowledgeable about the topics, write well in English, and the editing has improved out of sight.

Not sure how that was achieved, but am really pleased to see that it has.

In terms of production, there are only two comments I'd make:

  • I find the font style, size, etc. still harder to read than an equivalent book from, say, Apress. I find the books harder to read for long periods.
  • I know it's hard to ask for colour, but I have to agree with one of the reviewers on Amazon who commented that the lack of colour make some of the pictures hard to read.

Other than that, the book was large, solid, and well-presented.

Content Style

I like books that are cookbook style. I used to think the same about books on topics like MDX and DAX. There is a place for books that teach the theory but often what people need once they get past the basics, are books that just say "if you're trying to achieve this, do this", and have a big list of recipes.

This book does that. Most of the topics are covered with walkthroughs that step you through how to do a task. I liked that approach.

Topic Coverage

This book covers a lot of topics. Given the title of the book was about ADF, I was really suprised to see the breadth of topics that were covered. The subtitle is A data engineer's guide to building and managing ETL and ELT pipelines with data integration. And that gives a clue to the fact that the coverage is much, much broader than ADF.

I was surprised to see so much coverage of pipelines in other places like Synapse Analytics, Fabric, etc. but more surprised to see coverage of HDInsight and big data concepts. I can't remember the last time I saw anyone using HDInsight. I always thought it was seriously over-hyped while it was being promoted, and still think the same way.

It made more sense to see a bunch of coverage of Databricks, delta tables and integrating ADF with Azure Machine Learning, Azure Logic Apps, Azure Functions and more.  They are relatively common areas of integration for ADF, along with migrating on-premises SSIS packages to ADF.

Note: in general, I don't like migrating SSIS packages to ADF in any way except rewriting them. Most of my customers never complain about the cost of using ADF. The only ones I hear complaining are people who use either the SSIS runtime for ADF or those using dataflows in ADF. (I don't like using those either)

Summary

The book is substantial, well written, and comprehensive.

What I really would have liked is more ADF content. I don't want the book to be larger, but for a book with this title, I'd prefer more depth on how to do things in ADF and less on other related but ancilliary topics.

7 out of 10

SQL Down Under show 89 with guest Erin Stellato discussing SQL Server and data-related tools is now published!

Another bunch of fun today recording a SQL Down Under show and it's now published!

This time I had the pleasure of discussing SQL Server and other data-related tools with Erin Stellato.

Erin is a Senior Program Manager at Microsoft and works directly with the tools that I wanted to discuss.I've known Erin quite a while and she's always interesting to hear from.

I hope you enjoy it. You'll find this show (and previous shows) here: https://podcast.sqldownunder.com/

SQL Down Under show 88 with guest Angela Henry discussing data types in SQL Server is now published!

I really enjoyed recording today's SQL Down Under show and it's now published!

This time I had a great conversation with fellow Microsoft Data Platform MVP Angela Henry.

Angela is a principal consultant at Fortified, and on LinkedIn, describes herself as a data mover and shaper.

Angela has a particular fondness for the Microsoft BI Stack and Databricks.

You'll find Angela online as @sqlswimmer.

I hope you enjoy it. You'll find this show (and previous shows) here: https://podcast.sqldownunder.com/

SQL: Suggestion for SSMS -> Save as table

I often look at the results of a query in SSMS and want to save them off somewhere, and what I really want is a table. To do that, at present, I need to:

  • Right-click and use Save Results As to go to a CSV
  • Use the flat file import wizard (or something) to import the CSV

Now obviously, in some cases, if it was a SELECT query, I could add an INTO clause and just run the query again, but there are many many cases where I want to save the outcome of another type of query. It could also be that I just can't run the query again for whatever reason.

CSV Pain

Going via a CSV has a whole host of other problems. We all know that outputting into a CSV and loading it again can involve a world of pain.

A real benefit of going direct is the data-typing. SSMS already knows the data type of the results, so it would avoid the entire mess of having to change column data types, etc. when importing CSVs. It's painful.

Multi-Server Queries

Where I've really come across this lately is with multi-server queries. In those cases (apart from configuring a whole load of linked servers that I don't want), there really isn't another good option, apart from the CSV method.

What's needed?

I'd love to see an option in the Results grid in SSMS, in the Save To area, that let you save to a table.

Agree? If so, you know the drill, vote once, vote often:

https://feedback.azure.com/d365community/idea/18481740-8aec-ee11-a73d-000d3adc65a4

 

SQL Down Under show 87 with guest Ronen Ariely discussing the importance of SQL Server internals is now published!

Today, another SQL Down Under show is published!

This time I had the great pleasure yesterday to record a podcast with an old friend from the SQL Server and data communities Ronen Ariely.

Ronen is a senior consultant and an applications and data architect with more than 20 years of experience in a variety of programming languages and technologies.

He was awarded as a Microsoft MVP seven times and is active in communities, mostly related to Microsoft Azure, Data Platforms, and Dot.Net programming.

Ronen is a moderator in the Microsoft forums, a member of the board at the Microsoft Learn Community, and administers several large data platform groups on social media.

A prolific writer of technical blogs, tutorials, and articles, he leads the Data GlobalHebrew user group and the Cloud Data Driven user group in New Jersey.

Ronen is also a co-admin of the Data Driven Community and the Principal Organizer of the Future Data Driven summit which is where I've mostly come across him.

In the show, we discuss important aspects of SQL Server internals that Ronen wishes people knew. You can find Ronen's blog here: https://ariely.info/Blog/tabid/83/language/en-US/Default.aspx
and you can find the call for speakers at the Future Data Driven Summit here: https://sessionize.com/future-data-driven-summit-2024/

I hope you enjoy it. You'll find this show (and previous shows) here: https://podcast.sqldownunder.com/

SQL Down Under show 86 with guest Armando Lacerda discussing SQL Server 2022 snapshot backups and data virtualization is now published

And yet another new SQL Down Under show is published!

Once again, I had the great pleasure yesterday to record a podcast with one of my long-term data platform MVP friends Armando Lacerda.

Armando is a cloud architect and engineer focused on data platform.
He is a long-term data platform MVP, MCT, speaker, trainer and coder.

In the show, we discuss changes involving two aspects of SQL Server 2022:

  • Backups using snapshots
  • Data virtualization

I hope you enjoy it. You'll find this show (and previous shows) here: https://podcast.sqldownunder.com/

Cosmos Down Under show 11 with guest Khelan Modi discussing vector database and search is released

It's been a big week for Down Under podcasts. I really enjoyed recording another new Cosmos Down Under podcast this morning. It's now edited and released.

Show 11 features Azure Cosmos DB product manager Khelan Modi discussing the vector database and search features of Azure Cosmos DB, and particularly how that applies to large language models (LLMs).

Khelan is a product manager on the Azure Cosmos DB team. He leads the AI and Portal (UI) initiatives for the Vector Database service, Azure Cosmos DB for MongoDB vCore.

He's also responsible for the go to market strategies and the growth of the product.

I saw Khelan talking about this recently with Mark Brown, and I thought it would be great to have him on the show.

I hope you enjoy it.

You'll find it here, along with previous episodes: https://cosmosdownunder.com

Power BI Implementation Models for Enterprises Part 3: Cloud Friendly Clients

I've been writing a series on Power BI Implementation Models for Enterprises for https://tekkigurus.com.

Part 3 that covers what I consider Cloud Friendly Clients is now published:

https://www.tekkigurus.com/power-bi-implementation-models-part-3-cloud-friendly-clients/

Enjoy !

Introducing PG Down Under – Focus on PostgreSQL

Welcome to PG Down Under!

I've worked with data most of my life and I love all aspects of it, and PostgreSQL has been part of it. Whenever possible, I've attended local PostgreSQL user groups to make sure I stay across what's happening with it, even though my primary work was with SQL Server. PostgreSQL has been part of many client engagements over the years.

For some time, I've been starting to build up a series of podcasts for PostgreSQL developers and DBAs.

PG Down Under is a sister podcast for SQL Down Under, Cosmos Down Under, and Fabric Down Under.

In November 2022, as show 5 for Cosmos Down Under, I recorded an interview with Charles Feddersen about Azure Cosmos DB for PostgreSQL, and it covered a number of things about PostgreSQL in general.

So if you've ever watched Law and Order, or Chicago PD, or shows like that, consider this a cross-over episode to kick off the new series. While we might occasionally mention Azure options for Postgres, that's not the primary focus of this series. Most shows will be targeted at PostgreSQL in general.

Thanks for listening, and I hope you enjoy this flashback for a first episode. It might also help to provide some positioning for those coming from a Microsoft background.

Cosmos Down Under show 10 with guest Tara Bhatia discussing elasticity features is released

I had the great pleasure to get to also record another new Cosmos Down Under podcast this morning. It's now edited and released.

Show 10 features Azure Cosmos DB product manager Tara Bhatia discussing the elasticity features of Azure Cosmos DB. Tara and her team have been helpful lately, helping me understand the burst capacity features of the product better, and I thought it'd be great to have her on the show.

I hope you enjoy it.

You'll find it here, along with previous episodes: https://cosmosdownunder.com