ADF: Changing or removing your Git connection from an Azure Data Factory

I've been working a lot with ADF (Azure Data Factory) again lately. If you've started to work with ADF and aren't using Git in conjunction with it, you need to change that now.

ADF objects are also represented as JSON objects and lend themselves nicely to being stored in Git.

Another key advantage is that if you don't have Git connected, when you're working with ADF, you don't have a Save button. If you're part way through making changes, you can either Publish them or Discard All. When you have Git in place, you get an option to Save.

NOTE: If I'm working at a site that doesn't have Git connected, if I need to temporarily stop working on something, what I can do is get the JSON code for the pipeline (or whatever I'm working on), save it away manually, then discard my changes. At least I can then put it back again later. But that's basically what the Git option does for you anyway, so just use it.

Changing or Removing the Git Connection

Today was the first time I'd needed to change the Git connection across to another repository. And it took me a while to work out how. You'll notice that in the main editing window, there's no option to change it:

Now I think it should be there. But if not, it should be on the Properties page for the ADF instance. But it's not there either:

Now I think it should be in one of those two locations.

So where is it?

In one of the least obvious bits of UI ever created, it's tucked up in the top right corner of the Dashboard. That's the last place I would have looked. That page is usually useless to me. It's great if you've just started, as it has lots of links to tutorials, etc. but otherwise, not so interesting. But that's precisely where the option is:

I hope that helps someone.

 

BI: (Workaround) Changing partitioned tables in SSDT tabular designer

I was working with a client the other day and we had what we thought was a simple situation:

  • A table in an SSAS tabular model project
  • The table had two partitions
  • We needed to remove 6 columns from the table

So, we'd already removed the 6 columns from the SQL Server view that the tables were being loaded from. We just needed to update the tabular model.

Now for an unpartitioned table, that's easy. You open the table properties, click Design, wait for a moment, click below the query, then on the Query menu, click the option to Refresh. When you then click Import, you can then save the changes. We do that all the time.

However, the partitioned story was different.

  • If we tried to change the partitions, it would not let us save the changes because the partition differed from the table.
  • If we tried to change the table, it would not let us save the changes because the table differed from the partitions.

<SIGH>

There really doesn't seem to be any great way to do it using the standard GUI interface.

Official Method

The "official" way is to:

  • Remove all but one partition
  • Make the changes
  • Put the partitions back

That doesn't sound like much fun, particularly if you had a bunch of partitions.

Workaround

The workaround is to right-click the .bim file, then View Code, find the columns and carefully remove them from the JSON. When you save it, all is then happy.

Wish it wasn't so, but I hope that helps someone.

 

 

Opinion: SQL Server Reporting Services – The reports of my death…

SQL Server Reporting Services (SSRS) is a wonderful product, that's still really important, even though the product group in Microsoft hasn't been showing it much love lately.

I was at a client site the other week, and while we were using Power BI (PBI) for their dashboards and general visualizations, we were looking to use SSRS for part of the solution.

What fascinated me, is that when they talked to the local Microsoft field staff, they kept being told how SSRS was old technology, and asking why they'd look to use such old technology.

Worse, when I looked at what they were suggesting, it was Power BI Premium, combined with Azure Automation, which was launching Azure Functions, and/or using Power Automate.

Really?

Way too much Kool-aid drinking has been going on there.

Awesome image by Benjamin Voros
Awesome image by Benjamin Voros

Let's make something clear:

I love Power BI, but there are still many, many scenarios where SSRS is the best solution available today.

Power BI Premium isn't for everyone, and in a number of areas, it's still quite a distance short of where SSRS already is. For a start, quite a number of things that are commonly used in SSRS just aren't available in PBI Premium's implementation of Paginated Reports.

Some other things will no doubt come but until they do, it's also not the same development and maintenance story. (A good example is the current lack of shared data sources, report viewer controls, etc.)

Smaller Companies

At present, the pricing of Power BI Premium is simply too high for most smaller companies. We've been saying that since it first appeared. That might change, and alter the balance, but we can only consider the situation today.

And even if it does change, I also don't see any solution at all except SSRS for all the people currently running the Express or Standard editions of SQL Server.

Having paginated reports available in Power BI Pro would help a bit with this but it's still not the whole story.

Licenses

With SSRS, you have a single license for the report server. With Power BI, unless you're using Premium and/or embedded, you have a Pro license for everyone using it.

For some companies, this is a major difference. Many companies have far more report recipients than people who want to browse reports in a dashboard.

And even though the client was a tier-1 financial, only around 40 people needed access. There's a major difference in price between 40 Power BI Pro licenses and an SSRS license, and a Power BI Premium license.

Subscriptions

Data-driven subscriptions in SSRS are powerful. Do you have a whole lot of users who just need a pixel-perfect PDF of a report, or an Excel spreadsheet sent to them periodically (perhaps every Monday morning)? SSRS is perfect for that.

Do you need to send a pixel-perfect report to a whole lot of people, with different parameters for each? SSRS and data-driven subscriptions are perfect for that.

Summary

Power BI is wonderful and it's going in great directions. I'm really disappointed that the product group hasn't given SSRS much love in recent times, but there are many scenarios where it's still the best solution.

Don't discount it yet.

And if you find yourself proposing a convoluted set of services, just to start to achieve what SSRS already does, please think again. You owe it to your customers.

 

 

Power BI: (Workaround) Times disappear when datetime values are displayed in PBI tables

I'll start this post by mentioning that there's a general consensus that you should avoid columns in tabular data models that have both dates and times. That's largely because they don't compress well. However, sometimes you really do need to have both a date and a time in a single column.

For example, if you want to sort one column by the date and time, we have no option to sort one column by two other columns. And if you're going to create a composite column to get around that, you're really back in the same problem as storing a datetime anyway.

Displaying Values in a Table

But what really surprised me recently was that when I added a datetime column from my Live Query data model (source was Azure Analysis Services) into a table in Power BI, that the time values disappeared.

Only the date was shown.

I was quite puzzled about what was going on.

Why This Happens

Power BI has a limited range of data types. If you have a date, it's stored in a date data type. But if you have a datetime, it's also stored in a date data type.

So when you're designing a report, Power BI has to decide a default format for showing you a date data type element. And no major surprise, it just shows the date.

Fixing the Issue

To make the time values appear, you need to fix the issue back in the tabular data model, by adding a custom display format for the datetime column.

I can't say that I love this as a display format decision should really be made at the client application. I'm OK with the server supplying a default display format, but the client should be able to change it.

Unfortunately in Power BI today, you have no option to change the format of elements from Live Query data sources.

I hope that helps someone.

 

Power BI (Bug): Power BI Desktop auto-hides visible tables with all columns hidden

I have a client who's publishing their tabular data models to Azure Analysis Services (AAS). They want to publish a table that's visible, but only has a single column that's hidden.

You might wonder why he wanted to do that.

He's trying to have a table with no existing columns that's an anchor point for report designers to attach their report-specific measures.  There are measures and computed columns in the tabular data model in AAS. But he wants to have a known location for measures that are only related to the specific report.

All other BI tools that I've tried, show the table . As an example, here's the table appearing in a browse window in SQL Server Management Studio (SSMS):

But Power BI Desktop automatically hides a table that has no visible columns.

Workaround

You can see the table if you use the "View Hidden" option in the fields list, but you shouldn't need to do that for a table that's supposed to be visible.

Bug?

I saw discussion that says it relates to a backwards compatibility issue with Excel, but to me it just seems like a bug.

You can add your opinion on this here: https://community.powerbi.com/t5/Desktop/Power-BI-auto-hides-visible-tables-with-all-columns-hidden-bug/m-p/994358#M472966

(And I'd encourage you to do so)

BI: Wondering where you Integration Services Connector for Power Query has gone in Visual Studio 2019?

I wrote recently about menu items being missing from SSIS and other BI projects in VS2019, but another thing that's gone MIA is the Integration Services Connector for Power Query.

This is the connector that allows you to use Power Query (and the M language) to input data into SSIS.

The SSIS designer for VS2019 recently went GA (general availability). It's great that it's released, but if you install it, you'll find it removes the Power Query option for SSIS.

I imagine the reason for that is that the connector is still in preview, where the rest of the designer is now GA.

To get the functionality back, you need to separately download the Integration Services Connector for Power Query. You'll find both 2017 and 2019 versions of it here:

https://www.microsoft.com/en-us/download/details.aspx?id=100619

 

BI: Wondering why your SSRS, SSAS, and SSIS menus are missing in Visual Studio 2019?

If you've recently installed Visual Studio 2019 (VS2019), and then installed the SQL Server Reporting Services (SSRS), the SQL Server Analysis Services (SSAS), and/or SQL Server Integration Services (SSIS) extensions, when you open a project you might be wondering where your menus went.

How it was

For example, if you had a report open in the VS2017 report designer, this menu was present:

And if you had an AS model open in the VS2017 tabular designer, these menus were present:

And similarly, if you had an SSIS package open in the VS2017 designer, these menus were present:

 

How it now is

Now, when you open VS2019, no matter which of these you have open, you won't find those menus.

So where are they?

They're all hidden within the "EXTENSIONS" menu as these designers are all "Extensions". And if there were multiple menus, they're all in there separately:

Verdict?

I think this is a lousy idea for multiple reasons.

First, it means that when you're working on an object like a tabular data model, your menu bar is full of things unrelated to what you're working on, and the relevant menu items are hidden multiple levels down under "Extensions".

While I understand they're probably trying to keep the VS menu bar consistent, many people working with these projects could not care less about the rest of VS. That's just where these designers happen to be.

For the life of me, I can't see how that's good UI design.

Second, it makes the decades of blog posts and articles that have been written about these tools incorrect, and much harder to follow. The pain from this is hard to describe, and it's felt most by newcomers who will just be puzzled and not know what's going on.

I'd love to hear your thoughts. If you agree with mine, please tell Microsoft as well. I really hope they reconsider this.

Please vote here:

https://developercommunity.visualstudio.com/content/problem/934934/hiding-ssrs-ssis-ssas-menu-items-under-extensions.html

 

 

 

Spatial Data: If you are working with Australian mapping data, Mappify.io is worth knowing about

I spend quite a bit of each year working with spatial data (i.e. mostly mapping data but sometimes other data). One of the constant challenges is where to find the spatial data. Another challenge is where to find services that can help you to work with the data.

For Australian data, if you haven't considered Mappify.io, I think you should take a look.

You need to create an account before you start but it's free to create an account to try it out. Then there are a number of services.

Geocode

You can do simple web-based geocoding if you want. Enter values or upload a CSV and click Geocode.

You get back a map and results that you can export:

And of course there's an API that allows you to do this programatically.

Reverse Geocoding

This takes a list of latitudes and longitudes and reverse geocodes them.

And again, a map and an exportable set of results:

Area Code

The Area Code option determines the area that a location is in. You can choose how the area is categorized:

For example, POA returns the postcodes.

Address Cleansing

I particularly love the option for cleaning addresses. Pass in a full address and have the system break the data into standard columns:

Others

There are many other options as well, like routing and driving directions and distances, spatial analytics, etc. This is a great tool and a great service.

Mappify.io currently say that if you create an account, the first 2,500 requests every day are free. After that they're 0.1¢ each. That seems a bargain.

Learning about Spatial Data

If you're not using spatial data in SQL Server, you should be. And if you need to learn about spatial data in SQL Server, check out our popular on-demand online course:

https://training.sqldownunder.com/p/sqlserver-spatial-data

 

 

Snowflake for SQL Server Users – Part 8 – Case Sensitivity

There are many things I like about Snowflake. How they handle case and collations is not one of them.

There are currently no rich options for handling case like you have in SQL Server, with detailed options around both collations, and case sensitivity.

I've previously written about how I think that case-sensitivity is a pox on computing. I see absolutely no value in case-sensitivity in business applications, and a significant downside.

Case preservation is a different thing. I expect systems to remember the case that I define things with, but 99.9% of the time, I want to search for them without caring about case. All that case-sensitivity does provide is the ability to have two objects in the same scope that differ only by different capital letters in their names. That's usually the result of lazy coding, and almost never a good idea.

Snowflake is basically case-sensitive. I wish it wasn't. There are, however, some workarounds.

Object Names

To get around some of the problems that case-sensitivity causes, Snowflake automatically upper-cases object names when you define or use them. By default, Snowflake treats the objects Customers, customers, and CUSTOMERS as the same object. In fact, if you execute a statement like:

CREATE TABLE Customers

what it will create is a table called CUSTOMERS.  If you execute

SELECT 2 AS Value;

you'll get back the value 2 in a column called VALUE, not the column name that you asked for.

That breaks my basic wish (in any language) for case-preservation. Most objects that you see in almost every Snowflake presentation have names that are all capitalized.

You can, however, get around this by quoting each name with double-quotes.

CREATE TABLE "Customers"

And then you need to do that for every table, every column, every object, etc. from then on. If you execute:

SELECT 2 AS "Value";

you'll get the value 2 with the column name that you're after.

Case-Related Comparisons

To get around the idea that most people won't want to compare strings in a case-sensitive way, they've created some different operators to deal with case. For example, if you use

WHERE "CustomerName" ILIKE 'Fred%'

you get a case-insensitive version of LIKE.

Added To My Wish List

The Snowflake people really need to fix how they handle case. You can see from the requests in their user forums that I'm not the only one that thinks so.

This aspect of the product feels very Oracle-like, and also feels like being back in the 1960s. Humans don't like screaming snake case. I really hope they will fix it soon as it's currently one of the the weakest aspects of the product.

 

For an index to all posts in this series, see the first post here.

 

 

 

 

 

Snowflake for SQL Server users – Part 5 – Editions and Security Features

Like most products, Snowflake comes in a number of editions, and you can see the current editions in the main image above. (Keep in mind that they could always change at any time and to check their site for the current options).

First thing I need to say is that I really like the way that most of the SQL code surface is pretty much identical across editions. I wish that was complete coverage but it currently doesn't include materialized views.

Note: That was a great change when the SQL Server team did the same back in SQL Server 2016 SP1.

There is no free edition like we have with SQL Server Express. Similarly, there's no free option for developers like we have with SQL Server Developer Edition. That's not surprising though, as they aren't the cloud provider, they are purchasing services from cloud providers. I find that the Standard edition is pretty low cost though: you only pay a fairly low amount for storage, and you only pay for compute when you use it. So that's not too bad.

Differences in Editions

The main difference between Standard and Premier is that the latter comes with premier support. So that's not a bad distinction from say development tasks, to production tasks. I'd rather see that as just a single edition, with support an optional extra over all editions.

Snowflake has a feature called Time Travel. This allows you to see what data looked like at earlier times. It's a bit like temporal tables but also quite different to it. I'll talk more about it in another post.

Standard and Premier both have one day of time travel though, and Enterprise edition takes you up to 90 days. I like to see that sort of feature used as a differentiator between editions. It mostly wouldn't require code changes when working with different editions.

Business Critical basically introduces more security. It adds HIPAA and PCI compliance, and the ability to use customer-managed encryption keys. I can't say that I love the idea of core compliance as a distinction between editions. Everyone's data is important to them. Customer managed keys are a good edition differentiator though.

Snowflake data gets encrypted anyway, and with a key that changes each month (for new data). But on lower editions, it doesn't get re-keyed. What Business Critical also adds is annual key rotation. Data that's a year old gets decrypted and re-encrypted with a new key.

VPS or Virtual Private Snowflake is for people who can't tolerate the idea of any sort of shared Snowflake infrastructure. The Snowflake team do a completely separate deployment of the whole Snowflake stack, just for each customer. It's super expensive (I heard it starts at over $50M AUD) and so I can't imagine too many customers using it, but I'm sure there will be a few, including right here in Australia.

I heard that VPS was only available on AWS at the time of writing, but I'm sure that will change. And I'm guessing if you front up with $50M+, and say you want it on Azure, it's unlikely they'd say no.

 

For an index to all posts in this series, see the first post here.