The Bit Bucket

Fabric RTI 101: Ingestion Modes

Fabric RTI 101: Ingestion Modes

When we talk about ingestion modes in Fabric Real-Time Intelligence, we’re really thinking about how events move from their source into a destination, such as an Eventhouse table or another downstream system.

The pattern you choose affects latency, cost, data quality, and the flexibility of your real-time architecture. There are two broad models that students will encounter: direct ingestion and processing before ingestion.

Ingestion Modes

Direct ingestion is the simplest path. Events arrive from a source such as IoT devices, applications, or an event broker, and they are immediately written into the target system without any intermediate shaping. This mode gives the lowest latency because nothing happens in between. It is most useful when you want to preserve raw events for later analysis, replay, or transformations that happen downstream. It is also the right choice when your first priority is freshness and the consumers are able to handle any necessary cleaning or shaping themselves.

2026-03-30

SDU Tools: Jaro Winkler Similarity in SQL Server T-SQL

SDU Tools: Jaro Winkler Similarity in SQL Server T-SQL

Our free SDU Tools for developers and DBAs, now includes a very large number of tools, with procedures, functions, and views. The JaroWinklerSimilarity function that we have added calculates the Jaro Winkler similarity for two strings.

It essentially answers the question Do these two short strings probably refer to the same thing, even if they aren’t exactly the same?.

It can be used where typos are common, or characters are transposed, and where prefixes matter more than suffixes. Empty and NULL values on input return NULL.

2026-03-29

Fabric RTI 101: What are Event Schema Sets ?

Fabric RTI 101: What are Event Schema Sets ?

Event Schema Sets in Fabric Real-Time Intelligence are essentially a way to standardize the shape of the data coming into your real-time environment. When events are flowing in from a variety of sources—such as IoT devices, applications, APIs, or logs—you typically see a lot of variation: different fields, inconsistent casing, unexpected nested structures, or additional attributes that drift over time. Schema Sets give you a central place to define the expected structure for those events.

2026-03-28

SDU Tools: Formatting Bytes in SQL Server T-SQL

SDU Tools: Formatting Bytes in SQL Server T-SQL

Our free SDU Tools for developers and DBAs, now includes a very large number of tools, with procedures, functions, and views. The FormatBytes function can now be used to take a number of bytes, and to format it as a string, with appropriate units.

The calculation can be done using SI units (where 1000 bytes is one kB, or binary units where 1024 bytes is one KB. It can also output IEC based units like the kibibyte.

2026-03-27

Fabric RTI 101: Data Quality in Real-Time

Fabric RTI 101: Data Quality in Real-Time

One of the biggest differences between batch ETL pipelines and real-time pipelines is how you manage data quality. In a batch world, you often have long, multi-step processes that validate, clean, and enrich data before it ever reaches your reports. You can afford those extra passes because the data isn’t needed instantly.

In a real-time system, you don’t have that luxury. Events arrive continuously, and you need to deal with problems on the fly. That means data quality checks have to be fast, lightweight, and built directly into your ingestion or processing stage.

2026-03-26

General: So what's a kibibyte? Binary, SI, and IEC Units

General: So what's a kibibyte? Binary, SI, and IEC Units

Whenever we’re talking about an amount of data, it’s important to understand the units that are used. In all the early days of computing, it all seemed pretty simple. We had KB for 1024 bytes, MB for 1024 * 1024 bytes, etc.

The first people I saw messing that up were the hard drive manufacturers. Originally, they followed the standard units that we had been using in computing. But somewhere along the way, they changed how this worked. The vendors decided that if they had 10,000,000 bytes of storage, they would call that a 10MB hard drive, but of course it wasn’t, at least not in how we used to measure them. Some of the vendors even started talking about hard drive megabytes like it was some other new unit. That meant that suddenly a 128MB hard drive (128 * 1024 * 1024 * 1024 or 137,438,953,472 bytes) became a 137MB hard drive.

2026-03-25

Opinion: What's the most misleading error message you've ever seen?

Opinion: What's the most misleading error message you've ever seen?

I was part of a discussion the other day where the topic was the most misleading error message you’ve ever seen. I’ve been in the industry long enough that it’s a pretty tall list of error messages that I need to consider.

The winner for me

But I finally decided on one:

Back in the VB6 days, there was a common error message that said Out of Memory.

There were many issues that could lead to that error message, but running out of memory was probably the least likely.

2026-03-23

AI: AB-731 Exam - Microsoft Certified AI Transformation Leader

AI: AB-731 Exam - Microsoft Certified AI Transformation Leader

I’ve previously written about how I like to always have a certification exam that I’m working on.

Recently, I saw the new AB-731 exam for AI Transformation Leader. It’s the only exam required for the Microsoft Certified: AI Transformation Leader certification.

What interested me is that it was one of the first exams that are now designed to target the business people who are involved in making decisions within an organization. In this case, it’s about deciding how AI might impact the organization, how to plan and carry out the transformation, and which tools to use.

2026-03-22

SSRS and Fabric Paginated Reports: Be very careful with using "c" formatting for currency

SSRS and Fabric Paginated Reports: Be very careful with using "c" formatting for currency

While on site this week, another common problem that I see everywhere arose again.

When you need to format currency, you use the “c” format right? It’s in nearly every set of course materials I’ve ever seen. And people do it in almost every demonstration.

But so often, that’s wrong!

When you do this, you’re telling the system to display the monetary value using the local currency.

Is that correct though?

2026-03-21