SSIS: Reading pipe delimited text and selecting particular output columns

SSIS: Reading pipe delimited text and selecting particular output columns

There was a question on the Q&A forums today, asking how to read data using SSIS, when it’s in this format:

|Col1| |Col2|Col3|Col|
|101| |A|21|DC|

One of the concerns was that there was a leading pipe. This is not a problem. When you have data like that, and you set | as the delimiter, because there are 6 delimiters, then there are 7 columns output. These are the values:

Column 1: blank string Column 2: 101 Column 3: blank string Column 4: A Column 5: 21 Column 6: DC Column 7: blank string

SSIS can handle that easily. Let me show you how.

Also, if you’d like to learn about SSIS from me, we have a great online on-demand course here:

SQL Server Integration Services for Developers and DBAs (sqldownunder.com)

Example SSIS project

I start by creating a new SSIS project. Next in the Control Flow tab, I drag in a Data Flow task, and rename it to Import the Rows.

Control Flow

Right-click the task, and choose the Edit option. (Note, you can also do this via the Advanced Edit button but there’s no need for that here)

Edit link

When the data flow task editor appears, drag in a Flat File Source. (Note: make sure it’s this one, not the useless Source Assistant at the top)

Flat file source

Rename the flat file source to SampleData and right-click it and choose Edit. The Flat File Source Editor will appear:

Flat File Source Editor

You need to add a new connection, so choose New:

Add source details

My SampleData.txt file looks contains this:

|Col1| |Col2|Col3|Col|
|101| |A|21|DC|
|102| |B|22|DC|
|103| |A|23|DA|

So, here I’ve added the link to the file name (note: in production code, I would parameterize this), and I’ve made the Vertical Bar be the Header row delimiter. I’ve ensured that column names in the first data row is selected, and I don’t want to skip any header rows.

Once I’ve done that, I go to the Columns page:

Columns page

Note that SSIS has already named the four columns that had names in the header row, and put generic names for the others. At this point, it’s important to check the OutputColumnWidth property for each column. Then we click OK to save it.

Back on the Flat File Source Editor page, go to the Columns page:

Columns page

Note here that I can then uncheck the columns that I don’t want, at the top:

Uncheck columns

Once I click OK, I now have configured the source as required.

Testing

The easiest way to test a source is to connect it to something, and add a viewer to the data path.

I find the easiest for this is to use a Multicast. You can connect to it, and it doesn’t care if there are no destinations. (Unlike real destinations)

Multicast

So I drag a Multicast onto the surface and connect the data path to it. Then, right-click the data path (the line between them), and choose to Enable Data Viewer.

Enable data viewer

Then click Start and you’ll see the data as expected:

Expected data

 

We have four output columns with the correct names as expected.

I hope that helps someone get started with this.

2024-07-19