It’s scary how much of the world’s data lives in CSVs. Yet, as a standard, it’s problematic. So many implementations just don’t work as expected. Today, I had a client asking about why Azure Data Factory wouldn’t read a CSV that, while odd, was in a valid format.
The simplest equivalent of the file that wouldn’t load, would be one like this:
First Column,Second Column,Third Column,Fourth Column
12,Terry Johnson,Paul Johnson,031-23423
13,Mary Johnson,"Paul,Johnson",031-23423
14,Mia Johnson,"Paul ""the beast"", Johnson",031-23423
16,Cherry Johnson,Paul Johnson,031-23423
There are meant to be four columns. The source system wrapped strings in quotes only when the string contained a comma, as that was the delimiter for the file.
2023-09-12