Snowflake for SQL Server Users - Part 10 - Working with file formats

One thing that I quite like about Snowflake is the way it cleanly works with a wide variety of file formats.

Obviously this changes over time but at the time of writing, you could COPY from the following source file formats:

CSV
JSON
AVRO
ORC
PARQUET
XML

There are also quite a number of options for configuring how these are used. Apart from the obvious options like record and field delimiters, skipping rows, etc, one of the most important of these options is compression. You can currently choose these options for compression:

AUTO
GZIP
BZ2
BROTLI
ZSTD
DEFLATE
RAW_DEFLATE
NONE

I had good outcomes using the AUTO setting. Mostly I’m using zip’d input files.

There are a few options for transforming data while loading it, but they are very limited. For example, you can alias a column or change its position.

Named File Format

Similar to the way you can avoid specifying full file locations every time by creating an external stage, you can avoid specifying all your file format details every time by creating a named file format.

No surprise, that’s done with the CREATE FILE FORMAT command.

Data Export

When you export data, you again use the COPY command, but currently there are less file format options available than there are for input. You can use:

CSV
JSON
PARQUET

I would be surprised if the others aren’t added soon.

For an index to all posts in this series, see the first post here.

2019-10-11