This post relates to another question I got on Stack Overflow recently.
The poster was saying that he was having no luck inserting data from teh Gujarati language, even though he was using nvarchar as a data type.
The most common problem that I see when people aren't getting the outcome they want when inserting into an nvarchar column is that they aren't putting N in front of their string values. (N is National Characters Set). Imagine a table like this:
1 2 3 4 5 |
CREATE TABLE dbo.Test ( TestID int IDENTITY(1,1) PRIMARY KEY, TestValue nvarchar(100) ); |
Here's a Chinese example that won't work (because of multi-byte string values):
1 |
INSERT dbo.Test (TestValue) VALUES ('Hello 你好'); |
You won't get an error on the insert, but if you then query it, you'll see this:
With just single quotes, it's just an ANSI string, basically made up of ASCII values, and so your multibyte characters get messed up.
Let's try that with and without the N using Gujarati:
1 2 3 4 5 6 7 8 9 |
TRUNCATE TABLE dbo.Test; INSERT dbo.Test (TestValue) VALUES ('lakhnār'), ('vikǎselũ'); INSERT dbo.Test (TestValue) VALUES (N'lakhnār'), (N'vikǎselũ'); SELECT * FROM dbo.Test; |
You can see the output in the main image above. Note that in rows 1 and 2, the characters are almost what was input. In this case, there is a basic mapping to single byte characters that worked out not too bad, but often you'll just see ?? instead.
In rows 3 and 4, where we used the N for the string definition, all worked as expected.
Depending on the example, even that may not be enough. My advice is to review these links when supporting global databases:
https://aka.ms/sqlunicode and https://aka.ms/internationaltsql
Hi Pedro, that's the main thing that I find people get wrong in released versions, and was this guy's issue.
What other things do you find people commonly get wrong today ? (When UTF8 gets involved, things get messier)