Learning Chinese: What is Meant by Simplified Chinese?

In a recent post, I talked about the benefits I'd gained by learning to read Chinese, or at least getting better at it.

A curious question that I get from people sometimes, is about "learning to read Mandarin". I have to explain that Mandarin is a dialect (as is Cantonese) not a written language. I'll write more about dialects another time.

One of the upsides of learning to read Chinese nowadays is that it doesn't matter so much what dialect someone speaks, the written form is pretty much the same, well almost…

Chinese has a lot of characters. There have probably been upward of 30,000 over the years. Today though, the Table of General Standard Chinese Characters lists 8105 characters. That's a lot of characters to learn. I now know about 1900 and while I can make some sense of newspapers, etc., about 2500 is generally considered a good starting point for readers.

One of the challenges for people learning to read and write though was that some characters were pretty complicated. That might be ok if they are uncommon, but not if they are used regularly.

To improve literacy, starting in around 1946, the Chinese government decided to make some of them easier. A few hundred common characters were simplified as 简化字 (or jiǎnhuàzì). These are often also called 简体字 (or jiǎntǐzì).

Let's look at a couple of simple examples:

In English, we have collective words for groups of things ie: flock of geese, or pack of wolves. Chinese, however, has measure words (量词 or liàngcí). So to say "three fish", you say 三条鱼 (or sāntiáo yú). In this case, tiáo is the measure word for fish (actually it's used for many long thin things).

There are many, many measure words, but the most common generic measure word is gè. So if I said "I have a secret", I'd say 我有一个秘密 (or Wǒ yǒu yīgè mìmì). In that sentence, the character 个 (or gè) is the measure word.

But notice how this sentence looks in Traditional Chinese:

我有一個秘密

Look at how far more complex the fourth character is. Same sentence but one different character.

As another example, let's look at another long-thin thing that uses tiáo as its measure word:

If I say "one dragon", it's 一条龙 (or Yītiáo lóng) in simplified Chinese but in traditional characters, it's 一條龍.

You can see why they wanted to make the change. You might also wonder about why they'd simplify a word like "dragon" when choosing common words to simplify, but dragons are surprisingly prevalent in Chinese culture.

Anyway, nothing is ever all that simple. Now that we've discussed what they are, in a later post, I'll discuss who does/doesn't use them.

(More info here: https://en.wikipedia.org/wiki/Simplified_Chinese_characters)

 

Book Review: Hit Refresh – Satya Nadella

When I first heard that Satya Nadella had a book out, I was somewhat surprised as at the time, he had just taken over running Microsoft. Usually you don't see books from CEOs until they've been in the role for quite a while and have become philosophical about things.

But given the impact I could see he would have, I was fascinated to read his book Hit Refresh.

It was actually quite a bit more than I expected. I really enjoyed the tales of his life and how it led up to his current role.

He shared a great deal about his family situation, and you can see that he's really been through some hard times, and you can also see a love of his family shining through.

On any measure, Microsoft is a far superior company today than when we took the reins. I think people forget just how grim many things were looking at the time.

It's really interesting now to see Microsoft being seen as such an innovator, compared to say Apple. The jokes about Apple's biggest contributions lately being about removing headphone sockets off phones are only partially in jest. By comparison, they seem to have lost their mojo.

If you ever wonder if some of these CEOs are worth what they get paid, you only need to compare Microsoft today with Apple to get a idea of the impact that the right one can have.

What can I say? I really enjoyed the book.

Greg's rating: 9 out of 10

Note: as an Amazon Associate I earn from qualifying purchases but whether or not I recommend a book is unrelated to this. One day it might just help cover some of my site costs. (But given the rate, that's not really likely anyway 🙂

Shortcut: Using bookmarks in SQL Server Management Studio

In a previous post, I was discussing how outlining can be helpful with navigating around within a large T-SQL script file.

If you were trying to do that within a Microsoft Word document, the most common thing to use is bookmarks, and SQL Server Management Studio (SSMS) has them as well.

Bookmarks are simply placeholders within a script. (They can also apply to other types of document within SSMS). Where I find them very useful is when I'm working in two or three places within a long script at the same time. Perhaps I'm working on a function, and also on the code that calls the function. By using bookmarks, I'm not flipping endlessly around the script file, and can jump directly from placeholder to placeholder.

A quick check of the Bookmarks submenu (under the Edit menu), shows what's available:

You toggle (enable or disable) a bookmark at a particular point, by using Ctrl-K and Ctrl-K. You can then navigate forwards or backwards using Ctrl-K and Ctrl-N (next), or Ctrl-K and Ctrl-P (previous).

Note that there are options available for both the document and the folder. The folder option can be particularly powerful.

Bookmarks are an often-ignored but highly useful part of using SSMS. If you don't currently use them, you might want to consider them.

SDU Tools: DropTemporaryTableIfExists

I regularly find myself writing repetitive code in T-SQL. Some things are best done by just creating code snippets but we've also added several others to our free SDU Tools for developers and DBAs.

One straightforward one is DropTemporaryTableIfExists.

This just wraps all that's needed to remove a temporary table if it exists. The nice thing with this procedure is that you can call it before creating a temporary table, and call it again after you finish using the temporary table, as shown in the main image above.

In addition, to make it a little easier to use, you can reference the table with or without the # prefix, as shown below:

The outcome is the same in both cases.

You can also see it in action here:

To become an SDU Insider and to get our free tools and eBooks, please just visit here:

http://sdutools.sqldownunder.com

Upcoming: Precon for SQL Saturday in Auckland

At the end of the month, I'll be delivering a pre-conference session in Auckland on Friday 31st of next month (August). Would love to see you there. Details are as follows:

Developing SQL Server Applications that Perform

So many SQL Server applications today are so slow, even simple ones, yet the product is capable of amazing performance, even in tier-1 organizations. Greg spends his life working with SQL Server developers (from the smallest startup software houses to many large tier-1 organizations), working with them to get their applications flying.

In this one-day workshop, you'll learn:

* How SQL Server indexes work
* How data type choices affect index performance
* How to design appropriate table structures and indexes
* How indexing issues appear in query plans
* How to use query plans and other tools to find out why your queries aren't performing as expected
* Where columnstores fit into this picture (where you should and shouldn't use them)
* Where in-memory OLTP fits into this picture (where you can and can't use memory-optimized tables and native compilation)
* How to avoid common application performance design mistakes and anti-patterns that I see regularly

This session is for developers, DBAs, and consultants who need to know how to build better SQL Server applications, and to understand why SQL Server applications are slow.

Registration is available here:

https://www.eventbrite.com/e/developing-sql-server-applications-that-perform-31-august-2018-tickets-46835857310?aff=ehomecard

 

SQL: What's in a (default) name?

I often see people creating databases in SQL Server and not specifying the name of defaults they are applying to columns. They define a column like this:

And there are general reasons why this makes sense. For example, a column can only have one default, so what does the name matter anyway?

There are two reasons:

Dropping columns

In SQL Server, you'll find that if you go to drop either of those columns, you'll see something like this:

SQL Server requires you to drop the default constraint on the column before you can drop the column, and unfortunately it requires you to do that by name. Notice the name that it chose for the default: DF__Customers__First__3A81B327. Life is far, far easier if you have a pattern that means you already know the name that will have been applied.

DevOps and Database Comparisons

As part of DevOps or other techniques, you'll end up wanting to compare two databases to find what's different. While some tools help with this, having the same table definition create defaults with different names isn't going to be helpful in this.

Naming

It doesn't matter too much what pattern you use for the names. I'd use this one:

I use the schema name, the table name, and the column name. That can't go wrong or end in duplicates. And I already know the names of my defaults, and they're easy to generate programmatically.

Alternatives

Keep in mind that this is a SQL Server specific thing. In other languages like PostgreSQL, you can't apply a name to a default constraint. But you don't need to because column defaults are automatically dropped when the column is. (No idea why SQL Server doesn't just do this). And you can drop a default with the DROP DEFAULT clause to ALTER TABLE without needing a name for the default.

These are things that SQL Server should copy but until they do, name your default constraints.

 

Book Review: Blockchain – by Samuel Rees

Another book I've read recently while sitting on a few planes is Blockchain – by Samuel Rees.

I've seen some big claims in the titles of books but this one had me intrigued:

The Ultimate Beginner Through Advanced Guide on Everything You Need to Know About Investing in Blockchain, Cryptocurrencies, Bitcoin, Ethereum and the Future of Finance

That's quite a claim. I was really hoping this book would provide a great amount of detail given it's 'beginner through advanced' guide claim.

That's not what I found though. While it might be a useful book if you'd never learned anything at all about Blockchain, I thought the overall discussion was pretty shallow and there was a whole lot of "gee whiz how amazing is this" types of messaging that I really didn't enjoy.

I did persevere to the end though, as I was hoping there was more coming. What I did find was a bunch of info on the author's views on investing in cryptocurrencies. While he's careful to avoid straight-out telling you to invest, the tone is certainly that you should do so.

While Blockchain as a technology is still a strong option, when I read the book, I'd have to say I wasn't in love with the idea of investing in cryptocurrencies. Given how the majority have now tumbled and/or become extinct, and the way that even Bitcoin and Ethereum have plummeted lately, I'm glad I thought that way.

Greg's rating: 4 out of 10

Note: as an Amazon Associate I earn from qualifying purchases but whether or not I recommend a book is unrelated to this. One day it might just help cover some of my site costs.

Shortcut: Code outlining in SQL Server Management Studio

For some years now, SQL Server Management Studio (SSMS) has had the ability to use code outlining, the same way that other Visual Studio applications can.

This can be very useful when you are trying to navigate around a large script file.

The simplest usage is to collapse or expand a region of code. Note that in the following script, code regions have been automatically added by SSMS:

This allows us to click on the outline handles, and collapse the code:

Note that when the region of code is collapsed, the name of the region is shown as the first line of the code within the region, truncated.

If you hover over the ellipsis (the dot dot dot) at the beginning of the code region, you'll be shown what's contained within the region:

Now, what's missing?

I'd love to be able to just drag regions around.

I'd also love to be able to name the regions better. It's not too bad if the regions are procedures or functions but for other chunks of code, there's really no good option. Note that if I add a comment immediately above the code, it's not part of the same region. It might be better if it was like that, or if a specific comment could be treated as a region heading:

In the Edit menu, the Outlining submenu doesn't show anything else useful at this point, apart from bulk operations:

SDU Tools: DatesBetween – all dates between two dates in T-SQL

In our free SDU Tools for developers and DBAs, we have added many tools that help to manipulate dates.

When creating a date dimension (as part of dimensional modeling), you need to be able to get a list of all the dates between a start date and an end date. There are many other reasons why you might need to do this as well.

So we've added a table-valued function called DatesBetween to do just this. It takes a start date and an end date as parameters and returns all dates between. As well as the date values, it also numbers each of the dates.

In the main image above, you can see an example of it in use.

You can also see them in action here:

To become an SDU Insider and to get our free tools and eBooks, please just visit here:

http://sdutools.sqldownunder.com

Opinion: DIY security is not security

I spend a lot of time working in software houses. One of the nastiest things that I see again and again and again, is developers attempting to roll their own security and authentication mechanisms.

Spend a moment and think about how many security incidents the big companies (Google, Apple, Microsoft, etc.) have had over the years. Now think about how much effort they've put into doing it right, yet they still have issues at times.

The scary part about trying to do this yourself is that you often don't even know how scary what you are doing is.

Apart from the ones who do a reasonable job of password hashing, etc. I also see a surprising number who still store plain text passwords, or think that applying some "special algorithm that they wrote" to "encrypt" passwords or other private information is acceptable.

It's not.

I cringe every time I see someone who's written a algorithm that does obfuscation on a value before storing it. Worse is when they refer to it as "encryption" within the organization.

So my post today is just a simple plea:

Please don't do this.

The minute you find yourself writing "encryption algorithms" or authentication code, just stop. Just because you think you've got away with it for years, don't tell yourself that you don't have an issue.

I've seen the outcome at sites where this all goes wrong, and it's not pretty. You do not want to be anywhere near it when the finger-pointing starts. It all ends in tears.

Image by Tom Pumford
Image by Tom Pumford