Which Unicode character should represent the English apostrophe? (And why the Unicode committee is very wrong.)

A trivial matter that proves to be not so trivial as it sounds.

Ted's Blog

The Unicode committee is very clear that U+2019 (RIGHT SINGLE QUOTATION MARK) should represent the English apostrophe.

Section 6.2 of the Unicode Standard 7.0.0 states:

U+2019 […] is preferred where the character is to represent a punctuation mark, as for contractions: “We’ve been here before.”

This is very, very wrong. The character you should use to represent the English apostrophe is U+02BC (MODIFIER LETTER APOSTROPHE). I’m here to tell you why why.

Using U+2019 is inconsistent with the rest of the standard

Earlier in section 6.2, the standard explains the difference between punctuation marks and modifier letters:

Punctuation marks generally break words; modifier letters generally are considered part of a word.

Consider any English word with an apostrophe, e.g. “don’t”. The word “don’t” is a single word. It is not the word “don” juxtaposed against the word “t”. The apostrophe is part of the word, which, in…

View original post 904 more words

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s