MST PPL HV LTTL DFFCLTY N RDNG THS SNTNC
That statement, sans vowels, is Claude Shannon‘s example of how information can often be left out of an English sentence without degrading your ability to read it. Shannon, famous as the father of information theory, used various measures to calculate that English is roughly 50% redundant. This may sound wasteful, but Shannon’s information theory is the very tool that tells us that redundancy is useful for rejecting errors, as when you are listening to a conversation in a noisy room.
Google Scribe is a new tool that uses statistics to help you write. It’s like a super-powered spell checker: an about-to checker, as in “here’s what I think you’re about to type.” Here’s how it works. Because Google knows everything, it knows what you’re about to type. Or actually, it’s more like this. Because Google has seen everything written in English so far, and because you’re such a staggeringly unoriginal writer, it’s not hard to guess what you’re going to say. In other words, it may feel like magic, but it’s not magic. You’re just boring. Sorry.
Here’s a sentence I tried to type: “Claude Shannon showed that up to fifty percent of the characters in a typical English sentence are redundant.” That’s 109 characters, including spaces. I counted my keystrokes as I used Google Scribe and got 58 keystrokes, or 53% of the original length. Not bad.
Another fun game is to give it a starting letter and constantly accept the suggestion, just to see where it goes. Start with the letter I, and you get something like this: “In the case of these two types of information that is not appropriate for all users of the catalogue should also be noted that there is anything you would not believe.” Indeed.
If you use Google Scribe to write your term paper, should you cite it as a co-author? Perhaps it will just reach in and add itself.
My god-daughter is having her Bat Mitzvah soon and will be reading from a Torah without Niqqud (i.e. Hebrew consonants without diacritical marks for vowels). I was opining that a written language can be constructed in shorthand if the only people using it are native speakers, which is basically what Claude Shannon is saying about English for English speakers. I wonder how many of Alan’s students could interpret that sentence, or if they would feel like they needed the Niqqud. Maybe my god-daughter should get a portable scanner with Google Scribe Hebrew Edition for her reading.
I was going to mention Hebrew (and Arabic too, for that matter). Whether or not they can benefit from still further compression is something I’d like to know (only use every other consonant?).
Back to Google Scribe, by setting the sort to “S” for maximum expected saving of typing time, I was able to type “Now is the time for all good men to come to the aid of their country” in only 26 keystrokes, only 38% of the final keystroke count.
Ancient Punic also skipped vowels. I read about it because I have a book idea starring Hnnbl f Crthg. Y knw, th gy wth th lphnts.
Yes, maybe the name of God could be reduced to the digrammaton, YW! Hey, was that lightning?
“I had my first ho…”
Dangit, that was going to be “…hockey game of the season” but the word suggested was “homosexual”.
I guess that’s the risk one takes when all the English text on the Internet is the corpus.