Search This Blog

Thursday, May 24, 2007

Evolution of the English Language

This semester, to fill up three units I needed for the good student discount on my car insurance, I took a basic introduction to linguistics class. For my third paper in the class, I wrote about the evolution of the English declension system, from Proto-Indo-European to modern Standard American English (the type used in things like text books, which is a bit more formal and "high" than everyday English, but not all that much different). Unfortunately, as the paper wasn't supposed to be more than three pages, all I could really do was give a brief overview of how the forms changed (and even then you can tell I basically ran out of space in the last paragraph, and attempted to very quickly conclude the paper). Had it been a term paper (something more like ten pages), I could have also included such information as why the changes occurred, as well as some of my predictions for the future of English based on what I see today. In this post (and possibly future ones, depending on how lengthy this turns out to be), I'll talk a bit about the latter topic: the future of English and current trends (although here I'll address more than just declension).

First, and most relevant to the paper, is what is happening with declension today. As I mentioned in my previous comparison and contrasting of case between languages, I mentioned that English currently has two cases for nouns (the general and possessive cases), and three for pronouns (subjective - 'I' - objective - 'me' - and possessive - 'my' or 'mine', depending on whether the pronoun is acting as a noun or an adjective).

However, in modern day colloquial English (the type you'd talk to your friends with), the line between the subjective and objective cases of the pronoun are becoming blurry. It's now very frequent for native speakers to use objective pronouns even in the subject position (e.g. "Me and Josh went to the supermarket"). This is done even by people such as me, who know that that is technically incorrect (should be 'I', instead of 'me').

So, where is this development going? Unfortunately, while this is growing in prominence, and is thus likely to shape tomorrow's English, it's not entirely clear what that shape will be. There are at least three possible ways this could go. First, as some have suggested, the subjective case may be lost, and English will be down to two cases for pronouns, just like nouns. This is clearly something English could take in stride. We already use syntax (word order) to differentiate between nouns being the subject or object of the sentence (in a manner outlined in my paper), so the two cases are really redundant.

However, it must be noted that no native speaker of standard American English (I can't speak for other dialects, as I don't know about them) would say something like "Me went to the supermarket"; it clearly sounds wrong. There are two alternate hypotheses that take this into account.

First, the application of the subjective case could be restricted. Rather than being used for the subject of the sentence, the rule could be altered, so that the subjective case is only used for the subject of the sentence when the subject is only composed of one noun/pronoun (as in the previous example), ignoring whether it's singular or plural. While the names subjective/objective make this sound counter-intuitive, you have to remember that the objective case already functions like an oblique case (which might, perhaps, be a better name for it) - a catch-all that includes everything but the subjective. Given this, this possibility doesn't sound so strange; of course, the subjective case could still stand a better name.

Finally, there's a third possibility. This one is particularly attractive to me, simply because it would be so unusual; specifically, that we are seeing the creation of a new, additional case: the conjunctive case - one used when there are a list of two or more nouns/pronouns in a group (note that the possessive case behaves fundamentally differently than the subjective or objective case as it is*, and so wouldn't follow this new rule). In this case (pun not intended), the form of the conjunctive case is identical to the objective case. I suppose the surest proof of this hypothesis would be if the conjunctive and objective forms diverge in the future, while the subjective case is retained.

Lastly, a footnote about the possessive case. There are actually two forms of the possessive case: the noun form (e.g. 'mine') and the adjective form (e.g. 'my'). The adjective form is clearly distinct in use from the subjective and objective cases, as it is an adjective and they are nouns. The noun form of the possessive, however, is also clearly distinct. The reason for this is that it refers to something completely different than the subjective or objective case. 'I' and 'me' both refer to exactly the same thing - me; 'mine', however, refers to something completely different - something that is obviously not me. Thus its not unreasonable to consider the possessive forms as behaving fundamentally differently from the subjective or objective cases, and not require them to follow the same rules.

Friday, May 11, 2007

Reading Material & Stuff

No, no lengthy post, today; just some other stuff for you do read (and you should read them). From least to most specific:

Threading 3D Game Engine Basics
Real-World Case Studies: Threading Games for High Performance on Intel Processors
Multi-threaded Rendering and Physics Simulation
Designing and Building Parallel Programs

Also, got some interesting stuff going on with me, BZ, and multithreading, but I don't have time to go into that right now.

Friday, May 04, 2007

I Didn't Actually Win

A ways back I posted about my great amount of amusement at one of the bugs that showed up on my list at work. Obviously I never got around to posting about what I found when I actually had a chance to investigate the bug.

It turned out to be a mixture of several problems. What turned out to be happening is that the program was crashing (a simple user-mode crash; nothing fancy). However, because a user-mode debugger wasn't installed on that computer, the crash launched the kernel debugger (don't ask my why there was a kernel debugger but not a user mode debugger; I don't know). This kernel debugger, in fact, would halt the entire system and stop at a breakpoint in kernel mode code; debugging could then be done by linking the computer to another computer (the one with the debugger client) with a serial cable. So, thanks to the kernel debugger getting invoked, a common crash got elevated to a complete system halt, complete with hosed hardware.

Annoyed, I installed WinDbg on the computer, and tried it again, with the hope of finding what was crashing. The cause immediately became clear, to my further annoyance: IsBadReadPtr was throwing an access violation. For those not familiar with this function, it consists of establishing a structured exception handling frame, then reading from the supplied pointer. Normally, the access violation is caught by the exception handler and the function merely returns true. But in this case, something was catching the exception before the handler.

That something was AppVerifier - a program offered by MS to perform very strict code checks on a program. While these checks tend to whine a lot about stuff that isn't really a problem, they're helpful in that they can catch things that would normally result in a crash, often in rare circumstances (making the crash very difficult to debug). In this case, AppVerifier was catching the exception too early, and making a fuss about something that couldn't possible have resulted in a crash anyway.

Unfortunately, that wasn't the end of the matter. A quick look at the stack revealed that IsBadReadPtr was being called from an internal Windows function. As this was probably the function checking for an invalid parameter passed to an API function (and thus could potentially mean that my program was passing an invalid parameter to an API function - bad), this meant that I couldn't ignore it.

It turned out to be a bug in the GUI library our company wrote and uses (the author of that library is my arch-nemesis). The list view class contains two image list classes used for checkbox and other icons. What was happening is that, because of the poor architecture of this library (which I fight with regularly), the list view class was being destructed before the list view window itself (actually all windows are like that, in this library). This meant the destruction of all child classes, including the two image lists. Unfortunately, that list view was still USING those image lists, as the class did not unselect them from the list view window before destructing. When the dialog was closed, the list view window was destroyed, and the window attempted to free the image lists (this is the default behavior for list view windows; you can set an option to not automatically free them), and of course the image list pointers were now invalid.

Another day, another fixed bug, another few hundred calories burned laughing.

Campaign Finance Data

Given that several people are pointing fingers in this thread and other places, I thought a little bit of hard data was in order: campaign finance data for the movie and recording industries. More detailed information can be found here, here, here, and here.

- my post from Slashdot

Thursday, May 03, 2007

Now What?

For those of you who haven't been watching this unfold in real time, something absolutely unprecedented is happening, right now. Several days ago, Digg.com censored several stories which referred to the AACS (the copy protection system used by HD-DVD and I believe Blu-Ray) master encryption key, which can be used to decrypt any currently available movie.

This move was in response to a cease and desist order they received from the AACS forum, threatening prosecution under the DMCA anti-circumvention clause. As this was quite a serious threat (and has a much better chance of succeeding than the RIAA suits), Digg removed all of the posts referencing the key, without acknowledging or justifying the action to users. It even went so far as to ban users who repeatedly tried to submit those stories.

This caused a massive outrage (some might call it a temper-tantrum), where users submitted thousands of stories to Digg containing the key. Some consider it an act of civil disobedience, though in this case it's the Digg operators who stand to get sued, not the people actually disobeying the DMCA, so some would consider this worse than civil disobedience. After receiving many more submissions than the Digg operators could censor, they gave in, and promised not to remove any more posts containing the key.

As a side-effect of the outrage on Digg, hundreds of thousands of web sites have been created containing the key, either in its raw form or some manner of trivial encrypted form (so that anyone who wanted to could decrypt it). This thing has blown up to such a degree that offline news sources are covering it. BBC, the New York Times, and Forbes all have stories on the Digg incident, and likely many more will follow.

This is a completely unprecedented legal situation. No one's been in the position of the AACS forum before, and nobody really knows what to do next. They could issue hundreds of thousands of cease and desist orders, followed by many law suits (for those that don't comply). While they do have a very strong case, based on the DMCA, the best they could hope for is that a few law suits could induce enough fear that all the others will take the key down before they're sued; a technique that hasn't worked so well for the RIAA. Of course it's likely that many countries wouldn't allow suits from the US based on US laws, so it's likely the key sites would just migrate out of country.

Suing Google and other search engines (which I believe they're already planning on doing) might work to a limited extent, but that would only make the key harder to find, not actually get rid of it. What's least likely, unfortunately, is that they'll roll over and die, in the face of an enemy they can only hope to impede (not defeat). They'll probably push for stronger laws, and DMCA-like laws in other countries, but that's both a very long-term strategy, and a not wholly effective one.

We'll see.

Wednesday, May 02, 2007

Transition

So, for the last year or so, I've been planning to get a new computer (maybe someday it'll actually happen...). Something close enough to top of the line that I won't need to buy another one for several years. I first got this computer I have now back in 2001, when Warcraft III was going into beta (boy, did that chew up a lot of time; kinda like when World of Warcraft came out...). At the time it had an Athlon XP 1700+ (1.47 Ghz), 512 megs PC133 RAM, and a GeForce 2 video card. I had to upgrade to 1 gig RAM for World of Warcraft, the CPU to an Athlon XP 2200+ (1.8 Ghz) when my CPU fried (thanks to the heat sink fan failing), and a friend just happened to have a GeForce 3 he didn't need anymore (and wasn't going to try to sell). With upgrades, that makes this maybe a 2002 or 2003 or so computer, which makes it about due for an upgrade (particularly considering it's a "gaming" computer). With the improved graphics in Burning Crusade (the World of Warcraft expansion pack), the average frame rate I get is about 13 FPS (down from 17 before the XP).

What I'm currently considering getting (though if I don't get a new computer till after summer, I could probably get more for the same price) is a Dual Core 2 E6600 (2.4 Ghz), 2 gigs DDR2 memory, and either a GeForce 7900 GS, Radeon X1950 PRO, or Radeon X1950 XT. In terms of raw clock rate, that CPU would be 2.67x as fast as my current one, and improvements to instruction performance (cycles per instruction) would make the number even larger. But there's a problem: I'd really only see about half of that performance (1.33x my current performance, plus instruction performance improvements). Why? Because nobody is very good at multithreading, yet.

We're currently in a transition period. We're rapidly approaching the physical cap of clock speed (it's my personal prediction that we won't see clock rates go past 5 Ghz or so, twice current speed, using transistor technology). Already it's much more practical to increase the number of cores/ALUs than increase the clock speed. That's why dual cores have become almost standard, and quad cores will become standard in the next 5 years or so (though honestly, most people - those that don't do anything CPU intensive - don't NEED a multi-core CPU).

Yet programmers aren't keeping up. This move to parallelization is recent enough that still very few programmers have the skills needed to write effective multithreaded code. That's why the Cell, with its 8 cores, is such a terror to program (and the XBox 360 CPU, with 3 cores, to a lesser extent). Some things, like web servers, where you're dealing with a lot of short, independent tasks are easy to split among many threads (not that you'd need a lot of CPU power for a web server; maybe it's a digital signature verification server or something). But can you imagine trying to split your core game logic equally into 8 threads? "Nontrivial" would be a rather dramatic understatement.

Current PC games, such as World of Warcraft (the only game I play, at the moment), are still primarily single-threaded. They may have some helper threads that do various things (a music streaming thread is an easy helper thread to make), but those threads use up very little CPU, compared to the main (single) game logic thread.

I can only hope that this will get better with time. My prediction is that in 5-10 years, the ability to break down complex tasks equally into multiple threads will become mandatory for programming positions in every company producing programs that use a significant amount of CPU (e.g. games). Unfortunately, that doesn't help me now.

Of course, to be fair, multi-core CPUs still have their uses right now. Anything that uses large numbers of threads, including several potentially CPU intensive ones, can benefit from multiple cores (assuming you don't have a gay driver that disables all but one of them; I'm looking at you, Creative!). Our company just got a dual quad-core system (8 cores in all) for its VM server (hosting many VMs). That's an excellent use, because for the most part individual threads will not consume that much CPU, and can be load-balanced well. And, of course, you'll see noticeable performance improvement if you typically run something that consumes a moderate amount of CPU in the background while you play a game (or run a second CPU-intensive program). But again, neither of those helps me :P

Finally, a recommendation. If your school offers a course on multithreading optimization, take it. Better to learn now than later, and hopefully accelerate this transition.

Tuesday, May 01, 2007

Public Service Announcement

9 hackers looking into poor security,
249 MPAA lawyers browsing porn in the silence before the storm.

17 sites spreading the news,
2 sites surviving the mass visits.

157 drops of sweat down the AACS team's cheeks,
116 frantic phone calls buzzing in the offices.

227 lawyers starting up Plan B,
There's now 91 sites to shut down.

$216 sent as bribe for the Digg staff,
still 65 sites still up and running.

86 shutdown reasons discovered by abusing the DMCA,
197 prayers one will work.

99 sites now publishing the keys... oh wait!
86 managers finding the case is slipping out of control.

136 confused MPAA members mumbling about HD-DVD keys,
192 reasons found to keep trying to stifle sales
- Jugalator

One Key to rule them all, One Key to find them, One Key to bring them all and in the darkness bind them.
- addendum by ClamIAm