Subcontinental Scripts: Hindi vs. Urdu

As I mentioned earlier in the week, I recently taught myself how to read the Urdu script, and it was quite challenging. Reading from right to left isn't so hard to get used to, but there are some letters that seem to be interchangeable (i.e., two different ways of writing 'k'/'q'), and other letters that look painfully similar to one another on the page ('d', 'r', 'v', etc). Also, some of the vowel markers one sees in Hindi/Devanagari, though they do exist in Urdu as diacritic marks, are frequently omitted in practice, so you often have to guess which vowel should be used based on context. Oh, and did I mention that there often aren't clear word breaks (depending on how the typography is done in a given book or newspaper)?

But once I got the script down (roughly), I was pleasantly surprised to find that Manto's Urdu vocabulary isn't that far off from standard Hindustani -- but then, he's a prose writer known for his accessible style. By contrast, the vocabulary of much Urdu poetry (i.e., Ghalib) is so full of Persian words as to be unintelligible -- at least to a barbarian ABD like myself.

Via the Sepia Mutiny News Tab (thanks, ViParavane), I came across a great post at the Language Log blog with a historical linguistics explanation for how the script (and language) divide came to be. I don't have much knowledge to offer on top of what Mark Liberman says, so the following are the just the quotes in Liberman's post I found to be most interesting.

First, Liberman has several quotes from an article by linguist Bob King on the "digraphia" (Greek for "two scripts") of Urdu and Hindi. First, we have the background:

Hindi and Urdu are variants of the same language characterized by extreme digraphia: Hindi is written in the Devanagari script from left to right, Urdu in a script derived from a Persian modification of Arabic script written from right to left. High variants of Hindi look to Sanskrit for inspiration and linguistic enrichment, high variants of Urdu to Persian and Arabic. Hindi and Urdu diverge from each other cumulatively, mostly in vocabulary, as one moves from the bazaar to the higher realms, and in their highest -- and therefore most artificial -- forms the two languages are mutually incomprehensible. The battle between Hindi and Urdu, the graphemic conflict in particular, was a major flash point of Hindu/Muslim animosity before the partition of British India into India and Pakistan in 1947. (link)


Then there are the social implications, which are not trivial:

One can easily imagine a condition of pacific digraphia: people who speak more or less the same language choose for perfectly benevolent reasons to write their language differently; but these people otherwise like each other, get on with one another, live together as amiable neighbors. It is a homey picture, and one wishes it were the norm. It is not. Digraphia is regularly an outer and visible sign of ethnic or religious hatred. Script tolerance, alas, is no more common than tolerance itself. In this too Hindi-Urdu is lamentably all too typical. People have died in India for the Devanagari script of Hindi or the Perso-Arabic script of Urdu. It is rare, except for scholars, for Hindi speakers to learn to read Urdu script or for Urdu speakers to learn to read Devanagari. (link)


(And yes, even those of us who pretend to be scholars struggle with "script tolerance.")

Another scholar (Kelkar) gives some concrete examples of differences in vocabulary, with specific attention to the points of divergence:

Common words like chai 'tea', milna 'to meet', and mashin 'machine' are the same in either Hindi or Urdu. Vocabulary diverges sharply as we move from Low to High. The Hindi words for 'south' and 'temperature' (as in weather) are dakshin and tapman, the Urdu words junub and darja-e-hararat. The sentence "Who is the prime minister at the moment?'' is ajkal pradhan mantri kaun hai? in Hindi, ajkal vazir-e azam kaun hai? in Urdu.

An Indian linguist has illustrated how far the styles deviate from each other by asking how the abstract expression "salvation's true path'' might be translated into Hindi and Urdu at different style levels and among different ethnic-social groups. Village people would render this as mukti-ki sacci sarak (Bazaar Hindustani). Pandits or educated Hindus would say mukti-ki satya upay (Highbrow Hindi). Cultured Muslims would translate the phrase as nájat-ki haqq rah (Highbrow Urdu). Indians who speak English as their second language might say salweshan-ki tru path. The only indication that these four "languages'' are in some sense variants of the same language is the genitive marker -ki. Words like satya and upay in the Highbrow Hindi rendering are from Sanskrit. Every single content morpheme in the Highbrow Urdu version is from Persian or Arabic. One sees how dramatically the character of a language is changed when the sources of borrowed words for new concepts are as far apart as they are in Hindi and Urdu: we might as well be dealing with different
languages. (link)


Liberman's post ends with a reference to Gandhi, who struggled -- as early as 1917! -- to conceive of a "secularist" solution to the script problem, but failed to do so.

Obviously, with Partition, the terms of the debate over "standard" scripts changed in the Indian subcontinent. The debate in Pakistan is essentially over, and Urdu wins. But according to the scholars Liberman cites, the split over scripts is very much alive in India (especially northern India, though I have Muslim friends from places like Hyderabad who say their families only speak Urdu at home).

The joint/hybrid spoken language spoken in much of northern India is Hindustani (mostly Hindi grammatical structures with a mix of Sanskritic and Persian vocabulary), which seems to have persisted in northern India despite attempts at Sanskritization. But even with that shared spoken language, it appears the division over scripts remains.