Friday, January 04, 2008

Subcontinental Scripts: Hindi vs. Urdu

As I mentioned earlier in the week, I recently taught myself how to read the Urdu script, and it was quite challenging. Reading from right to left isn't so hard to get used to, but there are some letters that seem to be interchangeable (i.e., two different ways of writing 'k'/'q'), and other letters that look painfully similar to one another on the page ('d', 'r', 'v', etc). Also, some of the vowel markers one sees in Hindi/Devanagari, though they do exist in Urdu as diacritic marks, are frequently omitted in practice, so you often have to guess which vowel should be used based on context. Oh, and did I mention that there often aren't clear word breaks (depending on how the typography is done in a given book or newspaper)?

But once I got the script down (roughly), I was pleasantly surprised to find that Manto's Urdu vocabulary isn't that far off from standard Hindustani -- but then, he's a prose writer known for his accessible style. By contrast, the vocabulary of much Urdu poetry (i.e., Ghalib) is so full of Persian words as to be unintelligible -- at least to a barbarian ABD like myself.

Via the Sepia Mutiny News Tab (thanks, ViParavane), I came across a great post at the Language Log blog with a historical linguistics explanation for how the script (and language) divide came to be. I don't have much knowledge to offer on top of what Mark Liberman says, so the following are the just the quotes in Liberman's post I found to be most interesting.

First, Liberman has several quotes from an article by linguist Bob King on the "digraphia" (Greek for "two scripts") of Urdu and Hindi. First, we have the background:

Hindi and Urdu are variants of the same language characterized by extreme digraphia: Hindi is written in the Devanagari script from left to right, Urdu in a script derived from a Persian modification of Arabic script written from right to left. High variants of Hindi look to Sanskrit for inspiration and linguistic enrichment, high variants of Urdu to Persian and Arabic. Hindi and Urdu diverge from each other cumulatively, mostly in vocabulary, as one moves from the bazaar to the higher realms, and in their highest -- and therefore most artificial -- forms the two languages are mutually incomprehensible. The battle between Hindi and Urdu, the graphemic conflict in particular, was a major flash point of Hindu/Muslim animosity before the partition of British India into India and Pakistan in 1947. (link)

Then there are the social implications, which are not trivial:

One can easily imagine a condition of pacific digraphia: people who speak more or less the same language choose for perfectly benevolent reasons to write their language differently; but these people otherwise like each other, get on with one another, live together as amiable neighbors. It is a homey picture, and one wishes it were the norm. It is not. Digraphia is regularly an outer and visible sign of ethnic or religious hatred. Script tolerance, alas, is no more common than tolerance itself. In this too Hindi-Urdu is lamentably all too typical. People have died in India for the Devanagari script of Hindi or the Perso-Arabic script of Urdu. It is rare, except for scholars, for Hindi speakers to learn to read Urdu script or for Urdu speakers to learn to read Devanagari. (link)

(And yes, even those of us who pretend to be scholars struggle with "script tolerance.")

Another scholar (Kelkar) gives some concrete examples of differences in vocabulary, with specific attention to the points of divergence:

Common words like chai 'tea', milna 'to meet', and mashin 'machine' are the same in either Hindi or Urdu. Vocabulary diverges sharply as we move from Low to High. The Hindi words for 'south' and 'temperature' (as in weather) are dakshin and tapman, the Urdu words junub and darja-e-hararat. The sentence "Who is the prime minister at the moment?'' is ajkal pradhan mantri kaun hai? in Hindi, ajkal vazir-e azam kaun hai? in Urdu.

An Indian linguist has illustrated how far the styles deviate from each other by asking how the abstract expression "salvation's true path'' might be translated into Hindi and Urdu at different style levels and among different ethnic-social groups. Village people would render this as mukti-ki sacci sarak (Bazaar Hindustani). Pandits or educated Hindus would say mukti-ki satya upay (Highbrow Hindi). Cultured Muslims would translate the phrase as nájat-ki haqq rah (Highbrow Urdu). Indians who speak English as their second language might say salweshan-ki tru path. The only indication that these four "languages'' are in some sense variants of the same language is the genitive marker -ki. Words like satya and upay in the Highbrow Hindi rendering are from Sanskrit. Every single content morpheme in the Highbrow Urdu version is from Persian or Arabic. One sees how dramatically the character of a language is changed when the sources of borrowed words for new concepts are as far apart as they are in Hindi and Urdu: we might as well be dealing with different
languages. (link)

Liberman's post ends with a reference to Gandhi, who struggled -- as early as 1917! -- to conceive of a "secularist" solution to the script problem, but failed to do so.

Obviously, with Partition, the terms of the debate over "standard" scripts changed in the Indian subcontinent. The debate in Pakistan is essentially over, and Urdu wins. But according to the scholars Liberman cites, the split over scripts is very much alive in India (especially northern India, though I have Muslim friends from places like Hyderabad who say their families only speak Urdu at home).

The joint/hybrid spoken language spoken in much of northern India is Hindustani (mostly Hindi grammatical structures with a mix of Sanskritic and Persian vocabulary), which seems to have persisted in northern India despite attempts at Sanskritization. But even with that shared spoken language, it appears the division over scripts remains.


David Boyk said...

I don't think it's actually true to say that it's rare for Urdu speakers to read Devanagari. It's hard to get along in north India now without being able to read Hindi, even if you spend most of your time reading Nastaliq.

ana said...

Ah, this is so interesting, historical/sociolinguistics was something I had hoped to study myself especially when it came to our languages.

The interesting thing about k/q being interchangeable is that in Arabic and Persian where the "q" is used, what differentiates it is the sound, the "q" sounding at times more like the English "g" But I wonder if that differentiation may have slightly diminished in Urdu. The only way I could tell the difference between the two in spelling was if someone actually said qainchi(scissors)-wala-qaaf.It is interesting for me to see how the spelling of Benazir's father for example varies between Zulfikar and Zulfiqar. If I still remember the days when I read Urdu journals, I think the more accurate would be Zulfiqar.

Also interesting is I think this division in our shared language was evident in the spoken in quite a few of the Bollywood movies after the seventies. I noticed after years of watching Hindi films that I had to struggle through understanding the dialogue which was more Sanskritized. Even in the opening credits for some of the films, the Urdu title was absent. That was educational. :)

Anita Desai's In Custody comes to mind when writing of the rareness of Hindi speakers to learn to read Urdu.

Anonymous said...

There was something called 'Langdi', 'Landi' hindi which was prevalent in Northern part of India, it looked similar to Urdu but was actully Hindi. I beleive this was kind of a bridge between Hindi and Urdu.


narayan said...

"Also, some of the vowel markers one sees in Hindi/Devanagari, though they do exist in Urdu as diacritic marks, are frequently omitted in practice, so you often have to guess which vowel should be used based on context."
Anthony Burgess has an explanation in his book "A Mouthful of Air" ...
"All Semitic languages - Hebrew, Arabic, Phoenician, and the rest - possess in common a peculiar devotion to consonants. In fact, a Semite does not think of a Semitic word as being composed of syllables; he thinks of it as being made of the strong bones of consonants with the vowel sounds floating above like invisible spirits. Moreover, the vowels of a Hebrew or Arabic word have little to do with the determination of meaning. Meaning is firmly staked out by the consonants alone. Thus the three consonants k-t-b in Arabic possess a root meaning of "reading", so that kitab means book, khatib means mosque reader, the prefix m makes maktaba - bookshop, and mokhtab - college. ... Only in Semitic languages, rich in consonants but relatively poor in vowels, can a group of consonants stand for a whole word. We believe that the Phoenician traders of the Mediterranean were the first to take over simplified symbols and use them to represent consonants. They created a "betagam" rather than an alphabet - a BCD, not an ABC. They were not interested in finding vowel letters because vowel letters were not necessary to the writing of Phoenician, which shared that characteristic with other Semitic languages. This meant that they were able to make do with twenty-odd symbols - a tremendous and epoch-making economy." ...
"... wherever the flag of the star and crescent has been planted, the script of the Koran has been planted too. ... This has led to various modifications of the alphabet to fit the non-Semitic structure of languages like Spanish and Malay. Malay, for instance, is allowed to use more vowel signs than would be thought proper in Mecca, though even then it keeps them down to a minimum. Persian or Iranian is an Indo-European language, like English, and it has not taken kindly to the imposition of an alphabet that is nearly all consonants."

Wonderful reading!

junglibandar said...

Prof. Amardeep,
Very interesting post. You may want to look into the case of Konkani language, which is spoken predominantly in Goa, and in some regions of Maharastra and Karnataka. It was granted the status of official language a little after Goa gained statehood. Unfortunately this was extended to the Devanagri version only. Since then demands have been made to include the Roman version as well. The majority of christian population writes Konkani in Roman script. If local newspapers are anything to go by, this is still a very sensitive and emotional issue for Goans.

narayan said...

My last comment was prescient!
The NYTimes Book Review of 6 Jan 08 arrived with this afternoon's mail. I quote from the Essay, "Arabic Lessons", by Robert F. Worth :
"... all Arabic words have simple three- or four-letter roots, with systematically derived cognates that allow you to unfold a whole range of meanings from a single word. The word for 'to cook', for instance, is related in a predictable way to the words for 'kitchen', 'dish', 'chef', and so on. Arabic speakers are often dismayed to discover that the same principle is less common in English."
You would think that Mr. Worth had been reading Mr. Burgess in his sleep! The very next paragraph has :
"Arabic's hard 'h' letter, so difficult to pronounce at first, began to seem like a lovely breath of air, as if countless tiny parachutes were lifting the words above their glottal base."
My own experience of trying to learn a new language is filled with such tiny parachutes -- epiphanies they call them.

Fëanor said...

Nice one, Amardeep, and Happy new year! The issue of digraphia mapping religious divides is not restricted to India, of course. Consider Serbo-Croat. Serbs write it in Cyrillic and are Orthodox, Croats write it in the Roman script and are Catholic. For all practical purposes the dialects are almost 100% mutually intelligible. But can the speakers stand each other? There is no end to the divisions between humans on this vale of tears!

sarah said...

Excellent post,Amardeep.
As a language teacher,I really apprecaite it.
Farah Beal

I have no nickname! said...

The reason why the k/q letters seem interchangeable in some cases is because the words borrowed for Arabic or Persian keep their original form. It is something which people simply have to learn. For example, "mazak" which means to joke, is spelled different then what I thought it would be. I thought it would be meem, zai, alif, k. It is actually spelled meem, dhai, alif, q.

So I supposed that it is probably a borrowed word from Persian or Arabic.

Neha Sharma said...

Actually it takes some time to change a habbit. I'm also learning Urdu now a days and my problem is quite same as yours. My mind is trained to read from left to write thats why my teacher told me to practise reading and writing urdu daily at morning coz it's the best time to change a habit.

Anonymous said...

Great article! Urdu is the mother language of Delhi, Haryana, Madhya Pradesh, Uttar Pradesh and Bihar, though it is also used in many other states throughout Hindustan. It great that you have been teaching yourself Urdu:) In fact, that's how I taught myself as well.

Just to clarify, there is a difference between ک (for k sound) and ق ( for q sound). ک is pronounced as the english k, and is used to write urdu words such as kaam(کام), karna(کرنا), and koshish(کوشش).

ق is pronounced from the back of your throat, and produces a gutteral sound. Words include Qameez (قمیض), qabristan (قبرستان), and qaabil(قابل).