GYPSY LANGUAGES
Ian Hancock
The category âGypsy languagesâ is not a linguistic one, nor an accurate one by any criterion that would group them in any meaningful way. There are numbers of quite distinct populations referred to as âGypsiesâ, such as the Austronesian (Dayak)-speaking âSea Gypsiesâ, the Irish Travellers and the groups dealt with in Griersonâs Gipsy LanguagesâVolume XI of his Linguistic Survey of Indiaâwhich includes languages as unrelated as those of the Dravidian (Telugu)-speaking Bhamtas and the Indo-Aryan (Jaipuri)-speaking Pendaris.
The reason for the loose application of this label is traceable to 19th century Britain, when the Romani population in that country was receiving considerable attention as the supposĂšd keepers of a much missed and much romanticized pre-industrial rural way of life. They were also the focus of attention as an âexoticâ Asian population in the heart of England, as well as one considered greatly in need of Christian salvation. Both their language and their ancestry fascinated small groups of academics, folklorists and dilettantes, and over time gypsy (with a lower-case initial) has come to stand more for an imagined way of life than for an ethnic people, since the âTrue Romanyâ idealized by such individuals did not in fact exist.
The culprit would seem to be Sir Denzil Ibbettson, who used the word âgypsyâ in his 1881 Census Report for the Punjab to refer to certain indigenous Indian populations which, because of their itinerant way of life, reminded him of the âgypsiesâ he was familiar with in England.
Despite this established use of the word, there may nevertheless still be a legitimate need for such a category if it be properly defined, since no other label otherwise exists for a large number of languages, or remnants of languages, spoken by populations that originated in India and who left before or during the mediaeval period.
There are many such groups which, on the basis of more or fewer Indian elements in their speech, and sometimes subjectively from their physical appearance, are included in this category. For most of these, however, the ability to be more specific about their identity is made difficult by two factors: the names applied to them, and the fact that whatever Indic linguistic material is evident consists only of a handful of lexical items otherwise used in the grammatical matrix of whatever the local language happens to be. Arnold noted that â[i]t is a remarkable fact that in the wide stretch of country from the Indus to the Ăgean (Byzantine Gypsies) in the West and to Syria (Nuri) in the South, there exists no nomadic group which speaks a language corresponding with European Romaniâ (1967: 110).
Names
The only overview of such languages to date is Kenrick (1976), where he lists many of the names for Middle Eastern âgypsyâ groups found in the literature. These are Alimah, Abdal, Aptal, Awgon, Awgon-Luli, Baluji, Banu Sassan, Barake, Beluchi, Berberi, Beshawan, Bosha, Catchar, Djugi, Dom, Dshukihar, Dummi, Fayjan, Fiuj, Gavbar, Geygel, Goudari, Ghagar, Gurbat or Kurbat, Kustani, Guaidiya, Haddad, Helebi, Hindustani Kentschlini, KöÄer, Krishmal, Luli, Jat, Kabuli, Kaloro, Kara-chi, Tabriz, Kara Luli, Karashmar or Krismal, Kasibar or Mugat, Kenites, Kersi, Koli, Koudji or Kochi, Kouli, Lom, Luli, Lur, Luri, Mazang, Midreb, Motribiyya, Multani, Nawar, Odjuli, Pessewann, Posha, Qarabana or Qarabtu, Qarachi, Quenites, Qorbati or Ghorbati, Quf, Rawazi, Sagvand, Sahsawan, Sayabigeh, Shurasti, Sozan, Suzmani, Tat, Tavoktarosh, Zangi, Zargari and Zott. For most of these we have only a passing reference in this or that document. Sometimes, quite different names are applied to the same population (e.g. the Lom or Bosha); sometimesâlike the very label gypsy itselfâthe same name is applied to quite distinct groups (e.g. Djugi being applied to both the Iranian Koli and the Tajikistani Mugat). Sometimes they are clearly geographical (e.g. Helebi < Aleppo or Kabuli < Kabul) and sometimes they are occupational (e.g. motribiyya < Arabic âmusicianâ).  The overriding commonality seems to be only that such populations are migrant, âgypsyâ referring to their behaviour rather than their ethnicity. An excellent bibliography, as well as some grammatical discussion, is to be found in Windfuhr (2002).
Paucity of linguistic material
In cases where no grammatical material has been retained, it is not possible to make an identification with any particular Indian language or even dialect group, nor even necessarily with other âgypsyâ languages. That two of these vernaculars may share the Indic word pani for âwaterâ demonstrates nothing beyond an Indian connection; in India itself, a hundred languages have the same word. Nor is it the case that all of them were ever âfullâ languages at some earlier time; non-indigenous lexicon can be acquired in many ways and inserted into existing languages. That English, for example, contains the Chinese word tycoon does not mean that it is a Sinitic language or even that the Anglo-Saxons ever interacted at first hand with speakers of Chinese.
Romani Studies, which began to emerge as an area of scientific (mainly linguistic and ethnographical) endeavour in the 1780s, have traditionally regarded âgypsyâ languages as belonging to three branches: Romani, Domari and Lomavren (Hancock, 1995). The Indian origin of Romani, the language of the Rom or European âGypsiesâ, was first recognised by western scholars in the late 1700s; by the middle of the following century Domari, the language of the Dom âGypsiesâ in Syria and elsewhere had began to be documented and by 1887 Lomavren, spoken by the Armenian Lom, became a part of the discussion (Patkanov). It was initially assumed that all three were branches of the same original migration out of India, and attempts have been made to reconstruct protoforms based upon their combined analysis (Hancock, 1988). This no longer appears to be the case, at least for Domari. There are structural and lexical features of this language that point to a much earlier separation from India than is evident for the other two.
The first published account of Domari was Pott (1846), where he summarized notes on the language sent to him by the Reverend Eli Smith, an American missionary working in Syria. The only extensive grammatical and lexical account remains Macalister (1914), though it is flawed and seriously out of date; a modern linguistic description is in preparation by Matras (forthcoming), who has also written on the present state of the language in Jerusalem (1999).
The existence of Lomavren, spoken in Armenia, Georgia, eastern Turkey and probably elsewhere in the region, was first brought to the attention of European scholars in 1828 when a list of 100 words was published by von Joakimov (mentioned in Finck, 1907:2 which, together with Patkanov, op. cit., and Papazian 1901, is the principal published source on the language).
Speakers of Domari refer to themselves as Dom, but are most often called Luri or Nuri (plural Nawar) or Luli in the literature. Like the Lomavren-speaking Lom and the Romani-speaking Rom, they have not retained the details of their own history, and attempts to reconstruct it have been mostly speculative. It is possible that their ancestors left India in response to fifth and sixth century raids into India by the Huns, when the Indian kings sent out fighters to resist them. The word for a non-Dom is gaÄÄa which, like its equivalents in Lomavren (gaÄav) and Romani (gadĆŸo) derives from the Middle Indo-Aryan gajjh- meaning âcivilian; non-military personâ, and the word luri itself, though usually attributed to an Arabic origin, may be from MIA luth- âplundererâ (compare Romani lur, ditto). Speakers of other varieties of Domari, among them KaraÄi (a name possibly from Turkish karaca âswarthyâ) and Mıtrıp (possibly from Arabic motribiyya âmusicianâ) live elsewhere in the Middle East (Patkanov, 1887; Benninghaus, 1991; Hancock, 1995:31). Against this possibility and in support of a post-9th century move is the fact that none of the Persian items in Domari (or in Lomavren or Romani for that matter) derive from the Middle-Iranic period.
Lomavren exists almost solely as a vocabulary, and is a register of Armenian rather than an Indo-Aryan language. For this reason the only clues as to its history and affiliation are in its lexicon. Nevertheless it shares much more of this, as well as of its phonology, with Romani than it does with Domari, and an argument might be made for their having been one before developing independently in separate directions in eastern Anatolia. If the koïné hypothesis outlined below proves to be correct, it might also be the case that Lomavren was never a discrete language at any time, but has always existed only as a cryptic lexicon.
Romani is the most widely spoken of all the Gypsy languages, and the most extensively studied. It supports an extensive literature and is used on a number of internet websites, and is taught regularly at several universities (Charles, Moscow, Texas, the Sorbonne). Some of its dialects, such as those spoken in northeastern Europe, demonstrate after a thousand years a truly remarkable retention of Indic lexicon and morphosyntax, while others survive only as a limited lexical corpus in the structural and phonological framework of a European language, and are typologically no longer Romani, e.g. Pogadijib in England or Cal\ in Spain (Bakker & Cortiade, 1991).
Determining its linguistic origins has preoccupied scholars since its Indo-Aryan identity was first realised over two centuries ago. The theory that gained the widest acceptance was proposed in 1830 by John Harriott, viz. that the origin of the Romani speakers was to be found in the Persian epic by Firdousi, the Shah Nameh that describes events that took place during the Sassanid Dynasty. Here, 10,000 Indian musicians were sent as a gift to the shah Bahram Gur in AD 439 but after a year had been away from Persia, presumably moving on into other parts of the Middle and Near East where they remained until eventually coming into Europe centuries later. If this account were true, it would seem to have more relevance to Domari than to Romani history.
The Rajput Hypothesis
A more recent hypothesis (Hancock, 2000, 2002a) maintains that the ancestors of the Rom were a conglomerate of different ethnic peoples assembled into a military forceâthe Rajputsâto confront the Islamic invaders led by Mahmud Ghaznavi in a series of raids between AD 1000-1027. It further holds that a koĂŻnĂ©ized military lingua franca, which it calls Rajputic, emerged under the same circumstances that gave rise to the Urdu language (Urdu is from the Turkish word for âarmy campâ). That the hypothesized Rajputic was still spoken in India during this period is indicated by the redistribution of the original Middle Indo-Aryan neuter nouns.Â
Their individual reassignments to the masculine and feminine sets in Romani are mirrored in Hindi and other languages still spoken in India and this process, a salient characteristic of New Indo-Aryan, did not happen earlier than ca. AD 1000. Domari on the other hand is historically a three-gender language, evidently separating from Indic-speaking territory at an earlier time.
It is also significant that while Romani, Domari and Lomavren each contain Persian-derived lexical adoptions, there is not one such item shared by all three, and Romani and Domari have less than a fifth of them in common. If all three had passed through Persian-speaking territory as one migration before separating, a higher incidence of shared items would be expected.
The Ghaznavids took prisoners of war (called both ghulams, i.e. âslavesâ, and âIndiansâ) to use in their own forays; the Seljuqs in turn defeated the Ghaznavids (at Nishapur in Khurasan, in AD 1038) before later moving into Anatolia and establishing the Sultanate of Rum. Köymen writes that it is well documented that after that defeat, âsoldiers from throughout Khurasan . . . some of whom may have served the Ghaznavidsâ joined the battalions of the Seljuqs (1957:356). Indians are recorded as having been used for the same purpose by the Seljuqs in their raids into eastern Anatoliaâwhich had begun as early as AD 1015âand were likely the same prisoners of war captured from the Ghaznavids. As it was for the Rajputs, Persian was also the Seljuqsâ language of administration, though they were themselves a Ghuzz Turkic people. There is no Turkic material in the pre-European Romani lexicon, though modern Romani shares over half of its accreted Persian-derived lexicon with Urdu; far more than it does with Domari.
While this hypothesis requires elaboration, it is generally accepted that it was indeed the Seljuqs who were responsible for pushing the ancestors of the Rom into Anatolia following the Battle of Manzikert in AD 1071 (Soulis, 1961:163; Fraser, 1992:46) and perhaps even earlier. It would have been at this point that the ancestors of the Lom separated and moved east before the military lingua franca had acquired native speakers; there are no Greek items in Lomavren, but Byzantine Greek is the second largest contributor to the pre-European lexicon of Romani after its Indic items. It is also the case that Romani has as much a Balkan (Greek) character linguistically as it has an Indian one, and it might be surmised that it was in the Byzantine period that the language, as well as the ethnic identity, crystallized. It has also been suggested that the name Rom derives from Rum rather than from any Indic root; all citizens of the sultanate called themselves Romaivi, and the first Romanies to arrive in Greece called themselves Romiti.
Using the techniques of lexicostatistical dating, Fraser (1989) has shown that dialect splits within Romani began in the middle of the 11th century. According to one of a number of proposed schemata the modern language demonstrates three major divisions, which Cortiade calls strata, the first and geographically most widespread in Europe stretching from the Balkans to the north, east and west, the second concentrated in south-central Europe and the third emerging from the Romani-speaking populations who had been held in slavery in Moldavia and Wallachia until the 1860s. These last have since migrated to all parts of the world, and are particularly well represented in North and South America. The earliest reliable reference to a Romani presence in the Balkans dates from ca. AD 1100. It is entirely likely that these broad dialect divisions reflect more than one different period of migration into Europe from the Byzantine Empire; some dialects for example show minimal lexical influence from Greek, suggesting an early move out of Anatolia. There are some sixty different Romani dialects today, and attempts are underway to create a universal written standard (Hancock, 2003).
Works cited
Acton, Thomas A., ed., 2000. Scholarship and the Gypsy Struggle: Commitment in Romani Studies. Hatfield: The University of Hertfordshire Press.
Arnold, Hermann, 1967. âSome observations on Turkish and Persian Gypsies,â Journal of the Gypsy Lore Society, 46(3/4):105-122.
Bakker, Peter, & Marcel Cortiade, eds., 1991. In the Margin of Romani: Gypsy Languages in Contact. Amsterdam: Instituut voor Algemene Taalwetenschap. Studies in Language Contact No. 1.
Bakker, Peter, & Yaron Matras, 1997. âRomani linguistics, a very brief Historyâ, in Matras, Bakker & Kyuchukov., 1997:vij-xvj.
Benninghaus, RĂŒdiger, 1991. âLes Tsiganes de la Turquie orientaleâ, Etudes Tsiganes, 3:47-60.
Cortiade, Marcel, 1998. âThe Gypsy languageâ, Interface, 31:9-14.
Daftary, Farimah, ed., 2003. Language Politics. Flensberg: European Centre for Minority Issues.
Finck, Franz N., 1907. Die Sprache der armenischen Zigeuner. St. Petersburg: Imperial Science Academy.
Fraser, Angus, 1989. âLooking into the seeds of timeâ. Paper presented at the annual meeting of the Gypsy Lore Society, Toronto, 7-9 April.
Fraser, Angus, 1992.  The Gypsies. Oxford: Blackwell.
Grierson, George A., 1922. Linguistic Survey of India. Delhi: Motilal Banarsidass.
Hancock, Ian, 1988. âThe development of Romani linguisticsâ, in Jazyery & Winter, 1995:183-223.
Hancock, Ian, 1995. âOn the migration and affiliation of the DĆmba: Iranian words in Rom, Lom and Dom Gypsyâ, in Matras, 1995:25-51.
Hancock, Ian, 2000. âThe emergence of Romani as a koĂŻnĂ© outside of Indiaâ, in Acton, 2000:1-13.
Hancock, Ian, 2002a. We Are the Romani People: Ame Sam e Rromane DĆŸene. Hatfield: Hertfordshire University Press.Â
Hancock, Ian, 2003. âLanguage corpus and language politics: the case of the standardization of Romaniâ, in Daftary, ed., 2003.
Ibbetson, Denzil C.J., 1881. Census Report for the Punjab. Calcutta: Indian Foreign Office.
Jazyery, Ali & Werner Winter, eds., 1988. Languages and Cultures: Studies in Honor of Edgar C. Polomé. Berlin & New York: Mouton de Gruyter.
Kenrick, Donald, 1976. âRomanies in the Middle Eastâ. Roma, 1(4):5-8; 2(1):30-36; 2(2):3-39.
Köymen, Mehmet, 1957. BĂŒyĂŒk Selçuklu ImparatorluÄuânun KuruluĆu I. Ankara: DTCF.
Macalister, R.A.S., 1914. The Language of the Nawar or Zutt: the Nomad Smiths of Palestine. Edinburgh: Constable. Gypsy Lore Society monograph No. 3.
Matras, Yaron, ed., 1995. Romani in Contact. Amsterdam: Benjamins.
Matras, Yaron, Peter Bakker & Hristo Kyuchukov, eds., 1997.The Typology and Dialectology of Romani. Amsterdam: Benjamins.
Matras, Yaron, 1999. âThe state of present-day Domari in Jerusalemâ. Mediterranean Language Review, 11:1-58.
Matras, Yaron. A Grammar of Domari. Berlin: Mouton de Gruyter. In preparation.
Papazian, V. M., 1901. Armenskije BoĆĄa (Ciganje): EtnografiÄeskij OÄerk. Moscow.
Patkanov, K.N., 1887. Cigani Njeskoljko Slov o NarjeÄijax Kavkazskix Cigan: BoĆĄa i KaraÄi. St. Petersburg: Akademia Nauk.Â
A shorter English version, âSome words on the dialects of the Transcaucasian GypsiesâBoĆĄa and KaraÄiâ appears in the Journal of the Gypsy Lore Society, 1(3):229-334 (1908).
Pott, August F., 1846. âUeber die Sprache der Zigeuner in Syrienâ. Zeitschrift fĂŒr die Wissenschaft der Sprache, 1:175-186.
Soulis, George C., 1961. âGypsies in the Byzantine Empire and the Balkans in the late Middle Agesâ, Dumbarton Oaks Papers, 15:142-165.
Windfuhr, Gernot, 2002. âGypsy dialects.â EncyclopĂŠdia Iranica, Volume XI, Fascicle 4, pp. 415-421.