Rhotic degemination in Sanskrit and the etymology of Vedic ūrú - ‘thigh’, Hittite UZU( u ) walla - ‘id.’

This paper examines the absence of geminate - rr - in Sanskrit and argues that the synchronic ban on this sequence results from continued high ranking of an Obligatory Contour Principle constraint against heteromorphemic geminates (inherited from PIE) combined with the substrate influence of Dravidian languages in which the rhotics are non-geminable. New - rr - sequences that arose in Proto-Indo-Iranian and Proto-Indo-Aryan from PIE *- LL - or *- LHL - after loss of the laryngeal and merger of * l with the rhotic were repaired through degemination. This hypothesis predicts a development of PIE *(-) CL̥HLV - to Sanskrit (-) Cī/ūrV - which has not been previously rec-ognized in the treatments of Indic historical phonology. This development is arguably found in mūrá - ‘stupid’ < * mūrra - < * mr̥h x - lo - (cf. Hitt. marlant - ‘stupid’), ūrú - ‘thigh’ < * u̯ūrru - < *( h x ) u̯l̥h x - Lu - ← * (h x ) u̯l̥h x - Lo - (cf. Hitt. walla - ‘thigh’), śīrá - ‘fervent’ < * śīrrá - < * k̑l̥h x - Ló - (cf. śrā́ya - ti ‘be fervent’), and perhaps in several other examples.

Indo-European Linguistics 9 (2021)  Another remarkable fact is that /r/ is the only sound expressly excluded by  from the list of Sanskrit consonants that may be doubled in clusters.8 The sequence -rr-is thus prohibited in Sanskrit both in derived con-texts9 and morpheme-internally. 10 Sanskrit is not typologically unique in this regard: for instance, in the Romanesco dialect of Italian the rhotic is the only consonant excluded from gemination in external sandhi (raddoppiamento sintattico).11 The same restriction is observed in Tamil, Malayalam, and other Dravidian languages where all consonants may be geminated (for instance, in the formation of causative stems), except for /ɾ/, /r/, and /ɻ/.12 This situation is reconstructed for Proto-Dravidian;13 its potential significance for the question of non-geminability of the rhotic in Sanskrit is obvious, since Dravidian languages exerted considerable contact influence on Indo-Aryan.14 There are other languages in which geminate consonants are generally permitted, but rhotics are joined by some other consonants in being non-geminable. 15 The phonetic basis of the ban on geminate rr in Sanskrit is difficult to determine with certainty, since we do not know how exactly Sanskrit /r/ was articulated: Indian grammatical tradition variously describes it as dental (dantya), 8 Optional gemination of the second consonant in a cluster is taught by Indian grammarians and supported by ample manuscript and inscriptional evidence, e.g., brahmā > brahmmā 'Brahman' , apa hnute > apa hnnute 'denies' , or adya > addya 'today' , see Cardona (2013: 51-64). 9 Unsurprisingly, the sequence is not found in internal sandhi either. There is only one set of r-initial endings in the language, namely, 3 pl. -re / -rate, -ran / -ram and ipv. -rām, but there are no such forms in Vedic made from roots ending in -r: for instance, the root śr̥ -'to smash' makes aor. pass. (a)śari, but in the plural the verbal adjective has to be used instead, as in RV 1.174.6 tváyā śūrtaḥ 'they were shattered by you' , not tváyā †aśr̥ ran (where, admittedly, the cluster would have been -r̥ r-and not -rr-). Middle perfects made from roots in final -r invariably use connecting -i-before 3 pl. ending: dadhriré, īriré, cakriré, etc. 10 Sasha Lubotsky points out to me that cases of distant dissimilation of r … r, e.g., intens. *árar-(root r̥ -) > álar-ti 'be on the rise' (AiGr 1.221) or *durhr̥ ṇā-'rage' (root hr̥ -) > durháṇā- (Narten 1982: 140) may be related to the phenomenon under discussion. 11 In modern Romanesco rhotic degemination is a sociolinguistic marker, see Nodari & Meluzzi (2020). 12 For (literary) Tamil see Lehmann (1994: 11); for Malayalam see Mohanan & Mohanan (1984: 581-582) and Sadanandan (1999: 22-23). 13 See Steever (1998: 16); Andronov (2003: 300); Krishnamurti (2003: 152). 14 See, e.g., Hock & Bashir (2016: 256-359 (Niang 1997: 48). alveolar (dantamūlīya 'tooth-rootic'), or retroflex (mūrdhanya 'cerebral').16 However, evidence across languages suggests that rhotics of all types are marked as geminates.17 Podesva (2000) proposed a perceptual account of the constraint on geminate sonorants in general arguing that they are easily confused with corresponding singletons due to their relative acoustic similarity to flanking vowels.18 This hypothesis is supported with experimental evidence showing that intervocalic geminate sonorants are spectrally continuous with adjacent vowels making their constriction duration difficult to perceive.19 This paper is devoted to the diachronic aspect of the problem: how did the *GemRhotic constraint come about in Sanskrit? To start at the very beginning, we know that there was no *-rr-in Proto-Indo-European where heteromorphemic geminates were not allowed.20 For instance, as Jochem Schindler observed,21 there are no adjective stems in *-ro-made from roots in final *-r: PIE speakers apparently avoided the derivation that would lead to a dispreferred structure. We can therefore theorize that Proto-Indo-Iranian inherited from PIE an Obligatory Contour Principle constraint against adjacent identical segments: heteromorphemic geminates continued being avoided in derivation and were repaired through deletion when unavoidable. The continued high ranking of this constraint in Indo-Aryan conspired with the nongeminability of rhotics in Dravidian substratum resulting in the restriction on geminate -rrthroughout the history of Sanskrit.22 Let us explore the predictions of this hypothesis in more detail. While it is certain that Proto-Indo-Iranian did not inherit any geminate liquids going back to PIE *-rr-or *-ll-, since these sequences were avoided in the proto-language, it seems that in PIE *-rl-or *-lr-were possible clusters: individual IE languages either show no restriction on these sequences (cf. Luw. ḫūtarlānni-'(little) 16 See Allen (1953: 54-55). Hock (1992: 72-73) argued for alveolar articulation of Sanskrit /r/ and Kobayashi (2004: 99) put forth a phonetic reason for the restriction on -rr-based on the doctrine that Sanskrit /r/ was an alveolar flap, but see the criticism by Ryan (2017: 303). To me retroflex articulation seems likeliest because of the nati-rule (n-retroflexion via harmony, e.g., rūpaṇi [ɻuːpaːɳi] < /ɻuːpaːni/ in ex. (1) above) and the dissimilation -ṣr-> -sr-, e.g., tisrás '3' (nom.pl. fem. of tri-) or usrás (gen.sg. of uṣár-'dawn') on which see Hale (1998). 17 See the general discussion by Ryan (2019: 125-126). 18 See also Kawahara (2007). 19 See Hansen & Myers (2017); Kawahara & Pangilinan (2017). 20 See Szemerényi (1996: 109-110); Mayrhofer (1986: 110-111 & 120-121); Byrd (2015Byrd ( : 43-48 & 2018Byrd ( : 2071. So-called "expressive gemination" (Watkins 2013) is a special case. 21 Apud Mayrhofer (1986: 12199). 22 For the rhotics' aversion to gemination in Proto-Dravidian see n. 13 above.

27
Space limitations prohibit discussing the question of possible preservation of PIE *l in Indo-Iranian, and I am going to proceed on the admittedly simplified assumption that PIE *l > Proto-Indo-Aryan *r at least in those dialects of Indo-Aryan which underlie the texts from which the examples discussed in this paper have been drawn. For a recent discussion of PIE *l in Indo-Aryan, see Schoubben (2019). 28 Ved. dharā-'blade, cutting edge' (YAv. dārā-'id.' , Ch. Sogd. d'r) has been explained as a metonymical extension of *dharā-'gush, pouring, casting (of liquid metal)' (EWAia 1.789, root dhani), which is not entirely satisfactory on the semantic side. A possible direct comparandum may be found in Gmc. *darra-n 'spear' , ON darr for which Schaffner (2001: 124-125) posited a preform TPdhor-Ló-and proposed an etymological connection with IE *dher-'to hold (a weapon in hand)' . But under this root etymology Ved. dharā-can be just as well accounted for as a reflex of *dhór-eh2-or *dhḗr-eh2-(rather than *dhér-leh2-), while Gmc. *darra-n may also be analyzed as an outcome of *dhor-s-ó-. 29 Since it is unclear whether Proto-Indo-Iranian *Carra-became *Cara-or *Cāra-with compensatory lengthening, I am going to employ a compromise notation *Cara-.
There is another, more promising question that can be investigated under the hypothesis of a continued high ranking of an OCP constraint against geminates in Indo-Iranian: what happened to new -rr-sequences arising from PIE *-LHL-after application of various laryngeal-loss rules, such as the Saussure Effect or the creation of "long vocalic resonants"? For instance, there is no reason to doubt on phonological grounds that *CVLh x -Lo-could be a valid PIE formation: cf. *ghelh2-ro-> OIr. galar (n., -o-) 'sickness, disease' , Hitt. kallar(a)-'unfavorable, baleful' . An o-grade *CoLh x -Lo-would be subject to the Saussure Effect and become *CoL-Lo-already in PIE.30 If the liquids were identical (e.g., *Corh x -ro-> *Cor-ro-), we would expect the resulting sequence of two heteromorphemic resonants to be degeminated already in PIE: cf. *(s)tómh1-mn̥ 'a cut, opening' > *(s)tómmn̥ > *(s)tómn̥ > Gk. στόμα 'mouth' , Hitt. ištaman-'ear' .31 If the liquids were different (e.g., *Corh x -lo-), the resulting form *Cor-lo-would develop into Proto-Indo-Iranian *Car-ra-after the merger of *l and *r. Since we never find forms with geminate rr in Indo-Aryan or Iranian,32 it is not unreasonable to speculate that *Carra-in our exempli gratia reconstruction would be repaired as *Cara-since it violated the constraint against geminates. Unfortunately, this development cannot be demonstrated due to the same problem as above: a hypothetical Sanskrit or Avestan form *Cara-from PIE root *CeLh xcan always be explained as a plain thematic derivative, rather than a reflex of *CoL1h x -L2o-, unless the liquid suffix is supported by comparative evidence.33 Another non liquet. 30 While I assume a PIE date of the Saussure Effect, this sound change has been contested by some scholars (Pronk 2011;van Beek 2011). 31 This analysis was proposed by Neri (2005: 21249); see also Vine (2019: 229). Another example of resonant degemination after the loss of a laryngeal is found in OIr. neim (n.) 'poison, Gift' < Proto-Celtic *neman-which cannot directly continue the expected protoform *nemh1-mn̥ : for the final laryngeal in the root *nemh1-'to give (one's due), to distribute' see Nikolaev (2010: 84-85) and LIV Add. s.v. The unexpected transition from PIE men-stem to a neuter n-stem (a virtually non-existent nominal class in Celtic or Indo-European) in this word is best explained as follows: after the loss of *h1 by Schmidt-Hackstein's Law (*h x > ø / C_CC) in the prevocalic allomorph *nemh1mn-, the resulting sequence *-mmn-underwent degemination to become *-mn-and then a new strong stem *nemn̥ was back-formed to oblique *nemn-. (Differently Byrd (2018: 2071 who follows Rasmussen (1999: 647) in assuming aniṭ character of the root *nem-and posits degemination */nem-men/ > *nemn̥ ). It is instructive to compare David Stifter's analysis of OIr. gein (n.) 'birth' (apud Peters 1997Peters [2002: 101) which likewise cannot be a regular phonological outcome of *gȇnh1-mn̥ (Ved. ján(i)man-, Lat. germen (Stüber 1998(Stüber : 60-61 & 2015. I thank Michael Weiss for an enlightening discussion of the Irish material. 32 There are no geminate consonants in Old Iranian (Kümmel 2014: 208 & 210-211). 33 Finding evidence for the purported development *CoLh x -LV-> *CoL-LV-> *Car-rV-> Indo-European Linguistics 9 (2021) 171-202 But the thought experiment may continue: another valid PIE formation containing a sequence of two liquids separated by a laryngeal is *CL̥ h x -Lo-.34 The expected development of this sequence on the way to Sanskrit is as follows: Proto-Indo-Iranian *CL̥ H-La-(merger of /e/, /o/, and /a/ as /a/) Proto-Indo-Iranian *CL̥ H-La-(merger of /h1/, /h2/, and /h2/ as /H/) Proto-Indo-Iranian *CəLH-La-(epenthesis36) Proto-Indo-Iranian *CərH-ra-(merger of /r/ and /l/37) Proto-Indo-Aryan *Ci/urH-ra-(ə > i; rounding in a labial context38) Proto-Indo-Aryan *Cī/ūrra-(loss of H with compensatory lengthening) However, we never find forms like *Cī/ūrra-in Sanskrit. Just as above, it is possible to theorize that the continued high ranking of an OCP constraint against geminates (inherited from PIE) necessitated the repair of disallowed *Cī/ūrra-as *Cī/ūra-, except that in this case degemination would be of Proto-Indo-Aryan date: in the sequence CR̥ HC the laryngeal was lost after Indo-Aryan separated from Iranian.39 Importantly, in this case there may be actual evidence to support the claim that the lautgesetzlich outcome of PIE sequence *CL̥ h x -Lo-in Sanskrit was in fact *Cī/ūra-. Prior to discussing this evidence in the next section of the paper, I summarize my series of hypotheses in the table below, using lo-and ro-derivatives from roots *Cer-and Cerh x -for illustrative purposes: Skt. Car-V-remains a task for the future. One important prerequisite for it is a better understanding of the semantics of the PIE suffixes *-ro-and *-lo-. 34 Cf. Gk. χλωρός 'pale green, yellow-ish' < TPghl ̥h3-ró-or Lat. malleus 'hammer' derived from *malalo-< TPml ̥h2-lo-(both transponates are unfortunately uncertain since suffixal *-ero-/ *-elo-remains a possibility to be reckoned with). 35 The first four sound changes are not critically ordered in respect to one another. 36 See Cantera (2017: 489); Clayton (2018). 37 See n. 27 above. 38 For the phonological development *CL̥ h x C-> *Cī/ūrC-, cf. tīrṇá-'one who has crossed' (< *tr̥ h2-nó-) or gūrtí-'praise' (< *gwr̥ h x -ti-); see Pinault (1987Pinault ( -1988; Clayton (2018). 39 This is established on the basis of differing reflexes in Indo-Aryan and Iranian, cf. Ved. gīrbhíḥ 'with songs' vs. Av. garəbīš or Ved. stīrṇá-'spread' vs. Av. starəta-.

3
Ved. ūrú-I believe that another example of rhotic degemination may be found in Ved. ūrú-(m.) 'thigh, shank' (RV+).52 There is no Indo-European etymology for this word on the books: the search for a root *(h x )u̯ eh x -with a suitable meaning from which *(h x )uh2-ru-(> Ved. ūrú-) could have been derived has not yielded a satisfactory result.53 The idea presented in this section is that the sequence ūr-in this word results not from *uh x L-but from *u̯ L̥ h x -in preconsonantal posi-tion54 created by another (suffixal) liquid which was subsequently eliminated after the loss of the laryngeal, as argued in (2) above: *(h x )u̯ L̥ h x -Lu-> *u̯ ūr-ru-> *u̯ ūrú-> ūrú-.55

3.3
Assuming that the proposed derivation of Ved. ūrú-from *(h x )u̯ L̥ h x -Lo-does not flounder on dubious phonology, what could the underlying root *(h x )u̯ elh x -or *(h x )u̯ erh x -be? There are two possibilities.
widely accepted in later scholarship. Both humans and animals have a right and a left walla-made up of bones and meat, the latter being particularly desirable;80 in catalogs of body parts, walla-is listed between the lap and the knee, and, in the vocabulary list KBo 1.51. rev. 6, walla-translates Akk. pēmu 'thigh' (CAD/P 321).81 Next to Hitt. walla-we find an i-stem walli-, easily identifiable as a Luwian intrusion: for instance, the word is used in Hittite-Luwian ritual fragment KUB 35.146 (MS, CTH 767) where we find acc. sg. wallin and acc. pl.

3.3.3
A tertium comparationis for Ved. ūrú-and Hitt. walla-may be found in Latin: this is the word volva (Var.+), Class. Lat. vulva (f.) 'human or animal womb; female sexual organ' .89 The standard etymology suggests a connection with the root *u̯ el-'to turn, wrap up, enclose' , assuming that 'uterus' , viz., 'membrane enclosing a fetus' was the original meaning of the Latin word.90 This is not an unreasonable approach: even though the meaning 'female genitals' (Pers.+) is not attested for vulva much later than the meaning 'womb' , the diminutive vulvula is used by Gn. Naevius (3rd cent. BCE) in the meaning 'sow's womb used as food' .91 Nevertheless, in theory it remains possible that vulva originally referred to external female genitalia and then started to be applied to uterus secondarily. Under this (admittedly speculative) assumption one may tentatively entertain an etymological connection between Lat. vulva and the Vedic and Hittite words for 'thigh' .92 The immediate phonological advantage of this analysis over the traditional connection with the root *u̯ el-'to enclose' is the explanation of vulva with its unassimilated cluster *-lu̯ -as a syncope product of pre-Latin *u̯ elau̯ ā93 ḫurnezzi 'sprinkles' . For the development of initial *u̯ L̥ -cf. u-ur-ri-er 'they helped' (KBo 3.60 ii 7 (OH/NS)) < *u̯ r̥ h1i-. However, *ulla-could then undergo a later (MH) development to walla-, cf. OH ulkiššara-> NS walkiššara-'skilled' (see Kloekhorst (2008: 93-94203) and for the etymology n. 102 below) or OH úrani > MH warani 'burns' , unless this change is viewed as an adjustment of the zero-grade middle to the root vocalism of act. warnu-(I would like to thank Craig Melchert for discussing these problematic forms with me). Hitt. walla-may therefore go back to *(h1)u̯ l ̥h x -o-or *(h1)u̯ l ̥h x -lo-(identical with the preform postulated for Ved. ūrú-above), as long as it is assumed that the spelling walluš in KUB 39.1 iv 9-10 (OH/NS) is an NS intrusion in the late copy. 88 After this paper was presented at the 40th East Coast Indo-European conference in June of 2021, I was pleased to learn through the courtesy of Václav Blažek that he already proposed a comparison between Ved. ūrú-and Hitt. walla-, without, however, addressing the long ū of the Vedic form (Blažek 2000: 55). 89 For the change *o > u /__lC cf. *u̯ elti 'wants' > volt > vult, see Weiss (2020: 151). 90 See de Vaan (2008: 689). The usual comparandum is Ved. úlba-n. (RV 10.51.1+), úlva-(VS 19.76+) 'womb' (see AiGr 2.2: 867, EWAia 1.232), but the alternation úlba-/ úlva-may suggest foreign origin of the Vedic word (Kuiper 1955: 179 Hitt. walla-was tentatively compared to Lat. vulva by Carruba (1991: 174) in the context of his (very doubtful) connection with Lyc. lada-'wife' as "die Umhüllte". 93 For syncope feeding e > o /__ł cf. *u̯ eluu̯ ō 'roll' > volvō (Weiss 2020: 135 & 150).
Indo-European Linguistics 9 (2021) 171-202 (TP(h x )u̯ elh x -u̯ eh294) where the medial *-a-would be the expected vocalization product of the root-final laryngeal posited in the hypothetical root *(h x )u̯ elh xin order to explain Ved. ūrú-. By contrast, pre-Latin *u̯ el-u̯ ā-from PIE *u̯ el-'to enclose' (so the traditional etymology) would be expected to give *volla.95 Granted 'female genitals' was the original meaning of Lat. vulva, how would the comparison with Hittite and Vedic words for 'thigh' work semantically? Across cultures, thighs are a common euphemism for genitals. One term for intercourse in Greek comedy is διαμηρίζειν 'get between the thighs' (Ar. Av. 669), which reminds of Apuleius' coinage interfeminium 'the space between the thighs' (Apul. Apol. 33), but hardly served as its model.96 Queen Medb in Irish heroic saga is in the habit of offering sexual intimacy to men in exchange for alliances or services, calling these favors cardes mo ṡliasta-sa fessin 'the friendship of my thighs' (LL 97). In Vedic, sakthī'thigh' (du.) is regularly used to refer to female genitals, e.g., út sakthyòr gr̥ dáṃ dhehi 'bring the penis into the two thighs' , uttered by the priest during the performance of the Aśvamedha ritual (TS 7.4.19)97 or an even more explicit phrase prayapsyánn iva sakthyàu | ví na indra mr̥ dho jahi 'like someone who is going to bring the penis into the two thighs, smash aside our enemies, o Indra' (TB 2.4.6.5. Cf. also μηρῶν μεταξύ '(penis growing) between the (woman's) thighs' (Archil. fr. 66) or Solon 16.2 (Gentili-Prato) μηρῶν ἱμείρων 'desiring thighs' (in a pederastic context). Note also the clever word-play in an epigram of Rufinus (AP 5.36.2 = 12.2 Page), in which μηριόνην is employed to signify the female genitalia (τίς ἔχει κρείσσονα μηριόνην 'which of the three girls had the best pussy'): the form is perhaps best interpreted as acc. of an ā-stem *μηριόνη which at the same time makes an allusion to the name of the hero Meriones. 97 See Watkins (1995: 265-276). 98 On this passage see Nikolaev (2015: 232-235 In view of these parallels,100 the proposed derivation of Lat. vulva from the same root as Ved. ūrú-and Hitt. walla-'thigh' becomes a possibility to consider seriously: the meaning 'female genitals' may be due either to an euphemistic use (*(h x )u̯ elh x -u̯ o-101 'thigh' > 'female genitals') or to a morphological derivation of some kind (e.g., possessive *(h x )u̯ elh x -u̯ o-'occupying the thighs'102 or locatival *(h x )u̯ elh x -u̯ o-'located at/between the thighs'103).

Further examples of rhotic degemination
Following the adage that ten bad etymologies are better than one good one,107 I would like to signal several other etymological possibilities for Sanskrit words containing otherwise inexplicable prevocalic -ŪL-. As often with etymologies, they range from merely possible to unprovable, questionable, dubious, hopeless, ludicrous, and outlandish. One problem shared by the etymologies presented below is the total absence of comparative evidence for a liquid suffix. For instance, the old comparison between Ved. mula-'root of a plant' and Gk.
μῶλυ 'mythical herb' may be revived under the assumption that mula-< *ml ̥h x -Lo-(while μῶλυ < *mōlh x -u-), but the reconstruction of the suffix *-ro-or *-lo-in this form is entirely hypothetical. The quality of some other examples is diminished by the uncertainty about the meaning of the Sanskrit word. For instance, if Ved. śīrá-(always of fire in the RV, also as a first compound member in śīráśociṣ-) means 'making hot/cooked' , the word may be compared to Ved. śraȳa-ti 'be fervent, become cooked' and derived from *kl ̥h x -Ló-;108 but the exact mean-this analysis, too, is merely a possibility, since the meaning of the name Pūrúremains unknown. Future scholarship will show if there is value to any of the proposals made in this section.

Conclusion
This paper has examined the curious synchronic absence of geminate -rr-in Sanskrit (evinced by the facts of Vedic sandhi) and attempted to identify its diachronic antecedent in the PIE constraint against heteromorphemic geminates which received additional support from the contact between Indo-Aryan and Dravidian languages in which /r/ is non-geminable. The central hypothesis of this paper is that new -rr-sequences arising in Indo-Iranian and Indo-Aryan from PIE *-LL-or *-LHL-after merger of *l with *r and the loss of the laryngeal were repaired through degemination. While the development of sequences like PIE *Cer-lo-or *Colh x -ro-in Sanskrit cannot be established with certainty, one prediction of the degemination hypothesis, namely, the development of PIE *(-)CL̥ HLV-to Sanskrit (-)Cī/ūrV -, is borne out by several examples: mūrá-'stupid' < *mūrra-< *mr̥ h x -lo-(analyzed by Nussbaum as a direct counterpart of Hitt. marlant-'stupid'), ūrú-'thigh' < *u̯ ūrru-< *(h x )u̯ l ̥h x -Lu-← *(h x )u̯ l ̥h x -Lo-(cf. Hitt. walla-'thigh') and perhaps some others.