Save

Pairing peers and pears

Changing conventions of Gheg Albanian heritage speakers

In: Language Dynamics and Change
Authors:
Barbara Sonnenhauser LMU Munich Slavonic Linguistics Munich Germany
University of Zurich Research Priority Program “Language and Space” Zurich Switzerland

Search for other papers by Barbara Sonnenhauser in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0003-2757-3143
,
Blertë Ismajli Universiteti i Prishtinës “Hasan Prishtina” Department of German Language and Literature Pristina Kosovo

Search for other papers by Blertë Ismajli in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-4783-741X
, and
Paul Widmer University of Zurich Department of Comparative Language Science Zurich Switzerland
University of Zurich Research Priority Program “Language and Space” Zurich Switzerland

Search for other papers by Paul Widmer in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-9949-5212
Open Access

Abstract

Migration events splitting speaker communities and establishing novel contact situations are among the major drivers of language variation and change. While the precise processes that lead to change cannot usually be determined for past events with any certainty, the study of minority and heritage language usage in apparent time may provide insight into the contribution of the linguistic behavior underlying the dynamics. We capitalize on this and compare parts of speech usage in Pear Story renarrations across Gheg Albanian speakers of three generations in German-speaking environments, applying methods from information theory. The results suggest that the changing conventions in parts of speech usage across generations and places of residence can be attributed to changing linguistic behavior within the speaker community in the migration setting. These findings highlight the impact of changing sociocultural embedding and the roles of vertical and horizontal transmission in language change.

1 Introduction

Demographic events are well-known drivers and inhibitors of language change. They often split speaker communities and establish novel contacts between speaker populations (Nichols, 1997; Janda and Joseph, 2003). For speaker communities that developed due to migration and subsequent geographical separation from the source population, the sociocultural context is of particular importance. It co-determines the quantity and quality of linguistic input these speakers experience and the output they produce. Changes in cultural context and in patterns of social interaction generally affect patterns of language use because of changing speaker behavior and linguistic practice (Beckner et al., 2009). This has immediate consequences for the cultural transmission of language, both in vertical transmission between generations (i.e., in language acquisition in early life) and in horizontal transmission later on in the linguistic biography when linguistic traits are imitated between interlocutors, among peers in particular (Bloomfield, 1933; Labov, 2001; Labov, 2010).

One prominent outcome of migration-related splits of speaker communities are heritage languages, that is, languages of migrated speaker communities and their descendants that find themselves in new linguistic majority contexts (see Montrul and Polinsky, 2021). Such contexts may be helpful for a better understanding of actuation and propagation of language change. In fact, while the evergrowing body of quantitative methods allows us to model these processes for past events based on their observable outcomes, insight into actual variation and shifting preferences and their propagation as well as into the individual linguistic behavior contributing to these dynamics is much harder to obtain (Blythe and Croft, 2021) and thus likewise mainly inferred from their results. A possible approach to alleviate this issue is the study of minority languages and their speakers in real or apparent time: they provide an opportunity to open the black box and get a better grasp of the underlying conditions that inhibit or drive processes of language change (Bailey et al., 1991; Sankoff, 2006).

We pursue this idea by focusing on Gheg Albanian speakers in a German-speaking environment. Our aim is to empirically trace changing linguistic conventions, that is, the state of coordination between members of a speaker community across multiple communicative interactions (Clark, 1996; Croft, 2000). To do so, we control for communicative situation by keeping the pragmatic context fixed (renarration of a short video clip). Applying methods from information theory, we investigate the usage of parts of speech (PoS) in this context as unit of measurement for linguistic behavior and convention. In order to explore how sociocultural conditions in contact situations shape and change language in narrative conventions, we control for input quality and quantity by sampling three groups of speakers with distinct sociolinguistic biographies.

We first motivate our focus on PoS as a measure of language use in renarration by illustrating their relevance as a signature of pragmatic context in Section 2. In Section 3, we describe our data and methods. The results are reported in Section 4, followed by a discussion in Section 5.

2 Assessing language use in renarration

2.1 Linguistic knowledge

Many studies of heritage languages are interested in the linguistic knowledge (competence) of their speakers and apply a range of experiments with explicitly designed tasks (e.g., Benmamoun, Montrul, and Polinsky, 2013; Scontras, Fuchs, and Polinsky, 2015; Embick, White, and Tamminga, 2020; Flores and Rinke, 2020). While this approach provides important and sensible insights into the grammar of individuals, it is not always clear how to interpret the results from a variationist and diachronic perspective. Even though it seems to be possible to approach underlying competence from utterances to some extent (e.g., Stoll et al., 2011), identifying the competence to produce a form does not in itself tell us anything about its availability in communicative practice. As a consequence, individual scores in tasks related to grammar are not necessarily informative of the behavior of individuals in linguistic interaction and practice, where arguably change is actuated and innovation propagated.

Concerning the specific example of narration, there are many ways for a speaker to tell a story and the reasons for the choices they make in terms of categories at the lexical or clausal level are manifold. In particular, these choices do not necessarily fully reflect the speaker’s lexical skills or knowledge of grammar. For example, the absence of a specific lexical item or a category such as the French passé simple, a synthetic past tense, in the renarration of a story by a French-speaking informant does not imply that they have no active mastery of a given lexical item or this verbal category. It may equally as well comply to community-specific narrative convention to tell the story in other words or the French passé composé, an analytic past tense, or the present tense depending on the context (occasion, audience, etc.) and the content (topic, register, etc.). This is why narrative conventions also differ from narrative competence as assessed in terms of the ability to produce coherent stories by using appropriate lexical, morphological, and syntactic strategies and applying them for the proper coding of scenes, the packaging of information, and so on (e.g., Krasnoshchekova and Kashleva, 2019; Laurent, Nicoladis, and Marentette, 2015; Verhoeven, 2004).

2.2 PoS usage in narration

As for the comparison of narration, commonly used measures such as type-token ratios or attestations of morphological categories (see Polinsky, 2018) are meaningful only when interpreted against an adequate background and based on a sufficient and comparable amount of data. For instance, speakers using one single adjective display the same type-token frequency as speakers using 10 adjectives once and thus display the same entropy (Stoll et al., 2011). These speakers differ, however, in the narrative and syntactic complexity of their narrations. Similarly, the observation of speakers using few morphological ablatives or subjunctive forms or displaying a large number of nominatives does not tell us much unless embedded in the syntactic context of noun phrases, complex predicates, and syntactic functions, respectively.

Concerning the lexicon, there are many ways for speakers to avoid or substitute specific lexical items if they are not available at a particular moment. For instance, general nouns such as man and thing can easily be used instead of more specific ones such as farmer or basket (see also Polinsky, 2018). But whatever grammatical means or lexical items are chosen, they are inevitably of a particular lexical class—speakers simply have to couch their speech into nouns, verbs, prepositions and the like, according to the specific conventions of the genre, which may depend on more general constraints. As for the usage of nouns, pronouns, and verbs in narratives, Seifart (2012: 9) discovered characteristic, sinusoidal alternations of the noun-to-verb ratio as narratives unfold, which he interpreted as cognitive constraints on the activation of discourse participants in this communicative situation. Therefore, we start from the assumption that the usage of PoS is better suited to establishing comparability across (groups of) speakers than lexical types or grammatical specifications.

To be sure, PoS usage is in several respects not fully independent from competence in morphology. For example, for Albanian it makes sense to treat auxiliaries and lexical verbs as distinct PoS because they differ in semantic, distributional, and morphosyntactic characteristics (Newmark, Hubbard, and Prifti, 1982; Buchholz and Fiedler, 1987). The preference of a fully bilingual Albanian heritage speaker for auxiliaries may be related to his or her inability to produce synthetic tense forms, possibly influenced by the dominant contact language. While such effects are undeniable—and part of ongoing change—it is also clear that morphology and morphological competence alone cannot explain all variation. In fact, analytical constructions are reported as a general preference of heritage speakers irrespective of the presence of auxiliary constructions in the dominant language (Polinsky, 2018: 183). In Albanian, for example, tense usage varies a lot across individuals and social groups as our data shows, indicating that speakers of all groups have at least some command both of synthetic and analytic past forms and are thus able to make a choice.

2.3 PoS usage in heritage language

Previous research provides evidence that heritage speakers retain PoS distinctions irrespective of the degree to which they develop their language. This is taken to confirm the universality of the major lexical classes, in particular nouns and verbs (Polinsky, 2018, ch. 6; Benmamoun, Montrul, and Polinsky, 2013: 147–148). By means of elicitation tasks and translation studies, the mastering of PoS distinctions is approached from their lexical and morphological specification as manifest in particular in type-token ratios and the usage of morphological categories. Eliciting the competence concerning PoS characteristics and taking this as evidence for the mastering of a specific variety implicitly starts from the assumption that the activation of this competence is not affected in contact situations and is largely invariable across registers and pragmatic contexts. However, it has been shown that patterns of PoS usage are indicative of narrative conventions and conversational practices (Stoll et al., 2011; Seifart, 2012; Anstatt, 2018). This is not unexpected, given that corpus studies show that there are notable differences between types of performative acts (informal dialogues, official speech, storytelling, poetic diction, etc.) in terms of their linguistic patterns (Biber, 2012; Szmrecsanyi, 2019).

For speakers of a minority language the register aspect is highly relevant. Speakers of minority languages are commonly exposed to colloquial speech, but potentially much less to the many varieties of other registers used in more specific contexts (formal, narrative, poetic, etc.). This may lead to a lack of familiarity with specific register properties and, as a consequence, to usage patterns and variations that differ from established varieties (Flores and Rinke, 2020). In addition, the degree of naturalness or spontaneity of the communicative event may have an impact on the linguistic structures used (Himmelmann, 1998; Klamer, and Moro, 2020). This needs to be considered when analyzing data gained from “staged narratives” (Dobrushina and Sokur, 2022), such as the renarrations of the Pear Story video used in our paper.

In summary, PoS usage not only emerges as a basic feature of language usage in general but also as characteristic of particular pragmatic contexts and their specific patterns of language usage, viz. narrative conventions. Changing sociocultural contexts can impact on conversational practices and narrative conventions, measurable in terms of PoS usage. In the context of migration, each generation of speakers practices, develops, and/or acquires their language(s) under changing sociocultural circumstances. Based on these considerations we expect to find vestiges of incipient language change manifested in variation of PoS usage in subsequent generations of migrated and heritage speakers when controlling for pragmatic context.

3 Data and methods

3.1 Corpus

In order to make language usage in pragmatic contexts comparable, we controlled for the input by presenting a short video sequence (Wallace Chafe’s Pear Story film, Chafe, 1980) as a stimulus and asking our informants to renarrate what they saw. While this kind of data elicitation may have certain problems for documentary purposes (Haig, Schnell, and Wegener, 2011), it is well suited for comparing communicative practice in a very particular pragmatic context, as has been shown in the long research tradition using this well-established tool of data elicitation since Chafe’s original project (e.g., Du Bois, 2006; Stoll and Bickel, 2009; Chelliah, 2016; Fafulas, 2021; Frye, 2022). It goes without saying that our results are valid only for language usage in this specific setting.

3.2 Informants

Sociocultural conditions are controlled for by sampling informants according to linguistic biography via different social generation (see below) and place of residence, namely Zurich, Switzerland (Z) and Munich, Germany (M). Possible effects of the degree of German-Albanian bilingualism are taken into account by including a convenience sample of speakers of Gheg Albanian from Prishtina; see Table 1.

The sociolinguistic situation of our informants is determined by their linguistic socialization (i.e., their language of schooling and family language) on the one hand, and their place of residence with the specifics of local networks and access to instructions in the heritage language on the other. As to the former, we distinguish three social generations. While in terms of age our three social generations largely overlap with demographic generations, our choice controls for a crucial factor, namely linguistic socialization, the speaker’s first exposure to German:

  • First generation speakers (G1) have a complete schooling in Albanian and migrated to Switzerland/Germany after age 25.

  • Second generation speakers (G2) have a partial schooling in Albanian, and migrated to Switzerland/Germany before the age of 12.

  • Third generation speakers (G3) were born in Switzerland/Germany, with their complete schooling being in German.

All G1 and G2 speakers are from a Gheg-speaking region, mainly Kosovo, but also North Macedonia and Serbia. Concerning the country of residence, Switzerland and Germany differ with respect to the proportion of Albanian speakers and visibility. In Germany, Albanian is not particularly prominent (ca. 400,000 speakers, making up ca. 0.4 % of the population; Schader, 2016: 480; Adler, 2019). In Switzerland, ca. 250,000 speakers of Albanian (ca. 3.1 % of the population) form the largest linguistic group in recent migratory history (third overall) and the Albanian community is highly visible in public life and discourse, in particular in urban areas (Schader, 2016; Bundesamt für Statistik, 2022; Burri Sharani et al. 2010). The supply of Albanian language teaching for heritage speakers is much more established in Switzerland, where it is also coordinated on a local level (for Zurich see Volksschulamt, 2023), whereas Munich and the state of Bavaria is not involved at all in providing Albanian language teaching (Mediendienst Integration, 2020: 6). For more statistics and further information, see Schader (2016), Novinšćak Kölker (2016) and Bundesamt für Statistik (2022).

The linguistic socialization of G1 speakers encompasses yet another factor that needs to be considered: they were raised with Gheg Albanian as their family language and language of everyday communication, but educated with Tosk-based Standard Albanian, which was taught as being highly prestigious (e.g., Buchholz and Fiedler, 1987; Lloshi, 1999; Pani, 2006; Schader, 2006; Ajeti, 2002). This manifestation of language ideology is also strongly at play in the diaspora. As a consequence, our G1 informants in Zurich and Munich display a great desire to adhere to this standard in the experimental situation. The attitude towards Standard Albanian differs across generations of speakers in Prishtina. The attitude of G1/G2 speakers from Prishtina towards Standard Albanian is similar to that of G1 informants from Zurich and Munich. For them, Standard Albanian was a strong building block of ethnic identity, as was the case for speakers of Albanian in different regions in former Yugoslavia (Ismajli, 2020: 23–30). Among G3 speakers from Prishtina, Standard Albanian is not considered prestigious; these speakers tend to speak the local variety of Gheg Albanian. Since Gheg and Tosk do not differ in their inventory of PoS, possible switches to the Tosk-based standard during renarration have no immediate effect on PoS usage. It may have an effect on the probability distributions of certain PoS, for example for auxiliaries (potential preference for synthetic tenses under Tosk influence) or particles (lack of infinitival constructions when adhering to Tosk). However, we consider that this mirrors ongoing change under changing conditions and thus is relevant to our research question.

Table 1

Number and age of the informants by generation and place of residence

Generation

Location

N

Minimum age

Maximum age

Median age

G1

Munich

18

34

70

52

G1

Prishtina

3

53

67

65

G1

Zurich

17

30

67

54

G2

Munich

18

18

40

24

G2

Prishtina

3

29

39

36

G2

Zurich

21

20

48

28

G3

Munich

17

10

20

12

G3

Prishtina

3

15

18

16

G3

Zurich

21

10

23

12

3.3 Experimental setting

The Pear Story video (http://pearstories.org) was presented to our informants with the instruction to renarrate what they have seen in Albanian. In a second run several weeks later (three weeks on average, depending on availability), the same informants were shown the video again and asked to retell it in Swiss German or the German variety they use in Munich. While this keeps the communicative situation and pragmatic context stable across speakers, the situation is clearly quite different from naturalistically occurring language usage. Apart from the content, this pertains in particular to the forced language choice: speakers had to use Albanian or German, irrespective of whether they felt comfortable with that particular language in this specific context. As artificial as this might seem at first sight, it is not entirely so. There are indeed situations in which speakers feel obliged to use Albanian, irrespective of their mastery of this variety. As authentic recorded data involving family conversations including three generations of speakers show, once G1 speakers are present, G2 and also G3 speakers switch to Albanian, the language that G1 is more familiar with (Kelmendi n.d.). This results in a similar “forced” language choice as in our elicited data. In addition, this conscious usage of the less-familiar variety in order to maintain contact with older generations of speakers—not only of the source population, but also in the new setting—can be assumed to hold in historical situations as well. Once this contact across generations can no longer be maintained, the split of speaker communities gains speed.

3.4 Data processing and analysis

The recordings were transcribed and manually annotated in EXMARaLDA (Schmidt and Wörner, 2014) for part of speech and morphological categories. For analysis we extracted the parts of speech of each informant from the annotated texts, namely adjectives, adpositions, adverbs, auxiliaries, conjunctions, determiner (indefinite article and article preceding nominal and adjectival modifiers [see Newmark, Hubbard, and Prifti, 1982: 121] but not inflectional articles), nouns, numerals, particles, personal and demonstrative pronouns, other pronouns, and lexical verbs. Based on the informants’ PoS usage when renarrating the Pear Story we computed the pairwise Jensen-Shannon divergence (JS divergence). JS divergence is a method from information theory, whose general relevance to language and usefulness for linguistic analysis has been widely embraced over the past decades (Gibson et al., 2019). JS divergence is based on Shannon entropy, a measure of the predictability of a set of possible outcomes. In particular, JS divergence is a method to quantify the similarity between two probability distributions P and Q, in our case the probability distributions of PoS usage in Pear Story renarration of all pairs of informants. We use base 2 logarithms, in which case JS divergence is bounded by 0 and 1. JS divergence = 0 if and only if P = Q (Lin, 1991). To carry out the analysis, we used the custom function JSD from the package philentropy (Drost, 2018) in the R programming environment (R Core Team, 2022).

Since we are interested in the behavior of informants with the same social background we classify the pairs of informants by their group membership and evaluate the divergence between groups. For example, the divergence between all possible pairings of informants from G1 and G2 is indicative of the amount of similarity in PoS usage between the two generations involved. The distribution of G1-G2 pairwise divergences can be compared to the distribution of other groupings, such as G1 and G3 informants. The differences between G1-G2 and G1-G3 pairs provides information about the extent to which G3 behaves differently from G2 relative to G1. Lower average divergence values indicate more similarity between groups, with a narrower range of dispersion suggesting a higher amount of homogeneity. As for PoS usage in retelling the Pear Story, we interpret amount and dispersion of JS divergence as a measure of narrative conventions. To our knowledge, this is a novel application of JS divergence as previous research on PoS usage in narratives, such as Seifart (2012), has used various types of ratios.

4 Results

We first have a closer look at genre differences in terms of PoS usage in order to establish PoS as a sensible parameter of variation. Next, we zoom in on the variation within the Pear Story narrations across generations and compare G3 speakers from Munich or Zurich and Prishtina. We then explore the relevance of location/place of residence. In order to disentangle generation and age, we investigate the relevance of the age of speakers and the relevance of age difference between G1 and G3 speakers. Finally, we analyze PoS usage in German narrations produced by the Zurich informants.

4.1 Part of speech in pragmatic contexts

In order to ensure that PoS is a sensible parameter of variation across different pragmatic contexts we first compare the divergence in PoS usage between two different genres: folklore texts taken from Çetta (1982) and Panajoti (1988); and the renarrations of the Pear Story video by all our heritage G1, G2, and G3 informants from Zurich and Munich in Fig. 1.

d25061325e843

Figure 1

Box, violin, and dot plots of pairwise JS divergence, grouped by all Albanian folklore texts (F-F) and the Albanian Pear Story renarration by G1, G2, G3 from Zurich and Munich and all folklore texts (G1-F, G2-F, G3-F). Each dot represents the JS divergence of one pair of probability distributions, e.g. of two folklore texts in F-F, a folklore text and a G1 informant in G1-F, etc. Outer box limits include the central 50 % of the data, the horizontal line that divides the box indicates the median. Whiskers extend from the box to the largest and smallest value in the data that are no further than 1.5 times the inter-quartile range (distance between the first and third quartiles) from the box. Data beyond are considered outliers. Violin plots use density curves to depict in more detail the distribution of the data; the width of the curve corresponds to the frequency of data points in the area.

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

The divergence among the seven folklore texts (F-F) is very small in a narrow distribution. G1 and G2 are more or less equidistant from F while on average, the divergence from G3 to F is twice the average divergence between G1/G2 and F with a considerably wider distribution. This differences in distribution between F and G1/G2/G3 might well relate to the fact that the folklore texts are editorially revised and hence streamlined to a certain degree (i.e., adapted to reading conventions), whereas variation across speakers is a prevailing characteristic of linguistic behavior. However, the different average levels of divergence for F-F vs. F-G1/G2/G3 pairs still indicate differences in PoS usage. This indicates that based on the data we are using, F texts are very similar and homogeneous and indeed have a profile that is clearly different from that of the Pear Story renarrations by G1, G2, and G3. As predicted, PoS usage shows context-specific signatures.

Further, G1 and G2, who have a shared educational background until age 12 (see Section 3.2), display a similar degree of difference to F as opposed to G3. This hints at conventions in PoS usage being rather stable once established.

4.2 Generations

In order to assess variation, that is, the degree of (dis)similarity, in PoS usage within and across generations, we investigate how the variation is distributed within generations—that is, between pairs of informants belonging to the same generation (G1-G1, etc.)—and across generations—that is, between pairs of informants not belonging to the same generation (e.g., G1-G3, etc.). The results are reported in Fig. 2.

d25061325e874

Figure 2

Box, violin, and dot plots of pairwise JS divergence in Albanian Pear Story renarrations across generations (G1-G2, G1-G3, G2-G3) and within generations (G1-G1, G2-G2, G3-G3) from both places of residence (Munich and Zurich). Each dot represents the JS divergence of one pair of probability distributions, e.g., of two G1 informants in G1-G1, a G1 and a G2 informant in G1-G2, etc.

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

Pairs of speakers from G1 and G2 as well as the combination of G1 and G2 display the smallest amount of divergence at a similar level (first, second, and fourth plots in Fig. 2). Pairs involving G3 speakers (third, fifth, and sixth plots in Fig. 2) have an increased level of divergence, G1-G3 pairs being clearly distinct from pairs without G3 speakers. G1 and G2 have in common that up to age 12 they grew up in an Albanian-speaking environment before moving to Germany or Switzerland (Munich and Zurich respectively) into a German-speaking environment. The situation of G3 is more complex. Their family language, which provided the main input in their first years, clearly differs from the input they were exposed to upon entering the schooling system. The language of education and of many social contacts from this point onward was German. It is thus interesting to compare the narrative behavior of these unbalanced Albanian/German speakers with that of our informants from Prishtina of the same age group.

4.3 Linguistic biography

The comparison between the behavior of G3 in our Munich and Zurich sample (MZG3) and speakers of approximately the same age who grew up in an Albanian-speaking environment in Prishtina (PG3) is presented in Fig. 3.

d25061325e908

Figure 3

Box, violin, and dot plot of pairwise JS divergence of PoS usage in Albanian Pear Story renarrations, grouped by generation and location from Munich and Zurich (MZG1 = G1 from Munich and Zurich; MZG2 = G2 from Munich and Zurich; MZG3 = G3 from Munich and Zurich) and G3 from Prishtina (PG3). Each dot represents the JS divergence of one pair of probability distributions, e.g., of a G1 and a G3 informant in G1-G3, etc.

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

Whereas MZG3-MZG3 pairs (from Zurich and Munich) diverge from one another to the same extent as from the Prishtina sample (MZG3-PG3; rightmost pair in Fig. 3), the distribution of divergences of G1-G3 and G2-G3 pairs across conditions differ visibly. Thus when narrating the Pear Story, at an early age speakers of Albanian from Prishtina (PG3) use parts of speech in a way that is quite similar to MZG1 and MZG2 migrants, who grew up in the same educational and social context (right boxplots in leftmost and middle pairs). MZG3 speakers from Zurich and Munich, who grew up in a context where both Albanian and German were spoken and had their schooling in German use parts of speech differently from MZG1 and MZG2 migrants (left boxplots in leftmost and middle pairs).

This indicates that the social context and the quality and quantity of exposure to Albanian (i.e., the linguistic biography) has an impact on how speakers use PoS when narrating the Pear Story in Albanian. It also indicates that once a particular convention of narration is acquired, it remains stable even with migration. In addition, this convention is barely influenced by the dominant language as evidenced by the small divergence between MZG1/2 and PG3 (see the second and fourth box plot in Fig. 3). This differentiates MZG1/G2 from MZG3, and might point towards age as playing a role as well.

In the following, we first zoom in on place of residence as the second major determinant of a speaker’s sociocultural embedding (Section 4.4), and subsequently on age as another potentially relevant factor (Section 4.5).

4.4 Place of residence

When taking into account the heritage speakers’ sociocultural embedding as represented by place of residence (Munich and Zurich) we observe that the divergence within and between G1 and G2 does not differ substantially (first, second, and fourth groups in Fig. 4). For G2-G3 pairings the impact of the place of residence on the overall amount of divergence is not very strong either, all combinations contributing to a similar extent. In G3-G3 pairings Munich informants show a special behavior insofar as they contribute less to the overall divergence and dispersion is smaller (rightmost group in Fig. 4) than in pairs of Zurich informants (Z-Z) and mixed pairs with informants from Munich and Zurich (M-Z).

The most notable difference between locations shows up in G1-G3 pairs (third group in Fig. 4): M-M contribute much more to the overall G1-G3 divergence than Z-Z combinations. This points to a difference in social practices with respect to the use of Albanian between the Zurich and Munich communities. It seems possible to relate these differing practices to factors such as different network strengths (number of speakers, differences in access to tuition in Albanian, etc., see Section 3.2), but empirical verification remains for further research.

d25061325e985

Figure 4

Box, violin, and dot plots of pairwise JS divergence of PoS usage in Albanian Pear Story renarrations, grouped by place of residence (M[unich]-M[unich], Z[urich]-M[unich], Z[urich]-Z[urich]) and generation (G1, G2, G3). Each dot represents the JS divergence of one pair of probability distributions, e.g., of a G1 and a G3 informant in G1-G3, etc.

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

4.5 Age of speaker

The composition of the three generations in our sample is determined by shared linguistic biography based on age of first exposure to German (see Section 3.2). As a consequence, divergence in PoS usage is correlated not only with generation as discussed in Sections 4.2 and 4.3 but also with age, as seen in Fig. 5: in pairs of young and old speakers (upper left and lower right part) the density of increased divergence is easy to spot, with the brighter colors showing larger JS divergence.

d25061325e1014

Figure 5

Dot plot of pairwise JS divergence of PoS usage in Albanian Pear Story renarrations across all pairs of informants arranged by age of informants. The brighter the color of the dot, the larger the JS divergence of one pair of probability distributions.

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

To further explore the effects of age, we focus on the larger age difference between G1 and G3 speakers and control for possible effects of the place of residence (Munich and Zurich); see Fig. 6.

d25061325e1034

Figure 6

Dot plot of pairwise JS divergence of PoS usage in Albanian Pear Story renarrations in G1-G3 pairs from Munich (M) and Zurich (Z). A: by age and location of G3; B: by age difference between G1 and G3 (M-M = Munich-Munich pairs, Z-M = Zurich-Munich pairs, Z-Z = Zurich-Zurich pairs).

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

Panel A in Fig. 6 shows that the divergence between G1 and G3 decreases as the age of G3 increases. Panel B of Fig. 6 illustrates that across combinations of place, the increase of divergence with increasing age difference between G1 and G3 speakers largely follows the same trajectory, starting at different levels. The general higher level of divergence for M-M pairs conforms with the findings suggested by the results discussed in Section 4.4.

Since the three G3 informants from Prishtina are older than 15 it remains unclear whether the increased level of divergence and dispersion in speakers that are younger than 15 is due to genre conventions being acquired in late adolescence only. Our study design does not allow us to investigate this in more detail. But we can approach this question by having a look at whether G3 speakers converge when renarrating in German. This will be done in Section 4.6.

4.6 (Swiss) German

The results of the previous section give rise to the hypothesis that the increased PoS distribution divergence in pairs of G3 informants from Munich and Zurich is related to age. To test this we take advantage of the fact that our informants from Zurich and Munich are bilingual speakers of (Swiss) German and Albanian: if the acquisition of conventions is overall driven by age, we expect the JS divergence in G3 pairs to be similar for both languages. Using the same methodology as for the Albanian renarrations we analyze the (Swiss) German renarration of the Pear Story recorded with our Zurich informants and compare the divergence in (Swiss) German against the divergence in Albanian. Results are reported in Fig. 7.

d25061325e1069

Figure 7

Box, violin, and dot plots of pairwise JS divergence of PoS usage in Albanian and (Swiss) German Pear Story renarrations of informants from Zurich. Each dot represents the JS divergence of one pair of probability distributions, e.g., of two G1 informants in G1-G1, a G1 and a G2 informant in G1-G2, etc.

Citation: Language Dynamics and Change 13, 2 (2023) ; 10.1163/22105832-bja10027

In both (Swiss) German and Albanian renarrations of the Pear Story, the divergence between pairs of G1-G3, G2-G2, and G2-G3 speakers remains more or less the same. Noteworthy differences between Albanian and (Swiss) German emerge for G1-G1 pairs with a higher mean level of divergence and dispersion in the German version. A similar pattern, but less pronounced, is observable in G1-G2 pairs. The inverse is the case for G3-G3 pairs, with the German version displaying less median divergence and dispersion than the Albanian one.

We observe that our G3 informants diverge much less in their Swiss German version as compared to the Albanian version of their renarration. Obviously, their younger age does not prevent them from converging with respect to PoS usage when renarrating the Pear Story. Of course, these results do not tell us whether this convergence corresponds to that of Swiss German speakers without Albanian heritage. A purely qualitative assessment of the Swiss German renarrations from our G3 informants by Swiss German speakers without a migrant background did not reveal any obvious differences, but in the scope of this article, it is not possible to follow up on this issue in an empirical way. Ultimately, what is decisive here is the group-internal convergence of our G3 informants from Zurich in their usage of PoS when using German.

5 Discussion

We started from the assumption that heritage languages might provide a window into the black box of migration-related splits of speaker communities and their languages. We explicitly focused on language usage, leaving aside questions concerning the heritage speakers’ underlying linguistic competence. From a historical perspective, using competence as a baseline is not operationalizable: we usually do not have any knowledge about the competence of individuals or groups of individuals for historical states.

In our investigation we analyzed PoS usage of speakers of Gheg Albanian in a German-speaking context when renarrating the Pear Story video, taking it as an indicator of narrative conventions. A comparison with PoS usage in folklore text shows that this is indeed a sensible proxy for comparing patterns in communicative practice in different pragmatic contexts. Since we see meaningful patterns related to these pragmatic contexts, we interpret PoS usage as bearing a signal of sociocultural behavior, which can be investigated in various settings. Here we focus on the parameters of linguistic socialization, biography, and age.

Given the equal difference in PoS usage for folklore texts and G1/G2 Pear Story renarration (Fig. 1), we assume that narration techniques and their characteristics are entrenched in social practice and remain stable in an environment with Albanian as majority language and language of basic socialization, at least for the time span we cover (ca. 70 years for the oldest G1 informants). Zooming in on different sociocultural contexts reveals a more differentiated picture.

The comparison of the pairwise Jensen-Shannon divergence of PoS usage across three generations (G1, G2, G3) indicates that G1 and G2 show little divergence from each other. The amount of divergence between generation G1 and G2 (G1-G2 pairs) is very similar to within-generation divergence (G1-G1, G2-G2 pairs; Fig. 2). Compared to these pairs, divergence between cross-generation pairs that involve G3 participants is increased, for G1-G3 pairs more than for G2-G3 pairs. This hints at the relevance of sociocultural factors that contribute to the vertical (cross-generation) and horizontal (within-generation) transmission of narrative conventions.

G1 and G2 in our sample of participants have in common that they grew up in an Albanian-speaking environment with basic schooling in Albanian. They thereby differ from G3 (who grew up in an environment with both German and heritage Albanian, but were schooled in German). This suggests that overall, the shared linguistic socialization and biography have led G1 and G2 speakers to develop similar conventions in employing PoS in narrative contexts as mirrored in the similar PoS usage profiles. While G1 and G2 diverge from the G3 heritage Albanian speakers from Zurich and Munich, they hardly diverge from speakers from Prishtina of similar ages to the G3 speakers (Fig. 3). We interpret this as evidence that upon migrating, G1 and G2 kept a generation-internal coherence and cross-generational similarity in the new surroundings. Once acquired, conventions with respect to PoS in narration seem to be stable even with changing sociocultural embedding.

This brings age into play. For our informants, social generation and age are highly correlated in many cases. Overall we observe a distinct effect of age in the pairwise divergence of the relevant G1-G3 pairings (Fig. 6). Divergences decrease with increasing age of G3 speakers and divergences increase with increasing age difference between G1 and G3 speakers from both locations; the small differences in the general level of divergence (Fig. 4) may relate to differences in the Albanian-speaking networks. For the sake of argument one could assume that PoS usage in renarration generally converges late in ontogeny and requires a sufficiently rich and stable input. In fact, studies on the development of narrative competences in fluent bilingual children with no difference for languages report that these children develop narrative competences simultaneously in both languages and become more proficient with age (Laurent, Nicoladis, and Marentette, 2015: 67). On the other hand, in subsequent bilinguals in an L2 environment, mainly only up to age 10, the development of narrative skills in L2 is said to lag behind monolinguals (Verhoeven, 2004: 452). The sociolinguistic setting of our G3 is more complex than the classical situations of simultaneous and/or subsequent bilingualism. Starting with Albanian as L1, G3 speakers enter the educational system with its strong preference for German and the simultaneous emergence of social networks dominated by (Swiss) German. In this process L1 Albanian of early childhood becomes a minority language with usually less socioeconomic and cultural prestige. Our results are compatible with the view that G3 Albanian heritage speakers basically follow a developmental trajectory that is comparable to subsequent bilinguals in an L2 environment mentioned above. A possible reason for their developing Albanian narrative conventions in adolescence could be an increased interest in (re)learning the L1 of their childhood. Montrul (2015: 99–100) suggests such an idealized longitudinal development for heritage speakers who were simultaneous bilinguals in early childhood. However, for our cross-sectional data this interpretation is speculative.

That age is not the only relevant factor in this development is shown by the convergence of G3 speakers in PoS usage when narrating in Swiss German. The results from the comparison of Swiss German versions across Zurich informants support the idea that G3 participants converge in how they employ PoS in their majority language when renarrating the Pear Story. In fact, G3 within-generation divergence of renarrations in Swiss German, the dominant majority language in their sociocultural context, is much smaller and less dispersed than in Albanian (G3-G3 pair in Fig. 7), and very similar to the G1 within-generation divergence in Albanian. To be sure, the results display convergence but do not tell anything about whether G3 speakers from Zurich apply the same strategies as their non-migrant-background Swiss German peers nor how they perform in terms of morphology and lexicon. We do not know to what extent our G3 participants differ from speakers of the other majority language and/or possibly converge on an ethnolectal variety of (Swiss) German (Tissot, Schmid, and Galliker, 2011; Morand, Schwab, and Schmid, 2020). A qualitative assessment of the Swiss German versions of the renarration by non-migrant-background speakers does not endorse the assumption that G3 speakers differ substantially from native performance.

Therefore, all other things being equal, the observed differences within and across generations in heritage Albanian suggest that the emergence of narrative conventions depends on the sociocultural context via the difference in quality and quantity of linguistic input and exposure to communicative practices as well as opportunities for producing narratives (cf. Verhoeven, 2004: 452). Accordingly, we hypothesize that the quality and quantity of specific narrative input in Albanian from both G1 and G2 as well as from the Albanian-speaking peer generation and the opportunities for production have simply not been sufficient for our G3 speakers to converge more on PoS usage in Albanian narrations.

Certainly, sociocultural factors beyond our control might have been at work as well. An intriguing observation from our experimental setting was the influence of language ideologies, as manifested in the sociocultural prestige of (Tosk-based) Standard Albanian, the language of schooling and official media in all Albanian-speaking regions. This politically promoted ideology radiates into the heritage communities as well and triggers expectations. Some of our informants considered it situationally appropriate to aspire to adhere to Standard Albanian grammar in their renarrations, irrespective of their competence. This is an interesting sociolinguistic behavior and a potential confound, but we have no reason to assume that it influences PoS usage in a substantial way.

The relevance of input from the peer group is shown by our results for the renarrations in Swiss German. Here, too, G3 diverges from G1 and G2, but differently from the Albanian renarrations, the divergence we observe for Swiss German relates to an increased generation-internal convergence. G1 and G2 speakers do not converge at the same level in Swiss German as G3 speakers do. This suggests that the demographic G1 and G2 generations do not—and cannot—provide sufficient Swiss German input concerning narrative conventions to G3 speakers. Despite this lack of input from their ancestor generation, G3 speakers converge in PoS usage. This can be ascribed to the common practice and input in daily interaction with their peer generation and other speakers of (Swiss) German—including speakers with and without Albanian heritage background. Concerning the apparent time dynamics, this finding lends support to the assumption that the lack of interaction in Albanian after early childhood within G3—within their peer group—is an important factor for the transmission of narrative practices: it highlights the relevance of horizontal (intra-generational) transmission in the emergence of conventions (see also Morin, 2016).

6 Conclusions

Our findings allow for insight into the processes of growing variation and incipient change in situations of a split of speaker communities. They suggest that the major trigger for the divergences in PoS usage in a specific pragmatic context that we observe for our data is the changing sociocultural embedding and its impact on communicative practice. We observe that in minority/heritage languages, variation in a successor generation—indicated by decreased convergence and higher amount of dispersion—increases as a result of changing linguistic behavior and practice: the amount of exchange in terms of input received and output produced with the ancestor generation is smaller than in the source population setting. Thereby, our results also provide empirical insight into the roles of vertical and horizontal transmission on the one hand and on the emergence of “collective grammars” resulting from the mutual approximation of speakers in their narrative behavior on the other (Dediu et al., 2013; Dąbrowska, 2015). Because of the pairwise comparison within and across generations, the results also illustrate how individual behaviors, PoS usage in our case, might not be exactly the same for two single individuals, but may in sum converge towards evolving conventions.

Crucially, the increasing variation and its potential for actuation and change that we observe shows the importance of transmission processes via the changing linguistic behavior and convention within the speaker community. We take our results as evidence for the importance of narrative conventions as cultural techniques shared by social groups. These conventions are visible at a very overt layer of language usage and are presumably more important for social cohesion than morphology and phonology. The latter can be taught in isolation while the former require exposure. It thus seems fair to assume that in addition to particular structural genealogical signals in morphology, syntax, and lexicon, languages may also bear cultural signals of language usage that manifest themselves in the pragmatic conventions of speaker communities.

Supplementary materials

The script and data that were used to generate the analysis presented in this article are available online.

Acknowledgments

We gratefully acknowledge the support of the Swiss National Science Foundation (grant number 100015L_182126/1) and the University Research Priority Program “Language and Space” at the University of Zurich. In preparing this study we profited a lot from the input from our LMU Munich project partner Claudia Riehl (DFG grant RI 675/11–1). A huge faleminderit goes to Artan Islamaj, Lule Adili, Mimoza Avdiji, Dafina Salihu, and Ariana Dragusha, who recorded the Zurich renarrations, and transcribed and annotated all Albanian data; Naxhi Selimi also recorded some data in Zurich. Blerina Kelmendi supervised the collection and transcription of the Munich data. Magdalena Plamada provided technical support. Merci to Manuela Troxler and Nadja Naef for transcribing and annotating the Swiss German renarrations. We would also like to thank the two anonymous reviewers and the editors for their highly valuable remarks, which helped to improve the paper considerably.

References

  • Adler, Astrid. 2019. Sprachstatistik in Deutschland. DeutscheSprache 47(3): 197219.

  • Ajeti, Idriz. 2002. Shqipja standarde dhe shoqëria kosovare sot. StudimeFilologjike 3(4): 2329.

  • Anstatt, Tanja. 2018. Wortfrequenz und Textsorten. In Anna-Maria Meyer and Ljiljana Reinkowski (eds.), ImRhythmusderLinguistik. FestschriftfürSebastianKempgenzum65. Geburtstag. Bamberg: University of Bamberg Press, 3358.

    • Search Google Scholar
    • Export Citation
  • Bailey, Guy, Tom Wilke, Jan Tillery, and Lori Sand. 1991. The apparent time construct. LanguageVariationandChange 3: 241264.

  • Beckner, Clay, Richard Blythe, Joan Bybee, Morten H. Christiansen, William Croft, Nick C. Ellis, John Holland, Jinyun Ke, Diane Larsen-Freeman, and Tom Schoenemann. 2009. Language is a complex adaptive system. Position paper. LanguageLearning 59: 126.

    • Search Google Scholar
    • Export Citation
  • Benmamoun, Elabbas, Silvina Montrul, and Maria Polinsky. 2013. Heritage languages and their speakers. Opportunities and challenges for linguistics. TheoreticalLinguistics 39.34. https://doi.org/10.1515/tl-2013–0009.

    • Search Google Scholar
    • Export Citation
  • Biber, Douglas. 2012. Register as a predictor of linguistic variation. CorpusLinguisticsandLinguisticTheory 8(1): 937. https://doi.org/10.1515/cllt-2012–0002.

    • Search Google Scholar
    • Export Citation
  • Bloomfield, Leonard. 1933. Language. New York: Holt.

  • Blythe, Richard and William Croft. 2021. How individuals change language. PLOSONE 16.6, e0252582. https://doi.org/10.1371/journal.pone.0252582.

    • Search Google Scholar
    • Export Citation
  • Buchholz, Oda and Wilfried Fiedler. 1987. AlbanischeGrammatik. Leipzig: Verlag Enzyklopädie.

  • Bundesamt für Statistik. 2022. SprachenlandschaftinderSchweiz. Neuchâtel: BFS. https://dam-api.bfs.admin.ch/hub/api/dam/assets/23164427/master (accessed February 23, 2023).

    • Search Google Scholar
    • Export Citation
  • Burri Sharani, Barbara, Denise Efionayi-Mäder, Stephan Hammer, Marco Pecoraro, Bernhard Soland, Astrit Tsaka, and Chantal Wyssmüller. 2010. DiekosovarischeBevölkerunginderSchweiz. Bundesamt für Migration (BFM). https://www.sem.admin.ch/dam/data/sem/publiservice/publikationen/diaspora/diasporastudie-kosovo-d.pdf (accessed February 23, 2023).

    • Search Google Scholar
    • Export Citation
  • Çetta, Anton. 1982. Përralla. Prishtina: Instituti Albanologjik i Prishtinës.

  • Chafe, Wallace, ed. 1980. ThePearStories. Cognitive,culturalandlinguisticaspectsofnarrativeproduction. Norwood, NJ: Ablex.

  • Chelliah, Shobhana L. 2016. Responsive methodology. Perspectives on data gathering and language documentation in India. JournalofSouthAsianLanguagesandLinguistics 3(2): 175195.

    • Search Google Scholar
    • Export Citation
  • Clark, Herbert H. 1996. Usinglanguage. Cambridge: Cambridge University Press.

  • Croft, William. 2000. Explaininglanguagechange. Anevolutionaryapproach. London: Longman.

  • Dąbrowska, Ewa. 2015. Individual differences in grammatical knowledge. In Ewa Dąbrowska and Dagmar Divjak (eds.), Handbookofcognitivelinguistics, 650668. Berlin, Boston: De Gruyter.

    • Search Google Scholar
    • Export Citation
  • Dediu, Dan, Michael Cysouw, Stephen Levinson, Andrea Baronchelli, Morten Christiansen, William Croft, Nicholas Evans, Simon Garrod, Russell Gray, Anne Kandler, and Elena Lieven. 2013. Cultural evolution of language. In Peter Richerson and Morten Christiansen (eds.), CulturalEvolution. Society,Technology,Language,andReligion, 303332. Cambridge, MA: MIT Press.

    • Search Google Scholar
    • Export Citation
  • Dobrushina, Nina and Elena Sokur. 2022. Spoken corpora of Slavic languages. RussianLinguistics 46(2): 7793. https://doi.org/10.1007/s11185-022-09254-9.

    • Search Google Scholar
    • Export Citation
  • Drost, Hajk-Georg. 2018. Philentropy: Information Theory and Distance Quantification with R. JournalofOpenSourceSoftware. https://doi.org/10.21105/joss.00765.

    • Search Google Scholar