Save

Sentiments Towards Heritage Languages and Their Speakers

In: Heritage Language Journal
Authors:
Clara Fridman PhD Candidate, Department of English Literature and Linguistics, Bar Ilan University Ramat Gan Israel

Search for other papers by Clara Fridman in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-0675-3586
and
Onur Özsoy Postdoctoral Researcher, Research area 2: Language development and multilingualism, Leibniz-Zentrum Allgemeine Sprachwissenschaft (ZAS) Berlin Germany

Search for other papers by Onur Özsoy in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0003-3617-4697
Open Access

Abstract

Research on heritage languages has grown in both volume and scale over the last 30 years. More recently, researchers have begun observing shifts in the ways in which heritage languages and their speakers are being portrayed within the field, newly emphasizing more positive characterizations. In the present meta-analysis, we combine two methodologies to test the extent to which these positive trends in sentiment are reflected in the field of heritage language research: an automated sentiment analysis of 1,034 abstracts of relevant publications and a survey of 80 researchers in the field. Our findings show that while researchers report multiple positive trends in the field, the sentiments reflected in articles have always been consistently positive. Additionally, we provide novel metadata regarding the languages, subfields, and countries most represented within the field of heritage linguistics and discuss implications for future work.

1 Introduction

Heritage language (HL) research as it is known today emerged 30 years ago and has become a major productive stream in bilingualism research (Lynch, 2014). Scopus data show that the number of publications in this field has increased by 2500% within this time frame, from fewer than ten per year before 1994 to more than 250 articles a year since 2019 (Elsevier, 2023). However, the vast majority of these studies have focused on Spanish, Korean, and Chinese as HLs in combination with English as a majority language (Scontras & Putnam, 2020), leading researchers to call for increased diversity in subject languages. Furthermore, certain linguistic subdisciplines have been over-represented in the literature compared to others, with the field of morphosyntax taking the lead (Polinsky & Scontras, 2020). This, too, has begun to shift in recent years as other subdisciplines, such as semantics and lexicon, are gaining traction. Finally, the methodological approaches endorsed by the field appear to be shifting from a paradigm comparing HL speakers to monolinguals to a focus on individual differences within HL-speaking communities. This latter trend, in particular, has been credited with shifting HL research from a more negative view of HL speakers to a more positive one, a sentiment which has been echoed ubiquitously in recent HL-focused discussions, conferences, and publications. However, these claims lead us to question: Has there truly been a shift towards seeing HL speakers and HLs in general in a more positive light than before? Is this shift reflected both in sentiments from the research community and through methodological sentiment analysis across publications over time?

The present paper presents a meta-study of the field of HLs, spotlighting the most observed trends in HL research. It combines a quantitative sentiment analysis of 1,034 journal articles on HLs with a qualitative survey of 80 HL researchers. This hybrid methodology allows us to both address the main question of this study – whether an observable positive trend in sentiment exists – and to obtain a comprehensive and holistic overview of the current state of the field of HL research.

1.1 Heritage Language (HL) Speakers

In order to be able to assess the field of HL research, we must first put forward a working definition of what constitutes an HL. When it comes to the details of this matter, it becomes clear that there is no consensus among researchers as to the exact thresholds for defining an HL speaker (Aalberse et al., 2019; Montrul & Polinsky, 2021; Polinsky, 2018; Wiese et al., 2022). Therefore, we leave aside, for future work, the debate about finding a unified definition and set up a working definition that most researchers will be able to agree on. In its broadest definition, an HL is a minority language that is acquired naturalistically in a society with a different dominant majority language. This is similar to Rothman’s (2009) definition, which adds that an HL speaker must have some proficiency in the language to be classified as such. What unifies all HL researchers is the idea that HLs are distinct from other types of bilingual acquisition scenarios, such as the most typical case of L2 acquisition that starts in school (Polinsky & Scontras, 2020). The nature of this distinction leads to necessary comparisons with other groups in an attempt to characterize what exactly sets HLs and their speakers apart. From the beginning of the field, such conceptual and theoretical debates have always shaped the way empirical results were interpreted. Among the most debated issues in the literature is the matter of appropriate comparisons, although some papers also argue for more methodological advancements and linguistic diversity to better represent the world’s HLs.

1.2 Away from Deficit Account

When analyzing HLs, researchers often compared monolingual “baseline” speakers of the HL and the “divergent” HL speakers (Rothman et al., 2023). In these settings, HLs were initially framed as “incomplete,” “deficient,” “limited,” or “attrited.” Divergences from standard, mainstream, or other “homeland” baselines were often categorized as erroneous and invalid. HL speakers were described as “semi-speakers,” “incomplete acquirers,” “pseudo-bilinguals,” and “recessive bilinguals” (see Polinsky, 2015 for an overview). In fact, the notion of HL speakers’ limited HL abilities as compared to more proficient speakers seemed to be so heavily implied in discussions of HLs that Polinsky (2018) distinguished balanced child bilinguals, who received consistent input in the HL, from “true heritage speakers.”

Putnam & Sánchez (2013) revisit the incomplete acquisition hypothesis by adopting core aspects of previous scholarship while proposing an alternative model that offers a more accurate depiction of heritage speakers’ linguistic development. Over time, more researchers started to challenge deficit-oriented frameworks and proposed that HLs should be viewed as full native language varieties in their own right (Higby et al., 2023; Pascual y Cabo & Rothman, 2012; Rothman et al., 2023; Rothman & Treffers-Daller, 2014; Serratrice, 2020; Wiese et al., 2022). Proponents of this perspective argue that HLs are an accelerated case of language change as a result of contact with other languages (Kupisch & Polinsky, 2022; Lohndal et al., 2019). Pires & Rothman (2009, p. 216) describe it as a “new source to test diachronic proposals that are complicated by the maintenance in formal dialects of properties argued to have been lost but which are still present in the standard dialect used to educate monolinguals.” However, there are also contributions that defend the application of the term, especially in generative linguistics, though they also address the societal impact of such terminology (Dominguez et al., 2019). Montrul and Polinsky (2021) also argue, contrary to Polinsky’s earlier view (2018), that the HL should be considered a fully-fledged language in its own right, due to its systematic set of linguistic rules and properties. Doing so, they claim, will remove the stigma of HL speakers “not speaking right.” Another argument is that it is unreasonable to expect speakers of an HL to produce the same language as those immersed in the language at a societal level. Instead, it should be expected that HL speakers acquire the language according to their needs, e.g., Korean HL speakers in the U.S. might only acquire the more informal registers of their HL (Polinsky & Kagan, 2007).

1.3 Away from Monolingual-HL Comparisons

Recent findings show that L1 and Ln speakers often employ similar processing mechanisms, which supports a move away from monolingual-HL comparisons (Coughlin & Tremblay, 2014). The Research Unit Emerging Grammars in Language Contact (RUEG) study also addressed the issue of comparisons and framings, focusing on a resource-oriented rather than deficit-oriented view of HL speakers. Moving away from monolingual-HL comparisons, the RUEG study highlights that noncanonical linguistic patterns are present in both monolingual and bilingual speakers, challenging the traditional comparative framework (Wiese et al., 2022). For example, Iefremenko (2024) shows that monolinguals and HL speakers both make abundant use of postverbal predicates in Turkish, against reports from standard Turkish grammars that claim that postverbal predicates are highly restricted. The variability in language use among monolinguals often exceeds that observed in bilinguals, debunking the myth of a uniform monolingual standard (Wiese et al., 2022). Furthermore, the dominance of noncanonical patterns in informal speech registers among both groups suggests that HL speakers’ language use is not merely a result of “incomplete acquisition” but part of the natural spectrum of language variation, further problematizing the notion of baselines and the idea of selecting any appropriate baseline (Özsoy & Blum, 2023). These findings advocate for the recognition of HL speakers within the broader continuum of native speakers, emphasizing their linguistic contributions and the insights they offer into language development and variation (Kupisch & Rothman, 2018; Wiese et al., 2022).

Along similar lines, more recent work calls for considering HL and homeland varieties of a language as two separate dialects, rather than assessing the former as an underdeveloped version of the latter (Montrul & Polinsky, 2021). This allows researchers to portray HLs as varieties of languages that have their own grammars and lexicons. Furthermore, López et al. (2023) point out that comparing HL speakers to monolinguals implies that they should be reaching monolingual competencies, and often fail to do so, when the reality is that, due to their starkly different linguistic and social surroundings, there is no reason to expect identical outcomes. Comparing HL speakers, who have proficiency in the HL in addition to full fluency in their societal language (SL), to monolinguals whose full attention is limited only to one language, will necessarily disadvantage the HL speakers, who are being held to an unreasonable standard.

1.4 Towards New Comparisons

Unjustly expecting HL speakers to meet a monolingual baseline is problematic not only due to the deficit framework such a comparison elicits, but for scientific, practical, and demographic reasons as well.

In a recent impactful conceptual article, Rothman et al. (2023) show that comparisons between monolinguals and bilingual HL speakers are flawed, as they do not fulfill the scientific criteria of empirical control. Such comparisons allude to an analogy from medicine, where many studies compare a typical baseline group and an experimental treatment group that receives some kind of drug. Comparing monolinguals and bilinguals in a similar fashion would imply that bilingualism is some kind of “language drug” that can be given to a monolingual. However, the reality shows that bilingualism is a highly complex experience and practice that cannot simply be contrasted with a monolingual control group. Following from this, Rothman et al. argue for alternative research methodologies and empirical analyses that account better for the HL bilingual data. For example, they refer to a study by Bayram et al. (2019), who focus on the qualitative observation that HL speakers were able to produce passive structures, rather than quantifying the numbers and drawing an inappropriate comparison with monolinguals.

Kissling (2018) points out that HL speakers are already a highly heterogeneous group, with, for example, HL-Spanish speakers in the US potentially having roots in many different Spanish-speaking countries. Comparing them to a much more homogenous monolingual baseline, i.e., of speakers of a particular form of Spanish from one particular area, not only disadvantages the HL speakers but is simply an inappropriate comparison. Wiese et al.’s (2022) suggestion to remedy these and similar inadequate comparisons is to focus on variation and variability within different groups of native speakers, both HL speakers and monolinguals. Such a perspective is not only more equitable, the authors claim, but is a sounder methodology, both theoretically and empirically. Other researchers choose the approach of comparing HL speakers of the same HL but different SLs (e.g., Fridman et al., 2023; Özsoy et al., 2022; Rodina et al., 2020) or speakers of different HLs with the same SL (van Osch et al., 2023; Wiese et al., 2022).

Along with a methodological diversification in having different comparison groups, there has also been an increase in more experimental and online empirical methodologies such as eye-tracking and brain imaging.

1.5 Towards New Methodologies: Psycho- and Neurolinguistics in HL Research

The vast majority of HL research has concentrated on the oral or written outputs of HL speakers and on offline experimental tasks in which participants fill in blanks or complete sentences (Bayram et al., 2021). Experimental investigations are a crucial method for identifying which individual factors influence HL acquisition (Rothman & Slabakova, 2018). For instance, an eye-tracking study by Sagarra and Casillas (2023), which considered numerous individual differences, demonstrated that co-activation and type frequency enhance lexical access in HL speakers, while factors like age of onset are less significant. This type of psycholinguistic and large-scale experimental research on HLs is relatively recent compared to other subareas. Montrul (2022) highlights a general shortage of studies exploring the intersection of bilingualism and psycho- and neurolinguistics, particularly those examining implicit speaker knowledge. This sentiment is echoed in calls for more research in this area (Rothman, 2024). Similarly, Bayram et al. (2021) emphasize the limitations of claims based solely on offline studies and advocate for integrating these findings with results from online studies and analyses of individual differences. According to Bayram et al.’s (2024) recent review of HL research, there has been “a paradigm shift” towards studies that account for individual variation in HL speakers’ performance, as well as a move towards online tasks that assess processing, such as eye-tracking.

1.6 More Studies Using “Heritage Language” than Before: Linguistic Diversity

Early HL research has not only been limited in its methodologies, experimental paradigms, and comparison groups, it has also lacked diversity in the most basic context of language sciences: linguistic diversity. We have already underlined the vast increase of studies in the field of HLs (using this specific terminology). One would expect that this increase would be linear, leading to more world languages being represented. However, many researchers have pointed out that representation in HL research is still limited and there is a need to diversify to better inform linguistic theory. Lohndal et al. (2019) discuss the significant role of HL research in understanding linguistic diversity and emphasize the importance of this field for broader linguistic theories. They argue that HLs provide unique insights into how languages change and develop under different social and cognitive conditions. They further advocate for integrating these findings into the broader context of formal linguistic theories, which traditionally have focused on more uniform and stable language contexts. This call for increased research into HLs aligns with the need to document and analyze lesser-studied languages, as such research not only enriches our understanding of linguistic diversity but also informs language preservation efforts and educational policies for multilingual communities.

Recently, more research has assessed the diversity of languages that are studied in other fields of linguistics as well. For example, Kidd and Garcia (2022) established that less than 1.5% of the world’s languages have been examined in major language acquisition journals so far. Their call for increased diversity in language acquisition research has catalyzed many discussions and followup articles. For the field of psycholinguistics, Collart (2024) also finds that the vast majority of research has focused on English and a select few Indo-European languages, in total accounting for less than 2% of the world’s languages. Additionally, they find that this typological bias also leads to a bias in the phenomena that are investigated and in resulting theories of language processing which cannot easily generalize to a broader sample of languages.

Narrowing into the field of HLs, in their article, “Lesser-studied Heritage Languages: An Appeal to the Dyad,” Scontras and Putnam (2020) emphasize the importance of looking closer at the language dyads in HL research. They argue that the examination of lesser-studied HLs, particularly in the context of bilingual dyads, provides valuable insights into the dynamics of language maintenance and shift. This approach highlights the significance of assessing the interaction between an HL and the dominant societal language to better understand the complexities of HL acquisition and use. Scontras and Putnam (2020) also point out the need for a more balanced representation in HL research, which has often been skewed towards more commonly studied languages such as Spanish, Korean, and Arabic.

Similarly, Scontras et al. (2015) argue that HLs are essential for linguistic theory because they provide empirical data that challenge and refine existing models of language acquisition and use. They stress that HLs often exhibit a blend of features from both the HL and the dominant societal language, providing a rich tapestry of linguistic phenomena that are crucial for understanding bilingualism and language contact. This hybridity is important for developing more accurate and comprehensive theories that reflect the true diversity of human language.

1.7 Research Questions and Hypotheses

The literature review highlighted several conceptual shifts that will be investigated through quantitative and qualitative analyses. One central idea paired with each shift is that accounts of HLs have been more negative in the past and are increasingly becoming positive over time. We will empirically test this by analyzing sentiments from research articles and from researchers themselves.

To date there have been no empirical investigations that assess whether views have really been as deficit-oriented as is being claimed, and subsequently whether there really is a trend towards more positive framings. Furthermore, while several researchers have anecdotally referenced shifting sentiments towards HLs and their speakers, we do not currently have a cohesive understanding of HL researchers’ views on this topic. To this end, we posed two primary research questions.1

RQ1. Have sentiments in HL research shifted over time?
RQ2. How do HL researchers perceive shifts in HL sentiments over time, if they exist at all?

For RQ1, we considered the following three hypotheses.

H0. There is no clear trend in sentiment toward HLs over the last 25 years.
H1. There has been a shift from framing HLs more negatively in the early years of the field to more positive framings since the late 2010s.
H2. Sentiments toward HLs/HSs vary based on the HL investigated and the linguistic subfield, and these sentiments have shifted over time.

RQ2 was conceptualized as an exploratory research direction, to be assessed in the context of HL researchers’ general demographic and academic background.

To address RQ1, we conducted an objective sentiment analysis of 1,034 HL papers that were published since the year 2000. To address RQ2, we surveyed 80 HL researchers about their work, their background, and their views on sentiments in HL research.

2 Methods

2.1 Sentiment Analysis

Sentiment analysis is a tool from computational linguistics that allows for the automatic classification of sentiments in a text (for an extensive review, see Taboada, 2016). This can be output as a scale from very negative to very positive, or just as a categorical variable with the levels “positive” and “negative.” The methodology is most commonly applied to online texts, predominantly in social media research. There are two methods to conduct sentiment analysis: the machine learning approach and the lexicon-based approach. In the machine learning approach, algorithms are trained on manually annotated sets of data and use patterns in these datasets to classify new texts. In lexicon-based approaches, sentiments are based on the values of individual words in a sentiment dictionary, which contains words and corresponding sentiment scores, e.g., awesome with a positive score and terrible with a negative score. In our case, this includes terms that were commonly used to describe HLs, such as deficient, incomplete, divergent, and non-standard. The most common positive words in our data include well, positive, strong, successful, increased, advanced, dynamic, growth, vitality, stronger, and creative. The most common negative words in our data include limited, negative, lack, loss, lower, errors, vulnerable, weaker, problematic, and struggle. In the current study, we chose a lexicon-based approach as it is known to be robust across different texts and does not require any additional annotation of training data which might add bias to the analysis (Taboada et al., 2011). In order to ensure that the algorithm was making sensible tagging decisions, we exported and manually checked the five most negatively and most positively rated words for each abstract. The results of this check showed that the tagger was working well.

As we aimed to conduct a comprehensive meta-analysis of the field of HL research, we needed to survey as much of the existing literature as possible. This problem could only be solved using a quantitative approach – in this case the sentiment analysis – that captures the largest amount of meaningful data. To achieve this, we relied on Scopus, one of the largest abstract and citation databases available (Elsevier, 2023). Using Scopus also ensured a high quality of journals represented, as they are regularly assessed based on several established citation measures such as the h-Index and CiteScore (Baas et al., 2020).

2.2 Data Export

We queried the n-gram “heritage language” in the Scopus interface and limited the search to journals. We expected that including books and dissertations would skew the results since they often work on several HLs at once. While we recognize the significant role that edited volumes have played in the field of HL studies, particularly during the 1990s and 2000s, when much of the foundational work was published in these formats, we chose to limit our sample to journal articles for several reasons. First, this decision ensured better comparability between data points, as the scope and structure of journal articles differ from that of edited volumes. Second, accessibility of data and abstracts was far better for journal articles indexed in Scopus, facilitating a more streamlined and consistent data extraction process. Although volumes have had a profound impact on HL research, we believe this approach allowed for a clearer, more homogeneous dataset for sentiment analysis.

Our export contained all articles that were indexed in Scopus and were published before June 5, 2023, the date when we accessed the data. In total, we exported 1,513 journal articles with relevant metadata such as author names, abstracts, year of publication, publisher, journal, etc. In addition to these automatically exported data, we extended our database with manual annotations of meta-variables that might be informative for the research questions at hand.2 These variables are potentially interesting to describe trends and sentiments in HL research over time, namely the HL(s) studied, the majority language(s) studied, the linguistic subfield, and the labeling of the groups as native, monolingual, or heritage. Articles that were not empirical investigations of HLs were manually excluded from the study, resulting in a final database of 1,114 articles. Excluded articles were generally studies that reviewed the concept of HLs or provided a broader overview of the literature regarding a specific topic.

2.3 Sentiment Tagger

We chose to analyze sentiments in abstracts rather than full articles for two reasons. First, sentiment analysis is more appropriate for texts where opinions rather than facts are expressed (Taboada, 2016). In research articles, methods and results sections present objective information, while the introduction and discussion sections can be a mix of summarizing previous literature and framing the current study. These latter components also characterize abstracts; thus, we chose to analyze abstracts over full articles in order to avoid a topic-detection problem. Any mention of article sentiments in the present study refers to the sentiments in a given article’s abstract. Second, only 487 of the 1,114 articles in our sample were published with open access, which would have made them suitable for web scraping. However, we did not want to reduce the sample of articles in our data by more than half and also did not want to engage in web scraping as it is legally unsettled. Thus, to maintain sample size and methodological certainty, we used the accessible Scopus abstracts.

In the next step, we imported our database to a Python script and used the VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis library (Hutto & Gilbert, 2014) (https://github.com/cjhutto/vaderSentiment). This library is especially designed for natural language processing with short texts in online media and was deemed appropriate for research abstracts, which are highly dense in information.

We then imported the data, which were enriched with the sentiment scores and the manual annotations, into R (R Core Team, 2023) to conduct several descriptive analyses, mostly relying on the tidyverse family of packages (Wickham et al., 2019), and we built a Bayesian mixed-effects regression model in brms (Bürkner, 2017).

All scripts and data that are presented here can be reproduced at https://osf.io/f9csa/?view_only=1c02d69ee6524a2c892fb387756f71b8.

2.4 Survey

A fully automated sentiment analysis algorithm is not without its flaws. Therefore, it was important for us to personally reach out to researchers in the field to understand their own experiences and observations regarding changes in attitudes toward HLs and speakers.

A total of 80 participants filled out a questionnaire distributed to them through international mailing lists (including the ISB list, the heritage-language-list, and others), social media, and personal recommendations. The questionnaire first asked the binary question of whether the participant had observed any shifts in the portrayal and perception of HLs and speakers in research over the last few decades. Participants were then given the option of expanding on their response. Next, we collected background demographic data, including age, gender, and country of residence. A total of 17 questions were posed (for the full set of questions, see Appendix 1).

We then proceeded to a language-oriented line of questioning, asking participants to list the languages they speak and languages they have “studied /researched.” The intention behind the latter question was for participants to list languages on which they had conducted research. It appears, based on the responses, that some participants interpreted this question to include languages they had studied in school or other language courses. Where this interpretation was obvious (i.e., “Mandarin (studied at school, know nothing)”), the response was excluded.

Next, we asked whether the participants self-identify as HL speakers. This question was intended to show whether the subjects of the discipline are represented among its researchers. For the purposes of the present study, we defined HL speakers by the following criteria: (1) the participant acquired a family language (or several) from birth, (2) this family language was not the dominant majority language of the society where the participant grew up, and (3) the participant was born in the latter society or moved there in childhood.

We then collected professional information including current position, countries where the person had studied or held academic positions, years since receiving a PhD, and domains, populations, and methodologies featured in participants’ research. Information about studied domains was collected in an open checklist format, where we provided several options (phonetics, phonology, morphosyntax, lexicon, semantics, pragmatics, sociolinguistics, psycholinguistics, family language policy, reading, writing), and allowed users to add their own options as needed. When participants manually added domains that we felt were included under a previously included heading, we grouped them together in the analysis to avoid duplicate or near-duplicate categories.

For most questions, data were analyzed descriptively, by tallying responses. Most challenging was the classification of open-ended responses, especially the primary question regarding observations of shifts in the field. The authors manually transcribed the general idea(s) conveyed in each comment and mapped each to a sentiment (“positive,” “negative,” “neutral”) based on the perceived tone in the remark. Additionally, the observed shifts were distilled into key points and summed as well, to understand the most salient changes in the field, according to researchers. Some of these distillations were quite straightforward (“the field is moving away from a deficiency account”) while others were interpreted to the extent possible to fit a broader category (“there is a shift from seeing heritage language as a burden to seeing it as an asset,” categorized as “away from deficit account”). When a given response did not fit well into any major category, a new category was created (“heritage languages are no longer viewed as exotic” did not fit a previously defined label, so a new one was added). The goal in the categorization was to broadly capture major sentiments without over-segmenting, while accurately reflecting the original responses.

3 Results

3.1 Sentiment Analysis Results: Metadata Overview

Exporting more than 1,000 articles from Scopus allowed us not just to analyze sentiments but also to observe general descriptive patterns in the field of HL research based on a highly representative, almost fully comprehensive, overview of the published peer-reviewed literature. Here, we first outline these patterns based on frequencies and broad categories for the main subfields, language pairs, and countries of author affiliations that were studied. Then, we move on to the inferential analysis of sentiments and how these are affected by different independent factors.

Figure 1 shows the top ten subfields represented among the papers that were published using the term “heritage language.” Subfields were classified based on the keywords listed by the authors and the methodology and conceptualization used in the study. Initially, narrower subfields were defined and manually annotated for all articles (e.g., grammar processing studies). We then redefined the most frequent subfields categories using broader labels, manual annotation, and chatGPT 4.0 to summarize more articles in each broad category. Specifically, after manually classifying all articles, we asked ChatGPT to reclassify these annotations to fit in one of the broader categories that we defined or in the “other” category. For a number of articles, more than one subfield had to be assigned if it was clearly at the interface of two or more subfields. Our findings show that language acquisition, syntax, applied linguistics, and discourse have the highest number of publications in HL research. This is followed by publications in other areas that fall under the traditional notion of “grammar” in linguistic research (phonology, semantics, syntax).

Figure 1
Figure 1

Top 10 subfields in HL research based on a literature survey

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

Figure 2 shows the top ten most commonly studied HLs and the majority languages with which they are paired. The figure allows us to understand visually which combinations of languages are studied most. It is limited to the ten most common HLs for better visualization and clarity, as adding too many less-studied HLs would have made the figure unreadable. The left side represents the HLs studied and the y-axis reports the number of studies on that language. The transparent green to yellow lines that are combined with the right side are split into languages that are the majority languages in these studies. From this figure we can glean that by far the most common majority language is English, followed by German as a distant second. By far the most commonly studied HL is Spanish, followed by Chinese and Korean, all of which are overwhelmingly studied in the context of English.

Figure 2
Figure 2

Alluvial diagram of top 10 most commonly studied pairs of HLs and majority languages

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

The counts in Table 1 represent all the articles that were listed for each country. This means that multiple counting is possible (and highly probable) if one author has published many times from the same country or if one author has affiliations in multiple countries. A full list of all 64 countries is found in Appendix 2. We strikingly observe that ranks 2–10 only sum up to 871, which is still fewer authors than the USA, with 1,063 authors of HL research publications.

Table 1
Table 1

Top 10 countries with which authors of HL research are affiliated.

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

3.2 Sentiment Analysis Results Overall

Overall, 820 abstracts (79%) were classified as having a positive sentiment, 161 abstracts (16%) as having a negative sentiment, and for 53 abstracts (less than 1%) sentiments were neutral. In order to understand the overall frequencies and distributions in the data, we present the number of abstracts of articles per year in a bar graph which also encodes sentiments. Figure 3 shows a timespan from 1987 to approximately 2022. Each bar represents a year and is segmented by the sentiment of the articles: positive (green), negative (red), and neutral (blue). From 1987 to the early 2000s, the number of articles in the field of HL research, regardless of sentiment, was relatively low. There is a very small number of articles with positive sentiment and an even smaller number with negative or neutral sentiment. Starting in the mid-2000s, there is a gradual increase in the number of articles. This trend accelerates around 2010, and from 2010 to 2020 there is a significant increase in the number of articles, particularly those with positive sentiment. Throughout the years, the majority of articles is always positively skewed. Although there is a notable presence of articles with negative sentiment, the positive articles dominate the chart. Neutral sentiment articles remain quite scarce throughout the years.

Figure 3
Figure 3

Articles and their tagged sentiments over time. Sentiment categories are encoded by colors: blue = neutral, red = negative, green = positive

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

Overall, the data suggest a substantial increase in the number of articles over time with a strong consistent leaning towards positive sentiment, a moderate presence of negative sentiment, and a relatively minimal number of articles with a neutral sentiment.

3.3 Inferential Analysis: Modeling Sentiment Scores

The regression in Table 2 provides the results of a model that aimed to predict the sentiment score based on the year, subfield, and HL under study. The Bayesian regression model was employed to investigate sentiment scores in HL research articles, incorporating several predictors: publication year, various subfields of linguistics, and specific HLs. The key findings from the analysis are summarized below, with estimates and their 89% credible intervals. In Bayesian statistics, an 89% Confidence Interval (CI) signifies our confidence that the true value of a parameter lies within a specified range based on data and prior beliefs. A meaningful effect that allows us to make inferences in this framework is characterized by a CI that does not include zero. Bayesian analysis thus provides a nuanced approach to estimating uncertainties and evaluating the substantive impact of research findings beyond mere statistical significance (Kruschke, 2014; McElreath, 2018).

Table 2
Table 2

Bayesian regression model estimates for sentiment scores in HL research articles, including predictors such as publication year, linguistic subfields, and specific HLs.

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

The first concern of our study was to check whether sentiments change over time. The estimate for the year of publication is 0.00, with a credible interval of –0.01 to 0.01. This suggests that the year of publication does not have a meaningful impact on sentiment scores.

We also contrasted sentiments between different linguistic subfields. Among the various linguistic subfields, the estimates and their respective credible intervals are as follows: “Other” has an estimate of 0.00 (CI: –0.06 to 0.07), “Applied linguistics” has an estimate of 0.09 (CI: –0.03 to 0.21), “Discourse” has an estimate of –0.02 (CI: –0.16 to 0.12), “Language acquisition” has an estimate of 0.09 (CI: 0.01 to 0.18), “Phonology-Phonetics” has an estimate of –0.11 (CI: –0.31 to 0.09), “Semantics” has an estimate of 0.10 (CI: –0.06 to 0.26), “Sociolinguistics” has an estimate of 0.06 (CI: –0.14 to 0.26), and “Syntax” has an estimate of –0.21 (CI: –0.34 to –0.09). Notably, the subfield of language acquisition showed a positive association with sentiment scores, whereas syntax demonstrated a negative association.

In this model, we further investigated differences in sentiments for the ten most frequent HLs in the literature. For specific HLs, the estimates and credible intervals are as follows: “Arabic” has an estimate of 0.04 (CI: –0.10 to 0.18), “Chinese” has an estimate of 0.01 (CI: –0.08 to 0.09), “Greek” has an estimate of –0.04 (CI: –0.22 to 0.14), “Italian” has an estimate of –0.06 (CI: –0.23 to 0.10), “Japanese” has an estimate of 0.01 (CI: –0.12 to 0.14), “Korean” has an estimate of 0.16 (CI: 0.06 to 0.25), “Polish” has an estimate of 0.03 (CI: –0.12 to 0.19), “Russian” has an estimate of –0.03 (CI: –0.13 to 0.07), “Spanish” has an estimate of 0.09 (CI: 0.03 to 0.16), and “Turkish” has an estimate of –0.20 (CI: –0.33 to –0.08). The results indicate that articles on Korean and Spanish HLs have higher sentiment scores, whereas those on Turkish and the syntax subfield exhibit lower sentiment scores. Considering interactions between subfields and year did not yield any meaningful results.

The model explains a modest amount of the variance in sentiment scores, with a Bayesian R² of 0.076. This suggests that while the predictors contribute to understanding sentiment scores, a portion of the variance remains unexplained. Overall, the results highlight the nuanced impact of different linguistic subfields and specific HLs on the sentiment expressed in research articles. These findings highlight the variability of effects across different linguistic subfields on sentiments in HL research articles, providing insights into where positive sentiments are common and where challenges persist. The results underscore the importance of considering specific subfield contexts when evaluating overall sentiment in HL research.

3.4 Survey Results

Eighty participants (median age: 41) completed the questionnaire, including 60 who identified as female, 19 as male, and one who chose not to respond. Participants were currently based in 20 different countries, where the top three represented were the US (25), Germany (23), and Norway (6). Participants reported proficiency in an average of four languages per person, with 25 of the total 80 participants self-identifying as HL speakers themselves. Forty participants reported publishing at least five papers in the field of HLs, and on average, participants had studied or held academic positions in 1.9 different countries, with the top three being the US (33), Germany (27), and the UK (8).

As shown in Fig. 4, the top five fields represented by survey respondents were morphosyntax, sociolinguistics, psycholinguistics, lexicon, and phonetics. Additionally, 40 participants reported studying children and 63 reported studying adults, with 16 noting they study neurotypical populations and four studying clinical populations. Finally, 38 participants reported using only offline measures, six used only online measures, and 32 reported using both.

Figure 4
Figure 4

Top 5 subfields represented in the questionnaire

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

As shown in Fig. 5, the top five languages studied by respondents were English, German, Spanish, Russian, and Turkish. The top two responses align logically with the top two most mentioned current locations of the researchers (the US and Germany) and all top three countries along the academic path (the US, Germany, and the UK). Spanish and Turkish are highly common HLs in the US and Germany, respectively, while Russian is often studied in combination with both English and German. This also aligns with the findings of the literature review (see Section 3.1).

Figure 5
Figure 5

Top 5 languages studied

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

The following five observed shifts were among those most mentioned by the survey respondents (see Fig. 6). First, 23 participants observed a shift away from a deficit account. Seventeen noted that the field is growing, with more studies than before. Fourteen observed that fewer studies were comparing heritage and monolingual populations. Eight mentioned that the definition of HL speakers is becoming more inclusive, while six each listed a shift toward new methodologies, more awareness of HL speakers in language education, and a greater acceptance of HL speakers as speakers of a valid and unique dialect of the language.

Figure 6
Figure 6

The top shifts observed in HL research

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

Sentiments were overwhelmingly positive. A few responses were tagged as “neutral” when a sentiment could not be clearly discerned from the respondent’s phrasing of a particular trend. Responses were tagged as “negative” when they expressed that not enough changes had been implemented to avoid particular frameworks. For example, the following comment was tagged as having a positive sentiment: “the field is moving away from a deficiency account and towards embracing neurodiversity.” The use of “moving away” from a negative salience (“deficiency”) and using the positive word “embracing” led to this categorization. Meanwhile, the following response was tagged as neutral: “shift in the types of groups compared.” It was not apparent from this phrasing whether the respondent saw this shift as a positive or negative change, or merely an observation, so the “neutral” tag was used to avoid potentially ascribing an unintended sentiment. Finally, the following is an example of a response tagged with a negative sentiment: “still, the racist attitude toward HL speakers does not change.”

Participants’ responses were divided into main observed shifts, with approximately 1.7 observed shifts per respondent. Of the 121 total shifts, 99 were marked as having a positive sentiment (82%), 14 neutral, and eight negative.

4 Discussion

The present study set out to examine whether, as has been anecdotally claimed, sentiments towards HLs and their speakers have indeed become increasingly positive. To investigate, we surveyed 80 HL researchers on their experiences in the field and analyzed the sentiments of 1,034 article abstracts, penned by 1,392 unique authors.

To help contextualize our sample of researchers and their observations, we begin by delving into their reported backgrounds. Questionnaire results showed that the sampled researchers were primarily based in the United States and Germany (60% of respondents). These two countries were also the top two most represented among author affiliations from the quantitative paper analysis. Notably, these were also the top two countries most commonly found somewhere along the researchers’ academic paths. Taken together, these observations indicate that the United States and Germany are the two main hubs for HL research today.

The top five subfields of study, as reported by the researchers, also included some overlap with those tagged in the sample of papers, although the labels were not identical. For instance, syntax (or morphosyntax) and phonetics (or phonology-phonetics) featured in the top five of both datasets, along with psycholinguistics (overlapping with language acquisition and applied linguistics). This indicates that HL research extends to many areas of linguistics and is not restricted to any particular subdiscipline.

Finally, we assessed the top ten HLs featured in the papers and the dominant languages they were paired with. Unsurprisingly, English was the majority language in the vast majority of studies, studied primarily in conjunction with Spanish, Korean, Chinese, and Russian. Russian, along with Turkish, were the main HLs studied in contact with German, the second-most-common majority language. The prominence of English and German as majority languages fully aligns with the trends from researchers’ countries of origin and affiliations. The languages from the most common pairs are further reflected in the languages researchers reported studying, with the top two – English and German – likely referencing majority languages, and the remaining – Spanish, Russian, and Turkish – referencing HLs. The prominence of these five languages both among the sampled papers and among researchers’ reports underscores the point made by Scontras and Putnam in 2020 that there is a strong need for research on a broader variety of languages and language pairs within the field.

Addressing the main interest of this paper, sentiments in HL research, we find in the literature as well as in the survey sentiment analysis that sentiments are overwhelmingly positive. Figure 7 further demonstrates that the distributions of sentiments in the literature and the researcher survey are highly similar, indicating that researchers have a good grasp of developments in their field. However, it is important to emphasize that we asked researchers specifically to describe trends in HL research. Many of the trends that they observed indicate positive shifts, implying that some past concepts and practices in the field were perceived as negative and researchers are sensing that there are positive developments now.

Figure 7
Figure 7

Distribution of sentiments in the literature (n=1,034) and surveys (n=80)

Citation: Heritage Language Journal 21, 1 (2024) ; 10.1163/15507076-bja10034

Finally, addressing the crux of the present study, the sentiment analysis of papers on HL research (619 papers in the reduced sample with the top ten HLs, and 1,034 in the full sample with all HLs) definitively found no significant change in sentiments over the last two decades. When considering the predictors of time, the most commonly studied subfields and the most commonly studied HLs, no significant sentiment effects were found, with only a few exceptions. Studies in the subfield of syntax and studies focusing on HL-Turkish have mostly negative sentiments, while studies focusing on HL-Korean and HL-Spanish have mostly positive sentiments. Reflecting on potential reasons for these findings, we observe that syntax has been the most studied domain of HL development and has the strongest tradition of comparisons to monolingual baselines, leading to deficit-based framings of HL speakers. Thus, when collapsing data from syntax papers over time, it is reasonable that we would observe this negative tilt. The distribution of sentiment effects among HL-Turkish, Korean, and Spanish may be correlated to the countries in which those languages are most commonly studied. Namely, HL-Turkish is most frequently studied in Europe, often paired with majority-German, while HL-Korean and HL-Spanish are most frequently studied in the United States. The American context may be more open to pluralistic perspectives than the European one (McKay, 1997; Ziegler, 2013), leading to more positive or more negative sentiments toward diverse language outcomes, respectively. This explanation, however, does not sufficiently explain why other HLs, especially those that are predominantly studied in only one of the aforementioned contexts, do not show the same effect. Thus, future studies should include “Country of Research” as a predictor in their models in order to further investigate this finding.

No changes in sentiments were observed in the papers over time, and our analysis found that the papers have consistently been predominantly positive. Yet, despite this static trend, researchers have clearly observed multiple shifts in the way HLs and their speakers are discussed, with over 80% of surveyed researchers reporting positive changes.

There are several potential explanations for this divergence. First, it is possible that the sampled abstracts were not sufficiently reflective of the sentiment of the full papers, and had we analyzed the full texts instead, a different picture would have emerged. While we do not reject this possibility, Multani (2024) found that sentiment analyses of larger scientific texts were less precise than those on abstracts, due to large neutral sections, such as methodologies.

Therefore, should a more detailed sentiment analysis be carried out in the future, it will be important to focus on more subjective sections, such as introductions and discussions only, in order to target areas where sentiments could be conveyed. Beyond the text samples themselves, it is possible that more positive language does not fully correlate with attitudes toward HLs or HL speakers. Perhaps the true sentiment is not captured in a sufficiently fine-grained manner.

Alternatively, there may be a divide between formal writing and informal presentations and discussions within the field. While the perspectives conveyed in the literature may not change as drastically, perhaps conversations among researchers in forums such as conferences, workshops, invited talks, social media, and others provided room for less nuanced sentiments. Thus, when these sentiments began to change, researchers were more sensitive to the shifts than an algorithm would be to abstract texts. This explanation highlights the importance of considering both more subjective and more objective methodologies for understanding trends in the field.

The specific shifts researchers reported observing aligned strongly with those that have been mentioned in the literature over the last few years. Most commonly, respondents cite a shift away from a deficit-based framing of HL speakers, a change that has been actively called for by many researchers (Rothman & Treffers-Daller, 2014; Wiese et al., 2022). Tentatively, it would seem that these calls are indeed yielding, or at least correlating with, palpable changes in the field. Complementing the shift away from deficit-based framings, researchers observed shifts away from comparisons between HL speakers and monolinguals (López et al., 2023; Rothman et al., 2023) and, intuitively, towards more inclusive definitions of HL speakers and languages and more HL-friendly classrooms, and towards accepting HLs in their own right. The observed shift of an increasing number of studies is fully supported by our data (see Fig. 3), indicating exponentially rising numbers of publications over time. Also backed by recent papers, researchers observed new methodologies emerging in the field, similarly indicating a developmental trend. Taken together, these observations lead to two insights. First, the trends that researchers are reporting are, in many cases, backed by data from publications. Second, there is strong alignment between trends that researchers report observing and those that are described in the literature. For HL research in general, this could be seen as an indicator that the field is vital and well-connected as researchers seem to observe real developments in the field accurately.

To summarize the findings from our study, we first gained an understanding of the main hubs of HL research and the primary languages being investigated. Both the analysis of 1,034 papers and the survey of 80 researchers demonstrated that source countries and target languages of study are limited to a select few, and thus we join prior calls from Scontras and Putnam (2020) and others to continue expanding the field both geographically and linguistically. Regarding sentiments, we learned that a few subfields exhibit a more negative attitude than others; namely, studies on syntax and on HL-Turkish. While we presented several potential explanations for these framings, ultimately, we call on researchers on these topics to join the rest of the field in a more optimistic approach. Finally, we found that, while papers have always been mostly positive in their outlook on HLs and their speakers, researchers have noticed multiple increasingly positive changes in the field in recent years. This finding highlights continuous positive growth and evolution within the discipline: Papers were already predominantly positive to begin with, yet framings continue to change for the better.

4.1 Limitations and Future Directions

A study combining a computational sentiment analysis with human survey data is not without its limitations. We are aware that limiting our n-gram search to “heritage language” potentially overlooks literature that investigates the same population but uses different terms such as “family language,” “immigrant language,” “regional language,” or “minority language.”. However, as we were particularly interested in an analysis of research that is clearly identifiable as belonging to the subfield of HLs, we see this limitation to be justified. Additionally, as noted by a reviewer, sentiment extraction methods can introduce noise, as they may capture positive language (e.g., “excellent correlation”) that does not reflect overall sentiment toward the research field. Varying writing styles and increasing pressure on researchers to present their work positively may also influence these results, and a qualitative analysis could provide further clarity.

Our questionnaire was designed with the aim of being concise while still giving us some demographic background on the researchers and giving them space to express their views on changes in the field. Had we expanded the questionnaire, many more interesting insights could have been gleaned. For instance, researchers could be asked what shifts they would like to observe in the field in the future, as a means of predicting and projecting future directions. Fine-tuning the existing questionnaire, we could have conducted the survey following, rather than in parallel to, collecting the literature data, and thus provide field labels based on the literature findings, allowing a more apples-to-apples comparison.

5 Conclusion

The field of HL research has been, and continues to be, rapidly growing and developing, as indicated by the exponentially rising number of publications in the discipline. As the field continues to evolve, it is important to pause and look back on how far the field has come, and how the field and its convictions have shifted (or remained consistent) over time. The present study found that research on HLs and their speakers have been consistently predominantly positive over the last two decades, while surveyed researchers noted many avenues in which the field has shifted positively in recent years. Taken together, these findings indicate a net positive growth within the discipline. However, based on the background data collected from HL researchers, we were able to identify points for further improvement. Namely, we found from both the survey and the paper analysis that most HL research comes from two countries – the US and Germany – and is limited to only a small set of language pairs. Therefore, we join prior calls for increased diversity. It is our hope that this paper will serve as a reflection of how far the field has come and where it has yet to go.

Notes on Contributors

Dr. Clara Fridman earned her PhD from Bar Ilan University. Her research investigates the lexical development of heritage speakers, ranging from innovative production patterns, to language processing among trilingual heritage speakers, to mapping proficiency and background factor networks across multiple language pairs.

Onur Özsoy examined morphosyntactic development in heritage speakers in his doctoral work at Humboldt-Universität and ZAS Berlin. His research investigates language development using eye-tracking and corpus linguistic methodologies, specializing in language acquisition and assessment in multilingual children and adults, with a focus on individual differences. He also engages in open science practices, Bayesian statistics, and Big Team Science projects.

Appendix 1: Questionnaire Questions

  • Do you feel that there have been shifts in the ways heritage languages / heritage speakers are discussed or portrayed in research over the last several decades?

    • Yes / No

  • What shifts (if any at all) have you noticed in the way heritage languages / heritage speakers are discussed or portrayed in research over the last two decades of research? (for example, shifts in the types of groups that are often compared, frameworks or definitions used or promoted, topics or perspectives covered, etc.)

  • Age in years (numeric)

  • Gender

  • In which country are you currently based?

  • How many languages do you speak / sign?

  • List the languages you speak or sign (in order of proficiency)

  • List the languages you have studied / researched

  • Do you identify as a heritage speaker according to the following criteria?

    1. You acquired a family language (or several) from birth

    2. Your family language is not the dominant majority language of the society where you grew up

    3. You were born in this society or moved there in childhood

      • Yes / No

  • Which best describes your current position?

    • PhD Student / Postdoctoral Fellow / Faculty Member / Senior Faculty Member (tenure) / Independent Researcher / Other

  • If you hold a PhD, how many years ago did you earn your PhD?

  • How many papers have you published in the field of heritage languages?

    • 0 / 1–4 / 5–9 / 10+

  • Which domain(s) do you primarily study?

    • Phonetics / Phonology / Morphosyntax / Semantics / Lexicon / Pragmatics / Sociolinguistics / Psycholinguistics / Family Language Policy / Reading / Writing / Other

  • Which population(s) do you primarily study?

    • Children / Adults / Neurotypical / Clinical

  • Which methodologies do you use?

    • Online / Offline

  • List countries in which you have studied or held an academic position (MA, PhD, Postdoc, professorship)

How did you find this survey?

Appendix 2: Countries Where Heritage Language Researchers Are Affiliated with Universities and Institutions

FIG000010
FIG000010
FIG000010

References

  • Aalberse, S., Backus, A., & Muysken, P. (2019). Heritage languages: A language contact approach (Studies in Bilingualism, Vol. 58). John Benjamins.

    • Search Google Scholar
    • Export Citation
  • Baas, J., Schotten, M., Plume, A., Côté, G., & Karimi, R. (2020). Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies. Quantitative science studies, 1(1), 377386.

    • Search Google Scholar
    • Export Citation
  • Bayram, F., Kubota, M., & Pereira Soares, S. M. (2024). The next phase in heritage language studies: Methodological considerations and advancements. Frontiers in Psychology, 15, 1392474.

    • Search Google Scholar
    • Export Citation
  • Bayram, F., Pascual y Cabo, D., & Rothman, J. (2019). Intra-generational attrition: Contributions to heritage speaker competence. In M. S. Schmid & B. Köpke (Eds.), The Oxford Handbook of Language Attrition (pp. 446457). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780198793595.013.35

    • Search Google Scholar
    • Export Citation
  • Bayram, F., Rothman, J., Di Pisa, G., & Slabakova, R. (2021). Current trends and emerging methodologies in charting heritage language bilingual grammars. In S. Montrul & M. Polinsky (Eds.), The Cambridge Handbook of Heritage Languages and Linguistics (pp. 545578). Cambridge University Press.

    • Search Google Scholar
    • Export Citation
  • Bürkner, P. C. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of statistical software, 80, 128.

  • Collart, A. (2024). A decade of language processing research: Which place for linguistic diversity?. Glossa Psycholinguistics, 3(1). https://doi.org/10.5070/G60111432.

    • Search Google Scholar
    • Export Citation
  • Coughlin, C. E., & Tremblay, A. (2014). Morphological decomposition in native and non-native French speakers. Bilingualism: Language and Cognition, 18(3), 524542.

    • Search Google Scholar
    • Export Citation
  • Dominguez, L., Hicks, G., & Slabakova, R. (2019). Terminology choice in generative acquisition research: The case of “incomplete acquisition” in heritage language grammars. Studies in Second Language Acquisition, 41(2), 241255.

    • Search Google Scholar
    • Export Citation
  • Elsevier. (2023). Scopus: Abstract and citation database. Retrieved June 15, 2023, from https://www.scopus.com.

  • Fridman, C., Polinsky, M., & Meir, N. (2023). Cross-linguistic influence meets diminished input: A comparative study of heritage Russian in contact with Hebrew and English. Second Language Research, 40(3), 675708. https://doi.org/10.1177/02676583231176379.

    • Search Google Scholar
    • Export Citation
  • Higby E., Gámez, E., & Holguín Mendoza, C. (2023). Challenging deficit frameworks in research on heritage language bilingualism. Applied Psycholinguistics, 44(4), 417430. https://doi.org/10.1017/S0142716423000048.

    • Search Google Scholar
    • Export Citation
  • Hutto, C., & Gilbert, E. (2014, May). Vader: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the international AAAI conference on web and social media, 8(1), 216225.

    • Search Google Scholar
    • Export Citation
  • Iefremenko, K. (2024). Word order in Turkish and Kurmanji Kurdish in language contacts: Evidence of emerging varieties? [Doctoral dissertation, University of Potsdam].

  • Kidd, E., & Garcia, R. (2022). How diverse is child language acquisition research?. First Language, 42(6), 703735. https://doi.org/10.1177/01427237211066405.

    • Search Google Scholar
    • Export Citation
  • Kissling, E. M. (2018). An exploratory study of heritage Spanish rhotics: Addressing methodological challenges of heritage language phonetics research. Heritage Language Journal, 15(1), 2570.

    • Search Google Scholar
    • Export Citation
  • Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Elsevier.

  • Kupisch, T., & Polinsky, M. (2022). Language history on fast forward: Innovations in heritage languages and diachronic change. Bilingualism: Language and Cognition, 25(1), 112.

    • Search Google Scholar
    • Export Citation
  • Kupisch, T., & Rothman, J. (2018). Terminology matters! Why difference is not incompleteness and how early child bilinguals are heritage speakers. International Journal of Bilingualism, 22(5), 564582.

    • Search Google Scholar
    • Export Citation
  • Lohndal, T., Rothman, J., Kupisch, T., & Westergaard, M. (2019). Heritage language acquisition: What it reveals and why it is important for formal linguistic theories. Language and Linguistics Compass, 13(12). https://doi.org/10.1111/lnc3.12357.

    • Search Google Scholar
    • Export Citation
  • López, B. G., Luque, A., & Piña-Watson, B. (2023). Context, intersectionality, and resilience: Moving toward a more holistic study of bilingualism in cognitive science. Cultural Diversity and Ethnic Minority Psychology, 29(1), 24.

    • Search Google Scholar
    • Export Citation
  • Lynch, A. (2014). The first decade of the Heritage Language Journal: A retrospective view of research on heritage languages. Heritage Language Journal, 11(3), 224242.

    • Search Google Scholar
    • Export Citation
  • McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.

  • McKay, S. L. (1997). Multilingualism in the United States. Annual Review of Applied Linguistics, 17, 242262. https://doi.org/10.1017/S0267190500003378.

    • Search Google Scholar
    • Export Citation
  • Montrul, S. (2022). Native speakers, interrupted: Differential object marking and language change in heritage languages. Cambridge University Press.

    • Search Google Scholar
    • Export Citation
  • Montrul, S., & Polinsky, M. (Eds.). (2021). The Cambridge handbook of heritage languages and linguistics. Cambridge University Press.

  • Multani, P. K. (2024). Sentiment analysis using Python web scraping, NLTK, and R. [unpublished lab rotation report, Leibniz-Zentrum Allgemeine Sprachwissenschaft, Language Development and Multilingualism].

  • Özsoy, O., & Blum, F. (2023). Exploring individual variation in Turkish heritage speakers’ complex linguistic productions: Evidence from discourse markers. Applied Psycholinguistics, 44(4), 534564.

    • Search Google Scholar
    • Export Citation
  • Özsoy, O., Iefremenko, K., & Schroeder, C. (2022). Shifting and expanding clause combining strategies in heritage Turkish varieties. Languages, 7(3), 242.

    • Search Google Scholar
    • Export Citation
  • Pascual y Cabo, D., & Rothman, J. (2012). The (il)logical problem of heritage speaker bilingualism and incomplete acquisition. Applied Linguistics, 33(4), 450455.

    • Search Google Scholar
    • Export Citation
  • Pires, A., & Rothman, J. (2009). Disentangling sources of incomplete acquisition: An explanation for competence divergence across heritage grammars. International Journal of Bilingualism, 13(2), 211238.

    • Search Google Scholar
    • Export Citation
  • Polinsky, M. (2015). Heritage languages and their speakers: State of the field, challenges, perspectives for future work, and methodologies. Zeitschrift für Fremdsprachwissenschaft, 26(1).

    • Search Google Scholar
    • Export Citation
  • Polinsky, M. (2018). Bilingual children and adult heritage speakers: The range of comparison. International Journal of Bilingualism, 22(5), 547563.

    • Search Google Scholar
    • Export Citation
  • Polinsky, M., & Kagan, O. (2007). Heritage languages: In the ‘wild’ and in the classroom. Language and Linguistics Compass, 1(5), 368395.

    • Search Google Scholar
    • Export Citation
  • Polinsky, M., & Scontras, G. (2020). A roadmap for heritage language research. Bilingualism: Language and Cognition, 23(1), 5055.

  • Putnam, M. T., & Sánchez, L. (2013). What’s so incomplete about incomplete acquisition? A prolegomenon to modeling heritage language grammars. Linguistic Approaches to Bilingualism, 3(4), 478508.

    • Search Google Scholar
    • Export Citation
  • R Core Team. (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/.

    • Search Google Scholar
    • Export Citation
  • Rodina, Y., Kupisch, T., Meir, N., Mitrofanova, N., Urek, O., & Westergaard, M. (2020, March). Internal and external factors in heritage language acquisition: Evidence from heritage Russian in Israel, Germany, Norway, Latvia and the United Kingdom. Frontiers in Education, 5. https://doi.org/10.3389/feduc.2020.00020.

    • Search Google Scholar
    • Export Citation
  • Rothman, J. (2009). Understanding the nature and outcomes of early bilingualism: Romance languages as heritage languages. International Journal of Bilingualism, 13(2), 155163.

    • Search Google Scholar
    • Export Citation
  • Rothman, J. (2024). Harnessing the bilingual descent down the mountain of life: Charting novel paths for Cognitive and Brain Reserves research. Bilingualism: Language and Cognition, 19. https://doi.org/10.1017/S1366728924000026.

    • Search Google Scholar
    • Export Citation
  • Rothman, J., Bayram, F., DeLuca, V., Di Pisa, G., Dunabeitia, J. A., Gharibi, K., … & Wulff, S. (2023). Monolingual comparative normativity in bilingualism research is out of “control”: Arguments and alternatives. Applied Psycholinguistics, 44(3), 316329. https://doi.org/10.1017/S0142716422000315.

    • Search Google Scholar
    • Export Citation
  • Rothman, J., & Slabakova, R. (2018). The generative approach to SLA and its place in modern second language studies. Studies in Second Language Acquisition, 40(2), 417442.

    • Search Google Scholar
    • Export Citation
  • Rothman, J., & Treffers-Daller, J. (2014). A prolegomenon to the construct of the native speaker: Heritage speaker bilinguals are natives too!. Applied Linguistics, 35(1), 9398.

    • Search Google Scholar
    • Export Citation
  • Sagarra, N., & Casillas, J. V. (2023). Practice beats age: Co-activation shapes heritage speakers’ lexical access more than age of onset. Frontiers in Psychology, 14, 1141174.

    • Search Google Scholar
    • Export Citation
  • Scontras, G., Fuchs, Z., & Polinsky, M. (2015). Heritage language and linguistic theory. Frontiers in Psychology, 6, 1545.

  • Scontras, G., & Putnam, M. T. (2020). Lesser-studied heritage languages: An appeal to the dyad. Heritage Language Journal, 17(2), 152155. https://doi.org/10.46538/hlj.17.2.2.

    • Search Google Scholar
    • Export Citation
  • Serratrice, L. (2020). Lessons from studying language development in bilingual children. In C. F. Rowland, A. L. Theakston, B. Ambridge, & K. E. Twomey (Eds.), Current perspectives on child language acquisition. How children use their environment to learn (Trends in Language Acquisition Research, Vol. 12) (pp. 263286). John Benjamins. https://doi.org/10.1075/tilar.27.12ser.

    • Search Google Scholar
    • Export Citation
  • Taboada, M. (2016). Sentiment analysis: An overview from linguistics. Annual Review of Linguistics, 2(1), 325347.

  • Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267307.

    • Search Google Scholar
    • Export Citation
  • van Osch, B., Parafita Couto, M. C., Boers, I., & Sterken, B. (2023). Adjective position in the code-switched speech of Spanish and Papiamento heritage speakers in the Netherlands: Individual differences and methodological considerations. Frontiers in Psychology, 14, 1136023.

    • Search Google Scholar
    • Export Citation
  • Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D. A., François, R., … & Yutani, H. (2019). Welcome to the Tidyverse. Journal of Open Source Software, 4(43), 1686.

    • Search Google Scholar
    • Export Citation
  • Wiese, H., Alexiadou, A., Allen, S., Bunk, O., Gagarina, N., Iefremenko, K., … & Zuban, Y. (2022). Heritage speakers as part of the native language continuum. Frontiers in Psychology, 12, 717973.

    • Search Google Scholar
    • Export Citation
  • Ziegler, G. (2013). Multilingualism and the language education landscape: Challenges for teacher training in Europe. Multilingual Education, 3, 123.

    • Search Google Scholar
    • Export Citation
1

Our research questions, hypotheses and methods were pre-registered and time-logged at https://osf.io/zsfvp.

2

These annotations were conducted by three student research assistants at the Leibniz-Center General Linguistics, namely Nisa Büyükyıldırım, Mariya Burbelko, and Prabhjot K. Multani. We are grateful for their help in preparing the data.

Content Metrics

All Time Past 365 days Past 30 Days
Abstract Views 142 142 0
Full Text Views 345 345 51
PDF Views & Downloads 545 545 50