Response Cries or Response Statements? A Cross-Linguistic Analysis of Interjectional Expressions in Japanese and English

Goffman (1978: 800) claims that “[a] response cry doesn’t seem to be a statement in the linguistic sense (even a heavily elided one),” which suggests that such cries do not have the linguistic structures of statements (or descriptions) of either a speaker’s emotion/ sensation or evaluation of a situation. This study conducted a questionnaire survey targeting Japanese and American English speakers to investigate expressions they will produce under eight circumstances of Goffman’s response cries. The results were contrary to Goffman’s claim. About half of the Japanese response cries were “descriptive interjections” like Ita(i) ‘Painful’ and Yaba(i) ‘Awful,’ and so were about 17 percent of the American response cries. On the other hand, non-descriptive interjections (swear words, vocatives) were more favoured by American English speakers, but extremely rare in Japanese. This study also addresses the questions of sociality and/or dialogicity of response cries in the two languages. Downloaded from Brill.com03/22/2022 11:03:34PM via free access 2 Izutsu, Kim and Izutsu 10.1163/26660393-bja10038 | Contrastive PragmaticS (2022) 1–28


Introduction
When we are in solitude, we will probably not remain silent all the time. We sometimes produce interjections (or more generally, interjectional expressions) such as Oops! and Oh my God!1 These expressions are examples of what Erving Goffman (1978) calls "response cries." Goffman (1978: 800) describes "response cries" as "exclamatory interjections which are not full-fledged words." They are a verbal representation of the speaker's emotional display in response to an unexpected or uncontrolled situation. Some linguists assume that such interjections are "universal" (Ameka, 1992). However, the linguistic realisations of response cries may be diverse across languages -the term "linguistic" per se is problematic because some response cries are considered to be "non-words" that "can't quite be called part of a language" (Goffman, 1978: 810). How linguistic are these cries? And what kinds of cross-linguistic differences exist? These are the first questions to be addressed in this study. For the linguistic forms of response cries, Goffman (1978) claims that "[a] response cry doesn't seem to be a statement in the linguistic sense (even a heavily elided one), purportedly doing its work through the concatenated semantic reference of words" (p. 800, italics ours). This claim is repeated elsewhere: "[t]hese cries are conventionalized utterances which are specialized for an informative role; but in the linguistic and propositional sense, they are not statements" (p. 805, italics ours). Goffman (1978: 809) considers that the utterances used in response cries come from two sources: "taboo but full-fledged words" (e.g. Damn! Hell!) and "the broad class of non-word vocalizations" (e.g. Wow! Oops!). He observes "a nice division of linguistic labor," which highlights the distinctiveness of these two sources from the language used in ostensive communication: "[f]ull-fledged words that are socially acceptable are allocated to communication in the openly directed sense, while taboo words and 1 "Interjectional expression" is used in this study as a cover term for words and phrases that represent the speaker's immediate reaction to a linguistic or physical context (see Ameka, 1992: n. 3 regarding a different use of the term). As argued in 4.2, this term is necessary to capture the continuous nature of forms used for this expressive purpose. Our discussion is specifically concerned with what Ameka (and Wilkins) refer to as "expressive interjections" (Ameka, 1992(Ameka, , 1994Ameka and Wilkins, 2006). non-words are specialized for the more ritualized kind of communication" (p. 809). These accounts seem to suggest that response cries are either nonlexical vocalisations or swearing/vocative expressions (e.g. Damn! God!); they do not have linguistic structures of statements (specifically descriptions) of either a speaker's emotion/sensation or evaluation of a perceived situation.
The English examples provided by Goffman (1978) might support this observation. However, from examples of a typologically different language like Japanese, there emerges a different picture: apart from non-lexical vocalisations and swearing/vocative expressions, response cries can have the forms of statements or "even heavily elided ones." Another issue to consider is the functions of response cries. Goffman (1978: 814) argues that response cries are "creatures of social situations." In other words, they are used for remedial work in social situations, serving "as means of striking a self-defensible posture in the face of extraordinary events" (p. 806). For example, when a man trips on a pavement, he can utter a response cry to correct "the threat to his reputation" (p. 793) or for "the advertisement of self-respect" (p. 798). According to Goffman (1978), response cries presuppose the presence of others (a "gathering" or "witnesses," to use his terminology). Such "blurtings make a claim of sorts upon the attention of everyone in the social situation, a claim that our inner concerns should be theirs, too" (p. 814). Sometimes, they are "styled to be overheard in a gathering" (p. 799). When there is no one around, response cries may not be issued. If women and children are nearby, a male speaker is likely to "censor his cries accordingly" (p. 799), avoiding an f-word that he may use when he stumbles alone. In other words, "'[r]ecipient design' is involved" (p. 799) in the production of response cries.
The dialogic properties reported in many psychological studies are primarily observed in the use of language for self-regulation, which occurs when children (and adults) are engaged in cognitively challenging activities (e.g. Diaz, 1992: 62;John-Steiner, 1992;Duncan and Cheyne, 2002). Apart from such a self-regulatory function, however, some researchers (e.g. Fuson, 1979;Fossa, 2017) argue that private speech also fulfils several other functions including "an emotional/expressive function" (Fuson, 1979: 155) realised by response cries. In view of this, questions arise: Are we conscious of the presence of others (witnesses or addressees) when uttering response cries? Are there any differences in the conception of others among speakers of different languages (e.g. Japanese and English)?
In the present study, we explored forms and functions of response cries through a questionnaire survey designed to compare two typologically contrastive languages: Japanese and (American) English. The study investigated what kinds of expressions the two groups of speakers consider they would produce under eight circumstances of Goffman's response cries and whether the speakers would presuppose the presence of others (addressees or witnesses) when producing response cries. As a corollary of the first point, we also propose a continuous view of response cries and interjectional expressions. Goffman (1978) identifies nine types of response cries, as summarised in Table 1. This study examined expressions used for eight of them, excluding Sexual Moan. Table 1 Types of response cries (Goffman, 1978: 801-805) Type Example Oh, no! when making a mistake (which leads to another person's query) Audible Glee

Types of Response Cries
Oooooo! when an adolescent girl was served a large crepe These types are self-explanatory except Floor Cues (e.g. Oh, no!). Floor Cues include utterances such as "Oh, no!," which is expressed, for example, when a person deletes an important file on a computer by mistake. This kind of utterance "leads to, and apparently is designed to lead to, a colleague's query as to what was wrong" (Goffman, 1978: 804), hence providing a cue to the colleague's floor.

2.2
A Questionnaire Survey A questionnaire survey was administered to Japanese and American English speakers, using Google Forms.2 The questionnaire consists of two sections: A. Suppose that in the situations described below you are by yourself with no one around to hear you. In each case, what would you utter or say aloud when the following things happen? If you believe you would probably NEVER utter a thing, just write "nothing." B. (i) When you make the utterances you gave as answers above, do you feel as if you were speaking to someone? (ii) If yes, who are you speaking to? Write down all the possibilities you can think of.
Section A asked the participants what kind of utterance they would produce in each of the eight circumstances of Goffman's response cries. The basic setting common to all the circumstances was: "you are by yourself with no one around to hear you," which excludes the possibility of the presence of others (i.e. a gathering or witnesses). Section B was concerned with the issue of dialogicity, asking whether the participants felt as if they were speaking to someone when producing response cries.
The following eight questions were used in Section A of the English questionnaire. When a situation was considered difficult to imagine, we provided a supplementary photo alongside the question.  The questionnaire survey was administered to 54 native speakers of Japanese (24 females, 30 males) and 49 speakers of American English (29 females, 20 males). The participants were all university students aged 18-35 years.

A Taxonomy of Response Cries
Since response cries are "exclamatory interjections" (Goffman, 1978: 800), our classification of response cries refers to Ameka's (1992) distinction between primary and secondary interjections: primary interjections are "little words or non-words" (p. 105) like Ouch! Wow! Oops!, while secondary interjections are "words which have an independent semantic value" (p. 111) like Damn! Hell! Help! Ameka (1992: 103-104) regards "interjection" as a part of speech, hence only reserving the term for the word class. However, this poses a classification problem, because in this view, words such as damn and God are classified as interjections but semantically similar phrases such as damn it and my God are not.3 Our study extended the distinction between primary and secondary interjections to all other interjectional expressions used for response cries. This notional extension leads to viewing interjectional expressions as forming a functional category rather than a word class. Since the common understanding of interjectional expressions is that they can "stand alone as a complete 'utterance'" (Jespersen, 1924: 90; see also Ameka, 1992Ameka, , 1994Ameka and Wilkins, 2006), they can be regarded as a type of pragmatic markers (Norrick, 2009) that can be independently used as utterances. The forms of such utterances vary from non-words, words, phrases, and clauses to other composite expressions. Table 2 presents the taxonomy of the interjectional expressions used in this study, illustrated by English examples.
We distinguished secondary interjections into two subtypes: descriptive and non-descriptive types. The non-descriptive type includes swear words (e.g. Damn, Shoot) and vocative expressions (e.g. God, Man), while the descriptive type (e.g. Nice, Disgusting) is used to describe the emotion or sensation a speaker has just felt or a speaker's evaluation of a perceived situation. This distinction is crosscut by another distinction: simple-word and multiple-word expressions exemplified by Damn and God damn it, respectively.4 In fact, the recognition of words is problematic in Japanese, involving an argument whether functional morphemes such as case markers and tense and aspect markers should be counted as one word or not. Our study followed the definition of Suzuki (1972Suzuki ( , 1996 and Okuda (1974), who consider "word" to represent a unified entity of lexicon and grammar. For example, these scholars regard Japanese past-tensed 4 We regarded Dammit as a multiple-word expression, because it is still analysable as consisting of two free morphemes (damn and it). When a multiple-word interjection consisted of two different types of interjection, the classification was based on a type of interjection that plays a more pivotal role in the meaning of the combination. For example, consider Wow this is heavy, which consists of the primary interjection (wow) and the (multiple-word) descriptive interjection (this is heavy). The utterance was classified as a (multiple-word) descriptive interjection rather than a primary interjection.

Multiple-word expression
a For primary interjections, no distinction was made between simple-word and multiple-word expressions, because it was difficult to identify word boundary for some expressions (e.g. Japanese Uohho, English ooOoOoOooOOoo, nngngnnngghhhh).
verbs (e.g. arui-ta 'walked' or it-ta 'went') as one word like their English counterparts; these words are viewed as a single linguistic unit composed of a lexical dimension (an act of 'walking' or 'going') and a grammatical dimension including the temporal meaning indicated by the past tense marker -ta. Table 3 summarises the response cries provided by Japanese speakers, which are classified according to the taxonomy of response cries given in Table 2. "Mouth sound" represents answers that describe a sound produced using the mouth, such as a gasp, a sigh, and clicking one's tongue.

Forms of Response Cries
The most frequent response cries were simple-word descriptive interjections (40.3%). When simple and multiple types were combined, descriptive interjections amounted to 50.7 percent; that is, about half of the Japanese response cries were descriptive interjections. On the other hand, non-descriptive interjections (swear words and vocatives) made up only one percent, which interestingly included an example of Japanised English: Oomaigaa 'Oh my God.' This clearly shows that non-descriptive interjections are extremely rare in Japanese.5 Two other types (primary interjections and "nothing to utter") Table 3 Response cries provided by Japanese speakers also represented significant proportions: utterances comprised solely of primary interjections (e.g. Att, Aa, Ott, Oo, Uwa) accounted for 32.2 percent, and "nothing to utter" showed 15.3 percent. The prevalence of Japanese descriptive interjections is illustrated in (1), which shows that such interjections were provided by the participants for all the eight circumstances of Goffman's response cries in both simple and multiple types.
(1) Examples of Japanese descriptive interjections6 (i) [The Spill Cry] Simple-word expression: Simple-word expression: About half of the Japanese response cries contained "descriptive" predicates: some comprise predicates alone and others even include their grammatical subjects or objects. In other words, Japanese response cries can be "a statement in the linguistic sense" (Goffman, 1978: 800). When including "a heavily elided one" (p. 800) (i.e. a one-word predicate), more than half of the response cries were statements or, more precisely, descriptions of a speaker's emotion/ sensation or his/her evaluation of a perceived situation. The simple-multiple distinction shows that simple-word descriptive interjections (40.3%) were more frequently used by Japanese speakers than multiple-word ones (10.4%). Now let us look at how American English speakers reacted in the eight circumstances of response cries. Table 4 shows that primary interjections (e.g. ew(w), oh, ouch, ugh, wow) were the most frequent across all the situation types (37.5%). The second most frequent were multiple-word non-descriptive interjections (e.g. ahh sh*t, God damn it, Oh man, oh my God). When simple and multiple types were combined, non-descriptive interjections amounted to 30.3 percent. "Nothing to utter" accounted for 13.5 percent, which is similar in proportion to its Japanese counterpart (15.3%).
Interestingly, contrary to the expectation that English response cries would not be statements, American English speakers also provided answers classified as descriptive interjections, which accounted for 17.1 percent, including simple and multiple types. Examples of English descriptive interjections are shown in (2).
(  Although the proportion was smaller than their Japanese counterparts, these examples evidence that American English speakers can also use the description of their own emotions or their evaluation of a perceived situation as response cries. For such descriptions, they preferred to employ multiple-word expressions rather than uttering only one word; simple-word descriptive interjections constituted only 2.3 percent of all their response cries.

Comparison of Response Cries by Japanese and American English Speakers
A comparison of Figures 1 and 2 clarifies the similarities and differences between the two groups of speakers. The simple-word and multiple-word distinctions are not reflected in the figures.
The two figures reveal that Japanese and American English speakers produced primary interjections with similar frequencies. The difference was not statistically significant (χ2 = 2.34, p = .126). However, the uses of descriptive and non-descriptive interjections made a striking difference. Japanese showed a marked preference for descriptive interjections (e.g. Itai 'Painful,' Yabai ' Awful'), while their use of non-descriptive interjections was very limited (less than one percent). On the other hand, American English speakers favoured the use of non-descriptive interjections such as swear words and vocative expressions as expected, but as shown in (2) above, they also used descriptive interjections like AHH, hot and Wow this is heavy, though to a lesser degree than did Japanese speakers. Hasegawa (2010: 209) remarks that solitude speech has often been studied by Japanese linguists but rarely by researchers of the English language, which suggests that Japanese speakers may produce response cries more often than English speakers. In other words, American English speakers were expected to be more likely to choose "nothing" than Japanese speakers. However, the comparison reveals that there is little difference in the proportion of "nothing." "Nothing" accounted for 15.3 percent of the Japanese answers and 13.5 percent of the American ones; the difference was not statistically significant (χ2 = 0.38, p = .537). This indicates that most Japanese and American English speakers expect themselves to utter something in response to unexpected situations. The main difference between the two groups of speakers is that Japanese speakers heavily relied on descriptive interjections, while American English speakers preferred non-descriptive ones. The high proportion of Japanese descriptive interjections (50.7%) was mainly attributed to simple-word interjections, which accounted for about four-fifths of all the descriptive interjections (see Table 3). The most striking contrast is in (ii) Pain Cry, where the majority of the Japanese participants employed descriptive interjections (83.3%) like Itai 'painful' and their variant forms while Americans preferred to produce primary interjections and non-descriptive interjections in the same proportions (46.9% for each type). Descriptive interjections were also favoured among Japanese participants in (viii) Transition Display (63.0%) and (iii) Strain Grunt (59.3%), which constituted more than half of the response cries provided for each situation. It may be that instant sensory stimuli (pain, brightness, and heaviness) are related to the production of descriptive interjections by Japanese speakers, although further in-depth investigation is needed to fully understand this relationship. American speakers' heavy reliance on non-descriptive interjections was prominent in (i) Spill Cry (81.6%) and (iv) Floor Cues (61.2%), where Japanese tended to use primary interjections or descriptive interjections. The two types of response cries are both concerned with the speaker's inadvertent error. It might be reasonable to consider that the speaker wants to curse the event occasioned by his/her careless behaviour (e.g. Dammit, Sh*t) and/or to utter an exclamation calling the name of God (e.g.  (iv) Floor Cues (but not in others) again suggests a meaningful relationship between the use of non-descriptive interjections (swear words and vocative expressions) and the feeling of the speaker's disgust or irritation over his/her own accidental mistake.

Addressee Conception
The second part of the questionnaire asked the participants: "When you make the utterances you gave as answers above, do you feel as if you were speaking to someone?" This question was intended to investigate the issue of dialogicity, on which there appear to be divergent views between Japanese and Western scholars. As mentioned in Section 1, the presence of addressees or dialogic conception has long been presupposed in research on private or solitude speech. However, Hasegawa (2010) casts doubt on the dialogic nature of solitude speech or "soliloquy" in her terminology. She argues that "equating dyadic conversations with soliloquy, let alone with thought, emerges as an unwarranted oversimplification" (Hasegawa, 2010: 186). Admitting that some type of soliloquy "can be identified metaphorically as communication with self" (p. 192), Hasegawa contends that "such communication is by nature drastically different from ordinary dialogue" (p. 193).
Since response cries constitute a part of solitude speech, it is worthwhile looking at how the two groups of speakers conceptualise their own act of producing response cries. As explained in 2.2, the questionnaire presents a basic setting that excludes the presence of others ("you are by yourself with no one around to hear you"). We regarded the answers that only mentioned potential nearby people (e.g. anyone nearby, a random passerby, etc.) as "inappropriate answers." Figures 4 and 5 show the results of Section B(i) in the questionnaire survey. Figure 4 shows that almost all of the Japanese speakers did not think that they were speaking to someone when uttering response cries. One Japanese speaker answered that she felt as if speaking to a guardian spirit (shugorei), but we are not sure whether this answer counts as valid. For American English speakers ( Figure 5), there were slightly more people who felt as if speaking to someone, but contrary to the prevailing view of dialogicity, the majority of the American English speakers (78%) did not presuppose the presence of addressees when giving response cries.
Nine American participants provided answers that could be categorized as "yes." For the question of "who you are speaking to" (Section B(ii)), all participants answered "myself" (see Table 5). Although the proportion was smaller than expected, such a view of conversation with oneself is worth investigating further, especially in terms of whether it is related to the fact that an act of producing solitude speech is phrased with a reflexive expression (English talk to oneself ) that contains a speaking self or in terms of how familiar a speaker is with "dialogic engagement" with a ritual text (e.g. the Bible) (Du Bois, 2009Bois, , 2011see Kádár, 2017 for cross-cultural differences in interpersonal ritual practices).9 Another point to note is that since American response cries include examples of religious swearing known as profanity or blasphemy, one could expect that their response cries were addressed to God or other religious entities. However, no answers in the present results show that their response cries were so addressed. 9 The Japanese counterpart of talking to oneself is hitorigoto-o iu 'make a lone (or solitude) speech,' which does not contain a reflexive pronoun.

Descriptive Interjections and Abbreviation
Our study revealed that Japanese and American English speakers both employed forms other than non-lexical vocalisations and swearing/vocative expressions for response cries. About half of the Japanese response cries (50.7%) were descriptive interjections, and so were 17.1 percent of the English response cries. These results were contrary to Goffman's claim that response cries are not statements (even heavily elided ones). Japanese speakers preferred to use statements or descriptions of either a speaker's emotion/sensation or his/her evaluation of a perceived situation, for example, Itai 'Painful,' Mabusii 'Bright,' Oisisoo '(That) looks delicious,' or even Wa, kabi haeteru 'Oh, mold has grown.' Although the proportion was lower, some English response cries had similar descriptive forms like AHH, hot and Wow that's bright. The main difference between the two groups is that Japanese speakers highly frequently expected themselves to produce descriptive interjections (e.g. Itai 'Painful,' Yabai 'Awful') with only a small proportion of non-descriptive interjections, whereas American English speakers were more likely to use non-descriptive interjections such as swearing or vocative expressions. Descriptive interjections were also used by American English speakers but not as frequently as their Japanese counterparts.
The widespread use of descriptive interjections in Japanese is partly related to the fact that the language frequently allows the omission of subjects (and other contextually inferable elements) and predicate-only forms are easily available to Japanese speakers. Table 6 shows the breakdown of descriptive interjections used by the two groups of speakers. Most Japanese descriptive interjections (79.5%) were simple-word expressions, namely predicates only (e.g. Itai 'Painful' and Mabushii 'Bright') in contrast to only 13.4 percent of their English counterparts. In the exigency of response-cry situations, such shortened forms may be best suited to the description of the speaker's sudden impulsive feeling. Table 6 Descriptive interjections used by Japanese and American English speakers This observation reminds us that many kinds of variant forms were available for Japanese descriptive interjections. As shown in (1), Japanese participants preferred to use clipped (or shortened) forms of adjectives (Iwasaki, 2014). Itai is the basic form of an adjective meaning 'painful,' but Japanese people also like using its clipped (or shortened) forms such as ita, itatt, itta, ittu, and even itt or i.10 Such shortened forms are sometimes considered to be mainly used by the younger generation including university students (Agency of Cultural Affairs, 2011: 32-37, 137-141;Harada, 2013), but the older generation also uses some of these forms; for example, ita, itta, and itatt are often uttered by older adults as an instant reaction to an incident that causes pain.11 Konno (2012: 21) argues that forms such as itatt 'painful' or yabatt 'awful' are exclusively used in private speech, produced as an immediate expression of the speaker's instinctive sensation or perception of an incident that s/he encounters at the very instant of speaking (see also Togashi, 2006: 167). Hence, they belong to jitai sokuoogata 'immediate response type' (Nitta, 1997) or sokujibun 'sentence of immediacy' (Iwasaki and Ono, 2007; see also Konno 2012: 16). It is possible to consider that this immediacy or instantaneity of sensation/perception has an iconic relationship (Haiman, 1983) with such shortened forms of adjectives, as Iwasaki (2014: 63) succinctly summarises: "the simpler the form, the simpler the neurological process" referred to by the form.

Simple-word expression Multiple-word expression TOTAL
In this connection, one may associate the frequent use of simple-predicate forms with Vygotsky's (1986Vygotsky's ( [1934) account of inner speech as developing through the processes of "abbreviation" and "predication." However, we consider it too hasty to make such a simplistic association. Vygotsky (1986Vygotsky ( [1934) writes on the development of inner speech as follows: We applied the genetic method and found that as egocentric speech develops, it shows a tendency toward an altogether specific form of abbreviation, namely: omitting the subject of a sentence and all words connected 10 Some variants of these forms are referred to as "-i drop construction" (Konno, 2012), "clipped adjective" (Iwasaki, 2014) or "adjective stem-type sentence" (Shimizu, 2015). These variants are formed "by deleting the final -i from an -i ending adjective, such as in ita-i > ita (painful)" (Iwasaki, 2014: 65), by replacing the final -i with sokuon 'moraic obstruent': ita-i > itatt (represented in this paper by geminating the consonant t), or by the coalescence of the two final vowels: ita-i > ite. The deletion of the final -i can also be accompanied by the insertion of a moraic obstruent within the stem: ita-i > itta or ittu, which leads to the formation of a further shortened form: itt or i. 11 The preference of these shortened forms depends on the kinds of adjectives; for example, samutt '(it)'s cold' is more frequently used and felt less unnatural than urusatt '(it)'s noisy' by the older generation (Agency of Cultural Affairs, 2011: 33, 37, 137, 141;Harada, 2013: 339).
with it, while preserving predicates. This tendency toward predication appears in all experiments with such regularity that we must assume it to be the basic form of syntax of inner speech. (Vygotsky, 1986(Vygotsky, [1934: 236, italics ours) Vygotsky assumes that predication and abbreviation represent linguistic features that appear in a transitional stage from social speech to inner speech, namely, "speech almost without words" (1986 [1934]: 244). However, the notions of predication and abbreviation require cautious interpretation when looking at languages with "a high degree of ellipsis" (Shibatani, 1990: 363). In Japanese, grammatical subjects are "not used in many cases" (Kindaichi, 2010(Kindaichi, [1957: 172); the language "utilizes a way of 'expression in which the subject is not present'" (p. 214). In other words, the omission or abbreviation of grammatical subjects (and other contextual recoverable elements) is quite common not only in private (egocentric) speech but also in social speech. For example, a Japanese speaker could produce the utterance without the grammatical subject and object Kobosityatta(a) '(I)'ve spilt (coffee)!' when s/he wants to ask someone to help wipe up spilt coffee, or a Japanese kid might use the simple-predicate form Itai(i)! 'Painful!' when trying to stop his/her brother/sister's infliction of injury upon him/herself in a sibling rivalry.12 It is important to emphasise that in languages that heavily rely on unexpressed elements in spoken discourse, careful consideration is necessary for discussing the role played by the features of predication and abbreviation in the transformation of "speech for others" into "speech for oneself" (Vygotsky, 1986(Vygotsky, [1934: 225). The condensation or reduction in the development of inner speech may differ in kind and/ or degree across languages depending on how much ellipsis is allowed in a given language. Alternatively, the differentiation between social and private speech may be less clear in some languages than those discussed by Vygotsky and his colleagues. These are open questions for linguists to ponder in future research.

Interjectional Expressions as a Continuous Category
The high frequency of Japanese descriptive interjections and some, though less frequent, uses of their English counterparts lead us to reconsider the view 12 The social and private uses of these utterances can be prosodically differentiated. For example, the utterances Kobosityatta(a) and Itai(i) are more likely to be interpreted as social speech when uttered with an emphatic high-pitch accent on the second mora and the significant lengthening of the final vowel.
of response cries as "being in prototype merely a matter of nonsymbolic emotional expression" (Goffman, 1978: 806, our italics). As demonstrated by the present results (and also supported by the everyday observation of native speakers of Japanese), it is natural for Japanese speakers who experience sudden physical pain to utter response cries like Itai 'Painful' (or its variant forms), which are descriptive and symbolic forms representing their painful sensation. These are more common than primary interjections like Aa 'Aw' or Att 'Ouch' in Japanese (see Table 3). This may challenge the conventional view of interjections as something being outside of grammar -interjections "are not very well integrated into the clause grammars of languages" (Ameka and Wilkins, 2006: 6) or "have no structural relationship to ambient sentential syntax" (O'Connell and Kowal, 2008: 134). It may also undermine the view regarding morphology that "interjections do not normally take inflections or derivations in those languages that make such forms" (Ameka, 1992(Ameka, : 106, 1994(Ameka, : 1713 These examples, provided by Japanese participants of our survey, are all prefaced by primary interjections (e.g. a or att), but utterances without primary interjections (e.g. Kobosi-tyat-ta '(I)'ve spilt (coffee)') are also perfectly natural as interjectional expressions. The response cries simply represent a coffee spilling accident, which can be designated in morphosyntactically different forms. The event can be expressed by the transitive verb kobosu 'spill' as in (3a) and (3b) or by the intransitive verb koboreru 'spill.' It can further be represented in the simple past as in (3a) and (3c) or in the past mirative as in 13 The following abbreviations are used in the present study: intr (intransitive), mir (mirative), past (past tense), tr (transitive).
(3b).14 Ameka (1992Ameka ( : 106, 1994Ameka ( : 1713 observes that there are some interjections derived from verbs, but argues that they "have become frozen and form a completely new word," not obeying the morphosyntactic rules of the language in question. As he observes, the development of shortened (or clipped) forms of some Japanese adjectives (e.g. ita(tt) rather than itai 'painful') as seen in 4.1 may lend themselves to such fossilisation or frozen forms. However, utterances like kobosi-ta, kobosi-tyat-ta, kobore-ta unambiguously follow morphosyntactic rules of Japanese grammar; they are neither frozen nor fossilised forms specifically developed for the impulsive expression of an emotion or a feeling. Furthermore, there are cases in which a specific syntactic construction is exploited to express a speaker's emotion or sensation. (4a) is an example of (vii) Audible Glee, where a speaker utters a response cry when discovering a delicious-looking cake in a refrigerator. Note that the sentence is arranged in an "emotively motivated non-canonical word order" (Ono and Suzuki, 1992: 439), where the grammatical subject kore 'this' is placed after the nominal predicate nani 'what.' The canonical word order is given in (4b), which sounds like a simple wh-question asked to identify the referent of kore 'this.' (4) a. This kind of non-canonical word order was found in some other examples of Japanese response cries: Att satoozyan kore 'Oh (it)'s sugar, this' (Floor Cues), Yattawa kore '(I)'ve done this' (Revulsion Sounds).
One might claim that examples like (3) and (4) should not be categorised as interjections but as interjectional phrases or exclamatory utterances (Ameka, 1992(Ameka, , 1994, and that "the term interjection should be reserved for the word class" (Ameka, 1992: 103). However, as noted in 2.3, the notion of 'word' itself is problematic and the identification of words varies across languages (Crystal,14 -tyau (the adverbial form -tyat in (3b)) is a reduced form of -tesimau, which consists of the connective particle -te 'and' and the perfective or terminative auxiliary -simau. The form develops a new meaning that denotes "abruptness or unexpectedness" (Izutsu and Izutsu, 2008: 132) (cf. "inadvertence" or "non-volitionality" in Ono, 1992), hence glossed as mirative (mir) in (3b). 1991: 379-381). Our investigation of response cries and interjectional expressions of the two genetically and typologically unrelated languages suggests that such expressions would not support a clear-cut distinction between lexicon and grammar and between a non-propositional "mere expression" and a "proposition-like statement" (Goffman, 1978: 807).15 These emotional utterances should be viewed as forming a continuum with one extreme composed of non-word vocalisation and the other of descriptive interjections expressed in the form of a full-fledged clause.

Conclusion
Based on the questionnaire survey targeting Japanese and American English speakers, this study demonstrated cross-linguistic diversity in their perceptions of response cries, namely what they considered they would produce as response cries in given situations. Contrary to Goffman's claim that "[a] response cry doesn't seem to be a statement in the linguistic sense (even a heavily elided one)" (1978: 800), our investigation of the eight types of response cries revealed that Japanese speakers heavily relied on descriptive interjections (e.g. Ita(i), Itta 'Painful,' Yaba(i), Yabe 'Awful') that contain predicates representing the speaker's emotion or sensation or his/her evaluation of a perceived situation. On the other hand, American English speakers preferred non-descriptive interjections (vocatives and swear words), which were rarely used by Japanese speakers.
Regarding the possibility of dialogic conception, almost all the Japanese speakers had no addressees in mind when uttering response cries, whereas about one-fifth of the American English speakers said that they were speaking to themselves. This result points to the possibility of cross-linguistic (or cross-cultural) variability of addressee conception in making utterances without actual addressees. Such variability might be ascribed to lexical or syntactic features of languages or to cultural-specific religious beliefs or ritual practices. There is also another possibility that addressee conception may vary with the types of solitude speech. Our investigation of response cries showed an overall high proportion of non-dialogic conception for both groups of speakers. 15 In his deictic account of interjections, Wilkins (1992: 119) argues that interjections "convey complete propositions and have an illocutionary purpose" with all their referential arguments provided by context. However, he still seems to support the view, with some hedges provided within brackets, that an interjection "(typically) does not enter into construction with other word classes, is (usually) monomorphemic, and (generally) does not host inflectional or derivational morphemes" (1992: 124).
However, dialogicity has been a longstanding issue in educational and developmental psychology, where the primary concern is the use of private speech for self-regulation, that is, controlling one's own behaviour or thought while engaged in cognitively difficult tasks. This suggests that dialogic conceptions might not be uniform; they might be diverse even for a single speaker according to the type of solitude speech in which s/he is involved. Investigations like the present research or those based on more advanced methodologies need to be repeated to explore such possibilities. Related to this is the issue of the social situatedness of response cries. Goffman's (1978) account rests upon the presupposition of others (a gathering or witnesses) in producing response cries. According to Goffman, whether or not response cries are issued, and if they are, what kinds of cries are produced, depends on the presence of others and/or the kinds of persons who overhear them. The basic setting of our questionnaire survey excluded the possibility of the presence of others, saying "you are by yourself with no one around to hear you." Goffman's explanation would lead us to expect more answers of "nothing to utter," that is, the answer that one would probably NEVER utter a thing in a given situation. However, such answers only accounted for about 15 percent for both groups. Of course, there might be some participants who did not heed this basic instruction, but it is unlikely that they formed the majority. Thus, the results of our study imply that speakers are not necessarily conscious of the presence of others when uttering response cries, especially when they encounter a sudden, unexpected situation. In fact, (ii) Pain Cry indicated the lowest proportion of "nothing to utter" of the eight response cries (5.6% for Japanese speakers; 0% for American English speakers), which suggests that the more urgent an incident is, the more likely it is for a speaker to utter something despite the absence of others.
Finally, we conclude this paper by proposing a continuous view of response cries. Much effort has been made to demarcate a "sharp, underlying difference between conventionally directed statements and imprecatory interjections" (Goffman, 1978: 808). There is "a nice division of linguistic labor" (1978: 809): speech for oneself is different in form and function from speech for communication, and even "more so" are response cries (Goffman, 1978: 808). However, the present discussion of Japanese (and, in part, English) response cries suggests that it would be difficult to draw a distinct boundary between nonpropositional utterances for non-addressed cries and propositional utterances for communication with others. Response cries (and interjectional expressions as well) would be better seen as forming a continuum with non-word vocalisation at one end and descriptive interjections expressed as full-fledged clauses at the other. Goffman's response cries would be more likely to fall closer to the former end, but where or how other response cries are located on the continuum would vary by language, being at least to some extent determined by language-specific morphosyntactic features.