The Role of Pitch and Tempo in Sound-Temperature Crossmodal Correspondences

We explored the putative existence of crossmodal correspondences between sound attributes and beverage temperature. An online pre-study was conducted ﬁrst, in order to determine whether people would associate the auditory parameters of pitch and tempo with different imagined beverage temperatures. The same melody was manipulated to create a matrix of 25 variants with ﬁve different levels of both pitch and tempo. The participants were instructed to imagine consuming hot, room-temperature, or cold water, then to choose the melody that best matched the imagined drinking experience. The results revealed that imagining drinking cold water was associated with a signiﬁcantly higher pitch than drinking both room-temperature and hot water, and with signiﬁcantly faster tempo than room-temperature water. Next, the online study was replicated with participants in the lab tasting samples of hot, room-temperature, and cold water while choosing a melody that best matched the actual tasting experience. The results conﬁrmed that, compared to room-temperature and hot water, the experience of cold water was associated with both signiﬁcantly higher pitch and fast tempo. Possible mechanisms and potential applications of these results are discussed.


Introduction
In recent years, a growing body of empirical research has revealed various surprising yet robust crossmodal correspondences between auditory and gustatory stimulus attributes. For instance, people reliably associate a number of musical parameters, such as pitch, tempo, and timbre, with basic tastes (see Knöferle and Spence, 2012, for a review). However, correspondences between sound and the oral-somatosensory attributes of the eating/drinking experience (e.g., temperature, texture, viscosity) have mostly been limited to those sounds that are related to the consumption of food products. The sounds associated with the opening of product packaging can, for instance, communicate freshness, while the sounds of a liquid being poured might indicate levels of carbonation and viscosity, or perhaps the shape of the container, not to mention the liquid's temperature (see Spence and Wang, 2015, for a review). Velasco et al. (2013a) demonstrated that people (N = 33) can reliably distinguish hot (82-84°C) from cold water (6-8°C) based on nothing more than the sounds of pouring. Moreover, the perceived temperature of a liquid can be artificially raised simply by enhancing the volume around 200 Hz and decreasing it at around 5-6 kHz (and vice versa to lower the perceived temperature). Such findings suggest that people might associate hot and cold temperatures with different sound frequencies, and what applies to pouring sounds could potentially also be extended to the case of music.
In the present work, we therefore examined whether two basic auditory properties -namely tempo and pitch -are consistently associated with drinks of different temperatures. First, we conducted an online pre-study in order to assess sound-temperature associations with the imagined experiences of drinking water at different temperatures. Next, we conducted an experiment using real water samples in order to validate our findings.

Methods
2.1.1. Participants 30 participants (Note 1) (15 women, 15 men) between 21 and 52 years of age (M = 28.2, SD = 7.4) took part in the study. The participants gave their informed consent, and reported no hearing impairments. The participants were recruited from Prolific Academic. The study was approved by the Central University Research Ethics Committee of Oxford University (MSD-IDREC-C1-2014-205).

Auditory Stimuli
The same short melody (see Fig. 1) was manipulated to have five levels of both pitch and tempo. The melody was made up by the first author, in the major key, and was contained within a simple octave. In terms of pitch, the melody was shifted by an octave each time, resulting in melodies ranging from C3 (131 Hz) to C7 (2093 Hz). In terms of tempo, the melodies were presented at 60, 140, 220, 300, and 380 beats per minute. All of the soundtracks were RMSequalised. One set of soundtracks was produced with GarageBand's Steinway Grand Piano plugin, and another with GarageBand's String Ensemble plugin (samples of the manipulated melody can be heard at https://soundcloud.com/ janicewang09/sets/sound-of-cold-and-hot). The soundtracks were presented in a 5 × 5 grid, with variations in tempo along the horizontal axis and variations in pitch along the vertical axis. Two versions of the grid were produced, one with tempo increasing to the right and pitch decreasing going down, and one with tempo decreasing to the right and pitch increasing going down.

Procedure
The study was programmed on the Qualtrics online survey platform. Before the actual study began, the participants specified their gender, age, and selfrated their musical expertise levels (none, up to 2 years training, 2-10 years training, or 10+ years training). The participants had to answer an audio-based question correctly in order to ensure sound playback was functional and to allow the participant to adjust the volume to a comfortable listening level.
The participants were first given a practice trial in order to familiarise themselves with the 5 × 5 sound grid. They were instructed to imagine drinking a glass of warm milk, and to choose one sound from the grid that best matched the experience.
During the study, the participants answered two blocks of six trials each. For each trial, the participants were presented with a line of instruction asking them to imagine drinking either cold, room-temperature, or hot water. Below the instruction, on the same page, they were presented with a 5 × 5 sound grid with either piano or with string ensemble instrumentation. Each grid was associated with a sound clip and a number, and participants were asked to input the number of the sound clip that best matched the imagined drinking experience (see Fig. 2). The participants clicked 'next' to advance to the next page. The two blocks were identical except for the order of auditory stimuli along each axis. In the first block, the sounds were arranged from fast to slow going from left to right, and low pitch to high pitch from top to bottom. The directions were reversed in the second block. The order in which the questions were asked in each block was randomised.
The study lasted for approximately 3 min and the participants were paid £0.60 for their participation.

Data Analysis
Pitch and tempo information was extracted from participants' melody choices (each of the possible 25 melody choices was associated with one out of Figure 2. A screenshot of how participants might see a question in the pre-study. The participants were presented with an audio grid below the instructions, and were asked to choose the sound clip that best matched the imagined drinking experience by picking the number that corresponded to the number in the audio grid. five levels of pitch and tempo). A repeated-measured multivariate analysis of variance (RM-MANOVA) was conducted with temperature (hot, roomtemperature, or cold) and instrumentation (piano, string ensemble) as factors, and with pitch and tempo as measures.

Results
The mean values of pitch and tempo for each temperature are shown in  tempo [F (1.49, 43.41) = 4.58, p = 0.02, η 2 = 0.14]. The participants' choice of pitch was significantly higher for cold water than for room-temperature [p < 0.0005] or hot water [p = 0.02], and their choice of tempo was significantly higher for cold water than for room-temperature water [p = 0.002].
Overall, there was a significant positive correlation between pitch and tempo choices, r 179 = 0.47, p < 0.0005. More specifically, a positive correlation between pitch and tempo was observed when the participants were asked to imagine drinking cold (r 60 = 0.54, p < 0.0005) and hot water (r 60 = 0.30, p = 0.02), but not when imagining drinking room-temperature water (r 60 = 0.20, p = 0.13). In other words, high pitch tended to be associated with fast tempo, and low pitch with slow tempo, when it came to choosing the best matching melody for imagined cold and hot temperature drinks. This pattern of results can be seen in the colour-coded participant response frequency grid in Fig. 4, where the tendency for responses to cluster along the pitch-tempo diagonal is greater for the cold water and hot water conditions compared to the room-temperature condition.

Discussion
The imagined experience of drinking cold water is associated with a melody having a higher pitch (around C5 or 523 Hz) as compared to room-temperature and hot water; and associated with a faster tempo (around 300 bpm) as compared to room-temperature water. However, exact temperatures were not specified, only general descriptions of 'hot' and 'cold' that might perhaps have varied for each participant. Moreover, since the pre-study only involved imagined temperatures, it is possible that the association we uncovered only applies between sound attributes and imagined, not actually experienced, temperature. On the other hand, there is evidence that mental imagery and perception activate overlapping brain areas (see Kosslyn et al., 2001, for a review). For instance, eye movement patterns are similar when inspecting visual mental images and real images (Brandt and Stark, 1997), and the combination of visual and olfactory imagery can enhance salivatory response (Krishna et al., 2014). In terms of temperature, imagined warmness/coldness from an egocentric perspective (as in our pre-study) has been demonstrated to impact social evaluation in a similar way as real sensory experiences (Macrae et al., 2013).
In the main experiment, we set out to replicate the findings from the prestudy using actual water samples at different temperatures, in order to verify that the results of our pre-study, in fact, apply to orally experienced temperature.

Participants
Twenty four participants (11 women, 13 men) between 19 and 43 years of age (M = 28.2, SD = 5.7) took part in the study. The participants gave their informed consent, and reported no hearing impairments. The participants were recruited from the Oxford University Research Participant Scheme. The study was approved by the Central University Research Ethics Committee of Oxford University (MSD-IDREC-C1-2014-205).

Auditory Stimuli
The same auditory stimuli grid was used as in the pre-study.

Temperature Stimuli
Distilled-water samples were prepared at three different temperatures: cold (5°C), room-temperature (21°C), and hot (45°C). The cold water sample was kept in the refrigerator until needed for the relevant trials. The roomtemperature water sample was kept at room-temperature for at least 24 hours in order for the temperature to stabilise. The hot water sample was prepared immediately before needed for the relevant trials and produced by mixing boiling water and room-temperature water. The temperature of the samples was checked via an infrared thermometer (Benetech GM550 non-contact infrared digital thermometer) before each sample was served. The samples were served in 20 ml portions in 150 ml clear plastic cups.

Procedure
Participants performed the study seated in a study booth in front of a computer screen. Each participant completed six trials, with one of two possible arrangements pitch and tempo on the audio grid (either fastest tempo on the left and lowest pitch on top, or slowest tempo on the left and highest pitch on top). The procedure was identical to that of Study 1. The only difference was that, for each trial, instead of asking participants to imagine drinking water at different temperatures, the participants were given the instruction to wait for the experimenter to provide them with a sample of water to taste. The sam-ples were either cold, room-temperature, or hot, but the temperature was not explicitly communicated to the participant. Each sample was prepared by the experimenter immediately before it was sampled by the participant, in order to ensure it was served at the intended temperature. Participants took a 60 s break between each trial to avoid gustatory adaptation effects (Green and Nachtigal, 2012).
The study lasted for approximately 10 min and the participants were paid £2.00 for their participation.

Data Analysis
We performed the same data analysis as in our pre-study. In addition, a comparison between the results of our pre-study with the main experiment was performed via a RM-MANOVA with temperature and instrumentation as within-participants factors, experiment condition (imagined or perceived temperature) as a between-participant factor, and with pitch and tempo as measures.

Results
The mean values of pitch and tempo for each temperature are shown in The participants' choice of pitch was significantly higher for cold water than for room-temperature water, and higher for room-temperature water than for hot water (all comparisons p < 0.01). The participants choice of tempo was significantly higher for cold water than for room-temperature (p = 0.025) and hot water (p = 0.015).
As in the pre-study, there was a significantly positive correlation overall between participants' pitch and tempo choices, r 144 = 0.49, p < 0.0005. Specifically, a positive correlation between pitch and tempo was observed when participants tasted warm water (r 48 = 0.53, p < 0.0005), but not when they tasted cold water (r 48 = 0.26, p = 0.08) or room-temperature water (r 48 = −0.16, p = 0.27). This pattern of results can be seen in the colour-coded participant response frequency grid in Fig. 6, where the tendency for responses to cluster around the pitch-tempo diagonal is much greater for the warm water than cold or room-temperature water condition. Figure 6 also demonstrates the consistency in the participants' choices, especially in the room-temperature and warm water scenarios, where just two adjacent squares account for roughly 50% of the total responses. In addition, Fig. 6 also reveals that for the cold water condition, high pitch seemed to be favoured over any specific tempo ranges.

Discussion
The results of our main experiment confirmed the relationship between pitch/tempo and temperature shown in our pre-study. Replacing the imagined experience of drinking water samples at different temperatures with real water samples, the results showed that cold water is associated with higher pitch and faster tempo compared to room-temperature and hot water. Moreover, the results of our main experiment revealed that when it came to actual water samples, the hot temperature sample was associated with a lower pitch than the room-temperature sample. We did not observe this result in the pre-study, possibly because, when given the prompt to imagine drinking hot water, participants thought of water at a higher temperature than the rather comfortable 45°C that was offered in the present study. This inconsistency in imagined hot water temperature is shown in Fig. 4C, where there was a tendency for participants to choose either slow tempo and low pitched soundtracks, or fast tempo and high pitch soundtracks.

General Discussion
Why, one might ask, should people associate colder temperatures with higher pitch and faster tempo? One potential explanation points to emotional associations. There is already evidence that the crossmodal correspondences that exist between sound and smell (Levitan et al., 2015;see Deroy et al., 2013, for a review) and between sound and taste (Wang et al., 2016) are mediated by emotion. Both fast tempo (Van der Zwaag et al., 2011) and high pitch (see Wang et al., 2016, Appendix B) are associated with increased arousal. The experience of drinking cold water might therefore be associated with fast tempo and high pitch because it is deemed arousing and refreshing (see Brunstrom et al., 1997;Sandick et al., 1984). Hot water, on the other hand, may be associated with soothing, calming warm beverages like tea instead. This was especially true for the main study since the hot water was served at 45°C, a comfortable drinking temperature. It would be interesting to ask participants in an online study to associate pitch and tempo with both extremely hot water (around boiling, at 100°C) in addition to a comfortable 45°C. One might expect the very hot (hence arousing) water to be associated with faster tempo and higher pitch compared to the comfortably warm water. Of course, to truly verify the emotional association hypothesis, a future study would need to be conducted to gather the emotions that participants associate with each beverage sample.
The emotional association hypothesis could also be explained by brain connectivity. At a neuronal level, both oral temperature and audition are represented in the orbitofrontal cortex (Guest et al., 2007;Kadohisa et al., 2004;Verhagen and Engelen, 2006). In addition to unimodal neurons representing oral temperature, the majority of temperature-sensitive oral-somatosensory neurons are multisensory, associated with combinations of temperature, taste, and viscosity (Kadohisa et al., 2004). On the one hand, fMRI studies have revealed that the same brain regions -the prefrontal cortex and pregenual cingulate cortex -are responsible for processing pleasantness of oral temperature as well as pleasantness of food flavour (Guest et al., 2007). On the other hand, the orbitofrontal cortex is also responsible for processing emotional responses to auditory stimuli (Royet et al., 2000). Furthermore, there is evidence that the processing of aesthetic stimuli -whether they be paintings, music, or food -overlaps in the right anterior insula, an area associated with the processing of negative valence stimuli/concepts such as disgust and pain (Brown et al., 2011). Put together, this provides further evidence for the emotion-mediation theory between temperature and sound attributes.
The association of cold temperature with high pitch and fast tempo may also have a statistical origin, as Velasco et al. (2013a, b) demonstrated that enhancing the pouring sound around the 5-6 kHz range raised the perceived temperature of the liquid being poured. At the same time, we are all familiar with the sound of ice cubes tinkling in the glass, whereas a hot drink might give rise to images of low-pitched gurgling bubbling pots of mulled wine or hot chocolate. Finally, from an acoustics perspective, it is worth noting that a stringed instrument would sound flatter (i.e., lower pitch) in warmer temperature, as the string expands and loses tension (Tipler and Mosca, 2008). As these environmental observations seem to exist in nature, it would be helpful to conduct the same study with participants from different cultures in order to validate whether these mappings are indeed cross-cultural (see Knöferle et al., 2015).
It is worth pointing out here that what we have observed in the present work should not be thought of as synaesthesia per se. The results of the two studies reported in the present study demonstrate a consistent general tendency for participants to associate certain temperatures with certain pitches and tempi without an accompanying sensory concurrent. While temperature-sound synaesthesia (where temperature induces a sound concurrent) does exist, it is extremely rare. According to one source, those with temperature-sound synaesthesia have been reported in 0.1% of the population with synaesthesia (Day, 2005). Accounts of those synaesthetes usually mention childhood associations of certain sounds with certain locations, such as the association of dragonflies with hot temperatures or video game music with cold (http://syndiscovery.com/a-childhood-memory-soundto-temperature-synaesthesia-2/).
The fact that temperature has distinct pitch and tempo associations is certainly of interest to those working in the food and beverage marketing industry. One could, for instance, imagine sonic branding or advertising jingles created to emphasize the cold, refreshing aspects of carbonated drinks, say, or the warming qualities of soup or tea. In addition, an interesting topic for future study would be to assess whether such sound-temperature associations might also apply to 'warming' or 'cooling' flavours such as cinnamon or menthol (Chartier, 2012;Green, 1992;Nagata et al., 2005).
In summary, oral temperature (e.g., of a drink) is a multisensory construct that does not only concern the tactile/oral-somatosensory senses, but also vision (Fenko et al., 2010;Wastiels et al., 2012), smell (Michael et al., 2010), and sound (Velasco et al., 2013a, b). The present research demonstrates that beyond the sounds that are made by the beverage in question, more abstract musical parameters of pitch and tempo may also exhibit consistent associations with specific temperatures. One interesting future test would be to assess whether 'cold' or 'hot' soundtracks might be able to alter the perceived temperature of a food/beverage or even regulate participants' body temperature (see Takakura et al., 2015, for an example of visual information affecting human thermodynamics).

Acknowledgement
CS would like to thank the AHRC grant entitled 'Rethinking the senses' (AH/L007053/1) for supporting this research.

Notes
1. Only 30 participants were recruited given prior research conducted with N = 33 participants revealed that people were able to tell the difference between the sound of pouring hot versus cold water (Velasco et al., 2013a).