Crossmodal Correspondences: Standing Issues and Experimental Guidelines

in Multisensory Research
Restricted Access
Get Access to Full Text
Rent on DeepDyve

Have an Access Token?

Enter your access token to activate and access content online.

Please login and go to your personal user account to enter your access token.


Have Institutional Access?

Access content through your institution. Any other coaching guidance?


Crossmodal correspondences refer to the systematic associations often found across seemingly unrelated sensory features from different sensory modalities. Such phenomena constitute a universal trait of multisensory perception even in non-human species, and seem to result, at least in part, from the adaptation of sensory systems to natural scene statistics. Despite recent developments in the study of crossmodal correspondences, there are still a number of standing questions about their definition, their origins, their plasticity, and their underlying computational mechanisms. In this paper, I will review such questions in the light of current research on sensory cue integration, where crossmodal correspondences can be conceptualized in terms of natural mappings across different sensory cues that are present in the environment and learnt by the sensory systems. Finally, I will provide some practical guidelines for the design of experiments that might shed new light on crossmodal correspondences.

Crossmodal Correspondences: Standing Issues and Experimental Guidelines

in Multisensory Research



BackusB. T. (2011). Recruitment of new visual cues for perceptual appearance in: Sensory Cue IntegrationTrommershäuserJ.KördingK.LandyM. (Eds) pp.  101119. Oxford University PressNew York, NY, USA.

BaierB.KleinschmidtA.MüllerN. G. (2006). Cross-modal processing in early visual and auditory cortices depends on expected statistical relationship of multisensory informationJ. Neurosci. 261226012265.

BernsteinI. H.EdelsteinB. A. (1971). Effects of some variations in auditory input upon visual choice reaction timeJ. Exp. Psychol. 87241247.

BienN.Ten OeverS.GoebelR.SackA. T. (2012). The sound of size: crossmodal binding in pitch–size synesthesia: a combined TMS, EEG and psychophysics studyNeuroimage 59663672.

BienN.Ten OeverS.GoebelR.SackA. T. (2013). Corrigendum to ‘The sound of size: crossmodal binding in pitch–size synesthesia: a combined TMS, EEG and psychophysics study’Neuroimage 72325.

BremnerA. J.CaparosS.DavidoffJ.De FockertJ.LinnellK. J.SpenceC. (2013). “Bouba” and “Kiki” in Namibia? A remote culture make similar shape–sound matches, but different shape–taste matches to WesternersCognition 126165172.

CrisinelA. S.SpenceC. (2012). A fruity note: crossmodal associations between odors and musical notesChem. Senses 37151158.

DeroyO.SpenceC. (2013). Why we are not all synesthetes (not even weakly so)Psychonom. Bull. Rev. 20122.

Di LucaM.ErnstM. O.BackusB. (2010). Learning to use an invisible visual signal for perceptionCurr. Biol. 2018601863.

DolscheidS.HunniusS.CasasantoD.MajidA. (2014). Prelinguistic infants are sensitive to space–pitch associations found across culturesPsychol. Sci. 2512561261.

ErnstM. O. (2007). Learning to integrate arbitrary signals from vision and touchJ. Vis. 77. DOI:10.1167/7.5.7.

ErnstM. O.BanksM. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashionNature 415429433.

EvansK. K.TreismanA. (2010). Natural cross-modal mappings between visual and auditory featuresJ. Vis. 106. DOI:10.1167/10.1.6.

FechnerG. T. (1860). Elemente der Psychophysik (Elements of Psychophysics). Breitkopf and HartelLeipzig, Germany.

FlanaganJ. R.BeltznerM. A. (2000). Independence of perceptual and sensorimotor predictions in the size–weight illusionNat. Neurosci. 3737741.

FlanaganJ. R.BittnerJ. P.JohanssonR. S. (2008). Experience can change distinct size–weight priors engaged in lifting objects and judging their weightsCurr. Biol. 1817421747.

GallaceA.SpenceC. (2006). Multisensory synesthetic interactions in the speeded classification of visual sizePercept. Psychophys. 6811911203.

GatesG. A.MillsJ. H. (2005). PresbycusisLancet 36611111120.

GescheiderG. A. (2013). Psychophysics: the Fundamentals. Lawrence ErlbaumMahwah, NJ, USA.

GreenD.SwetsJ. (1966). Signal Detection Theory and Psychophysics. WileyNew York, NY, USA.

Guzman-MartinezE.OrtegaL.GraboweckyM.MossbridgeJ.SuzukiS. (2012). Interactive coding of visual spatial frequency and auditory amplitude-modulation rateCurr. Biol. 22383388.

HaijiangQ.SaundersJ. A.StoneR. W.BackusB. T. (2006). Demonstration of cue recruitment: change in visual appearance by means of Pavlovian conditioningProc. Natl Acad. Sci. USA 103483488.

HelmholtzH. V. (1909). Handbuch der Physiologischen Optik (Handbook of Physiological Optics). VossHamburg, Germany.

HillisJ.ErnstM. O.BanksM.LandyM. (2002). Combining sensory information: mandatory fusion within, but not between, sensesScience 29816271630.

HolwayA. H.BoringE. G. (1941). Determinants of apparent visual size with distance variantAm. J. Psychol. 542137.

KaliuzhnaM.PrsaM.GaleS.LeeS. J.BlankeO. (2015). Learning to integrate contradictory multisensory self-motion cue pairingsJ. Vis. 1510. DOI:10.1167/15.1.10.

KeetelsM.VroomenJ. (2011). No effect of synesthetic congruency on temporal ventriloquismAtten. Percept. Psychophys. 73110.

KnillD. C.RichardsW. (1996). Perception as Bayesian Inference. Cambridge University PressCambridge, UK.

KnöferleK. M.WoodsA.KäpplerF.SpenceC. (2015). That sounds sweet: using crossmodal correspondences to communicate gustatory attributesPsychol. Marketing 32107120.

KöhlerW. (1929). Gestalt Psychology. LiverightNew York, NY, USA.

KöhlerW. (1947). Gestalt Psychology: an Introduction to New Concepts in Modern Psychology. LiverightNew York, NY, USA.

LandyM. S.MaloneyL. T.JohnstonE. B.YoungM. (1995). Measurement and modeling of depth cue combination: in defense of weak fusionVis. Res. 35389412.

LandyM. S.HoY.-X.SerweS.TrommershäuserJ.MaloneyL. T. (2011). Cues and pseudocues in texture and shape perception in: Sensory Cue IntegrationTrommershäuserJ.KördingK.LandyM. (Eds) pp.  263278. Oxford University PressNew York, NY, USA.

LewkowiczD. J.MinarN. J. (2014). Infants are not sensitive to synesthetic cross-modality correspondences: a comment on Walker et al. (2010)Psychol. Sci. 25832834.

MarksL. E. (1989). On cross-modal similarity: the perceptual structure of pitch, loudness, and brightnessJ. Exp. Psychol. Hum. Percept. Perform. 15586602.

MarksL. E. (2004). Cross-modal interactions in speeded classification in: The Handbook of Multisensory ProcessesCalvertG. A.SpenceC.SteinB. E. (Eds) pp.  85106. MIT PressCambridge, MA, USA.

NielsenA.RendallD. (2011). The sound of round: evaluating the sound-symbolic role of consonants in the classic Takete–Maluma phenomenonCan. J. Exp. Psychol. 65115124.

NielsenA. K.RendallD. (2013). Parsing the role of consonants versus vowels in the classic Takete–Maluma phenomenonCan. J. Exp. Psychol. 67153163.

OccelliV.EspositoG.VenutiP.ArduinoG. M.ZampiniM. (2013). The takete–maluma phenomenon in autism spectrum disordersPerception 42233241.

Orchard-MillsE.Van der BurgE.AlaisD. (2013). Amplitude-modulated auditory stimuli influence selection of visual spatial frequenciesJ. Vis. 136. DOI:10.1167/13.3.6.

PariseC.PavaniF. (2011). Evidence of sound symbolism in simple vocalizationsExp. Brain Res. 214373380.

PariseC.SpenceC. (2008). Synesthetic congruency modulates the temporal ventriloquism effectNeurosci. Lett. 442257261.

PariseC.SpenceC. (2009). When birds of a feather flock together: synesthetic correspondences modulate audiovisual integration in non-synesthetesPLoS One 4e5664. DOI:10.1371/journal.pone.0005664.

PariseC. V.SpenceC. (2012). Audiovisual crossmodal correspondences and sound symbolism: a study using the implicit association testExp. Brain Res. 220319333.

PariseC.SpenceC. (2013). Audiovisual cross-modal correspondences in the general population in: Oxford Handbook of SynaesthesiaSimnerJ.HubbardE. M. (Eds) pp.  790815. Oxford University PressOxford, UK.

PariseC. V.KnorreK.ErnstM. O. (2014). Natural auditory scene statistics shapes human spatial hearingProc. Natl Acad. Sci. USA 11161046108.

PetersM. A. K.BalzerJ.ShamsL. (2015). Smaller = denser, and the brain knows it: natural statistics of object density shape weight expectationsPLoS One 10e0119794. DOI:10.1371/journal.pone.0119794.

PlaisierM. A.SmeetsJ. B. J. (2012). Mass is all that matters in the size–weight illusionPLoS One 7e42518. DOI:10.1371/journal.pone.0042518.

PrattC. C. (1930). The spatial character of high and low tonesJ. Exp. Psychol. 13278285.

RaderC. M.TellegenA. (1987). An investigation of synesthesiaJ. Pers. Soc. Psychol. 52981987.

RamachandranV.HubbardE. (2001). Synaesthesia: a window into perception, thought and languageJ. Conscious. Stud. 8334.

RammsayerT. H.VernerM. (2015). Larger visual stimuli are perceived to last longer from time to time: the internal clock is not affected by nontemporal visual stimulus sizeJ. Vis. 155. DOI:10.1167/15.3.5.

RofflerS. K.ButlerR. A. (1968a). Factors that influence the localization of sound in the vertical planeJ. Acoust. Soc. Am. 4312551259.

RofflerS. K.ButlerR. A. (1968b). Localization of tonal stimuli in the vertical planeJ. Acoust. Soc. Am. 4312601266.

RogersM. E.ButlerR. A. (1992). The linkage between stimulus frequency and covert peak areas as it relates to monaural localizationPercept. Psychophys. 52536546.

RossH. E. (1969). When is a weight not illusory? Q. J. Exp. Psychol. 21346355.

RusconiE.KwanB.GiordanoB. L.UmiltaC.ButterworthB. (2006). Spatial representation of pitch height: the SMARC effectCognition 99113129.

SadaghianiS.MaierJ. X.NoppeneyU. (2009). Natural, metaphoric, and linguistic auditory direction signals have distinct influences on visual motion processingJ. Neurosci. 2964906499.

SapirE. (1929). A study in phonetic symbolismJ. Exp. Psychol. 12225239.

SpenceC. (2011). Crossmodal correspondences: a tutorial reviewAtten. Percept. Psychophys. 73125.

SpenceC.DeroyO. (2012). Crossmodal correspondences: innate or learned? i-Perception 3316318.

SpenceC.PariseC. V. (2012). The cognitive neuroscience of crossmodal correspondencesi-Perception 3410412.

StevensJ. C.MarksL. E. (1965). Cross-modality matching of brightness and loudnessProc. Natl Acad. Sci. USA 54407411.

StumpfK. (1883). Tonpsychologie (Tone Psychology). HirzelLeipzig, Germany.

SuzukiY. I.TakeshimaH. (2004). Equal-loudness-level contours for pure tonesJ. Acoust. Soc. Am. 116918933.

TrommershäuserJ.KördingK.LandyM. (Eds) (2011). Sensory Cue Integration. Oxford University PressNew York, NY, USA.

Van DamL. C. J.ErnstM. O. (2015). Mapping shape to visuomotor mapping: learning and generalisation of sensorimotor behaviour based on contextual informationPLoS Comput. Biol. 11e1004172. DOI:10.1371/journal.pcbi.1004172.

Van DamL. C. J.PariseC. V.ErnstM. O. (2014). Modeling multisensory integration in: Sensory Integration and the Unity of ConsciousnessBennettD.HillC. (Eds) pp.  209229. MIT PressCambridge, MA, USA.

WalkerR. (1987). The effects of culture, environment, age, and musical training on choices of visual metaphors for soundPercept. Psychophys. 42491502.

WalkerP.SmithS. (1984). Stroop interference based on the synaesthetic qualities of auditory pitchPerception 137581.

WalkerP.SmithS. (1985). Stroop interference based on the multimodal correlates of haptic size and auditory pitchPerception 14729736.

WalkerP.BremnerJ.MasonU.SpringJ.MattockK.SlaterA.JohnsonS. (2010). Preverbal infants’ sensitivity to synaesthetic cross-modality correspondencesPsychol. Sci. 212125.

YatesM. J.LoetscherT.NichollsM. E. R. (2012). A generalized magnitude system for space, time, and quantity? A cautionary noteJ. Vis. 129. DOI:10.1167/12.7.9.


  • View in gallery

    Mappings across redundant (left), relative (center), and unrelated cues (right). In the case of redundant cues, the mapping is usually defined by the identity line; in the case of relative cues, the mapping is not fully defined by the two cues alone (e.g., length and frequency), but depends on other factors (e.g., tension and density). When two cues are unrelated, there is no clear mapping across cues. This figure is published in colour in the online version.

  • View in gallery

    Natural scene statistics and perceptual mappings. (A) External signals have their source in the environment, and then they are filtered by the transfer function of our sensory organs before being converted into neural activity in the brain. Correlation across cues can be already present in the original external signals or they can be introduced by the transfer functions of our sensory organs. The plots at the bottom of the figure represent the frequency–elevation mapping measured from natural scene statistics (B), the head-related transfer function (C), and the Bayesian priors representing the brain’s belief about the mapping between frequency and elevation in head-centred and world-centred reference frames (D; for details, see Parise et al., 2014). This figure is published in colour in the online version.

  • View in gallery

    Schematic representation of a putative associative network of interconnected sensory cues.

  • View in gallery

    Results of the survey. Error bars represent the 95% confidence intervals. This figure is published in colour in the online version.


Content Metrics

Content Metrics

All Time Past Year Past 30 Days
Abstract Views 75 75 28
Full Text Views 84 84 68
PDF Downloads 6 6 4
EPUB Downloads 0 0 0