Every day we perceive events that are multisensory, well aligned in time, and unified. This percept is a given fact and we rarely think that the synchrony of those events is not the rule but rather the exception. It is now known that the perceived order or subjective simultaneity of two sensory stimuli might not correspond to their actual physical order or objective synchrony, respectively. This was, however, unknown to astronomers back in the early 19th century who measured stellar transits (i.e., estimation of the position of a star across the reticules of the telescope between successive beats of a clock or metronome) via the ‘ear and eye’ method. This method was considered quite accurate, however, it was later observed that astronomers’ judgments deviated from each other for intervals that could reach in some cases the 800 ms (e.g., Mollon & Perkins, 1996). These deviations eventually led Gustav Fechner and Wilhelm Wundt to establish the fields of psychophysics and experimental psychology, respectively (Aghdaee, Battelli, & Assad, 2014).
The systematic discrepancy between objective and subjective stimulus timing (i.e., order and simultaneity) was partially accounted for by sensory arrival latencies. That is, differences in the time needed for the signals to be detected by the sense organs and transmitted to the appropriate processing centers (Sternberg & Knoll, 1973). Light, for example, travels through the air much faster than sound (i.e., approximately 300.000.000 m/s for light and 343 m/s for sound; Spence & Squire, 2003). When detected though, a visual stimulus needs more time to be transduced to the retina as compared to an auditory stimulus to get processed by the hair cells in the inner ear (King & Palmer, 1985). Thus, the sensory arrival latencies are identified in two different levels, the physical and the neuronal level.
At the physical level, one of the parameters affecting stimulus arrival times is the distance of the stimulus’ origin from the observer. As the distance of the multisensory event (e.g., audiovisual) increases, the arrival times of the auditory input lag even more than the visual ones. Take, for example, thunders in
Stimuli presented within the horizon of simultaneity and close in time are perceived as synchronous even though they are physically asynchronous (Stein & Meredith, 1993). One potential account of how simultaneity is perceived is the mechanism of the temporal window of integration (twi; e.g., Colonius & Diederich, 2012; King, 2005; Lewkowicz & Ghazanfar, 2009; Spence & Squire, 2003; van Wassenhove et al., 2007; Vatakis 2013; Vroomen & Keetels, 2010). This ‘window’ represents the temporal range at which the brain tolerates asynchronies in the presented stimuli (e.g., visual or auditory leads/lags) so as to integrate the multisensory event and perceive the inputs as simultaneous (Diedrich & Collonius, 2015; Stevenson & Wallace, 2013). Research has shown that the twi has a sensory bias with higher tolerance to visual as compared to auditory leads given the naturally occurring arrival delays of the sound. Thus, for simultaneity to be perceived, the twi is asymmetric characterized by a visual shift, which is referred to as a visual bias (i.e., the visual stimulus has to be presented before the auditory stimulus for synchrony to be perceived; Hirsh & Sherrick 1961; Lewald & Guski, 2003; Munhall, Gribble, Sacco, & Ward, 1996; Slutsky & Recanzone, 2001; Vatakis, Navarra, Soto-Faraco, & Spence, 2008; Zampini, Shore, & Spence, 2003; Zampini, Guest, Shore, & Spence, 2005a). The systems’ tolerance to asynchronies, however, is malleable since it has been shown that the system can be recalibrated when adapted to specific asynchronies (e.g., Fujisaki, Shimojo, Kashino, & Nishida, 2004; Navarra, Vatakis, Zampini, Soto-Faraco, Humphreys, & Spence, 2005; Vatakis, Navarra, Soto-Faraco, & Spence, 2007; Vroomen et al., 2004). Thus, a potential shift of the twi towards audition can be attained by manipulation of depth cues, sensory exposure, or experience (e.g., music experts vs. non-musicians; King, 2005; Petrini et al., 2009; Silva et al., 2014; Spence & Squire, 2003). Moreover, the twi can also be modulated by the temporal ventriloquism effect (i.e., the phenomenon
Spatial proximity of the auditory and visual inputs is another parameter that affects integration of the incoming inputs (Lewald & Guski, 2003). For instance, two events that are close in space, time, and structure are usually perceived as emanating from the same underlying cause, while in the case of large spatial displacements the percept is associated with two different events originating from different sources (Kording et al., 2007). Thus, it is proposed that the brain uses causal inference to make estimations about optimal cue combinations (Kayser & Shams, 2015; Kording et al., 2007; Shams & Beierholm, 2010). In other words, the brain decides on whether the two stimuli originate from the same source and, subsequently, whether to integrate or segregate them (Bayesian causal inference; Kording et al., 2007). Thus, one could infer that the ecological validity of the stimuli presented could enhance the likelihood of multisensory integration and, thus, ones’ percept of synchrony (Aschersleben, 1999; but see van Eijk, 2008).
Kohlrausch et al. (2013) have also suggested that the potential differences noted in the perception of synchrony between various audiovisual stimuli (i.e., simple stimuli such as flashes and beeps, or ecologically valid stimuli such as a bouncing ball that have inherent anticipatory and predictive characteristics) may be attributed to the apparent causality of the event and not to the visual event predictability per se. More specifically, the claim is that an ecological valid multisensory stimulus is expected to promote the impression that the visual stimulus causes the auditory stimulus. Thus, this implied causal relationship leads to the expectation that the auditory stimulus cannot precede the visual one. Such expectations are not present in simple stimulation, thus leading to shifts in the judgment of synchrony between different types of stimulation (i.e., less tolerance to auditory leading asynchronies for ecological valid stimuli; Kohlrausch et al., 2013).
Many other lower- and higher-level parameters have been reported to affect both participants’ perception of synchrony and sensitivity to asynchrony (see Vatakis, 2013, for a review). Stimulus characteristics such as intensity, duration, type, and content (Eg & Behne, 2015), as well as task characteristics, and attentional and decisional mechanisms have been reported to modulate the twi and, thus, to affect participants’ multisensory synchronous percept (Garcia-Perez & Alcala-Quintana, 2012; Keetels & Vroomen, 2012; Schneider & Bavelier, 2003; Shore, Spence, & Klein, 2001; Spence, Shore, & Klein, 2001; Sternberg & Knoll, 1973; Zampini et al., 2005b). Moreover, adaptation to specific stimulus asynchronies (Fujisaki et al., 2004; Vatakis, Navarra,
2 Tasks for the Measurement of Synchrony Perception
Research on the perception of synchrony has made use of various tasks for the measurement of perceptual latencies (differences in processing speed) between different sensory modalities, stimulus characteristics, participant groups, cue types (e.g., prior-entry), or in temporal recalibration (Garcia-Perez & Alcala-Quintana, 2015b). The two most widely used tasks are the: temporal order judgment (toj) task and the binary simultaneity judgment (SJ-2) task. In the toj task, stimuli of different modalities are presented at various stimulus onset asynchronies (soas) and participants have to decide which stimulus was presented first or second (i.e., visual/auditory-first or visual/auditory-second response). Similarly, in the SJ-2 task, participants are stimulated bimodally with various soas but their response now is on whether the two sensory inputs were simultaneously presented or not (i.e., synchronous or asynchronous response).
Another commonly used task for measuring the perception of synchrony is the ternary simultaneity judgment task (SJ-3; Kohlrausch, et al., 2013; Kuling, van Eijk, Juola, & Kohlrausch, 2012; Ulrich, 1987; van Eijk, Kohlrausch, Juola, & van de Par, 2008). This task is a combination of the two previously described tasks (i.e., toj and SJ-2). That is, the participants are presented with synchronous and asynchronous bimodal stimulation and they have to decide whether the stimuli were presented in synchrony or not and in case of an asynchronous response to report which sensory input was presented first. Thus, the three possible answers to this task are: synchronous/asynchronous, auditory-leading, or visual-leading stimulation.
Other tasks have also been implemented for the perception of synchrony but not commonly used such as reaction time tasks (Cardoso-Leite, Gorea, & Mamassian, 2007; Diedrich & Colonius, 2015; Leone & McCourt, 2015), perceptual fusion tasks (where participants decide whether they perceived a unified event or not; Stevenson & Wallace, 2013), and two-interval forced
3 The Parameters Associated with Synchrony Perception
The raw data obtained from the toj, SJ-2, and SJ-3 tasks are most commonly processed so as to obtain the: point of subjective simultaneity (pss; note that pss differs from the point of objective simultaneity due to the latencies described in the introduction section), just noticeable difference (jnd; standard deviation of the distribution, sd, in sj tasks), and twi (see Table 11.1). Although these derived parameters are interpreted in the same way across tasks, it is as yet unclear whether the different tasks actually measure the same exact perceptual processes. The discrepancies noted in these sensitivity parameters when these tasks are compared have raised concerns on whether these measures reflect differences in participants’ sensitivity to the event synchrony/asynchrony or reported biases and experimental manipulations (e.g., Garcia-Perez & Alcala-Quintana 2012; 2015a,b; Nicholls, Lew, Loetscher, & Yates, 2011; Spence & Parise, 2010; van Eijk et al., 2008; Vatakis et al., 2008; Vroomen & Keetels, 2010; Yates & Nicholls, 2011). We will explore these discrepancies later in this chapter, but let’s first explain what the parameters of these measures represent for each task and how they are calculated.
3.1 toj Measures of Sensitivity
In the TOJ task, the pss is an indirect measure of the perceived simultaneity of the stimuli presented (Garcia-Perez & Alcala-Quintana, 2015a). It represents the amount of asynchrony that must be present between bimodal inputs at which participants cannot reliably detect their temporal order (i.e., indirect perception of synchrony; see Table 11.1). Thus, participant responses for temporal order at the pss point are near chance level (for which modality was presented first or second) assuming no inherent response biases. To compute the pss in a toj task, the percentage of the visual-first (or auditory-first) responses
The jnd represents the smallest interval at which the participants can reliably decide which sensory input of the two presented was first (see Table 11.1). The steepness of the curve at the 50% point reflects participants’ sensitivity to temporal asynchronies. This measure can also be expressed as the jnd value and computed as half the difference between the 25% and the 75% point on the same curve (see Figure 11.1). Typically, a steep slope results in small jnds and, thus, in high participant sensitivity in the detection of asynchronies in the stimuli presented (i.e., high temporal resolution).
Finally, the twi represents the range of tolerance in audiovisual asynchronies within which the perceptual system integrates the sensory inputs and, thus, reliable detection of order is not possible. The range of the twi is computed as the [pss ± jnd]. The left twi (i.e., pss-jnd) represents participants’ insensitivity to detect order when the auditory stimulation is leading, while the right twi (i.e., pss + jnd) represents participants’ insensitivity to a leading visual stimulation.
3.2 SJ Measures of Sensitivity
4 Differences in the Measures Obtained from toj and sj Tasks
A number of studies have used toj and sj tasks to investigate how the perception of synchrony is affected by a number of parameters such as the: type of stimulation (e.g., speech or non-speech stimuli, vestibular, auditory, visual, or tactile stimulation; Barnett-Cowan & Harris, 2009, 2011; Eg & Behne, 2015; Fujisaki & Nishida, 2009; Li & Cai, 2014; Maier, Di Luca, & Noppeney, 2011; Sanders et al., 2011; Vatakis et al., 2008; Vroomen & Stekelenburg, 2010), participant group tested (e.g., patients with schizophrenia, video gamers, older adults; Bedard & Barnett-Cowan, 2016; Capa, Duval, Blaison, & Giersch, 2014; Donohue et al., 2010), attentional manipulations used (Schneider & Bavelier, 2003), potential confounds and biases (see Keetels & Vroomen, 2012, for a review), adaptation and recalibration effects (e.g., Fujisaki et al., 2004; Vroomen et al., 2004), and perceptual training (Cecere, Gross, & Thut, 2016; Stevenson et al., 2013). These studies revealed marked differences between the parameters obtained from the two tasks. For instance, the mean pss values across toj studies were mainly shifted towards audition (i.e., auditory-leading), whereas in sj studies the mean pss values were generally visually shifted (i.e., visual-leading; see van Eijk et al., 2008, for an extended literature review; but also see Leone & McCourt, 2015; Linares & Holcombe, 2014). A central question in the study of synchrony perception is, therefore, whether or not the toj and sj tasks utilized refer to the same or different perceptual processes (e.g., Binder, 2015; Garcia-Perez & Alcala-Quintana, 2015a; Keetels & Vroomen, 2010; Love et al., 2013; Spence & Parise, 2010; van Eijk et al., 2008; Vatakis et al., 2008).
The debate about whether the two tasks tap into common underlying processes stems back in the ’70s. On the basis of the independent-channels models described by Sternberg and Knoll (1973), it has been argued that a central timing mechanism receives the signals from the two stimuli presented, which arrive with randomly distributed arrival latencies, and applies a ternary decision rule to the arrival-time difference between the two signals in order to determine order or synchrony judgments. Thus, it was assumed that the two tasks were based on the same internal events (perceptual latency model; see Allan, 1975) and, thus, the perception of successiveness/asynchrony was a necessary and sufficient condition for the perception of temporal order. According to Hirsh (1959), however, the perception of asynchrony is a necessary but not
4.1 Divergent Perceptual Processes
Some researchers have proposed a potential dissociation between the mechanisms involved during the execution of the toj and sj tasks. Zampini et al. (2003), for instance, have argued that the toj task may reflect processes related to temporal discrimination, while sj tasks may be related more to temporal binding mechanisms. Vatakis et al. (2008) also argued that the two tasks might not measure the same aspects of temporal perception. In their study, they measured the participants’ sensitivity in the sj and toj tasks using temporal recalibration in a simple pair of audiovisual stimuli with an audiovisual speech stimulus as the adaptor. The results showed positive pss values for both tasks but these values were not correlated across tasks on an individual basis. If the pss values across tasks were correlated then one could support that they measure the same underlying processes. Yet this was not the case. Similarly, no correlation was found for the jnd values obtained from the two tasks. Vatakis and colleagues attributed the differences in the pss values to the nature of the sj task that could potentially bias participants’ responses toward a simultaneous rather an asynchronous response, given that matched events tend to also be matched in time, and, thus, should be in synchrony.
Moreover, van Eijk et al. (2008) argued for the adoption of different terminology as a function of task so as to avoid potential interpretations of obtained data as representing the same underlying perceptual processes. Specifically, Van Eijk and colleagues tested the effect of experimental method used and stimulus type presented on the audiovisual temporal percept across the same participants (not in a temporal adaptation, Vatakis et al., 2008; or prior entry paradigm, Yates et al., 2011). They used simple (i.e., light flashes and sound clicks) and complex stimuli (i.e., bouncing balls and impact sounds) at various soas and asked participants to provide a toj, an SJ-2, and an SJ-3 response.
Similarly, Love et al. (2013) suggested that the sj and toj tasks do not represent the same underlying processes of the perception of synchrony. Their study extended the results of van Eijk et al. (2008) by using five different stimulus types at different levels of complexity (i.e., beep-flash, beep-flash-constant-visual, beep-flash-drumming, point-light-drumming, and face-voice stimuli). The aim of their study was to investigate whether the previously observed pss differences (van Eijk et al., 2008) were consistent across different stimulus types. To eliminate any potential confounds due to stimulus duration,1 they mostly created stimuli with equal overall duration. They measured the pss and the twi. The pss obtained from the toj task was auditory shifted (i.e., auditory-leading), while, for the sj task, the pss values were visual shifted (i.e., visual-leading). Regarding the twis, narrower windows were obtained for the toj task using simple stimuli as compared to the sj task, while for complex speech stimuli, the twi was longer for the toj task as compared to the sj task. For the rest of the complex stimuli, no significant differences were obtained in the twi across tasks. Similar to van Eijk et al., Love et al. found no correlation for the pss or twi between the two tasks. Participants were more accurate to detect asynchrony in the sj task when the auditory modality was leading as compared to visual leads for all stimulus types except for the speech pairs.
As can be seen from the above-mentioned studies, differences in the pss values across stimulus types and across studies are not consistent. This may be due to the different stimulation used, the experimental set-ups, or analyses implemented. Together, however, these differences argue for the potential that the two tasks used for the measure of synchrony might not tap into the same underlying processes.
4.2 Partially Overlapping Mechanisms and Potential Sources of Bias
Fewer studies have also argued for the potential that the sj and toj tasks share at least in part some common mechanisms of temporal perception. Maier et al. (2011), for example, supported this for the perception of synchrony for audiovisual speech stimuli. In their study, they utilized speech and non-speech stimuli and fitted their data using both parametric (Gaussian and cumulative Gaussian psychometric functions) and non-parametric methods. Their results showed that the twi for the toj task was wider than that obtained from the sj task for the speech but not for the non-speech stimuli. This finding was attributed to an attentional shift of the participants’ focus to the onsets of the audiovisual signal during the toj task, making participants ignore the temporal information of the rest of the stimuli, whereas the sj task required judgments based on the combined auditory and visual signals. The pss and jnd parameters obtained in the toj and sj task were found to be significantly correlated (in the non-parametric fitting). Thus, although one could argue for some common processing, one could also claim that the analysis method can potentially affect the results and their interpretation.
Linares and Holcombe’s (2014)
toj and sj comparison focused on the role of biases as a potential explanation for the differences noted between the two tasks (Garcia-Perez & Alcala-Quintana, 2012; Schneider & Bavelier, 2003; Shore, Spence, & Klein, 2001; Yarrow, Jahn, Durant, & Arnold, 2011; Vatakis et al., 2008). Using simple stimuli they found positive (i.e., visual leading) pss values both for the toj and sj tasks (at individual level, larger deviations were obtained in the toj task with both positive and negative pss values). Moreover, no pss correlations were found between the two tasks. Thus, Linares and Holcombe attributed the different perceptual latencies (i.e., psss) across tasks at partially distinct sets of biases. For instance, the pss obtained from the toj task
Stevenson and Wallace (2013) also argued that it is possible for the two tasks to share some common underlying processes for the “ascription of temporal identity at a stimulus level”. Specifically, they investigated the effect of task and stimulus type on the twi at different statistical criterion levels to check whether and how the criterion level may affect twi outcomes. They used simple and complex (speech and non speech) stimuli and asked participants to perform an sj and a toj task. Their results showed that the twi was dependent both on the task and stimulus type. The sj task yielded wider twis as compared to the toj task showing that additional processing steps were required after the low-level analysis of the temporal relationship of the stimulus pair. Similarly, the speech stimuli yielded wider twis as compared to non-speech stimuli (for criterion level at 50%). Moreover, the right side of the twi (both at 50 and 70% criterion levels) was longer for the sj than the toj task, while for speech stimuli it was symmetrical as compared to the asymmetrical twi for non-speech stimuli with criterion levels impacting this symmetry. Contrary to what Linares and Holcombe reported for individual performance levels, Stevenson and Wallace found strong within participant twi correlations across tasks. On the basis of this finding, they proposed that while task and stimulus types may affect differentially the pss and twi values, the tasks strongly correlate in their elicitation of the twi.
Similarly, Garcia-Perez and Alcala-Quintana (2015) argued that differences in performance across the sj and toj tasks could be due to task-dependent decisional and response processes that operate on the “timing processes that are identical under both tasks”. In previous studies, the reported performance was based on a curve fitting relative to the task at hand and systematic discrepancies between toj and sj tasks were obtained. According to Garcia-Perez and Alcala-Quintana (2012; 2015b), however, this kind of analysis could not distinguish whether the differences obtained across stimuli, tasks, or experimental manipulations were due to timing, decisional, or response processes. To address this problem, Garcia-Perez and Alcala-Quintana (2012) developed a computational model of timing judgments to address the individual contribution of each component (i.e., timing, decisional, and response processes). Their model was based on independent-channels models and, thus, on the
The Garcia-Perez and Alcala-Quintana model has been extended to explain previously reported differences of the toj and sj tasks (Garcia-Perez & Alcala-Quintana, 2015a,b). For instance, Garcia-Perez and Alcala-Quintana (2015b) attributed the pss values obtained in the toj and sj tasks utilized in Matthews and Welch’s (2015) study to a left visual field advantage and different resolution at the decision mechanism (that is, different perceived onset asynchronies were required to perceive asynchrony between the hemifields), but not to differential low-level temporal characteristics (i.e., visual acuity). In general, for stimuli and conditions similar across tasks, the differences obtained in temporal judgments were due to decisional and response processes between the two tasks rather than different timing processes (Garcia-Perez & Alcala-Quintana, 2015a). Thus, this is among the first approaches to investigate how sjs and tojs differ in the decisional space (Garcia-Perez & Alcala-Quintana, 2012, 2015a; Matsuzaki et al., 2014; Regener, Love, Petrini, & Pollick, 2015).
Similarly, Matthews et al. (2016) supported that the differences between the toj and sj tasks stem from decisional and response related factors. Specifically, in their study they showed that decisional factors govern both the relative speed (i.e., reaction time) and accuracy of the relative timing judgments. Using a sj and a toj task within a visual rsvp task, Matthews and colleagues found that reaction times increased with uncertainty near the task-specific decision boundaries. Thus, for sj they found faster reaction times to synchronous as compared to asynchronous inputs, while the opposite pattern was obtained for toj. Overall, they found smaller reaction time (rt) patterns (rt to synchronized stimuli/RT to ± threshold asynchrony) for the sj than for the toj task (although not consistent across participants) suggesting that decisional and not stimulus driven differences affected the rts obtained in the two tasks. Further testing of this rt pattern to other types of stimuli (not only visual) still remains to be done, as well as testing its reliability for timing judgments.
Machulla et al. (2016) also supported that toj and sj tasks share, at least partly, common processes and, thus, they are not independent. They tested pss and jnd values across the sj and toj tasks using simple audiovisual, audiotactile, and visuotactile stimulation and they found that the psss obtained between the two tasks were correlated. Moreover, the pss values from the toj task were negative across the different pairs of stimuli (i.e., audiovisual, audiotactile, and visuotactile), while the pss values from the sj task were positive only for the visuotactile pair. It is important to note that the fitting data procedure was based on a non-parametric method (to allow for asymmetries around the pss).
Finally, an interesting proposal was recently put forward by Parise and Ernst (2016), who argued that a general mechanism could potentially explain perceptual processes such as those governing causality, synchrony, and order. Specifically, they developed a model by borrowing the structure of a neural mechanism that detects motion and motion direction in the visual system (known as the Hassenstein-Reichardt detector or elementary motion detector) and modified it so as to explain aspects of multisensory perception. In brief, this mechanism contains detectors/subunits (termed as multisensory correlation detectors) that receive sensory information from different senses of spatially aligned receptive fields. The detectors’ inputs are subjected to low-pass temporal filtering (i.e., temporal shifts are applied) and the outputs are either multiplied or subtracted to detect causality (correlation) and temporal order, respectively. Through this structure, one can explain the spatiotemporal characteristics of multisensory integration in a single general mechanism accounting for both neuronal and behavioral level outcomes (Parise & Ernst, 2016).
Recently, therefore, more and more researchers are supporting that the toj and sj tasks may not be independent but instead they may share common mechanisms in the low level of the timing judgment or even one general mechanism. See Table 11.2 for an overview of the studies mentioned above. To further contribute to this discussion, the next section briefly covers what happens at the neuronal level when utilizing these two tasks.
4.3 The Underlying Neuronal Processes of the sj and toj Tasks
It is important now to examine what happens in the brain when someone performs a synchrony/temporal judgment task. From a neuronal level perspective, Binder (2015) explored within participant differences in neural activation between toj and sj tasks for a simple audiovisual stimulation using event-related fMRI. One of the main findings of this study was that the active areas elicited by both tasks overlapped with regions usually associated with spatial selective attention. Thus, timing judgments of audiovisual sensory inputs activate regions that are used during tasks based on spatial information. Another
Using more complex stimuli and a mixed block/event-related fMRI design, Love et al. (submitted), supported that the sj and toj task have “divergent neural mechanisms” despite the common brain activity elicited for both tasks. In line with Binder’s (2015) results, Love et al. found that during the toj task, but not the sj, several regions in the left hemifield were activated (middle occipital, middle frontal, precuneus, and superior medial frontal cortex), while the left middle occipital cortex (moc) areas were deactivated. These findings suggest differential neural mechanisms for the two temporal tasks challenging the notion that the two tasks are based on the same cognitive architecture using the same sensory information (i.e., perceptual latency between the sensory inputs; Love et al., submitted).
Miyazaki et al. (2016) also supported that from the neuronal perspective, the two tasks are based on different mechanisms. In their study, they utilized unimodal tactile stimuli to test the neural activity between the toj and sj tasks and they found specific brain activation patterns for each of the two tasks. More specifically, during the toj task more areas were activated as compared to those during the sj task (i.e., left ventral and bilateral dorsal premotor cortices and left posterior parietal cortex for the toj task and posterior insular cortex for the sj task) providing support for the hypothesis that not only the two tasks engage their own specific processes but also that the toj task involves more processes than sj and that sj processes are included in those for toj (Miyazaki et al., 2016).
To sum up, despite the growing number of studies on the comparison of the two main tasks for synchrony perception, the results and, thus, the conclusions are still inconsistent across studies. The different types of stimuli used along with the different soas, the number of participants, the various techniques followed to fit the data, and the different parameters derived from each study (pss, jnd, or twi), do not allow for a direct comparison of the existing sj and toj studies. Moreover, research on the mechanisms of the perception of synchrony and order have not as yet expressed a clear position in relation to the initially proposed hypotheses of perceptual latency and two-stage models (but see Binder, 2016; Garcia-Perez & Alcala-Quintana, 2012).
5 Which Task Should One Use to Measure Synchrony Perception?
The sj and toj tasks have been treated as equivalent and used interchangeably when studying synchrony (Keetels & Vroomen, 2012). In view of the recent findings, however, the question of which task to use in ones’ study arises. In many studies, it has been argued that the toj data are more variable in terms of pss as compared to the sj data (van Eijk et al., 2008), which could be due to the inherent response biases (Garcia-Perez & Alcala-Quintana, 2012; Schneider & Bavelier, 2003; Vatakis et al., 2008) that cannot be distinguished from perceptual effects (Garcia-Perez & Alcala-Quintana, 2013). Garcia-Perez and Alcala-Quintana’s (2012) findings support that the response bias is larger in toj experiments (resulting in shallower psychometric function at the 50% point; Garcia-Perez & Alcala-Quintana, 2013) because participants are required to guess which stimulus was presented first even when they perceived multiple stimulations as a single event. In this respect, performance measures are contaminated and possibly cannot be directly compared to those obtained from sj tasks (Garcia-Perez & Alcala-Quintana, 2012). Yarrow et al. (2011) questioned the suitability of sj tasks supporting that they are not appropriate for understanding the underlying mechanisms of “apparent timing distortions”. As they support, it is not clear for the sj task whether the pss reflects differences in perceptual latencies and/or shifts in the criterion used to decide between synchronous and asynchronous sensory inputs.
Vatakis et al. (2008) have also expressed concerns about inherent response bias in the sj task given that participants may be biased toward binding the incoming sensory inputs and treating them as a unified percept. This bias according to Vatakis et al. cannot affect the performance in the toj task where participants need to judge the order of presentation of the stimulus pair. These arguments are based on their findings of worst jnd values in the sj as compared to the toj task. Similar jnd effects were also obtained by Barnett-Cowan and Harris (2009), who tested the temporal sensitivity to vestibular stimulation in relation to auditory, visual, and tactile stimulation.
Moreover, the comparison between the measures obtained from the SJ-2 and SJ-3 tasks revealed that although the auditory leading boundary in both sj tasks was shorter than the visual, the pss values and visual-first boundary for the SJ-2 task were larger as compared to those obtained from the SJ-3 task (van Eijk et al., 2008). These results led van Eijk et al. to suggest that for the perception of synchrony the SJ-3 task is potentially a better choice. The suitability of the ternary sj task (i.e., SJ-3) for temporal sensitivity measures has also been proposed by other researchers (e.g., Garcia-Perez & Alcala-Quintana, 2013; Schneider & Bavelier, 2003; Spence & Parise, 2010; Ulrich, 1987; Zampini, Shore,
Garcia-Perez and Alcala-Quintana (2015b) are also in line with this suggestion by taking it a step further to the decision space. Given that participants in an SJ-3 task may perceive asynchrony but are unable to identify stimulus order, Garcia-Perez and Alcala-Quintana proposed a potential extra division of the existing decisional space (synchronous, auditory first, visual first) in order to cater for this forth type of judgment (asynchronous but cannot report the order). The authors suggest that such a division could potentially also explain why the twi in a toj task is wider than that for an sj task, however, these arguments need to be further investigated.
A task recommendation that is well accepted by the whole community is yet to be defined. This is mainly due to the unresolved issues of what are the underlying mechanisms governing each task and what the actual data obtained from each task really means in terms of the perception of synchrony. Thus, more research is needed comparing the three tasks and for the time being one’s choice of a task should be dependent on the specific question asked in a given study.
6 Individual Differences
Generally, mean pss values are mainly positive for sj tasks and negative for toj tasks. On an individual level though, both auditory- and visual-leading psss have been reported for both tasks (Linares & Holcombe, 2014; Stone et al., 2001; van Eijk et al., 2008). For instance, while some participants perceive simultaneity when the visual stimulus is leading, some others perceive simultaneity when the sound stimulus is leading. As Stone et al. supported pss values are significantly different between most individuals as well as in each individual and the estimated population mean pss value. These consistent individual differences should not be disregarded but instead, should get more attention so as to better understand the underlying causes of such differences (Spence & Squire, 2003).
Moreover, there is a controversy on whether the measures of sensitivity on an individual level correlate or not across tasks. For instance, many researchers (Fujisaki & Nishida, 2009; Linares & Holcombe, 2014; Love et al., 2013; van Eijk et al., 2008; Vatakis et al., 2008; Vroomen & Stekelenburg, 2010) reported that within participants the twi or pss values were not correlated across tasks. On the other hand, Stevenson and Wallace (2013) found strong twi correlations on an individual level. Linares and Holcombe (2014) suggested that to evaluate the obtained differences in the estimation of the pss between the two tasks, one
7 Criteria for Excluding Participants
When collecting data for the study of the perception of synchrony you may sometimes need to exclude some data, thus in this section we will take a brief look at the criteria used for removing inappropriate data. Excluding participant’ data in a toj task is more widespread that in a sj task for similar stimulation and experimental settings (e.g., Love et al., 2013; Matthews et al., 2016). This may imply that the toj task is a more difficult task to perform than the sj. Different criteria are used across studies as it relates to potential data exclusion. One of these criteria is whether the participant’s data are fitted to the curves used (Love et al., 2013). The criterion for the goodness of fit between data and fitted function is usually the r2 value. Values below 0.5 are taken as indices for exclusion while values above this level lead to the maintenance of the data for further analysis. There are also occasions where the participants’ performance in the task (inability to perform above chance levels; Matthews et al., 2016; van Eijk et al., 2008) and/or pss values exceed the tested asynchronies range and, thus, are excluded from further analyses (Matthews et al., 2016; Vatakis & Spence, 2008; Zampini et al., 2003). In other studies, each participant’s pss values are compared to the group pss values with those falling two standard deviations away from the mean being excluded (Vatakis et al., 2008).
Love et al. (2013) had large exclusion rates in the toj task of their study (over 63% for complex stimuli with manipulations of duration) and this suggests
8 Differences in Fitting Data Procedures
The most recent research in synchrony perception mostly fits Gaussian and cumulative Gaussian psychometric functions to sj and toj data. Even the fitting procedures, however, are not consistent across studies and this may cause difficulties in comparing results across studies. The curve used to fit the data can affect the estimation of the pss either precluding its estimation or biasing the pss “away from the mean of the distribution and towards the median” (Linares & Holcombe, 2014; Maier et al., 2011). Nevertheless, the obtained psychometric function is usually modeled either by a cumulative Gaussian (Leone & McCourt, 2015; Linares & Holcombe, 2014; Love et al., 2013; Stevenson & Wallace, 2013; van Eijk et al., 2008) or a logistic function posing yet another issues in the comparison of the reported data across studies. Similarly, to analyze data obtained from a sj task, a bell-shaped psychometric function is used to fit the “synchronous” response curve (see Stone et al., 2001). Generally though, this fitting procedure is conducted on an individual basis and mean values across participants are computed for each parameter. Despite the function used to fit the data, the parameters obtained from this task are derived using the formulas described earlier in this chapter. Thus, it is clear that even within a study, where the stimuli and asynchronies across tasks are the same, the parameters involved in the different fitted functions in the toj and sj tasks can lead to differences in a study’s outcomes (Garcia-Perez & Alcala-Quintana, 2012).
A shortcoming of the Gaussian functions trying to fit the data curve is its symmetry while the data that we want to fit are asymmetric (i.e., participants report “synchronous” for larger visual leading asynchronies; Alcala-Quintana & Garcia-Perez, 2013). To capture asymmetries in data from sj tasks, some studies have used two cumulative Gaussians functions to fit the synchrony response curve allowing for different slopes in the two halves (e.g., Hillock, Powers, & Wallace, 2011; Stevenson & Wallace, 2013; van Eijk et al., 2008). Thus, one function was fitted to the synchronous responses when sound was leading (left twi) and the other to the synchronous responses when sound was lagging (right twi). The maximum synchronous response proportion was calculated for both halves and the intersection of the two curves may be used to estimate
Researchers have also used non-parametric functions to fit their data (both sj and toj; see Machulla et al., 2016; Maier et al., 2011) allowing asymmetries between the two halves. Comparing the parametric and non-parametric fitting, Maier et al. have found differences in the correlations obtained between the parameters of the two tasks (i.e., parametric functions did not reveal significant correlations, while non-parametric functions revealed some significant correlations) showing that the data analysis method affects the obtained results. Signal detection procedures have also been used to describe each participant’s temporal precision (d’) across soas (Matthews et al., 2016) and, subsequently, these values are fitted to the selected functions as would have happened with proportions of “visual first” or “synchronous” responses.
It should be clear, therefore, that there is no consistent analysis used across studies to describe the observed toj or sj performance. The main problem though, regarding the fitting of arbitrary functions to data is that these arbitrary functions (although they describe adequately the tendency of the data) cannot explain the differences obtained in sensitivity measures between the tasks (Alcala-Quintana & Garcia-Perez, 2012; Garcia-Perez & Alcala-Quintana, 2013) and do not address the potential link of the data to the sensory and decisional parameters that may affect judgments (Garcia-Perez & Alcala-Quintana, 2012). Garcia-Perez and Alcala-Quintana (2012, 2015a,b), therefore, propose the use of the model they constructed on the basis of the independent channels models of timing judgments in order to better describe participants’ performance and gain insight at how the different levels of perception (sensory, decisional, response) affect sensitivity to synchrony/asynchrony and temporal order perception. It is still, however, early days for the community to suggest other models or to adopt one specific model.
In this chapter, we aimed to describe the toj and sj tasks as well as the differences they yield in the estimation of perceptual latency. As it has been described, research has shown that many factors affect the pss and the twi values such as the task, the stimulus type, the analysis method, inherent biases, decisional factors, as well as the individual. Progress has also been made in comparing the different tasks and trying to disentangle the underlying processes governing an
Vatakis A. (2013). The role of stimulus properties and cognitive processes in the quality of the multisensory perception of synchrony. In L. Albertazzi (Ed.). Handbook of Experimental Phenomenology: Visual Perception of Shape Space and Appearance (pp. 243–263). UK: John Wiley and Sons.
The increasing soas between two signals increases linearly the stimulus duration potentially providing extra cues when judging asynchrony, while this is not valuable when judging stimulus order. Although, it would be best to ensure equal stimulus duration for both the toj and sj tasks.