Time is a fundamental dimension of human perception, cognition and action, as the processing and cognition of temporal information is essential for everyday activities and survival. Innumerable studies have investigated the perception of time over the last 100 years, but the neural and computational bases for the processing of time remains unknown. Extant models of time perception are discussed before the proposition of a unified model of time perception that relates perceived event timing with perceived duration. The distinction between perceived event timing and perceived duration provides the current for navigating a river of contemporary approaches to time perception. Recent work has advocated a Bayesian approach to time perception. This framework has been applied to both duration and perceived timing, where prior expectations about when a stimulus might occur in the future (prior distribution) are combined with current sensory evidence (likelihood function) in order to generate the perception of temporal properties (posterior distribution). In general, these models predict that the brain uses temporal expectations to bias perception in a way that stimuli are ‘regularized’ i.e. stimuli look more like what has been seen before. As such, the synthesis of perceived timing and duration models is of theoretical importance for the field of timing and time perception.
Time is a fundamental dimension that pervades all sensory, motor and cognitive processes. Organisms, such as human beings, must quantify time in order to survive and interact with the environment efficiently and successfully. Time is central to our everyday lives; from playing sports, speaking, dancing, singing, and playing music, and even our sleep–wake cycle. Though an important dimension of perception, a slight unease may fill the reader when researchers refer to ‘time perception’. The fields of colour, object, taste, olfactory, distance, speech and depth perception all investigate tangible physical properties, whereas the dimension of time is invisible and transient. In fact, one could ask whether time even exists at all — for example, theories of relativity suggest that all moments in the past, present and future are equally real — rendering the specious present something of an illusion (Callender, 2010; Davies, 2002; Einstein, 1916; James, 1890). In this article, we review classic and modern approaches to temporal perception, before discussing the data from recent experiments that have shown how the timing of events changes in a way that is consistent with Bayesian Decision Theory. Finally, this paper calls for a theory of time perception that brings together duration and event timing into a single unified framework.
1.1 Scales of Time
Time is perceived over a broad scale from microseconds to days, weeks and months (but probably not over sub-nanosecond or geological units of time). At the millisecond range, time is critical for speech generation (Schirmer, 2004), recognition (Mauk & Buonomano, 2004) and motor control (Edwards et al., 2002). At the interval range (seconds to minutes), time is crucial for foraging behaviour (Henderson et al., 2006; Meck, 2003), decision making (Brody et al., 2003), sequential actions (Bortoletto et al., 2011) and associative learning (Gallistel & Gibbon, 2000), and has been demonstrated in many species of non-human animals, such as birds (Bateson & Kacelnik, 1997; Buhusi et al., 2002; Henderson et al., 2006; Ohyama et al., 1999), rodents (Buhusi et al., 2002; Gallistel et al., 2004), fish (Drew et al., 2005), primates (Gribova et al., 2002; Janssen & Shadlen, 2005), as well as in human infants (Brannon et al., 2004) and adults (Church & Deluty, 1977; Gibbon et al., 1984). Circadian rhythms are based on 24-hour light/dark cycle due to the rotation of the Earth in relation to the Sun, which helps control waking times, sleep times and metabolic fitness (Buhusi & Meck, 2005; Czeisler et al., 1999).
Millisecond, interval and circadian scales are believed to support different (or even competing) computational or neural mechanisms (Buhusi & Meck, 2005; Ivry & Schlerf, 2008; Merchant & de Lafuente, 2014). This review focuses on human behaviour and perception in the hundreds of milliseconds scale and as such describes historical accounts of how the brain may deal with interval timing. ‘Timing’ can mean both how long an event lasted (the duration of an interval delimited by two stimuli), or conversely, when an event transpired (Merchant & de Lafuente, 2014). A large body of research has been concerned with revealing the mechanisms underlying how long an interval is. The central aim of this review, however, is to elucidate how the brain may estimate when an event occurred in the world (Di Luca & Rhodes, 2016; Yarrow et al., 2015), and how this should be related to interval timing. Firstly, we will discuss the methods employed in time perception to estimate the duration between two events, before describing how current models can explain temporal processing. Then, we introduce recent research that suggests the brain uses a Bayesian inferential processing approach to estimate time in the world.
2 Measuring Perceived Duration
If a mechanism for time perception exists in the brain, then what might its function be? One might argue that an optimal mechanism would try to perceive time as close to veridical (physical) time as possible. Thus, the two main dependent variables in time perception research historically concern the mean accuracy and variability of temporal estimates. Estimates of a temporal characteristic, such as the duration of an event, are prone to temporal distortions by stimulus properties (Horr & Di Luca, 2015a, 2015b; Thomas & Brown, 1974; Wearden et al., 2007), complexity (Schiffman & Bobko, 1977), sensory modality (Goldstone & Lhamon, 1974; Wearden et al., 1998, 2006), and context (Dyjas & Ulrich, 2014); and as such, the mean accuracy of an estimate deviates from real time. Whilst the mean accuracy may approximate real time, the system may be suboptimal and as such the variability in the system may sometimes lead to experiencing an event as shorter or longer than the physical duration (Grondin, 2010).
2.1 From Perceived Duration to Perceived Timing
Temporal reproduction and production (Allan, 1979; Goldstone, 1968), verbal estimation (Vierordt, 1868) and the method of comparison (Bald et al., 1942; Dinnerstein & Zlotogura, 1968; Hamlin, 1895; Höring, 1864; Spence et al., 2001; Wichmann & Hill, 2001; Zampini et al., 2003) have been used classically to assess the perceived duration of events. Of central interest to this review, however, is the perception of when an event occurs in the world rather than how long something lasts. In order to understand how we could measure the perceived timing of a stimulus, we briefly introduce the psychophysics of relative timing approach, and how the method of comparison can be used to estimate when a stimulus is perceived at a time point. The word ‘perceived’ here, is used in the loosest sense — the above methods cannot demonstrably show changes in low-level sensory processing of time (Rhodes, 2017). It is equally plausible that the methods we use in time perception are measuring changes in the decisional criteria associated with time (Solomon, Cavanagh, & Gorea, 2012; Treisman, 1984; Yarrow et al., 2015; Yarrow et al., 2011, 2016).
3 Perceived Event-Timing and Psychophysics
3.1 Psychophysical Methods
Psychophysics is the scientific investigation of the functional interrelations between the physical and phenomenal world (Ehrenstein & Ehrenstein, 1999; Fechner, 1860). The aim of psychophysics is to quantify and measure subjective experience by determining the relationship between perception and physical stimuli. A central tenet of modern psychophysics is to control and vary the properties of an external stimulus and then ask a participant to report what they have experienced — with as simple a question as possible. For example, one may be interested in the detection of whether a sound is present or not (i.e. did you hear that stimulus?) or, further, in identifying what kind of stimulus characteristic is present (i.e. where was the stimulus?). As such one might translate detection into the sensing of a stimulus, and identification as a higher-level process that can sometimes result in a failure to identify a stimulus. For example, if a stimulus is weak and noisy, it may be sensed but a participant may be unable to identify or report a characteristic associated with it.
3.2 Measuring Intersensory Synchrony and Temporal Order
We live in a multisensory environment where perception is not simultaneous — it takes time. The perception of synchrony or temporal order is not straightforward, as differences in neural and physical transmission times can cause synchronous events to be perceived as asynchronous, and vice versa. When a distant bolt of lightning illuminates the sky at night and sends out thunderous sound waves, we see the light first and then hear the sound even though both signals were emitted simultaneously. The discrepancy in the perception of a simultaneous multisensory event is due to the relative differences in sensory registration to the eyes and ears as light travels much quicker than sound (300,000,000 vs. 330 metres per second). To complicate matters further, the processing time for visual stimuli (approx. 50 ms) is longer than auditory stimuli (approx. 10 ms) as the chemical transduction of light in the retina is slower than the mechanical transduction of sound waves in the ear (Allison et al., 1977; King, 2005; Spence & Squire, 2003; Vroomen & Keetels, 2010). The distance at which the differences in neural and physical transmission times are negated, and as such, signals arrive at the primary sensory cortices synchronously is around 10–15 metres away from the observer and has been called the horizon of simultaneity (Spence & Squire, 2003; Vroomen & Keetels, 2010). However, in interactions between a human observer and a sound/light emitting device at a close distance (~1–3 metres), it has been commonly reported that visual signals have to precede auditory signals for the perception of simultaneity (Vroomen & Keetels, 2010; Zampini et al., 2003, 2005a, b).
The temporal difference between the senses is measured by finding the asynchrony necessary to perceive simultaneity, which is defined as the Point of Subjective Simultaneity (PSS). To measure this difference, one can use the psychophysical methodology. An extension of simply discriminating whether a signal is present or not, is to present two stimuli (X and Y) with varying stimulus onset asynchronies (SOAs) (X–Y) and force participants to report whether the two stimuli are simultaneous (Exner, 1875; Fujisaki et al., 2004; Spence et al., 2001; Zampini et al., 2005a, b), or to report the temporal order of the pair (Boenke et al., 2009; Gibbon & Rutschmann, 1969; Jaśkowski, 1992; Yamamoto & Kitazawa, 2001; Zampini et al., 2003).
In the Simultaneity Judgment (SJ) task, participants judge whether X and Y appear to be simultaneous or not. Here, the proportions of ‘simultaneous’ responses are plotted as a function of SOA (Fig. 1E). It is important to note, however, that fitting SJ data with a Gaussian function is rather arbitrary and without theoretical justification (see. e.g., Schneider & Bavelier, 2003; Sternberg & Knoll, 1973; Yarrow et al., 2011). Here, the assumption is that the peak represents perceived simultaneity (i.e. the PSS), as this is the point at which participants are maximally sure that X and Y are synchronous. A further measure than can be derived from such a function is the standard deviation (SD) of the distribution of responses. The SD may characterize either the relative sensitivity, or how liberal participants criteria are for perceived simultaneity. Larger SDs suggest participants had a larger region of complete insensitivity to order or, alternatively were either more liberal with their criteria for two events being simultaneous (Yarrow et al., 2011).
In temporal order judgments (TOJs), the proportion of ‘Y first’ responses are generally an increasing function of SOA (Fig. 1E). One usually obtains a sigmoid function where the PSS corresponds to the SOA at which an observer is maximally unsure about the temporal order of the pair of stimuli (50% point). The steepness of the curve at the PSS reflects an observers’ sensitivity to temporal order and is expressed as the Just-Noticeable Difference (JND). Generally this measure is taken as half of the difference between the SOA at the 25% and 75% points; however, other methods such as the Spearman–Kärber may calculate this based on the 14% and 86% points (two sigma; see Miller & Ulrich, 2001). As such, the JND represents the smallest SOA an observer can reliably judge the temporal order thereof. A flat curve would result in a relatively larger JND and as such reflect an observer that has low temporal sensitivity whereas a steep curve would constitute a smaller JND and thus implies an observer has higher temporal sensitivity.
3.3 Estimating Perceived Timing Using Psychophysics
We have discussed the psychophysical method and how one can measure the relative timing between two sensory events. Here, however, we will discuss how psychophysics may be used to estimate the perceived timing of an event through the PSS (defined as PSS here to avoid neologism, though it could be considered as the Point of Subjective Isochrony (PSI) in the following example). In a first type of task, participants are presented a sequence of stimuli with the same inter-onset interval (IOI) except the final stimulus has an anisochrony applied such that it could appear earlier or later than expected (Fig. 1A, B) and then asked to report if the final stimulus was on time (Di Luca & Rhodes, 2016), or in a different task: early or late (Li et al., 2016).
If we consider standard TOJs, the PSS is only really a measure of the relative asynchrony in the time it takes to process two signals to be perceived as simultaneous – not when an event happened. To measure the perceived event timing of a stimulus, this review advocates presenting a sequence of regularly timed stimuli and pairing the last stimulus with a stimulus from another modality (which is unaffected by the sequence), to compare the PSS for stimuli presented on time, earlier than and later than expected. Presently, models of time perception do not predict that the PSS should change regardless of when a stimulus is presented. In the next section, we discuss such models and their predictions before introducing a Bayesian model of perceived event timing that makes explicit predictions.
4 Contemporary Models of Time Perception
The aim of this review is to increase the understanding of the computational mechanisms of how the brain may estimate the perceived timing of events — that is, how can the brain know when is now, when was then and when is next? Extant models of time perception are mostly based on the notion of perceived duration, i.e., how the brain may represent and encode the time between two signals. We now introduce and discuss such contemporary models of interval timing. Firstly, it should be addressed that there exists a great literature on different taxonomies of timing models — where some researchers have conceptualised models of time in terms of having a dedicated neural mechanism for the perception of time (Creelman, 1962; Gibbon, 1977; Gibbon et al., 1984; Treisman, 1963; Wing & Kristofferson, 1973), in contrast to time being an intrinsic product of sensory information processing, where recurrent spatial or activity patterns read out duration without the need of an internal clock (Buonomano, 2009; Buonomano & Merzenich, 1995; Karmarkar & Buonomano, 2007; Mauk & Buonomano, 2004). Further, dedicated models assume that there are specialized brain regions involved in the representation of temporal information, whilst intrinsic models primarily argue for a distributed timing mechanism over the brain (Ivry & Schlerf, 2008). This review is concerned with two popular classes of dedicated models for the perception of time: Entrainment and interval models (Gibbon et al., 1984; Large & Jones, 1999; McAuley & Jones, 2003), and as such, we now introduce both before showing how they may be formulated to make predictions about the timing of individual stimuli.
4.1 Internal Clock Models
When one is asked ‘What time is it?’ or ‘How long have you been waiting?’, it is quite likely that this person will glance at their watch and use it to estimate what the present time is, or how long the wait has been. As such, it is intuitive to think that the brain may use a clock-like mechanism in order to deal with the perception of time. Internal clock (or interval) models of timing are born out of this analogy and they conceive time as a triad of clock, memory and decision processes (Creelman, 1962; Treisman, 1963). The most notable, and influential interval model is Scalar Expectancy Theory (SET; Church et al., 1994; Gibbon, 1977; Gibbon et al., 1984). In the SET model, the internal clock is considered as a pacemaker–accumulator mechanism, where a dedicated pacemaker emits pulses continuously. To represent duration, the accumulator counts the amount of pulses between two signals and then stores them in memory (Fig. 2). The hallmark of the SET model is that as the mean duration of an interval increases, the associated standard deviation of the duration estimate increases linearly also — this is often called the ‘scalar property’ of interval timing. Such a property is an important characteristic of temporal perception and not just a feature of the SET model, whilst also being synonymous with the Weber–Fechner Law (Fechner, 1860), which asserts a logarithmic relationship between physical magnitudes and the representation in the perceptual system, and as such, the JND between two physical magnitudes is proportional to the absolute physical magnitude. Each interval is maintained in working memory before being passed to a more robust representation in long-term memory. The key point here is that time, in these accounts, is represented as discrete interval durations that are subsequently compared with other intervals at a decision stage (Allman et al., 2013; Church & Broadbent, 1990; Gibbon et al., 1984). If the amount of pulses in one interval is greater than another, then the former interval is perceived to be longer. After sufficient exposure to repeated intervals, the representation of the interval in memory becomes more refined and leads to better discrimination performance (Drake & Botte, 1993; Hoopen et al., 2011; Miller & McAuley, 2005; Schulze, 1978, 1989). Further, the stored intervals can be compared to the current clock reading in order to estimate the onset of a future stimulus.
The SET model does not try to explain any changes in the perceived timing of individual stimuli; rather, it is concerned with changes in the representation of duration. Stimuli, in this sense are external cues that — after a processing delay — simply delimit intervals. Given this, interval models are also symmetric in the sense that they by large do not explicitly predict any differences in the detection of temporal irregularities at which a stimulus is presented (be that earlier or later than the expected time point). For example, if a stimulus is presented earlier than expected. then there should only be a small but predictable difference in its temporal discrimination. The scalar property can be used to predict asymmetric changes in temporal deviation detection by considering changes in the underlying transducer function of physical duration to perceived duration (García-Pérez, 2014), as well as the standard deviation of subjective duration being proportional to the average experienced duration (Church & Deluty, 1977; Church & Gibbon, 1982; Gibbon et al., 1984). Extending this to the idea of anisochrony, a stimulus presented earlier than expected has a shorter perceived duration, and as such a representation with a smaller standard deviation than a stimulus presented later than expected, meaning the earlier stimulus is easier to detect if irregular.
A recent paper tested the predictions of interval models in event timing, where a difference in performance of detecting temporal irregularity due to the sign of the anisochrony at which a stimulus is presented was reported (Di Luca & Rhodes, 2016). The study reported that as the number of stimuli in a sequence increased, so did the ability to discriminate temporal irregularity — but only for stimuli presented earlier than expected. Further, differences in the perceived timing of stimuli as a function of their relation to expectation were reported: as early stimuli were perceptually delayed whilst late stimuli were perceptually accelerated in order to appear closer to expectation. Interestingly, stimuli presented isochronously (on-time) were perceptually accelerated (an effect that has also been reported for ‘early’ or ‘late judgments’, Li et al., 2016). Interval models such as the Multiple-Look Model (Drake & Botte, 1993; Miller & McAuley, 2005) cannot account for these patterns of results however entrainment models can be formulated to explain at least the acceleration of stimuli presented isochronously.
4.2 Entrainment Models
Entrainment models offer an alternative realisation of interval timing. Similar to interval models, the basic tenet of these models is that a clock-like mechanism is an entrainable oscillator that peaks in amplitude at the expected onset of future stimuli (Large & Jones, 1999; Large & Palmer, 2002; Large & Snyder, 2009; McAuley & Jones, 2003) – though phase coincidence (Miall, 1989), recurrence of activity patterns (Buonomano, 2009; Buonomano & Merzenich, 1995; Karmarkar & Buonomano, 2007), or a Bayesian-like model that combines noisy estimates of duration with a resonance-like mechanism that regularizes sequences of intervals (Burr et al., 2013), have also been proposed as alternative intrinsic entrainment models. Whilst interval models have mainly been formulated to explain interval timing and determining which of two intervals is longer (or shorter) — entrainment models are more conducive to explaining stimulus timing in rhythmic sequences — as internal oscillations gradually adjust to the phase of external rhythms.
Dynamic Attending Theory (DAT) (Jones & Boltz, 1989; Large & Jones, 1999; Large & Palmer, 2002) is one realization of the concept of entrainment in time perception. Here, attention is not distributed evenly over time, but rather ebbs and flows with time’s passing. Originally proposed as a model of rhythmic expectancy, DAT proposes that rhythm perception is induced by way of entrainment to external signals. Internal fluctuations in attentional energy (attentional ‘peaks’) generate temporal expectancies about the onset of future events that can acclimate to the period and phase of external events by way of an adaptive internal oscillator (Fig. 3). At the neural level, the perception of regular events has been proposed to originate from neural oscillations that adjust and resonate with external signals (Henry & Herrmann, 2014; Large & Snyder, 2009; Zanto et al., 2006). The framework of active sensing (Schroeder & Lakatos, 2009; Schroeder et al., 2010) — the fluctuation of excitation/inhibition cycles — can be tied directly to DAT. The high excitability phase of neural oscillations are thought to be associated with the peak of the attentional pulse and as such facilitate sensory selection and processing of stimuli that coincide with the peak of an oscillation (Henry & Herrmann, 2014; Lakatos et al., 2008). Therefore, one can reason that if a stimulus occurs at the peak of an oscillation and high excitability phase, then it should be given a perceptual boost and processed faster. This effect is similar to prior entry (Spence & Parise, 2010; Sternberg et al., 1971), where attended stimuli are processed quicker than unattended ones. The idea of prioritized processing of attended stimuli exists in the visual cognition domain (Summerfield & Egner, 2009), and such attentional facilitation of perception has been highlighted in a number of studies in the temporal processing literature (Spence et al., 2001; Sternberg & Knoll, 1973; Zampini et al., 2005b) as well as at the neural level (McDonald et al., 2005).
DAT accounts for perceived stimulus timing by considering that humans detect asynchronies between an expected stimulus onset time and the actual stimulus onset time (McAuley, 1995). If the stimulus onset occurs after the expected peak then a stimulus is perceived as being late, whilst if it is before the expected peak then it is perceived as being early. Intuitively, when a stimulus onset time coincides with the peak of the expected time, then it is perceived as being on time; though as shown above, entrainment models could be formulated to predict an acceleration of attended-to stimuli that occur at the peak of an oscillation. As a consequence of increasing attentional expectancies due to entrainment, sensitivity to temporal deviations improves as a function of increasing sequence length (Barnes & Jones, 2000; Drake & Botte, 1993; McAuley & Kidd, 1998; Miller & McAuley, 2005).
Entrainment models can at least explain the perceptual acceleration of expected stimuli yet this is still rather speculative. Extant Bayesian models of time perception have been formulated (Jazayeri & Shadlen, 2010; Miyazaki et al., 2005; Shi et al., 2013), but primarily for the representation of intervals. Now we introduce the idea of Bayesian time perception for duration perception before discussing a contemporary Bayesian account of perceived event timing in rhythmic sequences.
4.3 A Bayesian Model of Interval Timing
As mentioned previously, time is subject to various contextual distortions. A seminal example of contextual calibration is Vierordt’s law (Lejeune & Wearden, 2009; Vierordt, 1868). When observers are presented with various intervals of different lengths and subsequently asked to reproduce each interval – they tend to overestimate the duration of short intervals, and underestimate long ones (Jazayeri & Shadlen, 2010, 2015). This is a type of ‘central-tendency’ effect — participants migrate their estimates of duration towards the mean of exposed intervals. A prevalent model of such an effect is that the perception of interval duration is derived from not only the perception of current sensory information, but also from the prior knowledge of the duration of previously exposed intervals (Jazayeri & Shadlen, 2010; Lejeune & Wearden, 2009; Murai & Yotsumoto, 2016; Petzschner & Glasauer, 2011; Petzschner et al., 2015; Roach et al., 2016; Shi & Burr, 2016; Taatgen & van Rijn, 2011). Prior knowledge of the temporal statistics of the environment, in this sense, biases temporal perception.
A suitable candidate to explain the central-tendency effect observed in time perception is the Bayesian framework (Bayes, 1763). Bayesian models of perception have been successfully used to model several perceptual domains (Ernst, 2006; Ernst & Banks, 2002; Ernst & Bülthoff, 2004; Knill, 2007; Knill & Richards, 1996; Maloney & Mamassian, 2009; Mamassian et al., 2002) and have been applied to duration estimation (Hartcher-O'Brien et al., 2014; Shi et al., 2013) and reproduction (Jazayeri & Shadlen, 2010; Miyazaki et al., 2005). Further, Bayesian models have been used to describe the perception of temporal order for near synchronous events (Miyazaki et al., 2006). Opposite to audiovisual recalibration effects (which are somewhat ‘Anti-Bayesian’, Di Luca et al., 2009; Fujisaki et al., 2004; Roach et al., 2011; Vroomen et al., 2004); tactile temporal order appears to follow Bayesian principles, whereby previous experience of adapted SOAs (i.e. SOAs distributed around a negative or positive SOA) biases responses such that the perceived temporal order of tactile events is closer to prior experience.
Under the Bayesian framework, a generative model combines current sensory information (likelihood) with a priori knowledge of the world (prior) in order to give rise to a percept (posterior). The likelihood and prior in this model are weighted by their relative uncertainties (Colas et al., 2010; Fernandes et al., 2014; Griffiths & Tenenbaum, 2011; Lucas & Griffiths, 2009; Vilares & Körding, 2011). For example, noisier (more uncertain likelihoods) stimuli are influenced more by previous sensory experience (Fig. 4).
The Bayesian framework has recently been applied to the SET model of interval timing (Shi et al., 2013). The central tenet of such a Bayesian model is that the triad of components of the SET model are translated into the Bayesian framework: the likelihood, prior and posterior are considered analogous to the clock, memory and decision stages (Fig. 2). The clock stage represents the likelihood function, that is, present perceptual information, and is rendered as such: if an interval delimited by two stimuli is duration D, with an allied internal clock count of C, which represents the number of ‘ticks’ accumulated by the time the second stimulus has delimited the interval, then the likelihood function P l(C |D), is the probability of acquiring the perceived duration C, given the external stimulation D. It is unclear, however, how continuous probability distributions such as likelihood functions are formed with discrete measures such as clock counts, i.e., how does the pacemaker–accumulator transform accumulated ticks into probabilistic representations of perceived duration? The width of the likelihood probability distribution indicates the relative sensory uncertainty given the measurement — a steep function, for instance, would give a likelihood function with little uncertainty about the duration observed D, whilst a flatter function would indicate a likelihood function with great uncertainty about D.
The memory stage is analogous to the prior probability distribution P p(D), The prior is a probability distribution that is centred at the objective mean of the sample intervals presented to subjects. As with the likelihood function, the prior’s width determines the precision of recent experience: flatter priors indicate that uncertainty about the mean of sample intervals, whilst a sharp prior would indicate more precise estimates. In order to arrive at an estimate of perceived duration, according to Bayes’ rule, the prior is combined with the likelihood, in order to form the posterior distribution P q(D|C):
The posterior distribution is considered as synonymous to the decision stage of the SET model. Given the posterior, a Bayesian ideal observer chooses an action given a loss function that specifies the relative cost or success of a potential behavioural response (Acerbi et al., 2012, 2014; Kording & Wolpert, 2004; Wolpert, 2007). If we consider the perception of duration, then the model predicts noisy sensory estimates of duration are biased towards the mean of the prior probability distribution (Fig. 4). Evidence for Bayesian interval timing is still in its infancy with regard to the depth of studies investigating such models, however there is recent work that shows that the central tendency effect is stronger in vision that in the auditory modality (Cicchini et al., 2012). This result can be interpreted in two ways: either the prior is relatively weaker in the auditory modality, and as such, has little influence on the likelihood; or secondly, audio likelihood functions are more precise (steeper) and are not captured by the prior. A recent study claims that priors are modality dependent (Murai & Yotsumoto, 2016), however the data appear to suggest that subjects are in fact not modality dependent, but rather the precision of duration estimates for perceived duration differ between modalities given auditory stimuli have greater reliability in temporal judgments (Ortega et al., 2014). Further, recent data also suggests that subjects form a general prior over two distinct sensory contexts (Roach et al., 2016).
4.4 Summary of Models
In summary, interval models of duration perception are based on the idea that an internal clock keeps track of time by counting the amount of pulses between the onsets of one event to another. When considering the perceived timing of a single stimulus, these models make no explicit predictions about changes in the timing of a stimulus due to the temporal structure of an embedded sequence. Entrainment models, on the other hand, can be formulated to predict that expected stimuli are processed faster and as such, perceived earlier. However, entrainment accounts have not been specifically formulated to explain how temporal structure may change the perceived timing of stimuli. In contrast to these accounts of time perception, the Bayesian framework has been applied to several perceptual domains, and has recently been applied to duration estimation (Hartcher-O'Brien et al., 2014; Shi et al., 2013). The Bayesian framework has been used to show how the representation of duration is calibrated in order to make intervals appear more similar to the duration of previously exposed intervals (a central tendency effect). The likelihood function is similar to the clock stage of the SET interval-based model — the clock is responsible for the measurement stage of inferring the duration of an external event. The prior is akin to the long-term reference and memory stages of the SET model and as such represents the learned knowledge of the average durations experienced. The posterior distribution represents a percept and an observer chooses a response after a decision rule, which is similar to the decision stage of the SET model. The model is useful in connecting the computational principles of Bayesian modelling with the information-processing account of duration perception of interval models. However, as with other interval-based models, the described Bayesian account of SET (described above) only makes predictions about what happens to the representation of intervals, and as such, does not predict any changes to the perceived timing of stimuli in sequences.
4.5 Shifting Focus from Perceived Duration to Perceived Event Timing?
Interval and entrainment models were born out of modelling the perception of duration. Numerous studies have sought to understand how discrimination performance to temporal irregularities increases as the amount of stimuli increases (Drake & Botte, 1993; Halpern & Darwin, 1982; Hoopen et al., 2011; Lunney, 1974; McAuley & Kidd, 1998; Miller & McAuley, 2005). These models predict that the detection of temporal irregularity is symmetric around an expected time point [though the application of SET to temporal bisection and generalization in duration perception do predict asymmetries in deviation detection (García-Pérez, 2014)]. Di Luca and Rhodes (2016) tested such a prediction, by asking participants to report whether the last stimulus in a unimodal sequence of isochronous tones of different lengths (3, 4, 5 or 6 stimuli) was ‘on time’ or ‘off time’ (Fig. 1A, B). In contrast to the multiple-look interval models, the increases in irregularity detection were asymmetric; stimuli presented earlier than expected were better discriminated as irregular with increasing sequence length compared to stimuli appearing later than expected.
As a possible explanation for this asymmetry, changes in the perceived timing of the final stimulus could account for the pattern of results. To measure the perceived timing of the final stimulus (rather than perceived isochrony), a sequence of isochronous tones was presented but this time the final tone was paired with a stimulus in another modality (Fig. 1C, D). From the participants’ responses, it was possible to calculate the PSS: the audiovisual asynchrony necessary to perceive both stimuli as simultaneous (Fig. 1E). Data evidences that if the final stimulus was presented a little earlier than expected, the perceived timing is changed in a way that delays the stimulus towards its expected timing. Conversely, stimuli presented a little later than expected are perceptually accelerated towards expectation. The effect of stimuli being delayed towards the time they are expected can be understood as temporal regularization, which is similar to central tendency effects in the time perception literature, such as Vierordt’s Law (Lejeune & Wearden, 2009; Vierordt, 1868), where the duration of an interval is biased by the average duration of intervals previously experienced (Jazayeri & Shadlen, 2010, 2015; Petzschner et al., 2015). However, in opposition to a central tendency effect, the authors found asymmetries also in the perceived timing data of stimuli presented at their expected time (on time), as they are perceptually accelerated away from expectation. To add weight to this finding, it has recently been reported that the perceived timing of a stimulus is accelerated for stimuli presented at the expected time point (Li et al., 2016).
The theme of this review is geared towards the distinction between perceived event timing and perceived duration. The perception of duration has a vast and important literature (Gibbon et al., 1984; Meck, 2003, 2005; Treisman, 1963; van Rijn et al., 2014), but the perception of events occurring at physical time points is less understood. Interval and perceived event timing, though related, differ. Intervals are delimited by the presence of two stimuli, or the onset and offset of one stimulus (i.e., a ‘filled’ duration). However, it is not explicitly stated in the SET interval timing model (Gibbon, 1977; Gibbon et al., 1984) what happens to the timing for either of the stimuli that delimit an interval. When inducing changes in the timing of a single stimulus due to temporal expectations (Di Luca & Rhodes, 2016; Rhodes & Di Luca, 2016), it is not apparent in SET whether the timing of the first and/or second stimulus that delimits an interval is subject to any change in its timing. One might ask, are perceived event timing and duration subserved by different systems or are they parts of the same system? The distinction between the two becomes blurred when one considers effects such as difference in the perceived duration of intervals, whether filled (Buffardi, 1971; Thomas & Brown, 1974; Wearden et al., 2007), or filled with a series of regularly or irregularly timed stimuli (Horr & Di Luca, 2014, 2015a). Here, durations filled with a continuous tone or a series of events are perceived as longer than intervals with an empty filler. How does the perceived timing of events interact with the perception of duration in order to produce such a phenomenon? The truth may be that the perceived timing of events feed forward in series or parallel towards a system that computes the perceived duration of an interval. As such, timing models that explicitly synthesize perceived timing and duration are of theoretical importance.
The perception of the timing between two events is well researched (Fujisaki et al., 2004; Grondin, 2010; Roseboom, 2017; Roseboom et al., 2015; Spence, 2007; Spence & Parise, 2010; Vroomen & Keetels, 2010), but there is a distinction between relative timing and anchored time points of stimuli. Humans appear to combine estimates of stimuli in a statistically optimal fashion using maximum likelihood estimation (Ernst, 2006; Ernst & Banks, 2002); however such an approach does not reveal when at an absolute time a single stimulus is perceived, but rather, only changes in the relationship, or integration of two events. When subjects complete audiovisual temporal order or simultaneity tasks (Di Luca et al., 2009; Fujisaki et al., 2004; Hartcher-O'Brien et al., 2014; Noel et al., 2016; Spence, 2007; Van der Burg et al., 2013; van Eijk et al., 2008), subjects (1) do not know the future timing of when a trial will occur, and (2) are not asked about the timing of one of the stimuli in the sequence with regards to an absolute timeline and given this, the exact timing of a single sensory event cannot possibly be known. As such, the following section discusses methods which may be able to measure the perceived timing of events with regards to a physical time line.
5 A Bayesian Model of Perceived Event Timing
A Bayesian model based on the dynamic updating of temporal expectations can explain the asymmetries in the detection of irregularity and also in the perceived event timing of stimuli (Di Luca & Rhodes, 2016). Within a single trial, perceived timing (the posterior distribution) is the result of combining the probability of sensing a stimulus (likelihood) with the time it was expected (prior) — at each point in time (Fig. 5). As opposed to current Bayesian accounts of time perception that use Gaussian probability distributions (Hartcher-O'Brien et al., 2014; Miyazaki et al., 2006; Shi et al., 2013), the key tenet of the model is the relaxation of the assumption of normality in the probability distributions (Acerbi et al., 2012; Di Luca & Rhodes, 2016; Jazayeri & Shadlen, 2010). Probability distributions in the temporal domain are asserted to be necessarily asymmetric due to the way time flows. The anisotropic nature of time means that evidence accumulated about stimulus timing for the likelihood function can only start after a short delay due to neural processing. But although a stimulus cannot be sensed before a stimulus is presented, there is always the chance it could be perceived a bit later than average due to noise in the sensory system. Prior distributions about the expected timing of future events should also be asymmetric, as an organism cannot predict a second event to occur before the first event, and as such should start at 0 for when the first event occurs and the distribution continues to rise until the expected timing of a second event. However, due to the anisotropy of time, the second event could still be expected tomorrow, and as such the prior should have a long off tail.
The Bayesian model of perceived event timing makes intrinsic predictions. As such, the perceived timing of stimuli in an environment where trials are isochronous should exhibit the temporal regularization effect — early stimuli should be delayed towards expectation whilst late stimuli should be accelerated. Stimuli presented on time, in contrast are perceptually accelerated, as the mean of the posterior is earlier in time than the mean of the prior and is, as such, reported earlier (Fig. 5). However, stimuli that are presented in a random sequence of irregular timings, should not have any temporal expectations built up. Therefore, they should not have any modulation of their perceived timing, suggesting that a prior is not built. Second, an implicit assumption of the model is that noisier measurements should lead to broader likelihood functions that are captured more by the prior probability distributions. In the next section, we will consider empirical data that supports these two predictions.
The perception of regularity has historically been investigated in terms of deviations from its inverse: irregularity (Drake & Botte, 1993; Halpern & Darwin, 1982; Lunney, 1974; McAuley & Kidd, 1998; Repp, 1999; Schulze, 1978, 1989; Tanaka et al., 2008). But what makes a sequence of isochronous tones be perceived as regular? Extant models of rhythm perception assume that if a stimulus is presented in an isochronous structure then it is simply perceived as such. Time, however, is a physical dimension that is often subject to distortion in human perception (Allman & Meck, 2012; Hellström & Rammsayer, 2015; Hoopen et al., 1995; Horr & Di Luca, 2015a, b; Jazayeri & Shadlen, 2010; Lejeune & Wearden, 2009; Petzschner et al., 2015; van Wassenhove et al., 2008; Wearden et al., 2007); so why should a temporal property such as regularity be taken for granted?
Rhodes & Di Luca (2016) investigated whether the temporal environment could influence the perception of regularity. If a sequence has temporal irregular events, then the perceived timing of a stimulus should not be modulated, as the prior that biased perceived timing cannot be built. The authors found that a regularly timed environment promotes the perception of regularity and changes the perceived timing of stimuli to make slightly irregular stimuli appear more regular. An irregular environment of jittered tones, on the other hand, makes perfectly regular tones embedded within it be perceived as slightly irregular.
These results can be interpreted within the context of the Bayesian model of perceived event timing. In a regular environment, temporal expectations dynamically build after each stimulus and subsequently bias the perception of slightly irregular stimuli to make them appear more regular (Fig. 6B). However, in an irregular environment, temporal expectations are less precise and as such do not build up, and therefore do not bias the perceived timing of stimuli. As the representations of the posterior are less precise (Fig. 6A), the posterior distribution from which the perception of regularity is taken is wider, and as such there is a chance that an isochronous stimulus is perceived as being irregular. The idea of lack of integration between the prior and likelihood could be due to the large differences between the information present, i.e., isochronous sequences versus highly anisochronous sequences. The system discounts the discrepant source of information (isochronous trials) and does not combine the priors and likelihoods (Banks & Backus, 1998; Ernst & Banks, 2002).
5.1 Impact of Bayesian Perceived Timing to Contemporary Models of Time Perception
The Bayesian model with asymmetric likelihood functions accurately captures recent data from experiments showing how anisochronous stimuli are temporally regularized, and isochronous stimuli are perceptually accelerated (Di Luca & Rhodes, 2016; Li et al., 2016; Rhodes & Di Luca, 2016). Previous timing models, such as interval and entrainment models of time perception, cannot account for the asymmetric patterns of results observed in these experiments. Di Luca and Rhodes (2016) show an asymmetry in temporal deviation detection: stimuli that are presented earlier than expected are better detected as off-time as the length of a sequence increases. Both interval and entrainment models predict a symmetric increase in temporal discrimination performance as the amount of stimuli in a sequence increases (Drake & Botte, 1993; Large & Jones, 1999; Large & Palmer, 2002; ten Hoopen et al., 2011). The Multiple-Look Model (MLM), an interval-based model of temporal discrimination, is based on the idea that as sequence length increases so does the precision of an estimate for each interval (Drake & Botte, 1993; Miller & McAuley, 2005). Similarly, the beat-averaging (Schulze, 1978, 1989), diminishing returns (ten Hoopen et al., 2011) and internal-reference model (Bausenhart et al., 2014; Dyjas et al., 2012; Ulrich, 1987), are all based on similar premises (Li et al., 2016). As the factor of change in such accounts is the better internal representation of an interval, interval-based models make do not make explicit predictions about changes in the perceived timing of stimuli (Gibbon, 1977; Gibbon et al., 1984; Shi et al., 2013), as stimuli simply delimit intervals.
A key interval-based model to explain such changes in representation is SET (Gibbon et al., 1984). In this model, an internal pacemaker emits pulses that are accumulated and counted between two events, leading to a duration estimate. In order to account for the modulations in perceived timing, the SET model must be augmented. Rather than being in competition with SET, the model presented represents a general issue in resolving how ‘global’ context effects can be reconciled with ‘local’ changes in perception, as it has been shown that the duration of just the previous stimulus can affect the perceived simultaneity of the next (Van der Burg et al., 2013); as well as the temporal regularization phenomena reported in this review. As such, a general model of time perception that both estimates perceived timing and duration is of paramount importance in order to reconcile such different ways of understanding how humans and animals perceive time.
Entrainment models of temporal perception similarly predict symmetrical performance in determining whether stimuli are earlier or later-than expected (Henry & Herrmann, 2014; Large & Jones, 1999; Large & Palmer, 2002). Entrainment models are based on the idea that the phase and frequency of temporal patterns adjust to rhythmic events — where at the neural level, recurrent activity patterns (Buonomano, 2009; Buonomano & Merzenich, 1995; Karmarkar & Buonomano, 2007; Laje & Buonomano, 2013) or phase coincidence (Miall, 1989) progressively tune to the frequency and phase of external stimulation. Though not originally formulated to predict changes in perceived timing, entrainment models could be formulated to appeal to the rhythmic deployment of attention at an expected time-point to facilitate the processing of on-time stimuli to be perceived faster (Rohenkohl et al., 2011). However, data evidences that early stimuli are delayed towards expectation and, as such, current formulations of entrainment models cannot account for this finding (Buonomano & Merzenich, 1995; Karmarkar & Buonomano, 2007; Large & Jones, 1999; Large & Palmer, 2002; Large & Snyder, 2009; Miall, 1989), as principally these models are based on phase correction for the next stimulus in a sequence, and not modifications of a stimulus at the present time, whilst it is also unclear how such models could account for perceptual delay. Similar to the implication for interval models, entrainment accounts of temporal processing should consider the modulation of PSS that results in temporal regularization.
To summarize, the Bayesian model of perceived timing can explain the delay of early stimuli as well as the acceleration of on time and later than expected stimuli. Interval models do not make any explicit predictions about changes in the perceived timing of stimuli and as such cannot account for this data. However, if one considers recent Bayesian interval timing models (Jazayeri & Shadlen, 2010), a maximum-likelihood estimator based on a Gaussian conditional probability would accelerate the temporal perception of events due to the asymmetry of the likelihood function. Entrainment accounts could be formulated to explain the acceleration of on time stimuli — however they cannot explain the delay towards expectation of early stimuli.
5.2 Impact to Sensory Processing Theories
Sensory processing involves three separate stages: (1) detecting incoming information, (2) representing incoming information and (3) interpreting that representation (Wei & Stocker, 2015). Two distinct accounts exist to explain these processes: the efficient coding hypothesis explains how limited neural resources lead to efficient representations that are optimized with regard to the natural statistics in the environment (Barlow, 1961; Lewicki, 2002; Simoncelli, 2003; Wei & Stocker, 2015). The role of primary sensory processing is, as such, to reduce the inefficiency and redundancy in representing a raw image by recoding a representation into an efficient form (Huang & Rao, 2011). However, in this framework, it is difficult to determine how perceptual biases may arise. Built on such a theoretical bases, the predictive coding hypothesis suggests sensory processing is the result of combining current sensory information with prior knowledge about the world (Friston & Kiebel, 2009; von Helmholtz, 1963; Kersten et al., 2004; Knill & Richards, 1996; Ma et al., 2006; Srinivasan et al., 1982) — according to Bayes’ (1763) rule. Such an information-processing approach can explain the myriad of data that shows consistent perceptual biases (Ernst, 2006; Ernst & Banks, 2002; Knill & Richards, 1996; Körding & Wolpert, 2004a; Mamassian et al., 2002; Petzschner et al., 2015; Wolpert & Ghahramani, 2000). Recently, however, a unified model has been proposed that reconciles a predictive coding (Bayesian) approach with efficient coding of a sensory representation (Wei & Stocker, 2012, 2015) by constraining priors and likelihoods with natural stimulus statistics.
Recent data shows how sensory information may be represented at the neural level by constraining the likelihood function with the anisotropy of time (Di Luca & Rhodes, 2016). The authors introduce the idea that the likelihood function is necessarily asymmetric in the temporal dimension, with a steep onset and long off tail. The asymmetric likelihood function explains how stimuli that are presented on time are perceptually accelerated — an anti-Bayesian effect. Interestingly, a recent article has shown concurrent repulsions away from the peak of the prior through similarly asymmetric likelihoods and priors (Wei & Stocker, 2012, 2015). The relaxation of the assumption of normality is thus of theoretical importance as up until now, probability distributions are generally described as Gaussians in the Bayesian framework (Ernst, 2006; Ernst & Banks, 2002; Knill & Richards, 1996; Miyazaki et al., 2005; Sciutti et al., 2014; Shi et al., 2013) – though asymmetric distributions have been used (e.g. Acerbi et al., 2012; Jazayeri & Shadlen, 2010).
Behavioural data hints at the brain optimizing perception in order to process sensory information more efficiently (Di Luca & Rhodes, 2016; Petzschner et al., 2015; Wei & Stocker, 2015). Why regularize stimuli if most are actually irregular? Similarly, the exploitation of temporal regularities decreases neural metabolic consumption (VanRullen & Dubois, 2011). The predictable timing of future stimuli leads to improved stimulus discrimination and detection in a plethora of tasks (Brochard et al., 2013; Carnevale et al., 2015; Correa et al., 2005; Cravo et al., 2013; Escoffier et al., 2010; Jazayeri & Shadlen, 2010; Rohenkohl & Nobre, 2011), whilst the rhythmic entrainment of stimuli allows the automatizing of behaviour for activities such as dance, locomotion, speech, and music production (McNeill, 1995; Repp, 2005).
Given the hypothesis that noisier signals should lead to shallower likelihood functions they should be captured more by the prior compared to less noisy functions. This sort of effect has also been found in the context of human speed perception, whereby a broader likelihood function results in speed estimates that are more dominated by the prior (Senna et al., 2015; Stocker & Simoncelli, 2006). Given how this effect has been translated into the domain of temporal perception, one could posit that this is applicable to other perceptual modalities and is, as such, perception-general.
6 Directions for Future Research
In order to continue to validate the proposed Bayesian model of perceived timing, the model must be tested and subsequently modified in order to reflect the findings of future work. In this section, we discuss explicit predictions based on this model to stimulate ideas for future research.
To elicit temporal regularization effects, single sequences of isochronous events, or intervals are presented in order to build up prior expectations, yet in the environment, sequences of repeated events are often not isochronous. In almost all forms of music around the world, there are rarely any instances of completely isochronous melodies – music has distinct and complex temporal patterns operating at different hierarchies and time signatures (Large & Palmer, 2002; Vuust & Witek, 2014). Syncopated rhythms, for example, carry expectations about the future timing of events, yet are not completely isochronous (Fitch & Rosenfeld, 2007). How can the brain predict such events in the context of a unified model if it is based on the isochronous presentation of stimuli? Models, at present, would predict that a syncopated (as such), deviant stimulus would be biased towards the expected timing/interval; yet it seems that when a stimulus is obviously earlier than expected, then we perceive it as such. To clarify this issue, the extent of the regularization effect must be mapped over a whole range of anisochronies. One may predict that at a certain magnitude of anisochrony the regularisation effect goes away. If this is the case, it may mean that a hierarchical prior takes over and modulates the tendency to regularize deviant stimuli. Further, one could also imagine another prior that is based on the rhythm and syncopation of a sequence, which also influences the lower-level regularisation and as such, the combination of the prior and likelihood.
Given that the prior is built after the presentation of isochronous events or intervals, sometimes events may not be sensed or not even occur. In the active sensing framework, entrained oscillations continue to be in phase consistency after the end of the external stimulation, yet decay after some time (Lakatos et al., 2005, 2008; Schroeder & Lakatos, 2009). In the same way, does the prior decay after time or does it stop influencing the moment a beat is missed? To test this, one could think of an experiment where the final stimulus is missed and presented at T + 1, T + 2, T + 3 etc. where T is the timing the final stimulus. If the prior is still present (yet decayed) it should still modulate perceived timing, but the effect should diminish as the missed beats increases.
Moving away from the perception of audio or visual stimuli, the model could be extended to the realm of motor control. It has been consistently shown that humans synchronize to sensorimotor events such as finger tapping or dancing (Elliott et al., 2009; Elliott et al., 2010, 2014; Repp, 1999, 2005; Repp & Su, 2013). A consistent finding in such studies is that the time of a tap (i.e., the time at which a finger touches a surface) is prior to the onset of an isochronous metronome. The model could account for such a negative error as it predicts that the perception of isochronous events is actually perceived earlier than expectation resulting in earlier taps. Further, how should an observer know when to initiate a tap? Due to the build of temporal expectations via the stimulation of a metronome observers can anticipate the timing of future taps and use this information to initiate a movement.
7 A Unified Model of Time Perception?
What should a unified model of time perception look like? A great deal of literature has been dedicated to the perception of time and, in particular, interval timing (Creelman, 1962; Gibbon et al., 1984; Matell & Meck, 2004; Meck, 2005; Merchant & de Lafuente, 2014; Treisman, 1963). The perception of duration has been described with the SET model and, in this framework, been tied to thalamo–cortico–striatal circuitry (Matell & Meck, 2004). Contextual calibration effects on perceived duration have been modelled in the Bayesian framework — whereby duration estimates are biased towards the mean of previously experienced intervals (Jazayeri & Shadlen, 2010; Miyazaki et al., 2006; Shi et al., 2013). Context effects are bound by the fact they take a long course to learn the temporal statistics of the environment (Acerbi et al., 2012). The motivation of current work from our lab was in re-focusing temporal perception from the duration dimension to perceived timing, as well as showing how the perception of time can be biased rapidly on a trial-to-trial basis. Therefore, it seems of some importance that future work should seek to link together the existing frameworks for perceived duration and perceived event timing. As both event timing and contextual calibration of perceived duration (Di Luca & Rhodes, 2016; Jazayeri & Shadlen, 2010; Miyazaki et al., 2006; Shi et al., 2013) have been described in the Bayesian framework, a neural model of Bayesian inference to explain both perceived duration and timing could lead to a unified and neurophysiologically plausible account of time perception.
There are several theories of how the brain may represent probability distributions (Beck et al., 2008; Deneve et al., 1999; Fiser et al., 2010; Hoyer & Hyvarinen, 2003; Pouget et al., 2000; Zemel et al., 1998). Whilst ultimately a computational framework to explain how prior expectations can be combined with current sensory evidence to arrive at a best estimate to the state of the world, Bayesian inference has been shown to operate at the neural level through probabilistic population coding (Ma et al., 2006). A constellation of psychophysical experiments shows that humans perform to near Bayes-optimal inference (Beierholm et al., 2009; Ernst, 2006; Ernst & Banks, 2002; Kersten & Yuille, 2003; Knill & Richards, 1996; Körding & Wolpert, 2004a, b; Ma et al., 2006; Petzschner & Glasauer, 2011; Shi et al., 2013; Stocker & Simoncelli, 2006; Vilares & Körding, 2011), but recent work has described how subjects use Bayesian inference in the domain of event timing.
In order to translate the Bayesian model of perceived event timing to the neural level, one must first consider that such a model is not in competition with interval-based accounts of time perception that have tried to link the internal clock model with the Bayesian framework (Creelman, 1962; Gibbon et al., 1984; Jazayeri & Shadlen, 2010; Petzschner et al., 2015; Shi et al., 2013; Treisman, 1963), but rather, the model should be synthesized with such models in order to arrive at a general model of time perception. A Bayesian neural inference model that is hierarchically organized such that at a low-level population codes encode the perceived timing of stimuli but then feed-forward to a higher level that encodes the duration between two stimuli may offer a way of harmonizing perceived duration and timing.
During the last 150 years, great steps have been made in understanding how the human brain may perceive time. The advent of the psychophysical approach to studying perception has allowed researchers to precisely measure temporal properties of stimuli and as such, a large body of research has sought to understand the mechanisms underpinning temporal–perceptual phenomena. Contemporary models of time perception consider temporal processing from the perspective of duration. A recent Bayesian model of perceived timing re-focuses temporal perception research towards an event-based outlook. The model sets the scene to unify temporal processing accounts at neural, computational and behavioural levels, with the future goal of leading to a general model of time perception that is neurobiologically plausible, grounded in computational principles and accounts for both interval and event timing.
This review was funded by the Marie Curie CIG 304235 ‘TICS’ and supported by the EU FET Proactive grant TIMESTORM: Mind and Time: Investigation of the Temporal Traits of Human–Machine Convergence, with additional support from the Dr Mortimer and Dame Theresa Sackler Foundation, which supports the work of the Sackler Centre for Consciousness Science. I would like to thank Max Di Luca, Warrick Roseboom and Anil Seth for all their continued help and support.
Acerbi L. Wolpert D. M. & Vijayakumar S. (2012). Internal representations of temporal statistics and feedback calibrate motor-sensory interval timing. PLoS Comput. Biol.8 e1002771. doi: 10.1371/journal.pcbi.1002771.
Acerbi L. Vijayakumar S. & Wolpert D. M. (2014). On the origins of suboptimality in human probabilistic inference. PLoS Comput. Biol.10 e1003661. doi: 10.1371/journal.pcbi.1003661.
Allison T. Matsumiya Y. Goff G. D. & Goff W. R. (1977). The scalp topography of human visual evoked potentials. Electroencephalogr. Clin. Neurophysiol.42185–197.
Allman M. J. Teki S. Griffiths T. D. & Meck W. H. (2013). Properties of the internal clock: First- and second-order principles of subjective time. Annu. Rev. Psychol.65743–771.
Bald L. Berrien F. K. Price J. B. & Sprague R. O. (1942). Errors in perceiving the temporal order of auditory and visual stimuli. J. Appl. Psychol.26382–388.
Banks M. S. & Backus B. T. (1998). Extra-retinal and perspective cues cause the small range of the induced effect. Vis. Res. 38187–194.
Barlow H. B. (1961). The coding of sensory messages. In Thorpe W. H. & Zangwill O. L. (Eds) Current problems in animal behaviour (pp. 330–360). Cambridge, UK: Cambridge University Press.
Bausenhart K. M. Dyjas O. & Ulrich R. (2014). Temporal reproductions are influenced by an internal reference: Explaining the Vierordt effect. Acta Psychol (Amst)14760–67.
Beck J. M. Ma W. J. Kiani R. Hanks T. Churchland A. K. Roitman J. Shadlen M. N. Latham P. E. & Pouget A. (2008). Probabilistic population codes for Bayesian decision making. Neuron601142–1152.
Beierholm U. R. Quartz S. R. & Shams L. (2009). Bayesian priors are encoded independently from likelihoods in human multisensory perception. J. Vis.923.1–9. doi: 10.1167/9.5.23.
Boenke L. T. Deliano M. & Ohl F. W. (2009). Stimulus duration influences perceived simultaneity in audiovisual temporal-order judgment. Exp. Brain Res.198233–244.
Brochard R. Tassin M. & Zagar D. (2013). Got rhythm… for better and for worse. Cross-modal effects of auditory rhythm on visual word recognition. Cognition127214–219.
Brody C. D. Hernández A. Zainos A. & Romo R. (2003). Timing and neural encoding of somatosensory parametric working memory in macaque prefrontal cortex. Cereb. Cortex131196–1207.
Buffardi L. (1971). Factors affecting the filled-duration illusion in the auditory, tactual, and visual modalities. Percept. Psychophys.10292–294.
Buhusi C. V. & Meck W. H. (2005). What makes us tick? Functional and neural mechanisms of interval timing. Nat. Rev. Neurosci.6755–765.
Buhusi C. V. Sasaki A. & Meck W. H. (2002). Temporal integration as a function of signal and gap intensity in rats (Rattus norvegicus) and pigeons (Columba livia). J. Comp. Psychol.116381–390.
Buonomano D. V. & Merzenich M. M. (1995). Temporal information transformed into a spatial code by a neural network with realistic properties. Science267(5200) 1028–1030.
Burr D. Rocca E. D. & Morrone M. C. (2013). Contextual effects in interval-duration judgements in vision, audition and touch. Exp. Brain Res.23087–98.
Carnevale F. de Lafuente V. Romo R. Barak O. & Parga N. (2015). Dynamic control of response criterion in premotor cortex during perceptual detection under temporal uncertainty. Neuron861067–1077.
Church R. M. Meck W. H. & Gibbon J. (1994). Application of scalar timing theory to individual trials. J. Exp. Psychol. Anim. Behav. Process.20135–155.
Cicchini G. M. Arrighi R. Cecchetti L. Giusti M. & Burr D. C. (2012). Optimal encoding of interval timing in expert percussionists. J. Neurosci.321056–1060.
Correa A. A. Lupiáñez J. J. & Tudela P. P. (2005). Attentional preparation based on temporal expectancy modulates processing at the perceptual level. Psychonom. Bull. Rev.12328–334.
Cravo A. M. Rohenkohl G. Wyart V. & Nobre A. C. (2013). Temporal expectation enhances contrast sensitivity by phase entrainment of low-frequency oscillations in visual cortex. J. Neurosci.334002–4010.
Czeisler C. A. Duffy J. F. Shanahan T. L. Brown E. N. Mitchell J. F. Rimmer D. W. Ronda J. M. Silva E. J. Allan J. S. Emens J. S. Dijk D. J. & Kronauer R. E. (1999). Stability, precision, and near-24-hour period of the human circadian pacemaker. Science284(5423) 2177–2181.
- Search Google Scholar
- Export Citation
( Czeisler C. A. Duffy J. F. Shanahan T. L. Brown E. N. Mitchell J. F. Rimmer D. W. Ronda J. M. Silva E. J. Allan J. S. Emens J. S. Dijk D. J. Kronauer R. E. 1999). Stability, precision, and near-24-hour period of the human circadian pacemaker. , 284( 5423), 2177– 2181.
Deneve S. Latham P. E. & Pouget A. (1999). Reading population codes: A neural implementation of ideal observers. Nat. Neurosci.2740–745.
Di Luca M. & Rhodes D. (2016). Optimal perceived timing: Integrating sensory information with dynamically updated expectations. Sci. Rep. 628563. doi: 10.1038/srep28563.
Di Luca M. Machulla T. K. & Ernst M. O. (2009). Recalibration of multisensory simultaneity: Cross-modal transfer coincides with a change in perceptual latency. J. Vis.91–16.
Dinnerstein A. J. & Zlotogura P. (1968). Intermodal perception of temporal order and motor skills: Effects of age. Percept. Mot. Skills26987–1000.
Drake C. C. & Botte M. C. M. (1993). Tempo sensitivity in auditory sequences: Evidence for a multiple-look model. Percept. Psychophys.54277–286.
Drew M. R. Zupan B. Cooke A. Couvillon P. A. & Balsam P. D. (2005). Temporal control of conditioned responding in goldfish. J. Exp. Psychol. Anim. Behav. Process.3131–39.
Dyjas O. & Ulrich R. (2014). Effects of stimulus order on discrimination processes in comparative and equality judgements: Data and models. Q. J. Exp. Psychol.671121–1150.
Dyjas O. Bausenhart K. M. & Ulrich R. (2012). Trial-by-trial updating of an internal reference in discrimination tasks: Evidence from effects of stimulus order and trial sequence. Atten. Percept. Psychophys.741819–1841.
Ehrenstein W. H. & Ehrenstein A. (1999). Psychophysical methods. In Windhorst U. & Håkan J. (Eds) Modern techniques in neuroscience (p. 1211–1241). Heidelberg, Germany: Springer-Verlag.
Elliott M. T. Wing A. M. & Welchman A. E. (2010). Multisensory cues improve sensorimotor synchronisation. Eur. J. Neurosci.311828–1835.
Elliott M. T. Wing A. M. & Welchman A. E. (2014). Moving in time: Bayesian causal inference explains movement coordination to auditory beats. Proc. Biol. Sci.281(1786) 20140751–20140751. doi: 10.1098/rspb.2014.0751.
Ernst M. O. (2006). A Bayesian view on multimodal cue integration. In Knoblich G. Thornton I. M. Grosjean M. & Shiffrar M. (Eds) Human Body Perception From the Inside Out (pp. 105–131). New York, NY, USA: Oxford University Press.
Ernst M. O. & Banks M. S. M. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature415(6870) 429–433.
Fernandes H. L. Stevenson I. H. Vilares I. & Körding K. P. (2014). The generalization of prior uncertainty during reaching. J. Neurosci.3411470–11484.
Fiser J. Berkes P. Orbán G. & Lengyel M. (2010). Statistically optimal perception and learning: from behavior to neural representations. Trends Cogn. Sci.14119–130.
Gallistel C. R. King A. & McDonald R. (2004). Sources of variability and systematic error in mouse timing behavior. J. Exp. Psychol. Anim. Behav. Process.303–16.
García-Pérez M. A. (2014). Does time ever fly or slow down? The difficult interpretation of psychophysical data on time perception. Front. Hum. Neurosci.8415. doi: 10.3389/fnhum.2014.00415.
Goldstone S. & Lhamon W. T. (1974). Studies of auditory–visual differences in human time judgment. 1. Sounds are judged longer than lights. Percep. Mot. Skills3963–82.
Gribova A. Donchin O. Bergman H. Vaadia E. & de Oliveira S. C. (2002). Timing of bimanual movements in human and non-human primates in relation to neuronal activity in primary motor cortex and supplementary motor area. Exp. Brain Res.146322–335.
Griffiths T. L. & Tenenbaum J. B. (2011). Predicting the future as Bayesian inference: People combine prior knowledge with observations when estimating duration and extent. J. Exp. Psychol. Gen.140725–743.
Grondin S. (2010). Timing and time perception: A review of recent behavioral and neuroscience findings and theoretical directions. Atten . Percept. Psychophys. 72561–582.
Hamlin A. J. (1895). On the least observable interval between stimuli addressed to disparate senses and to different organs of the same sense. Am. J. Psychol.6564–575.
Hartcher-O’Brien J. Di Luca M. & Ernst M. O. (2014). The duration of uncertain times: audiovisual information about intervals is integrated in a statistically optimal fashion. PLoS One9 e89339. doi: 10.1371/journal.pone.0089339.
Hellström Å. & Rammsayer T. H. (2015). Time-order errors and standard-position effects in duration discrimination: An experimental study and an analysis by the sensation-weighting model. Atten. Percept. Psychophys. 772409–2423.
Henderson J. Hurly T. A. Bateson M. & Healy S. D. (2006). Timing in free-living rufous hummingbirds, Selasphorus rufus . Curr. Biol.16512–515.
Henry M. J. & Herrmann B. (2014). Low-frequency neural oscillations support dynamic attending in temporal context. Timing Time Percept.262–86.
Horr N. K. & Di Luca M. (2015a). Filling the blanks in temporal intervals: the type of filling influences perceived duration and discrimination performance. Front. Psychol.6114. doi: 10.3389/fpsyg.2015.00114.
Horr N. K. & Di Luca M. (2015b). Taking a long look at isochrony: Perceived duration increases with temporal, but not stimulus regularity. Atten. Percept. Psychophys.77592–602.
Hoyer P. O. & Hyvärinen A. (2003). Interpreting neural response variability as Monte Carlo sampling of the posterior. In Becker S. Thrun S. & Obermayer K. (Eds) Advances in Neural Information Processing Systems Vol. 15 [Neural Information Processing Systems NIPS 2002 December 9–14 2002 Vancouver BC Canada] (pp. 277–284).
Karmarkar U. R. & Buonomano D. V. (2007). Timing in the absence of clocks: encoding time in neural network states. Neuron53427–438.
Laje R. & Buonomano D. V. (2013). Robust timing and motor patterns by taming chaos in recurrent neural networks. Nat. Neurosci. 16 925–933.
Lakatos P. Shah A. S. Knuth K. H. Ulbert I. Karmos G. & Schroeder C. E. (2005). An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. J. Neurophysiol. 941904–1911.
Lakatos P. Karmos G. Mehta A. D. Ulbert I. & Schroeder C. E. (2008). Entrainment of neuronal oscillations as a mechanism of attentional selection. Science320(5872) 110–113.
Lejeune H. & Wearden J. H. (2009). Vierordt’s The Experimental Study of the Time Sense (1868) and its legacy. Eur. J. Cogn. Psychol.21941–960.
Li M. S. Rhodes D. & Di Luca M. (2016). For the last time: Temporal sensitivity and perceived timing of the final stimulus in an isochronous sequence. Timing Time Percept. 4123–146.
Lucas C. G. & Griffiths T. L. (2009). Learning the form of causal relationships using hierarchical Bayesian models. Cogn. Sci.34113–147.
Ma W. J. Beck J. M. Latham P. E. & Pouget A. (2006). Bayesian inference with probabilistic population codes. Nat. Neurosci.91432–1438.
Maloney L. T. & Mamassian P. (2009). Bayesian decision theory as a model of human visual perception: Testing Bayesian transfer. Vis. Neurosci.26147–155.
Mamassian P. Landy M. S. & Maloney L. T. (2002). Bayesian modelling of visual perception. In Rao R. P. N. Olshausen B. A. & Lewicki Michael S. (Eds) Probabilistic models of the brain: Perception and neural function (pp. 13–36). Cambridge, MA, USA: MIT Press.
Matell M. S. & Meck W. H. (2004). Cortico-striatal circuits and interval timing: Coincidence detection of oscillatory processes. Cogn. Brain Res.21139–170.
McAuley J. D. & Jones M. R. (2003). Modeling effects of rhythmic context on perceived duration: A comparison of interval and entrainment approaches to short-interval timing. J. Exp. Psychol. Hum. Percept. Perform.291102–1125.
McAuley J. D. & Kidd G. R. (1998). Effect of deviations from temporal expectations on tempo discrimination of isochronous tone sequences. J. Exp. Psychol. Hum. Percept. Perform.241786–1800.
McDonald J. J. J. Teder-Sälejärvi W. A. W. Di Russo F. F. & Hillyard S. A. S. (2005). Neural basis of auditory-induced shifts in visual time-order perception. Nat. Neurosci.81197–1202.
Miller N. S. N. & McAuley J. D. J. (2005). Tempo sensitivity in isochronous tone sequences: The multiple-look model revisited. Percept. Psychophys.671150–1160.
Miller J. J. & Ulrich R. R. (2001). On the analysis of psychometric functions: The Spearman–Kärber method. Percept. Psychophys.631399–1420.
Miyazaki M. Yamamoto S. Uchida S. & Kitazawa S. (2006). Bayesian calibration of simultaneity in tactile temporal order judgment. Nat. Neurosci.9875–877.
Murai Y. & Yotsumoto Y. (2016). Timescale- and sensory modality-dependency of the central tendency of time perception. PLoS One11 e0158921. doi: 10.1371/journal.pone.0158921.
Noel J.-P. De Niear M. Van der Burg E. & Wallace M. T. (2016). Audiovisual simultaneity judgment and rapid recalibration throughout the lifespan. PLoS One11(8) e0161698. doi: 10.1371/journal.pone.0161698.
Ohyama T. Gibbon J. Deich J. D. & Balsam P. D. (1999). Temporal control during maintenance and extinction of conditioned keypecking in ring doves. Anim. Learn. Behav.2789–98.
Ortega L. Guzman-Martinez E. Grabowecky M. & Suzuki S. (2014). Audition dominates vision in duration perception irrespective of salience, attention, and temporal discriminability. Atten . Percept. Psychophys.761485–1502.
Petzschner F. H. & Glasauer S. (2011). Iterative Bayesian estimation as an explanation for range and regression effects: a study on human path integration. J. Neurosci.3117220–17229.
Repp B. H. (1999). Detecting deviations from metronomic timing in music: Effects of perceptual structure on the mental timekeeper. Percept. Psychophys.61529–548.
Rhodes D. & Di Luca M. (2016). Temporal regularity of the environment drives time perception. PLoS One11(7) e0159842. doi: 10.1371/journal.pone.0159842.
Roach N. W. Heron J. Whitaker D. & McGraw P. V. (2011). Asynchrony adaptation reveals neural population code for audio-visual timing. Proc. Biol. Sci.278(1710) 1314–1322.
Roach N. W. McGraw P. V. Whitaker D. J. & Heron J. (2016). Generalization of prior information for rapid Bayesian time estimation. Proc. Natl Acad. Sci. U. S. A.114412–417.
Rohenkohl G. & Nobre A. C. (2011). Alpha oscillations related to anticipatory attention follow temporal expectations. J. Neurosci.3114076–14084.
Rohenkohl G. Coull J. T. & Nobre A. C. (2011). Behavioural dissociation between exogenous and endogenous temporal orienting of attention. PLoS One6 e14620. doi: 10.1371/journal.pone.0014620.
Roseboom W. Linares D. & Nishida S. (2015). Sensory adaptation for timing perception. Proc. Biol. Sci.282(1805) 20142833. doi: 10.1098/rspb.2014.2833.
Schiffman H. R. & Bobko D. J. (1977). The role of number and familiarity of stimuli in the perception of brief temporal intervals. Am. J. Psychol. 9085–93.
Schroeder C. E. & Lakatos P. (2009). Low-frequency neuronal oscillations as instruments of sensory selection. Trends Neurosci.329–18.
Schroeder C. E. Wilson D. A. Radman T. Scharfman H. & Lakatos P. (2010). Dynamics of active sensing and perceptual selection. Curr. Opin. Neurobiol.20172–176.
Sciutti A. Burr D. Saracco A. Sandini G. & Gori M. (2014). Development of context dependency in human space perception. Exp. Brain Res.2323965–3976.
Senna I. Parise C. V. & Ernst M. O. (2015). Hearing in slow-motion: Humans underestimate the speed of moving sounds. Sci. Rep.514054. doi: 10.1038/srep14054.
Solomon J. A. Cavanagh P. & Gorea A. (2012). Recognition criteria vary with fluctuating uncertainty. J. Vis.122. doi: 10.1167/12.8.2.
Srinivasan M. V. Laughlin S. B. & Dubs A. (1982). Predictive coding: A fresh view of inhibition in the retina. Proc. R. Soc. B Biol. Sci.216(1205) 427–459.
Sternberg S. & Knoll R. L. (1973). The perception of temporal order: Fundamental issues and a general model. In Kornblum S. (Ed.) Attention and performance IV (pp. 629–685). New York, NY, USA: Academic Press.
Stocker A. A. & Simoncelli E. P. (2006). Noise characteristics and prior expectations in human visual speed perception. Nat. Neurosci.9578–585.
Tanaka S. Tsuzaki M. Aiba E. & Kato H. (2008). Auditory sensitivity to temporal deviations from perceptual isochrony: Comparison of the starting point and ending point of acoustic change. Jpn Psychol. Res.50223–231.
ten Hoopen G. G. Hartsuiker R. R. Sasaki T. T. Nakajima Y. Y. Tanaka M. M. & Tsumura T. T. (1995). Auditory isochrony: Time shrinking and temporal patterns. Perception24577–593.
ten Hoopen G. G. Van Den Berg S. Memelink J. Bocanegra B. & Boon R. (2011). Multiple-look effects on temporal discrimination within sound sequences. Atten. Percept. Psychophys.732249–2269.
Treisman M. (1963). Temporal discrimination and the indifference interval. Implications for a model of the “internal clock”. Psychol. Monogr.771–31.
Treisman M. (1984). A theory of criterion setting: An alternative to the attention band and response ratio hypotheses in magnitude estimation and cross-modality matching. J. Exp. Psychol. Gen.113443–463.
van Eijk R. L. J. Kohlrausch A. Juola J. F. & van de Par S. (2008). Audiovisual synchrony and temporal order judgments: Effects of experimental method and stimulus type. Percept. Psychophys.70955–968.
van Rijn H. Gu B.-M. & Meck W. H. (2014). Dedicated clock/timing-circuit theories of time perception and timed performance. In Merchant H. & de Lafuente V. (Eds) Neurobiology of interval timing. Advances in experimental medicine and biology Vol. 829 (pp. 75–99). New York, NY, USA: Springer New York.
van Wassenhove V. Buonomano D. V. Shimojo S. & Shams L. (2008). Distortions of subjective time perception within and across senses. PLoS One3 e1437. doi: 10.1371/journal.pone.0001437.
Vilares I. & Körding K. P. (2011). Bayesian models: The structure of the world, uncertainty, behavior, and the brain. Ann. N. Y. Acad. Sci.122422–39.
Vroomen J. Keetels M. de Gelder B. & Bertelson P. (2004). Recalibration of temporal order perception by exposure to audio-visual asynchrony. Cogn. Brain Res.2232–35.
Vuust P. & Witek M. A. G. (2014). Rhythmic complexity and predictive coding: A novel approach to modeling rhythm and meter perception in music. Front. Psychol.51111. doi: 10.3389/fpsyg.2014.01111.
Wearden J. H. Edwards H. Fakhri M. & Percival A. (1998). Why “sounds are judged longer than lights”: Application of a model of the internal clock in humans. Q. J. Exp. Psychol. B Comp. Physiol. Psychol.5197–120.
Wearden J. H. Norton R. Martin S. & Montford-Bebb O. (2007). Internal clock processes and the filled-duration illusion. J. Exp. Psychol. Hum. Percept. Perform.33716–729.
Wearden J. H. Todd N. P. M. & Jones L. A. (2006). When do auditory/visual differences in duration judgements occur? Q. J. Exp. Psychol.591709–1724.
Wei X. X. & Stocker A. (2012). Efficient coding provides a direct link between prior and likelihood in perceptual Bayesian inference. In Bartlett P. L. Pereira F. Burges C. Botou L. & Weinberger K. Q. (Eds) Advances in Neural Information Processing Systems 25 [Neural Information Processing Systems NIPS 2012 December 3 2012 Lake Tahoe CA USA] (pp. 1313–1321).
Wei X.-X. & Stocker A. A. (2015). A Bayesian observer model constrained by efficient coding can explain “anti-Bayesian” percepts. Nat. Neurosci.1–11.
Yarrow K. Jahn N. Durant S. & Arnold D. H. (2011). Shifts of criteria or neural timing? The assumptions underlying timing perception studies. Consc. Cogn.201518–1531.
Yarrow K. Martin S. E. Di Costa S. Solomon J. A. & Arnold D. H. (2016). A roving dual-presentation simultaneity-judgment task to estimate the point of subjective simultaneity. Front. Psychol.7416. doi: 10.3389/fpsyg.2016.00416.
Yarrow K. Minaei S. & Arnold D. H. (2015). A model-based comparison of tghree theories of audiovisual temporal recalibration. Cogn. Psychol.8354–76.