On the Distinction Between Perceived Duration and Event Timing: Towards a Unified Model of Time Perception

Time is a fundamental dimension of human perception, cognition and action, as the perception and cognition of temporal information is essential for everyday activities and survival. Innumerable studies have investigated the perception of time over the last 100 years, but the neural and computational bases for the processing of time remains unknown. Extant models of time perception are discussed before the proposition of a unified model of time perception that relates perceived event timing with perceived duration. The distinction between perceived event timing and perceived duration provides the current for navigating contemporary approaches to time perception. Recent work has advocated a Bayesian approach to time perception. This framework has been applied to both duration and perceived timing, where prior expectations about when a stimulus might occur in the future (prior distribution) are combined with current sensory evidence (likelihood function) in order to generate the perception of temporal properties (posterior distribution). In general, these models predict that the brain uses temporal expectations to bias perception in a way that stimuli are ‘regularized’ i.e. stimuli look more like what has been seen before. As such, the synthesis of perceived timing and duration models is of theoretical importance for the field of timing and time perception.


Abstract
Time is a fundamental dimension of human perception, cognition and action, as the perception and cognition of temporal information is essential for everyday activities and survival. Innumerable studies have investigated the perception of time over the last 100 years, but the neural and computational bases for the processing of time remains unknown. Extant models of time perception are discussed before the proposition of a unified model of time perception that relates perceived event timing with perceived duration. The distinction between perceived event timing and perceived duration provides the current for navigating contemporary approaches to time perception. Recent work has advocated a Bayesian approach to time perception. This framework has been applied to both duration and perceived timing, where prior expectations about when a stimulus might occur in the future (prior distribution) are combined with current sensory evidence (likelihood function) in order to generate the perception of temporal properties (posterior distribution). In general, these models predict that the brain uses temporal expectations to bias perception in a way that stimuli are 'regularized' i.e. stimuli look more like what has been seen before. As such, the synthesis of perceived timing and duration models is of theoretical importance for the field of timing and time perception.

Introduction
Time is a fundamental dimension that pervades all sensory, motor and cognitive processes. Organisms, such as human beings, must quantify time in order to survive and interact with the environment efficiently and successfully. Time is central to our everyday lives, from playing sports, speaking, dancing, singing, or playing music -to our sleep-wake cycle. Though an important dimension of perception, a slight unease may fill the reader when researchers refer to 'time perception'. The fields of colour, object, taste, olfactory, distance, speech and depth perception all investigate tangible physical properties, whereas the dimension of time is invisible and transient. In fact, one could ask whether time even exists at all -for example, theories of relativity suggest that all moments in the past, present and future are equally real -rendering the specious present something of an illusion (Callender, 2010;Davies, 2002;Einstein, 1916;James, 1890). In this article, we review classic and modern approaches to temporal perception, before discussing the data from recent experiments that have shown how the timing of events changes in a way that is consistent with Bayesian Decision Theory. Finally, we call for a theory of time perception that brings together duration and event timing into a single unified framework.

Scales of Time
Time is perceived over a broad scale from microseconds to days, weeks and months (but probably not over sub-nanosecond or geological units of time). At the millisecond range, time is critical for speech generation (Schirmer, 2004), recognition (Mauk & Buonomano, 2004) and motor control (Edwards, Alder, & Rose, 2002). At time perception to estimate the duration between two events, before describing how current models can explain temporal processing. Then, we introduce recent research that suggests the brain uses a Bayesian inferential processing approach to estimate interval timing.

Measuring Perceived Duration
If a mechanism for time perception exists in the brain -then what might its function be? One might argue that an optimal mechanism would try to perceive time as close to veridical (physical) time as possible. Thus, the two main dependent variables in time perception research historically concern the mean accuracy and variability of temporal estimates. Estimates of a temporal characteristic, such as the duration of an event, are prone to temporal distortions by stimulus properties (Horr & Di Luca, 2015a;2015b;Thomas & Brown, 1974;Wearden, Norton, Martin, & Montford-Bebb, 2007), complexity (Schiffman & Bobko, 1977), sensory modality (Goldstone & Lhamon, 1974;Wearden, Edwards, Fakhri, & Percival, 1998;Wearden, Todd, & Jones, 2006), and context ; and as such, the mean accuracy of an estimate deviates from real time. Whilst the mean accuracy may approximate real time, the system may be suboptimal and as such the variability in the system may sometimes lead to experiencing an event as shorter or longer than the physical duration (Grondin, 2010).

From Perceived Duration to Perceived Timing
Temporal reproduction and production (Allan, 1979;Goldstone, 1968), verbal estimation (Vierordt, 1868) and the method of comparison (Bald, Berrien, Price, & Sprague, 1942;Dinnerstein & Zlotogura, 1968;Hamlin, 1895;Höring, 1864;Spence, Shore, & Klein, 2001;Wichmann & Hill, 2001;Zampini, Shore, & Spence, 2003) have been used classically to assess the perceived duration of events. Of central interest to this review, however, is the perception of when an event occurs rather than how long something lasts. In order to understand how we could measure the perceived timing of a stimulus, we briefly introduce the psychophysics of relative timing approach, and how the method of comparison can be used to estimate when a stimulus is perceived at a time point. The word 'perceived' here, is used in the loosest sense -the above methods cannot demonstrably show changes in low-level sensory processing of time (Rhodes, 2017). It is equally plausible that the methods we use in time perception are measuring changes in the decisional criteria associated with time (Solomon, Cavanagh, & Gorea, 2012;Treisman, 1984;Yarrow et al., 2015;Yarrow, Jahn, Durant, & Arnold, 2011;Yarrow, Martin, Di Costa, Solomon, & Arnold, 2016).

Psychophysical Methods
Psychophysics is the scientific investigation of the functional interrelations between the physical and phenomenal world (Ehrenstein & Ehrenstein, 1999;Fechner, 1860).
The aim of psychophysics is to quantify and measure subjective experience by determining the relationship between perception and physical stimuli. A central tenet of modern psychophysics is to control and vary the properties of an external stimulus and then ask a participant to report what they have experienced -with as simple a question as possible. For example, one may be interested in the detection of whether a sound is present or not (i.e. did you hear that stimulus?) or, further, in identifying what kind of stimulus characteristic is present (i.e. where was the stimulus?). As such one might translate detection into the sensing of a stimulus -and identification as a higher-level process that can sometimes result in a failure to identify a stimulus. For example, if a stimulus is weak and noisy, it may be sensed but a participant may be unable to identify or report a characteristic associated with it.

Measuring Intersensory Synchrony and Temporal Order
We live in a multisensory environment where perception is not simultaneous -it takes time. The perception of synchrony or temporal order is not straightforward, as differences in neural and physical transmission times can cause synchronous events to be perceived as asynchronous, and vice versa. When a distant bolt of lightning illuminates the sky at night and sends out thunderous sound waves, we see the light first and then hear the sound even though both signals were emitted simultaneously.
The discrepancy in the perception of a simultaneous multisensory event is due to the relative differences in sensory registration to the eyes and ears as light travels much quicker than sound (300,000,000 vs. 330 metres per second). To complicate matters further, the processing time for visual stimuli (approx. 50ms) is longer than auditory stimuli (approx. 10ms) as the chemical transduction of light in the retina is slower than the mechanical transduction of sound waves in the ear (Allison, Matsumiya, Goff, & Goff, 1977;King, 2005;Spence & Squire, 2003;Vroomen & Keetels, 2010).
The distance at which the differences in neural and physical transmission times are negated and signals arrive at the primary sensory cortices synchronously is around 10-15 metres away from the observer and has been called the horizon of simultaneity (Spence & Squire, 2003;Vroomen & Keetels, 2010). However, in interactions between a human observer and a sound/light emitting device at a close distance (~1-3 metres), it has been commonly reported that visual signals have to precede auditory signals for the perception of simultaneity (Vroomen & Keetels, 2010;Zampini et al., 2003;Zampini, Guest, Shore, & Spence, 2005a;Zampini, Shore, & Spence, 2005b).
The temporal difference between the senses is measured by finding the asynchrony necessary to perceive simultaneity, which is defined as the Point of Subjective Simultaneity (PSS). To measure this difference, one can use the psychophysical methodology. An extension of simply discriminating whether a signal is present or not, is to present two stimuli (X and Y) with varying SOAs (X-Y) and force participants to report whether the two stimuli are simultaneous (Exner, 1875;Fujisaki, Shimojo, Kashino, & Nishida, 2004;Spence et al., 2001;Zampini, Guest, Shore, & Spence, 2005a;Zampini, Shore, & Spence, 2005b), or to report the temporal order of the pair (Boenke, Deliano, & Ohl, 2009;Gibbon & Rutschmann, 1969;Jaśkowski, 1992;Yamamoto & Kitazawa, 2001;Zampini et al., 2003).
In the Simultaneity Judgment (SJ) task, participants judge whether X and Y appear to be simultaneous -or not. Here, the proportions of 'simultaneous' responses are plotted as a function of SOA (Fig. 1E). It is important to note, however, that fitting SJ data with a Gaussian function is rather arbitrary and without theoretical justification (see e.g. Schneider & Bavelier, 2003;Sternberg & Knoll, 1973;Yarrow et al., 2011). Here, the assumption is that the peak represents perceived simultaneity (i.e. the PSS), as this is the point at which participants are maximally sure that X and Y are synchronous. A further measure than can be derived from such a function is the standard deviation (SD) of the distribution of responses. The SD may characterize either the relative sensitivity, or how liberal participants criteria are for perceived simultaneity. Larger SDs suggest participants had a larger region of complete insensitivity to order or, alternatively were either more liberal with their criteria for two events being simultaneous (Yarrow et al., 2011).
In temporal order judgments (TOJs), the proportion of 'Y first' responses are generally an increasing function of SOA (Fig. 1E). One usually obtains a sigmoid function where the PSS corresponds to the SOA at which an observer is maximally unsure about the temporal order of the pair of stimuli (50% point). The steepness of the curve at the PSS reflects an observers' sensitivity to temporal order and is expressed as the Just-Noticeable Difference (JND). Generally this measure is taken as half of the difference between the SOA at the 25% and 75% points, however other methods such as the Spearman-Kärber may calculate this based on the 14% and 86% points (two sigma; see J. J. Miller & Ulrich, 2001). As such, the JND represents the smallest SOA an observer can reliably judge the temporal order thereof. A flat curve would result in a relatively larger JND and as such reflect an observer that has low temporal sensitivity whereas a steep curve would constitute a smaller JND and thus implies an observer has higher temporal sensitivity.

Estimating Perceived Timing using Psychophysics
We have discussed the psychophysical method and how one can measure the relative timing between two sensory events. Here, however, we will discuss how psychophysics may be used to estimate the perceived timing of an event through the PSS (defined as PSS here to avoid neologism, though it could be considered as the Point of Subjective Isochrony (PSI) in the following example). In a first type of task, participants are presented a sequence of stimuli with the same inter-onset interval (IOI) except the final stimulus has an anisochrony applied such that it could appear earlier or later than expected (Fig. 1A,B) and then asked to report if the final stimulus was on time , or in a different task: early or late (Li, Rhodes, & Di Luca, 2016).
If we consider standard TOJs, the PSS is only really a measure of the relative asynchrony in the time it takes to process two signals to be perceived as simultaneous -not when an event happened. To measure the perceived event timing of a stimulus, this review advocates presenting a sequence of regularly timed stimuli and pairing the last stimulus with a stimulus from another modality (which is unaffected by the sequence), to compare the PSS for stimuli presented on time, earlier than and later than expected. Presently, models of time perception do not predict that the PSS should change regardless of when a stimulus is presented. In the next section, we discuss such models and their predictions before introducing a Bayesian model of perceived event timing that makes explicit predictions. judgment tasks. The PSS denotes the SOA at each curve where subjects report mostly that the two stimuli were on time (SJs) or were equally unsure (.5 point) about the temporal order (TOJs) of the pair. In this example, the PSSs are positive, meaning that the final stimulus had to be presented around 10ms before the visual probe to be perceived as simultaneous. The SD represents the smallest difference (JND) participants reliably report that stimuli were asynchronous or their temporal order.

Contemporary Models of Time Perception
The aim of this review is to increase the understanding of the computational mechanisms of how the brain may estimate the perceived timing of events -that is, how can the brain know when is now, when was then and when is next? Extant models of time perception are mostly based on the notion of perceived duration i.e. how the brain may represent and encode the time between two signals. We now introduce and discuss such contemporary models of interval timing. Firstly, it should be addressed that there exists a great literature on different taxonomies of timing models -where some researchers have conceptualised models of time in terms of having a dedicated neural mechanism for the perception of time (Creelman, 1962;Gibbon, 1977;Gibbon et al., 1984;Treisman, 1963;Wing & Kristofferson, 1973), in contrast to time being an intrinsic product of sensory information processing, where recurrent spatial or activity patterns read out duration without the need of an internal clock (Buonomano, 2009;Buonomano & Merzenich, 1995;Karmarkar & Buonomano, 2007;Mauk & Buonomano, 2004). Further, dedicated models assume that there are specialized brain regions involved in the representation of temporal information, whilst intrinsic models primarily argue for a distributed timing mechanism over the brain (Ivry & Schlerf, 2008). This review is concerned with two popular classes of dedicated models for the perception of time: Entrainment and interval models (Gibbon et al., 1984;Large & Jones, 1999;McAuley & Jones, 2003), and as such, we now introduce both before showing how they may be formulated to make predictions about the timing of individual stimuli.

Internal Clock Models
When one is asked 'what time is it?' or 'how long have you been waiting?' -it is quite likely that this person will glance at their watch and use it to estimate what the present time is -or how long the wait has been. As such, it is intuitive to think that the brain may use a clock-like mechanism in order to deal with the perception of time.
Internal clock (or interval) models of timing are born out of this analogy and they conceive time as a triad of clock, memory and decision processes (Creelman, 1962;Treisman, 1963). The most notable, and influential interval model is Scalar Expectancy Theory (SET; Church, Meck, & Gibbon, 1994;Gibbon, 1977;Gibbon et al., 1984). In the SET model, the internal clock is considered as a pacemakeraccumulator mechanism, where a dedicated pacemaker emits pulses continuously. To represent duration, the accumulator counts the amount of pulses between two signals and then stores them in memory (Fig. 2). The hallmark of the SET model is that as the mean duration of an interval increases, the associated standard deviation of the duration estimate increases linearly also -this is often called the 'scalar property' of interval timing. Such a property is an important characteristic of temporal perception and not just a feature of the SET model, whilst also being synonymous with the Weber-Fechner Law (Fechner, 1860), which asserts a logarithmic relationship between physical magnitudes and the representation in the perceptual system, and as such, the JND between two physical magnitudes is proportional to the absolute physical magnitude. Each interval is maintained in working memory before being passed to a more robust representation in long-term memory. The key point here is that time, in these accounts, is represented as discrete interval durations that are subsequently compared with other intervals at a decision stage (Allman, Teki, Griffiths, & Meck, 2013;Church & Broadbent, 1990;Gibbon et al., 1984). If the amount of pulses in one interval is greater than another -then the former interval is perceived to be longer. After sufficient exposure to repeated intervals, the representation of the interval in memory becomes more refined and leads to better discrimination performance (Drake & Botte, 1993;Hoopen, Van Den Berg, Memelink, Bocanegra, & Boon, 2011;Miller & McAuley, 2005;Schulze, 1978;. Further, the stored intervals can be compared to the current clock reading in order to estimate the onset of a future stimulus. shows how a Bayesian inference approach to duration estimation may be reconciled with SET (Shi et al., 2013). Sensed evidence (likelihood) is determined from the clock stage of the SET. The prior represents the previous knowledge of previously exposed durations. The posterior is the combination of the prior and likelihood, resulting in an estimation of the duration of an interval.
The SET model does not try to explain any changes in the perceived timing of individual stimuli -rather, it is concerned with changes in the representation of duration. Stimuli, in this sense are external cues that -after a processing delaysimply delimit intervals. Given this, interval models are also symmetric in the sense that they by large do not predict any differences in the detection of temporal irregularities at which a stimulus is presented (be that earlier or later than the expected time point). For example, if a stimulus is presented earlier than expected -then there should only be a small but predictable difference in its temporal discrimination. The scalar property can be used to predict asymmetric changes in temporal deviation detection by considering changes in the underlying transducer function of physical duration to perceived duration (García-Pérez, 2014), as well as the standard deviation of subjective duration being proportional to the average experienced duration (Church & Deluty, 1977;Church & Gibbon, 1982;Gibbon et al., 1984). Extending this to the idea of anisochrony, a stimulus presented earlier than expected has a shorter perceived duration, and as such a representation with a smaller standard deviation than a stimulus presented later than expected, meaning the earlier stimulus is easier to detect if irregular.
A recent paper tested the predictions of interval models in event timing, where a difference in performance of detecting temporal irregularity due to the sign of the anisochrony at which a stimulus is presented was reported .
The study reported that as the number of stimuli in a sequence increased, so did the ability to discriminate temporal irregularity -but only for stimuli presented earlier than expected. Further, differences in the perceived timing of stimuli as a function of their relation to expectation were reported -as early stimuli were perceptually delayed whilst late stimuli were perceptually accelerated in order to appear closer to expectation. Interestingly, stimuli presented isochronously (on-time) were perceptually accelerated (an effect that has also been reported for 'early' or 'late judgments', Li et al., 2016). Interval models such as the Multiple-Look Model (Drake & Botte, 1993;Miller & McAuley, 2005) cannot account for these patterns of results however entrainment models can be formulated to explain at least the acceleration of stimuli presented isochronously.

Entrainment Models
Entrainment models offer an alternative realisation of interval timing. Similar to interval models, the basic tenet of these models is that a clock-like mechanism is an entrainable oscillator that peaks in amplitude at the expected onset of future stimuli (Large & Jones, 1999;Large & Palmer, 2002;Large & Snyder, 2009;McAuley & Jones, 2003) -though phase coincidence (Miall, 1989), recurrence of activity patterns (Buonomano, 2009;Buonomano & Merzenich, 1995;Karmarkar & Buonomano, 2007), or a Bayesian-like model that combines noisy estimates of duration with a resonance-like mechanism that regularizes sequences of intervals (Burr, Rocca, & Morrone, 2013), have also been proposed as alternative intrinsic entrainment models.
Whilst interval models have mainly been formulated to explain interval timing and determining which of two intervals is longer (or shorter) -entrainment models are more conducive to explain stimulus timing in rhythmic sequences -as internal oscillations gradually adjust to the phase of external rhythms.
Dynamic Attending Theory (DAT) (Jones & Boltz, 1989;Large & Jones, 1999;Large & Palmer, 2002) is one realization of the concept of entrainment in time perception. Here, attention is not distributed evenly over time, but rather ebbs and flows with time's passing. Originally proposed as a model of rhythmic expectancy, DAT proposes that rhythm perception is induced by way of entrainment to external signals. Internal fluctuations in attentional energy (attentional 'peaks') generate temporal expectancies about the onset of future events that can acclimate to the period and phase of external events by way of an adaptive internal oscillator (Fig. 3). At the neural level, the perception of regular events has been proposed to originate from neural oscillations that adjust and resonate with external signals (Henry & Herrmann, 2014;Large & Snyder, 2009;Zanto, Snyder, & Large, 2006). The framework of active sensing (Schroeder & Lakatos, 2009;Schroeder, Wilson, Radman, Scharfman, & Lakatos, 2010) -the fluctuation of excitation/inhibition cycles-can be tied directly to DAT. The high excitability phase of neural oscillations are thought to be associated with the peak of the attentional pulse and as such facilitate sensory selection and processing of stimuli that coincide with the peak of an oscillation (Henry & Herrmann, 2014;Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008). Therefore, one can reason that if a stimulus occurs at the peak of an oscillation and high excitability phase, then it should be given a perceptual boost and processed faster.
This effect is similar to prior entry (Spence & Parise, 2010;Sternberg, Knoll, & Gates, 1971), where attended stimuli are processed quicker than unattended ones. The idea of prioritized processing of attended stimuli exists in the visual cognition domain (Summerfield & Egner, 2009), and such attentional facilitation of perception has been highlighted in a number of studies in the temporal processing literature (Spence et al., 2001;Sternberg & Knoll, 1973;Zampini, Shore, & Spence, 2005b) as well as at the neural level (McDonald, Teder-Sälejärvi, Di Russo, & Hillyard, 2005).
DAT accounts for perceived stimulus timing by considering that humans detect asynchronies between an expected stimulus onset time and the actual stimulus onset time (McAuley, 1995). If the stimulus onset occurs after the expected peak then a stimulus is perceived as being late, whilst if it is before the expected peak then it is perceived as being early. Intuitively, when a stimulus onset time coincides with the peak of the expected time, then it is perceived as being on time -though as shown above, entrainment models could be formulated to predict an acceleration of attended-to stimuli that occur at the peak of an oscillation. As a consequence of increasing attentional expectancies due to entrainment, sensitivity to temporal deviations improves as a function of increasing sequence length (Barnes & Jones, 2000;Drake & Botte, 1993;McAuley & Kidd, 1998;Miller & McAuley, 2005). oscillator is a dynamic system that periodically generates temporal expectancies (Jones & Boltz, 1989;Large & Jones, 1999;Large & Palmer, 2002). The oscillations coupled with pulses of attentional energy at (recurrent) expected time points, given the phase of a rhythm, result in attention being allocated at the expected time-point.
Discrepancies between the onset times of a stimulus in relation to its expected onset gives rise to the detection of temporal irregularities.
Entrainment models can at least explain the perceptual acceleration of expected stimuli yet this is still rather speculative. Extant Bayesian models of time perception have been formulated (Jazayeri & Shadlen, 2010;Miyazaki, Nozaki, & Nakajima, 2005;Shi, Church, & Meck, 2013) -but primarily for the representation of intervals. Now we introduce the idea of Bayesian time perception for duration perception before discussing a contemporary Bayesian account of perceived event timing in rhythmic sequences.

A Bayesian Model of Interval Timing
As mentioned previously, time is subject to various contextual distortions. A seminal example of contextual calibration is Vierordt's law (Lejeune & Wearden, 2009;Vierordt, 1868). When observers are presented with various intervals of different lengths and subsequently asked to reproduce each interval -they tend to overestimate the duration of short intervals, and underestimate long ones (Jazayeri & Shadlen, 2010;. This is a type of 'central-tendency' effect -participants migrate their estimates of duration towards the mean of exposed intervals. A prevalent model of such an effect is that the perception of interval duration is derived from not only the perception of current sensory information, but also from the prior knowledge of the duration of previously exposed intervals (Jazayeri & Shadlen, 2010;Lejeune & Wearden, 2009;Murai & Yotsumoto, 2016;Petzschner & Glasauer, 2011;Petzschner, Glasauer, & Stephan, 2015;Roach, McGraw, Whitaker, & Heron, 2016;Shi & Burr, 2016;Taatgen & van Rijn, 2011). Prior knowledge of the temporal statistics of the environment, in this sense, biases temporal perception.
Under the Bayesian framework, a generative model combines current sensory information (likelihood) with a priori knowledge of the world (prior) in order to give rise to a percept (posterior). The likelihood and prior in this model are weighted by their relative uncertainties (Colas, Diard, & Bessiere, 2010;Fernandes, Stevenson, Vilares, & Körding, 2014;Griffiths & Tenenbaum, 2011;Lucas & Griffiths, 2009;Vilares & Körding, 2011). For example, noisier (more uncertain likelihoods) stimuli are influenced more by previous sensory experience.
The Bayesian framework has recently been applied to the SET model of interval timing (Shi et al., 2013). The central tenet of such a Bayesian model is that the triad of components of the SET model are translated into the Bayesian framework: the likelihood, prior and posterior are considered analogous to the clock, memory and decision stages (Fig. 2). The clock stage represents the likelihood function, that is, present perceptual information, and is rendered as such: if an interval delimited by two stimuli is duration D, with an allied internal clock count of C, which represents the number of 'ticks' accumulated by the time the second stimulus has delimited the interval, then the likelihood function " ( | ), is the probability of acquiring the The memory stage is analogous to the prior probability distribution ) ( ), The prior is a probability distribution that is centred at the objective mean of the sample intervals presented to subjects. As with the likelihood function, the prior's width determines the precision of recent experience: flatter priors indicate that uncertainty about the mean of sample intervals, whilst a sharp prior would indicate more precise estimates. In order to arrive at an estimate of perceived duration, according to Bayes' rule, the prior is combined with the likelihood, in order to form the posterior distribution * ( | ): The posterior distribution is considered as synonymous to the decision stage of the SET model. Given the posterior, a Bayesian ideal observer chooses an action given a loss function that specifies the relative cost or success of a potential behavioural response (Acerbi, Vijayakumar, & Wolpert, 2014;Acerbi, Wolpert, & Vijayakumar, 2012;Kording & Wolpert, 2004;Wolpert, 2007). If we consider the perception of duration, then the model predicts noisy sensory estimates of duration are biased towards the mean of the prior probability distribution. Evidence for Bayesian interval timing is still in its infancy with regard to the depth of studies investigating such models, however there is recent work that shows that the central tendency effect is stronger in vision that in the auditory modality (Cicchini, Arrighi, Cecchetti, Giusti, & Burr, 2012). This result can be interpreted in two ways: either the prior is relatively weaker in the auditory modality, and as such, has little influence on the likelihood; or secondly, audio likelihood functions are more precise (steeper) and are not captured by the prior. A recent study claims that priors are modality dependent (Murai & Yotsumoto, 2016), however the data appear to suggest that subjects are in fact not modality dependent, but rather the precision of duration estimates for perceived duration differ between modalities given auditory stimuli have greater reliability in temporal judgments (Ortega, Guzman-Martinez, Grabowecky, & Suzuki, 2014).
Further, recent data also suggests that subjects form a general prior over two distinct sensory contexts (Roach et al., 2016).

Summary of Models
In summary, interval models of duration perception are based on the idea that an internal clock keeps track of time by counting the amount of pulses between the onsets of one event to another. When considering the perceived timing of a single stimulus, these models make no explicit predictions about changes in the timing of a stimulus due to the temporal structure of an embedded sequence. Entrainment models, on the other hand, can be formulated to predict that expected stimuli are processed faster and as such, perceived earlier. However, entrainment accounts have not been specifically formulated to explain how temporal structure may change the perceived timing of stimuli. In contrast to these accounts of time perception, the Bayesian framework has been applied to several perceptual domains, and has recently been applied to duration estimation (Hartcher-O'Brien et al., 2014;Shi et al., 2013). The Bayesian framework has been used to show how the representation of duration is calibrated in order to make intervals appear more similar to the duration of previously interval-based models -the described Bayesian account of SET (described above) only makes predictions about what happens to the representation of intervals, and as such, does not predict any changes to the perceived timing of stimuli in sequences.

Shifting Focus from Perceived Duration to Perceived Event
Timing?
Interval and entrainment models were born out of modelling the perception of duration. Numerous studies have sought to understand how discrimination performance to temporal irregularities increases as the amount of stimuli increases (Drake & Botte, 1993;Halpern & Darwin, 1982;Hoopen et al., 2011;Lunney, 1974;McAuley & Kidd, 1998;Miller & McAuley, 2005). These models predict that the detection of temporal irregularity is symmetric around an expected time point (though the application of SET to temporal bisection and generalization in duration perception do predict asymmetries in deviation detection, García-Pérez, 2014). Di Luca & Rhodes (2016) tested such a prediction, by asking participants to report whether the last stimulus in a unimodal sequence of isochronous tones of different lengths (3, 5, 5 or 6 stimuli) was 'on time' -or 'off time' (Fig. 1A,B). In contrast to the multiple-look interval models, the increases in irregularity detection were asymmetric -stimuli presented earlier than expected were better discriminated as irregular with increasing sequence length compared to stimuli appearing later than expected.
As a possible explanation for this asymmetry, changes in the perceived timing of the final stimulus could account for the pattern of results. To measure the perceived timing of the final stimulus (rather than perceived isochrony), a sequence of isochronous tones was presented but this time the final tone was paired with a stimulus in another modality (Fig. 1C, D). From the participants' responses, it was possible to calculate the PSS: the audiovisual asynchrony necessary to perceive both stimuli as simultaneous (Fig. 1E). Data evidences that if the final stimulus was presented a little earlier than expected -then the perceived timing is changed in a way that delays the stimulus towards its expected timing. Conversely, stimuli presented a little later than expected are perceptually accelerated towards expectation. The effect of stimuli being delayed towards the time they are expected can be understood as temporal regularization -which is similar to central tendency effects in the time perception literature, such as Vierordt's Law (Lejeune & Wearden, 2009;Vierordt, 1868), where the duration of an interval is biased by the average duration of intervals previously experienced (Jazayeri & Shadlen, 2010;Petzschner et al., 2015).
However, in opposition to a central tendency effect, the authors found asymmetries also in the perceived timing data of stimuli presented at their expected time (on time), as they are perceptually accelerated away from expectation. To add weight to this finding, it has recently been reported that the perceived timing of a stimulus is accelerated for stimuli presented at the expected time point (Li et al., 2016).
The theme of this review is geared towards the distinction between perceived event timing and perceived duration. The perception of duration has a vast and important literature (Gibbon et al., 1984;Meck, 2003;Treisman, 1963;van Rijn, Gu, & Meck, 2014), but the perception of events occurring at physical time points is less understood. Interval and perceived event timing, though related, differ.
Intervals are delimited by the presence of two stimuli, or the onset and offset of one stimulus (i.e. a 'filled' duration). However, it is not explicitly stated in the SET interval timing model (Gibbon, 1977;Gibbon et al., 1984) what happens to the timing for either of the stimuli that delimit an interval. When inducing changes in the timing of a single stimulus due to temporal expectations , it is not apparent in SET whether the timing of the first and/or second stimulus that delimits an interval is subject to any change in its timing. One might ask, are perceived event timing and duration subserved by different systems or are they parts of the same system? The distinction between the two becomes blurred when one considers effects such as difference in the perceived duration of intervals, whether filled (Buffardi, 1971;Thomas & Brown, 1974;Wearden et al., 2007), or filled with a series of regularly or irregularly timed stimuli (Horr & Di Luca, 2014;2015a). Here, durations filled with a continuous tone or a series of events are perceived as longer than intervals with an empty filler. How does the perceived timing of events interact with the perception of duration in order to produce such a phenomenon? The truth may be that the perceived timing of events feed forward in series or parallel towards a system that computes the perceived duration of an interval. As such, timing models that explicitly synthesize perceived timing and duration are of theoretical importance.
The perception of the timing between two events is well researched (Fujisaki et al., 2004;Grondin, 2010;Roseboom, 2017;Roseboom, Linares, & Nishida, 2015;Spence, 2007;Spence & Parise, 2010;Vroomen & Keetels, 2010), but there is a distinction between relative timing and anchored time points of stimuli. Humans appear to combine estimates of stimuli in a statistically-optimal fashion using maximum likelihood estimation (Ernst, 2006;Ernst & Banks, 2002), however such an approach does not reveal when at an absolute time a single stimulus is perceived -but rather, only changes in the relationship, or integration of two events. When subjects complete audiovisual temporal order or simultaneity tasks (Di Luca et al., 2009;Fujisaki et al., 2004;Hartcher-O'Brien et al., 2014;Noel, De Niear (2) are not asked about the timing of one of the stimuli in the sequence with regards to an absolute timeline and given this, the exact timing of a single sensory event cannot possibly be known. As such, the following section discusses methods which may be able to measure the perceived timing of events with regards to a physical time line. can only start after a short delay due to neural processing -but although a stimulus cannot be sensed before a stimulus is presented -however there is always the chance it could be perceived a bit later than average due to noise in the sensory system. Prior distributions about the expected timing of future events should also be asymmetric, as an organism cannot predict a second event to occur before the first event, and as such should start at 0 for when the first event occurs and the distribution continues to rise until the expected timing of a second event. However, due to the anisotropy of time, the second event could still be expected tomorrow, and as such the prior should have a long off tail.

A Bayesian Model of Perceived Event Timing
The Bayesian model of perceived event timing makes intrinsic predictions. As such, the perceived timing of stimuli in an environment where trials are isochronous should exhibit the temporal regularization effect -early stimuli should be delayed towards expectation whilst late stimuli should be accelerated. Stimuli presented on time, in contrast are perceptually accelerated, as the mean of the posterior is earlier in time than the mean of the prior and is, as such, reported earlier (Fig. 5). However, stimuli that are presented in a random sequence of irregular timings, should not have any temporal expectations built up. Therefore, they should not have any modulation of their perceived timing, suggesting that a prior is not built. Second, an implicit assumption of the model is that noisier measurements should lead to broader likelihood functions that are captured more by the prior probability distributions. In the next section, we will consider empirical data that supports these two predictions. The perception of regularity has historically been investigated in terms of deviations from its inverse: irregularity (Drake & Botte, 1993;Halpern & Darwin, 1982;Lunney, 1974;McAuley & Kidd, 1998;Repp, 1999;Schulze, 1978;Tanaka, Tsuzaki, Aiba, & Kato, 2008). But what makes a sequence of isochronous tones be perceived as regular? Extant models of rhythm perception assume that if a stimulus is presented in an isochronous structure then it is simply perceived as such. Time, however, is a physical dimension that is often subject to distortion in human perception (Allman & Meck, 2012;Hellström & Rammsayer, 2015;Hoopen et al., 1995;Horr & Di Luca, 2015a;2015b;Jazayeri & Shadlen, 2010;Lejeune & Wearden, 2009;Petzschner et al., 2015;van Wassenhove, Buonomano, Shimojo, & Shams, 2008;Wearden et al., 2007); so why should a temporal property such as regularity be taken for granted?
Rhodes & Di Luca (2016) investigated whether the temporal environment could influence the perception of regularity. If a sequence has temporal irregular events, then the perceived timing of a stimulus should not be modulated -as the prior that biased perceived timing cannot be built. The authors found that a regularly-timed environment promotes the perception of regularity and changes the perceived timing of stimuli to make slightly irregular stimuli appear more regular. An irregular environment of jittered tones, on the other hand, makes perfectly regular tones embedded within it be perceived as slightly irregular.
These results can be interpreted within the context of the Bayesian model of perceived event timing. In a regular environment, temporal expectations dynamically build after each stimulus and subsequently bias the perception of slightly irregular stimuli to make them appear more regular (Fig. 6B). However, in an irregular environment, temporal expectations are less precise and as such do not build up, and therefore do not bias the perceived timing of stimuli. As the representations of the posterior are less precise, (Fig. 6A) the posterior distribution from which the perception of regularity is taken is wider, and as such there is a chance that an isochronous stimulus is perceived as being irregular. The idea of lack of integration between the prior and likelihood could be due to the large differences between the information present i.e. isochronous sequences versus highly anisochronous sequences. The system discounts the discrepant source of information (isochronous trials) and does not combine the priors and likelihoods (Banks & Backus, 1998;Ernst & Banks, 2002). Rhodes & Di Luca, 2016), a temporal regularization effect was found: stimuli presented earlier than expected are perceptually delayed whilst late stimuli are perceptually accelerated. Importantly, addressing the motivation of this experiment, Rhodes & Di Luca (under review) found that the temporal regularization effect is strongest for stimuli when the final stimulus was of weak amplitude; providing evidence that the noise characteristics of stimuli influence perceived timing (Fig. 7).

Models of Time Perception
The increases. Both interval and entrainment models predict a symmetric increase in temporal discrimination performance as the amount of stimuli in a sequence increases (Drake & Botte, 1993;ten Hoopen et al., 2011;Large & Jones, 1999;Large & Palmer, 2002). The Multiple-Look Model (MLM), an interval-based model of temporal discrimination, is based on the idea that as sequence length increases so does the precision of an estimate for each interval (Drake & Botte, 1993;Miller & McAuley, 2005). Similarly, the beat-averaging (Schulze, 1978;, diminishing returns (Hoopen et al., 2011) and internal-reference model (Bausenhart, Dyjas, & Ulrich, 2014;Dyjas, Bausenhart, & Ulrich, 2012;Ulrich, 1987), are all based on similar premises (Li et al., 2016). As the factor of change in such accounts is the better internal representation of an interval, interval-based models make do not make explicit predictions about changes in the perceived timing of stimuli (Gibbon, 1977;Gibbon et al., 1984;Shi et al., 2013) -as stimuli simply delimit intervals.
A key interval-based model to explain such changes in representation is SET (Gibbon et al., 1984). In this model, an internal pacemaker emits pulses that are accumulated and counted between two events -leading to a duration estimate. In order to account for the modulations in perceived timing, the SET model must be augmented. Rather than being in competition with SET, the model presented represents a general issue in resolving how 'global' context effects can be reconciled with 'local' changes in perception -as it has been shown that the duration of just the previous stimulus can affect the perceived simultaneity of the next (Van der Burg et al., 2013); as well as the temporal regularization phenomena reported in this review.
As such, a general model of time perception that both estimates perceived timing and duration is of paramount importance in order to reconcile such different ways of understanding how humans and animals perceive time.
Entrainment models of temporal perception similarly predict symmetrical performance in determining whether stimuli are earlier or later-than expected (Henry & Herrmann, 2014;Large & Jones, 1999;Large & Palmer, 2002). Entrainment models are based on the idea that the phase and frequency of temporal patterns adjust to rhythmic events-where at the neural level, recurrent activity patterns (Buonomano, 2009;Buonomano & Merzenich, 1995;Karmarkar & Buonomano, 2007;Laje & Buonomano, 2013) or phase coincidence (Miall, 1989) progressively tune to the frequency and phase of external stimulation. Though not originally formulated to predict changes in perceived timing, entrainment models could be formulated to appeal to the rhythmic deployment of attention at an expected time-point to facilitate the processing of on-time stimuli to be perceived faster (Rohenkohl, Coull, & Nobre, 2011). However, data evidences that early stimuli are delayed towards expectation and, as such, current formulations of entrainment models cannot account for this finding (Buonomano & Merzenich, 1995;Karmarkar & Buonomano, 2007;Large & Jones, 1999;Large & Palmer, 2002;Large & Snyder, 2009;Miall, 1989) -as principally these models are based on phase correction for the next stimulus in a sequence, and not modifications of a stimulus at the present time, whilst it is also unclear how such models could account for perceptual delay. Similar to the implication for interval-models, entrainment accounts of temporal processing should consider the modulation of PSS that results in temporal regularization.
To summarize, the Bayesian model of perceived timing can explain the delay of early stimuli as well as the acceleration of on time and later than expected stimuli.
Interval models do not make any explicit predictions about changes in the perceived timing of stimuli and as such cannot account for our data. However, if one considers recent Bayesian interval timing models (Jazayeri & Shadlen, 2010), a maximumlikelihood estimator based on a Gaussian conditional probability would accelerate the temporal perception of events due to the asymmetry of the likelihood function. Entrainment accounts could be formulated to explain the acceleration of on time stimuli -however they cannot explain the delay towards expectation of early stimuli.

Impact to Sensory Processing Theories
Sensory processing involves three separate stages: (1) detecting incoming information, (2) representing incoming information and (3) interpreting that representation (Wei & Stocker, 2015). Two distinct accounts exist to explain these processes: the efficient coding hypothesis explains how limited neural resources lead to efficient representations that are optimized with regard to the natural statistics in the environment (Barlow, 1961;Lewicki, 2002;Simoncelli, 2003;Wei & Stocker, 2015). The role of primary sensory processing is, as such, to reduce the inefficiency and redundancy in representing a raw image by recoding a representation into an efficient form (Huang & Rao, 2011). However, in this framework, it is difficult to determine how perceptual biases may arise. Built on such a theoretical bases, the predictive coding hypothesis suggests sensory processing is the result of combining current sensory information with prior knowledge about the world (Friston & Kiebel, 2009;Helmholtz, 1963;Kersten, Mamassian, & Yuille, 2004;Knill & Richards, 1996;Ma, Beck, Latham, & Pouget, 2006;Srinivasan, Laughlin, & Dubs, 1982)according to Bayes' (1763) rule. Such an information-processing approach can explain the myriad of data that shows consistent perceptual biases (Ernst, 2006;Ernst & Banks, 2002;Knill & Richards, 1996;Körding & Wolpert, 2004;Mamassian et al., 2002;Petzschner et al., 2015;Wolpert & Ghahramani, 2000). Recently, however, a unified model has been proposed that reconciles a predictive coding (Bayesian) approach with efficient coding of a sensory representation (Wei & Stocker, 2012; by constraining priors and likelihoods with natural stimulus statistics. Recent data shows how sensory information may be represented at the neural level -by constraining the likelihood function with the anisotropy of time . The authors introduce the idea that the likelihood function is necessarily asymmetric in the temporal dimension, with a steep onset and long-off tail. The asymmetric likelihood function explains how stimuli that are presented on time are perceptually accelerated -an anti-Bayesian effect. Interestingly, a recent article has shown concurrent repulsions away from the peak of the prior through similarly asymmetric likelihoods and priors (Wei & Stocker, 2012;. The relaxation of the assumption of normality is thus of theoretical importance as up until now, probability distributions are generally described as Gaussians in the Bayesian framework (Ernst, 2006;Ernst & Banks, 2002;Knill & Richards, 1996;Miyazaki et al., 2005;Sciutti, Burr, Saracco, Sandini, & Gori, 2014;Shi et al., 2013) -though asymmetric distributions have been used (e.g. Acerbi et al., 2012;Jazayeri & Shadlen, 2010).
Behavioural data hints at the brain optimizing perception in order to process sensory information more efficiently ( (McNeill, 1995;Repp, 2005).
Given the hypothesis that noisier signals should lead to shallower likelihood functions they should be captured more by the prior compared to less noisy functions.
This sort of effect has also been found in the context of human speed perception, whereby a broader likelihood function results in speed estimates that are more dominated by the prior (Senna, Parise, & Ernst, 2015;Stocker & Simoncelli, 2006).
Given how this effect has been translated into the domain of temporal perceptionone could posit that this is applicable to other perceptual modalities and is, as suchperception-general.

Directions for Future Research
In order to continue to validate the proposed Bayesian model of perceived timing, the model must be tested and subsequently modified in order to reflect the findings of future work. In this section, I will discuss explicit predictions based on this model to stimulate ideas for future research.

Predictions
To elicit temporal regularization effects, single sequences of isochronous events, or intervals are presented in order to build up prior expectations, yet in the environment, sequences of repeated events are often not isochronous. In almost all forms of music around the world, there are rarely any instances of completely isochronous melodiesmusic has distinct and complex temporal patterns operating at different hierarchies and time signatures (Large & Palmer, 2002;Vuust & Witek, 2014). Syncopated rhythms, for example, carry expectations about the future timing of events -yet are not completely isochronous (Fitch & Rosenfeld, 2007). How can the brain predict such events in the context of a unified model? -if it is based on the isochronous presentation of stimuli? Models, at present, would predict that a syncopated (as such), deviant stimulus would be biased towards the expected timing/interval -yet it seems that when a stimulus is obviously earlier than expected -then we perceive it as such.
To clarify this issue -the extent of the regularization effect must be mapped over a whole range of anisochronies. One may predict that at a certain magnitude of anisochrony -the regularisation effect goes away. If this is the case, it may mean that a hierarchical prior takes over and modulates the tendency to regularize deviant stimuli. Further, one could also imagine another prior that is based on the rhythm and syncopation of a sequence, which also influences the lower-level regularisation and as such, the combination of the prior and likelihood.
Given that the prior is built after the presentation of isochronous events or intervals, sometimes events may not be sensed or not even occur. In the active sensing framework, entrained oscillations continue to be in phase consistency after the end of the external stimulation -yet decay after some time (Lakatos et al., 2008;Schroeder & Lakatos, 2009). In the same way, does the prior decay after time or does it stop influencing the moment a beat is missed? To test this, one could think of an experiment where the final stimulus is missed and presented at T+1, T+2, T+3 etc.
where T is the timing the final stimulus. If the prior is still present (yet decayed) it should still modulate perceived timing -but the effect should diminish as the missed beats increases.
Moving away from the perception of audio or visual stimuli -the model could be extended to the realm of motor control. It has been consistently shown that humans synchronize to sensorimotor events such as finger tapping or dancing (Elliott, Welchman, & Wing, 2009;Elliott, Wing, & Welchman, 2010;2014;Repp, 1999;Repp & Su, 2013). A consistent finding in such studies is that the time of a tap (i.e. the time at which a finger touches a surface) is prior to the onset of an isochronous metronome. The model could account for such a negative error as it predicts that the perception of isochronous events is actually perceived earlier than expectation resulting in earlier taps. Further, how should an observer know when to initiate a tap? Due to the build of temporal expectations via the stimulation of a metronome observers can anticipate the timing of future taps and use this information to initiate a movement.

A Unified Model of Time Perception?
What should a unified model of time perception look like? A great deal of literature has been dedicated to the perception of time -and in particular, interval timing (Creelman, 1962;Gibbon et al., 1984;Matell & Meck, 2004;Meck, 2005;Merchant & de Lafuente, 2014;Treisman, 1963). The perception of duration has been described with the SET model -and, in this framework, been tied to thalamo-cortico-striatal circuitry (Matell & Meck, 2004). Contextual calibration effects on perceived duration have been modelled in the Bayesian framework -whereby duration estimates are biased towards the mean of previously experienced intervals (Jazayeri & Shadlen, 2010;Miyazaki et al., 2006;Shi et al., 2013). Context effects are bound by the fact they take a long course to learn the temporal statistics of the environment (Acerbi et al., 2012). The motivation of current work from our lab was in re-focusing temporal perception from the duration dimension to perceived timing -as well as showing how the perception of time can be biased rapidly on a trial-to-trial basis. Therefore, it seems of some importance that future work should seek to link together the existing frameworks for perceived duration and perceived event timing. As both event timing and contextual calibration of perceived duration Jazayeri & Shadlen, 2010;Miyazaki et al., 2006;Shi et al., 2013) have been described in the Bayesian framework, a neural model of Bayesian inference to explain both perceived duration and timing could lead to a unified and neurophysiologically plausible account of time perception.
In order to translate the Bayesian model of perceived event timing to the neural level, one must first consider that such a model is not in competition with interval-based accounts of time perception that have tried to link the internal clock model with the Bayesian framework (Creelman, 1962;Gibbon et al., 1984;Jazayeri & Shadlen, 2010;Petzschner et al., 2015;Shi et al., 2013;Treisman, 1963) -but rather, the model should be synthesized with such models in order to arrive at a general model of time perception. A Bayesian neural inference model that is hierarchically organized such that at a low-level population codes encode the perceived timing of stimuli but then feed-forward to a higher level that encodes the duration between two stimuli may offer a way of harmonizing perceived duration and timing.

Conclusions
During the last 150 years, great steps have been made in understanding how the human brain may perceive time. The advent of the psychophysical approach to studying perception has allowed researchers to precisely measure temporal properties of stimuli and as such, a large body of research has sought to understand the mechanisms underpinning temporal-perceptual phenomena. Contemporary models of time perception consider temporal processing from the perspective of duration. A recent Bayesian model of perceived timing re-focuses temporal perception research towards an event-based outlook. The model sets the scene to unify temporal processing accounts at neural, computational and behavioural levels, with the future goal of leading to a general model of time perception that is neurobiologically plausible, grounded in computational principles and accounts for both interval and event timing.