Beats are among the basic units of perceptual experience. Produced by regular, intermittent stimulation, beats are most commonly associated with audition, but the experience of a beat can result from stimulation in other modalities as well. We studied the robustness of visual, vibrotactile, and bimodal signals as sources of beat perception. Subjects attempted to discriminate between pulse trains delivered at 3 Hz or at 6 Hz. To investigate signal robustness, we intentionally degraded signals on two-thirds of the trials using temporal-domain noise. On these trials, inter-pulse intervals (IPIs) were stochastic, perturbed independently from the nominal IPI by random samples from zero-mean Gaussian distributions with different variances. These perturbations produced directional changes in the IPIs, which either increased or decreased the likelihood of confusing the two pulse rates. In addition to affording an assay of signal robustness, this paradigm made it possible to gauge how subjects’ judgments were influenced by successive IPIs. Logistic regression revealed a strong primacy effect: subjects’ decisions were disproportionately influenced by a trial’s initial IPIs. Response times and parameter estimates from drift-diffusion modeling showed that information accumulates more rapidly with bimodal stimulation than with either unimodal stimulus alone. Analysis of error rates within each condition suggested consistently optimal decision making, even with increased IPI variability. Finally, beat information delivered by vibrotactile signals proved just as robust as information conveyed by visual signals, confirming vibrotactile stimulation’s potential as a communication channel.
When a brief stimulus repeats periodically on a short time scale, its repetition can generate perceptual experiences known as beats. Although the experience of a beat is most commonly associated with repetition of musical sounds, a comparable experience can also be reliably generated by repeated presentation of brief visual stimuli (Grahn, 2012; Guttman et al., 2005; Patel et al., 2005). However, beat perception with modalities other than vision and audition has attracted little attention. For example, with a few notable exceptions (e.g., Frings and Spence, 2010, 2011; Meng et al., 2015), researchers have ignored beat perception with vibrotactile stimulation. This neglect is somewhat surprising. After all, the human skin, a target for vibrotactile stimulation, is the largest of our sensory receptor surfaces, ∼1.7 m2 in average adults (Bender and Bender, 1999). As a result, vibrotactile beats afford a potentially valuable communication channel.
Although vibrotactile signals are widely used in cellphones and vehicles, the ability of such signals to transmit information more nuanced than simple alerts remains to be examined. Additionally, little is known about the effect of vibrotactile stimulation when paired with stimulation from one or more other sensory modalities. For example, although rate perception is enhanced when auditory and visual stimuli are paired (Levitan et al., 2015; Maddox et al., 2015; Recanzone, 2003), would a comparable benefit arise from a stimulus pairing that includes vibrotactile stimulation?
To investigate a sensory signal’s information-carrying capacity, one can measure the robustness of information transmission when the signal is perturbed by some random variable. Under real world conditions, random variation can diminish signal reliability and undermine perception; for example, variation in the skin’s contact with a vibrating source introduces variability into signals received by the skin. In controlled research settings, the addition of random variation, or noise, to a stimulus can yield valuable insights into the computations that support perception (Allard et al., 2015). Measurements of visual detection or identification in the presence of noise have been used to quantify vision’s efficiency (Gold et al., 1999) and to identify the particular features that subjects use to perform a task (Allard et al., 2015; Sekuler et al., 2004). Would systematic degradation help us understand more about the robustness of vibrotactile signals?
To address these questions, we had subjects discriminate between sequences of pulses delivered at two different mean rates, 3 Hz and 6 Hz. Pulses were presented in one of three modes: as vibrotactile stimuli delivered to the palms and fingers, as visual stimuli generated by turning on and off a spot of light, or as concurrent stimuli in both vibrotactile and visual modalities. To test signal robustness, external time-domain noise was added to the pulses in a sequence. Because we focused on the perception of beats, a purely temporal feature, we employed a form of noise that was defined in the temporal domain. Beats within any modality were generated by identical pulses, which forced subjects to base perceptual judgments solely on the temporal gaps between pulses, the inter-pulse intervals (IPIs). We inserted random temporal noise into our stimuli by adding a zero-mean Gaussian random variate to each IPI in a pulse sequence. This rendered the temporal separations between successive pulses stochastic. For these stochastic stimuli, IPIs in a sequence varied independently. This same basic operation was applied to stimuli from each of the three different modalities.
We had theoretical as well as applied reasons for examining how rate discrimination was affected by inserting random intervals into the stimuli. Changes in behavioral measures associated with increased variability inform the robustness of sensory signals (Pelli and Farell, 1999). If we observed different effects of temporal variability on task performance among modalities, we could infer differences in signal robustness among modalities as well. We wanted to benchmark the robustness of information carried by vibrotactile signals against the robustness of visual information, in particular. With signal detection efficiencies as high as 83% (Burgess et al., 1981), visual information provides a ‘gold standard’ for signal robustness and serves as a baseline for comparisons. Additionally, as explained later, we wanted to exploit random variation in the IPIs for a detailed analysis of how successive pulses were processed. Sequential sampling models of decision making vary with respect to their assumptions about evidence accumulation, some assuming a constant rate, and others emphasizing the role of later evidence, akin to a recency effect (Evans et al., 2017). Specifically, we asked whether all IPIs in a stochastic pulse sequence contributed equally to a response, as assumed by drift-diffusion models, or whether specific IPI positions carried additional weight. Finally, we wanted to assess whether designers of signaling devices should consider combining visual and vibrotactile stimuli to optimize information transfer. In particular, we wanted to determine whether processing of visual signals would benefit from coordination with vibrotactile signals.
Twenty-five subjects (13 female, 10 male, 2 declined to identify; mean age = 19.1 years, SD = 1.2) served in the experiment. Previous rate perception work from our laboratory (Bushmakin and Sekuler, 2016) demonstrated that such sample sizes are sufficient for observing significant effects. All subjects had best-corrected Snellen acuity 20/40 or better. Experimental procedures were approved by Brandeis University’s Institutional Review Board and were conducted in accordance with the Declaration of Helsinki. All subjects gave written informed consent prior to participation.
All stimuli were delivered via a handheld computer tablet (Samsung Note 10; Fig. 1). Subjects held the tablet bimanually so that on average it was ∼41 cm from their eyes.
Stimuli were sequences of pulses delivered either as visual (V), vibrotactile (vT), or concurrent visual and vibrotactile (V-vT) pulses. In all three of these conditions, each pulse in a sequence was 33 ms in duration, but the IPIs, defined as the time elapsed from the end of one pulse to the onset of the next, varied as explained below. Stimuli with two different mean pulse rates, 3 Hz and 6 Hz, were randomly intermixed; pilot testing showed that these two rates would be reliably, though imperfectly, discriminated. Subjects categorized the rate of each stimulus as either ‘slow’ (3 Hz) or ‘fast’ (6 Hz). Hereafter, we refer to each variable (modality, noise, and rate) in small capitals to differentiate the experimental variable from the broader meaning of each term.
Vibrotactile stimulation was produced by the tablet’s built-in motor whose rotating eccentric load delivered vibrations to the subjects’ fingers and palms. Located behind the tablet’s touch screen, the tablet’s motor rotated at and produced vibrations at 250 Hz. Each 33-ms vibrotactile stimulus pulse was generated by gating the motor’s rotation on and off. The strength of each vibrotactile stimulus pulse was well above detection threshold and did not vary throughout the task. Visual pulses were generated by turning on and turning off a small circular Gabor patch presented at the center of the tablet screen. At the average viewing distance, each Gabor subtended ∼1.4°, and had a peak luminance of 125 cd/m2 on a steady background luminance of 52 cd/m2. As with vibrotactile pulses, visual pulse strength was constant throughout the experiment. For V-vT stimuli, the concurrent visual and vibrotactile pulses were synchronized at the start of each trial. Previous work from our laboratory showed that vibrotactile input can either aid or interfere with processing of concurrent visual stimulation, depending upon whether information received from one modality reinforced or contradicted information received from the other (Bushmakin and Sekuler, 2016). To control for such complications, concurrent visual and vibrotactile pulses in our bimodal stimuli were always presented at the same rates and in synch.
Nominal IPIs for the two stimulus rates were set at 300 ms for 3-Hz sequences (33 ms pulse + 300 ms IPI = 333 ms cycle length) and 133 ms for the 6-Hz sequences (33 ms pulse + 133 ms IPI = 166 ms cycle length). As mentioned before, each IPI in a sequence could be independently perturbed by a new random variate,
Our choice to vary
To control for the effect of switching attention between modalities (Boulter, 1977), V, vT, and V-vT stimuli were presented in separate test blocks, of 130 trials each. For V-vT stimuli, concurrent visual and vibrotactile pulses were always synchronized with one another. Departing from the approach others have taken with bimodal stimuli (Varghese et al., 2017), we allowed subjects to base their decisions about V-vT pulse rate on whichever modality they preferred: visual or vibrotactile. The 130 trials in each block were randomized by mean pulse rate (3 Hz or 6 Hz) and noise level (
Figure 2A depicts overall trial structure. A fixation cross, displayed at the tablet’s center for 33 ms, alerted the subject that a trial was about to begin. Stimulus pulses began 500 ms later and continued until the subject made a categorization response by tilting the handheld tablet either downward, away from themselves, or upward, toward themselves. A downward tilt ⩾17° meant that pulse train was categorized as ‘slow’ (3 Hz) by the subject; the same amount of tilt upward meant that the pulse train was categorized as ‘fast’ (6 Hz). Differences in latency and speed of movement of the wrist’s flexor and extensor muscles caused the average times to rotate the tablet in each direction to differ. The failure to counterbalance how judgment categories (‘slow’ and ‘fast’) mapped onto directions of tablet rotation, upward or downward, foreclosed comparing judgment times for 3- and 6-Hz stimuli, but had no other effect on data analysis or interpretation. On each trial, subjects received pulses until they responded by tilting the tablet. While there was no limit to the number of pulses subjects could receive, they were encouraged to respond as quickly as possible. Whenever a response time exceeded 1600 ms, the tablet screen displayed a message encouraging more rapid responses on subsequent trials.
Response time (RT) was defined as the time elapsed from the onset of the trial’s pulse sequence to the time of the subject’s response. For the entire experiment, a reminder remained on the screen telling subjects how to tilt the tablet for the two different responses. Subjects received feedback for their responses only during each test block’s first 10 trials. To prevent any auditory signals generated by the tablet’s vibrations from impacting subjects’ responses, throughout the experiment subjects heard a masking sound, white noise delivered over noise-canceling headphones.
2.5. Data Analysis
Of the 25 subjects who were tested, data from two were excluded. One excluded subject consistently performed at or below chance (percent correct ⩽ 50%); the second excluded subject consistently gave abnormally long response times (that is, up to five seconds long). For the remaining 23 subjects, data for the first 10 trials in each block were treated as practice and were discarded. We analyzed RTs only from trials on which the subject’s response was correct. In addition, we excluded RTs that were outliers relative to a subject’s entire set of correct RTs. Specifically, we identified each subject’s first and third quartiles as well as their interquartile range (IQR), and then excluded trials when RT < Q1 − 1.5 × IQR, or when RT > Q3 + 1.5 × IQR (Tukey, 1977). Using these criteria, 4.35% of all correct trials were removed. After such trials were excluded, we summarized a subject’s RTs in each condition by that subject’s median RT. Data from all test trials were used for analyses of accuracy.
2.5.2. Accuracy and Response Time Analyses
Separate within-subject factorial ANOVAs were used to test how accuracy and response time were affected by noise and modality. Each ANOVA incorporated two sets of contrasts for planned comparisons: polynomial coefficients were used to dissect the ordered effect of noise, and Helmert coefficients were used to dissect the unordered categorical variable of modality.
2.5.3. Equivalence Testing
Signal detection theory defines an optimal decision criterion as one that elicits equal error rates across experimental conditions (Macmillan and Creelman, 2005). In our study, an optimal decision criterion would have yielded equal probabilities for responding ‘slow’ when the stimulus rate was fast [Pr(‘slow’ | fast stimulus)] and for responding ‘fast’ when the stimulus rate was slow [Pr(‘fast’ | slow stimulus)]. We did not expect to observe a response bias in our study, as both stimulus rates occurred equally often and carried equal payoff incentives (Macmillan and Creelman, 2005). The theoretical criterion IPI that would have fulfilled the equation Pr(‘fast’ | slow stimuli) = Pr(‘slow’ | fast stimulus) differed among the three noise levels, and each is shown by a dashed vertical line in the panels of Fig. 3.
To determine whether subjects actually used the optimal criteria described in the preceding paragraph, we tested the proposition that pairs of error rates, one for 3-Hz stimuli and one for 6-Hz stimuli, would be sufficiently similar for each noise level as to be reasonably considered equivalent. Equivalence testing was performed using the Two One-Sided Tests (TOST) method (Lakens, 2017) implemented in the TOSTER package in R. The TOST procedure determines whether an observed effect is small enough to be considered statistically equivalent to zero. The approach involves specifying a set of lower (
2.5.4. Drift-Diffusion Modeling
The drift-diffusion model (DDM) framework provides a comprehensive analysis of subjects’ decision-making processes, portraying decisions as the product of a stochastic process that accumulates information over time toward one of two response criteria (Ratcliff and McKoon, 2008). In our case, the two response criteria were either ‘slow’ or ‘fast’. The basic model includes three parameter estimates that characterize the two-choice decision-making process: (i) drift rate, the rate at which evidence accumulates in favor of one response; (ii) decision criterion, the amount of evidence needed to make a response; and (iii) non-decision time, described as the portion of response time allocated to both stimulus encoding and motor response. We used the Python-based toolbox HDDM (see Wiecki et al., 2013) to estimate these parameters.
Using Markov Chain Monte Carlo sampling, we generated 10 000 samples from the joint posterior distribution of all model parameters and discarded the first 1000 samples as burn-in. For each parameter, Bayesian methods in HDDM compared the posterior distributions estimated for each modality and noise condition (Kruschke, 2014). To assess model fit, we performed posterior predictive checks at the individual subject level. For each subject and each of the nine conditions, 500 parameter values were sampled from the posterior distributions and used to simulate a different data set for each parameter value. To judge the model fit in terms of accuracy as well as RT, both simulated and empirical RT distributions were error-coded such that RTs on correct trials were given a positive sign and RTs on incorrect trials were given a negative sign. Simulated RT distributions were compared to empirical RT distributions using two-sample Kolmogorov–Smirnov tests for each subject and condition (Siegel, 1956).
2.5.5. Modeling the Influence of Individual IPIs
The idea that subjects accumulate information over successive samples or ‘looks’ (Holt and Carney, 2005; Moore, 2003) is not novel; it is actually central to evidence-accumulation models. However, there is no principled reason to assume that successive samples are integrated without regard to their serial order. In fact, quite the opposite may be true: sequential sampling decision strategies that give extra weight to early samples can outperform their fixed sample-size counterparts (Wald, 1947). The power of a sequential sampling strategy and its demonstrated usefulness in diverse decision-making domains (Ratcliff et al., 2016) led us to examine how subjects processed the IPIs that comprised a stimulus sequence. Our analysis exploited the fact shown in Table 1: with stochastic stimuli, each individual IPI should have a directionally predictable influence on subjects’ responses, either promoting correct judgments or promoting errors. Knowing how each random variate might influence subjects’ judgments allowed us to test whether all IPIs in a series of IPIs contributed equally to the ultimate response. Such a test is important because theories of evidence accumulation in cognitive tasks typically assume that successive stimulus samples are given equal weight in the decision-making process (Evans et al., 2017).
We used mixed logistic regression to model the impact of each IPI in a pulse sequence on the likelihood of a correct response. With the lme4 package in R, we regressed accuracy on the first three IPIs for each trial, controlling for stimulus Modality. We focused our analysis exclusively on 6-Hz stimuli because their shorter IPIs meant that many responses would have been made after at least four pulses (and therefore three IPIs) occurred (see Section 3. Results; mean number of IPIs before response in the 6-Hz condition: 4.92 IPIs, SD: 1.69). In contrast, with longer IPI, 3-Hz stimuli, response times on nearly one-half of all trials would not have permitted at least four pulses (and three IPIs) to be delivered before the response (see Section 3. Results; mean number of IPIs before response in the 3-Hz condition: 2.46 IPIs, SD = 1.03).
We included the first three IPIs in the analysis, reasoning that three pulse cycles would afford a clear, though incomplete, picture of how successive IPIs influenced subjects’ decisions. While we expected subjects to base their decisions on more than the first three IPIs in a sequence, we did not include these additional IPIs individually at the risk of overfitting the models. We fit models for 20% noise trials and 40% noise trials separately; observations from trials with deterministic stimuli (noise = 0%) were omitted because all IPIs were the same in this condition.
For the reason explained already, these logistic regression models could not be applied to the 3-Hz data. To get around this limitation, we took another approach, examining only the impact of the initial IPI on each trial, for both 3- and 6-Hz stimuli. First, we subsetted the data into four groups for each noise level in way that corresponded to our predictions in Table 1: (i) 3-Hz trials with a shortened first IPI, (ii) 3-Hz trials with a lengthened first IPI, (iii) 6-Hz trials with a shortened first IPI, and (iv) 6-Hz trials with a lengthened first IPI. Given the large impact of the first IPI and the small effect of modality in our logistic regression analysis, we focused on the first IPI and collapsed data across modality conditions. Then, to test for directional effects of IPI perturbations, we used paired Student’s t-tests to compare subjects’ mean accuracy on trials with negatively perturbed (shortened) IPIs to that on trials with positively perturbed (lengthened) IPIs. We performed separate t-tests on 20% noise and 40% noise data for each rate condition.
Figure 4A shows variation in mean response accuracy with noise level and modality. As the ANOVA summary in Table 2 confirms, noise and modality each impacted accuracy significantly, but the two variables’ interaction did not,
3.2. Response Time
Figure 4B displays mean RT as a function of stimulus noise and modality. As Table 3 shows, RT was significantly impacted by both noise,
3.3. Subjects Adopt Optimal Criteria
To determine whether subjects used optimal criteria as defined by Macmillan and Creelman (2005), we tested the proposition that the error rate for 3-Hz stimuli [Pr(‘fast’ | slow stimulus)] and the error rate for 6-Hz stimuli [Pr(‘slow’ | fast stimulus)] would be sufficiently similar within each noise level as to be reasonably considered equivalent.
Using equivalence bounds of (−0.0768, 0.0768), calculated with
3.4. Evidence Accumulation Varies With Condition
As Fig. 4 shows, results with our two dependent variables, accuracy and response time, revealed an interesting, clear divergence. Response accuracy showed little or no reliable differences among V, vT, and V-vT conditions; in contrast, response times to V-vT stimuli were shorter than response times to stimuli in either of the unimodal conditions. We used DDM to reconcile this divergence. Effects of noise and modality on the posterior distributions of each model parameter are shown in Fig. 6. Each panel shows the values of an individual parameter associated with a particular noise level and modality. Results of group-level posterior comparisons are reported below as Bayesian probabilities that describe differences between pairs of conditions. To facilitate interpretation,
Both modality and noise had a reliable effect on drift rate (Fig. 6). For all modalities, increased levels of noise were associated with lower drift rates [V:
Only noise reliably affected the decision criterion parameter (Fig. 6). For all modalities, increased levels of noise were associated with lower decision criteria [V:
Both modality and noise had a reliable effect on non-decision time (Fig. 6). For the visual modality, increased levels of noise were associated with longer non-decision time values [
Assessment of model fit showed that the DDM fit our data well overall. Figures 7 and 8 show how model-simulated RT distributions fit the empirical RT distributions for two representative subjects. The subject whose data are shown in Fig. 7 was the ‘best’ fit; for all nine conditions, Kolmogorov–Smirnov tests showed no significant differences between simulated and empirical RT distributions (all
3.5. First Impressions (and First IPIs) Matter Most
To examine successive IPIs’ influence on subjects’ categorization of pulse rate, we applied multiple logistic regression to the first three IPIs for each trial with stochastic stimuli in the 6-Hz condition. Separate analyses were done for 20% noise and 40% noise levels. We focused our analysis exclusively on 6-Hz stimuli as their shorter IPIs meant that many responses were made after at least four pulses (and therefore three IPIs) occurred (mean number of IPIs before response in the 6-Hz condition: 4.92, SD: 1.69; 3-Hz condition: 2.46, SD = 1.03). For each stochastic stimulus sequence with a 6-Hz pulse rate, we regressed decision accuracy (a binary variable) against predictors including the modality of testing (V, vT, and V-vT) and the values of each of the first three IPIs in the stimulus. The nested models shown in Table 5 were each fit sequentially to determine the impact of each parameter on the likelihood of a correct response. These models omit interaction terms because preliminary tests that included those terms produced unstable model fits. In the 20% noise condition, interactions between the first and second IPIs [
We fit the mixed logistic regression models listed in Table 5 using the lme4 package in R. Likelihood ratio testing was performed using the deviance (-2LL) statistics for each nested model. The table shows that as expected, deviance, sometimes described as ‘badness of fit’, shrinks as additional parameters are included in successive nested models. Comparing the values of Δ-2LL for successive models yielded the p values presented in Table 5 and gave a sense of each additional parameter’s impact. With this criterion, we found that for both noise levels, the first and second IPIs each consistently significantly affected response accuracy. The effects of the third IPI and of stimulus modality were statistically significant only for the 40% noise condition. Additionally, the third IPI and the stimulus modality each had a considerably smaller impact on model deviance than did the first and second IPIs.
The odds ratio (OR) for each parameter estimate in the full model (Table 5) expresses the magnitude of each IPI’s effect on the odds of responding correctly. With 20% noise added to 6-Hz stimuli, increasing the first IPI by just 10 ms decreased the likelihood of a correct response by 11.2% (OR = 0.888, 95% CI = [0.856, 0.921],
These results suggest that subjects’ decisions were based disproportionately on the first two IPIs in a pulse sequence, and that only in the 40% noise condition did subjects make appreciable use of subsequent IPIs. It is important to consider the additive nature of the results presented: subjects undoubtedly based their decisions on sequences of IPIs rather than on any one IPI in isolation, and many or all IPIs prior to the decision could be taken into consideration by the subject. For example, if both the first and second IPIs in a 6-Hz stimulus with 20% noise deviated from the nominal IPI by +10 ms [Pr(IPI1 ⩾ 143 ms ∩ IPI2 ⩾ 143 ms) = 14.6%], holding all other IPIs in the trial constant, the likelihood that a subject would respond correctly on that trial would decrease by more than 19%.
As predicted, directional changes in the first IPI affected accuracy differentially for 3-Hz and 6-Hz stimuli (Fig. 9). Figure 9A shows that, with 20% noise, subjects were more accurate when the first IPI on a 3-Hz trial was lengthened compared to when the first IPI was shortened,
4.1. Review of Key Findings
The preceding sections expanded a basic examination of accuracy and response time results by deploying three different, but complementary analytic approaches. As expected, the multiple analytic approaches produced a variety of findings. So, before discussing our study’s implications, it will be useful to summarize what we consider our principal findings:
- 1.The results of our accuracy and response time analyses showed that, compared to either unimodal condition, bimodal stimulus presentation was associated with improved task performance, particularly faster response speed on correct trials (Fig. 4B). Performance did not differ substantially between the two unimodal conditions. The analyses of accuracy and response times also showed no significant noise × modality interactions, indicating that time-domain noise did not affect performance differently among the V, vT, and V-vT conditions.
- 2.Error rate analysis showed that subjects used statistically optimal decision criteria, adapting in real time to the random noise present in a stimulus. Drift-diffusion modeling confirmed this result.
- 3.Drift-diffusion modeling also offered one potential explanation of why modality had a larger effect on RT (
) than on accuracy ( ): subjects accumulated evidence more quickly with and took less time to encode, process, and respond to bimodal stimuli, compared to either V or vT stimuli (Fig. 6).
- 4.Importantly, logistic regression revealed that early evidence had the most weight in subjects’ responses, and later evidence only informed decisions when signals were degraded by noise (Table 5). The results of the logistic regression analysis suggest that each IPI in a sequence provided subjects a different amount of evidence, but this is in direct conflict with the assumptions of the drift-diffusion model.
4.2. An Assay of Signal Robustness
Of all three modality conditions in our experiment, subjects were least accurate with visual stimuli. Multiple studies have demonstrated that sensory systems are specialized for processing different stimulus attributes (the ‘modality appropriateness hypothesis’; Welch and Warren, 1980). When experiments combine auditory and visual stimuli, the modalities’ relative influences vary with task demands: when judgments must be based on temporal information, auditory cues are afforded more weight than visual cues; when judgments must be based on spatial information, visual cues are given more weight than auditory cues (Gebhard and Mowbray, 1959; Michalka et al., 2015; Recanzone, 2003; Welch et al., 1986). Analogous relationships have been reported between visual and haptic cues (Ernst, 2007; Ernst and Banks, 2002). The relatively poorer performance that we observed with visual stimuli may, therefore, be related to vision’s specialization for processing spatial information rather than the temporal information demanded by our task.
Our results may actually underestimate the potential effect of temporal variation in the IPIs. Specifically, difference thresholds for duration of vibrotactile stimuli have been reported to be as high as 13% (Francisco et al., 2015). With the Gaussian distributions from which we drew random variates, for the 20% noise condition many of the IPI perturbations would have been below threshold in vT and V-vT conditions. However, for the 40% noise condition, fewer IPI perturbations were likely below threshold (Fig. 10). Future work could apply temporal noise that extends the narrow range of noise we examined, possibly drawing noise from distributions that are less heavily weighted toward the mean.
Because we applied noise relative to the nominal IPI in a pulse sequence, the widths of IPI sampling distributions at any one noise level differed between the two rates (Fig. 10). Adding variability to the IPIs in this way meant that the amount of nominal noise (in ms) varied between the two pulse rates, but this approach scaled the amount of temporal noise present in a pulse sequence relative to that sequence’s other temporal features (namely, its frequency). Our failure to counterbalance the way the two response types were signaled, however, prevented us from testing for differences in the effect of noise on 3-Hz and 6-Hz trials.
4.3. The Benefit of Bimodality
Bimodality had a beneficial impact on both response speed and on drift-diffusion model estimates of rate at which evidence is accumulated (Figs 4 and 6). This result is consistent with earlier suggestions that multisensory stimuli are processed by supramodal cortical mechanisms (Crommett et al., 2017; Levitan et al., 2015). Interestingly, this hypothesis is supported by a recent functional magnetic resonance imaging (fMRI) demonstration that auditory frequency is broadly represented in human cerebral cortex, including in classically defined somatosensory cortex (Pérez-Bellido et al., 2018).
As mentioned earlier, with V-vT combinations, subjects could base their decisions on whichever modality they chose. With concurrent stimuli in multiple modalities, previous work showed that a subject’s preferred modality depends upon the modalities’ relative reliabilities (Bresciani and Ernst, 2007; Bresciani, Dammeier and Ernst, 2008; Ernst and Banks, 2002). However, the close equivalence in accuracy for V and vT stimuli in our experiment meant neither modality offered a clear advantage in terms of reliability. Despite this, subjects may have consistently relied upon one of the two unimodal cues that comprised a V-vT stimulus, inadvertently rendering cues from one modality task-irrelevant and undermining the true bimodal nature of the V-vT condition. It is worth noting that task-irrelevant stimuli still can affect perception and become automatically integrated with target stimuli (Bresciani et al., 2008; Maddox et al., 2015; Varghese et al., 2017). Additionally, because the separate concurrent components of V-vT stimuli were perfectly correlated and synchronized, the two components might have been bound perceptually, an effect likely amplified by their shared spatial location (Badde et al., 2018; Locke and Landy, 2017). These speculations could be tested in experiments that added independent noise samples to the separate V and vT components of bimodal V-vT stimuli, and/or varied their spatial relationship.
4.4. Investigating the Decision-Making Process
Drift-diffusion modeling yielded deeper insight into subjects’ decision-making process than did our separate analyses of accuracy and response speed. The diffusion model analysis showed that information was extracted from noisier stimuli more slowly than from less noisy stimuli. This result implies that in noisy sequences, individual IPIs provided less evidence toward a decision. Similar changes in drift rate were seen when diffusion modeling was applied to tasks that vary in difficulty (Voss et al., 2004; Wagenmakers, 2009).
Our results diverge from standard drift-diffusion accounts in two important ways. First, our diffusion modeling results, as well as the results of equivalence testing, imply that subjects adjusted their decision thresholds as a function of the noise level they experienced within a trial. Subjects’ ability to adjust decision thresholds in real time, during the course of a trial, has special theoretical interest. Diffusion models of decision making typically assume fixed boundaries set by the subject at trial onset (Ratcliff and McKoon, 2008; Ratcliff et al., 2016). In our study, however, noise level was randomized, guaranteeing that subjects had no prior knowledge of how much IPI variability they might encounter until the trial was underway. If subjects set and held a decision threshold at the beginning of each trial, we would have seen no differences in the decision criterion parameter between noise levels. Second, classic diffusion models also assume a constant rate of evidence accumulation during a trial (Ratcliff and McKoon, 2008; Ratcliff et al., 2016). In our study, where the IPI was the unit of evidence, a constant drift rate would imply that each IPI in a single trial provided the same amount of evidence toward the decision criterion threshold. Our logistic regression analysis, however, suggested a different narrative: that each IPI in a pulse sequence yielded a different amount of evidence, with the first and second IPIs in a sequence providing the most information and later IPIs providing comparatively less.
While our implementation of the drift-diffusion model yielded good model fits (Figs 7 and 8), the departures from the model assumptions described above may have detracted from the utility of fitting such a model. Holmes and Trueblood (2018) have reported on the limited ability of diffusion models to account for non-stationary decision criteria in particular, and suggested fitting a piecewise variant of the classic drift-diffusion model to account for time-varying parameters. Additionally, alternative sequential sampling models may have been better suited for our task. The accumulator model, for example, assumes that values of decision criteria vary exponentially across trials (Ratcliff and Smith, 2004). That model also assumes that evidence is sampled at discrete time points, reflecting the trial structure in our task more accurately.
We used our regression analysis to investigate the weight of each discrete unit of evidence in a trial, but what we have called the full model (Table 5) was not meant to be a complete account of subjects’ responses. Our analysis focused mainly on the relative importance of the first few IPIs that a subject experienced during a trial; an improved approach would account for not only the first three but all of the IPIs in a pulse sequence. We chose not to include each IPI experienced before a response as individual predictors to avoid overfitting the model. An alternative full model could have included predictors for each of the first three IPIs, as in Table 5, as well as an additional predictor that captured the mean of the remaining IPIs experienced before a response. We opted not to use this approach, however, because of the amount of data that we would have excluded to accommodate such an analysis. Our full model undoubtedly omitted other potentially consequential variables as well, such as interactions among IPIs, trial-to-trial variation in attention (e.g., Chambers and Pressnitzer, 2014; Parise and Ernst, 2017; Schwiedrzik et al., 2014), and variability in the decision criterion (Cabrera et al., 2015; Mueller and Weidemann, 2008), as discussed above.
4.5. Conclusions and Future Work
Our experiment set out to (i) benchmark the robustness of information conveyed by vibrotactile signals compared to visual signals, (ii) investigate the decision-making process in rate discrimination tasks with stochastic temporal sequences, and (iii) assess the potential of combined visual and vibrotactile stimuli for use in signaling devices. Subjects extracted the mean rate equally well from either of the two unimodal conditions; from this, we conclude that vibrotactile signals are as robust as visual signals as vehicles for transmitting information about the rate at which stimuli occur. Our results also support the notion that early information is given the most weight when making speeded decisions, and show that processing of rate information is expedited when information is presented bimodally.
Vehicles, mobile devices, and medical equipment now utilize pulsatile cues, usually auditory or visual, to convey information. Our results suggest that adding vibrotactile stimuli to the mix could promote faster, more accurate responses. Currently, vibrotactile signals are used in mobile devices primarily to alert users to some event, such as an incoming call. The robustness of beat-like vibrotactile signals demonstrated in our study confirms that various attributes of these cues could be manipulated to provide more information than a simple alert. We concur with the suggestions made by Meng and Spence (2015) about the considerable information-carrying potential of vibrotactile stimulation, particularly when its temporal, spatial, and intensity dimensions can all be varied. Future studies would do well to ascertain the limits on the ability of vibrotactile stimuli to convey meaningful, timely information, perhaps in combination with concurrent stimuli in other sensory modalities.
MBV and RS were responsible for study concept and design. MBV and RFS collected and analyzed the data under the supervision of RS. MBV and RS drafted the manuscript; RFS provided critical revisions. All authors approved the final version of this manuscript for submission.
This work was supported by NIGMS Training Grant T32GM132498 (MBV) and NIH Training Grant R90DA033463 (RFS). Publication of this open access article was funded by the Brandeis Library Open Access Fund. We thank Xiaodong Liu for statistical advice, Paul DiZio for constructive comments on an earlier version of the manuscript, and Maxim Bushmakin for programming contributions early in the project.
Declaration of Conflicting Interests
The authors declared that there were no conflicts of interest with respect to the authorship or publication of this article.
BenderD. A. and BenderA. E. (1999). Body surface area in: Benders’ Dictionary of Nutrition and Food TechnologyD. A. Bender (Ed.) p. 61. CRC PressBoca Raton, FL, USA.
BrescianiJ.-P. and ErnstM. O. (2007). Signal reliability modulates auditory–tactile integration for event countingNeuroreport 181157–1161.
BrescianiJ.-P.DammeierF. and ErnstM. O. (2008). Tri-modal integration of visual, tactile and auditory signals for the perception of sequences of eventsBrain Res. Bull. 75753–760.
CabreraC. A.LuZ.-L. and DosherB. A. (2015). Separating decision and encoding noise in signal detection tasksPsychol. Rev. 122429–460.
ChambersC. and PressnitzerD. (2014). Perceptual hysteresis in the judgment of auditory pitch shiftAtten. Percept. Psychophys. 761271–1279.
CrommettL. E.Pérez-BellidoA. and YauJ. M. (2017). Auditory adaptation improves tactile frequency perceptionJ. Neurophysiol. 1171352–1362.
ErnstM. O. and BanksM. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashionNature 415429–433.
EvansN. J.HawkinsG. E.BoehmU.WagenmakersE. J. and BrownS. D. (2017). The computations that support simple decision-making: a comparison between the diffusion and urgency-gating modelsSci. Rep. 716433. DOI:10.1038/s41598-017-16694-7.
FranciscoE. M.HoldenJ. K.NguyenR. H.FavorovO. V. and TommerdahlM. (2015). Percept of the duration of a vibrotactile stimulus is altered by changing its amplitudeFront. Syst. Neurosci. 977. DOI:10.3389/fnsys.2015.00077.
FringsC. and SpenceC. (2011). Increased perceptual and conceptual processing difficulty makes the immeasurable measurable: negative priming in the absence of probe distractorsJ. Exp. Psychol. Hum. Percept. Perform. 3772–84.
GoldJ. M.MurrayR. F.SekulerA. B.BennettP. J. and SekulerR. (2005). Visual memory decay is deterministicPsychol. Sci. 16769–774.
GuttmanS. E.GilroyL. A. and BlakeR. (2005). Hearing what the eyes see: auditory encoding of visual temporal sequencesPsychol. Sci. 16228–235.
LevitanC. A.BanY.-H. A.StilesN. R. B. and ShimojoS. (2015). Rate perception adapts across the senses: evidence for a unified timing mechanismSci. Rep. 58857. DOI:10.1038/srep08857.
LockeS. M. and LandyM. S. (2017). Temporal causal inference with stochastic audiovisual sequencesPLoS One 12e0183776. DOI:10.1371/journal.pone.0183776.
MaddoxR. K.AtilganH.BizleyJ. K. and LeeA. K. C. (2015). Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listenerseLife 4e04995. DOI:10.7554/eLife.04995.
MengF.GrayR.HoC.AhtamadM. and SpenceC. (2015). Dynamic vibrotactile signals for forward collision avoidance warning systemsHum. Fact. 57329–346.
MichalkaS. W.KongL.RosenM. L.Shinn-CunninghamB. G. and SomersD. C. (2015). Short-term memory for space and time flexibly recruit complementary sensory-biased frontal lobe attention networksNeuron 87882–892.
MuellerS. T. and WeidemannC. T. (2008). Decision noise: an explanation for observed violations of signal detection theoryPsychon. Bull. Rev. 15465–494.
PariseC. V. and ErnstM. O. (2017). Noise, multisensory integration, and previous response in perceptual disambiguationPLoS Comput. Biol. 13(7) e1005546. DOI:10.1371/journal.pcbi.1005546.
PatelA. D.IversenJ. R.ChenY. and ReppB. H. (2005). The influence of metricality and modality on synchronization with a beatExp. Brain Res. 163226–238.
Pérez-BellidoA.BarnesK. A.CrommettL. E. and YauJ. M. (2018). Auditory frequency representations in human somatosensory cortexCereb. Cortex 283908–3921. DOI:10.1093/cercor/bhx255.
RatcliffR. and McKoonG. (2008). The diffusion decision model: theory and data for two-choice decision tasksNeural Comput. 20873–922.
RatcliffR.SmithP. L.BrownS. D. and McKoonG. (2016). Diffusion decision model: current issues and historyTrends Cogn. Sci. 20260–281.
SchwiedrzikC. M.RuffC. C.LazarA.LeitnerF. C.SingerW. and MelloniL. (2014). Untangling perceptual memory: hysteresis and adaptation map into separate cortical networksCereb. Cortex 241152–1164.
SekulerA. B.GasparC. M.GoldJ. M. and BennettP. J. (2004). Inversion leads to quantitative, not qualitative, changes in face processingCurr. Biol. 14391–396.
VargheseL.MathiasS. R.BensussenS.ChouK.GoldbergH. R.SunY.SekulerR. and Shinn-CunninghamB. G. (2017). Bi-directional audiovisual influences on temporal modulation discriminationJ. Acoust. Soc. Am. 1412474. DOI:10.1121/1.4979470.
VossA.RothermundK. and VossJ. (2004). Interpreting the parameters of the diffusion model: an empirical validationMem. Cogn. 321206–1220.
WagenmakersE.-J. (2009). Methodological and empirical developments for the Ratcliff diffusion model of response times and accuracyEur J. Cogn. Psychol. 21641–671.
WelchR. B.DuttonHurtL. D. and WarrenD. H. (1986). Contributions of audition and vision to temporal rate perceptionPercept. Psychophys. 39294–300.
WieckiT. V.SoferI. and FrankM. J. (2013). HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in PythonFront. Neuroinform. 714. DOI:10.3389/fninf.2013.00014.