When information is available in more than one sensory modality, the central nervous system will integrate the cues to obtain a statistically optimal estimate of the event or object perceived (Alais and Burr, 2004; Ernst and Banks, 2002). For synchronising movements to a stream of events, this multisensory advantage is observed with reduced temporal variability of the movements compared to unimodal conditions (Elliott et al., 2010, 2011; Wing et al., 2010). Currently, this has been demonstrated for upper limb movements (finger tapping). Here, we investigate synchronisation of lower limb movements (stepping on the spot) to auditory, visual and combined auditory-visual metronome cues. In addition, we compare movement corrections to a phase perturbation in the metronome for the three sensory modality conditions. We hypothesised that, as with upper limb movements, there would be a multisensory advantage, with stepping variability being lowest in the bimodal condition. As such, we further expected correction to the phase perturbation to be quickest in the bimodal condition. Our results show that while we see evidence of multisensory integration taking place, there was no multisensory advantage in the phase correction task — correction under the bimodal condition was almost identical to the auditory-only condition. Both bimodal and auditory-only conditions showed larger corrections for each step after the perturbation, compared to the visual-only condition. We conclude that rapid lower limb corrections are possible when synchronising with salient, regular auditory cues, such that integration of information from other modalities does not improve correction efficiency. However, if the auditory modality is less reliable it is likely that multisensory cues would become advantageous in such a task.