There have been many debates of the two-visual-systems (what vs. how or perception vs. action) hypothesis that was proposed by Goodale and his colleagues. Many researchers have provided a variety of evidence for or against the hypothesis. For instance, a study performed by Aglioti et al. offered good evidence for the two-visual-systems theory using the Ebbinghaus illusion, but some researchers who used other visual illusions failed to find consistent results. Therefore, we used a perceptual task of conflict or interference to test this hypothesis. If the conflict or interference in perception had an influence on the processing of perception alone and did not affect the processing of action, we could infer that the two visual systems are separated, and vice versa. In the current study, we carried out two experiments which employed the Stroop, Garner and SNARC paradigms and used graspable 3-D Arabic numerals. We aimed to find if the effects resulting from perceptual conflicts or interferences would affect participants’ grasping and pointing. The results showed that the interaction between Stroop and numeral order (ascending or descending, or SNARC) was significant, and the SNARC effect significantly affected action, but the main effects of Stroop and Garner interference were not significant. The results indicated that, to some degree, perceptual conflict affects action processing. The results did not provide evidence for two separate visual systems.
On the basis of neuropsychological studies (Whitwell et al., 2014), researchers hypothesized that the primate visual system might be organized into two parallel pathways, i.e., one for conscious perception and the other to guide action. It was unclear, however, whether the visual system had two separate subsystems, that is, the ‘vision for perception’ system and the ‘vision for action’ system. In recent years, there has been a heated debate about the two visual systems and the question had not been solved yet (Foley et al., 2015; Franz, 2001; Franz and Gegenfurtner, 2008; Goodale, 2014; Namdar and Ganel, 2017; Smeets and Brenner, 2001; Whitwell et al., 2014). Based on neurophysiology and psychological research results, early researchers proposed that there are two early visual processing pathways, a ‘What’ and a ‘Where’ stream (Ungerleider and Haxby, 1994). Goodale and colleagues (Goodale, 1992; Milner and Goodale, 2008) proposed the ‘two-visual-system’ hypothesis (TVSH), which assumes there are two separate specialized visual pathways — the ventral stream and the dorsal stream. Information from the primary visual cortex (V1) is projected onto the two pathways; the ventral stream transforms visual information into perceptual representations for the processing of an object’s shape, size, and orientation, and the dorsal stream is mainly responsible for mediating the visual control of actions (Milner and Goodale, 2008).
Since Goodale and colleagues proposed the two-visual-systems hypothesis, it had attracted the attention of many researchers (Bruno et al., 2016). The theory reconciled the constructivist and ecological approaches which had existed in perception research for a long time (Norman, 2002). A considerable body of neurophysiological, neuropsychological, and psychological evidence for the existence of two distinct visual systems has been presented. For example, Aglioti et al. (1995) found that the Ebbinghaus illusion (or ‘Titchener circles’ illusion) has a powerful effect on perception (subjects usually report that the target circle surrounded by the array of smaller circles appears to be larger than the target surrounded by larger circles) but does not affect grasping movements (subjects’ grip aperture was largely determined by the true size of the target disc and not its illusory size when they were asked to pick up a disc). This finding has been considered strong evidence for the two-visual-systems hypothesis. Ganel and Goodale (2003) assumed that the visual signals that give us the experience of objects and events are not the same ones that control our actions. Kroliczak et al. (2006) also demonstrated that the hollow-face illusion has opposite effects on perception and action. Moreover, Westwood and Goodale (2011) supported the TVSH by converging evidence of patient D.F. and healthy adults’ data. The findings mentioned above could be interpreted in the framework of a two-visual-systems hypothesis.
Many studies, however, challenged the hypothesis that perception and action are separate processes. Some researchers believe that perception and action are based on the same internal mental representation and there is only one visual system (Franz and Gegenfurtner, 2008). For example, Franz (2001) did not replicate the results of Aglioti et al. (1995) while using the Ebbinghaus illusion. Gilster et al. (2006) found that the Ebbinghaus illusion also deceives grasping. Dewar and Carey (2006) failed to find that grasping the Müller–Lyer display is immune to the illusion while pointing the attention towards both shafts. Smeets and Brenner (2001) proposed that it was impossible to distinguish perception separately from action.
In most studies, the illusion was used to test the TVSH (Foster et al., 2012), but the results were highly controversial (Carey, 2001), and methodological differences might lead to different results. Some researchers found that the effect of the illusion on action depended on the spatial attribute of the object beside task type (Smeets et al., 2002). Considering the evidence against the TVSH, Goodale and colleagues admitted that the visual illusion could not be viewed as strong evidence for the TVSH (Goodale and Westwood, 2004). It was not clear whether the hypothesis could be generalized to other items and other actions such as pointing. Therefore, in the current study we tried to test the TVSH using visual conflict with the Garner interference or Garner speeded classification task, which provide a reliable measure of how efficiently people can process one dimension (e.g., length) of an object while ignoring its other dimensions (e.g., width) (Garner, 1978). Meanwhile, the Stroop conflict was also employed. Furthermore, we combined the Stroop conflict and Garner interference (see Section 2.1). Previous research had given the example of the combination of the two tasks (Namdar et al., 2014; Shalev and Algom, 2000), in which the numerical magnitude of the objects affected grip aperture during the initial stages of the grasp. In the current study, two 3-D digits (2, 8) made of polyvinyl chloride (PVC) were used as experimental items. The participants were asked to grasp or point at the 3-D digits. According to Algom et al. (1996), the representation of the digits is influenced by their numerical value. If the mental representation of the numerical magnitude is different from the physical size of the numbers, then there are Stroop-like conflicts. Pansky and Algom (2002) confirmed that the inconsistency between the physical size of a number and the corresponding size of the mental representation of the digits could produce Stroop conflict.
The current study aimed to test the TVSH by using Stroop-like conflict and the Garner interference task. If the hypothesis is correct (i.e., there are separate visual systems for perception and action), we can predict that visual conflict does not influence action but perception. On the contrary, if the hypothesis is incorrect (i.e., perception and action are not separable), then the visual conflict will affect perception as well as action (motion processing). The function and logic of the visual conflict in testing the TVSH are exactly the same as using visual illusion; however, visual conflict can avoid the shortcomings of visual illusion (usually presented in 2-D, which does not resemble the objects we meet in everyday life) and provide another solution to the arguments of the TVSH.
2. Experiment 1
In the first experiment, the participants were asked to grasp 3-D experimental representations of the digits 2 and 8 made of PVC. The dependent variable was the maximum grip aperture (MGA, the maximum distance between the thumb and index finger during a reach). Pansky and Algom (2002) demonstrated that the MGA is linearly related to the size of the object.
If the perceptual conflict affects the processing of action, it can be predicted that the main effects of the Stroop-like conflict and/or Garner interference will be significant. For example, a small mental representation of ‘2’ results in a smaller MGA than a relatively large mental representation of ‘8’ when the physical width of both objects is the same. Similarly, the MGA of digit ‘8’ would be bigger than for the digit ‘2’ because of its larger mental representation while they actually have the same width. The interaction between Stroop-like conflicts, Garner interference and numbers would also be significant.
Eight university students (four male and four female) were recruited with flyers posted on the BBS of Zhejiang University China. Their ages ranged from 20 to 26 years. All the participants had normal or corrected-to-normal vision. Participants were compensated RMB 20 for their participation.
2.1.2. Materials and Apparatus
White 3-D numbers ‘2’ and ‘8’ made of PVC (see Fig. 1) were used as targets. The sizes of both targets were either 75 × 30 mm or 60 × 30 mm (length × width), and the thickness was always 20 mm. Each number (2, 8) was presented at two vertical physical sizes (long: 75 mm, short: 60 mm), resulting in four combinations of numbers and lengths.
Optical motion capture equipment (MotionAnalysis Corporation) was employed to capture the trajectory of the grasping movement with ten high-speed synchronous cameras; the sampling rate was 60 Hz. Two markers were attached to the ends of the index finger and thumb (Fig. 1).
2.1.3. Experimental Design and Procedure
The design of Experiment 1 was 2 (Garner effect: baseline; filtering) × 2 (Stroop effect: congruent; incongruent) × 2 digits (2, 8). The experiment had a within-subjects design and all the factors were treated as within-subject variables. The dependent variable was the maximum grip aperture (MGA). For the Garner tasks, in the baseline blocks (variation of the relevant dimension while the irrelevant dimension remained constant), the relevant dimension (length) varied between trials while the irrelevant dimension (width) remained constant; meanwhile, in the filtering blocks (both dimensions varied unpredictably in the filtering condition), the relevant and irrelevant dimensions were varied randomly varied between trials. The order of the conditions (baseline/filtering) was counterbalanced across participants.
As shown in Fig. 1, participants sat in front of a table on which the target was placed at a viewing distance of approximately 50 cm. The MGA was recorded by a Mocap movement tracking device. Before a trial began, each participant rested his or her thumb and index finger on the starting point (the dock). The starting dock was placed at the center of the table. When the trial started, the participants were asked to pick up the target along its width as quickly as possible by opening their thumb and index finger and placing them on the table (the digit was concealed by a stiff piece of paper which prevented the participants from seeing the digit before the trial began; the trial began when the paper was removed). After finishing the grasping movement, the participants retracted their hand back to the starting point while keeping the thumb and index finger still. After practice, the formal experiment began. There were four blocks, with each block including 32 trials, which resulted in a total of 128 trials. There was an approximately one-minute interval after each block; the whole experiment lasted about 15 minutes.
2.1.4. Data Acquiring and Analyzing
The Mocap system recorded the three-dimensional coordinates of the two markers, namely the coordinate values of the two markers relative to the origin point (the origin point of the coordinates was set on the ground on the central line of the table; its position did not affect the calculation). System software package EVaRT gave the markers’ coordinates. The coordinates were imported into MATLAB to calculate the relative distance between the two markers (i.e., MGA). As the marker itself had a certain length and the thickness of participants’ fingers was different, we computed the MGA by subtracting these values to eliminate their effects. Therefore, the dependent variable MGA equalled (the distance between the two markers) − (the length of the marker itself and the thickness of the fingers). SPSS software was used to analyze the data and make statistical inferences.
2.2.1. Grip Aperture and the MGA
As shown in Fig. 2, the grip apertures changed as a function of time. In Fig. 2, ‘a–b’ represents the participant’s hand placed on the starting point (grip aperture is zero); ‘a–b’ is nearly straight. At point b the participant begins to grasp, and the distance between the fingers increases rapidly. At point c the grip aperture reaches a peak. Line ‘d–e’ corresponds to the period of seizing and holding the target. Line ‘e–f–g’ corresponds to the process where the participant lets go of the digit and puts it back, and line ‘g–h’ shows the participant withdrawing his or her hand to the starting point and waiting for the next trial to begin.
The ordinate of line ‘a–b–h’ represents the distance between the two markers when the two fingers are pinched together. This distance, represented by the symbol K, is a fixed length for each participant. The dependent variable MGA can therefore be calculated by the following formula: .
2.2.2. Analysis of the MGA
The MGA data were entered into a repeated-measures 2 (Garner effect) × 2 (Stroop-like effect) × 2 (Digits) ANOVA. The main effects were not significant [Garner factors, , ; Stroop-like effect, , ; Digits, , ]. Also, a non-significant interaction was observed between Garner interference and Stroop conflict [, ], Garner interference and Digits [, ], and Stroop conflict and Digits [, ].
The results show that although the interaction between Stroop-like conflicts and Digits was not significant [baseline: , ; filtering: , ], there was a consistent tendency (see Fig. 3).
The results of Experiment 1 show that Garner interference and Stroop-like conflict do not affect motion processing. Previous research had shown that the fingers’ grip aperture during the grasp is affected by the numerical value of the grasped digits. Numerically larger digits lead to larger grip apertures than numerically smaller digits. In the current experiment, it should be noted that though the main effect of Stroop was not statistically significant, we can see from Fig. 3 that, at the baseline level in the Garner task, the maximum grip aperture of ‘2’ is smaller in the conflict condition than in the consistent condition; on the contrary, the maximum grip aperture of ‘8’ in the conflict condition was larger than in the consistent condition. If the interaction does exist, it is in line with our expectations, indicating that the action processing is influenced by the perceptual interference or conflict (the effect might become significant upon increasing the number of participants or using stronger conflict stimuli).
We can use the vertical–horizontal illusion to explain the interaction in Fig. 3. If the vertical–horizontal illusion (long objects appear small, short objects look stumpy) can affect the grasping, it would make the grip aperture of long objects smaller than that of short targets (because the long object seems smaller while the short object seems wider). This would lead to the maximum grip aperture of a consistent 2 bigger than that of a conflict 2, and the maximum grip aperture of a consistent 8 smaller than that of a conflict 8. The results suggest that the illusion affects motion processing, which does not support the two-visual-systems hypothesis.
The main effects of Garner interference, the Stroop conflict and interaction effect in this experiment were not significant. The non-significant results might be due to the range of variables being small, or the dependent variable not being sensitive enough, which result in the perceived effects of conflict on the processing of action not being big enough. Consequently, the results failed to find the effects of Garner interference and Stroop conflict on the action. Therefore, a more sensitive dependent variable was required in our next study.
3. Experiment 2
The results of Experiment 1 found there were no Garner interference and Stroop conflict effects in grasping movements. Why did this occur? One possibility is that the Garner interference and the Stroop conflict indeed had no effect on the action. However, there were some other confounding factors. For example, the perceptual conflict caused by the experimental items was not strong enough and the range of MGA was too small. Therefore, in the current experiment we used the maximum speed of pointing as the dependent variable because the speed allows a greater range. A comparison task, which could generate stronger perceptual conflict, was employed to test the effect of perceptual conflict on the action.
The participants were identical to those in Experiment 1.
3.1.2. Materials and Apparatus
The experimental apparatus was the same as in the first experiment. The size of the experimental objects was 18 mm × 12 mm and 99 mm × 66 mm. The digits (2, 8) were paired for each size. All pairs were presented in either ascending (e.g. 2, 8) or descending (e.g. 8, 2) order.
3.1.3. Experimental Design and Procedure
A repeated-measures design was used in this experiment too. Stroop and Garner interference were also employed. In half of the trials, the numerically larger digit (8) was physically larger (not conflict or congruent condition). In the remaining trials, the numerically larger digit was physically smaller (conflict or incongruent condition).
At the beginning of the experiment, the participants were asked to naturally rest the index finger of their dominant hand on the starting dock on the desktop. The distance between the two digits was about 21 cm. When the two digits were presented, all participants performed two tasks: (1) in the quantity comparison task, participants selected the numerically larger digit by pointing at it; and (2) in the physical size comparison task, they judged which digit was larger in physical size, also by pointing at it.
For the quantity comparison, the relevant dimension for the baseline condition of the Garner task was the numerical magnitude, and the irrelevant dimension was physical size. Stimuli consisted of two kinds of pairs (small 2 vs. small 8, large 2 vs. large 8). For the filtering condition of the Garner task, stimulus pairs were small 2 vs. large 8 and large 2 vs. small 8. The stimulus pairs were presented in pseudo-random order. The Stroop-like effect only appeared in the filtering condition. Small 8 and large 2 constituted the incongruent condition, while the congruent condition consisted of large 8 and small 2.
For judging the physical size, the irrelevant dimension under the baseline condition of the Garner task was the numerical magnitude, and the relevant dimension was physical size. Stimuli were made up of two kinds of pairs (small 2 and large 2, small 8 and large 8). The objects for the filtering condition were similar to what we used in the quantity comparison but the participants were asked to point at the physically larger one.
In this experiment, because the digits were used as stimulus material, there could be a perceptual conflict called spatial–numerical association of response codes, or SNARC effect (Gevers and Lammertyn, 2005). Participants would react faster if the small numbers were on the left-hand side than on the right side. Similar to the SNARC effect, the task produced a pair-order effect (faster judgments for ascending pairs) and a reverse effect for descending order (Turconi et al., 2006). In the present study, we added numerical order as the third factor to be analyzed. The Stroop effect existed only in the filtering condition, therefore we analyzed the Stroop conflict and the SNARC effect in the filtering condition.
The dependent variable used in the current study was the maximum speed while pointing because there were difficulties in precisely measuring reaction time. When the distances were equal, the shorter the reaction time was, the faster the maximum speed would be. Consequently, the maximum speed could be used as an indirect indicator reflecting the reaction time. The average speed was not used as the dependent variable because we could not accurately determine the start and end time of the pointing movement and thus, the maximum speed was a relatively better option. Moreover, the maximum speed could exclude calculation bias of the MGA. Compared to the MGA, the maximum speed had better tolerance to error and could amplify the difference resulting from the independent variable.
3.1.4. Data Analysis
In this experiment, we only recorded the pointing speed of the index finger. Therefore only a single marker was attached to the index finger of the participant. A series of three-dimensional coordinates of the marker were acquired on each trial. The coordinates of the marker in two adjacent frames were used to calculate the distance of two adjacent positions of the index finger, and the distance was divided by the duration (1/sampling frequency) to get the average pointing speed between two adjacent frames. In this way, the average velocity of all adjacent frame pairs was obtained. All the average velocity values were imported into MATLAB to obtain the maximum speed of each trial.
3.2.1. The Pattern for Results of the Pointing Speed
As shown in Fig. 4, the pointing speed of the index finger changed as a function of time. Section ‘a–b’ corresponds to the participants’ hand being placed on the starting dock; the speed is around 0 mm/s. From point ‘b’ the index finger of participants begins to move and the velocity increases rapidly. At point ‘c’ the speed reached its peak. Shortly afterwards, the speed decreases, and point ‘d’ corresponds to the moment when the finger touches the digit. After a brief pause the speed increases quickly because the participant retracts his or her index finger from the selected digit back to the starting dock; the speed also reaches a maximum during this period (the value at point ‘e’). However, we are concerned with the speed value at point ‘c’ (maximum speed of pointing). Section ‘f–g’ represents the participant retracting his or her index finger to the starting dock and waiting for the the next trial to begin. The area under curve ‘b–c–d’ in the graph corresponds to the distance the finger travels from the starting dock to the digit.
3.2.2. Analysis of Maximum Speed of Pointing
In the quantity comparison task, the data for the maximum speed of pointing were entered into a repeated-measures 2 (Stroop-like effect) × 2 (SNARC) ANOVA. The main effect of the Stroop-like conflict was not significant [, ]. The interaction between the Stroop-like conflict and the SNARC effect was also not significant [, ]. However, the main effect of SNARC was significant [, ]. Meanwhile, a repeated-measures ANOVA was performed, with factors of SNARC effect by Garner interference. The main effect of SNARC was also significant [, ]. The main effect of Garner interference [, ] and interaction between Garner interference and SNARC effect [, ] were not significant, however (see Fig. 5).
In the physical size comparison task, the main effects of Stroop-like conflict and SNARC effect [, ; , ] were not significant. However, a significant interaction was found between Stroop-like conflict and SNARC effect [, ; see Fig. 6]. All other effects were not significant.
In addition, the maximum speed in the filtering condition was slower than the maximum speed in the baseline condition in the quantity comparison task. However, the difference was not significant [, ]. In the physical size comparison task, the maximum speed in the filtering condition was also slower than that in the baseline condition. The result of a t-test was also not significant [, ].
The results of Experiment 2 provide a good explanation for Experiment 1. The main effect of Stroop-like conflict was not significant because processing the action required instant information and the stimuli were presented solely rather than in pairs. The mental representation of numbers was not absolute but relative. For example, the mental representation of 2 was not always smaller than that of 8. If 2 and 8 appeared at the same time, then the mental representation of 2 should be smaller than that of 8. If 2 and 8 appeared separately, the representation of 2 could be larger than that of 8 or they might be the same mental size. Therefore, sole presentation of a digit does not necessarily form strong conflict.
In the current experiment, the Garner interference still did not show a significant main effect. For Garner interference, the relevant or irrelevant dimensions had to rely on the previous trials. However, the size and shape of the stimuli in the previous trials might not be saved in working memory, and consequently the information was not used (or the memory trace was very weak, which resulted in an inability to access the information or its incomplete retrieval). This led to each action merely responding to an instant stimulus, and thus there was no or no sufficiently strong Garner interference.
The results showed that the speed in the filtering condition tended to be slower than in the baseline condition. In the filtering condition, the numerical quantity and physical size of the stimulus were presented simultaneously, and thus the mental representation of the numerical quantity and the physical size were compared during the processing of the action. Consequently, paired stimuli could result in strong conflict which slowed the processing of the action.
Although the results for the effect of Stroop-like conflict and Garner interference were non-significant, as can be seen in Fig. 5 the main effect of SNARC was significant. The maximum speed in the ascending condition was much faster than that in the descending condition, indicating that perceptual conflict significantly affects the action. Meanwhile, the results from Fig. 6 confirm that the interaction between the Stroop and SNARC is significant, suggesting that Stroop-like conflict has an influence on action processing in conditions of relatively strong conflict. Further experiments should produce stronger Stroop conflict.
The current experiment found that the numerical order (SNARC effect) significantly affects processing of the action, and the interaction between Stroop and SNARC was significant. These results together suggest that perceptual conflict does affect action processing. Consequently, the results do not support TVSH, which predicts that the action is immune to the effects of perception (perception and action are separate).
4. General Discussion
In the current research, we used 3-D non-pictorial objects that elicited the perception conflict and were not used by previous studies testing the two-visual-systems hypothesis. The results (see Figs 5 and 6) show that perceptual conflict does affect the processing of action. As can be seen in Fig. 5, the main effect of SNARC is significant. In Fig. 6 we can see that the interaction of SNARC and Stroop-like conflict in the physical size comparison task is significant, and both results indicate that the processing of action is affected by perceived conflict.
The results show that the main effect of Stroop-like conflict was not significant in Experiment 1. This could be due to the dependent variable (MGA) not being sensitive enough and the range of the MGA being small; consequently, the measurement bias might exacerbate the results. In Experiment 2, we found that the interaction between Stroop-like conflict and SNARC was significant, which suggests that the Stroop-like conflict affected the action to some extent. If the dependent variable is changed to more sensitive indicators, such as reaction time, we may find that Stroop-like conflicts clearly affect the action.
It is noteworthy that the separation of perception and action is a necessary but not a sufficient condition for the theory of two visual systems. We were unable to prove that if ‘there exist perception and action separation, then there must be two systems’. The hypothesis also could not be proven because the results were non-significant. However, if the results did not support the separation of perception and action, they might falsify the two-visual-systems hypothesis. To some extent, the current results show that perceptual conflicts affect the processing of action (pointing), which could lead to the conclusion that perception and action are not separate.
The two visual systems might be separate: if one reaches for an object (e.g., the mouse of your computer), the brain must use the instant information from the object and the hand (position, size, etc.), relying mostly on working memory. However, for one to recognize the object (e.g., was it a mouse?), the information used would mostly come from long-term memory. The results of the current study, however, do not support the hypothesis of two separate visual systems. If there are not two separate visual systems, then how can they interact? In the future, more studies are needed to investigate how neural mechanisms underlying the two visual systems allow the two systems to interact.
In summary, interference or conflict in visual perception has an impact on action, and our results suggest that, to some degree, perception and action are not separated. From a representational viewpoint, perception and action may share the same representation.
This project was partially supported by the National Natural Science Foundation of China (No. 31460251), the Project of Social Sciences in Jiangxi Province (Project No. 14JY28) and the Science and Technology Project of the Educational Commission of Jiangxi Province (No. GJJ150879).