Specific Features of Perception of Semantically Equivalent Stimuli in the Verbal and Visual Form

Response time and evoked potentials were registered for visual images related to two categories fruit and tableware as well as their verbal representations. The stimuli were presented randomly. The subjects were to attribute them regardless of the form (a word or image) to one of the categories. 11 female and 10 male subjects (average age 21.9±2.9 years) participated in the tests. 6 components of the evoked potentials were singled out: Р1 (Р66), N1 (N124), Р2 (Р180), N2 (N248), Р3 (Р331) and N3 (N456). Analysis showed that both female and male subjects demonstrated reliably longer response time for words as compared to those for corresponding images. For words, evoked potentials were registered in more complex configurations and with a shorter latency period for the early components (P1, N1) and longer latency period for the late ones (P2, N2, P3, N3). The evoked potential amplitude in response to verbal stimuli was smaller than that for visual ones. Evoked potential components in response to target stimuli (both images and words) had, in general, shorter latency. The amplitude of N1, Р2 and N2 components was lower, while that of P3 and N3 was higher for target stimuli rather than a non-target. The obtained results allow us to assume that evaluation of the type of information (verbal or visual) can be performed on early stages of stimulus perception (up to 120-150 ms). Further analysis includes either more detailed description of spatial features of the visual stimuli in parietal and occipital lobes or estimation of the semantics of a word employing the frontal and temporal areas. Decision-making on formulating a response barely depends on the manner of information presentation (visual and verbal).


Introduction
One of the key properties of thinking is the ability forming a concept of the environment in the form of mental representation. Over the lifetime, a human learns to manipulate both visual and verbal information with equal success and creates the mental representation using different types of external events. Theoretically, there exist at least 3 ways to form mental representations: (1) dependent on the type of external stimuli and the objectives (visual representations for visual information, verbal -for words, etc.); (2) independent on external impact (visual representations for visual information, verbal ones for words etc.); (3) with a universal code when all representations are presented only in the verbal form.
Forming mental representations begins with the process of perception, which is being explored in a vast variety of experimental studies. As a rule, those involve the evoked and event-related potentials' method [1][2][3][4][5]. Alongside it, other methods are used, such as calculating the dipole source localization method [6,7], FMRI [8][9][10] and others.
The bioelectric activity of the brain is known to characterize by extreme changing and significant variability even when being presented images with identical physical parameters and carrying out the same type of activity. It proves impossible to detach individual variations completely from them even applying various methods of averaging. The nature of this variation is still uncovered incompletely. One of the assumption is that the nervous system at different levels of its organization may be not in one but in two and more stable states (i.e. it is characterized by bi-and even multi-stability). The transitions between these states occur under the influence of various internal and external factors. The presence of these states was demonstrated experimentally at the level of single neurons and their local groups [11][12][13], the brain as a whole [14,15] as well as on neural network models [16,17]. There is a reason to believe that multi-stability is one of the fundamental properties of the nervous system and can come out on any level of its organization and during any perception and cognitive process. The limitations of understanding the concept of multi-stability today are the absence of a strict criteria of its identification, and the dependence of the activity variability on the number of factors on any level of CNS organization.
In this respect, the process of perception, particularly that of ambiguous visual images, is of great interest. Attempts to study mechanisms and to create mathematical models demonstrating multi-stability have taken place [18,19]. However, information recognition and categorization in this context have barely been studied. In our opinion, one of the ways to investigate the multistability is the study of electrographic representations of perception features on semantically equivalent objects presented in the form of visual images and words. This approach would allow us to answer a number of questions: if the manner of information presentation affects the speed and quality of its recognition; if the activity patterns formed in the cortex during recognition of visual information are different from those formed during recognition of verbal information; what brain structures participate in this process; how it is reflected in the parameters of the brain's evoked activity; etc.
The objective of the current study was to explore the behavioral (response time) and electrographic (evoked brain activity parameters) phenomena lined to perceiving semantically equivalent objects presented in the visual and verbal form.

Subjects
The research was conducted on 21 subjects (11 females and 10 males) -students of various higher educational institutions of Rostov-on-Don, Russia. The average age of the subjects was 21.9±2.9 years. All of them confirmed their voluntary commitment to participate in the study in writing, as prescribed by the sanding order issued by the SFedU Commission on Bioethics.

Stimulation
The tests were conducted in a light and a soundproof chamber during daytime (mostly before noon). During the testing, subjects were seated in an armchair in a comfortable position. The stimuli were being shown on an LCD monitor located on the same level as the subject's eyes in the distance of 1 meter. To form a response, one of the buttons on the computer mouse was utilized. The hand, the button and the manner of pushing it were chosen by the subject based on their convenience.
The visual stimuli presented in the form of gray level images appeared in the center of the screen on white background. As stimuli, 6 images were used, each related to one of the two categories: fruit (an apple, a pear, a lemon) and tableware (a glass, a spoon, a plate), and 6 words used to denote these objects. The images were presented in the size of 4x10 arc degrees. All the images were brought into conformity in terms of size and color. They were presented in a random sequence, each exposed for 300 ms with a 2-sec interval between the stimuli. In order to fix the sight in the center of the screen, a 2x2 cm gray cross was presented between the stimuli.

Tasks
The study comprised 2 series with different target stimuli. In the first series, the subject was supposed to respond as quickly as possible to the image or the word related to the category 'fruit'; in the second one -to 'tableware'. Hence, each stimulus presented could be the target as well as indifferent (non-target). The response time (RT) was registered for the target stimuli. The duration of the study did not exceed 45 min during which 760 stimuli were presented. Each series was preceded by guidelines for the subjects in which they were informed of the random character of the presented stimuli, the actions to be taken once the stimuli appeared. They were also instructed to respond as quickly as possible.

Signal Analysis
To analyze the marks corresponding to the stimulus presentation, 1-sec EEG epochs were singled out automatically. The pre-stimulus interval was 200 ms before the stimulus onset and 800 ms after stimulus onset. EEG epochs containing artifacts were excluded from further analysis. Evoked potentials (EP) were determined for each type of stimuli by averaging single EPs in each series. The resulting EP was centered and smoothed by filters with the frequency band of 1÷17 Hz by secondorder FIR filter. By averaging the EP registered for every subject for all the stimuli on all electrodes General Great mean (GGM) and for each electrode, in particular, Great mean (GM) were calculated ( Figure 1). The latter was necessary considering the significant difference between the EP characteristics registered on different electrodes [1,20] and others. The temporary mismatch of the EP components in different electrodes was obtained (dark areas in Figure 1). Therefore, after the centering main components were singled out in GGM. These components were further identified on GM and in individual EP using original software developed in the laboratory (lead programmer -K. B. Kalinin). Time windows in which the components search was carried out were defined individually for each electrode.
During the preliminary GGM analysis, 6 main components were singled out, i.e. Р1 or Р66, N1 or N124, Р2 or Р180, N2 or N248, Р3 or Р331 and N3 or N456. The corresponding time windows estimated by GM were 18÷71 ms (for Р1), 33÷131 ms (N1), 103÷193 ms (Р2), 120÷263 ms (N2), 284÷422 ms (Р3) and 314÷492 ms (N3). Within each of the aforementioned time windows for EP in response to each type of stimuli with each subject on each electrode, one peak with max positive (P) or negative (N) amplitude was singled out. For each of those peaks, the latency period (LP) was detected. To estimate the wave structure of the evoked response, the overall quantity of extrema (QE) was counted, regardless of their polarity, amplitude and latency period.
Statistical evaluation of amplitude and LP for the EP components was carried out via ANOVA/MANOVA (with repeated measures, RP). Individual EP were grouped together with the consideration of the following factors: TYPE of stimulus (levels: image (I) and word (W)); GROUP (levels: target (T) and non-target (NT) stimuli); COMPONENTS (R1, 84 dependent variables: 6 components × 14 electrodes). The values of amplitudes (modules), latency periods and components as well as the complexity of the EP were analyzed separately. The stimuli referring to the categories fruit and tableware would unite, that's why the influence of the factor CATEGORY was not analyzed in this article. The RT values were analyzed using the same method with the consideration of such factors as GENDER and TYPE of stimulus.
To evaluate the reliability of the differences observed, 2 levels of significance were used. With р<0.05 the differences were considered reliable, with 0.05<p<0.08 -significant (a trend was proven present). The differences (in%) for each of the analyzed EP parameters were normalized against the first value in the pair compared in accordance with the following formula: Diff. (%) I-W = (W-I) /I×100%.

Results
The analysis has shown that EP for the verbal stimuli was more reliable than that for the visual ones for both female and male subjects (Table 1). No gender differences in both visual and verbal stimuli were detected. The analysis of the LP and amplitude of EP components registered in the 2 series of experiments has shown that there are significant differences in all 3 factors: TYPE of stimulus (image and word), GROUP (target and non-target) and COMPONENTS (R1) ( Table 2). The target stimulus was more substantial influence on the LP of the components of the EP (reliable Type×Group -Interaction) than on the amplitude of components.
Comparative analysis of EP components in response to visual and verbal stimuli (factor: TYPE) disclosed that reliable differences take place for LP of the following components: P1, N1, N2, P3; and amplitude of the components: P1, P2, P3, N3 (Table 2). If the amplitude values of all the listed EP components registered in response to verbal stimuli were lower than those registered for images, the LP values were multidirectional. Amplitude detailed approach to the listed differences employing single factor analysis (breakdown & one-way ANOVA) proved ( Table 2, I-W) that the LP for the early EP components (P1, N1) for verbal stimuli was shorter (especially, in the left hemisphere), and for the later ones (N2, P3) -shorter on the back and longer -on the frontal electrodes. The amplitude of the P1, Р2 and N2 EP components registered for verbal stimuli was lower mostly on the occipital electrodes, for N1 and P3 -on the frontal, and for N3 -on all electrodes. In the EP registered on the front (frontal, temporal and central) electrodes for verbal stimuli. The amplitude of the P1 and N2 components was, on the contrary, higher.  EP in response to verbal stimuli had a substantially more complex wave structure than that in response to the semantically equivalent visual images. The quantity of extremes of EP in response to verbal stimuli was significantly higher than that for visual stimuli (F I-W (1; 1006)=33.64, p=0.000). This ratio was observed on all electrodes, but it was most obvious in the occipital areas ( Figure 2).
The comparative analysis on the EP characteristics of visual and verbal stimuli considering the factor GROUP has shown that reliable differences take place on the level of the main effects in almost all the factors (QE, LP, A) ( Table 3). The single-factor analysis (One-way ANOVA) has shown (Figure 2, NT, T) that the differences in the EP parameters in response to words -as compared to those for images -largely coincided for target and indifferent stimuli. LP of the early (P1, N1) EP components in response to both target and nontarget verbal stimuli were shorter than those for visual stimuli. LP of the EP components N2, P3 and N3 registered for target verbal stimuli were larger than those for visual stimuli. LP of the EP components N2 and P3 in response to indifferent verbal stimuli were also longer in the frontal areas, and in the rear, on the contrary, were shorter than those for visual stimuli. The weak differences between the LP values for the P2 components were shown.
The amplitudes of the EP components in response to verbal stimuli were, in general, lower than those for visual stimuli, regardless of their significance. Only the EP components P1 and N2 in response to verbal stimuli had the higher amplitude in the frontal areas.
EP in response to verbal stimuli would have a more complex wave structure than that in response to visual stimuli in all cases, and to the target stimuli, the differences would be more substantial than for the indifferent ones.
The analysis conducted separately for the visual and verbal stimuli has shown (Figure 3) that the differences in the EP characteristics in response to target and non-target stimuli are, in general, similar for both manners of information presenting. EP in response to the target stimuli would have lower amplitude of the N1 component on the front, and for the P2 and N2 -on the occipital electrodes, and higher amplitude for the components P3 and N3 on the occipital areas. A distinctive feature typical for both manners of information presented was that in the EP in response to target stimuli (compared to the non-target) on the occipital electrodes, shorter LP for the components N2 and P3 and the higher amplitude for the component N3 were recorded. With a presentation of verbal information, the EP in response to target verbal stimuli would have longer LP for the N3 component of the temporal and occipital electrodes compared to the non-target stimuli.
Significations: For further details, see caption to Figure 1.

Discussion
In literature, there are data pointing out the presence [21,22] as well as the absence [23][24][25] of gender differences in the efficiency of the activity linked to operating with words and images. We couldn't disclose any gender differences in the response time for visual and verbal stimuli while attributing them to different categories. On the base of this fact, all subjects were used for further analysis.
All the stimuli (both verbal and visual ones) were presented in the form of visual images having equivalent physical parameters (size, exposing time, color solution). Each of them was presented multiple times, which practically excluded the necessity to read the words or search for further detailed differences in order to identify the image. It is known that word identification can be carried out during 250 ms [26]. Words that come up frequently and are analyzed as images can be identified during 200-ms interval, presumably, due to the formation of more efficient neural networks [27,28]. However, despite the fact that due to multiple presentations, verbal stimuli are being perceived as visual ones (gestalts), the responses registered upon their presentations would have reliably higher values of latency periods compared with those registered for semantically equivalent visual stimuli. This points to the fact that semantically equivalent constructions presented in visual and verbal forms are, apparently, perceived and processed by the nervous system in a different way, the analysis of verbal stimuli demanding more time.
A number of cognitive memory theories suggest that visual and verbal information are stored separately. They also suppose that the images are not stored as separate objects, but as prototypes [29,30] representing the most characteristic features of all the objects within the class. The information received during image perception is formalized, but not transformed into a verbal construct. An experiment following the suggestion that images have a spatial status has shown [31,32] that visual representations are demonstrative although in some cases their propositional representation typical for representing words is possible. Then, the concept of double-coding [33,34] suggests the existence of two data processing systems interacting with each other. Within each system, there are three levels of processing visual (in the terms of the author, images) or verbal (logogens) information, and the transition from the lower level to a higher one entails generalization of that information.
Thus, almost every case suggests two different processing mechanisms providing visual and verbal perception and information analysis.
Compared with visual stimuli, verbal ones contained a number of elements. It was shown [35] that perception better structured visual stimuli comprised of multiple different elements were connected to the severity increasing in the EP of the later components (P180-230 and N230-260) and decreasing in the earlier ones. However, as shown in our research, the amplitude is lower for almost every EP component in response to verbal stimuli compared with that for visual ones. The only exceptions were P1 and N2 in the EP registered on the front electrodes while perceiving verbal stimuli (especially for the indifferent) with higher amplitude than that for visual stimuli. The LP of the early EP components for verbal stimuli was shorter and for the later components -longer than that for the corresponding images. All of these point to the fact that longer response time for verbal stimuli and the EP differences are more likely to be defined not by the complexity of the visual stimuli, but by the differences in the mechanisms of processing information presented in the verbal and visual form. Similar results were received while studying the visual EP registered for the letter-based (verbal) and symbol-based (pictograms, i.e. images) stimuli [4]. In this study, 7 subjects out of 9 showed higher amplitudes for the later components (P3 and N4) while being presented symbol matrices rather than the letter. LP for the EP components N1 and P3a registered for symbolic matrices was shorter than that for letter ones, while that for the components P3b and N4 was, on the contrary, longer.
According to the obtained results, under categorizing verbal stimuli, compared with the semantically equivalent visual stimuli, the EP was registered in which: (1) the earlier (P1, N1) components have shorter, and (2) middle and later (P2, N2, P3 and N3) components -longer LP; (3) amplitude for the RP components is lower and (4) their structure is more complex.
Shorter LP for earlier (P1 and N1) components registered for verbal stimuli, compared with the semantically equivalent visual stimuli -regardless of their significance (target and non-target) -enables us to suppose that sensory analysis of the stimulus involving type detection (visual/verbal) takes place as early as the first 120-150 ms after presented. The recognition of the known verbal stimuli happens quicker than that of the semantically equivalent visual stimuli. A number of researchers have also shown that the identification of the well-known words can be carried out within 100 and even 40 ms after they are presented [36,37].
LP for later (especially N2 and P3) EP components registered on the front (frontal and temporal) cortex lobes for verbal stimuli would be longer than that for visual stimuli; those differences did not depend on the significance of the stimulus. The differences in EP registered for indifferent stimuli of different types (images and words) on the frontal areas were observed in the complex P2-N2-P3 (i.e. within the interval of 200-350 ms) while that for the significant one -N2-P3-N3 (300-500 ms) (Figure 2).
The N2 component registered in the frontal association cortex is being connected to the mechanisms of cognitive control, including suppression of inadequate reactions and choosing the correct decision out of the possible options [38,39], monitoring the corresponding events to the objective of the cognitive task being carried out [40,41]. The P3 component registered in the same areas is being associated with operations on comparing the incoming information to the inner model of the stimulus and decision whether they are identical or not [42,43]. It is also supposed that the frontal areas of the cortex are connected with the operative memory mechanisms [44,45] and are responsible for evaluating the significance of verbal and non-verbal stimuli depending on the context and past experience [46,47]. The activation of those areas associated with verbal thinking (starting with analyzing letters and up to pronounce words mentally) was demonstrated, in particular, by fMRI [9,48]. This allows us to suggest that unlike sensory, semantic word analysis with further attributing them to a corresponding category has a wider flow, hence, demands more time than the analysis of equivalent images. Longer LP of the later EP components is an evidence. These differences do not depend on the stimulus significance, and largely define the RT for the target verbal stimuli.
Unlike the frontal cortex, the difference in the LP of the later EP components in the temporal and occipital areas registered for stimuli of different types would depend on factor GROUP, which is explained by a significant Type×Group interaction. (see Figure 2). While perceiving indifferent stimuli the LPs of EPs registered in the TPO (temporal-parietal-occipital) areas of the cortex were shorter for the verbal stimuli than for the visual ones; while perceiving significant stimuli, the corresponding LPs were longer. In the former case (non-target stimuli) the differences in higher severity are observed for components N2 and P3, while in the latter case (target stimuli) -for P3 and N3. It should be noted that the component N3 would develop almost immediately after the response of pressing the button. The results are imagined to indicate that under categorization there are differences in the mechanisms of visual and verbal information perception (both target and indifferent stimuli).
Unlike the P3 component which is associated with the process of retrieving engrams out of the memory, comparing the current information in the memory and making a decision [1,42,43], the component N3 is associated with the final stages of semantic analysis, detecting the contextual connections of words within a sentence, letters within a word, parts within an image etc. [49]. The obtained results allow us to make the following conclusion: after the type identification (visual/verbal) stage performed within the first 120-150 ms, further analysis of the verbal stimuli goes as the analysis of semantic categories employing predominantly the front (association) areas of the cortex. This supposition goes along the predictive coding model [50,51] according to which a wholesome concept of an image is formed at higher levels of analysis hierarchy, in the association and prediction areas of the cortex [52,53]. During that time, the activity of the neural systems of lower (sensory) analysis levels decreases.
'Recognition' of the visual stimuli demands a more detailed description of the image's physical characteristics. Longer (compared with those for verbal stimuli) LP of the early EP components are indicative of that. Further semantic analysis and categorization would employ predominantly the parietaloccipital lobes associated with evaluating spatial characteristics of visual images [54,55,56]. These areas supposedly also participate in top-down control and organization of relevant object search in the memory [57,58]. Semantic analysis of a visual stimulus is, apparently, over by that time, which explains the shorter LP for the components N2 and P3 in the EP registered on the front areas of the cortex. The analysis of verbal stimuli (unlike that of the visual ones), apparently, did not end in a response. That is indicated, in particular, by a severe component N3 for which the LP was longer than the average response time for verbal stimuli. In literature, it is pointed out that this component registered in the EEG of the temporal and parietal lobes of the cortex is linked to the checking of the decision correctness [59,60]. During the word operations, the latter may have a wider flow, which could be extended after the decision of attributing the image to a certain category has been made.
Increasing the complexity of the stimulus [35] and the activity of the cortical structures participating in its analysis [61] is associated, generally, with the increase in the amplitude of the EP components. This conclusion, however, is drawn in rather simple experimental paradigms. In more complex situations, like ours [62], increasing the level of activation is often accompanied not by an increase but rather a decrease in amplitude of the evoked responses. This took place, in particular, in the situation when the recognition of the stimulus was carried out alongside cognitive activity (e.g. arithmetic calculations), simultaneously. The decrease in the EP components' amplitude, including the increasing load of the operative memory, increasing the attention level, and the complexity of the choosing options was revealed in a number of other works as well [63][64][65][66]. Our results also indicate this conclusion. For example, the amplitude was lower for almost all the EP components in response to verbal stimuli compared with semantic equivalent visual stimuli, particularly for the target ones. This could indicate that words are more complex stimuli to detect than images. Only while perceiving non-target stimuli, the amplitude of the N2 EP component registered for verbal stimuli was higher than for visual ones. As it was already mentioned, the N2 component is associated with suppressing inadequate reactions and choosing the correct option [38,67] as well as evaluating the correspondence of the current events to the cognitive task being carried out [40,68]. The latter is most likely to explain that the listed processes have higher significance for categorizing images rather than words.
It is well known that an increase in the amplitude of the focal potentials reflects, first of all, activity synchronization in the population of neurons. There is a reason to suppose [62,69,70] that this synchronization may be regulated by various mechanisms and reflects various states of this population. On the one hand, increasing amplitude of the evoked responses, first of all, in the primary projection areas of the cortex alongside increasing intensity of the active stimulus may be connected to a stronger afferent flow and reflects the involving of more neurons in the activity, the neurons' synchronization being provided by the flow. On the other hand, non-specific systems could have a synchronizing effect by providing a transition from a more active wakefulness to a less active one, up to somnolence and sleep. Such effects can cause neuron populations to aggregate into larger units through activity synchronization, as a rule, in the lower frequency range. Besides that, synchronized activity in local neuron populations (neural ensembles) may be caused by an increase in specific informational processes in them that are reflected in forming high-frequency gammaoscillations [71,72].
Thus, synchronized neural activity and, consequently, increased amplitude of the focal potentials can, apparently, reflect both an increase and a decrease in their activity, both functional mobilization and immobilization. The frequency of the appearing oscillations is of major significance and is, as a rule, inversely proportional to the amplitude of the registered potentials.
The recognition of visual stimuli depends on their physical characteristics [73,74], attention organization devoted to their sensory analysis [75,76], sensory differentiation of visual objects [77,78], interaction of attention and perception [79,80] that are reflected in amplitude and LP of the early (P1 and N1, in particular) EP components. There is, however, an evidence proving that the parameters of the visual stimulus, in particular, its intensity, have a crucial influence on the EP configuration in general [81]. Our results indicate that categorizing the visual word stimuli is a more active process as compared with categorizing the semantically equivalent image stimuli. It employs more stages of information processing, and it runs alongside not just a decrease in amplitude compared to the analysis of semantically equivalent visual stimuli, but also an increase in the configuration of the EP. The latter appears to prove that higher frequencies appear on the EP, and they reflect the functioning more local neuron populations that form asynchronous (successive) activity patterns.
Finally, the differences between the EP parameters registered for indifferent and target stimuli were barely dependent on the type of information presented (visual/verbal). For the significant stimuli, in both cases, an EP with shorter LP, the lower amplitude for the front area P1-N2 complex components were registered. On the contrary, the amplitude for the P3-N3 (especially those registered in the back areas) for the target stimuli was higher than for the indifferent, but they were registered after the response. (see Figure 3). It is reasonable to suppose that in the current experimental paradigm with the fastest possible reaction to a target stimulus in a multiple-choice situation, regardless of the type of presenting information, what is crucial for categorization is detecting key features in the stimuli. A quick detection of those is enough to make a decision on forming a response and 'reduction' the further analysis. This decreases the analysis time for the stimuli. Shorter LP for the most of EP components registered for significant stimuli of both types is indicative of that. The increase of amplitude for components P3 and N3 can be explained by more attention concentrated on carrying out the next task [82,83,84,85], by 'turning off' the neural populations unemployed in this task by synchronization of their activity.
Thus, the results obtained in our work indicate that classifying the visual stimuli and detecting their type (verbal/visual) is carried out in the early stages of perception (up to 120-150 ms). Further on, either more detailed analysis of their spatial characteristics (for visual stimuli) employing predominantly the back (parietal-occipital) areas of the cortex, semantic analysis (for word stimuli) employing predominantly the front (frontal and temporal) areas is carried out. Detecting the group of the stimulus (target/nontarget) and making a decision on forming a motor response is barely dependent on the information type (verbal/visual), and employs the front areas of the cortex. This all points to the existence of at least two mechanisms underlying visual recognition in which top-down and bottom-up influences are used in different proportions. This, in its turns, suggests at least two stable states in the functioning visual information semantic analysis system.

Conclusion
1. It was experimentally shown that categorizing speed of verbal and visual stimuli does not depend on gender. Under categorization, the response time for word stimuli in reliably longer than that for the semantically equivalent image stimuli. 2. Under categorization, the EP registered for verbal stimuli (compared with the semantically equivalent visual stimuli) has a more complex configuration. The mechanisms responsible for processing verbal and visual information are different.
3. Detecting the type of information (verbal/visual) under categorizing occurs in the early stages of their perception (up to 120-150 ms). Further analysis includes either a more detailed description of their spatial characteristics (for images) employing the back (parietal-occipital) areas, or detecting the semantic definition (for words) involving the front (frontal and temporal) areas of the cortex, predominantly in the left hemisphere. 4. For target stimuli (both images and words) under categorization the EP is registered with shorter latency periods and lower amplitudes of the components within the P1-N2 complex. These differences are the most pronounced in EP registered mainly on the front electrodes which indicate their participation in making the decision on the forming a response. The current research is carried out in the framework of the basic part of the state task MES № 6.5961.2017/8.9