Treffer: On the costs and benefits of faces and words : Process characteristics of feature search in highly meaningful stimuli
University Indiana University, United States
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
FRANCIS
Weitere Informationen
The authors present a comprehensive consideration of the process characteristics of visual search in contexts that vary in their meaningfulness. The authors frame hypotheses regarding process architecture, stopping rule, capacity, and channel independence, using analytic results and a rigorously specified dynamic system to characterize a set of alternative hypotheses that vary along all of these dimensions. Results of the tests of these hypotheses suggest that process architecture and the stopping rule do not distinguish the processing of meaningful and meaningless forms. The major distinction between configural and nonconfigural processing was with regard to processing capacity, potentially implicating channel interdependencies. All of these conclusions hold for both faces and words.
On the Costs and Benefits of Faces and Words: Process Characteristics of Feature Search in Highly Meaningful Stimuli
<cn> <bold>By: Michael J. Wenger</bold>>
> <bold>James T. Townsend</bold>
>
<bold>Acknowledgement: </bold>This work was supported in part by National Institute of Mental Health Grant 5R01MH57717-02 to James T. Townsend, a fellowship from the Mathematical Modeling of Cognitive Processes training grant within Indiana University's Cognitive Science program to Michael J. Wenger, and National Research Service Award 1 F32 MH11491-01 to Michael J. Wenger. We sincerely thank Brad Gibson, Christof Schuster, and Michael Tombu for comments on earlier versions of this work.
Can meaning and organization be double-edged swords? Certainly background knowledge can greatly improve performance in a range of perceptual and cognitive tasks (e.g., Ahissar & Hochstein, 2000; Chase & Ericsson, 1981; Dosher & Lu, 1999; Johnson & Carnot, 1990; Wenger & Payne, 1995). For example, trained users of ultrasound equipment can, with apparent speed and accuracy, localize structures that novices can only find in a slow, error-prone manner. The effects of experience appear to include the ability to group elements into meaningful wholes and improvements in the ability to locate particular elements in those groupings. Yet there are also examples of how knowledge and meaningful organization can have negative effects on performance (e.g., Kuehn & Jolicœur, 1994; Radvansky & Zacks, 1991; Suzuki & Cavanagh, 1992).
Although organization clearly can exert both positive and negative influences on performance, the mechanisms by which these influences are expressed are ambiguous. For example, the often profound effects associated with perception of human faces have suggested physiological (e.g., Kanwisher, McDermott, & Chun, 1997), qualitative (e.g., Farah, Wilson, Drain, & Tanaka, 1998), and quantitative (e.g., Bartlett & Searcy, 1993; O'Toole, Wenger, & Townsend, 2001) distinctions. Nonetheless, the characteristics of the processing systems that lead to variations in performance are far from clear. For example, is lower performance in one condition (e.g., perception of houses) when compared with a second condition (e.g., perception of faces) due to serial instead of parallel structure, limited instead of unlimited capacity, or possibly some combination?
The work we summarize in this article compares performance in two important types of stimuli associated with the notions of holistic or gestalt processing—faces and words. <anchor name="b-fn1"></anchor><sups>1</sups> In particular, our goal is to explicitly characterize search performance, given different types of stimulus organization, in terms of four basic characteristics of information processing: architecture, the stopping rule, independence or nonindependence in rates of processing, and process capacity (for more detailed discussions, see, e.g., Townsend & Ashby, 1983; Townsend & Nozawa, 1995). Although a number of studies have considered a subset of these four (particularly architecture and the stopping rule), the present study is the first, to our knowledge, to simultaneously and in a factorial design consider all four characteristics in the data of individual observers rather than in group averages.
We have theoretically analyzed the behaviors of standard serial and parallel systems in a completely general way for all of the experimental conditions used in the present work, at the level of means of response time (RT) distributions as well as at the level of the distributions themselves (e.g., Townsend, 1974; Townsend & Ashby, 1983; Townsend & Wenger, 2004a, 2004b). The models developed in the present work extend those general analyses by implementing the theoretical alternatives as real-time (dynamic) process models. In addition, because there were conditions (in the experiment we report) in which observed performance deviated in some rather profound ways from the predictions of standard models, our process models allowed us to probe alternative explanations—in particular, models incorporating interactive mechanisms—for these departures.
We based the specific models on our linear dynamic systems theory approach (augmented with decisional criteria and stochastic elements; see Townsend & Wenger, 2004b). This model class bears relations to a growing number of theoretical constructions in the literature (e.g., Busemeyer & Townsend, 1993; Usher & McClelland, 2004). This approach permitted us to represent a set of natural theoretical alternatives and provided guidance toward experimental conditions and empirical outcomes that led to a preferred theoretical candidate from among these alternatives.
A word is in order concerning our general approach. We believe that, when feasible, it is preferable to use properties or predictions that are universal to an entire set of models and that are qualitative (e.g., specified at the level of orderings on or the signs of measures; Townsend & Nozawa, 1995) rather than quantitative model fits. In this study, we implemented only the most fundamental conceptual aspects to the various dimensions of processing that we have mentioned. Our experimental results are directly interpretable by way of these conceptions and in terms of qualitative relations rather than parametric model fits. Although we certainly judge that both approaches are not only beneficial but sometimes mutually reinforcing (e.g., as in Thomas, 2001), there seems little to be gained by using numerical fits in the present investigation.
In anticipation, an important conclusion to come from the present work is that what distinguishes the processing of meaningful and meaningless stimuli in our results is not the common distinction between serial and parallel processing (see also Ingvalson & Wenger, 2005; Wenger & Townsend, 2001). Parallel processing so far has been uniformly confirmed with our methodology, among individuals and across a growing number of studies and tasks (e.g., Ingvalson & Wenger, 2005; Townsend & Nozawa, 1995; Wenger, 1999; Wenger & Townsend, 2001). Instead, stimulus organization seems to have profound impacts at the level of processing capacity, and we suggest that specific strategies and dependencies within parallel processing channels, differing according to pattern structure, may be responsible for producing these effects.
Visual Search and the Effects of Stimulus Organization
>
Visual search is perhaps one of the most widely used tasks in explorations of human information processing (for a review and introduction to the basic issues, see Dosher, 1998). With respect to the central question of the present investigation—the effects of meaningful organization—visual search was among the first tasks to be used in explorations of the processing of facial stimuli (e.g., Bruce, 1979; Ellis, 1975; Hampton, Purcell, Bersine, Hansen, & Hansen, 1989; Neisser, 1964). Whether the search stimulus is a meaningful organization of elements or an abstract field, two basic characteristics of information processing have frequently been of concern. The first of these is the architecture of the processing system. For example, assume that the stimulus display that is presented to the observer consists of
The second characteristic of processing that has been of persistent concern is the stopping or decision rule (sometimes referred to as the
Although architecture and the stopping rule have been the major issues of concern in studies of visual search, there are two additional process characteristics that the field must consider. The third characteristic is the preservation or violation of independence in the rates at which each of the display elements is processed. Researchers often use the term
The final characteristic to consider is the capacity available for processing the display elements. Capacity can be considered at a number of different levels of analysis (see Townsend & Ashby, 1978, 1983), including the capacity for processing the individual elements and the effective capacity of the system as a whole (see, e.g., Townsend & Nozawa, 1995; Townsend & Wenger, 2004b; Wenger & Gibson, 2004; Wenger & Townsend, 2000). Our measurement baseline in this study is the performance of unlimited-capacity, independent, parallel models. Such parallel models possess channels whose efficiency (capacity in terms of speed) does not degrade or improve as the number of other working channels is increased. The overall speed (or capacity, as given by the speed at which the stopping rule is fulfilled) is determined by the architecture, the dependence relations among items or channels, and, of course, the stopping rule. For any given stopping rule, the capacity is measured against that hallmark standard parallel model. Typically, the experimental psychologist expects that increased workload—in the form of more features to process, for instance—will result in at best unlimited and more likely limited capacity. For example, it is easy to imagine a situation in which meaningful organization of the stimulus elements would lead to an improvement in overall processing capacity, relative to a scrambled display, by way of positive dependencies. If the architecture is parallel, such improvements will be assessed, in terms of our measure, as supercapacity.
With these four dimensions of processing—architecture, stopping rule, independence, and capacity—in mind, the next step is consideration of the implications of the evidence on the effects of meaningfulness in search tasks. In particular, it is important to review inferences concerning specific combinations of values (e.g., parallel and exhaustive processing) on these dimensions. There are a number of early precedents for considering the effects of meaningful organization on search performance (e.g., Gilford & Juola, 1976; Johnson & Carnot, 1990; Karlin & Bower, 1976; Naus, 1974; Naus, Glucksberg, & Ornstein, 1972; Strongman & Brown, 1966), and the cumulating body of evidence generally seems to suggest that the perceptual system has access to and uses numerous levels of coding, ranging from physical to abstract (or meaningful), during visual search (as discussed, e.g., by Estes, 1975; Sperling & Dosher, 1986; Suzuki & Cavanagh, 1995). As theorists have evolved their conceptualizations to keep track of the empirical phenomena associated with visual search, they have widely argued that a complete understanding of search performance needs to consider both top-down and bottom-up influences (e.g., Cave & Wolfe, 1990; Dosher, 1998; Tong & Nakayama, 1999).
Possibly the most prominent set of findings pertinent to the effects of meaningful organization, at least for present purposes, has to do with what are often referred to as
A meaningful context or organization does not, however, uniformly produce improvements in performance. In fact, a number of studies have suggested that such contexts or organizations have the ability to hinder performance, relative to nonmeaningful contexts or organizations. For example, Klein (1978) demonstrated that a 3-D context (e.g., that used in Weisstein & Harris, 1974) produced a slowing of responding, relative to when that context was not present (see also Widmayer & Purcell, 1982). Researchers have obtained similar impairments in search performance with facial stimuli. For example, the biologically appropriate arrangement of facial features has been shown to produce decrements in performance (e.g., Mermelstein, Banks, & Prinzmetal, 1979). In addition, facial expressions have been shown to be capable of having both positive and negative effects on search performance (e.g., Nothdurft, 1993; Suzuki & Cavanagh, 1992, 1995).
Two detailed examinations of faces (and facelike stimuli) in visual search tasks give some hints as to the characteristics of faces that may be influential in producing both types of effects. Kuehn and Jolicœur (1994) examined search for a target face among various types of distractors. They documented that upright faces slowed search relative to line drawings (Experiment 1), that digitized images of real faces slowed search relative to schematic faces (Experiment 2), that a face in which the images were scrambled but placed in the proper top-to-bottom arrangement slowed search relative to faces in which this top-to-bottom ordering was not preserved (Experiment 4), and that a face in which the features were scrambled but arranged to preserve left–right symmetry slowed search relative to faces that did not preserve this symmetry (Experiment 4). Suzuki and Cavanagh (1995) found that facial organization improved performance in search for a conjunction of features and hindered performance in search for a single feature (Experiment 1; see also Koshino, 2001). In both studies, the authors put forward processing hypotheses to account for the pattern of results. Kuehn and Jolicœur (1994) suggested a serial self-terminating search (with precedent in Karlin & Bower, 1976), whereas Suzuki and Cavanagh (1995) suggested a form of parallel processing, and both sets of authors suggested search mechanisms that may be specialized for the processing of human faces (there is suggestive physiological evidence in both visual pathologies and normal human functioning; see Kuehn & Jolicœur, 1994, pp. 96–97).
This snapshot of the literature suggests a critical set of issues: First, few of the studies that have examined process hypotheses in visual search have relied on explicit and rigorous models for performance (although see Bricolo, Gianesini, Fanini, Bundesen, & Chelazzi, 2002; Bundesen, 1990; Dosher et al., 2004; Kinchla, Chen, & Evert, 1995; McElree & Carrasco, 1999; Palmer, 1994). This is true for studies using both meaningful and nonmeaningful stimuli. In addition, when researchers have relied on explicitly articulated models, they have typically limited themselves to consideration of only two of the four characteristics of processing—architecture and the stopping rule—without explicitly considering channel independence or capacity (although see Fisher, 1982). Consequently, we set for ourselves the goal of being able to simultaneously (i.e., in the data of individual observers) consider all four characteristics of processing, within the context of a mathematically specified and general set of process models.
Second, investigators have not manipulated the nature of the meaningful context in a uniform manner across stimulus types. For example, both Kuehn and Jolicœur (1994) and Suzuki and Cavanagh (1995) required a search for a target face (or facelike stimulus) among other objects. In contrast, many of the studies documenting word and object superiority effects (e.g., Reicher, 1969; Weisstein & Harris, 1974; Wheeler, 1970) have placed the target feature within the larger meaningful context. Consequently, we felt it important to equate search tasks across stimulus types, having observers search for facial features (or target letters) in the context of either a well-organized face (or word) or a scrambled set of facial features (or scrambled letter string; see also Martelli, Majaj, & Pelli, 2005).
Yet another important theoretical issue concerns the popular hypothesis that visual processing mechanisms may be specialized for facial stimuli (as discussed, e.g., in Kuehn & Jolicœur, 1994; Suzuki & Cavanagh, 1995); we sought to compare search performance across stimulus types within the performance of individual observers to ascertain whether the same individuals use similar or distinct processing strategies with different classes of stimuli. Note that we are not simply asking whether faces are better than words or vice versa. Instead, we are seeking to understand the general characteristics of processing that may be shared across stimulus forms.
Next, it is well known that agglomerating data across observers can obscure critical individual differences (see, e.g., Townsend & Fific, 2004), distort the form of psychological functions (e.g., Estes, 1956), and even support the wrong model (e.g., Ashby, Maddox, & Lee, 1994). Consequently, we chose to take the more time-consuming and expensive approach, with the goal of providing the strongest set of evidence for our hypothesis tests. Finally, we pursue these questions in the context of a set of provisional hypotheses that we earlier put forth as characterizing processing of configural patterns (see O'Toole et al., 2001; Townsend & Wenger, 2004b; Wenger & Townsend, 2001): (a) Processing is parallel on all parts of a configural stimulus pattern, (b) processing is always exhaustive on all parts of a configural pattern, (c) processing on all parts of a configural pattern is positively dependent (in the extreme, perfectly correlated, meaning that all parts start and stop processing together), and (d) processing of a configural pattern is supercapacity (i.e., performance improves as more features are available for processing) relative to an unlimited-capacity, independent, parallel model. We realize that some of these stipulations are very strong but believe they fairly capture the modal conceptions of configural processing.
Pursuing the Questions
>
The choice to take this comprehensive approach has a range of implications. In particular, if we wish to simultaneously consider architecture, the stopping rule, independence, and capacity, all as functions of stimulus organization, we need to (a) have a task capable of providing the range of data we need and (b) have explicit models of the candidate hypotheses to determine where in the data we should be looking to allow for the strongest possible inferences.
<h31 id="xhp-32-3-755-d291e575">Designing the Task</h31>We decided to design a visual search task modified from a classic experimental design. We designated a set of elements as targets and another set as nontargets. Then we manipulated both the total number of elements in any test display and the total number of target elements present (following Ashby, 1976). Researchers have used these manipulations of workload and target redundancy for decades to consider questions of processing architecture in visual and memory search (see, e.g., Palmer et al., 2000; Townsend & Wenger, 2004a), and such manipulations are critical if one is to examine questions of architecture and capacity. Because we also needed to consider the stopping rule, we included a manipulation of this factor. In one condition, we instructed observers to give a positive response if any element in the display was drawn from the target set—we refer to this as an
>
><anchor name="fig1"></anchor>
The first of the three types of organization we refer to as
We can compare the two types of gestalt stimuli (SC and SI) with the nongestalt stimuli in a way that allows us to avoid confounds as a function of display size, number of target elements, and the correct response (positive or negative). In particular, we can compare the SC-gestalt stimuli with nongestalt stimuli in which the number of target elements (
The task as described contains all of the manipulations needed for the theoretical questions of interest. However, not all of the conditions resulting from the factorial combination of these manipulations are critical in supporting strong-inference hypothesis testing. To better explicate where in the data we should be looking to distinguish among the theoretical possibilities, we developed a set of dynamic process models for those possibilities.
We began by developing both a serial and a parallel model for the experimental task. These initial models epitomize the predictions of the entire class of such models and serve as null hypotheses with respect to the effect of stimulus organization. That is, these initial models, by design, possess no means of distinguishing a well-organized or meaningful stimulus from a scrambled or meaningless stimulus.
Our approach to providing natural, specific models in this type of task is based on linear dynamic systems theory, augmented with decision thresholds and stochastic elements (for a more complete description of the general modeling approach, see Townsend & Wenger, 2004b). As Townsend and Wenger (2004b) described, our approach makes contact with a number of contemporary developments in the field (e.g., Ashby, 2000; Ratcliff & Smith, 2004; Smith, 1995, 1998, 2000; Smith & Ratcliff, 2004). <anchor name="b-fn4"></anchor><sups>4</sups> It is also connected to a substantial foundation of work on stochastic process representations of psychological hypotheses (e.g., Dzhafarov, 1997; Schweickert, 1983; Schweickert, Fisher & Goldstein, 1992;Schweickert, Giorgini, & Dzhafarov, 2000; Townsend, 1972, 1974, 1976, 1984, 1990a; Townsend & Nozawa, 1995; Townsend & Schweickert, 1989).
Figure 2 presents a schematic representation of the basic parallel model for the search task. We limit ourselves here to an informal description of the models and provide a more complete and mathematical description in the Appendix. Beginning at the left-hand side of the figure, we represent an input (i.e., an element from either the target or the nontarget set of elements) as a step function that is zero at the outset of a trial and steps to a constant value
>
><anchor name="fig2"></anchor>
The serial model is presented in schematic form in Figure 3; we created it from the parallel model by imposing sequentiality at the level of the inputs. In brief, the input to the second channel is not admitted for processing until the first channel has completed its work (i.e., reached a decision threshold). The remaining inputs are admitted for processing in a similar manner. Details of the implementation of this idea are provided in the Appendix.
>
><anchor name="fig3"></anchor>
Simulating the Search Task
>
As we have observed, the laws of basic performance for the standard parallel and serial models are well established (e.g., Townsend & Ashby, 1983; Townsend & Wenger, 2004b). Even so, in addition to the illustrative value of simulations, it is interesting to examine the extent to which the two architectures (serial and parallel) produce empirically distinguishable patterns of data for the search task, with parameters that are reasonable in the present experimental context. The simulations involved display sizes (
Table 1 summarizes the simulation results for all three types of stimuli. We begin with the results for those nongestalt stimuli that require
>
><anchor name="tbl1"></anchor>
>
><anchor name="fig4"></anchor>
In contrast, with an
Consider next the results for the nongestalt stimuli when a
Because the models at this point are unable to distinguish nongestalt from gestalt configurations, we define gestalt status in terms of the number of target elements relative to the number of display elements. SC-gestalt stimuli are those in which the number of target elements is equal to the number of display elements (
Consider first the simulation results for those display configurations that require a
>
><anchor name="fig5"></anchor>
Consider next the simulation results for those display configurations that require a
The simulation results agree with our earlier analytic findings, suggesting that there is a specific subset of the experimental conditions in which the two models produce distinguishable results (see the summary in Table 1). These are (a) trials involving the nongestalt stimuli in which no elements of the target set are present, an
We can draw a final set of expectations from the simulations, and these concern processing capacity. For both the SC- and the SI-gestalt stimuli, one can compare performance with performance on nongestalt stimuli possessing the same number of targets and display elements. That is, it should be possible to determine (using a statistical measure of processing capacity, which we define later in the article) whether stimulus organization affects capacity, holding the number of target and display elements constant. In the case of the simulations discussed so far, given the absence of any mechanisms to distinguish processing on the basis of stimulus organization, the expectation is that there will be no differences in capacity due to stimulus organization. Consequently, any observed differences in capacity as a function of stimulus organization should suggest that we need to reject our initial null models in favor of models that differ as a function of stimulus organization.
The simulations presented so far instantiate a set of expectations for what we should observe in an identically structured search experiment. To determine whether stimulus organization does require distinct process characteristics, we used stimulus features that could be presented in either meaningful or meaningless arrangements. Because we were also interested in the extent to which possible process distinctions might be specific to one class of stimulus forms (human faces, in particular), we used two classes of highly meaningful stimuli: faces and words.
Method
> <h31 id="xhp-32-3-755-d291e883">Participants</h31>
Four individuals were recruited from the cognitive science community at Indiana University. Participants were reimbursed at the rate of $6 per hour, and all reported normal or corrected-to-normal vision. Each participated for at least forty 1-hr sessions, with no more than 2 days elapsing between successive sessions (we made exceptions for illness and personal commitments).
<h31 id="xhp-32-3-755-d291e887">Materials</h31>
Figure 6 illustrates the construction of the face stimuli. Two faces, each a photograph of a middle-aged Caucasian man, were used as the sources for target and nontarget features. Each of these source faces served as the target set equally often across observers. A third face, also a photograph of a middle-aged Caucasian man, was blurred (via application of a Gaussian blur) so that only the overall outline and contours were detectable. This blurred image served as the background for the presentation of all features in all of the trials involving faces. All three faces were absent facial hair, glasses, and jewelry. The two source faces and the blurred background image were sized so that, at a viewing distance of 55 cm, the images subtended 2.1° of visual angle for both height and width.
>
><anchor name="fig6"></anchor>
The target features in each of the source faces were the left eye (and accompanying eyelid), the right eye (and accompanying eyelid), the nose, and the mouth. Each of these features was copied from its source face, along with a small (approximately 1 pixel) part of the surrounding area. This surrounding area was blurred and matched in shade to the background face. Pilot work with these features, involving a simple discrimination task (based on the source face) with only the target features, indicated that discriminability was approximately equal across all features. This being true, however, it is important to note that the powerful qualitative predictions of standard parallel and standard serial models do not depend on equal discriminability. Four locations on the background face were identified for the placement of the features, such that when all of the features from one of the source faces were present and in their biologically appropriate position, the resulting stimulus had a natural appearance (to us).
Two four-letter words were used as the source words, and a string of upper-case Xs was used as a common background. The letters in all strings were of equal width, and the four-letter strings subtended 2.0° of visual angle. We roughly equated the 2 four-letter words (
The facial features and letters were combined in three different ways, in parallel with the definitions used for the three classes of stimuli considered in the simulations (see Figure 1); Figure 6 illustrates the three types of forms for the facial stimuli. An SC-gestalt face was one in which all of the features present were from the same source face and were present in their biologically appropriate position. An SI-gestalt face was one in which the features present were from both of the source faces and were in their biologically appropriate position. A nongestalt face was one in which the features were not in their biologically appropriate position.
The same three types of stimulus forms were defined for the word stimuli. An SC-gestalt word was one in which all of the letters present were from the same source and were in their proper position. An SI-gestalt word was a legal (i.e., locatable in a dictionary) English word, which we formed by combining letters from each of the two source words. Note that this definition excluded pronounceable nonwords and did not take position in either of the source words into account. A nongestalt word was one in which the letters present were not all in their correct position and the resulting string was not a legal English word.
The nature of the stimulus set meant that the numbers of stimuli at each level of the design could not be completely equated. For example, the number of SI-gestalt stimuli could not be equated across stimulus form (faces, words). Actual stimulus and presentation frequencies (per block of self-terminating or exhaustive trials) are presented in Table 2. These trial frequencies balance trials requiring positive and negative responses for the nongestalt and SC-gestalt stimuli; in addition, within the three stimulus organizations, the overall assignment of trial frequencies roughly equates frequencies across number of display elements.
>
><anchor name="tbl2"></anchor>
All images (faces and words) were displayed via a 33-cm diagonal video graphics array monitor, controlled by a PC-compatible microcomputer, and projected through one field of a Gerbrands two-field tachistoscope to restrict all nontask-specific sources of light. All timings (display and RTs) were controlled by the computer, and all collected RTs were accurate to ±1 ms. Display onsets were synchronized to the vertical refresh rate of the monitors.
The experiment was conducted as a 2 (stimulus type: faces, words) × 2 (stopping rule: self-terminating, exhaustive) × 3 (stimulus form: SC gestalt, SI gestalt, nongestalt) × 3 (number of elements displayed: two, three, four) × 5 (total number of target features present: none, one, two, three, four) incomplete factorial, with all factors manipulated within subject. Note that it was possible to have an SC-gestalt stimulus only when the number of target features was equal to the number of elements displayed. It was possible to have an SI-gestalt stimulus only when there were at least one but no more than
Each experimental session lasted approximately 1 hr and began with participants dark adapting for approximately 5 min. Sessions consisted of between four and eight blocks of either 212 trials (in the self-terminating blocks) or 214 trials (in the exhaustive blocks; see Table 2). Each block was consistent in terms of stimulus type and stopping rule. We sequenced the four orders of these four block types for each observer using a balanced Latin square.
At the beginning of each block of trials, participants were informed of the stimulus type and stopping rule. They were then given a chance to study the two source stimuli, which were presented side by side. The stimulus on the left was arbitrarily designated as the stimulus containing the target features, whereas the stimulus on the right was arbitrarily designated as containing the nontarget features. <anchor name="b-fn8"></anchor><sups>8</sups> These stimuli were presented intact (i.e., the complete source faces or source words), and participants could study the stimuli for as long as they wished.
Participants were then instructed on the stopping rule for the block. In the case of the self-terminating stopping rule (
Each trial began with the presentation of a fixation cross, positioned at the location of either the nose (for face trials) or the point between the second and third letters (for word trials), for 1,000 ms. This was then replaced by the trial stimulus, which was present for 75 ms. Following the participants' response, a tone was sounded briefly (250 ms) to indicate either a correct (880 Hz) or an incorrect (220 Hz) response. There was then a 500-ms intertrial interval. Participants were given short breaks at the midpoint of each block and were given feedback about their mean RT and overall accuracy at the end of each block.
Results
>
To best compare what we observed with what we expected (on the basis of the simulations of the null models; see Table 1), we organize the analyses of the experimental data to correspond to those presented for the simulations. We began, however, by examining the data for effects due to practice. Although all participants did show improvements in the initial sessions of the experiment, there was no evidence to suggest differential effects of practice as a function of stimulus class, organization, and so forth. In addition, most of the improvements in performance occurred within the first eight blocks of trials with each stimulus type. Consequently, we discarded the first 10 sessions of data for each participant prior to completing our analyses.
For the data that remained, we first calculated median RTs for correct responses for each stimulus type (see Table 2). <anchor name="b-fn9"></anchor><sups>9</sups> We performed all remaining analyses on the means of these medians. For the data involved in the analyses reported here, error rates were in all cases less than 3.5%. We used an alpha level of .05 in all analyses and performed all analyses separately on the data for each observer.
<h31 id="xhp-32-3-755-d291e976">Nongestalt Stimuli</h31>For the purposes of the analyses, we separated the nongestalt trials into two sets. The first set contained stimuli in which the number of target elements was less than the number of display elements and those elements were not in their appropriate locations in the stimulus (0 ≤
We analyzed RTs for correct
>
><anchor name="tbl3 tbl4"></anchor>
>
><anchor name="fig7"></anchor>
Consider first the data for the
Next consider the data for the
The data so far suggest that observers were processing the elements of both the face and the word nongestalt stimuli in a parallel manner, using a stopping rule that was consistent with the task demands. These results replicate the findings we have obtained with scrambled facial images (Wenger & Townsend, 2001) and schematic forms (Ingvalson & Wenger, 2005), for which the inferences regarding the processing of nonfacial forms were identical to those we obtained for facial forms. The qualitative form of the results and theoretical inferences are also consistent with the results of experiments with simple dot detection (Townsend & Nozawa, 1995).
<h31 id="xhp-32-3-755-d291e1053">SI- and SC-Gestalt Stimuli</h31>As was the case for the simulation data, we define the SI-gestalt stimuli to be those in which the number of targets is one less than the number of display elements and the elements are all in their appropriate positions. We define the SC-gestalt stimuli to be those in which the number of targets equals the number of display elements and the elements are all in their appropriate places. As the two types of gestalt stimuli differed in terms of the number of target elements, we performed comparisons of each type with the nongestalt stimuli that possessed the same number of target and display elements. That is, we compared performance on the SI-gestalt stimuli with performance on the nongestalt stimuli in which the number of target elements was one less than the number of display elements (
>
><anchor name="fig8"></anchor>
Consider first the data for the
>
><anchor name="tbl5"></anchor>
Another critical result from these analyses is the consistently reliable effect of gestalt type. For both faces and words, the effect of going from a nongestalt to an SI-gestalt configuration was a substantial slowing of responding. Recall that the null models we presented at the outset had no way of distinguishing the organization of the stimuli and, as a consequence, predicted no difference as a function of gestalt type. Obviously, our initial models require alterations to account for the effects of meaningful organization, and we consider two possible modifications in the Discussion section.
We found additional divergence from the original models in the data from the SI-gestalt trials involving an
Consider next the data for the SC-gestalt stimuli (see Table 6). Here we found additional discrepancies with our null models. These were stimuli that required
>
><anchor name="tbl6"></anchor>
In summary of the analyses, the data from the nongestalt trials are firmly in line with standard parallel processing predictions. Contrarily, some fascinating departures from standard processing from either class of architectures accompanied the SI-gestalt and SC-gestalt results. These include invariance of RT to display size in several pertinent gestalt conditions and quite substantial detrimental (for SI gestalts) or beneficial (for SC gestalts) influences of stimulus organization. We consider and model these findings in the Discussion.
<h31 id="xhp-32-3-755-d291e1121">Capacity Effects</h31>The data so far suggest the following preliminary inferences regarding the characteristics of processing: Observers appeared to be processing the inputs in parallel, using a stopping rule that was consistent with task demands. In addition, the orderings on latencies as a function of stimulus organization (see Figure 8) suggest that some aspect of processing must differ as a function of stimulus organization. In particular, models that assume independence and within-item unlimited capacity of processing in the channels, even parallel channels, cannot account for the patterns observed in the SC- and SI-gestalt data. The two remaining process characteristics to consider are channel independence and process capacity, and we focus on the latter as a way of drawing inferences regarding the former.
To directly assess the extent to which process capacity was influenced by stimulus organization, we used a set of tools that allow for hypothesis testing at a statistical level of analysis that maps much more directly and in a more fine-grained manner to the construct of capacity than does the level of the mean (Townsend & Ashby, 1978; Townsend & Nozawa, 1995; Townsend & Wenger, 2004b; Wenger & Gibson, 2004). In particular, we examined the extent to which stimulus organization affected performance at the level of the RT distribution hazard function.
The hazard function is a conditional probability density function, <anchor name="eq1"></anchor>
where
To test hypotheses regarding effects of experimental manipulations at the level of the hazard function, we take advantage of a set of tools developed in areas outside experimental psychology. These tools are regression techniques known broadly as proportional hazards models (Allison, 1984; Collett, 1994; Cox, 1972; Therneau & Grambsch, 2000). Very recently, we explored the application of these approaches to RT data and found the approach to have a number of desirable properties (see Wenger & Gibson, 2004; Wenger, Schuster, Petersen & Petersen, in press).
The particular models we use for hypothesis testing assume that
where α
We applied these models in two sets of comparisons, the first involving the SC-gestalt stimuli and the nongestalt stimuli (with equal numbers of target and display elements;
Here, the subscripts
>
><anchor name="fig9"></anchor>
Discussion
>
We began our investigation by specifying a set of models to represent combinations of assumptions regarding process architecture (serial or parallel) and the stopping rule (
We then compared the data obtained in our experiment against the patterns generated by each of the simulation models. Although individual performance varied (see also Townsend & Fific, 2004), there was also a high level of consistency in the inferences derived from the data. With a single (if not highly persuasive) exception, the evidence favors parallel processing. In many cases, outside certain of the SI-gestalt and SC-gestalt results, the robust qualitative evidence is consistent with parallel processing that was independent and unlimited in capacity. At this level, our results are consistent with those of Palmer (1994, 1995; Palmer et al., 2000), whose accuracy data (from experiments using reasonably simple stimulus forms) were consistent with independent, unlimited-capacity, parallel processing. Although the patterns we found were stable across observers, of even more import is the uniformity across stimulus forms. Thus, although words and faces supported different levels of performance, the inferences regarding process architecture were invariant across these levels of performance. One can see this best by comparing the analyses presented in Tables 3–8, which show consistently similar outcomes, independent of levels of performance. Note that the need to preserve a common scaling in all figures results in some difficulty in seeing reliable differences in the word data. This qualitative equivalence across stimulus forms is a pattern we have observed in other work (see, in particular, Ingvalson & Wenger, 2005; Wenger & Ingvalson, 2002, 2003; Wenger & Townsend, 2001). This striking result provides evidence against any theory that proposes a shift of architecture from parallel to serial in moving from easy-to-process (here, gestalt) patterns to difficult-to-process patterns. For instance, an extension of the Treisman (1986) type of system, in which single-feature targets are located in parallel but conjunctions must be sought serially, would not be appropriate for our data.
With respect to response rule, the data from the nongestalt stimuli in all cases and the SI-gestalt stimuli in
It may seem odd at first encounter that RTs for SC-gestalt stimuli in the
However, how could exhaustive processing be faster on
It is more difficult to say why SI-gestalt RTs did not decrease with increasing numbers of targets (and display size) in
The emerging patterns of inference from the architecture and stopping rule analyses are buttressed by the patterns observed in our capacity statistics. We found that the SC-gestalt stimuli conferred a benefit in capacity, and this was particularly true for faces. In contrast, the SI stimuli imposed a cost in capacity. Our use of the dynamic systems models to explore capacity relations in parallel processing (Townsend & Wenger, 2004b) suggests that these types of shifts in capacity are very closely related to dependencies among the channels (with empirical explorations—e.g., Ingvalson & Wenger, 2005; Wenger & Townsend, 2001—being consistent with these suggestions). Consequently, the patterns observed in our data strongly suggest that we need to examine the assumption of channel independence. We emphasize that we intentionally did not attempt to consider the outcomes just described in our initial set of null models, models in which we assumed no processing differences to exist as a function of stimulus form. We did this to (a) bring an initial focus on the questions of architecture and stopping rule, as these have been the dimensions most frequently considered in the literature, and (b) examine the effects of stimulus organization from the perspective of models constructed according to well-understood assumptions (channel independence and unlimited capacity).
We considered two ways we could relax the assumption of channel independence to produce patterns consistent with the data. First, we could allow the channels to share activations as they are accumulating. This is the approach we used in our prior theoretical explorations and is a possibility that leads to reliable and large changes in capacity (Townsend & Wenger, 2004b). Given our assumptions regarding the coding of the inputs (positive sign for target elements, negative for nontarget elements), allowing cross-talk between the channels would allow the accumulation of positive evidence to be slowed by the influence of the negatively signed evidence associated with nontarget elements. We refer to this model as the
The results are presented in Figure 10, with error rates corresponding to each of the displayed means presented in the bottom of each panel. Generally, both the cross-talk and the decisional-shift model were able to produce costs and benefits. However, the cross-talk model diverged from the observed data in one very important way. That is, with an
>
><anchor name="fig10"></anchor>
In contrast, the decisional-shift model produced results for the
In sum, our conclusions offer strong support for our hypotheses regarding configural processing mechanisms. First, we have strong and consistent evidence supporting parallel processing with the nongestalt as well as the gestalt patterns. Second, given a well-configured set of elements drawn from a consistent source, capacity was super relative to that obtained for unorganized stimuli, both for faces and words. Although independence cannot be directly assessed in this (and most) RT paradigms, channel dependencies form a cogent explanation for both the facilitatory and the inhibitory effects in the SC-gestalt and SI-gestalt stimuli, respectively, and hold particular importance with respect to the shifts in capacity. The surprising discovery of exhaustive processing (when self-terminating responding was possible) with several of the gestalt patterns, though not ubiquitous in all of our previous experiments, suggests that it can indeed occur.
Certainly, we have not found any evidence for serial processing either in nongestalt or in gestalt patterns. If we accept the general conclusion of parallelism, the critical dimension along which the processing of both gestalt and nongestalt stimuli differ is the dimension of capacity, with changes in capacity in the present study apparently being determined by channel interdependencies, expressed through shifts in internal response criteria. As such, our conclusions reinforce a set of inferences that we have documented in other work. In particular, we have shown, in a set of studies, that the dimension of process architecture does not distinguish the processing of gestalt and nongestalt forms (Ingvalson & Wenger, 2005; Wenger & Townsend, 2001). In addition, we have documented that shifts in stimulus form that have previously been interpreted in terms of a distinction between holistic and nonholistic processing (per, e.g., Farah et al., 1998; Tanaka & Farah, 1993; Tanaka & Sengco, 1997) are produced in large part by shifts in response criteria (Wenger & Ingvalson, 2002, 2003). Although we do not believe that this pattern of outcomes across studies holds any necessary implication for the debate regarding the anatomical segregation of processing gestalt and nongestalt (or face and object) forms (e.g., Gauthier, Skudlarski, Gore, & Anderson, 2000; Gauthier, Tarr, Anderson, Skudlarksi, & Gore, 1999; Kanwisher et al., 1997; Kanwisher, Stanley & Harris, 1999; McDermott, Buckner, Petersen, Kelley, & Sanders, 1999), we do believe that our data challenge the notion that there exist distinct forms of processing for gestalt and nongestalt stimuli.
In closing, in the present study we have relied on a dynamic metatheory of processing (described in detail in Townsend & Wenger, 2004b) to exemplify earlier predictions for standard serial and parallel models. These models are intuitively reasonable candidates for perceptual and cognitive processing. We can readily modify them via alterations in our basic processing dimensions to probe nonstandard model assumptions, which, for example, are clearly needed to encompass coherently patterned stimuli. Through simulations, the models can produce both qualitative and quantitative predictions for similarly structured experimental tasks, allowing one to work from theory to data.
Footnotes
<anchor name="fn1"></anchor>
<sups>
1
</sups> Certain segments of the literature on holistic (we use the term generically) stimuli debate whether they differ from nonholistic stimuli via interpart relations versus a templatelike holism. In fact, adherents to one or the other view have frequently adopted a variety of relatively neutral terms (e.g.,
<sups> 2 </sups> Readers will certainly note the logical symmetry of these two response rules, relative to the use of target and nontarget elements for positive and negative responses. In principle, it is possible for observers to strategically shift these assignments. However, if this were the case, the rules for terminating processing would remain logically distinct. Consequently, the methods we use for distinguishing various characteristics of processing, particularly architecture, are unaffected. Furthermore, observers are generally slower and less efficient when processing negation (relative to the positive alternative; see, e.g., Wenger, 1999; Zbrodoff & Logan, 2000), an effect we observe in the experimental work reported here. Finally, over many years of experimenting with these types of response rules, we have yet to obtain evidence suggestive of observers searching for negatives.
<anchor name="fn3"></anchor><sups> 3 </sups> We reinforce the distinction between the two stopping rules in terms of positive and negative responses, as we discussed in Footnote 2.
<anchor name="fn4"></anchor><sups> 4 </sups> Although diffusion models have received a good deal of well-deserved attention in recent years, it is worth noting that specific models—including those developed both by Ratcliff and Smith (2004) and by the approach we describe—are all special cases of noisy linear systems (see, e.g., Townsend & Wenger, 2004b). The advantage that we find in the approach used here is the ability to specify the form of the input, the initial channel conditions, and the complete set of channel relations in a way that allows for a very direct translation of system-level hypotheses into formal representations.
<anchor name="fn5"></anchor>
<sups>
5
</sups> We designed the simulations to implement the distinction between positive (
<sups> 6 </sups> All of the patterns we discuss here are supported by statistical analyses of the simulation data, which are available on request. We analyzed the error rate data generated in the simulations for the nongestalt stimuli following an arcsine transformation (see, e.g., Zar, 1999); there were no reliable effects due to any of the independent variables.
<anchor name="fn7"></anchor><sups> 7 </sups> Analyses of the transformed values of the error rates for the simulation data for the two gestalt types showed no reliable effects due to any of the independent variables.
<anchor name="fn8"></anchor><sups> 8 </sups> Assignment of stimuli to target and nontarget status was balanced across observers. This stimulus assignment was constant for each observer across his or her entire period of participation.
<anchor name="fn9"></anchor>
<sups>
9
</sups> Note that in Table 2, differences in total and item-specific presentation frequency are confounded with gestalt type. This was an unfortunate concomitant to our desire to balance overall presentation frequency across available item types. To assess the extent to which we could interpret the data in spite of this confound, we examined the capacity results (see Tables 7 and 8, to be introduced later) to determine the answers to three questions. First, were there any item effects? The answer to this question was negative, meaning that we could examine total presentation frequency (
>
><anchor name="tbl7 tbl8"></anchor>
<sups> 10 </sups> We transformed error rates for each observer using the arcsine transformation used with the simulation data. For the nongestalt trials, error rates were invariant across all of the independent variables.
<anchor name="fn11"></anchor><sups> 11 </sups> Analyses of the transformed error rates for the two gestalt types indicated that error rates were invariant across all of the independent variables.
<anchor name="fn12"></anchor>
<sups>
12
</sups> Indeed, the magnitude of the effect, quantified in terms of a standardized measure of strength of association for linear regression models (Camp & Maxwell, 1983), was
= 0.2901 for faces and
= 0.2636 for words (averaged across observers).
<sups> 13 </sups> One might think that, with the reliable orderings observed in the mean RT data, we should not have to perform these tests. However, as demonstrated by Townsend (1990b), an ordering at the level of the mean does not imply an ordering at the level of the hazard function. Consequently, these tests are necessary for both conceptual and statistical purposes.
<anchor name="fn14"></anchor><sups> 14 </sups> The test of the hypothesis proceeds by estimating a likelihood ratio for the restricted model (i.e., the one in which all of the predictor coefficients are zero) and a fully parameterized model. This likelihood has a χ<sups>2</sups> distribution, which is used to guide the final inference. Additional details can be found in Collett (1994) and Therneau and Grambsch (2000).
References
<anchor name="c1"></anchor>Ahissar, M., & Hochstein, S. (2000). The spread of attention and learning in feature search: Effects of target distribution and task difficulty.
Allison, P. D. (1984).
Ashby, F. G. (1976).
Ashby, F. G. (2000). A stochastic version of general recognition theory.
Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U., & Waldron, E. M. (1999). A neuropsychological theory of multiple systems in category learning.
Ashby, F. G., & Lee, W. W. (1993). Perceptual variability as a fundamental axiom of perceptual science. In S. C.Masin (Ed.),
Ashby, F. G., Maddox, W. T., & Lee, W. W. (1994). On the dangers of averaging across subjects when using multidimensional scaling or the similarity choice model.
Ashby, F. G., & Townsend, J. T. (1986). Varieties of perceptual independence.
Atkinson, R. C., Holmgren, J. R., & Juola, J. F. (1969). Processing time as influenced by the number of elements in a visual display.
Bartlett, J. C., & Searcy, J. (1993). Inversion and configuration of faces.
Blaha, L. M., & Townsend, J. T. (2004, July).
Bricolo, E., Gianesini, T., Fanini, A., Bundesen, C., & Chelazzi, L. (2002). Serial attention mechanisms in visual search: A direct behavioral demonstration.
Bruce, V. (1979). Searching for politicians: An information processing approach to face recognition.
Bundesen, C. (1990). A theory of visual attention.
Busemeyer, J. B., & Townsend, J. T. (1993). Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment.
Camp, C. J., & Maxwell, S. E. (1983). A comparison of various strength-of-association measures commonly used in gerontological research.
Cave, K. R., & Wolfe, J. M. (1990). Modeling the role of parallel processing in visual search.
Chase, W. G., & Ericsson, K. A. (1981). Skilled memory. In J. R.Anderson (Ed.),
Collett, D. (1994).
Cox, D. R. (1972). Regression models and life tables (with discussion).
Dosher, B. A. (1998). Models of visual search: Finding a face in a crowd. In D.Scarborough & S.Sternberg (Eds.),
Dosher, B. A., Han, S., & Lu, Z.-L. (2004). Parallel processing in visual search asymmetry.
Dosher, B. A., & Lu, Z. L. (1999). Mechanisms of perceptual learning.
Doyle, J. R., & Leach, C. (1988). Word superiority in signal detection: Barely a glimpse, yet reading nonetheless.
Dzhafarov, E. N. (1997). Process representations and decompositions of response times. In A. A. J.Marley (Ed.),
Eckstein, M. P. (1998). The lower visual search efficiency for conjunctions is due to noise and not serial attentional processing.
Eckstein, M. P., Thomas, J. P., Palmer, J., & Shimozaki, S. S. (2000). A signal detection model predicts the effects of set size on visual search accuracy for feature, conjunction, triple conjunction, and disjunction displays.
Egeth, H. (1966). Parallel versus serial processes in multidimensional stimulus discrimination.
Ellis, H. D. (1975). Recognizing faces.
Enns, J. T., & Rensink, R. (1990). Sensitivity to three-dimensional orientation in visual search.
Enns, J. T., & Rensink, R. (1991). Preattentive recovery of three-dimensional orientation from line drawings.
Estes, W. K. (1956). The problem of inference from curves based on group data.
Estes, W. K. (1975). The locus of inferential and perceptual processes in letter identification.
Estes, W. K., & Brunn, J. L. (1987). Discriminability and bias in the word superiority effect.
Farah, M. J., Wilson, K. D., Drain, M., & Tanaka, J. N. (1998). What is “special” about face perception?
Fisher, D. L. (1982). Limited-channel models of automatic detection: Capacity in scanning in visual search.
Gauthier, I., Skudlarski, P., Gore, J. C., & Anderson, A. W. (2000). Expertise for cars and birds recruits brain areas involved in face recognition.
Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarksi, P., & Gore, J. C. (1999). Activation of the middle fusiform “face area” increases with expertise in recognizing novel objects.
Gilford, R. M., & Juola, J. M. (1976). Familiarity effects on memory and visual search.
Hampton, C., Purcell, D. G., Bersine, L., Hansen, C. H., & Hansen, R. D. (1989). Probing “pop-out”: Another look at the face-in-the-crowd effect.
Ingvalson, E. M., & Wenger, M. J. (2005). A strong test of the dual mode hypothesis.
Johnson, N. F., & Carnot, M. J. (1990). On time differences in searching for letters in words and nonwords: Do they emerge during the initial encoding or the subsequent scan?
Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception.
Kanwisher, N., Stanley, D., & Harris, A. (1999). The fusiform face area is selective for faces not animals.
Karlin, M. B., & Bower, G. H. (1976). Semantic category effects in visual word search.
Kinchla, R. A., Chen, Z., & Evert, D. (1995). Precue effects in visual-search: Data or resource limited?
Klein, R. (1978). Visual detection of line segments: Two exceptions to the object superiority effect.
Koshino, H. (2001). Activation and inhibition of stimulus features in conjunction search.
Kuehn, S. M., & Jolicœur, P. (1994). Impact of the quality of the image, orientation, and similarity of the stimuli on visual search for faces.
Levin, D. N. (2000). A differential geometric description of the relationship among percepts.
Levin, D. T. (1996). Classifying faces by race: The structure of face categories.
Levin, D. T., & Angelone, B. (2001). Visual search for a socially defined feature: What causes the search asymmetry favoring cross-race faces?
Liu, Z. L., & Dosher, B. A. (1998). External noise distinguishes attention mechanisms.
Martelli, M., Majaj, N. J., & Pelli, D. G. (2005). Are faces processed like words? A diagnostic test for recognition by parts.
Massaro, D. W. (1979). Letter information and orthographic context in word perception.
McDermott, K. B., Buckner, R. L., Petersen, S. E., Kelley, W. M., & Sanders, A. L. (1999). Set- and code-specific activation in frontal cortex: An fMRI study of encoding and retrieval of faces and words.
McElree, B., & Carrasco, M. (1999). The temporal dynamics of visual search: Evidence for parallel processing in feature and conjunction searches.
Mermelstein, R., Banks, W., & Prinzmetal, W. (1979). Figural goodness effects in perception and memory.
Naus, M. J. (1974). Memory search of categorized lists: A consideration of alternative self-terminating search strategies.
Naus, M. J., Glucksberg, S., & Ornstein, P. A. (1972). Taxonomic word categories and memory search.
Neisser, U. (1963). Decision time without reaction time: Experiments in visual scanning.
Neisser, U. (1964). Visual search.
Nothdurft, H. C. (1993). Faces and facial expressions do not pop out.
O'Toole, A. J., Wenger, M. J., & Townsend, J. T. (2001). Quantitative models of perceiving and remembering faces: Precedents and possibilities. In M. J.Wenger & J. T.Townsend (Eds.),
Palmer, J. (1994). Set-size effects in visual-search: The effect of attention is independent of the stimulus for simple tasks.
Palmer, J. (1995). Attention and visual search: Distinguishing four causes of a set-size effect.
Palmer, J., Verghese, P., & Pavel, M. (2000). The psychophysics of visual search.
Peterson, M. A., & Rhodes, G. (Eds.). (2003).
Prinzmetal, W. (1992). The word-superiority effect does not require a T-scope.
Radvansky, G. A., & Zacks, R. T. (1991). Mental models and the fan effect.
Ratcliff, R., & Smith, P. L. (2004). A comparison of sequential sampling models for two-choice reaction time.
Reicher, G. M. (1969). Perceptual recognition as a function of meaningfulness of stimulus material.
Schweickert, R. (1983). Latent network theory: Scheduling of processes in sentence verification and the Stroop effect.
Schweickert, R., Fisher, D. L., & Goldstein, W. M. (1992).
Schweickert, R., Giorgini, M., & Dzhafarov, E. (2000). Selective influence and response time cumulative distribution functions in serial-parallel task networks.
Smith, P. L. (1995). Psychophysically principled models of visual simple reaction time.
Smith, P. L. (1998). Bloch's law predictions from diffusion process models of detection.
Smith, P. L. (2000). Stochastic dynamic models of response time and accuracy: A foundational primer.
Smith, P. L., & Ratcliff, R. (2004). Psychology and neurobiology of simple decisions.
Sperling, G., & Dosher, B. A. (1986). Strategy and optimization in human information processing. In K. R.Boff, L.Kaufman, & J. P.Thomas (Eds.),
Sternberg, S. (1966, August8). High-speed scanning in human memory.
Strongman, K. T., & Brown, R. (1966). Visual search with meaningful and non-meaningful material.
Suzuki, S., & Cavanagh, P. (1992). Facial expression as an emergent feature in visual search.
Suzuki, S., & Cavanagh, P. (1995). Facial organization blocks access to low-level features: An object inferiority effect.
Tanaka, J. W., & Farah, M. J. (1993). Parts and wholes in face recognition.
Tanaka, J. W., & Sengco, J. A. (1997). Features and their configuration in face recognition.
Therneau, T. M., & Grambsch, P. M. (2000).
Thomas, R. D. (2001). Characterizing perceptual interactions in face identification using multidimensional signal detection theory. In M. J.Wenger & J. T.Townsend (Eds.),
Toglia, M. P., & Battig, W. F. (1978).
Tong, F., & Nakayama, K. (1999). Robust representations for faces: Evidence from visual search.
Townsend, J. T. (1972). Some results concerning the identifiability of parallel and serial processes.
Townsend, J. T. (1974). Issues and models concerning the processing of a finite number of inputs. In B. H.Kantowitz (Ed.),
Townsend, J. T. (1976). Serial and within-stage independent parallel model equivalence on the minimum completion time.
Townsend, J. T. (1984). Uncovering mental processes with factorial experiments.
Townsend, J. T. (1990a). Serial vs. parallel processing: Sometimes they look like Tweedledum and Tweedledee but they can (and should be) distinguished.
Townsend, J. T. (1990b). Truth and consequences of ordinal differences in statistical distributions: Toward a theory of hierarchical inference.
Townsend, J. T., & Ashby, F. G. (1978). Methods of modeling capacity in simple processing systems. In J.Castellan & F.Restle (Eds.),
Townsend, J. T., & Ashby, F. G. (1983).
Townsend, J. T., & Colonius, H. (1997). Parallel processing response times and experimental determination of the stopping rule.
Townsend, J. T., & Fific, M. (2004). Parallel versus serial processing and individual differences in high-speed search in human memory.
Townsend, J. T., Hu, G. G., & Evans, R. J. (1984). Modeling feature perception in brief displays with evidence for positive interdependencies.
Townsend, J. T., Hu, G. G., & Kadlec, H. (1988). Feature sensitivity, bias, and interdependencies as a function of intensity and payoffs.
Townsend, J. T., & Nozawa, G. (1995). On the spatio-temporal properties of elementary perception: An investigation of parallel, serial, and coactive theories.
Townsend, J. T., & Schweickert, R. (1989). Toward the trichotomy method: Laying the foundation of stochastic mental networks.
Townsend, J. T., & Wenger, M. J. (2004a). The serial-parallel dilemma: A case study in a linkage of theory and method.
Townsend, J. T., & Wenger, M. J. (2004b). A theory of interactive parallel processing: New capacity measures and predictions for a response time inequality series.
Treisman, A. M. (1986). Features and objects in visual processing.
Usher, M., & McClelland, J. L. (2004). Loss aversion and inhibition in dynamical models of alternative choice.
Weisstein, N., & Harris, C. S. (1974, November22). Visual detection of line segments: An object superiority effect.
Wenger, M. J. (1999). On the whats and hows of retrieval in the acquisition of a simple skill.
Wenger, M. J., & Gibson, B. S. (2004). Using hazard functions to assess changes in processing capacity in an attentional cuing paradigm.
Wenger, M. J., & Ingvalson, E. M. (2002). A decisional component of holistic encoding.
Wenger, M. J., & Ingvalson, E. M. (2003). Preserving informational separability and violating decisional separability in facial perception and recognition.
Wenger, M. J., & Payne, D. G. (1995). On the acquisition of mnemonic skill: Application of skilled memory theory.
Wenger, M. J., Schuster, C., Petersen, L. M., & Petersen, R. C. (in press). Applying proportional hazards models to response time data. In C.Bergeman & S.Boker (Eds.),
Wenger, M. J., & Townsend, J. T. (2000). Basic response time tools for studying general processing capacity in attention, perception, and cognition.
Wenger, M. J., & Townsend, J. T. (2001). Faces as gestalt stimuli: Process characteristics. In M. J.Wenger & J. T.Townsend (Eds.),
Wheeler, D. D. (1970). Processes in word identification.
Widmayer, M., & Purcell, D. G. (1982). Visual scanning of line segments: Object superiority and its reversal.
Zar, J. H. (1999).
Zbrodoff, J. N., & Logan, G. D. (2000). When it hurts to be misled: A Stroop-like effect in a simple addition production task.
The verbal description of the models provided in the text can be formalized, first for a parallel processing architecture, as a system of differential equations for the four processing channels. <sup>A1</sup> The manner in which the perceptual information in the system changes over time can be written as <anchor name="eq4"></anchor>
Here, x(t) is a four-element vector holding the level of information in each of the four processing channels at any given instant in time, A is a 4 × 4 matrix of coefficients that determines the rate at which each channel accumulates evidence, and u(t) is a four-element vector of inputs (one for each channel), with the form for each input given by <anchor name="eq5"></anchor>
i = 1 … 4, with t* indexing the time of the onset of the input to the channel. <sup>A2</sup> When the input is from the target set of elements, k > 0; when the input is from the nontarget set, k < 0. Finally, B is a 4 × 4 array of coefficients that specifies how the inputs are distributed to each of the processing channels.
Our parallel version of the null hypothesis (no difference in the processing of meaningful and meaningless inputs) places some constraints on this general representation. That is, we assume that the channels are, at all points in processing, independent. We implement this by requiring that the off-diagonal elements of both A and B be set to 0, resulting in <anchor name="eq6"></anchor>
and <anchor name="eq7"></anchor>
or the identity matrix.
An additional constraint on the specification of the model can be found in the diagonal elements of A. These elements, −r, are the rate parameters for the channels. All of the models presented in this article assume that r < 0. We have made this assumption to make the channels stable (i.e., prevent their activations levels from running off to either +∞ or −∞). In addition, we assume that the value of the rate parameter is constant across channels, inputs, and time. Under all of these assumptions, we can expand the system of differential equations for the parallel system (Equation A1) and write it as <anchor name="eq8"></anchor>
Under these assumptions and with the additional assumption that the level of activation in any channel is zero at the outset of a trial, the mean of the solution to the stochastic differential equation for channel i is <anchor name="eq9"></anchor>
The overall stochastic trajectory will be normally distributed at each point in time and will possess a variance that is proportional to the elapsed time (see Townsend & Wenger, 2004b). For the independent models we are considering at this point, the covariance of each channel with the others will be zero.
The activation level in each channel is compared at each instant in time with two thresholds, +γ and −γ, <sup>A3</sup> corresponding to thresholds for positive and negative responses, respectively. The response from any single channel is determined by the threshold that is exceeded first for that channel, and the corresponding latency for that channel's response is the first time for which its threshold is exceeded.
We determine the response for the system by combining the responses from all of the channels in a way that is consistent with the type of trial (or, and) being simulated. For example, assume that the inputs consist of two elements from the target set and two elements from the nontarget set and assume that an or response rule is being used. Assume that this produces two positive outputs and two negative outputs (from the appropriate channels). A correct positive response in this case would be the minimum of the two latencies associated with the two positive channel responses, conditional on that minimum being shorter than the maximum of latencies from the two channels processing the negative inputs. To simulate observable latencies, we add a value for a normally distributed random variable, N (μm, ςm), representing the distribution of motor output times.
We can picture our standard parallel models as being transformations from the previously described parallel models. Thus, we first require a mechanism that implements sequentiality. We do this by respecifying the input functions (Equation A2). In particular, for any channel 1 < i ≤ 4 (where i refers to the temporal order in which the channel is to be processed), <anchor name="eq10"></anchor>
where t*i−1 is the time at which the preceding channel generated either a positive or a negative response. We also need to change the way the outputs of the channels are combined to generate the observable response and its associated latency. For example, if we again assume that we have two target and two nontarget inputs, we assume they are to be processed in the order nontarget, target, target, nontarget, and we assume an or response rule, then the latency for a correct positive response will be the sum of the processing times for the first and second channels (plus the random variable representing motor output time). Figure 3 presents a schematic representation of the serial model for the search task.
This description reveals one critical way the serial model must differ from the parallel model. The serial and parallel models evolve an order of processing of the different items in quite distinct ways (see Townsend & Wenger, 2004a). In the work reported here, we simulate all possible positions of targets in a given display size and then average across all of these configurations in reporting our simulation results. Parameters and their values for the simulations are presented in Table A1. <sup>A4</sup>
We instantiated models for the two possibilities for interactive processing (presented in the Discussion) in the following ways. For the cross-talk model, we allowed the off-diagonal elements of the matrix containing the rate parameters (the A matrix) to be nonzero. This allowed the information accumulating in one channel to influence the amount accumulated in the others. A positive level of evidence in one channel would thus add to the values in the other channels, whereas a negative level of evidence in one channel would subtract from the values in the other channels. This would produce a shift toward the respective response thresholds for consistently signed inputs and impede the move toward those thresholds for inconsistently signed inputs. For the decisional-shift model, we reduced the magnitudes of the response thresholds when the inputs were of a consistent sign and increased the magnitudes when the inputs were of inconsistent signs. We simulated performance under these assumptions (with the values of the relevant parameters listed in Table A1) for parallel models using both the or and the and response rules and compared the results with those from the simulations of the independent channel models.
<cn> <bold>Appendix Footnotes</bold> </cn> <anchor name="fn15"></anchor><sups> A1 </sups> The mathematical discussion here is intentionally brief. Readers interested in a more comprehensive description of this modeling approach should see Townsend and Wenger (2004b). Also, when there is a stochastic element, as in Equation A2, mathematicians prefer to write the stochastic differential equation (Equation A1) in differential form to emphasize that, for instance, the solutions require specialized techniques (see, e.g., Smith, 2000). Thus, we write the model equations in a traditional derivative style but point out that mathematicians prefer a so-called differential style to emphasize that classical differentiation and integration are inoperable in the presence of stochastic elements.
<anchor name="fn16"></anchor><sups> A2 </sups> The performance of the systems we describe is evaluated by way of numerical simulations. Thus, by necessity, all results reflect the need to discretize time in the simulations. In our case, the time steps in our simulations are equal to 1 ms. The samples from the Gaussian distribution (in Equation A2) are independent of one another and taken at 1-ms intervals.
<anchor name="fn17"></anchor><sups> A3 </sups> In all of the applications reported here, we assume that these thresholds are symmetric. However, it is possible to have nonsymmetric response thresholds on each of the channels, which would allow one way of simulating the effects of response bias.
<anchor name="fn18"></anchor><sups> A4 </sups> To assess the generality of the predictions presented here, we repeated our simulations across ranges of values for each of the parameters. With the exception of certain “pathological” cases (e.g., channel thresholds approaching zero, input signals that were extremely weak relative to the channel noise), results from these simulations were qualitatively identical to those reported in the text.