Introduction

In Latin-based orthographies, the capitalized initial letter of a word has a specific linguistic function. Words are capitalized at the beginning of a text or sentence, after a period, or if the word is a proper name. While there are some differences across languages concerning the capitalization of some common nouns (e.g., months: February [English] vs. febrero [Spanish]), German orthography is unique because all nouns start with a capitalized letter.

It has been suggested that the capitalization of common nouns in German helps readers because it indicates the grammatical class of a word even before accessing its meaning (e.g., Bock, 1989; Hohenstein & Kliegl, 2013; Müsseler et al., 2005). However, whether German capitalized common nouns actually facilitate word identification and reading speed has been a matter of debate (Bock, 1986; Hohenstein and Kliegl, 2013; Jacobs et al., 2008; Wimmer et al., 2016). Whereas research by Bock et al., (1989) and by Pauly and Nottbusch (2020) found that capitalization benefits reading rates in German (though to varying degrees), Hohenstein and Kliegl (2013) did not find such an advantage in sentences. Furthermore, when typing, German capitalized common nouns may complicate the writing process (i.e., additional use of the shift key of the keyboard when typing common nouns). Indeed, other Germanic languages such as Danish and Norwegian ceased to capitalize common nouns during the first half of the twentieth century (see Bandle et al., 2005).Footnote 1

Further insight into why capitalization of German nouns benefits word identification is not only critical for educational and practical reasons, but is also theoretically important. As noted by Davis (2010), an assumption of interactive activation models (interactive activation model, McClelland & Rumelhart, 1981; multiple read-out [MRO] model, Grainger & Jacobs, 1996; dual-route cascaded [DRC] model, Coltheart et al., 2001; connectionist dual-process [CDP+] model, Perry et al., 2007; spatial coding [SC] model, Davis, 2010) is that there is an abstract level of case-invariant letter identities that drives visual word recognition: lexical activation spreads from letter features to abstract letter units and then to an orthographic lexicon (e.g., time, Time, or TIME would activate the same lexical entry). However, the letter level in the current implementation of these models only includes uppercase letters—i.e., Rumelhart and Siple’s (1974) font—between the level of letter features and the level of the orthographic lexicon. That is, interactive-activation models can encode the word TIME but not Time or time. More explicit assumptions on how letter features are mapped onto abstract letter units are made by contemporary neurally-inspired models of visual word recognition such as the local combinations detector (LCD) model (see Dehaene et al., 2005; see also Grainger et al., 2008). These models posit that visual input is mapped onto case-invariant abstract letter units at a prelexical stage (i.e., case-invariant detectors would respond similar to e and E). These abstract units guide the process of lexical access (i.e., the orthographic representations of time and TIME are the same). Results of masked priming studies that manipulated the letter case support this view. Responses to target words such as TIME are similar when briefly primed with the word TIME (same letter case) or time (different letter case) (see Jacobs et al., 1995, for French; see Perea et al., 2015a, 2015b, for English and Spanish). These results suggest that a matched/mismatching letter case does not affect response times to target words (see Grainger, 2018, for review). A similar pattern was reported in neuroimaging studies, where activation of the left fusiform gyrus (in the so-called visual word form area [VWFA]) was found to be independent of the letter case of the masked prime (see Dehaene et al., 2001). Furthermore, event-related potential (ERP) masked priming experiments have shown that the N250 component (i.e., a component associated to orthographic processing) is similar for time-TIME and TIME-TIME (Vergara-Martínez et al., 2015; see also Gutiérrez-Sigut et al., 2019). Taken together, these findings favor the view that in languages such as English, French, or Spanish, both lowercase and uppercase words activate the same orthographic representations (see Lu et al., 2021, for recent fMRI evidence; see also Vergara-Martínez et al., 2020, for ERP evidence, with unprimed paradigms).

Notwithstanding, there is empirical evidence that shows that letter-case information may not always be lost at a prelexical processing stage. In a series of lexical decision experiments with proper and common nouns in Italian, Peressotti et al. (2003) found that the form of the initial letter of a word influences lexical access. For proper nouns in Italian, response times were faster when the items contained a capitalized first letter (e.g., Anna faster than anna; see also Sulpizio & Job, 2018, for electrophysiological evidence). However, this effect did not occur in common nouns, which are usually written in lowercase (e.g., carne [flesh]). To account for these findings, Peressotti and colleagues proposed the so-called “Orthographic Cue” hypothesis (henceforth, OC hypothesis). This hypothesis postulates that letter-case is not a superfluous visual element—as typically assumed in models of visual word recognition (e.g., Dehaene et al., 2005); rather, it may serve as an orthographic cue at an abstract information level when processing proper nouns. For a given word, an orthographic cue would mark the first grapheme in a binary way: “yes, initial capital letter” or “no initial capital letter”. As this abstract marker would be used to pre-activate lexical units that are written in the same letter-case of the initial letter, the OC hypothesis can easily accommodate the advantage of the initial capitalization in proper nouns (e.g., the proper noun Mary would be stored in the orthographic lexicon with a marker of initial capital letter). Peressotti et al. (2003) stressed that the OC hypothesis is compatible with the idea of abstract letter identities driving lexical access: it is not that letter units themselves contain letter-case information, but rather it is an abstract marker of letter-case in the initial letter. As Peressotti et al. (2003) indicated, this hypothesis could be implemented in interactive activation models by assuming that the capitalization advantage for proper nouns occurs between the letter level and the orthographic lexicon—note that none of the current implementations of these models take letter-case information into account. It is unclear, however, whether an abstract marker for the capitalization of the initial letter could be compatible with neurally inspired models of word recognition such as Dehaene et al.’s LCD model (see Sulpizio & Job, 2018, for discussion). One goal of the present experiments was to evaluate whether it is necessary for these models, as a general principle, to consider the processing of letter-case information in future implementations.

Converging evidence for the importance of letter-case information during word recognition has been found with two special types of words that are usually presented in the same letter-case: brand names and acronyms (e.g., IKEA, FBI). In single presentation brand decision tasks, IKEA is identified as a brand name faster than ikea (Gontijo et al., 2002; Perea et al., 2015a, 2015b). Similarly, acronyms are identified faster when presented in their characteristic letter-case configuration (e.g., FBI) than when presented in an unfamiliar letter-case configuration (e.g., fbi) (Henderson & Chard, 1976).Footnote 2

Notably, proper nouns, brand names, and acronyms represent specific categories of words and their mental representations can be different from common words (see Brysbaert et al., 2009; Gontijo & Zhang, 2007, for discussion). To examine whether the encoding of letter-case information is a general principle of visual word recognition, it is necessary to test the role of letter-case using common words. As proposed by Jacobs et al. (2008) and Wimmer et al. (2016), an ideal scenario for that purpose is German because all common nouns (but not verbs, adjectives, …) are written with initial capitalization. Jacobs et al. (2008) used a tachistoscopic identification task in which participants perceived German nouns and non-nouns with or without an initial capitalized letter. In Experiment 1, Jacobs et al. (2008) found that German nouns were identified more accurately when the first letter was capitalized (e.g., Beruf [job]) than when presented in all lowercase (e.g., beruf)—the accuracy of uppercase words (e.g., BERUF) was in between these other conditions. In Experiment 2, they found that German non-nouns (i.e., adjectives, verbs, adverbs) were identified more accurately when written in all lowercase letters (e.g. eilen [to hurry]) or with the first letter capitalized (e.g. Eilen) than when presented in all uppercase letters (EILEN). Wimmer et al. (2016) further investigated the impact of familiarity on capitalized initial letters of German words in a lexical decision task. The stimuli were either nouns or non-nouns (e.g., adjectives, verbs), and were presented in all-lowercase format or with an initial capitalized letter. Wimmer et al. (2016) found that readers responded faster and more accurately to German nouns that were presented with a capitalized first letter than to all-lowercase nouns (e.g., Ball [ball] faster and more accurate than ball). Conversely, responses to German non-nouns presented in all lowercase letters were responded to faster and more accurately than capitalized non-nouns (e.g., blau [blue] faster and more accurate than Blau). Further, they reported higher neural activity for ball and Blau (all-lowercase noun and capitalized non-noun) than for Ball and blau (capitalized noun and all-lowercase non-noun) in the VWFA. They interpreted these findings as supporting the idea that orthographic representations in German contain case-specific information.

Taken together, the experiments conducted by Jacobs et al. (2008) and Wimmer et al. (2016) showed that letter-case information does play a role in identifying German common words: the initial capitalized letter facilitates response times to nouns but not to non-nouns in word identification and lexical decision tasks. However, a shortcoming of these studies is that neither of these tasks necessarily requires unique access to semantic information (see Forster & Shen, 1996, for discussion). First, it is possible to identify a letter string independent of lexical access. Second, identifying a combination of letters as a word in a lexical decision task can be done in the absence of unique word identification (see Grainger & Jacobs, 1996); furthermore, the obtained effects can be modulated by visual familiarity (see Perea et al., 2020; Perea et al., 2018, for discussion). For instance, Perea et al. (2020) found that, in lexical decision, responses to words are faster when the items were presented in a familiar letter-case configuration (e.g., HOUSE) than in an unfamiliar letter-case configuration (e.g., hOuSe), whereas the opposite occurs for pseudowords (e.g., TEBADA yields slower and less accurate responses than TeBaDa). Then, one might argue that the advantage of Buch [book, noun] and blau [blue, adjective] over buch and Blau could have been due to their higher visual familiarity. To circumvent these limitations in the present experiments, we used a semantic categorization task. This task requires the reader to retrieve the meaning of the presented words (Forster & Shen, 1996) and, furthermore, the effects in this task are not modulated by the words’ visual familiarity (e.g., hOuSe produces similar response times as HOUSE; see Perea et al., 2020).

In sum, we designed two experiments to directly examine the impact of initial letter capitalization in words on lexical access using a semantic categorization task (“Is the word an animal name?”). In Experiment 1, we compared word identification times of German words (animal names, common nouns, adjectives/verbs) presented with the initial capital letter (e.g., Hund, Buch, Blau) or in all-lowercase form (hund, buch, blau). Experiment 2 was conducted to replicate and extend Experiment 1 using an orthographically legal all-uppercase condition instead of the all-lowercase condition (e.g., HUND, BUCH, BLAU)—note that hund or buch would not follow the German orthographic rules for nouns.

The predictions for the two experiments were as follows: If initial capitalization of German words facilitates lexical access in common nouns, we would expect faster responses to capitalized common nouns (both animal and non-animal nouns) and a disadvantage for capitalized non-nouns (e.g., adjectives/verbs). This outcome would strongly suggest that future implementations of models of visual word recognition should consider how letter-case information may affect lexical access, as first suggested by Peressotti et al. (2003) with proper nouns. Alternatively, if initial capitalization of words in German does not play a role during lexical access, word identification times should be similar regardless of the format. This outcome would favor the view that the identification of common words is driven by abstract case-invariant letter units (see Coltheart et al., 2001; Davis, 2010; Dehaene et al., 2005; Grainger et al., 2008; Perry et al., 2007).

Experiment 1

Methods

Participants

We tested 42 participants (30 women, 12 men)—the number of observations in the initial capital condition and the all-lowercase condition were 6300 and 6300, respectively. All individuals were native German speakers with no reading problems and with normal or corrected-to-normal vision. Their mean age was 31.12 years (SD = 10.67). Participants received a small monetary incentive for their participation (2 vouchers of 20€). All participants signed an informed consent form before the experiment. Ethical approval of this research was obtained from the Research Ethics Committee of the University of Valencia, and the study followed the requirements of the Helsinki convention.

Materials

We selected a set of 300 German words from the SUBTLEX-DE word database (Brysbaert et al., 2011), of which 100 were animal nouns (e.g., Hund [dog]), 100 were other common nouns (e.g., Buch [book]), and 100 were verbs or adjectives (e.g., klein [small]). As shown in Table 1, non-nouns and common nouns were matched in the number of letters, word frequency, bigram frequency, and OLD20 (Oganian et al., 2016); all three groups of items were matched for number of letters.Footnote 3 The list of stimuli can be found in Appendix A. Each word was presented with all letters in lowercase or with an initial capital letter (e.g., common nouns: Buch vs. buch; non-nouns: Klein vs. klein; animal nouns: Hund vs. hund). We created two lists to counterbalance the materials in a Latin Square manner (e.g., Buch was presented in List 1, whereas buch was presented in List 2). Each list contained 100 animal nouns (50 with the initial capital), 100 common nouns (50 with the initial capital), and 100 non-nouns (50 with the initial capital). Each participant was presented with only one of the lists and the assignment to the lists was counterbalanced.

Table 1 Comparison of the mean characteristics of the word items

Procedure

The experiment was conducted in an online setting, using PsychoPy 3 (Peirce & MacAskill, 2018), and its corresponding online server Pavlovia (www.pavlovia.org). Participants were instructed to do the experiment in a quiet room without any distractions and to perform a semantic categorization task (“Does this word refer to an animal?”) by pressing the button “M” on their keyboard for answering “yes” and the button “X” on their keyboard for answering “no” as fast and accurately as possible. Before starting the actual experiment, all participants went through ten practice trials to get familiarized with the task. Within one trial, a fixation point was presented initially in the center of the screen for 500 milliseconds (ms). Then, the target item was presented until the response was made (or until a deadline of 2000 ms). The order of presentation of trials was randomized for each participant. Altogether, the experiment took about 8–12 min, including a short break after 150 trials.

Data analysis

For the statistical analyses, we employed Bayesian linear mixed-effects models using the rstan and brms packages (Bürkner, 2017) in the R environment. We chose this approach over the more traditional one (i.e., linear mixed-effects models using the lme4 package) because the Bayesian models normally converge with the maximal random effect structure of the design. In contrast, non-Bayesian models often fail to converge. Of note is that reducing the random effect structure in a model to achieve convergence may increase the risk of Type-I errors (Barr et al., 2013).

For the analyses of the non-animal items (“no” responses), the two fixed factors of the models were Form (all lowercase vs. initial capital) and Grammatical category (nouns vs. non-nouns). The levels of each fixed factor were centered in zero (i.e., − 0.5 vs. 0.5). As response time (RT) data typically shows a positive skew, we used the ex-Gaussian distribution (family = exgaussian (identity = link)) for the analysis of the latency data, whereas we applied the Bernoulli distribution (family = bernoulli) for the analysis of the accuracy data. We fit the maximal model in terms of random factor structure (i.e., Dependent_Variable ~ word_format × grammatical_category + (1 + word_format × grammatical_category | subject) + (1 + word_format × grammatical_category | item). For the analyses of the animal nouns (“yes”) responses, the strategy was the same as explained above except that the only factor was Form. We ran each model with four chains of 5000 Markov chain Monte Carlo iterations with a warmup of 1000 iterations for each chain. The priors were the default values for the parameters (i.e., inits = “random”).

Bayesian linear mixed-effects models provide not only an estimate of each parameter but also its Bayesian 95% credible interval. An effect was considered significant when its 95% credible interval did not include zero.

Results and discussion

We conducted separate analyses for the correct reaction times and the accuracy of the participants’ responses. To minimize the influence of fast guesses on the latency data, response times shorter than 250 ms were removed from the RT analyses. Due to the 2000 ms deadline, there were no response times above 2000 ms. We excluded two participants from further analyses because they made more than 15% of errors across all words. Two common nouns (Mensch [human] and Fleisch [flesh]) also yielded more than 15% of errors and were excluded from further analysis. Of note, the pattern of significant effects was exactly the same if we kept all the participants and items in the analyses.

We conducted separate analyses for non-animal and animal words as they required different responses (“no” vs. “yes”). The mean response times and the accuracy rates for each condition are presented in Table 2. The fits of the Bayesian models with the maximal structure factor model were very good for both latency and accuracy data: the values of R̂ (i.e., a measure of the convergence of the estimates across the four chains) were 1.00 for all coefficients.

Table 2 Mean response times (in ms) and accuracy of the answers (proportion) for non-nouns (i.e. verbs and adjectives), common nouns, and animal nouns, written with an initial capital letter or in all lowercase letters in Experiment 1

Analysis of the RT data

Animal nouns Response times were faster when the animal nouns were presented with an initial capital letter than when presented in all lowercase letters (b = 24.24, SE = 2.89, 95% CrI [18.55, 29.92]).

Non-animal words On average, participants responded faster to words in all lowercase than to words with an initial capital letter (main effect of Form: b = − 18.46, SE = 3.93, 95% CrI [− 26.22, − 10.75]), whereas there was no main effect of Grammatical class (b = − 1.26, SE = 4.03, 95% CrI [− 9.18, 6.59]). Critically, we found an interaction between Form and Grammatical class (b = − 14.49, SE = 4.66, 95% CrI [− 23.52, − 5.35]). This interaction revealed an all lowercase advantage that was greater for non-nouns (95% CrI [25.6, 40.9]) than for common nouns (95% CrI [10.6, 26.0]). For a graphical representation of the effects, see Fig. 1A.

Fig. 1
figure 1

Highest density intervals for the Bayesian linear mixed-effects models of the response times (A) and the accuracy (B) of the non-animal words. Boundaries of the 95% credible intervals are marked in purple. An effect was considered significant when the 95% credible interval of its possible parameter values did not include zero. (A) Significant effects were obtained for the fixed factor Format

Analysis of the accuracy data

Animal nouns Accuracy was higher when the animal nouns were presented with an initial capital letter than in all lowercase letters (b = − 0.61, SE = 0.27, 95% CrI [− 1.14, − 0.07]).

Non-animal words We did not find any significant effects—note that accuracy was at ceiling in all conditions (above 0.985; see Table 2 and Fig. 1B for a graphical representation).

The present semantic categorization experiment examined lexical access of German common nouns and non-nouns (verbs and adjectives) written with an initial capital letter or with all lowercase letters. We found that participants responded faster when the word was presented in its most common form for animal nouns (Hund faster than hund) and non-nouns (blau faster than Blau). These findings favor the idea that the presence/absence of the initial capitalization helps lexical access.

However, we found an unexpected outcome for non-animal common nouns: participants responded faster when the word was written in lowercase than when written with an initial capital letter (buch faster than Buch). To explain this paradoxical advantage of buch over Buch, it may be important to consider the characteristics of the task: one-third of the trials were animal nouns (i.e., “yes” responses), whereas two-thirds of the trials were either non-animal nouns or adjectives/verbs (i.e., “no” responses). This implies that if an item was presented in lowercase (e.g., hund [“yes” response], blau [“no” response], buch [“no” response]), it would be more likely to be a “no” response than a “yes” response, and this may have sped up responding “no” to buch when compared to Buch.

To examine this possibility, we conducted a pilot experiment with 12 participants. The experiment was identical to the present experiment except that we did not include verbs/adjectives (i.e., all words were common nouns). This way, the proportion of “yes” and “no” responses was the same, and thereby form was not diagnostic of the response. The mean RTs and accuracy per condition are presented in Table 3. As can be seen in Table 3, the findings of the pilot experiment mirrored those of Experiment 1 (i.e., hund was faster than Hund, but Buch was slower than buch), thus ruling out the above explanation.

Table 3 Mean response times (in ms) and accuracy of the answers (proportion) for common nouns (e.g. Buch) and animal nouns (e.g. Hund), written with an initial capital letter or in all lowercase letters in a pilot experiment (N = 12)

Another reason for the puzzling advantage of lowercase common nouns over initial capitalized common nouns for non-animals in the semantic categorization task could be as follows: lowercase common nouns (buch) do not follow German orthographic rules for nouns—German common nouns must be written with an initial uppercase letter. Thus, in terms of word recognition models that include a decision mechanism as accumulation of evidence (e.g., the Leaky-Competing Accumulator model; Dufau et al., 2012), one might expect that the combination of a “no” response with an item that is orthographically illegal in German may have sped up the responses to words such as buch.

Consequently, we designed Experiment 2, in which we replaced the (orthographically illegal) full lowercase format with a full uppercase format (i.e., BUCH)—note that uppercase words are the typical choice in most word recognition experiments. Hence, unlike buch, both Buch and BUCH would follow the German orthographic rules. To keep a 50% ratio of yes/no responses, the materials were composed of animal and non-animal nouns, either with the initial capitalized letter or in all uppercase letters (e.g., Hund vs. HUND; Buch vs. BUCH). Thus, if the initially capitalized nouns facilitate lexical access, one would expect a processing advantage for Hund and Buch over HUND and BUCH, respectively.

Experiment 2

Methods

Participants

We tested 46 participants (16 women, 27 men, 3 diverse), using Prolific Academic, a UK-based online crowdworking platform (http://prolific.ac)—this corresponds to 4600 items in each format. Prolific Academic’s recruitment filter was used such that only native German speakers with no reading problems and with normal-or-corrected-to-normal vision could participate. Their mean age was 25.54 years (SD = 5.09) and on average, they took less than 10 min to complete the task. As in Experiment 1, all participants gave informed consent before the experiment.

Materials

We used the same words as in Experiment 1, except the 100 verbs/adjectives. Further, we replaced the two common nouns Mensch [human] and Fleisch [Flesh], which had yielded a high error rate in Experiment 1 with two other common non-animal nouns (Anfang [Beginning] and Zeichen [Sign]) with similar word length, word frequency, bigram frequency, and OLD-20. Each word was presented with all letters in uppercase or with an initial capital letter (e.g., common nouns: Buch vs. BUCH; animal nouns: Hund vs. HUND). The counterbalanced lists of stimuli were created in the same way as in Experiment 1.

Procedure

The procedure was the same as in Experiment 1.

Data analysis

As in Experiment 1, we employed Bayesian linear mixed-effects models for the statistical analyses. We chose the same strategy of analyses as in Experiment 1, except that we analyzed animal nouns and common nouns together. Consequently, the two fixed factors of the model were Form (all uppercase vs. initial uppercase) and Word category (common nouns vs. animal nouns).

Results and discussion

As in Experiment 1, we conducted separate analyses for correct response times (250–2000 ms range) and the accuracy of the participants’ responses. Four participants were excluded due to inaccurate measurements of the response times on their computers (e.g., RTs were always 400, 500, 600 ms). The mean response times and the accuracy rates for each condition are presented in Table 4.

Table 4 Mean response times (in ms) and accuracy of the answers (proportion) for common nouns and animal nouns, written with an initial capital letter or in all uppercase letters in Experiment 2

Analysis of the RT data

Importantly, response times for animal nouns and common nouns did not differ significantly (b = − 0.55, SE = 4.08, 95% CrI [− 8.57, 7.33]). For both, animal nouns and common nouns, participants responded faster to words with an initial capital letter than to words written in all uppercase letters (main effect of Form: b = 7.74, SE = 2.43, 95% CrI [3.01, 12.51]). No interaction was found between Word category and Form (b = 2.58, SE = 3.41, 95% CrI [− 4.19, 9.32]). For a graphical representation, see Fig. 2A.

Fig. 2
figure 2

Highest density intervals for the Bayesian linear mixed-effects models of the response times (A) and accuracy (B). Boundaries of the 95% credible intervals are marked in purple

Analysis of the accuracy data

On average, participants responded more accurately to common nouns than animal nouns (main effect of Word category: b = − 0.89, SE = 0.24, 95% CrI [− 1.34, − 0.42]) and when words were presented with an initial capital letter than in all lowercase letters (main effect of Form: b = − 0.64, SE = 0.23, 95% CrI [− 1.09, − 0.20]). There was an interaction of Form and Word category (b = 0.54, SE = 0.26, 95% CrI [0.03, 1.07]; see Fig. 2, panel B for a graphical representation), indicating an effect of Form primarily of common nouns (see Table 4).

The present semantic categorization experiment compared common nouns and animal nouns written with an initial capital letter (Buch, Hund) or with all uppercase letters (BUCH, HUND). We found that, regardless of whether the nouns corresponded to animals or not, participants responded faster when nouns started with a capital letter than when nouns were written in uppercase (e.g., Buch faster than BUCH; Hund faster than HUND). Thus, using two orthographically correct forms and an equal ratio of yes/no responses, we found a consistent advantage for capitalized non-animal nouns in German.

General discussion

We designed two semantic categorization experiments that investigated the role of initial letter capitalization for German words on lexical access. Each word was presented with an initial capital letter or with the same case form (lowercase in Experiment 1; uppercase in Experiment 2). In both experiments, we found faster responses to animal nouns with initial letter capitalization (Hund faster than hund [Experiment 1]; Hund faster than HUND [Experiment 2]). In Experiment 1, we also found a sizeable advantage for lowercase than for initially capitalized non-nouns (blau faster than Blau). Experiment 1 also revealed a puzzling benefit for lowercase over initially capitalized non-animal nouns (buch faster than Buch). This paradoxical finding was likely due to the conflation of a “no” response for orthographically illegal word forms (i.e., buch should be written as Buch or BUCH). Indeed, in Experiment 2, when comparing word identification times of German common nouns and animal nouns with an orthographically legal uppercase format, we found an advantage of German common nouns with initial capitalization (Buch faster than BUCH). Thus, our experiments confirm, using a task that requires access to meaning (“is the word an animal name?”), that initial letter capitalization for nouns influences the speed of lexical access in German: it helps the word identification of nouns (Hund faster than hund or HUND) and hinders word identification of non-nouns (blau faster than Blau) (see Jacobs et al., 2008, for converging evidence with an identification task; see Wimmer et al., 2016, for evidence with the lexical decision task).

The present results have clear theoretical implications: the characteristics of an orthography can shape the process of word recognition (see Frost, 2012). As stated in the Introduction, current neurally inspired models of visual word recognition assume that lexical access is guided by case-invariant representations. For instance, Dehaene et al.’s (2005) LCD model assumes that letter-case only plays a role in the earliest stages of word processing, before visual input is mapped onto case-invariant abstract letter units (e.g., the words time, Time, and TIME would activate exactly the same arrays of case-invariant units). That is, letter-case information would not play a role during lexical access of common words (Dehaene et al., 2005). Similarly, computational models based on the classical McClelland and Rumelhart (1981) interactive activation model do not take letter-case information into account either—further, their current implementation only includes a letter feature level for uppercase letters. However, the present semantic categorization experiments showed that the capitalization of the initial letter influences lexical access of German words (see also Jacobs et al., 2008; Wimmer et al., 2016, for evidence with other word recognition tasks). Thus, neither the current version of the LCD model (Dehaene et al., 2005) nor the current implementation of the interactive activation model and its successors (for instance, DRC model, multiple read-out (MRO) model, CDP + model, SC model) can accommodate these findings with common words in German, as they assume that lexical access is guided by case-invariant representations.

As indicated by Sulpizio and Job (2018) in the context of the effects of initial capitalization of Italian proper nouns, “the recognition process is more complex than a simple mapping between letters and abstract letter/word representation” (p. 115). They suggested that the orthographic lexicon could contain information about letter case when identifying printed words in the spirit of the OC hypothesis, which was initially proposed for Italian proper nouns (Peressotti et al., 2003). The present results with German common words align well with the OC hypothesis: the capitalization of the initial letter in German nouns should not be seen as a superfluous visual feature to be mapped onto a case-invariant abstract representation, but rather as an element that forms part of the orthographic lexicon. Using the logic of the OC hypothesis, the first letter of the German common noun Hund could be marked with an abstract marker in the orthographic lexicon, indicating an initial capital letter. As a result, the visually presented word Hund would lead to a faster word identification than hund—note that hund shares all the abstract letter units (and their positions) with the orthographic representation of Hund but not the marker of the initial capitalization (Experiment 1).

Experiment 2 showed a faster processing for Hund than for HUND. If we take the OC hypothesis literally, it would predict similar word identification times for Hund and HUND: both words would be marked with an initial capital marker and share all the abstract letters with the abstract word unit Hund. Nonetheless, it is obvious that the initial capital H in Hund serves as an orthographic cue, but the initial capital H in HUND does not serve as an orthographic cue. Thus, our findings require further refinement of the OC hypothesis: this hypothesis should consider an abstract marker of letter-case not only for the initial letter, but also for the other letter positions. A drawback of this argument, however, is that assuming the existence of letter-case markers for each single letter of an orthographic representation may not seem parsimonious (or biologically plausible).

Another option to capture the advantage of the initial capitalization of German nouns is by assuming that the units in the orthographic lexicon in interactive activation models (e.g., DRC model, SC model, MRO model, CDP + model) are stored in their most frequent letter-case format (e.g., Buch, blau)—instead of abstract case-invariant units—and that the letter level preserves the letter-case of the visual input.Footnote 4 Note that the idea of an orthographic lexicon keeping not only the word’s abstract letter units but also some indexical properties of the word such as letter-case is not new (see Goldinger, 1998, for an episodic theory of lexical access). In this scenario, the visually presented stimuli Buch and blau (i.e., consistent with the most familiar form) would produce a greater level of activation in the orthographic lexicon than words than buch and Blau (see Jacobs et al., 2008, for a similar observation). The above explanation may also explain why all-uppercase words like BUCH produce (if anything) a smaller cost relative to Buch than all-lowercase words like buch. While we do not encounter all-lowercase nouns (e.g., buch) in German (i.e., they would not follow the orthographic rules), all uppercase words are often encountered in titles, advertisements, etc., and they can also be stored in the orthographic lexicon. It is just that their level of activation would be lower than those of Buch as they occur less frequently in this format. A similar argument applies to neurally inspired word recognition models. As first suggested by Wimmer et al. (2016), the neural representations of written words, at least in German, would contain information about the initial letter-case of a word—note that this assumption is in contrast to the view of case-invariant abstract representations of the LCD model by Dehaene et al. (2005). As a result, in this revised model, the most frequent letter-case format of a word (e.g., Buch, blau) would lead to the highest level of neural activation (see Wimmer et al., 2016, for fMRI evidence with German nouns vs. non-nouns). Notably, the idea of an orthographic lexicon that keeps indexical information besides the identity and order of each word’s abstract letter units fits well with the advantage of words when presented in their more common format, let be proper nouns (Anna faster than anna; Peressotti et al., 2003), acronyms (FBI faster than fbi; Henderson & Chard, 1976), brand names (IKEA faster than ikea; Gontijo et al., 2002), or common words (molecule faster than MOLECULE, but restaurant [which occurs frequently both in lowercase and uppercase] is not responded faster than RESTAURANT; Perea et al., 2018).

Thus, at a general level of theorizing, the present findings can be employed in future implementations of computational and neurally inspired models of word recognition. We have shown, using German common words in a task that requires lexical access (semantic categorization task), that letter-case information does not get lost early in processing. Converging evidence has also been found in sentence reading. Hohenstein and Kliegl (2013) found that initially capitalized German nouns are processed faster when reading a sentence. This finding has been interpreted in such a way that the capitalization of German nouns helps readers to quickly detect the word class, thus helping lexical access (Hohenstein and Kliegl, 2013). Future research could further explore the influence of reading skills on the processing advantage of initial capitalized German nouns. Developing readers and those speakers learning German, as well as individuals with a reading-related disorder may process capitalization in German differently (see Bock, 1986) and, hence, they may provide some clues on the origins and the development of the processing advantage of capitalized nouns. Further research is also needed to determine the neural representation of the initial letter capitalization using fMRI with orthographically legal formats of German nouns (e.g., Buch vs. BUCH) or examining the time-course of the processing advantage of capitalized German nouns by measuring their ERP correlates.

To sum up, we conducted two semantic categorization experiments to compare word identification times of German words that were presented in their standard format (Hund, Buch, blau) or not (hund, buch, Blau [Experiment 1]; HUND, BUCH [Experiment 2]). We found a processing advantage of capitalized nouns in comparison to other orthographically legal forms of writing (e.g., Hund is identified faster than HUND and Buch is identified faster than BUCH). Thus, letter-case information of the initial letter in German nouns is both preserved and used during lexical access. These results pose some problems for the case-invariance assumption shared by most current computational and neurally inspired models of visual word recognition. Additional research should focus on the processing advantage of capitalized German nouns while reading sentences, as well as in its developmental progression in individuals learning German (e.g., developing readers and individuals with reading-related disorders).