Skip to main content

Syntax · 14 min

X-Bar Theory

A schema for phrase structure: every phrase has a head, complement, specifier, and adjunct positions arranged in a uniform XP / X-bar / X tree. The structural framework that organizes constituency-test data into a coherent theory of syntactic categories. Foundational in 1970s generative syntax; refined in the minimalist program.

X-Bar Theory

Why This Matters

Constituency tests (constituency-tests) reveal that words group into hierarchical phrases (NP, VP, PP, etc.). The empirical data is rich; the theoretical question is why phrases have the structure they have. X-bar theory (Chomsky 1970, Jackendoff 1977) is the classical generative answer and still the standard pedagogical bridge from constituency tests to formal phrase structure: phrases are analyzed with a shared XP / X-bar / X schema.

The schema:

        XP (maximal projection)
       / \
      /   \
   YP      X-bar
 (specifier) / \
            /   \
        X-bar    ZP (adjunct)
        /  \
       /    \
      X      WP (complement)
   (head)

Three positions per category XX:

  • Head (XX): the lexical core (a noun for NP, verb for VP, etc.).
  • Complement (sister of XX): an argument selected by the head.
  • Specifier (sister of XX-bar): an argument or modifier attached at the maximal-projection level.
  • Adjunct (sister of XX-bar that itself dominates an XX-bar): optional modifiers, recursive.

The schema applies uniformly across categories. The same structural template gives:

  • NP: noun head, complement (e.g., a PP after the noun), specifier (a determiner, in older X-bar; modern theory splits NP into DP).
  • VP: verb head, complement (the direct object), specifier (in some analyses, the subject).
  • AP: adjective head, complement (a degree phrase), specifier.
  • PP: preposition head, complement (the object of the preposition), specifier.
  • IP/TP: inflection/tense head (the auxiliary or inflected verb), complement (VP), specifier (the subject).
  • CP: complementizer head (that, whether), complement (IP), specifier (the wh-phrase).

The cross-categorial generalization is the substance of the theory. Languages differ in parameters (head-initial vs head-final, where the specifier sits) but agree on the structural schema.

The Three Levels

X-bar's distinctive feature is the intermediate X-bar level between the head XX and the maximal projection XPXP. Why have an intermediate level?

The argument: certain constituency-test data only makes sense with three levels:

  • One-replacement in English NPs: the king of England, not the one of France. The pro-form one replaces king of England — an intermediate constituent (N-bar), not the head alone (king) and not the full NP (the king of England).
  • Coordination at the intermediate level: the [king of England] and [queen of France]. Two N-bar constituents conjoined under a single determiner.
  • Adjuncts attaching to X-bar: the [happy] [king of England]. The adjective happy attaches to N-bar, not to N or NP directly.

The three-level analysis explains all of these consistently; flat structures cannot.

Specifiers, Complements, and Adjuncts

The three argument-positions have distinct properties.

Complement (selected by the head):

  • Is the immediate sister of XX.
  • Is selected by the head's lexical specification (a verb that takes a direct object, a preposition that takes a noun- phrase object).
  • Is required (the head's argument structure demands it, modulo specific exceptions).

Specifier (sister of XX-bar):

  • Is the immediate sister of the higher XX-bar.
  • Often hosts the external argument (the subject in VPs).
  • Is determined by the syntactic-functional role.

Adjunct (sister of XX-bar, dominated by another XX-bar):

  • Is recursive: multiple adjuncts can stack at the same level.
  • Is optional.
  • Modifies the head's meaning (degree, manner, location, etc.).

Diagnostic differences: complements appear in specific selectional positions; specifiers and adjuncts have looser distributional patterns.

Worked Example: NP Structure

Consider the NP the angry king of England with the gold crown.

Tree (in classical X-bar with NP rather than DP):

              NP
             / \
           Det  N-bar
            |    /  \
           the  N-bar PP (adjunct)
                / \   |
              AP  N-bar with the gold crown
               |    \
             Adj   N-bar
              |     /  \
            angry  N    PP (complement)
                   |    |
                  king of England

Reading:

  • The complement of king is of England (selected; the noun king takes a of-X complement specifying which king).
  • angry is an adjunct (modifies king of England; could be removed).
  • with the gold crown is another adjunct (further modifies).
  • the is in the specifier position (the determiner).

The recursion in adjunct positions explains why we can have the angry, eccentric, [recently-deposed] king of England with the gold crown and the silver scepter — adjunct stacking is unbounded.

Modern Modifications: DP, IP, CP, vP

Since the original X-bar formulation, several refinements have entered the standard analysis.

The DP hypothesis (Abney 1987): the determiner is the head of the noun phrase, not the noun. The king is a DP (determiner phrase) with the as the head and king as a complement.

       DP
      / \
    the  NP
         / \
        N
        |
       king

Split-IP (Pollock 1989): the inflectional head splits into multiple functional heads (Tense, Agreement, Aspect, etc.), each projecting its own phrase. The richer functional structure explains cross-linguistic variation in word order and verb movement.

The vP / VP shell (Larson 1988, Chomsky 1995): the verb phrase has a "small v" (light verb) head above the lexical V. The split explains how verbs assign theta-roles to subjects (specifier of vP) and objects (complement of V).

Bare phrase structure (Chomsky 1995, Minimalist Program): the X-bar template is derived from more general principles (Merge, Move) rather than stipulated. Modern minimalism uses bare phrase structure in which projections are determined by feature checking rather than by an X-bar schema.

These modifications complicate the surface representation but preserve the cross-categorial generalization that X-bar identified.

Other syntactic frameworks, including LFG, HPSG, CCG, and dependency grammar, do not use X-bar theory in the same way. X-bar is therefore best read as a classical generative schema and a teaching bridge, not as the only serious theory of syntactic structure.

Cross-Linguistic Parameters

X-bar's structural schema is universal; languages set parameters that determine word order.

Head-direction parameter:

  • Head-initial languages (English, Romance, Mandarin): head precedes complement. The king of England; to the store; eats apples.
  • Head-final languages (Japanese, Korean, Turkish, basic word order in Hindi/Urdu): head follows complement. The Japanese equivalent of "of England" precedes "king".

Specifier-direction parameter:

  • Most languages: specifier precedes X-bar (specifier on the left, head and complement on the right or vice versa depending on the head-direction parameter).
  • A few languages have post-head specifiers; these are typological outliers.

The combination of head-direction and specifier-direction parameters yields the major word-order types in Greenberg's typology (SOV, SVO, VSO).

ML Connections

Constituency parsing

Constituency parsers produce phrase-structure trees, which are related to the X-bar tradition but usually simplified for annotation. Penn Treebank-style trees are not full X-bar analyses; they are annotated structures built for consistent parsing and evaluation.

Probing transformer representations for X-bar structure

Hewitt-Manning structural probe (probing-classifiers-for-linguistic-structure) recovers dependency-tree structure from BERT. Constituency-tree evaluations and edge probes (Marvin-Linzen 2018, Tenney et al. 2019) test related phrase-structure information. BERT's middle layers often contain enough syntactic information for probes to recover substantial tree structure, but a probe result is evidence of extractable information, not proof of an explicit X-bar tree in the model.

Tree-LSTM and tree-structured architectures

Pre-transformer models (Socher et al. 2013, Tai et al. 2015) explicitly built X-bar-style tree-structured neural networks. The structural bias helped on tasks sensitive to long-range dependencies but didn't scale to large corpora as well as transformer.

Compositional distributional semantics

The Coecke-Sadrzadeh-Clark framework maps X-bar-style syntactic types to vector-space tensors. Heads correspond to high-order tensors; complements and specifiers correspond to vectors that contract into the tensor. The resulting sentence representation is compositional in a precise sense.

Common Mistakes

Watch Out

Confusing X-bar theory with bare phrase structure

X-bar theory (1970s-1980s): structural schema is stipulated; all phrases follow the X-bar template. Bare phrase structure (1995 onward): projections emerge from Merge operations and feature-checking; the X-bar template is derived rather than primitive. Modern minimalism uses bare phrase structure; classical theory uses X-bar.

Watch Out

Treating English as the universal type

English is head-initial in most categories. Many languages (Japanese, Korean, Turkish, classical Sanskrit) are head-final in most categories. The X-bar schema is universal; the parameter settings differ.

Watch Out

Forgetting the DP hypothesis

Modern syntax treats the king as a DP with the as the head, not as an NP with king as the head. The shift is substantively motivated by determiner behavior; introductory treatments sometimes still use the older NP-with-determiner specifier analysis.

Watch Out

Confusing complements with adjuncts

Complements are selected by the head; adjuncts are not. Iterativity is one diagnostic: adjuncts can stack (the angry, eccentric king); complements cannot. Movement and ellipsis diagnostics also distinguish them.

Cross-Network Links

  • LinguisticsPath internal: prerequisites constituency-tests, morpheme-and-allomorph; next natural topics are Government and Binding, Minimalism, and dependency grammar.
  • TheoremPath direction: constituency parsing, neural grammars, and syntactic probes are the ML-theory side of this syntax topic.
  • ComputationPath: context-free grammars, X-bar production rules as a CFG fragment, and parsing algorithms (CYK, Earley) provide the formal-language framing.

References

Canonical:

  • Chomsky, Noam. "Remarks on Nominalization." Readings in English Transformational Grammar (1970) 184-221.
  • Jackendoff, Ray. X-bar Syntax: A Study of Phrase Structure (1977).
  • Carnie, Andrew. Syntax: A Generative Introduction (2021, 4th ed.), Chapter 6.
  • Radford, Andrew. Analysing English Sentences (2009, 2nd ed.), Chapters 3-4.
  • Adger, David. Core Syntax: A Minimalist Approach (2003), Chapter 4.

Later Developments and Computation:

  • Chomsky, Noam. "Bare Phrase Structure." Government and Binding Theory and the Minimalist Program (1995) 383-439.
  • Pollock, Jean-Yves. "Verb Movement, Universal Grammar, and the Structure of IP." Linguistic Inquiry 20 (1989) 365-424.
  • Kitaev, Nikita, and Dan Klein. "Constituency Parsing with a Self-Attentive Encoder." ACL (2018).