
Section
LinguisticsPath
Language structure for machine learning, formal grammar, and meaning.
Sounds, word structure, syntax, meaning, use, and learned representations belong on the same map. The point is to keep the linguistic object precise before connecting it to NLP and language models.
Start here
The launch spine
9 published topics
Phonology · 12 min
Phoneme vs Allophone
The phoneme is the smallest unit of sound that can distinguish meaning in a language; an allophone is a context-determined surface realization of a phoneme. Distinguishing them is the foundational analytical mo...
Morphology · 11 min
Morpheme and Allomorph
The morpheme is the smallest unit of meaning in a language; an allomorph is a context-determined surface realization of a morpheme. The plural -s in English realizes as three allomorphs depending on the precedi...
Syntax · 11 min
Constituency Tests
Standard diagnostics for whether a sequence of words forms a syntactic unit (constituent): substitution, movement, coordination, ellipsis, and pro-form replacement. The empirical foundation of phrase-structure...
Computational Linguistics · 14 min
The Distributional Hypothesis
Words that occur in similar contexts tend to have similar meanings; the empirical claim that grounds vector semantics, word2vec, and the dense-embedding parts of modern language models.
Computational Linguistics · 14 min
Probing Classifiers for Linguistic Structure
Train a small classifier to predict a linguistic label (POS tag, dependency relation, syntactic depth) from a frozen language model's hidden states. If the probe succeeds under appropriate controls, the represe...
Phonology
Sound systems, contrast, allophony, and the analytical machinery behind speech.
Morphology
Meaning-bearing word pieces, surface variants, and how morphology differs from tokenization.
Syntax
Constituents, phrase structure, and the tests that make syntactic claims empirical.
Constituency Tests
core · 11 minStandard diagnostics for whether a sequence of words forms a syntactic unit (constituent): substitution, movement, coordination, ellipsis, and pro-form replacement. The empirical foundation of phrase-structure analysis and the entry point t...
X-Bar Theory
advanced · 14 minA schema for phrase structure: every phrase has a head, complement, specifier, and adjunct positions arranged in a uniform XP / X-bar / X tree. The structural framework that organizes constituency-test data into a coherent theory of syntact...
Formal Semantics
Truth-conditional meaning, typed lambda calculus, and compositional interpretation.
Pragmatics
Meaning in use: implicature, cooperation, context, and what literal content leaves unsaid.
Computational Linguistics
Distributional semantics, embeddings, and careful tests of linguistic structure in language models.
Probing Classifiers for Linguistic Structure
advanced · 14 minTrain a small classifier to predict a linguistic label (POS tag, dependency relation, syntactic depth) from a frozen language model's hidden states. If the probe succeeds under appropriate controls, the representation contains information p...
The Distributional Hypothesis
core · 14 minWords that occur in similar contexts tend to have similar meanings; the empirical claim that grounds vector semantics, word2vec, and the dense-embedding parts of modern language models.
Vector Semantics and word2vec, Revisited
advanced · 14 minWords as vectors in a high-dimensional space, where geometric closeness reflects semantic similarity. The algorithmic instantiation of the distributional hypothesis: Mikolov's 2013 word2vec (skip-gram and CBOW) and the Levy-Goldberg 2014 re...