A proposal for a formal model, Fragment Grammars, that treats productivity and reuse as the target of inference in a probabilistic framework.
Language allows us to express and comprehend an unbounded number of thoughts. This fundamental and much-celebrated property is made possible by a division of labor between a large inventory of stored items (e.g., affixes, words, idioms) and a computational system that productively combines these stored units on the fly to create a potentially unlimited array of new expressions. A language learner must discover a language's productive, reusable units and determine which computational processes can give rise to new expressions. But how does the learner differentiate between the reusable, generalizable units (for example, the affix -ness, as in coolness, orderliness, cheapness) and apparent units that do not actually generalize in practice (for example, -th, as in warmth but not coolth)? In this book, Timothy O'Donnell proposes a formal computational model, Fragment Grammars, to answer these questions. This model treats productivity and reuse as the target of inference in a probabilistic framework, asking how an optimal agent can make use of the distribution of forms in the linguistic input to learn the distribution of productive word-formation processes and reusable units in a given language.
O'Donnell compares this model to a number of other theoretical and mathematical models, applying them to the English past tense and English derivational morphology, and showing that Fragment Grammars unifies a number of superficially distinct empirical phenomena in these domains and justifies certain seemingly ad hoc assumptions in earlier theories.