I S K O

edited by Birger Hjørland and Claudio Gnoli

 

Semantic primitives

by

(This is a version of an article published in JASIST, see colophon.)

Table of contents:
1. Introduction: levels of part-whole relations
2. Semantic primitives
3. Compositionality and componential analysis
    3.1 Compositionality
    3.2 Componential analysis
4. Theoretical perspectives
    4.1 Empiricism
    4.2 Rationalism
    4.3 Historicism
    4.4 Pragmatism
    4.5 Atomism versus holism
    4.6 The analogy with the periodic system
    4.7 Further arguments against primitives
5. Compositionality and semantic primitives in information science
    5.1 Facet analysis and semantic factoring
    5.2 Szostak and basic concepts
    5.3 Thesauri and Spärck Jones
    5.4 Other examples and issues
6. Conclusion
Acknowledgments
Endnotes
References
Colophon

Abstract:
The term semantic primitives refers to a set of basic, atomic concepts from which all other (compound) concepts are constructed. It presupposes the principle of compositionality — the idea that complex items or expressions can be formed by combining simpler constituents. Both notions are of particular relevance to knowledge organization (KO), where concepts are understood to be the primary objects of organization in knowledge organization systems (KOS). Semantic primitives, therefore, may be viewed as candidates for foundational units in such systems. Moreover, these concepts play important roles in fields such as automatic language processing, lexicography, word sense disambiguation, and artificial intelligence. In KO, they relate to methods such as semantic factoring and facet analysis, while in linguistics they parallel componential analysis. Nevertheless, semantic primitives and compositionality remain controversial, with strong arguments both for and against their very existence. The philosophical assumptions underlying these debates have significant implications for information science and knowledge organization.

 

1. Introduction: levels of part–whole relations

Primitives are elements that can be combined to form wholes. According to the semiotic triangle, three levels of such part–whole relations may be distinguished: (A) the ontological level, (B) the conceptual or semantic level, and (C) the linguistic/semiotic level [1].

Re A: Things (objects, wholes) may consist of parts or elements. A person, for example, consists of arms, which include hands, which in turn include fingers, and so on. This exemplifies the level of ontological relations between wholes and their parts. An example from chemistry is that all matter is composed of approximately 118 chemical elements. This level concerns compositionality at the level of reality. Part–whole relationships such as these are studied in individual sciences like anatomy and chemistry, and on a more general level in fields such as mereology (Varzi 2019). However, as argued in the next section, any knowledge of things is necessarily achieved through the conceptualization of those things.

Re B: Things and their parts are also conceptualized — for example, the concepts [hand], [finger], [water], and [hydrogen] [2]. The relations between such concepts are called conceptual or semantic relations. We often assume that our concepts represent reality unambiguously, but this presupposes a form of naïve realism (i.e., the belief that the world is as we perceive it to be). An alternative view is that we necessarily view reality from a perspective, a conceptualization, or a paradigm [3]. For example, most conceptualizations of [finger] include the thumb, but not all do; modern science conceptualizes water as composed of oxygen and hydrogen, whereas ancient science regarded water as an indivisible element. From this latter perspective, all things and all relations are known to us only as concepts and through their semantic relations [4]. We cannot avoid concepts when speaking about reality, but we must critically consider which conceptualizations best represent reality for a given purpose, and which epistemologies allow us to construct the most fruitful theories and concepts. In chemistry, conceptualizations include those of chemical elements and their compounds. This level concerns compositionality at the level of mental representation. Contemporary chemistry employs concepts of elements and compounds that differ fundamentally from the ancient four elements (fire, air, water, and earth), once believed to be composed of Platonic solids (cf. Scerri 2020, 3). This implies that our concepts and their semantic relations — including part–whole relations — are theory-dependent and dynamic. The conceptual level is of particular importance to knowledge organization (KO), where concepts are understood to be the entities organized in → knowledge organization systems (KOS) (cf. Dahlberg 2009; Hjørland 2021) [5]. Semantic primitives may thus be of interest as candidates for basic units in KOSs.

Re C: Concepts may or may not be represented in a language or in another sign system, such as chemical notation. When they are, the concepts are said to be lexicalized [6]. Relations between compound signs and their components are referred to as lexical or semiotic relations. Lexical and semantic relations are often confused in the literature, and they can sometimes be difficult to distinguish. Prominent linguists (e.g., Murphy 2003, 3) argue that there is no consensus on whether a lexical–conceptual boundary exists at all. As Murphy writes: “We thus need to look at how words, or word meanings are related — not just how things in the world are related” (Murphy 2003, 11). This perspective contrasts with a common view in KO, in which semantic and lexical relations are treated as distinct. Soergel (1998, 2–3), for instance, described lexical relations as linking individual words, while conceptual–semantic relations link concepts — a “commonly accepted distinction”.

Soergel (1998) also demonstrated how this distinction was confused by the prominent cognitive scientist George Miller, who claimed in the book reviewed “the basic semantic relation in WordNet is synonymy”. In → thesauri, however, synonymy — represented by the symbols USE and UF (used for) — is treated as a lexical relation, as it refers to different terms denoting the same concept. In contrast, semantic relations such as hypernymy — represented by the symbol BT (broader term) — link different concepts. There can be no synonymous relation between concepts because the idea of identical concepts is wrong, as they would be one and the same concept [7]. Thus, Soergel's criticism of Miller appears well founded.

Primitives at the linguistic/semiotic level are understood as terms (or signs) that are used to explain other terms, concepts, or expressions, but which themselves cannot be defined using other terms or signs. Chemical formulas, for example, represent the composition of chemical compounds — for example, H2O denotes water as consisting of hydrogen (H) and oxygen (O) in a specific quantitative relation [8]. Yet, there is not necessarily a direct correspondence between complex concepts and compound words or phrases. For example, the term greenhouse is not to be analyzed as [green] + [house], but rather as expressing the concepts [glass building] + [to grow plants].

Unlike chemical formulas, natural languages do not systematically represent ontological or conceptual relations based on scientific theories. Nevertheless, Wierzbicka (2006) proposed that all natural languages contain a universal set of semantic primitives — lexicalized in every language — from which all other concepts are derived, albeit differently across languages. This level concerns compositionality at the level of linguistic or semiotic representation (although Wierzbicka's theory is at the same time also a theory about the conceptual level).

The present article examines different views on compositionality and semantic primitives, with a focus on their epistemological implications and their application in information science. It provides a broad reviews of the literature, drawing on multiple fields including linguistics, cognitive science, philosophy, and knowledge organization. The discussion spans developments from Aristotle to the present day.

Section 2 introduces the notion of semantic primitives in relation to perspectives from artificial intelligence and contemporary linguistic research. Section 3 addresses compositionality and componential analysis: Section 3.1 discusses compositionality as a prerequisite for working with semantic primitives, while Section 3.2 focuses on componential analysis as a method for decomposing complex concepts. Section 4 presents theoretical perspectives and is subdivided into seven parts: 4.1 Empiricism, 4.2 Rationalism, 4.3 Historicism, 4.4 Pragmatism, 4.5 Atomism versus holism, 4.6 The analogy with the periodic system, and 4.7 Further arguments against primitives.

Section 5 explores the application of semantic primitives and semantic decomposition in information science, organized into four subsections: 5.1 Facet analysis and semantic factoring, 5.2 Szostak and basic concepts, 5.3 Thesauri and Spärck Jones, and 5.4 Other examples and issues. Finally, Section 6 offers the overall conclusion of the study.

[top of entry]

2. Semantic primitives

Semantic primitives may also be termed “semantic primes”, “primes”, “conceptual primitives”, “atomic concepts”, or “basic sense-components”. They are opposed to complex concepts, which are decomposable concepts composed of simpler concepts. A famous example is → Aristotle's definition by genus and differentia, such as [man] defined as [animal] + [rational]. Aristotle imagined that the concepts [rational] and [animal] themselves could be analyzed further, until everything would be defined in terms of a set of primitive concepts, which could not be further analyzed. In Categories (Aristotle 1984, 3–24) Aristotle called the ultimate primitives “categories”, of which he suggested 10: substance, quantity, quality, relation, place, time, situation, condition, action, and passion. However, in other writings, Aristotle listed different sets of categories and, according to Sowa (1984, 14–15), he never gave a definite set of primitives. As we shall see, it is important that Aristotle did not depend on these primitives in his biological classification [9].

The concept of semantic primes has played important roles in relation to automatic language processing, lexicography, meaning disambiguation, knowledge organization, artificial intelligence, etc. Sowa (1984, 14) wrote about the first mechanical systems based on primes:

The AI [artificial intelligence] goal of mechanically reducing concepts to primitives was first proposed by Ramon Lull [Llull] in the thirteenth century. His Ars Magna [1305–1308] was a system of disks inscribed with primitive concepts, which could be combined in various ways by rotating the disks. Under the influence of Lull's system, Leibniz (1679, 1903) developed his Universal Characteristic. He represented primitive concepts by prime numbers and compound concepts by products of primes. Then statements of the form All A is B are verified by checking whether the number for A is divisible by the number for B. If PLANT is represented by 17 and DECIDUOUS by 29, their product 493 would represent DECIDUOUS-PLANT. If BROAD-LEAFED-PLANT is represented by 20,213 and wine by 1,192,567, the statement All vines are broad-leafed plants is judged to be true because 1,192,567 is divisible by 20,213. Leibniz envisioned a universal dictionary for mapping concepts to numbers and a calculus of reasoning that would automate the syllogism. To simplify the computations, he invented the first calculating machine that could do multiplication and division. With the advent of electronic computers, computational linguistics set out to implement Leibniz's universal dictionary.

There have been many suggestions of sets of semantic primitives, but there have also been serious arguments about the very idea of their existence [10], and even the meaning of the term is not generally agreed. One of the most influential researchers today in this field is the Polish linguist Anna Wierzbicka (2006, 134), who defined semantic primitives this way:

“Semantic primitives” (or primes) are hypothetical elementary building blocks from which all meanings and all human thoughts are built — rather like chemical elements discovered by Mendeleev are elementary building blocks of all chemical substances [11].

Wierzbicka's theory, called Natural Semantic Metalanguage (NSM), was by Goddard (1998, 129) appointed “the most resilient and well-developed theory of semantic primitives”. Wierzbicka refers to Leibniz's theory “alphabet of human thoughts”, and found that she, based on contemporary linguistic knowledge, has produced what Leibniz attempted, but did not fulfill (and which he, in her opinion, could not fulfill lacking the adequate linguistic tools). Wierzbicka's definition includes the following claims:

a. Primes are undefinable concepts,
b. All other concepts are derived from primes (and, by implication, that any concept can be reduced to these primes or “kernels”).
c. Primes are universal in all human languages and
d. Analogy with chemical elements is appropriate.

In addition, Wierzbicka suggests (although not in this quote):

e. Primes are innate in human beings.

Wierzbicka's definition seems to be a theory rather than a definition, which is common to all theories of semantic primitives. Therefore, it may be fruitful to consider these theoretical claims, rather than to take them as given. Wierzbicka (1996) identified primes by first defining the concepts, then applying componential analysis (see Section 3.2), and, as the third step, comparing the meanings of the elements with corresponding elements in other languages, and only accepting as primes those that can be universally identified. In the process of defining concepts, she uses a broad spectrum of sources about their ordinary understanding (as contrasted with their scholarly understanding). In contrast to most other scholars, who tend to base their identification of primitives exclusively on logic and common sense, Wierzbicka's method also involves cultural studies.

Wierzbicka has identified a set of 65 primes, a number that grew from an initial 14 (Wierzbicka 1972), but, according to Wierzbicka (2021), it has remained relatively stable since 2014 (Table 1) [12].

Table 1: Wierzbicka's list of 65 semantic primes
(adapted from Levisen and Waters 2017, 12 and Goddard and Wierzbicka 2014)
 
Category Primes
Substantives I~me, you, someone, people, something/thing, body
Relational substantives Kind, part
Determiners This, the same, other~else~another
Quantifiers One, two, some, all, much/many, little/few
Evaluators Good, bad
Descriptors Big, small
Mental predicates Think, know, want, don't want, feel, see, hear
Speech Say, words, true
Actions, events, movement Do, happen, move
Existence, possession Be (somewhere), there is, be (someone/something), (is) mine
Life and death Live, die
Time When/time, now, before, after, a long time, a short time, for some time, moment
Space Where/place, here, above, below, far, near, side, inside, touch (contact)
Logical concepts Not, maybe, can, because, if
Intensifier, augmentor Very, more
Similarity Like/as/way

Wierzbicka approaches this theory from the standpoint of linguistics, basing her analysis on natural language rather than on subject-specific vocabularies or scientific theories. As noted earlier, her view implies that the distinction between conceptual relations and lexical/semiotic relations collapses: the same primitives serve both semantic and lexical functions. She argues that a universal set of semantic primitives exists across all human languages. While these primes are lexicalized differently across languages, the corresponding words share the same meanings.

According to Wierzbicka, all complex concepts are ultimately derived from this set of approximately 65 primitives. This claim, however, poses interpretive challenges. Take the example of the ordinary concept [sugar]. As neither [mouth] nor [sweet] is included among the 65 primes, one would need to define sugar using only the approved primitives — for example, [a kind of thing], [this thing is small], [people can do something with this thing], [when people eat it, they feel something good]. Such a definition appears inadequate, particularly for use in contexts such as thesauri for information retrieval. It seems doubtful that complex and specialized concepts can be satisfactorily reduced to a small, fixed set of primitives without significant loss of meaning or practical usefulness.

We will consider more arguments that have been raised against Wierzbicka's theory in Section 4.

We conclude this section by presenting an alternative definition of semantic primitives proposed by the British computer scientist Yorick Wilks (2007, 106). His formulation is, in some respects, less restrictive and more directly useful for applications in information science:

A PRIMITIVE (or rather a set of primitives plus a syntax etc.) is a reduction device which yields a semantic representation for a natural language via a translation algorithm and which is not plausibly explicated in terms of or reducible to other entities of the same type.

This definition leaves open, as it is intended to, the serious question of whether or not primitives are explicable in terms of, or reducible to, entities of some quite other type. This is a serious question because most attacks on the use of primitives take the form of demands that they be explicated in terms of some other type of entity altogether; just as most bad defenses of primitives take the form of offering some very weak equivalence between primitives and other types of entity.

As we shall see, there are several ways of defining and understanding prime and related terms. We will also encounter the term basic concept, which is sometimes defined differently from prime (cf. Section 5.2). The theoretical perspectives of researchers employing the term semantic primitive significantly influence how the term is understood, and the reported number of semantic primitives in the literature varies widely. Elkin & Brown (2023, 222) note that, for the clinical domain alone, the estimated number of primitives ranges from 20,000 to 1,000,000 — though it has not been possible to verify this claim in their cited sources.

[top of entry]

3. Compositionality and componential analysis

The idea of combinable elements is central to the concept of compositionality, which plays an important role in linguistics, faceted classification, and other areas. Montague (1970) argued that natural languages are theoretically very similar to artificial logical languages (which are prototypical examples of compositional systems). This claim has sparked sustained interest in linguistic work on compositionality.

The method for analyzing complex concepts is often referred to as componential analysis or semantic factoring. These approaches are also widely discussed in the literature on computational linguistics, information science, and knowledge organization.

[top of entry]

3.1 Compositionality

Compositionality is a concept that is presupposed in theories involving semantic primitives: for such primitives to exist, they must be capable of combining into more complex concepts. As Goldberg (2016, 419) explains:

A principle of compositionality is generally understood to entail that the meaning of every expression in a language must be a function of the meaning of its immediate constituents and the syntactic rules used to combine them.

Compositionality thus implies a building-block model of meaning. It is a widely held — though highly contested — view that natural languages operate on this principle. Pelletier (1994) surveyed the literature and found approximately 318 arguments raised against compositionality, while only three or four arguments were offered in its favor.

Costello and Keane (2000, 337) emphasized four reasons in support of the principle of compositionality, which they argue is central to theories of language:

  1. First, compositionality is important in theories of language because it allows communication between people who have different knowledge. Under compositionality two language users will be able to understand each other as long as they both know the meaning of words in their language. Any differences in any other knowledge they have is irrelevant.
  2. Second, compositionality is important because it provides for the generative nature of language. An almost infinite number of new expressions can be produced by combining the words in a language in novel ways; under compositionality, all new expressions can be understood by anybody who knows the meaning of words in the language. If language is noncompositional, even someone who knows the words in the language could nevertheless be unable to understand some new expressions, if they lack the further specific information necessary for those expressions.
  3. Third, compositionality is important for accounts of language learning (Butler, 1995). Under compositionality, once a learner has grasped the meaning of the words in a language they will be able to understand any complex expression they come across, without needing to learn any further information. If language is noncompositional, a learner's task may never be complete: before understanding any complex expression they would have [to] learn not only its constituent words, but also any further specific information necessary for understanding that expression.
  4. Finally, compositionality is important for accounts of access to information in comprehension. Under compositionality, the information accessed in understanding a complex expression is exactly that information accessed in understanding the constituent words of that expression. The same information is accessed in comprehending a word no matter what complex expression it occurs in. If language is noncompositional, different information will be accessed in comprehending a word when it occurs in different complex expressions.

These arguments illustrate the theoretical and practical significance often attributed to compositionality — despite the numerous objections raised in the literature.

Pagin (1997, 14) wrote: “Compositionality seems to imply that the meaning of a complex expression is determined locally, by nothing else than what is internal to it, i.e. the meaning of its parts and its mode of composition. So the parts must have a meaning prior to the complex expression itself”. As we will see in Section 4.5, this assumption is directly challenged by theories of semantic holism.

More recently, systems based on artificial intelligence — such as ChatGPT — have introduced new perspectives on compositionality. As Nefdt and Potts (2024) observe:

Perhaps the most striking finding is that modern large language models seem to induce partial compositional analysis just from training on large quantities of text, using learning objectives that simply push the model to imitate those texts. In other words, these models seem, at least in some instances, to arrive at process compositional analyses, and this may explain how they are able to successfully generalize to novel expressions. This is an exciting new opportunity for interdisciplinary collaboration, as linguists and computer scientists can collaborate on assessing models using compositional phenomena and on studying the ways in which models perform highly abstract compositional analyses.

In Section 4, we will turn to the main arguments that have been raised against the principle of compositionality.

[top of entry]

3.2 Componential analysis

Geeraerts (2005, 709) defined componential analysis (also known as decomposition analysis or semantic factoring) as follows:

Componential analysis is an approach that describes word meanings as a combination of elementary meaning components called semantic features or semantic components. The set of basic features is supposed to be finite [i.e., forming a set of semantic primitives] [13].

These features can either be present or absent in a given word and serve to define the word's meaning. For example, the word father can be analyzed into the semantic features [male] and [parent]. Similarly, Sowa (2003) defined semantic factoring as “the process of analyzing some or all of the categories of an ontology into a collection of primitives”.

More generally, componential analysis, decomposition analysis, and semantic factoring are variant terms for the method of analyzing the meaning of specific concepts by identifying the more general features or concepts they instantiate. The inverse process is referred to as conceptual combination (cf. Costello and Keane 2000), in which concepts are combined to form more complex ones.

In bibliographic classification, the processes of semantic factoring and conceptual combination are typically termed analysis and synthesis, respectively. Their combination constitutes the principle of analytico-synthetic classification (cf. Hjørland 2013, 547).

Lyons (1977, 317) pointed out an important precondition for componential analysis:

[Componential analysis] rests upon the thesis that the sense of every lexeme can be analyzed in terms of a set of more general sense-components* (or semantic features*), some or all of which will be common to several different lexemes in the vocabulary. […]
Componential analysis, interpreted this way, can be related to the idea of Leibniz and → Wilkins, which, as we saw earlier, served as an inspiration to Roget in the compilation of his thesaurus. [14]

Lyons's reference to Leibniz highlights the historical and conceptual connection between componential analysis and Wierzbicka's theory of semantic primitives. The connection to thesauri will be further discussed in Section 5.3.

According to the theory of componential analysis, every word can be decomposed into minimal, distinctive units of meaning — referred to as semantemes, semantic features, or semantic components. These minimal units form the theoretical basis for comparing meanings across lexical items and for modeling semantic structure.

One well-known application of componential analysis involves kinship terms, as demonstrated by Henning (2020) (Table 2).

Table 2: Example of componential analysis
(adapted from Henning 2020, 68)
 
Word Semantemes
Father male + parent
Mother female + parent
Son male + offspring
Daughter female + offspring
Brother male + sibling
Sister female + sibling

In this example, six kinship terms are analyzed using five semantemes: male, female, parent, offspring, and sibling. As the number of kinship terms analyzed increases, the relative efficiency of the method becomes more apparent, since a relatively small set of semantemes can represent a large vocabulary of relational terms. Kinship terms such as those shown above may also be analyzed using three abstract components: sex, generation, and lineage. Sex is specified as male or female; generation can be represented numerically, with 0 denoting the proband's generation, -1 the previous generation, and +1 the next; and lineage can be classified as direct, colineal (as in siblings), or ablineal (as in uncles and aunts).

Geeraerts (2005, 712) observed “that there is widespread agreement in linguistics about the usefulness of componential analysis as a descriptive and heuristic tool, but the associated epistemological view that there is a primitive set of basic features is generally treated with much more caution”. This skepticism is also found in information science, as will be discussed in Section 5.

[top of entry]

4. Theoretical perspectives


4.1 Empiricism

Empiricism is an epistemological position that holds that all knowledge originates from sense experience, from which more general knowledge is drawn by inductive methods. For empiricists, simple perceptual experiences — for example, an observational report such as “this thing is red” — form the atomic basis from which more complex concepts can be constructed (e.g., “this thing is a tomato”).

Quine (1951, 36) characterized this view as a form of reductionism:

Every meaningful statement is held to be translatable into a statement (true or false) about immediate experience. Radical reductionism, in one form or another, well antedates the verification theory of meaning explicitly so-called. Thus Locke and Hume held that every idea must either originate directly in sense experience or else be compounded of ideas thus originating.

This view also underlies the logical atomism proposed by Russell and Wittgenstein (see Oliver 1998, 773), which similarly envisions perceptual facts as the building blocks of knowledge.

Empiricism thus aligns well with the idea of semantic primitives as derived from simple sense experiences that can be combined into increasingly complex concepts. This alignment was explored in detail by Miller and Johnson-Laird (1976), whose psychological research attempted to derive semantic primitives from perceptual features. For example, the concept of table was decomposed into primitives such as OBJECT, CONNECTED, RIGID, TOP, FLAT FACE, HORIZONTAL, and VERTICAL. However, their project encountered significant difficulties. As Aitchison (2012, 94) summarized:

In brief, they [Miller and Johnson-Laird 1976] concluded, first, that even words for straightforward objects such as tables had elements of meaning which were not perceptually based. Second, there were an enormous number of vocabulary items whose meaning could not be tied down to a perceptual foundation. They therefore came to the reluctant conclusion that “much of the lexicon is based on primitive concepts that are not perceptual” [ibid., 688]. It seems, then, that even if semantic primitives exist, they cannot be based purely on perception.

This conclusion challenges the core empiricist assumption that semantic primitives can be grounded in perceptual input. It suggests that the empiricist model of concept formation is insufficient to explain the full structure of meaning in language.

Importantly, empiricism should not be confused with empirical research. The critique of empiricism concerns foundational assumptions about observation — particularly the belief that observations are neutral and independent of the observer's cultural, historical, and scientific background. On the contrary, as many critics argue, all observations are theory-laden and embedded in paradigmatic frameworks.

[top of entry]

4.2 Rationalism

Rationalism is an epistemological position that prioritizes logic, deduction, and the axiomatic method as the primary means of attaining knowledge. While it emphasizes reasoning, rationalism does not wholly deny the role of sense experience in forming certain types of knowledge. Rationalists generally assume that any field of knowledge can be constructed from a set of fundamental principles or axioms — principles that are either self-evident or derived through rational intuition. From these foundations, further knowledge is generated through systematic and logical deduction. A model for rationalism is Euclidean geometry, which is generalized to all fields of knowledge.

The rationalist view often treats language and cognition as representational systems, analogous to systems of formal logic. In this framework, meaning is determined by the structure and combinatorial properties of symbols. As Weiskopf (n.d., Section 5b) explains:

A representational system is compositional if the properties of complex symbols are completely determined by the properties of the simpler symbols that make them up, plus the properties of their mode of combination. So predicate logic is compositional, since the semantic value of “Fa” is determined by the semantic values of the predicate “F” and the individual constant “a”. Similarly “Fa & Fb” is semantically determined as a function of “Fa”, “Fb”, and the interpretation of conjunction. Many have argued that thought is compositional as well (Fodor 1998), which entails that the properties of complex concepts derive wholly from the properties of their constituents.

This rationalist conception of thought and meaning — particularly its reliance on compositionality and formal representation — has strongly influenced semantic theories, artificial intelligence, and cognitive science. We have already met the rationalist philosopher Leibniz, who in 1666 in Dissertatio de arte combinatoria claimed that a proper philosophical language [=scientific language] should analyze all concepts into their simplest elements, that is, into “the alphabet of thought”. Ducheyne (2005, 113) wrote:

A proper symbol should indicate a thing's nature, in other words, it needs to define it by means of its appearance. Leibniz's attempt presupposed that (1) ideas can be analyzed into primitive notions, that (2) ideas can be represented symbolically, and that (3) it is further possible to represent the relations between these ideas (Rossi, [1983] 2001, 177). Paolo Rossi ([1983] 2001, 159–160) remarks that the seventeenth-century attempt to construct a universal philosophical language presupposed that a complete enumeration of human knowledge could be given.

The very idea of such a universal language is characterized, from the perspective of semiotics, by Eco (1995) as a utopian dream, which he traces from the Bible over the Enlightenment philosophers to recent attempts to create a natural language for artificial intelligence.

Influential rationalists from the 20th and 21st centuries include Noam Chomsky, Jerry Fodor, Stephen Laurence, and Eric Margolis. Chomsky was one of the pioneers in cognitive science, a field that partly replaced behaviorism (which was dominated by empiricism). While Chomsky strongly argued for the existence of innate concepts, he based his view on empirical arguments, not on a priori methods. (However, according to Laurence & Margolis (2024, 10) Chomsky's research focused on language, but not on concepts.) Fodor's notorious view was that virtually all lexical concepts are innate, which Laurence & Margolis (2024, 533) characterized as “really is an extreme outlier” among rationalist theories [15].

Laurence and Margolis (2024) offer the most comprehensive and up-to-date defense of the rationalist view of concepts, including the claim that some concepts are innate and universal among humans. This is especially evident in their treatment of logical concepts such as [or], [all], and [not], which they argue are human universals. As they note (2024, 326): “Our claim is that children universally interpret disjunction in natural language in just one way despite considerable evidence from adult speech for contrary interpretations”. They further argue that:

Empiricists, then, are committed to the existence of innate primitives in exactly the same sense as rationalists and for exactly the same reasons that rationalists are. (Laurence and Margolis 2024, 132)

Rationalism, therefore — like empiricism — may appear at first glance to support the idea of semantic primitives. It is closely linked to the position known as conceptual nativism, which holds that at least some concepts are inborn. Wierzbicka's theory of semantic primes aligns with this view in its claim that such primitives are objective, neutral, and culture-independent (Wierzbicka 1999, 16). She describes these primes as capturing “people's fundamental conceptual models” (1999, 10), and refers to them as “presumably innate ‘indefinables’”. Wierzbicka also claimed to have empirically validated Leibniz's hypothesis of an “alphabet of human thoughts” [16].

Laurence & Margolis (2024, 131) acknowledge that rationalists are indeed committed to conceptual primitives but argue that these should be understood as innate psychological structures involved in concept acquisition, rather than as primitives in an absolute or metaphysical sense. As they write (2024, 130), “the building blocks must themselves be built”, meaning that such primitives are relative, not absolute. While they accept some of Wierzbicka's proposed primes, they do not endorse a specific inventory of semantic primitives.

Still, as with empiricism, the rationalist view faces major challenges. Although it is uncontroversial that something in the human mind must be innate, attempts to specify a definite set of building blocks — and to show how complex concepts are systematically constructed from them — have not succeeded. As Aitchison (2012, 95–96) notes:

So far, then, we have noted, first, that a number of linguists believe in the existence of a universal store of ‘semantic primitives’, small components of meaning out of which words are built. Second, no one agrees what these components are, and no one has been able to find them.

All this is still not proof that semantic primitives are non-existent: “The problem is that it is very difficult to show conclusively that something does not exist”. (Pulman 1983, 31)

Aitchison concludes her chapter with the following assessment: “We conclude, therefore, that atomic globules do not exist in the mind” (Aitchison 2012, 97).

Rationalism shares a central weakness with empiricism: both are individual epistemologies that do not adequately recognize that the mind, its observations, and its conceptual processes are shaped by the socio-cultural and paradigmatic contexts in which individuals are raised. As the following sections will show, other, more context-sensitive perspectives offer alternative understandings of semantic primitives and compositionality.

[top of entry]

4.3 Historicism

Historicism refers to a family of views that emphasize knowledge and cognition as historically shaped and contextually embedded. Under this label, we include a range of perspectives, the most prominent of which is hermeneutics. Historicism stresses that knowledge, ideas, and cultural practices cannot be fully understood apart from their temporal, social, and situational development. According to this view, methods for studying and producing knowledge must account for the social, cultural, and historical embeddedness of concepts and theories.

In linguistics, historicism is associated with relativism, the idea that different languages express different world views and influence the cognitive processes of their speakers. Language, on this view, is not a neutral medium but shapes how people conceptualize and classify reality. In contrast to empiricism and rationalism — which often assume reductionist or atomistic foundations — historicism shifts the focus from isolated facts to broader contexts (see Section 4.5 Atomism versus holism).

With respect to semantics and semantic primitives, historicism generally holds either that primitives do not exist or that they are relative to specific historical and disciplinary contexts. In any given domain of knowledge, different sets of primitives (atoms or elements) may be identified or constructed depending on the prevailing theoretical framework (as discussed earlier in relation to the term chemical element). From this perspective, concepts are not derived from raw sense impressions (as in empiricism), nor are they innately structured in the brain (as in rationalism). Thomas Kuhn's theory of scientific paradigms is a well-known example of a historicist position (see further in Section 4.5).

Linguists mostly focus on general languages (such as English) rather than special languages, such as scientific terminologies and semiotic systems. This is also true of Wierzbicka, who aims to distinguish the semantics of natural languages like English from domain-specific or encyclopedic knowledge. While scientific and everyday languages may be partially distinct systems of meaning, scientific concepts frequently migrate into general discourse and reshape it over time. As Andersen (1997, 73) notes (see Section 4.4), “the content system of the national language no longer appears as a homogeneous entity, but as a conglomerate of possibly different sublanguages”. These sublanguages develop their own semantic structures based on specific disciplinary needs, with the more dominant sublanguages exerting greater influence on general language use.

Whorf (1956, 252) probably formulated the most influential cultural position in linguistics, which he described as “new to Western science” but based on “unimpeachable evidence”:

[E]very language is a vast pattern-system, different from others, in which are culturally ordained the forms and categories by which the personality not only communicates, but also analyzes nature, notices or neglects types of relationship and phenomena, channels his reasoning, and builds the house of his consciousness.

Whorf's position was grounded in empirical studies, such as his research on the Hopi language, which led him to conclude that basic concepts — such as [time] — differ across languages and are thus shaped by cultural frameworks [17].

Goddard (2003) has compared the views of Whorf and Wierzbicka on language, meaning, and cognition. While he found some of Wierzbicka's work to be “neo-Whorfian”, he also identified aspects of her program that run counter to Whorf's perspective. As Goddard (2003, 427) summarized:

The main affinities between the work of Whorf and that of Wierzbicka are that they both see linguistic semantics as fundamental to human cognition, that they both recognise that natural languages differ hugely in their semantic organisation, and that they have both sought to demonstrate and explore semantic differences through empirical studies of non-English languages. […]
On the other hand, NSM [Natural Semantic Metalanguage] research indicates that there is a very small ‘core’ of simple meanings and grammatical constructions which all languages share; and that this universal core can be used as a kind of semantic bridge between the vastly different conceptual worlds embodied in full natural languages. This contention could be seen as ‘counter-Whorfian’.

One of the clearest examples of a historicist approach to semantic primitives is found in the work of Blumczynski (2013; 2016). Although he acknowledges a strong intellectual debt to Wierzbicka, he distances himself from the NSM framework. Blumczynski (2016, 12) writes (indented listing added):

While I gratefully acknowledge a significant inspiration by Wierzbicka's work in conceptual semantics and cross-cultural pragmatics […]. My method differs radically from NSM in several important aspects:
First, NSM postulates a finite set of semantic primitives (even though it has been expanding over the years). In contrast, the set of primitive concepts suggested here is open and flexible […]
Secondly, NSM employs semantic primitives in explications that are meant to be reductive, i.e. that aspire to exhausting the entire semantic content of a word or expression. My theoretical position is radically different: Since words are protean in nature, their semantic contribution will depend on the context, which can never be fully predicted or explicated, which invalidates any attempts of decontextualized semantic explications.
Thirdly, NSM claims that the set of semantic primitives is language-independent, that is, translatable into any language without loss or gain; the various language versions of semantic primitives are claimed to be isomorphic. I certainly do not share this axiom of fully reversible equivalence and stress that my set of primitive concepts is inevitably bound to the English language. […]
Fourthly, NSM, despite its claims to derive its semantic primitives from natural language, nevertheless uses them in a technical way (namely, as terms) by stripping them of their natural polysemy. I am stressing that they are concepts, prone to interpersonal and contextual variation in interpretation and use.

Blumczynski thus promotes a view of semantic primitives as historically and linguistically embedded, rejecting the NSM framework's claims of universality, reducibility, and semantic invariance.

Riemer (2016, 312) adds a broader epistemological critique of decomposition approaches in semantics. While he acknowledges their “considerable heuristic utility”, he notes that these methods face “no less significant problems” as theories of underlying semantic structure. Riemer recommends an epistemological shift in semantics toward hermeneutics, thereby supporting the present article's emphasis on epistemological positions in this area.

[top of entry]

4.4 Pragmatism

Whereas empiricism emphasizes data derived from observation, rationalism emphasizes logical intuitions, and historicism emphasizes the historical embeddedness of knowledge, pragmatism emphasizes the consequences and implications of beliefs. At its core, pragmatist epistemology examines how interests, purposes, goals, values, and political contexts shape the development and function of knowledge.

Peirce (1877, 293) formulated the canonical pragmatist maxim:

Consider what effects, which might conceivably have practical bearings, we conceive the object of our conception to have. Then, our conception of these effects is the whole of our conception of the object.

Pragmatism shares important assumptions with historicism, particularly its orientation as a social epistemology. Both emphasize that knowledge is not context-free and must be examined regarding its historical embeddedness. In this sense, pragmatism challenges abstract and ahistorical accounts of meaning, including universalist claims about semantic primitives.

Feminist epistemology — alongside critical theory and related traditions — exemplifies the pragmatic approach to knowledge. As Grasswick (2018, 1) writes:

Motivated by the political project of eliminating the oppression of women, feminist epistemologists are interested in how the norms and practices of knowledge production affect the lives of women and are implicated in systems of oppression. Feminist epistemologists seek to understand not only how our social relations of gender have shaped our knowledge practices, but also whether and how these relations should play a role in good knowing.

A significant consequence of the pragmatist position for the theory of semantic primitives is illuminated by Legg (2007, 424):

His [Charles Sanders Peirce's] philosophical pragmatism led him to see any attempt to formalize the entire meaning of a body of knowledge as impossible. In his view, an irreducible dimension of the meaning of any term (such as “hard” or “magnetic”) is constituted by the effects that an agent situated in the world would experience in relevant situations and the sum total of such effects can never be known in advance (or there would be no need for scientific inquiry). (Peirce 1940)

This does not mean that pragmatically oriented scholars cannot work with the idea of semantic primitives. However, pragmatism — like historicism — is skeptical toward any claim that primitives can be captured in a single, finite, and context-independent set. Pragmatists emphasize that meanings are contingent on perspective, interest, and function, and therefore resist attempts to formalize universal meanings divorced from human practice.

Andersen (1997, 72–73), writing under the heading “A materialist view of language”, articulated this position by contrasting biological and functional explanations for language universals [18]:

The materialistic point of view may also contribute to our general understanding of language. Why is language as it is? If there are universals of language, what are the reasons for them? Chomsky [e.g. Chomsky 1968, section 3] believes that our cognitive apparatus determines the limits on possible language, and that, conversely, a study of language universals may be used to characterize the cognitive faculties of man. True as it may be, there are other explanations of language universals, namely that anything that functions as a human language must be able to perform certain functions [see e.g. Halliday 1978]. Thus, universals of language do not necessarily express biological properties of the human brain but may simply reflect basic constraints on human societies. For example, any society involves some kind of division of labor, some kind of cooperation, some kind of commodity exchange, and some kind of reproduction. These activities are impossible without language, and conversely, any human language must be able to work in these activities.
If this line of thought is pursued to its conclusion, the content system of the national language no longer appears as a homogeneous entity, but as a conglomerate of possibly different sublanguages […].

Here, Andersen positions Chomsky's view within the rationalist tradition — grounded in the search for general psychological mechanisms, but Zwicky (1973, 474) notes that Chomsky acknowledged a functional explanation for semantic universals, namely that all languages must fulfill core communicative functions across human cultures. Nevertheless, Chomsky largely ignored sociocultural studies as explanations, making it reasonable to associate him with the rationalist tradition.

The final sentence of Andersen's quote is especially important: while most linguists treat a national language as a coherent semantic system, the pragmatic view instead sees it as a conglomerate of different sublanguages. This perspective aligns with Bakhtin's (1981) theory of heteroglossia, which emphasizes the coexistence of multiple discourses and semantic frameworks within any living language. This insight is crucial for the present article, as it challenges the assumption of a uniform conceptual base across domains and cultures — a key presupposition behind many theories of semantic primitives.

[top of entry]

4.5 Atomism versus holism

As noted earlier, Pagin (1997, 14) asserts a foundational assumption of compositional systems: that the parts must have a meaning prior to the complex expression itself. In other words, elements are assumed to retain their meanings across combinations and contexts. From this assumption follows one of the central philosophical arguments against semantic primitives: meaning holism [19].

Jackman (2020, §1) defines meaning holism in contrast to atomism and molecularism [20]:

The label “meaning holism” is generally applied to views that treat the meanings of all of the words in a language as interdependent. Meaning holism is typically contrasted with atomism about meaning (where each word's meaning is independent of every other word's meaning), and molecularism about meaning (where a word's meaning is tied to the meanings of some comparatively small subset of other words in the language).

Holism and the principle of contextuality thus stand in opposition to atomistic models of meaning. Early 20th-century philosophy saw logical atomism emerge as a dominant position, particularly in the work of Russell and the early Wittgenstein. According to Proops (2022, §1):

Although it has few adherents today, logical atomism was once a leading movement of early twentieth-century analytic philosophy. Different, though related, versions of the view were developed by Bertrand Russell and Ludwig Wittgenstein. Russell's logical atomism is set forth chiefly in his 1918 work “The Philosophy of Logical Atomism” (Russell, 1956), Wittgenstein's in his Tractatus Logico-Philosophicus of 1921 (Wittgenstein, 1981). The core tenets of Wittgenstein's logical atomism may be stated as follows:
    i. Every proposition has a unique final analysis which reveals it to be a truth-function of elementary propositions (Tractatus 3.25, 4.221, 4.51, 5);
    ii. These elementary propositions assert the existence of atomic states of affairs (3.25, 4.21);
    iii. Elementary propositions are mutually independent — each one can be true or false independently of the others (4.211, 5.134);
    iv. Elementary propositions are immediate combinations of semantically simple symbols or “names” (4.221);
    v. Names refer to items wholly devoid of complexity — so-called “objects” (2.02 & 3.22);
    vi. Atomic states of affairs are combinations of these objects (2.01).

Wittgenstein later rejected this atomistic model and developed a use-oriented perspective centered on language games. As Heal (2011) explains:

Wittgenstein came to think that the idea that words name simple objects was incoherent, and instead introduced the idea of “language games”. We teach language to children by training them in practices in which words and actions are interwoven. To understand a word is to know how to use it in the course of the projects of everyday life. We find our ways of classifying things and interacting with them so natural that it may seem to us that they are necessary and that in adopting them we are recognizing the one and only possible conceptual scheme. But if we reflect, we discover that we can at least begin to describe alternatives which might be appropriate if certain very general facts about the world were different or if we had different interests.

Wittgenstein's development thus illustrates a shift from supporting atomism and compositionality toward a rejection of these views in favor of a context-sensitive, pragmatic understanding of meaning [21].

A particularly influential form of meaning holism is found in Kuhn's (1962) theory of scientific paradigms. According to Kuhn, the meaning of a term is relative to the paradigm in which it functions. He famously illustrated this through the transformation of astronomical terms during the Copernican revolution. Prior to Copernicus, the geocentric model (with Earth at the center) shaped the meanings of terms like star and planet; after the paradigm shift to the heliocentric model, the same terms were redefined:

  1. In paradigm one, Ptolemaic astronomers might learn the concepts [star] and [planet] by having the Sun, the Moon, and Mars pointed out as instances of the concept [planet] and some fixed stars as instances of the concept [star].
  2. In paradigm two, Copernicans might learn the concepts [star], [planet] and [satellites] by having Mars and Jupiter pointed out as instances of the concept [planet], the Moon as an instance of the concept [satellite] and the Sun and some fixed stars as instances of the concept [star].

Thus, the terms planet, star, and satellite acquired new meanings, and astronomy adopted a new classification of celestial bodies. In Kuhn's model, the ontology of science (its image of reality) is primary; when it changes, so too do the concepts. Meaning, then, is not determined by all of language but rather by a set of core terms within a given theoretical system. This provides a concrete criterion for determining meaning in scientific discourse.

A Kuhn-inspired theory of concepts offers further critique of the idea of fixed semantic primitives. As Gopnik et al. (1996) explain:

Children seem to understand the meaning of the words they hear in terms of the theories they have; they treat the words of natural language the way that scientists treat theoretical terms. Moreover, rather than reflecting some fixed set of semantic primitives, children's understanding of words changes in parallel with their changing theoretical understanding of the world. Finally, language itself seems to play an important role in theory-formation. We have also shown empirically that the words children hear influence the development of their theories. [22]

While linguists typically take a natural language and its lexicon as the point of departure, Kuhn's model begins with scientific paradigms. Theory theory generalizes this approach, showing how both adults and children build and revise conceptual structures in tandem with their evolving theoretical understandings [23].

In sum, this section has introduced contextuality and semantic holism as serious challenges to the principle of semantic primitives. Section 4.7 will present further arguments against this principle.

[top of entry]

4.6 The analogy with the periodic system

As noted in the introduction, Wierzbicka (2006, 134) compared semantic primitives to chemical elements, writing that they are “like chemical elements discovered by Mendeleev are elementary building blocks of all chemical substances”. Other linguists have drawn similar analogies, for example, Baker (2001) and Zwicky (1973; 1980).

As stated in Section 1, we distinguished between ontological and conceptual part–whole relations. Although both natural language and the periodic system consist of concepts and conceptual relations, they differ significantly in their structure and function. In particular, chemical names — unlike expressions in natural language — explicitly reflect the composition of chemical compounds according to contemporary chemical theory.

The periodic system, along with chemical nomenclature and conceptual frameworks, represents the cumulative results of centuries of chemical and physical research. While questions about how best to classify the elements remain, there is broad consensus about the main structure of the periodic system and its deep entwinement with chemical and physical theory [24]. This classification was not imposed by external classificationists; rather, it emerged from within the field of chemistry itself. As Bawden (2017), referring to Brock (2016), observed:

[T]o someone like myself who studied chemistry, it is interesting to reflect on the extent to which information representation and communication has gone hand-in-hand with the development of concepts and theories in chemistry, so that it is difficult to tell where the one ends and the other begins.

In ideal cases of scientific development, scientists construct both theories and the sign systems used to represent them — including nomenclatures and sign systems — and these mutually support further conceptual and theoretical refinement. This point has important implications for the theory of semantic primitives. As Rast (2023, 69) noted, since decompositions depend on world-level theories, they may ultimately be mistaken if those underlying theories are revised or refuted.

The lesson from the periodic system is that the very definition of chemical elements is theory-dependent and historically situated — and so are any attempts to classify them. In this context, the symbols of the chemical elements can be seen as a kind of semantic primitives. However, these “primes” differ from Wierzbicka's linguistic primitives in several fundamental respects. They are not universal across languages or semiotic systems; they cannot be discovered through linguistic, psychological, or sociological methods; and they are certainly not accessible via introspection, intuition, or formal logic alone.

A crucial difference between Wierzbicka's linguistic primes and the elements of the periodic system lies in their epistemic foundations. Wierzbicka holds that linguistic semantics can be distinguished from world knowledge or encyclopedic knowledge [25]. In contrast, scientific classifications — such as those found in chemistry — are explicitly based on theoretical world knowledge and constructed representational systems designed to best express that knowledge.

[top of entry]

4.7 Further arguments against primitives

We have already encountered arguments challenging Wierzbicka's semantic primes in Section 4.3, particularly through Blumczynski's relativist approach, and in Section 4.5 with Kuhn's theory of paradigms and conceptual holism. This section expands the critical perspective by introducing additional objections rooted in contemporary cognitive and philosophical theories [26].

From the standpoint of the theory theory of concepts, Weiskopf (n.d., §5b) raises a significant challenge: if concepts are compositional, then the broader theories in which these concepts are embedded must also be compositional — a requirement he argues is implausible. Weiskopf concludes that even if individual concepts have internal structure grounded in domain-specific theories, new compound concepts (“pet fish” is given as an example) often involve emergent properties not derivable from those theories in isolation. This undermines the notion of strong compositionality and, by extension, the presupposition of stable semantic primitives.

A different but related argument stems from Goodman's analysis of primitives in formal systems. Goodman noted (1951, 57):

It is not because a term is indefinable that it is chosen as primitive; rather, it is because a term has been chosen as primitive for a system that it is indefinable in that system. No term is absolutely indefinable. And if indefinability is taken to mean incomprehensibility, incomprehensible terms have no place at all in a system. […]. There is no absolute primitive, and no one correct selection of primitives.

According to Goodman, then, primitives are not discovered but stipulated — defined by their role within a specific theoretical or formal framework. There is no such thing as an absolute semantic primitive; what counts as primitive is relative to the structure of the system in question. Wierzbicka (1996, 11–13) acknowledged this critique but argued that Goodman's relativism only applies to artificial languages and not to natural languages, which she claimed have an objective and universal semantic core. However, even if Wierzbicka is right that natural languages may permit absolute primitives, this argument still does not apply to knowledge organization systems, which are by nature artificial. Therefore, Goodman's relativism remains highly relevant for discussions in information science and classification theory.

A final critique concerns the division between linguistic and encyclopedic knowledge, which Wierzbicka attempts to maintain in support of her semantic primes. The idea is that linguistic meaning can be captured independently of factual or encyclopedic content. However, this distinction appears increasingly untenable. As scientific knowledge advances, it frequently reshapes the meanings of even the most basic terms in ordinary language. The interpenetration of language and evolving knowledge systems undermines the assumption that linguistic meaning forms a stable and autonomous system, isolated from epistemic and conceptual development.

Together, these criticisms — based in theory-based compositional failure, conceptual relativism, and the inseparability of linguistic and world knowledge — converge to cast doubt on the coherence, applicability, and utility of the notion of stable, universal semantic primitives, at least outside highly constrained domains.

[top of entry]

5. Compositionality and semantic primitives in information science

5.1 Facet analysis and semantic factoring

In knowledge organization, the approach known as faceted classification has had a dominant influence (cf. Broughton 2023). The basic philosophy of semantic primitives appears to correspond closely with the theoretical foundation of → facet analysis. Both approaches assume that a basic list of elements can be defined through logical analysis and then used to synthesize any complex concept.

S. R. Ranganathan (1892–1972) was a major contributor to this approach. According to Raghavan (2019, section 8), he found inspiration for a new method of classification in observing a Meccano toy set composed of a few basic components that could be combined to construct various toys. A central idea in faceted classification is that concepts from different facets can be combined to classify a document. As Mills, Broughton and Lang (1980, xviii) explained:

In principle, a faceted classification consists of facets and arrays of relatively elementary terms; all compounds are formed by the classifier assigning classmarks to them by means of synthesis. So compound classes are not usually to be found enumerated in the schedules.

Hjørland (2013, 548), however, argued — contrary to Ranganathan — that the fundamental theory in knowledge organization is shared by both enumerative and faceted classifications. Although faceted systems are more flexible, they cannot represent all new subjects as combinations of a basic set of preexisting elements.

Foskett (1996, 78) described semantic factoring as follows:

During the 1950s a team at Case Western Reserve University worked on a system of analysis known as semantic factoring (Perry & Kent, 1958). The objective was to break down every concept into a set of fundamental concepts called semantic factors. Because of their fundamental nature, there would only be a limited number of these factors. A concept would be denoted by the appropriate combination of semantic factors, and the use of a complex set of roles and links enabled the indexer to write a “telegraphic abstract”, which would represent the subject of a document in a computer file.

The method is clearly a powerful one, but is open to some doubts and objections. Exactly how far does one carry such an analysis? Heat and temperature, for example, could be specified as movement of molecules. Again, it is possible to specify a concept by using only some of its attributes; or perhaps more significantly, is it ever possible to specify all the attributes for a given concept?

Soergel (1985, 256–261) presents concept combination, semantic factoring, and their relation to facet analysis. He shows how the concept [ship] can be analyzed as [vehicle]:[water transport]; these concepts may be further analyzed as [vehicle] = [means]:[transportation]:[mobile]; and [water transport] = [water]:[transport]. We arrive at:

[ship] = [means]:[transportation]:[mobile]:[water]

Soergel (1985, 257) wrote: “Carrying this process to the end leads to elementary concepts that cannot be factored further”. It is not clear whether he meant that [ship] cannot be factored further than just shown (which seems questionable, as [water], e.g., may be broken down into H2O). Soergel discusses the application of this semantic factoring for indexing languages. He presents two options:

  1. to use [ship] as a subject descriptor and then show its relations to semantic factors for the benefit of the searcher;
  2. to omit [ship] from the index language and instruct the indexer instead to use one or another of the two levels of semantic factoring.

Soergel (1985, 258) found: “Facet analysis is helpful for finding all semantic factors and for solving difficult cases. Facets are aspects or points from which entities — such as food products or subjects (topics, themes) in an area such as education — can be analyzed”. (Examples are given in the book.)

Soergel presented arguments in favor of semantic factoring (Soergel 1985, 280–281), but he did not discuss the potential disadvantages of carrying the analysis too far. Therefore, the question arises: should the ideal for indexing languages such as thesauri be to proceed all the way to semantic primitives (if such primitives exist)? ISO 25964-1:2011, section 7.3, under the heading “Deciding whether or not to admit a complex concept”, addresses this issue, noting that the decision is often difficult and subjective. This reflects the understanding that there are limits to the use of semantic factoring, even if no formal criteria exist for determining these limits. This appears to be the current situation regarding semantic factoring for indexing languages. A likely way forward is to consider how concepts are used within subject domains and to avoid construing an indexing language too far from that use.

Soergel (2017) presents a range of concepts — including compositionality, entity–relationship (ER) modeling, and description logic — thus placing faceted classification within a broader theoretical context. He wrote (2017, 43–44):

The paper illustrates this principle [compositionality] through a number of examples for (1) simple composition, such as Chinese characters, sign language, and other systems from linguistics and knowledge organization, and (2) more structured systems, including entity–relationship modeling, facets (including facets in the UDC), and frames and record structures, introduced by the faceted arrangement of the Semitic–Greek alphabet. These examples show that:
  1. The idea of facets has been around for a long time.
  2. Following the principle of compositionality through many contexts will improve our understanding of faceted classification.

It seems important, as noted by both Soergel (2017) and Kashyap (2003), to connect facet analysis with related approaches. A substantial literature exists on entity–relationship modeling, and a review article offering a critical examination of this approach in the context of knowledge organization is still needed.

Three important questions arise in relation to facet analysis:

  1. The issue raised by ISO 25964-1:2011 concerning the difficult and subjective nature of deciding when to apply semantic factoring, and how far this process should go — there is a clear need for principled guidelines on this point;
  2. The challenge posed by the principle of semantic holism (see Section 4.5), which suggests that a concept cannot be assumed to retain the same meaning across different contexts — does this limit the utility of analytico-synthetic classification?
  3. Whether a specific way of specifying a document in facets necessarily introduces a bias toward a particular perspective. For instance, Hjørland and Barros (2024) ask whether the treatment of medicine in → Bliss Bibliographic Classification, 2nd ed. (BC2) reflects a traditional (reductionist) biomedical model at the expense of, for example, a biopsychosocial model. In other words, are facet analysis and semantic factoring neutral techniques, or do they imply an (often unconscious) theoretical prioritization?

[top of entry]

5.2 Szostak and basic concepts

Szostak (2011) advocates the construction of classification systems based on the decomposition of “complex concepts” into “basic concepts”, aiming to identify concepts that (2011, 2247) “can be understood in a similar way across disciplines and cultures”. These “basic concepts” are then proposed as the foundation for universal classification systems. Szostak argued:

Interdisciplinary communication, and thus the rate of progress in scholarly understanding, would be greatly enhanced if scholars had access to a universal classification of documents or ideas → not grounded in particular disciplines or cultures. Such a classification is feasible if complex concepts can be understood as some combination of more basic concepts.

Szostak here assumes that basic concepts, in contrast to complex ones, are universally intelligible, and that classifications based on such concepts would be free from cultural, disciplinary, or theoretical biases. In this section, we shall examine how Szostak supports this assumption.

The first point to address is Szostak's notion that “basic concepts” are better understood — also in interdisciplinary contexts — than more complex ones. Here, an argument by Riemer (2006, 355) is pertinent. He contended that intelligibility is not primarily a matter of using “simpler” words in any universal sense but rather depends on the learner's prior knowledge. Similarly, Hajibayova (2013, 685) found that “basic-level categories vary across individuals and cultures because of differences in the everyday experiences and activities of individuals”.

These perspectives challenge Szostak's assumption that decomposed concepts form a conceptual level at which general interdisciplinary and cross-cultural understanding is naturally achieved. Moreover, Szostak does not offer substantial argumentation or empirical support for his position; rather, his view appears to be based primarily on intuitive reasoning.

If the goal is to create a classification system that is broadly intelligible, one option would be to ground it in the study of → folk classifications (see Hjørland & Gnoli 2021) or in the theory of “basic-level categories” developed by Rosch et al. (1976), which suggests a pragmatic understanding of categories: that ordinary people tend to organize concepts in terms of their practical interaction with objects, and that basic-level categories are those most cognitively salient and widely shared. By contrast, both more general (superordinate) and more specific (subordinate) categories are often less well developed and less consistently understood across communities.

Rather than adopting such views, Szostak does the opposite. For example, “fish” and “bird” are well-established and broadly understood categories in both folk classification and traditional biological taxonomy. However, in 2021, Szostak revised flora and fauna in his classification (Szostak 2013) to reflect a cladistic view. The most commonly recognized animal classes, including “fish” and “bird”, were not retained as classes but subsumed under the category FH “Hypothesized Species”. How does this correspond to cladistics? Contemporary cladistics does not recognize “fish” as a taxonomic group (whereas “bird” is a recognized taxon) [27], which is not to say that fish are hypothesized species, but as formulated by Weitzman and Parenti (2025): “The term fish is applied to a variety of vertebrates of several evolutionary lines. It describes a life-form rather than a taxonomic group”. (But still, it seems inappropriate for a system aiming to offer broad intelligibility to avoid the concept “fish”, which also raises practical difficulties for indexing, retrieving and interdisciplinary communication of subjects such as fish farming, fish economics, fish as pets, or fishing industries.) In the case of “bird”, Szostak's classification is demonstrably incorrect, as “bird” is a valid clade also according to cladistics.

While I share Szostak's commitment to grounding bibliographic classifications in scientific knowledge — rather than folk classifications or psychological studies — this does not entail that narrower, highly specific terms are always better understood than broader, established ones. Moreover, Szostak's interpretation of cladistics and its application to bibliographic classification needs theoretical justification [28].

A second point concerns Szostak's (2011, 2248, 2249, 2254) arguments against requiring excessive conceptual precision in classification design. But how much precision is enough? A too-relaxed attitude toward conceptual definition may result in low standards for both scholarly discourse and professional practice in information science. Notably, Szostak (2024, 314), in a different context, argues that classification does, in fact, require precision.

A third issue is illustrated by Szostak's example of decomposing → Dewey Decimal Classification (DDC) entries into basic concepts. He writes (Szostak 2011, 2247):

Can we take the subject entries in existing universal but discipline-based classifications, and break these into a set of more basic concepts that can be applied across disciplinary classes? The author performs this sort of analysis for Dewey classes 300 to 339.9. This analysis will serve to identify the sort of ‘basic concepts’ that would lie at the heart of a truly universal classification.

Szostak's first example is DDC 303.376 [Censorship] that he analyzes as [preventing] [publication], which he claims may be treated as basic concepts. However firstly, this interpretation seems too narrow. Censorship, broadly understood, involves the control, suppression, or restriction of information, ideas, expression, or access to knowledge — often by an authority seeking to maintain power, enforce norms, or prevent dissent. For example, the prohibition against slaves learning to read or write in the United States during the 18th and 19th centuries is a clear instance of censorship, yet it would be difficult to accommodate this case within Szostak's analysis. Secondly, [publication] may not qualify as a basic concept in either of Szostak's senses: it is unlikely to be interpreted uniformly across user communities, and it may be decomposed into constituent ideas such as [document] and [made public] — both of which are themselves ambiguous.

Our final point concerns Szostak's (2011, 2247) argument that he finds support for his position in each of the five prominent theories of concepts he identifies:

[These theories] provide some support for the idea of breaking complex into basic concepts that can be understood across disciplines or cultures, but each has detractors. None of these criticisms represents a substantive obstacle to breaking complex concepts into basic concepts within information science.

He continues (2011, 2248):

The purpose is to identify what sort of classification projects — with particular emphasis on the possibility of a truly universal classification — might be justified with respect to all major concept theories in philosophy. I will argue that that is the appropriate stance for the information science community to take toward the philosophical literature.

Szostak's emphasis on the relevance of concept theories to the question of compositionality is a valuable and commendable contribution. However, his argumentative strategy appears to be backward: rather than coordinating his theory of classification with one particular theory of concepts, he instead tries to show that none of the five theories contradicts his position. A more coherent approach would be to develop and justify his classification model by explicitly grounding it in a specific theory of concepts and coordinating its assumptions accordingly.

A full discussion of the five concept theories lies beyond the scope of this article, although such a discussion would be highly worthwhile. It may be claimed here, however, that only the so-called classical theory of concepts unequivocally supports compositionality in the strong, universal sense Szostak assumes. According to the classical theory, concepts are defined by sets of necessary and sufficient features (e.g., [bachelor] = [unmarried] + [man]). In contrast, the remaining four theories — including the “theory theory of concepts” discussed in Section 4.5, which the present author endorses — pose significant challenges to the notion of strict compositionality. Szostak does not demonstrate how the theory theory of concepts supports his proposal for a universal classification system; on the contrary, he offers a critical appraisal of this theory and of Hjørland's (2009) application of it.

Crucially, Szostak's argument overlooks the historical and theoretical development of scientific concepts and how this development is often reflected in everyday language. As scientific knowledge progresses, so do its conceptual systems. For example, we now understand the [sun] to be a [star], in contrast to ancient classifications, and recent taxonomic revisions — such as the reclassification of birds (see Fjeldså 2013) — have already been adopted by amateur ornithologists and are gradually permeating educational curricula and common usage.

Szostak's approach implies that concepts can be defined and classified from a position of neutrality — what Nagel (1986) famously called “the view from nowhere”. Yet this idea has been widely challenged; Metzinger (2003, 582), among others, argues that such a view is epistemologically untenable.

In conclusion, it is difficult to accept Szostak's assertion that each of the five concept theories provides meaningful support for the idea of decomposing complex concepts into universally understood basic ones. Different epistemological approaches — rationalist, empirical, historicist, or pragmatic — may all involve analyzing concepts into constituent elements, but they diverge in how they understand this process. Rationalist epistemologies tend to treat such analysis as value-neutral, performed from a “God's eye” perspective, whereas pragmatic epistemologies emphasize that all conceptual analysis is situated and interest-laden, serving particular purposes and potentially privileging certain perspectives over others.

[top of entry]

5.3 Thesauri and Spärck Jones

Karen Spärck Jones was a pioneer in the field of → information retrieval and natural language processing. She observed (Spärck Jones 1992, 1609) that the theory of semantic primitives was influential in early thesaurus construction:

Thesaurus classes in IR [information retrieval] define generic indexing descriptors, and the thesaurus has been envisaged as having a similar function in NLP [natural language processing], that of providing semantic primitives. This connection was explicitly recognized in early work on the use of a thesaurus for NLP, and specifically for machine translation (MT). Early workers on MT, attempting general text translation, say of research papers, were immediately faced with the problem of word sense identification and, wherever direct links between input sentences and their output equivalents were not, or could not be, provided, with the problem of output word (sense) selection as well. A thesaurus was seen as providing a set of general-purpose, domain-independent semantic primitives allowing the specification, or at least an indication, of the essential semantic concepts expressed by, or relating to, a text, and hence allowing sense determination.

In this passage, semantic primitives are equated with generic terms. While such equivalence may hold in specific instances where a generic term denotes a concept basic enough to function as a primitive, this is not universally the case. Moreover, by not exploring the underlying nature of concepts, Spärck Jones missed a crucial theoretical dimension.

In the introductory remark to the published version of her dissertation — originally written in 1964 but first published in 1986 — Spärck Jones (1986, 1) claimed that it offered “a characterization of, and a basis for deriving, semantic primitives, that is, the general concepts under which natural language words and messages are categorized”. However, Wilks and Tait (2006, 5) questioned this retrospective claim:

Perhaps the most striking feature of her retrospective, as compared to the original SSC [the dissertation, published in Spärck Jones 1986], is the emphasis on semantic primitives and the explicit opening claim [cited above]. This view of SSC is not one that a reader of the original thesis would necessarily come to from its text, although it makes perfect sense if we take semantic primitives to mean the topic markers that are the 1000 or so Roget heads, such as 324 SOFTNESS. However, and as noted in the previous section there are some problems with reconciling this notion of predefined primitives and truly emergent ones.

Wilks and Tait (2006, 6) point to a tension in Spärck Jones's approach. On the one hand, she “shies away from putting forward anything which cannot be directly observed in text” relying on statistical methods to identify emergent semantic primitives (i.e., she adheres to an empiricist philosophy). On the other hand, she makes use of the overarching a priori structure of Roget's Thesaurus — a framework derived from intuition and consistent with a rationalist philosophy. Wilks and Tait concluded that “Sparck Jones was accepting a great deal of decoration beyond the words themselves”, which caused her supposed primitives to mirror Roget's thematic headings.

We shall not delve further into Spärck Jones' (1986) treatment of semantic primitives, but note that while her dissertation was rightly praised as a pioneering work, its treatment of semantic primitives remains ambiguous. Despite the retrospective emphasis on primitives, the dissertation's explicit focus — according to its title and substantive content — is synonymy. While it offers a rigorous account of theories of synonymy, it does not engage deeply with theories of semantic primitives, making the later claim about its focus rather surprising.

A later article by Spärck Jones (2007) explicitly addressed semantic primitives, but her views remain unclear. The conclusion of that article (2007, 251) — “while you can't make a language processor without semantic primitives somewhere, you choose your semantic primitive cloth, and tailor it, to suit your processor climate” — might be interpreted as aligning with the pragmatic perspective on semantic primitives discussed in Section 4.4. However, Halpin (2013, 190) noted that Spärck Jones eventually abandoned the notion of semantic primitives, even in its most open-ended formulation.

Spärck Jones is not a central representative of the main tradition in thesaurus construction, which is more accurately represented by works such as Aitchison et al. (2000), Dextre Clarke (2019), and ISO 25964-1 (2011). This mainstream tradition has been discussed elsewhere in the present article, for example, in Section 5.1. Nevertheless, Spärck Jones provided the most explicit connection between semantic primitives and thesaurus theory and has therefore been examined in this section.

[top of entry]

5.4 Other examples and issues

Issues related to the reduction of semantic complexity are widespread in the literature, although the discussions are often scattered and lack coherence. This section presents a few additional examples that illustrate the diversity of approaches involved.

In the biomedical domain, more than 730,000 concepts have been identified in the Unified Medical Language System® (UMLS). These have been grouped into 134 semantic types, which McCray et al. (2001) further reduced into 15 high-level groupings, as a way to manage conceptual complexity in large-scale systems:

  1. Activities and Behaviors
  2. Anatomy
  3. Chemicals and Drugs
  4. Concepts and Ideas
  5. Devices
  6. Disorders
  7. Genes and Molecular Sequences
  8. Geographic Areas
  9. Living Beings
  10. Objects
  11. Occupations
  12. Organizations
  13. Phenomena
  14. Physiology
  15. Procedures

While this grouping helps manage complexity, it arguably lacks a consistent logical structure. For example, “disorders” and “living beings” are themselves “phenomena”, yet they are listed as separate categories. A more logically coherent classification of medical knowledge can be found in BC2, as discussed by Mills (2004, 552–553).

A related effort to define foundational concepts is found in the development of upper → ontologies (also called top-level ontologies) [29]. These aim to provide highly general, domain-independent concepts that serve as the basis for more specific domain ontologies. Thus, they function — if the task is feasible at all — as sets of “semantic universals”, akin to semantic primitives. As discussed in Section 5.2, such attempts presume a “view from nowhere”, which has been critiqued in philosophy.

Kausch (2024, 387) made the following claim in support of such upper ontologies:

The use of a top-level ontology composed of semantic primitives could not only aid in interoperability, but also contribute to the long-term preservation of knowledge organization systems, and by extension, the resilience of the scientific project, where resilience is defined as the ability of future generations to understand the original intention and spirit of a knowledge organization system.

While Kausch explicitly refers to Wierzbicka's research, the fundamental difference between linguistic approaches (like hers) and the epistemological and ontological strategies used in scientific ontology design is not addressed. It should be emphasized that Wierzbicka's 65 semantic primitives are qualitatively different from the concepts employed in well-known upper ontologies such as those by Sowa, Cyc, or the Suggested Upper Merged Ontology (SUMO) (cf. Gómez-Pérez et al. 2004, 71–78).

These examples illustrate attempts — beyond semantic primitives and semantic factoring — to manage semantic complexity in knowledge organization systems.

It should also be noted that compositionality has received substantial attention in applied fields such as computational linguistics, information retrieval, and medical informatics. Studies by Amigo et al. (2022), Wang et al. (2019), McKnight et al. (1999), Elkin et al. (1998), and Elkin and Brown (2023) investigate compositionality from a computational perspective, often testing specific algorithms or formal representations. These works rarely incorporate the theoretical insight — central to the present article — that concepts are (or are individuated by) theories. In particular, the Kuhnian view that meaning is paradigm-dependent (e.g., the reclassification of “star” and “planet”) seems to be entirely absent in such research, despite its potential significance for understanding semantic dynamics in applied systems.

[top of entry]

6. Conclusion

This article took its point of departure in knowledge organization, classification research, and information science, but has drawn upon interdisciplinary research to achieve a deeper level of understanding. It has shown that questions concerning semantic primitives and componential analysis are deeply connected to a fundamental problem in information science. Consequently, the notions of semantic primes and compositionality are important for the theoretical foundations of this field.

The nature of compositionality has been shown to relate to broader philosophical frameworks, here broadly classified as rationalist, empiricist, historicist, and pragmatic. The latter two, which represent versions of social epistemology, view human cognition and perception as shaped by socio-cultural contexts. From these perspectives, both semantic primitives and compositionality are seen as relative to cultural, social, and theoretical contexts — as well as to scientific paradigms. This article has argued in favor of such a social epistemological view, which assumes a relativistic, holistic, and anti-essentialist conception of meaning. This aligns with Kuhn's view that ontological categories and relations are theory-dependent, as illustrated by an example involving the terms star and planet, and supported by contemporary examples from biological systematics.

The facet-analytic approach to knowledge organization is closely tied to the concepts of semantic primitives and compositional theory. This approach has been hailed as one of the most important innovations in the field (e.g. Furner & Hjørland 2023). However, it is important to consider a development from a rationalist to a pragmatic interpretation. Ranganathan postulated five facets, about which his student Gopinath (1976, 60) wrote:

[facet analysis] has led to the conjecture that there may be an “absolute syntax” among the constituents of the subjects within a basic subject, perhaps parallel to the sequence of thought process itself, irrespective of the language in which the ideas may be expressed, irrespective of the cultural background or other differences in the environments in which the specialists, as creators as well as the users of the subject, may be placed.

This clearly expresses a rationalist position. However, Vickery (1960, 23) suggested 13 facets, and Broughton expanded this number [30]. Such expansions support a more pragmatic understanding: that facets are neither eternal nor universal but are discovered or constructed as new domains of knowledge evolve and require classification. Aristotle, for instance, could not depend on his pre-established set of categories when he made his biological classification.

Semantic building blocks should not be assumed, as done by Wierzbicka, to represent elements common to all languages — or even to a single language such as English. From the perspective of knowledge organization, they are better understood as domain-specific elements (e.g., chemical or kinship concepts), embedded within models and theories of particular fields.

The most important conclusion is that the philosophy of science — which is largely absent in contemporary research on compositionality and semantic primitives — needs to play a more central role. As Rast (2023, 69) emphasized: because decompositions rest on world-level theories in one way or another, they may be wrong if the underlying theory turns out to be false. All texts are produced by someone for particular purposes and on the basis of certain assumptions — what might be called “theories” or “paradigmatic views”. These provide the foundation for the meaning of the words and signs used. To identify those meanings, and the semantic relations between them, it is necessary to identify the relevant contexts — implying the need to map the (often hidden) paradigms within the information being represented.

[top of entry]

Acknowledgments

Thanks to the anonymous peer reviewers and to the ARIST editorial team. Their feedback has contributed to improving this article.

[top of entry]

Endnotes

1. The most influential illustration of the semiotic triangle is found in Ogden and Richards (1923, 11), but it has a long history that includes Aristotle, Gottlob Frege, and Charles Sanders Peirce (cf. Sowa 2000, 58–59).

2. In this article, square brackets are used to indicate concepts (as opposed to words or signs).

3. The philosophical view that we see things mediated by our conceptions was argued by Kant (1997); in the 20th century, it was taken up in a modified form by, among others, Einstein and Kuhn. Einstein (1949, 674) described his theoretical attitude as “distinct from that of Kant only by the fact that we do not conceive of the ‘categories’ as unalterable (conditioned by the nature of the understanding) but as (in the logical sense) free conventions. They appear to be a priori only insofar as thinking without the positing of categories and of concepts in general would be as impossible as is breathing in a vacuum”. Kuhn (2000, 264) said: “I go around explaining my own position saying I am a Kantian with moveable categories”. This view may be called “the mediated view” and is also subscribed to by the present author.

4. Mitchell and Panzer (2013, 192) wrote: “GeoNames (like many other ontologies) does not contain descriptions of or identifiers for concepts of places; it contains descriptions of and identifiers for the places themselves”. This quote represents the naïve-realist view that a KOS can represent reality rather than a conception of reality. For a further discussion of this point, see Hjørland (2021).

5. It is almost universally accepted that KOSs organize concepts. For a detractor, see Smith (2004); for a rebuttal, see Hjørland (2021).

6. An example that demonstrates that not all concepts are lexicalized in all natural languages is [the succession of one day and one night], which in Danish is lexicalized as et døgn but is not lexicalized in English.

7. The terms happiness and joy are often understood as synonymous words, meaning that they refer to the same concept. (If somebody considers [happiness] and [joy] as different concepts the terms are not synonyms, but possible near synonyms.) “Synonymous concepts” is a contradiction in terms, concepts rather have degrees of semantic similarity, relative to perspective and theories. Biological taxonomies have claimed, based on DNA analysis, that of the three concepts [salmon], [lungfish], and [cow], [lungfish] and [cow] are more closely related, sharing a more recent common ancestor than either does with [salmon]. Although this claim is debated (see Williams and Ebach 2020, 401–403), it shows that we should not take our received understanding of semantic similarities as given.

8. Whether modern science represents an objective picture of reality is debated among, for example, scientific realists and social constructivists. On the view of water as H2O see Chang (2012).

9. The term category is related to the concept of → facet in knowledge organization (see Moss 1964) and to the idea of a fundamental set of categories which, according to Mazzocchi and Gnoli (2010, 137), “should permit the analysis and classification of any phenomenon or object”.

10. Richard (1998, 477), for example, concluded: “Whether natural languages have compositional semantics, and whether the meanings of their sentences are determined simply by the meaning of their parts and syntax, is still not settled”; Pagin and Westerståhl (2010, 279) likewise concluded, “All in all, it seems that the issue of compositionality in natural language will remain live, important and controversial for a long time to come”.

11. The cited quote is from the HTML version of Wierzbicka (2006); the PDF version of this article differs (!) by not providing a definition but just started referring Leibniz's view.

12. Matthewson (2003, 263) found, however: “Indeed, there may not be a single proposed semantic primitive which fails to strike formal semanticists as extremely complex. Thus, it is difficult for us to accept the NSM claim that primitives such as I, YOU, SOMEONE, THIS, THINK, and WANT are ‘simple words’ and that they are ‘intuitively comprehensible and self-explanatory’ (Durst, 2003, p. 2)”.

13. Semantic markers are a specific type of semantic feature used in componential analysis. Katz and Fodor (1963) introduced semantic markers in linguistics with combination rules called “projection rules”. They are theoretical units of meaning-holding components used to represent word meaning. According to Sowa (1984, 15), they only support either-or dichotomies, not concepts with slightly different shades of meaning. An updated presentation and criticism of this concept is given by Pitt (2003).

14. An anonymous reviewer wrote: “[Wilkins'] artificial language as documented in Essay towards a real character and a philosophical language [1668] is very close in nature to a faceted classification, and some of his examples are very similar to those in the figure from Henning [2020, 68]”. For a critical discussion of Wilkins see Borges (1964) and Mai (2016).

15. Pulman (2005, 165–166) wrote: “However, the actual sense in which all concepts are innate turns out to be rather weak. Faced with the absurd possibility that concepts like ‘doorknob’ or ‘spade’ might be innate, Fodor claims instead that what is innate is the ability to form concepts of that kind on the basis of rapid and variable exposure to instances”.

16. In relation to complex concepts, however, Wierzbicka is not a rationalist. Although she understands these as built on the primes, this is done in culturally specific ways. In this respect, her theory aligns with historicist linguists such as Benjamin Lee Whorf, and she investigates complex concepts in their cultural embeddedness.

17. Whorf did believe, however, that there is a common basis for language in human psychology, especially the psychology of perception. However, contemporary philosophers of science tend to embrace the theory-ladenness of observation (see Boyd and Bogen 2021) and thereby undermine this basis for a universalist view of language.

18. Andersen developed his “materialist” view following the student revolt in Denmark in the 1970s, with a strong focus on sociological differences between groups of people. He conducted empirical work on the language of mechanics at a car workshop. However, I find that the terms functional and, especially, pragmatic cover the same view but are broader, which is why I prefer the latter.

19. According to Riemer (2016, 226) also Gestalt psychology contrasts with theoretical assumptions in research about semantic primitives.

20. Linguists often reject meaning holism. For example, Vallée (2006) concluded: “Wherever one looks, one finds only implausible consequences of SH [semantic holism]. As a consequence, it never served as the basis for a detailed research program in semantics. However, it did bring the analytic philosopher one step closer to the Hermeneutic tradition. Hermeneutics is thoroughly holistic”.

21. However, Butler (1995) argued that not all varieties of sensitivity to context pose difficulties for the compositionality principle.

22. Gopnik (2003) wrote further on “theory theory”: “Since semantics, by definition, relates linguistic expressions to our understanding of the world, and I have argued that our everyday understanding of the world is theory-like, this is not surprising [that ‘rather than reflecting some fixed set of semantic primitives, children's understanding of words changes in parallel with their changing theoretical understanding of the world’]”.

23. Theory is an ambiguous word. Influenced by “theory theory”, Hjørland (2015, 116–117) argued that a theory is a statement or conception that is considered open to question and connected with background assumptions.

24. It is interesting to see that even today researchers participate in a “periodic debate”, and tend to divide between those who are more rationalist-oriented versus those, who are more pragmatic-oriented (cf. Stewart 2011, in comment to Bradley 2011).

25. In Semantics: Primes and Universals, Wierzbicka (1996, 14) states: “The meaning of a word is not the same as what people know or believe about the thing or situation that the word refers to. Semantics is not encyclopedic knowledge”. This distinction is foundational to her methodology, which attempts to isolate core semantic content using a small set of universal semantic primes, avoiding the conflation of semantics with contingent cultural or factual knowledge. Wierzbicka's analogy to the periodic table strengthens rather than weakens the case that semantic analysis must rest on an extensive base of world knowledge, cultural context, and theory-laden interpretation. Thus, her attempt to isolate “pure linguistic meaning” from encyclopedic knowledge seems problematic — even internally inconsistent.

26. Spärck Jones and Kay (1973, 124) wrote that some linguists dismissed the notion of semantic primitives, but did not develop or discuss these views: “There are some linguists who, with Lyons (1968), attempt to avoid the problems associated with a notation based on semantic primitives by dismissing the notion altogether”. However, Lyons did discuss “basic sense-components” in both his 1968 and 1977 books (Lyons 1968, 470–481: “Componential analysis and universal semantics”; Lyons 1977, 317–335: “Componential analysis”). Goldberg (2016, 430) described unresolved issues, including whether the principle of compositionality is based on empirical evidence, as some researchers claim (Fodor 2001; Pelletier 1994), or whether it is a methodological assumption, as others claim (Barker and Jacobson 2007; Dowty 2007; Groenendijk and Stokhof 2005; Janssen 1983; Partee 1995). Goldberg further argued that there are empirical questions that need to be addressed for the principle of compositionality to be upheld.

27. “Fish” is a paraphyletic group of animals, meaning they do not include all descendants of their most recent common ancestor. “Birds”, however, is a valid clade, as birds form a monophyletic group since all modern birds descend from a common ancestor. It therefore seems even more strange that birds are also classified by Szostak as FH “Hypothesized Species”.

28. The two approaches to cladistics, “process cladistics” and “pattern cladistics” agree in rejecting paraphyletic groups like “fish” and in accepting birds as a valid clade, but process cladistics is more willing to make historical and evolutionary statements about these taxa, for example, that birds are dinosaurs, while pattern cladistics understands birds as part of Dinosauria.

29. For a comparison of selected upper ontologies, see Mascardi et al. (2007).

30. For example, Mills et al. (1993), when classifying mathematics for BC2, added a new facet: relation.

[top of entry]

References

Aitchison, J. 2012. Words in the mind: An introduction to the mental lexicon, 4th ed. John Wiley & Sons.

Aitchison, J., Gilchrist, A., and Bawden, D. 2000. Thesaurus construction and use: A practical manual, 4th ed. Aslib.

Amigo, E., Ariza-Casabona, A., Fresno, V., and Martí, M. A. 2022. Information theory–based compositional distributional semantics. Computational Linguistics, 48(4), 907–948. https://doi.org/10.1162/coli_a_00454.

Andersen, P. B. 1997. A theory of computer semiotics. In Semiotic approaches to construction and assessment of computer systems. Cambridge University Press.

Aristotle. 1984. In J. Barnes (Ed.), The complete works of Aristotle. The revised Oxford translation, vol. 1. Princeton University Press.

Baker, M. C. 2001. The atoms of language: The mind's hidden rules of grammar. Oxford University Press.

Bakhtin, M. 1981. Discourse in the novel. In Holquist and C. Emerson (Eds.), The dialogic imagination: Four essays by M.M. Bakhtin, 239–422. University of Texas Press. (Original work published 1934.)

C. Barker, and P. I. Jacobson (Eds.). 2007. Direct compositionality. Oxford University Press.

Bawden, D. 2017. Chemistry and its (information) history. In The occasional informationist: Irregular thoughts on the information sciences (blog). https://theoccasionalinformationist.com/2017/09/03/.

Blumczynski, P. 2013. Turning the tide: A critique of natural semantic metalanguage from a translation studies perspective. Translation Studies, 6(3), 261–276. [https://doi.org/10.1080/14781700.2013.781484].

Blumczynski, P. 2016. Ubiquitous translation. Routledge.

Borges, J. L. 1964. Other inquisitions (1937–1952). Translated by Ruth L. C. Simms. University of Texas Press. (Original work published 1942.) https://www.crockford.com/wilkins.html.

Boyd, N. M., and Bogen, J. 2021. Theory and observation in science. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. Stanford University. https://plato.stanford.edu/archives/win2021/entries/science-theory-observation/.

Bradley, D. 2011. At last, a definitive periodic table? Chemistry views: The Magazine of Chemistry Europe. https://doi.org/10.1002/chemv.201000107.

Brock, W. H. 2016. The history of chemistry: A very short introduction. Oxford University Press.

Broughton, V. 2023. Facet analysis: The evolution of an idea. Cataloging & Classification Quarterly, 61(5–6), 411–438. https://doi.org/10.1080/01639374.2023.2196291.

Butler, K. 1995. Content, context and compositionality. Mind & Language, 10(1–2), 3–24. https://doi.org/10.1111/j.1468-0017.1995.tb00003.x.

Chang, H. 2012. Is water H2O? Evidence, realism and pluralism. Springer Nature. https://doi.org/10.1007/978-94-007-3932-1.

Chomsky, N. 1968. Language and mind. Harcourt, Brace & World.

Costello, F. J., and Keane, M. T. 2000. Efficient creativity: Constraint-guided conceptual combination. Cognitive Science, 24(2), 299–349.

Dahlberg, I. 2009. Brief communication: Concepts and terms — ISKO's major challenge. Knowledge Organization, 36(2–3), 169–177. https://doi.org/10.5771/0943-7444-2009-2-3-169.

Dextre Clarke, S. G. 2019. The information retrieval thesaurus. Knowledge organization, 46, 439–459. Also in B. Hjørland and C. Gnoli (Eds.), ISKO Encyclopedia of Knowledge Organization. https://www.isko.org/cyclo/thesaurus

Dowty, D. 2007. Compositionality as an empirical problem. In C. Barker and P. Jacobson (Eds.), Direct compositionality (pp. 23–101). Oxford University Press.

Ducheyne, S. 2005. Paul Otlet's theory of knowledge and linguistic objectivism. Knowledge Organization, 32(3), 110–116.

Durst, U. 2003. The natural semantic metalanguage approach to linguistic meaning. Theoretical Linguistics, 29(3), 157–200. https://doi.org/10.1515/thli.29.3.157.

Eco, U. 1995. The search for the perfect language. Blackwell Publishing. (Original work published 1993.)

Einstein, A. 1949. Remarks concerning the essays brought together in this co-operative volume. In P. A. Schilpp (Ed.), Albert Einstein: Philosopher-scientist (pp. 665–688). The Library of Living Philosophers.

Elkin, P. L., and Brown, S. H. 2023. Compositionality. In P. L. Elkin (Ed.), Terminology, ontology and their implementations, 2nd ed. Springer. https://doi.org/10.1007/978-3-031-11039-9_7.

Elkin, P. L., Tuttle, M. S., Keck, K., Campbell, K. C., Atkin, G. E., and Chute, C. G. 1998. The role of compositionality in standardized problem list generation. Studies in health technology and informatics, 52(1), 660–664.

Fjeldså, J. 2013. Avian classification in flux. In J. Hoyo, A. Elliott, J. Sargatal, and D. A. Christie (Eds.), Handbook of the birds of the world, vol. 17, pp. 77–146. Lynx Edicions.

Fodor, J. A. 1998. Concepts: Where cognitive science went wrong. Oxford University Press.

Fodor, J. A. 2001. Language, thought and compositionality. Mind & Language, 16(1), 1–15. https://doi.org/10.1111/1468-0017.00153.

Foskett, A. C. 1996. The subject approach to information, 4th ed. Library Association.

Furner, J., and Hjørland, B. 2023. The coverage of information science and knowledge organization in the Library of Congress Subject Headings (LCSH). Journal of Documentation, 79(5), 1265–1284. https://doi.org/10.1108/JD-11-2022-0256.

Geeraerts, D. 2005. Componential analysis. In K. Brown (Ed.), Encyclopedia of language and linguistics, 2nd ed., pp. 709–712. Elsevier. https://doi.org/10.1016/B0-08-044854-2/01029-4.

Goddard, C. 1998. Bad arguments against semantic primitives. Theoretical Linguistics, 24(2–3), 129–156. https://doi.org/10.1515/thli.1998.24.2-3.129.

Goddard, C. 2003. Whorf meets Wierzbicka: Variation and universals in language and thinking. Language Sciences, 25(4), 393–432. https://doi.org/10.1016/S0388-0001(03)00002-0.

Goddard, C., and Wierzbicka, A. 2014. Words and meanings: Lexical semantics across domains, languages and cultures. Oxford University Press.

Goldberg, A. E. 2016. Compositionality. In N. Riemer (Ed.), Routledge handbook of semantics, 419–433. Routledge.

Gómez-Pérez, A., Fernández-López, M., and Corcho, O. 2004. Ontological engineering: With examples from the areas of knowledge management, e-commerce and the semantic web. Springer.

Goodman, N. 1951. The structure of appearance. Harvard University Press.

Gopinath, M. A. 1976. Colon classification. In A. Maltby (Ed.), Classification in the 1970s: A second look, 51–80. Clive Bingly.

Gopnik, A. 2003. The theory theory as an alternative to the innateness hypothesis. In L. M. Antony and N. Hornstein (Eds.), Chomsky and his critics (pp. 238–254). Basil Blackwell. https://doi.org/10.1002/9780470690024.ch10.

Gopnik, A., Choi, S., and Baumberger, T. 1996. Cross-linguistic differences in early semantic and cognitive development. Cognitive Development, 11(2), 197–227. https://doi.org/10.1016/S0885-2014(96)90003-9.

Grasswick, H. 2018. Feminist social epistemology. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (fall 2018 edition). Stanford University. https://plato.stanford.edu/archives/fall2018/entries/feminist-social-epistemology/.

Groenendijk, J., and Stokhof, M. 2005. Why compositionality? In G. N. Carlson and F. J. Pelletier (Eds.), Reference and quantification: The partee effect (pp. 83–106). CSLI Publications.

Hajibayova, L. 2013. Basic-level categories: A review. Journal of Information Science, 39(5), 676–687. https://doi.org/10.1177/0165551513481443.

Halliday, M. A. K. 1978. Language as social semiotics. In The social interpretation of language and meaning. Edward Arnold.

Halpin, H. 2013. Social semantics. In H. Halpin (Ed.), Social semantics: The search for meaning on the web (pp. 187–203). Springer. https://doi.org/10.1007/978-1-4614-1885-6_7.

Heal, J. 2011. Wittgenstein, Ludwig Josef Johann (1889–1951). In E. Craig (Ed.), Routledge encyclopedia of philosophy. Taylor and Francis. https://doi.org/10.4324/9780415249126-DD072-2.

Henning, J. 2020. Langmaker: Celebrating Conlangs [Language maker: Celebrating constructed languages]. Yonagu.

Hjørland, B. 2009. Concept theory. Journal of the American Society for Information Science and Technology, 60(8), 1519–1536. https://doi.org/10.1002/asi.21082.

Hjørland, B. 2013. Facet analysis: The logical approach to knowledge organization. Information Processing and Management, 49(2), 545–557. https://doi.org/10.1016/j.ipm.2012.10.001.

Hjørland, B. 2015. Theories are knowledge organizing systems (KOS). Knowledge Organization, 42(2), 113–128. https://doi.org/10.5771/0943-7444-2015-2-113.

Hjørland, B. 2021. Information retrieval and knowledge organization: A perspective from the philosophy of science. Information (Basel), 12(3), 135. https://doi.org/10.3390/info12030135.

Hjørland, B., and Barros, T. H. B. 2024. Domain analysis versus facet analysis. Knowledge Organization, 51(1), 19–25. https://doi.org/10.5771/0943-7444-2024-1-19.

Hjørland, B., and Gnoli, C. 2021. Folk classification. In B. Hjørland and C. Gnoli (Eds.), ISKO encyclopedia of knowledge organization. International Society for Knowledge Organization. https://www.isko.org/cyclo/folk.

ISO 25964-1. 2011. Information and documentation: Thesauri and interoperability with other vocabularies, Part 1: Thesauri for information retrieval. International Organization for Standardization.

Jackman, H. 2020. Meaning holism. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (winter 2020 edition). Stanford University. https://plato.stanford.edu/archives/win2020/entries/meaning-holism/.

Janssen, T. 1983. Foundations and applications of Montague grammar. PhD thesis, Department of Mathematics, University of Amsterdam.

Kant, I. 1997. The critique of pure reason. P. Guyer and A. W. Wood (Trans.). Cambridge University Press. (Original work published 1781 and 1787.) https://cpb-us-w2.wpmucdn.com/...

Kashyap, M. M. 2003. Likeness between Ranganathan's postulations based approach to knowledge classification and entity relationship data modelling approach. Knowledge Organization, 30(1), 1–19.

Katz, J. J., and Fodor, J. A. 1963. The structure of semantic theory. Language, 39(2), 170–210. https://doi.org/10.2307/411200.

Kausch, J. 2024. Nuclear semiotics and knowledge organization: Five design heuristics for semantic primitives. Advances in Knowledge Organization, 20, 385–391.

Kuhn, T. S. 1962. The structure of scientific revolutions. University of Chicago Press.

Kuhn, T. S. 2000. In J. Conant and J. Haugeland (Eds.), The road since Structure: Philosophical essays, 1970–1993, with an autobiographical interview. University of Chicago Press.

Laurence, S., and Margolis, E. 2024. The building blocks of thought: A rationalist account of the origins of concepts. Oxford University Press. https://doi.org/10.1093/9780191925375.001.0001.

Legg, C. 2007. Ontologies on the semantic web. Annual Review of Information Science and Technology, 41, 407–452. https://doi.org/10.1002/aris.2007.1440410116.

Leibniz, G. W. 1903. Elementa characteristica universalis. In L. Couturat (Ed.), Opuscules et fragments inédits de Leibniz, extraits des manuscrits de la Bibliothèque royale de Hanovre (pp. 42–92). Ancienne Librairie Germer Balliere. (Original work published 1679.)

Levisen, C., and Waters, S. 2017. How words do things with people. In C. Levisen and S. Waters (Eds.), Cultural keywords in discourse, 8–35. John Benjamins.

Llull, R. 1305–1308. Ars generalis ultima, In Raimundi Lulli Opera Latina XIV/Corpus Christianorum Continuatio Mediaevalis 75 1986, 4–527. Available in English from Mnemonic Arts of Blessed Raymond Lull. https://lullianarts.narpan.net/ArsGeneralisUltima.pdf.

Lyons, J. 1968. Introduction to theoretical linguistics. Cambridge University Press.

Lyons, J. 1977. Semantics (Vol. 1). Cambridge University Press.

Mai, J. E. 2016. Marginalization and exclusion: Unraveling systemic bias in classification. Knowledge Organization, 43, 324–330. https://doi.org/10.5771/0943-7444-2016-5-324

Mascardi, V., Cordì, V., and Rosso, P. 2007. A comparison of upper ontologies. In WOA 2007: From objects to agents. Agents and industry: Technological applications of software agents, 24–25 September 2007, 55–64. Università di Genova. Seneca.

Matthewson, L. 2003. Is the meta-language really natural? Theoretical Linguistics, 29(3), 263–274. https://doi.org/10.1515/thli.29.3.263.

Mazzocchi, F., and Gnoli, C. 2010. S.R. Ranganathan's PMEST categories: Analyzing their philosophical background and cognitive function. Information Studies [Ranganathan Centre for Information Studies], 16(3), 133–147.

McCray, A. T., Burgun, A., and Bodenreider, O. 2001. Aggregating UMLS semantic types for reducing conceptual complexity. Studies in Health Technology and Informatics, 84(1), 216–220. https://doi.org/10.3233/978-1-60750-928-8-216.

McKnight, L. K., Elkin, P. L., Ogren, P. V., and Chute, C. G. 1999. Barriers to the clinical implementation of compositionality. Proceedings of the AMIA Symposium, 2, 320–324. https://www.ncbi.nlm.nih.gov/...

Metzinger, T. 2003. Being no one: The self-model theory of subjectivity. MIT Press.

Miller, G. A., and Johnson-Laird, P. N. 1976. Language and perception. Cambridge University Press.

Mills, J. 2004. Faceted classification and logical division in information retrieval. Library Trends, 52(3), 541–570.

Mills, J., Broughton, V., and Lang, V. 1980. Bliss Bibliographic Classification, 2nd ed. Class H: Anthropology, human biology, health sciences. Butterworth.

Mills, J., Broughton, V., and Lievesley, N. 1993. Bliss Bibliographic Classification, 2nd ed. Class AM/AX: Mathematics, statistics, and probability. Bowker Saur.

Mitchell, J. S., and Panzer, M. 2013. Dewey linked data: Making connections with old friends and new acquaintances. JLIS.it, 4(1), 177–199. https://doi.org/10.4403/jlis.it-5467.

Montague, R. 1970. Universal grammar. Theoria, 36(3), 373–398. https://doi.org/10.1111/j.1755-2567.1970.tb00434.x.

Moss, W. R. 1964. Categories and relations: Origins of two classification theories. American Documentation, 15(4), 296–301. https://doi.org/10.1002/asi.5090150408.

Murphy, M. L. 2003. Semantic relations and the lexicon: Antonymy, synonymy, and other paradigms. Cambridge University Press.

Nagel, T. 1986. The view from nowhere. Oxford University Press.

Nefdt, R. M., and Potts, C. 2024. Compositionality. In M. C. Frank and A. Majid (Eds.), Open encyclopedia of cognitive science. MIT Press. https://doi.org/10.21428/e2759450.494deacd.

Ogden, C. K., and Richards, I. A. 1923. The meaning of meaning: A study of the influence of language upon thought and of the science of symbolism. Kegan Paul.

Oliver, A. 1998. Logical atomism. In E. Craig (Ed.), Routledge encyclopedia of philosophy, Vol. 5, pp. 772–775. Routledge. https://doi.org/10.4324/9780415249126-N030-1.

Pagin, P. 1997. Is compositionality compatible with holism? Mind & Language, 12(1), 11–33. https://doi.org/10.1111/j.1468-0017.1997.tb00060.x.

Pagin, P., and Westerståhl, D. 2010. Compositionality II: Arguments and problems. Philosophy Compass, 5(3), 265–282. https://doi.org/10.1111/j.1747-9991.2009.00229.x.

Partee, B. H. 1995. Lexical semantics and compositionality. In L. R. Gleitman and M. Liberman (Eds.), Language: An invitation to cognitive science, 311–360. MIT Press.

Peirce, C. S. 1877. Illustration of the logic of science: Second paper. How to make our ideas clear. Popular Science Monthly January 1878, pp. 286–302. http://archive.org/details/popularsciencemo12newy.

Peirce, C. S. 1940. In J. Buchler (Ed.), The philosophy of Peirce: Selected writings. Routledge and Kegan Paul.

Pelletier, F. J. 1994. The principle of semantic compositionality. Topoi, 13(1), 11–24. https://doi.org/10.1007/BF00763644.

Perry, J. W., and Kent, A. 1958. Tools for machine literature searching: Semantic code dictionary, equipment, procedures. Interscience.

Pitt, D. 2003. On Markerese. The Philosophical Forum, 34(3 & 4), 267–300. https://doi.org/10.1111/1467-9191.00139.

Proops, I. 2022. Wittgenstein's logical atomism. In E. N. Zalta and U. Nodelman (Eds.), The Stanford encyclopedia of philosophy (Fall 2022 edition). Stanford University. https://plato.stanford.edu/archives/fall2022/entries/wittgenstein-atomism/.

Pulman, S. G. 1983. Word meaning and belief. Croom Helm.

Pulman, S. G. 2005. Lexical decomposition: For and against. In J. I. Tait (Ed.), Charting a new course: Natural language processing and information retrieval: Essays in honour of Karen Spärck Jones, 155–173. Springer. https://doi.org/10.1007/1-4020-3467-9_10.

Quine, W. V. O. 1951. Two dogmas of empiricism. The Philosophical Review, 60(1), 20–43.

Raghavan, K. S. 2019. Shiyali Ramamrita Ranganathan. In B. Hjørland and C. Gnoli (Eds.), ISKO encyclopedia of knowledge organization. https://www.isko.org/cyclo/ranganathan.

Rast, E. 2023. Metalinguistic disputes, semantic decomposition, and externalism. Linguistics and Philosophy, 46(1), 65–85. https://doi.org/10.1007/s10988-022-09357-y.

Richard, M. 1998. Compositionality. In E. Craig (Ed.), Routledge encyclopedia of philosophy, Vol. 2, 476–477. Routledge. https://doi.org/10.4324/9780415249126-U007-1.

Riemer, N. 2006. Reductive paraphrase and meaning: A critique of Wierzbickian semantics. Linguistics and Philosophy, 29(3), 347–379. https://doi.org/10.1007/s10988-006-0001-4.

Riemer, N. 2016. Lexical decomposition. In N. Riemer (Ed.), The Routledge handbook of semantics, 213–232. Routledge.

Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., and Boyes-Braem, P. 1976. Basic objects in natural categories. Cognitive Psychology, 8(3), 382–439. https://doi.org/10.1016/0010-0285(76)90013-X.

Rossi, P. 2001. Logic and the art of memory: The quest for a universal language, 2nd ed. Athlone Press. (Original work published 1983.)

Russell, B. 1956. The philosophy of logical atomism. In R. C. Marsh (Ed.), Russell: Logic and knowledge: Essays 1901–1950, 175–281. George Allen & Unwin. (Original work published 1918.)

Scerri, E. 2020. The periodic table: Its story and significance, 2nd ed. Oxford University Press.

Smith, B. 2004. Beyond concepts: Ontology as reality representation. In A. C. Varzi and L. Vieu (Eds.), Proceedings of the FOIS 2004. International conference on formal ontology and information systems, Turin, Italy, 4–6 November 2004, 73–84. IOS Press. https://philpapers.org/archive/SMIBCO.pdf.

Soergel, D. 1985. Organizing information: Principles of database and retrieval systems. Academic Press.

Soergel, D. 1998. WordNet [review of the book WordNet by Fellbaum, C., Ed.]. D-Lib Magazine, 4(10), 1–7. https://www.dlib.org/dlib/october98/10bookreview.html.

Soergel, D. 2017. The principle of compositionality and entity-relationship modelling: Faceted classification in a broader context. In A. Slavic and C. Gnoli (Eds.), Faceted classification today: Theory, technology and end users. Proceedings of the International UDC Seminar 14–15 September 2017 London, United Kingdom, 43–60. Ergon.

Sowa, J. F. 1984. Conceptual structures: Information processing in mind and machine. Addison-Wesley.

Sowa, J. F. 2000. Ontology, metadata, and semiotics. In Lecture notes in computer science, Vol. 1867, 55–81. Springer.

Sowa, J. F. 2003. Ontology. In Glossary. http://www.jfsowa.com/ontology/gloss.htm

Spärck Jones, K. 1986. Synonymy and semantic classification. Edinburgh University Press.

Spärck Jones, K. 1992. Thesaurus. In S. C. Shapiro (Ed.), Encyclopedia of artificial intelligence, Vol. 2, 2nd ed., pp. 1605–1613. Wiley.

Spärck Jones, K. 2007. Semantic primitives: The tip of the iceberg. In K. Ahmad, C. Brewster, and M. Stevenson (Eds.), Words and intelligence II: Essays in honour of Yorick Wilks, 235–253. Springer.

Spärck Jones, K., and Kay, M. 1973. Linguistics and information science. Academic Press.

Stewart, P. 2011. Comment to Bradly 2011. Chemistry s: The Magazine of Chemistry Europe. https://doi.org/10.1002/chemv.201000107.

Szostak, R. 2011. Complex concepts into basic concepts. Journal of the American Society for Information Science and Technology, 62(11), 2247–2265. https://doi.org/10.1002/asi.21635.

Szostak, R. 2013. Basic concepts classification (web site, updated since 2013). https://sites.google.com/ualberta.ca/rick-szostak/Basic-Concepts-Classification.

Szostak, R. 2024. Knowledge organization. In F. Darbellay (Ed.), Elgar encyclopedia of interdisciplinarity and transdisciplinarity, 314–317. Edward Elgar.

Vallée, R. 2006. Holism, semantic and epistemic. In K. Brown (Ed.), Encyclopedia of language and linguistics, 2nd. ed., 374–377. Elsevier. https://doi.org/10.1016/B0-08-044854-2/01175-5.

Varzi, A. 2019. Mereology. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (spring 2019 edition). Stanford University. https://plato.stanford.edu/archives/spr2019/entries/mereology/.

Vickery, B. C. 1960. Faceted classification: A guide to the construction and use of special schemes. Aslib.

Wang, D., Li, Q., Lima, L. C., Simonsen, J. G., and Lioma, C. 2019. Contextual compositionality detection with external knowledge bases and word embeddings. In L. Liu and R. White (Eds.), The Web Conference 2019 — Companion of the World Wide Web Conference, WWW 2019 (pp. 317–323). Association for Computing Machinery. https://doi.org/10.1145/3308560.3316584.

Weiskopf, D. A. n.d. The theory-theory of concepts. In J. Fieser and B. Dowden (Eds.), The Internet encyclopedia of philosophy. https://iep.utm.edu/theory-theory-of-concepts/.

Weitzman, S. H., and Parenti, L. R. 2025. Fish. In Encyclopædia Britannica. Encyclopædia Britannica. https://www.britannica.com/animal/fish.

Whorf, B. L. 1956. In J. B. Carroll (Ed.), Language, thought, and reality: Selected writings of Benjamin Lee Whorf. MIT Press. (Original work published 1941.) https://ia601605.us.archive.org/...

Wierzbicka, A. 1972. Semantic primitives. Athenäum.

Wierzbicka, A. 1996. Semantics: Primes and universals. Oxford University Press.

Wierzbicka, A. 1999. Emotions across languages and cultures: Diversity and universals. Cambridge University Press. https://doi.org/10.1017/CBO9780511521256.

Wierzbicka, A. 2006. Semantic primitives. In K. Brown (Ed.), Encyclopedia of language and linguistics ( 2nd ed., pp. 134–137). Elsevier. https://doi.org/10.1016/B0-08-044854-2/01073-7.

Wierzbicka, A. 2021. ’Semantic primitives', fifty years later. Russian Journal of Linguistics, 25(2), 317–342. https://doi.org/10.22363/2687-0088-2021-25-2-317-342.

Wilkins, J. (1668. An essay towards a real character, and a philosophical language. Printed for Sa: Gellibrand, and for John Martyn printer to the Royal Society. https://archive.org/...

Wilks, Y. 2007. Good and bad arguments about semantic primitives. In K. Ahmad, C. Brewster, and M. Stevenson (Eds.), Words and intelligence I: Selected papers by Yorick Wilks, 103–139. Springer. (Original work published 1977.) https://doi.org/10.1007/1-4020-5285-5_6.

Wilks, Y., and Tait, J. 2006. A retrospective view of synonymy and semantic classification. In J. I. Tait (Ed.), Charting a new course: Natural language processing and information retrieval: Essays in honour of Karen Spärck Jones, 1–11. Springer. https://doi.org/10.1007/1-4020-3467-9_1.

Williams, D. M., and Ebach, M. C. 2020. Cladistics: A guide to biological classification, 3rd ed. Cambridge University Press.

Wittgenstein, L. 1981. Tractatus Logico-Philosophicus. C. K. Ogden (Trans.). Routledge and Kegan Paul. (Original work published 1921.)

Zwicky, A. M. 1973. Linguistics as chemistry: The substance theory of semantic primes. In S. R. Anderson and P. Kiparsky (Eds.), A festschrift for Morris Halle (pp. 467–485). Holt.

Zwicky, A. M. 1980. The analogy of linguistics with chemistry. In M. Key (Ed.), The relationship of verbal and nonverbal communication, 319–326. De Gruyter Mouton. https://doi.org/10.1515/9783110813098.319.

[top of entry]

 

Visited Hit Counter by Digits times.


Version 1.0 published 2025-08-27-

Article category: Theoretical concepts

This article (version 1.0) was first published open access in JASIST. How to cite it:
Hjørland, Birger. In press. “Semantic primitives and compositionality: An Annual Review of Information Science and Technology (ARIST) paper”. Journal of the Association for Information Science and Technology. https://doi.org/10.1002/asi.70011. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, .
To quote text edited in a later version, you should save it in the Wayback Machine and cite the saved version.

CC BY-NC-ND.