edited by Birger Hjørland and Claudio Gnoli


Subject (of documents)

by Birger Hjørland

Table of contents:
1. Introduction;
2. Theoretical views: 2.1 Charles Ammi Cutter (1837-1903), 2.2 S. R. Ranganathan (1892-1972), 2.3 Patrick Wilson (1927-2003), 2.4 "Content oriented" versus "request oriented" views, 2.5 Issues of subjectivity and objectivity, 2.6 The subject knowledge view, 2.7 Other views and definitions;
3. Related concepts: 3.1 Words versus concepts versus subjects, 3.2 Aboutness, 3.3 Topic, 3.4 Isness, 3.5 Ofness, 3.6 Theme; 3.7 Content;
4. Conclusion;

This article presents and discusses the concept "subject" or subject matter (of documents) as it has been examined in library and information science (LIS) for more than 100 years. Different theoretical positions are outlined and it is found that the most important distinction is between document-oriented views versus request-oriented views. The document-oriented view conceive subject as something inherent in documents, whereas the request-oriented view (or the policy based view) understand subject as an attribution made to documents in order to facilitate certain uses of them. Related concepts such as concepts, aboutness, topic, isness, ofness and content are also briefly presented. The conclusion is that the most fruitful way of defining "subject" (of a document) is the documents informative or epistemological potentials, that is, the documents potentials of informing users and advance the development of knowledge.

1. Introduction

In Library and Information Science (LIS), documents (such as books, articles and pictures) are classified, indexed and searched by subject (as well as by other attributes such as author, genre and language). This makes "subject" a fundamental concept in this field (see Golub 2014 for a recent text). This use of "subject" in LIS is part of the broader use of the concept that refers to all kinds of utterances ("what is he talking about"). LIS specialists assign subject labels to documents to make them findable/retrievable. Such professionally assigned subject labels compete with other subject access points such as words from titles, abstracts and full-text, bibliographic references, user tagging etc. Therefore, research in subject representation is not limited to professionally assigned subject labels but includes the study of all possible subject access points.

There are many ways to produce subject representations and in general there is not always consensus about which subject should be attributed to a given document. As stated by Lancaster (2003, 21), it is important to distinguish the conceptual analysis and the translation stages in indexing and classification. In conceptual analysis, subjects are attributed to documents and in the translation stage subject labels are assigned to documents. There tend to be great variation among indexers and classifiers in subject analysis and choice of subject labels, as measured, for example, by so-called inter-indexer consistency studies, see Saracevic (2008). To optimize subject representation and searching, we need to have a deeper understanding of the questions

  • What is the criterion that a given subject should be attributed to a given document?
  • What is to be understood by the statement 'document A belongs to subject category X'?
  • What is a subject?

This issue has been debated in the field for more than 100 years, often by using other terms such as aboutness or topic (cf., below).

One may think that the concept "subject" in this connection is self-evident and in no need for theoretical exploration. The claim of this article is, however, that it is a basic concept with different meanings and that a fruitful understanding of it is of fundamental importance for LIS. What Tredinnick wrote about the concepts information, knowledge, data, document and text is equally true for subject:

The difficulty in reaching agreement about their meaning in part derives from the kinds of research questions that are addressed, but also in part from fundamental differences in the conceptual outlooks into which they are slotted. Implicit in this is an ongoing cycle of appropriation and reappropriation of the meaning of these contested terms for particular ends" (Tredinnick 2006, 19)

Therefore, we have to consider the different theoretical outlooks in order to decide which outlook and thereby understanding of "subject" is most fruitful for knowledge organization.

[top of entry]

2. Theoretical views

This section provides a chronological presentation of definitions or understandings of "subject" in LIS. The presentation seeks to present all significant views without guarantee of being complete (the presentation has been difficult to produce because different researchers have mostly ignored former definitions).

2.1 Charles Ammi Cutter (1837-1903)

For Cutter the stability of subjects depends on a social process in which their meaning is stabilized in a name or a designation. We are here presenting Cutter's view from Miksa (1983a) and Frohmann (1994).

Francis Miksa wrote:

[A subject] referred [...] to those intellections [...] that had received a name that itself represented a distinct consensus in usage" and: the "systematic structure of established subjects" is "resident in the public realm" (Miksa 1983a, 69)
[s]ubjects are by their very nature locations in a classificatory structure of publicly accumulated knowledge" (Miksa 1983a, 61).

Bernd Frohmann added:

The stability of the public realm in turn relies upon natural and objective mental structures which, with proper education, govern a natural progression from particular to general concepts. Since for Cutter, mind, society, and SKO [Systems of Knowledge Organization] stand one behind the other, each supporting each, all manifesting the same structure, his discursive construction of subjects invites connections with discourses of mind, education, and society. The Dewey Decimal Classification (DDC), by contrast, severs those connections. Melvil Dewey emphasized more than once that his system maps no structure beyond its own; there is neither a "transcendental deduction" of its categories nor any reference to Cutter's objective structure of social consensus. It is content-free: Dewey disdained any philosophical excogitation of the meaning of his class symbols, leaving the job of finding verbal equivalents to others. His innovation and the essence of the system lay in the notation. The DDC is a poorly semiotic system of expanding nests of ten digits, lacking any referent beyond itself. In it, a subject is wholly constituted in terms of its position in the system. The essential characteristic of a subject is a class symbol which refers only to other symbols. Its verbal equivalent is accidental, a merely pragmatic characteristic [...] The conflict of interpretations over "subjects" became explicit in the battles between "bibliography" (an approach to subjects having much in common with Cutter's) and Dewey's "close classification". William Fletcher spoke for the scholarly bibliographer [...] Fletcher's "subjects", like Cutter's, referred to the categories of a fantasized, stable social order, whereas Dewey's subjects were elements of a semiological system of standardized, techno-bureaucratic administrative software for the library in its corporate, rather than high culture, incarnation". (Frohmann 1994, 112-113).

Cutter's view on "subject" is probably wiser than most of the later understandings that dominated the 20th century, including the understanding reflected in the ISO-standard quoted below. The early statements quoted by Frohmann indicate that subjects are somehow shaped in social processes. It also indicates that there was a conflict between Cutter and Dewey in understanding "subjects" that is reflected in their respective classification systems. When that is said, it should be added that Cutter's view seems not particularly detailed or clear. We only get a vague idea of the social nature of subjects.

[top of entry]

2.2 S. R. Ranganathan (1892-1972)

Ranganathan provided the following definitions:

Subject: assumed term (Ranganathan 1963, 27)
Subject: Thought-content of a document (Ranganathan 1964, 109).
Subject - an organized body of ideas, whose extension and intention are likely to fall coherently within the field of interests and comfortably within the intellectual competence and the field of inevitable specialization of a normal person. (Ranganathan 1967, 82).

A related definition is given by one of Ranganathan's students:

A subject is an organized and systematized body of ideas. It may consist of one idea or a combination of several... (Gopinath 1976, 51).

The first of Ranganathan's definitions (1963) seems to consider "subject" self-evident and in no need for theoretical exploration. His second definition (1964) corresponds to the "content oriented view" (see section 2.4 below)". Ranganathan's third definition (1967) as well as that of Gopinath (1976), are here taken as the point of departure and overall, they are considered alike. Ranganathan's 1967 definition of "subject" is clearly influenced by his Colon Classification system (CC), which is an analytic-synthetic scheme based on the combination of single elements from facets to subject designations. The definition needs to be understood in the context of his other concepts such as "isolate" and "basic concept" in CC. Ranganathan's concepts are highly idiosyncratic, for example, the claim that gold cannot be a subject (but is alternatively termed "an isolate"). The concept "discipline" is substituted with "basic subject" defined (1967, 83) as a subject, that does not have isolate ideas as a component (example: Mathematics).

We can see a problem with Ranganathan's concepts if we consider a simple sentence. In the Dewey system (DDC) is stated "No other feature of the DDC is more basic than this, that it scatters subjects by discipline (Dewey 1979, xxxi)". This makes sense, and "subject" as well as "discipline" are here used in a way that is not specific for DDC, but can be applied generally. This is not so with Ranganathan's concepts, which can only be understood in relation to CC.

If we consider the 1967 definition with the definitions presented in the rest of this article, we can see that is provide no guidance in itself for subject analysis. It does not address the problems, for example, raised by Wilson (section 2.3) or by issues discussed in sections 2.4 and 2.5. That a subject is organized seems just to refer to how subjects are analyzed in CC, where subjects are organized and combined of single elements from facets. This is the reason why the organized or combined nature of subjects is emphasized. It seems unacceptable that Ranganathan defines the concept of "subject" in a way that favors his own system. A scientific concept like "subject" should make it possible to compare different ways of establishing access to information. Whether we speak of, for example, of enumerated systems, pre-coordinative systems, faceted systems or post-coordinative systems, whether subjects are organized or not, should not be a part of the definition of "subject" (but when "subject" has been defined its degree of organization may be examined in specific cases). Ranganathan's definition also contains the pragmatic demand that a subject should be determined in a way that suits a normal person's competency or specialization. Again, we see a strange kind of mixing a general understanding of a concept with demands put by a specific system. One thing is what the concept "subject" represents; quite another issue is how to provide subject descriptions that fulfill demands such as precision and recall. Because Ranganathan's (1967) definition is too closely related to his CC, comparative studies of different kinds of systems are made difficult by using it.

This aspect of the theory was criticized by Metcalfe (1973, 318). Metcalfe's skepticism regarding Ranganathan's theory is formulated in harsh words (op. cit., 317): "This pseudo-science imposed itself on British disciples from about 1950 on...". Although this voice is contrary to Ranganathan's generally high prestige in LIS and partly dismissed by Drake (1960), it seems important that Ranganathan's theoretical assumptions be carefully examined and not taken for granted (see also Hjørland 2013). Ranganathan's concept of subject has been further presented by Dutta (2015), Dutta and Dutta (2013) and Dutta, Majumder and Sen (2013). These articles are, however, most summaries of other authors' papers.

Based on these arguments we may conclude that Ranganathan's definition of the concept of "subject" is not suited for general scientific use. Like the definition of "subject" given by the ISO-standard for topic maps (see section 2.7), Ranganathan's definition may be useful within his own closed system. The purpose of a scientific and scholarly field is, however, to examine the relative fruitfulness of systems such as topic maps and CC. For such purposes, another understanding of the concept of "subject" seems to be necessary.

[top of entry]

2.3 Patrick Wilson (1927-2003)

In his book, Wilson (1968) examined — in particular by thought experiments — the suitability of different methods of determining the subject of a document. The methods were:

  • To identify the author's purpose for writing the document
  • To weigh the relative dominance and subordination of different elements in the picture, which the reading imposes on the reader
  • To group or count the document's use of concepts and references
  • To construe a set of rules for selecting elements deemed necessary (as opposed to unnecessary) for the understanding of the work as a whole.

Wilson demonstrated that each of these methods is insufficient to determine the subject of a document. He is led to conclude:

The notion of the subject of a writing is indeterminate... (Wilson 1968, 89)

or about what users may expect to find using a particular position in a library classification system:

For nothing definite can be expected of the things found at any given position. (Wilson 1968, 92)

In connection with the last quote, Wilson adds an interesting footnote in which he writes:

For example, I know more or less clearly what hostility is, that is, the word 'hostility' has a fairly sharp meaning for me, but far from a perfectly sharp and precise meaning. Now if I were to supply myself with an exact defined concept, got by explication of my imprecise notion, I might find that I could never use the new concept in describing any actual piece of writing; the concept might be too sharp ever to find application. There would be instances of hostility (in the new sense) that I could recognize, but no instances of writings on hostility that I could recognize, for no one would have written on hostility (as I now would understand it). If people write on what are for them ill-defined phenomena, a correct description of their subjects must reflect the ill-definedness (Wilson 1968, 92).

Hjørland (1992) discussed Wilson's concept of subject and found that it is problematic to give up the precise understanding of such a basic concept in LIS. Wilson's arguments led him to an agnostic position, which Hjørland found unacceptable and unnecessary. Concerning the authors' use of ambiguous terms, the role of the subject analysis is to determine which documents would be fruitful for users to identify whether or not the documents use one or another term or whether a given term in a document is used in one or another meaning. The information specialists provide an interpretation and a description (for example based on a controlled vocabulary) which classify the literature in a way that users may learn to use to identify the terms or classes that with high probability refer to the needed documents. Relevant concepts and distinctions in classification systems and controlled vocabularies may be fruitful even if applied to documents with ambiguous terminology. The problem is not whether there is a precise match between the documents' and the information specialist's concepts, but whether the subject representation makes distinctions that are relevant for the users.

[top of entry]

2.4 "Content oriented" versus "request oriented" views

In this section, two kinds of indexing principles will be presented that illuminate a core theoretical issue related to the concept "subject". Traditional indexing has been content- or document oriented. An example is the 20% rule used by, for example, the Library of Congress. According to this rule, at least 20% of any given document shall be about the subject indicated by the subject label:

Assign headings only for topics that comprise at least 20% of the work.
In the case of a work containing separate parts, for example, a narrative text plus an extensive bibliography or a section of maps (cf. H 1865), or a book with accompanying materials, such as a computer disc, assign separate headings for the individual parts or materials if they constitute at least 20% of the item and are judged to be significant (Library of Congress 2008, sheet H 180).

The alternative principle is request oriented indexing in which the anticipated request from users is influencing how documents are being indexed. The indexer asks himself or herself:

Under which descriptors should this entity [document] be found?
think of all the possible queries and decide for which ones the entity at hand is relevant (Soergel 1985, 230; see also Soergel 1974, Chapter F1, 356).

Request oriented indexing may be indexing that is targeted towards a particular audience or user group. A library for feminist studies may, for example, index documents different compared to a historical library: If a feminist library buy a book of, say, Napoleon, it must be assumed that it does so because the book in some way is relevant from a feminist perspective (i.e. say something about women at the time of Napoleon). For the user of the catalog, it is important that this purpose and perspective be expressed in the subject representation of the book, in order enable users to find books about women at that time. In other words, the purpose or perspective of a specific collection should ideally be reflected in its classification or indexing (which, of course, is contrary to economic considerations to standardize subject representation and reuse the work done by other libraries, there are thus contradicting interests at play). It is probably best to understand request oriented indexing as policy based indexing: as indexing done according to some ideals and reflecting the purpose of the library or database for which it is done. In this way, it is not necessarily a kind of indexing based on empirical user requests, but only those anticipated requests that are considered within the purpose of the library or database to answer. (Only if empirical data about use or users are applied, indexing should be regarded as user-based).

It is interesting to consider that mainstream automatic indexing is not purely document-oriented because the frequency of terms in a given collection is taking into account. Terms that are used in many documents have a low discriminatory power and are therefore assigned a lower weight. In this way automatic indexing is less document-oriented and more contextual compared with, for example, the use of to 20% rule. Still, of course, this principle of automatic indexing does not fulfill the demands of "request oriented indexing"/ "policy oriented indexing".

The content-oriented view considers "subject" to be something inherent in documents. The request-oriented view, on the other hand, consider "subject" to be something attributed to documents by somebody in order to facilitate certain uses of the documents. The problem of whether the subject is something inherent in the document (and determined "objectively") or is context dependent (and determined "subjectively") is related to the philosophical subject-object problem to which we now turn.

[top of entry]

2.5 Issues of subjectivity and objectivity

There exists an ideal, often implicit, that there is one right way to provide a subject representation for a given document. The formerly mentioned inter-indexer consistency studies is then an example of an attempt to measure the subjectivity in subject representation based on the assumption that the majority of indexers are closer to the truth compared to the outliers. However, as pointed out by Cooper (1969), indexing may be consistently wrong. Indexers may be guided, for example, by the same bad principles or assumptions and in that case their indexing will be consistently bad. Therefore, studies of inter-indexer consistency may not necessarily provide a basis for improving indexing quality. The implication is that we can only determine the quality of subject representation from the standpoint of a theory of what good indexing and classification should be like. If we take the request-oriented view as the point of departure, the subject representation should only be consistent in relation to the same anticipated requests or the same policy framework. In other words, subject representation should be based on inter-subjectively stated goals, values and policies. They should not as an ideal be objective. In a way, the subjectivity of indexers should be an ideal (but not any form of subjectivity, of course, just a subjectivity developed to consider a specific perspective).

"Subject" may also mean the knowing subject (person) who retrieves documents that answer questions for him. In general, these two meanings are separated in information science, although, as we saw above, different persons may provide different subject representations, even as an ideal. In a recent monograph (Day 2014) these two meanings of "subject" are combined. The main point in Day's book is that indexes in a certain theoretical perspective "have more than simply a retrieval function; they do not only act as affordances and means for the fulfillment of 'information needs', but for the creation of such, and the creation of documentary-mediated persons and selves, as well" (Day 2014, 37). A point of view that may seem somewhat exaggerated.

[top of entry]

2.6 The subject knowledge view

The subject knowledge view of subjects emphasizes the role of domain specific knowledge in relation to both subject representation in practice and theoretical issues concerning the nature of "subject". It may also be called "the domain analytic view" or "the epistemological view" because it understand subject knowledge as formed by different theories, which in the end are connected with epistemological assumptions. Rowley & Hartley wrote

In order to achieve good consistent indexing, the indexer must have a through appreciation of the structure of the subject and the nature of the contribution that the document is making to the advancement of knowledge within a particular discipline". (Rowley & Hartley 2008, 109)

This is an important statement (which unfortunately has not been further developed by the authors). It clearly expresses that subject representation aim at supporting advancement of knowledge in different domains and that subject knowledge is a precondition for doing so. This statement is in accordance with how Hjørland (1992, 185) defined subjects as the epistemological potentials of documents (or, synonymously, as the informative potentials of documents). This definition also implies that subject representations aim at supporting advancement of knowledge in different domains and that subject knowledge is a precondition for doing so. Hjørland's definition contains the additional layer that different "paradigms" entail different subject representations. Therefore, the question of subject representation is closely linked to the question of which paradigms should be supported. In other words, subject representations cannot be regarded as neutral expressions. On the contrary: The activity of assigning a subject label to a given document represents a kind of power, (cf., Olson 2002) which aims at facilitating certain uses of that document at the expenses of other uses.

Let us consider a concrete example. Fisher (1921) as a part of a series published the article "Studies in crop variation". As indicated by the title, the subject is crop variation. Retrospectively, however, this title and subject attribution is considered poor:

Seldom in the history of science has a set of titles [Studies in crop variation] been such a poor description of the importance of the material they contain. In these papers, Fisher developed original tool for the analysis of data, derived the mathematical foundations of those tools, described their extensions into other fields, and applied them to the "muck" he found at Rothamsted. These papers show a brilliant originality and are filled with fascinating implications that kept theoreticians busy for the rest of the twentieth century, and will probably continue to inspire work in the years that follow" (Salsburg 2001, 43).

Of course, Fisher (1921) is (also) about crop variation and should be indexed as such in indexes within agriculture. However, as the quote says, this article has had a much broader and deeper importance in the field of statistical probability where two of its main subjects are experimental design and sampling. If the purpose of subject representation is to support future use of documents, then these last mentioned subject labels are far more important than that indicated by the title.

The bibliometrican Henry Small published an important paper "Cited documents as concept symbols" (Small 1978) in which he found that highly cited papers tend to be cited for the same reasons and that these reasons are often represented in the citing documents as "concept symbols". For example, we may assume that most of the papers citing Fisher (1921) uses, for example, "experimental design" as a concept symbol at the place of the reference in the text. Bibliometric methods may therefore be used automatically or semi automatically to determine the subject of documents in a way that is in agreement with the subject knowledge view (cf., Schneider and Borlund 2004). Of course, this technique cannot be applied to assign subject labels to new documents, only retrospectively and only to (highly) cited documents. Whether or not we may apply this method in practice, the example provides a deep insight to the dynamic nature of "subjects". It demonstrates that the subject of a document is not independent of evaluation of the potentials of that document.

[top of entry]

2.7 Other views and definitions

In the ISO-standard for topic maps, the concept of subject is defined this way:

Anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever." (ISO/IEC 13250 2002, 4)

This definition may work well with the closed system of concepts provided by the topic maps standard. In broader contexts, however, it is not fruitful because it does not contain any specification on how to determine the subject of a given document. If different methods of subject analysis imply different results, which of these results should then be preferred? Different persons may have different opinions about what the subject of a specific document is. The theoretical understanding of the concept of "subject" should be helpful for deciding principles of subject analysis. It is not helpful just to say "subject" is "anything whatsoever".

[top of entry]

3. Related concepts

3.1 Words versus concepts versus subjects

A proposal for the differentiation between concept indexing and subject indexing was given by Bernier (1980). In his opinion subject indexes are different from, and can be contrasted with, indexes to concepts and words. Subjects are what authors are working and reporting on. A document can have the subject of "chromatography" if this is what the author wishes to inform about. Papers using chromatography as a research method or discussing it in a subsection do not have chromatography as subjects. Indexers can easily drift into indexing concepts and words rather than subjects, but this is not good indexing.

Bernier does not, however, differentiate authors' subjects from those of the information seekers. A user may want a document for other reasons that its author intended. From the point of view of information systems, the subject of a document is related to the questions that the document can answer for the users (cf. the distinction between a content oriented and a request-oriented approach presented above).

This distinction between words, concepts and subjects is often confused. If "subject" is defined differently from words and concepts, it follows that its statistical distribution may also be different. Hjørland & Nicolaisen (2005) in their analysis of the concept of "subject" in relation to Bradford's law of scattering made this distinction:

Lexical scattering is the scattering of words in texts and in collections of texts.
Semantic scattering is the scattering of concepts in texts and in collections of texts.
Subject scattering is the scattering of items useful to a given task or problem.

This examples demonstrates that the concept of subject have wide-ranging implications not just for subject representation but also for bibliometrics and LIS in general.

[top of entry]

3.2 Aboutness

Aboutness is a concept used in LIS, linguistics, philosophy of language, and philosophy of mind. In the philosophy of mind, it has been often considered synonymous with intentionality (cf., Siewert 2016); in the philosophy of logic and language it is understood as the way a piece of text relates to a subject matter or topic (cf., Demolombe and Jones 1999; Yablo 2014).

Robert A. Fairthorne (1969) is credited with coining the term aboutness in LIS, which became popular in LIS in the late 1970s, perhaps due to arguments put forward by William John Hutchins (1975; 1977 and 1978). Hutchins argued that aboutness was to be preferred to subject because it removed some epistemological problems (e.g., that different people may attribute different subjects to the same document). Hjørland (1992 and 1997) argued, however, that the same epistemological problems were also present in Hutchins' proposal (different people may also attribute different "aboutness" to the same document). Because the same problems are connected with aboutness, the reason to introduce this term as a substitute for subject is unsupported. By implication, aboutness and subject should be considered synonymous in LIS.

Tredinnick (2006) throughout the book considers the attribution of "aboutness" to documents to be a problematic activity in LIS ("subject" is not discussed). He wrote:

Any isolation of the aboutness of texts therefore involved an act of interpretation that seeks to limit the signifying value of the text, without any particular claim to authority or authenticity. In other words, what information means also becomes a matter of the socio-cultural values that we bring to it, what Eco (1976) calls the cultural codes within which signification occurs, and these values are neither neutral in the way we might assume, nor absolute. The identification of the aboutness of information imposes certain privileged perspectives on text. It happens that these perspectives can be mapped against sociocultural norms or particular discursive communities, such as the humanist outlook that influenced librarianship and the positivism of information science. This is a problem for the information profession, which largely occupies itself by isolating in various ways the aboutness of texts. (Tredinnick 2006, 138)

If I understand this quote correctly, it says that the determination of aboutness involve socio-cultural values (and this cover "subject" as well). It is difficult to understand, however, that this act in itself is considered a problem; it should only be considered a problem if epistemological and socio-cultural values are ignored.

[top of entry]

3.3 Topic

Topic is a term often used synonymously with subject and aboutness. Examples are Jarneving (2005, 252), who wrote, "title words have a high topicality"; Xu and Yin (2008, 202) wrote: "Topicality measures the "aboutness" of a document to the topic area suggested by a query" and Janes (1994, 161) wrote "Topicality, the relation of a document to the topic of a user's query". Huang (2009) is a dissertation with the title: Topicality Reconsidered: A Multidisciplinary Inquiry into Topical Relevance Relationships.

Based on how the term topic is used in the literature of LIS, it is here concluded that it should be considered a synonym for subject.

[top of entry]

3.4 Isness

Isness is a concept that has been suggested to cover terms for indexing that are considered to be beyond proper subject terms. The International Federation of Library Associations and Institutions wrote:

The FRSAR Working Group is aware that some controlled vocabularies provide terminology to express other aspects of works in addition to subject (such as form, genre, and target audience of resources). While very important and the focus of many user queries, these aspects describe isness or what class the work belongs to based on form or genre (e.g., novel, play, poem, essay, biography, symphony, concerto, sonata, map, drawing, painting, photograph, etc.) rather than what the work is about. (IFLA 2010, 10)

"Isness" thus expresses what something is as opposed to what it is about. It is however a rather seldom term in LIS.

[top of entry]

3.5 Ofness

In picture indexing, the term ofness is sometimes used to refer to objects or events in the picture:

Those LIS authors who have focused on the subjects of visual resources, such as artworks and photographs, have often been concerned with how to distinguish between the "aboutness" and the "ofness" (both specific and generic depiction or representation) of such works (Shatford 1986). In this sense, "aboutness" has a narrower meaning than that used above. A painting of a sunset over San Francisco, for instance, might be analyzed as being (generically) "of" sunsets and (specifically) "of" San Francisco, but also "about" the passage of time" (IFLA 2010, 11).

Shatford's analysis was inspired by Panofsky (1939), who identified three levels of meaning in works of art. At the first, or pre- iconographic, level, subject matter was designated as factual ("ofness") or expressional ("aboutness"), and based on the objects and events in an image as it could be interpreted through everyday experience. At the second, or iconographic, level, interpretation requires some cultural knowledge of themes and concepts (not "a sailor" but "Ulysses"). The third or iconological level requires interpretation at a sophisticated level using world and cultural knowledge plus a deeper understanding of the history and background of the work.

See further in: Baca & Harpring (2000), Krause (1988) and Shatford (1986).

[top of entry]

3.6 Theme

In art history, literary studies, text linguistics and other fields, the notion of theme of a work or a text is often discussed. "Thematics is the study of themes and motifs in text and discourse" (Louwerse and Peer 2006).

Theme is often considered a synonym for subject. The ISO 5963 standard Methods for examining documents, determining their subjects and selecting indexing terms, for example, defines "subject" as follows: "Any concept or combination of concepts representing a theme in a document" (this definition and standard is also clearly document-oriented). A similar definition is used in FRSAD, where theme is defined as "any entity used as a subject of a work". Therefore this model confirms one of the basic relationships defined in FRBR: "WORK has as subject THEMA / THEMA is subject of WORK" (Zeng, Žumer & Salaba 2010, p. 16).

The notion of theme occurs in Derek Austin's → PRECIS and the works of the Italian Gruppo di ricerca sull'indicizzazione per soggetto (GRIS) (Cheti 1996; 2008). In their analysis, a work's subject may consist of one base theme and possibly of some particular themes that are related to the base theme in the document's argumentation; the latter may be mentioned or not in subject headings, while the former is mandatory. Theme is then understood as a component of subject. Gnoli and Cheti (2013) argue that base theme should be cited before particular themes within a classmark and displayed earlier in search results.

According to Wikipedia (September 2023): "In contemporary literary studies, a theme is a central topic, subject, or message within a narrative. Themes can be divided into two categories: a work's thematic concept is what readers 'think the work is about' and its thematic statement being 'what the work says about the subject'" (again, both definitions are document-oriented rather than request-oriented). In linguistics, these are called respectively theme and rheme or, in slightly different senses, topic and comment or given and new. They can refer to a whole text macrostructure or to a sentence microstructure: "Concerning weather [theme], today it’s sunny [rheme]".

Weinberg (1988) argues that rheme should be expressed in subjects as well as theme. Lancaster (2003, p. 16) comments, however, that she "fails to convince that these distinctions are really useful in the context of indexing or that it might be possible for indexers to maintain such distinctions". The → Integrative Levels Classification provides a way to express the rheme of a document, although this is not expected to be a common application (Gnoli 2018).

Hjørland (1997) argues that subject indexing is not necessarily about the main idea of a document, why subject and theme should not be considered synonyms. An issue of a journal may be thematic: the articles share the same theme, but they are usually indexed differently and by implication, their subjects are different (as understood in indexing and retrieving).

[top of entry]

3.7 Content

Content analysis is a method used in many fields, for example, educational research, linguistics, psychology, and sociology. Also, although the term is not much used in fields such as literature studies and art studies, kinds of content analysis can be said to be used here as well. In information science different approaches have been used, including automated content analysis based on word frequencies. Content analysis is applied to other media than texts. Short (2019), for example, is about content analysis of visual images.

Chapter 6 in Broughton (2012) is called ”Content analysis”. In this chapter (65) she wrote:

Before you can do that it is necessary to decide what the item being catalogued is about. Whatever system of subject headings (or classification scheme or thesaurus) is being used to describe a document, you should try initially to make an independent assessment of what the subject of that document is. In practice, you will almost certainly be unable to represent this exactly using the artificial language of your system, but you should at least begin by deciding objectively what it is you want to express. This process may be called 'subject analysis', or 'document analysis', 'content analysis' or 'concept analysis'. The subject content of items is sometimes also referred to more grandly as 'intellectual content' or 'semantic content', but these are simply other ways of defining what a document is about.

Broughton’s advice to think of the subject before working on its representing it in a particular knowledge organization system is important and corresponds to what Lancaster (2003, 9) describes as two steps in indexing: (1) “conceptual analysis”, determine the subject of a document (2) “translation”, finding the best place for the subject in a given knowledge organization system.

To Broughton’s list of what is sometimes used as synonyms for subject analysis, the term information analysis could be added. Vickery and Vickery (2004, 119) defined:

Human analysis of primary information message consists of a scan to select from it terms, phrases, and other expressions that are believed best to express its information content. The structure of the primary message itself often guides the human indexer — for example, the title of a paper, or a summary provided by the author, or his conclusions.

(It seems somewhat strange that Vickery and Vickery limit this definition to (1) primary messages, as also secondary messages are indexed by the same principle (2) derivative indexing, as also assigned indexing needs a definition.)

Dousa (2009) used the term “information analysis” to cover the analysis of both whole documents as well as smaller units of these:

The late 19th and early 20th centuries witnessed numerous developments in the domain of classificatory and indexing activities known today as knowledge organization (KO). Among the most striking of these was the emergence of the idea that documents could be decomposed not only into smaller bibliographical units (as, for example, a periodical into articles or a book into chapters), but also into yet smaller information units (such as, for example, the concepts or facts discussed in discrete passages within a text) and that, once identified, these information units could be reconfigured in new arrangements that would facilitate their retrieval [Metcalfe 1957, 223; Frarey 1953, 221-2]. This idea, which I term information analysis, would have a long and influential career in information science (IS) and continues to influence IS theory and practice to this day.

However, the retrieval of smaller units from documents is mostly termed “passage retrieval” (see, e.g., O´Connor 1980), or by Ranganathan (1963, 29) retrieval of ‘microdocuments’. In any case, the important issue is, that what is retrieved are parts of documents, not something with an independent existence. In passage retrieval (or micro document retrieval) the criterium for what should be retrieved is the subject of the passage/micro document. Nothing is gained by calling it “information analysis” rather than “subject analysis”.

Broughton in the above quote consider content as one among many synonyms for subject. We have so far mentioned the following candidates as synonyms for subject analysis:

  • (Aboutness analysis. Section 3.2 in the present article argues that aboutness should be considered a synonym for subject, and therefore aboutness analysis a synonym for subject analysis).
  • Concept analysis (= conceptual analysis). This term has a long tradition in philosophy, from Plato’s view that questions like 'What is knowledge?', 'What is justice?', or 'What is truth?' can be answered solely on the basis of one's grasp of the relevant concepts. Hanna (1998) wrote “by the end of the 1970s the movement [i.e., conceptual analysis] was widely regarded as defunct”. The problems of concept analysis are also important for knowledge organization, for example, in the approach known as “formal concept analysis”, and hopefully IEKO will later contain an article about concept analysis. Here, our conclusion is that this philosophical tradition is distinct from subject analysis in knowledge organization, and that these two concepts therefore should not be considered synonyms.
  • Content analysis is a research method used in the social sciences and humanities, including library and information science. White and Marsh (2006) is an article about content analysis written for the LIS community. The article presents 24 selected studies within LIS 1991-2005 which are based on content analysis as the research method. None of these are about what is known as subject analysis in knowledge organization. Therefore, the terms content analysis and subject analysis should not be considered synonyms (although we shall consider a counterexample from literary studies below).
  • Document analysis. In Web of Science this term was used 217 times in “topic” in Information science and library science out of a total of 6,191. Limited to title the respective figures are 23 of 292. Among important uses in LIS are Gardin (1973) and Salminen, Kauppinen and Lehtovaara (1997). Outside LIS, the article by Bowen (2009) can be mentioned. Gardin (1973) used document analysis for “the extraction of meaning from documents”. This may be closely related to subject analysis, but subject analysis is not just about extracting, but also about assigning meanings to documents. In general, document analysis is used as a broader term and more related to content analysis than to subject analysis.
  • Information analysis. Vickery & Vickery’s (2004, 119) definition of this term is 100% a synonym for what in knowledge organization traditionally have been called “subject analysis”.
  • (Semantic analysis is a concept mostly connected with linguistics, see, e.g., Goddard 2011. Some parts of semantic analysis are closely related to establishing semantic relations in knowledge organization systems, but this is not synonym with subject analysis. An article on semantic relations is planned for IEKO.)
  • Subject analysis is treated in the present article, where it is understood as a process aiming at describing documents in order to optimize their searchability; it is considered a prerequisite for indexing, classifying or retrieving documents (or passages) whether this is done by humans or by computers. It is suggested to be the preferred term.

More concepts could be considered, including discourse analysis, genre analysis, picture analysis and text analysis, but will not be discussed here.

As already said, content analysis is a term used by other disciplines in ways, which are different from what in library and information science is meant by subject analysis. For example, Roberts (2015, 769) defined:

Content analysis is a class of techniques for mapping symbolic data into a data matrix suitable for statistical analysis. These techniques may be applied to any representative sample of cultural artifacts (e.g., books, paintings, technological innovations, etc.), whereby various nonnumeric attributes of these artifacts are mapped into a matrix of statistically manipulable symbols. Thus, content analysis involves measurement, not ‘analysis’ in the usual sense of the word.

Neuendorf (2002, 10) defines content analysis as

summarizing, quantitative analysis of messages that relies on the scientific method and is not limited to the types of variables that may be measured or the context in which the messages are created or presented.

These specific definitions limits content analysis to be about quantitative methods. Short (2016, 3) wrote, however:

Content analysis currently includes both quantitative and qualitative approaches. Quantitative approaches are used in fields concerned with mass communications (Neuendorf, 2002), while qualitative content analysis covers methods such as discourse analysis, social constructivist analysis, rhetorical analysis, and textual analysis.

That content analysis includes both quantitative and qualitative methods is explicitly the case with White and Marsh (2006), who quote a definition by Krippendorff (2004, 18): “a research technique for making replicable and valid inferences from texts (or other meaningful matter) to the context of their use”.

Despite our conclusion that content analysis is a concept that is not a synonym for subject analysis, there are overlaps and exceptions. Stephens (2015, v) thus claimed that the question “What is this text about?” always has been the basis of analysis of (children’s) literature whether the method of critical content analysis or literary analysis is applied. To the degree that this is true, content analysis can be considered a near-synonym for subject analysis. (The term critical in critical content analysis is discussed by Short (2016, 3) who wrote: “Adding the word “critical” in front of content analysis signals a political stance by the researcher, particularly in searching for and using research tools to examine inequities from multiple perspectives. Researchers who adopt a critical stance focus on locating power in social practices by understanding, uncovering, and transforming conditions of inequity embedded in society”.)

Krippendorff (2018, 27-31) discusses six important philosophical problems in content analysis that are also relevant to consider for subject analysis in knowledge organization:

  1. Text have no objective — that is, no reader-independent — qualities.
  2. Texts do not have single meanings that could be “found”, “identified”, or “described” for what they are correlated with states of their sources.
  3. The meanings invoked by texts need not be shared.
  4. Meanings (contents) speak to something other than the given texts, even where convention suggests that messages “contain” them or texts “have” them.
  5. Texts have meanings relative to particular contexts, discourses, or purposes.
  6. The nature of texts demands that content analysts draw specific inferences from a body of texts to their chosen context — from print to what the printed matters means to particular users, from how analysts regard a body of texts to how selected audiences are affected by those texts, and from available data to unobservable phenomena.

Such principles correspond to the principle in knowledge organization according to which documents do not “have” subjects, but that subjects are attributed to documents in order to serve selected purposes for potential users.

[top of entry]

4. Conclusion

The concept "subject" has a long history in LIS but the different meanings have seldom been compared and examined. The main conclusions of this article are:

  • Any approach to subject representation is connected to a certain understanding of "subject", which is often implicit.
  • Different definitions or implicit views of "subject" is connected to different approaches and paradigms in information science. The concept "subject" cannot be properly understood or developed without considering basic theoretical issues in LIS.
  • The activity of assigning a subject label to a given document aims at facilitating certain uses of that document at the expenses of other uses. This activity is done by somebody or by an algorithm based on his or her (or the programmer's) knowledge, theories, working conditions etc.
  • Any given document have an unlimited range of possible uses or potentials. The aim of subject analysis is to identify the most important potentials in order to facilitate the identification of documents that supports important human activities. The subjects of a document are its informative or epistemological potentials, that is its potential of informing users and advance the development of knowledge.

[top of entry]


The author would like to thank Widad Mustafa El Hadi for serving as the editor of this article, the three anonymous referees for providing their valuable feedback, and Claudio Gnoli for contributing to the section on Theme.

[top of entry]


Baca, Murtha and Harpring, Patricia (Eds.). 2009. "Categories for the description of works of art (CDWA)". Los Angeles, CA: The J. Paul Getty Trust and College Art Association, Getty Research Institute. Retrieved 2010-01-20 from: http://www.getty.edu/research/conducting%5Fresearch/standards/cdwa/index.html

Bernier, Charles L. 1980. "Subject indexes". In: Kent, Allen; Lancour, Harold & Daily, Jay E. (Eds.), Encyclopedia of Library and Information Science: Volume 29. New York, NY: Marcel Dekker, Inc.: 191-205.

Bowen, Glenn A. 2009. «Document Analysis as a Qualitative Research Method». Qualitative Research Journal 9, no. 2: 27-40. DOI: 10.3316/QRJ0902027

Broughton, Vanda. 2012. Essential of Library of Congress Subject Headings. London: Facet Publishing.

Cheti, Alberto. 1996. "Testo e contesto nell'analisi concettuale dei documenti" [Text and Context in Conceptual Analysis of Documents]. In Il linguaggio della biblioteca: scritti in onore di Diego Maltese, ed. Mauro Guerrini. Milano: Editrice Bibliografica, 833-55.

Cheti, Alberto. 2008. "Il punto di vista del GRIS sulla relazione di soggetto in FRBR" [GRIS' Viewpoint on the Subject Relationship in FRBR]. In Principi di catalogazione internazionali: una piattaforma europea? Considerazioni sull'IME ECC di Francoforte e Buenos Aires: Atti del convegno internazionale, Roma, Bibliocom-51o Congresso AIB, 27 ottobre 2004, ed. Mauro Guerrini. Rome: Associazione italiana biblioteche, 91-100. http://www.aib.it/ aib/congr/c51/chetint.htm.

Cooper, William S. 1969. "Is interindexer consistency a hobgoblin? " American Documentation, 20: 268-278.

Day, Ronald E. 2014. Indexing it all: the subject in the age of documentation, information, and data. Cambridge, MA: The MIT Press.

Demolombe, Robert & Jones, Andrew J. I.1999. "On sentences of the kind sentence "p" is about topic "t" ". Chapter in, H-J. Ohlbach, U. Reyle, editors. Logic, Language and Reasoning. Essays in honor of Dov Gabbay (pp. 125-144). Dordrecht: Kluwer. https://www.irit.fr/~Robert.Demolombe/publications/1996/gabbay96.pdf

Dewey, Melvil. 1979. Dewey Decimal Classification and relative index. (19th ed., Vol. 1). Albany, NJ: Forest Press.

Dousa, Thomas M. 2009. “Facts and Frameworks in Paul Otlet´s and Julius Otto Kaiser´s Theories of Knowledge Organization”. Bulletin of the American Society for Information Science and Technology 36, no 2: 19-25. https://doi.org/10.1002/bult.2010.1720360208

Drake, Cyril Lewis. 1960. "What is a subject?" Australian Library Journal, 9: 34-41.

Dutta, Bidyarthi. 2015. "Ranganathan's elucidation of 'subject' in the light of 'Infinity (8)' ". Annals of Library and Information Studies, 62: 255-264. Digital version: http://nopr.niscair.res.in/bitstream/123456789/33720/1/ALIS%2062(4)%20255-264.pdf

Dutta, Bidyarthi & Dutta, Chaitali. 2013. "Concept of "subject" in library and information science from a new angle". Annals of Library and Information Studies, 60(2): 78-87. Digital version: http://op.niscair.res.in/index.php/ALIS/article/download/2086/61

Dutta, Bidyarthi, Majumder, Krishnapada and Sen, B K. 2013. "In search of dimensions of subject from the standpoint of Ranganathan". Annals of Library and Information Studies, 60(1): 51-55.

Eco, Umberto. 1976. A Theory of Semiotics. Bloomington: Indiana University Press.

Fairthorne, Robert A. 1969. "Content analysis, specification and control". Annual Review of Information Science and Technology, 4: 73-109.

Fisher, Ronald Aylmer. 1921. "Studies in Crop Variation. I. An examination of the yield of dressed grain from Broadbalk". Journal of Agricultural Science. 11 (2): 107-135. doi:10.1017/S0021859600003750.

Frarey, Carlyle J. 1953. “Developments in Subject Cataloging”. Library Trends 2, no. 2: 217– 35.

Frohmann, Bernd. 1994. "The social construction of knowledge organization: The case of Melvil Dewey". Advances in Knowledge Organization, 4: 109-117.

Gardin, Jean-Claude. 1973. “Document Analysis and Linguistic Theory”. Journal of Documentation 29, no. 2: 137-68.

Gnoli, Claudio. 2018. "Classifying Phenomena Part 4: Themes and Rhemes". Knowledge Organization 45, no. 1: 43-53. DOI:10.5771/0943-7444-2018-1-43.

Gnoli, Claudio and Alberto Cheti. 2013. "Sorting Documents by Base Theme with Synthetic Classification: The Double Query Method". In Classification & Visualization: Interfaces to Knowledge: Proceedings of the International UDC Seminar 24-25 October 2013 the Hague, the Netherlands, edited by Aida Slavic, Almila Akdag Salah and Sylvie Davies. Ergon: Würzburg, 225-232.

Goddard, Cliff. 2011. Semantic Analysis: A Practical Introduction. 2. ed. Oxford, UK: Oxford University Press.

Golub, Koraljka. 2014. Subject access to information: An interdisciplinary approach. Santa Barbara, CA: Libraries Unlimited.

Gopinath, Malur Aji. 1976. "Colon Classification". In: Arthur Maltby (Ed.): Classification in the 1970s: A second look (rev. ed.; pp. 51-80). London: Clive Bingly.

Hanna, Robert. 1998. “Conceptual Analysis”. In Routledge Encyclopedia of Philosophy, ed. Edward Craig. London: Routledge, 518-22.

Hjørland, Birger. 1992. "The concept of "subject" in information science". Journal of Documentation, 48(2):172-200.

Hjørland, Birger. 1997 Information seeking and subject representation. An activity-theoretical approach to information science. Westport & London: Greenwood Press.

Hjørland, Birger. 2013. "Facet analysis: The logical approach to knowledge organization". Information processing and management, 49(2): 545-557.

Hjørland, Birger & Nicolaisen, Jeppe. 2005. "Bradford's law of scattering: Ambiguities in the concept of "subject" ". In: Crestani, F. & Ruthven, I. (Eds.): CoLIS 2005, Proceedings of the 5th International Conference on Conceptions of Library and Information Science (pp. 96-106). Berlin: Springer-Verlag. (LNCS 3507)

Huang, Xiaoli. 2009. Topicality Reconsidered: A Multidisciplinary Inquiry into Topical Relevance Relationships. College Park, MD: University of Maryland, College of Information Studies. (PhD-dissertation).

Hutchins, W. John. 1975. Languages of indexing and classification. A linguistic study of structures and functions. London: Peter Peregrinus.

Hutchins, W. John. 1977. "On the problem of "aboutness" in document analysis." Journal of Informatics, 1: 17-35.

Hutchins, W. John. 1978. "The concept of "aboutness" in subject indexing." Aslib Proceedings, 30: 172-181.

IFLA. 2010. Functional requirements for subject authority data (FRSAD): A conceptual model. By IFLA Working Group on the Functional Requirements for Subject Authority Records (FRSAR). Edited by Marcia Lei Zeng, Maja Zumer, Athena Salaba. International Federation of Library Associations and Institutions. Berlin: De Gruyter. Retrieved 2011-09-14 from: http://www.ifla.org/files/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf

ISO 5963:1985. Documentation: Methods for examining documents, determining their subjects and selecting indexing terms. International Organization for Standardization. https://www.iso.org/obp/ui/#iso:std:iso:5963:ed-1:v1:en

ISO/IEC 13250 Topic Maps. Information Technology. Document Description and Processing Languages. Second Edition. Geneva, 19 May 2002. http://xml.coverpages.org/TM-iso13250-2nd-ed-v2.pdf

Janes, Joseph W. 1994. "Other peoples' judgments: A comparison of users and others' judgments of document relevance, topicality, and utility". Journal of the American Society for Information Science and Technology, 45(3): 160-171.

Jarneving, Bo. 2005. "A comparison of two bibliometric methods for mapping of the research front". Scientometrics, 65(2): 245-263.

Krause, Michael G. 1988. "Intellectual problems of indexing picture collections". Audiovisual Librarian, 14(2): 73-81.

Krippendorff, Klaus. 2004. Content Analysis: An Introduction to Its Methodology. 2nd. Edition. Thousand Oaks, CA: SAGE Publications.

Krippendorff, Klaus. 2018. Content Analysis: An Introduction to Its Methodology. 4th. Edition. Thousand Oaks, CA: SAGE Publications.

Lancaster, Frederick Wilfrid. 2003. Indexing and abstracting in theory and practice. Third edition. London: Facet Publishing.

Library of Congress. 2008. The subject headings manual. Washington, D.C: Library of Congress, Policy and Standards Division.

Louwerse, Max M. and Willie van Peer. 2006. "Thematics". In Encyclopedia of Language and Linguistics, 2nd ed. Ed. Keith Brown. Oxford: Elsevier, 12, p. 653-658.

Metcalfe, John Wallace. 1957. Information Indexing and Subject Cataloging: Alphabetical, Classified, Coordinate, Mechanical. New York: Scarecrow Press.

Metcalfe, John. 1973. "When is a subject not a subject?" In Towards a theory of Librarianship. Ed. by Conrad H. Rawski. New York: Scarecrow Press.

Miksa, Francis. 1983a. "Melvin Dewey and the corporate ideal". Pp. 49-100 in: Melvil Dewey: The man and the classification. Ed. by G. Stevenson & J. Kramer-Greene. Albany, NY: Forest Press.

Miksa, Francis. 1983b. The subject in the dictionary catalog from Cutter to the present. Chicago: American Library Association.

Neuendorf, Kimberly A. 2002. Content Analysis Guidebook. Thousand Oaks, CA: Sage.

O´Connor, J. 1980. “Answer-Passage Retrieval by Text Searching”. Journal of the American Society for Information Science 31, no. 4: 227-39. https://doi.org/10.1002/asi.4630310402.

Olson, Hope A. 2002. The power to name: Locating the limits of subject representation in libraries. Dordrecht, The Netherlands: Kluwer Academic Publishers.

Panofsky, Erwin. 1939. Studies in iconology: Humanistic themes in the art of the Renaissance. New York: Oxford University Press.

Ranganathan, Shiyali Ramamrita. 1963. Documentation and Its Facets. New York: Asia Publishing House.http://arizona.openrepository.com/arizona/ bitstream/10150/105426/3/documen.partb.pdf.

Ranganathan, Shiyali Ramamrita. 1964. "Subject heading and facet analysis", Journal of Documentation, 20, No. 3: 109-119.

Ranganathan, Shiyali Ramamrita. 1967. Prolegomena to library classification. Third edition. London: Asia Publishing House.

Roberts, Carl W. 2015. “Content Analysis”. In International Encyclopedia of the Social & Behavioral Sciences 2nd edition, ed. James D. Wright. Amsterdam, Netherlands: Elsevier, Volume 4: 769-73. https://doi.org/10.1016/B978-0-08-097086-8.44010-9

Rowley, Jennifer & Hartley, Richard. 2008. Organizing knowledge. An introduction to managing access to information. 4th edition. Aldershot: Ashgate Publishing Limited.

Salminen, Airi, Katri Kauppinen and Merja Lehtovaara. 1997. “Towards a Methodology for Document Analysis”. Journal of the American Society for Information Science 48, no. 7: 644-55.

Salsburg, David. 2001. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. New York: W. H. Freeman.

Saracevic, Tefko. 2008. "Effects of inconsistent relevance judgments on information retrieval test results: A historical perspective". Library Trends, 56(4):763-783. http://comminfo.rutgers.edu/~tefko/LibraryTrends2008.pdf

Schneider, Jesper W & Borlund, Pia. 2004. "Introduction to bibliometrics for construction and maintenance of thesauri: Methodical considerations". Journal of Documentation, 60, No. 5: 524-549.

Shatford, Sara. 1986. "Analyzing the subject of a picture: A theoretical approach". Cataloging & Classification Quarterly, 6 (3): 39-62.

Short, Kathy G. 2019. “Critical Content Analysis of Visual Images”. In Critical Content Analysis of Visual Images in Books for Young People Reading Images, eds. Holy Johnson, Janelle Mathis and Kathy G. Short. New York: Routledge, 3-22.

Short, Kathy G. 2016 (with the Worlds of Words Community). ”Critical Content Analysis as a Research Methodology”. In Critical Content Analysis of Children’s and Young Adult Literature: Reframing Perspective, eds. Holly Johnson, Janelle Mathis and Kathy Gnagey Short. London, England: Routledge, 1-15.

Siewert, Charles. 2016. "Consciousness and intentionality", The Stanford Encyclopedia of Philosophy (Fall 2016 Edition), Edward N. Zalta (ed.). http://plato.stanford.edu/entries/consciousness-intentionality/.

Small, Henry G. 1978. "Cited documents as concept symbols". Social studies of science, 8(3): 327-340.

Soergel, Dagobert. 1974. Indexing languages and thesauri: Construction and maintenance. Los Angeles, Calif: Melville Publishing.

Soergel, Dagobert. 1985. Organizing information: Principles of data base and retrieval systems. Orlando, FL: Academic Press.

Stephens, John. 2015. “Editorial: Critical Content Analysis and Literary Criticism”. International Research in Children’s Literature 8, no. 1: v-viii. DOI: 10.3366/ircl.2015.0144

Tredinnick, Luke. 2006. Digital information contexts: Theoretical approaches to understanding digital information. Oxford: Chandos.

Vickery, Brian C. and Alina Vickery. 2004. Information Theory in Theory and Practice. 3rd.ed. München: K. G. Saur.

Weinberg, Bella Hass. 1988. Why indexing fails the researcher. The Indexer, 16(1), 3-6.

Welty, Christopher A. 1998. "The ontological nature of subject taxonomies". In, Nicola Guarino (ed.), Proceedings of the First Conference on Formal Ontology and Information Systems, Amsterdam, IOS Press. http://www.cs.vassar.edu/faculty/welty/papers/fois-98/fois-98-1.html

White, Marilyn Domas and Emily E. Marsh. 2006. “Content Analysis: A Flexible Methodology”. Library Trends 55, no. 1: 22–45. DOI: 10.1353/lib.2006.0053

Wilson, Patrick. 1968. Two kinds of power. An essay on bibliographical control. Berkeley: University of California Press.

Wikipedia, the free encyclopedia. Theme (narrative). https://en.wikipedia.org/wiki/Theme_(narrative)

Xu, Yuniie & Yin, Hainan. 2008. "Novelty and topicality in interactive information retrieval". Journal of the American Society for Information Science and Technology, 59(2), 201-215.

Yablo, Stephen. 2014. Aboutness. Princeton, NJ: Oxford: Princeton University Press.

Zeng, Marcia Lei, Žumer, Maja & Salaba, Athena (Eds.). 2010. Functional Requirements for Subject Authority Data (FRSAD): A Conceptual Model. Approved by the Standing Committee of the IFLA Section on Classification and Indexing. The Hague: International Federation of Library.

[top of entry]

Visited Hit Counter by Digits times since 2018-10-16 (2 years after first publication).

Version 1.0 published 2016-10-04
Version 1.1 published 2018-10-06: reference to Huang 2009
Version 2.0 published 2020-10-15: section 3.7 added
Version 2.1 published 2023-09-14: section 3.6 improved
Article category: Theoretical concepts

This is a major revision of an article formerly published by the present author on Wikipedia. This article (version 1.0) is published in Knowledge Organization, vol. 44 (2017), Issue 1, pp. 55-64.
How to cite it (version 1.0): Hjørland, Birger. 2017. “Subject (of documents)”. Knowledge Organization 44, no. 1: 55-64. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/subject

©2016 ISKO. All rights reserved.