edited by Birger Hjørland and Claudio Gnoli


Fictional literature classification and indexing

by Jarmo Saarti

Table of contents:
1. Introduction
2. Information process of fiction
3. Aspects of fiction content description
4. Classification and indexing of fiction
5. Classification practices and principles for fiction
6. Development of fiction thesauri and ontologies
7. Systematic approach to the fictional knowledge organization
8. Conclusions
Appendix 1

Fiction content analysis and retrieval are interesting specific topics for two major reasons: (1) the extensive use of fictional works and (2) the multimodality and interpretational nature of fiction. The primary challenge in the analysis of fictional content is that there is no single meaning to be analysed; the analysis is an ongoing process involving an interaction between the text produced by author, the reader and the society in which the interaction occurs. Furthermore, different audiences have specific needs to be taken into consideration. This article explores the topic of fiction knowledge organization, including both classification and indexing. It provides a broad and analytical overview of the literature as well as describing several experimental approaches and developmental projects for the analysis of fictional content. Traditional fiction indexing has been mainly based on the factual aspects of the work, this has then been expanded to handle different aspects of the fictional work. There have been attempts made to develop vocabularies for fiction indexing. All the major classification schemes use the genre and language/culture of fictional works when subdividing fictional works into subclasses. The evolution of shelf classification of fiction and the appearance of different types of digital tools have revolutionized the classification of fiction, making it possible to integrate both indexing and classification of fictional works.

1. Introduction

There are several reasons why fiction content analysis and retrieval are interesting topics within the knowledge management and organization of documents; i.e. the practical need for fiction retrieval has remained unabated while the possibilities for creating retrieval systems for fiction have increased. This can be traced to the development of computerised environments for information retrieval, and especially for the dissemination of fictional works by both commercial Internet-based vendors and the public sector. These developments have applied a multifaceted approach of analysing and describing texts, as this is an important feature of characterizing and finding the appropriate works of fiction. One must remember that fiction is the most popular type of literature, especially in public libraries.

The history of active content analysis of fiction is surprisingly short, only about one hundred years. The need for the fiction indexing and classification was, and still sometimes seems to be a political issue. As Eriksson stated:

An early significant event is an extensive classification of fiction carried out by the Free Library of Philadelphia in the very beginning of the 20th century. This work becomes a national issue in the USA when the classification is discussed for a few years at the ALA’s annual congress, but it ends up being dismissed. The thesis [i.e. Eriksson’s work] argues that this decision stopped the development of classification for fiction for decades, and quite possibly it is one of the reasons why bibliographic systems, even in the 1980s, did not reflect the topics or themes of fiction. Only eighty years later did the ALA change its mind and from 1990, fiction has been indexed in USA and Denmark, and this may be anticipated to spread to many other countries. (Eriksson 2010, VII)

The inexorable spread of the Internet, especially from the beginning of this millennium served as an impetus for the organization of fictional knowledge e.g. the development of specialized and fast information retrieval systems. First the different vendors, e.g. commercial bookstores, publishers, even individual readers started to utilize classification and indexing as well as other tools in their Internet services. The evolving statistical and social media types of tools were also incorporated into both commercial and library information systems. Furthermore, the Internet created totally new tools for promoting fiction and supporting a reading culture. (Collins 2010; Ross, McKehnie and Rothbauer 2018; Birdi and Ford 2017)

There is already some evidence that enriched result lists and multiple entry points to fiction may help users to locate books (Mikkonen and Vakkari 2016, 67) whereas a simple access point is not as useful (Wilson et al. 2000). In addition, the search strategies used by readers to locate fiction have been analysed and found to support the multimodal nature of the fiction searching as well as considering the needs of each individual reader trying to find fiction (Saarinen and Vakkari 2013, 752–753).

The gradual shift to the digital distribution of information has meant that one needs new tools for analysing the contents of fictional material as well as for its indexing. In other words, texts and other materials that have not been analysed, described and classified and/or indexed in full text databases are hard or even impossible to retrieve. Another reason why we need to take a fresh approach to the content analysis of fictional material is that a free text search is not efficient when searching fictional material. This becomes apparent if we compare it with the search and retrieval of publications in the natural sciences, where even though the text and content may be very topical, its retrieval is usually rather straightforward.

From the viewpoint of information science, the analysis of fictional texts and the information dissemination process of fictional works clearly challenge but also enrich the traditional theoretical models and thus expand the theoretical tools and concepts underpinning this field of research. (See, e.g., Beghtol 1994b; 1997; Green 1997; Ward and Saarti 2018.)

This article evaluates the methods and tools for organizing fictional knowledge with a special emphasis on the content representation of fiction mainly from the perspective of the public libraries.

[top of entry]

2. Information process of fiction

The main actors in the information process of fiction are the work of art, its creator (i.e. the writer), the reader and the social-historical environment where the publishing and reception takes place (Beghtol 1986, 93; Saarti 2000, see Fig. 1). Because of the special nature of a work of fiction, the reception of the work of art is not fulfilled unless all the above actors participate in the process. The role of the writer is to write works of art — novels, short stories, poems, plays — to be published. The role of the work of art is to be a medium through which the artist can communicate with his/her audience. However, the work of art has its own, autonomous life: after the book has been published, the writer can only have a role as one of its readers, i.e. an interpreter of the work.

Figure 1: Communication process of fiction (adapted from Segers 1985, 72 and Martens 1975, 36; Saarti 2000)

The role of the reader is that he/she is an interpreter of a work of art. The interpretation as well as the creation of a work of art takes place in a social-historical context that defines the language used and its means of artistic expression. Without a common language, there can be no communication between the writer and her/his readers. This influences the search for fiction: the knowledge about authors, works and their likeness to other works of art are major factors when searching for fiction and the systems should support this fact (Ross 2001).

It is also typical for fictional communication that it is a two-way street. One can first consider it in terms of factual meanings, i.e. references to actual happenings, historical events and geographical facts etc. (see e.g. Ranta 1991, 20-23) and on the other hand, it has an aesthetic facet but this will be based on the individual interpretation and reception. That influences the content description: on the one hand, objective grounds can be identified but on the other hand, some aspects are subjective and thus personal and diverse. This dichotomy was apparent in Saarti's study where test persons indexed and abstracted novels. The indexing was found to be very inconsistent (Saarti 2002) and one could characterize the abstracts in the following categories (Saarti 2000):

  • Abstracts that describe the structure and content of the novel (plot/thematical abstract).
  • Abstracts that describe the position of the novel in its writer's list of works or describe the novel's position in the literary canon (cultural/historical abstract).
  • Abstracts that describe the reading experience.
  • Critical abstracts.

Adkins and Bossaller (2007) conducted an analysis of the access point to fiction in computer-mediated book information sources. They stated that “online bookstores may be effective tools for librarians helping patrons find "good" books because of their increased use of access points. However, reader advisory databases, which contain reviews and subject headings, are occasionally more effective than online bookstores for identifying books published prior to the 1990s” (ibid. 354). They list altogether 35 different types of access points that they found in databases to fictional works including contents, cataloguing information, visual information, plot information, reviews etc. (ibid. 368).

Vernitski (2007) has proposed a model for managing the intertextuality of fictional works. She postulated that there are the following nodes for the intertextual references: quotation, allusion, variation, sequel and prequel (ibid. 47–48). She stated that these types of indexes could be especially useful for the research community. Thus, the organization of fictional knowledge is also dependent on the point-of-view of the target audience: fiction can be read and interpreted in completely different ways and these need different types of tools and approaches for their management

Thus, it is evident that the primary challenge for the fiction content analysis is that there is no single topical meaning to be analysed; in fact, the analysis is an ongoing project due to the nature of the fictional process i.e. there is a continual interaction between the author, text, society and reader. Furthermore, different audiences have their specific own needs that must be taken into consideration.

[top of entry]

3. Aspects of fiction content description

Ranta (1991) has drawn a distinction between two basic kinds of elements to be indexed in fictional works — denotative and connotative. Denotative or factual elements consist of facts in fictional works, such as the setting, personae and factual elements of the plot. Connotative or imaginative elements consist of elements interpreted from fictional works, e.g. the theme and its interpretation and issues arising from the expressional aspects of the work of art (Ranta 1991, 20-23). Ranta has utilized Shatford's approach for indexing photographs, based on Panofsky's theory. Shatford divided the meaning into two categories, i.e. factual and expressional forms. The difference between these two categories is that the factual meanings are objective while the expressional meanings are subjective. "The former describes what the picture is Of, the latter, what it is About". Thus, the indexing of the factual meanings is far more straightforward than that of the expressional meanings (Shatford 1986, 42-50).

It has also been typical that traditional classifications of fiction have a very theoretical foundation, especially the traditional denotative classification systems. They are mainly built on the tradition of historical linguistics originating from the romantic era and ideologies with an educational basis. Unfortunately, in these approaches, the needs of the users are ignored. This was one of the reasons why Pejtersen carried out her study in Danish public libraries to determine what the users wanted to be classified/indexed from the novels. As a result, she divided the questions of the interviewed users into four categories: subject matter, frame, author's intention and accessibility (Pejtersen and Austin 1983, 234).

Pejtersen's categories can be divided into denotative (subject matter and frame) and a connotative (author's intention) aspects. Furthermore, she has included aspects that are usually left to the cataloguing of books in terms of group accessibility (e.g. physical characteristics). This shows that a system for fiction, created according to the reader's wishes must be multi-faceted and include both denotative and connotative aspects: some that are easily recognizable and traditional, as well as some that are unfamiliar to the present systems of classifying and indexing (e.g., evaluating). Pejtersen's results also indicate that the clear division between cataloguing and classifying/indexing is of no relevance to users — their only interest is in locating the works of art they need as easily as possible. Thus, Green stated that the indexing terms of fiction should be divided into two categories — subject terms and attribute terms. The former is those "that reflect what a document or a user need is about". However: "This leaves attribute indexing to reflect such other characteristics of documents and user needs as language, regency, author affiliation, intended audience, and so on" (Green 1997, 86).

The most problematic aspect in Pejtersen's scheme is the author's intention because this is based on the indexer's point of view i.e. on his/her interpretation. This is especially true in the case of emotional experience that does not belong to the work itself but to the reader. Categorizing the author's intention is also problematic because it is difficult, if not impossible, to define from the work of art what was the author's intention. In addition, as Wellek and Warren already mentioned, the author can misinterpret his or her own intention: "It happens to all of us that we misinterpret or do not fully understand what we have written some time ago" (Wellek and Warren 1980, 148). Furthermore, in order to define the author's intention, we would have to ask the author him/herself — which would be very difficult, time-consuming and in many cases completely impossible.

Andersson and Holst modified Pejtersen's classification in their study, which was based on interviews of 100 users in two Swedish public libraries; they then analysed the descriptions of the novels' plots and compared them with the library's indexes (Andersson and Holst 1996, 88). Their model included the following categories: phenomena, the frame and the author's intention.

Andersson and Holst have added some important aspects to Pejtersen's categories that belong to fictional communication e.g., a borrowed motif, a subtler analysis of the phenomena of fictional works and a category related to modifications as well as additions to the author's intention, in which they have used a more neutral concept of message complemented with the reader's experience.

It is interesting to note that the above categories do not include fundamental aspects of the work of art: the aesthetic and/or moral value of the work. Of course, one reason is that valuing is usually very subjective and thus fits poorly with the traditional neutral approach of indexing and classifying works. On the other hand, when the valuing of a work of art is omitted, one and perhaps the most important aspect of an aesthetic object, is ignored. It also seems that users do want valuing of works of art. This can be observed in many forms, e.g., in marketing, criticism or knowledge that the book has been a candidate for a prestigious award and prize in literature etc.

It can also be seen that the aspects to be indexed/classified are mostly limited to those that are as objective (denotative) as possible. Pejtersen as well as Andersson and Holst have added a few mutable/fuzzy categories that are based upon readers' experiences. Nonetheless, there is some aspects totally missing from the categories mentioned above i.e. the history of different interpretations of a work of art as well as its position in the literary-historical continuity. In some cases, this aspect could be interesting and enlightening. In this respect, the author and his/her role have secondary roles in the above categories. On the other hand, this reveals that we must make clear definitions about what aspects are worth indexing in fictional works. In addition, it clearly indicates that the systems for indexing fiction are clearly dependent on the environment for which they have been created.

We can see from the schemes described above that the traditional type of fiction indexing is mainly based on factual aspects. According to Nielsen (1997), these should be extended to incorporate aspects of thematical factors, as well as the features of the narrational structures. This is needed because in modern and post-modern fiction, the main point is how it is told and not what is told. The third aspect that Nielsen emphasises as one way of improving fiction indexing would be the inclusion of both cultural and historical facts that have affected the work, e.g. artistic schools and cultural periods (see also Negrini and Adamo 1996 where there is a more precise analysis of the literature domain).

For the classification of the fiction, the different literature genres have often been used a basis for the classification (see more Rafferty 2012). In this respect, a genre means literally a kind or a class. However, as Chandler (1997, 1) stated the concept of genre is problematic in several ways. The concept of genre is often used in a biological way, i.e. in biology a genre can be thought of as a → genealogically defined species whereas in literature, genres are continually being re-defined.

There also seem to be different layers in genre definition. In fiction, the broadest genres are poetry, prose and drama and their consequent subdivisions. This classical definition can be seen in the traditional → classification schemes. When using specific genres as a basis for classification, one has to bear in mind that: “The classification and hierarchical taxonomy is not a neutral and 'objective' procedure. There are no undisputed 'maps' of the system of genres within any medium (though literature may perhaps lay some claim to a loose consensus). Furthermore, there is often a considerable theoretical disagreement about the definition of specific genres” (Chandler 1997, 1). For an example of the complexity of fiction genres, see Appendix 1.

[top of entry]

4. Classification and indexing of fiction

Because of the nature of fiction, it has proved very difficult to separate the indexing from the classification of fiction: there are several significant facets to be considered in the indexing, and classification schemes thus become multi-faceted. In fact, some classification schemes use keywords as class notations.

One major feature of fiction indexing and classification studies has been the problem of identifying those aspects that are worth indexing and/or classifying in individual works. Traditionally, the general classification systems have utilized a literary basis (specifically genre), the year of publication (sometimes with the reference to an epoch) and the country of publication and/or the writer (sometimes with a reference to cultural regions). Some classification schemes have later expanded to include certain specific classes of subject matter. These have remained the basic foundations of the main classification systems (see e.g. Beghtol 1989; 1990). The literary genre, time of publication and geographical region are useful bases for classification. They can be considered to belong to the tradition of historical linguistics used for classifying languages and their literature. They can also be viewed as providing an objective basis for the classification. However, these classification systems leave the idea of describing the → subject content of fiction — what the fiction is about — untouched (see also Bierbaum 1995, 390).

The studies on the classification of fiction can be divided into two categories — those that discuss the shelf classification of fiction and those that believe that the classification should be a means to provide a content description of fiction.

Fiction classification studies have constantly emphasised the fact that the content description of fiction will necessarily be multi-faceted. Thus, Beghtol claimed in her study examining the different fiction classification schemes: "Characters, Events, Spaces and Times may be taken as fundamental data categories for fiction" (Beghtol 1994a, 157). Pejtersen made the same kind of claim in an empirical study on the basic aspects that patrons use while searching fiction for themselves (Pejtersen and Austin 1983; 1984). Pejtersen's studies imply also that indexing and classification — especially with respect to fiction — are merging into more holistic schemes where classes are described by indexing terms and vice versa. User-friendly systems such as Pejtersen's BookHouse (Pejtersen 1989), have adopted this type of classification with indexing terms as class notations.

Previous studies on fiction indexing can be divided into two categories; the first consists of those that discuss fiction indexing and the principles behind it at a general level. The second category includes those that deal with the creation of book indexes. The studies on book indexes have been mostly carried out in Anglo-American cultures which have a long tradition of book indexing, but some work has been done in the Nordic countries, especially in Denmark.

These studies have discussed the management of the complexity of fiction in indexing, as well as the concept of → aboutness in fiction retrieval (Andersson and Holst 1996; Beghtol 1992; Bell 1991; Pulli 1992; Ranta 1991; Moraes 2012). There are also publications with some similarities to these studies that have discussed the possibilities of creating AI systems for fiction, because those systems are basically built upon indexes (Rich 1979; 1986). Furthermore, there are several reports describing experiments of fiction indexing in various libraries (e.g., MacPherson 1987, who examined the creation of children's literature indexes in a school environment).

[top of entry]

5. Classification practices and principles for fiction

All the major classification schemes used in libraries have included fiction. UDC and Dewey use the genre and language/culture of fictional works when dividing works to subclasses. The following subdivision is an example utilizing the Dewey system:

820   English & Old English literatures
821       English poetry
822       English drama
823       English fiction
824       English essays
825       English speeches
826       English letters
827       English humour and satire
828       English miscellaneous writings

Because of the analytical-synthetic and multi-faceted nature of the UDC, one can also apply a special auxiliary subdivision for literary forms, genres, techniques and different languages. The → Colon Classification is rather like the UDC applying the following facets for fiction: language, form, author, work (Satija 2017).

These main classification schemes have been utilised as a basis for the shelf classification of fiction which has been an important aspect in developing the classification of fiction. The shelf classification of fiction has the longest tradition in the Anglo-American libraries. The classes used have mainly been recreational and popular fiction genres, e.g. thrillers, horror, romances. The reason for using these genres is very clear — recreational genres are used in advertising, these books are often published in series and, they are usually written in the form of a certain genre which is targeted to certain readers — the rules of reading and writing generic fiction are very clear in recreational fiction. On the other hand, there are various and heterogenic sets of genre classifications especially for the printed stock and these are used in both libraries and bookstores.

Historically we can separate three different ways of developing a shelf classification of fiction. The oldest and most widely used system is to separate a few well-known genres from the rest of the fiction stock. Usually these genres are also the most popular for the users of the library for example, detective novels are considered as a distinct shelf class in nearly every public library (Harrell 1985, 14; Juntunen and Saarti 1992, 108; Jennings and Sear 1989). The second step in shelf classification is to separate popular fiction from the fiction stock and arrange it according to genres (see e.g. Alternative arrangement 1982, 75-76). Usually here, the most popular genres of fiction are shelved separately, e.g. science fiction, romance, thrillers and detective fiction. (For the definition of these genres, see Trott 2017.)

The third and the most challenging way is to try to classify the entire fiction stock. Two different approaches have been applied; in the first, the whole stock is divided into classes without any distinction made between recreational and serious fiction (see e.g. Burgess 1936; Saarti 1997). In the other model, the fiction stock is initially divided into two main classes — recreational and serious fiction — and then those main classes are divided into subcategories (see e.g. Spiller 1980, 241).

The idea of dividing fiction to classes based on genres has also been utilised in the present commercial and library software used in the Internet. All the major Internet bookshops have developed their own genre-based classifications for fiction. (Wikipedia has a list of 53 “genre” categories for fiction with a total of 528 subcategories; see Appendix 1.) In addition, statistical tools are used which analyse the user’s preferences in order that they can recommend new fiction to their customers. The users can also create their own recommendation lists that are published. This type of social and statistical knowledge organization is also used in different types of so-called fan fiction sites (Smith 2017).

The major change here is that in a digital environment, the classification is not tied to physical shelves and thus the concept of having a multimodal classification can be realized, i.e. the same fictional work can be in different classes at the same time. This has also enhanced the integration between the indexing and classification of fiction (see e.g. Pawlicki 2017).

[top of entry]

6. Development of fiction thesauri and ontologies

The → thesauri and subject heading lists for fiction started to evolve from the needs of individual libraries and/or because of the initiative of a single individual. Subsequently, these started to expand and recently we have also seen systems operating at the national level. At first, they have been mostly simple word-lists or general thesauri/subject heading lists that have been supplemented with terms for fiction. Based on these experiments, the subject heading lists and fiction thesauri have evolved in order to strive for unity of indexing and centralised cataloguing services (Pulli 1992). In the Nordic countries, there is an on-going project, based on the ideas of the BookHouse concept. Its main objective is to enable the dissemination of the cataloguing data of fiction between the Nordic countries (Pejtersen et al. 1996, 75).

In the United States, the development started at the national level when the American Library Association's Subject Analysis Committee published their Guidelines on Subject Access to Individual Works of Fiction, Drama etc. In the guidelines, the committee recommended that the following aspects should be indexed from fictional works: form/genre, persons, setting and topics. Based on this recommendation and on the 23-page supplementary word list for the Library of Congress Subject Headings, a project was started in 1991, when ten libraries began to index fiction. In addition, Olderr has devised a supplementary list of fiction subject headings which is broader than the LC thesaurus (Young 1992, 89-94; see also Young and Mandelstam 2013). The first edition of Olderr's fiction subject headings was published in 1987 and as a thesaurus in 1991. It includes terms from six different categories: topics, genres, geographical settings, chronological settings, characters and treatment (of the theme). The latter are terms that describe more specifically the genre of the work (Olderr 1991, ix-xx). The American Library Association (2000) has also published rules for the subject headings which are intended to ease access to fiction.

In Sweden, the largest thesaurus is Jansson's and Södervall's Tesaurus för indexering av skönlitteratur (Thesaurus for Indexing Fiction), which was published in 1987. It is divided into two parts — systematic and alphabetical — with the former being arranged as a thesaurus. In the systematic part, the terms are divided into three main facets, which are setting (ram), persons (person) and subject (ämne). These are divided into sub-facets so that setting is divided into time (tid) and place (rum); persons are divided into development (utveckling), social relations (sociala relationer) and profession/occupation (yrke/verksamhet) and subjects are divided into ideology (ideologi), action (aktivitet), nature (natur) and human body (människokropp). As stated by the editors, the borders between different facets are not fixed and placing some of the terms only in one facet is based predominantly on the principles of the design of this thesaurus, in which each term can be placed only in one facet (Jansson and Södervall 1987, 4-6). In the Nordic countries, several subject-heading lists have been developed based on the BookHouse concept (see the Pejtersen section above, section 3; see also Eriksson 2005).

The Swedish Library Association’s Fiction Indexing Committee was inaugurated in 2005. As a result of this Committee’s work, two subject heading lists were produced, i.e. subject headings of fiction for children and subject headings of fiction for adults. The subject headings have a hierarchical and faceted structure: 1. Genre, 2. Date, 3. Setting, 4. Subject, 5. Character and 6. Form. For children’s literature, form and genre are combined as form/genre (Aagaard and Viktorsson 2014, 68).

In Finland, there have also been some experiments conducted on indexing fiction by Finnish librarians and Finnish book traders before the appearance of Finnish Thesaurus for Fiction. They all used the Finnish General Thesaurus but very soon it was appreciated that it lacked the appropriate terms for indexing fiction (Pulli 1992, 2-4). Based on the experiences of these pilot projects, as well as those of the Finnish project based on the BookHouse concept, it soon became apparent that there was a need for a centralised indexing service for fiction. This service was needed because indexing of fiction is laborious; it lacks traditions and guidelines, for example, a subject heading list and furthermore there has been no decision about which thesaurus should be followed.

The Helsinki University Library — also the National Library of Finland — decided together with the BTJ Group Ltd to initiate a project in order to make a subject-heading list for fiction. The editing was started in the fall of 1993, and in addition to deciding who would be the editor, an editorial board was appointed to oversee the project. The subject-heading list was soon changed into the form of a thesaurus in order to match it to the other thesauri published by the Helsinki University Library. The first version was then tested in Finnish public libraries, and finally the first edition of Kaunokki was released in 1996 (in Swedish, Bella 1997).

The principal problem in devising a subject-heading list for fiction was deciding on the structure under which the terms were to be collected and organised. The editorial board of Kaunokki decided that the subject headings should be arranged in the form of a thesaurus and the organisation of the thesaurus should be made to follow the facets mentioned in the previous studies on the classification and indexing of fiction. In addition, an alphabetical index of all the terms used was added to the end of the thesaurus.

The facets used were as follows:

  • Terms that describe fictional genres and their explanations.
  • Terms that describe events, motives and themes.
  • Terms that describe actors.
  • Terms that describe settings.
  • Terms that describe times.
  • Terms that describe other, mostly technical and typographical aspects.

Four of the above-mentioned facets — events, actors, spaces and times — have been mentioned in almost all the previous studies as the main categories being applied for fiction indexing. Thus, Beghtol drew the conclusion that: "Characters, Events, Spaces and Times may be taken as fundamental data categories for fiction" (Beghtol 1994a, 157).

If we compare Beghtol's list to Ranganathan's PMEST → facets — as Shatford undertook in her system for indexing pictures (Shatford 1986, 49) — we can see that those are very similar to Shatford’s MEST (matter, energy, space, time) facets. In her system, Shatford made the decision to combine personality and matter facets into one group — actors, and then she referred with the energy facet to what these actors were doing. In Kaunokki, the solution was that terms that describe the genre of the fictional work were considered to correspond to the personality facet. This seems logical because the genre or the kind of literature describes the personality of the work and in fact determines many of the events, spaces and times described in a novel (see, e.g., Wellek and Warren 1980, 226-237; Saarti 1999). The matter facet on the other hand corresponds to that of events and motives in Kaunokki and the energy facet to that of actors. By incorporating Ranganathan's Basic Subject (Ranganathan 1969, 200), one could also make a distinction between different types of fictional works.

In the group "other", mainly terms that describe aspects outside the factual text of the work were included, because they are regularly asked by library users. For example, these are the previously mentioned aspects included in Pejtersen's accessibility category (Pejtersen and Austin 1983, 234).

When collecting the terms for the thesaurus, it was obvious that the context where the thesaurus is used would play an important role in choosing the right terms and the appropriate depth of the terms being chosen. A concrete example of that was the subject headings for the indexing of juvenile literature. They were included in Kaunokki, although they could as well have been published in a separate special thesaurus. Another problem was considering the environment where the thesaurus would be used. From the very onset, the decision was made that Kaunokki should be suitable for public libraries. For this reason, a great many of the terms that students of literature would consider important aspects of fictional works were omitted from the thesaurus. One solution for this problem would be to create a Thesaurus for Literary Research which is currently under preparation. There is already an example of this in Italy — Thesaurus di letteratura italiana (Negrini and Zozi 1995, see also Negrini and Adamo 1996; Aschero et al. 1995). In the second edition of Kaunokki (2000), this aspect was incorporated. Kaunokki was also developed in order to make it a thesaurus for the entire spectrum of fiction, i.e. literature, movies, comics etc.

The Kaunokki has also been implemented as an ontology-based linked → metadata-based service and this has been utilized when creating the Finnish BookSampo service for fictional works. BookSampo is a semantic portal, encapsulating metadata about practically all Finnish fiction literature available in Finnish public libraries (Mäkelä, Hypén and Hyvönen 2011, 173; Saarti and Hypén 2010).

Branch et al. (2017) emphasize that there is a great need for the ontological structures of fiction. This because (1) of the multi-faceted nature of the fiction and (2) the active and broad culture of fan fiction. It seems that there is no structural coherence and consistency between different types of fiction databases, i.e. library, commercial and fan-based environments. The ontology-based approach could help in improving this situation (see also Rafferty 2018 on social → tagging).

[top of entry]

7. Systematic approach to the fictional knowledge organization

It is apparent that not only the indexing and classification but also the search and retrieval systems for fiction must become multi-faceted in order to meet the diverse needs of different users. Fig. 2 describes a model for a search and retrieval system of fiction (Saarti 2000). It consists of five main blocks (databases) that represent the different actors of the fictional communication system — works of art (texts), their subject indexing and abstracts, history of their reception by readers, history of the writers and cultural history (see e.g. Spiter and Pecoskie 2016). With the aid of this kind of system, one can document in a holistical manner the different aspects of the meaning of a work of fiction, i.e. what the work of fiction is about.

Figure 2: A broad model for a search and retrieval system for fiction

During the past three decades, we have seen a rapid growth in various types of information systems for works of fiction. Figure 2 is a framework for the various layers of the system’s contents. As discussed earlier, the greatest challenge in the analysis of fictional content is its interpretational character. This means that a user-analysis is of the utmost importance when evaluating the pros and cons of any system.

It seems that the commercial systems are incorporating more content elements and especially more user behaviour-based data into their systems. For example, this can be seen when comparing Amazon books’ user interface (https://www.amazon.com) and WorldCat’s FictionFinder (https://experimental.worldcat.org/xfinder/fictionfinder.html). This multi-faceted use of tools and different types of access points seem to be very useful when searching for fiction. The aesthetic point of view has also given new possibilities for fiction retrieval e.g. as can be seen in Whichbook.net (https://www.whichbook.net), where the user can utilize factor-based search tools with more interpretational type of data. The third, and maybe the most rapidly evolving environment, are the different types of user-motivated information systems, e.g. fan-fiction sites and services that utilize a lot of unstructured fiction content analysis that is based on the users’ needs (e.g. https://www.fanfiction.net/; Smith 2017).

[top of entry]

8. Conclusions

One can conclude from the studies conducted on indexing and abstracting of fictional works that the effect of the interpretation of the work of art has a major impact on the content description of the work. This highlights the importance of these tools for librarians and patrons, they should not be so restrictive that they control the content as well as the vocabulary used in the indexing of (fictional) works. Of course, the interpretational aspect of content description is a subject that requires clarification, not only for fictional works but also for scientific material.

Additional studies will be needed in order to improve the indexing and classification of fiction. One important topic is the effect of the environment on indexing and whether the environment impacts on the use of indexes, which is also crucial for understanding the relationship between centralised and local indexing. Furthermore, democratic indexing in different libraries — a model that enables the users to contribute to the indexing — requires more investigation. This could be one model through which we could incorporate the interpretations and opinions of different individuals into our information systems (see Hidderley and Rafferty 1997 and investigations of the development in the search and retrieval systems of the Internet book-stores).

In addition, cultural and functional aspects are important from both the scientific and practical viewpoints. The multicultural point of view is especially interesting with respect to fiction. Centralised indexing services for fiction have been available in several countries for years, and their experiences can be a basis for assessing the benefits and drawbacks of a centralised service.

There is much work to be done in developing better information systems for handling fiction. In fact, at times it seems to be a never-ending task if one wishes to devise more sophisticated and more tailored indexing and classification systems (e.g. see Bartlet and Hughes 2011). The latest technological possibilities have created truly revolutionary tools for fictional retrieval. These have opened new perspectives for totally new types of indexing: e.g. emotional indexing referring to the reader’s experience and promotional tools for fictional literature. For libraries, this will also mean soul-searching i.e. librarians need to decide what they must concentrate on in this field, what is best left for other actors and finally identify areas where co-operation will be most beneficial.

[top of entry]


The author would like to thank Birger Hjørland, who served as the initiator and editor for this article, the two anonymous reviewers who provided valuable feedback that increased the value of this article, and Mark Ward for bringing me back on the subject and especially Dr. Ewen Macdonald for linguistic advice.

[top of entry]


Aagaard, Harriet and Viktorsson, Elisabet. 2014. “Subject Headings for Fiction in Sweden: A Cooperative Development”. Cataloging & Classification Quarterly 52, no. 1: 62-68. https://doi.org/10.1080/01639374.2013.855603.

Adkins, Denice and Bossaller, Jenny E. 2007. “Fiction Access Points Across Computer-Mediated Book Information Sources: A Comparison of Online Bookstores, Reader Advisory Databases, And Public Library Catalogs”. Library & Information Science Research 29, no. 3: 354–368.

American Library Association. Subcommittee on the Revision of the Guidelines on Subject Access
to Individual Works of Fiction. 2000. Guidelines on subject access to individual works of fiction, drama, etc. 2nd ed. Chicago: American Library Association.

Alternative Arrangement: New Approaches to Public Library Stock. 1982. Edited by Patricia Ainley and Barry Totterdell. London: Association of Assistant Librarians.

Andersson, Rolf and Holst, Erik. 1996. “Indexes And Other Depictions Of Fiction: A New Model For Analysis Empirically Tested”. Swedish Library Research/Svensk biblioteksforskning nos. 2-3: 77-95.

Aschero, Benedetto and al. 1995. “SYSTEMATIFIER: A Guide for The Systematisation Of Italian Literature”. Im: Fortschritte in der Wissenorganisation, Band 3: 125-133, Herausgegeben von Norbert Meder et al.

Bartlett, Sarah and Bill Hughes. 2011. “Intertextuality and the Semantic Web: Jane Eyre as a Test Case for Modelling Literary Relationships with Linked Data”. Serials: The Journal for the Serials Community 24, no. 2: 160-165.

Beghtol, Clare. 1986. “Bibliographic Classification Theory and Text Linguistics: Aboutness Analysis, Intertextuality and the Cognitive Act of Classifying Documents”. Journal of Documentation 42, no. 2): 84-113.

Beghtol, Clare. 1989. “Acces To Fiction: A Problem in Classification Theory and Practise, Part 1”. International Classification 16, no. 3: 134-140.

Beghtol, Clare. 1990. “Acces To Fiction: A Problem in Classification Theory and Practise, Part 2”. International Classification 17, no. 1: 21-27.

Beghtol, Clare. 1992. “Toward A Theory of Fiction Analysis for Information Storage And Retrieval”. In: Classification Research for Knowledge Representation and Organization. Editors N. J. Williamson and M. Hudson. Elsevier: Amsterdam.

Beghtol, Clare. 1994a. The Classification of Fiction: The Development Of A System Based On Theoretical Principles. Metuchen: Scarecrow Press.

Beghtol, Clare. 1994b. “Domain Analysis, Literary Warrant, And Consensus: The Case of Fiction Studies”. Journal of the American Society for Information Science 46, no. 1: 30-44.

Beghtol, Clare. 1997. “Stories: Applications of Narrative Discourse Analysis To Issues In Information Storage and Retrieval”. Knowledge Organization 24, no. 2: 64-71.

Bell, Hazel K. 1991. “Indexing Fiction: a Story of Complexity”. The Indexer 17, no. 4: 251-256.

Bella: Specialtesaurus för Skönlitteratur. 1997. Ansvarig redaktör Jarmo Saarti; översättning: Marita Rajalin, Ringa Sandelin, Ylva Thölix. Helsingfors: BTJ Kirjastopalvelu.

Bierbaum, Esther (1995. “Book Reviews: The Classification of Fiction: The Development Of A System Based On Theoretical Principles. Metuchen: Scarecrow Press. Clare Beghtol”. Journal of American Society for Information Science 46, no. 5: 389–390.

Birdi, Briony and Ford, Nigel. 2017. “Towards a New Sociological Model of Fiction Reading”. Journal of The Association For Information Science And Technology, 69(11):1291–1303, 2018

Branch, Frank et al. 2017. “Representing Transmedia Fictional Worlds Through Ontology”. Journal of the Association for Information Science and Technology 68, no. 12: 2771-82.

Burgess, L. A. 1936. “A System for The Classification and Evaluation Of Fiction”. The Library World 38: 179-182.

Chandler, Daniel. 1997. An Introduction to Genre Theory. [Digital resource.] http://visual-memory.co.uk/daniel/Documents/intgenre/intgenre1.html. Last visited 2019-02-25.

Collins, Jim. 2010. Bring on the Books for Everybody: How Literary Culture Became Popular Culture. Durham (NC): Duke University Press.

Eriksson, Rune. 2005. “Skønlitteraturen i Danbib: Klassifikation, Indeksering, Noter”. Dansk Biblioteksforskning 1, no. 3: 7–20.

Eriksson, Rune. 2010. Klassifikation og Indeksering af Skønlitteratur: Et Teoretisk og Historisk Perspektiv. Copenhagen: Royal School of Library and Information Science. https://curis.ku.dk/ws/files/47028127/Eriksson_phd_2010.pdf.

Green, Rebecca. 1997. “The Role of Relational Structures in Indexing For The Humanities”. Information Services & Use 17, nos. 2-3: 85-100.

Harrell, Gail. 1985. “The Classification and Organization of Adult Fiction in Large American Public Libraries”. Public Libraries, 24, no. 1: 13-14.

Hidderley, Rob and Rafferty, Pauline. 1997. “Democratic Indexing: An Approach to The Retrieval of Fiction”. Information Services & Use 17, nos. 2-3: 101-109.

Jansson, Eiler and Södervall, Bo. 1987. Tesaurus för indexering av skönlitteratur. Högskolan i Borås. Institutionen Bibliotekshögskolan. Specialarbete, 1987:7. Borås: Högskolan i Borås.

Jennings, Barbara and Sear, Lyn. 1989. “Novel Ideas: A Browsing Area for Fiction”. Public Library Journal 4, no. 3: 41-44.

Juntunen, Arja and Saarti, Jarmo. 1992. Kaunokirjallisuuden sisällönkuvailu yleisissä kirjastoissa. Kirjastotieteen- ja informatiikan pro gradu -tutkielma. Oulu: Oulun yliopisto.

Kaunokki: fiktiivisen aineiston asiasanasto. 2000. Ed. Jarmo Saarti. Helsinki: BTJ Kirjastopalvelu.

Kaunokki: kaunokirjallisuuden asiasanasto. 1996. Ed. Jarmo Saarti. Helsinki: BTJ Kirjastopalvelu.

MacPherson, Ruby. 1987. “Children's Literature Indexes at Moray House”. Library Review 4: 254-260.

Mäkelä, Eetu, Hypén, Kaisa and Hyvönen, Eero. 2011. ”BookSampo—Lessons Learned in Creating a Semantic Portal for Fiction Literature”. In: L. Aroyo et al. (Eds): ISWC 2011, Part II, LNCS 7032, pp. 173–188. Berlin: Springer-Verlag.

Martens, Gunter. 1975. “Textstrukturen aus rezeptionsästhetischer Sicht: Perspektiven einer Textästhetik auf der Grundlage des Prager Strukturalismus”. In: Literarische Rezeption: Beiträge zur Theorie des Text-Leser-Verhältnisses und seiner empirischen Erforschung. Herausgegeben und eingeleitet von Hartmut Heuerman and Peter Hühn and Brigitte Rötger. Ferdinand Schöningh, Paderborn. (ISL, 4): 23-49.

Mikkonen, Anna and Vakkari, Pertti. 2016. “Finding Fiction: Search Moves and Success in Two Online Catalogs”. Library & Information Science Research 38, no. 1: 60–68. http://dx.doi.org/10.1016/j.lisr.2016.01.006.

Moraes, João (2012). “Aboutness in Fiction: Methodological Perspectives for Knowledge Organization”. Advances in Knowledge Organization, no. 13. 242-248.

Negrini, Giliola and Adamo, Giovanni. 1996. “The Evolution of a Concept System: Reflections on Case Studies Of Scientific Research, Italian Literature And Humanities Computing”. In: Advances of Knowledge Organization, vol. 5: 275-283, ed. by Rebecca Green.

Negrini, Giliola et al. 1995. Thesaurus di letteratura italiana. Note di bibliografia e di documentazione scientifica, LIX. Roma: C.N.R.

Nielsen, Hans Jørn. 1997. “The Nature of Fiction and Its Significance for Classification and Indexing”. Information Services & Use 17, nos. 2-3: 171-181.

Olderr, Steven. 1991. Olderr's Fiction Subject Headings: A Supplement and Guide to The LC Thesaurus. ALA, Chigaco.

Pawlicki, Kamil. 2017. Genre Theory Applied: Genre and Form Terms in the National Library of Poland Catalogue. Paper presented at: IFLA WLIC 2017–Wroclaw, Poland–Libraries. Solidarity. Society. in Session 98: Bibliography and Subject Analysis. Available from: http://library.ifla.org/id/eprint/1644.

Pejtersen, Annelise Mark and Albrechtsen, Hanne and Lundgren, Lena and Sandelin, Ringa and Valtonen, Riitta. 1996. Subject Access to Scandinavian Fiction Literature: Index Methods and OPAC Development. Copenhagen: Nordic Council of Ministers. TemaNord: Culture, 608.

Pejtersen, Annelise Mark and Austin, Jutta. 1983. “Fiction Retrieval: Experimental Design and Evaluation of a Search System Based on Users' Value Criteria: Part 1”. Journal of Documentation 39, no. 4: 230-246.

Pejtersen, Annelise Mark and Austin, Jutta. 1984. “Fiction Retrieval: Experimental Design and Evaluation of a Search System Based On Users' Value Criteria: Part 2”. Journal of Documentation 40, no. 1: 25-35.

Pejtersen, Annelise Mark. 1989. “A Library System for Information Retrieval based on a Cognitive Task Analysis and Supported by an Icon-Based Interface”. ACM SIGIR Forum, Special issue: Proceedings of the 12th annual international ACMSIGIR conference on Research and development in information retrieval, N.J. Belkin and C.J. van Rijsbergen (Eds). 23, June 1989: 40 - 47. doi:10.1145/75335.75340.

Pulli, Riitta. 1992. Kaunokirjallisuuden keskitetty indeksointi Suomessa. Helsinki: Helsingin yliopiston kirjasto.

Rafferty, Pauline. 2012. “Epistemology, Literary Genre and Knowledge Organisation Systems”, paper presented at Actas del X Congreso de ISKO-España, Ferrol, 20 June-1 July 2011, 553-565. Available at: https://www.researchgate.net/publication/273120468_Epistemology_literary_genre_and_knowledge_organisation_systems

Rafferty, Pauline (2018). “Tagging”. Knowledge Organization 45, no. 6: 500-516. Also available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli. http://www.isko.org/cyclo/tagging.

Ranganathan, S. R. 1969. “Colon Classification Edition 7 (1971): A Preview”. Library Science with a Slant to Documentation 6, no. 3: 193-242.

Ranta, Judith A. 1991. “The New Literary Scholarship and a Basis For Increased Subject Catalog Access to Imaginative Literature”. Cataloging & Classification Quarterly 14, no. 1: 3-27.

Rich, Elaine. 1979. “User Modeling Via Stereotypes”. Cognitive Science no. 3: 329-354.

Rich, Elaine. 1986. “Users as Individuals: Individualizing User Models”. In: Intelligent Information Systems: Progress and Prospects. R. Davies (editor). Chichester: Ellis Horwood.

Ross, Catherine S. 2001. “Making Choices: What Readers Say About Choosing Books To Read For Pleasure”. The Acquisition Librarian 13, no. 25: 5-21.

Ross, Catherine Sheldrick, McKechnie, Lynne and Rothbauer, Paulette M. 2018. Reading Still Matters: What the Research Reveals about Reading, Libraries, and Community. Santa Barbara: ABC-CLIO.

Saarinen, Katariina and Vakkari, Pertti. 2013. "A Sign of a Good Book: Readers’ Methods of Accessing Fiction in The Public Library". Journal of Documentation 69, no. 5: 736-754. https://doi.org/10.1108/JD-04-2012-0041

Saarti, Jarmo. 1997. “Feeding with The Spoon, Or the Effects of Shelf Classification of Fiction on the Loaning Of Fiction”. Information Services & Use 17, nos. 2-3: 159-169.

Saarti, Jarmo. 1999. “Fiction Indexing and The Development Of Fiction Thesauri”. Journal of Librarianship and Information Science 31, no. 2: 85-92.

Saarti, Jarmo. 2000. “Taxonomy of Novel Abstracts Based on Empirical Findings”. Knowledge Organization 27, no. 4: 213-220.

Saarti, Jarmo. 2002. “Consistency of Subject Indexing of Novels by Public Library Professionals and Patrons”. Journal of Documentation 58, no. 1: 49-65.

Saarti, Jarmo and Hypén, Kaisa. 2010. “From Thesaurus to Ontology: The Development of the Kaunokki Finnish Fiction Thesaurus”. The Indexer 28, no. 9: 50–58.

Satija, M.P. 2017. “Colon Classification (CC)”. Knowledge Organization 44, no. 4: 291-307. Also available in Hjørland, Birger and Gnoli, Claudio eds. ISKO Encyclopedia of Knowledge Organization, http://www.isko.org/cyclo/colon_classification.

Segers, Rien T. 1985. Kirja ja lukija: johdatusta kirjallisuudentutkimuksen uuteen suuntaukseen. (Original title: Het lezen van literatuur: een inleiding tot een nieuwe literatuurbenadering). Translated by: Lili Ahonen. Helsinki: SKS. Tietolipas, 97.

Shatford, Sara. 1986. “Analyzing the Subject of a Picture: A Theoretical Approach”. Cataloging & Classification Quarterly 6, no. 3: 39-62.

Smith, Joanna. 2017. “The Ultimate Guide to Fanfiction and Fanfiction Sites”. Medium. https://medium.com/@joannasmith008/fanfiction-428029544a12. Last visited 2019-25-02.

Spiller, David. 1980. “The Provision Of Fiction For Public Libraries”. Journal of Librarianship 12, no. 4: 238-266.

Spiter, Louise and Pecoskies, Jen. 2016. “In the Readers’ Own Words: How User Content in the Catalog Can Enhance Readers’ Advisory Services”. Reference & User Services Quarterly 56, no. 2: 91–95.

Trott, Barry. 2017. “Popular Literature Genres”. Encyclopedia of Library and Information Sciences, 4th. ed. Abingdon: Taylor & Francis. DOI: 10.1081/E-ELIS4-120043671.

Vernitski, Anat. 2007. “Developing an Intertextuality-Oriented Fiction Classification”. Journal of Librarianship and Information Science 39, no. 1: 41–52. DOI: 10.1177/0961000607074814

Ward, Mark and Saarti, Jarmo. 2018. “Reviewing, Rebutting, and Reimagining Fiction”. Classification, Cataloging & Classification Quarterly 56, no. 4:317-329. DOI: 10.1080/01639374.2017.1411414

Wellek, René and Warren, Austin. 1980. Theory of literature. (Rep) Harmondsworth: Penguin.

Wilson, Mary D., Spillane, Jodi L. and Cook, Colleen and Highsmith, Anne L. 2000. “The Relationship Between Subject Headings for Works of Fiction and Circulation in an Academic Library”. Library Collections, Acquisitions, and Technical Services 24, no. 4: 459-465. https://doi.org/10.1016/S1464-9055(00)00156-1

Young, J. Bradford. 1992. “Olderr's Fiction Subject Headings: A Supplement and Guide to the LC Thesaurus. (Book review)” Cataloging & Classification Quarterly 15, no. 1: 89-94.

Young, Janis and Mandelstam, Yael. 2013. “It Takes a Village: Developing Library of Congress Genre/Form Terms”. Cataloging & Classification Quarterly 51, no. 1-3: 6-24, DOI: 10.1080/01639374.2012.715117

[top of entry]

Appendix 1

Category:Fiction by genre. From Wikipedia, the free encyclopedia. C refers to number of subcategories; Science fiction, for example, have 21 subcategories, total = 528 categories; P refers to the number of Wikipedia pages in the category.

Fictional characters by genre (17 C)
Fiction writers by genre (21 C)
Absurdist fiction (2 C, 61 P)
Adventure fiction (19 C, 27 P)
Children's literature (21 C, 28 P)
Christian fiction (7 C, 12 P)
Christianity in fiction (8 C, 10 P)
Coming-of-age fiction (6 C, 41 P)
Crossover fiction (12 C, 26 P)
Fiction narrated by a dead person (1 C, 66 P)
Dystopian fiction (15 C, 36 P)
Environmental fiction books (1 C, 77 P)
Erotic fiction (8 C, 7 P)
Family saga (1 C, 6 P)
Fantasy (21 C, 6 P)
Feminist fiction (4 C, 24 P)
Fiction with unreliable narrators (2 C, 258 P)
Ghost stories (6 C, 30 P)
Historical fiction (17 C, 51 P, 2 F)
Horror fiction (25 C, 50 P)
Islam in fiction (4 C, 31 P)
Islamic fiction (2 C, 2 P)
LGBT fiction (10 C)
Men's fiction (1 C)
Metafiction (4 C, 11 P)
Military fiction (8 C, 26 P)
Mockumentaries (3 C, 17 P)
Motorcycling in fiction (5 C, 5 P)
Mystery fiction (22 C, 43 P)
Mythopoeia (2 C, 12 P)
Novels by genre (72 C, 2 P)
Occult detective fiction (8 C, 31 P)
Overpopulation fiction (43 P)
Parallel literature (1 C, 32 P)
Penny dreadfuls (5 P)
Philosophical fiction (4 C, 11 P)
Political fiction (10 C, 14 P)
Psychological fiction (8 C, 10 P)
Pulp fiction (10 C, 25 P)
Rapid human age change in fiction (16 P)
Rapid human growth change in fiction (4 P)
Fiction about religion (30 C, 19 P)
Romantic fiction (14 C, 15 P)
Science fiction (21 C, 7 P)
Speculative fiction (39 C, 33 P)
Spy fiction (20 C, 5 P)
Thrillers (16 C, 21 P)
Urban fiction (19 P)
Utopian fiction (3 C, 30 P)
Western (genre) (20 C, 15 P)
Women's fiction (2 C, 9 P)
Wuxia (8 C, 5 P)
Young adult fiction (4 C, 53 P)

Pages in category "Fiction by genre" (This list may not reflect recent changes)

Atomic bomb literature
Authoritarian literature
Bizarro fiction
Caper story
Cell phone novel
Comic novel
Conspiracy fiction
Existentialist fiction
Exploitation fiction
Fragmentary novel
Hysterical realism
I Novel
Invasion literature
Musical fiction
New adult fiction
Northern (genre)
Urban fiction
Western (genre)
Young adult fiction
Young adult romance literature

[top of entry]


Visited Hit Counter by Digits times.

Version 1.0, published 2019-03-26
Article category: KO in specific domains

©2019 ISKO. All rights reserved.