RILM Index (Répertoire international de littérature musicale)

Table of contents:
1. General introduction
2. Introduction to RILM and RILM Index
    2.1 Répertoire international de littérature musicale (RILM): 2.1.1 RILM as an organization; 2.1.2 RILM databases
    2.2 RILM Index: 2.2.1 Overview of RILM Index; 2.2.2 Index terms: topics vs. terms; 2.2.3 Basic indexing rules; 2.2.4 RILM Index history; 2.2.5 Comparisons between RILM Index and other KOSs for music
3. Issues in RILM Index
    3.1 Polysemes without disambiguation
    3.2 Concepts with multiple term representations
    3.3 Redundant index terms
    3.4 Inconsistent index terms
    3.5 Lack of control
    3.6 Eurocentrism
    3.7 Quality issues in RILM Index
4. RILM Music Thesaurus: a possible solution
5. Conclusion

RILM Index is a partially controlled vocabulary designated to index scholarly writings on music and related subjects, created and curated by Répertoire international de littérature musicale (RILM). It has been developed over 50 years and has served the music community as a primary research tool. This analytical review of the characteristics of RILM Index reveals several issues, related to the Index’s history, that impinge on its usefulness. An in-progress thesaurus is presented as a possible solution to these issues. RILM Index, despite being imperfect, provides a foundation for developing an ontological structure for both indexing and information retrieval purposes.

1. General introduction

This article describes the structure, history, and issues of the RILM Index and discusses its future prospects. As one of the most complicated and richest subjects, music embraces different genres, cultures, ethnicities, and traditions, thus creating great challenges for libraries, bibliographic databases, and search engines to index music literature. The source documents of music indexing exist in three forms: music literature, musical works (e.g., scores in the form of sheet music, broadsheets), and recordings. There is a lack of standard controlled vocabularies for literature on music in the library community, and there has been limited work addressing this problem within the musicology community. The Library of Congress Subject Headings (LCSH) is one of the most widely used subject indexing tools in libraries. LCSH has main headings to represent a broad range of subjects, including music theories, musical instruments, voices, ensembles, compositions, performance, criticism, musicians, composers, and performers (Broughton 2012). To differentiate works about music, scores, and recordings, LCSH applies free-floating subdivisions and some conventions. However, LCSH was originally developed for the subject cataloging of Library of Congress collections and primarily serves the library community. Headings are usually assigned at the work-as-a-whole level rather than to individual parts of a work (e.g., individual articles of a journal). LCSH does not provide the specificity, granularity, or diversity required for indexing writings on music from non-Western countries, cultures, and traditions (Fan and Wu 2018).

2. Introduction to RILM and RILM Index

2.1 Répertoire international de littérature musicale (RILM)

2.1.1 RILM as an organization

Established in 1966, Répertoire international de littérature musicale (RILM) is an international music bibliography project co-sponsored by the International Association of Music Libraries, Archives, and Documentation Centres (IAML), the International Musicological Society (IMS), and the International Council for Traditional Music (ICTM). It was founded by the musicologist Barry S. Brook and is one of the four major international bibliographical series [1] for music scholars and librarians (Mackenzie 2007) and is “committed to representing the world’s knowledge about all musical traditions, and to making this knowledge accessible to research and performance communities worldwide via digital collections and advanced tools” (RILM n.d.).

RILM’s global network includes 41 national committees, the International Center housed at the Graduate Center of the City University of New York, and a Commission internationale mixte. The national committees consist of musicologists and librarians based at major universities, national libraries, and research institutes all over the world. They are responsible for collecting significant writings on music published in their respective countries or regions and creating descriptive bibliographic records and abstracts in both original languages and English. RILM editors, technology experts, and administrators at the International Center are responsible for compiling, editing, and publishing these bibliographic records (Brook 1969; Brook and Schiødt 1969). RILM editors, in particular, are music scholars who are also experts in other academic fields such as linguistics, biology, library and information science, and pedagogy. As much as their other responsibilities allow, the International Center staff also collect, edit, and index publications from countries and regions that do not have RILM national committees yet. RILM has been making efforts to reach out to these communities, especially those in Asia, Africa, and South America (Mackenzie 2008). The Commission internationale mixte, consisting of four representatives from each of the three sponsoring organizations, serves as RILM’s advisory board.

2.1.2 RILM databases

For over half a century RILM produced a single product, RILM Abstracts of Music Literature (hereafter called RILM Abstracts). In 1967 RILM Abstracts started as a printed triannual publication, with abstracts and an index published separately and became an exclusively online resource in 2000 (Blažekovic 2014). It collects scholarly writings on music and related subjects in all languages and includes all types of publications, such as scholarly journals, monographs, dissertations, conference proceedings, recording notes, sound recordings, performance program, and commentary on editions of musical works. Articles on music and related subjects published in non-music journals are also included in RILM Abstracts (Green 2000). Currently RILM Abstracts is published and distributed by EBSCOhost (Figure 1). Due to its broad disciplinary and linguistic coverage, high-quality abstracts, and detailed descriptive index, RILM Abstracts is widely considered to be one of music scholars’ most important research tools (Arnold et al. 2004; Schuursma 1986; Spivacke 1968).

Over the last several years, RILM has launched several new products, including RILM Music Encyclopedias (RME) in 2015, Musik in Geschichte und Gegenwart (MGG) Online in 2016, and RILM Abstracts with Full Text (RAFT) also in 2016. In mid-2019, Index to Printed Music (IPM), became available as yet another addition to the RILM suite of music reference works. Among these new developments, RAFT can be viewed as an extension of RILM Abstracts, since it includes the full text of over 200 core music journals in addition to all the bibliographic records found in RILM Abstracts. RAFT is also published on the EBSCO platform.

2.2 RILM Index

RILM Index, which includes index terms and a set of indexing rules, is a partially controlled vocabulary developed and maintained by RILM. Index terms can be one-word nouns, compound terms, phrases, and numbers such as years. Publications included in two of RILM's databases, RILM Abstracts and RAFT, serve as the → literary warrant (Barité 2017) for RILM Index. Although RILM’s other products (RME, MGG Online, and IPM) do not use RILM Index directly, the topics covered in those databases have been used to develop the RILM Music Thesaurus (see Section 4), an ongoing project aiming to improve the representation and organization of subjects covered in music literature. After an overview of RILM Index and a brief introduction to its history and development, this article discusses its issues and limitations. The RILM Music Thesaurus is introduced at the end of this article as a solution to the current issues in RILM Index and as a reference tool for a broader audience.

2.2.1 Overview of RILM Index

As mentioned above, RILM Index is a compilation of all the index terms that have been assigned to the publications in RILM Abstracts and RAFT. Both databases are housed and maintained in a Web-based editorial system called IBIS developed by RILM in the early 2010s. IBIS is a relational database used for creating and maintaining bibliographic citations, abstracts, RILM Index, and authority records. In-house tools for developing the RILM Music Thesaurus are also being created in IBIS.

RILM’s index terms include both headwords and non-headwords. Headwords consist of topical terms, instrument families, personal names, family names, and names of countries, continents, and supranational regions. Headwords are somewhat similar to the main headings of LCSH and the Sears List of Subject Headings in terms of usage, but are different in categories and structure. Unlike LCSH and the Sears List, RILM’s headwords do not include corporate name, meeting name, and uniform title headings, but due to RILM’s subject specialty, they do include instruments and instrument families. Some of RILM’s headwords contain a subfield to refine the meaning or indicate the treatment of those headwords (e.g., performers--piano, therapy--music therapy, popular music--by place).

All the other index terms used after headwords in RILM Index are called non-headwords. While LCSH and the Sears List have only four types of subdivisions (i.e., topical, form, geographic, and chronological), RILM index has 12 categories of non-headwords (i.e., topical, persons and families, organizations, schools, musical instruments and instrument families, ethnic groups, geographic places, musical works, literary works, periodicals, music manuscripts, and visual art works). RILM editors collectively decide to create, modify, and terminate headwords, and this can often involve a lengthy process. For example, instrument family headwords were first developed by RILM editors in 1984 with reference to existing classification systems such as the → Hornbostel-Sachs system (Hornbostel and Sachs 1914; Lee 2020; Montagu 2009). Since then RILM editors have modified and expanded these headwords to include new musical instruments and to reflect new trends in organological research. Editorial discussions and decisions about headwords are documented in the RILM Manual, an internal wiki site. They are also included in annual reports summarizing important activities of RILM and are published in the IAML journal, Fontes artis musicae. RILM editors have more flexibility to create and modify non-headwords independently on a regular basis.

Each index term in RILM Index has a unique identifier called a term ID, which is automatically generated by IBIS and assigned to each index term upon creation. As of July 2020, RILM Index had 1,418,106 unique index terms in use. Regardless of the original languages of publications, English is the designated language for RILM’s index terms. Terms in foreign languages or transliterations of foreign-language terms (e.g., Gebrauchsmusik, yuehu) are used as index terms only when the concepts cannot be translated into English or the foreign terms convey specific cultural/historical meanings.

RILM’s index terms are classified into 17 categories, each of which has a designated tag (Table 1). In theory, each index term must be validated by RILM editors based on inclusion in authoritative reference sources. The majority of index terms in most categories have been validated, except those in the N (names of individuals and families) and P (titles of individual periodicals) categories. RILM’s indexing rules require editors to index personal names and journal titles if they appear in the title and/or abstract of a publication. It is often difficult for RILM editors to obtain all the information needed to verify a personal name or journal title, especially when they do not have access to the publication where the name or title is mentioned. This is especially problematic for foreign and old publications. Around 51% of RILM’s index terms, called null terms, are under maintenance and do not have any assigned categories. The large number of null terms explains why RILM Index is considered to be a partially controlled vocabulary.

Table 1: Validation status of RILM Index terms by category
RILM Index terms by category Count of unique RILM index terms Count of validated RILM index terms % of validated RILM index terms
N Names of individual persons or families 561,503 202,400 36%
K Names of individual persons in IPM 13,905 12,535 90%
W Titles of individual musical works 37,652 29,874 79%
L Titles of individual literary works 7,201 6,694 93%
I Musical instruments and instrument families 2,974 2,856 96%
V Titles of individual visual art works 228 206 90%
O Names of individual organizations 33221 30531 92%
R Titles of music manuscripts 773 686 89%
S Names of individual schools 3,253 3,413 95%
F Titles of films, TV shows, and music videos 1,277 1,261 99%
P Titles of individual periodicals 2,734 1,449 53%
G Names of individual geographic locations 10,254 9,692 95%
E Ethnic groups 1,681 1,462 87%
T Topical terms (headwords) 1,475 1,459 99%
M Topical terms (in the margin) 71 64 90%
Z Topical terms (narrower terms of specific topical headwords) 1,669 1,660 99%
D Dictionary terms (non-headword topical terms other than Z and M terms) 13,595 12,507 92%
null Terms under maintenance 724,640 17 0%
Total 1,418,106 318,766 22%

Each bibliographic record (Figure 1) included in RILM Abstracts and RAFT consists of descriptive bibliographic information about the publication (e.g., title in original language and English translation provided by a RILM editor, publication date, author names in original language and transliteration provided by a RILM editor, language), subjects, and an abstract (provided by a RILM editor if neither the author nor the committee has provided one) along with its English translation (provided by a RILM editor if the abstract from the author or journal is not in English).

Figure 1: A RILM Abstract record on EBSCOhost (click to open in full size)

Each subject assigned to a bibliographic record is referred to as an index string, consisting of at least two fields (Figure 2): the first field is always occupied by a headword, and the secondary field after the headword is called the margin term. The difference between headwords and non-headwords is that headwords can be used in the first and later fields of an index string while non-headwords are not allowed in the first field. All headwords can be used as non-headwords in index strings. Non-headwords in later fields of an index string are further subdivisions or topics related to the headword. Index strings are created and assigned by RILM editors in the International Center. RILM editors have developed a set of indexing rules governing how each headword should be assigned and how the non-headwords are formed and used after each headword. These indexing rules are highly complex and documented in the RILM Manual. It usually takes a new editor an average of six months to learn how to apply these rules.

Index strings support RILM users’ browsing of headwords and all later fields on EBSCOhost. EBSCO provides two functionalities that facilitate browsing: explode and expand. Users can choose to explode selected headwords and to explode or expand any index terms in later fields. Clicking on a headword allows users to browse all of the index terms associated with that headword that are used in the margin and later fields. Exploding a selected headword will retrieve all records whose index strings contain that headword. Exploding a selected non-headword term allows users to retrieve all bibliographic records whose index strings contain that term and follow the same sequence of terms. Expanding an index term in the margin or later fields allows users to browse all terms used after the one selected. Users can also expand any term in any field of an index string to generate a search query for all records whose index strings contain the selected term and all preceding terms in the same order.

Figure 2: Index strings assigned to a journal article about jazz improvisation

2.2.2 Index terms: topics vs. terms

RILM editors have selected around 20% of all the index terms as representations of concepts (Dextre Clarke 2019) to be included in the RILM Music Thesaurus. Only English index terms are used as preferred terms to represent those concepts, which are called topics in IBIS and are assigned unique topic IDs. Each topic can be associated with multiple terms, including equivalences in other languages and scripts. For index terms without corresponding concepts in the English language, RILM follows the ISO standards to transliterate non-Latin writing systems into Latin characters. For example, RILM editors selected the index term urtyn duu (term ID: 151211) as the preferred term to represent the concept of a traditional vocal music genre of the Mongol people (topic ID: 69571, Figure 3). The English term long song and a few Chinese terms (e.g., changdiao, 乌尔汀哆) were included as equivalent terms (lead-in terms) to express the topic for retrieval purposes. Since 2019 RILM editors have been required to use preferred terms to represent topics for indexing if they are available.

Figure 3: Multiple equivalent terms of urtyn duu

Among different categories of index terms, RILM editors are currently only creating and maintaining authority records for persons. These authority records include biographic information on each person, links to internal authoritative sources (e.g., MGG-Online, RME) and external ones (e.g., Virtual International Authority File (VIAF), Getty Thesaurus of Geographic Names (TGN)), and variant names of the person. RILM plans to create and maintain authority records for every topic in the Index. As necessary, each authority record can include up to four types of notes:

  • scope notes indicate the range of subjects to which a topic is applied, distinguish related topics, and specify the meaning of the topic in the context of RILM and its uses in RILM Index;
  • history notes provide the history of the creation, use, modification, and replacement of a term;
  • editorial notes are reserved for detailed editorial discussions about a topic, examples, and other editorial subjects concerning the topic; and
  • staff notes are for miscellaneous messages, notes, and communications between RILM staff members.

Authority records will also include the hierarchical and associative relationships between topics used in RILM and all equivalent terms referring to the same topic. Each different type of named/titled entity and conceptual topic will have different metadata fields in their authority records that supply contextual information relevant to that specific type of entity. The creation and maintenance of authority records are part of the RILM Music Thesaurus Project in progress.

2.2.3 Basic indexing rules

RILM Index has conventions regarding the use of specific index terms directly after four types of headwords: geographic names, personal/family names, instrument families, and four scholarly disciplines in music and dance (i.e., musicology, ethnomusicology, dance history, and ethnochoreology). For example, there are currently 30 index terms called personal margin terms in three levels of preference by specificity that can be used directly after personal-name headwords (Figure 4). For other headwords, there is an indexing rule called standard arrangement governing the types of index terms (subdivisions) used in the margin based on the order of priority ranging from first choice (personal name), second choice (geographic location), to third choice (topic). For example, to index Frédéric Chopin’s works published by German publishers in the 19th century under the headword publishing and printing, both the composer’s name and the geographic term Germany are logical choices as margin term candidates. Since personal names take preference over geographic terms, the index string assigned is publishing and printing--Chopin, Frédéric--Germany--19th c. but not publishing and printing--Germany--Chopin, Frédéric--19th c.

Figure 4: Personal margin terms in the order of preference by specificity

[top of entry]

2.2.4 RILM Index history

In order to promote a better understanding of the issues, this section introduces the history and development of RILM Index. RILM started to publish an index to aid the finding of bibliographic records in 1966. Since its early years RILM has been a pioneer that has adopted the most advanced technologies to support the production and distribution of bibliographic and index data and to support its users. It was the first automated bibliography in the humanities and was the model for other constituents of the American Council of Learned Societies such as the Répertoire international de littérature de l’art. In 1979 RILM started to distribute RILM Abstracts data in digital format. RILM Abstracts remained largely a print publication until the end of the 20th century (Blažekovic 2014). Up to the early 1990s RILM Index was published every five years independent of RILM Abstracts. Consisting of an author index, subject index, and periodical index, each volume of RILM Index is a compilation of all the index strings assigned to the publications included in RILM Abstracts for the previous five years. The terms used in the subject index were also maintained in RILM Abstracts English-Language Thesaurus (RILM 1976; 1983) and RILM Abstracts of Music Literature: International Thesaurus (RILM 1990; 1993), two RILM publications that were published between 1976 and 1993. Brook (1989) described RILM’s → “thesauri” as a collection of subject headings created for the RILM Index. These “thesauri” include see and see also references, the equivalences of the subject headings in 17 European languages, and indexing rules. RILM’s subject headings were based on the Sears List of Subject Headings but adjusted to accommodate the indexing of music-specific content. During this process of adjustment, RILM Index was integrated into and became part of a digital database. Terms in “RILM English-Language Thesaurus” and “RILM International Thesaurus” were integrated into the IBIS term tables as part of the development of IBIS.

Despite its migration from a print publication to an online database, RILM Index retains its pre-coordinate style: the index strings are structured in a hierarchical manner, in which each headword is specified by a string of index terms that provides a descriptive summary of the music literature. As an increasing number of new terms — especially topical terms — became regularly used as index terms, the complexity of RILM Index has grown. As the key content of the thesauri, index terms are constantly added, modified, merged, or replaced to reflect the vast growth of RILM’s disciplinary, geographic, and linguistic coverage. RILM editors usually perform retrospective indexing to revise the index strings in all bibliographic records to reflect changes to the index terms. For example, the headword Indians and Inuits was used over 3,000 times in RILM Index referring to indigenous people in Americas and has been changed to indigenous peoples--Americas and applied to all existing indexing strings in April 2020.

[top of entry]

2.2.5 Comparisons between RILM Index and other KOSs for music

As discussed above, music literature serves as the literary warrant for RILM index terms, which therefore reflect the scholarship on music rather than music itself (Mackenzie 2007). This makes RILM Index a unique → knowledge organization system (KOS: Mazzocchi 2017) compared to others. RILM Index terms represent not only concepts in music but also topics discussed by researchers from various disciplinary and cultural backgrounds. For example, RILM Index contains terms for concepts about instruments such as historical instruments and adapted instruments, and terms for actual instruments such as piano and zheng. These four terms are common topics in musicological writings and unsurprisingly piano and zheng are included in the KOSs for musical instruments such as the Library of Congress Medium of Performance Thesaurus (LCMPT). Historical instruments and adapted instruments, however, would fall outside of the scope of LCMPT, which is for cataloging real instruments used for music performance. RILM Index includes topics and/or terms as long as they are discussed in a piece of scholarly work regardless of the level of significance of the topics and/or terms outside music scholarship.

Table 2: Comparison of RILM Index and existing KOSs for music
KOS Type Domain/Scope User communities Structure
RILM Index subject headings list all subjects covered in scholarly writings on music and related fields RILM editors and end users of RILM databases a precoordinate system with equivalent and hierarchical relationships
LCMPT thesaurus musical instruments, voices, ensemble types, etc., used in the performance of musical works (Iseminger et al. 2017; Library of Congress 2020) libraries and library users a postcoordinate system with equivalent and hierarchical relationships
The Music Ontology ontology editorial, cultural, and acoustic information related to musical works (Abdallah et al. 2006), production and consumption of music (Turchet et al. 2019) music industry and general public linking business-related information about music found on the World Wide Web
DOREMUS ontology classical musical works and related information such as artists, recordings, and performances from the catalogs of three major French cultural institutions (Achichi et al. 2018) libraries and other cultural institutions such as archives and orchestras a common knowledge model describing musical works
MIMO thesaurus taxonomy musical instruments (Dolan 2017) museums and general public a classification tree of instruments
MusicBrainz database musical works and contextual information about these works (Swartz 2002) music industry and general public a relational database

The richness and breadth of topical coverage is another distinctive characteristic of RILM Index compared to existing KOSs for music. RILM index terms are highly diverse, representing concepts and topics in music and other subjects such as dance, dramatic arts, literature, and religions. For example, RILM Index has developed a collection of medical terms such as repetition suppression and music-evoked autobiographical memories (MEAMs) in order to index publications in fields like music therapy and music pedagogy. This makes it challenging to build a well-structured thesaurus to accommodate all the diverse topics covered in the literature in RILM Abstracts and RAFT. As shown in Table 2, other KOSs are exclusively for representing information on music and music products in various formats (e.g. scores, sound recordings, and performances). For example, Doing Reusable Musical Data (DOREMUS n.d.) is an ongoing project aiming to develop a common knowledge model for musical works using semantic Web technologies and musical work metadata collected from three major French cultural institutions (i.e., Radio France, Bibliothèque nationale de France, and Philharmonie de Paris). Research in the field of → knowledge organization also focuses on the organization of music information. Some concentrate on general topics such as classification systems for Western music (Lane 2002) and ontological representations of musical information (Madalli et al. 2015), while others concern more specific aspects of music such as the construction of classic music recording ontologies (Wu and Shi 2016), classification of chamber music ensembles (Lee 2017), and the relationship between original musical works, arrangement, and transcriptions (Lee 2019).

Although RILM Index has some distinctive characteristics, it is less organized and less robust compared to existing KOSs for music, including LCMPT, The Music Ontology, the data model for musical works by DOREMUS, the Music Instrument Museums Online (MIMO) thesaurus, and MusicBrainz Database by MusicBrainz (Table 2). RILM Index only has a one-level hierarchical relationship linking 275 headwords and 2,277 Z terms. Equivalent relationships are mainly applied to personal names (218,928 persons with 561,001 alternative personal names). Associative relationships between topics are largely absent in RILM Index.

Despite the differences between RILM Index and these KOSs in domain, scope, user communities, and structure, the overlaps are also obvious. For example, RILM could use the hierarchies defined by LCMPT and/or MIMO to establish its own hierarchical structure for instruments and instrument families. RILM could also create and link its own instrument authorities to those in LCMPT as linked data. Similarly, RILM could collaborate with and take advantage of existing KOSs for musical works. RILM is working on creating authority records for musical works from all cultures and traditions. Each work authority record provides efficient contextual information about the work and will be used by both RILM editors and end users. The main use of work authority records would be to help identify the correct works, especially those with a common (e.g., generic) title. The robust data models representing musical works proposed by The Music Ontology, DOREMUS, and MusicBrainz offer a deep level of granularity and may not be applicable to RILM work authorities directly. It is highly likely that RILM could extend its data model for musical works in the future referring to those models.

3. Issues in RILM Index

3.1 Polysemes without disambiguation

Homonymy and polysemy have long been issues in indexing and information retrieval (Furnas et al. 1987), and they are so in RILM Index. Many non-headword index terms are polysemous and are used to refer to different concepts in index strings. For example, programming is an index term used in RILM Index that could refer to different concepts, such as the process of coding algorithms to be executed by a computer, the activity of creating the performance program of a musical event, or the design of radio or television programs. In this case, the meaning of the term programming is entirely dependent on the context of individual index strings where it is used. When indexed after the headword computer applications, the term most likely means computer programming; but when indexed after headwords like conducting or choral music, the term may refer to concert programming.

The historical reason for this kind of ambiguity in RILM is that for decades RILM has collected literature mostly on music, and the Index primarily has served to represent musical information. Some terms used in RILM Index may be polysemous in nature, but only described single musical concepts in the early years of RILM. As RILM’s coverage grew broader, editors started to use such terms to refer to more than one concept, and the ambiguity issue began to emerge. Editors often chose a term for indexing without being aware that it had already been used in the Index to refer to other concepts/topics. In recent years, RILM editors have become aware of these problems and, as recommended by the ISO standard 25964-1 (International Organization for Standardization 2011), have started adding parenthetical qualifiers to such terms. For example, sequence has been divided into sequence (genre) and sequence (structure). By July 2020, a small number of more important index terms, such as headwords, have been disambiguated. The RILM Music Thesaurus Project is currently focusing on the disambiguation of all polysemic index terms.

3.2 Concepts with multiple term representations

It is common that different index terms refer to the same concepts but are not connected as synonyms in RILM Index. For example, country rock and rock, country are two index terms that have been used interchangeably in RILM to index the same concept, a subgenre of rock music and country music. The ongoing RILM Music Thesaurus Project is solving this problem by connecting synonyms to topic IDs and selecting one of the index terms as the preferred term representing the topic.

[top of entry]

RILM’s strict indexing rules have led to some index strings that violate the principle of specific entry (Broughton 2012; Chan and Salaba 2016). For example, the indexing rule for the headword performance is that it can be subdivided by place and topic and used for indexing various aspects of performance, including the “considerations of performance per se, physiological and psychological factors affecting performance, factors that help improve performance, how performers perceive themselves, cultural issues, the interface between performance and theory, and general writings on performance studies” (RILM Manual 2019). psychology is one of the high-frequency index terms used after the headword performance in RILM’s index strings. The intention of the index string performance--psychology could be to express the concept of “performance psychology” (i.e., a subdivision of psychology that applies psychological principles and techniques to performance) rather than the psychological aspect of performance. When an article is indeed about performance psychology but not the psychological aspects of performance, performance psychology is a better candidate as an index term rather than performance. However, the current indexing rules require using performance as a headword for “performance psychology” even if the publication is not about “performance” per se but about “performance psychology”. This example shows that using a single headword to govern a variety of topics can be unnecessary and confusing. live performance and piano recital are two other index terms frequently used after the headword performance in index strings. In both cases, live performance and piano recital are types of performance based on different categorization logic. LCSH generally does not allow the subdivision of a topical heading to represent a species, part, or kind of the subject represented by the main heading (Chan and Salaba 2016). In this case, indirect indexing (Keyser 2012) that violates the principle of specific entry denoting redundant information occurs in RILM’s index strings.

Another RILM indexing rule that can produce redundancy is that each index string must have a minimum of two index fields. In cases where the headwords themselves are sufficient to represent the subjects discussed in a publication, RILM editors are forced to use a second term following the headword. For example, the index term general in the second field of the index string creative process--general is an unnecessary stand-alone adjective that could result in retrieving irrelevant search results (International Organization for Standardization 2011).

Unlike LCSH and the Sears List, RILM Index does not use parenthetical statements (e.g., “May subdiv. geog.”) to indicate the authorization of geographic subdivisions. The treatment of some headwords in RILM appears as their direct subdivisions, which can exist as scope notes or parenthetical statements. For example, RILM uses the headwords popular music--general, popular music--by genre, and popular music--by place to indicate three different treatments of the subject popular music. These direct subdivisions are unnecessary in an online index and lead to cumbersome index strings, such as popular music--by place--United States of America--Missouri--Kansas City--influence on jazz--early-mid 20th c.

3.4 Inconsistent index terms

RILM used to index the same personal names in different forms when located in different fields of the index strings, which is a practice inherited from the print era in order to save space on paper. If a person is considered as a subject heading for a given publication, RILM editors would assign her/his preferred name as the headword of one of the index strings; for the index strings where the personal name appeared in later fields, RILM editors would spell out the last name only but keep the first name initials (Figure 5). This violates the principle of uniformity (Chan and Salaba 2016) and may cause confusion to users when examining the bibliographic record of the publication, as Davis, M. could refer to the singer Martha Davis (1951-), the jazz musician Miles Davis (1926-1991), and a few others. For users not familiar with the subject or domain, it may appear that this publication is related to two different persons. RILM discontinued this practice years ago, but personal names created in this format remain in the database. RILM is solving this issue by running SQL queries to replace variant forms of names with preferred personal names in all the index strings.

Figure 5: Preferred name and variant name of the same person used in different index strings of the same publication

Some headwords used for indexing individual entities are the classes or groups to which the entities belong. For example, the headword performing groups--popular music is used for indexing individual popular music groups. RILM’s indexing rules require that the field following this headword must be the name of a popular music group. Such indexing rules could cause inconsistency in the index terms representing the same entity when that entity belongs to more than one class or group described by more than one headword. Take the jazz vocal group The Four FreshMen as an example. This band has been indexed under either the headword performing groups--vocal or performing groups--jazz and blues in different RILM records. The RILM Music Thesaurus project is addressing this issue by creating authority records that store characteristics or attributes of individual entities as semantic relationships. These authority records will be used to support indexing and searching.

Besides personal names, this inconsistency exists in some topical index terms. For example, pianists can only be used as a non-headword in index strings. When this concept is used as a headword, it must appear as performers--piano. The reason for formatting some headwords in this particular way is to enable users to browse all subdivisions of a broader concept, such as performers--piano, performers--dance, and performers--dramatic arts. This works well when the index is on paper, and the EBSCO interface does provide a browse function that can replicate the experience of browsing a printed index, but users are more likely to conduct keyword searches (Gross, Taylor and Joudrey 2015) using “pianist” or “piano performer” rather than “performer--piano”. In this case, RILM’s term selection violates the principle of common usage (Svenonius 2003), and may result in users not being able to retrieve bibliographic records indexed with performers--piano as a headword if they search for “pianist” or “pianists”. RILM is resolving this issue in the same way as the issue discussed in Section 3.2, by linking synonyms to topic IDs and selecting one of the index terms as the preferred term representing the topic.

3.5 Lack of control

Of the three key thesaurus relationships interlinking terms and concepts (Dextre Clarke 2019), IBIS currently supports the creation of the equivalent relationship between index terms and a one-level hierarchical relationship between headwords and Z terms. Although IBIS allows creating topic IDs and linking equivalent terms to the same topic IDs, RILM editors rarely articulate, curate, or maintain equivalent relationships between terms and constantly add new index terms representing topics already existing in the Index. As shown in the previous example, performers--piano is a verified concept with a topic ID, while the same concept expressed in a more natural form pianists is a null term, not linked to that topic ID. The heavy use of such null terms has contributed to the challenge of improving the recall of RILM Index.

RILM Index could benefit from the inclusion of hierarchical and associative relationships between concepts. For example, rapper dance is a subgenre of sword dance in England. Both are used as index terms in RILM but are not linked by a hierarchical relationship. The absence of cross references between the two index terms would prevent end users and machines from expanding their searches to broader, narrower, or related topics. RILM’s technology team is developing features for creating and maintaining hierarchical and associative relationships between concepts in IBIS as part of the thesaurus project, which is discussed in the following section.

As mentioned earlier, around 51% of RILM’s index terms are null terms that are unverified and have no assigned category tags. Around 20% (140,000) of the null terms are prepositional phrases that indicate relationships between nouns or noun phrases (e.g., influence on jazz, relation to traditional music). These prepositional phrases are used in index strings to describe relationships between entities: one expressed directly in the phrase and another indexed in previous fields of the same string. For example, the index string Bach, Johann Sebastian--works--influence on jazz indicates that the composer’s work has influence on jazz music. Johann Sebastian Bach's works is one entity and the music genre jazz is another entity, and the two are connected by the type of relationship “influence on”. RILM editors use a limited number of prepositional phrases such as “relation to” and “influenced by” in order to maintain a certain degree of consistency, yet all such phrases are unverified. The relationships indicated by prepositional phrases in RILM Index provide valuable information for researchers and other KOSs for music. The RILM Music Thesaurus project is in the process of compiling and integrating these relationships into the RILM Music Thesaurus.

3.6 Eurocentrism

The granularity of RILM’s topical headwords for Western art music genres, forms, and styles is significantly greater than that for headwords describing traditional genres and popular music. For example, RILM has 101 headwords for specific Western art music genres and only eight for traditional music genres. This has led to the unbalanced treatment of genres from different traditions. For example, the German song genre lied is a headword and can be used directly in index strings (e.g., lied--texts--English translations), while the vocal music genre urtyn duu of the Mongol peoples is not a headword and has to be indexed under headwords such as Mongolia (e.g., Mongolia--traditional music--urtyn duu--history and development--cultural transformation).

In addition to the unbalanced treatments of Western art music genres versus genres of traditional and popular music, RILM sometimes applies index terms representing concepts from Western art music to concepts from non-Western traditions due to its incomplete vocabulary for those traditions. For example, RILM uses the term temperament theory to index publications about “lü” (律), a concept specific to Northeast Asian countries such as China, Japan, and Korea. This issue reflects a historical bias in RILM toward publications from Western countries that focus on Western musicology. This bias has been noted by music researchers and librarians in the past (Green 2001; Keller 1980; Tsuge 1986).

Since the early 2000s RILM has put great effort into expanding its geographic and linguistic coverage by reaching out to Asian, Central and South American, and African countries and regions that have been overlooked by major bibliographic repositories in the West (Mackenzie 2008). This effort includes hiring ethnomusicologists from underrepresented cultures and traditions as editors and subject experts; creating new headwords for traditional and popular music genres, styles, and concepts; and implementing non-Roman scripts such as Chinese on EBSCOhost. The number of bibliographic records from countries in those regions in RILM databases has grown rapidly in the past 15 years. As a result, RILM”s index terms have evolved to reflect the increasing diversity of the publications represented in its databases, even though RILM Index retains in large part a vocabulary that suits Western art music genres better than non-Western music traditions. Adopting more flexible indexing rules that promote direct use of terms from non-Western cultures and traditions as headwords would help RILM Index achieve more balanced term granularity.

3.7 Quality issues in RILM Index

The quality problems of controlled vocabularies are closely related to data quality issues (Mader et al. 2012). The six quality issues of RILM Index discussed above can be conceptualized as data or information quality problems (Stvilia et al. 2007) rather than unexpected phenomena. Quality can generally be defined as fitness for use (Wang and Strong 1996), and thus is contextual. Data quality can be defined as “the degree to which the data meet the needs and requirements of the activities in which they are used” (Stvilia et al. 2015, 247). The quality of RILM Index may not be sufficient to meet the needs and requirements of RILM editors and database users in all contexts. RILM Index’s data quality problems can be mapped to nine types of data quality problems proposed by Stvilia et al. (2007): intrinsic redundancy, intrinsic semantic inconsistency, relational informativeness/redundancy, intrinsic structural inconsistency, relational semantic inconsistency, lack of authority, relational incompleteness, relational structural inconsistency, and contextual structural incompleteness, as shown in Figure 6. Mader et al. (2012) identified 15 quality issues by analyzing 15 controlled vocabularies (including LCSH, AgroVoc, and MeSH), and developed computable quality metrics to detect them. Except for redundancy, all the six issues in RILM Index can be mapped to six of the 15 quality issues identified by Mader et al. (2012), including label conflicts, omitted or invalid language tags, undocumented concepts, missing out-links, orphan concepts, and incomplete language coverage (Figure 6). Although RILM Index’s vocabulary is only partially controlled, it shares issues with some well-known controlled vocabularies such as LCSH and EuroVoc (Mader et al. 2012) and might consider adopting some of the automatic or low-cost quality metrics reported in the literature (e.g., information noise, count of the same concepts using different index terms) to help identify or even resolve those quality issues (Mader et al. 2012; Stvilia 2007; Stvilia et al. 2007).

Figure 6: Mapping RILM Index issues to data quality problems and quality issues in controlled vocabularies

4. RILM Music Thesaurus: a possible solution

Recognizing the index issues discussed above RILM decided to improve the organization of its index terms and started the RILM Music Thesaurus Project in October 2017. The goal of the project is twofold: first, to develop a multilingual music thesaurus that organizes topical concepts and terms into a faceted structure clarifying the hierarchical, equivalent, and associative relationships between those concepts and terms; and second, to provide better authority control over named entities in RILM Index. The three types of thesaurus relationships and authority control provided by this thesaurus will help resolve the above-mentioned six data quality issues. This thesaurus is meant to be used by RILM editors for indexing and end users for searching music literature.

RILM has formed a thesaurus team consisting of two staff members who are both experienced musicologists and library and information professionals. The thesaurus team took a top down, deductive approach (Keyser 2012) to construct the initial hierarchical structure of the thesaurus by referring to previous discussions and models of the music thesauri proposed by the library communities. In particular, the thesaurus team looked into the Music Thesaurus Project of the Music Library Association (MLA), the proposal of an international music thesaurus by Spilker (2005), and the plan for a thesaurus focusing on ethnomusicological topics. MLA’s project proposed a music thesaurus with seven → facets (i.e. agents, events, forms/genres, geo-cultural attributes, sound devices, texts, and other topics). Although never finished, the Music Thesaurus Project working group reported that they had plans to integrate vocabularies from LCSH, RILM Index, and other subject lists for music (Hemmasi 1994; MLA Music Thesaurus Project Working Group 1989). Spilker (2005) described a model for an international music thesaurus with 13 facets (agents, activities, associated concepts, content subjects, document types, events, forms and genres, historical contexts, instruments, philosophies and religions, research and analysis, styles and periods, and texts). This model was developed with reference to the MLA Music Thesaurus Project. A lesser known project by the Ethnomusicology Archive at University of California, Los Angeles, is the only one focusing on a domain-specific thesaurus for ethnomusicology (Schuursma 1990; Spear 1986). It is unclear if this thesaurus project was finished or ever used. These attempts, though unsuccessful, helped RILM take the initial steps towards establishing the facets and top concepts (the broadest categories of similar concepts in the same facet) for the RILM Music Thesaurus.

RILM’s topical index terms represent topics in RILM Abstracts and RAFT and cover a wide range of subjects and disciplines. They led the thesaurus team to determine the domain and scope of the thesaurus. The thesaurus team tested the feasibility of facets and top concepts by using them to categorize all of the topical terms and some frequently used topical null terms — approximately 10,000 topical index terms in total. After refining the facets and top concepts based on the feasibility test, the team took a bottom up, inductive approach (Keyser 2012) to develop subclasses of each top concept using those 10,000 topical terms. As of July 2020, the RILM Music Thesaurus has a hierarchical structure consisting of eight facets and 38 top concepts (see below), and each top concept has a hierarchy up to four levels deep. The next steps for the thesaurus team include identifying topical null terms and situating them into this hierarchy.


  • People
  • Organizations
  • Animals
  • Disciplines
  • Events
  • Techniques and Methods
  • Games and Sports
  • Phenomena
  • General Activities
Periods and Styles
  • Chronological Periods
  • Styles
Forms of Expression
  • General Forms of Expression
  • Document Types
  • Versions
  • Formats
  • Notations
  • Symbols
  • Information and Communication Technology
  • Media for Musical Performance
  • Manmade and Natural Objects
  • Geographic Locations
  • Venues
  • Musical Concepts
  • Cultural and Social Concepts
  • Languages and Linguistic Concepts
  • Philosophical Concepts
  • Religions and Religious Concepts
  • Medical and Scientific Concepts
  • Dance Concepts
  • Dramatic Arts Concepts
  • Literary Concepts
  • Art and Architecture Concepts
  • Pedagogical Concepts
  • General Concepts
  • Music Genres and Types
  • Dramatic Arts Genres and Types
  • Arts and Architecture Genres and Types
  • Literary Genres and Types

The thesaurus team is currently working on adding parenthetical identifiers to polysemic topical index terms and compiling authority records for all topics, including adding links to outside reference sources and connecting synonyms that already exist in RILM Index. This will enable explicit identification and disambiguation of all types of topics indexed in RILM and partially resolve the issues of label conflicts, undocumented concepts, missing out-links, and orphan concepts. Once the editing of hierarchical and associative relationships between concepts becomes available in IBIS in the near future, the thesaurus team and RILM editors will have the means to create broader/narrower and associative relationships between topic IDs, which will help reduce undocumented and orphan concepts. RILM can avoid the issue of redundancy by changing its indexing rules, allowing more flexibility in directly choosing the most specific terms from the thesaurus as headwords. This will also help promote equality between concepts from Western and non-Western cultures and traditions, and in part resolve the issue of Eurocentrism in the current RILM Index. The long-term goal of the Thesaurus Project is to create a multilingual thesaurus, but the present focus is on building an English-language thesaurus. The anticipated multilingual feature of the thesaurus will also help alleviate the incomplete vocabulary issue discussed in Section 3.6.

Although traditional thesauri are usually limited to three main relationship types among their terms and concepts (i.e., equivalence, hierarchical, and associative), some attributes from ontologies can be adopted to improve thesauri by adding more useful types of relationships from a given domain (Dextre Clarke 2019; Hjørland 2016). As mentioned in Section 3.5, RILM Index contains a large number (around 140,000) of prepositional phrases indicating the relationships between nouns or noun phrases. Therefore, RILM Index can be considered as a KOS (Hjørland 2007) comprising a rich set of associative relationships among different concepts in the domains of musicology and related subjects. This is a unique and valuable asset of RILM. To the authors’ best knowledge, no other KOS for music has invested the time and effort to provide this type of information. The thesaurus team has extracted some of the relationships (e.g. “influenced by”, “arranged by”, “based on”, “composed for”, “performed for”, “libretto for”) and entity types indicated by the prepositional phrases in RILM Index, and is in the process of compiling and integrating more types of domain-specific relationships into the thesaurus to support granular search expansion (Dextre Clarke 2019; Kless et al. 2015).

Based on RILM Index and the categories of index terms, the thesaurus team is also building ontologies (Horwitz et al. 2017) for named/titled entities such as persons, organizations, and objects, and is linking these entities to other authoritative sources such as VIAF and the TGN. These ontologies supply contextual information for each entity identified by their designated URIs. This will help reduce the length of RILM’s index strings and improve recall and precision. For example, under the current indexing rules, in order to index a particular musical work by a known composer, the composer’s name must be the headword, followed by the margin term works, and lastly the title of the work (e.g., Chopin, Frédéric--works--preludes, piano, op. 28, no. 10). The ontology model for individual music works will include all the information in this string (i.e. composer, opus number, catalogue number, genre), which can be reduced to one descriptor/concept representing this particular work to support post-coordinated indexing and searching (Hjørland 2018).

5. Conclusion

Considering the disciplinary and linguistic coverage of RILM databases, the richness of RILM’s index data, RILM editors’ expertise on music as well as library and information science, and their efforts in community outreach in different countries and regions all over the world, RILM is in a good position to develop a comprehensive and novel music thesaurus with ontological features. Despite the issues discussed above, RILM Index provides a solid foundation for the construction of a music thesaurus, especially relationships between specific entities suggested by prepositional phrases. The RILM Music Thesaurus would benefit bibliographic databases, search engines, and music information seekers alike.


The authors would like to express their gratitude to two anonymous reviewers, Dr. Birger Hjørland, Dr. Zdravko Blažekovic, Dr. Christopher Bruhn, and Elizabeth Parry for their helpful suggestions.

1. The other three are Répertoire international des sources musicales (RISM, http://www.rism.info/home.html); Répertoire international d'iconographie musicale (RIdIM, https://ridim.org); and Répertoire international de la presse musicale (RIPM, https://www.ripm.org).

