I S K O

edited by Birger Hjørland and Claudio Gnoli

 

Jacques Maniez

by and

Table of contents:
1. Introduction
2. Syntax and information retrieval
3. Documentary languages (DLs)
4. Classification and faceted classification
5. Indexing and indexes
6. Conclusion
Endnotes
Bibliography: Works by Jacques Maniez; Other references
Colophon

Abstract:
Jacques Maniez (1923-2020) was a French linguist and information science specialist. From 1977 to 2010, he explored four important topics: the role of syntax in information retrieval, the structure and functions of documentary languages, classification and facets, indexing and indexes. By bringing principles of knowledge organization, document analysis and information retrieval under the magnifying glass of the science of language, Maniez made a significant contribution to French thinking in these subfields of information science.

[top of entry]

1. Introduction

Jacques Maniez (1923-2020) was a French specialist in information science (IS) who chose to adopt a linguistic perspective to analyze the characteristics and functions of documentary languages (DLs) [1]. With a degree in literature and a doctorate in linguistics, he taught humanities in several lycées (high schools) and, later, linguistics applied to documentation at the Institut universitaire de technologie (IUT) in Dijon, France. During his career, Maniez elaborated and led numerous training courses and he was a motivated participant in the early activities of the → International Society for Knowledge Organization (ISKO). In 1995, he set up ISKO-France, the French chapter of the organization, with his colleagues Danièle Dégez and Widad Mustafa El Hadi. Three years later, he co-edited the proceedings of the Fifth International ISKO Conference, held in Lille in 1998, with professors Mustafa El Hadi and A. Steven Pollitt.

Figure 1: Jacques Maniez in Dijon in June 2006

Maniez was not the most prolific author among French IS specialists, but his contribution to the field is significant. His most substantial writings include a doctoral dissertation on the role of syntax in information retrieval systems (1976-1977) and three papers: “Relationships in thesauri: critical remarks” (1988), “Database merging and the compatibility of indexing languages” (1997b) and “Du bon usage des facettes: des classifications au thésaurus” (1999) [The proper use of facets: from classifications to thesauri]. We owe him three textbooks in French on indexing and DLs: Langages documentaires et classificatoires : conception, construction et utilisation dans les systèmes documentaires (1987) [Documentary and classificatory languages: design, construction and use in retrieval systems], Actualité des langages documentaires: fondements théoriques de la recherche d’information (2002) [Documentary languages: theoretical foundations of information searching], and Concevoir l’index d’un livre: histoire, actualité et perspectives (2010) [Creating the back-of-the-book index: past, present and future].

This article summarizes the most significant of Maniez’s writings. Its four sections take up themes explored by the teacher-researcher from the time he completed his doctoral research. The thesis is at the heart of the first section, announcing the topics that would capture the linguist’s attention over the following 30 years: the structure and functions of DLs, classification and facets, indexing and indexes.

[top of entry]

2. Syntax and information retrieval

Having completed postgraduate studies at the Université de Franche-Comté, Jacques Maniez produces an imposing report in 1976-1977. Distributed in two volumes under the title: Le rôle de la syntaxe dans les systèmes de recherche documentaire [The role of syntax in information retrieval systems], the thesis has been inspired by the work of Maurice Coyaud and Jean-Claude Gardin. Maniez reports on his in-depth analysis of the syntactic characteristics of several contemporary → retrieval systems; in doing so, he joins the numerous colleagues who are then examining the main factors influencing systems performance. At the source of Maniez’s reflection, a general question: to what extent does taking into account syntactic links in the representation of the content of documents contribute to the efficiency of bibliographic research? And two specific questions: (1) What is the average rate of relevance in retrieval systems working without syntax? and (2) Do syntactic procedures significantly increase the relevance rate? Demonstrating that DLs stand at the crossroad of common languages and artificial codes, Maniez suggests that a study borrowing concepts and methods from the science of language will offer interesting answers.

In the first part of his thesis, the researcher draws a parallel between the problems inherent to natural languages and those of DLs, highlighting their similarities and differences. Adopting a semantic perspective to study the syntax of → indexing formulas, he characterizes languages and systems, using as a basis the degree of complexity of the syntactic rules that apply to them. The absence of syntax leads to ambiguity in the indexing formula and is a source of noise at the time of retrieval. But ambiguity is not the only pitfall. According to Maniez, synonymy is even more of a problem since it generates silence. In search of solutions to reduce noise and silence in retrieval systems, he explores the main linguistic theories of the time: Chomsky’s transformational grammar, Saumjan’s applicative grammar, Tesnière’s dependency grammar, and semantic-based grammars such as Fillmore’s case grammar and Pottier’s actancial schema. In context, the models developed by Tesnière, Fillmore and Pottier are considered the most interesting of all.

In the second part of the report, Maniez states that the only justification for an integration of syntactic procedures in a retrieval system is an improvement of the precision rate. To support this assertion, he evaluates five existing systems which apply syntactic rules, using three criteria: motivation (why use syntactic rules?), theoretical coherence (which syntactic rules are used?), and efficiency in terms of results. The five systems are: Gardin’s SYNTOL, Jason Farradane’s Relational indexing, → S.R. Ranganathan’s → Colon Classification, Derek Austin’s → PRECIS, and DOCILIS, a full-text retrieval system in the field of law. A detailed description of each system is followed by a review of its positive and negative characteristics.

SYNTOL impresses Maniez with its clarity, rigour, ingenuity and the probity and meticulous care with which it was designed. But SYNTOL has only been exploited in the context of experimental applications, and its contribution is more theoretical than practical. Maniez considers that its overly complex syntax is a weakness, but this has not prevented the application of the system to discourse analysis and automatic indexing. Maniez disapproves of Farradane’s Relational indexing, seeing it as an intellectual exercise lacking practical justifications and whose contribution is more significant by the rigour of the proposed method than by its set of syntactic relators. Ranganathan’s proposal to use articulated facets for the representation of complex subjects is considered one of the most significant proposals of the 20th century in the field of documentation. Maniez demonstrates that the → facet, as defined by Ranganathan, is both a classificatory category and a syntactic category, at the crossroad of the paradigm and the syntagm. The faceted model is too simple, too static: the division of the world as proposed by the Indian master implies an unrealistic predestination of objects fixed in immutable roles. The weakness of the system lies in the dubious identity of ill-defined facets. Maniez’s interest in the faceted system is such, however, that he will devote a substantial paper to the topic some 20 years later (see section 4 below). Maniez is not very impressed by PRECIS, a mechanical system driven by coding considerations and computer processing requirements rather than by logic. Humorously, he compares PRECIS’ syntax to a refinement pertaining to luxury products, while acknowledging the qualities of a system more widely used than the previously described ones. Finally, in DOCILIS, syntactic procedures are used to decrease silence. Maniez remarks that, as is the case in the majority of retrieval systems, one loses in recall what is gained in precision.

The general conclusion is the most theoretical part of the report. Maniez observes, and this comes as no surprise, that syntax is a source of problems in document and information research, and that the simplest syntactic systems are better suited where efficiency and costs are of concern. Challenging Morris’ postulate that any system of signs presupposes semantics, syntax and pragmatics, the researcher demonstrates that, while syntax is essential to natural language, a universal communication tool, it remains optional in DLs. The cost-benefit ratio of an integration of syntactic rules in retrieval systems is far from favourable, and Maniez concludes that the only syntactic universal of interest is the aggregation of distinct terms, since it is impossible to foresee all subjects that will eventually have to be represented.

Maniez’s report is an interesting contribution to the body of research on the nature of connections between the science of language and documentation, and on the influence of one on the other. A wider dissemination of the thesis would have made it possible for the important work carried out by British and American IS specialists to be acknowledged in France at an earlier date. → Eric Coates, Jason Farradane, Gerald Salton, Karen Spärck-Jones, Cyril Cleverdon, Frederick W. Lancaster and Dagobert Soergel were among those who inspired and influenced Jacques Maniez.

As a follow-up to his doctoral research, Maniez publishes “Problèmes de syntaxe dans les systèmes de recherche documentaire” [Syntactic problems in document and information retrieval systems] in 1983. In this paper, the author develops several observations first made in the dissertation. Having redefined the concepts of semantics (the study of meaning in social communication) and syntax (the rules which make it possible to compose complex units of meaning from simple elements), Maniez insists on the fact that the two are not heterogeneous since syntax has an undeniable role to play in the production of meaning. And because semantics is at the heart of information retrieval, syntax-related issues are unavoidable. These issues that affect the quality of information retrieval are syntactic ambiguity in the absence of a context, and syntactic synonymy, i.e. the possibility of conveying the same message in various ways. According to Maniez, contemporary DLs, and among them the popular → thesaurus of descriptors then entering its golden age, do not make it possible to remove ambiguity in content representation. Solutions to the problem exist though. The insertion of links between units of representation reduces the number of false combinations. Roles determine the syntactic-semantic function of each term in the indexing formula. Maniez believes, however, that relational indexing, as applied in SYNTOL and in the Vercingétorix language at the Université de Clermont-Ferrand [2], is not the ideal solution. The author also addresses the issue of free-text searching using examples from the STAIRS [3] and QUESTEL [4] systems, and evaluates the syntactic efficiency of distance operators.

In his thesis, Maniez analyzed the usefulness of syntactic procedures from a theoretical perspective. Here, he looks at field practices, criticizing the fact that librarians rely mostly on Boolean operators to modulate the rates of precision and recall. He is convinced that syntactic procedures do increase the efficiency of retrieval systems. If the least complex systems remain so popular, it is because of an impossibility to measure the intensity of noise attributable to the absence of syntax. Indexing languages developers do exercise some form of syntactic control, but they recognize that a complex syntax leads to time-consuming and more tedious indexing, more expensive computer processing, and an increase in difficulty for the information seeker. In relational indexing models, the gain in precision at the expense of completeness (recall) is not, yet, cost-efficient.

[top of entry]

3. Documentary languages (DLs)

Following this first foray into IS through information retrieval systems, it is to DLs as an essential component of these systems that Maniez will devote most of his work over the next thirty years.

With his extended knowledge of both linguistics and → terminology (as a science and a practice), Maniez was among the first to establish a parallel between language products issuing from two distinct fields, terminology and IS. In 1977, when “Terminologies et thésaurus: divergences et convergences” [Terminologies and thesauri: divergences and convergences] was published, the thesaurus had not yet acquired the degree of popularity it would enjoy over the following decades. In this paper, Maniez compares similar products, one created by terminologists, the other by information specialists. His comparison adopts the perspective of the user, whom he describes as a scientific or technical domain specialist, an information specialist, or a terminologist. The author observes that both language tools, while differing in structure and purpose, also have much in common. A terminology is a list of technical and scientific terms used in a particular field, together with clear definitions of the objects or processes represented. The thesaurus is a list of terms chosen to represent the content of documents or search queries in a field of knowledge, subsequently allowing for a quick and efficient selection of documents relevant to an information need; thesaurus terms, called → keywords or descriptors, are used for indexing and for searching.

Maniez describes significant differences, in terms of purpose, between the two instruments. On the one hand, the terminology is a reference tool used to clarify the meaning of an unknown or poorly understood scientific or technical term; its function is to facilitate the transition between common discourse and the scientific/technical universe, as well as to establish a common discursive basis among specialists. The set of terms constitutes a subset of the natural language lexicon, and the terminology is itself a specific type of dictionary. On the other hand, the thesaurus is a tool that adapts to the needs of information research; the function of its elementary components, the descriptors, is to provide a series of labels allowing for a precise description of subjects discussed in a document, and facilitating the later identification of this document. The thesaurus is therefore a particular type of DL in its function as retrieval code. The author then exposes the differences in the content and structure of each list, grounding his observations on three elements: explanatory value, completeness, and degree of normativity of the lexicon. In a terminology, terms are complemented by definitions which can be considered the centerpiece of the whole work. Maniez explains that the definition, a type of didactic discourse, connects the learned or technical word to the common vocabulary of a language, integrating the unknown to the already known. In the thesaurus, descriptors are normally accompanied only by a few semantically related terms (synonyms, broader, narrower and associated terms). The meaning of descriptors is assumed to be known by users who are specialists of the field. The choice of terms that will become descriptors is therefore made intuitively by indexers, possibly with the help of a terminology if one exists in the field. Definitions are rarely found in the thesaurus, and where they have been added, they are motivated by a particular, uncommon, use of a descriptor. As for related terms accompanying a descriptor, their function is to help indexers and searchers adjust their indexing formula and search query, rather than to specify the meaning of said descriptor. In the terminology, the inventory of terms aims at being as complete as possible, to include all denominations used in a domain. The thesaurus lexicon, on the contrary, is always limited; synonyms are eliminated and the thesaurus cannot fall below a certain threshold of specificity since the presence of multiple specific terms with similar meaning is detrimental to the effectiveness of the search. Finally, Maniez considers that terminology and thesaurus are both normative tools in that they rely on a series of conventions; in this regard, the thesaurus is stricter than the terminology because its objective is bi-univocity, a prerequisite for the effectiveness of any information retrieval system.

Nevertheless, Maniez does not see the terminology and the thesaurus as radically different tools. By emphasizing what distinguishes them, the author has only tried to show the limits of the exchange of good practices their respective users can expect from them. This should not conceal their fundamental kinship: terminology and thesaurus share two essential characteristics. The first, formal in nature, is the use of → alphabetic rather than numeric or alphanumeric signs. The second, of a semantic nature, is the fact that the delimitation of the units of content adopts the language habits of the domain represented. Indeed, the terminology and the thesaurus complement each other. A potential application of such complementarity is the consultation of lists of technical terms, and clarification of the meaning of these terms, by thesaurus developers; terminologies are a primary source of concepts and terms when building a thesaurus, whether in a monolingual or a multilingual environment. Both tools are equally useful to scientific and technical translators.

In 1987, Maniez publishes Langages documentaires et classificatoires : conception, construction et utilisation dans les systèmes documentaires [Documentary and classificatory languages: design, construction and use in retrieval systems]. In his introduction, the author specifies that this textbook is a complement to the one just published by his colleague George Van Slype under a similar title: Les langages d’indexation : conception, construction et utilisation dans les systèmes documentaires (1987) [Indexing languages: design, construction and use in retrieval systems]. Van Slype is a consulting engineer, a specialist of indexing languages, and a lecturer at Université Libre de Bruxelles. The two textbooks form a whole and are the result of a joint project, devised in 1985, with the objective of publishing a study of DLs intended for information professionals but accessible also to other specialists and to members of the general public interested in documentation; Van Slype would deal with indexing languages, and Maniez would focus on classificatory languages. The ambition of the authors was to subordinate theory to practice, without sacrificing the first to emphasize the second. Their goal was to show the importance and originality of two families of DLs and to describe their common features, putting an end to the controversy between supporters of traditional → classification structures and proponents of modern thesauri by demonstrating that each generated benefits in the information retrieval process.

Maniez’s textbook is divided into two parts. In the first part, the author characterizes general knowledge and bibliographic classifications. Among the systems created in the 19th century, Maniez describes the Brunet Classification, the Dewey Decimal Classification (DDC), and the Library of Congress Classification (LCC). Of structures born in the 20th century, the author chooses to present the Universal Decimal Classification (UDC) and Bliss Bibliographic Classification (BC). In his discussion of structures introduced after the Second World War, Maniez emphasizes the influence of UNESCO and of the International Federation for Information and Documentation (FID), before commenting briefly on systems developed in the USSR and in the People’s Republic of China.

Maniez suggests that the DDC has asserted itself on all five continents thanks to existing translations in numerous languages of its complete and abridged editions. He believes that the system’s weaknesses are of little weight when compared to its practical advantages: a logical → hierarchical structure, a decimal → notation easy to handle and memorize, a detailed relative index, and widespread use in national bibliographies, catalogues and indexes. Maniez then suggests that Paul Otlet’s UDC, an offshoot of the DDC, has evolved freely and assimilated enough new techniques to guarantee its autonomy in the face of its powerful model. Maniez describes the context of creation of the UDC, as well as its evolution and popularity in a good number of countries. But we know that while it is used in numerous Eastern and Western countries, UDC remains quasi-invisible in North America. According to Maniez, this may be attributed to three important weaknesses of this system: the sluggishness of the updating work, the lack of an index showing equivalence and associative relations and the initial decision to reuse the DDC main classes.

Of Ranganathan’s CC, Maniez agrees that its development marks an important stage in the history of classification; it is indeed based on principles that appear to be well ahead of those which support other universal classification structures. But the CC seems incapable of imposing itself, especially outside the country of its creator. Maniez suggests that this results from the fact that the Indian master sacrificed convenience of use to the rigor of his theory of facets; the system can only accommodate the growing diversity of subjects at the cost of a complicated syntax and convoluted notation.

The Library of Congress Classification, born of the need for a complete reorganization of the Library’s enormous collections at the end of the 19th century, has gained some ground outside of its parent institution; it is used in most North American university and research libraries. Although specialists consider that LCC meets the minimum requirements of a functional classification system with its disjoint classes, rational sequence of classes, sufficiently logical order of subjects within classes and easy-to-memorize notation, Maniez admits to not understanding how such a traditional and antiquated system has spread so widely and can exert so much influence.

There is a certain kinship between Bliss BC and Ranganathan’s CC. Born 20 years before his colleague, → Bliss has also devoted a lifetime of work to a theoretical reflection on classification. Maniez observes that both thinkers exerted a reciprocal influence on each other; this influence is notably visible in the evolution of Bliss' thinking and work following the publication of the first edition of his classification system. In Maniez’s textbook, the second edition is described, with a presentation of the principles on which it is based, its structure, its notation and the system’s limitations.

The first part of the textbook is completed by two chapters devoted respectively to the construction and updating of classification structures, and to their operational use.

The second part of the textbook is a general study of DLs. First, Maniez defines their fundamental characteristics in relation to their functional particularities. Then, he focuses on the relationship between DLs and other types of languages with which they must interact. The author believes that one can only perceive the originality of DLs by looking at “their raison d’être (to find documents easily and quickly) and at the particular communication context in which they are used” (Maniez 1987, 236) [5]. He describes the semantic and syntactic dimensions of DLs, emphasizing in passing their formal rigidity. Although seen as artificial languages, because DLs derive from natural languages, they suffer from the same weaknesses; synonymy and polysemy are obstacles to the communication of information.

Maniez reminds us that most DLs, and in particular enumerative classification structures and post-coordinated indexing languages, function without syntactic rules. But from the moment a DL integrates an explicit syntax, one must be aware of the risks of increasing silence at the time of retrieval; indeed, syntactic devices “often make it possible to represent the same content with two different arrangements of concepts” (Maniez 1987, 245) [6]. Drawing from his doctoral dissertation, Maniez describes syntactic processes reminiscent of case grammars, both for classification (in the faceted model) and in the thesaurus. Roles such as agent, means, goal, etc., inspired by the work of Charles Fillmore, could be used for example to identify associative relationships in a thesaurus.

Using examples, Maniez describes the problems related to the semantic and syntactic ambiguity of natural language and of DLs before explaining the concept of univocity, a principle that must prevail within the latter type of language to ensure effective communication of information. The author reminds us that the closer a DL remains to everyday language, the greater the risk of increased silence and noise in document and information retrieval.

To introduce the principles of → automatic indexing based on linguistic knowledge and techniques, Maniez explains the documentary function of the text and demonstrates the importance of opposing full words or signifying units (noun, verb, adjective) to stop-words (article, adverb, preposition). Automatic term extraction and automatic indexing both depend on the correct identification of stop-words.

The second part of the textbook concludes with an assessment of the situation of DLs as of 1987. Maniez summarizes what brings natural languages closer to DLs, and what differentiates them; this comparison is still useful today. He explains that if DLs may appear closer to computer languages due to their degree of formalization, they remain connected to natural language through their semantic content.

Maniez is most interested in the thesaurus, the DL most often used in the 1980s and 1990s for information representation and searching. Published in 1988, the too rarely cited “Relationships in thesauri: some critical remarks” is an enlightening paper including realistic recommendations. The author defines and contrasts two types of relationships. Paradigmatic relationships bind two concepts intrinsically, permanently and independently of any context (inter-conceptual relationships), or a concept to the class to which it belongs (structural relationships). Paradigmatic relationships are essential since the objective of classification and indexing is to use the same terms to represent the same subjects within indexing formulas and search queries. Syntagmatic relationships are the result of a creation process; they are impermanent relations between independent terms needed to represent a new concept. In an artificial DL, paradigmatic relations only are essential. Syntagmatic relations, however, are useful in faceted languages and in SYNTOL-type information systems. In this article, it should be noted that Maniez considers syntactic processes applied at the time of use of the DL and not the syntax inherent in some pre-coordinated DLs, a hierarchical classification system for example.

Information searchers want to retrieve all documents that are relevant to their needs (recall/exhaustiveness) and only these documents that are relevant to their needs (precision). In this context, thesaural relations are useful. When the indexing formula is created, relations guide the indexer to the most specific terms that are available to represent a subject; these same relations later lead the user to the same terms, ensuring a satisfactory degree of recall and precision. Thesaurus relations also make it possible to expand logically the scope of a search, using semantically related concepts to increase exhaustivity. Maniez notes that, although it is at the heart of the thesaurus structure, the network of relations is not clearly and explicitly characterized; the identification of semantic links between descriptors appears to be the product of intuition rather than the result of an evaluation of their potential usefulness.

Maniez identifies the three common types of relations that make up the structure of the thesaurus. He assesses their usefulness and questions their actual value in the indexing and searching processes. The relation of equivalence, for one, could advantageously be replaced by a set of automated links between synonyms and quasi-synonyms, thus eliminating the obligation to select a preferential term. As for the hierarchical relationship, its usefulness is restricted because it is indifferently used to link species to class (the thumb is a finger) or part to whole (the finger is part of the hand); since the transfer of essential characteristics is not the same in these two cases, hierarchical relationships cannot be exploited automatically to expand logically the scope of a search. Finally, Maniez suggests that the relevance of the associative relation is even more difficult to ascertain under the current conditions of its use. Indeed, since it is too loosely defined, it often links terms with little semantic affinity, and it seems of little use to specify a search query. The author sees only three cases of potential usefulness for the associative relation: semantic quasi-synonymy (an extension of the true lexical and/or conceptual synonymy), semantic overlap (when a term is used in the definition of another) and extra-semantic association based on facts and experience rather than terminological kinship (the causal relation for example). Maniez concludes by presenting a functional view of the thesaurus and its role in information retrieval. The transformation of the tool would imply the elimination of the associative and hierarchical relations; the former is deemed useless, and the latter generates too much noise. Each descriptor, integrated into a relevant class, would then be surrounded only by terms useful to the logical broadening of a search. A simplified structure would facilitate the choice of the most relevant descriptors for the generation of indexing and searching formulas, as well as the enrichment of the latter to ensure identification of the highest number of relevant sources.

Five years later, in “L’évolution des langages documentaires” (1993) [Evolution of documentary languages], Maniez revisits the basics when he describes the characteristics and assesses the usefulness of DLs in the context of rapidly evolving automation processes and proliferation of full-text search systems. At the end of a century which has seen information systems become increasingly efficient, a major obstacle remains: for the retrieval of relevant information to be feasible, the language of the indexer, of the searcher, and to a certain extent that of the document creator must still coincide. This is why specialists have conceived of an artificial language in which control of the lexicon and of the meaning is achieved by seeking a one-to-one correspondence between a concept and its representation. This language is ideally used by the indexer and by the searcher, or by the indexer and the automated system to which the searcher has delegated the search.

Maniez describes the changes that have marked the evolution of two main types of DLs, namely hierarchical classifications and analytical languages such as authority lists, thesauri and faceted languages. Hierarchical structures group documents according to their content. The classificatory language embeds two subsystems: a hierarchical list of all subjects, and a representation of each subject in the form of a more or less expressive notation. The author submits a rather negative assessment of the classification model, citing its rigidity, the inadequacy of semantic relations around which it is structured, and the irregular updating of structures, too dependent on local and economic conditions. Recognizing that the most popular classifications seem to be adapted to the needs of users, he does not envision new applications of classification to information retrieval that contemporary information systems would allow.

Maniez claims that analytical languages deserve more attention than classifications since post-coordinate indexing is gaining in popularity. In this new model, the subject is no longer expressed as a single term or expression, a class number or a pre-formatted subject heading for example, but by a set of concepts represented by terms that are more or less controlled and remain independent from one another in the system. With a limited number of terms in the lexicon, it then becomes possible to represent a very large number of subjects. The passage through an artificial or coded language is no longer essential, and it is no longer required that the indexing formula and the search query coincide perfectly for relevant information to be retrieved. However, when analytical languages are used, the absence of syntax within the indexing formula may also constitute a weakness. This is why the faceted approach appears potentially more efficient since it determines priority and emphasizes dependency. The use of a faceted language in indexing is rooted in the following postulate: any subject can be broken down into a series of concepts connecting to fundamental categories (agent, process, product, attribute, place, time, etc.) whose order and function in the indexing formula is determined by syntactic rules. That being said, Maniez recognizes again that if the faceted language is theoretically the most appropriate to ensure precision and efficiency, it is the least popular because the complexity of the model renders the indexing process time- and money-consuming.

In 1993, technological innovations make automatic indexing and full-text searching possible, eliminating the need for an intermediary between the researcher and the content of a collection; Maniez applauds this evolution. In his view, this does not condemn DLs to oblivion; they remain essential for lexical standardization. The link between DL and technology is indeed an interesting one. On the one hand, technology supports the creation and updating of DLs, and facilitates their integration into “intelligent” systems. On the other hand, technology competes with DLs because it diminishes the importance of the subject in the search for relevant documents, now also accessible via straightforward identification criteria such as date, language, document type, etc.

The multiplication of specialized databases and controlled vocabularies creates a problem of compatibility in which Maniez is naturally interested. Compatibility is at the forefront of IS specialists’ preoccupations at the end of the 1990s when he publishes “Database merging and the compatibility of indexing languages” (1997b). The author observes that the existence of more than 500 relevant articles, all published between 1950 and 1990, testifies to the growing importance of the issue. In this paper, Maniez demonstrates that the availability of powerful information technology makes it possible to envision the development of less costly and more efficient gateways between databases.

Every language is based on specificities distinguishing it from all others; these specificities generate communication difficulties. But Maniez reminds his readers that linguistic convergence can be achieved and could be sought at two levels: that of the language itself and that of the statements produced with this language. Approaching the compatibility issue from a linguistic perspective, the author outlines potential solutions to the problem. It must be remembered that DLs stem from a desire to achieve conceptual compatibility, i.e. an agreement between the creator of a document, the indexer and the searcher on the meaning of terms they all use. The search for convergence between DLs is a form of integration whose ultimate degree would be the universal adoption of a single, common indexing language, as independent as possible of a particular natural language. The process of harmonizing DLs may be more realistic but it is as difficult to achieve; it is limited by the difficulty to reconcile different relational structures, as would exist in distinct natural or speciality languages. It seems more feasible then to harmonize indexing formulas, to translate formulas used in one database into the indexing language used to access another. Maniez points out, however, that the difficulties inherent in translating from one natural language into another will also manifest themselves in an artificial linguistic environment. To reduce the impact of these difficulties, solutions are proposed. The creation of a table of concordance based on the relation of equivalence allows for rudimentary translation, and it may be sufficient to reconcile small databases. The creation of a switching language appears more efficient since it does not require bidirectional conversion between all DLs used to access a single or multiple databases; all that is required is the compatibility of each DL with an intermediate language, a classification structure or a multilingual thesaurus for example.

The publication in 2002 of Actualité des langages documentaires [Modernity of documentary languages] was eagerly awaited in the French-speaking IS community; the last textbook published in French on the subject dated back to the 1980s. This work provides an update of the state of the art on DLs and aims to “defend the relevance of documentary languages today, [because it is] threatened by the ‘insolent success’ of the searching methods offered on the Web” (Amar 2002, 112) [7]. The automated processing of textual information, facilitated by the progress of linguistic technologies, is increasingly favoured, reinforcing the impression that DLs have become obsolete, an “endangered species” (Maniez 2002, 8) [8]. Maniez makes it his mission to demonstrate that the opposite is true in a convincing, rigorous way, and by thinking outside the box. Instead of drawing up a simple description of the situation, he chooses to prove that the existence of DLs is justified by the real and permanent requirements of information searching, as well as by the linguistic reasons put forward in the body of literature devoted to them.

The first chapters describe the close relationship between information systems and DLs: this is where the essential character of DLs must be drawn from. From the outset, the author introduces several elements that are essential to the study of search systems (functions, user habits and behaviour, technologies, object search versus information search, etc.) He differentiates target and search keys, ranking and classification, controlled vocabulary and documentary language. Maniez addresses the need for information within the framework of human communication and dialogue. Having distinguished systematic or on-demand dissemination and information searching, the author draws up a typology of retrieval systems on the basis of the object sought: factual information, contextualized information or documents.

In the eyes of the author, the chapter titled “Recherche de documents à partir d’un sujet” [Finding documents on the basis of a subject] is central to the textbook. In this chapter, Maniez reflects on the specificity of the document as object, on the specificity of information within the document, and on the issue of the search key, namely the → subject and its representation. The feasibility of subject searching is founded on the assumption that there exist identical subjects in the minds of searchers and in the documents of a collection. But these subjects do not express themselves in the same way: the former are articulated clearly in the form of a question, the latter exist in a diffuse way and must be brought to light explicitly.

Maniez exploits linguistic theories, discourse analysis protocols and methods, and hermeneutics to describe indexing problems. His presentation of the practical aspects of indexing reuses examples disseminated in his previous works. He also demonstrates the importance of the tools used to facilitate searching. Because computer technologies tend to amalgamate natural, indexing and query languages, the author argues that the availability of standardized terminological and semantic tools becomes all the more necessary. Maniez draws up a new typology of DLs based on the minimum unit of representation (theme, syntagm, concept) and on a definition of semantic codes of representation. Once the three fundamental components of DLs have been introduced, namely the notion of relation (with references to logic), the notion of facet (in its current sense and according to the meaning given to it by Ranganathan) and the notion of classification, the analysis of three types of DLs can begin. These are: hierarchical languages, syntagmatic languages, and descriptor-based languages. For each type of DL, its historical evolution, essential characteristics and prospects for use in a computerized environment are presented and commented. An entire chapter of the textbook is devoted to problems of compatibility in an era of globalization, a subject for which Maniez maintains a strong interest.

Actualité des languages documentaires is the most accomplished and stimulating of Maniez’s contributions to documentation and IS. The textbook is an opportunity for its author to update, restructure and synthesize the theoretical and practical knowledge he has himself acquired and processed over the previous 25 years. With this text, he demonstrates not only his mastery of concepts, but also his ability to communicate in a clear style, integrating illustrations, examples, and the occasional touch of humour. The content is supported by a rich bibliography, and the subject index testifies to the exhaustiveness and relevance of the study.

In 2007, Maniez publishes his ultimate reflection on DLs, in a context that he describes as a welcome return to a familiar country: he has been invited to prepare the introductory text to a special issue on DLs of Documentaliste – Sciences de l’information. In the first part of this piece, the author summarizes brilliantly the articles that make up the issue, pleased to see that classification systems and thesauri continue to play their role of semantic mediators in the communication of information. In the second, more personal part, Maniez questions the very notion of DL, while assessing the likelihood that terms then in vogue (→ ontologies, → taxonomies, → knowledge organization systems, → folksonomies) will impose themselves in the communities that should be concerned with them. This personal reflection is launched when Maniez observes that the phrase langage documentaire, widely used in France and in Québec, is in fact poorly defined and has not succeeded in establishing itself in IS as a universal concept. The expression has no exact equivalent in the English-speaking world, where the couple “indexing languages and classification languages” is commonly preferred to represent the same object. Paradoxically, it is by our American and British colleagues that DLs have been studied most often, both from a theoretical and an application perspective. Is the lack of scientific visibility of the object in the Francophone research world due to the narrow definition of its nature and function, to its characterization as an artificial language very closely related to the traditional conception of the → document and to human-based description and indexing practices? Could it be because DLs prioritize lexicon over syntax? Maniez admits that DLs are associated with an era of supremacy of printed documentation and denounces the fact that they are underused in full-text searching contexts. But he points out that his own reading of the articles in this issue confirms that the so-called innovative tools available to organize data and represent content remain similar to traditional DLs, even if the authors do not explicitly refer to them or even acknowledge any kinship connection with them.

Forever concerned with terminological precision, Maniez remarks that the new language tools for the representation and retrieval of information are not more clearly characterized and defined than their predecessors. However, these tools have the merit of staying closer to natural languages and of being more flexible, capable of integrating information systems relying increasingly on artificial intelligence. While acknowledging the value of these knowledge organization systems, the author wonders about their viability in the long-term. Indeed, the very notion of knowledge organization systems is vague. Furthermore, the concept is claimed as theirs by several disciplines (philosophy, linguistics, computer science, economics, etc.), each discipline using it in its own particular way.

[top of entry]

4. Classification and faceted classification

Although a sizeable section of his textbook Langages documentaires et classificatoires [Documentary and classificatory languages] was occupied by descriptions of contemporary classification systems (see section 3 above), Maniez was less interested in the process and tools of classification than he was in verbal languages. His position on the place and function of classifications in information retrieval remains ambiguous. Throughout his career, he appears to have retained a traditional vision of the classification system, describing it as a useful tool for document organization and not considering its potential as an aid to searching.

The evolution of information systems at the turn of the 1990s sees information specialists questioning the role of traditional bibliographic classifications within search environments capable of retrieving large numbers of relevant sources with simple access keys. It is therefore appropriate to wonder whether the rigid structures that belong to the oldest generation of DLs are still useful, considering the capabilities, flexibility and speed offered by contemporary systems. Maniez participates in this debate with a contribution to the 1990 Rome conference focusing on DLs in databases. In “Are classifications still relevant in databases?” (1991), he describes the main advantages offered by classifications: their comprehensive and coherent approach to the organization of subjects, their systematic hierarchical structure, and their coded notation independent of any natural-language. This last property makes them particularly attractive in multilingual databases and in any other environment where compatibility is required. The author observes that some form of searching language is used in the vast majority of bibliographic databases, with only half of them exploiting a systematic classification. Post-coordinated DLs are preferred, even if disadvantages linked to their use are well documented. Maniez suggests that combining descriptors and class numbers would be an effective means of ensuring relevance and precision, since it would better control any potential ambiguity in the representation of a subject. But for an optimization of its value in searching, the classification notation must be expressive (as is the UDC notation for example) and presented to the information searcher via a device making it easy to use, and possibly even transparent.

Would the faceted classification, whose popularity is growing at the end of the 20th century, be a more interesting option? Maniez does not believe so because the complications generated by the use of this type of DL seem too important to overcome. Representing or searching for a complex subject with facets requires a good knowledge of the structure and workings of a system that is anything but user-friendly. Maniez does not believe that the faceted classification can adapt to the requirements for quick data access.

Maniez was already interested in the notion of facet, as defined by S.R. Ranganathan, in his doctoral dissertation. Twenty years later, he revisits the subject in “Du bon usage des facettes: des classifications au thésaurus” (1999) [The proper use of facets: from classifications to thesauri]. In this substantial contribution, he focuses again on the most important weakness of the now popular faceted approach, namely the ambiguous definition of the role and nature of the facet. The facet is an essential component of the Colon Classification and of the knowledge organization principles stated by Ranganathan. According to Maniez, the master has unfortunately chosen a term already loaded with meaning to name a new concept that he then places at the heart of his theory and at the foundation of his classification system. Maniez points out in passing that this phenomenon is reminiscent of what happens in the broader context of IS, where terminological ambiguity is the norm rather than the exception; he has himself, more than once, emphasized the lack of rigour shown by information scientists in their definition and use of fundamental concepts.

Maniez analyzes Ranganathan’s proposal from angles corresponding to two axes of language: the paradigmatic axis and the syntagmatic axis. He recognizes the originality and audacity of the master who has conceived of a subject representation model in which these two axes merge in the notion and term of facet. Indeed, Maniez demonstrates that the Personality, Matter, and Energy facets are both paradigmatic (indicative of nature or essence) and syntagmatic (indicative of function or role); this generates a significant degree of ambiguity that various attempts at rectification, that of the Classification Research Group (CRG) among others, have only made more apparent. Maniez points out that Ranganathan’s indexing formula does assign a function to each concept in the representation of a subject. In CC, an inventory of concepts is available, but the relation between concept and facet is established at the time of → subject analysis only, is meaningful in a particular context only, and is not permanent. The author offers a terminological solution to the ambiguity thus introduced, knowingly or not, by the Indian thinker. Faced with the apparent impossibility of creating distinct terms to represent different meanings, he suggests adding the qualifier classificatory to the term facet, so that it acquires a meaning specific to IS. He then distinguishes two categories of classificatory facets. Categorial facets are applicable to any classification of concepts, independently of domains, while structural facets represent the essential constituents of an entity or subject. Categorial facets are located on the paradigmatic axis; structural facets are aligned on the syntagmatic axis. Although intriguing, this terminological clarification does not resolve the original ambiguity as to the nature and function of the facet, and at the time of publication of this article, Maniez recognizes the limits of his proposal.

Even if Maniez’s remarks testify to his vision of the facet as a basis for classification, as structuring device rather than as navigation and searching aid, it remains relevant to this day. The fusion/confusion of semantics and syntax persists and it should be re-examined to feed the reflection on facet theory as well as critical evaluations of the many applications that are made of it today.

[top of entry]

5. Indexing and indexes

Maniez was more concerned with languages than with the processes or products of indexing. In the early 1990s, however, he co-signs with British researcher D.J. Foskett the article titled “Indexation” in Encyclopédie Universalis. Its title is misleading since the article describes, albeit briefly, all operations related to traditional document processing, namely descriptive cataloguing, classification, abstracting and indexing. We hypothesize that Maniez was in fact the translator of the first part of the article. An additional section, attributed precisely to Jacques Maniez, discusses indexing and artificial intelligence. Observing that indexing is no longer a process exclusive to human intelligence, Maniez reviews the benefits of automatic indexing as it has been practiced for more than a decade: it is less expensive, more objective, faster and capable of analyzing the text of a document and users’ questions in exactly the same way. Maniez emphasizes, and it is the objective of his contribution, that advances in artificial intelligence make it possible to envisage great improvement in the quality of automated text analysis, as systems can now take into account the frequency of words, and are getting better at mastering techniques of morpho-syntactic analysis and at recognizing noun phrases and their functions in a sentence. Expert systems, which combine a network of semantic relationships and content analysis techniques, can even determine which subjects need to be represented. Maniez maintains that significant technological developments in machine translation also apply to both IS generally and to document analysis and representation specifically.

Maniez’s most significant contribution to the field of indexing is found elsewhere, in a “how-to” manual on back-of-the-book indexing, written in collaboration with Dominique Maniez of Université de Lyon. Published in 2010 as Concevoir l’index d’un livre: histoire, actualité et perspectives [Creating the back-of-the-book index: past, present and future], the textbook fills a void. In the English-speaking world, the preparation of back-of-the-book indexes is a legitimate professional occupation and textbooks on indexing abound. In Great Britain, where the importance and usefulness of the index was recognized as early as the 17th century, as well as in the United States and Australia, such textbooks exist in multiple editions and are regularly updated. The Maniez have used them as their source of inspiration, France having long ceased to be a leader where alphabetical access to the subject content of documents is concerned.

There are two distinct sections in this work. The first section covers the technical aspects of book indexing. It introduces theoretical notions before addressing each stage of the index production and identifying the tasks that can be automated. All chapters are rich with definitions, examples, and links to additional sources. Full chapters are devoted respectively to vocabulary control and to syntactic processes in the index; Jacques Maniez’s mastery of these topics reveals itself in a text much more pragmatic than those he has published before. Several technical sections, most likely written by Dominique Maniez, present different computer-assisted solutions available to index designers: word processing, embedded indexing, dedicated indexing software, etc. The second section of the textbook outlines the history of book indexes, from Ancient to contemporary times. Anybody interested in indexing and indexes will be delighted by the quotations and examples dotting these chapters in which we learn that, paradoxically, indexes were very common and popular in France in the 17th and 18th centuries; even novels and works of poetry were often published with an index. A description of the 21st century index precedes a chapter devoted to the automatic indexing of a book. The presentation of the problems and ongoing searches for solutions in this field is one of this textbook’s most interesting and important contributions.

[top of entry]

6. Conclusion

Jacques Maniez’s theoretical and practical competencies resided at the crossroads of two disciplines, IS and language sciences. Maniez did not put forward theories, methods or recommendations. His writings were largely inspired and supported by his extensive knowledge and deep understanding of the fundamental elements of linguistics and of essential documentary processes (classification and indexing); this enabled him to name, interpret and explain the key concepts in our field. By bringing several principles of knowledge organization, document analysis and information retrieval under the magnifying glass of the science of language, he made a significant contribution to French thinking on these subjects. He also showed that it is the need to have quick access at all times to basic knowledge and to “know how” (savoir-faire) that has given rise to a series of increasingly complex knowledge organization techniques.

Without going so far as to predict a brilliant future for them, Maniez believed in the sustainability of documentary languages, and in that he was not mistaken. The indexing and searching languages of today are less visible in the information processing chain, are referred to by other names, and are increasingly used in the background of information systems by so-called artificial intelligences. But their primary function has not changed. These standardized tools facilitate the transfer of relevant information and documents in monolingual and multilingual environments, automated or not, on the Web as in traditional libraries. For students needing to familiarize themselves with the concepts and terminology of indexing languages, for information specialists and professionals who wish to refresh their basic knowledge in the field, for all those who want to better understand the complexity of the processes involved in today’s information representation and retrieval, Jacques Maniez’s work remains accessible, useful and important.

[top of entry]

Endnotes

1. To remain faithful to Maniez’s writings in French and in English and to the French tradition in knowledge organization, we use throughout a literal translation of the French langages documentaires, namely documentary languages (abbreviated DLs) rather than the common English equivalent classification and indexing languages. The term was originally used by Maurice Coyaud and Jean-Claude Gardin in the 1960s.

2. This system is described at https://hal.archives-ouvertes.fr/hal-01071423.

3. https://en.wikipedia.org/wiki/IBM_STAIRS.

4. https://questel.com/.

5. In French : “de leur raison d’être (retrouver facilement et rapidement les documents) et du contexte de communication particulier dans lequel on les utilise”. Our translation.

6. In French : “permet souvent de représenter le même contenu à partir de deux agencements différents des concepts”. Our translation.

7. In French: “défendre la pertinence des langages documentaires aujourd’hui, menacés par le 'succès insolent' des modes de recherche proposés sur le Web”. Our translation.

8. In French: ‘une espèce en voie de disparition”. Our translation.

[top of entry]

Bibliography


Works by Jacques Maniez

Maniez, Jacques. 1977. “Terminologie et thésaurus : divergences et convergences”. In Terminologies 76 : Colloque international, Paris-La Défense, 15-18 juin 1976. Paris: La Maison du dictionnaire, p. IV39-IV50.

Maniez, Jacques. 1978. Rôle de la syntaxe dans les systèmes de recherche documentaires. Thèse de doctorat, Université de Besançon, sous la direction de Jean Peytard. Tome 1: Aspects linguistiques. Tome 2 : Étude critique de quelques systèmes de recherche documentaire.

Maniez, Jacques. 1980. Quatre leçons d'introduction à la linguistique générale. Dijon, IUT. 16 leaves.

Maniez, Jacques. 1983. “Problèmes de syntaxe dans les systèmes de recherche documentaire”. Documentaliste, 20, 2: 52-58.

Maniez, Jacques. 1985. “Outline of a Methodology for Training in Librarianship”. Education for Information 3: 51-54

Maniez, Jacques. 1987. Les langages documentaires et classificatoires: conception, construction et utilisation dans les systèmes documentaires. Paris, Éditions d’Organisation.

Maniez, Jacques. 1988. “Relationships in Thesauri: Some Critical Remarks”. International Classification, 15: 133-138.

Maniez, Jacques. 1990. “Are Classifications Still Relevant in Databases?” In Documentary Languages in Databases: Papers from the Rome Conference, December 3-4, 1990. Frankfurt/Main, Germany, Indeks, 120-129. (Advances in Knowledge Organization, 3)

Maniez, Jacques. 1991. “A Decade of Research in Classification”. International Classification, 18: 73-77. [Homage to → Éric de Grolier in a special issue of International Classification.]

Aitchison, Jean and Alan Gilchrist. 1992. Construire un thésaurus: manuel pratique, traduction Dominique Hervieu, révision scientifique Jacques Maniez. Paris, ADBS Éditions.

Maniez, Jacques. 1992. Los lenguajes documentales y de clasificación: concepción, construcción y utilización en los sistemas documentales. Madrid, Spain, Germán Sánchez Rui Pérez Foundation.

Maniez, Jacques. 1993. “L’évolution des langages documentaires”. Documentaliste - Sciences de l’information 30: 254-259.

Maniez, Jacques. 1997a. “Fusion de banques de données documentaires et compatibilité des langages d’indexation”. Documentaliste - Sciences de l’information, 34: 212-222.

Maniez, Jacques. 1997b. “Database Merging and the Compatibility of Indexing Languages”. Knowledge Organization, 24: 213-224.

Maniez, Jacques. 1998. Structures and Relations in Knowledge Organization: Proceedings of the Fifth International ISKO Conference, 25-29 August 1998, Lille, France. Widad Mustafa El Hadi, Jacques Maniez, Steven A.S. Pollitt, editors. Würzburg: Ergon.

Maniez, Jacques. 1999. “Du bon usage des facettes : des classifications aux thésaurus”. Documentaliste – Sciences de l’information, 36: 249-262.

Maniez, Jacques. 2001. Logotel pratique : manuel-disquette d'auto-formation. Paris, ADBS.

Maniez, Jacques. 2002. Actualité des langages documentaires: fondements théoriques de la recherche d’information. Paris, ADBS Éditions.

Maniez, Jacques. 2007. “Langages documentaires et outils linguistiques: principes, usages, perspectives : rupture ou continuité ?” Documentaliste - Sciences de l’information, 44: 12-16.

Maniez, Jacques. 2009. Concevoir l’index d’un livre: histoire, actualité, perspectives. Dominique Maniez, co-auteur. Paris, ADBS Éditions.

Foskett, Douglas J. and Jacques Maniez. s.d. “Indexation?”. In Universalis.fr. Douglas J. Foskett et Jacques Maniez. https://www.universalis.fr/encyclopedie/indexation (accessed 19 October 2022).

[top of entry]

Other references

Amar. Muriel. 2002. “Jacques Maniez, Actualité des langages documentaires: fondements théoriques de la recherche d'information”. Bulletin des bibliothèques de France, 2002, no 5: 112.

Blanquet, Marie-France. 2002. “Jacques Maniez, Actualité des langages documentaires: fondements théoriques de la recherche d'information, notes de lectures”. Documentaliste - Sciences de l’information, 39: 234.

Hudon, Michèle. 2018. “Du bon usage des facettes: un linguiste revisite la théorie de Ranganathan”. In Fondements épistémologiques et théoriques de la science de l’information-documentation : hommage aux pionniers francophones : Actes du 11e Colloque ISKO-France, 11-12 juillet 2017, Paris, France, p. 70-83. Londres: ISTE.

Van Slype, George. 1987. Les langages d’indexation: conception, constructions et utilisation dans les systèmes documentaires. Paris, Éd. d’Organisation.

[top of entry]

 

Visited Hit Counter by Digits times.


Version 1.0 published 2022-12-19

Article category: Biographical articles

©2022 ISKO. All rights reserved.