1 Patent Classifications

Patents are routinely classified into classes by patent offices. The classification process follows international standards, which were established back in history and are regularly updated in order to follow the technological evolution. Classifications are, then, routinely adopted for a large variety of practical and analytical purposes, ranging from prior art search to econometric modeling.

Patent classifications follow a mix of functional and industrial criteria. By functional we mean a classification that is based on the engineering principles underlying the technical invention.

In the engineering literature, a function is defined as follows [40.1, p. 72]:

The function is a property of the technical system, and describes its ability to fulfill a purpose, namely to convert an input measure into a required output under precisely given condition.

A function is a transformation that takes place in the physical space. From a representational point of view, it can be described by a verbal expression (functional verb), usually associated to an object and a modal expression.

This notion is, needless to say, crucial for the working of the entire building of the patent system. Functional descriptions deal with the novelty and utility requisites of patents. The notion of utility requires that the invention deliver something of value for a user, at least potentially. For the utility to be delivered, what is needed is precisely that a function (perhaps a complex one) is implemented. The notion of novelty requires that this function has never been implemented the way it is implemented in the patent application. More rarely, the function itself is an invention, in the sense that it was not conceptualized before the invention. Industrial criteria, on the contrary, refer to potential users of inventions. They deal mainly with the industrial requisite of patents.

The functional language is crucial in the definition of invention, as well as in the doctrine of functional equivalence, stating that solutions that realize the same function should be considered protected by the same patent. Furthermore, it can be shown that some of the most relevant changes in legislation and practice in the last decades, namely the patentability of software and of biotechnological inventions, have been promulgated on the basis of scientific theories about the alleged functional impact of patentability [40.2].

Notwithstanding the foundational role of the concept, legal and economic doctrines and practices rely on a relatively informal definition of functions. The International Patent Classification ( ) and the Cooperative Patent Classification ( ), for example, use a relatively loose definition of functions, and use it in a non-systematic and complete way. Furthermore, there has not been any systematic connection with the disciplines in which a great deal of work has been done to define and formalize functions, such as theories of engineering design, theories of systematic invention, or functional analysis.

In this chapter, we argue that the current classification system of patents, despite a significant effort to update, is not able to follow the acceleration of technological developments. New pervasive and transversal technologies, broadly defined as digital technologies, have almost destroyed industrial boundaries and opened new forms of lateral and transversal competition. New products and services are developed by combining existing technologies in novel ways. Very often inventions that were initially conceived for an industry find applications in entirely new domains. To mention just a couple of examples, blockchain solutions, which were initially conceived for the financial industry, now find applications in agriculture, in order to certify the space and time of an inspection of a protected label. Or Second Life, which was intended as an entertainment software platform, is currently largely used as a therapeutic tool.

Under this kind of technological dynamics, existing classifications are almost inevitably in delay. In particular, mixed functional-industrial classifications fail to capture the full potential of inventions to cut across industrial boundaries. Consequently, using existing patent classes does not allow a fine-grained technology intelligence and misses almost entirely the opportunities for lateral vision.

We suggest integrating the existing classifications with a full scale functional classification, based on functional hierarchies and supported for specific tasks by the construction of a large functional dictionary.

The chapter is structured as follows. First, we offer a short overview of the notion of function in a variety of disciplines and comment on recent advancements in computational linguistics that have made it possible to develop large scale dictionaries and classifications. In Sect. 40.3, we discuss in detail the limitations of existing patent classifications. In Sect. 40.4, we offer three short case studies of the application of the methodology. The final section concludes this chapter.

2 A Brief History of Functional Analysis

The main elements that characterize functions are as follows:

  1. 1.

    Functions are abstract representations–they must be independent on specific technical solutions. This is called solution neutrality.

  2. 2.

    Functions are normative–they describe a purpose, or a goal, or a raison d'être of an object, and in this way they describe the conditions under which the object may come into existence, and, correspondingly, the conditions under which the object may not work properly, or its dysfunction.

  3. 3.

    Functions are hierarchical–they can be decomposed in an iterative way, moving up to functions of higher abstraction or down to functions of lower abstraction, or higher instantiation. However, there might be multiple hierarchies or different ways to decompose a higher level function.

  4. 4.

    Functions involve a transformation in the physical space–they can then be made consistent with physical descriptions.

In the following, we call the attention of the community working on Science and Technology ( ) indicators to the deeper intellectual roots of some of the concepts that we utilize.

2.1 Philosophical Foundations of Functional Thinking

According to Aristotle, the general notion of cause, as conceptualized for physical entities, was not sufficient to explain the relation between some forms of action and the world. In his Physics, Aristotle introduced the notion of telos, or goal, as the basis for a separate form of explanation, called teleological explanation [40.3, p. 8]:

Teleological explanation in Aristotle pertains broadly to goal-directed actions or behavior. Aristotle invokes teleology when an event or action pertains to goals: ‘that for the sake of which'.

The explanation was found by positing a separate notion of cause, called final cause. Final causality worked backward: the existence of a final goal required by necessity the working of individual elements in such a way that their coordination could ensure the working of the whole organism. Interestingly, Aristotle used the notion of telos to cover two distinct kinds of final teleology: agency-centered teleology (involving behavior and artifacts, or technology) and teleology pertaining to natural organisms [40.3]. While the distinction between the two classes is one of consciousness (agents are aware of the goal of their behavior, natural organisms are not), the notion of functions can be applied to both fields. According to Aristotle, consequently, the actions aimed at constructing artifacts have value only as manifestations of human goals, that is, they have no internal necessity [40.4].

This notion was rejected after the modern Scientific Revolution. Galileo already asked to reason only in terms of physical causes, leaving the overall why to things outside the scientific domain. Notably Darwin, in his rejection of Lamarck's explanation of the adaptation of life, called for a scientific reasoning that moved only from causes to effects and never backward [40.5, 40.6]. Adaptation comes from random variations that are selected only by virtue of their fitness to the environment and not by virtue of some finalization to superior forms. Variation is random, selection is blind. This is function without purpose [40.7, 40.8].

These arguments were formalized in the twentieth century in two separate traditions. On the one hand, the neo-Darwinian synthesis, introduced by authors such as S. Wright, R.A. Fisher, and J. Maynard Smith, provided the mathematical framework to examine the way in which random genotypic variations could generate new forms or new phenotypes [40.10, 40.11, 40.6, 40.9]. On the other hand, the neopositivist philosophy of science rejected the teleological arguments, proposing that arguments outside the physical causality notion, introducing teleology, should not be considered valid [40.12, 40.13].

Overall, there was a general consensus until the end of the twentieth century on the idea that the notion of function, having inevitably a normative content, should not be admitted in scientific reasoning at all.

This agreement encountered some trouble at the end of the twentieth century, although the streams of thinkers that have criticized the dominant view are still somewhat peripheral. In Kuhnian terms, they point to anomalies in the dominant view, but they still have not gained central ground. Nevertheless, the convergence of critical arguments from a variety of theoretical starting points is an indication of deeper problems.

First, within the philosophy of biology, the notion of pure randomness of variations has been criticized. Starting from the famous metaphor of the spandrels of San Marco in Venice [40.14], Gould argued vigorously that variation is not random, but has an internal structure, most likely hierarchical, that makes the probabilities to change in different directions in the space of genetic possibilities highly constrained [40.15, 40.16, 40.17]. This argument, initially minoritary in evolutionary biology, found support in the recent advancements of epigenetics and systems biology.

Second, the notion of organism has gained prominence in a recent revival of the philosophy of Hegel. The German philosopher tried to conceptualize the existence of complex living entities by positing a new type of causality, called dialectics, as a mutual determination of constraints at different hierarchical levels of reality. Here, the main question is not whether causality can work backward in time (as in teleology) but can work downward, or from a higher-level structure or organism, down to more elementary units. The extent to which dialectics could be used in the philosophy of science is open to debate, but it is clearly on the table. It seems to be a way to give autonomous philosophical foundations to the notion of organization, which is also central in the analysis of complex systems.

Finally, in the last two decades, a number of contributions in analytical philosophy have reintroduced and rejuvenated the notion of function, in a way fully compatible with the procedures of scientific reasoning. In this tradition, starting from the earlier studies of Wright [40.18] and Cummins [40.19], biological and technological systems are considered an instantiation of a more general class of entities in which the internal working satisfies some general conditions, drawn from the scientific knowledge of the world, for survival or self-reproduction [40.20, 40.21, 40.22, 40.23, 40.24]. Functions may be described in causal terms, according to this analytic tradition, although with some qualifications [40.25, 40.26].

Summing up, and to make a long story short (for a longer story see [40.27]), we now witness an intellectual landscape in foundational disciplines that is favourable to the development of a general framework, which gives prominence to the notion of functions. This intellectual climate fits nicely with recent developments in the applied disciplines, such as engineering or design, to which we now turn.

The following is not a complete review of existing approaches to functional analysis, which would require a long exposition. There are several traditions, mainly developed in USA, Germany, Russia, and Japan, with a large number of contributions. Rather, we try to identify those contributions that are more relevant for our issue: how to use functional analysis in the field of patent classification and patent search.

2.2 The German Schoolof Systematic Engineering Design

While the notion of function is intuitive, the challenge is how to represent it in such a way to be able to manipulate and use the concept in practical terms. This is the object of functional analysis, a stream of scholarship in engineering design whose goal is to develop theoretical frameworks and tools to represent technical problems in an abstract way.

This challenge was taken up systematically by a number of authors, mainly in mechanical engineering, who wrote in German and were active in German-speaking countries. Their books have subsequently been translated into English. Examples of this approach are Hubka and Eder's Theory of Technical Systems [40.1] and Pahl and Beitz's Engineering Design. A Systematic Approach [40.28, first edition 1977].

These fundamental books should be interpreted in the light of the professional tradition of German engineers and of their academic training (for a reconstruction of this tradition see [40.29]). German engineers are trained to design new technical solutions by moving from first principles in a highly structured way. The specific rules of reasoning are written down in a formal way and generate lists of suggestions. In some sense, the discipline is one of the systematic examination of many solutions, following a formal list of methods to guide the reasoning process (see also [40.30]).

It is within this tradition that one should understand functional analysis, as a systematic effort to describe and standardize the abstract requirements of design tasks. A development that took place independently but with a similar abstraction goal (and a rather similar time trajectory), should also be mentioned here, i. e., the theory of inventive principles, or TRIZ (the Russian acronym for the theory of inventive problem solving [40.31]). While this is not the object of this chapter, the TRIZ literature pushed forward the notion that there is a compact collection of general inventive principles to be applied to engineering problems, whose functional meaning can be made explicit. Indeed, even if treated from a different point of view, functional thinking has a significant role in TRIZ theory.

2.3 Artificial Intelligence and Design:From Herbert Simonto the Carnegie Mellon Project

From an entirely different intellectual tradition, a number of scholars of artificial intelligence started to ask whether the design task could be formalized and automatized. This was in clear continuity of the ambitious tradition of cognitive science that started from the automation of well-structured problems, such as chess playing, and moved to ill-structured problems, such as scientific discovery or design [40.32, 40.33, 40.34, 40.35]. At Carnegie Mellon, this project was pursued systematically for many years [40.36, 40.37, 40.38, 40.39, 40.40].

The emphasis here was more on the cognitive procedures utilized to move in the design space, called heuristics, than on the initial requirements of design tasks. Nevertheless, the formal definition of requirements in terms of functions was a necessary element of the description of the problem [40.41]. A remarkable example of this tradition is Tong and Sriram's Artificial Intelligence in Engineering Design [40.42] and Sriram's Intelligent Systems for Engineering [40.43].

Interestingly, these efforts did not produce compelling results. The most relevant applications were found in the field of the design of electric and electronic circuits. Applications in the field of architectural design were also explored, but with limited success.

One might argue that the focus on the cognitive procedures missed a point, that is, the language with which people transform the abstract requirement (the why of the artifact) into broad ideas, then more precise concepts, down to detailed specifications and design. In other words, there must be an interaction between cognitive processes of the general type [40.44, 40.45] and the specific representational language in each of the design domains. Such interaction was almost entirely missed in earlier efforts to automatize design, due to the emphasis on discovering and modeling general mechanisms of intelligent behavior [40.35].

2.4 Functional Bases

These limitations led to a different strategy, i. e., focusing more on the language than on the cognitive operations in the design space. The task was to develop a language that might be used to write functional expressions that were consistent with grammar and semantic rules.

Interestingly, this intuition was forwarded in the direction of the development of functional languages based on the manipulation of a very small set of functional verbs. This strategy followed some sort of Occam's razor argument: the goal was to describe functionally an artifact using the smallest possible number of different functional verbs. The main research goal was parsimony and elegance, rather than coverage and usability in every context. This approach, labeled functional basis, was developed mainly in the USA, starting with the pioneering works of Little et al [40.46] and Stone and Wood [40.47]. A more systematic version was elaborated shortly after by Hirtz et al [40.48] and further expanded by many authors [40.49, 40.50, 40.51]. The functional basis paradigm created a systematic linkage with earlier traditions of engineering analysis, such as value analysis. Several US research teams created a stream of research that transformed functional analysis into a usable tool.

Over time, however, this approach showed some limitations; in real world applications, it was practically impossible to use functional bases without the help of experts in specific engineering domains. In order to capture the functions of artifacts it was necessary to add several qualifications to verbs that were too broad or general. These limitations were evident when the trend towards automatic text processing became dominant. Functional bases were mainly intended for manual use. They were not suited for the treatment of large collections of texts.

2.5 Introducing Behavior in the Functional Representation: The Function-Behavior-Structure (FBS) Model

The limits of functional bases were clearly anticipated in a stream of literature that introduced a new layer of description.

In functional analysis, a distinction is made between the structure of the artifact (i. e., its geometry and material composition) and the function, or the abstract description of its purpose. The function is, however, implemented by the structure of the artifact in a dynamic way, that is, by producing a behavior. This behavior is consistent with the structure and is aimed at delivering the function. In functional analysis, the description of the behavior is absorbed in the functional description, by working with varying degrees of granularity in the hierarchy of functions.

Various authors [40.52, 40.53] suggested enriching the framework by developing a separate layer, called behavior, within a unitary framework called function–behavior–structure ( ). In this way, functional descriptions can be left more general, and the implementation of functions can be described more carefully in dynamic terms, by following the behavior of artifacts in their context [40.53, 40.54, 40.55, 40.56]. This approach is much more flexible and articulated, as it permits the representation of functions in dynamic terms, as well as a more explicit link between the functions and the expected behavior or user expectation [40.57, 40.58, 40.59, 40.60]. In other words, in linking structure and function through the notion of behavior, this approach allows us to examine expected behavior by users as an indication of needs. It has received large acclaim in the literature [40.61, 40.62].

Other authors have also articulated functional analysis in order to accommodate a separate behavior layer [40.52, 40.63].

2.6 The Ontology Revolution and the Role of Computational Linguistics

The developments discussed above (Pahl and Beitz's systematic design, functional bases, and the FBS framework) were developed within the engineering design literature. This literature is produced by a relatively small community, whose main interests are in the construction of formal systems for representing technological problems moving from a deep knowledge of engineering disciplines [40.64]. In other words, this is a minority of scholars in engineering who, in addition to, sometimes (more rarely) in substitution for, deep studies in specialized engineering disciplines, have a propensity for theoretical generalization and formalization.

Parallel to these developments, and initially with no overlappings, the last two decades have witnessed impressive advancements in computational linguistics, and the ability of artificial systems to process large collections of texts has increased enormously. These developments are based on the construction of formal ontologies or abstract representations of entities and their relations. In parallel, powerful statistical methods for the extraction of meaningful information have been developed in the fields of information retrieval and data mining.

After several pioneering contributions, a full-scale effort to develop a functional ontology was promoted by Kitamura and co-authors [40.65, 40.66, 40.67, 40.68]. The construction of an ontology requires the formal modeling, using knowledge representation concepts and theorems, of the substantive relations among entities [40.69]. This is usually done in interaction between domain knowledge experts (in this case, engineers and designers) and computer science experts.

After the initial effort, the literature on functional ontologies witnessed large adoption [40.70]. Yet, other authors followed a different path: they built functional representations by massively processing technical texts (in particular, patents) in order to automatically extract functional information. This was done without a pre-existing functional ontology, but only on linguistic bases. In particular, Montecchi and Russo started to apply to the newly created patent classification (CPC) linguistic queries based on the FBS framework and its variants [40.71, 40.72, 40.73].

Other authors did not capitalize on existing engineering functional frameworks but reconstructed the notion of function on the basis of purely linguistic structures, e. g., actions. This approach is called SAO. The acronym SAO means subject-action-object structures: every verbal construct in which there is a subject doing an action that involves an object. Yoon and Kim [40.74] developed a patent analysis strategy based on the concepts of natural language processing. Taking advantage of parsing tools like the Stanford Parser or Knowledgist, patents claims are analyzed to reach a level where each of them is representable by a set of SAO structures. This permits a rapid identification of what the components in a new product are, and what their function is. The extracted structures are compared using a similarity measure that is purely linguistic, i. e., does not implement an underlying ontology. A similar approach is found in Choi et al [40.75], while Park et al [40.76] combine the SAO structure with TRIZ inventive principles. These contributions realize the goal of offering a rich language for functional descriptions. They might be labeled text mining without ontologies. The underlying engineering, and ultimately physical, constitution is not made manifest.

2.7 Functional Dictionaries

Finally, an alternative path was followed at the intersection between engineering design and machine learning, i. e., the development of large functional dictionaries. This path followed the notion that cognitive operations in design depend on the representation of the design task in specific, contextual, and semantically rich environments, in which the interplay between function and structure could be produced. This idea was also at the origin of the Functional Basis movement of Hirtz, Stone, Wood, and co-authors. These authors, however, pursued a goal of parsimony.

The functional dictionary approach goes in the opposite direction, developing the largest possible dictionary. It also goes in a different direction to the ontology movement of Kitamura and co-authors, in the sense that there is no need to develop a full scale ontology. What is needed is a procedure to generate the largest possible collection of functional lemmas and to demonstrate that the semantic space of design is saturated, or there are not any important undescribed elements left. By saturation, we mean a methodology, borrowed from social sciences, in which there is systematic interaction between observed data and modeling until the model includes all elements that are needed to explain the data. In the context of engineering design, this means completeness at two levels: the level of the categories of elements needed to describe an artifact and the level of all the various possible technical implementations existing in each category. A related crucial factor, to be dealt with properly for the methodology to work, is the domain dependency of many types of technical concepts: saturation in one domain does not guarantee the same in another, and completeness must be achieved at a third level too, that of all sectors of interest. In our case, we tested the dictionary in a dozen applications in highly disparate industries, up to the point where the entries were fully adequate to describe the problem at hand. The construction of large repositories of technical terminology has been investigated by various research groups, especially within the TRIZ community. As an example, the commercial software Goldfire Innovator (http://inventionmachine.com) includes a large database of functions and physical effects; furthermore, the extraction of functional information from patents was pioneered by Cascini and co-authors [40.77, 40.78, 40.79]. Dictionaries for specific sectors have been built by a number of authors (see, for example, [40.80, 40.81]).

This direction was taken over a decade ago also by the authors of this chapter [40.27, 40.82, 40.83, 40.84, 40.85, 40.86]. To reach the optimal result, we combined different techniques, ranging from advanced text mining, use of knowledge patterns and structures, and human factor analysis, all revised by experts in the technological sectors examined.

The most important achievement of this research effort was the construction of a functional dictionary containing more than \(\mathrm{100000}\) lemmas, of which there are approximately \(\mathrm{12000}\) functional verbs. This dictionary contains functions, behaviors, and structures, defined as atoms of the artifact, in terms of process, action, or task that the artifact system is able to perform. All entries are related to semantically related entries, such as synonyms, antonyms, and hyperonyms.

This approach has opened the way to an automatic procedure to extract functional information from patents [40.87], while keeping full control of the underlying physical description of functions. This dictionary has been repeatedly used in tasks of patent search, patent classification, topics modeling, technology foresight, and design crossover, in the last few years.

More recently, the same dictionary approach was followed in an effort to build up other technical dictionaries referring to advantages/disadvantages of artifacts and to users or stakeholders of artifacts. The idea is to increase the coverage of all the directions in the design space, thus expanding the possible applications. Overall, the following dictionaries have been developed:

  • Stakeholders: persons who have relations with a product or service. It has been built by merging multiple lists (e. g., users, workers, patients). Its size is about \(\mathrm{77000}\) entries [40.88].

  • Advantages and disadvantages: positive and negative effects of products/services. These classes could be also defined as benefits and failures. They consist of more than \(\mathrm{20000}\) entries.

  • Components: list of systems and sub-systems contained in products.

  • Physical quantities and units of measurement: physical properties of a phenomenon that can be quantified by numbers and units of measurement.

More recently, a vertical dictionary has been built, called Technimeter\({}^{\text{\textregistered}}\) 4.0 [40.89]. It is a list of technologies and techniques related to Industry 4.0. The dictionary and taxonomy behind the Technimeter are designed to map documents of the new industrial revolution. It has the form of a fully linked graph and consists of about \(\mathrm{2000}\) technologies and \(\mathrm{200000}\) links. It is in three languages (automatically expandable) and could be easily extended to fields like precision agriculture, the Internet of Things ( ), smart cities, smart energy, and E-health.

A sample of entries from these dictionaries is given in Table 40.1.

Table 40.1 Sample of entries from the functional dictionary, the stakeholder or user dictionary, the advantage/disadvantage dictionary and the Technimeter 4.0 (in alphabetical order)

The approach, which is based on large, non-ontological dictionaries is not suited for all type of analyses, at least in its current state of development. When the analysis requires a high level of abstraction with respect to the specific artifact descriptions, such as in the construction of functional diagrams or other functional modeling-related tasks, and in general for every study performed by a human expert, the variety and the details embedded in the database just add unneeded complexity. In such cases, more prescriptive and concise methods, such as those built on functional bases, usually deliver faster results.

On the other hand, in all cases in which the investigation needs to deal with the complexity and fuzziness of natural language, a complete functional dictionary provides, almost by design, an efficient and reliable instrument. This is the case of all forms of software-based, automated analyses of texts, which in turn are the only possible way of tackling large amounts of technical documents in tasks as diverse as information retrieval, knowledge extraction, or document categorization and labeling.

One of the most promising areas of application of the functional dictionary is, indeed, patent classification. We suggest that a full scale functional dictionary allows a fine-grained representation (i. e., retrieval, mapping, clustering , and profiling) of patent information, opening the way to a variety of powerful applications. As a future development, integrating a full dictionary with computational linguistic algorithms may even allow tasks such as constructing functional diagrams in an automated way and comparing them.

In the next chapters, alongside the general discussion about functional classification of patents, we will also provide some examples of use, among others, of the dictionary approach.

3 Patent Search and the Limitations of Existing Patent Classifications

Patent classification is a necessary part of any patent system, for legal, administrative, and practical purposes. One of the main areas of utilization of patent classification is patent search, which has two main applications: ex ante patent search or the search of prior art done by inventors, assignees, attorneys, and patent officers before and during patent application, and ex post patent search, or the search in databases carried out after the publication of patents.

Patent search can be based on several alternative strategies of query. Some queries are exclusively based on existing patent classifications (IPC or CPC) or industrial classifications, others use other metadata that are included in patent documents. Summing up a patent search is usually done in one or more of the following ways:

  1. 1.

    IPC or CPC classifications

  2. 2.

    The codes of NACE-CLIO, that is, the European Industry Classification (The acronym stands for Nomenclature générale des Activités économiques dans les Communautés Européennes–Classification Input–Output or equivalent national correspondents, such as the Italian (ATtività ECOnomiche), to identify the name of companies of the field of interest

  3. 3.

    Keywords associated with technologies of interest

  4. 4.

    Full names of companies and/or research centers that develop technologies in the field of interest

  5. 5.

    Full names of inventors.

Interestingly, each of these search strategies suffers from a number of severe limitations. We review them in order.

3.1 IPC or CPC Classes

The official patent classifications are the most largely used in patent analysis, both in professional practice and in academic research. In the latter domain, patent classifications are routinely used in the economics of innovation and strategic management literature, in order to address issues such as diversification of companies, related variety, innovation search, or novelty. Yet these classifications suffer from limitations that are not always clearly recognized.

To start with, the large number of IPC codes (more than \(\mathrm{70000}\) IPC codes among classes, subclasses, groups, and subgroups [40.90] and of CPC entries (more than \(\mathrm{200000}\)) [40.91], while an indication of the effort of IPR authorities to follow the evolution of technology, generates a cumbersome task. Patent officers and analysts are faced with a severe trade-off: using fine-grained classification requires a large specialized knowledge, while using higher-level codes would bring in the patent set lot of noise from distant and unrelated documents. As a matter of fact, the reading of the definitions of patent classes does not solve at all uncertainties in classification and also in information retrieval.

Second, the IPC/CPC classification has its own ambiguities of attribution. Compare, for example, the subgroup A61B 5/00–Measuring for Diagnostic Purposes with the class G01 Measuring, or the subclass F16F–Springs; Shock Absorbers with the equally valid subclass B60G–Vehicle Suspensions. It is clear that a patent of interest can be legitimately listed under one or the other code. This means that an incomplete IPC-based query (in particular, a query that fails to recognize these kinds of ambiguities) will miss important information.

Third, there are errors of classification. In some cases, the technology covered by the patent has nothing to do with the patent class, due to mistakes or misprints. However, the most intriguing (and disturbing) classification error is intentional. Applicants submit patent applications that intentionally include misleading information, that lead patent analysts at patent office into misclassification. In other words, companies try to hide the true content of their patents from competitors, for defensive purposes or for creating hidden threats. It is not uncommon, for example, to see inventions for power windows in the automotive sector classified as blinds for use in houses or solutions for gas turbines classified as solutions for standard combustion engines, and vice versa.

Fourth, the speed at which new patent classifications are introduced does not match the speed of technological evolution. Despite significant efforts, official classifications are several steps behind the technological state of the art. In particular, patent classifications are under pressure in following inventions that are transversal in nature. In general, patents with broad cross-field and cross-industry application are classified in several sectors of the IPC classification. On the other hand, CPC has tried to address the issue by introducing the Y class, but for the time being the coverage is far from complete. As a clear example, it has been shown [40.92] that only a tiny fraction (\({\mathrm{5}}\%\)) of the relevant patents in the field of bioinformatics is listed under the corresponding IPC code G06F19/10, while all the others are scattered among over 30 codes. A similar problem refers to the case of transversal or interdisciplinary technologies, which adopt several technical solutions and span several applications, and, therefore, can be pertinent to many classes. Consider for example patents for robotics, or IoT. More generally, even standard technologies may have multiple IPC/CPC attributions. A classic case is control software, which can be classified either under pure software classes or under classes related to the specific industrial sector of application.

Finally, the IPC/CPC classification has not yet been adopted in all National Patent Offices.

Summing up, the use of IPC or CPC classification schemes is justified as a first approximation, while it suffers from severe limitations if the goal is to identify emerging technologies and technological trends, as well as to build up strategic technology intelligence tools that allow for lateral innovation, boundary-crossing technologies, and strategic hiding behavior by competitors.

3.2 Industry Codes

A suitable alternative to the construction of lists of companies is to rely on industry codes. At the European level, they follow the NACE-CLIO nomenclature.

This approach, too, has various limitations. First, using industry classification creates the same problem of classification errors found for patent classes; sometimes companies are listed in classes that have nothing to do with the reality of their production, due to misclassification.

Secondly, large enterprises and holdings generally operate in more than one industrial sector; thus the reference is to multiple NACE codes, so that a clear association of enterprises to NACE is difficult. Furthermore, in the case of groups or holdings, quite often the parent company is classified under services, although the associated or subsidiary companies are manufacturing enterprises.

Finally, research centers are not classified by industry codes.

3.3 Keywords

Keywords are another largely used technique in patent search. After patent classes, keywords are probably the most important search tool. Keywords must be built up after an expert judgment. More recently, the elicitation of keywords by experts in the subject domain is integrated with formal computer language methodologies (ontologies).

In practice, however, it is difficult to characterize completely and precisely a technology using only a limited number of keywords. The larger and more inclusive is the choice of keywords, for example, including all synonyms of a given term, the greater the risk of finding unrelated patents due to polysemy or usage of the word in several other industries. Moreover, assignees may use (inadvertently or intentionally) different terminology to label the same technical concept. The list of variants is not known a priori.

Often inventions are described in ways that defy the precise qualification by means of keywords. Or the same functions are described differently, so leading to the publication of different keywords.

Finally, the labeling with keywords may miss important information. As a matter of fact, even the most obvious keywords may not be present in patent documents. For example, the CPC subgroup F04D 19/042 is about turbomolecular vacuum pumps, and yet there are 52 documents classified in F04D 19/042 that do not contain the term turbomolecular pump or any other variation of such an expression (source: http://www.wipo.int/meetings/en/details.jsp?meeting_id=39303).

3.4 Full Name of Assignees (Companies or Research Centers)

It is usually quite difficult to start with a complete list of companies and/or research centers that may be designated as assignees of patents. This is even more so in rapidly growing sectors, due to the massive entry of newcomers, as well as frequent mergers and acquisitions ( ). A list of the full names of companies can be found in some industrial sectors in sources like industrial repositories, catalogues, trade associations, and associated websites. Complementary sources are the commercial database services.

However, even if a complete list were available, there are several limitations, some of which are similar to those commonly found in bibliometrics and scientometrics.

The well-known problem of harmonization of company names is pervasive: there are countless variations of company names to be found in patents. Harmonization efforts come into play, but they are still incomplete. In addition, companies try to hide their identity by assigning patents to subsidiaries whose corporate links are difficult to reconstruct, or even to their long term suppliers. In many cases, the assignees are inventors themselves, so the name of the company is not visible in the patent data. However, the inventors are employees or collaborators of a company. This information is not available in patent documents, so it must be inferred from other sources. As a matter of fact, the information may be difficult or impossible to reconstruct.

Finally, lists of companies are typically based on criteria for inclusion that refer mainly to the final products, i. e., are based on industry-sector criteria. This corresponds to the traditional notion that members of an industry are only those companies that actively compete in the product markets, or, more formally, those for which the cross-elasticity of product demand is non-zero. This notion was entirely appropriate in an innovation landscape in which there was a strong coherence between the technology owned or controlled by a company and its product portfolio. However, in a landscape of pervasive digital technologies and disruptive business models, this strict correspondence is not warranted. As an example, in emerging technologies, one often finds among assignees names of companies, usually large ones, coming from completely unrelated fields. This means that they are studying the technology. The extent to which they will develop products based on these patents, becoming new entrants and newcomer competitors, is not obvious at all.

3.5 Full Names of Inventors

The inventor record in the patent text is a source of crucial information. Many studies have been carried out by using lists of inventors, as well as their affiliation, country of origin, nationality, extracted from given sets of patents. An interesting example is the classification of inventors by country based on automatic tools of disambiguation, which assign a country or region with a certain probability given the frequency distribution of names and surnames.

Inventors are, however, physical persons. Contrary to names of companies and research centers, which create a universe in the order of magnitude of dozens of thousands, names of physical persons are in the order of millions. In addition, for companies there is an incentive to select corporate or brand names that are clearly distinguishable from competitors. This does not happen for physical persons. This means that issues of homonymy are cumbersome and may create lot of noise in data.

In addition, there is no validated list of inventors, for the time being.

For the convenience of the reader, the main limitations of existing patent search criteria and approaches are summarized in Table 40.2.

Table 40.2 Summary of the limitations of existing patent classifications

4 Functional Patent Classification: Three Case Studies

As mentioned in the previous chapters a functional patent classification ( ) is based on the main functions performed by the technology, rather than on the inventive solutions or their potential applications. The functional approach allows overcoming most of the above-mentioned limits. One aspect that makes functions such a powerful tool is their generality and abstraction. Representing logical, physical, or teleological concepts, functions are neither domain specific nor domain dependent. As an example, separation, movement, and control are present in every technical domain, what changes is only the structure that realizes these general goals or effects. Therefore, functions can help the identification of connections or even the creation of bridges between distant technologies or industrial areas.

The connection may be found in the two time directions. Looking retrospectively, it is possible to start from a given present-day solution and explore inventions of the past belonging to different sectors, either to make more complete the positioning of a technology and the understanding of its evolution trajectory, or to widen the scope of infringement, opposition, or freedom to operate analyses.

Looking forward, on the contrary, the existing patent corpus can be used to provide inspiration for the inventions of the future, tackling a creative process called crossover, i. e., the adaptation of technical solutions from one field to another. The same approach can help in anticipating the evolution of transversal technologies.

Furthermore, the search of prior-art is very important in the every day practice of engineers, designers and IP professionals; however, the projection towards the future provided by crossover is probably even more important, since it leads to new technologies and businesses, and can provide valuable support to the strategic planning of companies and policy makers.

In the following sections, we will review some of the advantages of using FPC in a variety of directions. These include:

  • Patent search

  • Technology foresight

  • Prior art

  • Crossover analysis

In the first two case studies, we will show how adopting a functional reasoning and FPC allows a better retrieval of the patents:

  • Related to a technological cluster, e. g., during a foresight activity.

  • Related to a specific product that a company plan to patent and commercialize without infringing one or more existing IP (in the case of patent search and prior art analysis).

In the third case study, we will discuss how performing a patent search based on functional criteria permits us to identify different technologies that satisfy the same need and apply that to crossover activities, in order to support creative tasks in the conceptual design phase.

These applications are of interest for patent offices, patent attorneys, and patent analysts, as well as for entrepreneurs and venture capitalists, or researchers and analysts interested in technology and competitive intelligence.

4.1 Case Study No. 1: Patent Search

There are two main advantages of adopting a functional point of view when performing a patent search. The first is higher recall (in information retrieval; the term recall indicates a percentage parameter representing the completeness of a given target document set; in the present case, it gives the fraction of relevant patents that have been actually retrieved over the total amount of existing relevant inventions). Quite often, relevant patents are filed under IPC/CPC classes different from that of the starting patent application, and traditional queries are usually not able to retrieve them, either because they rely too much on the IPC/CPC patent classification, or because the keywords used are too domain dependent. Even similarity search, based on semantic technologies, usually fails in this task, since it still bases its internal representation on the specific terminology of the initial example.

The second advantage is that finding solutions coming from different fields is often unexpected and, therefore, offers additional weapons in the IP dialectics (for example, in patent litigation or opposition), similar to the possibility of utilizing non-patent literature. Moreover, the reverse is also true, that is, using the functional approach it is possible to detect patents that have been hidden in classes that are far away from the obvious one, either for defensive or offensive purposes.

In the foresight activity of the biomedical industry commissioned in 2017 by Toscana Life Science, a non-profit organization in the support of biomedical research and acceleration of startup companies, the starting point was the creation of the set of relevant patents.

The biomedical field has been clustered in 12 areas, defined at high level by using functional verbs that identify the main action performed by the technologies belonging to each area. For example, in the field of surgery, instead of listing individual technologies such as scalpel or cutting laser, we defined the cluster in terms of the main function, i. e., to separate/cut the tissues of a patient. Table 40.3 shows the main functions identified in the exercise. This segmentation is not intended to be exhaustive; it addresses the main ares of interest of the client. It gives a hint to the search strategy that the functional classification suggests.

Table 40.3 Classification of patents for medical devices adopting the similarity of functions performed as criterion for clustering

Let us consider the cluster of technologies which function is to support the motor functions (listed as number 5 in Table 40.3). It contains the products and devices used for the rehabilitation and the aid of the mobility of a patient, such as crutches, wheelchairs, or training equipment.

Taking advantage of the functional dictionary to support the functionalization of the search, i. e., considering all possible variants of the functional concepts to be retrieved, we identified in this functional class a global patent set of \(\mathrm{133197}\) documents, belonging to \(\mathrm{45976}\) patent families, filed worldwide from 1900 to 2015.

Only a tiny fraction of these patents were filed in the region supporting the study (Tuscany). From the above set, in fact, 267 individual patents filed by assignees localized in the region have been found. They belong to 42 patent families, and their filing date is after 1985. Focusing on this small sample, it appears that some of the patents identified would not have been found, had we used the search criteria listed above.

For example, the application US2006113846_A1 (Mechanism of motor reduction with variable rigidity and rapidly controllable) is classified under the IPC groups B25J9/02, F16H19/06 and H02N3/00. These groups are labeled, respectively, Manipulators positioned in space by hand, Gearings comprising essentially only toothed gears or friction members and not capable of conveying indefinitely-continuing rotary motion and Generators in which thermal or kinetic energy is converted into electrical energy by ionisation of a fluid and removal of the charge therefrom. If we had conducted the search using just those IPC classes for which the definition matches the concept of rehabilitation devices or mobility aids (that is, A61F,- Medical or veterinary science; Hygiene; filters implantable into blood vessels; prostheses; devices providing patency to, or preventing collapsing of, tubular structures of the body; orthopaedic, nursing or contraceptive devices or A61G,- Medical or veterinary science; Hygiene; Transport, personal conveyances, or accommodation specially adapted for patients or disabled persons), we would have not identified the above relevant US application, since it is classified under classes apparently unrelated to the biomedical field.

Rather often, the assignees and the inventor of a patent overlap. This is generally true for US applications, since in that jurisdiction there is the presumption that the inventor is the initial owner of a patent or patent application. Sometimes the inventor is, indeed, a single professional working on his/her own. More frequently, however, particularly in some industries, the inventor is an employee in the R&D department of a company and, by contract, the owner of the intellectual property is the company, not the inventor. In certain cases, the re-assignment to the legal entity from the physical person (inventor) to the company may still be in progress. In other cases, however, companies intentionally leave individual inventors as assignees, in order to hide the invention from competitors. Therefore, if these patents were searched using the names of the companies active in the industry, they would not be retrieved. For example, in the patent IT1252816_B (Reinforced cotyle for hip joint prosthesis) Mr. Massimo Giontella is both the assignee and the inventor. We started a search on other documents and discovered that this inventor works for a company (MP srl), and that this company owns several patents that refer to devices for the support of the motor functions. Had we searched for the standard criteria listed above, we would not have been able to reconstruct this hidden connection.

Another interesting remark about the above-mentioned company, MP srl, is that it performs mechanical manufacturing. Indeed, from its website it is not possible to infer that it produces biomedical equipment. For this reason, it would be difficult to retrieve its patents relying on the assignee information only, since it is not listed in any company list in the biomedical industry.

Similarly, we identified an assignee whose industrial classification was Integrated engineering design services (Ateco code 71.12.2 in the Italian industry classification). This industrial classification is too generic to infer any relatedness to the medical device industry. Yet it is the classification used for Prensilia, a university spinoff company, whose patent EP2653137_A1 (Self-contained multifunctional hand prosthesis) is clearly relevant to the biomedical industry. The patents of Prensilia would have been missed if the query had been based on industry classification only.

Finally, industrial classifications do not cover universities and research institutes and centers. In our case, as many as 34 patents related to the motor functions are assigned to Scuola Superiore Sant'Anna, a university institution. Thanks to the functional approach, it is, therefore, possible to find documents that do not have explicit reference to known assignees. In addition, it is also possible to elaborate on the relations between the technologies of interest and the strategic orientation of companies that do not appear in the core of the industry, and, therefore, are not under the regular scrutiny of competitors.

To sum up, in using the Functional Dictionary illustrated above, the levels of recall and precision were extremely high, by the standards adopted in the computational linguistics community (in information retrieval, the term precision indicates the fraction of relevant documents contained in the retrieved set; in the present case, such percentage parameter estimates how many of the patents in a given target patent-set do pertain to the technical area of interest). Functional thinking in general allows finding results that would have been missed otherwise, both by traditional patent search methods and by relying on pre-existing knowledge.

A final comment about the application of functional classification to the field of technology foresight is in order. Here, the functional approach is extended along the time dimension. The functional representation of technologies supports the identification of technical trends that project into the future the evolution of solutions, beyond the existing ones. This is a powerful counterbalance to the tendency of experts to reason of future technologies in terms of extensions of already existing solutions. Following the functional approach, the technology foresight may lead to the prediction of forthcoming solutions that fulfill the needs and goals emerging from the analysis, or, stated more formally, the functions of interest [40.93]. In other words, by investigating functions not properly addressed by existing solutions, as well as by extrapolating trends well known from the theory of functional analysis, it is possible to identify the directions along which the next innovative steps will take place. In addition, the functional approach allows the early identification of the potential failures of inventions. Failures can, in fact, be conceptualized as negative functions. A functional representation allows early detection of the areas in which the promised deliveries of benefits are likely to be frustrated [40.94].

4.2 Case Study No. 2:Prior Art and Out-of-Field Citations

The advantages of retrieving solutions coming from different fields were already pointed out in the previous section. There, the discussion was on search in general, but the same is true for the specific case of prior art search.

However, for anteriority search an objection can arise. How far apart (from the technical point of view) can two inventions be, so that one can still be considered a legitimate prior art of the other? Indeed, one may object that, in principle, there might be a threshold over which two solutions are so different that they can hardly be considered by a person skilled in the art to share a similar inventive step, even if they perform a similar function. However, there is no common agreed upon definition of an objective or measurable distance between artifacts that would allow setting such a threshold in a clear way. The judgement about the degree of similarity is usually left to the sensibility and experience of the IP professional. In addition, as a matter of fact, out-of-field citations are, indeed, used by patent examiners, patent attorneys, and companies' IP professionals.

To investigate the degree of usage/retrieval in prior art searches of solutions coming from external sectors, we used a set of over \(\mathrm{200000}\) patent applications, belonging to the biomedical sector and coming from several jurisdictions, which we had carefully selected for a previous study, and looked at the backward citations. For all data on citations and on search reports, we refer to the European Patent Office's PATSTAT service [40.95].

We assumed a very simple metric to compare any given application with its citations: two documents are considered pertaining to different sectors only if they have a different IPC/CPC class, i. e., if the first three characters of their IPC/CPC code are different. Any difference in the subsequent characters has no relevance for the present purposes. Such a metric, relying on the IPC classification tree, the criticalities of which we have already highlighted, is probably not the most accurate for an in-depth one-to-one comparison but can be easily automated to process large amount of documents, and the results are reliable on a statistical basis.

The study of the above-mentioned patent set led to some interesting results.

First, out-of-field citation is quite common; in almost one out of two patent applications (\({\mathrm{46}}\%\)), the examiner cited in his/her search report at least one document belonging to a different sector. (Note that we restricted our analysis to citations made by patent office examiners and third parties only, neglecting the citations made by the applicant themselves. Please also note that the above percentage can be slightly overestimated because some documents have multiple IPC attributions, which may be both in-field and out-of-field.)

Out of almost 4 million backward citations from examiners for the whole set, \({\mathrm{32}}\%\) have a different IPC class with respect to the starting application (again, the exact percentage may be a bit lower when taking into account multiple attributions).

Second, even given the above, examiners very rarely rely on out-of-field citations only. In the various search reports of patent office examiners, around \(\mathrm{10000}\) applications were found to present prior art that would compromise the validity of one or more claims (X or Y categories of citation according to European Patent Office's convention: category X is applicable where a document is such that when taken alone, a claimed invention cannot be considered novel or cannot be considered to involve an inventive step; category Y is applicable where a document is such that a claimed invention cannot be considered to involve an inventive step when the document is combined with one or more other such documents, such combination being obvious to a person skilled in the art). Of those, about half still presented at least one out-of-field citation, but only \({\mathrm{0.3}}\%\) (29 out of 10621) had only out-of-field citations. Even if we included non-invalidating citations (A category and similar), we reach only \({\mathrm{0.9}}\%\) of applications with out-of-field citations only.

Third, out-of-field citations are relatively more important in patent opposition (an opposition occurs when a third party challenges the validity of a patent; data for oppositions can also be found using the PATSTAT service). We found only 153 patent applications within the set that received an opposition. However, the percentage of documents opposed using out-of-field citations only now rises to \({\mathrm{4.6}}\%\) (7 out of 153), i. e., more than ten times the examiner's case.

We now turn to specific examples. Consider, for example, patent application EP1943975 (A1), about an Holder for storage of surgical or medical equipment with filling template and assigned to IPC subclass A61B (diagnosis, surgery, identification; a drawing of the invention can be seen in Fig. 40.1a,ba). The only critical prior art (it received the X category) cited by the examiner during his/her search is US5379887 (A), about a Method and apparatus for managing sewing machine spare parts and assigned to subclasses B25H (workshop equipment) and D05B (sewing) (Fig. 40.1a,bb). Although the application sectors are very different, the two documents obviously share the same main function, i. e., storage of objects. Reading the patents it is clear that they also share the additional function of displaying the correct position of objects within the box to the user. Similar functions often imply similar solutions, and indeed, as pointed out by the examiner, both patents recur for the display function to a template fixed to the lid.

Fig. 40.1a,b
figure 1figure 1

Example of similarity of functions in patents from distant IPC classes: (a) comes from patent EP1943975 and represents a holder for surgical instruments (after [40.96]), (b) from patent US5379887, refers to an holder for sewing machine parts (after [40.97]). The two inventions belong to different industrial areas yet they perform the same function and indeed share also many features and part of the inventive step, as detailed in the text

As for EP 1479353 (A1) instead, a Control panel for electro-surgery devices, also filed under subclass A61B, the examiner has found only in-field prior art, such as for example patent DE3923024 (A1) about an Electrosurgical apparatus with operating, display and safety device. On the contrary, the patent application received an opposition citing the following three documents:

  • DE10022588 (A1) an Electronic device under H04M (telephonic communication)

  • de19951100 (A1) an Operating element, filed under H01H (electric switches) and with a clear automotive application

  • WO0073867 (A1), an Indicator for a robotic machine, filed under various classes including A47L (domestic washing or cleaning) and concerning a robotic vacuum cleaner.

The documents belong to different sectors, yet they all perform the control and display functions in a similar way.

As a further example consider finally EP1670371 (A1), a Transport device for sterile media in A61B; the examiner cited, for example, the Fluid jet blood sampling device and methods of US 20020045912, still in A61B, but a competitor filed an opposition citing instead the Flow control system for liquid chromatographs of US4137011 (A) under, among others, F04B (positive-displacement machines for liquids; pumps).

Identifying out-of-field citations may be crucial for supporting patent litigation or for defending the competitive position against competitors. A strategy often adopted by attorneys that oppose a patent is to invoke the so called general common knowledge: if a solution is adopted in other industries, one should infer that it is largely known. It is, therefore, of crucial importance to carry out an extensive out-of-field search in order to anticipate potential arguments for opposition. Indeed, in a case we studied, there were similar solutions in at least seven (sic) different industries.

4.3 Case Study No. 3:Functional Crossoverin Food Container Sterilization

As much as functions highlight connections between existing solutions in different sectors, they can be used to create a bridge to reach the inventions yet to be invented, thus fostering the innovation process. Patents can be a very interesting source of ideas to support the activities of inventors and designers in the concept design stage of the new product development ( ) process. Indeed, understanding what has been created by others can spark creative solutions, in the form of variants or new combinations.

Going further, it is possible to use a technology traditionally developed in one industry to satisfy the needs of users in totally different fields of applications. This goes under the name of crossover, and it is a well-known way in which inventions are generated, consciously or not. For example, biomimetic is just a type of crossover, and while it is now a design discipline on its own, humans have always taken inspiration from nature for new inventions.

Crossover requires analogical reasoning, that is, the ability to identify the similarity between the deep structure of problems, beneath the surface of differences [40.100, 40.98, 40.99]. People capable of analogical reasoning discover similarity where ordinary people see only semantically irreducible problems.

Functional analysis offers a systematic approach to identify similarity across distant industries and products. It builds up abstract representations of the goals of products and technologies, that cut across existing solutions described in structural terms. In fact, the same functions may be found in completely different industries. Harnessing the functional approach coupled with a proper mining of the patent corpus is the most effective way to generate crossovers [40.101, 40.102, 40.85]. Several heuristics can be used for this purpose, such as the search for variants, the use of the same physical principles for different functions, the systematic search for synonyms and antonyms, and the like.

We applied functional analysis to the field of food container sterilization. The goal was to identify novel technologies, outside the focal industry. This challenge could not be addressed by relying on any of the search criteria discussed above; no patent classification, no list of companies or inventors, no industry classification, and no keywords were available, and if they had been available, they would not have allowed the discovery of the same result.

The preliminary stage was the formal definition of the main functions of a food container sterilizer (i. e., the destruction or removal of bacteria and other organisms harmful to humans). It is crucial that the functions come to be described in a clear and formal way. This requires a good understanding of the functional paradigm and can benefit from the use of a complete functional dictionary.

The full scale functional representation was then projected on the patent corpus in order to find those technologies that perform the functions, without imposing any restriction on the industrial sector. Following this approach, we identified as many as 50 patents about systems to sterilize materials and surfaces, outside the focal patent classes and industry classifications. In turn, these documents have been classified according to the physical effect underlying the patented technology, such as for example x-rays, gamma rays, plasma, ultrasounds, chemical agents, and so on (the latter classification can be performed in an automated way if a database of physical effects is available).

Table 40.4 shows a sample of results from this analysis.

Table 40.4 Sample of patents identified with an FPC in the field of food container sterilization

In Table 40.4 we list a few patents that were identified through the functional approach. They are also classified according to the correspondence between IPC classes and technical sectors developed by Schmoch and co-authors [40.103, 40.104]. It is clear from the table that highly relevant patents are found in industries and technical sectors that have no proximity to the food or packaging industries, such as, for example, medical technology.

After the identification of these patents, it was possible to set up brainstorming sessions aimed at exploring the underlying inventive principles and their relevance for the sterilization of food containers. This activity led to the validation of a large number of product concepts: as many as 55. These concepts were then subject to a process of screening and refining, until a small number was selected for implementation.

5 Conclusions and Future Research

The notion of function is at the core of the patent system. However, the legal and economic doctrine of patents, as well as professional practice, have largely ignored the theoretical and empirical developments of this notion in fields such as engineering design and design theory.

It is time to make an effort to put this notion at the core of analysis and practice. We have shown that the theoretical treatment of the notion is now mature, from a philosophical and epistemological point of view, as well as in engineering disciplines. These conceptual developments offer a robust background for a systematic analysis of the notion of functions in the legal and economic doctrine of intellectual property. In turn, this might offer ground for more systematic and formal procedures of patent search carried out at patent offices.

We have also shown that the recent and impressive developments in computational linguistics and the automatic treatment of texts open the way for new applications.

In this chapter, we have suggested the integration between current approaches to patent classification and the functional classification approach. It is clear that a full scale, pure functional classification of all existing patents is a long term goal, requiring further research over many years. However, a promising intermediate step might be to compare existing classifications with functional classification in limited, controllable, new areas of technology that require dedicated efforts of updating. Are the current approaches to classify patents, say, in the field of Industry 4.0, appropriate? Or in the field of FinTech? It would be useful to develop a formal framework for the comparison of alternative approaches, based on well-defined metrics drawn from computational linguistics and from graph theory (e. g., precision, recall, predictive power, number, and share of relevant out-of-the field citations identified, and measures of distance in the classification graph).

Another long term goal, which is, however, made realistic by the current developments in computational linguistics, is the definition of formal measures of technological distance and its semi-automatic computation.

Keeping the full scale functional classification as a long term goal, other short term applications are already very promising. In the field of patent search and patent analysis, the functional approach allows us to overcome the limitations of existing classifications, by identifying several relevant inventions that would remain hidden otherwise. Applications to patent search will prove valuable in prior art analysis, freedom to operate, and litigation. Patent offices might find it useful to incorporate it in their routine procedures.

The functional approach offers new perspectives in fields of analysis that use patent datasets for a variety of purposes. It is a powerful tool for the profiling of emerging technologies, beyond existing technology or industry boundaries. It allows the identification of lateral opportunities, analogical solutions, and crossover applications in innovation management. It supports a systematic projection of technologies in the future, in studies of technology foresight, mitigating the cognitive and motivational biases of experts.

A promising direction is the use of large scale functional dictionaries, based on deep engineering domain knowledge, coupled with powerful linguistic tools. Given the success in developing large scale dictionaries based on functions, the same approach should be followed in the effort to reach saturation in dictionaries that deal with stakeholders/users, advantages and disadvantages, and physical descriptions of structures and behaviors.

A large research agenda is therefore open.