Market research for requirements analysis using linguistic tools

Luisa, Mich; Mariangela, Franch; Pierluigi, Novi Inverardi

doi:10.1007/s00766-003-0179-8

Market research for requirements analysis using linguistic tools

Original Article
Published: 30 October 2003

Volume 9, pages 40–56, (2004)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Requirements Engineering Aims and scope Submit manuscript

Market research for requirements analysis using linguistic tools

Download PDF

Mich Luisa¹,
Franch Mariangela² &
Novi Inverardi Pierluigi²

3130 Accesses
108 Citations
4 Altmetric
Explore all metrics

An Erratum to this article was published on 01 April 2004

Abstract

Numerous studies in recent months have proposed the use of linguistic instruments to support requirements analysis. There are two main reasons for this: (i) the progress made in natural language processing and (ii) the need to provide the developers of software systems with support in the early phases of requirements definition and conceptual modelling. This paper presents the results of an online market research intended (a) to assess the economic advantages of developing a CASE (computer-aided software engineering) tool that integrates linguistic analysis techniques for documents written in natural language, and (b) to verify the existence of the potential demand for such a tool. The research included a study of the language – ranging from completely natural to highly restricted – used in documents available for requirements analysis, an important factor given that on a technological level there is a trade-off between the language used and the performance of the linguistic instruments. To determine the potential demand for such tool, some of the survey questions dealt with the adoption of development methodologies and consequently with models and support tools; other questions referred to activities deemed critical by the companies involved. Through statistical correspondence analysis of the responses, we were able to outline two “profiles” of companies that correspond to two potential market niches, which are characterised by their very different approach to software development.

A Natural Language Approach for Requirements Engineering

Application of Computational Linguistics Techniques for Improving Software Quality

An Experience with the Application of Three NLP Tools for the Analysis of Natural Language Requirements

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Objectives and structure of the paper

1.1 Premise

This paper presents the results of an online market research conducted in the spring and summer of 1999 by the Department of Computer and Management Sciences of Trento University, Italy. The study is part of a larger project whose principal aim is to identify the advantages and disadvantages of market research done online with respect to traditional methods and channels, and to look at its applicability in diverse product markets.^{Footnote 1} In methodological terms the objective of the research presented in this paper was to demonstrate the benefits of conducting online market studies for innovative products. Problems with such innovative products derive firstly from the fact that their characteristics cannot be thoroughly defined before conducting the research, and secondly their availability in commercial form usually requires further sizeable investments in research and trialling. Both of these issues are critical for CASE (computer-aided software engineering) tools, which use linguistic instruments to analyse documents in natural language, and are therefore based on technologies for natural language processing (NLP) developed in the field of artificial intelligence. Working from the perspective of a company attempting to decide which products to develop (from among different projects related to NLP-based applications), our objective was to evaluate the potential demand for NLP-based CASE tools. In conducting the study we made the reasonable assumption that the respondents (people involved in developing software systems) could be contacted easily by Internet; this prerequisite could not be guaranteed principally at a national level for other sectors studied previously (e.g. tourism or electronic commerce of groceries).^{Footnote 2} At the same time, a certain predisposition not to participate in the study was to be expected, whether because of time constraints (noted even during the initial explorative interviews) or because of an already high level of saturation. In fact, both of these assumptions were confirmed during the course of the research. Nonetheless, we emphasise that this paper focusses on the results of the actual content of the research, and hereinafter we describe only methodological aspects that are pertinent to the interpretation of the results obtained.^{Footnote 3}

1.2 Objectives

As previously mentioned, the aim of the research was to analyse the potential demand for a CASE tool integrating linguistic instruments as a support to requirements analysis [2]. To give the context in which such a tool could be designed and used, the following paragraph first describes the role of natural language in requirements engineering and then classifies the possible applications of linguistic instruments, making reference to the architecture of an ideal NLP system and to the three fundamental activities of requirements analysis: elicitation, modelling, and validation [3]. Our market research refers principally to the support of conceptual modelling, an activity that to benefit from the use of linguistic instruments requires the design of a modelling module. The other activities could be supported by existing functionalities of an NLP system, with varying levels of performance.

It was found early in the study that none of the commercial CASE tools exploited linguistic instruments to support requirements modelling [4]; this meant, therefore, that the market research was to focus on a new product whose features could not be defined in relation to similar existing products (analysis of the competition). Numerous research projects do exist in this area, however, and serve as a testimony of the considerable interest in the use of linguistic instruments in requirements engineering [5, 6].^{Footnote 4} The common objective is to carry out a linguistic analysis of requirements documents in order to produce conceptual models of them.^{Footnote 5} Among the most recent projects, as an example, we can cite those described in [8, 9]. While a complete review is beyond the scope of this paper, it is worth noting how different approaches can be analysed by looking at two principal aspects (depending on the characteristics of the linguistic tools adopted):

a.
How “natural” the input language is, which is normally subject to restrictions regarding grammar, vocabulary, or both;
b.
How much intervention by an analyst is needed in order to process “semi-automatically” the text or to identify the key elements for conceptual modelling.

The survey described in this paper focusses on the first of these points, one that we deem of vital importance because whatever the approach adopted, the “naturalness” of the language directly affects the amount of effort needed to extract useful information from the documents. First, it was necessary to establish whether the documents gathered in the requirements elicitation phase were in ‘real’ natural language or in some type of restricted language, and if they were in natural language, whether the user or customer could be asked to describe the requirements using a more restricted language. In fact, if the documents are written in a ‘controlled’ language (restrictions on grammar or vocabulary), information can be extracted using syntactic or ‘shallow’ techniques, such as parse trees.^{Footnote 6} To obtain equivalent performances with documents in unrestricted natural language it is necessary to have a semantic representation of knowledge that embeds reasoning techniques. Such applications are currently being studied.^{Footnote 7} Moreover, the language used in the documents can be more or less linked to a particular application domain (for example, software for telecommunications), thus determining the degree of specialisation of the support linguistic tool to be used in the conceptual analysis, and therefore of its knowledge base. In other words, hypothesising that the basic NLP technologies are available, for a company that must decide whether or not to invest in the development of an NLP-based tool for requirements analysis, it is important to establish first if it is possible to design and realise a general-purpose tool to support software development for different application domains or if instead it is necessary to make further investments later to customise the tool for the different companies or customers it will eventually serve. These are all essential considerations in determining the investment necessary to convert a research prototype – like those developed in the existing research projects – into a commercial tool.

Results of preliminary interviews as well as the state of the art of existing prototypes led us to decide not to investigate the degree of analyst intervention requested nor the performance requested of the tool (point b: we limit ourselves on this point to giving some general findings that emerged while conducting the research). To do so would have required further investment in a more extensive market research; such study would be justifiable only with a positive outcome, certainly not guaranteed, relative to the issues related to point a). Moreover, to assess the potential market for an NLP-based tool for requirements analysis, we studied aspects related to the diffusion of methods and instruments of software engineering. In particular, we intended to verify whether requirements analysis is in fact considered critical in relation to other important activities in software development (testing, documentation, etc.).

1.3 Structure of the paper

The paper is organised as follows: the next section describes the context of an NLP-enabled CASE tool and summarises possible applications of linguistic tools for requirements engineering. This provides information on the design of the questionnaire and the eventual interpretation of the results. The third section outlines the plan of the market research, noting the different phases and focussing on the questionnaire and on the characteristics of the respondents. The main results of the online survey are presented in the fourth section, where they are analysed using a statistical technique referred to as correspondence analysis. The profiles obtained have revealed the existence of two market niches characterised by their diverse approaches to software development. Finally, some observations are given regarding the characteristics of the survey and the extendibility of the results. The conclusions summarise how the results of the survey can be used by those who develop software in general, and by those who design tools and environments for requirements analysis in particular.

2 The role of natural language in requirements engineering

Much has been written on the importance of requirements analysis. In order to show why environments and tools to support such analysis are less satisfactory than those available for the other phases of the software life cycle, we shall briefly review the distinctive features of requirements engineering, defined as:

...the systematic approach of developing requirements through an iterative cooperative process of analysing the problem, documenting the resulting observations in a variety of representation formats, and checking the accuracy of the understanding gained. [3, p. 13]

Thus evident is the central importance of communication^{Footnote 8} and knowledge. Compared with other phases of software engineering, requirements analysis and conceptual modelling [15] present unique difficulties. Many of the activities involved are cognitive and require creativity as well as knowledge about information technologies and the application domain. Moreover, the recent advances brought about by business process re-engineering (BPR) and the inclusion of innovative components in information systems are broadening the scope of projects. As a consequence, the number of the actors, interactions, and languages involved have increased. Completing the picture are the needs of companies, which operate at ever higher levels of competitiveness and which demand increasingly flexible information systems.

In this context, the use of linguistic tools – more precisely of NLP systems – to support the development of software systems in general and requirements analysis in particular, may help the analyst to:

Concentrate on the problem rather than on the modelling;
Interact with other actors;
Take into account the various kinds of requirements (organisational, functional, etc.);
Achieve traceability as from the first documents produced;
Manage more efficiently the problem of the changing user requirements^{Footnote 9}.

As regards the possible applications of NLP systems to requirements engineering, it is worth noting that they are able to process both vocal and textual input, sometimes imposing restrictions such as limiting the vocabulary or the grammar.

NLP systems can be used to obtain, with different levels of performance, essentially three types of output:

Syntactic, semantic, or pragmatic analysis;
Text either in the same language or another one, natural or artificial;
Syntheses in the form of differently structured summaries or templates.

Figure 1 is a simplified scheme of an ideal general-purpose NLP system. It is important to remember that the systems for real applications are usually highly dependent on the task and on the domain.^{Footnote 10}

With reference to this scheme, linguistic tools of differing complexity and especially of differing maturity can be used:

a.
In the requirements elicitation phase:
- To facilitate the digitising of requirements documents using speech recognition systems or NLP-based interrogation interfaces;
- To reveal ambiguities and contradictions in documents describing user needs (see, for example, [12, 18, 19]);
- To design questionnaires or interviews, by verifying the ambiguity of the questions;
- For automatic analysis of replies to open-ended questions, interpreting and classifying their contents [20].
b.
To model requirements by extracting (directly from the text) the descriptions of the elements to be included in the conceptual models envisaged by the development method adopted, in particular UML (Unified Modelling Language^{Footnote 11}) diagrams (see Fig. 2).
Fig. 2.
The models generation process
Full size image
c.
To support requirements validation, by exploiting the generation functionality of NLP systems to produce descriptions in natural language based on the structures used to represent knowledge.

A complete vision requires noting that NLP tools can also be used for documentation, generating reports on the various stages of requirements collection and modelling; for traceability, allowing a link to be maintained between the texts used and the models produced; and for the translation of documents into various languages, something that becomes increasingly necessary in the design of international information systems.

The survey described in this paper concerns the second of these points, that is, the use of NLP techniques to support the development of conceptual models, given that it requires the design of a modelling module. All the other activities could be supported by existing functionalities of an ideal NLP system, albeit with different performances. The most important assumption is that the requirements documents, once analysed, can contribute to a “knowledge base” from which to extract elements deemed useful for modelling activities. There are two important aspects to note regarding projects for developing this type of instrument: (i) many of these projects are based on ad hoc NLP systems, and therefore do not appear to correspond to the requirements for scalability and robustness of real applications; and (ii) given the complexity of natural language, almost all of them expect that documents will be written in restricted language or that some revision of the text will have taken place before undergoing the automatic analysis. These two facts are worth remembering when interpreting the results of market research and when estimating potential investments in NLP technologies, and certainly when developing a CASE module to support requirements analysis.

3 Plan and realisation of the market research

The decision to investigate the market for an NLP-based tool for requirements analysis was made in the context of a joint research project with the Department of Computer Sciences of Durham University (UK) in which a prototype was developed of a CASE tool – called NL-OOPS^{Footnote 12} for requirements modelling according to the object-oriented approach [21, 22].

The market research described here was based on the administration of a questionnaire whose design required consideration of the experience gained throughout the development of NL-OOPS and of the methodology and techniques of online market research. Specifically, the research progressed in the following phases:

Preliminary survey
Identification of interview subjects
Designing and testing of the questionnaire
Selection of the contact method
Distribution of the questionnaire and reminders
Collection and analysis of the data

A description of each phase follows, with greater emphasis on the third phase (designing the questionnaire) and on the final stage (analysis of data).

Preliminary survey

The first step in the research project was to create a focus group composed of both companies that develop linguistic instruments as well as big and small businesses that develop software or offer services linked to the introduction of information technologies in the workplace. The goal of this phase was to collect information about the users’ needs that could be satisfied with an NLP-based CASE tool and to gather other information useful in designing the questionnaire. The researchers were immediately confronted with pessimistic views of tools which use NLP techniques to support requirements analysis. In particular, some focus group members expressed serious doubts that the language in the documents gathered for requirements analysis was sufficiently ‘natural’ to justify the adoption of a tool based on NLP techniques. Others questioned the technical feasibility of such tools, citing their own unsatisfactory experiences with other NLP applications such as translation programs.

Identification of interview subjects

In accordance with the objective of the study, the questionnaire was directed principally to persons involved in software development, and in addition to managers responsible for important decisions regarding the process of software development, including the decision to adopt methodologies and support instruments. From a statistical viewpoint, when dealing with a survey conducted via Internet, one of the main problems is to establish the degree to which the sample is representative of the target population, in this case the people or companies involved in software development. On the one hand, it is reasonable to assume that the intended respondents are reachable by Internet, while on the other hand the population has characteristics (number, size, geographic distribution, etc.) that are not documented. Given this and also considering the chosen methods of contact, the approach to the study is conceptually similar to a sequential sampling. Statistically, this would classify it as a descriptive study, and as such requires caution when extending the results outside of the survey sample.

Designing and testing of the questionnaire

Again considering the objectives of the study, in terms of both methodology and content, the survey was conducted only via Internet and it consisted of a questionnaire on a web page (see the Appendix).^{Footnote 13} This choice was the driving force during the design and testing stage, the aim being to have a concise questionnaire with close-ended questions in language as clear as possible.^{Footnote 14} As for the questions themselves, the choices were made as logical and pertinent issues emerged throughout the course of the focus group. After a phase of testing in which the questionnaire underwent the scrutiny – first directly and then online – of a select group of analysts and project managers, the final version was produced. The final questionnaire was divided into two sections, for a total of eighteen questions, and a final open question for further observations. The first group consisted of questions relating to the company (questions 1–4) and to the respondent (questions 5 and 6). The second part investigated processes of software production, so that one group of questions concerned the use of methodologies (questions 7–10) and tools (questions 13 and 14) in software development; another group dealt with documents used in requirements analysis (questions 11, 12, and 15) and the last three were about the efficiency of the development process (questions 16, 17, and 18). The respondents were also asked if they were interested in obtaining the results of the research or in viewing a demonstration of a prototype of an NLP-based CASE tool. The decision to introduce questions associated with an engineering approach to software development was made after verifying the possibility of using existing data. Surprisingly^{Footnote 15}, only a small amount of data was found, whether for the diffusion of object-oriented methodology or for the use of ‘classic’ models such as the entity-relationships models. These are important because the early research and conceptual models for linguistic analysis of requirements [7] looked to produce entity-relationships diagrams; moreover, these models can be seen as a particular case of the class models foreseen by the object-oriented approach. As regards the market for CASE tools^{Footnote 16}, in many cases they did not meet expectations and as a consequence did not have the desired market success [25]. We will have to wait for the adoption of the UML – developed about one year before the present research project began – as a standard for conceptual modelling by the OMG (Object Management Group); only then will there be a significant growth in the market for CASE tools, repackaged and renamed as object modelling tools or visual modelling tools. In short, the scarcity of data on the penetration and role of an engineering approach to software development influenced the choice of questions for the survey, but also, as we shall see, the ability to validate and extend the results.

The questions considered most important to verifying the existence of a market niche for an NLP-based CASE tool are those related to the documents used to collect requirements. In fact, as we have already seen, if documents are in real natural language, an even more sophisticated (and costly) technology is needed to develop an environment that effectively supports analysis using linguistic instruments. It is therefore useful to establish whether the company is in a position to require clients or analysts to describe requirements in a restricted language. Typical restrictions can include: (i) grammar – aiming to have syntactic constructions that are easier to analyse by requiring, for example, shorter phrases, using the active voice, by avoiding anaphorical references, etc.; and (ii) vocabulary – aiming to reduce ambiguity of terms. Moreover, in order to determine the degree of customisation required of a possible NLP-based tool, further questions dealt with the level of specialisation of the terminology and the domain knowledge required to develop the software.

In the questions related to the efficiency of production processes, respondents were asked in particular about the improvements that they would like to see (choosing from a list of eight possible activities considered critical, two of which are fundamental for the phase of requirements analysis) and how they could be achieved, the choice being between ‘internal delegation’, ‘outsourcing’, and ‘automation’. The final question was designed to ascertain whether the company was able to deliver the software systems or products without delays. Finally, in keeping with the general rule of market research, an incentive to participate was provided in the form of a random draw among respondents for tickets to an opera performance at the Arena in Verona.^{Footnote 17}

Selection of the contact method

The objectives of the research and the characteristics of the tool inherently required a contact method that would permit efficient use of time and resources while at the same time reach the largest number of potential respondents. On this point, to take into account the fact that there is a high level of saturation – due to the large number of such survey requests that the respondents receive – we had initially thought to send the questionnaire to some specialised newsgroups^{Footnote 18}, highlighting the academic nature of the research. In the first phase we identified three newsgroups whose work is related to the research topic (comp.object, comp.software-eng, alt.comp.software-tools); another 21 newsgroups were later added to the list (the complete list is available at http://online.cs.unitn.it/). Nonetheless, after this method of contact proved less successful than expected^{Footnote 19}, we decided to contact the companies directly by email, supplying them with the address of the Web page where they could find and complete the questionnaire. The companies’ addresses were acquired online using search engines, in particular a directory of Yahoo! (http://www.yahoo.com – Computer > Software > Developers).

Distributing the questionnaire and reminders

As described above, the questionnaire was administered in two different ways. In a first phase it was publicised on a number of newsgroups devoted to software development (resulting in 44 completed questionnaires and 39 software companies) and in the second, requests to take part in the survey were sent by e-mail to 1541 addresses corresponding to 1234 software companies. By means of this second method, 107 completed questionnaires corresponding to 103 companies, were obtained. To get these results, it was necessary in many cases to send a message reminding the receiver to participate in the study, yet at the same time allowing him or her to explain the decision not to complete the questionnaire. Reasons given for not completing the questionnaire frequently referred to a lack of time and the large number of requests of this kind received (the email messages sent are accessible online at http://on-line.cs.unitn.it/). In addition, several addresses were incorrect, although the percentage was rather low (7.6%, 6.1% if calculated by number of companies).^{Footnote 20} Consequently, the number of valid contacts was 1424, corresponding to 1159 companies.

Collection and analysis of the data

A total of 151 questionnaires were returned, 91% within five days of sending the initial request or the questionnaire itself. The response rate calculated for the questionnaires sent via email was around 8%. This can be regarded as a satisfactory result when compared with traditional surveys conducted by post or fax, and with other surveys of software development, for which the response rate has been 3% [25].^{Footnote 21} In strictly statistical terms, the group of companies contacted – while constituting in itself a large number – cannot be taken as a representative sample of the population of software development companies. Given this, it is important that the results be interpreted in a descriptive mode, thus requiring caution in extending them. We shall see, however, that for some questions the quality of the survey results can be evaluated by comparing them with those obtained from other surveys and with data relative to the CASE market. The results of these comparisons are provided at the end of the next paragraph.

On a methodological level, the use of newsgroups confirmed that little effort was required to ask respondents to participate, but the low number of questionnaires completed may nullify this advantage. Furthermore, the use of newsgroups should be evaluated on the basis of the following factors: level of specialisation^{Footnote 22}, number of messages, and presence of a moderator. In light of the results of our survey, in the case of very specialised newsgroups, even if the contents of the survey are relevant to them, in order to increase the response rate it is advisable to ask for the moderator’s consent, or to identify one or more newsgroup leaders who can legitimate the survey with their participation.

The initial analysis noted the geographic distribution of the respondents, most of whom are residents of European states or of North America (see Fig. 3). This first result of the research is supported by the analysis of similarities among different geographic distributions (using appropriate indices) showing, in fact, that these markets have similar characteristics. Given this, we present here results of the survey in its entirety, highlighting only those aspects where the geographic area of residence influenced the responses.

Eighty-six percent of the respondents fill roles relating to software development projects, 68% having occupied the role for more than six years.^{Footnote 23} Moreover, as to be expected, length of service influenced the position occupied in the company, so that programming work was more frequently performed by persons employed for the shortest periods, while those who had worked in their companies for 6–10 years were almost uniformly distributed among roles. To be noted is that the majority of European respondents selected ‘System Engineer/Architect’ but their American counterparts selected ‘Project Manager’, which may have been because different terms are used to denote the same role in the two areas. Some 29% of the respondents worked in companies with more than 100 employees, although small-sized companies were also well represented (Table 1).

Table 1. Company size

Full size table

The core business of the companies surveyed in 77% of the cases is ‘Software’ and in 23% is ‘Websites’ or ‘Other’. As expected, the highest percentage of companies engaged in other types of business (or rather, also in other types of business) consisted of larger-sized ones. As regards the type of software produced, 42% of the companies developed software for niche markets (Fig. 4), with a high of 48% for North America. This may be due to the presence of a larger number of small-sized companies, given that 59% of companies with five or fewer employees, and 24% of those with more than 100, operated in niche markets. Software products were mostly sold to the end user: 84%;^{Footnote 24} only 13% sold to another software company, and 3% to software shops. Interestingly, given the nature of this type of product, all the companies that developed websites sold their products directly to the end users.

The next section provides a detailed analysis of the results of research into the existence of a potential market for an innovative tool to support conceptual analysis – a tool that has the capability to analyse documents written in varying levels of natural language.

4 The results of the survey and the potential demand for an NLP-based tool to support requirements analysis

We can identify three groups of elements that are useful in evaluating potential demand^{Footnote 25} for a CASE tool to support requirements analysis for documents written in natural language. They can be described as follows, taking into account their interrelatedness:

The market for instruments supporting software development and requirements modelling

How extensive is the market? How much competition is there? Do software developers use CASE tools? If so, which ones? (Normally the use of a CASE tool presupposes the adoption of a development methodology.) This last point was important both for establishing which conceptual models the tool should support (an aspect that became less important with the diffusion of UML^{Footnote 26}), and for reasons of compatibility and integration with existing tools.^{Footnote 27} Some information on this point could be obtained by means of the data on sales of CASE tools, but one question on this topic was inserted regarding the tools supporting requirements analysis and top-level design.

Features of the tool

The requirements principally influencing the investments necessary to develop a tool for requirements analysis based on linguistic instruments are (a) the language found in the documents gathered in the elicitation of requirements phase, crucial in identifying appropriate techniques and linguistic instruments, and (b) the degree of specialised domain knowledge required of the tool, which determines the degree of specialisation required of the producer of the CASE tool (generality). Also, given the state of the art of linguistic instruments, an important consideration is the performance required of the tool; in other words, how ‘good’ does it have to be to merit purchase?^{Footnote 28}

Requirements analysis viewed as crucial

This is a vital element in identifying potential market niches and in ascertaining the tendency of users to invest in a tool that supports requirements analysis, as well as their willingness and ability to accept the changes that accompany the adoption of a new tool. Companies that have an engineering approach to software development have highly standardised processes and should therefore consider the activities lacking structure or support as crucial points demanding attention. A company employing a more informal or ‘craft’ process would not necessarily share this concern but would, however, be more interested in the use of natural language.

To glean the most useful information on these three points, we analysed the completed questionnaires in two phases. In the first phase we looked at individual answers, studying reciprocal relationships and dependencies. In the second phase we applied correspondence analysis [28], aiming to unveil the existence of profiles corresponding to potential market niches for an innovative CASE tool.

4.1 The market for instruments supporting software development and requirements modelling

As for the use of a tool supporting requirements analysis and top-level design, only 30% replied positively. As was expected, greater use was made of these tools in large-sized companies, reaching 51% in those with more than 100 employees, as is shown in the table of conditional distributions (Table 2). Not surprisingly, the use of these tools increases with length of service (rising from 17% to 36%) with analysts as the category of employee using them most frequently.

Table 2. Use of tools for requirements analysis and top-level design by company size

Full size table

Moreover, 84% of the respondents stated that they used specific methodologies for software development. Size was a determining characteristic here: 78% of companies with five or fewer employees use specific methodologies and 93% for those with more than 100. The type of software or the sales channel does not significantly influence the use of methodologies, although role and experience seem to do so to some extent.

The best-known diagrams for data modelling, entity-relationship (E-R) diagrams, were used by 63% of respondents who adopted a methodology. Moreover, smaller company size corresponded to their more infrequent use (52% in companies with fewer than five employees, 73% in those with more than 100). The use of E-R diagrams was substantially greater among respondents who had worked longer in the computer business (increasing from 35% among those who had worked in the field for less than three years to 66% among those who had done so for more than ten). Finally, as regards the type of software, E-R diagrams were used to very different extents by respondents who developed general-purpose software (93%) and by those who developed network software (25%), while there were no substantial differences as far as the other items are concerned.

The percentage of respondents who used an object-oriented (OO) method was 68%, a percentage similar to that of E-R diagram users. The classification by company size shows a difference between companies with five or fewer employees (60% of which used OO methods) and those with more than 100 (74% of which do so). There are no significant variations with respect to years of experience, while there is a closer association with the position occupied within the company: the percentages ranged from 45% for programmers to 78% for system engineers/architects. An interesting comparison can be made in Table 3, where one notes that those who adopt OO methods were already accustomed to using E-R diagrams, thus indicating that they seemed more inclined to use an OO approach.

Table 3. Entity-Rrelationship diagrams and Oobject-Ooriented Mmethods

Full size table

As far as the most widely used OO method, 77% of respondents who replied in the affirmative to the previous question declared that they use UML. This is a result which confirms the affirmation of UML as the industrial standard for OO modelling. It is worth mentioning that the survey was carried out approximately one and a half years after the adoption of UML by the OMG.

It also emerged that the great majority of the respondents who said that they did not use methodologies did not use tools for requirements analysis and top-level design either (90%): indeed, there is an association between the use of methodologies and CASE tools. Another finding to be emphasised is the connection between the use of CASE tools for requirements analysis or top-level design and the type of language employed in documents. Not unexpectedly, these tools were used more frequently when the language was more formal (24% with ‘Common natural language’ and 63% with ‘Formalised language’). Even if these results should be treated with caution, given the low number of companies surveyed, they seemingly confirm the inability of currently available CASE tools to meet the needs of natural language processing by yielding environments that are effectively useful. As far as the tools used are concerned, 52% of respondents who replied in the affirmative to the previous question declared that they used Rational Rose.^{Footnote 29} Rational Rose was the tool with the highest market share both worldwide and in Europe.^{Footnote 30} In 1998 it accounted for 33% of the market, with an increase of 79% on the previous year.^{Footnote 31} For this reason, the percentage found by our survey (52% for the year 1999) appears to be as one would expect.^{Footnote 32}

4.2 Features of the tool

As noted, the type of language used in requirements documents determines the complexity of the linguistic instruments and of the NLP techniques to be used. When documents are written in a constrained language (a subset of natural language) – which imposes restrictions on the grammar, vocabulary, or both – simpler and more mature linguistic tools can be used. However, it is not usually possible to impose restrictions on the language employed. Firstly, because it is necessary to adopt a customer-oriented approach in the development of software applications. Secondly, because it is necessary to reduce the risk that the restrictions imposed on the language and the formalisms adopted will force the user, or even the analyst, to express what the models permit to be represented, rather than the real requirements of the system. The survey shows that, in both Europe and North America, requirements documents are furnished directly by the customer and integrated with interviews in around two-thirds of projects. The main difference between the two regions considered was the percentage of companies that conducted interviews with customers: 73% in North America and 58% in Europe, without significant differences in behaviour between small- and large-sized companies.

With regard to the level of the terminology in requirements documents, one finds that 79% of the latter are couched in natural language (Fig. 5). For the correspondence analysis, the final two modalities (structured and formalised language) have been merged.

An analysis of the interdependence of the use of natural language with the other factors examined did not show any significant association with type of company, nor with the adoption of a methodology.

Another important aspect concerning both the potential demand for an NLP-based CASE tool in particular and software development in general is the domain knowledge required for an adequate understanding of the problem so that the user’s requirements can be defined. In fact, in the presence of high levels of specialist knowledge, the tool must be adapted to the needs of every customer if it is to operate efficiently in different corporate settings. By contrast, a very low level permits the development of a single standard tool able to operate in different fields of application. In this regard, it was found that respondents required an average (54%) to high (34%) level of domain knowledge. It also emerged that the higher the level of domain knowledge required to develop the software, the greater the use of methodologies (9% for low levels, 53% for average ones, and 38% for high ones) and of tools for requirements analysis and top-level design (2%, 56%, and 42%, respectively).

4.3 Requirements analysis viewed as crucial

As regards the efficiency of production processes, upon conclusion of the market study it was important to determine which software activities were viewed as crucial, as well as their weight relative to requirements (question 16).

In interpreting the answers to this question, it is worth noting that two selections were requested, thus having results above 100 percent. Fig. 6 shows that ‘Identify user requirements’ and ‘Model user requirements’ were cited as priorities by a high percentage of respondents.^{Footnote 33} Unlike in the case of ‘Identify user requirements’ – which was largely independent of the language used to model requirements (46% for ‘Common natural language’, 37% for ‘Structured natural language’, and 50% for ‘Formalised language’) and for ‘Testing the software’ (35%, 32%, 38%, respectively) – for ‘Model user requirements’ the percentages were 38% for ‘Common natural language’ and 13% for ‘Formalised language’, in accordance with expectations. Another noteworthy finding is that testing was viewed as crucial by higher percentages (ranging from 19% to 46%) of the respondents who used no tools at all. A similar pattern is displayed by the level of domain knowledge necessary, where at low levels of knowledge, testing was perceived as more important than all the other activities (63%, compared to 32% and 30% for medium to high levels of knowledge). Also of interest is the fact that ‘Learn to use a new tool’ was selected by a higher percentage of respondents declaring that they did not use a tool for requirements analysis than by those who instead said that they used a tool of this kind.^{Footnote 34}

The importance of this question requires a comparison of the results for Europe and North America (see Fig. 7). Also the correspondence analysis – reported in the second part of this section – was done taking into account the centrality of this question with respect to the objectives of the market research, in which the activities considered most critical become determinative when identifying profiles.

To the question ‘What would be the most useful thing to improve general day-to-day efficiency?’, the majority (64%) chose the option ‘Automation’, while ‘Outsourcing’ was selected by 7% and ‘Internal delegation’ by 29%. Contrary to expectations, no particular differences emerged among the replies to this question with respect to company size, where the only significant difference concerned companies with 6 to 20 employees, where the percentage selecting ‘Internal delegation’ was nearly double that for other company groups, a difference that may be due to organisational shortcomings. Interestingly, the percentage of respondents who used a methodology or a requirements analysis tool and believed it less important to increase the level of internal delegation was above the average of the entire sample. Instead, there were no differences regarding the documents available for requirements analysis.

Joint analysis of the two questions on the efficiency of software production processes shows that a larger percentage of respondents who believed it important to increase the level of automation had previously selected ‘Learn to use a new tool’ and ‘Model user requirements’ (Table 4).

Table 4. Efficiency of software development processes

Full size table

For the final question, regarding the average delay in delivery of the software, the best performances were achieved by companies with 6–20 employees (29% of which delivered with less than one week of delay and 59% with less than one month) and by those who sold directly to the end consumer (probably for contractual reasons). Though not to a statistically significant extent, companies using formalised language delivered with the least delay, although there were no substantial differences as regards delays of more than one month (26% for common natural language, 33% for structured natural language, 25% for formalised language). A fair interpretation of these results requires one to remember that the answers do not factor in the length of the projects. Nonetheless, assuming that an average delay of less than one week corresponds to companies which on average deliver the software within the designated time, similar findings are reported in [32], where more than 80% of the respondents stated that their projects were sometimes or usually late.

Considering the purpose of this study, and particularly the question of whether there is a market for an NLP-based CASE tool for requirements analysis, the results presented thus far confirm the perception of requirements analysis as crucial for the development of systems, the widespread use of the object-oriented approach and of UML, and the important role of natural language. Specifically:

More than 80% of the companies adopt a methodology to develop their software, and nearly 68% of them adopt an object-oriented method (UML or one of the methods merged into UML).
The majority of the documents available for requirements analysis are in natural language and are either furnished by the customer or obtained by means of interviews.
The domain knowledge required is medium to high.
Tools supporting requirements analysis and top-level design are used in less than one-third of cases.
However, identifying and modelling requirements are perceived as being at least as important as testing the software.
A higher level of automation is indicated by around 64% of the respondents as the most useful means to improve day-to-day efficiency.

All of these elements work together to confirm the existence of a potential demand for a CASE tool based on NLP. To justify this claim, we undertook a correspondence analysis (CA) study. This meant using a statistical technique suited for the study of relationships between modalities with two or more distinguishable variables, usually qualitative. The main steps of correspondence analysis are concisely described as follows:

1.
Define a cloud of points (rows and columns of a contingency table) in a multidimensional vector space.
2.
Choose the metric structure on this space.
3.
Produce the fit of the cloud in step 1 to a variable low-dimensional subspace onto which the points (row and column profiles) are projected for display.
4.
Give an interpretation of the clusters of points corresponding to the projections of the rows and columns of the original contingency table; analyse their absolute contributions as guides to the interpretation of the underlying dimensions and their relative contributions (the so-called squared correlations) to indicate how well the points are described along the considered dimension.

The geometry of CA is very similar to Karl Pearson’s [33] geometric description of principal components analysis. The closeness of the points to a line, plane, or in general to a low-dimensional subspace is defined as the sum of squared distances from the points to the subspace. In general, it is important to avoid the direct comparison of the distances among the projections of row and column profiles because they belong to different low dimensional subspaces and the raw interpretation of their distances may produce misleading conclusions.

Here we have considered a CA involving one of the items of the questionnaire (‘What should be done more efficiently’) as a dependent variable and some other collected variables (number of employees, core business, kind of software produced, use of any methodology, starting documentation, level of terminology, use of any tool, knowledge of domain, thing to improve the day-to-day efficiency, average delay in delivering the software) as independent variables in order to verify whether and how much the answer to this item is influenced by the modalities of the other variables and to identify some relevant aggregations of modalities which can reveal the potential market demand for a CASE tool based on NLP.

We present here the result of the application of the CA based on the responses to the question regarding which activities are considered most critical (see Fig. 8).^{Footnote 35}

An initial interpretation of the graph can be reached by looking at the axes. Specifically, one can interpret the vertical axis in organisational terms, assuming that the request for more automation rather than internal delegation is due to an already more or less solid organisational structure. The horizontal axis, meanwhile, corresponds to an engineering or to a more informal approach to software development depending on the use or non-use of methodologies and instruments to support analysis and designing.

According to this interpretation of the graph, there are two potential market niches. The first market niche corresponds to companies that adopt methodologies and instruments to support requirements analysis and top-level design. We can safely assume that they use an ‘industrial’ rather than ‘craft’ software development process. For this type of company, project evaluation is considered a critical activity, along with requirements identification. These two activities, among the possible activities listed in the questionnaire, are the most interdisciplinary and at the same time the most difficult to structure. In particular, for purposes of our study, requirements identification can be efficiently supported by tools able to analyse documents in natural language. Moreover, for this type of company, the tool should be specialised to have an appropriate level of domain knowledge for the given area of software development. The client provides requirements documents and the software produced is in turn delivered to the client. For a customer-oriented approach, this means having only a limited possibility to ask the client to write the documents in a restricted form of natural language; however, these companies sometimes receive the documents in a somewhat structured (formalised) form. In these cases it is possible to envision the use of less sophisticated linguistic techniques to analyse requirements documents in order to produce conceptual models using the object-oriented approach.

The second market niche includes medium- or large-sized companies that use neither methodologies nor instruments to support requirements analysis and top-level design. They do, however, perceive requirements modelling as critical, along with other activities such as software documentation and testing, which are already supported in varying ways by existing CASE tools. One can reasonably conclude that also this second group of companies constitutes a market niche for a CASE tool enabled by linguistic instruments. In fact, a CASE of this type could integrate the functionalities of a traditional CASE, favouring the adoption of an engineering approach in software development. Another activity deemed critical is to learn new tools, an obstacle that could be surmounted by adopting a CASE that makes extensive use of natural language. The indication of requirements modelling rather than identification brings to light the fact that a problem at the level of requirements specification can hide deeper problems related to requirements elicitation (these can be supported by speech recognition systems and by all the functionalities envisaged in point (a) of Sect. 2.). This is confirmed to some extent by the fact that identification, rather than modelling, of requirements is considered critical by the companies that adopt a more structured approach to software development.

An important aspect of this research is the broader application of the results. As noted, this research is descriptive, based on a large number of questionnaires (among the highest we have seen in our studies^{Footnote 36}), yet not fully representative of the population. The fact is that for the software industry, there simply is not enough information on the reference population to permit a meaningful and statistically correct extension of the results.

Having said this, we maintain that it is useful to make a comparison with data available in the literature. Table 5 summarises the most significant of these. Worth noting is the scarcity of existing data. Although the surveys to which these results refer are very different,^{Footnote 37} their similarities do stand out.

Table 5. Comparison with results relative to other surveys and the CASE market

Full size table

We can also cite here some data found in [34], which contains detailed indications of the percentage of pages in natural language or similar forms – text with keywords, hierarchical enumeration, and tables – for three projects, having values ranging from 82% to 99% (73%, 43.9%, and 34.4%, respectively, only for natural language text).

Another aspect that enables positive assessment of the outcome of the survey is the low percentage of non-replies (1.65%) and the fact that in the case of replies for which the option ‘Other’ was selected, in 91% of cases a specification was given.

5 Conclusions

As the principal aim of this research project was to assess if there is a market for NLP-enabled CASE tools, the most important finding is that the majority of the documents available for requirements analysis are provided by the customer and couched in ‘real’ natural language, leading to the conclusion that the use of linguistic techniques and tools may perform a crucial role in providing support for requirements analysis.

Because an engineering approach suggests the use of linguistic tools suited to the language employed in the narrative description of user requirements, we find that in a majority of cases it is necessary to use NLP systems capable of analysing documents in full natural language. If the language used in the documents is controlled (giving a subset of natural language), it is possible to use simpler and therefore less costly linguistic tools, which in some cases are already available. Instruments of this type can also be used to analyse documents in full natural language, even if in this case more analyst consultation is required to reduce the complexity of the language used in input documents or to intervene automatically in the models produced as output. Moreover, needed in many cases, besides an adequate representation of the shared/common knowledge, is specialised knowledge of the domain. Once again, the management of expert knowledge requires more substantial investments to adapt the tool to the company’s needs.

As for the potential demand for NLP-based CASE tools, two company profiles have been identified, corresponding to two distinct market niches. The first is composed of companies having an engineering approach to software development and that indicated – of the two activities linked to requirements analysis – the identification of requirements as the more critical. In this case the tool could be configured as a module to integrate with the CASE tool already used by the company, and would provide support for phases where existing tools are insufficient. In the second market niche, the technologies of natural language are used to facilitate the adoption of a CASE tool and more generally of ‘best practises’ of software development, given that along with requirements modelling, these companies have also indicated as crucial activities in which the contribution of software engineering is well developed (testing or software documentation, for example).

We can also make some preliminary observations here regarding the features expected of a tool based on NLP, proceeding from interviews with systems analysts/engineers and project managers in both small- and medium-sized companies. Specifically, they confirm assumptions made regarding potential demand and interest in the following features:

The possibility to accelerate the production of analysis models and to rapidly create models to be used in interactions with users and in project groups. The fact that, for example, the class models may contain spurious classes or that some classes may be missing was regarded as less important if the models are produced automatically.
The tool was also regarded as useful for the training of analysts, with the presentation of texts and the corresponding models, both for junior analysts and for the retraining of those unfamiliar with the object-oriented approach (the latter problem seems to be more important for small-sized companies).
The possibility of integrating the tool with CASE tools for drawing diagrams using the elements singled out by the algorithm and using tools for documents management.

Finally, for some questions in the survey (e.g. the use of methodologies and E-R models or the use of support tools in the initial phases of development) the contributions this paper makes to the field go beyond the confines of the market research as described by the title. It confirmed some expectations (the diffusion of the object-oriented approach), which on the surface could appear obvious, yet have not been sufficiently supported by hard data. It also confirmed the presence of significant possibilities for the adoption of instruments and methods of software engineering [35].

Notes

Multi-year project funded by the Department of Computer and Management Sciences of Trento University.
Some comparisons deriving from our research are described in [1].
For further study of issues related to online market research, the interested reader can refer to the literature (see for example, the publications found at ESOMAR – European Society for Opinion and Marketing Research – http://www.esomar.org).
A bibliography is available at http://nl-oops.cs.unitn.it.
The first proposals to use linguistic criteria for the extraction of entities and relations, and then objects and associations, from narrative descriptions of requirements date from the 1980s [7].
Included in this category are, for example, the instruments described in [10] and [11].
For example, to recognise if Washington is the name of a person, of an airport, or of a city in a given document requires a semantic approach. Limitations on space do not permit a deeper discussion of this issue here; see for example [12].
“The hard part, and the true essence of requirements, is trying to understand your customer’s needs. A person involved in requirements needs human skills, communication skills, understanding skills, feeling skills, listening skills” [13]. See also [14].
For a recent study on why it is impossible for users to know their requirements beforehand, see [16].
On this point, see, for example, the tasks required by the MUC competitions (Message Understanding Competition) organised by the DARPA (Defense Advanced Research Projects Agency) [17].
The official documents of the UML’s specifications can be found on the OMG (Object Management Group) website: http://www.omg.org.
Natural Language – Object-Oriented Production System, http://nl-oops.cs.unitn.it.
The questionnaire is available along with the data gathered and other related research material at http://on-line.cs.unitn.it.
For example, a questionnaire like the one used for the survey described in [23] would have to be radically altered to be used online.
In light of the observations in [24], this may not be so surprising.
The choice of tools for question 14 was made on the basis of sales data for a period prior to the study.
Because the survey concluded at the end of the Arena opera season, the tickets were replaced with CDs of opera music by Verdi.
One of the aims of the survey, in fact, was to investigate the conditions under which newsgroups can be used to carry out online surveys.
Limited number of questionnaires obtained (44) and accusations of spamming.
This is a rather high percentage, bearing in mind that they were collected from the homepages of official company websites. Another survey carried out in the same period on winter tourism, where the addresses were provided by a specialized magazine, found a very similar percentage of wrong addresses (8.9%), but the amount can be much higher. For example, in a survey of Internet users carried out in 1996, 35% of a total 1221 addresses were found to be wrong [26].
This was the minimum value for the traditional-type surveys, which achieved a maximum response rate of 20%. In the survey described by Glass and Howard [25], the percentage rose to 17% after the questionnaire mailings were supplemented by telephone contacts with fax follow-up.
For a survey on virtual supermarkets, a message was sent to 6 newsgroups obtaining 100 completed questionnaires.
All the percentages were calculated on the total number of respondents who answered the relative questions, with non-replies omitted.
Further investigation of this aspect would require knowledge of the number and size of the companies’ customers. This, however, is beyond the scope of our survey.
For an introduction to the evaluation of potential demand, see for example, [27].
In the past, the need to support different graphic notations was a drawback to the market for CASE, in that it required producers to choose which notation to support with their own tools, or to absorb the higher cost of developing different versions.
A CASE based on linguistic techniques for object-oriented analysis does not necessarily require the realisation of an entire support environment, but rather can be seen as a module that can be integrated with an existing product.
A study of the ‘robustness’ is of utmost importance also to establish the degree of analyst intervention required in developing requirements models, and should be conducted using a prototype of the tool. See also point (b) of the introduction and conclusion.
None of the tools indicated by those choosing the option ‘Other’ was selected more than twice.
International Data Corporation (IDC) data.
These figures seem to contradict the results of the survey by Glass and Howard [25], where CASE technologies are described as being in decline. However, it should be pointed out that where back-end or ‘lower’ CASE technologies are concerned, many of the functions offered by these tools are by now part of the development environment. Moreover, other expressions are often used instead of ‘CASE’: for example, the IDC surveys use OOAMDC (object-oriented analysis, modelling, design, and construction) tools. On the other hand, in 1998 the market for OOAMDC grew by more than 10% (24% in Europe). See also the results in [29].
It should be pointed out, however, that the data of our survey are expressed in terms of units of output by the companies surveyed, while the sales figures are calculated on invoices and consequently depend on the prices charged by vendors.
To be noted is that also around one-third of the final observations concerned the role and importance of requirements. Taking into account the different goals of the surveys described in [30, 31], we can compare these results with those obtained for a question therein on the perceived relative importance of software problems in Europe (most of the software problems are in the area of requirements specification and managing customer requirements; following documentation and testing) and on the perceived scope of a generic process model (defining system requirements, 78%).
In this regard we quote a remark made in one of the questionnaires: “I hate to be a cynic, but there are hardly any worthwhile tools. The overhead in learning to use them is too great for the payoff.”
The contingency table is available at http://on-line.cs.unitn.it.
Notable exceptions are the surveys conducted by the European Software Institute: http://www.esi.es.
These surveys were carried out with different objectives and using different methods and samples. The survey described in [25] used 78 questionnaires compiled mainly by directors or managers of information systems development in companies operating outside the software field, while the Finnish one reports results relative to 12 Finnish companies, 8 of which worked exclusively in the software field.

References

Franch M, Mich L, Osti L (2000) Online research as decision tool for marketing and management strategies. In: Gan R (ed) Proceedings of the Information Technology for Business Management – ITBM2000, 16th IFIP WCC, Beijing, China, 21–25 August 2000, pp 737–743
D’Elia M (2000) On-line market research: an application to the software domain (in Italian). Degree Thesis, University of Trento
Loucopoulos P, Karakostas V (1995) System requirements engineering. McGraw-Hill
Chiocchetti N, Mich L (2000) The market for object-oriented CASE tools (in Italian). Tech Report, Department of Computer and Management Sciences, University of Trento
Burg JFM (1997) Linguistic instrument in requirements engineering. IOS, Amsterdam
Ryan K (1992) The role of natural language in requirements engineering. IEEE, pp 240–242
Chen PP-S (1983) English sentence structure and entity-relationships diagrams. Inf Sci 29:127–149
Article Google Scholar
Ambriola V, Gervasi V (1999) An environment for cooperative construction of natural-language requirements bases. In: Proceedings of the 8th ICRE. IEEE Computer Society Press, pp 124–130
Juristo N, Moreno AM, Lòpez M (2000) How to use linguistic instruments for OO analysis. IEEE Softw 17(3):80–89
Google Scholar
Fuchs NE, Schwitter R (1996) Attempto controlled english. In: CLAW ‘96, 1st international workshop on controlled language applications, Katholieke Universiteit, Leuven, Belgium
Delisle S, Barker K, Biskri I (1999) Object-oriented analysis: getting help from robust computational linguistic tools. In: Friedl G, Mayr HC (eds) Proceedings of the 4th International Conference on NLDB ‘99, Klagenfurt, Austria, 17–19 June 1999: Application of natural language to information systems (OCG Schriftenreihe 129), pp 167–172
Mich L, Garigliano R (2000) Ambiguity measures in requirements engineering. In: Proceedings of ICS 2000 16th IFIP WCC, Beijing, China, 21–25 August 2000, pp 39–48
Davis AM (1998) The harmony in rechoirments. IEEE Softw, March/April:6–8
Nitto E Di, Fuggetta A (1995) Change vs consolidation: a challenge for SW development organisations. Riv Inf AICA 25(4):267–279
Google Scholar
Mylopoulos J (1998) Information modeling in the time of the revolution. Inf Syst 23(3–4):127–156
Rugg G, Hooper S (1999) Knowing the unknowable: the causes and nature of changing requirements. In: Eder J, Maiden N, Missikoff M (eds) Proceedings of the 1st International Workshop EMRPS ‘99, Venice, 25–27 September 1999, pp 183–192
AAA Message Understanding Conference (1991, 1992, 1993, 1995, 1998) Proceedings MUC-3, MUC-4, MUC-5, MUC-6, MUC-7. Morgan Kaufmann. http://www.itl.nist.gov/iaui/894.02/related_projects/muc/index.html
Fabbrini F, Fusani M, Gervasi V, Gnesi S, Ruggieri S (1998) Achieving quality in natural language requirements. In: Proceedings of International SW Quality Week, Francisco, CA, May 1998
Laitenberg O, Atkinson C, Schlich M, El Emam K (2000) An experimental comparison of reading techniques for defect detection in UML design documents. J Syst Softw 53:183–204
Article Google Scholar
Canzano G (1999) Natural language processing in market research: automatic analysis of replies to open-ended questions (in Italian). Degree Thesis, University of Trento
Google Scholar
Mich L (1996) NL-OOPS: from natural language to OO requirements using the natural language processing system LOLITA. In: J Nat Language Eng 2(2):161–187
Mich L, Garigliano R (1999) The NL-OOPS project: OO modelling using the NLPS LOLITA. In: Friedl G, Mayr HC (eds) Proceedings of the 4th International Conference on NLDB ‘99, Klagenfurt, Austria, 17–19 June 1999: Application of natural language to information systems (OCG Schriftenreihe 129), pp 215–218
Nikula U, Sajaniemi J, Kaelviaeinen H (2000) A state-of-the-practice survey on requirements engineering in small- and medium-sized enterprises. Research Report 1, Lappeenranta University of Technology
Zvegintzov N (1998) Frequently begged questions and how to answer them. IEEE Softw 15(2):93–96
Article Google Scholar
Glass R, Howard A (1998) Software development state-of-the-practice. Managing Syst Dev June:7–8
Google Scholar
Comley P (1996) The use of the Internet as a data collection method. SGA Market Research, 1996
Wheelwright SC, Makridakis S (1985) Forecasting methods. Wiley, New York
Greenacre JM (1984) Theory and application of correspondence analysis. Academic Press, New York
Dutta S, Lee M, Van Wassenhove L (1999) Software engineering in Europe: a study of best practices. IEEE Softw 16(3):82–90
Article Google Scholar
ESI (1996) ESPITI, European user survey analysis. European Software Insitute, Spain, Nov
ESI (1998) System engineering in Europe. Survey: summary of results. European Software Insitute, Spain, Aug
van Genuchten M (1991) Why is software late? An empirical study of reasons for delay in software development. IEEE Trans SWE 17(6):582–590
Google Scholar
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 6(2):559–572
Google Scholar
Melchisedech R (1998) Investigation of requirements documents written in natural language. Require Eng 3:91–97
Google Scholar
ESI (1997) Software best practice questionnaire, analysis of results. European Software Insitute, Spain, Dec

Download references

Author information

Authors and Affiliations

Department of Computer and Telecommunication Technology, University of Trento, Via Sommarive 14, 38050, Trento, Italy
Mich Luisa
Department of Computer and Management Sciences, University of Trento, Via Inama 5, 38100, Trento, Italy
Franch Mariangela & Novi Inverardi Pierluigi

Authors

Mich Luisa
View author publications
You can also search for this author in PubMed Google Scholar
Franch Mariangela
View author publications
You can also search for this author in PubMed Google Scholar
Novi Inverardi Pierluigi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mich Luisa.

Additional information

An erratum to this article can be found at http://dx.doi.org/10.1007/s00766-004-0195-3

Appendices

Appendix A:

1.1 Questionnaire for a new CASE tool

Appendix B:

1.1 Online material

(http://online.cs.unitn.it/)

Questionnaire (html form)
Contacted newsgroups list
E-mail messages
Correspondence analysis

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luisa, M., Mariangela, F. & Pierluigi, N.I. Market research for requirements analysis using linguistic tools. Requirements Eng 9, 40–56 (2004). https://doi.org/10.1007/s00766-003-0179-8

Download citation

Received: 15 October 2001
Accepted: 16 July 2003
Published: 30 October 2003
Issue Date: February 2004
DOI: https://doi.org/10.1007/s00766-003-0179-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Market research for requirements analysis using linguistic tools

Abstract

Similar content being viewed by others

A Natural Language Approach for Requirements Engineering

Application of Computational Linguistics Techniques for Improving Software Quality

An Experience with the Application of Three NLP Tools for the Analysis of Natural Language Requirements

Explore related subjects

1 Objectives and structure of the paper

1.1 Premise

1.2 Objectives

1.3 Structure of the paper

2 The role of natural language in requirements engineering

3 Plan and realisation of the market research

Preliminary survey

Identification of interview subjects

Designing and testing of the questionnaire

Selection of the contact method

Distributing the questionnaire and reminders

Collection and analysis of the data

4 The results of the survey and the potential demand for an NLP-based tool to support requirements analysis

The market for instruments supporting software development and requirements modelling

Features of the tool

Requirements analysis viewed as crucial

4.1 The market for instruments supporting software development and requirements modelling

4.2 Features of the tool

4.3 Requirements analysis viewed as crucial

5 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A:

1.1 Questionnaire for a new CASE tool

Appendix B:

1.1 Online material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation