Rethinking ‘Advanced Search’: A New Approach to Complex Query Formulation

Russell-Rose, Tony; Chamberlain, Jon; Kruschwitz, Udo

doi:10.1007/978-3-030-15719-7_31

Tony Russell-Rose²⁰,
Jon Chamberlain²¹ &
Udo Kruschwitz²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11438))

Included in the following conference series:

European Conference on Information Retrieval

1897 Accesses
1 Citations
4 Altmetric

Abstract

Knowledge workers such as patent agents, recruiters and media monitoring professionals undertake work tasks where search forms a core part of their duties. In these instances, the search task often involves the formulation of complex queries expressed as Boolean strings. However, creating effective Boolean queries remains an ongoing challenge, often compromised by errors and inefficiencies. In this demo paper, we present a new approach to query formulation in which concepts are expressed on a two-dimensional canvas and relationships are articulated using direct manipulation. This has the potential to eliminate many sources of error, makes the query semantics more transparent, and offers new opportunities for query refinement and optimisation.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Wolfram|Alpha: A Computational Knowledge “Search” Engine

QUEST: Querying Complex Information by Direct Manipulation

A Framework for Comparing Query Languages in Their Ability to Express Boolean Queries

Keywords

1 Introduction

Many knowledge workers rely on the effective use of search applications in the course of their professional duties [6]. Patent agents, for example, depend on accurate prior art search as the foundation of their due diligence process [10]. Similarly, recruitment professionals rely on Boolean search as the basis of the candidate sourcing process [8], and media monitoring professionals routinely manage thousands of Boolean expressions on behalf their client briefs [12].

The traditional solution is to formulate complex Boolean expressions consisting of keywords, operators and search commands, such as that shown in Fig. 1. However, the practice of using Boolean strings to articulate complex information needs suffers from a number of fundamental shortcomings [9]. First, it is poor at communicating structure: without some sort of physical cue such as indentation, parentheses and other delimiters can become lost among other alphanumeric characters. Second, it scales poorly: as queries grow in size, readability becomes progressively degraded. Third, they are error-prone: even if syntax checking is provided, it is still possible to place parentheses incorrectly, changing the semantics of the whole expression.

To mitigate these issues, many professionals rely on previous examples of best practice. Recruitment professionals, for example, draw on repositories such as the Boolean Search Strings Repository^{Footnote 1} and the Boolean String Bank^{Footnote 2}. However, these repositories store content as unstructured text strings, and as such their true value as source of experimentation and learning may never be fully realized.^{Footnote 3}

2dSearch^{Footnote 4} offers an alternative approach. Instead of formulating Boolean strings, queries are expressed by combining objects on a two-dimensional canvas and relationships are articulated using direct manipulation. This eliminates many sources of syntactic error, makes the query semantics more transparent, and offers further opportunities for query refinement and optimisation.

2 Related Work

The application of data visualisation to search query formulation can offer significant benefits, such as fewer zero-hit queries, improved query comprehension and better support for exploration of an unfamiliar database [3]. An early example is that of Anick et al. [1], who developed a two-dimensional graphical representation of a user’s natural language query that supported reformulation via direct manipulation. Fishkin and Stone [2] investigated the application of direct manipulation techniques to database query formulation, using a system of ‘lenses’ to refine and filter the data. Jones [4] developed a query interface to the New Zealand Digital Library which uses Venn diagrams and integrated query result previews.

A further example is Yi et al. [13], who applied a ‘dust and magnet’ metaphor to multivariate data visualization. Nitsche and Nürnberger [5] developed a system based on a radial user interface that supports phrasing and interactive visual refinement of vague queries. A further example is Boolify^{Footnote 5}, which provides a drag and drop interface to Google. More recently, de Vries et al. [11] developed a system which utilizes a visual canvas and elementary building blocks to allow users to graphically configure a search engine. 2dSearch differs from the prior art in offering a database-agnostic approach with automated query suggestions and support for optimising, sharing and re-using query templates and best practices.

3 Design Concept

At the heart of 2dSearch is a graphical editor which allows the user to formulate queries as objects on a two-dimensional canvas. Concepts can be simple keywords or attribute: value pairs representing controlled vocabulary terms or database-specific search operators. Concepts can be combined using Boolean (and other) operators to form higher-level groups and then iteratively nested to create expressions of arbitrary complexity. Groups can be expanded or collapsed on demand to facilitate transparency and readability.

The application consists of two panes (see Fig. 2): a query canvas and a search results pane (which can be resized or detached in a separate window). The canvas can be resized or zoomed, and features an ‘overview’ widget to allow users to navigate to elements that may be outside the current viewport. Adopting design cues from Google’s Material Design language^{Footnote 6}, a sliding menu is offered on the left, providing file I/O and other options. This is complemented by a navigation bar which provides support for document-level functions such as naming and sharing queries.

Although 2dSearch supports creation of complex queries from a blank canvas, its value is most readily understood by reference to an example such as that of Fig. 1, which is intended to find social profiles for data migration project managers located in Dublin. Although relatively simple, this query is still difficult to interpret, optimise or debug. However, when opened with 2dSearch, it becomes apparent that the overall expression consists of a conjunction of OR clauses (nested blocks) with a number of specialist search operators (dark blue) and negated terms (white on black). To edit the expression, the user can move terms using direct manipulation or create new groups by combining terms. They can also cut, copy, delete, and lasso multiple objects. If they want to understand the effect of one group in isolation, they can execute it individually. Conversely, if they want to remove one element from consideration, they can disable it. In each case, the effects of each operation are displayed in real time in the adjacent search results pane.

2dSearch functions as a meta-search engine, so is in principle agnostic of any particular search technology or platform. In practice however, to execute a given query, the semantics of the canvas content must be mapped to the API of the underlying database. This is achieved via an abstraction layer or set of ‘adapters’ for common search platforms such as Bing, Google, PubMed, Google Scholar, etc. These are user selectable via a drop-down control.

Support for query optimisation is provided via a ‘Messages’ tab on the results pane. For example, if the user tries to execute via Bing a query string containing operators specific to Google, an alert is shown listing the unknown operators. 2dSearch also identifies redundant structure (e.g. spurious brackets or duplicate elements) and supports comparison of canonical representations. Query suggestions are provided via an NLP services API which utilises various Python libraries (for word embedding, keyword extraction, etc.) and SPARQL endpoints (for linked open data ontology lookup) [7].

4 Summary and Further Work

2dSearch is a framework for search query formulation in which information needs are expressed by manipulating objects on a two-dimensional canvas. Transforming logical structure into physical structure mitigates many of the shortcomings of Boolean strings. This eliminates syntax errors, makes the query semantics more transparent and offers new ways to optimise, save and share best practices. In due course, we hope to engage in a formal, user-centric evaluation, particularly in relation to traditional query builders. We are currently engaging in an outreach programme and invite subject matter experts to work with us in building repositories of curated (or user generated) examples and templates.

Adopting a database-agnostic approach presents challenges, but it also offers the prospect of a universal framework in which information needs can be articulated in a generic manner and the task of mapping to an underlying database can be delegated to platform-specific adapters. This could have profound implications for the way in which professional search skills are taught, learnt and applied.

Notes

1.
https://booleanstrings.ning.com/forum/topics/boolean-search-strings-repository, accessed 10 Oct 2018.
2.
https://scoperac.com/booleanstringbank, accessed 10 Oct 2018.
3.
http://booleanblackbelt.com/2016/01/the-most-powerful-boolean-search-operator, accessed 10 Oct 2018.
4.
https://2dsearch.com, accessed 24 Oct 2018.
5.
https://www.kidzsearch.com/boolify/, accessed 23 Oct 2018.
6.
https://material.io.

References

Anick, P.G., Brennan, J.D., Flynn, R.A., Hanssen, D.R., Alvey, B., Robbins, J.M.: A direct manipulation interface for boolean information retrieval via natural language query. In: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 1990, pp. 135–150. ACM, New York, NY, USA (1990). https://doi.org/10.1145/96749.98015
Fishkin, K., Stone, M.C.: Enhanced Dynamic Queries Via Movable Filters, pp. 415–420. ACM Press, New York (1995)
Google Scholar
Goldberg, J.H., Gajendar, U.N.: Graphical condition builder for facilitating database queries. U.S. Patent No. 7,383,513. 3 (2008)
Google Scholar
Jones, S.: Graphical query specification and dynamic result previews for a digital library. In: Proceedings of the 11th Annual ACM Symposium on User Interface Software and Technology, UIST 1998, pp. 143–151. ACM, New York, NY, USA (1998). https://doi.org/10.1145/288392.288595
Nitsche, M., Nürnberger, A.: QUEST: querying complex information by direct manipulation. In: Yamamoto, S. (ed.) HIMI 2013. LNCS, vol. 8016, pp. 240–249. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39209-2_28
Chapter Google Scholar
Russell-Rose, T., Chamberlain, J., Azzopardi, L.: Information retrieval in the workplace: a comparison of professional search practices. Inf. Process. Manag. 54(6), 1042–1057 (2018). https://doi.org/10.1016/j.ipm.2018.07.003
Article Google Scholar
Russell-Rose, T., Gooch, P.: 2dsearch: a visual approach to search strategy formulation. In: Proceedings of DESIRES: Design of Experimental Search & Information REtrieval Systems. DESIRES 2018 (2018)
Google Scholar
Russell-Rose, T., Chamberlain, J.: Real-world expertise retrieval: the information seeking behaviour of recruitment professionals. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 669–674. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_51
Chapter Google Scholar
Russell-Rose, T., Chamberlain, J.: Searching for talent: the information retrieval challenges of recruitment professionals. Bus. Inf. Rev. 33(1), 40–48 (2016)
Google Scholar
Tait, J.I.: An introduction to professional search. In: Paltoglou, G., Loizides, F., Hansen, P. (eds.) Professional Search in the Modern World. LNCS, vol. 8830, pp. 1–5. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12511-4_1
Chapter Google Scholar
de Vries, A.P., Alink, W., Cornacchia, R.: Search by strategy. In: Proceedings of the Third Workshop on Exploiting Semantic Annotations in Information Retrieval, pp. 27–28. ACM (2010)
Google Scholar
Pazer, J.W.: The importance of the boolean search query in social media monitoring tools. DragonSearch white paper (2013). https://www.dragon360.com/wp-content/uploads/2013/08/social-media-monitoring-tools-boolean-search-query.pdf. (Accessed 22 Mar 2018)
Yi, J.S., Melton, R., Stasko, J., Jacko, J.A.: Dust & magnet: multivariate information visualization using a magnet metaphor. Inf. Vis. 4(4), 239–256 (2005). https://doi.org/10.1057/palgrave.ivs.9500099
Article Google Scholar

Download references

Author information

Authors and Affiliations

UXlabs Ltd., Brunel House, 340 Firecrest Ct, Centre Park, Warrington, WA1 1RG, UK
Tony Russell-Rose
School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester, Essex, CO4 3SQ, UK
Jon Chamberlain & Udo Kruschwitz

Authors

Tony Russell-Rose
View author publications
You can also search for this author in PubMed Google Scholar
Jon Chamberlain
View author publications
You can also search for this author in PubMed Google Scholar
Udo Kruschwitz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tony Russell-Rose .

Editor information

Editors and Affiliations

University of Strathclyde, Glasgow, UK
Leif Azzopardi
Bauhaus Universität Weimar, Weimar, Germany
Benno Stein
Universität Duisburg-Essen, Duisburg, Germany
Norbert Fuhr
GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
Philipp Mayr
Delft University of Technology, Delft, The Netherlands
Claudia Hauff
University of Twente, Enschede, The Netherlands
Djoerd Hiemstra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Russell-Rose, T., Chamberlain, J., Kruschwitz, U. (2019). Rethinking ‘Advanced Search’: A New Approach to Complex Query Formulation. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds) Advances in Information Retrieval. ECIR 2019. Lecture Notes in Computer Science(), vol 11438. Springer, Cham. https://doi.org/10.1007/978-3-030-15719-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-15719-7_31
Published: 07 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15718-0
Online ISBN: 978-3-030-15719-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Rethinking ‘Advanced Search’: A New Approach to Complex Query Formulation

Abstract

Similar content being viewed by others

Wolfram|Alpha: A Computational Knowledge “Search” Engine

QUEST: Querying Complex Information by Direct Manipulation

A Framework for Comparing Query Languages in Their Ability to Express Boolean Queries

Keywords

1 Introduction

2 Related Work

3 Design Concept

4 Summary and Further Work

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Rethinking ‘Advanced Search’: A New Approach to Complex Query Formulation

Abstract

Similar content being viewed by others

Wolfram|Alpha: A Computational Knowledge “Search” Engine

QUEST: Querying Complex Information by Direct Manipulation

A Framework for Comparing Query Languages in Their Ability to Express Boolean Queries

Keywords

1 Introduction

2 Related Work

3 Design Concept

4 Summary and Further Work

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation