Keywords

1 Introduction

Math-aware search engines supporting keyword and formula search have been around since at least 2003, when the Digital Library of Mathematical FunctionsFootnote 1 supported  in queries [13]. The new information that sophisticated math retrieval would provide, such as more easily locating definitions of symbols and other notations, finding usage, proofs and mathematical properties across disciplines, and compiling information on applications (e.g., variations of the log loss for machine learning) has stimulated work in math-aware search, alongside parallel developments in math question answering within the Natural Language Processing community [2]. To realize their full potential, math-aware search interfaces must be engaging and easy-to-use for different levels of expertise, and particularly for non-experts (e.g., students in middle school).

2 Interface Design Elements

Formula Entry. Let’s first consider the problem of creating formulas. While formulas such as ‘\(y = x^2 + 1\)’ can be easily written in , others such as:

$$\begin{aligned} \varvec{\nabla } \times \varvec{F} = \left( \frac{\partial F_z}{\partial y} - \frac{\partial F_y}{\partial z} \right) \mathbf {i} + \left( \frac{\partial F_x}{\partial z} - \frac{\partial F_z}{\partial x} \right) \mathbf {j} + \left( \frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y} \right) \mathbf {k} \end{aligned}$$
(1)

are large, complex, and contain symbols that many non-experts cannot name let alone express in a query. Despite this, most math-aware search engines are restricted to two forms of input: (1)  (or MathML) entry in text boxes, and (2) visual template editors similar to the Microsoft Equation Editor [11, 13]. Many users find template editors confining, and so the text box approach is the most common, often in combination with a palette used to insert symbols and structures in the entry box. Text input is used by most online math-aware systems, including DLMF [8], WebMIAS [10], Math WebSearch [3], Wolfram Alpha, SymboLab, SearchOnMath, and the (now-defunct) Springer  Search.

Two challenges for text-based input are (1) most users are unfamiliar with  (even fewer know MathML), and (2) rendered formulas are shown separately from input, leading to users having difficulty locating entry errors [14]. Appealing solutions to these issues are handwritten formula input, formula image upload, and supporting the analogy of physically moving symbols around on a page [15]. These are key design elements in the MathSeer interface. In one study, a majority of the undergraduate participants reported preferring drawing over typing formulas given a choice between the two [12]. They also expressed formulas with handwriting that they could not using a keyboard (e.g., \(4 \atopwithdelims ()2\)).

To address these issues, our MathSeer search interface (see Fig. 1) allows formula input using a combination of typing , uploading formula images, and drawing formulas by hand.Footnote 2 In MathSeer handwritten symbols are recognized each time a user stops drawing for a short time. Pressing a button recognizes formula structure, and copies the  result into the panel at the bottom-left of the interface. The  can then be edited, with a rendering of the formula updated in real-time (e.g., to quickly change ‘p’ to ‘P’). At bottom-right, palettes containing symbols and structures may be used to insert corresponding  at the cursor position in the  panel.

Images may be dragged-and-dropped on the canvas or uploaded using a button that presents a file navigation pop-up window. This produces a formula ‘chip’ on the canvas, which can be used directly in a query, edited, or used in constructing other formulas. A line-of-sight graph-based parsing technique is used to recognize formula images and handwritten formulas [4].

Users can freely alternate between drawing and manipulating symbols on the canvas, uploading images, and editing the  panel contents. Robust undo/redo operations are provided to easily reverse operations. Formulas in the query bar can be chosen for editing by clicking on them, allowing for quick switching between formulas. Mansouri et al. found that users search for math with keywords or in the context of a question [6]. In order to help the user add additional information for their query, MathSeer also supports keywords in their search queries (see Fig. 1).

Fig. 1.
figure 1

MathSeer interface. Query formulas and keywords are ‘chips’ at top left; keywords are entered using the box at top right. Formulas are created by manipulating symbols on the canvas, uploading formula images, and editing  in the panel at bottom left. At bottom-right is a panel for ‘favorite’ formulas (two are shown here), the formula history, and palettes for symbols and structures to insert in the  panel.

Formula Containers and Reuse (‘Chips’). Handwritten formula entry is convenient for small expressions, but for large expressions such as Eq. 1 handwriting is slow [12], and accurate recognition is challenging [5]. Users may also want to avoid re-entering formulas, and to share formulas with others [12].

MathSeer introduces a new model for formula reuse, flexible containers that we call formula ‘chips’. Figure 1 shows a chip in the query bar, and there are two ‘favorite’ chips in a list at bottom-right. Chips can be created and used in a number of ways. In addition to the formula creation operations described above, chips can be created by selecting symbols on the canvas and ‘popping’ them up into a chip. All formula chips have their creation history automatically recorded, and are stored in a ‘history’ menu in the symbol palette panel. On the canvas, chips may be easily moved, resized, and ‘pushed’ onto the canvas (i.e., the symbols on the chip are added to the canvas, and the chip disappears).

Chips have two possible states: ‘recognized’ chips containing a string, and ‘template’ chips representing only symbols on a canvas. Chips that are ‘favorites’ are shown using an orange border, and are either a recognized or template chip. As an example use for template (grey) chips, ‘\(\int dx\)’ with a large space in the middle can be used as a template to quickly create other formulas with an integral, by dragging and dropping the chip from the favorites or history tab in the palette panel to the canvas. Recognized (blue) chips in the history and favorites tabs in the palette panel can also be used like palette buttons - clicking on them inserts their interpretation in the  panel, making it easy to re-use and insert large formulas. Chips may also be exported as images with metadata containing all chip data, allowing chip images to be later reused in MathSeer (e.g., using drag-and-drop) or shared with others (e.g., over email). Using chips for formula containers was inspired by the Approach0 interface.Footnote 3

MathSeer records the entire editing session, including all formula chips using an automatically generated URL that users can revisit later. The idea to use a URL to record editing state came from discussions with the creators of 2dsearch [9].

Math Entity Cards. To support formula autocompletion using online data (e.g., from Wikidata), we use a new type of entity card that provides concept names and descriptions for formulas. We use these to provide names and descriptions for individual symbols and formulas in real-time as they are entered [1]. Formula search over the card collection is done using Tangent-CFT embedding vectors [7]. In addition, we will soon allow formulas to be quickly found by searching concept names on cards (e.g., typing ‘Pyt’ brings up the card and formula for the Pythagorean Theorem). Further, we plan to allow users to create their own entity cards for formula chips. An illustration of math entity cards is shown in Fig. 2. This view is expanded to show the full cards; in the unexpanded view only the formulas and titles are visible.

Fig. 2.
figure 2

Expanded auto-complete results displaying entity cards with similar formulas.

3 Conclusion and Future Work

The MathSeer interface addresses limitations of the standard text box + symbol palette formula entry technique common in math-aware search interfaces. MathSeer’s interface supports multimodal formula editing through handwritten, , and image input. We have introduced formula chips, a new container to support storage, reuse, editing, and sharing of formulas. The chip creation history and favorites list support quick query reformulation and reuse. In future work, we are considering manual editing operations to define spatial relationships between symbols and/or sub-expressions to avoid recognizing complex formulas.