Keywords

1 Introduction

Due to the development of semi-high-throughput technologies for generating glycomics data in the glycosciences, such as mass spectrometry (MS), nuclear magnetic resonance (NMR), and glycan arrays, numerous glycomics databases and algorithms have been developed to analyze this data in the past decade. The latter (algorithms), however, have been mainly published in the informatics literature, and few have been implemented as easy-to-use tools for the wet-lab glycobiologist. Therefore, RINGS was developed since 2006 in order to provide a user-friendly interface to use these tools. In the current version of RINGS, the main text representation of glycan structures used is KCF (KEGG Chemical Structure Format) (Akune et al. 2010). As new formats are being developed, utilities to translate between KCF and other formats have also been incorporated into RINGS. In the future, however, RINGS tools will be designed to allow any format to be used as input so that the user need not be concerned with the glycan representation to use.

Fig. 15.1
figure 1

The DrawRINGS input screen, illustrating the menu displayed when using the core structure button

2 DrawRINGS

DrawRINGS is an easy-to-use, web-based drawing tool for searching glycans using KCaM (Aoki et al. 2004). Users can also obtain KCF text for glycans by drawing structures on the canvas.

2.1 DrawRINGS Input

The DrawRINGS interface is a Java applet that runs in web browsers. Because of security risks, many browsers these days do not permit applets to run by default. However, it is still possible to run them with the appropriate browser settings, which are described in the Tips Sect. 15.2.3. The main drawing interface for DrawRINGS is illustrated in Fig. 15.1. There are nine buttons across the top, which can be used for drawing monosaccharides (called nodes), glycosidic bonds (called edges), erasing nodes, making queries, etc. Table 15.1 describes each button and itsfunctionality.

Table 15.1 Each button in DrawRINGS allows users to perform the following tasks

2.1.1 Drawing/Inputting Structures onto the Canvas

To draw on the canvas, first use the buttons for either drawing monosaccharides or drawing structures. Then additional monosaccharides and glycosidic bonds can be added as necessary. Adjustments to the topology of the glycan can be made by using the select button and dragging the necessary components around on the canvas. Selected components can also be copied using Ctrl-C (or Command-C on Mac OS) and pasted using Ctrl-V (or Command-V on Mac OS).

A file containing a glycan in KCF format can also be loaded into DrawRINGS by selecting it using the Browse… button located below the canvas. Once selected, the structure can be drawn on the canvas by clicking the Upload button.

2.1.2 Obtaining KCF

Once a structure is drawn on the canvas, the KCF-formatted text representing it can be obtained by clicking the Output KCF button. Figure 15.2 illustrates how DrawRINGS looks when the KCF text is shown in the text area on the right-hand side of the canvas. This text area can be edited manually to modify the glycan displayed as well. The updated glycan figure can then be obtained by clicking the Draw KCF button, which will update the canvas with the glycan represented by the KCF text.

Fig. 15.2
figure 2

The DrawRINGS screen after KCF has been outputted. The text area containing the KCF data can be edited manually as needed. The updated glycan image can then be obtained by clicking the Draw KCF button, which will update the canvas with the glycan represented by the KCF text

2.1.3 Running Queries

The RINGS database stores all glycan structures derived from KEGG GLYCAN (Aoki-Kinoshita and Kanehisa 2015) and GlycomeDB (Ranzinger et al. 2011), in addition to several structures that were manually curated from the literature, up to 2009. The structures in these data can be queried using the Query button in DrawRINGS.

The algorithm behind this query tool is KCaM (Aoki et al. 2004), which implements an efficient tree structure matching algorithm. Three options are provided by this query tool: (1) score matrix, (2) database, and (3) score type. A glycan score matrix can be analogized to BLOSUM or PAM for protein sequence alignment, where particular glycosidic linkages and monosaccharides that should be considered similar to one another, in particular within a certain glycan class, can be scored more highly to obtain higher-scoring glycans within the same class. Predefined score matrices are available, including N-glycans, O-glycans, and sphingolipids. A fourth matrix, called Link_similarity, is a hand-generated score matrix based on expert knowledge of glycosidic linkages and monosaccharides that may be substituted more frequently with other glycosidic linkages and monosaccharides, respectively. For example, β1-3 may be considered more similar to β1-4 than α1-2.

For the target database, users may choose between querying either the data RINGS, which includes KEGG GLYCAN and manually curated structures, or GlycomeDB, or both of these databases. Because the latter contains almost 40,000 structures, it takes a few minutes longer to query it.

Finally, the score type allows users to specify the scoring system to use, and the results are always displayed in descending order of score. The Similarity option gives a maximum score of 100, whereas the Matched option sums the score of matching monosaccharides and linkages, where each matching glycosidic linkage (and its monosaccharide) receives a score of 100. In fact, the Similarity option actually divides the Matched score by the larger of the two structures. Because each glycosidic linkage can receive a maximum score of 100, the maximum possible score would be 100 times the maximum number of linkages. For example, if a glycan having nine glycosidic linkages is compared against one of four, the maximum possible score is 100 times the larger of the structures (in this case, 100 × 9 = 900). Thus, if the number of matching linkages (and their monosaccharides) is, say, five, then the Similarity score between the two structures would be 56 %( = 500 ÷ 900).

2.2 DrawRINGS Query Output

As a result of running a query, a new browser window listing the matching results in order of score will be displayed, as in Fig. 15.3. This figure is a screenshot of a query using the structure drawn in Fig. 15.2 and using the Link_Similarity score matrix on the GlycomeDB data. The results list the structures in order of similarity, where it is apparent that the most similar structures are displayed at the top. Each result is linked to the Glycan Entry page, shown in Fig. 15.4, which displays the various text representations of the structure. This page also provides a link to the KEGG GLYCAN database and reaction information about the glycan if available.

Fig. 15.3
figure 3

The DrawRINGS result screen using the structure drawn in Fig. 15.2 and using the Link_Similarity score matrix on the GlycomeDB data. The results list the structures in order of similarity, where it is apparent that the most similar structures are displayed at the top. Each result is linked to the Glycan Entry page of RINGS (Fig. 15.4), which displays detailed information about the structure

Fig. 15.4
figure 4

A Glycan Entry page for the fourth matching structure in the DrawRINGS query results shown in Fig. 15.3. The various text representations of this structure are listed, along with a link to the original KEGG GLYCAN database

2.3 DrawRINGS Tips

The following are some tips regarding using DrawRINGS:

  • Java needs to be enabled in the browser with appropriate security settings (usually at low levels).

  • Google Chrome may not work, but DrawRINGS should work on Safari, Firefox, and Internet Explorer.

  • Set the URLs http://www.rings.t.soka.ac.jp and http://rings.t.soka.ac.jp in the list of allowed sites in Java settings.

  • At the time of this writing, selecting the text area and using Ctrl-C (or Command-C on Macs) to copy the KCF are disabled on many browsers. Thus, in order to download the KCF text, it is recommended that DrawRINGS be used as a registered user having logged in and then a query be made on the RINGS database, in order to store the KCF in the user’s data library. The User Management system is described later in Sect. 15.9 and instructions on running queries are described in the next subsection.

3 ProfilePSTMM

The ProfilePSTMM Tool was built based on the probabilistic model proposed called PSTMM (probabilistic sibling-dependent tree Markov model) (Ueda et al. 2004), which was then modified such that profiles could be directly output from the model (Aoki-Kinoshita et al. 2006). In simple terms, this model learns patterns from the input data in a process called training. Once a model is trained on a particular data set, it can essentially be used to test other glycans to determine how well it matches the pattern that has been learned by the model. In other words, during the training process, ProfilePSTMM attempts to find patterns inherent within a given glycan data set containing many disparate glycan structures. These patterns are then visualized in the results as distributions of monosaccharides at particular positions of a branched topology. Once a pattern has been learned, it can potentially be used to assess, or test, whether a given glycan structure that was not in the original data set should belong to the set or not. That is, it assesses the glycan to see how well it matches the pattern learned.

As for the ProfilePSTMM Tool, it performs the training process and simply outputs the trained pattern that was learned from the data. The model is currently not stored for later testing on other data. However, such advanced analyses are being planned at the time of this writing.

The ProfilePSTMM model was originally developed to aid in the understanding of glycan patterns being recognized by glycan-binding agents such as proteins, viruses, and antibodies, whose data can be obtained from glycan array databases. It uses a probabilistic model to overcome the noise that can be found in many of the glycan-binding data available, as well as weak-binding patterns which may skew analytical results. This is in comparison to tree alignment algorithms, which do not allow for such flexibility.

Fig. 15.5
figure 5

A snapshot of the default ProfilePSTMM Tool input page. A list of KEGG GLYCAN IDs or KCF data can be specified

Fig. 15.6
figure 6

Result screen after running ProfilePSTMM

3.1 ProfilePSTMM Input

The ProfilePSTMM Tool takes as input a data set of glycans. The intention of this tool was to allow for the analysis of glycans that exhibit binding to a particular agent. Thus, data such as that from lectin and glycan arrays can be analyzed with this tool. A snapshot of the input screen is given in Fig. 15.5. A list of KEGG GLYCAN IDs or a concatenated list of KCF data can be specified. The shuffle number refers to the number of times the computation should be repeated to avoid local optima. Because this tool is a probabilistic model, it is suggested that it is run at least five times in order to find the optimal solution. A file can also be specified containing the input data, either as KEGG GLYCAN IDs or KCF data, instead of cutting and pasting into the text area.

3.2 ProfilePSTMM Output

As a result of training on the default glycan structures using shuffle number 5, the resulting glycan profiles will be displayed, as in Fig. 15.6. Looking closely at these results, one can see that there were basically two distinct profiles that were trained based on the data set, one with score 23.363 and one with 23.340. In both profiles, the biantennary pattern learned contained the same distribution of monosaccharides at the terminating ends. Mannose appeared 100 % on one end, and GlcNAc, mannose, glucose, and galactose appeared with a ratio of 37:33:6:17. The single difference between these profiles appears at the neighboring position, where fucose either appears alone (100 %) or may be replaced with a 30 % probability with GlcNAc. The computed score is actually computed based on the likelihood score computed when learning the patterns in the glycan data set. Future work will entail assessing these scores to obtain significant values that can also be displayed to the user.

Note that the shape, or topology, of the profile is currently based on the concept of maximum common subtree of the input data, meaning that the shape of the profile that is output will be based on the smallest glycan structure that exists in the input data set. This causes the learning process to combine monosaccharide distributions into a smaller space, causing the results to be misleading. This is a currently known issue that is being updated and will be made more flexible in future versions, when it will be easier to ascertain the results.

3.3 ProfilePSTMM Tips

  • At the time of this writing, the ProfilePSTMM Tool can be used for glycan data sets that do not contain very small glycan structures, but rather contain glycans that are of comparable size.

  • In order to specify stronger binding glycans compared to others in the data set, weights can be implied by specifying multiple copies of heavily binding glycans in the input.

  • The computation time may increase with larger shuffling number and larger number of glycans in the input.

4 Glycan Miner

The Glycan Miner Tool is based on a mining algorithm called α-closed frequent subtree mining (Hashimoto et al. 2008). This tool takes as input a large data set of glycans and attempts to find significant subtrees among the data. In a sense, it can find unique overly expressed glycan substructures within a set of glycan structures. It is suited for the analysis of glycan profiling data by mass spectrometry or glycan-binding data such as that from glycan arrays.

Fig. 15.7
figure 7

Input screen for Glycan Miner with default values

4.1 Glycan Miner Input

The input screen for the Glycan Miner Tool is shown in Fig. 15.7. The default values allow users to test the tool to see what kind of results can be obtained. The data should be specified in KCF format and can be inputted into the text area or specified with a filename containing the data. There are two options available: alpha and minimum support. These parameters refer to how a candidate subtree (extracted from the data set) should be handled. The support for any candidate subtree refers to the number of glycans in the input data set in which it appears. For example, if a data set of 80 structures, 50 N-glycans and 30 O-glycans, is given, the subtree consisting of the Man 3 GlcNAc 2 core structure of N-glycans would have a support of 50. Candidate subtrees are computed from the input data set; they consist of all possible subtrees among the data.

The second parameter, alpha, essentially specifies the uniqueness of the subtrees to be output. An example would best illustrate this parameter, so two output examples are provided in Sect. 15.4.2. In simple terms, a larger value of alpha, closer to 1, will output a larger range of glycan subtrees, whereas a smaller value, closer to 0, will filter out redundant glycan subtrees.

Fig. 15.8
figure 8

Results of running Glycan Miner on the default data set, using a value of 1 for support and.3 for α

Fig. 15.9
figure 9

Results of running Glycan Miner on the default data set, using a value of 1 for support and.9 for α

4.2 Glycan Miner Output

Figures 15.8 and 15.9 are snapshots of the Glycan Miner results using the same support  = 1 but different values for α. As illustrated, the results of α = . 9 contain seven glycans, whereas those of α = . 3 contain just three. The difference lies in the fact that several of the seven glycans are rather similar to one another. More specifically, their support values are rather close, whereas the three selected using α = . 3 are less similar to one another.

The α-closed frequent mining algorithm compares the support between every glycan subtree pair in which one contains the other. Let us assume that there are two subtrees, the N-glycan core Man 3 GlcNAc 2 which we call largerT and a subtree of this structure Man 2 GlcNAc 2 which we call smallerT. The α-closed frequent mining algorithm will compare each of their respective supports (denoted as supp(largerT) and supp(smallerT)) and will discard the larger if its support does not satisfy the equation supp(largerT) < max(α × supp(smallerT), minsup), where minsup is the minimum support specified by the user. Since it can be assumed that subtrees with larger values of support will be less overlapping, α can be used to make this differentiation.

The Glycan Miner Tool also sorts the resulting subtrees based on p-value, which is generated based on the data set inputted. That is, a random data set of subtrees are generated, with which the support of the subtrees in the results are compared. The smaller the p-value, the fewer the chance that the same results are obtained randomly, and so the subtrees with smaller p-value are listed first.

4.3 Glycan Miner Tips

  • p-values are generated based on the input, so the p-value on smaller data sets will not be very meaningful. Moreover, because it is generated for each run, the order of the results may change, especially for small data sets. Such results would imply that there is no significant difference between the results and that they should all be treated equally.

  • Glycosidic bond information is very specific: unspecified linkage information will not be matched with structures whose linkages are specified.

  • IUPAC-formatted data can be used for Glycan Miner by following the link at the top of the input page for using CFG data.

5 MCAW

MCAW, or multiple carbohydrate alignment with weights, was originally developed as a temporary tool to determine the shape of profiles for ProfilePSTMM. However, it was decided that MCAW could be used on its own to align multiple glycans to find consensus sequence patterns across a set of glycans, so it was made available as a web tool. MCAW generates easy-to-understand glycan profile images as results. Moreover, detailed parameters are available for advanced users who wish to fine-tune the results.

Similar to the ProfilePSTMM Tool, MCAW can be used to find patterns across a set of glycans, such as from glycan-array or lectin-array data. Note, however, that it is not a probabilistic model, meaning that all the input data will be used at face value, meaning that all the data will be used in the results. In contrast, a probabilistic model will account for noise in the data, and so discard data that may be outliers.

Fig. 15.10
figure 10

The MCAW input screen, where multiple KCF data can be specified, and a variety of parameters are available for fine-tuning the results

5.1 MCAW Input

The MCAW input screen is shown in Fig. 15.10. Multiple KCF data can be pasted into the text area, or a file containing such data can be selected. The advanced parameters are described in Table 15.2. The default values can be used normally, but these options allow users to fine-tune the results if specific characteristics of the data inputted are known.

5.2 MCAW Output

The output of MCAW is a figure, as in Fig. 15.11, of the glycans aligned into a profile, but the results in PKCF (profile KCF format) can also be downloaded by the link provided. A legend for the figure is also provided as a link.

In the figure, similar to ProfilePSTMM, distributions of glycans and their linkages are given at various positions of glycans. Highly consistent portions of the alignment will have higher distributions of a particular monosaccharide at those positions. Figure 15.11 illustrates an example. One can see that at the positions numbered 7 through 10, highly conserved patterns of monosaccharides are found. This is a sialyl-Lewis X structure, which is a common motif among glycans in mammals. Moreover, position 18, which is attached to position 7, also contains a 6-sulfate approximately 16 % of the time, indicating that 6-sulfated sialyl-Lewis X structures can also be found in the data.

Table 15.2 The advanced parameters in MCAW
Fig. 15.11
figure 11

A snapshot of the MCAW output screen, where distributions of monosaccharides and their glycosidic linkages are given at various positions

5.2.1 PKCF Format

The output of MCAW can also be downloaded as a PKCF (Profile KCF) file from the result page. The PKCF for the result in Fig. 15.11 is as follows.

ENTRY G04804-G04183-G04845-G05121-G05108-G04329 GlycanProfile

NODE 18

1 1=GlcNAc 2=GlcNAc 3=Glc 4=GalNAc 5=GalNAc 6=GlcNAc 0 0

2 1=GlcNAc 2=GlcNAc 3=- 4=- 5=- 6=Gal -8 6

3 1=Man 2=Man 3=- 4=- 5=- 6=GlcNAc -16 6

4 1=Man 2=Man 3=Gal 4=Gal 5=Gal 6=Gal -24 1

5 1=GlcNAc 2=GlcNAc 3=GlcNAc 4=GlcNAc 5=GlcNAc 6=GlcNAc -32 5

6 1=LFuc 2=LFuc 3=LFuc 4=LFuc 5=LFuc 6=LFuc -40 5

7 1=Gal 2=Gal 3=Gal 4=Gal 5=Gal 6=Gal -40 7

8 1=Neu5Ac 2=Neu5Ac 3=Neu5Ac 4=Neu5Ac 5=Neu5Ac 6=Neu5Ac -48 7

9 1=GlcNAc 2=GlcNAc 3=GlcNAc 4=0 5=0 6=0 -32 -3

10 1=Gal 2=Gal 3=Gal 4=0 5=0 6=0 -40 -2

11 1=Man 2=Man 3=0 4=0 5=0 6=LFuc -24 11

12 1=Neu5Ac 2=Neu5Ac 3=0 4=0 5=0 6=0 -48 -2

13 1=GlcNAc 2=GlcNAc 3=0 4=0 5=0 6=0 -32 11

14 1=Gal 2=Gal 3=0 4=0 5=0 6=0 -40 11

15 1=Neu5Ac 2=Neu5Ac 3=0 4=0 5=0 6=0 -48 11

16 1=0 2=0 3=0 4=Neu5Ac 5=0 6=LFuc -8 -6

17 1=0 2=0 3=LFuc 4=0 5=0 6=0 -40 -4

18 1=0 2=0 3=0 4=0 5=S 6=0 -40 3

EDGE 17

1 2-1:b1 1-1:4

1 2-2:b1 1-2:4

1 2-3 1-3

1 2-4 1-4

1 2-5 1-5

1 2-6:b1 1-6:4

2 3-1:b1 2-1:4

2 3-2:b1 2-2:4

2 3-3 2-3

2 3-4 2-4

2 3-5 2-5

2 3-6:b1 2-6:3

.

.

17 18-3 5-3

17 18-4 5-4

17 18-5 5-5:6

17 18-6 5-6

The general format of PKCF follows that of KCF, in that there are three major sections in order: ENTRY, NODE, and EDGE. The first section lists the glycan names as inputted into the MCAW tool, separated by hyphens, read in from the KCF files. The order of the names indicates the order of the items listed in the NODE and EDGE sections. The NODE and EDGE labels are followed by numbers, indicating the number of positions and connected positions, respectively. Usually, the number of edges is one less than the number of positions. The lines following the NODE heading are the details regarding each position. They are formatted as follows:

<pos#> <glycan#1>=<residue> <glycan#2>=<residue> …<glycan#n>=

<residue> <x-coord> <y-coord>

where pos# indicates the position number within the tree topology and x-coord and y-coord represent their x- and y-coordinates, respectively, as drawn. Thus at each position, the residue names derived from each glycan are listed in order of the glycans listed in the ENTRY line. This is followed by x- and y-coordinates, similar to the KCF format.

In contrast, the EDGE section consists of lines corresponding to each glycosidic bond in each glycan. Thus the number of lines in this section equals the number of connected positions times the number of glycans. In the example above, there are 17 edges in the tree topology and 5 glycans, so there is a total of 85 lines, which have been cut to save space. Each line in this section is formatted as follows:

<link#> <pos#>-<glycan#>:<anomer><carbon#> <pos#>-<glycan#>:

<carbon#>

where link# is a unique number given to each edge. However, if the anomer and/or carbon number information is unavailable, these are left out. If both are unavailable, then the colon “:” is omitted. This is, again, a similar format to the KCF representation, except with additional position information included.

5.3 MCAW Tips

  • The input data should be in KCF format, but with the extra restriction that a name must be included in the ENTRY line and that the name should not have a hyphen in it.

  • MCAW is still in alpha-stage and may not give results for complex data sets, such as those having a large variety of structures. It may work better on data sets with similar glycan structures.

  • Glycans with no glycosidic bonds (i.e., single monosaccharides) cannot be included in the input data.

  • As of May 2016, new functionality has been made available such that weights can be specified with IUPAC-formatted data: follow the link for using CFG array data, available at the top of the input screen of MCAW.

6 Kernel Tool

The Glycan Kernel Tool in RINGS is an implementation of the biochemically weighted Kernel (Jiang et al. 2011). This tool implements a kernel that utilizes a score matrix for weighting similar glycosidic linkages and monosaccharides more highly in order to extract biochemically meaningful features from the data. Combined with a support vector machine (SVM), this tool can be used to detect biologically significant motifs across two data sets of glycans. It has been shown to extract the same motifs as those that were manually curated, which were consistent with the literature. This detection procedure consists of using the SVM to classify the two input data sets based on the kernel, and then feature extraction is performed based on the classification to detect the most significant features, or glycan substructures, that most distinguish the two data sets. In layman’s terms, given two data sets of glycan structures to compare, this tool will find those glycan substructures that distinguish one data set from the other. That is, it will find substructures that are unique to the target data set, compared to the control.

Fig. 15.12
figure 12

A snapshot of the input screen for the Glycan Kernel Tool. Two text areas are provided for the user to provide KCF-formatted data of two data sets of glycans to compare. Because of the potentially long computation time, previously executed runs of this tool are assigned computation IDs, which can be retrieved from the text field on the left-hand side

6.1 Kernel Tool Input

Two data sets of glycans are required as input for comparison. The target data set of glycans is compared against the control data set to find unique substructures of glycans, or motifs. Figure 15.12 is a snapshot of the input screen for the Glycan Kernel Tool. Two text areas are provided for the user to provide KCF-formatted data of two data sets of glycans to compare. However, large numbers of glycans in the data sets will take longer to compute. Therefore, computation IDs are assigned to each run of this tool. Thus, users can close this browser window and retrieve the results later once they are complete by returning to this page and inputting the computation ID in the text field on the left-hand side of the page.

Fig. 15.13
figure 13

A snapshot of the output screen for the Glycan Kernel Tool. The list of glycans, their scores, and their layers are listed in order of significance

6.2 Kernel Tool Output

The output of this tool is a list of the glycan features and their layers, along with their scores, as shown in Fig. 15.13. Details are described in the original manuscript (Jiang et al. 2011), but in general terms, the layer of the glycan feature is the distance of the given feature from the root node, or reducing end, of the original glycan within which it was found. That is, the layer indicates the number of glycosidic linkages between the feature and the reducing end. Thus, features with larger layers are those that are found closer to the nonreducing end of the input glycan(s). In Fig. 15.13, the top feature is actually a subtree of the second highest feature, which has an additional mannose (green circle) at its reducing end. Accordingly, their layers differ by one. Other features are also subsets of these top two features, indicating that this top feature is probably the single motif that most distinguishes the two data sets.

6.3 Kernel Tool Tips

  • Data sets of over 40 structures may take over an hour to compute. Please be sure to save the computation ID for larger data sets.

  • Since substructures unique to the target and not in the control will be extracted by this tool, swapping target and control data sets may result in interesting results, where structures unique to the control and not the target can be found.

7 GPP: Glycan Pathway Predictor

The glycan pathway predictor (GPP) is a tool that computes the N-glycans that could be theoretically biosynthesized given a set of glyco-enzymes involved with N-glycan biosynthesis. This generation procedure is based on the mathematical model proposed by Krambeck et al. in (2009). In short, the substrate specificities of each enzyme are specified using a mathematical model, and it is assumed that there are no limitations in the amount of each enzyme, sugar donors, or substrates. Then, GPP computes all possible glycan that can be synthesized using the input glycan structure and selected enzymes. The output results in a network of the synthesized glycans.

7.1 GPP Input

The GPP input screen is shown in Fig. 15.14. A single N-glycan formatted in KCF should be specified in the text area on the left. Then one or more enzymes should be selected from the list on the right. Finally, because this mathematical model could potentially continue indefinitely, a limit for the largest glycan structure to generate, indicated by molecular mass, should be specified. Table 15.3 gives a description of each enzyme that is available in GPP.

Fig. 15.14
figure 14

A snapshot of the input screen for the GPP Tool. A single N-glycan formatted in KCF should be specified. Then one or more enzymes should be selected from the list on the right. Finally, the mass limit for the generated glycans should be entered in order to prevent the computation from continuing indefinitely

Table 15.3 The glycogenes available in GPP
Fig. 15.15
figure 15

A snapshot of the output screen of the GPP Tool

Fig. 15.16
figure 16

A snapshot of the output screen zoomed in

Fig. 15.17
figure 17

A snapshot of the detail screen for a glycan selected from the output of the GPP Tool

7.2 GPP Output

The output screen for GPP is a Java applet. Therefore, similar to the DrawRINGS applet, Java security permissions of the web browser used will need to be adjusted accordingly in order to obtain the results. An open-source applet called ZGRViewer (http://zvtm.sourceforge.net/zgrviewer.html) is used to display the resulting pathway. A snapshot of the output screen is shown in Fig. 15.15. The functionality available in ZGRViewer includes zoom, swipe, and double-clicking of nodes. Figure 15.16 is a snapshot of the zoomed in screen. Nodes represent each glycan structure synthesized, and they are connected by edges representing the enzyme involved in the biosynthesis reaction. Nodes can be double-clicked to open a browser window displaying the detailed information of the glycan, as in Fig. 15.17.

7.3 GPP Tips

  • The glycan structure used as the first structure must be an N-linked glycan and a substrate for at least one of the enzymes selected.

  • The resulting pathway map may take some time to load. Clicking on the canvas once may refresh the map and display it.

  • The detail screen may not display in newer browsers due to Java applet security issues. GPP will be updated to include more enzymes in the near future, and the web display will also be updated accordingly.

8 Converter Utility

The Convert Tool in RINGS is a tool that aims to combine and automate the various glycan text format conversion utilities currently available in RINGS. It takes as input any of the major glycan text formats and provides options to convert the structures in the supported output format.

Fig. 15.18
figure 18

A snapshot of the input screen for the Convert Tool. In this example, GLYDE-II is used as input. Note that only one structure at a time can be converted from GLYDE-II format

Fig. 15.19
figure 19

The output format selection screen of the Convert Tool after a structure(s) has been inputted. The input format is automatically recognized, and a pull-down menu of the available output formats is displayed

8.1 Convert Utility Input

The Convert Tool takes as input glycan structures in GLYDE-II, GlycoCTcondensed, KCF, IUPAC, LinearCode, and LINUCS formats. If multiple glycans are inputted, multiline formats such as GlycoCT and KCF need to be inputted with newlines in between. After clicking the “Submit” button, the representation format of the input is automatically recognized, and a list of possible output formats is provided as a pull-down menu. Figure 15.18 is a snapshot of the input screen for the Convert Tool, using GLYDE-II as an example. Note that only one structure in GLYDE-II can be converted at a time due to formatting issues in XML. Figure 15.19 shows how the input format of the text in Fig. 15.18 was recognized and that the possible representation formats to which it can be converted are KCF, GlycoCTcondensed, IUPAC, LinearCode, and MDL Mol. Moreover, the output file format can be selected from HTML, Text, JSON, and DL. HTML will display the results graphically, with images of the glycan structures that were converted. Text will display the results in plain text format, which is easier to process by computer and can be copied as text. JSON is a special format for computers to process the results, and DL refers to download, allowing users to download the results as a text file instead of displaying all the results in the browser, saving network bandwidth.

8.2 Convert Utility Output

The output screen depends on the output formatted selected from the input page. Figure 15.20 is a snapshot of when the HTML format is selected. If either KCF or GlycoCT is selected as the output representation format, then a figure of each converted structure is also provided. This allows the user to visually confirm that the converted structure is correct.

Fig. 15.20
figure 20

A snapshot of the Convert Tool results page, displayed in HTML format. Each converted structure is displayed along with its figure so that users can visually confirm the structure

8.3 Convert Utility Tips

  • Depending on the conversion path via which input formats are converted to the selected output formats, some information may be lost in the process of conversion. This may further cause errors during the conversion. In that case, it is suggested that a format that contains maximal structural information, such as GLYDE-II or GlycoCT, be used as the input format. To do so, the existing input format can be translated to GLYDE-II or GlycoCT, and any missing information should be added manually before being translated to the target format.

  • GLYDE-II-formatted structures can only be converted one structure at a time, due to formatting issues of XML.

9 Data Management System

A data management system is available for users who have registered an account on RINGS. Registration is free of charge, and all data is kept private on the RINGS server. The purpose of this system is to allow users to store their past analysis results online without having to download the results of each tool individually. Moreover, data and results stored on RINGS from one tool can be used as input to other tools when the data formats are compatible. Thus, this system enables users to use the RINGS tools more efficiently.

9.1 Data Management System Usage

Users can create an account with their email address by clicking the “User registration form” link located at the top of the main RINGS page (Fig. 15.21). After successfully creating an account, users can then log-in with their registered email address and password via the Login form at the top right of this page.

Fig. 15.21
figure 21

Top of RINGS home page, where links to create a user account are available via the “User registration form” link. Registered users can then log-in with their registered email address and password, located at the top right of this page

Fig. 15.22
figure 22

Details regarding a data set can be inputted by clicking the name of the data set. The name can be modified and comments can be stored with each data set

Once users have logged in with their registered email address and password, the main page displays a data tree at the left, as in Fig. 15.22. As shown in this figure, by clicking on the name of the data set itself, the name of the data set and comments regarding the data set can be inputted.

Figure 15.23 is a view of the input data of the data set in Fig. 15.22. Having clicked the “Input” folder for this data set, the glycan structures used for input to ProfilePSTMM on this data set is shown on the right. A pull-down menu is available at the top left for other tools that can also take this data as input. The data set itself can also be downloaded via the “Download” link at the top right.

Fig. 15.23
figure 23

Snapshot of RINGS after user has logged in and selected a data set. Here, the input data used for the analysis of Siglec 5 data is selected and can be viewed graphically on the right. Options to rerun this data using a tool in the pull-down menu are shown. The data itself can also be downloaded using the “Download” link at the right

By clicking on the “Output” folder of a data set, the results of the analysis using the input data are shown. In Fig. 15.24, the output view for ProfilePSTMM is shown using the Profile Viewer. An option to view the results in text format is also provided on this page.

Fig. 15.24
figure 24

The output view of the analytical results using ProfilePSTMM on the data in Fig. 15.22. The results can also be viewed as text by clicking the button at the top of this page

Finally, it is highly recommended that users log out using the “Logout” button at the bottom of the data tree when analysis is complete. This will ensure that the data is kept private and cannot be viewed by others. Note that the size of all the data is also indicated at the bottom. Although there is currently no quota, users can check their data usage and delete any unnecessary data as needed. Data can be deleted by clicking on a tool name in the data tree and then checking the data they want to delete. The “Delete” button is shown at the bottom of the page.

9.2 Data Management System Tips

  • Sometimes the data tree does not display after logging in, due to cached cookies and having not logged out a previous time. This can usually be solved by logging in again, and the tree should be displayed.

  • It is highly recommended that a data set name be specified whenever running a tool. The default name can be used of course, but managing the data can become difficult as more data is accumulated.

  • If for some reason the data tree is not updated with the latest results, the “Reload” button located at the bottom of the data tree can be used to refresh the data tree.

10 Feedback System

A feedback and bug report system is also available via the “Feedback search” link at the top of the main RINGS page, as shown in Fig. 15.21. Although anyone can view others’ feedback without logging in, users are required to register and log in order to report any new bugs or feedback.

Fig. 15.25
figure 25

The top of the Feedback Form where users can report new feedback using the “Report Form” or view previous feedback

10.1 Usage

Fig. 15.26
figure 26

The detailed page of a particular feedback. Users can vote for certain feedback if they agree and would like it to be prioritized

Figure 15.25 shows the top part of the Feedback Form. The “Report Form” link will open a new form for users to enter feedback after logging in. This form provides three fields: the tool regarding which feedback is being entered, a title to summarize the problem or suggestion, and the details of how the problem occurred or regarding the suggestion. All feedback can be viewed by selecting “all” and clicking the “Go” button using the main Feedback Form (Fig. 15.25). Feedback regarding a specific tool can also be selected in this form. The results of this form show the list of feedback, including Title, Report Data, Status (open, closed, or duplicate), the tool name, the number of votes for the feedback, any comments from developers, and a priority indicator. The bottom of the Feedback Form also shows the latest feedback. Similar to the search results, specific feedback can be seen in detail by clicking the title of the feedback. The feedback detail page, shown in Fig. 15.26, provides users an option to vote on particular feedback. Accumulated votes will then be considered to be prioritized more highly.

10.2 Tips

  • Voting on existing feedback will help motivate developers to prioritize future work.

  • A clear description, including the data used, when encountering problems, is essential to be able to reenact the problem and help the debugging process.

11 Summary

RINGS was developed to provide analytical tools for glycomics analysis, with an easy-to-use interface. However, without user feedback, the interfaces currently provided cannot be improved. Note that we are also under the process of rebuilding the RINGS interface using the latest technologies. Moreover, as mentioned in the Introduction, in the future, RINGS tools will be designed to allow any format to be used as input. We are also considering allowing GlyTouCan IDs to be used as well. Therefore, RINGS will continue to be improved as user demands are met.