1 Introduction

Siri Bhoovalaya is a renowned multi-lingual literary masterpiece (see [2, 9]) that dates back approximately a millennium, authored by the Jain monk Muni Kumudendu in Karnataka, India. One of its standout features is its unique composition—entirely in the numerals of the Kannada language. Furthermore, this work is so intricately designed that applying varied decryption methods unveils texts in different languages.

Each segment of this extensive work is termed a 'Chakra,' while the method to decode a chakra is known as a 'Bandha.' The text consists of an impressive 16,000 chakras, organized into 56 chapters and further grouped into 9 Khandas. Cumulatively, this amounts to 600,000 shlokas, encompassing roughly 1,400,000 characters. To put its magnitude into perspective, Siri Bhoovalaya is approximately sixfold the size of the epic Indian tale, Mahabharata. It employs intricate patterns such as Chakrabandha, Hamsabandha, Varapadmabandha, Sagarabandha, Sarasabandha, Kruanchabandha, Mayurabandha, Ramapabandha, Nakhabandha, among others. Recognizing these patterns is crucial to determine the appropriate decryption technique. The chakras span diverse fields, from religious doctrines like Jainism, Vedas, Ayurveda, and astrology, to scientific disciplines including mathematics, physics, chemistry, and astronomy.

Each chakra aligns with the Saangathya metre, a hallmark of Kannada poetry. Specifically, every chakra presents a 27 × 27 matrix filled with integers, ranging between 1 and 64. Impressively, every integer corresponds to a phonetic alphabet in the Kannada language. When deciphered, these chakras translate into verses spanning 718 dialects prevalent across the Indian subcontinent. These dialects are comprised of 18 major languages, such as Sanskrit, Prakrit, Telugu, Tamil, Pali, Marathi, Apabhramsha, to name a few, in addition to 700 other minor dialects.

Despite its impressive scope, this vast composition has largely remained obscure, primarily because its numeric-centric nature makes decryption daunting. Consequently, there arose a prevailing belief that the original work, along with the supposed five extant copies, had vanished. This notion persisted until the 1950s when Pundit Yellappa Shastri unveiled the sole surviving copy. However, this version only encompasses 1,270 chakras from the Prathama Khanda, termed as Mangala Prabhruta. To date, merely about 8% of its content has been revealed, necessitating the application of diverse cryptographic techniques, including substitution, transposition, and steganography.

The prevailing sentiment among scholars is that Muni Kumudendu didn't encrypt this work for the sake of hiding its contents. Instead, he harnessed these methods to ingeniously embed content from various languages into a singular cipher text.

2 Cryptographic techniques

2.1 Mono-alphabetic substitution cipher

In this technique, there exists a substitution table that gives the mapping from every plain alphabet to a cipher alphabet. The plain alphabet is encrypted by replacing it with the cipher alphabet given by the substitution table. Similarly, decryption occurs by replacing the cipher alphabet with the plain alphabet in accordance with the table [5].

This concept is used in Siri Bhoovalaya where there exists a substitution table for every dialect. Figure 1 shows the substitution table for Kannada language (c.f [2, 13]).

Fig. 1
figure 1

Substitution table for Kannada language

The substitution table gives the mapping between a phonetic alphabet of the Kannada language and an integer between 1 and 64.

2.2 Transposition cipher

A transposition cipher is a encryption method where the cipher text is a permutation of the plain text and requires to be traversed in a particular order [6, 14].

There are a large number of transposition ciphers, two of which are described below.

2.2.1 Chakra bandha

This is a deciphering technique introduced by Muni Kumudendu. The bandha gives a transposition matrix as shown in Fig. 2. The cells of the chakra must be traversed as illustrated in the figure from cells 1 to 729. Cell 1 is situated at row 1, column 14, cell 2 is situated at row 27, column 15 and so on until cell 729 which is situated at row 27, column 14.

Fig. 2
figure 2

Chakra Bandha transposition table

2.2.2 Navmaank bandha

Here, a chakra can be divided into a set of 3 × 3 tiles. Each tile is a 9 × 9 matrix of cells as shown in Fig. 3. Cell traversal is in accordance with the transposition matrix shown in Fig. 4. The position of the tiles varies with chapters as shown in Fig. 5.

Fig. 3
figure 3

Chakra divided into 9 tiles

Fig. 4
figure 4

Navmaank Bandha transposition table

Fig. 5
figure 5

Tile transposition scheme

2.3 Steganographic schemes

Stenography is the technique of concealing one message within another [7, 15, 16]. This technique is widely applied in Siri Bhoovalaya. For instance, when Chakra 1–1-1 is deciphered using Chakra Bandha and the transposition table for Kannada language, the result is a Kannada text. When the first character of each line of Kannada text is assembled, it gives a Gatha in Prakrit. Similarly, when characters in the center of each line of Kannada text is aggregated, it results in a Sanskrit shloka.

The chakra’s exclusive numerical composition and the complex cryptographic techniques re- quired for decryption have warranted the involvement of computers [2]. has attempted decryption of chakras using Microsoft Small Basic. This program takes the chakra as input and gives the deciphered characters as output as shown in Fig. 6.

Fig. 6
figure 6

Screen shot of chakra decrypted by [2]

This paper proposes a model to extend the automated decryption of chakra using bandha as given in [2] by incorporating an automated association of these characters to form words. This takes a step closer in understanding the shloka originally encrypted.

3 Model

The proposed model for automated decryption and logical association of alphabets into words is shown in Fig. 7.

Fig. 7
figure 7

Model for automated decryption and alphabet association

The model consists of the following components:

3.1 Decryptor

This component of the model takes three inputs:

3.1.1 Chakra

The chakra is a 27 × 27 matrix of integers in the range of 1 and 64 (as described in Section 1). An instance of chakra is given in Fig. 8.

Fig. 8
figure 8

Chakra1-1–1

3.1.2 Bandha

The bandha is a technique to decrypt a chakra using transposition. There are several bandhas to decrypt a chakra (as discussed in Section 2.2). The algorithm for Chakra Bandha is given in Algorithm 3.

3.1.3 Substitution table

The numerals obtained on application of bandha on chakra is replaced with its corresponding alphabets using the substitution table. An instance of this is given in Fig. 1.

In summary, the decryptor applies the input bandha on the input chakra and substitutes the output numerals with the corresponding alphabets using the input substitution table. Thus, resulting in a list of decoded alphabets.

3.2 N-gram generator

It takes as input the list of decoded alphabets rendered by the decryptor and returns a list of 1-g[1] to 25-g sequences of alphabets.

3.3 Bag of words generator

This component takes a corpus containing documents relating to a particular language (language used by substitution table) and returns a dictionary for each word in the corpus with its corresponding frequency of occurrence.

3.4 Word matcher

This takes the following inputs:

  1. 1)

    List of generated sequences from the N-gram generator

  2. 2)

    List of words from the bag of words generator

  3. 3)

    List of decoded alphabets from the decryptor

These inputs are passed as parameters to the following procedures:

3.4.1 Finding partial matching words for alphabet sequences. Algorithm 1 requires the predefined function

  1. (1)

    str_search_list(sequence, word_list) — Takes a string, sequence as a regular expression. Searches for this regular expression in the list of strings, word_list. Returns a list of strings in word_list that match sequence. If no matches are found, returns ϕ.

Algorithm 1
figure a

Find partial matches for alphabet sequences

3.4.2 Finding exact matching words for alphabet list. Algorithm 2 necessitates the following predefined functions

  1. (1)

    as.string(char_list) — Returns a string formed from the combination of all the characters in. char_list

  2. (2)

    myarray.append(myelement) — Appends myelement to the array, myarray.

  3. (3)

    str_search_str(substr,str) — Returns True if string, substr is a substring of string, str. Else, returns False.

Algorithm 2
figure b

Find exact matches for alphabet list

The component returns a dictionary of partial matches and a list of exact matches.

3.5 Consolidator

This is the final component of the model that takes as inputs:

  1. 1)

    List of decoded alphabets from the decryptor

  2. 2)

    Dictionary of partial matches, and

  3. 3)

    List of exact matches

Returns a sequence of exact matches that substitute the corresponding alphabets and unmatched alphabets.

4 Results

The model proposed in this paper has been implemented in R programming language [8, 10] on a 128 GB RAM, 64-bit Linux system running R version 3.3.1. This implementation resulted in the generation of text files for the final and intermediate output. To eliminate the need for si- multaneously viewing these text files, an interactive Shiny [3, 17] web application was developed and deployed. This application is hosted on shinyapps.io [4, 6] and can be accessed at the address: https://siribhoovalya.shinyapps.io/siribhoovalya/.

The functionality of the application is explained as follows:

  1. 1)

    A chakra for decryption must be selected in the Input Chakra drop down. The selected chakra is displayed on the Input tab.

  2. 2)

    The language is chosen in the Substitution Table drop down.

  3. 3)

    The desired bandha is selected in the Decryption Algorithm drop down.

  4. 4)

    Once the chakra, language and bandha are appropriately chosen, the Process button must be pressed.

  5. 5)

    This will result in a Processing pop-up window to be visible in the bottom-left. This pop-up window will reflect the progress of the processing. The processing can be halted at any time by selecting the close button in the pop-up window.

  6. 6)

    Completion of the processing will display the Output tab that contains the following:

    1. a)

      Decrypted Output pane that shows the list of decoded alphabets (as rendered by the Decryptor component of the model)

    2. b)

      Exact Matched Predicted words pane that depicts the exact matches (as presented by the

    3. c)

      Word Matcher component of the model)

    4. d)

      Unigram Predicted words pane that portrays the partial matches (as provided by the Word Matcher component of the model)

    5. e)

      Processed Output pane that provides the sequence of exact matches and unmatched alphabets (as given by the Consolidator component of the model)

Figure 9 shows the web application processing the given inputs, while Fig. 10 shows the output pane of the web application.

Fig. 9
figure 9

Web application processing the inputs

Fig. 10
figure 10

Output pane of web application

5 Conclusion

This paper presents a comprehensive model which when given a chakra, a bandha and a substitution table will not only return a list of decrypted alphabets but is also capable of returning words predicted from these alphabets. The model also provides the words that partially match the alphabets.

Accounting for the fact that Siri Bhoovalaya is a work that encapsulates works from several fields of study by applying numerous encryption techniques to create a seemingly simple set of 729 numbers per page, the authors share the view of several contemporaries [2, 11, 12, 18] in believing that researchers of no particular area of research will be able to solely unravel the mysteries of this intriguing creation.

This paper makes an initial attempt towards associating alphabets to form words and portrays both the partially matched words and the exactly matched words with the intention of providing a common base for linguists, cryptographers, religious experts, etc.to work towards solving the intricacies of Siri Bhoovalaya.

6 Future work

Two primary challenges stand out in the study at hand. First, while the paper outlines two transposition techniques in Section 2.2, there's an understanding that Muni Kumudendu utilized several other, perhaps lesser-known, transposition methods as mentioned in Section 1. Comprehensive research is required to pinpoint these techniques and develop a bespoke approach for their decryption concerning this work.

Second, the challenge of deciphering ancient word associations in Siri Bhoovalaya is intensified by its age. Given that it was composed roughly a millennium ago, it likely contains archaic terms that have since fallen into obscurity. Compounding this challenge is the fact that many potentially helpful reference materials, which might contain these obsolete words, haven't been digitized. Considering that the chakras decode into 718 dialects, some with antiquated terms and texts not readily available digitally, creating a pertinent corpus appears to be a monumental task. Collaborative efforts among experts spanning various disciplines—from linguistics to computer science—are imperative [19, 20].

The introduced web application represents a pioneering attempt to simplify the deciphering process, obviating the need for intricate hardware and software. However, it has its limitations in terms of the number of chakras, the substitution table, and the decryption algorithms it currently supports [21, 22]. Efforts are in progress to digitize the available chakras. Additionally, substitution tables for numerous primary languages are in development and will soon be accessible. Notwithstanding, integrating decryption algorithms is intricate and demands a deeper, more nuanced understanding [23,24,25].