Abstract
"The Siri Bhoovalaya is a seminal work of literature, believed to have been composed approximately a millennium ago, which encompasses diverse information encrypted using numerals of the Kannada language—a predominant language of southern India. Currently, only a portion of this enigmatic text is accessible, and deciphering its content remains largely a manual endeavor. This article presents a novel model designed to automate the conversion of these Kannada numerals into phonetic alphabets of the designated language. Subsequent to this conversion, algorithms rooted in Natural Language Processing (NLP) techniques are utilized to form coherent words. These algorithms adhere to the linguistic and grammatical structures of the target language. Through this research, we aim to establish an initial technical blueprint to shed light on the profound content encapsulated within this age-old masterpiece."
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Siri Bhoovalaya is a renowned multi-lingual literary masterpiece (see [2, 9]) that dates back approximately a millennium, authored by the Jain monk Muni Kumudendu in Karnataka, India. One of its standout features is its unique composition—entirely in the numerals of the Kannada language. Furthermore, this work is so intricately designed that applying varied decryption methods unveils texts in different languages.
Each segment of this extensive work is termed a 'Chakra,' while the method to decode a chakra is known as a 'Bandha.' The text consists of an impressive 16,000 chakras, organized into 56 chapters and further grouped into 9 Khandas. Cumulatively, this amounts to 600,000 shlokas, encompassing roughly 1,400,000 characters. To put its magnitude into perspective, Siri Bhoovalaya is approximately sixfold the size of the epic Indian tale, Mahabharata. It employs intricate patterns such as Chakrabandha, Hamsabandha, Varapadmabandha, Sagarabandha, Sarasabandha, Kruanchabandha, Mayurabandha, Ramapabandha, Nakhabandha, among others. Recognizing these patterns is crucial to determine the appropriate decryption technique. The chakras span diverse fields, from religious doctrines like Jainism, Vedas, Ayurveda, and astrology, to scientific disciplines including mathematics, physics, chemistry, and astronomy.
Each chakra aligns with the Saangathya metre, a hallmark of Kannada poetry. Specifically, every chakra presents a 27 × 27 matrix filled with integers, ranging between 1 and 64. Impressively, every integer corresponds to a phonetic alphabet in the Kannada language. When deciphered, these chakras translate into verses spanning 718 dialects prevalent across the Indian subcontinent. These dialects are comprised of 18 major languages, such as Sanskrit, Prakrit, Telugu, Tamil, Pali, Marathi, Apabhramsha, to name a few, in addition to 700 other minor dialects.
Despite its impressive scope, this vast composition has largely remained obscure, primarily because its numeric-centric nature makes decryption daunting. Consequently, there arose a prevailing belief that the original work, along with the supposed five extant copies, had vanished. This notion persisted until the 1950s when Pundit Yellappa Shastri unveiled the sole surviving copy. However, this version only encompasses 1,270 chakras from the Prathama Khanda, termed as Mangala Prabhruta. To date, merely about 8% of its content has been revealed, necessitating the application of diverse cryptographic techniques, including substitution, transposition, and steganography.
The prevailing sentiment among scholars is that Muni Kumudendu didn't encrypt this work for the sake of hiding its contents. Instead, he harnessed these methods to ingeniously embed content from various languages into a singular cipher text.
2 Cryptographic techniques
2.1 Mono-alphabetic substitution cipher
In this technique, there exists a substitution table that gives the mapping from every plain alphabet to a cipher alphabet. The plain alphabet is encrypted by replacing it with the cipher alphabet given by the substitution table. Similarly, decryption occurs by replacing the cipher alphabet with the plain alphabet in accordance with the table [5].
This concept is used in Siri Bhoovalaya where there exists a substitution table for every dialect. Figure 1 shows the substitution table for Kannada language (c.f [2, 13]).
The substitution table gives the mapping between a phonetic alphabet of the Kannada language and an integer between 1 and 64.
2.2 Transposition cipher
A transposition cipher is a encryption method where the cipher text is a permutation of the plain text and requires to be traversed in a particular order [6, 14].
There are a large number of transposition ciphers, two of which are described below.
2.2.1 Chakra bandha
This is a deciphering technique introduced by Muni Kumudendu. The bandha gives a transposition matrix as shown in Fig. 2. The cells of the chakra must be traversed as illustrated in the figure from cells 1 to 729. Cell 1 is situated at row 1, column 14, cell 2 is situated at row 27, column 15 and so on until cell 729 which is situated at row 27, column 14.
2.2.2 Navmaank bandha
Here, a chakra can be divided into a set of 3 × 3 tiles. Each tile is a 9 × 9 matrix of cells as shown in Fig. 3. Cell traversal is in accordance with the transposition matrix shown in Fig. 4. The position of the tiles varies with chapters as shown in Fig. 5.
2.3 Steganographic schemes
Stenography is the technique of concealing one message within another [7, 15, 16]. This technique is widely applied in Siri Bhoovalaya. For instance, when Chakra 1–1-1 is deciphered using Chakra Bandha and the transposition table for Kannada language, the result is a Kannada text. When the first character of each line of Kannada text is assembled, it gives a Gatha in Prakrit. Similarly, when characters in the center of each line of Kannada text is aggregated, it results in a Sanskrit shloka.
The chakra’s exclusive numerical composition and the complex cryptographic techniques re- quired for decryption have warranted the involvement of computers [2]. has attempted decryption of chakras using Microsoft Small Basic. This program takes the chakra as input and gives the deciphered characters as output as shown in Fig. 6.
This paper proposes a model to extend the automated decryption of chakra using bandha as given in [2] by incorporating an automated association of these characters to form words. This takes a step closer in understanding the shloka originally encrypted.
3 Model
The proposed model for automated decryption and logical association of alphabets into words is shown in Fig. 7.
The model consists of the following components:
3.1 Decryptor
This component of the model takes three inputs:
3.1.1 Chakra
The chakra is a 27 × 27 matrix of integers in the range of 1 and 64 (as described in Section 1). An instance of chakra is given in Fig. 8.
3.1.2 Bandha
The bandha is a technique to decrypt a chakra using transposition. There are several bandhas to decrypt a chakra (as discussed in Section 2.2). The algorithm for Chakra Bandha is given in Algorithm 3.
3.1.3 Substitution table
The numerals obtained on application of bandha on chakra is replaced with its corresponding alphabets using the substitution table. An instance of this is given in Fig. 1.
In summary, the decryptor applies the input bandha on the input chakra and substitutes the output numerals with the corresponding alphabets using the input substitution table. Thus, resulting in a list of decoded alphabets.
3.2 N-gram generator
It takes as input the list of decoded alphabets rendered by the decryptor and returns a list of 1-g[1] to 25-g sequences of alphabets.
3.3 Bag of words generator
This component takes a corpus containing documents relating to a particular language (language used by substitution table) and returns a dictionary for each word in the corpus with its corresponding frequency of occurrence.
3.4 Word matcher
This takes the following inputs:
-
1)
List of generated sequences from the N-gram generator
-
2)
List of words from the bag of words generator
-
3)
List of decoded alphabets from the decryptor
These inputs are passed as parameters to the following procedures:
3.4.1 Finding partial matching words for alphabet sequences. Algorithm 1 requires the predefined function
-
(1)
str_search_list(sequence, word_list) — Takes a string, sequence as a regular expression. Searches for this regular expression in the list of strings, word_list. Returns a list of strings in word_list that match sequence. If no matches are found, returns ϕ.
3.4.2 Finding exact matching words for alphabet list. Algorithm 2 necessitates the following predefined functions
-
(1)
as.string(char_list) — Returns a string formed from the combination of all the characters in. char_list
-
(2)
myarray.append(myelement) — Appends myelement to the array, myarray.
-
(3)
str_search_str(substr,str) — Returns True if string, substr is a substring of string, str. Else, returns False.
The component returns a dictionary of partial matches and a list of exact matches.
3.5 Consolidator
This is the final component of the model that takes as inputs:
-
1)
List of decoded alphabets from the decryptor
-
2)
Dictionary of partial matches, and
-
3)
List of exact matches
Returns a sequence of exact matches that substitute the corresponding alphabets and unmatched alphabets.
4 Results
The model proposed in this paper has been implemented in R programming language [8, 10] on a 128 GB RAM, 64-bit Linux system running R version 3.3.1. This implementation resulted in the generation of text files for the final and intermediate output. To eliminate the need for si- multaneously viewing these text files, an interactive Shiny [3, 17] web application was developed and deployed. This application is hosted on shinyapps.io [4, 6] and can be accessed at the address: https://siribhoovalya.shinyapps.io/siribhoovalya/.
The functionality of the application is explained as follows:
-
1)
A chakra for decryption must be selected in the Input Chakra drop down. The selected chakra is displayed on the Input tab.
-
2)
The language is chosen in the Substitution Table drop down.
-
3)
The desired bandha is selected in the Decryption Algorithm drop down.
-
4)
Once the chakra, language and bandha are appropriately chosen, the Process button must be pressed.
-
5)
This will result in a Processing pop-up window to be visible in the bottom-left. This pop-up window will reflect the progress of the processing. The processing can be halted at any time by selecting the close button in the pop-up window.
-
6)
Completion of the processing will display the Output tab that contains the following:
-
a)
Decrypted Output pane that shows the list of decoded alphabets (as rendered by the Decryptor component of the model)
-
b)
Exact Matched Predicted words pane that depicts the exact matches (as presented by the
-
c)
Word Matcher component of the model)
-
d)
Unigram Predicted words pane that portrays the partial matches (as provided by the Word Matcher component of the model)
-
e)
Processed Output pane that provides the sequence of exact matches and unmatched alphabets (as given by the Consolidator component of the model)
-
a)
Figure 9 shows the web application processing the given inputs, while Fig. 10 shows the output pane of the web application.
5 Conclusion
This paper presents a comprehensive model which when given a chakra, a bandha and a substitution table will not only return a list of decrypted alphabets but is also capable of returning words predicted from these alphabets. The model also provides the words that partially match the alphabets.
Accounting for the fact that Siri Bhoovalaya is a work that encapsulates works from several fields of study by applying numerous encryption techniques to create a seemingly simple set of 729 numbers per page, the authors share the view of several contemporaries [2, 11, 12, 18] in believing that researchers of no particular area of research will be able to solely unravel the mysteries of this intriguing creation.
This paper makes an initial attempt towards associating alphabets to form words and portrays both the partially matched words and the exactly matched words with the intention of providing a common base for linguists, cryptographers, religious experts, etc.to work towards solving the intricacies of Siri Bhoovalaya.
6 Future work
Two primary challenges stand out in the study at hand. First, while the paper outlines two transposition techniques in Section 2.2, there's an understanding that Muni Kumudendu utilized several other, perhaps lesser-known, transposition methods as mentioned in Section 1. Comprehensive research is required to pinpoint these techniques and develop a bespoke approach for their decryption concerning this work.
Second, the challenge of deciphering ancient word associations in Siri Bhoovalaya is intensified by its age. Given that it was composed roughly a millennium ago, it likely contains archaic terms that have since fallen into obscurity. Compounding this challenge is the fact that many potentially helpful reference materials, which might contain these obsolete words, haven't been digitized. Considering that the chakras decode into 718 dialects, some with antiquated terms and texts not readily available digitally, creating a pertinent corpus appears to be a monumental task. Collaborative efforts among experts spanning various disciplines—from linguistics to computer science—are imperative [19, 20].
The introduced web application represents a pioneering attempt to simplify the deciphering process, obviating the need for intricate hardware and software. However, it has its limitations in terms of the number of chakras, the substitution table, and the decryption algorithms it currently supports [21, 22]. Efforts are in progress to digitize the available chakras. Additionally, substitution tables for numerous primary languages are in development and will soon be accessible. Notwithstanding, integrating decryption algorithms is intricate and demands a deeper, more nuanced understanding [23,24,25].
Data Availability
All the data is collected from the simulation reports of the software and tools used by the authors. Authors are working on implementing the same using real world data with appropriate permissions.
References
Brown PF, Desouza PV, Mercer RL, Della Pietra VJ, Lai JC (1992) Class-based n-gram models of natural language. Comput Linguist 18(4):467–479
Jain AK (2013) An inimitable cryptographic creation: Siri Bhoovalaya
Shiny (2017) Shiny. https://shiny.rstudio.com/
shinyapps.io (2017) shinyapps.io. https://www.shinyapps.io/
Stallings W (2006) Cryptography and network security: principles and practices, 4th edn. Pearson Education India, pp 35–49
Stinson DR (2005) Cryptography: theory and practice. Chapman and Hall/CRC
Stallings W (2007) Network security essentials: applications and standards. Pearson Education India
R Core Team (2016) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
You Y et al (2018) A review of cyber security controls from an ICS perspective. In: 2018 international conference on platform technology and service (PlatCon). IEEE
Zhang H, Lin Y, Xiao J (2017) An innovative analying method for the scale of distribution system security region. In: 2017 IEEE power & energy society general meeting, IEEE
Bianchi T, Bioglio V, Magli E (2014) On the security of random linear measurements. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE
Do T, Gan L, Nguyen N, Tran T (2012) Fast and efficient compressive sensing using structurally random matrices. IEEE Trans Signal Process 60(1):139–154
Rao A, Jha B, Kini G (2013) Effect of grammar on security of long passwords. In: Proceedings of the third ACM conference on data and application security and privacy (CODASPY '13). ACM, New York, pp 317–324. https://doi.org/10.1145/2435349.2435395
Yan Y, Huang J (2017) Cooperative output regulation of discrete-time linear time-delay multi-agent systems under switching network [J]. Neurocomputing 241(7):108–114
Zhou L, Li C (2017) Out sourcing Eigen-decomposition and singular value decomposition of large matrix to a public cloud. IEEE Access 4:869–879
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Stern J (ed) Advances in cryptology—Eurocrypt. Springer, Berlin, pp 223–238
Jha DP, Kohli R, Gupta A (2016) Proposed encryption algorithm for data security using matrix properties. In: 2016 International conference on innovation and challenges in cyber security (ICICCS-INBUSH). IEEE
Patel B, Desai P, Panchal U (2017) Methods of recommender system: A review. In: 2017 international conference on innovations in information, embedded and communication systems (ICIIECS). IEEE
Thomas A, Sujatha AK (2016) Comparative study of recommender systems. In: 2016 international conference on circuit, power and computing technologies (ICCPCT). https://doi.org/10.1109/iccpct.2016.7530304
Yang S, Chen B (2023) SNIB: improving spike-based machine learning using nonlinear information bottleneck. IEEE Trans Syst Man Cybern: Syst 53(12):7852–7863. https://doi.org/10.1109/TSMC.2023.3300318
Sharma RK (2018) Title of the article. J Indian History Culture 2:11–35
Kumar SP, Sethi R eds (2021) Krishna Sobti: A counter archive. Taylor & Francis
University of Kerala (2000) International journal of Dravidian linguistics, vol 29. Department of Linguistics, University of Kerala
Hong Z et al (2021) Challenges and advances in information extraction from scientific literature: a review. JOM 73(11):3383–3400
Khurana D, Koli A, Khatter K et al (2023) Natural language processing: state of the art, current trends and challenges. Multimed Tools Appl 82:3713–3744. https://doi.org/10.1007/s11042-022-13428-46
Funding
On Behalf of all authors the corresponding author states that they did not receive any funds for this project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that we have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1
Appendix 1
1.1 Chakra Bandha Transposition algorithm
Algorithm 3 requires the following predefined functions:
-
1)
myarray. append(myelement) — Appends myelement to the array, myarray.
-
2)
mynumber + + — Increments the integer or float, mynumber by one.
-
3)
mynumber1: mynumber2 — Returns array of all integers between the two integers, mynumber1 and mynumber2. If the mynumber1 and mynumber2 are floats, then array of all floats between these two floats will be returned.
-
4)
rev(myarray) — Reverses the array, myarray.
-
5)
len(myarray) — Returns the number of elements in the array, myarray.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
D, J.S. Automated decryption of siri bhoovalaya using cryptography and natural language processing techniques. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18527-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18527-y