Keywords

1 Introduction

The database (DB), as the most important corporate information resource, should be properly protected. And in the case of using a database, built on the basis of a scheme with the universal basis of relations, that allows to store simultaneously in the fixed structure of the relations of its scheme the data of various subject domains (SDs) necessary for the company, organization or institution according to its multidisciplinary activities, as a corporate repository, the problem of ensuring the database security, as the protection of its data from undesirable disclosure-use (violation of confidentiality), falsification (integrity violation), loss or decrease in availability measure, becomes even more actual. Herewith, it must be noted that the finished universal technique of integrated solution of the task ensuring of databases security today does not exist. This is explained by the variety of activities of enterprises, the structure of information systems, networks, database management systems (DBMS), data flows and ways to organize access to them. Therefore, in each specific situation, including when using a database with a universal basis of relations, this most important task requires an original approach, usually based on existing developments and solutions in this area.

2 Means and Methods for Protecting Databases Built on the Basis of the Scheme with the Universal Basis of Relations

The essence of the problem of ensuring the information security of databases is to develop methods and means that ensure the confidentiality, integrity, availability of their data under conditions of impact on them of any intentional or unintentional threats. Any threat should be considered as a potential possibility of security system violation that, if successful, can have some negative impact.

Existing approaches to solving this problem involves examining the database survey to identify possible threats that can lead to destruction, theft, data fraud, loss of confidentiality and data integrity, loss of availability and finding effective ways, means of confronting them. Herewith, first of all, the possibilities of means and methods of database protection that are supported by the database management system are analyzed.

The means and methods of databases protecting in various DBMS, on the platform of which the database scheme with the universal basis of relations [1,2,3] can be implemented, differ from each other. However, in varying degrees, often enough among them there are such as [4]: authorization, access control; views; backup and recovery; integrity support; encryption; the use of fault-tolerant hardware.

Below, without affecting the general security issues of computer systems, using the theory and the research results described in [4,5,6,7,8,9,10,11,12], we propose an approach implemented the complex use of various mechanisms ensuring databases security built on the basis of the database scheme that is invariant to subject domains (SDs) [1], taking into account the peculiarities of its structure.

It is known that to the violation of confidentiality leads both an intentional action aimed at the implementation of unauthorized access to the data and a random error of software or unskilled user action, which led to the disclosure of confidential information. Therefore, in the first place, the user authorization mechanism was analyzed to identify certain vulnerabilities in it and their subsequent elimination.

It is no secret that once a user get access right to database system, he can automatically be granted various privileges associated with his identifier. In particular, these privileges can include permission to access certain base (tables), virtual (views) relations of DB schema, its procedures, functions and other objects, as well as various actions with them. Such a method is sufficiently developed and flexible. It allows the database administrator to configure the access rights of users in accordance with their job responsibilities (the principle of access reasonableness). However, with it help, access is restricted only to named objects of the database schema, and, as a rule, only to a complete set of data that is provided to the respective users. While there is a need to control access at a lower level (although some DBMSs provide certain privileges for attributes of basic and virtual relations [5]). Using discretionary access control, it is impossible, for example, in full to restrict a subject access (a registered user) to only a part of DB relation tuples. And, taking into account the peculiarities of the database scheme with the universal basis of relations, which is invariant to subject domains and which allows to store data from different SDs in the fixed structure of the relations simultaneously, this is unacceptable. Thereby, analyzing the different approaches [4,5,6,7,8, 13] of the solution of this problem, it was concluded that it is advisable to use in this case of the mechanism of fine-grained access control, also known as row level security (RLS) [14].

That it was possible to take advantage of this mechanism benefits, as well as to simplify the procedure for registering the actions of database users (conducting an audit), ensuring auditability (one of the most important requirements of computer security), certain adjustments were made to the DB invariant scheme presented in [1] (in her base relations \( R^{sh} \)).

First, in each base relation of such database scheme \( R_{i}^{sh} \in R^{sh} \) the following attributes were added: user identifier (\( u_{i} \in at(R_{i}^{sh} ) \)); the tuple recording time (\( t_{i}^{ins} \in at(R_{i}^{sh} ) \)); tuple component update time (\( t_{i}^{upd} \in at(R_{i}^{sh} ) \)). Secondly, since the specific user (its identifier) \( u^{j} \in U_{1} \) is associated with the value of the attribute \( u_{i} \)(\( U_{1} \) – the set of user identifiers) for which access privileges to the tuples of the base relations should be defined \( R^{sh} \)\( p_{j} \in U_{3} \) (where \( U_{3} \) is the set of privileges granted to users for performing operations such as deletion, insert, update, select, as well as their combinations), the relation was defined whose extensions include data on the corresponding database user names, their identifiers and corresponding their access privileges. In a formalized form, such relation, referred to as the relation of users, can be represented as a subset of the Cartesian product \( U_{1} \times U_{2} \times U_{3} \):

$$ U = \{ (u_{1} ,u_{2} ,u_{3} )|u_{1} \in U_{1} \wedge u_{2} \in U_{2} \wedge u_{3} \in U_{3} \} , $$
(1)

where \( U_{2} \) is the set of user names; \( u_{1} = u^{j} \); \( u_{3} = p_{j} \);

$$ p_{j} = p_{j}^{gl} \cup p_{j}^{del} \cup p_{j}^{upd} \cup p_{j}^{sel} \cup p_{j}^{ins} , $$
(2)
$$ \begin{aligned} p_{j}^{gl} & = \left\{ {\begin{array}{*{20}l} {p^{gl} \in P_{user} ,{\text{ if }}u^{j} {\text{ has the privilege to access data }}\forall u^{k} , { }k = 1 \ldots |U_{1} |; \, } \hfill \\ {p^{none} = 0 \, (p^{none} \in P_{user} ),{\text{ else; }}} \hfill \\ \end{array} } \right. \\ p_{j}^{del} & = \left\{ {\begin{array}{*{20}l} {p^{del} \in P_{user} ,{\text{ if }}u^{j} {\text{ has the privilege to delete his data}}; \, } \hfill \\ {p^{none} = 0 \, (p^{none} \in P_{user} ),{\text{ else; }}} \hfill \\ \end{array} } \right. \\ p_{j}^{upd} & = \left\{ {\begin{array}{*{20}l} {p^{upd} \in P_{user} ,{\text{ if }}u^{j} {\text{ has the privilege to update his data}}; \, } \hfill \\ {p^{none} = 0 \, (p^{none} \in P_{user} ),{\text{ else, }}} \hfill \\ \end{array} } \right. \\ \end{aligned} $$

similarly for the privilege \( p_{j}^{sel} \) that allows to \( j \) user (\( u^{j} \)) to select the data available to him from the base relations \( R^{sh} \), and data insert privilege – \( p_{j}^{ins} \); \( P_{user} = \{ p^{none} ,p^{del} ,p^{upd} ,p^{sel} ,p^{ins} ,p^{gl} \} \) is user privilege domain.

The result of mapping the relation \( U \) to the base relation \( R_{U}^{sh} \in R^{sh} \) of the database scheme that is invariant to SDs is represented in the form of the main lines of data definition language code of the ISO SQL standard used in the CREATE (ALTER) TABLE operators:

figure a

Thirdly, to implement the possibility that the grantor (by one authorized user \( u^{j} \)) transferred the privileges belonging to him to other authorized users (grantees), the new mechanism was developed, since the traditional way (using GRANT command of the SQL standard) was not fully suitable for the database scheme with the universal basis of relations, taking into account its destination and structure of relations \( R^{sh} \). Such mechanism implemented within RLS technology required the development of a new relation for the database scheme that is invariant to SD, namely so-called of the access privilege distribution relation to the data of other users. In a formalized form, it can be represented as a subset of the Cartesian product \( U_{1} \times U_{1} \times U_{3} \):

$$ G = \{ (g_{1} ,g_{2} ,g_{3} )|g_{1} \in U_{1} \wedge g_{2} \in U_{1} \wedge g_{3} \in U_{3} \} . $$
(3)

The relation (3) extension is a set of tuples, each of which is associated with a specific data owner (\( g_{1} \)), which transmits its access privileges (\( g_{3} \)) to another authorized user (\( g_{2} \)).

The result of mapping this relation to the base schema relation (\( R_{G}^{sh} \in R^{sh} \)) is shown below in the form of the following main lines of SQL code used in CREATE (ALTER) TABLE operators:

figure b

Further in accordance with the RLS technology were determined:

  • a set of declarative commands (RLS policies) that determine how and when have to apply users access restrictions to the tuples of the schema base relations \( R^{sh} \);

  • a set of stored functions \( \Psi \) (combined in a package) that are called when the conditions specified in the security policy (RLS policy) are performed;

  • predicates formed by functions \( \Psi \) that the DBMS automatically appends to the end of the WHERE clause of user-executable SQL statements (the consumption of system resources depends on the correct predicate formation).

In the aggregate, all this can be represented as the implementation of rules for protecting relations \( R^{sh} \) and formalized in the form of the following expression:

$$ Sr = \{ R_{i}^{sh} ,oper_{i}^{j} ,policy_{i}^{k} ,\Psi _{i}^{l} ,attr_{i}^{\mu kl} ,pat_{contr}^{{R_{i}^{sh} }} \} , $$
(4)

where \( oper_{i}^{j} \) is the \( j \)-th combination (from the values: select, update, delete, insert) of the allowed access operations to the relation \( R_{i}^{sh} \in R^{sh} \); \( policy_{i}^{k} \) is name of the \( k \)-th RLS policy, which is applied for the base relation \( R_{i}^{sh} \); \( \Psi _{i}^{l} \in\Psi \) is the name of the \( l \)-th function (specifying the package name) that generates the predicate for the base relation \( R_{i}^{sh} \); \( attr_{i}^{\mu kl} \) is the value of the \( \upmu \)-th parameter for the \( k \)-th RLS policy and the \( l \)-th function; \( pat_{contr}^{{R_{i}^{sh} }} \) is pattern of the commands for managing access to \( R_{i}^{sh} \) (for implementation of the security policy).

Herewith, it should be noted that, for example, when implementing the database scheme with the universal basis of relations on the Oracle DBMS platform, in order to enhance the capabilities of RLS technology, guided by the recommendations [7, 8], it is also expediently to use the mechanism of so-called application contexts (named set of pairs “parameter-value”). The idea underlying the use of contexts is simple enough, but, at the same time, it allows to provide serious protection. The list of variables in memory (context), whose values are bound to sessions, is determined. Herewith, the session can get the current values of these variables, calling a special function, and variables in the context can be set only by calling a procedure associated with this context.

An example of the pattern of access control commands (of security policy implementation for Oracle DBMS) is given below.

figure c

The following symbols are used in the above pattern:

  • variables (in bold italic font): user_name is user name; table_name is base relation name \( R_{i}^{sh} \in R^{sh} \) to be protected by policy; access_rule is combinations of access operations (select, update, delete, insert – \( oper_{i}^{j} \), expression (4)) to the relation specified in table_name;

  • parameters of the add_policy procedure of the standard Oracle package dbms_rls: policy_name is the name of the RLS policy that is applied to the base relation \( R_{i}^{sh} \in R^{sh} \); policy_function is the name of the function owner that returns the condition; policy_function is the name of the function (with the name of the package) which generates a predicate for the base relation \( R_{i}^{sh} \in R^{sh} \); statement_types is statement types to which the policy applies; update_check is optional argument for INSERT or UPDATE statement types (the default is false). Setting update_check to true causes the server to also check the policy against the value after INSERT or UPDATE; enable is a parameter indicating the activation of the policy immediately after its addition (the default is true); static_policy is a parameter (the default is true). If it is set to true, the server assumes that the policy function for the static policy produces the same predicate string for anyone accessing the object, except for SYS or the privileged user who has the EXEMPT ACCESS POLICY privilege.

  • the braces are {}, the square brackets are [], the symbol | correspond to the notation taken from the extended Backus-Naur notation;

  • all other alphanumeric characters are either language keywords, either by standard package names or by accepted string literals.

As a rule, today in relational DBMS separate records, from the point of view of the access organization to them of various users, are not specially protected, although there are examples known from practice when it is required [13, 15]. Therefore, in order to provide such functionality, based on the capabilities of the above fine-grained access control mechanism, taking into account the predetermined relation \( R^{sh} \) structure, a special additional relation of the DB scheme with the universal basis of relations was developed, the data of which is used by the function forming the predicate. This relation, referred to as the relation of access restrictions to a specific data item, can be presented in a formalized form as a subset of the Cartesian product \( U_{1} \times U_{2} \times R_{name}^{sh} \times R_{ID}^{sh} \):

$$ A = \{ (a_{1} ,a_{2} ,a_{3} ,a_{4} )|a_{1} \in U_{1} \wedge a_{2} \in U_{2} \wedge a_{3} \in R_{name}^{sh} \wedge a_{4} \in R_{ID}^{sh} \} , $$
(5)

where \( R_{name}^{sh} \) is the set of schema relations \( R^{sh} \) names; \( R_{ID}^{sh} = \mathop \cup \limits_{i} R_{i}^{sh} [K_{{PK_{i} }} ] \) is a set of identifiers that are primary keys (\( K_{{PK_{i} }} \)) in the corresponding relations \( R^{sh} \), access to which is limited to a user with an identifier \( a_{1} \in U_{1} \) and a name \( a_{2} \in U_{2} \).

The result of mapping this relation to the base relation (\( R_{A}^{sh} \in R^{sh} \)) of the database scheme that is invariant to SD is represented as the following main lines of SQL code:

figure d

The packet body fragment forming the context attributes and the predicate string is shown in [10].

Herewith, in general, it is necessary to understand that the implementation of the access mechanism of different users to specific individual data elements leads to an increase in the total time required to process the corresponding requests to the database.

In order to organize access to data and system resources in the development of the database schema with the universal basis of relations, its following objects were also identified: roles designed to simplify the management of system and object privileges; synonyms required to specify an alternate object name of the database; profiles as a named set of resource limits and password parameters that restrict database usage for a user.

To protect important information stored in the database, access to it from one side should be limited, and on the other side it is advisable to encrypt it. Data encryption is a key component in implementing the principle of multilevel protection. The desire to reduce the risk of data confidentiality loss, including due to insider threats of privileged users, has become a motivated start for: developing mechanisms that provide the possibility of effective use for the protection of cryptographic primitives supported by the DBMS (if there are any and they satisfy the consumer of the information product); developing their own cryptographic protection means (if they are not available or they do not fully meet the requirements of the consumer of the information product); or for their complex use.

For this purpose, a package of subprograms, a technique for its application was developed. Also recommendations on the use of existing technologies of information encryption and hiding have been determined.

The developed package of subroutines and the technique of its application provide for the integrated use of both the supported cryptographic primitives by the DBMS (for example, for Oracle DBMS it is cryptographic algorithms AES, 3DES168, RC4; cryptographic hash algorithms: MD5, SHA-1, SHA-2; keyed hash (MAC) algorithms: HMAC_MD5, HMAC_SH1, HMAC_SH512 and others) and the symmetric block cipher “Kalyna” from the national Ukrainian encryption standard DSTU 7624:2014.

Large amounts of data and discretionary access to information stored in a database based on a database schema with the universal basis of relations to a certain extent complicate the implementation of an effective mechanism when the content is pre-decrypted and then, after use, is encrypted back. Therefore, in order to preserve the habitual tools and operating procedure in the proposed approach, it is recommended to use the so-called “transparent encryption” method, in which the information is automatically decrypted when reading from the medium (if the correct key was entered) and automatically encrypted during recording. So transparent data encryption (TDE) allows you to selectively encrypt vulnerable data that is stored in database files, as well as all stream file components, such as redo logs, archive logs, backup tapes. The TDE technique is inherent in various DBMSs (IBM patent 7426745 [16]). The main purpose of TDE is to protect vulnerable data in the appropriate operating system files.

In addition, without resorting to the encryption procedure, for several reasons: observance of intellectual property rights; commercial value; the code provides the solution of problems of protection and distribution of data access rights; the inadmissibility of code modification by other users (especially after installing the software to the consumer of the information product, that the latter had no reason to declare after his own modification (in fact, hacking) about the inoperability and unreliability of software), code of the main procedures, functions, packages, triggers of the database schema with the universal basis of relations is advisable to hide. For this purpose, it is suggested to use the corresponding DBMS tools. For example, in Oracle DBMS, such the most suitable means is the special utility WRAP. This utility allows to hide PL/SQL code of main objects of the database schema quite simply and effectively, transforming this code into an unreadable form, which in its turn is uniquely understood by the server. The server can compile and execute it. Herewith the code is changed, and not encrypted with complex algorithms. Such a substitution does not greatly affect the performance, unlike the encryption/decryption procedure. However, using this mechanism, it should be kept in mind that if you need to change the source code, you will have to change the original again, hide it with the utility and load the hidden version into the database.

In addition to this mechanism, special pipelined-functions were developed to hide the composition and structure of the base and virtual relations of the database schema with the universal basis of relations. These functions with the parameter in the form of a meta-description line of the data model language (LDM) [17, 18] can be used in SQL statements (in FROM clause) instead of base and virtual relations of the database schema with the universal basis of relations.

An example of using pipelined-function:

figure e

where get_spis_metadata is a pipelined function with a parameter in the form of a meta-description line of the LDM that is used in the SELECT statement as a virtual table to obtain a list of certain requested data of the considered SD.

It is known that the loss of database data integrity can have the most serious consequences for the future work of the organization. Therefore, in the database scheme with a universal basis of relations, appropriate means of maintaining the data integrity were realized, which were considered in detail in [1], which contribute to the overall security of the database, preventing the possibility of data transition to an inconsistent state, thereby excluding the threat of receiving erroneous or incorrect results.

Using databases based on a database scheme with the universal basis of relations, it is recommended as one of the mechanisms that contribute to increasing the DB availability, to perform periodically backup their contents and organize the storage of created copies in places provided with the necessary protection. Today, almost any modern DBMS provides backup tools that allow you to restore a database. At the same time, in addition to the capabilities of standard backup and recovery tools, it became expedient to define one more relation of the database scheme with the universal basis of relations. Namely, a relation, referred to as a log of changed data. In a formalized form, this relation can be represented as a subset of the Cartesian product \( OS_{user} \times IP \times U_{2} \times DB_{name} \times L_{DM} \times T_{DB} \times T_{DDB} \times Op \times P_{name} \):

$$ \begin{aligned} L = \{ (l_{1} ,l_{2} ,l_{3} ,l_{4} ,l_{5} ,l_{6} ,l_{7} ,l_{8} ,l_{9} )|l_{1} \in OS_{user} \wedge l_{2} \in IP \wedge l_{3} \in U_{2} \wedge l_{4} \in DB_{name} \wedge \hfill \\ \, l_{5} \in L_{DM} \wedge l_{6} \in T_{DB} \wedge l_{7} \in T_{DDB} \wedge l_{8} \in Op \wedge l_{9} \in P_{name} \} , \hfill \\ \end{aligned} $$
(6)

where \( OS_{user} \) is the set of device names (host-machines of clients) from which the session was activated; \( IP \) is the set of IP addresses of the devices from which the session was activated; \( DB_{name} \) is set of database names; \( L_{DM} \) is the set of meta description lines of LDM, leading to a change in the data stored in the database; \( T_{DB} \) is the set of times when changes were made to the current database; \( T_{DDB} \) is the set of times of data output to other databases (when replication, distribution of data in a distributed system); \( Op \) is the set of statement that lead to the modification of data stored in the database (\( Op = \{ insert,delete,update\} \)); \( P_{name} \) is the set of procedure names of the LDM interpreter [18] (\( P_{name} = \)\( \{ Proc_{metadata} ,Proc_{data} \} \)).

The result of mapping this relation to the base relation of the database schema with the universal basis of relations (\( R_{L}^{sh} \in R^{sh} \)) is presented below in the form of the main lines of SQL code:

figure f

Thanks to the information stored in the log table, which is automatically formed when the corresponding parameter of the stored procedure of the LDM interpreter is specified [18], the process of recovering incorrectly changed or lost data is simplified, and the procedure for determining users, times and the nature of their changes is facilitated. In addition, the information from the log-table of the changed data can be used in distributed systems when data is propagated (replicated).

It is no secret that an audit procedure is no less important for creating a complete database security system. Actions with critical data should be logged. Therefore to monitor the status, changes introduced to the database, user actions, in addition to using standard DBMS audit tools, on the platform of which the database scheme with the universal basis of relations was implemented, special diagnostic functions, including dynamic analyzers of code coverage implemented in the LDM interpreter, capable of detecting entering incorrect data, as well as triggers that support the logging of operations performed in the database have been developed. Also, for accountability of user actions, data from the log table of the changed data, as discussed above, can be used.

Thus, solving the problem of protecting corporate databases built on the basis of the database scheme with the universal basis of relations, from possible threats, special means (in the form of implemented scheme objects) and the rules for their use were developed in the process of creating this database scheme. These means and rules are based both on common methods and tools supported by the DBMS, on the platform of which the proposed scheme is implemented, and on its own mechanisms developed within the framework of creating this scheme. The means and methods implemented in the DB schema with the universal basis of relations to ensure the security of databases are shown in a systematized form in Fig. 1.

Fig. 1.
figure 1

The means and methods implemented in the DB schema with the universal basis of relations to ensure the security of databases

3 Conclusions

  1. 1.

    To ensure the security of data in databases built on the basis of the database scheme with the universal basis of relations, an approach is proposed of the integrated use of common methods and tools supported by the DBMS, on the platform of which this scheme is implemented, as well as its own mechanisms developed in the framework of creation of the DB scheme that is invariant to SDs.

  2. 2.

    Solving the problem of protecting databases as the most important corporate resource, in the process of creating database schema invariant to subject domains, special means were developed (in the form of implemented schema objects such as triggers, procedures, packages, tables, functions) and rules of their use, ensuring: access control to schema objects; data protection and hiding of objects; data integrity support; recovery of incorrectly modified or lost data; monitoring of the state, changes introduced into the database; logging user actions.

  3. 3.

    Implemented in DB scheme with the universal basis of relations, means and methods of protection allow you to control access to data up to a specific element.

  4. 4.

    The practice of using databases built on the basis of the database scheme with the universal basis of relations, for information systems of different subject domains, in the projects of which it was required to organize reliable, safe storage, adaptation to changes occurring in SD and legislation, timely processing of data, showed that they have a sufficiently high degree of controllability of access to data, cryptographic protection of data, reliability, stability.