Keywords

1 Requirements to Benchmark Software

Huppler [3] described the five most important characteristics of a good benchmark: relevant, repeatable, fair, verifiable, and economical. PeakmarksFootnote 1 meets all these requirements.

However, other features are also crucial for customer acceptance of benchmark software:

Simplicity.

It must be easy to install the benchmark software, perform the benchmark, and interpret the results. Peakmarks is implemented with the tools of the database without operating system scripts. Therefore, Peakmarks runs unchanged everywhere where the database software is available. Any DBA can easily manage the benchmark software without additional know-how.

Speed.

The installation, loading of the data, processing of the various workloads, and evaluation of the performance key figures should be fast. Peakmarks is installed in a few hours, including all adjustments of the database. The loading time of the database depends on the database size and the performance of the infrastructure. On powerful systems, loading times of 4 TByte per hour were measured. Complete benchmark runs with all workloads typically take between 12 and 24 h; the results are immediately available. A comprehensive benchmark project can be completed within a week. This is significantly faster than many proof-of-concepts, which may take several weeks and whose value is limited to the tested application.

Different Load Situations.

Often it is not the maximum value of a performance metric that is of interest, but the optimal value. Peakmarks analyzes the performance of a database service in all load situations. A benchmark test starts with a low load and increases the load continuously until the system is saturated. In this way, the optimum performance range of a database service can be determined.

More Performance Metrics.

Many benchmarks provide only a single performance metric. This dramatically simplifies the comparison of different systems. However, a single metric is difficult to understand [2]. Peakmarks provides a set of representative and easy-to-understand metrics for different aspects. Actual performance questions can be answered more easily. Performance bottlenecks and malfunctions can be detected more easily. Since several performance metrics are available, the user must decide with which priority the individual metrics are to be included in the decision-making process, when choosing the right cloud service.

Product Specific Workloads.

When customers have to pay license fees for database software, they are interested in getting the highest performance out of their database service. That’s why we’ve deliberately implemented workloads that can only be found on certain database products but are essential for the solution architecture. Currently, Peakmarks is available for Oracle 12.2 and upwards. There are considerations to port the software to other database systems as well. Peakmarks is not suitable to compare different database products; it only serves to compare the underlying infrastructure, on-premise or in the cloud.

2 Key Performance Indicators

Representative performance indicators of database services can be used for various tasks:

Quality Assurance.

A database service is validated for its performance properties. Performance bottlenecks can be quickly identified; performance promises of the providers are easily checked.

Evaluation.

Performance indicators are used for price/performance considerations of various database services, technologies, or configurations.

Capacity Planning.

When systems migrate to new platforms or cloud services, performance indicators help with capacity planning.

License Cost Optimization.

Our experience over many years has shown that many users can halve license costs for the same performance, only by optimizing the infrastructure. License costs often exceed infrastructure costs by far.

3 The Architecture of Peakmarks Benchmark Software

Peakmarks is written in the procedural SQL extension of the database system, in the case of Oracle in PL/SQL. The size of the database can be configured in a range from 250 GByte up to 64 TByte per database server. Clusters with multiple servers are supported. The record length of the benchmark tables can be configured between 80 bytes and 4’000 bytes. The redundancy of the benchmark data can be controlled via a parameter. The data can optionally be encrypted using a further configuration parameter. All encryption methods offered by the database system are supported. The scalable loading process of the benchmark data automatically adapts its parallelism to the performance capabilities of the database platform.

A workload generator generates the database load with database jobs, and a performance monitor collects all relevant performance statistics before and after each performance test. All workloads are generated within the database, and all performance statistics also originate from the database (Fig. 1).

Fig. 1.
figure 1

Peakmarks benchmark software – on-premise and in the cloud

Peakmarks provides a library of workloads to determine the most important performance indicators of database services for:

  • Server and storage systems in database operation.

  • Critical database background processes, responsible for transaction management (log writer) and buffer management (database writer).

  • Typical database operations such as data load, data analytics and transaction processing.

  • PL/SQL application code.

4 Simple and Complex Workloads

Peakmarks distinguishes between simple and complex workloads. Simple workloads execute precisely one type of load (SQL statement). Complex workloads are hierarchically composed of different simple workloads to simulate complex load situations.

The runtime of the workloads is configurable. Runtimes between 5 and 10 min per test have proven to be representative. Let us take a closer look at some of the workloads.

4.1 Server Workloads

Server workloads determine the power of a server with its processors, main memory, and internal memory channels in database operation. These workloads are especially crucial if license cost must be optimized. License costs are often linked to the number of processorsFootnote 2 used. In this case, a server with the highest performance per processor is searched for.

Server workloads also show the efficiency of multithreading and virtualization technologies and provide hints about scalability when high numbers of sockets and cores are used (NUMA effects). If database encryption is selected, its impact on the overall database performance can also be determined.

All server workloads access tables via SQL with different access patterns. The affected tables are fixed in the buffer cache. There are almost no I/O operations, so these workloads are entirely CPU-bound.

The essential primary performance metrics are queries per second (qps), the response time, and the scan rate of queries (memory bandwidth). A secondary performance metric is the number of logical reads per second (Table 1).

Table 1. Peakmarks workloads to determine server performance.

The benchmark report in Fig. 2 shows the performance of a database service for workload SRV-LIGHT. This benchmark comprises 5 tests. The second column shows the workload name. The column Nodes indicate how many cluster nodes are used in the test. The column Jobs describes the number of processes that generate the load for the workloads. The next 4 columns describe the percentage CPU load in the different CPU modes. The column Queries total describes the total number of queries processed per second. The column Queries per cpu shows the performance per involved processor. This information is important for license cost considerations. The columns Logical reads total and Logical reads per cpu are the corresponding performance metrics for database accesses in the buffer cache. The column BuCache read displays the hit rate of all read accesses in the buffer cache and is only used to check that this workload has been optimally processed.

Fig. 2.
figure 2

Benchmark report for a server system workload.

It is straightforward to see how, as the load increases, the response time also increases, but the number of queries per CPU decreases. The short response times of less than 30 μs show the exceptional efficiency of database queries when all the data is in the buffer cache. It is also noticeable that throughput and CPU utilization do not correlate above 50%; a typical characteristic of some processor architectures when multi-threading is enabled.

4.2 Storage Workloads

Conventional I/O benchmark tools such as vdbench, iometer, Orion often display performance values that are not achieved in real database operations. The reason for this is the complexity of database I/O operations.

If a data block is read, the buffer cache management of the database has to perform many tasks: a) find a free slot for the block; b) if there is no free slot, replace older blocks; c) synchronize all database processes that simultaneously try for free slots in the buffer cache; d) if a shared disk cluster architecture is used, the synchronization has to be cluster-wide; e) finally, blocks are checked for their integrity and consistency during I/O transfer.

Peakmarks, therefore, generates I/O load with so-called SQL-generated I/O operations to obtain representative performance metrics for the storage system (Table 2).

Table 2. Peakmarks workloads to determine storage performance.

Storage workloads show the efficiency of the I/O stack (I/O scheduler, queues, multipathing, virtualization), the technologies used (HDD, SSD, Flash, SAS or PCI, NVMe, etc.) and storage specific functionalities (deduplication, compression, encryption, snapshots, mirroring, SQL offloading).

The benchmark report in Fig. 3 shows the performance of a database service for workload STO-RANDOM with 100% read operations.

Fig. 3.
figure 3

Benchmark report for random read storage system workload.

This report shows the difference between the maximum and optimal range of performance. The storage system can read over 800,000 random single database blocks per second (dbps), but at a service time of just under one millisecond (Test 13). An all-flash storage system is used in this case study. We expect a service time of less than 500 microseconds per single database block read for this storage technology. The optimal performance is more like 750,000 dbps (Test 12). Higher values are possible, but only at the price of sharply increasing service times. It is a good advice to keep the storage utilization below this value.

4.3 Data Load Workloads

System architects and capacity planners need performance metrics from database services regarding their ability to load data. This is particularly important for Data Warehouse and Data Analytics systems, where data volumes are constantly growing as the time available for loading becomes smaller.

Oracle provides different technologies for loading data: conventional loading via buffer cache and direct loading bypassing the buffer cache. Peakmarks provides workloads for both data loading techniques (Table 3).

Table 3. Peakmarks workloads to determine data load performance.

The benchmark report in Fig. 4 shows the performance of a database service for both data load workloads. The key performance metrics for data load is the amount of data that can be loaded within a certain timeframe in column Loaded user data.

Fig. 4.
figure 4

Benchmark report for data load workloads.

This case study was run on an Oracle Engineered System which uses flash cache technology. The buffered load generates more REDO data. The direct load in workload DL-DIRECT provides much higher throughput in data load. In the last test of each workload, the load is doubled but distributed over two database servers. In both cases, the system scales well.

4.4 Data Analytic Workloads

System Architects and capacity planners require performance metrics from database services regarding their ability to search for data. Data analytics applications are typically based on “full table scan” operations. The performance of “full table scans” depends on the position of the data in the storage hierarchy and the technology used to boost scanning performance.

Peakmarks provides workloads to test different data locations (storage, memory) and to test different boost technologies (smart scan, in-memory column store) (Table 4).

Table 4. Peakmarks workloads to determine data analytics performance.

The benchmark report in Fig. 5 shows the performance of a database service for data analytic workloads. The key performance metric for data analytics is the amount of data that can be scanned within a certain timeframe in column Scanned user data.

Fig. 5.
figure 5

Benchmark report for data analytics workloads using storage.

“Full table scans” cause sequential storage reads on the storage system and are usually limited by the bandwidth between the storage system and server system, in this case around 6 GBps (test 34). When two database servers request sequential reads, the storage system scales well (2 cluster nodes are used in test 35 and test 42).

Test 36 to 42 show the performance when SQL offload technology can be used. Even one database server can use the full performance capabilities of the storage system, which is by factors higher than on a conventional storage system. But this technology requires specialized hardware and software (Oracle Engineered System).

The benchmark report in Fig. 6 shows the same workload, but data is stored in main memory using row store (test 43–51) or column store (test 52–60). The results are very different (by factors) and allow a fair comparison of different technologies to calculate the price-/performance ratio of each data analytics solution.

Fig. 6.
figure 6

Benchmark report for data analytics workloads using main memory.

4.5 Transaction Processing Workloads

System architects and capacity planners need performance metrics from database services regarding their ability to run typical transaction processing applications. Peakmarks provides transaction processing workloads of varying complexity (light, medium, and heavy) (Table 5).

Table 5. Peakmarks workloads to determine transaction processing performance.

The following benchmark report in Fig. 7 shows the performance of a database service for workloads TP-LIGHT and TP-HEAVY with 80/20 select/update ratio. The percentage of update transactions can be configured from 0% to 100% in 10% steps (column Upd).

Fig. 7.
figure 7

Benchmark report for transaction processing, 20% updates.

The key performance metrics for these workloads are transactions per second (column Trans total) and the response time (column Resp time).

The performance of these workloads depends on various factors, including the ratio of database size to buffer cache size. The higher the hit rate of the buffer cache (column BuCache read), the higher the transaction rate and the lower the response time of the transactions column Resp time). This is particularly true for low update rates, where the proportion of write operations is low in relation to the number of read operations.

The following benchmark report in Fig. 8 shows the performance of a database service with the workload “TP-MIXED” with 20% update share. This complex workload is similar to a TPC-C workload, where queries, update, and insert transactions are processed simultaneously.

Fig. 8.
figure 8

Benchmark report for mixed transaction processing, 20% updates.

4.6 PL/SQL Application Performance

PL/SQL is the preferred programming language for complex transaction logic and algorithms. PL/SQL code is stored in the database server. Some large applications, e.g., core banking systems, are entirely implemented in PL/SQL.

Peakmarks provides workloads to test PL/SQL code efficiency on a particular processor. These workloads are entirely CPU-bound. The key performance metrics for PL/SQL application performance are the number of executed PL/SQL operations within a certain timeframe and the execution time of PL/SQL algorithms (Table 6).

Table 6. Peakmarks workloads to determine transaction processing performance.

The following benchmark report in Fig. 9 shows the performance of a database service with the workload “PLS_MIXED” with different numerical datatypes.

Fig. 9.
figure 9

Benchmark report for PL/SQL code with mixed datatypes.

4.7 Database Service Processes

In the case of Oracle, the performance of the log writer background process is critical. It is responsible for transaction logging and database recovery after system failures. The latency of transaction logging can have a significant impact on the response time of user transactions.

Optionally, the log writer is also used for database replication to synchronize standby databases. This technology is very popular for disaster recovery solutions. Replication can take place in synchronous or asynchronous mode. The data transfer between primary and standby databases can optionally be encrypted and/or compressed. With synchronous replication, local transactions have to wait until the standby databases have also applied the transaction log. This may delay local transaction processing considerably.

To analyze the performance behavior of the log writer process in all possible situations, Peakmarks offers two different workloads. One workload analyzes log writer latency, and the other one’s log writer throughput (Table 7).

Table 7. Peakmarks workloads to determine log writer performance.

The following benchmark report in Fig. 10 shows the performance of a database service with the workload “LGWR-LAT” and different transaction sizes (column TX size) of 1 and 25 rows per transaction.

Fig. 10.
figure 10

Benchmark report for log writer latency.

The following benchmark report in Fig. 11 shows the performance of a database service with the workload “LGWR-THR” with large transactions.

Fig. 11.
figure 11

Benchmark report for log writer throughput.

4.8 Order in Which Workloads Are Executed

We run the workloads in a logical order and start with the server and storage workloads. For example, if the storage workloads do not perform satisfactorily, other workloads that are heavily dependent on storage performance will also deliver disappointing results.

The following order has proven workable

  • Workloads for Server Systems.

  • Workloads for Storage Systems.

  • Workloads for Database Background Processes.

  • Workloads for Data Load.

  • Workloads for Data Analytics.

  • Workloads for Transaction Processing.

  • Workloads for PL/SQL application programs.

5 Case Study

The Peakmarks benchmark software offers a fast and comprehensive performance analysis of Database Cloud Services. The results are understandable key performance metrics for representative database operations and provide a reliable foundation for price/performance comparisons and capacity planning.

Here is a summary of performance metrics of a database service with 8 processors, 32 GByte main memory, flash storage, and a 250 GByte database. The min/max values describe the system behavior in all load situations (best case, worst case).

The whole benchmark took less than 24 h. The customer selected those workloads from all that are important to him. The parameters for the workloads were chosen to reflect the customer’s current environment best (Tables 8, 9, 10, 11, 12, 13 and 14).

Table 8. Peakmarks key performance metrics for server component.
Table 9. Peakmarks key performance metrics for storage component.
Table 10. Peakmarks key performance metrics for data load.
Table 11. Peakmarks key performance metrics for data analytics.
Table 12. Peakmarks key performance metrics for transaction processing (80% read, 20% update).
Table 13. Peakmarks key performance metrics for PL/SQL application code.
Table 14. Peakmarks key performance metrics for database service processes.