Analysis and Prevention of Code-Injection Attacks on Android OS

University of South Florida University of South Florida

Digital Commons @ University of Digital Commons @ University of

South Florida South Florida

USF Tampa Graduate Theses and Dissertations USF Graduate Theses and Dissertations

10-22-2014

Analysis and Prevention of Code-Injection Attacks on Android OS Analysis and Prevention of Code-Injection Attacks on Android OS

Grant Joseph Smith

University of South Florida

, [email protected]

Follow this and additional works at: https://digitalcommons.usf.edu/etd

Part of the Computer Engineering Commons

Scholar Commons Citation Scholar Commons Citation

Smith, Grant Joseph, "Analysis and Prevention of Code-Injection Attacks on Android OS" (2014).

USF

Tampa Graduate Theses and Dissertations.

https://digitalcommons.usf.edu/etd/5391

This Thesis is brought to you for free and open access by the USF Graduate Theses and Dissertations at Digital

Commons @ University of South Florida. It has been accepted for inclusion in USF Tampa Graduate Theses and

Dissertations by an authorized administrator of Digital Commons @ University of South Florida. For more

information, please contact [email protected].

Analysis and Prevention of Code-Injection Attacks on Android OS

Grant J. Smith

A thesis submitted in partial fulﬁllment

of the requirements for the degree of

Master of Science in Computer Science

Department of Computer Science and Engineering

College of Engineering

University of South Florida

Major Professor: Jay Ligatti, Ph.D.

Yao Liu, Ph.D.

Hao Zheng, Ph.D.

Date of Approval:

October 22, 2014

Keywords: Injection Attacks, BroNIEs, Formal Deﬁnitions, SQL, OS Shell

 2014, Grant J. Smith

ACKNOWLEDGMENTS

I would like to thank Dr. Jay Ligatti for being an inexhaustibly helpful, constructive, and

optimistic advisor. I would also like to thank Clayton Whitelaw for his help in reviewing and

proofreading this thesis. Finally, I would like to thank all the rest of my friends and family, who

have been very supportive of and interested in my work.

TABLE OF CONTENTS

LIST OF FIGURES ii

ABSTRACT iii

CHAPTER 1 INTRODUCTION 1

1.1 OS Shell Injection 2

1.2 SQLite Injection 3

1.3 Summary of Contributions 7

1.4 Thesis Outline 8

CHAPTER 2 RELATED WORK 10

2.1 General Android Security 10

2.2 Theory of Injection Attacks 11

2.3 Taint Tracking 12

CHAPTER 3 ATTACK SURFACES 14

3.1 SQLite 14

3.2 OS Shell 15

3.3 Possible Attacks into OS Shell 18

CHAPTER 4 A REVIEW OF BRONIES 21

4.1 Template String 21

4.2 Token Expansion 22

4.3 Token Classiﬁcation 22

4.4 The NIE Property 23

4.5 Stopping a BroNIE 23

CHAPTER 5 PROTECTION MECHANISM IMPLEMENTATION 25

5.1 Taint Tracker 25

5.2 Weaving with AspectJ 28

5.3 Detecting BroNIE’s 29

CHAPTER 6 CONCLUSION 32

LIST OF REFERENCES 33

LIST OF FIGURES

Figure 1.1 Java code for an Android application vulnerable to shell injection 2

Figure 1.2 Vulnerable Java code. 4

Figure 1.3 Code for an application that uses Android’s SQL injection security mechanism 6

Figure 3.1 Code for our timing test application 19

Figure 3.2 A graph showing timing data gathered from the test application. 19

Figure 5.1 Code for the taint tracking module we have added to the Java libraries 27

Figure 5.2 AspectJ code to call the taint tracker from application code 28

Figure 5.3 Algorithm to detect BroNIEs using a template string and a program string 30

ABSTRACT

Injection attacks are the top two causes of software errors and vulnerabilities, according to the

MITRE Common Vulnerabilities list [1]. This thesis presents a threat analysis of injection attacks

on applications built for Android, a popular but not rigorously studied operating system designed

for mobile devices. The following thesis is argued: Injection attacks are possible on oﬀ-the-shelf

Android systems, and such attacks have the capacity to compromise the device through resource

denial and leaking private data. Speciﬁcally, we demonstrate that injection attacks are possible

through the OS shell and through the SQLite API. To mitigate these attacks, we augment the

Android OS with a taint-tracking mechanism to monitor the ﬂow of untrusted character strings

through application execution. We use this taint information to implement a mechanism to detect

and prevent these injection attacks. A good deﬁnition of an attack being critical to preventing

it, our mechanism is based on Ray and Ligatti’s formalized “NIE” property, which states that

untrusted inputs must only insert or expand noncode tokens in output programs. If this property

is violated, an injection attack has occurred. This deﬁnition’s detection algorithm, in combination

with our taint tracker, allow our mechanism to defend against these attacks.

iii

CHAPTER 1

INTRODUCTION

As of January 2014, a study has shown that 58% of American adults own and use a smart-

phone [2]. The increasing ubiquity of this new computing platform has also opened a new bat-

tleground in computer security. Smartphones are ﬁlled with private user data such as contact

databases, call histories, camera images, and the contents of any mounted SD card. If this private

data is compromised by an attacker, it could have high value to advertisers, spammers, and stalkers.

Phones provided by the workplace may also store conﬁdential company information, making it even

more critical to secure these devices against software vulnerabilities. A leak of sensitive company

data is a constant threat and 33% of IT professionals said that the most vulnerable points for data

leakage on some secure network are the USB storage devices [3]. Android phones, with their USB

connectivity and SD card mounting capability, fall into this category.

Android OS is an open-source mobile device operating system [4] built on the Linux kernel.

It implements a security system modeled on Linux, with several modiﬁcations. Android restricts

at install time what an application may access on the system [5, 6]. Applications declare in their

package manifest which permissions they require to run properly and these permissions are displayed

to the user when downloading an application from the Google Play store or installing an application

from any other source. The user may then choose to install the application and grant the required

permissions, or abort the install. The user cannot force an application to run with fewer permissions

than it has requested in oﬀ-the-shelf distributions of Android. There is no monitoring of how an

application uses data or services; permissions determine only whether an application may use

certain data or services.

This lack of ﬁne-grained runtime monitoring makes Android vulnerable to injection attacks.

This thesis will argue that such attacks are possible on Android OS, through both the OS shell and

SQLite API, and that these attacks are able to compromise the device.

// Obtain a p r o c e s s running an OS S h e l l

Pr o ces s proc = Runtime . getRuntime . exe c ( ‘ ‘ sh −i ’ ’ ) ;

// Obtain a stream to send commands to t h at p r o c e s s

OutputStream programInput = proc . getOutputStream ( ) ;

// Obtain some u n t r u s t e d in p u t

S t r i n g us e r i n p u t = g et Un tr us te dI np ut ( ) ;

// Create a command w ith un t r u s t e d in put , and e x e c u t e th e c o m mand

programInput . wr i t e ( ‘ ‘ ping −c 4 ’ ’ + u s e r i n p u t + ‘ ‘\ n ’ ’ ) ;

Figure 1.1: Java code for an Android application vulnerable to shell injection

1.1 OS Shell Injection

Consider an example application that uses user input in a simple shell program to “ping” a

remote host speciﬁed by the user. Application code for this functionality might look like the code in

Figure 1.1. This program uses the Runtime class and its method .exec() to obtain a shell running

inside an instance of the Process class called proc and obtains an OutputStream to that Process.

It then uses proc to execute a shell program that pings a remote host speciﬁed by user input with

four packets. The newline character at the end of the program instructs the shell to execute the

command given by the string preceding the newline.

Unfortunately, this application accepts input from an untrusted source. A malicious attacker

controlling that source could perform a code injection attack on the program output by the shell.

Consider the following possible input strings. The text input by the attacker is underlined for

clarity.

1. www.remotehostname.com’’ && rm -rf ‘‘data’’, to make the application output the

program ping -c 4 ‘‘www.remotehostname.com’’ && rm -rf ‘‘data’’ \n The && sym-

bol causes the program to execute the next command, based on whether the previous com-

mand returned successfully. The ﬁrst command will ping some supplied remote host name,

as expected. The second command begins with rm, the command to delete directories or

ﬁles from the ﬁlesystem. The ﬁrst argument to rm is -rf, meaning that the program will

remove subdirectories and ﬁles inside the speciﬁed directory recursively without warning the

user. The second argument to rm is data, which will be used as the name of the directory

to delete. If data is a location containing some crucial ﬁles, this injection attack could be

very damaging.

2. www.remotehostname.com && f() { f|f& } && f \n #’’, to the make the application

output the program ping -c 4 ‘‘www.remotehostname.com’’ && f() { f|f& } && f

\n #’’ \n’’. This program inserts a subprogram via the && symbol in the same fashion

as before. The second program this time begins with the creation of a function f that

recursively executes itself indeﬁnitely and maintains a pipe from each parent process to

each child process, to prevent the parent from terminating. It then executes this function f.

The ﬁnal # character is used to start a line comment, and eliminate the unused closing quotes

from the application code. The result of this program is a full process table, preventing

the operating system from creating any new processes. To ﬁx resource denial, the user is

required to wipe volatile memory, which might be accomplished by removing the phone’s

battery. For mobile devices that have non-removable batteries, the user might need to

wait until the device runs out of battery naturally. This attack is colloquially known as a

“forkbomb”.

In practice, these vulnerabilities might be used in the following way. The user decides to install

some application that they trust. The user gives the application permission to access a protected

resource—the SD card, say. The trusted application, unbeknownst to the user, has an injection

vulnerability like the one above and takes input from an untrusted source. An attacker could then

inject into this application and take advantage of its permission to access the SD card to read, write,

and modify ﬁles. If the target phone was holding sensitive data, the results could be disastrous.

1.2 SQLite Injection

The Android OS ships a SQLite API with its development kit. SQL injection vulnerabilities are

well studied and are the single most common vulnerability in software, according to the MITRE

Common Vulnerabilities List [1]. A SQLite database in Android is not protected by a permission

restriction in the application manifest but does have several countermeasures against injection.

The countermeasures take the form of wrapper methods to access the database [7]. These wrapper

methods all have overloads that utilize prepared statements. Prepared statements are a common

// C on st ru ct a SQLite query s t r i n g

S t r i n g query =

// Match rows where KEYVALUNAME i s eq u a l t o untrusteduname and

KEYVALUNAME + ‘ ‘ = ‘ ’ ’ + untrusteduname + ‘ ‘ ’ AND ’ ’ +

// Rows where KEYVALPASS i s e qu a l to un t r u s t ed p a s s

KEYVALPASS + ‘ ‘ = ‘ ’ ’ + u n t r ust e d p a s s + ‘ ‘ ’ ’ ’ ;

// St or e r e s u l t s from t h e query i n t o a cu r s or

Cursor c u r s o r = s q l d a t a b a s e . query (

// Pass t o t h e q uery t h e t a b l e t o search , th e columns t o r e t u r n

TABLEVALDEMO, new S t r i n g [ ] { KEYVALID, KEYVALUNAME, KEYVALPASS } ,

// and our qu ery s t r i n g . Other o p t i o n a l ar g s are n u l l .

query , null , null , null , null , null ) ;

// I f any match was found , re t ur n t rue , e l s e r e t u r n f a l s e

return ( cu r s o r . moveToFirst ( ) ) }

Figure 1.2: Vulnerable Java code.

SQL injection prevention mechanism that “prepare” a SQL query before concatenating in user

input [8]. The untrusted strings are then converted to equivalent values and added to the query.

We call this behavior “string binding.” Closed values are noncode, so this protection mechanism

fully defends against injection attacks if used properly. However, applications do not automatically

implement this mechanism. The onus is on the developer to make use of these overloads. Developers

who are unfamiliar with the dangers of injection attacks might not know that they need to program

their applications securely.

An example of such a poorly coded application might look like Figure 1.2. The program is

querying a SQLite database with columns for primarykey, username and password for the exis-

tence of a particular username and password in the database, attempting to authenticate some

user. It constructs a query string that implements this behavior. Note that SQLite requires that

string literals in queries be enclosed with single quote characters. The single quote characters to

enclose untrusteduname and untrustedpass are included in the application code, before and after

concatenating in the string variables. The query string is then used as an one of the arguments

to Android’s SQLite API method .query(). The code also passes to the .query() method the

name of the table to be queried and which columns of information to return in the results. Other

arguments, including how to sort the results and how to group the results are left as null. The

query method returns an instance of the Cursor class, a datatype that contains the results of the

query. The code should then return true if it has found a match to the username/password com-

bination in the database and false otherwise. The .moveToFirst() method of the Cursor class

implements exactly that behavior, returning false if the Cursor is empty, and true otherwise. This

is a standard example of how a SQLite database might be queried in an Android application. If

the source of untrusteduname and untrustedpass are controlled by an attacker, an injection attack

could be executed on this application code.

Consider the following example. Suppose that our example database from Figure 1.2 contains

only 1 entry—the entry for the administrator account. The username for the administrator account

is “admin” and the password is “adminpass”. According to our speciﬁcations, the code in Fig-

ure 1.2 should only return true if untrusteduname and untrustedpass each match the credentials

on some row in the database. A malicious attacker attempts to authenticate to the application with

username ‘‘admin’’ and password ‘‘ ’ OR‘1’=‘1’’. This results in the query String query =

KEYVALUNAME + ‘‘ = ‘ ’’ + admin + ‘‘ ’ AND ’’ + KEYVALPASS + ‘‘ = ‘ ’’ + ’OR‘1’=‘1

+ ‘‘’ ’’.

Our example database should reject this authentication attempt. The values passed into the

.authenticate() method by the attacker do not match the values for the user account stored in

the database.

This application code does not reject the authentication attempt as desired. The username is

matched as normal but the password provided is not tokenized as a string literal. Instead, the ﬁrst

single quote in the untrusted input closes the string literal that should have contained the password

to match against the password ﬁeld in the database. The untrusted input then injects an extra OR

token and an equality expression ‘1’=‘1. Recall that SQLite also requires that literals be enclosed

with single quotes. The single quote to the right hand side of the ﬁnal number ‘1’ is left absent in

the user provided string because the trusted query string has one already—the single quote that

would have closed the password string literal. This injection results in the SQL query matching all

rows where the column KEYVALUNAME is equal to “admin” and rows where the column KEYVALPASS

is equal to “ ” or where “1 = 1”. The equation “1 = 1” is, of course, a tautology, resulting in

// C on st ru ct the query s t r i n g marking us er i n p u t l o c a t i o n s wi t h ? ’ s .

S t r i n g query = KEYVALLOGIN + ‘ ‘ =? ’ ’+ ‘ ‘AND’ ’ + KEYVALPASS + ‘ ‘=? ’ ’ ;

// Pass t o t h e q uery method th e t a b l e t o sear ch ,

Cursor c u r s o r = db . query (TABLEVALDEMO,

// th e columns t o re turn , t h e query s t r i n g

new S t r i n g [ ] { KEYVALID, KEYVALLOGIN, KEYVALPASS } , query ,

// and S t r i n g arr ay c o n t a i n i n g u n t r u s t e d i n p u t to be bound as l i t e r a l s

new S t r i n g [ ] { untrusteduname , u n t r us t e d p a ss } , null , null , null , null ) ;

Figure 1.3: Code for an application that uses Android’s SQL injection security mechanism

the query matching every row in the database. A row is returned, so the conditional at the end

of the code returns true and the malicious user successfully authenticates to the database, despite

not providing a valid username/password combination.

The injection attack described above was made possible by a poorly programmed application.

The code of an equivalent application that takes advantage of Android’s protection mechanisms

would look like the code in Figure 1.3. This application code makes use of a prepared statement.

The programmer can use a prepared statement by marking areas where untrusted input will be

used in the query with ‘?’ characters. The example code does this for both the untrusuteduname

and untrustedpass strings. Then, when the SQLite API method is called with our query string,

the fourth argument is used to pass in a String array containing the untrusted strings. It is then

the task of the API method to insert the untrusted strings into the query and ensure that they are

bound as literals, meaning that the untrusted strings will be tokenized as literals only. The new

secure application, instead of searching for rows for which either the password value is the empty

string or the equation “1 = 1” is true, will now search for rows for which the password value is

’ OR ‘1’=‘1. Since there is no row with that password in our example database, the query returns

no rows, the conditional statement correctly returns false, and the injection attack is prevented.

This mechanism for preventing SQL injection attacks is present in the oﬀ-the-shelf Android OS

but is not automatic for all databases. The onus is on the programmer to understand why these

overloads are important and to use them properly in their code to secure their application against

injection attacks.

So far as we have been able to determine, there is no reason why an application developer would

not want to use the provided SQLite security mechanisms, if s/he were aware of them. If there is no

penalty for using the protection mechanism, why is SQL injection still a problem? The documenta-

tion for the SQLiteDatabase class describes the argument used to implement prepared statements

as “You may include ?’s in selection, which will be replaced by the values from selectionArgs, in

order that they appear in the selection. The values will be bound as Strings. [7]” This explanation

makes no mention of what string binding is or why it is important to protect the application from

injection attacks. The lack of detail might lead a developer unaware of the dangers of injection

attacks to forego using the security mechanism, leaving their code vulnerable to injection attacks.

In addition, prepared statements have been a feature of the Android OS since API level one—the

initial release of Android in 2008. Despite the existence of these prepared statements, SQL injection

continues to be a problem. Clearly, many developers are not understanding or simply neglecting

to use these security features.

1.3 Summary of Contributions

This thesis has been written on the results of a collaborative research project. It is important

to note which parts of this thesis are my work alone, which parts are the work of my peers, and

which parts are a collaborative eﬀort.

I am exclusively responsible for the taint tracking module of the protection mechanism and the

conception of the original idea of the ping drain and card wipe exploits.

I worked jointly with Clayton Whitelaw on the detection portion of the protection mechanism

and on the survey of existing Android applications.

I worked jointly with Hebron George, Ishan Mitra, and Clayton Whitelaw on the survey of the

capabilities of the Android shell.

The original idea for this research into injection atttacks on Android was proposed by Jay

Ligatti and Donald Ray. The formal deﬁnition of the NIE property on which this work is based is

attributed to them as well [9].

1.4 Thesis Outline

This thesis makes two high-level contributions. First, we analyze the eﬀectiveness of injection

attacks on Android OS. Included in this analysis is a conﬁrmation that such attacks are possible

on the on both the OS shell and SQLite database, through the APIs provided by Android. We

also demonstrate that these attacks have the capacity to compromise the target device and discuss

which types of attacks may be executed depending on which permissions the target application

has obtained. Finally, we will determine whether such vulnerabilities are present in applications

currently released on Google Play and whether application developers are likely to include these

vulnerabilities in their applications.

Our second contribution is the development and implementation of an extension to the Java

libraries used by Android OS that attempts to prevent injection attacks. Any mechanism that

attempts to precisely defend against injection attacks must precisely deﬁne an injection attack.

This paper will utilize the deﬁnition developed by Ray and Ligatti [9, 10], which stipulates that

for all applications that generate output programs based on untrusted input, the untrusted input

must only cause the insertion or expansion of noncode tokens. Our extension to the Android OS

libraries is itself in two parts. The ﬁrst part consists of a proof of concept taint tracking module,

which monitors the ﬂow of untrusted input through an application’s execution. The second part

is a detection module, which contains an implementation of the detection algorithm described in

Ray and Ligatti’s work [9]. Components of this implementation include a classiﬁcation of SQL

and OS shell tokens into code and noncode, implementations of SQLite and OS shell lexers, and

an implementation of the detection algorithm itself. Finally, we present data on the performance

overhead of our implementation and its eﬀectiveness in preventing code injection.

This thesis is organized as follows:

• Chapter 2 discusses previous work on Android security, taint tracking, and the deﬁnition of

and defense against code injection attacks,

• Chapter 3 details the implementation of SQLite and the OS shell in Android. We cover how

these two applications are accessed, what protections exist for them, how injection into them

occurs, and why these protections might be ignored by an application developer. We also

demonstrate several example attacks on these applications.

• Chapter 4 discusses the formal deﬁnition of a BroNIE, the deﬁntion of an injection attack

used by this thesis.

• Chapter 5 covers the development of our prevention mechanism, detailing the implementation

methods and demonstrating its eﬀectiveness.

• Chapter 6 discusses our results, summarizes runtime overhead data, and concludes.

CHAPTER 2

RELATED WORK

This chapter covers other research into topics related to this thesis. Although not rigorously

studied, work has been done on the security of the Android OS, with a large focus on its mechanism

of granting permissions to applications at install time. Previous work in the area of formalizing a

deﬁnition of injection attacks, another focus of this thesis, is cited and compiled here but discussed

in full detail in Chapter 4. Taint tracking as a general topic has received a signiﬁcant amount of

study, due to its usefulness in defense against injection attacks. Research into taint tracking on

Android has been focused more on its ability to implement runtime data ﬂow policies, a feature not

present in the oﬀ-the-shelf Android operating system. These runtime dataﬂow polices are useful to

monitor how sensitive user data is used by an application.

2.1 General Android Security

Analysis and discussion of existing Andriod security mechanisms, including their implementa-

tion, has been discussed in [11]. Android creates a unique user id for each new installed application.

This user id is granted permissions in the system on creation based on the application manifest,

an XML ﬁle contained in the installation package of the application. The permissions speciﬁed at

install time determine the capabilities the application has at runtime.

The install time permissions system relies on the “reputation” of an application. Decisions by

the user to grant permissions to an application must be made before the application is ever used.

Once an application is installed, it may use the permissions given in any way it desires. There

are no protections for how the permissions are used—only whether they are usable. The lack of

runtime permissions and protections could lead to user data being used by some application in a

way that the user might wish to prevent. For example, an application might purport to only send

anonymized user location data to their own database but is in reality sending user-tagged data to

advertisers.

Previous works on exploits and vulnerabilities on Android [12] have studied phishing attacks

and privilege escalation. Privilege escalation is a technique also used in applications for “rooting”

a mobile device. Rooting a mobile device is the process of circumventing system protections to

obtain root user privileges on the device. Rooting is usually used to install modiﬁcations to the

default operating system, or to install an entirely new operating system.

Privilege escalation can also be achieved by using multiple applications communicating, as

discussed in [13]. If some application obtains the internet access permission from the system, it can

then transmit the data it obtains from the internet via one of several communication channels to

another application that does not have the internet permission. The ﬁrst application functions as

a medium, passing information to and from the internet and the second application. In this way,

two applications can both use the permission to access some resource, but the user is only notiﬁed

that one of the two is using the resource.

Research has been done on statically analyzing applications from the store, enabling users to

determine before installing some appliction whether it is malware [14, 15]. Malicious applications

are one of the primary concerns in Android security, given that application data usage is not

monitored at runtime in oﬀ-the-shelf Android systems. Being able to detect malicious applications

statically would allow for warnings to be posted on suspected malicious applications in the store,

to warn users.

2.2 Theory of Injection Attacks

A signiﬁcant body of work has been done on a precise and formal deﬁnition of injection attacks.

It is critical, in order to develop a adequate prevention mechanism, to deﬁne when these attacks

occur [16, 17, 18, 19, 20]. Previous works exhibit false positives and false negatives when faced

with programs that inject only noncode tokens through tainted input. This thesis uses the NIE

property [9, 10], which states that applications that generate output programs may not allow

anything other than the insertion or expansion of noncode tokens from untrusted input in their

output programs. This deﬁnition avoids the drawbacks of previous work. A more detailed deﬁnition

of a BroNIE will be discussed in Chapter four.

2.3 Taint Tracking

Taint tracking is the technique of marking data from some untrusted source with taint infor-

mation and propagating that taint information though the execution of the program. Programs

may then make policy enforcement decisions based on whether data is marked as untrusted. This

information is especially useful for preventing injection attacks. The NIE property uses the idea of

untrusted input in its deﬁnition to determine when injected symbols cause bad behavior in output

programs.

Previous work on tracking tainted inputs has been done on a variety of platforms including Java

applications [21, 22, 23] and Javascript [24]. Web Servers running Java frequently interface with

SQL databases, making providing mechanisms designed to prevent SQL injection with accurate

taint information a valuable research target.

Diglossia, a prevention mechanism for PHP requests [22], uses shadow characters to propogate

taint information. Shadow characters are stored separately from ordinary characters and are not

able to be used in the remainder of the program. At the point where some output program is

generated, two distinct parsers are executed on the program string and the shadow of the program

string. One parser accepts only ordinary program characters and the other accepts program char-

acters and shadow characters. Injections are detected by comparing the parse tree results of the

dual parsers and searching for diﬀerences.

There has also been platform-independent work on taint tracking. One paper focuses instead

on the enforcement of some arbitrary policy [25], in addition to discussion of the “privilege hijack”

attack. A privilege hijack attack can be said to occur when some program with legitimate privileges

granted by the system is exploited, allowing the attacker privileges on the system that they would

not ordinarily have through the program that they now control. Another paper examines syntax

embedding [26], a technique that allows some source language to generate a program in some target

language, with the source language aware of the syntax of the target language. This awareness

allows the source language to take measures against injection attacks.

Taint tracking on Android OS has also been studied [27, 28, 29, 30, 31]. Challenges in taint

tracking include the tradeoﬀs between the overhead of the system and its accuracy. Tracking of

taint through all program variables carries a high performance cost. Compromises can be made to

track only strings and characters, to reduce the load on the system. Much of the work into taint

tracking on Android has been done for the purpose of extending Android’s security mechanisms,

which are restricted to permissions at install time for oﬀ-the-shelf Android. Marking and tracking

the ﬂow of private data through untrusted applications would allow Android to enforce policies

about how the data and services are used. For instance, some policy enforcer might instrument

method calls that perform network activity. If marked private data, such as contact databases, call

histories, user images, or the contents of the SD card, are passed to the internet through one of

these methods, an exception is thrown, and the transfer does not succeed. Concerns about mobile

devices being increasingly used to store private data have given focus to this work.

CHAPTER 3

ATTACK SURFACES

In this chapter, we analyze the injection attacks we have discovered on Android OS. We analyze

how each entry point may be exploited by an attacker and provide examples of exploits that might

be used on vulnerable applications in the real world.

3.1 SQLite

A SQLite API is included in the Android SDK, for use by application developers. This API

creates a local database for the application, as one of several options available to the developer for

data storage. No declared manifest permission is required to use this database API. Data is inserted

into and retrieved from the database by way of wrapper methods. There are wrapper methods for

the insertion of rows through .insert(), deletion of rows through .delete(), updating of rows

through .update(), and querying the database through .query() and .rawQuery(). A ﬁnal

wrapper method, .execSQL(), covers the execution of any commands not covered by the ﬁrst four,

such as the dropping of entire tables. All ﬁve of the SQLite methods carry overloads that make

use of prepared statements [8]. The overloads take an additional array of strings as an argument.

These strings will be bound as values before being passed into the SQL command, forcing them to

be tokenized as string literals. Since the untrusted input can only be a string literal, an attacker

cannot use input passed to a wrapper method in this fashion to inject code into a SQLite query.

However, as discussed in the introduction, using this protection mechanism is not an automatic

feature of the SQLite implementation in Android. The programmer must go out of their way to

use the appropriate method overload, and they must organize their query string with question

marks in the appropriate places. Our mechanism for preventing injection attacks into SQLite

is automatic, not requiring knowledge or intervention from the programmer, an advantage over

oﬀ-the-shelf Android.

The vulnerability into SQLite, if exhibited by applications on the Google Play store, could leak

sensitive data to an attacker from the database. To determine whether or not is it present in the real

world, we conducted a study on the top 78 downloaded applications on the Google Play store for the

vulnerabilities we have described. To perform our analysis, we ﬁrst used “APKTool” [32], software

that can decode Android applications from .apk format into Dalvik Virtual Machine bytecode.

After obtaining this bytecode, we can use tools like grep, the Linux ﬁle search utility, to search

through it and ﬁnd potentially unsafe method invocations. Speciﬁcally, we searched for invocations

of .execSQL(), .rawQuery(), and .query() methods that have passed “null” as their prepared

statement argument. The lack of a safety argument suggests that if unsafe input is being used in the

query, it is not being concatenated into the query properly. Our search returned 35 source ﬁles that

exhibited this unsafe behavior while using the .rawQuery() method, and 55 source ﬁles that used

.query() and .execSQL(). Our results suggest that some application developers are not aware

of, or choose not to use, the protection mechanisms available to them through the SQLite API.

We attempted to search through these potentially vulnerable applications and locate a concrete

example of a SQL injection attack onto an application on the store. Lack of a static dataﬂow

analysis tool and the fact that the majority of the applications were obfuscated made it necessary

to inspect the code by hand. In our manual inspection, we were only able to ﬁnd instances where

trusted strings were passed to an unsafe query. As a result, we were unable to verify the existence

of any such vulnerable application in our search.

3.2 OS Shell

To access the OS shell, an application may use the Runtime class as in Figure 1.1, which

represents the computer in which the Java VM is running. Applications may obtain an instance

of this Runtime class and then create processes running on the parent computer by calling the

.exec() [33]. These processes can be any program accessible from the command line, including

ls, sh, rm, and so on. The .exec() method may be called on either a string array or a simple

string. In the case of the string array, the ﬁrst element of the array is used as the program name

to be executed and any remaining elements as the arguments to that program. In the case of the

plain string, the .exec() method splits the input string by its whitespace and parses it into a

string array. The rules for execution then proceed the same as if the programmer had passed in a

string array to begin with. For example, the string rm -rf datadirectory would be parsed into

a command to execute the program rm, with -rf as the ﬁrst argument, and datadirectory as the

second.

A second class is also available to create operating system processes—the Process Builder

class [34]. Process Builder separates the setting of the command and the starting of the process

into the .command() and .start() methods respectively. Commands may be passed as single

strings or a list of strings to the command method. The parsing of a plain string into a list

of strings is also implemented with a whitespace split, the same algorithm used in the .exec()

method. The programmer may then call the .start() method to create the process and begin

execution.

The functionality of these two classes raises an important point. These classes do not themselves

connect directly to the shell of the Android OS. Commands into these method calls are not parsed as

shell programs. Only a simple program lookup is performed on the ﬁrst string in the command array.

The remainder of the command is passed directly to the program being executed as arguments. Due

to this, injection attacks into the shell such as the one demonstrated in Figure 1.1 are not possible

into these methods. Only if an attacker is given control over the full program string can they

choose which program is being executed. If given control over only a single program argument in a

string array, the attacker would only be able to modify that single argument. If the attacker’s input

were to be concatenated into a command string, the attacker would be able to insert additional

program arguments. For example, if the command passed to .exec() or .command() was “‘‘ping

-c 4’’ + untrustedinput”, the attacker could input -i .2 www.nameofhost.com , resulting in

the following string: ping -c 4 -i .2 www.nameofhost.com. Because these methods split the

command string by whitespace, the attacker will have added two additional arguments to the

program, -i and .2. This would decrease the interval between sending packets to one ﬁfth of a

second.

In order to perform a full code injection into the OS shell of an Android device, an application on

the system must ﬁrst obtain a Process that is running a full shell, as in example 1.1. The example

application has a full shell running inside the Process ‘proc’, and our research indicates that

commands passed to this shell will be parsed and executed as shell programs, allowing injections

like the one discussed in the introduction to be performed.

Having identiﬁed this OS shell vulnerability, we next examine whether this vulnerability is

likely to exist in practice. First, consider whether an application developer would even need to

use the Android shell for any functionality. It is possible that all of the OS programs accessible

through the shell are implemented in the Android SDK. However, our research indicates that this

is not the case. The Android SDK does not implement a ping program. An application developer

could perhaps write their own ping program using the available socket tools native to Java but no

ping program exists oﬀ the shelf in Android, except through the native ping program accessible

through the shell. Application developers can also use the shell to determine if the device their

application is being run on is “rooted”. Application developers might also be more comfortable

performing certain functionality in a familiar environment to them. Those more comfortable with

shell scripting than with the Android SDK might choose to obtain a shell over working with the

Android SDK.

Now that we know that an application developer has reason to access the OS shell, we must

examine the beneﬁts to the application developer of obtaining a full shell to execute these programs,

instead of simply using the .exec or .command(). Recall that only the full shell is vulnerable to

the attacks we have discovered. The most obvious potential beneﬁt is processing overhead. If the

developer decides to obtain a full shell and execute a series of OS programs using it, the overhead of

calling the .exec() method is removed with the exception of the ﬁrst .exec() call needed to obtain

the full shell. The developer can then use the Process containing the shell to execute commands

without the need for additional calls to .exec().

To determine exactly how much processing time is saved, we performed runtime analysis on a

test application that we created. The code that performs the testing is shown in Figure 3.1. The

test application uses the System class, and its static method .nanoTime() to time how long it takes

to perform an amount of shell tasks speciﬁed by the user, both through obtaining a full shell and

through using calls to the .exec() method for each task. One call to .nanoTime() is performed

before the tasks begin and the value is stored. When the tasks are complete, another call is made.

The diﬀerence between these two values is the time, in nanoseconds, that elapsed during the test.

The data we collected is compiled in the graph in Figure 3.2. The number of milliseconds to

complete the tasks is shown on the y axis and the number of tasks to be completed is shown on

the x axis. It’s obvious that using the interactive shell, shown with the solid line, has an execution

time that grows very slowly with the number of tasks to be executed. Using individual .exec()

method calls, shown by the dotted line, grows in execution time far more quickly in comparison.

The inconsistency in the data for task numbers 18, 19, and 20 for the individual .exec() method

calls could be the result of a process swap in the middle of the test. The process swap would

stop the shell tasks from executing for some time, while the system time value would continue to

increase. This time spent doing no work would lead to longer execution times than in the rest of our

tests. Using individual .exec() method calls is only faster than using an interactive shell for the

execution of a single task and it is only faster by 11 milliseconds. This graph demonstrates a clear

motivation for an application developer to use a full shell to perform tasks in their application—it

has the potential to save a signiﬁcant amount of processing time if many OS tasks are being run

over the execution of the application.

In a continuation of our work analyzing applications for SQLite vulnerabilities, we decompiled

and analyzed the same set of 78 top applications on Google Play for shell vulnerabilities. Using

the the same bytecode search methods with APKTool, we searched for invocations of the .exec or

.start methods. In the majority of applications, we found that the .exec method call was used at

some point in the code, indicating that application developers are aware of its existence and have

found it to be useful. Several applications did obtain a full shell but passed only constant strings

as programs into the shell. The potential for a vulnerability exists in the shell but we have been

unable to verify an application that exhibits this vulnerability among the top 78 applications on

Google Play.

3.3 Possible Attacks into OS Shell

With our analysis of the shell vulnerability complete, we now move on to a survey of what

functionality is available through the shell and what behaviors exist that a malicious attacker could

abuse. Recall that the Android OS uses an install-time permission system. This system is imple-

mented by each distinct application having its own user id, with the permissions the application has

s t a r t t i m e = System . nanoTime ( ) ; // S t a r t t h e c l o c k f o r t h e i n t e r a c t i v e s h e l l t e s t

p = r . e x e c (new S t r i n g [ ] { ” sh ” , ”−i ”} ) ; // S t a r t t h e i n t e r a c t i v e s h e l l

DataOutputStream p r o c e s s I n p u t =

newDataOutputStream ( p . getOutputStr e a m ( ) ) ; // Get an a s t r e a m t o p u t commands i n t o t h e p r o c e s s

f o r ( i = 0 ; i < c ount ; i ++) // E x e c u t e ‘ ‘ c o u n t ’ ’ number o f t a s k s

{

p r o c e s s I n p u t . w r i t e B y t e s ( ” l s \n” ) ;

p r o c e s s I n p u t . f l u s h ( ) ;

}

p r o c e s s I n p u t . w r i t e B y t e s ( ” e x i t \n” ) ; // Wait f o r t h e p r o c e s s t o e x i t

p r o c e s s I n p u t . f l u s h ( ) ;

try {

p . w a itF o r ( ) ;

endt i m e = System . nanoTime ( ) ; / / S to p t h e c l o c k f o r t h e i n t e r a c t i v e s h e l l t e s t

p . d e s t r o y ( ) ;

} catch ( I n t e r r u p t e d E x c e p t i o n e ) {

e . p r i n t S t a c k T r a c e ( ) ;

}

s t a r t t i m e = System . nanoTime ( ) ; // S t a r t t h e c l o c k f o r t h e . e x e c method

f o r ( i = 0 ; i < c ount ; i ++) // E x e c u t e ‘ ‘ c o u n t ’ ’ number o f t a s k s

{

p = r . e x e c ( ” l s ” ) ;

try {

p . w a itF o r ( ) ;

p . d e s t r o y ( ) ;

} catch ( I n t e r r u p t e d E x c e p t i o n e ) {

e . p r i n t S t a c k T r a c e ( ) ;

}

endt i m e = System . nanoTime ( ) ; / / S to p t h e c l o c k f o r t h e . e x e c method

Figure 3.1: Code for our timing test application

Figure 3.2: A graph showing timing data gathered from the test application.

requested attached to that id. Our research indicates that when an application obtains a full shell,

that shell operates with the same permissions as that user id. The permissions of the application

to be injected into aﬀect what exploits an attacker injecting into the application has access to.

We have discovered three major exploits made possible by this vulnerability.

1. Ping Drain (Internet Permission Required) – Mobile devices can connect to the internet

through a Wi-Fi connection, or though the mobile data network, where data is transmitted

through cell base towers. Most cell providers have limitations and caps for “data” access to

their network. An attacker can use the ping program, with a maximum size packet (65535

bytes) at the shortest interval (5/sec) to use 3MB/sec of data. If running constantly, this

program could very quickly accrue a large amount of mobile data usage, leading to the user

being assessed an excessive mobile phone bill, or simply being unable to use their mobile

network data access for the remainder of the billing period.

2. Card Wipe (External Storage Permission Required) – The external storage permission gives

an application read and modify permissions for the entire external storage. External storage

for an Android device usually takes the form of an SD card. If an application with this

permission were to be injected into, the attacker would have access to all data mounted on

the SD card’s ﬁlesystem. The attacker can read private data or delete some or all of the

contents of the SD card.

3. Fork Bomb (No Permissions Required) – Android applications are restricted in the number

of processes they may create but very loosely. A normal app may create up to 6656 processes.

This number is more than enough processes to execute a resource denial attack by way of

a forkbomb and lock up the target device.

We believe that these exploits into the OS shell and the lack of automatic SQL injection pre-

vention are a signiﬁcant enough concern to warrant implementing a security mechanism to mitigate

these injection attacks on Android.

CHAPTER 4

A REVIEW OF BRONIES

To begin our eﬀorts to implement a mechanism to defend against injection attacks, it is vital to

decide on a formal deﬁnition of attacks. Without knowledge of what the attacks are at a low level,

any protection mechanism is bound to have ﬂaws. Our protection mechanism is based on Ray and

Ligatti’s deﬁnition of a BroNIE [9].

A BroNIE begins, at a high level, when some output program is generated by using tainted

input. If the addition of this tainted input causes anything besides the expansion or insertion of

noncode tokens in the target programming language, a BroNIE has occurred. To deﬁne the formal

details of a BroNIE, Ray and Ligatti have constructed multiple subdeﬁnitions. We review them in

the following sections and demonstrate that this deﬁnition correctly identiﬁes the injection attacks

we have described earlier in this thesis.

4.1 Template String

In order to determine whether a token has been expanded or inserted by the addition of tainted

input, we must ﬁrst examine the original output program string, before the tainted input was

added. To do this, we determine the string template. The template string is the same as the

output string, but all tainted characters have been replaced with the ‘ε’ character, to represent the

empty string. The only purpose of the ‘ε’ character in the template is to act as a placeholder in

the program string, representing a location where an injected symbol once was. They are otherwise

ignored by the tokenizer. For example, the template of the string ping ‘‘www.hostname.com’’ is

ping ‘‘www.εεεεεεεε.com’’.

4.2 Token Expansion

Token expansion in the BroNIE deﬁnition describes how tainted input is permitted to enlarge

existing tokens. Tokens are deﬁned in terms of what kind of token they are, the semantic value

of the token, and the range of indices over which the token appears in the program string. We

represent tokens in this thesis with the form τ

(tokensymbols)

, representing a token of type τ ,

composed of symbols “tokensymbols’ and ranging from character index i to character index j in the

program string. For example, the string “2 + 10 * 3” would be tokenized as {INT

(2)

, PLUS

(+)

INT

(10)

, MULT

(∗)

, INT

(3)

A token τ can be said to expand another token τ

if and only if:

1. The token types of the two tokens are equal.

2. The character indices of token τ are equal to, or an expansion of, the character indices of

token τ

3. The semantic value of token τ

is a subsequence of the semantic value of token τ.

As an example, consider the string ping ‘‘www.hostname.com’’ again. Recall that the tem-

plate of this string is ping ‘‘www.εεεεεεεε.com’’. The token for the string literal representing

the hostname in the template would be thus: LITERAL

(www..com)

. The token for the string

literal in the original program would be LITERAL

(www.hostname.com)

. Note that the epsilon

characters preserve the character indices of the template string’s token, despite the tainted input

being removed. By this deﬁnition of token expansion, the original program token expands the

template string’s token.

4.3 Token Classiﬁcation

To detect BroNIEs for some output programming language, we must ﬁrst partition the tokens of

the target language into code and noncode tokens. Code tokens are reducible by evaluation, meaning

that computation can be performed on them at runtime. A code token might be the ping operator.

This token speciﬁes a program to be executed, obviously necessitating some computation. Noncode

tokens are the opposite; they are already values, so no computation can be performed on them.

An example noncode token might be the String token with semantic value ”www.hostname.com”.

This is a String value and cannot be evaluated any further than it has been already.

Again considering our example of ping ‘‘www.hostname.com”, shell strings can all be classiﬁed

under the token WORD. WORD tokens can themselves be split into 3 diﬀerent token types: code,

open value, and literal. Of these three types, only the literal type is noncode. This string could

thus be fully tokenized as {CODEWORD

(ping)

, LITERAL

(www.hostname.com)

4.4 The NIE Property

Finally, we can formally deﬁne the NIE property. Given some program string s and the template

t(s), the NIE property holds if and only if it is possible to create s from t(s) by inserting or

expanding noncode tokens. If this cannot be achieved, we say that the output program exhibits

a BroNIE. Continuing with our example from the rest of the chapter, we examine the string

ping www.hostname.com and its template ping www.εεεεεεεε.com. As discussed in the previous

example, the LITERAL token in the output program is an expansion of the LITERAL token in

the template program, and the CODEWORD token is unchanged in the template string. Thus, we

can conclude that this example does not exhibit a BroNIE.

4.5 Stopping a BroNIE

Let us now examine the example injection attacks into Android that we discussed in the

introduction and determine whether the deﬁnitions discussed in this chapter classify them as

BroNIE’s. Consider the ﬁrst example shell injection program from the introduction: ping -c

4 ‘‘www.remotehostname.com’’ && rm -rf ‘‘data’’ \n. As before, tainted input is marked

by underlining. First, we create a template of this string, as follows: ping -c 4 ‘‘εεεεεεεεεεε

εεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεε’’ \n. The original program string is tokenized as {CODEW-

ORD

(ping)

, ARGUMENT

(-c)

, LITERAL

(4)

, LITERAL

(“www.hostname.com”)

, &&

(&&)

, CODEWORD

(rm)

, ARGUMENT

(-rf)

, LITERAL

(“data”)

, NEWLINE

(\n)

}. The template string can be tokenized as: {CODEWORD

(ping)

, ARGUMENT

(-c)

, LITE-

RAL

(4)

, LITERAL

(“”)

, NEWLINE

(\n)

}. In order to convert from the template token

stream to the original token stream, we will at the very least need to add the token &&

(&&)

&& is a code token—it can be reduced by evaluating the expressions on either side of it. Since

adding code tokens is against the NIE property, this program exhibits a BroNIE. In similar fashion,

the other shell program discussed in the introduction: ping -c 4 ‘‘www.remotehostname.com’’

&& f() { f|f& } && f \n #’’ \n’’” injects an && token and the template string will be unable

to add it back to restore itself to the program string in a way that satisﬁes the NIE property. This

program too, exhibits a BroNIE.

Finally, we examine our SQLite example string from the introduction, shown here as a query

string, not a snippet of Java code: SELECT FROM ‘accounts’ WHERE KEYVALUNAME=

‘admin’ AND KEYVALPASS=‘’ OR ‘1’=‘1’. This string’s template is as follows: SELECT FROM

accounts WHERE KEYVALUNAME=‘εεεεε’ AND KEYVALPASS=‘εεεεεεεεεεεεε’ The program string’s

tainted input areas will be tokenized as: {STRING

(‘admin’)

} and {STRI-

(’)

, OR

(OR)

, STRING

(‘1’)

, EQUALS

(=)

, STRING

(‘1)

}. The same tainted

areas in the template string will be tokenized as: {STRING

(‘’)

} and {STRING-

,(‘’)

}. Clearly, there are many code tokens present in the program string that do not ap-

pear in the template string. This program too, exhibits a BroNIE. All of our example attacks

from the introduction have now been successfully detected by the NIE property and its associated

deﬁnitions.

CHAPTER 5

PROTECTION MECHANISM IMPLEMENTATION

Having explored the surfaces through which injection attacks can occur on Android and deﬁned

formally how injection attacks occur, we can now implement a mechanism to defend against these

attacks. There are two high-level components to this mechanism, the taint tracker and the detection

algorithm. The taint tracker applies taint data to characters from untrusted sources and propagates

that taint information in strings through the execution of the application. The detection algorithm

will then use the taint information supplied by the taint tracker to generate the template string

described in Chapter four. The detection algorithm compares the template string and the original

string. If it ﬁnds that the original string cannot be reached from the template string by inserting

or expanding noncode tokens, it throws an exception and the program or query is not executed [9].

5.1 Taint Tracker

In our modiﬁcations to the Java libraries used by Android OS, taint information is stored for

each String in the system as a boolean array, with one bit for each character. A false bit appears

for a speciﬁed character if and only if that character is untainted and a true if and only if the

character is tainted. Taint information should be linked to each string on creation. An “untrusted”

source would mark the full string created from it with taint bits.

It would be convenient to modify the String class directly and add our taint tracking infor-

mation as a ﬁeld to it. However, the Android OS performs some VM-level optimization with the

String class. The bit-level oﬀsets of the ﬁelds in the class are hardcoded into other sections of the

OS, making adding ﬁelds to the String class diﬃcult.

To circumvent this problem, we have implemented our taint information in a separate class, a

new addition to the Java libraries. We call it the TaintTrack class. It is inserted into the Java.lang

package, alongside the String class. Selections of code from the class are displayed in Figure 5.1.

The primary data structure used by the class is a HashMap, which stores taint information for a

particular string keyed by the hash code of the string being stored. The class implements methods

for adding untainted and tainted strings or character sequences to the HashMap. The character

sequence adding method must ﬁrst convert the character sequence to an instance of the String

class. Both methods then allocate a boolean array of the size of the string and ﬁll it with the

appropriate taint data (false for untainted, true for tainted.) In Figure 5.1, only the methods for

adding a tainted character sequence and an untainted string are shown. The methods for adding an

untainted character sequence and a tainted string require only trivial alterations from the methods

we have shown. There is a method for retrieving the taint information for a speciﬁc string, for which

the code needs only access the HashMap. Finally, there is a method for updating taint information

after two strings have been concatenated. For this method, a boolean array of the combined size

of the strings is allocated, and the taint information for the two original strings is retrieved and

combined together into the result array. The result array is then placed into the HashMap, under

the hash code of the ﬁnal concatenated string.

Now that we have added all of the necessary helper methods to track taints, our implementation

must call them from the appropriate locations in the Java libraries and application code. To begin,

we instrumented the String class to add all created strings as untainted by default using the

.addStringUntainted() method from Figure 5.1. Tainted strings, in our example implementation,

are obtained from a text box in the application’s UI. The method for obtaining the text that a user

has entered into a text box is .getText() from the TextView class. We have instrumented the

.getText() method with the .addCharSeqTainted() method from Figure 5.1 to taint all strings

created through it. In a full implementation of this mechanism, the developer might wish to specify

other sources as tainted, and instrument those sources as well. For our example implementation, a

single taint source is suﬃcient to demonstrate the eﬃcacy of the system.

Next, we must call our string concatenation tracking method from the Java code. To do this,

it is important to know how the Java compiler handles the “+” operator on two strings. It ﬁrst

instantiates an instance of the StringBuilder class, with the string on the left hand side as its

data. Then, it calls the .append() method of the StringBuilder class, with the string on the right

hand side as its argument. Finally, it builds the string, and returns the result. It would be simplest

// The d a t a s t r u c t u r e u se d t o h o l d t a i n t i n f o r m a t i o n

private s t a t i c f i n a l HashMap<S t r i n g , boolean [ ] > ta i n tmap ;

// Add some c h a r a c t e r s e q u e n c e t o t h e t a i n t ma p a s a t a i n t e d s t r i n g

public s t a t i c void addCharSeqTainted ( C h a r S e q u e n c e s e q ) {

// C on ve rt t h e Ch a rSq u enc e t o a s t r i n g

S t r i n g i np ut = s e q . t o S t r i n g ( ) ;

// A l l o c a t e a t a i n t a r r a y o f t h e s i z e o f t h e s t r i n g

boolean [ ] t a i n t = new boolean [ i n p u t . l e n g t h ( ) ] ;

// T hi s s t r i n g i s f u l l y t a i n t e d , s o t h e a r r a y i s a l l t a i n t e d

Ar r a ys . f i l l ( t a i n t , tru e ) ;

// P ut t h e a r r a y i n t o t h e t r a c k e r

tai n t m ap . put ( i n put , t a i n t ) ; }

public s t a t i c void a dd St r i n g U nt ai n t e d ( S t r i n g i n p u t ) {

// A l l o c a t e a t a i n t a r r a y o f t h e s i z e o f t h e s t r i n g

boolean [ ] t a i n t = new boolean [ i n p u t . l e n g t h ( ) ] ;

// T hi s s t r i n g i s f u l l y u n t a i n t e d , s o t h e a r r a y i s a l l u n t a i n t e d

Ar r a ys . f i l l ( t a i n t , f a l s e ) ;

// P ut t h e a r r a y i n t o t h e t r a c k e r

tai n t m ap . put ( i n put , t a i n t ) ; }

// R e t r i e v e t a i n t d a t a from t h e m ap f o r some s t r i n g

public s t a t i c boolean [ ] g e tD a t a Fo r S t r i ng ( S t r i n g i n p ut ){

retu rn t aintm a p . g e t ( i n p u t ) ; }

// P r o p o g a t e t a i n t i n f o r m a t i o n t h r o u g h c o n c a t e n a t i o n

public s t a t i c void s t r i n g C a t ( S t r i n g i n put 1 , S t r i n g i n put 2 , S t r i n g i n p u t ){

// R e t r i e v e t a i n t d a t a f r o m t h e tw o s o u r c e s

boolean [ ] t a i n t 1 = t aintma p . g et ( i n p u t 1 ) ;

boolean [ ] t a i n t 2 = t aintma p . g et ( i n p u t 2 ) ;

// A l l o c a t e a new t a i n t a r r a y w i t h t h e a p p r o p r i a t e new s i z e

i n t l e n 1 = t a i n t 1 . l e n g t h ;

i n t l e n 2 = t a i n t 2 . l e n g t h ;

boolean [ ] t a i n t = new boolean [ l e n 1 + l e n 2 ] ;

// Copy t h e t a i n t i n f o r m a t i o n i n t o t h e new a r r a y

System . a r r ay co py ( t a i n t 1 , 0 , t a i n t , 0 , l e n 1 ) ;

System . a r r ay co py ( t a i n t 2 , 0 , t a i n t , l en 1 , l e n 2 ) ;

// P ut t h e new t a i n t i n f o r m a t i o n i n t o t h e t r a c k e r

tai n t m ap . put ( i n put , t a i n t ) ; }

Figure 5.1: Code for the taint tracking module we have added to the Java libraries

// D e c l a r e a d v i c e t h a t r u n s a r ound t h e t a r g e t me thod , and r e t u r n s S t r i n g B u i l d e r

S t r i n g B u i l d e r around ( S t r i n g B u i l d e r sb , S t r i n g i n p u t ) :

// T hi s a d v i c e wr a ps ar o und S t r i n g B u i l d e r . a p pend ( )

c a l l ( S t r i n g B u i l d e r S t r i n g B u i l d e r . append ( S t r i n g ) )

// B i nd t h e c a l l i n g o b j e c t t o v a r i a b l e ‘ ‘ s b ’ ’

&& t a r g e t ( sb )

// B i nd t h e f i r s t a r gume n t t o v a r i a b l e ‘ ‘ i n p u t ’ ’

&& a r g s ( i n pu t ) {

// S a ve t h e c o n t e n t s o f t h e s t r i n g b u i l d e r b e f o r e t h e c o n c a t e n a t i o n

S t r i n g s t a r t = sb . t o S t r i n g ( ) ;

// C on ti nu e w i t h e x e c u t i o n , s t a r t t h i s c o d e a g a i n a f t e r done

p r o c e e d ( sb , i n p ut ) ;

// S a ve t h e c o n t e n t s o f t h e s t r i n g b u i l d e r a f t e r t h e c o n c a t e n a t i o n

S t r i n g end = sb . t o S t r i n g ( ) ;

// C a l l t h e t a i n t t r a c k e r − a b b r e v i a t e s 42 l i n e s i n t h e f u l l i m p l e m e n t a t i o n

c a l l T a i n t T r a c k e r ( s t a r t , i npu t , end ) ;

// R e tur n t h e s t r i n g b u i l d e r , and c o n t i n u e w i t h n o r mal e x e c u t i o n

retu rn s b ; }

Figure 5.2: AspectJ code to call the taint tracker from application code

to instrument the .append() method of the StringBuilder class, but it also uses optimization at

the VM level and we were unable to instrument it inside the source code for the Java libraries.

5.2 Weaving with AspectJ

To circumvent this problem, our implementation makes use of a tool called AspectJ, a program-

ming language for weaving through source code. Weaving is the act of adding speciﬁc code snippets

at speciﬁc points, declared by the programmer. To obtain the desired behavior, we programmed an

AspectJ ﬁle to instrument the appropriate method calls. A code snippet from this ﬁle is displayed

in Figure 5.2. The code begins by declaring a piece of advice, the class structure for AspectJ. It

then speciﬁes that this advice will weave “around” any calls to the .append() method. Recall from

Figure 5.1 that our method to update taint information after concatenation requires the strings

used for concatenating and the result of the concatenation. For this reason, we wrap fully around

the call to .append() and not just before or after it. Before the .append call is executed, our code

saves the values for the initial strings. After the call to .append(), it obtains the result string and

calls the stringCat() method with all three strings.

The ﬁnal component of the taint tracker is the point where an output program is being generated

by the application. At the point of program generation, we again use AspectJ to call to the taint

tracker to obtain the taint information for the string passed into the method that generates the

output program. The work of the taint tracker is currently complete and the rest is done by the

detection algorithm.

5.3 Detecting BroNIE’s

The detection algorithm is an implementation of the detection algorithm developed by Ray and

Ligatti [9] and the implementation of their algorithm that we discuss in this thesis was primarily the

work of Clayton Whitelaw. The algorithm begins with a program output string for some language

and taint information for that string. The algorithm next generates the template string counterpart

for the output string being used to generate the program. The algorithm does this by replacing all

tainted characters with ε characters. The template string is then tokenized by a modiﬁed tokenizer

for the target programming language. The modiﬁcations to the tokenizer allow it to ignore the ε

characters in the template string. We then tokenize the original output program string in the same

way, assuming that ε characters are not present in the original program output.

Now, we have 2 token streams, one from the template and one from the original. Now we

must implement the BroNIE detection algorithm from [9]. The implementation of the detection

algorithm is shown in Figure 5.3. The algorithm works by scanning through both token streams

and comparing the tokens at each index. If the tokens are the exact same or a token has been

inserted or expanded, but is noncode, the algorithm continues scanning the token streams. In any

other case, a BroNIE exception is thrown. If the end of the template string is reached ﬁrst, the

program scans through any additional noncode tokens in the program string. Finally, it checks to

see if all tokens have been checked in both token streams. If they have not, then there are either

additional code tokens in the program string or there are tokens in the template string that are not

in the program string. Both of these are outlawed by the deﬁnition of a BroNIE, so an exception is

thrown. If the program passes all of these checks, the program is executed through the appropriate

application.

The security mechanism is currently complete and should fully defend against BroNIEs on An-

droid OS. To test the mechanism, we have attempted to execute the example injection attacks

discussed in the introduction and Chapter four. All three program strings were correctly identi-

private s t a t i c void c he c k ( S t r i n g cmd) throws B r o n i e E x ce pt io n {

// R e t r i e v e t a i n t i n f o r m a t i o n

boolean [ ] t a i n t = T a i n tT r a c ke r . g e t D at a F o r S tr i n g (cmd ) ;

// T o k e n i z e t h e p ro g r am s t r i n g

Tkn [ ] o r i = t o k e n i z e ( cmd ) ;

// C on ve rt t h e p r o g r am s t r i n g i n t o i t s t e m p l a t e

S t r i n g t e m p l a t e = ge tT e m p la t e S t r in g (cmd ) ;

// T o k e n i z e t h e t e m p l a t e

Tkn [ ] tem = t o k e n i z e ( t em p l a t e ) ;

// I n i t i a l i z e i n d i c e s f o r l o o p i n g t h r o u g h t h e t o k e n s t r e a m s

i n t i = 0 ;

i n t j = 0 ;

wh ile ( i < o r i . l e n g t h && j < te m . l e n g t h ) {

i f ( o r i [ i ] . e q u a l s ( tem [ i ] ) ) {

// The t o k e n s a r e t h e e x a c t same , c o n t i n u e on

i ++;

j ++;

} e l s e i f ( o r i [ i ] . i sNo n co d e ( ) && tem [ i ] . expandsTo ( o r i [ i ] ) ) {

// The t e m p l a t e t o k e n can e xpa n d t o t h e o r i g i n a l t o k e n , c o n t i n u e on

i ++;

j ++;

} e l s e i f ( o r i [ i ] . is N on c od e ( ) ) {

// A non c ode t o k e n h as b e en i n j e c t e d , c o n t i n u e on

i ++;

} e l s e {

// S om e t h in g e l s e h as h a ppene d , n o t a l l o w e d , t h i s p ro g ra m e x h i b i t s a BroNIE

throw new B r o n i e E x ce pt io n ( ”BroNI E d e t e c t e d ” ) ;

}

wh ile ( i < o r i . l e n g t h && o r i [ i ] . i s Non cod e ( ) ) {

// A d d i t i o n a l n onco d e t o k e n s we r e i n s e r t e d , c o n t i n u e on

i ++;}

i f ( ! ( i >= o r i . l e n g t h && j >= tem . l e n g t h ) ) {

System . e r r . p r i n t l n ( i + ”<= ” + o r i . l e n g t h + ” , ” + j + ”<= ” + tem . l e n g t h ) ;

// Not a l l t o k e n s w ere s ca nn e d − t h i s program e x h i b i t s a Br oNIE

throw new B r o n i e E x ce pt io n ( ”BroNIE d e t e c t e d ” ) ;

}

Figure 5.3: Algorithm to detect BroNIEs using a template string and a program string

ﬁed by the system as BroNIEs. Our proof-of-concept mechanism has been implemented and has

demonstrated its eﬀectiveness against the attacks we have described.

CHAPTER 6

CONCLUSION

This thesis set out to argue that injection attacks are both possible on Android and have

the potential to compromise the target device. We have presented evidence of several injection

attacks into Android that inject into both the OS shell and the SQLite API. The attacks we have

demonstrated include the capability to falsely authenticate to some SQLite database, the ability to

drain the mobile data allotment of the target device, the ability to modify or delete the contents of

the SD card on a device, and the ability to “forkbomb” a device. There exists a mechanism to defend

against SQL injection attacks on Android, but it is not automatic, relying on the programmer of the

application for proper implementation. There is no mechanism to defend against OS shell attacks

on Android. These attacks have the capacity to cause damage to users and we believe that they

are a problem that warrants a solution.

In pursuit of that goal, we have implemented a defense mechanism to mitigate these injection

attacks, based on Ray and Ligatti’s BroNIE deﬁnition of injection attacks [9]. We have demon-

strated that the deﬁnition of a BroNIE accurately classiﬁes our example attacks as injection attacks.

Our mechanism adds taint tracking via modiﬁcation to the Java libraries used by the Andriod OS

and via automated compilation of android applications with AspectJ. We have also developed an

implementation of the detection algorithm discussed in Ray and Ligatti’s work in AspectJ to detect

injection attacks based on the taint information provided by the taint tracker.

LIST OF REFERENCES

[1] The MITRE Corporation. CWE/SANS Top 25 Most Dangerous Software Errors, 2011.

http://cwe.mitre.org/top25/archive/2011/2011 cwe sans top25.pdf.

[2] Pew Internet. Mobile technology fact sheet, 2014. http://www.pewinternet.org/face-

sheets/mobile-technology-fact-sheet.

[3] CISCO. Data leakage worldwide: The high cost of insider threats, 2008.

http://www.cisco.com/c/en/us/solutions/collateral/enterprise-networks/data-loss-

prevention/white paper c11-506224.pdf.

[4] Google. Android open source project, 2014. https://source.android.com.

[5] A. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner. Android permissions demystiﬁed. In

Proceedings of the 18th ACM Conference on Computer and Communications Security, CCS,

pages 627–638, New York, NY, USA, 2011. http://doi.acm.org/10.1145/2046707.2046779.

[6] P. Kelley, S. Consolvo, L. Cranor, J. Jung, N. Sadeh, and D. Wetherall. A conundrum of

permissions: Installing applications on an android smartphone. In Financial Cryptography

and Data Security, volume 7398 of Lecture Notes in Computer Science, pages 68–79. Springer

Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-34638-5 6.

[7] Google. Sqlitedatabase class documentation, 2014. http://developer.android.com/reference

/android/database/sqlite/SQLiteDatabase.html.

[8] Open Web Application Security Project. SQL Injection Prevention Cheat Sheet, 2014.

https://www.owasp.org/index.php/SQL Injection Prevention Cheat Sheet.

[9] D. Ray and J. Ligatti. Deﬁning injection attacks. Technical report, University of South Florida,

2014. http://www.cse.usf.edu/˜ligatti/papers/bronies.pdf.

[10] D. Ray and J. Ligatti. Deﬁning code-injection attacks. In Proceedings of the Symposium on

Principles of Programming Languages, POPL, pages 179–190, 2012.

[11] W. Enck, M. Ongtang, P.D. McDaniel, et al. Understanding android security. IEEE Security

& Privacy, 7(1):50–57, 2009.

[12] T. Vidas, D. Votipka, and N. Christin. All your droid are belong to us: A survey of current

android attacks. In Women Of Our Time, WOOT, pages 81–90, 2011.

[13] C. Orthacker, P. Teuﬂ, S. Kraxberger, G. Lackner, M. Gissing, A. Marsalek, J. Leibetseder,

and O. Prevenhueber. Android security permissions can we trust them? In Security and

Privacy in Mobile Information and Communication Systems, volume 94 of Lecture Notes of

the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering,

pages 40–51. Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-30244-

[14] Y. Zhou, Z. Wang, W. Zhou, and X. Jiang. Hey, you, get oﬀ of my market: Detecting malicious

apps in oﬃcial and alternative android markets. In NDSS, 2012.

[15] F. Di Cerbo, A. Girardello, F. Michahelles, and S. Voronkova. Detection of malicious applica-

tions on android os. In Computational Forensics, pages 138–149. Springer Berlin Heidelberg,

2011.

[16] P. Bisht, P. Madhusudan, and V. N. Venkatakrishnan. Candid: Dynamic candidate evaluations

for automatic prevention of SQL injection attacks. ACM Trans. Inf. Syst. Secur., 13(2):14:1–

14:39, March 2010. http://doi.acm.org/10.1145/1698750.1698754.

[17] Z. Su and G. Wassermann. The essence of command injection attacks in web applica-

tions. In Conference Record of the 33rd ACM SIGPLAN-SIGACT Symposium on Prin-

ciples of Programming Languages, POPL, pages 372–382, New York, NY, USA, 2006.

http://doi.acm.org/10.1145/1111037.1111070.

[18] G. Wassermann and Z. Su. Sound and precise analysis of web applications for injection vulner-

abilities. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design

and Implementation, PLDI, pages 32–41, 2007. http://doi.acm.org/10.1145/1250734.1250739.

[19] Oracle. How to write SQL injection proof PL/SQL, 2008.

http://www.oracle.com/technetwork/database/features/plsql/overview/how-to-write-

injection-proof-plsql-1-129572.pdf.

[20] G.-V. Jourdan. Securing large applications against command injections. In Security Technol-

ogy, 41st Annual IEEE International Carnahan Conference on, pages 69–78, 2007.

[21] W.G.J. Halfond, A Orso, and P. Manolios. Wasp: Protecting web applications using positive

tainting and syntax-aware evaluation. Software Engineering, IEEE Transactions on, 34(1):65–

81, Jan 2008.

[22] S. Son, K. S. McKinley, and V. Shmatikov. Diglossia: detecting code injection attacks

with precision and eﬃciency. In Proceedings of the ACM SIGSAC conference on Com-

puter & co mmunications security, CCS, pages 1181–1192, New York, NY, USA, 2013.

http://doi.acm.org/10.1145/2508859.2516696.

[23] A. Nguyen-Tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans. Automatically hardening

web applications using precise tainting. In Security and Privacy in the Age of Ubiquitous

Computing, IFIP 20th International Conference on Information Security (SEC), pages 295–

308, 2005.

[24] Florian N., Nenad J., Engin K., Christopher K., and Giovanni V. Cross-site scripting pre-

vention with dynamic data tainting and static analysis. In In Proceeding of the Network and

Distributed System Security Symposium, 2007.

[25] W. Xu, S. Bhatkar, and R. Sekar. Taint-enhanced policy enforcement: A practical

approach to defeat a wide range of attacks. In Proceedings of the 15th Conference

on USENIX Security Symposium - Volume 15, USENIX-SS, Berkeley, CA, USA, 2006.

http://dl.acm.org/citation.cfm?id=1267336.1267345.

[26] M. Bravenboer, E. Dolstra, and E. Visser. Preventing injection attacks with syntax

embeddings. In Proceedings of the 6th International Conference on Generative Pro-

gramming and Component Engineering, GPCE, pages 3–12, New York, NY, USA, 2007.

http://doi.acm.org/10.1145/1289971.1289975.

[27] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth. Taint-

droid: An information ﬂow tracking system for real-time privacy monitoring on smartphones.

Commun. ACM, 57(3):99–106, March 2014. http://doi.acm.org/10.1145/2494522.

[28] C. Gibler, J. Crussell, J. Erickson, and H. Chen. Androidleaks: Automatically detecting

potential privacy leaks in android applications on a large scale. In Trust and Trustworthy

Computing, volume 7344 of Lecture Notes in Computer Science, pages 291–307. ACM, 2012.

http://dx.doi.org/10.1007/978-3-642-30921-2

17.

[29] P. Hornyack, S. Han, J. Jung, S. Schechter, and D. Wetherall. These aren’t the droids you’re

looking for: Retroﬁtting android to protect data from imperious applications. In Proceedings of

the 18th ACM Conference on Computer and Communications Security, CCS, pages 639–652,

New York, NY, USA, 2011. http://doi.acm.org/10.1145/2046707.2046780.

[30] Y. Zhou, X. Zhang, X. Jiang, and V. Freeh. Taming information-stealing smartphone appli-

cations (on android). In Trust and Trustworthy Computing, volume 6740 of Lecture Notes in

Computer Science, pages 93–107. ACM, 2011. http://dx.doi.org/10.1007/978-3-642-21599-5 7.

[31] G. Sarwar, O. Mehani, R. Boreli, and D. Kaafar. On the eﬀectiveness of dynamic taint anal-

ysis for protecting against private information leaks on android-based devices. In Pierangela

Samarati, editor, SECRYPT, 10th International Conference on Security and Cryptography,

pages 461–467, Reykjvik, Iceland, July 2013.

[32] Google. Android-apktool, 2014. https://code.google.com/p/android-apktool/.

[33] Oracle. Class Runtime Javadoc, 1993. http://docs.oracle.com/javase/7/docs/api

/java/lang/Runtime.html.

[34] Oracle. Class Process Builder Javadoc, 1993. http://docs.oracle.com/javase/7/docs/api/java

/lang/ProcessBuilder.html.