Introduction to Code Obfuscation

Introduction to Code Obfuscation

Part 1

Image for post

Introduction

Remember the last time you were on a vacation with your family? You packed your bags with the stuff you needed and left your place, right? I am certain you didn?t take all your belongings with you. You left some of them behind at your apartment or house, whatever sort of place you live in. But can you recall if you locked your house or not?

Image for post

Absolutely. Now, coming straight to the point, why did you do that? The answer is obvious: to prevent anyone from trespassing in your absence and to ensure safety for your stuff. So, for this, you might have some security devices installed like CCTV cameras, sensors, and the most basic locks.

Image for post

Yes, we are!

For understanding a technical topic, you don?t need to know all the jargon used in it. Sometimes the best way is to correlate it with something that we have observed ourselves, probably several times.

Think of the code/source-code as your stuff left behind at your home. Now, when the code is deployed over the internet, you make it available publicly. Thieves may try to steal your stuff, which is called piracy in a code or software application context. So what do you do to prevent this stealing? The most simple answer is just lock it!

Yes, you read it right, just lock it. But here, since the software is not a physical commodity, we can?t lock it the way we lock our house. So, for that, we have a lot of techniques, among which my favorite is Code Obfuscation.

Code obfuscation is a technique of safeguarding your code so that any unauthorized person won?t be able to see and understand the logic written in it.

Image for post

Code obfuscation is the technique of making the source code of an application difficult to read and comprehend so it becomes almost impossible for any unauthorized third-party group or individual, using any available tools, to reverse engineer it.

We can describe code obfuscation as a technique of converting our source code in a form, such that it gives the same output as of the un-obfuscated or the real code and, at the same time, it would not be human-readable.

The burglar who invaded your place (here code) might use your stuff (code logic) for personal gain or try to exploit it. In technical terms, this practice is called tempering the restrictions imposed by license.

Need for Code Obfuscation

Nowadays, most of the code for multifarious applications (like mobile applications, web applications, etc.) is available over numerous open-source platforms, where people can look into their code and suggest changes to make it more optimized and raise unnoticed bugs. But, with this code transparency, some groups or individuals try to reverse engineer the code to exploit it for personal gain. They may try to tamper with the application or bypass the restrictions imposed by licensing.

Thus, to protect our code from such malicious users, there is a need to add some security to our code, which is best provided under the hood of code obfuscation.

Methods and Techniques

Obfuscation methods are classified depending on the information they target. Some techniques target the lexical structure of the program while others target its data structures or the control flow.

Just like security measures that we take to protect our stuff, code obfuscation provides techniques to prevent our code from being exploited. What level of security we are willing to add to our code depends on our needs. Some major code obfuscation techniques are:

1. Data Obfuscation: This targets the data structure of a program either by replacing a variable?s name with a complex expression (e.g., c1*i +c2) or manipulating the form in which the data is stored.

Basically, we are trying to confuse the burglar about our stuff. They will be busy figuring out what stuff is lying there in front of them.

2. Layout Obfuscation: This targets the layout or appearance of the code. It may manipulate the indentation, variable names, or add or delete comments used in the code.

Pretty much the same as the previous technique; we are exhibiting our couch as our bed.

3. Control Obfuscation: This manipulates the statements written in the code. For example, replacing a function call by its body, or including the complete library or module used in the code. It may also alter the code flow by adding some dead code.

In this technique, we simply change the orientation of our place. The burglar may think they?re going into the bedroom, where our precious things might be kept, but they end up in the bathroom.

Image for post

I am afraid to say it, but this won?t be a good idea, the reason being, adding security to the code also impacts its performance. So, we have to keep that in mind as well.

Impact on Code Performance

Imagine you have used hundreds or thousands of locks at your place, but when you come back, you need to unlock them, right? And it will take time. The more locks you have added, the more time consumed in unlocking them or, in the case of code, time to process any request for data or data manipulation.

Usually, the impact on performance is from 15% to 80% depending upon the options used for obfuscation. The more options we add, the slower our code?s execution time will be.

Debugging in Obfuscated Code

Since the structure of code is manipulated, along with some data, during obfuscation, we cannot directly debug this obfuscated code. So to perform debugging operations, during obfuscation, a sourcemap file is generated, which is provided to de-obfuscating tools, which gives stack trace of the errors occurred along with the line numbers of errors generated according to the original source code.

So, it must be clear by now we have to safeguard this key so that only we can unlock our house. Pretty simple, right?

I hope you have a good idea about code obfuscation techniques by now. This technique is not foolproof, as debugging is a very challenging task even with de-obfuscating tools, but still, it provides a good solution for enhancing code security.

This part of the article was just to provide a brief introduction about code obfuscation and its techniques. In the next part, we will focus on its implementation. So to find out more on code obfuscation, stay tuned.

Now that you understand Code Obfuscation, the next step is implementation of Code Obfuscation (Part 2), which we discuss here:

Implementing Code Obfuscation

We discuss a simple way to implement code obfuscation

DLT Labs on medium.com

23