An attacker discovers the structure, function, and composition of an object, resource, or system by using a variety of analysis techniques to effectively determine how the analyzed entity was constructed or operates. The goal of reverse engineering is often to duplicate the function, or a part of the function, of an object in order to duplicate or "back engineer" some aspect of its functioning. Reverse engineering techniques can be applied to mechanical objects, electronic devices or components, or to software, although the methodology and techniques involved in each type of analysis differ widely.
Software Reverse Engineering
An attacker discovers the structure, function, and composition of a type of computer software by using a variety of analysis techniques to effectively determine how the software functions and operates, or if vulnerabilities or security weakness are present within the implementation. Reverse engineering methods, as applied to software, can utilize a wide number approaches and techniques.
Methodologies for software reverse engineering fall into two broad categories, 'white box' and 'black box.' White box techniques involve methods which can be applied to a piece of software when an executable or some other compiled object can be directly subjected to analysis, revealing at least a portion of its machine instructions that can be observed upon execution. 'Black Box' methods involve interacting with the software indirectly, in the absence of the ability to measure, instrument, or analyze an executable object directly. Such analysis typically involves interacting with the software at the boundaries of where the software interfaces with a larger execution environment, such as input-output vectors, libraries, or APIs.
Reverse Engineer an Executable to Expose Assumed Hidden Functionality or Content
An attacker analyzes a binary file or executable for the purpose of discovering the structure, function, and possibly source-code of the file by using a variety of analysis techniques to effectively determine how the software functions and operates. This type of analysis is also referred to as Reverse Code Engineering, as techniques exist for extracting source code from an executable.
Several techniques are often employed for this purpose, both black box and white box. The use of computer bus analyzers and packet sniffers allows the binary to be studied at a level of interactions with its computing environment, such as a host OS, inter-process communication, and/or network communication. This type of analysis falls into the 'black box' category because it involves behavioral analysis of the software without reference to source code, object code, or protocol specifications.
White box analysis techniques include file or binary analysis, debugging, disassembly, and decompilation, and generally fall into categories referred to as 'static' and 'dynamic' analysis. Static analysis encompasses methods which analyze the binary, or extract its source code or object code without executing the program. Dynamic analysis involves analyzing the program during execution.
Some forms of file analysis tools allow the executable itself to be analyzed, the most basic of which can analyze features of the binary, such as the strings contained within the file. More sophisticated forms of static analysis analyze the binary file and extract assembly code, and possibly source code representations, from analyzing the structure of the file itself. Dynamic analysis tools execute the binary file and monitor its in memory footprint, revealing its execution flow, memory usage, register values, and machine instructions. This type of analysis is most effective for analyzing the execution of binary files whose content has been obfuscated or encrypted in its native executable form.
Debuggers allow the program's execution to be monitored, and depending upon the debugger's sophistication may show relevant source code for each step in execution, or may display and allow interactions with memory, variables, or values generated by the program during run-time operations. Disassemblers operate in reverse of assemblers, allowing assembly code to be extracted from a program as it executes machine code instructions. Disassemblers allow low-level interactions with the program as it executes, such as manipulating the program's run time operations. Decompilers can be utilized to analyze a binary file and extract source code from the compiled executable. Collectively, the tools and methods described are those commonly applied to a binary executable file and provide means for reverse engineering the file by revealing the hidden functions of its operation or composition.
Read Sensitive Strings Within an Executable
An attacker engages in activities to discover any sensitive strings are present within the compiled code of an executable, such as literal ASCII strings within the file itself, or possibly strings hard-coded into particular routines that can be revealed by code refactoring methods including static and dynamic analysis.
One specific example of a sensitive string is a hard-coded password. Typical examples of software with hard-coded passwords include server-side executables which may check for a hard-coded password or key during a user's authentication with the server. Hard-coded passwords can also be present in client-side executables which utilize the password or key when connecting to either a remote component, such as a database server, licensing server, or otherwise, or a processes on the same host that expects a key or password.
When analyzing an executable the attacker may search for the presence of such strings by analyzing the byte-code of the file itself. Example utilities for revealing strings within a file include 'strings,' 'grep,' or other variants of these programs depending upon the type of operating system used. These programs can be used to dump any ASCII or UNICODE strings contained within a program. Strings can also be searched for using a hex editors by loading the binary or object code file and utilizing native search functions such as regular expressions.
More sophisticated methods of searching for sensitive strings within a file involve disassembly or decompiling of the file. One could, for example, utilize disassembly methods on an ISAPI executable or dll to discover a hard-coded password within the code as it executes. This type of analysis usually involves four stages in which first a debugger is attached to the running process, anti-debugging countermeasures are circumvented or bypassed, the program is analyzed step-by-step, and breakpoints are established so that discrete functions and data structures can be analyzed.
Debugging tools such as SoftICE, Ollydbg, or vendor supplied debugging tools are often used. Disassembly tools such as IDA pro, or similar tools, can also be employed. A third strategy for accessing sensitive strings within a binary involves the decompilation of the file itself into source code that reveals the strings. An example of this type of analysis involves extracting source code from a java JAR file and then using functionality within a java IDE to search the source code for sensitive, hard-coded information. In performing this analysis native java tools, such as "jar" are used to extract the compiled class files. Next, a java decompiler such as "DJ" is used to extract java source code from the compiled classes, revealing source code. Finally, the source code is audited to reveal sensitive information, a step that is usually assisted by source code analysis programs.
Protocol Reverse Engineering
An attacker engages in activities to decipher and/or decode protocol information for a network or application communication protocol used for transmitting information between interconnected nodes or systems on a packet-switched data network. While this type of analysis inherently involves the analysis of a networking protocol, it does not require the presence of an actual or physical network. Although certain techniques for protocol analysis benefit from manipulating live 'on-the-wire' interactions between communicating components, static or dynamic analysis techniques applied to executables as well as to device drivers such as network interface drivers, can also be used to reveal the function and characteristics of a communication protocol implementation.
Depending upon the methods used, protocol reverse engineering can involve similar methods as those employed when reverse engineering an executable, or the process may involve observing, interacting, and modifying actual communications occurring between hosts. The goal of protocol reverse engineering is to derive the data transmission syntax, as well as to extract the meaningful content, including packet or content delimiters used by the protocol. This type of analysis is often performed on closed-specification protocols, or proprietary protocols, but is also useful for analyzing publicly available specifications to determine how particular implementations deviate from published specifications.
There are several challenges inherent to protocol reverse engineering depending upon the nature of the protocol being analyzed. There may also be other types of factors which complicate the process such as encryption or ad hoc obfuscation of the protocol. In general there are two kinds of networking protocols, each associated with its own challenges and analysis approaches or methodologies. Some protocols are human-readable, which is to say they are text-based protocols. Examples of these types of protocols include HTTP, SMTP, and SOAP. Additionally, application-layer protocols can be embedded or encapsulated within human-readable protocols in the data portion of the packet. Typically, human-readable protocol implementations are susceptible to automatic decoding by the appropriate tools, such as Wireshark/ethereal, tcpdump, or similar protocol sniffers or analyzers.
The presence of well-known protocol specifications in addition to easily identified protocol delimiters, such as Carriage Return or Line Feed characters (CRLF) result in text-based protocols susceptibility to direct scrutiny through manual processes. Protocol reverse engineering against protocol implementation such as HTTP is often performed to identify idiosyncratic implementations of a protocol by a server or client. In the case of application-layer protocols which are embedded within text-based protocols, analysis techniques typically benefit from the well-known nature of the encapsulating protocols and can focus on discovering the semantic characteristics of the proprietary protocol or API, since the syntax and protocol delimiters of the underlying protocols can be readily identified.
When performing protocol analysis of machine-readable (non-text-based) protocols difficulties emerge as the protocol itself was designed to be read by computing process. Such protocols are typically composed entirely in binary with no apparent syntax, grammar, or structural boundaries. Examples of these types of protocols are IP, UDP, and TCP. Binary protocols with published specifications can be automatically decoded by protocol analyzers, but in the case of proprietary, closed-specification, binary protocols there are no immediate indicators of packet syntax such as packet boundaries, delimiters, or structure, or the presence or absence of encryption or obfuscation. In these cases there is no one technology that can extract or reveal the structure of the packet on the wire, so it is necessary to use trial and error approaches while observing application behavior based on systematic mutations introduced at the packet-level. Tools such as Protocol Debug (PDB) or other packet injection suites are often employed. In cases where the binary executable is available, protocol analysis can be augmented with static and dynamic analysis techniques.
Try Common(default) Usernames and Passwords
An attacker may try certain common (default) usernames and passwords to gain access into the system and perform unauthorized actions. An attacker may try an intelligent brute force using known vendor default credentials as well as a dictionary of common usernames and passwords.
Many vendor products come preconfigured with default (and thus well-known) usernames and passwords that should be deleted prior to usage in a production environment. It is a common mistake to forget to remove these default login credentials. Another problem is that users would pick very simple (common) passwords (e.g. "secret" or "password") that make it easier for the attacker to gain access to the system compared to using a brute force attack or even a dictionary attack using a full dictionary.