Reverse-engineering is the act of dismantling an object to see how it works. It is done primarily to analyze and gain knowledge about the way something works but often is used to duplicate or enhance the object. Many things can be reverse-engineered, including software, physical machines, military technology and even biological functions related to how genes work.
The practice of reverse-engineering as applied to computer hardware and software is taken from older industries. Software reverse-engineering focuses on a program's machine code -- the string of 0s and 1s that are sent to the logic processor. Program language statements are used to turn the machine code back into the original source code.
Depending on the technology, the knowledge gained during reverse-engineering can be used to repurpose obsolete objects, do a security analysis, gain a competitive advantage or simply to teach someone about how something works. No matter how the knowledge is used or what it relates to, reverse-engineering is the process of gaining that knowledge from a finished object.
Often the goal of reverse-engineering software or hardware is to find a way to create a similar product more inexpensively or because the original product is no longer available. Reverse-engineering in information technology is also used to address compatibility issues and make the hardware or software work with other hardware, software or operating systems that it wasn't originally compatible with.
Apple's Logic Pro software, which lets musicians compose, record, arrange, edit and mix music, is a good example. Logic Pro is only available for Mac devices, and it is relatively expensive. The program has several proprietary digital instruments. With a bit of investigation, a programmer could reverse-engineer those digital instruments, figure out how they work and customize them for use in Logic Pro or to make them interoperable with other music software that is compatible with Windows.
The details of the reverse-engineering process vary depending on the thing being reverse-engineered, but generally it fits into these three steps.
Software reverse-engineering involves the use of several tools. One tool is a hexadecimal dumper, which prints or displays the binary numbers of a program in hexadecimal. By knowing the bit patterns that represent the processor instructions, as well as the instruction lengths, the reverse-engineer can identify portions of a program to see how they work.
Another software reverse-engineering tool is the disassembler. It reads the binary code and displays each executable instruction as text. A disassembler cannot tell the difference between an executable instruction and the data the program uses, so a debugger is used to prevent the disassembler from disassembling the data portions of a program. These tools might be used by a computer cracker and gain entry to a computer system or cause other harm.
Computer-aided design (CAD) is a reverse-engineering technique used to recreate a manufactured part when the original blueprint is no longer available. It involves producing 3D images of the part so it can be remanufactured. A coordinate measuring machine measures the part, and as it is measured, a 3D wire frame image is generated using CAD software and displayed on a monitor. After the measuring is complete, the wire frame image is dimensioned. Any part can be reverse-engineered using these methods.
One example is Phoenix, a U.S. software company that created basic input/output system (BIOS) software, which was compatible with IBM's proprietary version. To do this, Phoenix reverse-engineered the IBM version in a way that protected them from copyright charges, by recording the steps it followed and not referencing the proprietary code.
Malware is another area where software reverse-engineering is used. Threat actors often use software code obfuscation to keep their malicious code from being discovered or understood. The owners of infected software or systems can use reverse-engineering to identify malicious content, such as a virus. U.S. Defense Intelligence Agency has said it intended to use these techniques to reverse-engineer enemy malware to create its own offensive cyberweapons. Tools are available to aid in malware reverse-engineering, such as the National Security Agency's Ghidra software, which is used to reverse-engineer the WannaCry malware, for instance.
Reverse-engineering a patented product is generally legal under the Defend Trade Secrets Act, but there are situations where its legality is questionable. Patent owners have legal recourse against anyone copying their inventions.
Reverse-engineering software for the purpose of copying or duplicating a program may constitute a copyright law violation. Some software licenses specifically prohibit reverse-engineering. Other contractual agreements can also limit the use of reverse-engineering to gain access to code, including terms of service or use notices and nondisclosure and other types of developer agreements.
Technological protection measures (TPM), such as passwords, encryption and access control devices, are often used to control access to software and other digital copyrighted content. Circumventing TPM can raise legal issues.
The various laws pertaining to reverse-engineering include the following:
One way to purposefully reverse-engineer and develop a new software product to avoid patent or copyright infringement is to use a clean room or ethical wall technique, in which two separate groups of programmers work on the project, ensuring that the original is not directly copied.
Reverse-engineering is a complicated area of ethics and law. The proliferation of information technology in many sectors of everyday life is making it even more complicated.
The steps involved are complicated and vary depending upon what is being reverse-engineered. For example, QA professionals looking to address user issues with software products can reverse-engineer a complaint to get to its cause. Identifying the root causes of user problems isn't easy, but reverse-engineering techniques eliminate some of the guesswork.
The practice of reverse-engineering as applied to computer hardware and software is taken from older industries. Software reverse-engineering focuses on a program's machine code -- the string of 0s and 1s that are sent to the logic processor. Program language statements are used to turn the machine code back into the original source code.
Depending on the technology, the knowledge gained during reverse-engineering can be used to repurpose obsolete objects, do a security analysis, gain a competitive advantage or simply to teach someone about how something works. No matter how the knowledge is used or what it relates to, reverse-engineering is the process of gaining that knowledge from a finished object.
What is the purpose of reverse-engineering?
The purpose of reverse-engineering is to find out how an object or system works. There are a variety of reasons to do this. Reverse-engineering can be used to learn how something works and to recreate the object or to create a similar object with added enhancements.Often the goal of reverse-engineering software or hardware is to find a way to create a similar product more inexpensively or because the original product is no longer available. Reverse-engineering in information technology is also used to address compatibility issues and make the hardware or software work with other hardware, software or operating systems that it wasn't originally compatible with.
Apple's Logic Pro software, which lets musicians compose, record, arrange, edit and mix music, is a good example. Logic Pro is only available for Mac devices, and it is relatively expensive. The program has several proprietary digital instruments. With a bit of investigation, a programmer could reverse-engineer those digital instruments, figure out how they work and customize them for use in Logic Pro or to make them interoperable with other music software that is compatible with Windows.
How does the reverse-engineering process work?
The reverse-engineering process is specific to the object on which its being performed. However, no matter the context, there are three general steps common to all reverse-engineering efforts. They include:- Information extraction. The object being reverse-engineered is studied, information about its design is extracted and that information is examined to determine how the pieces fit together. In software reverse-engineering, this might require gathering source code and related design documents for study. It may also involve the use of tools, such as a disassembler to break apart the program into its constituent parts.
- Modeling. The collected information is abstracted into a conceptual model, with each piece of the model explaining its function in the overall structure. The purpose of this step is to take information specific to the original and abstract it into a general model that can be used to guide the design of new objects or systems. In software reverse-engineering this might take the form of a data flow diagram or a structure chart.
- Review. This involves reviewing the model and testing it in various scenarios to ensure it is a realistic abstraction of the original object or system. In software engineering this might take the form of software testing. Once it is tested, the model can be implemented to reengineer the original object.
Software reverse-engineering involves the use of several tools. One tool is a hexadecimal dumper, which prints or displays the binary numbers of a program in hexadecimal. By knowing the bit patterns that represent the processor instructions, as well as the instruction lengths, the reverse-engineer can identify portions of a program to see how they work.
Another software reverse-engineering tool is the disassembler. It reads the binary code and displays each executable instruction as text. A disassembler cannot tell the difference between an executable instruction and the data the program uses, so a debugger is used to prevent the disassembler from disassembling the data portions of a program. These tools might be used by a computer cracker and gain entry to a computer system or cause other harm.
Computer-aided design (CAD) is a reverse-engineering technique used to recreate a manufactured part when the original blueprint is no longer available. It involves producing 3D images of the part so it can be remanufactured. A coordinate measuring machine measures the part, and as it is measured, a 3D wire frame image is generated using CAD software and displayed on a monitor. After the measuring is complete, the wire frame image is dimensioned. Any part can be reverse-engineered using these methods.
Examples of reverse-engineering
Reverse-engineering varies depending on what it is being applied to and the purpose of reverse-engineering the technology. Common examples include:Software
There are several instances where reverse-engineering is used to disassemble software. A common example is to adapt a program written for use with one microprocessor to another. Other examples include reconstructing lost source code, studying how a program performs certain operations, improving performance and fixing bugs or correcting errors when the source code is not available.One example is Phoenix, a U.S. software company that created basic input/output system (BIOS) software, which was compatible with IBM's proprietary version. To do this, Phoenix reverse-engineered the IBM version in a way that protected them from copyright charges, by recording the steps it followed and not referencing the proprietary code.
Malware is another area where software reverse-engineering is used. Threat actors often use software code obfuscation to keep their malicious code from being discovered or understood. The owners of infected software or systems can use reverse-engineering to identify malicious content, such as a virus. U.S. Defense Intelligence Agency has said it intended to use these techniques to reverse-engineer enemy malware to create its own offensive cyberweapons. Tools are available to aid in malware reverse-engineering, such as the National Security Agency's Ghidra software, which is used to reverse-engineer the WannaCry malware, for instance.
Computer parts
If a processor manufacturer wants to see how a competitor's processor works, it can buy a competitor's processor, reverse-engineer it and then use what it learns to make its own processor. This process is illegal in many countries, and it requires a great deal of expertise and is expensive. Reverse-engineering is often used to create replacement parts when the original parts for legacy equipment are no longer available. Reverse-engineering of computer parts is also done to enhance security. For example, Google's Project Zero identified vulnerabilities in microprocessors using reverse-engineering.Network security assessments
Companies doing network security assessments also use reverse-engineering as one of their tools. They divide their security group into two teams. One team simulates attacks, and the other team monitors the network and reverse-engineers the other team's attacks. The information gained from these mock attacks is used to strengthen the corporate network.Legal and ethical challenges with reverse-engineering
In the U.S., reverse-engineering is generally considered a legal way to learn about a product as long as the original version is obtained legally and no other contractual agreements are broken. U.S. trade laws aim to allow for reverse-engineering if it is in the interest of improving the product or creating interoperability with other products that were previously incompatible.Reverse-engineering a patented product is generally legal under the Defend Trade Secrets Act, but there are situations where its legality is questionable. Patent owners have legal recourse against anyone copying their inventions.
Reverse-engineering software for the purpose of copying or duplicating a program may constitute a copyright law violation. Some software licenses specifically prohibit reverse-engineering. Other contractual agreements can also limit the use of reverse-engineering to gain access to code, including terms of service or use notices and nondisclosure and other types of developer agreements.
Technological protection measures (TPM), such as passwords, encryption and access control devices, are often used to control access to software and other digital copyrighted content. Circumventing TPM can raise legal issues.
The various laws pertaining to reverse-engineering include the following:
- patent law;
- copyright and fair use law;
- trade secret law;
- anticircumvention provisions of the Digital Millennium Copyright Act;
- Electronic Communications Privacy Act; and
- any contract law specific to the product in question.
One way to purposefully reverse-engineer and develop a new software product to avoid patent or copyright infringement is to use a clean room or ethical wall technique, in which two separate groups of programmers work on the project, ensuring that the original is not directly copied.
Reverse-engineering is a complicated area of ethics and law. The proliferation of information technology in many sectors of everyday life is making it even more complicated.
The takeaway
Reverse-engineering has many legitimate uses in IT. It can be both a legal and ethical approach to address compatibility issues, recreate legacy parts, do security assessments, improve upon an existing product or make it more inexpensively.The steps involved are complicated and vary depending upon what is being reverse-engineered. For example, QA professionals looking to address user issues with software products can reverse-engineer a complaint to get to its cause. Identifying the root causes of user problems isn't easy, but reverse-engineering techniques eliminate some of the guesswork.