Friday, October 4, 2019

Malware Analysis - Part III(A) - Basic Static Analysis


Malware Analysis


Hi guys!! In the last part we had setup the lab for malware analysis. Before analysing the malware we must know the basic theory of Static Malware Analysis. In this blog, we will go through the Static Malware Analysis and the techniques to perform the same.


Static Analysis

Static Analysis is the first step in studying malware. To perform Analysis, you must have a malware file on which you can perform the Analysis. Moving forward, you will hear two terms the most, i.e., Malware Analyst and Malware Writers/Authors. Malware Writers/Authors are the ones who write or codes the malware, and Malware Analyst are the ones who analyze the malware and its functionality. Malware Analysts are mostly given executable files to perform Analysis. In this blog, we will understand multiple ways to extract useful information from these executables. For Basic Static Analysis, the following techniques will be discussed:

  1. Antivirus Scanning
  2. Hashing
  3. Extracting Strings
  4. Packed and Obfuscated Malware
  5. Portable Executable File Format

Each of the above techniques will provide different information, and the ones you will use will depend on your goals. Typically, you'll be using several methods to gather as much information as possible.

Techniques

1. Antivirus Scanning

The first step to analyzing malware is to see if the malware is already identified and publicly known. A good step to do this would be to run it against multiple antivirus programs. Antivirus programs mainly rely on the database, which contains the file signatures of known malwares, as well as behavioral and pattern-matching Analysis to identify suspicious files. Thus running the malware against antivirus tools would allow us to see if they have identified this malware. But antivirus tools are certainly not perfect. One of the main problem is that malware writers can easily modify their code, thereby changing their program's signature and bypassing virus scanners. Also, malwares that are not seen more often goes undetected by antivirus software because it's simply not in the database.

  • VirusTotal - VirusTotal inspects items with over 70 antivirus scanners and URL/domain blacklisting services. It allows you to upload a file for scanning by multiple antivirus engines and generates a report which includes the total number of engines that marked the file as malicious, the malware name, and additional information about the malware if available. 
    • https://www.virustotal.com/

2. Hashing

Hashing is a common method used to identify malware uniquely. The Hashing program generates a unique hash that identifies the application/malware. The Message-Digest Algorithm 5 (MD5) hash function and Secure Hash Algorithm 1 (SHA-1) is the one most commonly used for malware analysis.  
  • Md5deep - Available for Linux and Windows. 
  • Md5sum - Available for Linux.
  • Winmd5Free - WinMD5Free is a utility to compute MD5 hash value for files which work with Microsoft Windows 2000, XP, 2003, Vista and Windows 7/8/10.
After you have the unique hash for the malware, it can be used as follows:
  • The hash can be used as a label.
  • Hash can be shared with other analysts to identify malware.
  • Hash can be searched online to see if the file has already been identified.

3. Extracting Strings


A program can contain strings if it prints a message, connects to a URL or copies file to a specific location. A string in a program is a sequence of characters such as "the." Take a look at the strings to get hints about the functionality of a program. 
  • String - String is a tool that is available for Linux and Windows. Strings search for a three-letter or greater sequence of ASCII and Unicode strings in an executable. Strings may generate a lot of false-positive and leaves it up to the user to filter it out.

4. Packed and Obfuscated Malware


Packing and Obfuscation are used by Malware writers to make their files more difficult to detect or analyze. 
  • Obfuscated programs are those whose execution the malware author has attempted to hide. 
  • Packed programs are the subset of the Obfuscated program in which malicious code is compressed and cannot be analyzed. 
Note: The below diagram shows the Life Cycle of Packed program.


Diagram 1

Legitimate programs will almost always include many strings. If the malware contains very few strings, it is probably packed or obfuscated. Both techniques will limit your attempts to analyze the malware statically. To investigate these kinds of files, you'll likely need more than Static Analysis. The most common function found in the malicious packed program is LoadLibrary and GetProcAddress used to load and gain access to additional functions. 

  • PEiD Program - PEID is a tool used to Detect Packed Files. 
  • UPX - UPX can be used to Unpack malware.

5. Portable Executable File Format


For a malware analyst, the format of a file can reveal a lot about the program's functionality. Portable Executable File Format is a file format used by Windows 32-bit and 64-bit Operating System for executables, DLLs, COM files, .NET executables, Object code, .FON Font files, NT's Kernel-mode drivers, etc. The PE file format contains the information that is important for the Windows OS loader to manage the wrapped executable code. Mostly every file with executable code loaded by Windows is in the PE file format, though there may be some exceptions.

Note: For a malware analyst, it's just not important to understand the tools, and it's working but in-depth detail. Read about PE File Format and understand the basic structure of it. 
5.1 Linked Libraries and Functions

The most useful information that can be gathered about an executable during Static Analysis is the list of functions that it imports. Imports are functions used by a program that are actually stored in a different program, such as code libraries that are linked to the main executable. Imports are linked to the programs so that there is no need to re-implement certain functionality in multiple programs. Linking libraries can be done statically, at runtime, or dynamically. 

  • Static Linking - It is the least commonly used method for linking libraries, mostly used in UNIX and Linux programs. When all the code from the library is copied into the executable, it can be referred to as Static Linking. Static Linking increases the size of an executable. It can be difficult to differentiate between the Static linked code and the executable's own code since nothing in the PE file header indicates that the file contains linked code.
  • Dynamic Linking / Runtime Linking - Malwares that are packed and obfuscated mostly use Runtime Linking. In Runtime Linking, executables use libraries only when that function is needed, not at program start, as with dynamically linked programs.

Common DLLs
  • Kernel32.dll - core functionality-access and manipulation of memory, files, hardware.
  • Advapi32.dll - Provides access to Service Manager and Registry.
  • User32.dll - Contains user-interface for displaying & manipulating graphics.
  • Ntdll.dll - Interface to Windows Kernel, Imported by Kernel32.dll
  • WSock32.dll & Ws2-32.dll - Networking Dlls.
  • Wininet.dll - Higher Level networking functions that implement protocols like FTP, HTTP, NTP.
Functions
  • Imports - An executable may use different functions. Viewing the PE file header we can gain information about specific functions used by an executable. This function can be very useful to the Malware Analyst to identify the functionality of the executable. Microsoft Developer Network (MSDN) library documents the Windows API that ships with Microsoft products.
  • Exports - DLLs and EXEs export functions to interact with other programs and code. A DLL may implement multiple functions and export them for use by an executable that can then import and use them. Exported functions are most common in DLLs. Exports in an executable can often provide useful information. Exports can be viewed using different tools like Dependency Walker.
Information Revealed in the PE Header -
  • Imports Functions - from other libraries that are used by the malware 
  • Exports Functions - in the malware that are meant to be called by other programs or libraries 
  • Time Date Stamp Time - when the program was compiled 
  • Sections Names - of sections in the file and their sizes on disk and in memory 
  • Subsystem - Indicates whether the program is a command-line or GUI application 
  • Resources - Strings, icons, menus, and other information included in the file.
5.2 Tools
Below are some of the tools to gather information about the PE File -
  • PE View
  • Dependency Walker

Conclusion

Static Analysis is a very useful step, to begin with, but further analysis is also necessary. Using relatively simple tools, Static Analysis can be performed on malware to gain a certain amount of insight into its function. In the next part, some live examples of static analysis will be shown.

Wednesday, October 2, 2019

Malware Analysis - Part II - Setting up Lab

Malware Analysis



Hi guys! Malware Analysis - Part I - Basics just gave you an overview of Malware Analysis. This blog will prepare you with all the requirements needed before we dig into Static and Dynamic Analysis. Let us first understand the advantages & risks associated with using VMware and setup & configure our lab for Malware Analysis. Doing so, we can make sure that the analysis is performed in a safe environment and does not affect the system or network. 
Malware Analysis can be performed in Air-Gapped Network. Air-Gapped Networks allows running the Malware in an isolated environment far from the public Internet or an unsecured Local Area Network thus without putting other computers at risk. Air-gapped network cannot be used if the Malware interacts with the Internet. If Malware Analysis is performed in the physical machine make sure you use a tool such as Norton Ghost or Clonezilla to manage backup images of their operating systems (OSs), which can be restored after completion of the analysis. 

Requirements

Below are a few software and hardware requirements that need to be set up before moving further.

Software Requirements:

  • VMWare Workstation / VirtualBox (Linux)
  • Windows XP iso
  • Malware Analysis Tools

Hardware Requirements:

  • RAM: 4GB

Setup lab for Malware Analysis

  • Download VMWare Workstation or Virtual Box (Linux) and install it in your host machine. If you are not sure about the configuration settings, then it is recommended to use the default settings.
  • Next, Download and Install the Operating System. For the in-depth understanding of malware analysis, there is no better option than using Windows XP.
  • After Windows XP is installed, Install the required tools/application for Malware Analysis as we move further.
  • Next, Install VMWare tools. Goto VMWare Menu→ VM → Install VMware Tools.

Configuring Virtual Machine

Now we need to configure the Virtual Machine so that the Malware does not affect the network and host system. VMware and Virtual Box offer several networking options for virtual networking. We will mostly focus on two things - 

Malware Analysis in Air-Gapped Network


It is not recommended to configure VM with no network connectivity since you won’t be able to analyze if the Malware is performing any malicious network activity.
  • Disconnect the network adapter from the Virtual Box/Virtual Machine or remove the network adapter.
  • Host-only networking is a feature in both VMWare and Virtual Box that creates a separate private network between the host OS and the Virtual Machine. A host-only network is not connected to the Internet. Thus the Malware is contained within your Virtual Machine but allowed some network connectivity, and the host is still connected to the Internet or other external networks. Also, ensure that the host machine is fully patched in case the Malware tries to spread. It is possible that the Malware uses a zero-day exploit against the host OS.
  • Multiple Virtual Machines can be disconnected from the Internet and host machine but can be linked to LAN so that the Malware is connected to a network, but the network isn’t connected to anything.

Malware Analysis Connecting to the Internet


When performing dynamic analysis, it may sometimes be essential to connect the Virtual Machine running Malware to the Internet for more realistic analysis environment, despite all the risks. Before connecting it to the internet, perform some analysis to determine what the malware can do when connected to the internet.
  • Never connect to the Internet without knowing what Malware can do. Connecting to the Internet and performing Malware Analysis could give knowledge to Malware writers about the connection.
  • Using VMware/VBox with a bridged network adapter allows the VM to be connected to the same network interface as the host machine, thus, allowing malware to connect to the Internet. Using VMware’s/VBox's Network Address Translation (NAT) mode shares the host’s IP connection to the Internet. In this, the host acts as a router and translates all requests from the virtual machine.
  • Any external device can also be connected in the VMware/VBox. Connecting a USB device when the VM is active will connect it to the guest machine and not the host machine.

    Features of Virtual Machine


    Snapshots

    VirtualBox and VMware provide the save snapshot feature of guest VM state information. That simply means time travel is possible! Going back in time and reverting the virtual machine. Let us understand this with an example - At 10:00, you take a snapshot of the computer, and you run the malware. At 12:00, you revert back to the snapshot taken at 10:00. The Operating System, Softwares, and other components of the machine will return to the same state they were at 10:00, and everything that occurred between 10:00 and 12:00 is erased as though it never happened. 

    Transferring Files from VM

    One limitation of using snapshots is that any work undertaken on the virtual machine is lost when you revert to an earlier snapshot. Save the work before loading the earlier snapshot by transferring any files that you want to keep to the host OS using Vbox and VMware’s drag-and-drop feature. You can also transfer your data with VMware’s and VBox’s shared folders, which is accessible from both the host and the guest OS.

    Risks - Using VMware for Malware Analysis 

    When performing Analysis, some malware can identify it running in Virtual Machine and executes differently, thus, can be very difficult for the Malware Analyst to perform Analysis. Many techniques have been published to detect the malware in Virtual Machine. VMware does not consider this as a vulnerability and does not take specific steps to avoid detection. VMware has vulnerabilities found in the shared folders feature and the tools have also been released to exploit the drag-and-drop functionality which can be exploited and can cause the host Operating System to crash, or can even be used to run code on the host Operating System. Make sure to keep the VMware version fully patched. Though, the risk is always present analyzing Malware even after you take all possible precautions. Thus, avoid performing malware analysis on the critical machine.

    Analyzing Malware using Virtual Machine

    Following are the steps to run and analyze the Malware using Virtual machine:
    • Take a clean snapshot of the OS with no malware running on it. 
    • Transfer the Malware to the virtual machine.
    • Perform the Malware Analysis on the Virtual Machine. 
    • Transfer all your data and details that you need to the host machine.
    • Revert the Virtual Machine to the snapshot taken.
    New and updated Malware Analysis tools are released thus, install the tools and updates, and then take a clean snapshot. Throughout this, when we discuss running Malware, we assume that the Malware is running in a Virtual Machine.

    References

    https://www.ibm.com/developerworks/community/files/basic/anonymous/api/library/be969f28-ea0a-496c-8736-03038aeea0a7/document/a9c1a42b-6b79-4cc6-b0b3-45cdfb6dcb50/media
    https://www.coursehero.com/file/p22us1n/Sometimes-youll-want-to-connect-your-malware-running-machine-to-the-Internet-to/

    Tuesday, October 1, 2019

    Malware Analysis - Part I - Basics

    Malware Analysis

    Hi guys, this blog is all about the Overview of Malware Analysis. It tells you about what kind of questions should be asked by Malware Analyst, types of Malware analysis and the general rules that should be followed.

    Introduction

    Malware analysis is a process of learning how malware functions. Any code that performs evil action is called malware. New malware's are being coded every day, and the number of malware is increasing exponentially. Malware code can differ, and it is essential to know that the malware can have multiple functionalities. These may come in the form of viruses, worms, spyware, and Trojan horses. Each type of malware gathers information about the infected device without the knowledge, or authorization of users.

    Why should we analyze Malware?

    The answer to this question is straightforward. The goal of Malware Analysis is to protect something or someone. Generally, there are two sets of questions that should be asked by the Malware Analyst, i.e., Business Questions and Technical Questions.

    Business Questions:

    • What is the purpose of malware?
    • How did it get here?    
    • Who is targeting us and how good are they?
    • How can I get rid of it?
    • What did they steal?
    • How long has it been?
    • Does it spread on its own?
    • How can I find it on other machines?
    • How do I prevent this from happening?

    Technical Questions:

    • Network Indicators?
    • Host-Based Indicators?
    • Persistence Mechanism?
    • Date of Compilation?
    • Date of Installation?
    • Language used?
    • Is it packed?
    • Does it have any rootkit functionality?

    Types of Malware Analysis

    The techniques by which malware analysis is performed typically fall under the following two categories:

    • Static Malware Analysis (Examine malware without running it)

    Static Malware Analysis is usually performed without executing the malware and studying each component. It would include:

      • Basic Static Analysis


    In Basic Static Analysis the executable files are examined without viewing the actual instructions. This analysis would confirm that the file is malicious or not, would give the basic idea of its functionality, and sometimes provide information that will allow you to produce simple network signatures. It is straightforward and quick, but is ineffective against sophisticated malware, and can miss important behaviors.

      • Advanced Static Analysis

    Advanced Static Analysis consists of reverse-engineering malware internals by loading the executable into a disassembler and looking at the program instructions to understand programs logic. The CPU executes the instructions; thus advanced static analysis tells you exactly what the program does. However, Advance Static Analysis has a steeper learning curve than the Basic Static Analysis and needs specialized knowledge of, code constructs, disassembly, and Windows OS concepts.

    • Dynamic Malware Analysis (Examine malware by running it)

    Dynamic Malware Analysis is performed by observing the behavior of the malware while it is running on a host system. It would include:

      • Basic Dynamic Analysis

    Basic Dynamic Analysis technique would involve running the malware and observing its behavior on the system to remove the infection and produced effective signatures. However, before running a malware, a proper environment must be set up which would allow the study of running malware without any risk or damage to your system or network. Basic Dynamic Analysis techniques can also be used by people without in-depth programming knowledge similar to the Basic Static Analysis, but it won’t be effective with all malware and can miss important functionality.

      • Advanced Dynamic Analysis

    The Advanced dynamic analysis uses a debugger to examine the internal state of running malware executable. This technique provides a different way to extract detailed information from an executable. These techniques are most useful when you’re trying to obtain information that is difficult to gather with other methods.

    Rules for Malware Analysis

    • First, don’t get too caught up in details. Most malware programs are large and complex, and you can’t possibly understand every aspect. Focus instead on the key features. When you run into intricate and complex sections, try to get a general overview before you get stuck.
    • Second, remember that different tools and approaches are available for different jobs. There is no one approach. If a tool doesn't give you the information that you want, try another. If you get stuck, don’t spend too much time on one issue, move onto something else. Try analyzing the malware from a different angle, or try a different approach.
    • Finally, remember malware analysis is like a cat and mouse game. As new malware analysis techniques are developed, malware authors respond with new techniques to analyze. To succeed as a malware analyst, you must be able to recognize, understand, and defeat these techniques, and respond to changes in the art of malware analysis.

    The above details are fundamentals that should be known before moving further. Part 2 of Malware Analysis would contain information on how to set up your lab to perform the Analysis.




    Thursday, September 12, 2019

    Portable Executable File

    Hi techies!! I was recently reading about Static Malware Analysis, and I found the term "PE File" being used most of the time, but wasn't sure of what it actually is. When digging deep into it, there were many interesting things about PE Files, so here I am writing a blog on Portable Executable File for a more clear picture. In this blog, I will be discussing the basics of File Format and will go through the PE File Format, the structure and the tools to view the PE Files. 


    What is File Format?



    A file format is a structure of a file in terms of how the data within the file is organized. The data stored in a file must be viewed in a proper layout; thus, the program that uses the data must be able to recognize and access data within the file. For example, a file in the HTML File Format can be processed by the Web browser program so that it appears as a Web page, but it cannot display a file in a format designed for Microsoft's Word program. File format can be identified by the file name extension. 


    A few of the more common file formats are:
    • Word documents (.doc)
    • Executable programs (.exe)
    • Web text pages (.htm or .html)
    • Images (.gif and .jpg)
    • Adobe Acrobat files (.pdf)
    • Multimedia files (.mp3)

    What is PE File Format?



    Portable Executable File Format is a file format used by Windows 32-bit and 64-bit Operating System for executables, DLLs, COM files, .NET executables, Object code, .FON Font files, NT's Kernel-mode drivers, etc. The PE file format contains the information that is important for the Windows OS loader to manage the wrapped executable code. COFF(Common Object File Format) was used in Windows NT systems before the PE file format. The different extensions used to recognize that file format are : .cpl, .dll, .drv, .efi, .exe, .ocx, .scr and .sys.


    Basic Structure of PE File



    Diagram 1 shows a basic structure of the Portable Executable File Format. You can also use a tool such as PE Viewer to view the basic structure of a PE File.



    Diagram 1

    1. DOS MZ Header


    The first 64-byte of all the PE file has this header. This section recognizes if the file is a valid PE file or not. All the valid PE files contain the value of the first two-byte as 4D and 5A ("MZ" in ASCII) as shown in Exhibit 1, named after Mark Zbikowsky, a well-known architect of MS-DOS. Under this header, includes a list of structure. Here, we will be discussing two important ones i.e., magic and ifanew structure.


    • E_magic is the first field, also called magic number. The primary purpose of this field is to identify that the file is compatible with the MS-DOS file type. The value for all MS-DOS-compatible executable files is set to 4D 5A, as shown in Exhibit 1, representing the ASCII characters MZ. MS-DOS header is sometimes referred to as MZ headers.
    • E_ifanew is the offset to the PE Header. By using this, you can directly go to the PE Header. The windows loader looks for this offset to skip the DOS stub and go directly to the PE header.


    2. DOS Stub


    DOS Stub section contains the string "This program cannot be run in DOS mode.". It like a warning message displaying that the program cannot be run on windows. It starts just after a 4-byte reserved address "ifanew" and its standard universal size is 128 bytes. 

    3. PE File Header


    PE Header is also known as IMAGE_NT_HEADER and contains three main components as shown below -

    i. Signature
    • The structure includes the DWORD value 50h, 45h, 00, 00 (meaning "PE" followed by two termination zeros), meaning its a signature indicating that the PE header starts here.
    The below diagram illustrates the structure and value of the PE executable.


    Exhibit 1
    ii. File Header
    • The next 20 bytes after Signature represents the file header. It contains information about the physical layout and properties of the file which includes the following - 
      • Machine - The number in it identifies the type of machine such as Intel, AMD, etc.
      • NumberOfSections - Tells the number of Sections the PE file holds with it. If the value is 04h,00 it means it contains four sections.
      • TimeDateStamp - It represents the time when the linker or the compiler for an OBJ file produced this file.
      • PointerToSymbolTable
      • NumberOfSymbols
      • SizeOfOptionalHeader -  The value for an object file is set to zero. As the name suggests, this is the size of the optional header required for an executable file. 
    Exhibit 2
      • Characteristics -  It contains the flag value which can help in identifying if the file is a DLL or an executable.

    iii. Optional Header
    This header follows FileHeader and makes the next 224 bytes containing information about the logical layout of the file. Some of the important ones are:
    • Magic - The unsigned integer that identifies the state of the image file. Exhibit 3 shows that the value is set to 0x10b for 32-bit executable.
    Exhibit 3
    • AddressOfEntrypoint - The stored value in it presents the address where the execution of the file starts.
    • SectionAlignment - This is the alignment of sections when they are loaded into the memory. If the value is 130(1000h), this indicates that each section is going to get stored in multiple slots of 130 bytes each no matter the actual size of the section(less or more).
    • SizeOfImage - This value is the combined file size of all the sections of the file. It must be a multiple of SectionAlignment.
    • FileAlignment - This is the alignment of sections in the file when the file is not loaded. It is similar to SectionAlignment. The only difference is in the size of each slot. In this case, it's 134 bytes(200h).
    • DataDirectories - The last 228 bytes represent DataDirectory, an array or 16 IMAGE_DATA_DIRECTORY structures, each one of them relating to an important data structure in PE file, for example, Import table, Export table, etc. 

    4. Section Table


    This table immediately follows the optional header. It contains information about the Sections present in PE files. The total number of sections can also be viewed in the File Header under NumberOfSections. If the number of sections present in a PE file is five, then, there must be five IMAGE_SECTION_HEADER structures present just after the PE file header.
    • Name1 - An 8-byte null-padded UTF8 encoding string. This can be null.
    • VirtualSize - This is the actual size in bytes of the section's data. The size may be less than the size of the section on disk.
    • SizeOfRawData - The size of the section's data in the file on the disk.
    • PointerToRawData - This is so useful because it is the offset from the file's beginning to the section's data.
    • Characteristics - This flag describes the characteristics of the section.

    5. PE File Section


    PE File section contains the main content of the file, including code, data, resources, and other executable files. Each section has a header and a body.
    • .text - The section, also known as CODE, is the place where all the instructions reside. These instructions are further executed by the CPU. This is the section that contains "Entry Point," as mentioned earlier.
    • .rdata - The import and export information is represented by this section. This section stores other read-only data used by the program like literals, constant strings, etc.
    • .data - The .data section consists of the program's global data, which can be accessed from anywhere in the program.
    • .rsrc - The .rsrc section contains resources such as images, icons, menu, etc. used by the executable. ResHacker is a resource editor tool that displays this section in a structured tree format.

    Tools

    PE data can be viewed using various tools. Some of the free tools are listed below-
    • PE View - Available for Windows
    • PE Explorer - Available for Windows
    • FileAlyzer - Available for Windows
    • CFF Explorer - Available for Windows

    Conclusion



    Thus, this is a brief about the Portable Executable File Structure. For starters, this would be enough to get a basic understanding of the PE File Structure. If you want to dig deeper into this, you can definitely use Google to do it. I hope this blog was useful. Comment and share if you like it.

    References

    https://resources.infosecinstitute.com/2-malware-researchers-handbook-demystifying-pe-file/#gref
    https://www.talentcookie.com/2016/06/pe-file-inside-tour/