Question? Leave a message!




Reversing C++

Reversing C++ 11
IBM Global Services Reversing C++ Paul Vincent Sabanal XForce RD Mark Vincent Yason XForce RD IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part I. Introduction IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Introduction Purpose  Understand C++ concepts as they are represented in disassemblies  Have a big picture idea on what are major pieces (classes) of the C++ target and how these pieces relate together (class relationships) IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Introduction Focus On…  (1) Identifying Classes  (2) Identifying Class Relationships  (3) Identifying Class Members IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Introduction Motivation  Increasing use of C++ code in malware – Difficult to follow virtual function calls in static analysis – Examples: Agobot, Mytob, new malcodes from our honeypot  Most modern applications use C++ – For binary auditing, reversers can expect that the target can be a C++ compiled binary  General lack of publicly available information regarding the subject of C++ reversing – Only good information is from Igor Skochinsky – https://www.openrce.org/articles/fullview/23 IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part II. Manual Approach IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part II. Manual Approach Identifying C++ Binaries Constructs IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying C++ Binaries Constructs  Heavy use of ecx (this ptr) .text:004019E4 mov ecx, esi .text:004019E6 push 0BBh .text:004019EB call sub401120  ecx used without being initialized .text:004010D0 sub4010D0 proc near .text:004010D0 push esi .text:004010D1 mov esi, ecx .text:004010DD mov dword ptr esi, offset off40C0D0 .text:00401101 mov dword ptr esi+4, 0BBh .text:00401108 call sub401EB0 .text:0040110D add esp, 18h .text:00401110 pop esi .text:00401111 retn .text:00401111 sub4010D0 endp IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying C++ Binaries Constructs  Parameters on the stack, ecx = this ptr .text:00401994 push 0Ch .text:00401996 call 2YAPAXIZ ; operator new(uint) .text:004019AB mov ecx, eax ::: .text:004019AD call ClassActor  Virtual function calls (indirect calls) .text:00401996 call 2YAPAXIZ ; operator new(uint) ::: .text:004019B2 mov esi, eax ::: .text:004019FF mov eax, esi ;EAX = vftable .text:00401A01 add esp, 8 .text:00401A04 mov ecx, esi .text:00401A06 push 0CCh .text:00401A0B call dword ptr eax IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying C++ Binaries Constructs  STL Code and Imported DLLs .text:00401201 mov ecx, eax .text:00401203 call ds:sputcbasicstreambufDUchartraitsDstdstdQAEHDZ ; std::basicstreambufchar,std::chartraitschar::sputc(char) IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Class Instance Layout  Class Instance Layout class Ex1 int var1; int var2; char var3; public: int getvar1(); ; class Ex1 size(12): + 0 var1 4 var2 8 var3 alignment member (size=3) + IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Class Instance Layout  Class Instance Layout class Ex2 int var1; public: virtual int getsum(int x, int y); virtual void resetvalues(); ; class Ex2 size(8): + Ex2::vftable: 0 vfptr 0 Ex2::getsum 4 var1 4 Ex2::resetvalues + IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Class Instance Layout  Class Instance Layout class Ex3: public Ex2 int var1; public: void getvalues(); ; class Ex3 size(12): + + (base class Ex2) 0 vfptr 4 var1 + 8 var1 + IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Class Instance Layout  Class Instance Layout class Ex4 class Ex5 size(24): + int var1; + (base class Ex2) int var2; 0 vfptr public: 4 var1 virtual void func1(); + virtual void func2(); + (base class Ex4) ; 8 vfptr 12 var1 class Ex5: public Ex2, Ex4 16 var2 + int var1; 20 var1 public: + void func1(); virtual void vex5(); ; IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part II. Manual Approach Identifying Classes IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes Constructor/Destructor Identification  Global Objects – Allocated in the data segment – Constructor is called at program startup – Destructor is called at program exit – this pointer points to a global variable – To locate constructor/destructor, examine cross references IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes Constructor/Destructor Identification  Local Objects – Allocated in the stack – Constructor is called at declaration – this pointer points to an uninitialized local variable – Destructor is called at block exit IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes Constructor/Destructor Identification  Local Objects .text:00401060 sub401060 proc near .text:00401060 .text:00401060 varC = dword ptr 0Ch .text:00401060 var8 = dword ptr 8 .text:00401060 var4 = dword ptr 4 .text:00401060 …(some code)… .text:004010A4 add esp, 8 .text:004010A7 cmp ebp+var4, 5 .text:004010AB jle short loc4010CE .text:004010AB .text:004010AB  block begin .text:004010AD lea ecx, ebp+var8 ; var8 is uninitialized .text:004010B0 call sub401000 ; constructor .text:004010B5 mov edx, ebp+var8 .text:004010B8 push edx .text:004010B9 push offset strWithinIfX .text:004010BE call sub4010E4 .text:004010C3 add esp, 8 .text:004010C6 lea ecx, ebp+var8 .text:004010C9 call sub401020 ; destructor .text:004010CE  block end .text:004010CE .text:004010CE loc4010CE: ; CODE XREF: sub401060+4Bj .text:004010CE mov ebp+varC, 0 .text:004010D5 lea ecx, ebp+var4 .text:004010D8 call sub401020 IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes Constructor/Destructor Identification  Dynamically Allocated Objects – Allocated in the heap – Created via operator new  Allocates memory in heap  Calls the constructor – Destructor is called via operator delete  Calls destructor  Deallocates object instance IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes Constructor/Destructor Identification  Dynamically Allocated Objects .text:0040103D main proc near .text:0040103D argc = dword ptr 8 .text:0040103D argv = dword ptr 0Ch .text:0040103D envp = dword ptr 10h .text:0040103D .text:0040103D push esi .text:0040103E push 4 ; sizet .text:00401040 call 2YAPAXIZ ; operator new(uint) .text:00401045 test eax, eax ; eax = address of allocated memory .text:00401047 pop ecx .text:00401048 jz short loc401055 .text:0040104A mov ecx, eax .text:0040104C call sub401000 ; call to constructor .text:00401051 mov esi, eax .text:00401053 jmp short loc401057 .text:00401055 loc401055: ; CODE XREF: main+Bj .text:00401055 xor esi, esi .text:00401057 loc401057: ; CODE XREF: main+16j .text:00401057 push 45h .text:00401059 mov ecx, esi .text:0040105B call sub401027 .text:00401060 test esi, esi .text:00401062 jz short loc401072 .text:00401064 mov ecx, esi .text:00401066 call sub40101B ; call to destructor .text:0040106B push esi ; void .text:0040106C call jfree ; call to free thunk function .text:00401071 pop ecx .text:00401072 loc401072: ; CODE XREF: main+25j .text:00401072 xor eax, eax .text:00401074 pop esi .text:00401075 retn .text:00401075 main endp IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  What is RTTI – Runtime Type Information (RTTI) – Used for identification of object type on runtime – Generated for polymorphic classes (classes with virtual functions) – Utilized by operators typeid and dynamiccast – Will give us important information on  Class Name • Rough idea what the class is all about  Class Hierarchy – Consists of several data structures IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTICompleteObjectLocator – Contains pointers to two structures that identifies  Class information (TypeDescriptor)  Class Hierarchy (RTTIClassHierarchyDescriptor) – Located just below the class’ vftable .rdata:00404128 dd offset ClassARTTICompleteObjectLocator .rdata:0040412C ClassAvftable dd offset sub401000 ; DATA XREF:... .rdata:00404130 dd offset sub401050 .rdata:00404134 dd offset sub4010C0 .rdata:00404138 dd offset ClassBRTTICompleteObjectLocator .rdata:0040413C ClassBvftable dd offset sub4012B0 ; DATA XREF:... .rdata:00404140 dd offset sub401300 .rdata:00404144 dd offset sub4010C0 IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTICompleteObjectLocator Offset Type Name Description 0x00 DW signature Always 0 Offset of vftable 0x04 DW offset within the class 0x08 DW cdOffset 0x0C DW pTypeDescriptor Class Information pClassHierarchy Class Hierarchy 0x10 DW Descriptor Information .rdata:004045A4 ClassBRTTICompleteObjectLocator dd 0 ; COL.signature .rdata:004045A8 dd 0 ; COL.offset .rdata:004045AC dd 0 ; COL.cdOffset .rdata:004045B0 dd offset ClassBTypeDescriptor .rdata:004045B4 dd offset ClassBRTTIClassHierarchyDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  TypeDescriptor – Contains the class name (which is an important information) – Say CPacketParser and CTCPPacketParser Description Offset Type Name Always points to 0x00 DW pVFTable typeinfo’s vftable 0x04 DW spare 0x08 SZ name Class Name .data:0041A098 ClassATypeDescriptor ; DATA XREF: .... dd offset typeinfovftable ; TypeDescriptor.pVFTable .data:0041A09C dd 0 ; TypeDescriptor.spare .data:0041A0A0 db '.AVClassA',0 ; TypeDescriptor.name IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTIClassHierarchyDescriptor – Information about the class hierarchy – Includes pointers to BaseClassDescriptors for each base class Description Offset Type Name Always 0 0x00 DW signature Bit 0 – multiple inheritance Bit 1 – virtual inheritance 0x04 DW attributes Number of base classes. Count includes the class 0x08 DW numBaseClasses itself Array of 0x0C DW pBaseClassArray RTTIBaseClassDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTIClassHierarchyDescriptor – Example class declaration class ClassA … class ClassE … class ClassG: public virtual ClassA, public virtual ClassE … – Corresponding RTTIClassHierarchyDescriptor .rdata:004178C8 ClassGRTTIClassHierarchyDescriptor ; DATA XREF: ... .rdata:004178C8 dd 0 ; signature .rdata:004178CC dd 3 ; attributes .rdata:004178D0 dd 3 ; numBaseClasses .rdata:004178D4 dd offset ClassGpBaseClassArray ; pBaseClassArray .rdata:004178D8 ClassGpBaseClassArray dd offset oopreRTTIBaseClassDescriptor4178e8 .rdata:004178DC dd offset oopreRTTIBaseClassDescriptor417904 .rdata:004178E0 dd offset oopreRTTIBaseClassDescriptor417920 IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTIBaseClassDescriptor – Information about the base class – Contains the TypeDescriptor for the base class Description Offs et Type Name TypeDescriptor of this base class 0x00 DW pTypeDescriptor Number of direct bases of this base class 0x04 DW numContainedBases 0x08 DW PMD.mdisp vftable offset vbtable offset (1: vftable is at displacement PMD.mdisp inside the class) 0x0C DW PMD.pdisp Displacement of the base class vftable pointer inside the vbtable 0x10 DW PMD.vdisp 0x14 DW attributes RTTIClassHierarchyDescriptor of this 0x18 DW pClassDescriptor base class IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  vbtable (virtual base class table) – Contains information necessary to locate the actual base class within class – Generated for multiple virtual inheritance and used for upclassing (casting to base classes) ClassG::vbtable: class ClassG size(28): 0 4 + 1 4 (ClassGd(ClassG+4)ClassA) 0 vfptr 2 16 (ClassGd(ClassG+4)ClassE) 4 vbptr + + (virtual base ClassA) 8 vfptr 12 classavar01 16 classavar02 alignment member (size=3) + + (virtual base ClassE) 20 vfptr 24 classevar01 + IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTIBaseClassDescriptor (example) class ClassG size(28): ClassG::vbtable: + 0 4 0 vfptr 1 4 (ClassGd(ClassG+4)ClassA) 4 vbptr 2 16 (ClassGd(ClassG+4)ClassE) + + (virtual base ClassA) 8 vfptr 12 classavar01 16 classavar02 alignment member (size=3) + + (virtual base ClassE) 20 vfptr 24 classevar01 + .rdata:00418AFC RTTIBaseClassDescriptor418afc ; DATA XREF: ... dd offset oopreClassETypeDescriptor .rdata:00418B00 dd 0 ; numContainedBases .rdata:00418B04 dd 0 ; PMD.mdisp .rdata:00418B08 dd 4 ; PMD.pdisp .rdata:00418B0C dd 8 ; PMD.vdisp .rdata:00418B10 dd 50h ; attributes .rdata:00418B14 dd offset oopreClassERTTIClassHierarchyDescriptor ; pClassDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Classes via RTTI  RTTI Data Structures Layout TypeDescriptor Class A vftable CompleteObjectLocator ClassA BaseClassArray ClassHierarchyDescriptor BaseClassDescriptor Inherits from TypeDescriptor Class B vftable CompleteObjectLocator ClassB BaseClassArray BaseClassDescriptor ClassHierarchyDescriptor Inherits from BaseClassDescriptor TypeDescriptor Class C vftable CompleteObjectLocator ClassC BaseClassArray BaseClassDescriptor ClassHierarchyDescriptor BaseClassDescriptor BaseClassDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part II. Manual Approach Identifying Class Relationship IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Relationship Constructor Analysis  Single Inheritance .text:00401010 sub401010 proc near .text:00401010 .text:00401010 var4 = dword ptr 4 .text:00401010 .text:00401010 push ebp .text:00401011 mov ebp, esp .text:00401013 push ecx .text:00401014 mov ebp+var4, ecx ; get this ptr to current object .text:00401017 mov ecx, ebp+var4 ; .text:0040101A call sub401000 ; call class A constructor .text:0040101F mov eax, ebp+var4 .text:00401022 mov esp, ebp .text:00401024 pop ebp .text:00401025 retn .text:00401025 sub401010 endp IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Relationship Constructor Analysis  Multiple Inheritance .text:00401020 sub401020 proc near .text:00401020 .text:00401020 var4 = dword ptr 4 .text:00401020 .text:00401020 push ebp .text:00401021 mov ebp, esp .text:00401023 push ecx .text:00401024 mov ebp+var4, ecx .text:00401027 mov ecx, ebp+var4 ; ptr to base class A .text:0040102A call sub401000 ; call class A constructor .text:0040102A .text:0040102F mov ecx, ebp+var4 .text:00401032 add ecx, 4 ; ptr to base class C .text:00401035 call sub401010 ; call class C constructor .text:00401035 .text:0040103A mov eax, ebp+var4 .text:0040103D mov esp, ebp .text:0040103F pop ebp .text:00401040 retn .text:00401040 .text:00401040 sub401020 endp IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Relationship  Multiple Inheritance class A size(4): + 0 a1 + class C size(4): + 0 c1 + class D size(12): + + (base class A) 0 a1 + + (base class C) 4 c1 + 8 d1 + IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Relationship via RTTI  Using RTTIClassHierarchyDescriptor  Contain pointers to RTTIBaseClassDescriptors (BCDs) Description Offset Type Name Always 0 0x00 DW signature Bit 0 – multiple inheritance Bit 1 – virtual inheritance 0x04 DW attributes Number of base classes. Count includes the class 0x08 DW numBaseClasses itself Array of 0x0C DW pBaseClassArray RTTIBaseClassDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Relationship via RTTI  Example: C inherits B inherits A Class A class ClassA … Inherits from class ClassB : public ClassA … class ClassC : public ClassB … Class B Inherits from Class C IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Relationship via RTTI  Example: C inherits B inherits A TypeDescriptor Class C vftable CompleteObjectLocator ClassC BaseClassArray BaseClassDescriptor TypeDescriptor ClassHierarchyDescriptor BaseClassDescriptor ClassB BaseClassDescriptor TypeDescriptor ClassA class ClassA … class ClassB : public ClassA … class ClassC : public ClassB … IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part II. Manual Approach Identifying Class Members IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Class Members  Class Member Variable .text:00401003 push ecx .text:00401004 mov ebp+var4, ecx .text:00401007 mov eax, ebp+var4 .text:0040100A mov dword ptr eax + 8, 12345h IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Class Members  Virtual Functions .text:00401C21 mov ecx, ebp+var1C ; ecx = this pointer .text:00401C24 mov edx, ecx ; edx = ptr to vftable .text:00401C26 mov ecx, ebp+var1C .text:00401C29 mov eax, edx+4 .text:00401C2C call eax ; call virtual function IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Manual Approach Identifying Class Members  Nonvirtual member functions .text:00401AFC push 0CCh .text:00401B01 lea ecx, ebp+varC ; ecx = this pointer .text:00401B04 call sub401110 .text:00401110 push ebp .text:00401111 mov ebp, esp .text:00401113 push ecx .text:00401114 mov ebp+var4, ecx ; ecx used IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part III. Automation IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Automation OOPRE  Developed in Python  Uses the IDAPython platform  Identifies Classes, Relationships and Members  Using Static Analysis IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Why a Static Approach  Difficult to perform runtime analysis on some platforms (Symbian)  Of course, a hybrid approach may produce more exact results IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part III. Automation Automated Analysis Strategies IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 1. Polymorphic Class Identification via RTTI  Leverage RTTI data to accurately extract: – Polymorphic Classes – Polymorphic class Name – Polymorphic class Hierarchy – Polymorphic class Virtual Function Table and Virtual Functions – Polymorphic class Destructors/Constructors IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 1. Polymorphic Class Identification via RTTI  Searching RTTIrelated structures – Via virtual function table (vftable) searching: If item is DWORD If item is a pointer to a Code If item is being referenced by a Code and the instruction in this referencing code is a mov instruction (vftable assignment) – RTTICompleteObjectLocator is just below a vftable .rdata:004165B0 dd offset ClassBRTTICompleteObjectLocator00 .rdata:004165B4 ClassBvftable .rdata:004165B4 dd offset sub401410 ; DATA XREF:... .rdata:004165B8 dd offset sub401460 .rdata:004165BC dd offset sub401230 IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 1. Polymorphic Class Identification via RTTI  Verifying RTTICompleteObjectLocator – Verify if RTTICompleteObjectLocator points to a valid TypeDescriptor – TypeDescriptor is valid if TypeDescriptor.name starts with “.AV” .rdata:00418A28 ClassBRTTICompleteObjectLocator00 .rdata:00418A28 dd 0 ; signature .rdata:00418A2C dd 0 ; offset .rdata:00418A30 dd 0 ; cdOffset .rdata:00418A34 dd offset ClassBTypeDescriptor .rdata:00418A38 dd offset ClassBRTTIClassHierarchyDescriptor .data:0041B01C ClassBTypeDescriptor dd offset typeinfovftable .data:0041B020 dd 0 ;spare .data:0041B024 aavclassb db '.AVClassB',0 ; name IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 1. Polymorphic Class Identification via RTTI  Class Information from RTTI (Summary) newclass() Identified from TypeDescriptors newclass.classname Identified from TypeDescriptor.name newclass.vftable/vfuncs Identified from vftableRTTICompleteObjectLocator relationship newclass.ctorsdtors Identified from functions referencing the vftable newclass.baseclasses Identified from RTTICompleteObjectLocator.pClassHierarchyDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 2. Polymorphic Class Identification (w/o RTTI)  Polymorphic Classes Identification (w/o RTTI) – Via vftable searching (previously discussed) – Base classes are not yet identified – Class name will be automatically generated newclass() Identified from vftable newclass.classname Autogenerated (based from vftable address, etc.) newclass.vftable/vfuncs Identified from vftable newclass.ctorsdtors Identified from functions referencing the vftable IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 3. Class Identification via Constructor / Destructor Search  Simple Data Flow Analyzer Algo 1. If the variable/register is overwritten, stop tracking 2. If EAX is being tracked and a call is encountered, stop tracking. (We assume that all calls return values in EAX). 3. If a call is encountered, treat the next instruction as a new block 4. If a conditional jump is encountered, follow the register/variable in both branches, starting a new block on each branch. 5. If the register/variable was copied into another variable, start a new block and track both the old variable and the new one starting on this block. 6. Otherwise, track next instruction. IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 3. Class Identification via Constructor / Destructor Search  Constructor Identification – For dynamically allocated objects 1. Look for calls to new() . 2. Track the value returned in EAX 3. When tracking is done, look for the earliest call where the tracked register/variable is ECX. Mark this function as constructor. – For local objects  For local objects, we do the same thing. Instead of initially tracking returned values of new(), we first locate instructions where an address of a stack variable is written to ECX, then start tracking ECX IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 4. Class Relationship Inferencing  Inheritance Identification 1. Track this pointer (ECX) 2. Check blocks with ECX as tracked variable 3. See if there is call to a constructor 4. To handle multiple inheritance, track pointers to offsets relative to object address IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 5. Class Member Identification  Member Variables – track the this pointer from the point the object is initialized. – note accesses to offsets relative to the this pointer. IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 5. Class Member Identification  Nonvirtual Functions – track the this pointer from the point the object is initialized. – note all blocks where ECX is the tracked variable, then mark the call in that block, if there is any, as a member of the current class. IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Strategies 5. Class Member Identification  Virtual Functions – To identify virtual functions, we simply have to locate vftables first through constructor analysis. After all of this is done, we then reconstruct the class using the results of these analysis. IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part III. Automation Enhancing Disassembly IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Disassembly Enhancement  RTTI structures reconstruction, naming, commenting Original .rdata:004165A0 dd offset unk4189E0 .rdata:004165A4 off4165A4 dd offset sub401170 ; DATA XREF:... .rdata:004165A8 dd offset sub4011C0 .rdata:004165AC dd offset sub401230 .rdata:004165B0 dd offset unk418A28 Processed .rdata:004165A0 dd offset oopreClassARTTICompleteObjectLocator00 .rdata:004165A4 oopreClassAvftable00 dd offset sub401170 ; DATA XREF: ... .rdata:004165A8 dd offset sub4011C0 .rdata:004165AC dd offset sub401230 IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Disassembly Enhancement  RTTI structures (another example) Original .rdata:004189E0 dword4189E0 dd 0 ; DATA XREF:... .rdata:004189E4 dd 0 .rdata:004189E8 dd 0 .rdata:004189EC dd offset off41B004 .rdata:004189F0 dd offset unk4189F4 Processed .rdata:004189E0 oopreClassARTTICompleteObjectLocator00 dd 0 ; RTTICompleteObjectLocator.signature .rdata:004189E4 dd 0 ; RTTICompleteObjectLocator.offset .rdata:004189E8 dd 0 ; RTTICompleteObjectLocator.cdOffset .rdata:004189EC dd offset oopreClassATypeDescriptor .rdata:004189F0 dd offset oopreClassARTTIClassHierarchyDescriptor IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Disassembly Enhancement  Improving the call graph – Add cross references on virtual function calls – Result in more accurate call graph – Will yield improvements on binary diffing results IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Part III. Automation Visualization IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Visualization  UML Diagram Generation – Using pydot – Create a node for each class – Create an edge from each base classes – Pretty simple (once you have the data :) and Cool too…  – Very effective if RTTI exists (class names) – EXE2UML IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Visualization  UML Diagram Example (w/o RTTI) class ClassA ... class ClassB : public ClassA ... class ClassC ... class ClassD : public ClassB, public ClassC ... IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Internet Security Systems Automation Visualization  UML Diagram Example (w/ RTTI) class ClassA ... class ClassB : public ClassA ... class ClassC ... class ClassD : public ClassB, public ClassC ... IBM Internet Security Systems XForce – Reversing C++ © Copyright IBM Corporation 2007IBM Global Services Reversing C++ Demo… IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007IBM Global Services Thank you Questions Paul Vincent Sabanal XForce RD Mark Vincent Yason XForce RD IBM Internet Security Systems ™ Ahead of the threat. © Copyright IBM Corporation 2007
sharer
Presentations
Free
Document Information
Category:
Presentations
User Name:
JadenNorton
User Type:
Researcher
Country:
United States
Uploaded Date:
14-07-2017