Dynamic code execution in Writeable+Executable memory alloc
Posted: Tue Nov 24, 2015 12:18 am
				
				hello, today ive been learning all about dynamic code execution, or a feature Intel calls self-modifying code, but of course it doesn't have to be self-modifying, it can simply be code copied over to a new memory region and ran from there. Of course that's then not self-modifying code, just copied or copied+modified code, but it needs the same environment - memory that is Writeable+Executable, and with a cpu instruction cache flush when the code is modified.
 
Some uses include compression - packers like UPX decompress your executable code into writeable+executable memory section then call it, encryption - protecting code from disassembly, and optimization - for example detecting a CPU feature and modifying some code to take advantage accordingly, just to name a few! Endless possibilities, lol. Hopefully Fred will add support for protection + alignment flags to AllocateMemory() and an option for the main .code section
 
The method is very simple, whew!
1) Get the system page size so we can allocate memory of appropriate size, and it must be page-aligned too so we need to call api such as valloc/VirtualAlloc instead of AllocateMemory.
2) Change its protection to Readable+Writeable+Executable. This is the key - all our other program code is just Readable+Executable, not Writeable
3) Copy the code over, and if we want to make any modifications we can also do them now
4) Flush the instruction cache to ensure the cpu will be executing our new code
5) Call/jmp to the code
 
Anyway I have managed to put together code to demonstrate this for Windows + Linux + OSX, 32+64, maybe somebody will find them useful too, i couldnt find any similar postings. If not I still had fun learning anyway heehee
 
For this demo my dynamic code is simply "mov eax 0, ret" (bytes: B8 00000000 C3) ... pretend it was compressed and ive just decompressed it and have copied it into memory at a new Writeable+Executable section. I also patch the code so it becomes "mov eax,123" to prove the dynamic ability.
 
Linux & OSX:
Windows:
			Some uses include compression - packers like UPX decompress your executable code into writeable+executable memory section then call it, encryption - protecting code from disassembly, and optimization - for example detecting a CPU feature and modifying some code to take advantage accordingly, just to name a few! Endless possibilities, lol. Hopefully Fred will add support for protection + alignment flags to AllocateMemory() and an option for the main .code section

The method is very simple, whew!
1) Get the system page size so we can allocate memory of appropriate size, and it must be page-aligned too so we need to call api such as valloc/VirtualAlloc instead of AllocateMemory.
2) Change its protection to Readable+Writeable+Executable. This is the key - all our other program code is just Readable+Executable, not Writeable
3) Copy the code over, and if we want to make any modifications we can also do them now
4) Flush the instruction cache to ensure the cpu will be executing our new code
5) Call/jmp to the code
Anyway I have managed to put together code to demonstrate this for Windows + Linux + OSX, 32+64, maybe somebody will find them useful too, i couldnt find any similar postings. If not I still had fun learning anyway heehee
For this demo my dynamic code is simply "mov eax 0, ret" (bytes: B8 00000000 C3) ... pretend it was compressed and ive just decompressed it and have copied it into memory at a new Writeable+Executable section. I also patch the code so it becomes "mov eax,123" to prove the dynamic ability.
Linux & OSX:
Code: Select all
#PROT_NONE  = 0
#PROT_READ  = 1
#PROT_WRITE = 2
#PROT_EXEC  = 4
#PROT_ALL   = #PROT_READ + #PROT_WRITE + #PROT_EXEC
ImportC ""
  mprotect (*pmem, pagesize.i, protect.i)
  CompilerIf #PB_Compiler_OS = #PB_OS_Linux
    clear_cache (*pmem, bytes.i) As "__clear_cache"
  CompilerElseIf #PB_Compiler_OS = #PB_OS_MacOS
    sys_icache_invalidate(*pmem, bytes.i)
  CompilerEndIf
EndImport
;1) Get system memory page size so we can allocate in multiples of the correct size
pagesize.i = getpagesize_()
If pagesize <= 0: pagesize = 4096: EndIf
;2) Allocate memory. In this case a single page is big enough
*pmem = valloc_(pagesize)
If *pmem = 0
  MessageRequester("Error","valloc failed"): End
EndIf
;3) Change memory section protection to Read+Write+Exec
If mprotect(*pmem, pagesize, #PROT_ALL) = -1
  MessageRequester("Error","mprotect failed"): End
EndIf
;4) Copy the code over to the new memory region at *pmem
CopyMemory(?StartDynCode, *pmem, ?EndDynCode-?StartDynCode)
;5) Its dynamic code so here i overwrite the 0 in "mov eax,0" with 123
PokeA(*pmem+1, 123)   ;Note that we're writing to the new *pmem, not the original at DataSection's StartDynCode
;6) Flush instruction cache to ensure new code is ready to be executed
CompilerIf #PB_Compiler_OS = #PB_OS_Linux
  clear_cache (*pmem, pagesize)
CompilerElseIf #PB_Compiler_OS = #PB_OS_MacOS
  sys_icache_invalidate(*pmem, pagesize)
CompilerEndIf
;7) Call the code
Define lret.l
EnableASM 
call *pmem
mov lret, eax
DisableASM
MessageRequester("Result", Str(lret) + " (should be 123)")
End
DataSection
  StartDynCode:   ;Doesnt get executed from here, we copy it to our new memory and execute it there
  ! mov eax, 0    ;We will patch this 0 to 123
  ! ret
  EndDynCode:
EndDataSectionCode: Select all
;1) Get system memory page size so we can allocate in multiples of the correct size
Define lpSystemInfo.SYSTEM_INFO
GetSystemInfo_(lpSystemInfo)
pagesize.i = lpSystemInfo\dwPageSize
If pagesize <= 0: pagesize = 4096: EndIf
;2) Allocate memory. In this case a single page is big enough
*pmem = VirtualAlloc_(0, pagesize, #MEM_COMMIT, #PAGE_EXECUTE_READWRITE)
If *pmem = 0
  MessageRequester("Error","VirtualAlloc failed"): End
EndIf
;3) Change memory section protection to Read+Write+Exec
;Not needed though because we already set the protection when calling VirtualAlloc()
;If VirtualProtect_(*pmem, pagesize, #PAGE_EXECUTE_READWRITE, @oldprot) = 0
;  MessageRequester("Error","VirtualProtect failed"): End
;EndIf
;4) Copy the code over to the new memory region at *pmem
CopyMemory(?StartDynCode, *pmem, ?EndDynCode-?StartDynCode)
;5) Its dynamic code so here i overwrite the 0 in "mov eax,0" with 123
PokeA(*pmem+1, 123)   ;Note that we're writing to the new *pmem, not the original at DataSection's StartDynCode
;6) Flush instruction cache to ensure new code is ready to be executed
If FlushInstructionCache_(GetCurrentProcess_(), *pmem, pagesize) = 0
  MessageRequester("Warning", "Cache flush failed")
EndIf
;7) Call the code
Define lret.l
EnableASM 
call *pmem
mov lret, eax
DisableASM
MessageRequester("Result", Str(lret) + " (should be 123)")
End
DataSection
  StartDynCode:   ;Doesnt get executed from here, we copy it to our new memory and execute it there
  ! mov eax, 0    ;We will patch this 0 to 123
  ! ret
  EndDynCode:
EndDataSection “Joo Janta 200 Super-Chromatic Peril Sensitive Sunglasses have been specially designed to help people develop a relaxed attitude to danger. At the first hint of trouble, they turn totally black and thus prevent you from seeing anything that might alarm you.”
 “Joo Janta 200 Super-Chromatic Peril Sensitive Sunglasses have been specially designed to help people develop a relaxed attitude to danger. At the first hint of trouble, they turn totally black and thus prevent you from seeing anything that might alarm you.” 

