Tuesday, February 20, 2007

Getting the vsyscall address from the ELF auxiliary vector [ref]

In a previous post I have showed how to make new style system calls in linux x86. I've also remarked that assuming that the __kernel_vsyscall address was always at a fixed address was wrong. The proper way to get the function address would be declare it as "extern", and link it with with linux-gate.so.1. However, I don't know how to do that, because linux gate is virtual and ld can't find it. I'll post here another way, that gets the information passed to the running software by the ELF loader. It retrieves an auxiliary ELF structure, that is present in the stack, after the usual argc, argv and envp pointers.

First, let's modify our previous hello.asm to call a new function "get_kernel_vsyscall", and set the address returned as the system call address:

; hello.asm
bits 32
global _start
extern get_kernel_vsyscall ; new function in vsyscall.asm
section .data
     hello: db 'Hello World !', 10
     hello_len: equ $-hello
section .bss
     linux_gate: resd 1 ; now, linux gate is variable
section .text
_start:
     ; New code to save the vsyscall address
     mov eax, esp
     call get_kernel_vsyscall
     mov [linux_gate], eax

     mov eax, 4 ; write()
     mov ebx, 1
     mov ecx, hello
     mov edx, hello_len
     call [linux_gate] ; call with dereference

     mov eax, 1 ; _exit()
     mov ebx, 42
     call [linux_gate] ; call with dereference


So, we changed "linux_gate" into a variable and filled it with the address returned by "get_kernel_vsyscall", basically. This function loops in the stack, beginning with argc, until it finds the correct (key, value) pair in the ELF header:

; vsyscall.asm
bits 32
global get_kernel_vsyscall
%define ELF_VSYSCALL_ID 32 ; defined in elf.h as AT_SYSINFO
%define ELF_EOF_ID 0 ; defined in elf.h as AT_NULL

section .text
get_kernel_vsyscall:
; Returns the kernel vsyscall address from the
; ELF aux vector. The vsyscall address is the
; subroutine used for the new style system call.
; in:
;     eax: the address, in the stack, of "argc"
; out:
;     eax: the kernel vsyscall address, or zero
;     on error

     push ebx

     ; First we skip argc and argv
     mov ebx, [eax]
     inc ebx
     inc ebx
     shl ebx, 2
     add eax, ebx

     ; For envp, we need a loop until NULL is found
envp_loop:
     mov ebx, [eax]
     add eax, 4
     cmp ebx, 0
     jne envp_loop

     ; Now we are in the ELF aux vector. Look for
     ; the correct (key, value) pair
     mov ebx, [eax]
     cmp ebx, ELF_VSYSCALL_ID
     je vsyscall_id_found
elf_vector_loop:
     add eax, 8
     mov ebx, [eax]
     cmp ebx, ELF_VSYSCALL_ID
     je vsyscall_id_found
     cmp ebx, ELF_EOF_ID
     jne elf_vector_loop
vsyscall_id_found:
     add eax, 4
     mov eax, [eax]

     pop ebx
     ret

Weird, eh ? We needed to look for the AT_SYSINFO field (ELF_VSYSCALL_ID in the code) in some extra vector provided by the ELF loader. Other executable formats would give those parameters some other way. We have two options, now: use a hardcoded vsyscall address or retrieve it from the ELF vector. Compile and link with "nasm -f elf hello.asm; nasm -f elf vsyscall.asm; ld -o hello hello.o vsyscall.o".

No comments: