Translating GAS to NASM, part 1

February 15, 2021

Programming from the Ground Up by Jonathan Bartlett is a fantastic introduction to coding in assembly, but the syntax it uses is a bit outdated. While searching for more modern standards, I had trouble finding much related to the GNU Assembler, but plenty for NASM, the Netwide Assembler, which seems to be a lot more common.

So, in the interest of keeping my studies relevant, I decided to translate the coding examples into modern 64-bit NASM. None of this should be taken as authoritative. I’m just a student I’m learning as I go.

All register names used in Ground Up are for 32-bit registers. Most modern CPUs use 64-bit registers. The 32-bit version will still often work, but in some situations (like pushing to and popping off the stack) the full register needs to be accounted for. Since I don’t yet know best practices for every situation, I’m just going to replace any 32-bit register (eax, edi, ebp, etc.) with its 64-bit equivalent (rax, rdi, rbp, etc.) for now.

Other key differences between the book and the conventions I’m using for this chapter are:

GAS uses the format instruction source, destination, while NASM uses instruction destination, source
Constants in GAS are preceded with a $ and registers with a %. NASM doesn’t use prefixes for either.
GAS’s globl keyword is simply global in NASM
NASM doesn’t use suffixes for its instructions. mov will work for anything from a BYTE to a QWORD.
GAS comments start with #, NASM comments start with ;
The use of int 0x80 has been replaced in the x86_64 standard. The syscall instruction should now be used. The number for selecting the syscall is still placed in rax, but which number issues which call has changed. ‘Exit’ is now number 60. The return value for syscall 60 is placed in rdi, not rbx.
The syntax for indexed addressing is much neater in NASM. Instead of GAS’s weird address(,index,multiplier) thing, NASM dereferences all memory addresses using square brackets, within which the author can perform straightforward pointer arithmetic.
GAS uses a set of identifiers for declaring initialized data, like ‘.long’ and ‘.ascii’. NASM’s identifiers are db, dw, dd, dq, dt, do, dy, and dz, which cover values with 1, 2, 4, 8, 10, 16, 32, and 64 bytes, respectively.

Exercise 1 - Exiting a Process

    section .data

    section .text
    global _start
_start:
    mov rax, 60
    mov rdi, 0
    syscall

Exercise 2 - Find max value

    section .data
data_items: dq 3, 67, 34, 222, 45, 75, 54, 34, 44, 33, 22, 11, 0

    section .text
    global _start
_start:
    mov rdi, 0
    mov rax, [data_items + rdi * 8]
    mov rbx, rax

start_loop:
    cmp rax, 0                  ; Note that cmp doesn't work if the constant
                                ; comes first. 'cmp 0, rax' will throw an error.
    je loop_exit
    inc rdi
    mov rax, [data_items + rdi * 8]

    cmp rax, rbx                ; These two lines represent the conditional
    jle start_loop              ; rax <= rbx. In GAS, it would represent 
                                ; rbx <= rax. NASM order is more intuitive for 
                                ; once.
    mov rbx, rax
    jmp start_loop

loop_exit:
    mov rax, 60
    mov rdi, rbx
    syscall

Next Chapter

tags:

assembly

comp-sci