Translating GAS to NASM, part 1
February 15, 2021
Programming from the Ground Up by Jonathan Bartlett is a fantastic introduction to coding in assembly, but the syntax it uses is a bit outdated. While searching for more modern standards, I had trouble finding much related to the GNU Assembler, but plenty for NASM, the Netwide Assembler, which seems to be a lot more common.
So, in the interest of keeping my studies relevant, I decided to translate the coding examples into modern 64-bit NASM. None of this should be taken as authoritative. I’m just a student I’m learning as I go.
All register names used in Ground Up are for 32-bit registers. Most modern CPUs use 64-bit registers. The 32-bit version will still often work, but in some situations (like pushing to and popping off the stack) the full register needs to be accounted for. Since I don’t yet know best practices for every situation, I’m just going to replace any 32-bit register (eax, edi, ebp, etc.) with its 64-bit equivalent (rax, rdi, rbp, etc.) for now.
Other key differences between the book and the conventions I’m using for this chapter are:
- GAS uses the format
instruction source, destination
, while NASM usesinstruction destination, source
- Constants in GAS are preceded with a
$
and registers with a%
. NASM doesn’t use prefixes for either. - GAS’s
globl
keyword is simplyglobal
in NASM - NASM doesn’t use suffixes for its instructions.
mov
will work for anything from aBYTE
to aQWORD
. - GAS comments start with
#
, NASM comments start with;
- The use of
int 0x80
has been replaced in the x86_64 standard. Thesyscall
instruction should now be used. The number for selecting thesyscall
is still placed inrax
, but which number issues which call has changed. ‘Exit’ is now number 60. The return value forsyscall
60 is placed inrdi
, notrbx
. - The syntax for indexed addressing is much neater in NASM. Instead of GAS’s weird
address(,index,multiplier)
thing, NASM dereferences all memory addresses using square brackets, within which the author can perform straightforward pointer arithmetic. - GAS uses a set of identifiers for declaring initialized data, like ‘.long’ and ‘.ascii’. NASM’s identifiers are
db, dw, dd, dq, dt, do, dy, and dz
, which cover values with 1, 2, 4, 8, 10, 16, 32, and 64 bytes, respectively.
Exercise 1 - Exiting a Process
section .data
section .text
global _start
_start:
mov rax, 60
mov rdi, 0
syscall
Exercise 2 - Find max value
section .data
data_items: dq 3, 67, 34, 222, 45, 75, 54, 34, 44, 33, 22, 11, 0
section .text
global _start
_start:
mov rdi, 0
mov rax, [data_items + rdi * 8]
mov rbx, rax
start_loop:
cmp rax, 0 ; Note that cmp doesn't work if the constant
; comes first. 'cmp 0, rax' will throw an error.
je loop_exit
inc rdi
mov rax, [data_items + rdi * 8]
cmp rax, rbx ; These two lines represent the conditional
jle start_loop ; rax <= rbx. In GAS, it would represent
; rbx <= rax. NASM order is more intuitive for
; once.
mov rbx, rax
jmp start_loop
loop_exit:
mov rax, 60
mov rdi, rbx
syscall