Sunday, February 5, 2012

NASM Docs Chapter 2: Running NASM

Next Chapter | Previous Chapter | Contents | Index

Chapter 2: Running NASM

2.1 NASM Command-Line Syntax

To assemble a file, you issue a command of the form

nasm -f <format> <filename> [-o <output>]

For example,

nasm -f elf myfile.asm

will assemble myfile.asm into an ELF object file myfile.o. And

nasm -f bin myfile.asm -o myfile.com

will assemble myfile.asm into a raw binary file myfile.com.

To produce a listing file, with the hex codes output from NASM displayed on the left of the original sources, use the -l option to give a listing file name, for example:

nasm -f coff myfile.asm -l myfile.lst

To get further usage instructions from NASM, try typing

nasm -h

This will also list the available output file formats, and what they are.

If you use Linux but aren't sure whether your system is a.out or ELF, type

file nasm

(in the directory in which you put the NASM binary when you installed it). If it says something like

nasm: ELF 32-bit LSB executable i386 (386 and up) Version 1

then your system is ELF, and you should use the option -f elf when you want NASM to produce Linux object files. If it says

nasm: Linux/i386 demand-paged executable (QMAGIC)

or something similar, your system is a.out, and you should use -f aout instead.

Like Unix compilers and assemblers, NASM is silent unless it goes wrong: you won't see any output at all, unless it gives error messages.

2.1.1 The -o Option: Specifying the Output File Name

NASM will normally choose the name of your output file for you; precisely how it does this is dependent on the object file format. For Microsoft object file formats (obj and win32), it will remove the .asm extension (or whatever extension you like to use - NASM doesn't care) from your source file name and substitute .obj. For Unix object file formats (aout, coff, elf and as86) it will substitute .o. For rdf, it will use .rdf, and for the bin format it will simply remove the extension, so that myfile.asm produces the output file myfile.

If the output file already exists, NASM will overwrite it, unless it has the same name as the input file, in which case it will give a warning and use nasm.out as the output file name instead.

For situations in which this behaviour is unacceptable, NASM provides the -o command-line option, which allows you to specify your desired output file name. You invoke -o by following it with the name you wish for the output file, either with or without an intervening space. For example:

nasm -f bin program.asm -o program.com 
nasm -f bin driver.asm -odriver.sys

2.1.2 The -f Option: Specifying the Output File Format

If you do not supply the -f option to NASM, it will choose an output file format for you itself. In the distribution versions of NASM, the default is always bin; if you've compiled your own copy of NASM, you can redefine OF_DEFAULT at compile time and choose what you want the default to be.

Like -o, the intervening space between -f and the output file format is optional; so -f elf and -felf are both valid.

A complete list of the available output file formats can be given by issuing the command nasm -h.

2.1.3 The -l Option: Generating a Listing File

If you supply the -l option to NASM, followed (with the usual optional space) by a file name, NASM will generate a source-listing file for you, in which addresses and generated code are listed on the left, and the actual source code, with expansions of multi-line macros (except those which specifically request no expansion in source listings: see section 4.2.9) on the right. For example:

nasm -f elf myfile.asm -l myfile.lst

2.1.4 The -s Option: Send Errors to stdout

Under MS-DOS it can be difficult (though there are ways) to redirect the standard-error output of a program to a file. Since NASM usually produces its warning and error messages on stderr, this can make it hard to capture the errors if (for example) you want to load them into an editor.

NASM therefore provides the -s option, requiring no argument, which causes errors to be sent to standard output rather than standard error. Therefore you can redirect the errors into a file by typing

nasm -s -f obj myfile.asm > myfile.err

2.1.5 The -i Option: Include File Search Directories

When NASM sees the %include directive in a source file (see section 4.5), it will search for the given file not only in the current directory, but also in any directories specified on the command line by the use of the -i option. Therefore you can include files from a macro library, for example, by typing

nasm -ic:\macrolib\ -f obj myfile.asm

(As usual, a space between -i and the path name is allowed, and optional).

NASM, in the interests of complete source-code portability, does not understand the file naming conventions of the OS it is running on; the string you provide as an argument to the -i option will be prepended exactly as written to the name of the include file. Therefore the trailing backslash in the above example is necessary. Under Unix, a trailing forward slash is similarly necessary.

(You can use this to your advantage, if you're really perverse, by noting that the option -ifoo will cause %include "bar.i" to search for the file foobar.i...)

If you want to define a standard include search path, similar to /usr/include on Unix systems, you should place one or more -i directives in the NASM environment variable (see section 2.1.11).

2.1.6 The -p Option: Pre-Include a File

NASM allows you to specify files to be pre-included into your source file, by the use of the -p option. So running

nasm myfile.asm -p myinc.inc

is equivalent to running nasm myfile.asm and placing the directive %include "myinc.inc" at the start of the file.

2.1.7 The -d Option: Pre-Define a Macro

Just as the -p option gives an alternative to placing %include directives at the start of a source file, the -d option gives an alternative to placing a %define directive. You could code

nasm myfile.asm -dFOO=100

as an alternative to placing the directive

%define FOO 100

at the start of the file. You can miss off the macro value, as well: the option -dFOO is equivalent to coding %define FOO. This form of the directive may be useful for selecting assembly-time options which are then tested using %ifdef, for example -dDEBUG.

2.1.8 The -e Option: Preprocess Only

NASM allows the preprocessor to be run on its own, up to a point. Using the -e option (which requires no arguments) will cause NASM to preprocess its input file, expand all the macro references, remove all the comments and preprocessor directives, and print the resulting file on standard output (or save it to a file, if the -o option is also used).

This option cannot be applied to programs which require the preprocessor to evaluate expressions which depend on the values of symbols: so code such as

%assign tablesize ($-tablestart)

will cause an error in preprocess-only mode.

2.1.9 The -a Option: Don't Preprocess At All

If NASM is being used as the back end to a compiler, it might be desirable to suppress preprocessing completely and assume the compiler has already done it, to save time and increase compilation speeds. The -a option, requiring no argument, instructs NASM to replace its powerful preprocessor with a stub preprocessor which does nothing.

2.1.10 The -w Option: Enable or Disable Assembly Warnings

NASM can observe many conditions during the course of assembly which are worth mentioning to the user, but not a sufficiently severe error to justify NASM refusing to generate an output file. These conditions are reported like errors, but come up with the word `warning' before the message. Warnings do not prevent NASM from generating an output file and returning a success status to the operating system.

Some conditions are even less severe than that: they are only sometimes worth mentioning to the user. Therefore NASM supports the -w command-line option, which enables or disables certain classes of assembly warning. Such warning classes are described by a name, for example orphan-labels; you can enable warnings of this class by the command-line option -w+orphan-labels and disable it by -w-orphan-labels.

The suppressible warning classes are:

  • macro-params covers warnings about multi-line macros being invoked with the wrong number of parameters. This warning class is enabled by default; see section 4.2.1 for an example of why you might want to disable it.
  • orphan-labels covers warnings about source lines which contain no instruction but define a label without a trailing colon. NASM does not warn about this somewhat obscure condition by default; see section 3.1 for an example of why you might want it to.
  • number-overflow covers warnings about numeric constants which don't fit in 32 bits (for example, it's easy to type one too many Fs and produce 0x7ffffffff by mistake). This warning class is enabled by default.

2.1.11 The NASM Environment Variable

If you define an environment variable called NASM, the program will interpret it as a list of extra command-line options, which are processed before the real command line. You can use this to define standard search directories for include files, by putting -i options in the NASM variable.

The value of the variable is split up at white space, so that the value -s -ic:\nasmlib will be treated as two separate options. However, that means that the value -dNAME="my name" won't do what you might want, because it will be split at the space and the NASM command-line processing will get confused by the two nonsensical words -dNAME="my and name".

To get round this, NASM provides a feature whereby, if you begin the NASM environment variable with some character that isn't a minus sign, then NASM will treat this character as the separator character for options. So setting the NASM variable to the value !-s!-ic:\nasmlib is equivalent to setting it to -s -ic:\nasmlib, but !-dNAME="my name" will work.

2.2 Quick Start for MASM Users

If you're used to writing programs with MASM, or with TASM in MASM-compatible (non-Ideal) mode, or with a86, this section attempts to outline the major differences between MASM's syntax and NASM's. If you're not already used to MASM, it's probably worth skipping this section.

2.2.1 NASM Is Case-Sensitive

One simple difference is that NASM is case-sensitive. It makes a difference whether you call your label foo, Foo or FOO. If you're assembling to DOS or OS/2 .OBJ files, you can invoke the UPPERCASE directive (documented in section 6.2) to ensure that all symbols exported to other code modules are forced to be upper case; but even then, within a single module, NASM will distinguish between labels differing only in case.

2.2.2 NASM Requires Square Brackets For Memory References

NASM was designed with simplicity of syntax in mind. One of the design goals of NASM is that it should be possible, as far as is practical, for the user to look at a single line of NASM code and tell what opcode is generated by it. You can't do this in MASM: if you declare, for example,

foo       equ 1 
bar       dw 2

then the two lines of code

          mov ax,foo 
          mov ax,bar

generate completely different opcodes, despite having identical-looking syntaxes.

NASM avoids this undesirable situation by having a much simpler syntax for memory references. The rule is simply that any access to the contents of a memory location requires square brackets around the address, and any access to the address of a variable doesn't. So an instruction of the form mov ax,foo will always refer to a compile-time constant, whether it's an EQU or the address of a variable; and to access the contents of the variable bar, you must code mov ax,[bar].

This also means that NASM has no need for MASM's OFFSET keyword, since the MASM code mov ax,offset bar means exactly the same thing as NASM's mov ax,bar. If you're trying to get large amounts of MASM code to assemble sensibly under NASM, you can always code %idefine offset to make the preprocessor treat the OFFSET keyword as a no-op.

This issue is even more confusing in a86, where declaring a label with a trailing colon defines it to be a `label' as opposed to a `variable' and causes a86 to adopt NASM-style semantics; so in a86, mov ax,var has different behaviour depending on whether var was declared as var: dw 0 (a label) or var dw 0 (a word-size variable). NASM is very simple by comparison: everything is a label.

NASM, in the interests of simplicity, also does not support the hybrid syntaxes supported by MASM and its clones, such as mov ax,table[bx], where a memory reference is denoted by one portion outside square brackets and another portion inside. The correct syntax for the above is mov ax,[table+bx]. Likewise, mov ax,es:[di] is wrong and mov ax,[es:di] is right.

2.2.3 NASM Doesn't Store Variable Types

NASM, by design, chooses not to remember the types of variables you declare. Whereas MASM will remember, on seeing var dw 0, that you declared var as a word-size variable, and will then be able to fill in the ambiguity in the size of the instruction mov var,2, NASM will deliberately remember nothing about the symbol var except where it begins, and so you must explicitly code mov word [var],2.

For this reason, NASM doesn't support the LODS, MOVS, STOS, SCAS, CMPS, INS, or OUTS instructions, but only supports the forms such as LODSB, MOVSW, and SCASD, which explicitly specify the size of the components of the strings being manipulated.

2.2.4 NASM Doesn't ASSUME

As part of NASM's drive for simplicity, it also does not support the ASSUME directive. NASM will not keep track of what values you choose to put in your segment registers, and will never automatically generate a segment override prefix.

2.2.5 NASM Doesn't Support Memory Models

NASM also does not have any directives to support different 16-bit memory models. The programmer has to keep track of which functions are supposed to be called with a far call and which with a near call, and is responsible for putting the correct form of RET instruction (RETN or RETF; NASM accepts RET itself as an alternate form for RETN); in addition, the programmer is responsible for coding CALL FAR instructions where necessary when calling external functions, and must also keep track of which external variable definitions are far and which are near.

2.2.6 Floating-Point Differences

NASM uses different names to refer to floating-point registers from MASM: where MASM would call them ST(0), ST(1) and so on, and a86 would call them simply 0, 1 and so on, NASM chooses to call them st0, st1 etc.

As of version 0.96, NASM now treats the instructions with `nowait' forms in the same way as MASM-compatible assemblers. The idiosyncratic treatment employed by 0.95 and earlier was based on a misunderstanding by the authors.

2.2.7 Other Differences

For historical reasons, NASM uses the keyword TWORD where MASM and compatible assemblers use TBYTE.

NASM does not declare uninitialised storage in the same way as MASM: where a MASM programmer might use stack db 64 dup (?), NASM requires stack resb 64, intended to be read as `reserve 64 bytes'. For a limited amount of compatibility, since NASM treats ? as a valid character in symbol names, you can code ? equ 0 and then writing dw ? will at least do something vaguely useful. DUP is still not a supported syntax, however.

In addition to all of this, macros and directives work completely differently to MASM. See chapter 4 and chapter 5 for further details.

Next Chapter | Previous Chapter | Contents | Index

No comments:

Post a Comment