8 _if__(_M680X0__ && !_ALL_ARCH__)
9 @setfilename as-m680x0.info
10 _fi__(_M680X0__ && !_ALL_ARCH__)
11 _if__(_AMD29K__ && !_ALL_ARCH__)
12 @setfilename as-29k.info
13 _fi__(_AMD29K__ && !_ALL_ARCH__)
15 @c NOTE: this manual is marked up for preprocessing with a collection
16 @c of m4 macros called "pretex.m4". If you see <_if__> and <_fi__>
17 @c scattered around the source, you have the full source before
18 @c preprocessing; if you don't, you have the source configured for some
19 @c particular architecture (and you can of course get the full source,
20 @c with all configurations, from wherever you got this). The full
21 @c source needs to be run through m4 before either tex- or info-
22 @c formatting: for example,
23 @c m4 pretex.m4 none.m4 m680x0.m4 as.texinfo >as-680x0.texinfo
24 @c will produce (assuming your path finds either GNU or SysV m4;
25 @c Berkeley won't do) a file suitable for formatting.
26 @c See the text in "pretex.m4" for a fuller explanation (and the macro
31 This file documents the GNU Assembler "as".
33 Copyright (C) 1991 Free Software Foundation, Inc.
35 Permission is granted to make and distribute verbatim copies of
36 this manual provided the copyright notice and this permission notice
37 are preserved on all copies.
40 Permission is granted to process this file through Tex and print the
41 results, provided the printed document carries copying permission
42 notice identical to this one except for the removal of this paragraph
43 (this paragraph not being relevant to the printed manual).
46 Permission is granted to copy and distribute modified versions of this
47 manual under the conditions for verbatim copying, provided also that the
48 section entitled ``GNU General Public License'' is included exactly as
49 in the original, and provided that the entire resulting derived work is
50 distributed under the terms of a permission notice identical to this
53 Permission is granted to copy and distribute translations of this manual
54 into another language, under the above conditions for modified versions,
55 except that the section entitled ``GNU General Public License'' may be
56 included in a translation approved by the author instead of in the
63 @setchapternewpage odd
65 @settitle Using GNU as (680x0)
68 @settitle Using GNU as (AMD 29K)
72 @subtitle{The GNU Assembler}
74 @subtitle{for Motorola 680x0}
77 @subtitle{for the AMD 29K family}
80 @subtitle February 1991
82 The Free Software Foundation Inc. thanks The Nice Computer
83 Company of Australia for loaning Dean Elsner to write the
84 first (Vax) version of @code{as} for Project GNU.
85 The proprietors, management and staff of TNCCA thank FSF for
86 distracting the boss while they got some work
89 @author{Dean Elsner, Jay Fenlason & friends}
90 @author{revised by Roland Pesch for Cygnus Support}
94 \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
95 \xdef\manvers{\$Revision$} % For use in headers, footers too
97 \hfill Cygnus Support\par
99 \hfill \TeX{}info \texinfoversion\par
101 %"boxit" macro for figures:
102 %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
103 \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
104 \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
105 #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
106 \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
109 @vskip 0pt plus 1filll
110 Copyright @copyright{} 1991 Free Software Foundation, Inc.
112 Permission is granted to make and distribute verbatim copies of
113 this manual provided the copyright notice and this permission notice
114 are preserved on all copies.
116 Permission is granted to copy and distribute modified versions of this
117 manual under the conditions for verbatim copying, provided also that the
118 section entitled ``GNU General Public License'' is included exactly as
119 in the original, and provided that the entire resulting derived work is
120 distributed under the terms of a permission notice identical to this
123 Permission is granted to copy and distribute translations of this manual
124 into another language, under the above conditions for modified versions,
125 except that the section entitled ``GNU General Public License'' may be
126 included in a translation approved by the author instead of in the
131 @node Top, Overview, (dir), (dir)
134 * Overview:: Overview
136 * Segments:: Segments and Relocation
138 * Expressions:: Expressions
139 * Pseudo Ops:: Assembler Directives
140 * Maintenance:: Maintaining the Assembler
141 * Retargeting:: Teaching the Assembler about a New Machine
142 * License:: GNU GENERAL PUBLIC LICENSE
144 --- The Detailed Node Listing ---
148 * Invoking:: Invoking @code{as}
149 * Manual:: Structure of this Manual
150 * GNU Assembler:: as, the GNU Assembler
151 * Command Line:: Command Line
152 * Input Files:: Input Files
153 * Object:: Output (Object) File
154 * Errors:: Error and Warning Messages
159 * Filenames:: Input Filenames and Line-numbers
163 * Pre-processing:: Pre-processing
164 * Whitespace:: Whitespace
165 * Comments:: Comments
166 * Symbol Intro:: Symbols
167 * Statements:: Statements
168 * Constants:: Constants
172 * Characters:: Character Constants
173 * Numbers:: Number Constants
180 Segments and Relocation
182 * Segs Background:: Background
183 * ld Segments:: ld Segments
184 * as Segments:: as Internal Segments
185 * Sub-Segments:: Sub-Segments
188 Segments and Relocation
190 * ld Segments:: ld Segments
191 * as Segments:: as Internal Segments
192 * Sub-Segments:: Sub-Segments
198 * Setting Symbols:: Giving Symbols Other Values
199 * Symbol Names:: Symbol Names
200 * Dot:: The Special Dot Symbol
201 * Symbol Attributes:: Symbol Attributes
205 * Local Symbols:: Local Symbol Names
209 * Symbol Value:: Value
211 * Symbol Desc:: Descriptor
212 * Symbol Other:: Other
216 * Empty Exprs:: Empty Expressions
217 * Integer Exprs:: Integer Expressions
221 * Arguments:: Arguments
222 * Operators:: Operators
223 * Prefix Ops:: Prefix Operators
224 * Infix Ops:: Infix Operators
228 * Abort:: The Abort directive causes as to abort
229 * Align:: Pad the location counter to a power of 2
230 * App-File:: Set the logical file name
231 * Ascii:: Fill memory with bytes of ASCII characters
232 * Asciz:: Fill memory with bytes of ASCII characters followed
234 * Byte:: Fill memory with 8-bit integers
235 * Comm:: Reserve public space in the BSS segment
236 * Data:: Change to the data segment
237 * Desc:: Set the n_desc of a symbol
238 * Double:: Fill memory with double-precision floating-point numbers
239 * Else:: @code{.else}
241 * Endif:: @code{.endif}
242 * Equ:: @code{.equ @var{symbol}, @var{expression}}
243 * Extern:: @code{.extern}
244 * Fill:: Fill memory with repeated values
245 * Float:: Fill memory with single-precision floating-point numbers
246 * Global:: Make a symbol visible to the linker
247 * Ident:: @code{.ident}
248 * If:: @code{.if @var{absolute expression}}
249 * Include:: @code{.include "@var{file}"}
250 * Int:: Fill memory with 32-bit integers
251 * Lcomm:: Reserve private space in the BSS segment
252 * Line:: Set the logical line number
253 * Ln:: @code{.ln @var{line-number}}
254 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
255 * Long:: Fill memory with 32-bit integers
256 * Lsym:: Create a local symbol
257 * Octa:: Fill memory with 128-bit integers
258 * Org:: Change the location counter
259 * Quad:: Fill memory with 64-bit integers
260 * Set:: Set the value of a symbol
261 * Short:: Fill memory with 16-bit integers
262 * Single:: @code{.single @var{flonums}}
263 * Stab:: Store debugging information
264 * Text:: Change to the text segment
265 * Word:: Fill memory with 32-bit integers
266 * Deprecated:: Deprecated Directives
267 * Machine Options:: Options
268 * Machine Syntax:: Syntax
269 * Floating Point:: Floating Point
270 * Machine Directives:: Machine Directives
275 * block:: @code{.block @var{size} , @var{fill}}
276 * cputype:: @code{.cputype}
277 * file:: @code{.file}
278 * hword:: @code{.hword @var{expressions}}
279 * line:: @code{.line}
280 * reg:: @code{.reg @var{symbol}, @var{expression}}
281 * sect:: @code{.sect}
282 * use:: @code{.use @var{segment name}}
285 @node Overview, Syntax, Top, Top
288 This manual is a user guide to the GNU assembler @code{as}.
290 This version of the manual describes @code{as} configured to generate
291 code for Motorola 680x0 architectures.
294 This version of the manual describes @code{as} configured to generate
295 code for Advanced Micro Devices' 29K architectures.
299 * Invoking:: Invoking @code{as}
300 * Manual:: Structure of this Manual
301 * GNU Assembler:: as, the GNU Assembler
302 * Command Line:: Command Line
303 * Input Files:: Input Files
304 * Object:: Output (Object) File
305 * Errors:: Error and Warning Messages
309 @node Invoking, Manual, Overview, Overview
310 @section Invoking @code{as}
312 Here is a brief summary of how to invoke GNU @code{as}. For details,
315 @c We don't use @deffn and friends for the following because they seem
316 @c to be limited to one line for the header.
318 as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
320 [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
323 @c am29k has no machine-dependent assembler options
325 [ -- | @var{files} @dots{} ]
331 This option is accepted only for script compatibility with calls to
332 other assemblers; it has no effect on GNU @code{as}.
335 ``fast''---skip preprocessing (assume source is compiler output)
338 Add @var{path} to the search list for @code{.include} directives
342 This option is accepted but has no effect on the 29K family.
345 Issue warnings when difference tables altered for long displacements
349 Keep (in symbol table) local symbols, starting with @samp{L}
351 @item -o @var{objfile}
352 Name the object-file output from @code{as}
355 Fold data segment into text segment
358 Suppress warning messages
362 Shorten references to undefined symbols, to one word instead of two
364 @item -mc68000 | -mc68010 | -mc68020
365 Specify what processor in the 68000 family is the target (default 68020)
368 @item -- | @var{files} @dots{}
369 Source files to assemble, or standard input
372 @node Manual, GNU Assembler, Invoking, Overview
373 @section Structure of this Manual
374 This document is intended to describe what you need to know to use GNU
375 @code{as}. We cover the syntax expected in source files, including
376 notation for symbols, constants, and expressions; the directives that
377 @code{as} understands; and of course how to invoke @code{as}.
379 _if__(_M680X0__ && !_ALL_ARCH__)
380 We also cover special features in the 68000 configuration of @code{as},
381 including pseudo-operations.
382 _fi__(_M680X0__ && !_ALL_ARCH__)
383 _if__(_AMD29K__ && !_ALL_ARCH__)
384 We also cover special features in the AMD 29K configuration of @code{as},
385 including assembler directives.
386 _fi__(_AMD29K__ && !_ALL_ARCH__)
389 This document also describes some of the machine-dependent features of
390 various flavors of the assembler.
393 This document also describes how the assembler works internally, and
394 provides some information that may be useful to people attempting to
395 port the assembler to another machine.
398 On the other hand, this manual is @emph{not} intended as an introduction
399 to programming in assembly language---let alone programming in general!
400 In a similar vein, we make no attempt to introduce the machine
401 architecture; we do @emph{not} describe the instruction set, standard
402 mnemonics, registers or addressing modes that are standard to a
403 particular architecture. You may want to consult the manufacturer's
404 machine architecture manual for this information.
409 Throughout this document, we assume that you are running @dfn{GNU},
410 the portable operating system from the @dfn{Free Software
411 Foundation, Inc.}. This restricts our attention to certain kinds of
412 computer (in particular, the kinds of computers that GNU can run on);
413 once this assumption is granted examples and definitions need less
416 @code{as} is part of a team of programs that turn a high-level
417 human-readable series of instructions into a low-level
418 computer-readable series of instructions. Different versions of
419 @code{as} are used for different kinds of computer.
422 @c There used to be a section "Terminology" here, which defined
423 @c "contents", "byte", "word", and "long". Defining "word" to any
424 @c particular size is confusing when the .word directive may generate 16
425 @c bits on one machine and 32 bits on another; in general, for the user
426 @c version of this manual, none of these terms seem essential to define.
427 @c They were used very little even in the former draft of the manual;
428 @c this draft makes an effort to avoid them (except in names of
431 @node GNU Assembler, Command Line, Manual, Overview
432 @section as, the GNU Assembler
433 @code{as} is primarily intended to assemble the output of the GNU C
434 compiler @code{gcc} for use by the linker @code{ld}. Nevertheless,
435 we've tried to make @code{as} assemble correctly everything that the native
438 Any exceptions are documented explicitly (@pxref{Machine Dependent}).
440 This doesn't mean @code{as} always uses the same syntax as another
441 assembler for the same architecture; for example, we know of several
442 incompatible versions of 680x0 assembly language syntax.
444 GNU @code{as} is really a family of assemblers. If you use (or have
445 used) GNU @code{as} on another architecture, you should find a fairly
446 similar environment. Each version has much in common with the others,
447 including object file formats, most assembler directives (often called
448 @dfn{pseudo-ops)} and assembler syntax.
450 Unlike older assemblers, @code{as} is designed to assemble a source
451 program in one pass of the source file. This has a subtle impact on the
452 @kbd{.org} directive (@pxref{Org}).
454 @node Command Line, Input Files, GNU Assembler, Overview
455 @section Command Line
457 After the program name @code{as}, the command line may contain
458 options and file names. Options may be in any order, and may be
459 before, after, or between file names. The order of file names is
462 @file{--} (two hyphens) by itself names the standard input file
463 explicitly, as one of the files for @code{as} to assemble.
465 Except for @samp{--} any command line argument that begins with a
466 hyphen (@samp{-}) is an option. Each option changes the behavior of
467 @code{as}. No option changes the way another option works. An
468 option is a @samp{-} followed by one or more letters; the case of
469 the letter is important. All options are optional.
471 Some options expect exactly one file name to follow them. The file
472 name may either immediately follow the option's letter (compatible
473 with older assemblers) or it may be the next command argument (GNU
474 standard). These two command lines are equivalent:
477 as -o my-object-file.o mumble
478 as -omy-object-file.o mumble
481 @node Input Files, Object, Command Line, Overview
484 We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
485 describe the program input to one run of @code{as}. The program may
486 be in one or more files; how the source is partitioned into files
487 doesn't change the meaning of the source.
489 @c I added "con" prefix to "catenation" just to prove I can overcome my
491 The source program is a concatenation of the text in all the files, in the
494 Each time you run @code{as} it assembles exactly one source
495 program. The source program is made up of one or more files.
496 (The standard input is also a file.)
498 You give @code{as} a command line that has zero or more input file
499 names. The input files are read (from left file name to right). A
500 command line argument (in any position) that has no special meaning
501 is taken to be an input file name.
503 If @code{as} is given no file names it attempts to read one input file
504 from @code{as}'s standard input, which is normally your terminal. You
505 may have to type @key{ctl-D} to tell @code{as} there is no more program
508 Use @samp{--} if you need to explicitly name the standard input file
509 in your command line.
511 If the source is empty, @code{as} will produce a small, empty object
515 * Filenames:: Input Filenames and Line-numbers
518 @node Filenames, , Input Files, Input Files
519 @subsection Input Filenames and Line-numbers
520 There are two ways of locating a line in the input file (or files) and both
521 are used in reporting error messages. One way refers to a line
522 number in a physical file; the other refers to a line number in a
525 @dfn{Physical files} are those files named in the command line given
528 @dfn{Logical files} are simply names declared explicitly by assembler
529 directives; they bear no relation to physical files. Logical file names
530 help error messages reflect the original source file, when @code{as}
531 source is itself synthesized from other files. @xref{App-File}.
533 @node Object, Errors, Input Files, Overview
534 @section Output (Object) File
535 Every time you run @code{as} it produces an output file, which is
536 your assembly language program translated into numbers. This file
537 is the object file, named @code{a.out} unless you tell @code{as} to
538 give it another name by using the @code{-o} option. Conventionally,
539 object file names end with @file{.o}. The default name of
540 @file{a.out} is used for historical reasons: older assemblers were
541 capable of assembling self-contained programs directly into a
543 @c This may still work, but hasn't been tested.
545 The object file is meant for input to the linker @code{ld}. It contains
546 assembled program code, information to help @code{ld} integrate
547 the assembled program into a runnable file, and (optionally) symbolic
548 information for the debugger.
550 @comment link above to some info file(s) like the description of a.out.
551 @comment don't forget to describe GNU info as well as Unix lossage.
553 @node Errors, Options, Object, Overview
554 @section Error and Warning Messages
556 @code{as} may write warnings and error messages to the standard error
557 file (usually your terminal). This should not happen when @code{as} is
558 run automatically by a compiler. Warnings report an assumption made so
559 that @code{as} could keep assembling a flawed program; errors report a
560 grave problem that stops the assembly.
562 Warning messages have the format
564 file_name:@b{NNN}:Warning Message Text
566 @noindent(where @b{NNN} is a line number). If a logical file name has
567 been given (@pxref{App-File}) it is used for the filename, otherwise the
568 name of the current input file is used. If a logical line number was
576 then it is used to calculate the number printed,
577 otherwise the actual line in the current source file is printed. The
578 message text is intended to be self explanatory (in the grand Unix
581 Error messages have the format
583 file_name:@b{NNN}:FATAL:Error Message Text
585 The file name and line number are derived as for warning
586 messages. The actual message text may be rather less explanatory
587 because many of them aren't supposed to happen.
590 @node Options, , Errors, Overview
592 @subsection @code{-D}
593 This option has no effect whatsoever, but it is accepted to make it more
594 likely that scripts written for other assemblers will also work with
598 @subsection Work Faster: @code{-f}
599 @samp{-f} should only be used when assembling programs written by a
600 (trusted) compiler. @samp{-f} stops the assembler from pre-processing
601 the input file(s) before assembling them.
603 @emph{Warning:} if the files actually need to be pre-processed (if they
604 contain comments, for example), @code{as} will not work correctly if
608 @subsection Add to @code{.include} search path: @code{-I} @var{path}
609 Use this option to add a @var{path} to the list of directories GNU
610 @code{as} will search for files specified in @code{.include} directives
611 (@pxref{Include}). You may use @code{-I} as many times as necessary to
612 include a variety of paths. The current working directory is always
613 searched first; after that, @code{as} searches any @samp{-I} directories
614 in the same order as they were specified (left to right) on the command
617 @subsection Warn if difference tables altered: @code{-k}
619 On the AMD 29K family, this option is allowed, but has no effect. It is
620 permitted for compatibility with GNU @code{as} on other platforms,
621 where it can be used to warn when @code{as} alters the machine code
622 generated for @samp{.word} directives in difference tables. The AMD 29K
623 family does not have the addressing limitations that sometimes lead to this
624 alteration on other platforms.
628 @code{as} sometimes alters the code emitted for directives of the form
629 @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
630 You can use the @samp{-k} option if you want a warning issued when this
634 @subsection Include Local Labels: @code{-L}
635 Labels beginning with @samp{L} (upper case only) are called @dfn{local
636 labels}. @xref{Symbol Names}. Normally you don't see such labels when
637 debugging, because they are intended for the use of programs (like
638 compilers) that compose assembler programs, not for your notice.
639 Normally both @code{as} and @code{ld} discard such labels, so you don't
640 normally debug with them.
642 This option tells @code{as} to retain those @samp{L@dots{}} symbols
643 in the object file. Usually if you do this you also tell the linker
644 @code{ld} to preserve symbols whose names begin with @samp{L}.
646 @subsection Name the Object File: @code{-o}
647 There is always one object file output when you run @code{as}. By
648 default it has the name @file{a.out}. You use this option (which
649 takes exactly one filename) to give the object file a different name.
651 Whatever the object file is called, @code{as} will overwrite any
652 existing file of the same name.
654 @subsection Data Segment into Text Segment: @code{-R}
655 @code{-R} tells @code{as} to write the object file as if all
656 data-segment data lives in the text segment. This is only done at
657 the very last moment: your binary data are the same, but data
658 segment parts are relocated differently. The data segment part of
659 your object file is zero bytes long because all it bytes are
660 appended to the text segment. (@xref{Segments}.)
662 When you specify @code{-R} it would be possible to generate shorter
663 address displacements (because we don't have to cross between text and
664 data segment). We don't do this simply for compatibility with older
665 versions of @code{as}. In future, @code{-R} may work this way.
667 @subsection Suppress Warnings: @code{-W}
668 @code{as} should never give a warning or error message when
669 assembling compiler output. But programs written by people often
670 cause @code{as} to give a warning that a particular assumption was
671 made. All such warnings are directed to the standard error file.
672 If you use this option, no warnings are issued. This option only
673 affects the warning messages: it does not change any particular of how
674 @code{as} assembles your file. Errors, which stop the assembly, are
677 @node Syntax, Segments, Overview, Top
679 This chapter describes the machine-independent syntax allowed in a
680 source file. @code{as} syntax is similar to what many other assemblers
681 use; it is inspired in BSD 4.2
686 assembler, except that @code{as} does not assemble Vax bit-fields.
690 * Pre-processing:: Pre-processing
691 * Whitespace:: Whitespace
692 * Comments:: Comments
693 * Symbol Intro:: Symbols
694 * Statements:: Statements
695 * Constants:: Constants
698 @node Pre-processing, Whitespace, Syntax, Syntax
699 @section Pre-processing
704 adjusts and removes extra whitespace. It leaves one space or tab before
705 the keywords on a line, and turns any other whitespace on the line into
709 removes all comments, replacing them with a single space, or an
710 appropriate number of newlines.
713 converts character constants into the appropriate numeric values.
716 Excess whitespace, comments, and character constants
717 cannot be used in the portions of the input text that are not
720 If the first line of an input file is @code{#NO_APP} or the @samp{-f}
721 option is given, the input file will not be pre-processed. Within such
722 an input file, parts of the file can be pre-processed by putting a line
723 that says @code{#APP} before the text that should be pre-processed, and
724 putting a line that says @code{#NO_APP} after them. This feature is
725 mainly intend to support @code{asm} statements in compilers whose output
726 normally does not need to be pre-processed.
728 @node Whitespace, Comments, Pre-processing, Syntax
730 @dfn{Whitespace} is one or more blanks or tabs, in any order.
731 Whitespace is used to separate symbols, and to make programs neater
732 for people to read. Unless within character constants
733 (@pxref{Characters}), any whitespace means the same as exactly one
736 @node Comments, Symbol Intro, Whitespace, Syntax
738 There are two ways of rendering comments to @code{as}. In both
739 cases the comment is equivalent to one space.
741 Anything from @samp{/*} through the next @samp{*/} is a comment.
742 This means you may not nest these comments.
746 The only way to include a newline ('\n') in a comment
747 is to use this sort of comment.
750 /* This sort of comment does not nest. */
753 Anything from the @dfn{line comment} character to the next newline
754 is considered a comment and is ignored. The line comment character is
759 @samp{|} on the 680x0;
762 @samp{;} for the AMD 29K family;
764 @pxref{Machine Dependent}. @refill
767 On some machines there are two different line comment characters. One
768 will only begin a comment if it is the first non-whitespace character on
769 a line, while the other will always begin a comment.
772 To be compatible with past assemblers a special interpretation is
773 given to lines that begin with @samp{#}. Following the @samp{#} an
774 absolute expression (@pxref{Expressions}) is expected: this will be
775 the logical line number of the @b{next} line. Then a string
776 (@xref{Strings}.) is allowed: if present it is a new logical file
777 name. The rest of the line, if any, should be whitespace.
779 If the first non-whitespace characters on the line are not numeric,
780 the line is ignored. (Just like a comment.)
782 # This is an ordinary comment.
783 # 42-6 "new_file_name" # New logical file name
784 # This is logical line # 36.
786 This feature is deprecated, and may disappear from future versions
789 @node Symbol Intro, Statements, Comments, Syntax
791 A @dfn{symbol} is one or more characters chosen from the set of all
792 letters (both upper and lower case), digits and the three characters
793 @samp{_.$}. No symbol may begin with a digit. Case is significant.
794 There is no length limit: all characters are significant. Symbols are
795 delimited by characters not in that set, or by the beginning of a file
796 (since the source program must end with a newline, the end of a file is
797 not a possible symbol delimiter). @xref{Symbols}.
799 @node Statements, Constants, Symbol Intro, Syntax
801 A @dfn{statement} ends at a newline character (@samp{\n})
803 or at a semicolon (@samp{;}). The newline or semicolon
806 or an ``at'' sign (@samp{@@}). The newline or at sign
809 of the preceding statement. Newlines
817 character constants are an exception: they don't end statements.
818 It is an error to end any statement with end-of-file: the last
819 character of any input file should be a newline.@refill
821 You may write a statement on more than one line if you put a
822 backslash (@kbd{\}) immediately in front of any newlines within the
823 statement. When @code{as} reads a backslashed newline both
824 characters are ignored. You can even put backslashed newlines in
825 the middle of symbol names without changing the meaning of your
828 An empty statement is allowed, and may include whitespace. It is ignored.
830 @c "key symbol" is not used elsewhere in the document; seems pedantic to
833 A statement begins with zero or more labels, optionally followed by a
834 key symbol which determines what kind of statement it is. The key
835 symbol determines the syntax of the rest of the statement. If the
836 symbol begins with a dot @samp{.} then the statement is an assembler
837 directive: typically valid for any computer. If the symbol begins with
838 a letter the statement is an assembly language @dfn{instruction}: it
839 will assemble into a machine language instruction. Different versions
840 of @code{as} for different computers will recognize different
841 instructions. In fact, the same symbol may represent a different
842 instruction in a different computer's assembly language.
844 A label is a symbol immediately followed by a colon (@code{:}).
845 Whitespace before a label or after a colon is permitted, but you may not
846 have whitespace between a label's symbol and its colon. @xref{Labels}.
849 label: .directive followed by something
850 another$label: # This is an empty statement.
851 instruction operand_1, operand_2, @dots{}
854 @node Constants, , Statements, Syntax
856 A constant is a number, written so that its value is known by
857 inspection, without knowing any context. Like this:
859 .byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
860 .ascii "Ring the bell\7" # A string constant.
861 .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.
862 .float 0f-314159265358979323846264338327\
863 95028841971.693993751E-40 # - pi, a flonum.
867 * Characters:: Character Constants
868 * Numbers:: Number Constants
871 @node Characters, Numbers, Constants, Constants
872 @subsection Character Constants
873 There are two kinds of character constants. A @dfn{character} stands
874 for one character in one byte and its value may be used in
875 numeric expressions. String constants (properly called string
876 @emph{literals}) are potentially many bytes and their values may not be
877 used in arithmetic expressions.
884 @node Strings, Chars, Characters, Characters
885 @subsubsection Strings
886 A @dfn{string} is written between double-quotes. It may contain
887 double-quotes or null characters. The way to get special characters
888 into a string is to @dfn{escape} these characters: precede them with
889 a backslash @samp{\} character. For example @samp{\\} represents
890 one backslash: the first @code{\} is an escape which tells
891 @code{as} to interpret the second character literally as a backslash
892 (which prevents @code{as} from recognizing the second @code{\} as an
893 escape character). The complete list of escapes follows.
897 @c Mnemonic for ACKnowledge; for ASCII this is octal code 007.
899 Mnemonic for backspace; for ASCII this is octal code 010.
901 @c Mnemonic for EOText; for ASCII this is octal code 004.
903 Mnemonic for FormFeed; for ASCII this is octal code 014.
905 Mnemonic for newline; for ASCII this is octal code 012.
907 @c Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
909 Mnemonic for carriage-Return; for ASCII this is octal code 015.
911 @c Mnemonic for space; for ASCII this is octal code 040. Included for compliance with
914 Mnemonic for horizontal Tab; for ASCII this is octal code 011.
916 @c Mnemonic for Vertical tab; for ASCII this is octal code 013.
917 @c @item \x @var{digit} @var{digit} @var{digit}
918 @c A hexadecimal character code. The numeric code is 3 hexadecimal digits.
919 @item \ @var{digit} @var{digit} @var{digit}
920 An octal character code. The numeric code is 3 octal digits.
921 For compatibility with other Unix systems, 8 and 9 are accepted as digits:
922 for example, @code{\008} has the value 010, and @code{\009} the value 011.
924 Represents one @samp{\} character.
926 @c Represents one @samp{'} (accent acute) character.
927 @c This is needed in single character literals
928 @c (@xref{Characters}.) to represent
931 Represents one @samp{"} character. Needed in strings to represent
932 this character, because an unescaped @samp{"} would end the string.
933 @item \ @var{anything-else}
934 Any other character when escaped by @kbd{\} will give a warning, but
935 assemble as if the @samp{\} was not present. The idea is that if
936 you used an escape sequence you clearly didn't want the literal
937 interpretation of the following character. However @code{as} has no
938 other interpretation, so @code{as} knows it is giving you the wrong
939 code and warns you of the fact.
942 Which characters are escapable, and what those escapes represent,
943 varies widely among assemblers. The current set is what we think
944 BSD 4.2 @code{as} recognizes, and is a subset of what most C
945 compilers recognize. If you are in doubt, don't use an escape
948 @node Chars, , Strings, Characters
949 @subsubsection Characters
950 A single character may be written as a single quote immediately
951 followed by that character. The same escapes apply to characters as
952 to strings. So if you want to write the character backslash, you
953 must write @kbd{'\\} where the first @code{\} escapes the second
954 @code{\}. As you can see, the quote is an acute accent, not a
955 grave accent. A newline
957 (or semicolon @samp{;})
960 (or at sign @samp{@@})
963 following an acute accent is taken as a literal character and does
964 not count as the end of a statement. The value of a character
965 constant in a numeric expression is the machine's byte-wide code for
966 that character. @code{as} assumes your character code is ASCII: @kbd{'A}
967 means 65, @kbd{'B} means 66, and so on. @refill
969 @node Numbers, , Characters, Constants
970 @subsection Number Constants
971 @code{as} distinguishes three kinds of numbers according to how they
972 are stored in the target machine. @emph{Integers} are numbers that
973 would fit into an @code{int} in the C language. @emph{Bignums} are
974 integers, but they are stored in a more than 32 bits. @emph{Flonums}
975 are floating point numbers, described below.
977 @subsubsection Integers
978 A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
979 the binary digits @samp{01}.
981 An octal integer is @samp{0} followed by zero or more of the octal
982 digits (@samp{01234567}).
984 A decimal integer starts with a non-zero digit followed by zero or
985 more digits (@samp{0123456789}).
987 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
988 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
990 Integers have the usual values. To denote a negative integer, use
991 the prefix operator @samp{-} discussed under expressions
992 (@pxref{Prefix Ops}).
994 @subsubsection Bignums
995 A @dfn{bignum} has the same syntax and semantics as an integer
996 except that the number (or its negative) takes more than 32 bits to
997 represent in binary. The distinction is made because in some places
998 integers are permitted while bignums are not.
1000 @subsubsection Flonums
1001 A @dfn{flonum} represents a floating point number. The translation is
1002 complex: a decimal floating point number from the text is converted by
1003 @code{as} to a generic binary floating point number of more than
1004 sufficient precision. This generic floating point number is converted
1005 to a particular computer's floating point format (or formats) by a
1006 portion of @code{as} specialized to that computer.
1008 A flonum is written by writing (in order)
1014 One of the letters @samp{DFPRSX} (in upper or lower case), to tell
1015 @code{as} the rest of the number is a flonum.
1018 A letter, to tell @code{as} the rest of the number is a flonum. @kbd{e}
1019 is recommended. Case is not important. (Any otherwise illegal letter
1020 will work here, but that might be changed. Vax BSD 4.2 assembler seems
1021 to allow any of @samp{defghDEFGH}.)
1024 An optional sign: either @samp{+} or @samp{-}.
1026 An optional @dfn{integer part}: zero or more decimal digits.
1028 An optional @dfn{fraction part}: @samp{.} followed by zero
1029 or more decimal digits.
1031 An optional exponent, consisting of:
1035 An @samp{E} or @samp{e}.
1037 A letter; the exact significance varies according to
1038 the computer that executes the program. @code{as}
1039 accepts any letter for now. Case is not important.
1042 Optional sign: either @samp{+} or @samp{-}.
1044 One or more decimal digits.
1048 At least one of @var{integer part} or @var{fraction part} must be
1049 present. The floating point number has the usual base-10 value.
1051 @code{as} does all processing using integers. Flonums are computed
1052 independently of any floating point hardware in the computer running
1055 @node Segments, Symbols, Syntax, Top
1056 @chapter Segments and Relocation
1058 * Segs Background:: Background
1059 * ld Segments:: ld Segments
1060 * as Segments:: as Internal Segments
1061 * Sub-Segments:: Sub-Segments
1065 @node Segs Background, ld Segments, Segments, Segments
1067 Roughly, a segment is a range of addresses, with no gaps; all data
1068 ``in'' those addresses is treated the same for some particular purpose.
1069 For example there may be a ``read only'' segment.
1071 The linker @code{ld} reads many object files (partial programs) and
1072 combines their contents to form a runnable program. When @code{as}
1073 emits an object file, the partial program is assumed to start at address
1074 0. @code{ld} will assign the final addresses the partial program
1075 occupies, so that different partial programs don't overlap. This is
1076 actually an over-simplification, but it will suffice to explain how
1077 @code{as} uses segments.
1079 @code{ld} moves blocks of bytes of your program to their run-time
1080 addresses. These blocks slide to their run-time addresses as rigid
1081 units; their length does not change and neither does the order of bytes
1082 within them. Such a rigid unit is called a @emph{segment}. Assigning
1083 run-time addresses to segments is called @dfn{relocation}. It includes
1084 the task of adjusting mentions of object-file addresses so they refer to
1085 the proper run-time addresses.
1087 An object file written by @code{as} has three segments, any of which may
1088 be empty. These are named @dfn{text}, @dfn{data} and @dfn{bss}
1089 segments. Within the object file, the text segment starts at
1090 address @code{0}, the data segment follows, and the bss segment
1091 follows the data segment.
1093 To let @code{ld} know which data will change when the segments are
1094 relocated, and how to change that data, @code{as} also writes to the
1095 object file details of the relocation needed. To perform relocation
1096 @code{ld} must know, each time an address in the object
1100 Where in the object file is the beginning of this reference to
1103 How long (in bytes) is this reference?
1105 Which segment does the address refer to? What is the numeric value of
1107 (@var{address}) @minus{} (@var{start-address of segment})?
1110 Is the reference to an address ``Program-Counter relative''?
1113 In fact, every address @code{as} ever uses is expressed as
1114 @code{(@var{segment}) + (@var{offset into segment})}. Further, every
1115 expression @code{as} computes is of this segmented nature.
1116 @dfn{Absolute expression} means an expression with segment ``absolute''
1117 (@pxref{ld Segments}). A @dfn{pass1 expression} means an expression
1118 with segment ``pass1'' (@pxref{as Segments}). In this manual we use the
1119 notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
1122 Apart from text, data and bss segments you need to know about the
1123 @dfn{absolute} segment. When @code{ld} mixes partial programs,
1124 addresses in the absolute segment remain unchanged. That is, address
1125 @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
1126 Although two partial programs' data segments will not overlap addresses
1127 after linking, @emph{by definition} their absolute segments will overlap.
1128 Address @code{@{absolute@ 239@}} in one partial program will always be the same
1129 address when the program is running as address @code{@{absolute@ 239@}} in any
1130 other partial program.
1132 The idea of segments is extended to the @dfn{undefined} segment. Any
1133 address whose segment is unknown at assembly time is by definition
1134 rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1135 Since numbers are always defined, the only way to generate an undefined
1136 address is to mention an undefined symbol. A reference to a named
1137 common block would be such a symbol: its value is unknown at assembly
1138 time so it has segment @emph{undefined}.
1140 By analogy the word @emph{segment} is used to describe groups of segments in
1141 the linked program. @code{ld} puts all partial programs' text
1142 segments in contiguous addresses in the linked program. It is
1143 customary to refer to the @emph{text segment} of a program, meaning all
1144 the addresses of all partial program's text segments. Likewise for
1145 data and bss segments.
1147 Some segments are manipulated by @code{ld}; others are invented for
1148 use of @code{as} and have no meaning except during assembly.
1151 * ld Segments:: ld Segments
1152 * as Segments:: as Internal Segments
1153 * Sub-Segments:: Sub-Segments
1157 @node ld Segments, as Segments, Segs Background, Segments
1158 @section ld Segments
1159 @code{ld} deals with just five kinds of segments, summarized below.
1165 These segments hold your program. @code{as} and @code{ld} treat them as
1166 separate but equal segments. Anything you can say of one segment is
1167 true of the other. When the program is running, however, it is
1168 customary for the text segment to be unalterable. The
1169 text segment is often shared among processes: it will contain
1170 instructions, constants and the like. The data segment of a running
1171 program is usually alterable: for example, C variables would be stored
1172 in the data segment.
1175 This segment contains zeroed bytes when your program begins running. It
1176 is used to hold unitialized variables or common storage. The length of
1177 each partial program's bss segment is important, but because it starts
1178 out containing zeroed bytes there is no need to store explicit zero
1179 bytes in the object file. The bss segment was invented to eliminate
1180 those explicit zeros from object files.
1182 @item absolute segment
1183 Address 0 of this segment is always ``relocated'' to runtime address 0.
1184 This is useful if you want to refer to an address that @code{ld} must
1185 not change when relocating. In this sense we speak of absolute
1186 addresses being ``unrelocatable'': they don't change during relocation.
1188 @item @code{undefined} segment
1189 This ``segment'' is a catch-all for address references to objects not in
1190 the preceding segments.
1191 @c FIXME: ref to some other doc on obj-file formats could go here.
1195 An idealized example of the 3 relocatable segments follows. Memory
1196 addresses are on the horizontal axis.
1201 partial program # 1: |ttttt|dddd|00|
1208 partial program # 2: |TTT|DDD|000|
1211 +--+---+-----+--+----+---+-----+~~
1212 linked program: | |TTT|ttttt| |dddd|DDD|00000|
1213 +--+---+-----+--+----+---+-----+~~
1215 addresses: 0 @dots{}
1219 \halign{\hfil\rm #\quad&#\cr
1221 &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1222 Partial program \#1:
1223 &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1225 &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1226 Partial program \#2:
1227 &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1229 &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1231 &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1232 ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1233 DDDD}\boxit{2cm}{00000}\ \dots\cr
1239 @node as Segments, Sub-Segments, ld Segments, Segments
1240 @section as Internal Segments
1241 These segments are invented for the internal use of @code{as}. They
1242 have no meaning at run-time. You don't need to know about these
1243 segments except that they might be mentioned in @code{as}' warning
1244 messages. These segments are invented to permit the value of every
1245 expression in your assembly language program to be a segmented
1249 @item absent segment
1250 An expression was expected and none was
1254 An internal assembler logic error has been
1255 found. This means there is a bug in the assembler.
1258 A @dfn{grand number} is a bignum or a flonum, but not an integer. If a
1259 number can't be written as a C @code{int} constant, it is a grand
1260 number. @code{as} has to remember that a flonum or a bignum does not
1261 fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
1262 expression: this is done by making a flonum or bignum be in segment
1263 grand. This is purely for internal @code{as} convenience; grand
1264 segment behaves similarly to absolute segment.
1267 The expression was impossible to evaluate in the first pass. The
1268 assembler will attempt a second pass (second reading of the source) to
1269 evaluate the expression. Your expression mentioned an undefined symbol
1270 in a way that defies the one-pass (segment + offset in segment) assembly
1271 process. No compiler need emit such an expression.
1274 @emph{Warning:} the second pass is currently not implemented. @code{as}
1275 will abort with an error message if one is required.
1278 @item difference segment
1279 As an assist to the C compiler, expressions of the forms
1281 (@var{undefined symbol}) @minus{} (@var{expression}
1282 (@var{something} @minus{} (@var{undefined symbol})
1283 (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1285 are permitted, and belong to the difference segment. @code{as}
1286 re-evaluates such expressions after the source file has been read and
1287 the symbol table built. If by that time there are no undefined symbols
1288 in the expression then the expression assumes a new segment. The
1289 intention is to permit statements like
1290 @samp{.word label - base_of_table}
1291 to be assembled in one pass where both @code{label} and
1292 @code{base_of_table} are undefined. This is useful for compiling C and
1293 Algol switch statements, Pascal case statements, FORTRAN computed goto
1294 statements and the like.
1297 @node Sub-Segments, bss, as Segments, Segments
1298 @section Sub-Segments
1299 Assembled bytes fall into two segments: text and data.
1300 Because you may have groups of text or data that you want to end up near
1301 to each other in the object file, @code{as} allows you to use
1302 @dfn{subsegments}. Within each segment, there can be numbered
1303 subsegments with values from 0 to 8192. Objects assembled into the same
1304 subsegment will be grouped with other objects in the same subsegment
1305 when they are all put into the object file. For example, a compiler
1306 might want to store constants in the text segment, but might not want to
1307 have them interspersed with the program being assembled. In this case,
1308 the compiler could issue a @code{text 0} before each section of code
1309 being output, and a @code{text 1} before each group of constants being
1312 Subsegments are optional. If you don't use subsegments, everything
1313 will be stored in subsegment number zero.
1316 Each subsegment is zero-padded up to a multiple of four bytes.
1317 (Subsegments may be padded a different amount on different flavors
1321 On the AMD 29K family, no particular padding is added to segment sizes;
1322 GNU as forces no alignment on this platform.
1324 Subsegments appear in your object file in numeric order, lowest numbered
1325 to highest. (All this to be compatible with other people's assemblers.)
1326 The object file contains no representation of subsegments; @code{ld} and
1327 other programs that manipulate object files will see no trace of them.
1328 They just see all your text subsegments as a text segment, and all your
1329 data subsegments as a data segment.
1331 To specify which subsegment you want subsequent statements assembled
1332 into, use a @samp{.text @var{expression}} or a @samp{.data
1333 @var{expression}} statement. @var{Expression} should be an absolute
1334 expression. (@xref{Expressions}.) If you just say @samp{.text}
1335 then @samp{.text 0} is assumed. Likewise @samp{.data} means
1336 @samp{.data 0}. Assembly begins in @code{text 0}.
1339 .text 0 # The default subsegment is text 0 anyway.
1340 .ascii "This lives in the first text subsegment. *"
1342 .ascii "But this lives in the second text subsegment."
1344 .ascii "This lives in the data segment,"
1345 .ascii "in the first data subsegment."
1347 .ascii "This lives in the first text segment,"
1348 .ascii "immediately following the asterisk (*)."
1351 Each segment has a @dfn{location counter} incremented by one for every
1352 byte assembled into that segment. Because subsegments are merely a
1353 convenience restricted to @code{as} there is no concept of a subsegment
1354 location counter. There is no way to directly manipulate a location
1355 counter---but the @code{.align} directive will change it, and any label
1356 definition will capture its current value. The location counter of the
1357 segment that statements are being assembled into is said to be the
1358 @dfn{active} location counter.
1360 @node bss, , Sub-Segments, Segments
1361 @section bss Segment
1362 The bss segment is used for local common variable storage.
1363 You may allocate address space in the bss segment, but you may
1364 not dictate data to load into it before your program executes. When
1365 your program starts running, all the contents of the bss
1366 segment are zeroed bytes.
1368 Addresses in the bss segment are allocated with special directives;
1369 you may not assemble anything directly into the bss segment. Hence
1370 there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
1372 @node Symbols, Expressions, Segments, Top
1374 Symbols are a central concept: the programmer uses symbols to name
1375 things, the linker uses symbols to link, and the debugger uses symbols
1379 @emph{Warning:} @code{as} does not place symbols in the object file in
1380 the same order they were declared. This may break some debuggers.
1385 * Setting Symbols:: Giving Symbols Other Values
1386 * Symbol Names:: Symbol Names
1387 * Dot:: The Special Dot Symbol
1388 * Symbol Attributes:: Symbol Attributes
1391 @node Labels, Setting Symbols, Symbols, Symbols
1393 A @dfn{label} is written as a symbol immediately followed by a colon
1394 @samp{:}. The symbol then represents the current value of the
1395 active location counter, and is, for example, a suitable instruction
1396 operand. You are warned if you use the same symbol to represent two
1397 different locations: the first definition overrides any other
1400 @node Setting Symbols, Symbol Names, Labels, Symbols
1401 @section Giving Symbols Other Values
1402 A symbol can be given an arbitrary value by writing a symbol, followed
1403 by an equals sign @samp{=}, followed by an expression
1404 (@pxref{Expressions}). This is equivalent to using the @code{.set}
1405 directive. @xref{Set}.
1407 @node Symbol Names, Dot, Setting Symbols, Symbols
1408 @section Symbol Names
1409 Symbol names begin with a letter or with one of @samp{$._}. That
1410 character may be followed by any string of digits, letters,
1411 underscores and dollar signs. Case of letters is significant:
1412 @code{foo} is a different symbol name than @code{Foo}.
1415 For the AMD 29K family, @samp{?} is also allowed in the
1416 body of a symbol name, though not at its beginning.
1419 Each symbol has exactly one name. Each name in an assembly language
1420 program refers to exactly one symbol. You may use that symbol name any
1421 number of times in a program.
1424 * Local Symbols:: Local Symbol Names
1427 @node Local Symbols, , Symbol Names, Symbol Names
1428 @subsection Local Symbol Names
1430 Local symbols help compilers and programmers use names temporarily.
1431 There are ten local symbol names, which are re-used throughout the
1432 program. You may refer to them using the names @samp{0} @samp{1}
1433 @dots{} @samp{9}. To define a local symbol, write a label of the form
1434 @samp{@b{N}:} (where @b{N} represents any digit). To refer to the most
1435 recent previous definition of that symbol write @samp{@b{N}b}, using the
1436 same digit as when you defined the label. To refer to the next
1437 definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1438 a choice of 10 forward references. The @samp{b} stands for
1439 ``backwards'' and the @samp{f} stands for ``forwards''.
1441 Local symbols are not emitted by the current GNU C compiler.
1443 There is no restriction on how you can use these labels, but
1444 remember that at any point in the assembly you can refer to at most
1445 10 prior local labels and to at most 10 forward local labels.
1447 Local symbol names are only a notation device. They are immediately
1448 transformed into more conventional symbol names before the assembler
1449 uses them. The symbol names stored in the symbol table, appearing in
1450 error messages and optionally emitted to the object file have these
1455 All local labels begin with @samp{L}. Normally both @code{as} and
1456 @code{ld} forget symbols that start with @samp{L}. These labels are
1457 used for symbols you are never intended to see. If you give the
1458 @samp{-L} option then @code{as} will retain these symbols in the
1459 object file. If you also instruct @code{ld} to retain these symbols,
1460 you may use them in debugging.
1463 If the label is written @samp{0:} then the digit is @samp{0}.
1464 If the label is written @samp{1:} then the digit is @samp{1}.
1465 And so on up through @samp{9:}.
1468 This unusual character is included so you don't accidentally invent
1469 a symbol of the same name. The character has ASCII value
1472 @item @emph{ordinal number}
1473 This is a serial number to keep the labels distinct. The first
1474 @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
1475 number @samp{15}; @emph{etc.}. Likewise for the other labels @samp{1:}
1479 For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1480 @code{3:} is named @code{L3@ctrl{A}44}.
1482 @node Dot, Symbol Attributes, Symbol Names, Symbols
1483 @section The Special Dot Symbol
1485 The special symbol @samp{.} refers to the current address that
1486 @code{as} is assembling into. Thus, the expression @samp{melvin:
1487 .long .} will cause @code{melvin} to contain its own address.
1488 Assigning a value to @code{.} is treated the same as a @code{.org}
1489 directive. Thus, the expression @samp{.=.+4} is the same as saying
1497 @node Symbol Attributes, , Dot, Symbols
1498 @section Symbol Attributes
1499 Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
1501 The detailed definitions are in _0__<a.out.h>_1__.
1504 If you use a symbol without defining it, @code{as} assumes zero for
1505 all these attributes, and probably won't warn you. This makes the
1506 symbol an externally defined symbol, which is generally what you
1510 * Symbol Value:: Value
1511 * Symbol Type:: Type
1512 * Symbol Desc:: Descriptor
1513 * Symbol Other:: Other
1516 @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
1518 The value of a symbol is (usually) 32 bits, the size of one GNU C
1519 @code{int}. For a symbol which labels a location in the
1520 text, data, bss or absolute segments the
1521 value is the number of addresses from the start of that segment to
1522 the label. Naturally for text, data and bss
1523 segments the value of a symbol changes as @code{ld} changes segment
1524 base addresses during linking. absolute symbols' values do
1525 not change during linking: that is why they are called absolute.
1527 The value of an undefined symbol is treated in a special way. If it is
1528 0 then the symbol is not defined in this assembler source program, and
1529 @code{ld} will try to determine its value from other programs it is
1530 linked with. You make this kind of symbol simply by mentioning a symbol
1531 name without defining it. A non-zero value represents a @code{.comm}
1532 common declaration. The value is how much common storage to reserve, in
1533 bytes (addresses). The symbol refers to the first address of the
1536 @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
1538 The type attribute of a symbol is 8 bits encoded in a devious way.
1539 We kept this coding standard for compatibility with older operating
1545 7 6 5 4 3 2 1 0 bit numbers
1546 +-----+-----+-----+-----+-----+-----+-----+-----+
1548 | N_STAB bits | N_TYPE bits |N_EXT|
1550 +-----+-----+-----+-----+-----+-----+-----+-----+
1558 \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1.1cm}{0}&bit numbers\cr
1559 \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
1560 bits}\boxit{1.1cm}{\tt N\_EXT}\cr
1561 \hfill {\bf Type} byte\hfill\cr
1565 @subsubsection @code{N_EXT} bit
1566 This bit is set if @code{ld} might need to use the symbol's type bits
1567 and value. If this bit is off, then @code{ld} can ignore the
1568 symbol while linking. It is set in two cases. If the symbol is
1569 undefined, then @code{ld} is expected to find the symbol's value
1570 elsewhere in another program module. Otherwise the symbol has the
1571 value given, but this symbol name and value are revealed to any other
1572 programs linked in the same executable program. This second use of
1573 the @code{N_EXT} bit is most often made by a @code{.globl} statement.
1575 @subsubsection @code{N_TYPE} bits
1576 These establish the symbol's ``type'', which is mainly a relocation
1577 concept. Common values are detailed in the manual describing the
1578 executable file format.
1580 @subsubsection @code{N_STAB} bits
1581 Common values for these bits are described in the manual on the
1582 executable file format.
1584 @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
1585 @subsection Descriptor
1586 This is an arbitrary 16-bit value. You may establish a symbol's
1587 descriptor value by using a @code{.desc} statement (@pxref{Desc}).
1588 A descriptor value means nothing to @code{as}.
1590 @node Symbol Other, , Symbol Desc, Symbol Attributes
1592 This is an arbitrary 8-bit value. It means nothing to @code{as}.
1594 @node Expressions, Pseudo Ops, Symbols, Top
1595 @chapter Expressions
1596 An @dfn{expression} specifies an address or numeric value.
1597 Whitespace may precede and/or follow an expression.
1600 * Empty Exprs:: Empty Expressions
1601 * Integer Exprs:: Integer Expressions
1604 @node Empty Exprs, Integer Exprs, Expressions, Expressions
1605 @section Empty Expressions
1606 An empty expression has no value: it is just whitespace or null.
1607 Wherever an absolute expression is required, you may omit the
1608 expression and @code{as} will assume a value of (absolute) 0. This
1609 is compatible with other assemblers.
1611 @node Integer Exprs, , Empty Exprs, Expressions
1612 @section Integer Expressions
1613 An @dfn{integer expression} is one or more @emph{arguments} delimited
1614 by @emph{operators}.
1617 * Arguments:: Arguments
1618 * Operators:: Operators
1619 * Prefix Ops:: Prefix Operators
1620 * Infix Ops:: Infix Operators
1623 @node Arguments, Operators, Integer Exprs, Integer Exprs
1624 @subsection Arguments
1626 @dfn{Arguments} are symbols, numbers or subexpressions. In other
1627 contexts arguments are sometimes called ``arithmetic operands''. In
1628 this manual, to avoid confusing them with the ``instruction operands'' of
1629 the machine language, we use the term ``argument'' to refer to parts of
1630 expressions only, reserving the word ``operand'' to refer only to machine
1631 instruction operands.
1633 Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1634 @var{segment} is one of text, data, bss, absolute,
1635 or @code{undefined}. @var{NNN} is a signed, 2's complement 32 bit
1638 Numbers are usually integers.
1640 A number can be a flonum or bignum. In this case, you are warned
1641 that only the low order 32 bits are used, and @code{as} pretends
1642 these 32 bits are an integer. You may write integer-manipulating
1643 instructions that act on exotic constants, compatible with other
1646 Subexpressions are a left parenthesis @samp{(} followed by an integer
1647 expression, followed by a right parenthesis @samp{)}; or a prefix
1648 operator followed by an argument.
1650 @node Operators, Prefix Ops, Arguments, Integer Exprs
1651 @subsection Operators
1652 @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}. Prefix
1653 operators are followed by an argument. Infix operators appear
1654 between their arguments. Operators may be preceded and/or followed by
1657 @node Prefix Ops, Infix Ops, Operators, Integer Exprs
1658 @subsection Prefix Operators
1659 @code{as} has the following @dfn{prefix operators}. They each take
1660 one argument, which must be absolute.
1663 @dfn{Negation}. Two's complement negation.
1665 @dfn{Complementation}. Bitwise not.
1668 @node Infix Ops, , Prefix Ops, Integer Exprs
1669 @subsection Infix Operators
1671 @dfn{Infix operators} take two arguments, one on either side. Operators
1672 have precedence, but operations with equal precedence are performed left
1673 to right. Apart from @code{+} or @code{-}, both arguments must be
1674 absolute, and the result is absolute.
1682 @dfn{Multiplication}.
1684 @dfn{Division}. Truncation is the same as the C operator @samp{/}
1689 @dfn{Shift Left}. Same as the C operator @samp{_0__<<_1__}
1692 @dfn{Shift Right}. Same as the C operator @samp{_0__>>_1__}
1696 Intermediate precedence
1699 @dfn{Bitwise Inclusive Or}.
1703 @dfn{Bitwise Exclusive Or}.
1705 @dfn{Bitwise Or Not}.
1712 @dfn{Addition}. If either argument is absolute, the result
1713 has the segment of the other argument.
1714 If either argument is pass1 or undefined, the result is pass1.
1715 Otherwise @code{+} is illegal.
1717 @dfn{Subtraction}. If the right argument is absolute, the
1718 result has the segment of the left argument.
1719 If either argument is pass1 the result is pass1.
1720 If either argument is undefined the result is difference segment.
1721 If both arguments are in the same segment, the result is absolute---provided
1722 that segment is one of text, data or bss.
1723 Otherwise subtraction is illegal.
1727 The sense of the rule for addition is that it's only meaningful to add
1728 the @emph{offsets} in an address; you can only have a defined segment in
1729 one of the two arguments.
1731 Similarly, you can't subtract quantities from two different segments.
1733 @node Pseudo Ops, Machine Dependent, Expressions, Top
1734 @chapter Assembler Directives
1736 * Abort:: The Abort directive causes as to abort
1737 * Align:: Pad the location counter to a power of 2
1738 * App-File:: Set the logical file name
1739 * Ascii:: Fill memory with bytes of ASCII characters
1740 * Asciz:: Fill memory with bytes of ASCII characters followed
1742 * Byte:: Fill memory with 8-bit integers
1743 * Comm:: Reserve public space in the BSS segment
1744 * Data:: Change to the data segment
1745 * Desc:: Set the n_desc of a symbol
1746 * Double:: Fill memory with double-precision floating-point numbers
1747 * Else:: @code{.else}
1749 * Endif:: @code{.endif}
1750 * Equ:: @code{.equ @var{symbol}, @var{expression}}
1751 * Extern:: @code{.extern}
1752 * Fill:: Fill memory with repeated values
1753 * Float:: Fill memory with single-precision floating-point numbers
1754 * Global:: Make a symbol visible to the linker
1755 * Ident:: @code{.ident}
1756 * If:: @code{.if @var{absolute expression}}
1757 * Include:: @code{.include "@var{file}"}
1758 * Int:: Fill memory with 32-bit integers
1759 * Lcomm:: Reserve private space in the BSS segment
1760 * Line:: Set the logical line number
1761 * Ln:: @code{.ln @var{line-number}}
1762 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1763 * Long:: Fill memory with 32-bit integers
1764 * Lsym:: Create a local symbol
1765 * Octa:: Fill memory with 128-bit integers
1766 * Org:: Change the location counter
1767 * Quad:: Fill memory with 64-bit integers
1768 * Set:: Set the value of a symbol
1769 * Short:: Fill memory with 16-bit integers
1770 * Single:: @code{.single @var{flonums}}
1771 * Stab:: Store debugging information
1772 * Text:: Change to the text segment
1773 * Word:: Fill memory with 32-bit integers
1774 * Deprecated:: Deprecated Directives
1775 * Machine Options:: Options
1776 * Machine Syntax:: Syntax
1777 * Floating Point:: Floating Point
1778 * Machine Directives:: Machine Directives
1782 All assembler directives have names that begin with a period (@samp{.}).
1783 The rest of the name is letters: their case does not matter.
1785 This chapter discusses directives present in all versions of GNU
1786 @code{as}; @pxref{Machine Dependent} for additional directives.
1788 @node Abort, Align, Pseudo Ops, Pseudo Ops
1789 @section @code{.abort}
1790 This directive stops the assembly immediately. It is for
1791 compatibility with other assemblers. The original idea was that the
1792 assembler program would be piped into the assembler. If the sender
1793 of a program quit, it could use this directive tells @code{as} to
1794 quit also. One day @code{.abort} will not be supported.
1796 @node Align, App-File, Abort, Pseudo Ops
1797 @section @code{.align @var{abs-expression} , @var{abs-expression}}
1798 Pad the location counter (in the current subsegment) to a particular
1799 storage boundary. The first expression (which must be absolute) is the
1800 number of low-order zero bits the location counter will have after
1801 advancement. For example @samp{.align 3} will advance the location
1802 counter until it a multiple of 8. If the location counter is already a
1803 multiple of 8, no change is needed.
1805 The second expression (also absolute) gives the value to be stored in
1806 the padding bytes. It (and the comma) may be omitted. If it is
1807 omitted, the padding bytes are zero.
1809 @node App-File, Ascii, Align, Pseudo Ops
1810 @section @code{.app-file @var{string}}
1811 @code{.app-file} tells @code{as} that we are about to start a new
1812 logical file. @var{String} is the new file name. In general, the
1813 filename is recognized whether or not it is surrounded by quotes @samp{"};
1814 but if you wish to specify an empty file name is permitted,
1815 you must give the quotes--@code{""}. This statement may go away in
1816 future: it is only recognized to be compatible with old @code{as}
1819 @node Ascii, Asciz, App-File, Pseudo Ops
1820 @section @code{.ascii "@var{string}"}@dots{}
1821 @code{.ascii} expects zero or more string literals (@pxref{Strings})
1822 separated by commas. It assembles each string (with no automatic
1823 trailing zero byte) into consecutive addresses.
1825 @node Asciz, Byte, Ascii, Pseudo Ops
1826 @section @code{.asciz "@var{string}"}@dots{}
1827 @code{.asciz} is just like @code{.ascii}, but each string is followed by
1828 a zero byte. The ``z'' in @samp{.asciz} stands for ``zero''.
1830 @node Byte, Comm, Asciz, Pseudo Ops
1831 @section @code{.byte @var{expressions}}
1833 @code{.byte} expects zero or more expressions, separated by commas.
1834 Each expression is assembled into the next byte.
1836 @node Comm, Data, Byte, Pseudo Ops
1837 @section @code{.comm @var{symbol} , @var{length} }
1838 @code{.comm} declares a named common area in the bss segment. Normally
1839 @code{ld} reserves memory addresses for it during linking, so no partial
1840 program defines the location of the symbol. Use @code{.comm} to tell
1841 @code{ld} that it must be at least @var{length} bytes long. @code{ld}
1842 will allocate space for each @code{.comm} symbol that is at least as
1843 long as the longest @code{.comm} request in any of the partial programs
1844 linked. @var{length} is an absolute expression.
1846 @node Data, Desc, Comm, Pseudo Ops
1847 @section @code{.data @var{subsegment}}
1848 @code{.data} tells @code{as} to assemble the following statements onto the
1849 end of the data subsegment numbered @var{subsegment} (which is an
1850 absolute expression). If @var{subsegment} is omitted, it defaults
1853 @node Desc, Double, Data, Pseudo Ops
1854 @section @code{.desc @var{symbol}, @var{abs-expression}}
1855 This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
1856 to the low 16 bits of an absolute expression.
1858 @node Double, Else, Desc, Pseudo Ops
1859 @section @code{.double @var{flonums}}
1860 @code{.double} expects zero or more flonums, separated by commas. It assembles
1861 floating point numbers.
1863 The exact kind of floating point numbers emitted depends on how
1864 @code{as} is configured. @xref{Machine Dependent}.
1867 On the AMD 29K family the floating point format used is IEEE.
1870 @node Else, End, Double, Pseudo Ops
1871 @section @code{.else}
1872 @code{.else} is part of the @code{as} support for conditional assembly;
1873 @pxref{If}. It marks the beginning of a section of code to be assembled
1874 if the condition for the preceding @code{.if} was false.
1877 @node End, Endif, Else, Pseudo Ops
1878 @section @code{.end}
1879 This doesn't do anything---but isn't an s_ignore, so I suspect it's
1880 meant to do something eventually (which is why it isn't documented here
1881 as "for compatibility with blah").
1884 @node Endif, Equ, End, Pseudo Ops
1885 @section @code{.endif}
1886 @code{.endif} is part of the @code{as} support for conditional assembly;
1887 it marks the end of a block of code that is only assembled
1888 conditionally. @xref{If}.
1890 @node Equ, Extern, Endif, Pseudo Ops
1891 @section @code{.equ @var{symbol}, @var{expression}}
1893 This directive sets the value of @var{symbol} to @var{expression}.
1894 It is synonymous with @samp{.set}; @pxref{Set}.
1896 @node Extern, Fill, Equ, Pseudo Ops
1897 @section @code{.extern}
1898 @code{.extern} is accepted in the source program---for compatibility
1899 with other assemblers---but it is ignored. GNU @code{as} treats
1900 all undefined symbols as external.
1902 @node Fill, Float, Extern, Pseudo Ops
1903 @section @code{.fill @var{repeat} , @var{size} , @var{value}}
1904 @var{result}, @var{size} and @var{value} are absolute expressions.
1905 This emits @var{repeat} copies of @var{size} bytes. @var{Repeat}
1906 may be zero or more. @var{Size} may be zero or more, but if it is
1907 more than 8, then it is deemed to have the value 8, compatible with
1908 other people's assemblers. The contents of each @var{repeat} bytes
1909 is taken from an 8-byte number. The highest order 4 bytes are
1910 zero. The lowest order 4 bytes are @var{value} rendered in the
1911 byte-order of an integer on the computer @code{as} is assembling for.
1912 Each @var{size} bytes in a repetition is taken from the lowest order
1913 @var{size} bytes of this number. Again, this bizarre behavior is
1914 compatible with other people's assemblers.
1916 @var{Size} and @var{value} are optional.
1917 If the second comma and @var{value} are absent, @var{value} is
1918 assumed zero. If the first comma and following tokens are absent,
1919 @var{size} is assumed to be 1.
1921 @node Float, Global, Fill, Pseudo Ops
1922 @section @code{.float @var{flonums}}
1923 This directive assembles zero or more flonums, separated by commas. It
1924 has the same effect as @code{.single}.
1926 The exact kind of floating point numbers emitted depends on how
1927 @code{as} is configured.
1928 @xref{Machine Dependent}.
1931 The floating point format used for the AMD 29K family is IEEE.
1934 @node Global, Ident, Float, Pseudo Ops
1935 @section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
1936 @code{.global} makes the symbol visible to @code{ld}. If you define
1937 @var{symbol} in your partial program, its value is made available to
1938 other partial programs that are linked with it. Otherwise,
1939 @var{symbol} will take its attributes from a symbol of the same name
1940 from another partial program it is linked with.
1942 This is done by setting the @code{N_EXT} bit of that symbol's type byte
1943 to 1. @xref{Symbol Attributes}.
1945 Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1946 compatibility with other assemblers.
1948 @node Ident, If, Global, Pseudo Ops
1949 @section @code{.ident}
1950 This directive is used by some assemblers to place tags in object files.
1951 GNU @code{as} simply accepts the directive for source-file
1952 compatibility with such assemblers, but does not actually emit anything
1955 @node If, Include, Ident, Pseudo Ops
1956 @section @code{.if @var{absolute expression}}
1957 @code{.if} marks the beginning of a section of code which is only
1958 considered part of the source program being assembled if the argument
1959 (which must be an @var{absolute expression}) is non-zero. The end of
1960 the conditional section of code must be marked by @code{.endif}
1961 (@pxref{Endif}); optionally, you may include code for the
1962 alternative condition, flagged by @code{.else} (@pxref{Else}.
1964 The following variants of @code{.if} are also supported:
1966 @item ifdef @var{symbol}
1967 Assembles the following section of code if the specified @var{symbol}
1975 @item ifndef @var{symbol}
1976 @itemx ifnotdef @var{symbol}
1977 Assembles the following section of code if the specified @var{symbol}
1978 has not been defined. Both spelling variants are equivalent.
1982 NO bogons, I presume?
1986 @node Include, Int, If, Pseudo Ops
1987 @section @code{.include "@var{file}"}
1988 This directive provides a way to include supporting files at specified
1989 points in your source program. The code from @var{file} is assembled as
1990 if it followed the point of the @code{.include}; when the end of the
1991 included file is reached, assembly of the original file continues. You
1992 can control the search paths used with the @samp{-I} command-line option
1993 (@pxref{Options}). Quotation marks are required around @var{file}.
1995 @node Int, Lcomm, Include, Pseudo Ops
1996 @section @code{.int @var{expressions}}
1997 Expect zero or more @var{expressions}, of any segment, separated by
1998 commas. For each expression, emit a 32-bit number that will, at run
1999 time, be the value of that expression. The byte order of the
2000 expression depends on what kind of computer will run the program.
2002 @node Lcomm, Line, Int, Pseudo Ops
2003 @section @code{.lcomm @var{symbol} , @var{length}}
2004 Reserve @var{length} (an absolute expression) bytes for a local
2005 common denoted by @var{symbol}. The segment and value of @var{symbol} are
2006 those of the new local common. The addresses are allocated in the
2007 bss segment, so at run-time the bytes will start off zeroed.
2008 @var{Symbol} is not declared global (@pxref{Global}), so is normally
2009 not visible to @code{ld}.
2012 @node Line, Ln, Lcomm, Pseudo Ops
2013 @section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2014 @code{.line}, and its alternate spelling @code{.ln}, tell
2017 @node Ln, List, Line, Pseudo Ops
2018 @section @code{.ln @var{line-number}}
2021 @code{as} to change the logical line number. @var{line-number} must be
2022 an absolute expression. The next line will have that logical line
2023 number. So any other statements on the current line (after a statement
2031 will be reported as on logical line number
2032 @var{logical line number} @minus{} 1.
2033 One day this directive will be unsupported: it is used only
2034 for compatibility with existing assembler programs. @refill
2036 @node List, Long, Ln, Pseudo Ops
2037 @section @code{.list} and related directives
2038 GNU @code{as} ignores the directives @code{.list}, @code{.nolist},
2039 @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}; however,
2040 they're accepted for compatibility with assemblers that use them.
2042 @node Long, Lsym, List, Pseudo Ops
2043 @section @code{.long @var{expressions}}
2044 @code{.long} is the same as @samp{.int}, @pxref{Int}.
2046 @node Lsym, Octa, Long, Pseudo Ops
2047 @section @code{.lsym @var{symbol}, @var{expression}}
2048 @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
2049 the hash table, ensuring it cannot be referenced by name during the
2050 rest of the assembly. This sets the attributes of the symbol to be
2051 the same as the expression value:
2053 @var{other} = @var{descriptor} = 0
2054 @var{type} = @r{(segment of @var{expression})}
2056 @var{value} = @var{expression}
2059 @node Octa, Org, Lsym, Pseudo Ops
2060 @section @code{.octa @var{bignums}}
2061 This directive expects zero or more bignums, separated by commas. For each
2062 bignum, it emits a 16-byte integer.
2064 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2065 hence @emph{quad}-word for 8 bytes.
2067 @node Org, Quad, Octa, Pseudo Ops
2068 @section @code{.org @var{new-lc} , @var{fill}}
2070 @code{.org} will advance the location counter of the current segment to
2071 @var{new-lc}. @var{new-lc} is either an absolute expression or an
2072 expression with the same segment as the current subsegment. That is,
2073 you can't use @code{.org} to cross segments: if @var{new-lc} has the
2074 wrong segment, the @code{.org} directive is ignored. To be compatible
2075 with former assemblers, if the segment of @var{new-lc} is absolute,
2076 @code{as} will issue a warning, then pretend the segment of @var{new-lc}
2077 is the same as the current subsegment.
2079 @code{.org} may only increase the location counter, or leave it
2080 unchanged; you cannot use @code{.org} to move the location counter
2083 @c double negative used below "not undefined" because this is a specific
2084 @c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2086 Because @code{as} tries to assemble programs in one pass @var{new-lc}
2087 may not be undefined. If you really detest this restriction we eagerly await
2088 a chance to share your improved assembler.
2090 Beware that the origin is relative to the start of the segment, not
2091 to the start of the subsegment. This is compatible with other
2092 people's assemblers.
2094 When the location counter (of the current subsegment) is advanced, the
2095 intervening bytes are filled with @var{fill} which should be an
2096 absolute expression. If the comma and @var{fill} are omitted,
2097 @var{fill} defaults to zero.
2099 @node Quad, Set, Org, Pseudo Ops
2100 @section @code{.quad @var{bignums}}
2101 @code{.quad} expects zero or more bignums, separated by commas. For
2102 each bignum, it emits an 8-byte integer. If the bignum won't fit in a 8
2103 bytes, it prints a warning message; and just takes the lowest order 8
2104 bytes of the bignum.
2106 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2107 hence @emph{quad}-word for 8 bytes.
2109 @node Set, Short, Quad, Pseudo Ops
2110 @section @code{.set @var{symbol}, @var{expression}}
2112 This directive sets the value of @var{symbol} to @var{expression}. This
2113 will change @var{symbol}'s value and type to conform to
2114 @var{expression}. If @code{N_EXT} is set, it remains set.
2115 (@xref{Symbol Attributes}.)
2117 You may @code{.set} a symbol many times in the same assembly.
2118 If the expression's segment is unknowable during pass 1, a second
2119 pass over the source program will be forced. The second pass is
2120 currently not implemented. @code{as} will abort with an error
2121 message if one is required.
2123 If you @code{.set} a global symbol, the value stored in the object
2124 file is the last value stored into it.
2126 @node Short, Single, Set, Pseudo Ops
2127 @section @code{.short @var{expressions}}
2128 _if__(! (_SPARC__ || _AMD29K__) )
2129 @code{.short} is the same as @samp{.word}. @xref{Word}.
2130 _fi__(! (_SPARC__ || _AMD29K__) )
2131 _if__(_SPARC__ || _AMD29K__)
2132 This expects zero or more @var{expressions}, and emits
2133 a 16 bit number for each.
2134 _fi__(_SPARC__ || _AMD29K__)
2136 @node Single, Space, Short, Pseudo Ops
2137 @section @code{.single @var{flonums}}
2138 This directive assembles zero or more flonums, separated by commas. It
2139 has the same effect as @code{.float}.
2141 The exact kind of floating point numbers emitted depends on how
2142 @code{as} is configured. @xref{Machine Dependent}.
2145 The floating point format used for the AMD 29K family is IEEE.
2149 @node Space, Space, Single, Pseudo Ops
2151 @section @code{.space @var{size} , @var{fill}}
2152 This directive emits @var{size} bytes, each of value @var{fill}. Both
2153 @var{size} and @var{fill} are absolute expressions. If the comma
2154 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2158 @section @code{.space}
2159 This directive is ignored; it is accepted for compatibility with other
2163 @emph{Warning:} In other versions of GNU @code{as}, the directive
2164 @code{.space} has the effect of @code{.block} @xref{Machine Directives}.
2168 @node Stab, Text, Space, Pseudo Ops
2169 @section @code{.stabd, .stabn, .stabs}
2170 There are three directives that begin @samp{.stab}.
2171 All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
2172 The symbols are not entered in @code{as}' hash table: they
2173 cannot be referenced elsewhere in the source file.
2174 Up to five fields are required:
2177 This is the symbol's name. It may contain any character except @samp{\000},
2178 so is more general than ordinary symbol names. Some debuggers used to
2179 code arbitrarily complex structures into symbol names using this field.
2181 An absolute expression. The symbol's type is set to the low 8
2182 bits of this expression.
2183 Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2186 An absolute expression.
2187 The symbol's ``other'' attribute is set to the low 8 bits of this expression.
2189 An absolute expression.
2190 The symbol's descriptor is set to the low 16 bits of this expression.
2192 An absolute expression which becomes the symbol's value.
2195 If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2196 or @code{.stabs} statement, the symbol has probably already been created
2197 and you will get a half-formed symbol in your object file. This is
2198 compatible with earlier assemblers!
2201 @item .stabd @var{type} , @var{other} , @var{desc}
2203 The ``name'' of the symbol generated is not even an empty string.
2204 It is a null pointer, for compatibility. Older assemblers used a
2205 null pointer so they didn't waste space in object files with empty
2208 The symbol's value is set to the location counter,
2209 relocatably. When your program is linked, the value of this symbol
2210 will be where the location counter was when the @code{.stabd} was
2213 @item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
2215 The name of the symbol is set to the empty string @code{""}.
2217 @item .stabs @var{string} , @var{type} , @var{other} , @var{desc} , @var{value}
2219 All five fields are specified.
2222 @node Text, Word, Stab, Pseudo Ops
2223 @section @code{.text @var{subsegment}}
2224 Tells @code{as} to assemble the following statements onto the end of
2225 the text subsegment numbered @var{subsegment}, which is an absolute
2226 expression. If @var{subsegment} is omitted, subsegment number zero
2229 @node Word, Deprecated, Text, Pseudo Ops
2230 @section @code{.word @var{expressions}}
2231 This directive expects zero or more @var{expressions}, of any segment,
2232 separated by commas.
2233 _if__(_SPARC__ || _AMD29K__)
2234 For each expression, @code{as} emits a 32-bit number.
2235 _fi__(_SPARC__ || _AMD29K__)
2236 _if__(! (_SPARC__ || _AMD29K__) )
2237 For each expression, @code{as} emits a 16-bit number.
2238 _fi__(! (_SPARC__ || _AMD29K__) )
2241 The byte order of the expression depends on what kind of computer will
2245 @c on the 29k the "special treatment to support compilers" doesn't
2246 @c happen---32-bit addressability, period; no long/short jumps.
2248 @subsection Special Treatment to support Compilers
2250 In order to assemble compiler output into something that will work,
2251 @code{as} will occasionlly do strange things to @samp{.word} directives.
2252 Directives of the form @samp{.word sym1-sym2} are often emitted by
2253 compilers as part of jump tables. Therefore, when @code{as} assembles a
2254 directive of the form @samp{.word sym1-sym2}, and the difference between
2255 @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2256 create a @dfn{secondary jump table}, immediately before the next label.
2257 This @var{secondary jump table} will be preceded by a short-jump to the
2258 first byte after the secondary table. This short-jump prevents the flow
2259 of control from accidentally falling into the new table. Inside the
2260 table will be a long-jump to @code{sym2}. The original @samp{.word}
2261 will contain @code{sym1} minus the address of the long-jump to
2264 If there were several occurrences of @samp{.word sym1-sym2} before the
2265 secondary jump table, all of them will be adjusted. If there was a
2266 @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2267 long-jump to @code{sym4} will be included in the secondary jump table,
2268 and the @code{.word} directives will be adjusted to contain @code{sym3}
2269 minus the address of the long-jump to @code{sym4}; and so on, for as many
2270 entries in the original jump table as necessary.
2273 @emph{This feature may be disabled by compiling @code{as} with the
2274 @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2275 assembly language programmers.
2279 @node Deprecated, Machine Dependent, Word, Pseudo Ops
2280 @section Deprecated Directives
2281 One day these directives won't work.
2282 They are included for compatibility with older assemblers.
2289 @node Machine Dependent, Machine Dependent, Pseudo Ops, Top
2291 @chapter Machine Dependent Features
2294 _if__(_VAX__ && !_ALL_ARCH__)
2295 @chapter Machine Dependent Features: VAX
2296 _fi__(_VAX__ && !_ALL_ARCH__)
2303 The Vax version of @code{as} accepts any of the following options,
2304 gives a warning message that the option was ignored and proceeds.
2305 These options are for compatibility with scripts designed for other
2306 people's assemblers.
2309 @item @kbd{-D} (Debug)
2310 @itemx @kbd{-S} (Symbol Table)
2311 @itemx @kbd{-T} (Token Trace)
2312 These are obsolete options used to debug old assemblers.
2314 @item @kbd{-d} (Displacement size for JUMPs)
2315 This option expects a number following the @kbd{-d}. Like options
2316 that expect filenames, the number may immediately follow the
2317 @kbd{-d} (old standard) or constitute the whole of the command line
2318 argument that follows @kbd{-d} (GNU standard).
2320 @item @kbd{-V} (Virtualize Interpass Temporary File)
2321 Some other assemblers use a temporary file. This option
2322 commanded them to keep the information in active memory rather
2323 than in a disk file. @code{as} always does this, so this
2324 option is redundant.
2326 @item @kbd{-J} (JUMPify Longer Branches)
2327 Many 32-bit computers permit a variety of branch instructions
2328 to do the same job. Some of these instructions are short (and
2329 fast) but have a limited range; others are long (and slow) but
2330 can branch anywhere in virtual memory. Often there are 3
2331 flavors of branch: short, medium and long. Some other
2332 assemblers would emit short and medium branches, unless told by
2333 this option to emit short and long branches.
2335 @item @kbd{-t} (Temporary File Directory)
2336 Some other assemblers may use a temporary file, and this option
2337 takes a filename being the directory to site the temporary
2338 file. @code{as} does not use a temporary disk file, so this
2339 option makes no difference. @kbd{-t} needs exactly one
2343 The Vax version of the assembler accepts two options when
2344 compiled for VMS. They are @kbd{-h}, and @kbd{-+}. The
2345 @kbd{-h} option prevents @code{as} from modifying the
2346 symbol-table entries for symbols that contain lowercase
2347 characters (I think). The @kbd{-+} option causes @code{as} to
2348 print warning messages if the FILENAME part of the object file,
2349 or any symbol name is larger than 31 characters. The @kbd{-+}
2350 option also insertes some code following the @samp{_main}
2351 symbol so that the object file will be compatible with Vax-11
2354 @subsection Floating Point
2355 Conversion of flonums to floating point is correct, and
2356 compatible with previous assemblers. Rounding is
2357 towards zero if the remainder is exactly half the least significant bit.
2359 @code{D}, @code{F}, @code{G} and @code{H} floating point formats
2362 Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
2363 are rendered correctly. Again, rounding is towards zero in the
2366 The @code{.float} directive produces @code{f} format numbers.
2367 The @code{.double} directive produces @code{d} format numbers.
2369 @subsection Machine Directives
2370 The Vax version of the assembler supports four directives for
2371 generating Vax floating point constants. They are described in the
2376 This expects zero or more flonums, separated by commas, and
2377 assembles Vax @code{d} format 64-bit floating point constants.
2380 This expects zero or more flonums, separated by commas, and
2381 assembles Vax @code{f} format 32-bit floating point constants.
2384 This expects zero or more flonums, separated by commas, and
2385 assembles Vax @code{g} format 64-bit floating point constants.
2388 This expects zero or more flonums, separated by commas, and
2389 assembles Vax @code{h} format 128-bit floating point constants.
2394 All DEC mnemonics are supported. Beware that @code{case@dots{}}
2395 instructions have exactly 3 operands. The dispatch table that
2396 follows the @code{case@dots{}} instruction should be made with
2397 @code{.word} statements. This is compatible with all unix
2398 assemblers we know of.
2400 @subsection Branch Improvement
2401 Certain pseudo opcodes are permitted. They are for branch
2402 instructions. They expand to the shortest branch instruction that
2403 will reach the target. Generally these mnemonics are made by
2404 substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2405 This feature is included both for compatibility and to help
2406 compilers. If you don't need this feature, don't use these
2407 opcodes. Here are the mnemonics, and the code they can expand into.
2411 @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2413 @item (byte displacement)
2415 @item (word displacement)
2417 @item (long displacement)
2422 Unconditional branch.
2424 @item (byte displacement)
2426 @item (word displacement)
2428 @item (long displacement)
2432 @var{COND} may be any one of the conditional branches
2433 @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2434 @var{COND} may also be one of the bit tests
2435 @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2436 @var{NOTCOND} is the opposite condition to @var{COND}.
2438 @item (byte displacement)
2439 @kbd{b@var{COND} @dots{}}
2440 @item (word displacement)
2441 @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2442 @item (long displacement)
2443 @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2446 @var{X} may be one of @code{b d f g h l w}.
2448 @item (word displacement)
2449 @kbd{@var{OPCODE} @dots{}}
2450 @item (long displacement)
2451 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2454 @var{YYY} may be one of @code{lss leq}.
2456 @var{ZZZ} may be one of @code{geq gtr}.
2458 @item (byte displacement)
2459 @kbd{@var{OPCODE} @dots{}}
2460 @item (word displacement)
2461 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2462 @item (long displacement)
2463 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2470 @item (byte displacement)
2471 @kbd{@var{OPCODE} @dots{}}
2472 @item (word displacement)
2473 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2474 @item (long displacement)
2475 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2479 @subsection operands
2480 The immediate character is @samp{$} for Unix compatibility, not
2481 @samp{#} as DEC writes it.
2483 The indirect character is @samp{*} for Unix compatibility, not
2484 @samp{@@} as DEC writes it.
2486 The displacement sizing character is @samp{`} (an accent grave) for
2487 Unix compatibility, not @samp{^} as DEC writes it. The letter
2488 preceding @samp{`} may have either case. @samp{G} is not
2489 understood, but all other letters (@code{b i l s w}) are understood.
2491 Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2492 pc}. Any case of letters will do.
2499 Any expression is permitted in an operand. Operands are comma
2502 @c There is some bug to do with recognizing expressions
2503 @c in operands, but I forget what it is. It is
2504 @c a syntax clash because () is used as an address mode
2505 @c and to encapsulate sub-expressions.
2506 @subsection Not Supported
2507 Vax bit fields can not be assembled with @code{as}. Someone
2508 can add the required code if they really need it.
2511 _if__(_AMD29K__ && !_ALL_ARCH__)
2512 @chapter Machine Dependent Features: AMD 29K
2513 _fi__(_AMD29K__ && !_ALL_ARCH__)
2515 @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2517 GNU @code{as} has no additional command-line options for the AMD
2520 @node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2522 @subsection Special Characters
2523 @samp{;} is the line comment character.
2525 @samp{@@} can be used instead of a newline to separate statements.
2527 The character @samp{?} is permitted in identifiers (but may not begin
2530 @subsection Register Names
2531 General-purpose registers are represented by predefined symbols of the
2532 form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2533 (for local registers), where @var{nnn} represents a number between
2534 @code{0} and @code{127}, written with no leading zeros. The leading
2535 letters may be in either upper or lower case; for example, @samp{gr13}
2536 and @samp{LR7} are both valid register names.
2538 You may also refer to general-purpose registers by specifying the
2539 register number as the result of an expression (prefixed with @samp{%%}
2540 to flag the expression as a register number):
2544 @noindent---where @var{expression} must be an absolute expression
2545 evaluating to a number between @code{0} and @code{255}. The range
2546 [0, 127] refers to global registers, and the range [128, 255] to local
2549 In addition, GNU @code{as} understands the following protected
2550 special-purpose register names for the AMD 29K family:
2560 These unprotected special-purpose register names are also recognized:
2568 @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2569 @section Floating Point
2570 The AMD 29K family uses IEEE floating-point numbers.
2572 @node Machine Directives, Opcodes, Floating Point, Machine Dependent
2573 @section Machine Directives
2576 * block:: @code{.block @var{size} , @var{fill}}
2577 * cputype:: @code{.cputype}
2578 * file:: @code{.file}
2579 * hword:: @code{.hword @var{expressions}}
2580 * line:: @code{.line}
2581 * reg:: @code{.reg @var{symbol}, @var{expression}}
2582 * sect:: @code{.sect}
2583 * use:: @code{.use @var{segment name}}
2586 @node block, cputype, Machine Directives, Machine Directives
2587 @subsection @code{.block @var{size} , @var{fill}}
2588 This directive emits @var{size} bytes, each of value @var{fill}. Both
2589 @var{size} and @var{fill} are absolute expressions. If the comma
2590 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2592 In other versions of GNU @code{as}, this directive is called
2595 @node cputype, file, block, Machine Directives
2596 @subsection @code{.cputype}
2597 This directive is ignored; it is accepted for compatibility with other
2600 @node file, hword, cputype, Machine Directives
2601 @subsection @code{.file}
2602 This directive is ignored; it is accepted for compatibility with other
2606 @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2607 used for the directive called @code{.app-file} in the AMD 29K support.
2610 @node hword, line, file, Machine Directives
2611 @subsection @code{.hword @var{expressions}}
2612 This expects zero or more @var{expressions}, and emits
2613 a 16 bit number for each. (Synonym for @samp{.short}.)
2615 @node line, reg, hword, Machine Directives
2616 @subsection @code{.line}
2617 This directive is ignored; it is accepted for compatibility with other
2620 @node reg, sect, line, Machine Directives
2621 @subsection @code{.reg @var{symbol}, @var{expression}}
2622 @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2624 @node sect, use, reg, Machine Directives
2625 @subsection @code{.sect}
2626 This directive is ignored; it is accepted for compatibility with other
2629 @node use, , sect, Machine Directives
2630 @subsection @code{.use @var{segment name}}
2631 Establishes the segment and subsegment for the following code;
2632 @var{segment name} may be one of @code{.text}, @code{.data},
2633 @code{.data1}, or @code{.lit}. With one of the first three @var{segment
2634 name} options, @samp{.use} is equivalent to the machine directive
2635 @var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2639 @node Opcodes, Opcodes, Machine Directives, Machine Dependent
2641 GNU @code{as} implements all the standard AMD 29K opcodes. No
2642 additional pseudo-instructions are needed on this family.
2644 For information on the 29K machine instruction set, see @cite{Am29000
2645 User's Manual}, Advanced Micro Devices, Inc.
2649 _if__(_M680X0__ && !_ALL_ARCH__)
2650 @chapter Machine Dependent Features: Motorola 680x0
2651 _fi__(_M680X0__ && !_ALL_ARCH__)
2654 The 680x0 version of @code{as} has two machine dependent options.
2655 One shortens undefined references from 32 to 16 bits, while the
2656 other is used to tell @code{as} what kind of machine it is
2659 You can use the @kbd{-l} option to shorten the size of references to
2660 undefined symbols. If the @kbd{-l} option is not given, references to
2661 undefined symbols will be a full long (32 bits) wide. (Since @code{as}
2662 cannot know where these symbols will end up, @code{as} can only allocate
2663 space for the linker to fill in later. Since @code{as} doesn't know how
2664 far away these symbols will be, it allocates as much space as it can.)
2665 If this option is given, the references will only be one word wide (16
2666 bits). This may be useful if you want the object file to be as small as
2667 possible, and you know that the relevant symbols will be less than 17
2670 The 680x0 version of @code{as} is most frequently used to assemble
2671 programs for the Motorola MC68020 microprocessor. Occasionally it is
2672 used to assemble programs for the mostly similar, but slightly different
2673 MC68000 or MC68010 microprocessors. You can give @code{as} the options
2674 @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2675 @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2680 The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2681 Size modifiers are appended directly to the end of the opcode without an
2682 intervening period. For example, write @samp{movl} rather than
2686 If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2687 Sun-style local labels of the form @samp{1$} through @samp{$9}.
2690 In the following table @dfn{apc} stands for any of the address
2691 registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2692 Program Counter (@samp{pc}), or the zero-address relative to the
2693 program counter (@samp{zpc}).
2695 The following addressing modes are understood:
2698 @samp{#@var{digits}}
2701 @samp{d0} through @samp{d7}
2703 @item Address Register
2704 @samp{a0} through @samp{a7}
2706 @item Address Register Indirect
2707 @samp{a0@@} through @samp{a7@@}
2709 @item Address Register Postincrement
2710 @samp{a0@@+} through @samp{a7@@+}
2712 @item Address Register Predecrement
2713 @samp{a0@@-} through @samp{a7@@-}
2715 @item Indirect Plus Offset
2716 @samp{@var{apc}@@(@var{digits})}
2719 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2720 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2723 @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2724 or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2727 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2728 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2730 @item Memory Indirect
2731 @samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2734 @samp{@var{symbol}}, or @samp{@var{digits}}
2737 @c research before documenting.
2738 , or either of the above followed
2739 by @samp{:b}, @samp{:w}, or @samp{:l}.
2743 @section Floating Point
2744 The floating point code is not too well tested, and may have
2747 Packed decimal (P) format floating literals are not supported.
2748 Feel free to add the code!
2750 The floating point formats generated by directives are these.
2753 @code{Single} precision floating point constants.
2755 @code{Double} precision floating point constants.
2758 There is no directive to produce regions of memory holding
2759 extended precision numbers, however they can be used as
2760 immediate operands to floating-point instructions. Adding a
2761 directive to create extended precision numbers would not be
2762 hard, but it has not yet seemed necessary.
2764 @section Machine Directives
2765 In order to be compatible with the Sun assembler the 680x0 assembler
2766 understands the following directives.
2769 This directive is identical to a @code{.data 1} directive.
2771 This directive is identical to a @code{.data 2} directive.
2773 This directive is identical to a @code{.align 1} directive.
2774 @c Is this true? does it work???
2776 This directive is identical to a @code{.space} directive.
2781 @c paragraph. Bugs are bugs; how does saying this
2784 Danger: Several bugs have been found in the opcode table (and
2785 fixed). More bugs may exist. Be careful when using obscure
2789 @subsection Branch Improvement
2791 Certain pseudo opcodes are permitted for branch instructions.
2792 They expand to the shortest branch instruction that will reach the
2793 target. Generally these mnemonics are made by substituting @samp{j} for
2794 @samp{b} at the start of a Motorola mnemonic.
2796 The following table summarizes the pseudo-operations. A @code{*} flags
2797 cases that are more fully described after the table:
2801 +---------------------------------------------------------
2803 Pseudo-Op |BYTE WORD LONG LONG non-PC relative
2804 +---------------------------------------------------------
2805 jbsr |bsrs bsr bsrl jsr jsr
2806 jra |bras bra bral jmp jmp
2807 * jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp
2808 * dbXX |dbXX dbXX dbXX; bra; jmpl
2809 * fjXX |fbXXw fbXXw fbXXl fbNXw;jmp
2812 NX: negative of condition XX
2815 @center{@code{*}---see full description below}
2820 These are the simplest jump pseudo-operations; they always map to one
2821 particular machine instruction, depending on the displacement to the
2825 Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2826 where @var{XX} is a conditional branch or condition-code test. The full
2827 list of pseudo-ops in this family is:
2829 jhi jls jcc jcs jne jeq jvc
2830 jvs jpl jmi jge jlt jgt jle
2833 For the cases of non-PC relative displacements and long displacements on
2834 the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2835 @var{NX}, the opposite condition to @var{XX}:
2847 The full family of pseudo-operations covered here is
2849 dbhi dbls dbcc dbcs dbne dbeq dbvc
2850 dbvs dbpl dbmi dbge dblt dbgt dble
2854 Other than for word and byte displacements, when the source reads
2855 @samp{db@var{XX} foo}, @code{as} will emit
2864 This family includes
2866 fjne fjeq fjge fjlt fjgt fjle fjf
2867 fjt fjgl fjgle fjnge fjngl fjngle fjngt
2868 fjnle fjnlt fjoge fjogl fjogt fjole fjolt
2869 fjor fjseq fjsf fjsne fjst fjueq fjuge
2870 fjugt fjule fjult fjun
2873 For branch targets that are not PC relative, @code{as} emits
2879 when it encounters @samp{fj@var{XX} foo}.
2883 @subsection Special Characters
2884 The immediate character is @samp{#} for Sun compatibility. The
2885 line-comment character is @samp{|}. If a @samp{#} appears at the
2886 beginning of a line, it is treated as a comment unless it looks like
2887 @samp{# line file}, in which case it is treated normally.
2894 The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2895 specify thiat it is compiling for a 32032 processor, or a
2896 @kbd{-m32532} to specify that it is compiling for a 32532 option.
2897 The default (if neither is specified) is chosen when the assembler
2901 I don't know anything about the 32x32 syntax assembled by
2902 @code{as}. Someone who undersands the processor (I've never seen
2903 one) and the possible syntaxes should write this section.
2905 @subsection Floating Point
2906 The 32x32 uses IEEE floating point numbers, but @code{as} will only
2907 create single or double precision values. I don't know if the 32x32
2908 understands extended precision numbers.
2910 @subsection Machine Directives
2911 The 32x32 has no machine dependent directives.
2916 _if__(_SPARC__ && !_ALL_ARCH__)
2917 @chapter Machine Dependent Features: SPARC
2918 _fi__(_SPARC__ && !_ALL_ARCH__)
2921 The sparc has no machine dependent options.
2924 I don't know anything about Sparc syntax. Someone who does
2925 will have to write this section.
2927 @subsection Floating Point
2928 The Sparc uses ieee floating-point numbers.
2930 @subsection Machine Directives
2931 The Sparc version of @code{as} supports the following additional
2936 This must be followed by a symbol name, a positive number, and
2937 @code{"bss"}. This behaves somewhat like @code{.comm}, but the
2938 syntax is different.
2941 This is functionally identical to @code{.globl}.
2944 This is functionally identical to @code{.short}.
2947 This directive is ignored. Any text following it on the same
2948 line is also ignored.
2951 This must be followed by a symbol name, a positive number, and
2952 @code{"bss"}. This behaves somewhat like @code{.lcomm}, but the
2953 syntax is different.
2956 This must be followed by @code{"text"}, @code{"data"}, or
2957 @code{"data1"}. It behaves like @code{.text}, @code{.data}, or
2961 This is functionally identical to the .space directive.
2964 On the Sparc, the .word directive produces 32 bit values,
2965 instead of the 16 bit values it produces on every other machine.
2970 _if__(_I80386__ && !_ALL_ARCH__)
2971 @chapter Machine Dependent Features: SPARC
2972 _fi__(_I80386__ && !_ALL_ARCH__)
2974 @section Intel 80386
2976 The 80386 has no machine dependent options.
2978 @subsection AT&T Syntax versus Intel Syntax
2979 In order to maintain compatibility with the output of @code{GCC},
2980 @code{as} supports AT&T System V/386 assembler syntax. This is quite
2981 different from Intel syntax. We mention these differences because
2982 almost all 80386 documents used only Intel syntax. Notable differences
2983 between the two syntaxes are:
2986 AT&T immediate operands are preceded by @samp{$}; Intel immediate
2987 operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2988 AT&T register operands are preceded by @samp{%}; Intel register operands
2989 are undelimited. AT&T absolute (as opposed to PC relative) jump/call
2990 operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2993 AT&T and Intel syntax use the opposite order for source and destination
2994 operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The
2995 @samp{source, dest} convention is maintained for compatibility with
2996 previous Unix assemblers.
2999 In AT&T syntax the size of memory operands is determined from the last
3000 character of the opcode name. Opcode suffixes of @samp{b}, @samp{w},
3001 and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
3002 memory references. Intel syntax accomplishes this by prefixes memory
3003 operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
3004 @samp{word ptr}, and @samp{dword ptr}. Thus, Intel @samp{mov al, byte
3005 ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
3008 Immediate form long jumps and calls are
3009 @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
3011 @samp{call/jmp far @var{segment}:@var{offset}}. Also, the far return
3013 is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3014 @samp{ret far @var{stack-adjust}}.
3017 The AT&T assembler does not provide support for multiple segment
3018 programs. Unix style systems expect all programs to be single segments.
3021 @subsection Opcode Naming
3022 Opcode names are suffixed with one character modifiers which specify the
3023 size of operands. The letters @samp{b}, @samp{w}, and @samp{l} specify
3024 byte, word, and long operands. If no suffix is specified by an
3025 instruction and it contains no memory operands then @code{as} tries to
3026 fill in the missing suffix based on the destination register operand
3027 (the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent
3028 to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3029 @samp{movw $1, %bx}. Note that this is incompatible with the AT&T Unix
3030 assembler which assumes that a missing opcode suffix implies long
3031 operand size. (This incompatibility does not affect compiler output
3032 since compilers always explicitly specify the opcode suffix.)
3034 Almost all opcodes have the same names in AT&T and Intel format. There
3035 are a few exceptions. The sign extend and zero extend instructions need
3036 two sizes to specify them. They need a size to sign/zero extend
3037 @emph{from} and a size to zero extend @emph{to}. This is accomplished
3038 by using two opcode suffixes in AT&T syntax. Base names for sign extend
3039 and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3040 syntax (@samp{movsx} and @samp{movzx} in Intel syntax). The opcode
3041 suffixes are tacked on to this base name, the @emph{from} suffix before
3042 the @emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3043 ``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes,
3044 thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3045 and @samp{wl} (from word to long).
3047 The Intel syntax conversion instructions
3050 @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3052 @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3054 @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3056 @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3058 are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3059 AT&T naming. @code{as} accepts either naming for these instructions.
3061 Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3062 AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3065 @subsection Register Naming
3066 Register operands are always prefixes with @samp{%}. The 80386 registers
3070 the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3071 @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3072 frame pointer), and @samp{%esp} (the stack pointer).
3075 the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3076 @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3079 the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3080 @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3081 are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3082 @samp{%cx}, and @samp{%dx})
3085 the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3086 (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3090 the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3094 the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3095 @samp{%db3}, @samp{%db6}, and @samp{%db7}.
3098 the 2 test registers @samp{%tr6} and @samp{%tr7}.
3101 the 8 floating point register stack @samp{%st} or equivalently
3102 @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3103 @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3106 @subsection Opcode Prefixes
3107 Opcode prefixes are used to modify the following opcode. They are used
3108 to repeat string instructions, to provide segment overrides, to perform
3109 bus lock operations, and to give operand and address size (16-bit
3110 operands are specified in an instruction by prefixing what would
3111 normally be 32-bit operands with a ``operand size'' opcode prefix).
3112 Opcode prefixes are usually given as single-line instructions with no
3113 operands, and must directly precede the instruction they act upon. For
3114 example, the @samp{scas} (scan string) instruction is repeated with:
3120 Here is a list of opcode prefixes:
3123 Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3124 @samp{fs}, @samp{gs}. These are automatically added by specifying
3125 using the @var{segment}:@var{memory-operand} form for memory references.
3128 Operand/Address size prefixes @samp{data16} and @samp{addr16}
3129 change 32-bit operands/addresses into 16-bit operands/addresses. Note
3130 that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3131 are not supported (yet).
3134 The bus lock prefix @samp{lock} inhibits interrupts during
3135 execution of the instruction it precedes. (This is only valid with
3136 certain instructions; see a 80386 manual for details).
3139 The wait for coprocessor prefix @samp{wait} waits for the
3140 coprocessor to complete the current instruction. This should never be
3141 needed for the 80386/80387 combination.
3144 The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3145 to string instructions to make them repeat @samp{%ecx} times.
3148 @subsection Memory References
3149 An Intel syntax indirect memory reference of the form
3151 @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3153 is translated into the AT&T syntax
3155 @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3157 where @var{base} and @var{index} are the optional 32-bit base and
3158 index registers, @var{disp} is the optional displacement, and
3159 @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3160 to calculate the address of the operand. If no @var{scale} is
3161 specified, @var{scale} is taken to be 1. @var{segment} specifies the
3162 optional segment register for the memory operand, and may override the
3163 default segment register (see a 80386 manual for segment register
3164 defaults). Note that segment overrides in AT&T syntax @emph{must} have
3165 be preceded by a @samp{%}. If you specify a segment override which
3166 coincides with the default segment register, @code{as} will @emph{not}
3167 output any segment register override prefixes to assemble the given
3168 instruction. Thus, segment overrides can be specified to emphasize which
3169 segment register is used for a given memory operand.
3171 Here are some examples of Intel and AT&T style memory references:
3174 @item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]}
3175 @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3176 missing, and the default segment is used (@samp{%ss} for addressing with
3177 @samp{%ebp} as the base register). @var{index}, @var{scale} are both missing.
3179 @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3180 @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3181 @samp{foo}. All other fields are missing. The segment register here
3182 defaults to @samp{%ds}.
3184 @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3185 This uses the value pointed to by @samp{foo} as a memory operand.
3186 Note that @var{base} and @var{index} are both missing, but there is only
3187 @emph{one} @samp{,}. This is a syntactic exception.
3189 @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3190 This selects the contents of the variable @samp{foo} with segment
3191 register @var{segment} being @samp{%gs}.
3195 Absolute (as opposed to PC relative) call and jump operands must be
3196 prefixed with @samp{*}. If no @samp{*} is specified, @code{as} will
3197 always choose PC relative addressing for jump/call labels.
3199 Any instruction that has a memory operand @emph{must} specify its size (byte,
3200 word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3203 @subsection Handling of Jump Instructions
3204 Jump instructions are always optimized to use the smallest possible
3205 displacements. This is accomplished by using byte (8-bit) displacement
3206 jumps whenever the target is sufficiently close. If a byte displacement
3207 is insufficient a long (32-bit) displacement is used. We do not support
3208 word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3209 with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3210 @samp{%eip} to 16 bits after the word displacement is added.
3212 Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3213 @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3214 byte displacements, so that it is possible that use of these
3215 instructions (@code{GCC} does not use them) will cause the assembler to
3216 print an error message (and generate incorrect code). The AT&T 80386
3217 assembler tries to get around this problem by expanding @samp{jcxz foo} to
3225 @subsection Floating Point
3226 All 80387 floating point types except packed BCD are supported.
3227 (BCD support may be added without much difficulty). These data
3228 types are 16-, 32-, and 64- bit integers, and single (32-bit),
3229 double (64-bit), and extended (80-bit) precision floating point.
3230 Each supported type has an opcode suffix and a constructor
3231 associated with it. Opcode suffixes specify operand's data
3232 types. Constructors build these data types into memory.
3236 Floating point constructors are @samp{.float} or @samp{.single},
3237 @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3238 These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3239 @samp{t} stands for temporary real, and that the 80387 only supports
3240 this format via the @samp{fldt} (load temporary real to stack top) and
3241 @samp{fstpt} (store temporary real and pop stack) instructions.
3244 Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3245 @samp{.quad} for the 16-, 32-, and 64-bit integer formats. The corresponding
3246 opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3247 (quad). As with the temporary real format the 64-bit @samp{q} format is
3248 only present in the @samp{fildq} (load quad integer to stack top) and
3249 @samp{fistpq} (store quad integer and pop stack) instructions.
3252 Register to register operations do not require opcode suffixes,
3253 so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3255 Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3256 instructions are almost never needed (this is not the case for the
3257 80286/80287 and 8086/8087 combinations). Therefore, @code{as} suppresses
3258 the @samp{fwait} instruction whenever it is implicitly selected by one
3259 of the @samp{fn@dots{}} instructions. For example, @samp{fsave} and
3260 @samp{fnsave} are treated identically. In general, all the @samp{fn@dots{}}
3261 instructions are made equivalent to @samp{f@dots{}} instructions. If
3262 @samp{fwait} is desired it must be explicitly coded.
3265 There is some trickery concerning the @samp{mul} and @samp{imul}
3266 instructions that deserves mention. The 16-, 32-, and 64-bit expanding
3267 multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3268 for @samp{imul}) can be output only in the one operand form. Thus,
3269 @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3270 the expanding multiply would clobber the @samp{%edx} register, and this
3271 would confuse @code{GCC} output. Use @samp{imul %ebx} to get the
3272 64-bit product in @samp{%edx:%eax}.
3274 We have added a two operand form of @samp{imul} when the first operand
3275 is an immediate mode expression and the second operand is a register.
3276 This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3277 example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3283 @c changing rapidly. These may need to be moved to another
3284 @c book anyhow, if we adopt the model of user/modifier
3287 @node Maintenance, Retargeting, Machine Dependent, Top
3288 @chapter Maintaining the Assembler
3289 [[this chapter is still being built]]
3292 We had these goals, in descending priority:
3295 For every program composed by a compiler, @code{as} should emit
3296 ``correct'' code. This leaves some latitude in choosing addressing
3297 modes, order of @code{relocation_info} structures in the object
3300 @item Speed, for usual case.
3301 By far the most common use of @code{as} will be assembling compiler
3304 @item Upward compatibility for existing assembler code.
3305 Well @dots{} we don't support Vax bit fields but everything else
3306 seems to be upward compatible.
3309 The code should be maintainable with few surprises. (JF: ha!)
3313 We assumed that disk I/O was slow and expensive while memory was
3314 fast and access to memory was cheap. We expect the in-memory data
3315 structures to be less than 10 times the size of the emitted object
3316 file. (Contrast this with the C compiler where in-memory structures
3317 might be 100 times object file size!)
3321 Try to read the source file from disk only one time. For other
3322 reasons, we keep large chunks of the source file in memory during
3323 assembly so this is not a problem. Also the assembly algorithm
3324 should only scan the source text once if the compiler composed the
3325 text according to a few simple rules.
3327 Emit the object code bytes only once. Don't store values and then
3330 Build the object file in memory and do direct writes to disk of
3334 RMS suggested a one-pass algorithm which seems to work well. By not
3335 parsing text during a second pass considerable time is saved on
3336 large programs (@emph{e.g.} the sort of C program @code{yacc} would
3339 It happened that the data structures needed to emit relocation
3340 information to the object file were neatly subsumed into the data
3341 structures that do backpatching of addresses after pass 1.
3343 Many of the functions began life as re-usable modules, loosely
3344 connected. RMS changed this to gain speed. For example, input
3345 parsing routines which used to work on pre-sanitized strings now
3346 must parse raw data. Hence they have to import knowledge of the
3347 assemblers' comment conventions @emph{etc}.
3349 @section Deprecated Feature(?)s
3350 We have stopped supporting some features:
3353 @code{.org} statements must have @b{defined} expressions.
3355 Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3358 It might be a good idea to not support these features in a future release:
3361 @kbd{#} should begin a comment, even in column 1.
3363 Why support the logical line & file concept any more?
3365 Subsegments are a good candidate for flushing.
3366 Depends on which compilers need them I guess.
3369 @section Bugs, Ideas, Further Work
3370 Clearly the major improvement is DON'T USE A TEXT-READING
3371 ASSEMBLER for the back end of a compiler. It is much faster to
3372 interpret binary gobbledygook from a compiler's tables than to
3373 ask the compiler to write out human-readable code just so the
3374 assembler can parse it back to binary.
3376 Assuming you use @code{as} for human written programs: here are
3380 Document (here) @code{APP}.
3382 Take advantage of knowing no spaces except after opcode
3383 to speed up @code{as}. (Modify @code{app.c} to flush useless spaces:
3384 only keep space/tabs at begin of line or between 2
3387 Put pointers in this documentation to @file{a.out} documentation.
3389 Split the assembler into parts so it can gobble direct binary
3390 from @emph{e.g.} @code{cc}. It is silly for@code{cc} to compose text
3391 just so @code{as} can parse it back to binary.
3393 Rewrite hash functions: I want a more modular, faster library.
3395 Clean up LOTS of code.
3397 Include all the non-@file{.c} files in the maintenance chapter.
3401 Implement flonum short literals.
3403 Change all talk of expression operands to expression quantities,
3404 or perhaps to expression arguments.
3408 Whenever a @code{.text} or @code{.data} statement is seen, we close
3409 of the current frag with an imaginary @code{.fill 0}. This is
3410 because we only have one obstack for frags, and we can't grow new
3411 frags for a new subsegment, then go back to the old subsegment and
3412 append bytes to the old frag. All this nonsense goes away if we
3413 give each subsegment its own obstack. It makes code simpler in
3414 about 10 places, but nobody has bothered to do it because C compiler
3415 output rarely changes subsegments (compared to ending frags with
3416 relaxable addresses, which is common).
3420 @c The following files in the @file{as} directory
3421 @c are symbolic links to other files, of
3422 @c the same name, in a different directory.
3425 @c @file{atof_generic.c}
3427 @c @file{atof_vax.c}
3429 @c @file{flonum_const.c}
3431 @c @file{flonum_copy.c}
3433 @c @file{flonum_get.c}
3435 @c @file{flonum_multip.c}
3437 @c @file{flonum_normal.c}
3439 @c @file{flonum_print.c}
3442 Here is a list of the source files in the @file{as} directory.
3446 This contains the pre-processing phase, which deletes comments,
3447 handles whitespace, etc. This was recently re-written, since app
3448 used to be a separate program, but RMS wanted it to be inline.
3451 This is a subroutine to append a string to another string returning a
3452 pointer just after the last @code{char} appended. (JF: All these
3453 little routines should probably all be put in one file.)
3456 Here you will find the main program of the assembler @code{as}.
3459 This is a branch office of @file{read.c}. This understands
3460 expressions, arguments. Inside @code{as}, arguments are called
3461 (expression) @emph{operands}. This is confusing, because we also talk
3462 (elsewhere) about instruction @emph{operands}. Also, expression
3463 operands are called @emph{quantities} explicitly to avoid confusion
3464 with instruction operands. What a mess.
3467 This implements the @b{frag} concept. Without frags, finding the
3468 right size for branch instructions would be a lot harder.
3471 This contains the symbol table, opcode table @emph{etc.} hashing
3475 This is a table of values of digits, for use in atoi() type
3476 functions. Could probably be flushed by using calls to strtol(), or
3480 This contains Operating system dependent source file reading
3481 routines. Since error messages often say where we are in reading
3482 the source file, they live here too. Since @code{as} is intended to
3483 run under GNU and Unix only, this might be worth flushing. Anyway,
3484 almost all C compilers support stdio.
3487 This deals with calling the pre-processor (if needed) and feeding the
3488 chunks back to the rest of the assembler the right way.
3491 This contains operating system independent parts of fatal and
3492 warning message reporting. See @file{append.c} above.
3495 This contains operating system dependent functions that write an
3496 object file for @code{as}. See @file{input-file.c} above.
3499 This implements all the directives of @code{as}. This also deals
3500 with passing input lines to the machine dependent part of the
3504 This is a C library function that isn't in most C libraries yet.
3505 See @file{append.c} above.
3508 This implements subsegments.
3511 This implements symbols.
3514 This contains the code to perform relaxation, and to write out
3515 the object file. It is mostly operating system independent, but
3516 different OSes have different object file formats in any case.
3519 This implements @code{malloc()} or bust. See @file{append.c} above.
3522 This implements @code{realloc()} or bust. See @file{append.c} above.
3524 @item atof-generic.c
3525 The following files were taken from a machine-independent subroutine
3526 library for manipulating floating point numbers and very large
3529 @file{atof-generic.c} turns a string into a flonum internal format
3530 floating-point number.
3532 @item flonum-const.c
3533 This contains some potentially useful floating point numbers in
3537 This copies a flonum.
3539 @item flonum-multip.c
3540 This multiplies two flonums together.
3543 This copies a bignum.
3547 Here is a table of all the machine-specific files (this includes
3548 both source and header files). Typically, there is a
3549 @var{machine}.c file, a @var{machine}-opcode.h file, and an
3550 atof-@var{machine}.c file. The @var{machine}-opcode.h file should
3551 be identical to the one used by GDB (which uses it for disassembly.)
3556 This contains code to turn a flonum into a ieee literal constant.
3557 This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3560 This is the opcode-table for the i386 version of the assembler.
3563 This contains all the code for the i386 version of the assembler.
3566 This defines constants and macros used by the i386 version of the assembler.
3569 generic 68020 header file. To be linked to m68k.h on a
3570 non-sun3, non-hpux system.
3573 68010 header file for Sun2 workstations. Not well tested. To be linked
3574 to m68k.h on a sun2. (See also @samp{-DSUN_ASM_SYNTAX} in the
3578 68020 header file for Sun3 workstations. To be linked to m68k.h before
3579 compiling on a Sun3 system. (See also @samp{-DSUN_ASM_SYNTAX} in the
3583 68020 header file for a HPUX (system 5?) box. Which box, which
3584 version of HPUX, etc? I don't know.
3587 A hard- or symbolic- link to one of @file{m-generic.h},
3588 @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3589 680x0 you are assembling for. (See also @samp{-DSUN_ASM_SYNTAX} in the
3593 Opcode table for 68020. This is now a link to the opcode table
3594 in the @code{GDB} source directory.
3597 All the mc680x0 code, in one huge, slow-to-compile file.
3600 This contains the code for the ns32032/ns32532 version of the
3603 @item ns32k-opcode.h
3604 This contains the opcode table for the ns32032/ns32532 version
3608 Vax specific file for describing Vax operands and other Vax-ish things.
3614 Vax specific parts of @code{as}. Also includes the former files
3615 @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3618 Turns a flonum into a Vax constant.
3621 This file contains the special code needed to put out a VMS
3622 style object file for the Vax.
3626 Here is a list of the header files in the source directory.
3627 (Warning: This section may not be very accurate. I didn't
3628 write the header files; I just report them.) Also note that I
3629 think many of these header files could be cleaned up or
3635 This describes the structures used to create the binary header data
3636 inside the object file. Perhaps we should use the one in
3637 @file{/usr/include}?
3640 This defines all the globally useful things, and pulls in _0__<stdio.h>_1__
3641 and _0__<assert.h>_1__.
3644 This defines macros useful for dealing with bignums.
3647 Structure and macros for dealing with expression()
3650 This defines the structure for dealing with floating point
3651 numbers. It #includes @file{bignum.h}.
3654 This contains macro for appending a byte to the current frag.
3657 Structures and function definitions for the hashing functions.
3660 Function headers for the input-file.c functions.
3663 structures and function headers for things defined in the
3664 machine dependent part of the assembler.
3667 This is the GNU systemwide include file for manipulating obstacks.
3668 Since nobody is running under real GNU yet, we include this file.
3671 Macros and function headers for reading in source files.
3673 @item struct-symbol.h
3674 Structure definition and macros for dealing with the gas
3675 internal form of a symbol.
3678 structure definition for dealing with the numbered subsegments
3679 of the text and data segments.
3682 Macros and function headers for dealing with symbols.
3685 Structure for doing segment fixups.
3688 @comment ~subsection Test Directory
3689 @comment (Note: The test directory seems to have disappeared somewhere
3690 @comment along the line. If you want it, you'll probably have to find a
3691 @comment REALLY OLD dump tape~dots{})
3693 @comment The ~file{test/} directory is used for regression testing.
3694 @comment After you modify ~@code{as}, you can get a quick go/nogo
3695 @comment confidence test by running the new ~@code{as} over the source
3696 @comment files in this directory. You use a shell script ~file{test/do}.
3698 @comment The tests in this suite are evolving. They are not comprehensive.
3699 @comment They have, however, caught hundreds of bugs early in the debugging
3700 @comment cycle of ~@code{as}. Most test statements in this suite were naturally
3701 @comment selected: they were used to demonstrate actual ~@code{as} bugs rather
3702 @comment than being written ~i{a prioi}.
3704 @comment Another testing suggestion: over 30 bugs have been found simply by
3705 @comment running examples from this manual through ~@code{as}.
3706 @comment Some examples in this manual are selected
3707 @comment to distinguish boundary conditions; they are good for testing ~@code{as}.
3709 @comment ~subsubsection Regression Testing
3710 @comment Each regression test involves assembling a file and comparing the
3711 @comment actual output of ~@code{as} to ``known good'' output files. Both
3712 @comment the object file and the error/warning message file (stderr) are
3713 @comment inspected. Optionally ~@code{as}' exit status may be checked.
3714 @comment Discrepencies are reported. Each discrepency means either that
3715 @comment you broke some part of ~@code{as} or that the ``known good'' files
3716 @comment are now out of date and should be changed to reflect the new
3717 @comment definition of ``good''.
3719 @comment Each regression test lives in its own directory, in a tree
3720 @comment rooted in the directory ~file{test/}. Each such directory
3721 @comment has a name ending in ~file{.ret}, where `ret' stands for
3722 @comment REgression Test. The ~file{.ret} ending allows ~code{find
3723 @comment (1)} to find all regression tests in the tree, without
3724 @comment needing to list them explicitly.
3726 @comment Any ~file{.ret} directory must contain a file called
3727 @comment ~file{input} which is the source file to assemble. During
3728 @comment testing an object file ~file{output} is created, as well as
3729 @comment a file ~file{stdouterr} which contains the output to both
3730 @comment stderr and stderr. If there is a file ~file{output.good} in
3731 @comment the directory, and if ~file{output} contains exactly the
3732 @comment same data as ~file{output.good}, the file ~file{output} is
3733 @comment deleted. Likewise ~file{stdouterr} is removed if it exactly
3734 @comment matches a file ~file{stdouterr.good}. If file
3735 @comment ~file{status.good} is present, containing a decimal number
3736 @comment before a newline, the exit status of ~@code{as} is compared
3737 @comment to this number. If the status numbers are not equal, a file
3738 @comment ~file{status} is written to the directory, containing the
3739 @comment actual status as a decimal number followed by newline.
3741 @comment Should any of the ~file{*.good} files fail to match their corresponding
3742 @comment actual files, this is noted by a 1-line message on the screen during
3743 @comment the regression test, and you can use ~@code{find (1)} to find any
3744 @comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3746 @node Retargeting, License, Maintenance, Top
3747 @chapter Teaching the Assembler about a New Machine
3749 This chapter describes the steps required in order to make the
3750 assembler work with another machine's assembly language. This
3751 chapter is not complete, and only describes the steps in the
3752 broadest terms. You should look at the source for the
3753 currently supported machine in order to discover some of the
3754 details that aren't mentioned here.
3756 You should create a new file called @file{@var{machine}.c}, and
3757 add the appropriate lines to the file @file{Makefile} so that
3758 you can compile your new version of the assembler. This should
3759 be straighforward; simply add lines similar to the ones there
3760 for the four current versions of the assembler.
3762 If you want to be compatible with GDB, (and the current
3763 machine-dependent versions of the assembler), you should create
3764 a file called @file{@var{machine}-opcode.h} which should
3765 contain all the information about the names of the machine
3766 instructions, their opcodes, and what addressing modes they
3767 support. If you do this right, the assembler and GDB can share
3768 this file, and you'll only have to write it once. Note that
3769 while you're writing @code{as}, you may want to use an
3770 independent program (if you have access to one), to make sure
3771 that @code{as} is emitting the correct bytes. Since @code{as}
3772 and @code{GDB} share the opcode table, an incorrect opcode
3773 table entry may make invalid bytes look OK when you disassemble
3774 them with @code{GDB}.
3776 @section Functions You will Have to Write
3778 Your file @file{@var{machine}.c} should contain definitions for
3779 the following functions and variables. It will need to include
3780 some header files in order to use some of the structures
3781 defined in the machine-independent part of the assembler. The
3782 needed header files are mentioned in the descriptions of the
3783 functions that will need them.
3788 This long integer holds the value to place at the beginning of
3789 the @file{a.out} file. It is usually @samp{OMAGIC}, except on
3790 machines that store additional information in the magic-number.
3792 @item char comment_chars[];
3793 This character array holds the values of the characters that
3794 start a comment anywhere in a line. Comments are stripped off
3795 automatically by the machine independent part of the
3796 assembler. Note that the @samp{/*} will always start a
3797 comment, and that only @samp{*/} will end a comment started by
3800 @item char line_comment_chars[];
3801 This character array holds the values of the chars that start a
3802 comment only if they are the first (non-whitespace) character
3803 on a line. If the character @samp{#} does not appear in this
3804 list, you may get unexpected results. (Various
3805 machine-independent parts of the assembler treat the comments
3806 @samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3807 that start with @samp{#} are comments.)
3809 @item char EXP_CHARS[];
3810 This character array holds the letters that can separate the
3811 mantissa and the exponent of a floating point number. Typical
3812 values are @samp{e} and @samp{E}.
3814 @item char FLT_CHARS[];
3815 This character array holds the letters that--when they appear
3816 immediately after a leading zero--indicate that a number is a
3817 floating-point number. (Sort of how 0x indicates that a
3818 hexadecimal number follows.)
3820 @item pseudo_typeS md_pseudo_table[];
3821 (@var{pseudo_typeS} is defined in @file{md.h})
3822 This array contains a list of the machine_dependent directives
3823 the assembler must support. It contains the name of each
3824 pseudo op (Without the leading @samp{.}), a pointer to a
3825 function to be called when that directive is encountered, and
3826 an integer argument to be passed to that function.
3828 @item void md_begin(void)
3829 This function is called as part of the assembler's
3830 initialization. It should do any initialization required by
3831 any of your other routines.
3833 @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3834 This routine is called once for each option on the command line
3835 that the machine-independent part of @code{as} does not
3836 understand. This function should return non-zero if the option
3837 pointed to by @var{optionPTR} is a valid option. If it is not
3838 a valid option, this routine should return zero. The variables
3839 @var{argcPTR} and @var{argvPTR} are provided in case the option
3840 requires a filename or something similar as an argument. If
3841 the option is multi-character, @var{optionPTR} should be
3842 advanced past the end of the option, otherwise every letter in
3843 the option will be treated as a separate single-character
3846 @item void md_assemble(char *string)
3847 This routine is called for every machine-dependent
3848 non-directive line in the source file. It does all the real
3849 work involved in reading the opcode, parsing the operands,
3850 etc. @var{string} is a pointer to a null-terminated string,
3851 that comprises the input line, with all excess whitespace and
3854 @item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3855 This routine is called to turn a C long int, short int, or char
3856 into the series of bytes that represents that number on the
3857 target machine. @var{outputPTR} points to an array where the
3858 result should be stored; @var{value} is the value to store; and
3859 @var{nbytes} is the number of bytes in 'value' that should be
3862 @item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3863 This routine is called to turn a C long int, short int, or char
3864 into the series of bytes that represent an immediate value on
3865 the target machine. It is identical to the function @code{md_number_to_chars},
3866 except on NS32K machines.@refill
3868 @item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3869 This routine is called to turn a C long int, short int, or char
3870 into the series of bytes that represent an displacement value on
3871 the target machine. It is identical to the function @code{md_number_to_chars},
3872 except on NS32K machines.@refill
3874 @item void md_number_to_field(char *outputPTR,long value,int nbytes)
3875 This routine is identical to @code{md_number_to_chars},
3876 except on NS32K machines.
3878 @item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3879 (@code{struct relocation_info} is defined in @file{a.out.h})
3880 This routine emits the relocation info in @var{ri}
3881 in the appropriate bit-pattern for the target machine.
3882 The result should be stored in the location pointed
3883 to by @var{riPTR}. This routine may be a no-op unless you are
3884 attempting to do cross-assembly.
3886 @item char *md_atof(char type,char *outputPTR,int *sizePTR)
3887 This routine turns a series of digits into the appropriate
3888 internal representation for a floating-point number.
3889 @var{type} is a character from @var{FLT_CHARS[]} that describes
3890 what kind of floating point number is wanted; @var{outputPTR}
3891 is a pointer to an array that the result should be stored in;
3892 and @var{sizePTR} is a pointer to an integer where the size (in
3893 bytes) of the result should be stored. This routine should
3894 return an error message, or an empty string (not (char *)0) for
3897 @item int md_short_jump_size;
3898 This variable holds the (maximum) size in bytes of a short (16
3899 bit or so) jump created by @code{md_create_short_jump()}. This
3900 variable is used as part of the broken-word feature, and isn't
3901 needed if the assembler is compiled with
3902 @samp{-DWORKING_DOT_WORD}.
3904 @item int md_long_jump_size;
3905 This variable holds the (maximum) size in bytes of a long (32
3906 bit or so) jump created by @code{md_create_long_jump()}. This
3907 variable is used as part of the broken-word feature, and isn't
3908 needed if the assembler is compiled with
3909 @samp{-DWORKING_DOT_WORD}.
3911 @item void md_create_short_jump(char *resultPTR,long from_addr,
3912 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3913 This function emits a jump from @var{from_addr} to @var{to_addr} in
3914 the array of bytes pointed to by @var{resultPTR}. If this creates a
3915 type of jump that must be relocated, this function should call
3916 @code{fix_new()} with @var{frag} and @var{to_symbol}. The jump
3917 emitted by this function may be smaller than @var{md_short_jump_size},
3918 but it must never create a larger one.
3919 (If it creates a smaller jump, the extra bytes of memory will not be
3920 used.) This function is used as part of the broken-word feature,
3921 and isn't needed if the assembler is compiled with
3922 @samp{-DWORKING_DOT_WORD}.@refill
3924 @item void md_create_long_jump(char *ptr,long from_addr,
3925 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3926 This function is similar to the previous function,
3927 @code{md_create_short_jump()}, except that it creates a long
3928 jump instead of a short one. This function is used as part of
3929 the broken-word feature, and isn't needed if the assembler is
3930 compiled with @samp{-DWORKING_DOT_WORD}.
3932 @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3933 This function does the initial setting up for relaxation. This
3934 includes forcing references to still-undefined symbols to the
3935 appropriate addressing modes.
3937 @item relax_typeS md_relax_table[];
3938 (relax_typeS is defined in md.h)
3939 This array describes the various machine dependent states a
3940 frag may be in before relaxation. You will need one group of
3941 entries for each type of addressing mode you intend to relax.
3943 @item void md_convert_frag(fragS *fragPTR)
3944 (@var{fragS} is defined in @file{as.h})
3945 This routine does the required cleanup after relaxation.
3946 Relaxation has changed the type of the frag to a type that can
3947 reach its destination. This function should adjust the opcode
3948 of the frag to use the appropriate addressing mode.
3949 @var{fragPTR} points to the frag to clean up.
3951 @item void md_end(void)
3952 This function is called just before the assembler exits. It
3953 need not free up memory unless the operating system doesn't do
3954 it automatically on exit. (In which case you'll also have to
3955 track down all the other places where the assembler allocates
3956 space but never frees it.)
3960 @section External Variables You will Need to Use
3962 You will need to refer to or change the following external variables
3963 from within the machine-dependent part of the assembler.
3966 @item extern char flagseen[];
3967 This array holds non-zero values in locations corresponding to
3968 the options that were on the command line. Thus, if the
3969 assembler was called with @samp{-W}, @var{flagseen['W']} would
3972 @item extern fragS *frag_now;
3973 This pointer points to the current frag--the frag that bytes
3974 are currently being added to. If nothing else, you will need
3975 to pass it as an argument to various machine-independent
3976 functions. It is maintained automatically by the
3977 frag-manipulating functions; you should never have to change it
3980 @item extern LITTLENUM_TYPE generic_bignum[];
3981 (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3982 This is where @dfn{bignums}--numbers larger than 32 bits--are
3983 returned when they are encountered in an expression. You will
3984 need to use this if you need to implement directives (or
3985 anything else) that must deal with these large numbers.
3986 @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3987 @file{as.h}, and have a positive @code{X_add_number}. The
3988 @code{X_add_number} of a @code{bignum} is the number of
3989 @code{LITTLENUMS} in @var{generic_bignum} that the number takes
3992 @item extern FLONUM_TYPE generic_floating_point_number;
3993 (@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3994 The is where @dfn{flonums}--floating-point numbers within
3995 expressions--are returned. @code{Flonums} are of @code{segT}
3996 @code{SEG_BIG}, and have a negative @code{X_add_number}.
3997 @code{Flonums} are returned in a generic format. You will have
3998 to write a routine to turn this generic format into the
3999 appropriate floating-point format for your machine.
4001 @item extern int need_pass_2;
4002 If this variable is non-zero, the assembler has encountered an
4003 expression that cannot be assembled in a single pass. Since
4004 the second pass isn't implemented, this flag means that the
4005 assembler is punting, and is only looking for additional syntax
4006 errors. (Or something like that.)
4008 @item extern segT now_seg;
4009 This variable holds the value of the segment the assembler is
4010 currently assembling into.
4014 @section External functions will you need
4016 You will find the following external functions useful (or
4017 indispensable) when you're writing the machine-dependent part
4022 @item char *frag_more(int bytes)
4023 This function allocates @var{bytes} more bytes in the current
4024 frag (or starts a new frag, if it can't expand the current frag
4025 any more.) for you to store some object-file bytes in. It
4026 returns a pointer to the bytes, ready for you to store data in.
4028 @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4029 This function stores a relocation fixup to be acted on later.
4030 @var{frag} points to the frag the relocation belongs in;
4031 @var{where} is the location within the frag where the relocation begins;
4032 @var{size} is the size of the relocation, and is usually 1 (a single byte),
4033 2 (sixteen bits), or 4 (a longword).
4034 The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
4035 at _0__@var{frag->literal[where]}_1__. If @var{pcrel} is non-zero, the address of the
4036 location is subtracted from the result. A relocation entry is also added
4037 to the @file{a.out} file. @var{add_symbol}, @var{sub_symbol}, and/or
4038 @var{offset} may be NULL.@refill
4040 @item char *frag_var(relax_stateT type, int max_chars, int var,
4041 @code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4042 This function creates a machine-dependent frag of type @var{type}
4043 (usually @code{rs_machine_dependent}).
4044 @var{max_chars} is the maximum size in bytes that the frag may grow by;
4045 @var{var} is the current size of the variable end of the frag;
4046 @var{subtype} is the sub-type of the frag. The sub-type is used to index into
4047 @var{md_relax_table[]} during @code{relaxation}.
4048 @var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4049 @var{opcode} points into a byte whose value may have to be modified if the
4050 addressing mode used by this frag changes. It typically points into the
4051 @var{fr_literal[]} of the previous frag, and is used to point to a location
4052 that @code{md_convert_frag()}, may have to change.@refill
4054 @item void frag_wane(fragS *fragPTR)
4055 This function is useful from within @code{md_convert_frag}. It
4056 changes a frag to type rs_fill, and sets the variable-sized
4057 piece of the frag to zero. The frag will never change in size
4060 @item segT expression(expressionS *retval)
4061 (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4062 This function parses the string pointed to by the external char
4063 pointer @var{input_line_pointer}, and returns the segment-type
4064 of the expression. It also stores the results in the
4065 @var{expressionS} pointed to by @var{retval}.
4066 @var{input_line_pointer} is advanced to point past the end of
4067 the expression. (@var{input_line_pointer} is used by other
4068 parts of the assembler. If you modify it, be sure to restore
4069 it to its original value.)
4071 @item as_warn(char *message,@dots{})
4072 If warning messages are disabled, this function does nothing.
4073 Otherwise, it prints out the current file name, and the current
4074 line number, then uses @code{fprintf} to print the
4075 @var{message} and any arguments it was passed.
4077 @item as_bad(char *message,@dots{})
4078 This function should be called when @code{as} encounters
4079 conditions that are bad enough that @code{as} should not
4080 produce an object file, but should continue reading input and
4081 printing warning and bad error messages.
4083 @item as_fatal(char *message,@dots{})
4084 This function prints out the current file name and line number,
4085 prints the word @samp{FATAL:}, then uses @code{fprintf} to
4086 print the @var{message} and any arguments it was passed. Then
4087 the assembler exits. This function should only be used for
4088 serious, unrecoverable errors.
4090 @item void float_const(int float_type)
4091 This function reads floating-point constants from the current
4092 input line, and calls @code{md_atof} to assemble them. It is
4093 useful as the function to call for the directives
4094 @samp{.single}, @samp{.double}, @samp{.float}, etc.
4095 @var{float_type} must be a character from @var{FLT_CHARS}.
4097 @item void demand_empty_rest_of_line(void);
4098 This function can be used by machine-dependent directives to
4099 make sure the rest of the input line is empty. It prints a
4100 warning message if there are additional characters on the line.
4102 @item long int get_absolute_expression(void)
4103 This function can be used by machine-dependent directives to
4104 read an absolute number from the current input line. It
4105 returns the result. If it isn't given an absolute expression,
4106 it prints a warning message and returns zero.
4111 @section The concept of Frags
4113 This assembler works to optimize the size of certain addressing
4114 modes. (e.g. branch instructions) This means the size of many
4115 pieces of object code cannot be determined until after assembly
4116 is finished. (This means that the addresses of symbols cannot be
4117 determined until assembly is finished.) In order to do this,
4118 @code{as} stores the output bytes as @dfn{frags}.
4120 Here is the definition of a frag (from @file{as.h})
4126 relax_stateT fr_type;
4127 relax_substateT fr_substate;
4128 unsigned long fr_address;
4130 struct symbol *fr_symbol;
4132 struct frag *fr_next;
4139 is the size of the fixed-size piece of the frag.
4142 is the maximum (?) size of the variable-sized piece of the frag.
4145 is the type of the frag.
4150 rs_machine_dependent
4153 This stores the type of machine-dependent frag this is. (what
4154 kind of addressing mode is being used, and what size is being
4158 @var{fr_address} is only valid after relaxation is finished.
4159 Before relaxation, the only way to store an address is (pointer
4160 to frag containing the address) plus (offset into the frag).
4163 This contains a number, whose meaning depends on the type of
4165 for machine_dependent frags, this contains the offset from
4166 fr_symbol that the frag wants to go to. Thus, for branch
4167 instructions it is usually zero. (unless the instruction was
4168 @samp{jba foo+12} or something like that.)
4171 for machine_dependent frags, this points to the symbol the frag
4175 This points to the location in the frag (or in a previous frag)
4176 of the opcode for the instruction that caused this to be a frag.
4177 @var{fr_opcode} is needed if the actual opcode must be changed
4178 in order to use a different form of the addressing mode.
4179 (For example, if a conditional branch only comes in size tiny,
4180 a large-size branch could be implemented by reversing the sense
4181 of the test, and turning it into a tiny branch over a large jump.
4182 This would require changing the opcode.)
4184 @var{fr_literal} is a variable-size array that contains the
4185 actual object bytes. A frag consists of a fixed size piece of
4186 object data, (which may be zero bytes long), followed by a
4187 piece of object data whose size may not have been determined
4188 yet. Other information includes the type of the frag (which
4189 controls how it is relaxed),
4192 This is the next frag in the singly-linked list. This is
4193 usually only needed by the machine-independent part of
4199 @node License, , Retargeting, Top
4200 @unnumbered GNU GENERAL PUBLIC LICENSE
4201 @center Version 1, February 1989
4204 Copyright @copyright{} 1989 Free Software Foundation, Inc.
4205 675 Mass Ave, Cambridge, MA 02139, USA
4207 Everyone is permitted to copy and distribute verbatim copies
4208 of this license document, but changing it is not allowed.
4211 @unnumberedsec Preamble
4213 The license agreements of most software companies try to keep users
4214 at the mercy of those companies. By contrast, our General Public
4215 License is intended to guarantee your freedom to share and change free
4216 software---to make sure the software is free for all its users. The
4217 General Public License applies to the Free Software Foundation's
4218 software and to any other program whose authors commit to using it.
4219 You can use it for your programs, too.
4221 When we speak of free software, we are referring to freedom, not
4222 price. Specifically, the General Public License is designed to make
4223 sure that you have the freedom to give away or sell copies of free
4224 software, that you receive source code or can get it if you want it,
4225 that you can change the software or use pieces of it in new free
4226 programs; and that you know you can do these things.
4228 To protect your rights, we need to make restrictions that forbid
4229 anyone to deny you these rights or to ask you to surrender the rights.
4230 These restrictions translate to certain responsibilities for you if you
4231 distribute copies of the software, or if you modify it.
4233 For example, if you distribute copies of a such a program, whether
4234 gratis or for a fee, you must give the recipients all the rights that
4235 you have. You must make sure that they, too, receive or can get the
4236 source code. And you must tell them their rights.
4238 We protect your rights with two steps: (1) copyright the software, and
4239 (2) offer you this license which gives you legal permission to copy,
4240 distribute and/or modify the software.
4242 Also, for each author's protection and ours, we want to make certain
4243 that everyone understands that there is no warranty for this free
4244 software. If the software is modified by someone else and passed on, we
4245 want its recipients to know that what they have is not the original, so
4246 that any problems introduced by others will not reflect on the original
4247 authors' reputations.
4249 The precise terms and conditions for copying, distribution and
4250 modification follow.
4253 @unnumberedsec TERMS AND CONDITIONS
4256 @center TERMS AND CONDITIONS
4261 This License Agreement applies to any program or other work which
4262 contains a notice placed by the copyright holder saying it may be
4263 distributed under the terms of this General Public License. The
4264 ``Program'', below, refers to any such program or work, and a ``work based
4265 on the Program'' means either the Program or any work containing the
4266 Program or a portion of it, either verbatim or with modifications. Each
4267 licensee is addressed as ``you''.
4270 You may copy and distribute verbatim copies of the Program's source
4271 code as you receive it, in any medium, provided that you conspicuously and
4272 appropriately publish on each copy an appropriate copyright notice and
4273 disclaimer of warranty; keep intact all the notices that refer to this
4274 General Public License and to the absence of any warranty; and give any
4275 other recipients of the Program a copy of this General Public License
4276 along with the Program. You may charge a fee for the physical act of
4277 transferring a copy.
4280 You may modify your copy or copies of the Program or any portion of
4281 it, and copy and distribute such modifications under the terms of Paragraph
4282 1 above, provided that you also do the following:
4286 cause the modified files to carry prominent notices stating that
4287 you changed the files and the date of any change; and
4290 cause the whole of any work that you distribute or publish, that
4291 in whole or in part contains the Program or any part thereof, either
4292 with or without modifications, to be licensed at no charge to all
4293 third parties under the terms of this General Public License (except
4294 that you may choose to grant warranty protection to some or all
4295 third parties, at your option).
4298 If the modified program normally reads commands interactively when
4299 run, you must cause it, when started running for such interactive use
4300 in the simplest and most usual way, to print or display an
4301 announcement including an appropriate copyright notice and a notice
4302 that there is no warranty (or else, saying that you provide a
4303 warranty) and that users may redistribute the program under these
4304 conditions, and telling the user how to view a copy of this General
4308 You may charge a fee for the physical act of transferring a
4309 copy, and you may at your option offer warranty protection in
4313 Mere aggregation of another independent work with the Program (or its
4314 derivative) on a volume of a storage or distribution medium does not bring
4315 the other work under the scope of these terms.
4318 You may copy and distribute the Program (or a portion or derivative of
4319 it, under Paragraph 2) in object code or executable form under the terms of
4320 Paragraphs 1 and 2 above provided that you also do one of the following:
4324 accompany it with the complete corresponding machine-readable
4325 source code, which must be distributed under the terms of
4326 Paragraphs 1 and 2 above; or,
4329 accompany it with a written offer, valid for at least three
4330 years, to give any third party free (except for a nominal charge
4331 for the cost of distribution) a complete machine-readable copy of the
4332 corresponding source code, to be distributed under the terms of
4333 Paragraphs 1 and 2 above; or,
4336 accompany it with the information you received as to where the
4337 corresponding source code may be obtained. (This alternative is
4338 allowed only for noncommercial distribution and only if you
4339 received the program in object code or executable form alone.)
4342 Source code for a work means the preferred form of the work for making
4343 modifications to it. For an executable file, complete source code means
4344 all the source code for all modules it contains; but, as a special
4345 exception, it need not include source code for modules which are standard
4346 libraries that accompany the operating system on which the executable
4347 file runs, or for standard header files or definitions files that
4348 accompany that operating system.
4351 You may not copy, modify, sublicense, distribute or transfer the
4352 Program except as expressly provided under this General Public License.
4353 Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4354 the Program is void, and will automatically terminate your rights to use
4355 the Program under this License. However, parties who have received
4356 copies, or rights to use copies, from you under this General Public
4357 License will not have their licenses terminated so long as such parties
4358 remain in full compliance.
4361 By copying, distributing or modifying the Program (or any work based
4362 on the Program) you indicate your acceptance of this license to do so,
4363 and all its terms and conditions.
4366 Each time you redistribute the Program (or any work based on the
4367 Program), the recipient automatically receives a license from the original
4368 licensor to copy, distribute or modify the Program subject to these
4369 terms and conditions. You may not impose any further restrictions on the
4370 recipients' exercise of the rights granted herein.
4373 The Free Software Foundation may publish revised and/or new versions
4374 of the General Public License from time to time. Such new versions will
4375 be similar in spirit to the present version, but may differ in detail to
4376 address new problems or concerns.
4378 Each version is given a distinguishing version number. If the Program
4379 specifies a version number of the license which applies to it and ``any
4380 later version'', you have the option of following the terms and conditions
4381 either of that version or of any later version published by the Free
4382 Software Foundation. If the Program does not specify a version number of
4383 the license, you may choose any version ever published by the Free Software
4387 If you wish to incorporate parts of the Program into other free
4388 programs whose distribution conditions are different, write to the author
4389 to ask for permission. For software which is copyrighted by the Free
4390 Software Foundation, write to the Free Software Foundation; we sometimes
4391 make exceptions for this. Our decision will be guided by the two goals
4392 of preserving the free status of all derivatives of our free software and
4393 of promoting the sharing and reuse of software generally.
4396 @heading NO WARRANTY
4403 BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4404 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
4405 OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4406 PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4407 OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4408 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
4409 TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
4410 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4411 REPAIR OR CORRECTION.
4414 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4415 ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4416 REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4417 INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4418 ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4419 LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4420 SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4421 WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4422 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4426 @heading END OF TERMS AND CONDITIONS
4429 @center END OF TERMS AND CONDITIONS
4433 @unnumberedsec How to Apply These Terms to Your New Programs
4435 If you develop a new program, and you want it to be of the greatest
4436 possible use to humanity, the best way to achieve this is to make it
4437 free software which everyone can redistribute and change under these
4440 To do so, attach the following notices to the program. It is safest to
4441 attach them to the start of each source file to most effectively convey
4442 the exclusion of warranty; and each file should have at least the
4443 ``copyright'' line and a pointer to where the full notice is found.
4446 @var{one line to give the program's name and a brief idea of what it does.}
4447 Copyright (C) 19@var{yy} @var{name of author}
4449 This program is free software; you can redistribute it and/or modify
4450 it under the terms of the GNU General Public License as published by
4451 the Free Software Foundation; either version 1, or (at your option)
4454 This program is distributed in the hope that it will be useful,
4455 but WITHOUT ANY WARRANTY; without even the implied warranty of
4456 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4457 GNU General Public License for more details.
4459 You should have received a copy of the GNU General Public License
4460 along with this program; if not, write to the Free Software
4461 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4464 Also add information on how to contact you by electronic and paper mail.
4466 If the program is interactive, make it output a short notice like this
4467 when it starts in an interactive mode:
4470 Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4471 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4472 This is free software, and you are welcome to redistribute it
4473 under certain conditions; type `show c' for details.
4476 The hypothetical commands `show w' and `show c' should show the
4477 appropriate parts of the General Public License. Of course, the
4478 commands you use may be called something other than `show w' and `show
4479 c'; they could even be mouse-clicks or menu items---whatever suits your
4482 You should also get your employer (if you work as a programmer) or your
4483 school, if any, to sign a ``copyright disclaimer'' for the program, if
4484 necessary. Here is a sample; alter the names:
4487 Yoyodyne, Inc., hereby disclaims all copyright interest in the
4488 program `Gnomovision' (a program to direct compilers to make passes
4489 at assemblers) written by James Hacker.
4491 @var{signature of Ty Coon}, 1 April 1989
4492 Ty Coon, President of Vice
4495 That's all there is to it!