]> Git Repo - binutils.git/blame - gas/doc/as.texinfo
Default value for _HOST__; indiv configs *should* override this.
[binutils.git] / gas / doc / as.texinfo
CommitLineData
47342e8f
RP
1\input texinfo
2@c @tex
3@c \special{twoside}
4@c @end tex
09352a5d
RP
5_if__(_ALL_ARCH__)
6@setfilename as.info
7_fi__(_ALL_ARCH__)
8_if__(_M680X0__ && !_ALL_ARCH__)
9@setfilename as-m680x0.info
10_fi__(_M680X0__ && !_ALL_ARCH__)
11_if__(_AMD29K__ && !_ALL_ARCH__)
12@setfilename as-29k.info
13_fi__(_AMD29K__ && !_ALL_ARCH__)
14@c
15@c NOTE: this manual is marked up for preprocessing with a collection
16@c of m4 macros called "pretex.m4". If you see <_if__> and <_fi__>
17@c scattered around the source, you have the full source before
18@c preprocessing; if you don't, you have the source configured for some
19@c particular architecture (and you can of course get the full source,
20@c with all configurations, from wherever you got this). The full
21@c source needs to be run through m4 before either tex- or info-
22@c formatting: for example,
23@c m4 pretex.m4 none.m4 m680x0.m4 as.texinfo >as-680x0.texinfo
24@c will produce (assuming your path finds either GNU or SysV m4;
25@c Berkeley won't do) a file suitable for formatting.
26@c See the text in "pretex.m4" for a fuller explanation (and the macro
27@c definitions).
28@c
47342e8f
RP
29@synindex ky cp
30@ifinfo
31This file documents the GNU Assembler "as".
32
33Copyright (C) 1991 Free Software Foundation, Inc.
34
35Permission is granted to make and distribute verbatim copies of
36this manual provided the copyright notice and this permission notice
37are preserved on all copies.
38
39@ignore
40Permission is granted to process this file through Tex and print the
41results, provided the printed document carries copying permission
42notice identical to this one except for the removal of this paragraph
43(this paragraph not being relevant to the printed manual).
44
45@end ignore
46Permission is granted to copy and distribute modified versions of this
47manual under the conditions for verbatim copying, provided also that the
48section entitled ``GNU General Public License'' is included exactly as
49in the original, and provided that the entire resulting derived work is
50distributed under the terms of a permission notice identical to this
51one.
52
53Permission is granted to copy and distribute translations of this manual
54into another language, under the above conditions for modified versions,
55except that the section entitled ``GNU General Public License'' may be
56included in a translation approved by the author instead of in the
57original English.
58@end ifinfo
63f5d795
RP
59@tex
60@finalout
61@end tex
f4335d56 62@smallbook
47342e8f 63@setchapternewpage odd
09352a5d
RP
64_if__(_M680X0__)
65@settitle Using GNU as (680x0)
66_fi__(_M680X0__)
67_if__(_AMD29K__)
b50e59fe 68@settitle Using GNU as (AMD 29K)
09352a5d 69_fi__(_AMD29K__)
93b45514 70@titlepage
b50e59fe 71@title{Using GNU as}
47342e8f 72@subtitle{The GNU Assembler}
09352a5d
RP
73_if__(_M680X0__)
74@subtitle{for Motorola 680x0}
75_fi__(_M680X0__)
76_if__(_AMD29K__)
b50e59fe 77@subtitle{for the AMD 29K family}
09352a5d 78_fi__(_AMD29K__)
93b45514 79@sp 1
b50e59fe 80@subtitle February 1991
93b45514
RP
81@sp 13
82The Free Software Foundation Inc. thanks The Nice Computer
83Company of Australia for loaning Dean Elsner to write the
84first (Vax) version of @code{as} for Project GNU.
85The proprietors, management and staff of TNCCA thank FSF for
86distracting the boss while they got some work
87done.
88@sp 3
47342e8f
RP
89@author{Dean Elsner, Jay Fenlason & friends}
90@author{revised by Roland Pesch for Cygnus Support}
91@c [email protected]
92@page
93@tex
94\def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
95\xdef\manvers{\$Revision$} % For use in headers, footers too
96{\parskip=0pt
97\hfill Cygnus Support\par
98\hfill \manvers\par
99\hfill \TeX{}info \texinfoversion\par
100}
b50e59fe
RP
101%"boxit" macro for figures:
102%Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
103\gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
104 \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
105#2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
106\gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
47342e8f 107@end tex
93b45514 108
47342e8f
RP
109@vskip 0pt plus 1filll
110Copyright @copyright{} 1991 Free Software Foundation, Inc.
93b45514
RP
111
112Permission is granted to make and distribute verbatim copies of
113this manual provided the copyright notice and this permission notice
114are preserved on all copies.
115
93b45514 116Permission is granted to copy and distribute modified versions of this
47342e8f
RP
117manual under the conditions for verbatim copying, provided also that the
118section entitled ``GNU General Public License'' is included exactly as
119in the original, and provided that the entire resulting derived work is
120distributed under the terms of a permission notice identical to this
121one.
93b45514
RP
122
123Permission is granted to copy and distribute translations of this manual
47342e8f
RP
124into another language, under the above conditions for modified versions,
125except that the section entitled ``GNU General Public License'' may be
126included in a translation approved by the author instead of in the
127original English.
93b45514 128@end titlepage
47342e8f
RP
129@page
130
b50e59fe 131@node Top, Overview, (dir), (dir)
47342e8f 132
93b45514 133@menu
b50e59fe
RP
134* Overview:: Overview
135* Syntax:: Syntax
136* Segments:: Segments and Relocation
137* Symbols:: Symbols
138* Expressions:: Expressions
139* Pseudo Ops:: Assembler Directives
140* Maintenance:: Maintaining the Assembler
141* Retargeting:: Teaching the Assembler about a New Machine
142* License:: GNU GENERAL PUBLIC LICENSE
143
144 --- The Detailed Node Listing ---
145
146Overview
147
148* Invoking:: Invoking @code{as}
149* Manual:: Structure of this Manual
150* GNU Assembler:: as, the GNU Assembler
151* Command Line:: Command Line
152* Input Files:: Input Files
153* Object:: Output (Object) File
154* Errors:: Error and Warning Messages
155* Options:: Options
156
157Input Files
158
159* Filenames:: Input Filenames and Line-numbers
160
161Syntax
162
163* Pre-processing:: Pre-processing
164* Whitespace:: Whitespace
165* Comments:: Comments
166* Symbol Intro:: Symbols
167* Statements:: Statements
168* Constants:: Constants
169
170Constants
171
172* Characters:: Character Constants
173* Numbers:: Number Constants
174
175Character Constants
176
177* Strings:: Strings
178* Chars:: Characters
179
180Segments and Relocation
181
182* Segs Background:: Background
183* ld Segments:: ld Segments
184* as Segments:: as Internal Segments
185* Sub-Segments:: Sub-Segments
186* bss:: bss Segment
187
188Segments and Relocation
189
190* ld Segments:: ld Segments
191* as Segments:: as Internal Segments
192* Sub-Segments:: Sub-Segments
193* bss:: bss Segment
194
195Symbols
196
197* Labels:: Labels
198* Setting Symbols:: Giving Symbols Other Values
199* Symbol Names:: Symbol Names
200* Dot:: The Special Dot Symbol
201* Symbol Attributes:: Symbol Attributes
202
203Symbol Names
204
205* Local Symbols:: Local Symbol Names
206
207Symbol Attributes
208
209* Symbol Value:: Value
210* Symbol Type:: Type
211* Symbol Desc:: Descriptor
212* Symbol Other:: Other
213
214Expressions
215
216* Empty Exprs:: Empty Expressions
217* Integer Exprs:: Integer Expressions
218
219Integer Expressions
220
221* Arguments:: Arguments
222* Operators:: Operators
223* Prefix Ops:: Prefix Operators
224* Infix Ops:: Infix Operators
225
226Assembler Directives
227
228* Abort:: The Abort directive causes as to abort
229* Align:: Pad the location counter to a power of 2
230* App-File:: Set the logical file name
231* Ascii:: Fill memory with bytes of ASCII characters
232* Asciz:: Fill memory with bytes of ASCII characters followed
233 by a null.
234* Byte:: Fill memory with 8-bit integers
235* Comm:: Reserve public space in the BSS segment
236* Data:: Change to the data segment
237* Desc:: Set the n_desc of a symbol
238* Double:: Fill memory with double-precision floating-point numbers
239* Else:: @code{.else}
240* End:: @code{.end}
241* Endif:: @code{.endif}
242* Equ:: @code{.equ @var{symbol}, @var{expression}}
243* Extern:: @code{.extern}
244* Fill:: Fill memory with repeated values
245* Float:: Fill memory with single-precision floating-point numbers
246* Global:: Make a symbol visible to the linker
247* Ident:: @code{.ident}
248* If:: @code{.if @var{absolute expression}}
249* Include:: @code{.include "@var{file}"}
250* Int:: Fill memory with 32-bit integers
251* Lcomm:: Reserve private space in the BSS segment
252* Line:: Set the logical line number
253* Ln:: @code{.ln @var{line-number}}
254* List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
255* Long:: Fill memory with 32-bit integers
256* Lsym:: Create a local symbol
257* Octa:: Fill memory with 128-bit integers
258* Org:: Change the location counter
259* Quad:: Fill memory with 64-bit integers
260* Set:: Set the value of a symbol
261* Short:: Fill memory with 16-bit integers
262* Single:: @code{.single @var{flonums}}
263* Stab:: Store debugging information
264* Text:: Change to the text segment
b50e59fe 265* Word:: Fill memory with 32-bit integers
b50e59fe
RP
266* Deprecated:: Deprecated Directives
267* Machine Options:: Options
268* Machine Syntax:: Syntax
269* Floating Point:: Floating Point
270* Machine Directives:: Machine Directives
271* Opcodes:: Opcodes
272
273Machine Directives
274
275* block:: @code{.block @var{size} , @var{fill}}
276* cputype:: @code{.cputype}
277* file:: @code{.file}
278* hword:: @code{.hword @var{expressions}}
279* line:: @code{.line}
280* reg:: @code{.reg @var{symbol}, @var{expression}}
281* sect:: @code{.sect}
282* use:: @code{.use @var{segment name}}
93b45514 283@end menu
47342e8f 284
b50e59fe
RP
285@node Overview, Syntax, Top, Top
286@chapter Overview
287
47342e8f 288This manual is a user guide to the GNU assembler @code{as}.
09352a5d
RP
289_if__(_M680X0__)
290This version of the manual describes @code{as} configured to generate
291code for Motorola 680x0 architectures.
292_fi__(_M680X0__)
293_if__(_AMD29K__)
b50e59fe
RP
294This version of the manual describes @code{as} configured to generate
295code for Advanced Micro Devices' 29K architectures.
09352a5d 296_fi__(_AMD29K__)
b50e59fe
RP
297
298@menu
299* Invoking:: Invoking @code{as}
300* Manual:: Structure of this Manual
301* GNU Assembler:: as, the GNU Assembler
302* Command Line:: Command Line
303* Input Files:: Input Files
304* Object:: Output (Object) File
305* Errors:: Error and Warning Messages
306* Options:: Options
307@end menu
47342e8f 308
b50e59fe
RP
309@node Invoking, Manual, Overview, Overview
310@section Invoking @code{as}
47342e8f 311
b50e59fe
RP
312Here is a brief summary of how to invoke GNU @code{as}. For details,
313@pxref{Options}.
314
315@c We don't use @deffn and friends for the following because they seem
316@c to be limited to one line for the header.
47342e8f 317@example
b50e59fe 318 as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
09352a5d
RP
319_if__(_M680X0__)
320 [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
321_fi__(_M680X0__)
322_if__(_AMD29K__)
323@c am29k has no machine-dependent assembler options
324_fi__(_AMD29K__)
47342e8f
RP
325 [ -- | @var{files} @dots{} ]
326@end example
327
328@table @code
b50e59fe
RP
329
330@item -D
331This option is accepted only for script compatibility with calls to
332other assemblers; it has no effect on GNU @code{as}.
333
47342e8f
RP
334@item -f
335``fast''---skip preprocessing (assume source is compiler output)
336
b50e59fe
RP
337@item -I @var{path}
338Add @var{path} to the search list for @code{.include} directives
339
47342e8f 340@item -k
09352a5d 341_if__(_AMD29K__)
b50e59fe 342This option is accepted but has no effect on the 29K family.
09352a5d
RP
343_fi__(_AMD29K__)
344_if__(!_AMD29K__)
345Issue warnings when difference tables altered for long displacements
346_fi__(!_AMD29K__)
47342e8f
RP
347
348@item -L
349Keep (in symbol table) local symbols, starting with @samp{L}
350
351@item -o @var{objfile}
352Name the object-file output from @code{as}
353
354@item -R
355Fold data segment into text segment
356
357@item -W
b50e59fe 358Suppress warning messages
47342e8f 359
09352a5d
RP
360_if__(_M680X0__)
361@item -l
362Shorten references to undefined symbols, to one word instead of two
363
364@item -mc68000 | -mc68010 | -mc68020
365Specify what processor in the 68000 family is the target (default 68020)
366_fi__(_M680X0__)
47342e8f
RP
367
368@item -- | @var{files} @dots{}
369Source files to assemble, or standard input
370@end table
371
b50e59fe 372@node Manual, GNU Assembler, Invoking, Overview
47342e8f
RP
373@section Structure of this Manual
374This document is intended to describe what you need to know to use GNU
375@code{as}. We cover the syntax expected in source files, including
376notation for symbols, constants, and expressions; the directives that
377@code{as} understands; and of course how to invoke @code{as}.
378
09352a5d
RP
379_if__(_M680X0__ && !_ALL_ARCH__)
380We also cover special features in the 68000 configuration of @code{as},
381including pseudo-operations.
382_fi__(_M680X0__ && !_ALL_ARCH__)
383_if__(_AMD29K__ && !_ALL_ARCH__)
b50e59fe
RP
384We also cover special features in the AMD 29K configuration of @code{as},
385including assembler directives.
09352a5d 386_fi__(_AMD29K__ && !_ALL_ARCH__)
47342e8f 387
09352a5d
RP
388_if__(_ALL_ARCH__)
389This document also describes some of the machine-dependent features of
390various flavors of the assembler.
391_fi__(_ALL_ARCH__)
392_if__(_INTERNALS__)
93b45514
RP
393This document also describes how the assembler works internally, and
394provides some information that may be useful to people attempting to
395port the assembler to another machine.
09352a5d 396_fi__(_INTERNALS__)
93b45514 397
47342e8f 398On the other hand, this manual is @emph{not} intended as an introduction
b50e59fe
RP
399to programming in assembly language---let alone programming in general!
400In a similar vein, we make no attempt to introduce the machine
47342e8f
RP
401architecture; we do @emph{not} describe the instruction set, standard
402mnemonics, registers or addressing modes that are standard to a
403particular architecture. You may want to consult the manufacturer's
b50e59fe
RP
404machine architecture manual for this information.
405
93b45514 406
47342e8f
RP
407@c I think this is [email protected], 17jan1991
408@ignore
93b45514
RP
409Throughout this document, we assume that you are running @dfn{GNU},
410the portable operating system from the @dfn{Free Software
411Foundation, Inc.}. This restricts our attention to certain kinds of
47342e8f 412computer (in particular, the kinds of computers that GNU can run on);
93b45514
RP
413once this assumption is granted examples and definitions need less
414qualification.
415
93b45514
RP
416@code{as} is part of a team of programs that turn a high-level
417human-readable series of instructions into a low-level
418computer-readable series of instructions. Different versions of
09352a5d 419@code{as} are used for different kinds of computer.
47342e8f 420@end ignore
93b45514 421
b50e59fe
RP
422@c There used to be a section "Terminology" here, which defined
423@c "contents", "byte", "word", and "long". Defining "word" to any
424@c particular size is confusing when the .word directive may generate 16
425@c bits on one machine and 32 bits on another; in general, for the user
426@c version of this manual, none of these terms seem essential to define.
427@c They were used very little even in the former draft of the manual;
428@c this draft makes an effort to avoid them (except in names of
429@c directives).
93b45514 430
b50e59fe 431@node GNU Assembler, Command Line, Manual, Overview
93b45514 432@section as, the GNU Assembler
47342e8f
RP
433@code{as} is primarily intended to assemble the output of the GNU C
434compiler @code{gcc} for use by the linker @code{ld}. Nevertheless,
b50e59fe
RP
435we've tried to make @code{as} assemble correctly everything that the native
436assembler would.
09352a5d 437_if__(_VAX__)
b50e59fe 438Any exceptions are documented explicitly (@pxref{Machine Dependent}).
09352a5d 439_fi__(_VAX__)
b50e59fe
RP
440This doesn't mean @code{as} always uses the same syntax as another
441assembler for the same architecture; for example, we know of several
442incompatible versions of 680x0 assembly language syntax.
47342e8f
RP
443
444GNU @code{as} is really a family of assemblers. If you use (or have
445used) GNU @code{as} on another architecture, you should find a fairly
446similar environment. Each version has much in common with the others,
447including object file formats, most assembler directives (often called
93b45514
RP
448@dfn{pseudo-ops)} and assembler syntax.
449
b50e59fe
RP
450Unlike older assemblers, @code{as} is designed to assemble a source
451program in one pass of the source file. This has a subtle impact on the
452@kbd{.org} directive (@pxref{Org}).
93b45514 453
b50e59fe
RP
454@node Command Line, Input Files, GNU Assembler, Overview
455@section Command Line
93b45514
RP
456
457After the program name @code{as}, the command line may contain
458options and file names. Options may be in any order, and may be
459before, after, or between file names. The order of file names is
460significant.
461
47342e8f 462@file{--} (two hyphens) by itself names the standard input file
b50e59fe 463explicitly, as one of the files for @code{as} to assemble.
47342e8f 464
93b45514
RP
465Except for @samp{--} any command line argument that begins with a
466hyphen (@samp{-}) is an option. Each option changes the behavior of
467@code{as}. No option changes the way another option works. An
47342e8f 468option is a @samp{-} followed by one or more letters; the case of
b50e59fe 469the letter is important. All options are optional.
93b45514
RP
470
471Some options expect exactly one file name to follow them. The file
472name may either immediately follow the option's letter (compatible
473with older assemblers) or it may be the next command argument (GNU
474standard). These two command lines are equivalent:
475
476@example
477as -o my-object-file.o mumble
478as -omy-object-file.o mumble
479@end example
480
b50e59fe 481@node Input Files, Object, Command Line, Overview
47342e8f 482@section Input Files
93b45514 483
47342e8f 484We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
93b45514
RP
485describe the program input to one run of @code{as}. The program may
486be in one or more files; how the source is partitioned into files
487doesn't change the meaning of the source.
488
b50e59fe
RP
489@c I added "con" prefix to "catenation" just to prove I can overcome my
490@c APL training... [email protected]
491The source program is a concatenation of the text in all the files, in the
47342e8f 492order specified.
93b45514
RP
493
494Each time you run @code{as} it assembles exactly one source
47342e8f 495program. The source program is made up of one or more files.
93b45514
RP
496(The standard input is also a file.)
497
498You give @code{as} a command line that has zero or more input file
499names. The input files are read (from left file name to right). A
500command line argument (in any position) that has no special meaning
47342e8f 501is taken to be an input file name.
93b45514 502
47342e8f
RP
503If @code{as} is given no file names it attempts to read one input file
504from @code{as}'s standard input, which is normally your terminal. You
505may have to type @key{ctl-D} to tell @code{as} there is no more program
506to assemble.
93b45514 507
47342e8f
RP
508Use @samp{--} if you need to explicitly name the standard input file
509in your command line.
93b45514 510
b50e59fe 511If the source is empty, @code{as} will produce a small, empty object
47342e8f 512file.
93b45514 513
b50e59fe
RP
514@menu
515* Filenames:: Input Filenames and Line-numbers
516@end menu
517
518@node Filenames, , Input Files, Input Files
93b45514 519@subsection Input Filenames and Line-numbers
47342e8f 520There are two ways of locating a line in the input file (or files) and both
93b45514
RP
521are used in reporting error messages. One way refers to a line
522number in a physical file; the other refers to a line number in a
47342e8f 523``logical'' file.
93b45514
RP
524
525@dfn{Physical files} are those files named in the command line given
526to @code{as}.
527
47342e8f
RP
528@dfn{Logical files} are simply names declared explicitly by assembler
529directives; they bear no relation to physical files. Logical file names
530help error messages reflect the original source file, when @code{as}
b50e59fe 531source is itself synthesized from other files. @xref{App-File}.
93b45514 532
b50e59fe 533@node Object, Errors, Input Files, Overview
93b45514
RP
534@section Output (Object) File
535Every time you run @code{as} it produces an output file, which is
536your assembly language program translated into numbers. This file
47342e8f 537is the object file, named @code{a.out} unless you tell @code{as} to
93b45514
RP
538give it another name by using the @code{-o} option. Conventionally,
539object file names end with @file{.o}. The default name of
47342e8f 540@file{a.out} is used for historical reasons: older assemblers were
93b45514 541capable of assembling self-contained programs directly into a
47342e8f
RP
542runnable program.
543@c This may still work, but hasn't been tested.
93b45514 544
47342e8f 545The object file is meant for input to the linker @code{ld}. It contains
b50e59fe
RP
546assembled program code, information to help @code{ld} integrate
547the assembled program into a runnable file, and (optionally) symbolic
47342e8f 548information for the debugger.
93b45514
RP
549
550@comment link above to some info file(s) like the description of a.out.
551@comment don't forget to describe GNU info as well as Unix lossage.
552
b50e59fe 553@node Errors, Options, Object, Overview
93b45514
RP
554@section Error and Warning Messages
555
b50e59fe
RP
556@code{as} may write warnings and error messages to the standard error
557file (usually your terminal). This should not happen when @code{as} is
558run automatically by a compiler. Warnings report an assumption made so
559that @code{as} could keep assembling a flawed program; errors report a
560grave problem that stops the assembly.
93b45514
RP
561
562Warning messages have the format
563@example
b50e59fe 564file_name:@b{NNN}:Warning Message Text
93b45514 565@end example
b50e59fe
RP
566@noindent(where @b{NNN} is a line number). If a logical file name has
567been given (@pxref{App-File}) it is used for the filename, otherwise the
568name of the current input file is used. If a logical line number was
63f5d795 569given
09352a5d
RP
570_if__(!_AMD29K__)
571(@pxref{Line})
572_fi__(!_AMD29K__)
573_if__(_AMD29K__)
63f5d795 574(@pxref{Ln})
09352a5d 575_fi__(_AMD29K__)
63f5d795 576then it is used to calculate the number printed,
b50e59fe
RP
577otherwise the actual line in the current source file is printed. The
578message text is intended to be self explanatory (in the grand Unix
63f5d795 579tradition). @refill
93b45514
RP
580
581Error messages have the format
582@example
b50e59fe 583file_name:@b{NNN}:FATAL:Error Message Text
93b45514 584@end example
47342e8f 585The file name and line number are derived as for warning
93b45514
RP
586messages. The actual message text may be rather less explanatory
587because many of them aren't supposed to happen.
588
63f5d795 589@group
b50e59fe 590@node Options, , Errors, Overview
47342e8f 591@section Options
b50e59fe
RP
592@subsection @code{-D}
593This option has no effect whatsoever, but it is accepted to make it more
594likely that scripts written for other assemblers will also work with
595GNU @code{as}.
63f5d795 596@end group
b50e59fe
RP
597
598@subsection Work Faster: @code{-f}
93b45514 599@samp{-f} should only be used when assembling programs written by a
47342e8f 600(trusted) compiler. @samp{-f} stops the assembler from pre-processing
b50e59fe
RP
601the input file(s) before assembling them.
602@quotation
603@emph{Warning:} if the files actually need to be pre-processed (if they
604contain comments, for example), @code{as} will not work correctly if
605@samp{-f} is used.
606@end quotation
607
608@subsection Add to @code{.include} search path: @code{-I} @var{path}
609Use this option to add a @var{path} to the list of directories GNU
610@code{as} will search for files specified in @code{.include} directives
611(@pxref{Include}). You may use @code{-I} as many times as necessary to
612include a variety of paths. The current working directory is always
613searched first; after that, @code{as} searches any @samp{-I} directories
614in the same order as they were specified (left to right) on the command
615line.
616
617@subsection Warn if difference tables altered: @code{-k}
09352a5d 618_if__(_AMD29K__)
b50e59fe
RP
619On the AMD 29K family, this option is allowed, but has no effect. It is
620permitted for compatibility with GNU @code{as} on other platforms,
621where it can be used to warn when @code{as} alters the machine code
622generated for @samp{.word} directives in difference tables. The AMD 29K
623family does not have the addressing limitations that sometimes lead to this
624alteration on other platforms.
09352a5d 625_fi__(_AMD29K__)
b50e59fe 626
09352a5d 627_if__(!_AMD29K__)
47342e8f
RP
628@code{as} sometimes alters the code emitted for directives of the form
629@samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
630You can use the @samp{-k} option if you want a warning issued when this
631is done.
09352a5d 632_fi__(!_AMD29K__)
47342e8f 633
b50e59fe
RP
634@subsection Include Local Labels: @code{-L}
635Labels beginning with @samp{L} (upper case only) are called @dfn{local
636labels}. @xref{Symbol Names}. Normally you don't see such labels when
47342e8f 637debugging, because they are intended for the use of programs (like
b50e59fe 638compilers) that compose assembler programs, not for your notice.
47342e8f 639Normally both @code{as} and @code{ld} discard such labels, so you don't
b50e59fe 640normally debug with them.
93b45514
RP
641
642This option tells @code{as} to retain those @samp{L@dots{}} symbols
643in the object file. Usually if you do this you also tell the linker
644@code{ld} to preserve symbols whose names begin with @samp{L}.
645
b50e59fe 646@subsection Name the Object File: @code{-o}
93b45514
RP
647There is always one object file output when you run @code{as}. By
648default it has the name @file{a.out}. You use this option (which
649takes exactly one filename) to give the object file a different name.
650
651Whatever the object file is called, @code{as} will overwrite any
652existing file of the same name.
653
f4335d56 654@subsection Data Segment into Text Segment: @code{-R}
93b45514
RP
655@code{-R} tells @code{as} to write the object file as if all
656data-segment data lives in the text segment. This is only done at
657the very last moment: your binary data are the same, but data
658segment parts are relocated differently. The data segment part of
659your object file is zero bytes long because all it bytes are
660appended to the text segment. (@xref{Segments}.)
661
b50e59fe 662When you specify @code{-R} it would be possible to generate shorter
47342e8f
RP
663address displacements (because we don't have to cross between text and
664data segment). We don't do this simply for compatibility with older
b50e59fe 665versions of @code{as}. In future, @code{-R} may work this way.
93b45514 666
b50e59fe 667@subsection Suppress Warnings: @code{-W}
93b45514
RP
668@code{as} should never give a warning or error message when
669assembling compiler output. But programs written by people often
670cause @code{as} to give a warning that a particular assumption was
671made. All such warnings are directed to the standard error file.
47342e8f
RP
672If you use this option, no warnings are issued. This option only
673affects the warning messages: it does not change any particular of how
93b45514
RP
674@code{as} assembles your file. Errors, which stop the assembly, are
675still reported.
676
b50e59fe 677@node Syntax, Segments, Overview, Top
47342e8f
RP
678@chapter Syntax
679This chapter describes the machine-independent syntax allowed in a
680source file. @code{as} syntax is similar to what many other assemblers
b50e59fe 681use; it is inspired in BSD 4.2
09352a5d 682_if__(!_VAX__)
b50e59fe 683assembler. @refill
09352a5d
RP
684_fi__(!_VAX__)
685_if__(_VAX__)
686assembler, except that @code{as} does not assemble Vax bit-fields.
687_fi__(_VAX__)
b50e59fe
RP
688
689@menu
690* Pre-processing:: Pre-processing
691* Whitespace:: Whitespace
692* Comments:: Comments
693* Symbol Intro:: Symbols
694* Statements:: Statements
695* Constants:: Constants
696@end menu
93b45514 697
b50e59fe
RP
698@node Pre-processing, Whitespace, Syntax, Syntax
699@section Pre-processing
93b45514 700
b50e59fe
RP
701The pre-processor:
702@itemize @bullet
703@item
704adjusts and removes extra whitespace. It leaves one space or tab before
705the keywords on a line, and turns any other whitespace on the line into
706a single space.
93b45514 707
b50e59fe
RP
708@item
709removes all comments, replacing them with a single space, or an
710appropriate number of newlines.
93b45514 711
b50e59fe
RP
712@item
713converts character constants into the appropriate numeric values.
714@end itemize
715
716Excess whitespace, comments, and character constants
93b45514
RP
717cannot be used in the portions of the input text that are not
718pre-processed.
719
b50e59fe
RP
720If the first line of an input file is @code{#NO_APP} or the @samp{-f}
721option is given, the input file will not be pre-processed. Within such
722an input file, parts of the file can be pre-processed by putting a line
723that says @code{#APP} before the text that should be pre-processed, and
724putting a line that says @code{#NO_APP} after them. This feature is
725mainly intend to support @code{asm} statements in compilers whose output
726normally does not need to be pre-processed.
93b45514 727
b50e59fe 728@node Whitespace, Comments, Pre-processing, Syntax
93b45514
RP
729@section Whitespace
730@dfn{Whitespace} is one or more blanks or tabs, in any order.
731Whitespace is used to separate symbols, and to make programs neater
732for people to read. Unless within character constants
b50e59fe 733(@pxref{Characters}), any whitespace means the same as exactly one
93b45514
RP
734space.
735
b50e59fe 736@node Comments, Symbol Intro, Whitespace, Syntax
93b45514
RP
737@section Comments
738There are two ways of rendering comments to @code{as}. In both
739cases the comment is equivalent to one space.
740
47342e8f
RP
741Anything from @samp{/*} through the next @samp{*/} is a comment.
742This means you may not nest these comments.
93b45514
RP
743
744@example
745/*
746 The only way to include a newline ('\n') in a comment
747 is to use this sort of comment.
748*/
47342e8f 749
93b45514
RP
750/* This sort of comment does not nest. */
751@end example
752
753Anything from the @dfn{line comment} character to the next newline
47342e8f 754is considered a comment and is ignored. The line comment character is
09352a5d
RP
755_if__(_VAX__)
756@samp{#} on the Vax;
757_fi__(_VAX__)
758_if__(_M680X0__)
759@samp{|} on the 680x0;
760_fi__(_M680X0__)
761_if__(_AMD29K__)
762@samp{;} for the AMD 29K family;
763_fi__(_AMD29K__)
764@pxref{Machine Dependent}. @refill
765
766_if__(_ALL_ARCH__)
b50e59fe
RP
767On some machines there are two different line comment characters. One
768will only begin a comment if it is the first non-whitespace character on
769a line, while the other will always begin a comment.
09352a5d 770_fi__(_ALL_ARCH__)
93b45514
RP
771
772To be compatible with past assemblers a special interpretation is
773given to lines that begin with @samp{#}. Following the @samp{#} an
774absolute expression (@pxref{Expressions}) is expected: this will be
775the logical line number of the @b{next} line. Then a string
776(@xref{Strings}.) is allowed: if present it is a new logical file
777name. The rest of the line, if any, should be whitespace.
778
779If the first non-whitespace characters on the line are not numeric,
780the line is ignored. (Just like a comment.)
781@example
782 # This is an ordinary comment.
783# 42-6 "new_file_name" # New logical file name
784 # This is logical line # 36.
785@end example
786This feature is deprecated, and may disappear from future versions
787of @code{as}.
788
b50e59fe 789@node Symbol Intro, Statements, Comments, Syntax
93b45514
RP
790@section Symbols
791A @dfn{symbol} is one or more characters chosen from the set of all
792letters (both upper and lower case), digits and the three characters
b50e59fe
RP
793@samp{_.$}. No symbol may begin with a digit. Case is significant.
794There is no length limit: all characters are significant. Symbols are
795delimited by characters not in that set, or by the beginning of a file
796(since the source program must end with a newline, the end of a file is
797not a possible symbol delimiter). @xref{Symbols}.
93b45514 798
b50e59fe 799@node Statements, Constants, Symbol Intro, Syntax
93b45514 800@section Statements
b50e59fe 801A @dfn{statement} ends at a newline character (@samp{\n})
09352a5d
RP
802_if__(!_AMD29K__)
803or at a semicolon (@samp{;}). The newline or semicolon
804_fi__(!_AMD29K__)
805_if__(_AMD29K__)
b50e59fe 806or an ``at'' sign (@samp{@@}). The newline or at sign
09352a5d 807_fi__(_AMD29K__)
b50e59fe
RP
808is considered part
809of the preceding statement. Newlines
09352a5d
RP
810_if__(!_AMD29K__)
811and semicolons
812_fi__(!_AMD29K__)
813_if__(_AMD29K__)
b50e59fe 814and at signs
09352a5d 815_fi__(_AMD29K__)
b50e59fe 816within
93b45514
RP
817character constants are an exception: they don't end statements.
818It is an error to end any statement with end-of-file: the last
b50e59fe 819character of any input file should be a newline.@refill
93b45514
RP
820
821You may write a statement on more than one line if you put a
822backslash (@kbd{\}) immediately in front of any newlines within the
823statement. When @code{as} reads a backslashed newline both
824characters are ignored. You can even put backslashed newlines in
825the middle of symbol names without changing the meaning of your
826source program.
827
47342e8f 828An empty statement is allowed, and may include whitespace. It is ignored.
93b45514 829
b50e59fe
RP
830@c "key symbol" is not used elsewhere in the document; seems pedantic to
831@c @defn{} it in that case, as was done previously... [email protected],
832@c 13feb91.
47342e8f 833A statement begins with zero or more labels, optionally followed by a
b50e59fe 834key symbol which determines what kind of statement it is. The key
93b45514 835symbol determines the syntax of the rest of the statement. If the
b50e59fe 836symbol begins with a dot @samp{.} then the statement is an assembler
47342e8f
RP
837directive: typically valid for any computer. If the symbol begins with
838a letter the statement is an assembly language @dfn{instruction}: it
839will assemble into a machine language instruction. Different versions
840of @code{as} for different computers will recognize different
841instructions. In fact, the same symbol may represent a different
842instruction in a different computer's assembly language.
843
844A label is a symbol immediately followed by a colon (@code{:}).
845Whitespace before a label or after a colon is permitted, but you may not
846have whitespace between a label's symbol and its colon. @xref{Labels}.
93b45514
RP
847
848@example
849label: .directive followed by something
850another$label: # This is an empty statement.
851 instruction operand_1, operand_2, @dots{}
852@end example
853
b50e59fe 854@node Constants, , Statements, Syntax
93b45514
RP
855@section Constants
856A constant is a number, written so that its value is known by
857inspection, without knowing any context. Like this:
f4335d56 858@smallexample
93b45514
RP
859.byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
860.ascii "Ring the bell\7" # A string constant.
861.octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.
862.float 0f-314159265358979323846264338327\
86395028841971.693993751E-40 # - pi, a flonum.
f4335d56 864@end smallexample
93b45514 865
b50e59fe
RP
866@menu
867* Characters:: Character Constants
868* Numbers:: Number Constants
869@end menu
870
871@node Characters, Numbers, Constants, Constants
93b45514 872@subsection Character Constants
47342e8f
RP
873There are two kinds of character constants. A @dfn{character} stands
874for one character in one byte and its value may be used in
93b45514 875numeric expressions. String constants (properly called string
47342e8f 876@emph{literals}) are potentially many bytes and their values may not be
93b45514
RP
877used in arithmetic expressions.
878
b50e59fe
RP
879@menu
880* Strings:: Strings
881* Chars:: Characters
882@end menu
883
884@node Strings, Chars, Characters, Characters
93b45514
RP
885@subsubsection Strings
886A @dfn{string} is written between double-quotes. It may contain
47342e8f 887double-quotes or null characters. The way to get special characters
93b45514 888into a string is to @dfn{escape} these characters: precede them with
b50e59fe 889a backslash @samp{\} character. For example @samp{\\} represents
93b45514
RP
890one backslash: the first @code{\} is an escape which tells
891@code{as} to interpret the second character literally as a backslash
892(which prevents @code{as} from recognizing the second @code{\} as an
893escape character). The complete list of escapes follows.
894
895@table @kbd
93b45514
RP
896@c @item \a
897@c Mnemonic for ACKnowledge; for ASCII this is octal code 007.
898@item \b
899Mnemonic for backspace; for ASCII this is octal code 010.
900@c @item \e
901@c Mnemonic for EOText; for ASCII this is octal code 004.
902@item \f
903Mnemonic for FormFeed; for ASCII this is octal code 014.
904@item \n
905Mnemonic for newline; for ASCII this is octal code 012.
906@c @item \p
907@c Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
908@item \r
909Mnemonic for carriage-Return; for ASCII this is octal code 015.
910@c @item \s
911@c Mnemonic for space; for ASCII this is octal code 040. Included for compliance with
912@c other assemblers.
913@item \t
914Mnemonic for horizontal Tab; for ASCII this is octal code 011.
915@c @item \v
916@c Mnemonic for Vertical tab; for ASCII this is octal code 013.
917@c @item \x @var{digit} @var{digit} @var{digit}
918@c A hexadecimal character code. The numeric code is 3 hexadecimal digits.
919@item \ @var{digit} @var{digit} @var{digit}
920An octal character code. The numeric code is 3 octal digits.
47342e8f
RP
921For compatibility with other Unix systems, 8 and 9 are accepted as digits:
922for example, @code{\008} has the value 010, and @code{\009} the value 011.
93b45514
RP
923@item \\
924Represents one @samp{\} character.
925@c @item \'
926@c Represents one @samp{'} (accent acute) character.
927@c This is needed in single character literals
928@c (@xref{Characters}.) to represent
929@c a @samp{'}.
930@item \"
931Represents one @samp{"} character. Needed in strings to represent
932this character, because an unescaped @samp{"} would end the string.
933@item \ @var{anything-else}
934Any other character when escaped by @kbd{\} will give a warning, but
935assemble as if the @samp{\} was not present. The idea is that if
936you used an escape sequence you clearly didn't want the literal
937interpretation of the following character. However @code{as} has no
938other interpretation, so @code{as} knows it is giving you the wrong
939code and warns you of the fact.
940@end table
941
942Which characters are escapable, and what those escapes represent,
943varies widely among assemblers. The current set is what we think
944BSD 4.2 @code{as} recognizes, and is a subset of what most C
945compilers recognize. If you are in doubt, don't use an escape
946sequence.
947
b50e59fe 948@node Chars, , Strings, Characters
93b45514
RP
949@subsubsection Characters
950A single character may be written as a single quote immediately
951followed by that character. The same escapes apply to characters as
952to strings. So if you want to write the character backslash, you
953must write @kbd{'\\} where the first @code{\} escapes the second
b50e59fe
RP
954@code{\}. As you can see, the quote is an acute accent, not a
955grave accent. A newline
09352a5d
RP
956_if__(!_AMD29K__)
957(or semicolon @samp{;})
958_fi__(!_AMD29K__)
959_if__(_AMD29K__)
b50e59fe 960(or at sign @samp{@@})
09352a5d 961_fi__(_AMD29K__)
b50e59fe
RP
962immediately
963following an acute accent is taken as a literal character and does
93b45514
RP
964not count as the end of a statement. The value of a character
965constant in a numeric expression is the machine's byte-wide code for
966that character. @code{as} assumes your character code is ASCII: @kbd{'A}
b50e59fe 967means 65, @kbd{'B} means 66, and so on. @refill
93b45514 968
b50e59fe 969@node Numbers, , Characters, Constants
93b45514 970@subsection Number Constants
b50e59fe 971@code{as} distinguishes three kinds of numbers according to how they
47342e8f
RP
972are stored in the target machine. @emph{Integers} are numbers that
973would fit into an @code{int} in the C language. @emph{Bignums} are
974integers, but they are stored in a more than 32 bits. @emph{Flonums}
93b45514
RP
975are floating point numbers, described below.
976
977@subsubsection Integers
b50e59fe
RP
978A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
979the binary digits @samp{01}.
980
93b45514
RP
981An octal integer is @samp{0} followed by zero or more of the octal
982digits (@samp{01234567}).
983
984A decimal integer starts with a non-zero digit followed by zero or
985more digits (@samp{0123456789}).
986
987A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
988more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
989
47342e8f 990Integers have the usual values. To denote a negative integer, use
b50e59fe
RP
991the prefix operator @samp{-} discussed under expressions
992(@pxref{Prefix Ops}).
93b45514
RP
993
994@subsubsection Bignums
995A @dfn{bignum} has the same syntax and semantics as an integer
996except that the number (or its negative) takes more than 32 bits to
997represent in binary. The distinction is made because in some places
998integers are permitted while bignums are not.
999
1000@subsubsection Flonums
b50e59fe
RP
1001A @dfn{flonum} represents a floating point number. The translation is
1002complex: a decimal floating point number from the text is converted by
1003@code{as} to a generic binary floating point number of more than
1004sufficient precision. This generic floating point number is converted
1005to a particular computer's floating point format (or formats) by a
1006portion of @code{as} specialized to that computer.
93b45514
RP
1007
1008A flonum is written by writing (in order)
1009@itemize @bullet
1010@item
1011The digit @samp{0}.
1012@item
09352a5d 1013_if__(_AMD29K__)
b50e59fe
RP
1014One of the letters @samp{DFPRSX} (in upper or lower case), to tell
1015@code{as} the rest of the number is a flonum.
09352a5d
RP
1016_fi__(_AMD29K__)
1017_if__(!_AMD29K__)
b50e59fe
RP
1018A letter, to tell @code{as} the rest of the number is a flonum. @kbd{e}
1019is recommended. Case is not important. (Any otherwise illegal letter
1020will work here, but that might be changed. Vax BSD 4.2 assembler seems
1021to allow any of @samp{defghDEFGH}.)
09352a5d 1022_fi__(!_AMD29K__)
93b45514
RP
1023@item
1024An optional sign: either @samp{+} or @samp{-}.
1025@item
47342e8f 1026An optional @dfn{integer part}: zero or more decimal digits.
93b45514 1027@item
47342e8f 1028An optional @dfn{fraction part}: @samp{.} followed by zero
93b45514
RP
1029or more decimal digits.
1030@item
1031An optional exponent, consisting of:
1032@itemize @bullet
1033@item
09352a5d 1034_if__(_AMD29K__)
b50e59fe 1035An @samp{E} or @samp{e}.
09352a5d 1036_if__(!_AMD29K__)
93b45514
RP
1037A letter; the exact significance varies according to
1038the computer that executes the program. @code{as}
1039accepts any letter for now. Case is not important.
09352a5d 1040_fi__(!_AMD29K__)
93b45514
RP
1041@item
1042Optional sign: either @samp{+} or @samp{-}.
1043@item
1044One or more decimal digits.
1045@end itemize
1046@end itemize
1047
1048At least one of @var{integer part} or @var{fraction part} must be
47342e8f 1049present. The floating point number has the usual base-10 value.
93b45514 1050
47342e8f
RP
1051@code{as} does all processing using integers. Flonums are computed
1052independently of any floating point hardware in the computer running
1053@code{as}.
93b45514 1054
b50e59fe 1055@node Segments, Symbols, Syntax, Top
47342e8f 1056@chapter Segments and Relocation
b50e59fe
RP
1057@menu
1058* Segs Background:: Background
1059* ld Segments:: ld Segments
1060* as Segments:: as Internal Segments
1061* Sub-Segments:: Sub-Segments
1062* bss:: bss Segment
1063@end menu
1064
1065@node Segs Background, ld Segments, Segments, Segments
1066@section Background
47342e8f
RP
1067Roughly, a segment is a range of addresses, with no gaps; all data
1068``in'' those addresses is treated the same for some particular purpose.
1069For example there may be a ``read only'' segment.
93b45514
RP
1070
1071The linker @code{ld} reads many object files (partial programs) and
1072combines their contents to form a runnable program. When @code{as}
47342e8f
RP
1073emits an object file, the partial program is assumed to start at address
10740. @code{ld} will assign the final addresses the partial program
1075occupies, so that different partial programs don't overlap. This is
1076actually an over-simplification, but it will suffice to explain how
1077@code{as} uses segments.
93b45514
RP
1078
1079@code{ld} moves blocks of bytes of your program to their run-time
1080addresses. These blocks slide to their run-time addresses as rigid
47342e8f
RP
1081units; their length does not change and neither does the order of bytes
1082within them. Such a rigid unit is called a @emph{segment}. Assigning
1083run-time addresses to segments is called @dfn{relocation}. It includes
1084the task of adjusting mentions of object-file addresses so they refer to
1085the proper run-time addresses.
93b45514 1086
b50e59fe
RP
1087An object file written by @code{as} has three segments, any of which may
1088be empty. These are named @dfn{text}, @dfn{data} and @dfn{bss}
93b45514 1089segments. Within the object file, the text segment starts at
b50e59fe
RP
1090address @code{0}, the data segment follows, and the bss segment
1091follows the data segment.
93b45514
RP
1092
1093To let @code{ld} know which data will change when the segments are
1094relocated, and how to change that data, @code{as} also writes to the
1095object file details of the relocation needed. To perform relocation
47342e8f
RP
1096@code{ld} must know, each time an address in the object
1097file is mentioned:
93b45514
RP
1098@itemize @bullet
1099@item
47342e8f
RP
1100Where in the object file is the beginning of this reference to
1101an address?
93b45514 1102@item
47342e8f 1103How long (in bytes) is this reference?
93b45514 1104@item
b50e59fe
RP
1105Which segment does the address refer to? What is the numeric value of
1106@display
1107(@var{address}) @minus{} (@var{start-address of segment})?
1108@end display
93b45514 1109@item
b50e59fe 1110Is the reference to an address ``Program-Counter relative''?
93b45514
RP
1111@end itemize
1112
47342e8f 1113In fact, every address @code{as} ever uses is expressed as
b50e59fe
RP
1114@code{(@var{segment}) + (@var{offset into segment})}. Further, every
1115expression @code{as} computes is of this segmented nature.
47342e8f 1116@dfn{Absolute expression} means an expression with segment ``absolute''
b50e59fe
RP
1117(@pxref{ld Segments}). A @dfn{pass1 expression} means an expression
1118with segment ``pass1'' (@pxref{as Segments}). In this manual we use the
47342e8f 1119notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
b50e59fe 1120@var{segname}''.
93b45514
RP
1121
1122Apart from text, data and bss segments you need to know about the
1123@dfn{absolute} segment. When @code{ld} mixes partial programs,
47342e8f 1124addresses in the absolute segment remain unchanged. That is, address
b50e59fe 1125@code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
47342e8f 1126Although two partial programs' data segments will not overlap addresses
b50e59fe
RP
1127after linking, @emph{by definition} their absolute segments will overlap.
1128Address @code{@{absolute@ 239@}} in one partial program will always be the same
1129address when the program is running as address @code{@{absolute@ 239@}} in any
47342e8f
RP
1130other partial program.
1131
1132The idea of segments is extended to the @dfn{undefined} segment. Any
1133address whose segment is unknown at assembly time is by definition
1134rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1135Since numbers are always defined, the only way to generate an undefined
93b45514
RP
1136address is to mention an undefined symbol. A reference to a named
1137common block would be such a symbol: its value is unknown at assembly
47342e8f 1138time so it has segment @emph{undefined}.
93b45514 1139
b50e59fe 1140By analogy the word @emph{segment} is used to describe groups of segments in
47342e8f 1141the linked program. @code{ld} puts all partial programs' text
93b45514 1142segments in contiguous addresses in the linked program. It is
47342e8f 1143customary to refer to the @emph{text segment} of a program, meaning all
93b45514
RP
1144the addresses of all partial program's text segments. Likewise for
1145data and bss segments.
1146
93b45514
RP
1147Some segments are manipulated by @code{ld}; others are invented for
1148use of @code{as} and have no meaning except during assembly.
1149
b50e59fe
RP
1150@menu
1151* ld Segments:: ld Segments
1152* as Segments:: as Internal Segments
1153* Sub-Segments:: Sub-Segments
1154* bss:: bss Segment
1155@end menu
47342e8f 1156
b50e59fe
RP
1157@node ld Segments, as Segments, Segs Background, Segments
1158@section ld Segments
1159@code{ld} deals with just five kinds of segments, summarized below.
1160
1161@table @strong
47342e8f 1162
93b45514
RP
1163@item text segment
1164@itemx data segment
47342e8f
RP
1165These segments hold your program. @code{as} and @code{ld} treat them as
1166separate but equal segments. Anything you can say of one segment is
b50e59fe
RP
1167true of the other. When the program is running, however, it is
1168customary for the text segment to be unalterable. The
1169text segment is often shared among processes: it will contain
1170instructions, constants and the like. The data segment of a running
1171program is usually alterable: for example, C variables would be stored
1172in the data segment.
47342e8f
RP
1173
1174@item bss segment
1175This segment contains zeroed bytes when your program begins running. It
1176is used to hold unitialized variables or common storage. The length of
1177each partial program's bss segment is important, but because it starts
1178out containing zeroed bytes there is no need to store explicit zero
b50e59fe 1179bytes in the object file. The bss segment was invented to eliminate
47342e8f
RP
1180those explicit zeros from object files.
1181
1182@item absolute segment
1183Address 0 of this segment is always ``relocated'' to runtime address 0.
1184This is useful if you want to refer to an address that @code{ld} must
1185not change when relocating. In this sense we speak of absolute
1186addresses being ``unrelocatable'': they don't change during relocation.
1187
b50e59fe 1188@item @code{undefined} segment
47342e8f
RP
1189This ``segment'' is a catch-all for address references to objects not in
1190the preceding segments.
1191@c FIXME: ref to some other doc on obj-file formats could go here.
1192
93b45514 1193@end table
47342e8f 1194
93b45514 1195An idealized example of the 3 relocatable segments follows. Memory
47342e8f 1196addresses are on the horizontal axis.
93b45514 1197
b50e59fe 1198@ifinfo
93b45514
RP
1199@example
1200 +-----+----+--+
1201partial program # 1: |ttttt|dddd|00|
1202 +-----+----+--+
1203
1204 text data bss
1205 seg. seg. seg.
1206
1207 +---+---+---+
1208partial program # 2: |TTT|DDD|000|
1209 +---+---+---+
1210
1211 +--+---+-----+--+----+---+-----+~~
1212linked program: | |TTT|ttttt| |dddd|DDD|00000|
1213 +--+---+-----+--+----+---+-----+~~
1214
1215 addresses: 0 @dots{}
1216@end example
b50e59fe
RP
1217@end ifinfo
1218@tex
1219\halign{\hfil\rm #\quad&#\cr
1220\cr
1221 &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1222Partial program \#1:
1223&\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1224\cr
1225 &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1226Partial program \#2:
1227&\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1228\cr
1229 &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1230linked program:
1231&\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1232ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1233DDDD}\boxit{2cm}{00000}\ \dots\cr
1234addresses:
1235&\dots\cr
1236}
1237@end tex
93b45514 1238
b50e59fe
RP
1239@node as Segments, Sub-Segments, ld Segments, Segments
1240@section as Internal Segments
93b45514
RP
1241These segments are invented for the internal use of @code{as}. They
1242have no meaning at run-time. You don't need to know about these
1243segments except that they might be mentioned in @code{as}' warning
1244messages. These segments are invented to permit the value of every
1245expression in your assembly language program to be a segmented
1246address.
1247
47342e8f
RP
1248@table @b
1249@item absent segment
1250An expression was expected and none was
1251found.
1252
1253@item goof segment
1254An internal assembler logic error has been
1255found. This means there is a bug in the assembler.
1256
93b45514 1257@item grand segment
47342e8f
RP
1258A @dfn{grand number} is a bignum or a flonum, but not an integer. If a
1259number can't be written as a C @code{int} constant, it is a grand
1260number. @code{as} has to remember that a flonum or a bignum does not
b50e59fe 1261fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
47342e8f 1262expression: this is done by making a flonum or bignum be in segment
b50e59fe 1263grand. This is purely for internal @code{as} convenience; grand
47342e8f
RP
1264segment behaves similarly to absolute segment.
1265
1266@item pass1 segment
93b45514 1267The expression was impossible to evaluate in the first pass. The
47342e8f
RP
1268assembler will attempt a second pass (second reading of the source) to
1269evaluate the expression. Your expression mentioned an undefined symbol
1270in a way that defies the one-pass (segment + offset in segment) assembly
1271process. No compiler need emit such an expression.
1272
b50e59fe
RP
1273@quotation
1274@emph{Warning:} the second pass is currently not implemented. @code{as}
1275will abort with an error message if one is required.
1276@end quotation
47342e8f
RP
1277
1278@item difference segment
93b45514 1279As an assist to the C compiler, expressions of the forms
b50e59fe
RP
1280@display
1281 (@var{undefined symbol}) @minus{} (@var{expression}
1282 (@var{something} @minus{} (@var{undefined symbol})
1283 (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1284@end display
1285are permitted, and belong to the difference segment. @code{as}
47342e8f
RP
1286re-evaluates such expressions after the source file has been read and
1287the symbol table built. If by that time there are no undefined symbols
1288in the expression then the expression assumes a new segment. The
1289intention is to permit statements like
1290@samp{.word label - base_of_table}
1291to be assembled in one pass where both @code{label} and
1292@code{base_of_table} are undefined. This is useful for compiling C and
1293Algol switch statements, Pascal case statements, FORTRAN computed goto
1294statements and the like.
93b45514
RP
1295@end table
1296
b50e59fe 1297@node Sub-Segments, bss, as Segments, Segments
93b45514 1298@section Sub-Segments
b50e59fe
RP
1299Assembled bytes fall into two segments: text and data.
1300Because you may have groups of text or data that you want to end up near
1301to each other in the object file, @code{as} allows you to use
93b45514 1302@dfn{subsegments}. Within each segment, there can be numbered
b50e59fe
RP
1303subsegments with values from 0 to 8192. Objects assembled into the same
1304subsegment will be grouped with other objects in the same subsegment
1305when they are all put into the object file. For example, a compiler
1306might want to store constants in the text segment, but might not want to
1307have them interspersed with the program being assembled. In this case,
1308the compiler could issue a @code{text 0} before each section of code
1309being output, and a @code{text 1} before each group of constants being
1310output.
1311
1312Subsegments are optional. If you don't use subsegments, everything
93b45514
RP
1313will be stored in subsegment number zero.
1314
09352a5d
RP
1315_if__(!_AMD29K__)
1316Each subsegment is zero-padded up to a multiple of four bytes.
1317(Subsegments may be padded a different amount on different flavors
1318of @code{as}.)
1319_fi__(!_AMD29K__)
1320_if__(_AMD29K__)
b50e59fe
RP
1321On the AMD 29K family, no particular padding is added to segment sizes;
1322GNU as forces no alignment on this platform.
09352a5d 1323_fi__(_AMD29K__)
b50e59fe
RP
1324Subsegments appear in your object file in numeric order, lowest numbered
1325to highest. (All this to be compatible with other people's assemblers.)
1326The object file contains no representation of subsegments; @code{ld} and
1327other programs that manipulate object files will see no trace of them.
1328They just see all your text subsegments as a text segment, and all your
1329data subsegments as a data segment.
93b45514
RP
1330
1331To specify which subsegment you want subsequent statements assembled
1332into, use a @samp{.text @var{expression}} or a @samp{.data
1333@var{expression}} statement. @var{Expression} should be an absolute
1334expression. (@xref{Expressions}.) If you just say @samp{.text}
1335then @samp{.text 0} is assumed. Likewise @samp{.data} means
1336@samp{.data 0}. Assembly begins in @code{text 0}.
1337For instance:
1338@example
1339.text 0 # The default subsegment is text 0 anyway.
1340.ascii "This lives in the first text subsegment. *"
1341.text 1
1342.ascii "But this lives in the second text subsegment."
1343.data 0
1344.ascii "This lives in the data segment,"
1345.ascii "in the first data subsegment."
1346.text 0
1347.ascii "This lives in the first text segment,"
1348.ascii "immediately following the asterisk (*)."
1349@end example
1350
b50e59fe
RP
1351Each segment has a @dfn{location counter} incremented by one for every
1352byte assembled into that segment. Because subsegments are merely a
1353convenience restricted to @code{as} there is no concept of a subsegment
1354location counter. There is no way to directly manipulate a location
1355counter---but the @code{.align} directive will change it, and any label
1356definition will capture its current value. The location counter of the
1357segment that statements are being assembled into is said to be the
93b45514
RP
1358@dfn{active} location counter.
1359
b50e59fe
RP
1360@node bss, , Sub-Segments, Segments
1361@section bss Segment
1362The bss segment is used for local common variable storage.
1363You may allocate address space in the bss segment, but you may
93b45514 1364not dictate data to load into it before your program executes. When
b50e59fe 1365your program starts running, all the contents of the bss
93b45514
RP
1366segment are zeroed bytes.
1367
47342e8f 1368Addresses in the bss segment are allocated with special directives;
93b45514 1369you may not assemble anything directly into the bss segment. Hence
47342e8f 1370there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
93b45514 1371
b50e59fe 1372@node Symbols, Expressions, Segments, Top
93b45514 1373@chapter Symbols
47342e8f
RP
1374Symbols are a central concept: the programmer uses symbols to name
1375things, the linker uses symbols to link, and the debugger uses symbols
1376to debug.
1377
b50e59fe
RP
1378@quotation
1379@emph{Warning:} @code{as} does not place symbols in the object file in
1380the same order they were declared. This may break some debuggers.
1381@end quotation
93b45514 1382
b50e59fe
RP
1383@menu
1384* Labels:: Labels
1385* Setting Symbols:: Giving Symbols Other Values
1386* Symbol Names:: Symbol Names
1387* Dot:: The Special Dot Symbol
1388* Symbol Attributes:: Symbol Attributes
1389@end menu
1390
1391@node Labels, Setting Symbols, Symbols, Symbols
93b45514
RP
1392@section Labels
1393A @dfn{label} is written as a symbol immediately followed by a colon
b50e59fe 1394@samp{:}. The symbol then represents the current value of the
93b45514
RP
1395active location counter, and is, for example, a suitable instruction
1396operand. You are warned if you use the same symbol to represent two
1397different locations: the first definition overrides any other
1398definitions.
1399
b50e59fe 1400@node Setting Symbols, Symbol Names, Labels, Symbols
93b45514 1401@section Giving Symbols Other Values
b50e59fe
RP
1402A symbol can be given an arbitrary value by writing a symbol, followed
1403by an equals sign @samp{=}, followed by an expression
93b45514 1404(@pxref{Expressions}). This is equivalent to using the @code{.set}
b50e59fe 1405directive. @xref{Set}.
93b45514 1406
b50e59fe 1407@node Symbol Names, Dot, Setting Symbols, Symbols
93b45514
RP
1408@section Symbol Names
1409Symbol names begin with a letter or with one of @samp{$._}. That
1410character may be followed by any string of digits, letters,
1411underscores and dollar signs. Case of letters is significant:
1412@code{foo} is a different symbol name than @code{Foo}.
1413
09352a5d 1414_if__(_AMD29K__)
b50e59fe
RP
1415For the AMD 29K family, @samp{?} is also allowed in the
1416body of a symbol name, though not at its beginning.
09352a5d 1417_fi__(_AMD29K__)
b50e59fe 1418
47342e8f
RP
1419Each symbol has exactly one name. Each name in an assembly language
1420program refers to exactly one symbol. You may use that symbol name any
1421number of times in a program.
93b45514 1422
b50e59fe
RP
1423@menu
1424* Local Symbols:: Local Symbol Names
1425@end menu
1426
1427@node Local Symbols, , Symbol Names, Symbol Names
93b45514
RP
1428@subsection Local Symbol Names
1429
1430Local symbols help compilers and programmers use names temporarily.
b50e59fe
RP
1431There are ten local symbol names, which are re-used throughout the
1432program. You may refer to them using the names @samp{0} @samp{1}
1433@dots{} @samp{9}. To define a local symbol, write a label of the form
1434@samp{@b{N}:} (where @b{N} represents any digit). To refer to the most
1435recent previous definition of that symbol write @samp{@b{N}b}, using the
1436same digit as when you defined the label. To refer to the next
1437definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1438a choice of 10 forward references. The @samp{b} stands for
1439``backwards'' and the @samp{f} stands for ``forwards''.
1440
1441Local symbols are not emitted by the current GNU C compiler.
93b45514
RP
1442
1443There is no restriction on how you can use these labels, but
1444remember that at any point in the assembly you can refer to at most
144510 prior local labels and to at most 10 forward local labels.
1446
47342e8f 1447Local symbol names are only a notation device. They are immediately
93b45514 1448transformed into more conventional symbol names before the assembler
47342e8f
RP
1449uses them. The symbol names stored in the symbol table, appearing in
1450error messages and optionally emitted to the object file have these
1451parts:
1452
1453@table @code
93b45514
RP
1454@item L
1455All local labels begin with @samp{L}. Normally both @code{as} and
1456@code{ld} forget symbols that start with @samp{L}. These labels are
1457used for symbols you are never intended to see. If you give the
1458@samp{-L} option then @code{as} will retain these symbols in the
b50e59fe 1459object file. If you also instruct @code{ld} to retain these symbols,
93b45514 1460you may use them in debugging.
47342e8f
RP
1461
1462@item @var{digit}
93b45514
RP
1463If the label is written @samp{0:} then the digit is @samp{0}.
1464If the label is written @samp{1:} then the digit is @samp{1}.
1465And so on up through @samp{9:}.
47342e8f
RP
1466
1467@item @ctrl{A}
93b45514
RP
1468This unusual character is included so you don't accidentally invent
1469a symbol of the same name. The character has ASCII value
1470@samp{\001}.
47342e8f
RP
1471
1472@item @emph{ordinal number}
1473This is a serial number to keep the labels distinct. The first
93b45514 1474@samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
47342e8f 1475number @samp{15}; @emph{etc.}. Likewise for the other labels @samp{1:}
93b45514
RP
1476through @samp{9:}.
1477@end table
47342e8f
RP
1478
1479For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1480@code{3:} is named @code{L3@ctrl{A}44}.
93b45514 1481
b50e59fe 1482@node Dot, Symbol Attributes, Symbol Names, Symbols
93b45514
RP
1483@section The Special Dot Symbol
1484
b50e59fe 1485The special symbol @samp{.} refers to the current address that
93b45514 1486@code{as} is assembling into. Thus, the expression @samp{melvin:
b50e59fe 1487.long .} will cause @code{melvin} to contain its own address.
93b45514
RP
1488Assigning a value to @code{.} is treated the same as a @code{.org}
1489directive. Thus, the expression @samp{.=.+4} is the same as saying
09352a5d
RP
1490_if__(!_AMD29K__)
1491@samp{.space 4}.
1492_fi__(!_AMD29K__)
1493_if__(_AMD29K__)
b50e59fe 1494@samp{.block 4}.
09352a5d 1495_fi__(_AMD29K__)
b50e59fe
RP
1496
1497@node Symbol Attributes, , Dot, Symbols
93b45514 1498@section Symbol Attributes
47342e8f 1499Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
09352a5d
RP
1500_if__(_INTERNALS__)
1501The detailed definitions are in _0__<a.out.h>_1__.
1502_fi__(_INTERNALS__)
93b45514
RP
1503
1504If you use a symbol without defining it, @code{as} assumes zero for
1505all these attributes, and probably won't warn you. This makes the
1506symbol an externally defined symbol, which is generally what you
1507would want.
1508
b50e59fe
RP
1509@menu
1510* Symbol Value:: Value
1511* Symbol Type:: Type
1512* Symbol Desc:: Descriptor
1513* Symbol Other:: Other
1514@end menu
1515
1516@node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
93b45514 1517@subsection Value
47342e8f 1518The value of a symbol is (usually) 32 bits, the size of one GNU C
93b45514 1519@code{int}. For a symbol which labels a location in the
b50e59fe 1520text, data, bss or absolute segments the
93b45514 1521value is the number of addresses from the start of that segment to
b50e59fe 1522the label. Naturally for text, data and bss
93b45514 1523segments the value of a symbol changes as @code{ld} changes segment
b50e59fe 1524base addresses during linking. absolute symbols' values do
93b45514
RP
1525not change during linking: that is why they are called absolute.
1526
b50e59fe
RP
1527The value of an undefined symbol is treated in a special way. If it is
15280 then the symbol is not defined in this assembler source program, and
1529@code{ld} will try to determine its value from other programs it is
1530linked with. You make this kind of symbol simply by mentioning a symbol
1531name without defining it. A non-zero value represents a @code{.comm}
1532common declaration. The value is how much common storage to reserve, in
1533bytes (addresses). The symbol refers to the first address of the
1534allocated storage.
93b45514 1535
b50e59fe 1536@node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
93b45514
RP
1537@subsection Type
1538The type attribute of a symbol is 8 bits encoded in a devious way.
1539We kept this coding standard for compatibility with older operating
1540systems.
1541
b50e59fe 1542@ifinfo
93b45514
RP
1543@example
1544
1545 7 6 5 4 3 2 1 0 bit numbers
1546 +-----+-----+-----+-----+-----+-----+-----+-----+
1547 | | | |
1548 | N_STAB bits | N_TYPE bits |N_EXT|
1549 | | | bit |
1550 +-----+-----+-----+-----+-----+-----+-----+-----+
1551
b50e59fe 1552 Type byte
93b45514 1553@end example
b50e59fe
RP
1554@end ifinfo
1555@tex
1556\vskip 1pc
1557\halign{#\quad&#\cr
63f5d795 1558\ibox{3cm}{7}\ibox{4cm}{4}\ibox{1.1cm}{0}&bit numbers\cr
b50e59fe 1559\boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
63f5d795 1560bits}\boxit{1.1cm}{\tt N\_EXT}\cr
b50e59fe
RP
1561\hfill {\bf Type} byte\hfill\cr
1562}
1563@end tex
93b45514 1564
b50e59fe 1565@subsubsection @code{N_EXT} bit
47342e8f
RP
1566This bit is set if @code{ld} might need to use the symbol's type bits
1567and value. If this bit is off, then @code{ld} can ignore the
93b45514
RP
1568symbol while linking. It is set in two cases. If the symbol is
1569undefined, then @code{ld} is expected to find the symbol's value
1570elsewhere in another program module. Otherwise the symbol has the
1571value given, but this symbol name and value are revealed to any other
1572programs linked in the same executable program. This second use of
b50e59fe 1573the @code{N_EXT} bit is most often made by a @code{.globl} statement.
93b45514 1574
b50e59fe 1575@subsubsection @code{N_TYPE} bits
93b45514
RP
1576These establish the symbol's ``type'', which is mainly a relocation
1577concept. Common values are detailed in the manual describing the
1578executable file format.
1579
b50e59fe 1580@subsubsection @code{N_STAB} bits
93b45514
RP
1581Common values for these bits are described in the manual on the
1582executable file format.
1583
b50e59fe 1584@node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
47342e8f 1585@subsection Descriptor
93b45514 1586This is an arbitrary 16-bit value. You may establish a symbol's
47342e8f 1587descriptor value by using a @code{.desc} statement (@pxref{Desc}).
93b45514
RP
1588A descriptor value means nothing to @code{as}.
1589
b50e59fe 1590@node Symbol Other, , Symbol Desc, Symbol Attributes
93b45514
RP
1591@subsection Other
1592This is an arbitrary 8-bit value. It means nothing to @code{as}.
1593
b50e59fe 1594@node Expressions, Pseudo Ops, Symbols, Top
93b45514
RP
1595@chapter Expressions
1596An @dfn{expression} specifies an address or numeric value.
1597Whitespace may precede and/or follow an expression.
1598
b50e59fe
RP
1599@menu
1600* Empty Exprs:: Empty Expressions
1601* Integer Exprs:: Integer Expressions
1602@end menu
1603
1604@node Empty Exprs, Integer Exprs, Expressions, Expressions
93b45514 1605@section Empty Expressions
47342e8f 1606An empty expression has no value: it is just whitespace or null.
93b45514
RP
1607Wherever an absolute expression is required, you may omit the
1608expression and @code{as} will assume a value of (absolute) 0. This
1609is compatible with other assemblers.
1610
b50e59fe 1611@node Integer Exprs, , Empty Exprs, Expressions
93b45514 1612@section Integer Expressions
47342e8f
RP
1613An @dfn{integer expression} is one or more @emph{arguments} delimited
1614by @emph{operators}.
1615
b50e59fe
RP
1616@menu
1617* Arguments:: Arguments
1618* Operators:: Operators
1619* Prefix Ops:: Prefix Operators
1620* Infix Ops:: Infix Operators
1621@end menu
1622
1623@node Arguments, Operators, Integer Exprs, Integer Exprs
47342e8f 1624@subsection Arguments
93b45514 1625
47342e8f
RP
1626@dfn{Arguments} are symbols, numbers or subexpressions. In other
1627contexts arguments are sometimes called ``arithmetic operands''. In
1628this manual, to avoid confusing them with the ``instruction operands'' of
1629the machine language, we use the term ``argument'' to refer to parts of
b50e59fe 1630expressions only, reserving the word ``operand'' to refer only to machine
47342e8f 1631instruction operands.
93b45514 1632
b50e59fe
RP
1633Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1634@var{segment} is one of text, data, bss, absolute,
1635or @code{undefined}. @var{NNN} is a signed, 2's complement 32 bit
93b45514
RP
1636integer.
1637
1638Numbers are usually integers.
1639
1640A number can be a flonum or bignum. In this case, you are warned
1641that only the low order 32 bits are used, and @code{as} pretends
1642these 32 bits are an integer. You may write integer-manipulating
1643instructions that act on exotic constants, compatible with other
1644assemblers.
1645
b50e59fe
RP
1646Subexpressions are a left parenthesis @samp{(} followed by an integer
1647expression, followed by a right parenthesis @samp{)}; or a prefix
47342e8f 1648operator followed by an argument.
93b45514 1649
b50e59fe 1650@node Operators, Prefix Ops, Arguments, Integer Exprs
93b45514 1651@subsection Operators
b50e59fe
RP
1652@dfn{Operators} are arithmetic functions, like @code{+} or @code{%}. Prefix
1653operators are followed by an argument. Infix operators appear
47342e8f 1654between their arguments. Operators may be preceded and/or followed by
93b45514
RP
1655whitespace.
1656
b50e59fe
RP
1657@node Prefix Ops, Infix Ops, Operators, Integer Exprs
1658@subsection Prefix Operators
1659@code{as} has the following @dfn{prefix operators}. They each take
47342e8f 1660one argument, which must be absolute.
b50e59fe 1661@table @code
93b45514 1662@item -
b50e59fe 1663@dfn{Negation}. Two's complement negation.
93b45514 1664@item ~
b50e59fe 1665@dfn{Complementation}. Bitwise not.
93b45514
RP
1666@end table
1667
b50e59fe
RP
1668@node Infix Ops, , Prefix Ops, Integer Exprs
1669@subsection Infix Operators
47342e8f 1670
b50e59fe
RP
1671@dfn{Infix operators} take two arguments, one on either side. Operators
1672have precedence, but operations with equal precedence are performed left
1673to right. Apart from @code{+} or @code{-}, both arguments must be
1674absolute, and the result is absolute.
47342e8f 1675
93b45514 1676@enumerate
47342e8f 1677
93b45514 1678@item
47342e8f 1679Highest Precedence
93b45514
RP
1680@table @code
1681@item *
1682@dfn{Multiplication}.
1683@item /
1684@dfn{Division}. Truncation is the same as the C operator @samp{/}
93b45514
RP
1685@item %
1686@dfn{Remainder}.
09352a5d
RP
1687@item _0__<_1__
1688@itemx _0__<<_1__
1689@dfn{Shift Left}. Same as the C operator @samp{_0__<<_1__}
1690@item _0__>_1__
1691@itemx _0__>>_1__
1692@dfn{Shift Right}. Same as the C operator @samp{_0__>>_1__}
93b45514 1693@end table
47342e8f 1694
93b45514 1695@item
47342e8f
RP
1696Intermediate precedence
1697@table @code
93b45514
RP
1698@item |
1699@dfn{Bitwise Inclusive Or}.
1700@item &
1701@dfn{Bitwise And}.
1702@item ^
1703@dfn{Bitwise Exclusive Or}.
1704@item !
1705@dfn{Bitwise Or Not}.
1706@end table
47342e8f 1707
93b45514 1708@item
47342e8f
RP
1709Lowest Precedence
1710@table @code
93b45514 1711@item +
47342e8f
RP
1712@dfn{Addition}. If either argument is absolute, the result
1713has the segment of the other argument.
1714If either argument is pass1 or undefined, the result is pass1.
1715Otherwise @code{+} is illegal.
93b45514 1716@item -
47342e8f
RP
1717@dfn{Subtraction}. If the right argument is absolute, the
1718result has the segment of the left argument.
1719If either argument is pass1 the result is pass1.
1720If either argument is undefined the result is difference segment.
1721If both arguments are in the same segment, the result is absolute---provided
b50e59fe
RP
1722that segment is one of text, data or bss.
1723Otherwise subtraction is illegal.
93b45514
RP
1724@end table
1725@end enumerate
1726
b50e59fe 1727The sense of the rule for addition is that it's only meaningful to add
47342e8f
RP
1728the @emph{offsets} in an address; you can only have a defined segment in
1729one of the two arguments.
93b45514 1730
47342e8f
RP
1731Similarly, you can't subtract quantities from two different segments.
1732
b50e59fe 1733@node Pseudo Ops, Machine Dependent, Expressions, Top
93b45514
RP
1734@chapter Assembler Directives
1735@menu
b50e59fe
RP
1736* Abort:: The Abort directive causes as to abort
1737* Align:: Pad the location counter to a power of 2
1738* App-File:: Set the logical file name
1739* Ascii:: Fill memory with bytes of ASCII characters
1740* Asciz:: Fill memory with bytes of ASCII characters followed
93b45514 1741 by a null.
b50e59fe
RP
1742* Byte:: Fill memory with 8-bit integers
1743* Comm:: Reserve public space in the BSS segment
1744* Data:: Change to the data segment
1745* Desc:: Set the n_desc of a symbol
1746* Double:: Fill memory with double-precision floating-point numbers
1747* Else:: @code{.else}
1748* End:: @code{.end}
1749* Endif:: @code{.endif}
1750* Equ:: @code{.equ @var{symbol}, @var{expression}}
1751* Extern:: @code{.extern}
1752* Fill:: Fill memory with repeated values
1753* Float:: Fill memory with single-precision floating-point numbers
1754* Global:: Make a symbol visible to the linker
1755* Ident:: @code{.ident}
1756* If:: @code{.if @var{absolute expression}}
1757* Include:: @code{.include "@var{file}"}
1758* Int:: Fill memory with 32-bit integers
1759* Lcomm:: Reserve private space in the BSS segment
1760* Line:: Set the logical line number
1761* Ln:: @code{.ln @var{line-number}}
1762* List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1763* Long:: Fill memory with 32-bit integers
1764* Lsym:: Create a local symbol
1765* Octa:: Fill memory with 128-bit integers
1766* Org:: Change the location counter
1767* Quad:: Fill memory with 64-bit integers
1768* Set:: Set the value of a symbol
1769* Short:: Fill memory with 16-bit integers
1770* Single:: @code{.single @var{flonums}}
1771* Stab:: Store debugging information
1772* Text:: Change to the text segment
b50e59fe 1773* Word:: Fill memory with 32-bit integers
b50e59fe
RP
1774* Deprecated:: Deprecated Directives
1775* Machine Options:: Options
1776* Machine Syntax:: Syntax
1777* Floating Point:: Floating Point
1778* Machine Directives:: Machine Directives
1779* Opcodes:: Opcodes
93b45514
RP
1780@end menu
1781
47342e8f
RP
1782All assembler directives have names that begin with a period (@samp{.}).
1783The rest of the name is letters: their case does not matter.
93b45514 1784
b50e59fe
RP
1785This chapter discusses directives present in all versions of GNU
1786@code{as}; @pxref{Machine Dependent} for additional directives.
1787
47342e8f 1788@node Abort, Align, Pseudo Ops, Pseudo Ops
b50e59fe 1789@section @code{.abort}
93b45514
RP
1790This directive stops the assembly immediately. It is for
1791compatibility with other assemblers. The original idea was that the
47342e8f
RP
1792assembler program would be piped into the assembler. If the sender
1793of a program quit, it could use this directive tells @code{as} to
93b45514
RP
1794quit also. One day @code{.abort} will not be supported.
1795
b50e59fe 1796@node Align, App-File, Abort, Pseudo Ops
f4335d56 1797@section @code{.align @var{abs-expression} , @var{abs-expression}}
b50e59fe 1798Pad the location counter (in the current subsegment) to a particular
f4335d56
RP
1799storage boundary. The first expression (which must be absolute) is the
1800number of low-order zero bits the location counter will have after
1801advancement. For example @samp{.align 3} will advance the location
1802counter until it a multiple of 8. If the location counter is already a
1803multiple of 8, no change is needed.
93b45514 1804
f4335d56
RP
1805The second expression (also absolute) gives the value to be stored in
1806the padding bytes. It (and the comma) may be omitted. If it is
1807omitted, the padding bytes are zero.
93b45514 1808
b50e59fe
RP
1809@node App-File, Ascii, Align, Pseudo Ops
1810@section @code{.app-file @var{string}}
1811@code{.app-file} tells @code{as} that we are about to start a new
1812logical file. @var{String} is the new file name. In general, the
1813filename is recognized whether or not it is surrounded by quotes @samp{"};
1814but if you wish to specify an empty file name is permitted,
1815you must give the quotes--@code{""}. This statement may go away in
1816future: it is only recognized to be compatible with old @code{as}
1817programs.
1818
1819@node Ascii, Asciz, App-File, Pseudo Ops
1820@section @code{.ascii "@var{string}"}@dots{}
47342e8f 1821@code{.ascii} expects zero or more string literals (@pxref{Strings})
93b45514
RP
1822separated by commas. It assembles each string (with no automatic
1823trailing zero byte) into consecutive addresses.
1824
47342e8f 1825@node Asciz, Byte, Ascii, Pseudo Ops
b50e59fe
RP
1826@section @code{.asciz "@var{string}"}@dots{}
1827@code{.asciz} is just like @code{.ascii}, but each string is followed by
1828a zero byte. The ``z'' in @samp{.asciz} stands for ``zero''.
93b45514 1829
47342e8f 1830@node Byte, Comm, Asciz, Pseudo Ops
b50e59fe 1831@section @code{.byte @var{expressions}}
93b45514 1832
47342e8f 1833@code{.byte} expects zero or more expressions, separated by commas.
93b45514
RP
1834Each expression is assembled into the next byte.
1835
b50e59fe
RP
1836@node Comm, Data, Byte, Pseudo Ops
1837@section @code{.comm @var{symbol} , @var{length} }
47342e8f
RP
1838@code{.comm} declares a named common area in the bss segment. Normally
1839@code{ld} reserves memory addresses for it during linking, so no partial
1840program defines the location of the symbol. Use @code{.comm} to tell
1841@code{ld} that it must be at least @var{length} bytes long. @code{ld}
1842will allocate space for each @code{.comm} symbol that is at least as
1843long as the longest @code{.comm} request in any of the partial programs
1844linked. @var{length} is an absolute expression.
1845
1846@node Data, Desc, Comm, Pseudo Ops
b50e59fe 1847@section @code{.data @var{subsegment}}
47342e8f 1848@code{.data} tells @code{as} to assemble the following statements onto the
93b45514
RP
1849end of the data subsegment numbered @var{subsegment} (which is an
1850absolute expression). If @var{subsegment} is omitted, it defaults
1851to zero.
1852
47342e8f 1853@node Desc, Double, Data, Pseudo Ops
f4335d56 1854@section @code{.desc @var{symbol}, @var{abs-expression}}
b50e59fe 1855This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
f4335d56 1856to the low 16 bits of an absolute expression.
93b45514 1857
b50e59fe
RP
1858@node Double, Else, Desc, Pseudo Ops
1859@section @code{.double @var{flonums}}
47342e8f 1860@code{.double} expects zero or more flonums, separated by commas. It assembles
b50e59fe 1861floating point numbers.
09352a5d
RP
1862_if__(_ALL_ARCH__)
1863The exact kind of floating point numbers emitted depends on how
1864@code{as} is configured. @xref{Machine Dependent}.
1865_fi__(_ALL_ARCH__)
1866_if__(_AMD29K__)
b50e59fe 1867On the AMD 29K family the floating point format used is IEEE.
09352a5d 1868_fi__(_AMD29K__)
b50e59fe
RP
1869
1870@node Else, End, Double, Pseudo Ops
1871@section @code{.else}
1872@code{.else} is part of the @code{as} support for conditional assembly;
1873@pxref{If}. It marks the beginning of a section of code to be assembled
1874if the condition for the preceding @code{.if} was false.
1875
1876@ignore
1877@node End, Endif, Else, Pseudo Ops
1878@section @code{.end}
1879This doesn't do anything---but isn't an s_ignore, so I suspect it's
1880meant to do something eventually (which is why it isn't documented here
1881as "for compatibility with blah").
1882@end ignore
1883
1884@node Endif, Equ, End, Pseudo Ops
1885@section @code{.endif}
1886@code{.endif} is part of the @code{as} support for conditional assembly;
1887it marks the end of a block of code that is only assembled
1888conditionally. @xref{If}.
1889
1890@node Equ, Extern, Endif, Pseudo Ops
1891@section @code{.equ @var{symbol}, @var{expression}}
1892
1893This directive sets the value of @var{symbol} to @var{expression}.
1894It is synonymous with @samp{.set}; @pxref{Set}.
1895
1896@node Extern, Fill, Equ, Pseudo Ops
1897@section @code{.extern}
1898@code{.extern} is accepted in the source program---for compatibility
1899with other assemblers---but it is ignored. GNU @code{as} treats
1900all undefined symbols as external.
1901
1902@node Fill, Float, Extern, Pseudo Ops
1903@section @code{.fill @var{repeat} , @var{size} , @var{value}}
93b45514
RP
1904@var{result}, @var{size} and @var{value} are absolute expressions.
1905This emits @var{repeat} copies of @var{size} bytes. @var{Repeat}
1906may be zero or more. @var{Size} may be zero or more, but if it is
1907more than 8, then it is deemed to have the value 8, compatible with
1908other people's assemblers. The contents of each @var{repeat} bytes
1909is taken from an 8-byte number. The highest order 4 bytes are
1910zero. The lowest order 4 bytes are @var{value} rendered in the
1911byte-order of an integer on the computer @code{as} is assembling for.
1912Each @var{size} bytes in a repetition is taken from the lowest order
1913@var{size} bytes of this number. Again, this bizarre behavior is
1914compatible with other people's assemblers.
1915
1916@var{Size} and @var{value} are optional.
1917If the second comma and @var{value} are absent, @var{value} is
1918assumed zero. If the first comma and following tokens are absent,
1919@var{size} is assumed to be 1.
1920
47342e8f 1921@node Float, Global, Fill, Pseudo Ops
b50e59fe
RP
1922@section @code{.float @var{flonums}}
1923This directive assembles zero or more flonums, separated by commas. It
1924has the same effect as @code{.single}.
09352a5d
RP
1925_if__(_ALL_ARCH__)
1926The exact kind of floating point numbers emitted depends on how
1927@code{as} is configured.
1928@xref{Machine Dependent}.
1929_fi__(_ALL_ARCH__)
1930_if__(_AMD29K__)
b50e59fe 1931The floating point format used for the AMD 29K family is IEEE.
09352a5d 1932_fi__(_AMD29K__)
93b45514 1933
b50e59fe
RP
1934@node Global, Ident, Float, Pseudo Ops
1935@section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
47342e8f 1936@code{.global} makes the symbol visible to @code{ld}. If you define
93b45514
RP
1937@var{symbol} in your partial program, its value is made available to
1938other partial programs that are linked with it. Otherwise,
1939@var{symbol} will take its attributes from a symbol of the same name
1940from another partial program it is linked with.
1941
b50e59fe
RP
1942This is done by setting the @code{N_EXT} bit of that symbol's type byte
1943to 1. @xref{Symbol Attributes}.
1944
1945Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1946compatibility with other assemblers.
1947
1948@node Ident, If, Global, Pseudo Ops
1949@section @code{.ident}
1950This directive is used by some assemblers to place tags in object files.
1951GNU @code{as} simply accepts the directive for source-file
1952compatibility with such assemblers, but does not actually emit anything
1953for it.
1954
1955@node If, Include, Ident, Pseudo Ops
1956@section @code{.if @var{absolute expression}}
1957@code{.if} marks the beginning of a section of code which is only
1958considered part of the source program being assembled if the argument
1959(which must be an @var{absolute expression}) is non-zero. The end of
1960the conditional section of code must be marked by @code{.endif}
1961(@pxref{Endif}); optionally, you may include code for the
1962alternative condition, flagged by @code{.else} (@pxref{Else}.
1963
1964The following variants of @code{.if} are also supported:
1965@table @code
1966@item ifdef @var{symbol}
1967Assembles the following section of code if the specified @var{symbol}
1968has been defined.
1969
1970@ignore
1971@item ifeqs
1972BOGONS??
1973@end ignore
1974
1975@item ifndef @var{symbol}
1976@itemx ifnotdef @var{symbol}
1977Assembles the following section of code if the specified @var{symbol}
1978has not been defined. Both spelling variants are equivalent.
93b45514 1979
b50e59fe
RP
1980@ignore
1981@item ifnes
1982NO bogons, I presume?
1983@end ignore
1984@end table
1985
1986@node Include, Int, If, Pseudo Ops
1987@section @code{.include "@var{file}"}
1988This directive provides a way to include supporting files at specified
1989points in your source program. The code from @var{file} is assembled as
1990if it followed the point of the @code{.include}; when the end of the
1991included file is reached, assembly of the original file continues. You
1992can control the search paths used with the @samp{-I} command-line option
1993(@pxref{Options}). Quotation marks are required around @var{file}.
1994
1995@node Int, Lcomm, Include, Pseudo Ops
1996@section @code{.int @var{expressions}}
93b45514
RP
1997Expect zero or more @var{expressions}, of any segment, separated by
1998commas. For each expression, emit a 32-bit number that will, at run
1999time, be the value of that expression. The byte order of the
2000expression depends on what kind of computer will run the program.
2001
47342e8f 2002@node Lcomm, Line, Int, Pseudo Ops
b50e59fe 2003@section @code{.lcomm @var{symbol} , @var{length}}
93b45514 2004Reserve @var{length} (an absolute expression) bytes for a local
47342e8f 2005common denoted by @var{symbol}. The segment and value of @var{symbol} are
93b45514 2006those of the new local common. The addresses are allocated in the
b50e59fe 2007bss segment, so at run-time the bytes will start off zeroed.
47342e8f 2008@var{Symbol} is not declared global (@pxref{Global}), so is normally
93b45514
RP
2009not visible to @code{ld}.
2010
09352a5d 2011_if__(!_AMD29K__)
b50e59fe
RP
2012@node Line, Ln, Lcomm, Pseudo Ops
2013@section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2014@code{.line}, and its alternate spelling @code{.ln}, tell
09352a5d
RP
2015_fi__(!_AMD29K__)
2016_if__(_AMD29K__)
b50e59fe
RP
2017@node Ln, List, Line, Pseudo Ops
2018@section @code{.ln @var{line-number}}
2019Tell
09352a5d 2020_fi__(_AMD29K__)
b50e59fe
RP
2021@code{as} to change the logical line number. @var{line-number} must be
2022an absolute expression. The next line will have that logical line
2023number. So any other statements on the current line (after a statement
2024separator character
09352a5d 2025_if__(_AMD29K__)
b50e59fe 2026@samp{@@})
09352a5d
RP
2027_fi__(_AMD29K__)
2028_if__(!_AMD29K__)
2029@code{;})
2030_fi__(!_AMD29K__)
b50e59fe
RP
2031will be reported as on logical line number
2032@var{logical line number} @minus{} 1.
2033One day this directive will be unsupported: it is used only
2034for compatibility with existing assembler programs. @refill
2035
2036@node List, Long, Ln, Pseudo Ops
f4335d56
RP
2037@section @code{.list} and related directives
2038GNU @code{as} ignores the directives @code{.list}, @code{.nolist},
2039@code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}; however,
2040they're accepted for compatibility with assemblers that use them.
b50e59fe
RP
2041
2042@node Long, Lsym, List, Pseudo Ops
2043@section @code{.long @var{expressions}}
47342e8f 2044@code{.long} is the same as @samp{.int}, @pxref{Int}.
93b45514 2045
47342e8f 2046@node Lsym, Octa, Long, Pseudo Ops
b50e59fe 2047@section @code{.lsym @var{symbol}, @var{expression}}
47342e8f 2048@code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
93b45514
RP
2049the hash table, ensuring it cannot be referenced by name during the
2050rest of the assembly. This sets the attributes of the symbol to be
47342e8f 2051the same as the expression value:
b50e59fe
RP
2052@example
2053@var{other} = @var{descriptor} = 0
2054@var{type} = @r{(segment of @var{expression})}
2055N_EXT = 0
2056@var{value} = @var{expression}
2057@end example
93b45514 2058
47342e8f 2059@node Octa, Org, Lsym, Pseudo Ops
b50e59fe 2060@section @code{.octa @var{bignums}}
47342e8f 2061This directive expects zero or more bignums, separated by commas. For each
b50e59fe
RP
2062bignum, it emits a 16-byte integer.
2063
2064The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2065hence @emph{quad}-word for 8 bytes.
93b45514 2066
47342e8f 2067@node Org, Quad, Octa, Pseudo Ops
b50e59fe 2068@section @code{.org @var{new-lc} , @var{fill}}
47342e8f
RP
2069
2070@code{.org} will advance the location counter of the current segment to
93b45514 2071@var{new-lc}. @var{new-lc} is either an absolute expression or an
47342e8f
RP
2072expression with the same segment as the current subsegment. That is,
2073you can't use @code{.org} to cross segments: if @var{new-lc} has the
2074wrong segment, the @code{.org} directive is ignored. To be compatible
2075with former assemblers, if the segment of @var{new-lc} is absolute,
2076@code{as} will issue a warning, then pretend the segment of @var{new-lc}
2077is the same as the current subsegment.
2078
2079@code{.org} may only increase the location counter, or leave it
2080unchanged; you cannot use @code{.org} to move the location counter
2081backwards.
2082
b50e59fe
RP
2083@c double negative used below "not undefined" because this is a specific
2084@c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2085@c segment. [email protected] 18feb91
47342e8f 2086Because @code{as} tries to assemble programs in one pass @var{new-lc}
b50e59fe 2087may not be undefined. If you really detest this restriction we eagerly await
47342e8f 2088a chance to share your improved assembler.
93b45514
RP
2089
2090Beware that the origin is relative to the start of the segment, not
2091to the start of the subsegment. This is compatible with other
2092people's assemblers.
2093
47342e8f 2094When the location counter (of the current subsegment) is advanced, the
93b45514
RP
2095intervening bytes are filled with @var{fill} which should be an
2096absolute expression. If the comma and @var{fill} are omitted,
2097@var{fill} defaults to zero.
2098
47342e8f 2099@node Quad, Set, Org, Pseudo Ops
b50e59fe
RP
2100@section @code{.quad @var{bignums}}
2101@code{.quad} expects zero or more bignums, separated by commas. For
2102each bignum, it emits an 8-byte integer. If the bignum won't fit in a 8
2103bytes, it prints a warning message; and just takes the lowest order 8
2104bytes of the bignum.
2105
2106The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2107hence @emph{quad}-word for 8 bytes.
93b45514 2108
47342e8f 2109@node Set, Short, Quad, Pseudo Ops
b50e59fe 2110@section @code{.set @var{symbol}, @var{expression}}
93b45514 2111
47342e8f 2112This directive sets the value of @var{symbol} to @var{expression}. This
b50e59fe
RP
2113will change @var{symbol}'s value and type to conform to
2114@var{expression}. If @code{N_EXT} is set, it remains set.
2115(@xref{Symbol Attributes}.)
93b45514 2116
47342e8f 2117You may @code{.set} a symbol many times in the same assembly.
93b45514
RP
2118If the expression's segment is unknowable during pass 1, a second
2119pass over the source program will be forced. The second pass is
2120currently not implemented. @code{as} will abort with an error
2121message if one is required.
2122
2123If you @code{.set} a global symbol, the value stored in the object
2124file is the last value stored into it.
2125
b50e59fe
RP
2126@node Short, Single, Set, Pseudo Ops
2127@section @code{.short @var{expressions}}
09352a5d
RP
2128_if__(! (_SPARC__ || _AMD29K__) )
2129@code{.short} is the same as @samp{.word}. @xref{Word}.
2130_fi__(! (_SPARC__ || _AMD29K__) )
2131_if__(_SPARC__ || _AMD29K__)
b50e59fe
RP
2132This expects zero or more @var{expressions}, and emits
2133a 16 bit number for each.
09352a5d 2134_fi__(_SPARC__ || _AMD29K__)
b50e59fe
RP
2135
2136@node Single, Space, Short, Pseudo Ops
2137@section @code{.single @var{flonums}}
2138This directive assembles zero or more flonums, separated by commas. It
2139has the same effect as @code{.float}.
09352a5d
RP
2140_if__(_ALL_ARCH__)
2141The exact kind of floating point numbers emitted depends on how
2142@code{as} is configured. @xref{Machine Dependent}.
2143_fi__(_ALL_ARCH__)
2144_if__(_AMD29K__)
b50e59fe 2145The floating point format used for the AMD 29K family is IEEE.
09352a5d 2146_fi__(_AMD29K__)
b50e59fe
RP
2147
2148
2149@node Space, Space, Single, Pseudo Ops
09352a5d 2150_if__(!_AMD29K__)
b50e59fe 2151@section @code{.space @var{size} , @var{fill}}
47342e8f 2152This directive emits @var{size} bytes, each of value @var{fill}. Both
93b45514
RP
2153@var{size} and @var{fill} are absolute expressions. If the comma
2154and @var{fill} are omitted, @var{fill} is assumed to be zero.
09352a5d 2155_fi__(!_AMD29K__)
b50e59fe 2156
09352a5d 2157_if__(_AMD29K__)
b50e59fe
RP
2158@section @code{.space}
2159This directive is ignored; it is accepted for compatibility with other
2160AMD 29K assemblers.
2161
2162@quotation
2163@emph{Warning:} In other versions of GNU @code{as}, the directive
2164@code{.space} has the effect of @code{.block} @xref{Machine Directives}.
2165@end quotation
09352a5d 2166_fi__(_AMD29K__)
93b45514 2167
47342e8f 2168@node Stab, Text, Space, Pseudo Ops
b50e59fe 2169@section @code{.stabd, .stabn, .stabs}
47342e8f 2170There are three directives that begin @samp{.stab}.
b50e59fe 2171All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
93b45514 2172The symbols are not entered in @code{as}' hash table: they
b50e59fe 2173cannot be referenced elsewhere in the source file.
93b45514
RP
2174Up to five fields are required:
2175@table @var
2176@item string
2177This is the symbol's name. It may contain any character except @samp{\000},
2178so is more general than ordinary symbol names. Some debuggers used to
47342e8f 2179code arbitrarily complex structures into symbol names using this field.
93b45514 2180@item type
b50e59fe 2181An absolute expression. The symbol's type is set to the low 8
93b45514
RP
2182bits of this expression.
2183Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2184silly bit patterns.
2185@item other
2186An absolute expression.
b50e59fe 2187The symbol's ``other'' attribute is set to the low 8 bits of this expression.
93b45514
RP
2188@item desc
2189An absolute expression.
b50e59fe 2190The symbol's descriptor is set to the low 16 bits of this expression.
93b45514 2191@item value
b50e59fe 2192An absolute expression which becomes the symbol's value.
93b45514
RP
2193@end table
2194
b50e59fe
RP
2195If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2196or @code{.stabs} statement, the symbol has probably already been created
2197and you will get a half-formed symbol in your object file. This is
2198compatible with earlier assemblers!
93b45514 2199
47342e8f
RP
2200@table @code
2201@item .stabd @var{type} , @var{other} , @var{desc}
93b45514
RP
2202
2203The ``name'' of the symbol generated is not even an empty string.
2204It is a null pointer, for compatibility. Older assemblers used a
2205null pointer so they didn't waste space in object files with empty
2206strings.
2207
b50e59fe 2208The symbol's value is set to the location counter,
93b45514
RP
2209relocatably. When your program is linked, the value of this symbol
2210will be where the location counter was when the @code{.stabd} was
2211assembled.
2212
47342e8f 2213@item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
93b45514
RP
2214
2215The name of the symbol is set to the empty string @code{""}.
2216
47342e8f 2217@item .stabs @var{string} , @var{type} , @var{other} , @var{desc} , @var{value}
93b45514 2218
47342e8f
RP
2219All five fields are specified.
2220@end table
2221
2222@node Text, Word, Stab, Pseudo Ops
b50e59fe 2223@section @code{.text @var{subsegment}}
93b45514
RP
2224Tells @code{as} to assemble the following statements onto the end of
2225the text subsegment numbered @var{subsegment}, which is an absolute
2226expression. If @var{subsegment} is omitted, subsegment number zero
2227is used.
2228
b50e59fe
RP
2229@node Word, Deprecated, Text, Pseudo Ops
2230@section @code{.word @var{expressions}}
47342e8f 2231This directive expects zero or more @var{expressions}, of any segment,
b50e59fe 2232separated by commas.
09352a5d 2233_if__(_SPARC__ || _AMD29K__)
b50e59fe 2234For each expression, @code{as} emits a 32-bit number.
09352a5d
RP
2235_fi__(_SPARC__ || _AMD29K__)
2236_if__(! (_SPARC__ || _AMD29K__) )
2237For each expression, @code{as} emits a 16-bit number.
2238_fi__(! (_SPARC__ || _AMD29K__) )
2239
2240_if__(_ALL_ARCH__)
2241The byte order of the expression depends on what kind of computer will
2242run the program.
2243_fi__(_ALL_ARCH__)
2244
2245@c on the 29k the "special treatment to support compilers" doesn't
2246@c happen---32-bit addressability, period; no long/short jumps.
2247_if__(!_AMD29K__)
47342e8f
RP
2248@subsection Special Treatment to support Compilers
2249
2250In order to assemble compiler output into something that will work,
2251@code{as} will occasionlly do strange things to @samp{.word} directives.
2252Directives of the form @samp{.word sym1-sym2} are often emitted by
2253compilers as part of jump tables. Therefore, when @code{as} assembles a
2254directive of the form @samp{.word sym1-sym2}, and the difference between
2255@code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2256create a @dfn{secondary jump table}, immediately before the next label.
2257This @var{secondary jump table} will be preceded by a short-jump to the
2258first byte after the secondary table. This short-jump prevents the flow
2259of control from accidentally falling into the new table. Inside the
2260table will be a long-jump to @code{sym2}. The original @samp{.word}
2261will contain @code{sym1} minus the address of the long-jump to
2262@code{sym2}.
2263
2264If there were several occurrences of @samp{.word sym1-sym2} before the
2265secondary jump table, all of them will be adjusted. If there was a
2266@samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2267long-jump to @code{sym4} will be included in the secondary jump table,
2268and the @code{.word} directives will be adjusted to contain @code{sym3}
2269minus the address of the long-jump to @code{sym4}; and so on, for as many
2270entries in the original jump table as necessary.
09352a5d
RP
2271
2272_if__(_INTERNALS__)
47342e8f
RP
2273@emph{This feature may be disabled by compiling @code{as} with the
2274@samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2275assembly language programmers.
09352a5d
RP
2276_fi__(_INTERNALS__)
2277_fi__(!_AMD29K__)
93b45514 2278
b50e59fe 2279@node Deprecated, Machine Dependent, Word, Pseudo Ops
93b45514
RP
2280@section Deprecated Directives
2281One day these directives won't work.
2282They are included for compatibility with older assemblers.
2283@table @t
2284@item .abort
b50e59fe 2285@item .app-file
93b45514
RP
2286@item .line
2287@end table
2288
b50e59fe 2289@node Machine Dependent, Machine Dependent, Pseudo Ops, Top
09352a5d
RP
2290_if__(_ALL_ARCH__)
2291@chapter Machine Dependent Features
2292_fi__(_ALL_ARCH__)
2293
2294_if__(_VAX__ && !_ALL_ARCH__)
2295@chapter Machine Dependent Features: VAX
2296_fi__(_VAX__ && !_ALL_ARCH__)
2297_if__(_ALL_ARCH__)
93b45514 2298@section Vax
09352a5d
RP
2299_fi__(_ALL_ARCH__)
2300_if__(_VAX__)
93b45514
RP
2301@subsection Options
2302
2303The Vax version of @code{as} accepts any of the following options,
2304gives a warning message that the option was ignored and proceeds.
2305These options are for compatibility with scripts designed for other
2306people's assemblers.
2307
2308@table @asis
2309@item @kbd{-D} (Debug)
2310@itemx @kbd{-S} (Symbol Table)
2311@itemx @kbd{-T} (Token Trace)
2312These are obsolete options used to debug old assemblers.
2313
2314@item @kbd{-d} (Displacement size for JUMPs)
2315This option expects a number following the @kbd{-d}. Like options
2316that expect filenames, the number may immediately follow the
2317@kbd{-d} (old standard) or constitute the whole of the command line
2318argument that follows @kbd{-d} (GNU standard).
2319
2320@item @kbd{-V} (Virtualize Interpass Temporary File)
2321Some other assemblers use a temporary file. This option
2322commanded them to keep the information in active memory rather
2323than in a disk file. @code{as} always does this, so this
2324option is redundant.
2325
2326@item @kbd{-J} (JUMPify Longer Branches)
2327Many 32-bit computers permit a variety of branch instructions
2328to do the same job. Some of these instructions are short (and
2329fast) but have a limited range; others are long (and slow) but
2330can branch anywhere in virtual memory. Often there are 3
2331flavors of branch: short, medium and long. Some other
2332assemblers would emit short and medium branches, unless told by
2333this option to emit short and long branches.
2334
2335@item @kbd{-t} (Temporary File Directory)
2336Some other assemblers may use a temporary file, and this option
2337takes a filename being the directory to site the temporary
2338file. @code{as} does not use a temporary disk file, so this
2339option makes no difference. @kbd{-t} needs exactly one
2340filename.
2341@end table
2342
2343The Vax version of the assembler accepts two options when
2344compiled for VMS. They are @kbd{-h}, and @kbd{-+}. The
2345@kbd{-h} option prevents @code{as} from modifying the
2346symbol-table entries for symbols that contain lowercase
2347characters (I think). The @kbd{-+} option causes @code{as} to
2348print warning messages if the FILENAME part of the object file,
2349or any symbol name is larger than 31 characters. The @kbd{-+}
2350option also insertes some code following the @samp{_main}
47342e8f 2351symbol so that the object file will be compatible with Vax-11
93b45514
RP
2352"C".
2353
2354@subsection Floating Point
2355Conversion of flonums to floating point is correct, and
2356compatible with previous assemblers. Rounding is
2357towards zero if the remainder is exactly half the least significant bit.
2358
2359@code{D}, @code{F}, @code{G} and @code{H} floating point formats
2360are understood.
2361
47342e8f 2362Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
93b45514
RP
2363are rendered correctly. Again, rounding is towards zero in the
2364boundary case.
2365
2366The @code{.float} directive produces @code{f} format numbers.
2367The @code{.double} directive produces @code{d} format numbers.
2368
2369@subsection Machine Directives
2370The Vax version of the assembler supports four directives for
2371generating Vax floating point constants. They are described in the
2372table below.
2373
2374@table @code
2375@item .dfloat
2376This expects zero or more flonums, separated by commas, and
2377assembles Vax @code{d} format 64-bit floating point constants.
2378
2379@item .ffloat
2380This expects zero or more flonums, separated by commas, and
2381assembles Vax @code{f} format 32-bit floating point constants.
2382
2383@item .gfloat
2384This expects zero or more flonums, separated by commas, and
2385assembles Vax @code{g} format 64-bit floating point constants.
2386
2387@item .hfloat
2388This expects zero or more flonums, separated by commas, and
2389assembles Vax @code{h} format 128-bit floating point constants.
2390
2391@end table
2392
2393@subsection Opcodes
2394All DEC mnemonics are supported. Beware that @code{case@dots{}}
2395instructions have exactly 3 operands. The dispatch table that
2396follows the @code{case@dots{}} instruction should be made with
2397@code{.word} statements. This is compatible with all unix
2398assemblers we know of.
2399
2400@subsection Branch Improvement
2401Certain pseudo opcodes are permitted. They are for branch
2402instructions. They expand to the shortest branch instruction that
2403will reach the target. Generally these mnemonics are made by
2404substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2405This feature is included both for compatibility and to help
2406compilers. If you don't need this feature, don't use these
2407opcodes. Here are the mnemonics, and the code they can expand into.
2408
2409@table @code
2410@item jbsb
2411@samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2412@table @asis
2413@item (byte displacement)
2414@kbd{bsbb @dots{}}
2415@item (word displacement)
2416@kbd{bsbw @dots{}}
2417@item (long displacement)
2418@kbd{jsb @dots{}}
2419@end table
2420@item jbr
2421@itemx jr
2422Unconditional branch.
2423@table @asis
2424@item (byte displacement)
2425@kbd{brb @dots{}}
2426@item (word displacement)
2427@kbd{brw @dots{}}
2428@item (long displacement)
2429@kbd{jmp @dots{}}
2430@end table
2431@item j@var{COND}
2432@var{COND} may be any one of the conditional branches
2433@code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2434@var{COND} may also be one of the bit tests
2435@code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2436@var{NOTCOND} is the opposite condition to @var{COND}.
2437@table @asis
2438@item (byte displacement)
2439@kbd{b@var{COND} @dots{}}
2440@item (word displacement)
2441@kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2442@item (long displacement)
2443@kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2444@end table
2445@item jacb@var{X}
2446@var{X} may be one of @code{b d f g h l w}.
2447@table @asis
2448@item (word displacement)
2449@kbd{@var{OPCODE} @dots{}}
2450@item (long displacement)
2451@kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2452@end table
2453@item jaob@var{YYY}
2454@var{YYY} may be one of @code{lss leq}.
2455@item jsob@var{ZZZ}
2456@var{ZZZ} may be one of @code{geq gtr}.
2457@table @asis
2458@item (byte displacement)
2459@kbd{@var{OPCODE} @dots{}}
2460@item (word displacement)
2461@kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2462@item (long displacement)
2463@kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2464@end table
2465@item aobleq
2466@itemx aoblss
2467@itemx sobgeq
2468@itemx sobgtr
2469@table @asis
2470@item (byte displacement)
2471@kbd{@var{OPCODE} @dots{}}
2472@item (word displacement)
2473@kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2474@item (long displacement)
2475@kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2476@end table
2477@end table
2478
2479@subsection operands
2480The immediate character is @samp{$} for Unix compatibility, not
2481@samp{#} as DEC writes it.
2482
2483The indirect character is @samp{*} for Unix compatibility, not
2484@samp{@@} as DEC writes it.
2485
2486The displacement sizing character is @samp{`} (an accent grave) for
2487Unix compatibility, not @samp{^} as DEC writes it. The letter
2488preceding @samp{`} may have either case. @samp{G} is not
2489understood, but all other letters (@code{b i l s w}) are understood.
2490
2491Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2492pc}. Any case of letters will do.
2493
2494For instance
2495@example
2496tstb *w`$4(r5)
2497@end example
2498
2499Any expression is permitted in an operand. Operands are comma
2500separated.
2501
2502@c There is some bug to do with recognizing expressions
2503@c in operands, but I forget what it is. It is
2504@c a syntax clash because () is used as an address mode
2505@c and to encapsulate sub-expressions.
2506@subsection Not Supported
2507Vax bit fields can not be assembled with @code{as}. Someone
2508can add the required code if they really need it.
09352a5d 2509_fi__(_VAX__)
93b45514 2510
09352a5d
RP
2511_if__(_AMD29K__ && !_ALL_ARCH__)
2512@chapter Machine Dependent Features: AMD 29K
2513_fi__(_AMD29K__ && !_ALL_ARCH__)
2514_if__(_AMD29K__)
b50e59fe
RP
2515@node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2516@section Options
2517GNU @code{as} has no additional command-line options for the AMD
251829K family.
2519
2520@node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2521@section Syntax
2522@subsection Special Characters
2523@samp{;} is the line comment character.
2524
2525@samp{@@} can be used instead of a newline to separate statements.
2526
2527The character @samp{?} is permitted in identifiers (but may not begin
2528an identifier).
2529
2530@subsection Register Names
2531General-purpose registers are represented by predefined symbols of the
2532form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2533(for local registers), where @var{nnn} represents a number between
2534@code{0} and @code{127}, written with no leading zeros. The leading
2535letters may be in either upper or lower case; for example, @samp{gr13}
2536and @samp{LR7} are both valid register names.
2537
2538You may also refer to general-purpose registers by specifying the
2539register number as the result of an expression (prefixed with @samp{%%}
2540to flag the expression as a register number):
2541@example
2542%%@var{expression}
2543@end example
2544@noindent---where @var{expression} must be an absolute expression
2545evaluating to a number between @code{0} and @code{255}. The range
2546[0, 127] refers to global registers, and the range [128, 255] to local
2547registers.
2548
2549In addition, GNU @code{as} understands the following protected
2550special-purpose register names for the AMD 29K family:
2551
2552@example
2553 vab chd pc0
2554 ops chc pc1
2555 cps rbp pc2
2556 cfg tmc mmu
2557 cha tmr lru
2558@end example
2559
2560These unprotected special-purpose register names are also recognized:
2561@example
2562 ipc alu fpe
2563 ipa bp inte
2564 ipb fc fps
2565 q cr exop
2566@end example
2567
2568@node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2569@section Floating Point
2570The AMD 29K family uses IEEE floating-point numbers.
2571
2572@node Machine Directives, Opcodes, Floating Point, Machine Dependent
2573@section Machine Directives
2574
2575@menu
2576* block:: @code{.block @var{size} , @var{fill}}
2577* cputype:: @code{.cputype}
2578* file:: @code{.file}
2579* hword:: @code{.hword @var{expressions}}
2580* line:: @code{.line}
2581* reg:: @code{.reg @var{symbol}, @var{expression}}
2582* sect:: @code{.sect}
2583* use:: @code{.use @var{segment name}}
2584@end menu
2585
2586@node block, cputype, Machine Directives, Machine Directives
2587@subsection @code{.block @var{size} , @var{fill}}
2588This directive emits @var{size} bytes, each of value @var{fill}. Both
2589@var{size} and @var{fill} are absolute expressions. If the comma
2590and @var{fill} are omitted, @var{fill} is assumed to be zero.
2591
2592In other versions of GNU @code{as}, this directive is called
2593@samp{.space}.
2594
2595@node cputype, file, block, Machine Directives
2596@subsection @code{.cputype}
2597This directive is ignored; it is accepted for compatibility with other
2598AMD 29K assemblers.
2599
2600@node file, hword, cputype, Machine Directives
2601@subsection @code{.file}
2602This directive is ignored; it is accepted for compatibility with other
2603AMD 29K assemblers.
2604
2605@quotation
2606@emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2607used for the directive called @code{.app-file} in the AMD 29K support.
2608@end quotation
2609
2610@node hword, line, file, Machine Directives
2611@subsection @code{.hword @var{expressions}}
2612This expects zero or more @var{expressions}, and emits
2613a 16 bit number for each. (Synonym for @samp{.short}.)
2614
2615@node line, reg, hword, Machine Directives
2616@subsection @code{.line}
2617This directive is ignored; it is accepted for compatibility with other
2618AMD 29K assemblers.
2619
2620@node reg, sect, line, Machine Directives
2621@subsection @code{.reg @var{symbol}, @var{expression}}
2622@code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2623
2624@node sect, use, reg, Machine Directives
2625@subsection @code{.sect}
2626This directive is ignored; it is accepted for compatibility with other
2627AMD 29K assemblers.
2628
2629@node use, , sect, Machine Directives
2630@subsection @code{.use @var{segment name}}
2631Establishes the segment and subsegment for the following code;
2632@var{segment name} may be one of @code{.text}, @code{.data},
2633@code{.data1}, or @code{.lit}. With one of the first three @var{segment
2634name} options, @samp{.use} is equivalent to the machine directive
2635@var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2636@samp{.data 200}.
2637
2638
2639@node Opcodes, Opcodes, Machine Directives, Machine Dependent
2640@section Opcodes
2641GNU @code{as} implements all the standard AMD 29K opcodes. No
2642additional pseudo-instructions are needed on this family.
2643
2644For information on the 29K machine instruction set, see @cite{Am29000
2645User's Manual}, Advanced Micro Devices, Inc.
2646
2647
09352a5d
RP
2648_fi__(_AMD29K__)
2649_if__(_M680X0__ && !_ALL_ARCH__)
2650@chapter Machine Dependent Features: Motorola 680x0
2651_fi__(_M680X0__ && !_ALL_ARCH__)
2652_if__(_M680X0__)
47342e8f 2653@section Options
93b45514
RP
2654The 680x0 version of @code{as} has two machine dependent options.
2655One shortens undefined references from 32 to 16 bits, while the
2656other is used to tell @code{as} what kind of machine it is
2657assembling for.
2658
2659You can use the @kbd{-l} option to shorten the size of references to
47342e8f
RP
2660undefined symbols. If the @kbd{-l} option is not given, references to
2661undefined symbols will be a full long (32 bits) wide. (Since @code{as}
2662cannot know where these symbols will end up, @code{as} can only allocate
2663space for the linker to fill in later. Since @code{as} doesn't know how
2664far away these symbols will be, it allocates as much space as it can.)
2665If this option is given, the references will only be one word wide (16
2666bits). This may be useful if you want the object file to be as small as
2667possible, and you know that the relevant symbols will be less than 17
2668bits away.
2669
2670The 680x0 version of @code{as} is most frequently used to assemble
2671programs for the Motorola MC68020 microprocessor. Occasionally it is
2672used to assemble programs for the mostly similar, but slightly different
2673MC68000 or MC68010 microprocessors. You can give @code{as} the options
2674@samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2675@samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2676target.
2677
2678@section Syntax
2679
2680The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2681Size modifiers are appended directly to the end of the opcode without an
2682intervening period. For example, write @samp{movl} rather than
2683@samp{move.l}.
2684
09352a5d 2685_if__(_INTERNALS__)
47342e8f
RP
2686If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2687Sun-style local labels of the form @samp{1$} through @samp{$9}.
09352a5d 2688_fi__(_INTERNALS__)
93b45514
RP
2689
2690In the following table @dfn{apc} stands for any of the address
2691registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2692Program Counter (@samp{pc}), or the zero-address relative to the
2693program counter (@samp{zpc}).
2694
2695The following addressing modes are understood:
2696@table @dfn
2697@item Immediate
2698@samp{#@var{digits}}
2699
2700@item Data Register
2701@samp{d0} through @samp{d7}
2702
2703@item Address Register
2704@samp{a0} through @samp{a7}
2705
2706@item Address Register Indirect
2707@samp{a0@@} through @samp{a7@@}
2708
2709@item Address Register Postincrement
2710@samp{a0@@+} through @samp{a7@@+}
2711
2712@item Address Register Predecrement
2713@samp{a0@@-} through @samp{a7@@-}
2714
2715@item Indirect Plus Offset
2716@samp{@var{apc}@@(@var{digits})}
2717
2718@item Index
2719@samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2720or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2721
2722@item Postindex
2723@samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2724or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2725
2726@item Preindex
2727@samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2728or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2729
2730@item Memory Indirect
2731@samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2732
2733@item Absolute
47342e8f 2734@samp{@var{symbol}}, or @samp{@var{digits}}
09352a5d 2735@ignore
47342e8f
RP
2736@c [email protected]: gnu, rich concur the following needs careful
2737@c research before documenting.
2738 , or either of the above followed
93b45514 2739by @samp{:b}, @samp{:w}, or @samp{:l}.
09352a5d 2740@end ignore
93b45514
RP
2741@end table
2742
47342e8f 2743@section Floating Point
93b45514
RP
2744The floating point code is not too well tested, and may have
2745subtle bugs in it.
2746
2747Packed decimal (P) format floating literals are not supported.
47342e8f 2748Feel free to add the code!
93b45514
RP
2749
2750The floating point formats generated by directives are these.
2751@table @code
2752@item .float
2753@code{Single} precision floating point constants.
2754@item .double
2755@code{Double} precision floating point constants.
2756@end table
2757
2758There is no directive to produce regions of memory holding
2759extended precision numbers, however they can be used as
2760immediate operands to floating-point instructions. Adding a
2761directive to create extended precision numbers would not be
47342e8f 2762hard, but it has not yet seemed necessary.
93b45514 2763
47342e8f 2764@section Machine Directives
93b45514
RP
2765In order to be compatible with the Sun assembler the 680x0 assembler
2766understands the following directives.
2767@table @code
2768@item .data1
2769This directive is identical to a @code{.data 1} directive.
2770@item .data2
2771This directive is identical to a @code{.data 2} directive.
2772@item .even
2773This directive is identical to a @code{.align 1} directive.
2774@c Is this true? does it work???
2775@item .skip
2776This directive is identical to a @code{.space} directive.
2777@end table
2778
47342e8f
RP
2779@section Opcodes
2780@c [email protected]: I don't see any point in the following
2781@c paragraph. Bugs are bugs; how does saying this
2782@c help anyone?
09352a5d 2783@ignore
93b45514
RP
2784Danger: Several bugs have been found in the opcode table (and
2785fixed). More bugs may exist. Be careful when using obscure
2786instructions.
09352a5d 2787@end ignore
47342e8f
RP
2788
2789@subsection Branch Improvement
2790
2791Certain pseudo opcodes are permitted for branch instructions.
2792They expand to the shortest branch instruction that will reach the
2793target. Generally these mnemonics are made by substituting @samp{j} for
2794@samp{b} at the start of a Motorola mnemonic.
2795
2796The following table summarizes the pseudo-operations. A @code{*} flags
2797cases that are more fully described after the table:
2798
2799@example
2800 Displacement
2801 +---------------------------------------------------------
2802 | 68020 68000/10
2803Pseudo-Op |BYTE WORD LONG LONG non-PC relative
2804 +---------------------------------------------------------
2805 jbsr |bsrs bsr bsrl jsr jsr
2806 jra |bras bra bral jmp jmp
2807* jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp
2808* dbXX |dbXX dbXX dbXX; bra; jmpl
2809* fjXX |fbXXw fbXXw fbXXl fbNXw;jmp
2810
2811XX: condition
2812NX: negative of condition XX
2813
2814@end example
2815@center{@code{*}---see full description below}
2816
2817@table @code
2818@item jbsr
2819@itemx jra
2820These are the simplest jump pseudo-operations; they always map to one
2821particular machine instruction, depending on the displacement to the
2822branch target.
2823
2824@item j@var{XX}
2825Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2826where @var{XX} is a conditional branch or condition-code test. The full
2827list of pseudo-ops in this family is:
2828@example
2829 jhi jls jcc jcs jne jeq jvc
2830 jvs jpl jmi jge jlt jgt jle
2831@end example
93b45514 2832
47342e8f
RP
2833For the cases of non-PC relative displacements and long displacements on
2834the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2835@var{NX}, the opposite condition to @var{XX}:
2836@example
2837 j@var{XX} foo
2838@end example
2839gives
2840@example
2841 b@var{NX}s oof
2842 jmp foo
2843 oof:
2844@end example
93b45514 2845
47342e8f
RP
2846@item db@var{XX}
2847The full family of pseudo-operations covered here is
2848@example
2849 dbhi dbls dbcc dbcs dbne dbeq dbvc
2850 dbvs dbpl dbmi dbge dblt dbgt dble
2851 dbf dbra dbt
2852@end example
2853
2854Other than for word and byte displacements, when the source reads
2855@samp{db@var{XX} foo}, @code{as} will emit
2856@example
2857 db@var{XX} oo1
2858 bra oo2
2859 oo1:jmpl foo
2860 oo2:
2861@end example
2862
2863@item fj@var{XX}
2864This family includes
2865@example
2866 fjne fjeq fjge fjlt fjgt fjle fjf
2867 fjt fjgl fjgle fjnge fjngl fjngle fjngt
2868 fjnle fjnlt fjoge fjogl fjogt fjole fjolt
2869 fjor fjseq fjsf fjsne fjst fjueq fjuge
2870 fjugt fjule fjult fjun
2871@end example
2872
2873For branch targets that are not PC relative, @code{as} emits
2874@example
2875 fb@var{NX} oof
2876 jmp foo
2877 oof:
2878@end example
2879when it encounters @samp{fj@var{XX} foo}.
2880
2881@end table
2882
2883@subsection Special Characters
93b45514
RP
2884The immediate character is @samp{#} for Sun compatibility. The
2885line-comment character is @samp{|}. If a @samp{#} appears at the
2886beginning of a line, it is treated as a comment unless it looks like
2887@samp{# line file}, in which case it is treated normally.
09352a5d 2888_fi__(_M680X0__)
93b45514 2889
09352a5d 2890@c [email protected]: conditionalize, rather than ignore, when filled in.
47342e8f 2891@ignore
93b45514 2892@section 32x32
47342e8f 2893@section Options
93b45514
RP
2894The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2895specify thiat it is compiling for a 32032 processor, or a
2896@kbd{-m32532} to specify that it is compiling for a 32532 option.
2897The default (if neither is specified) is chosen when the assembler
2898is compiled.
2899
2900@subsection Syntax
2901I don't know anything about the 32x32 syntax assembled by
2902@code{as}. Someone who undersands the processor (I've never seen
2903one) and the possible syntaxes should write this section.
2904
2905@subsection Floating Point
2906The 32x32 uses IEEE floating point numbers, but @code{as} will only
2907create single or double precision values. I don't know if the 32x32
2908understands extended precision numbers.
2909
2910@subsection Machine Directives
2911The 32x32 has no machine dependent directives.
09352a5d 2912@end ignore
93b45514 2913
09352a5d
RP
2914@c [email protected]: stop ignoring this when "syntax" section filled in
2915@ignore
2916_if__(_SPARC__ && !_ALL_ARCH__)
2917@chapter Machine Dependent Features: SPARC
2918_fi__(_SPARC__ && !_ALL_ARCH__)
93b45514
RP
2919@section Sparc
2920@subsection Options
2921The sparc has no machine dependent options.
2922
2923@subsection syntax
2924I don't know anything about Sparc syntax. Someone who does
2925will have to write this section.
2926
2927@subsection Floating Point
2928The Sparc uses ieee floating-point numbers.
2929
2930@subsection Machine Directives
2931The Sparc version of @code{as} supports the following additional
2932machine directives:
2933
2934@table @code
2935@item .common
2936This must be followed by a symbol name, a positive number, and
2937@code{"bss"}. This behaves somewhat like @code{.comm}, but the
2938syntax is different.
2939
2940@item .global
2941This is functionally identical to @code{.globl}.
2942
2943@item .half
2944This is functionally identical to @code{.short}.
2945
2946@item .proc
2947This directive is ignored. Any text following it on the same
2948line is also ignored.
2949
2950@item .reserve
2951This must be followed by a symbol name, a positive number, and
2952@code{"bss"}. This behaves somewhat like @code{.lcomm}, but the
2953syntax is different.
2954
2955@item .seg
2956This must be followed by @code{"text"}, @code{"data"}, or
2957@code{"data1"}. It behaves like @code{.text}, @code{.data}, or
2958@code{.data 1}.
2959
2960@item .skip
2961This is functionally identical to the .space directive.
2962
2963@item .word
2964On the Sparc, the .word directive produces 32 bit values,
2965instead of the 16 bit values it produces on every other machine.
2966
2967@end table
09352a5d 2968@end ignore
93b45514 2969
09352a5d
RP
2970_if__(_I80386__ && !_ALL_ARCH__)
2971@chapter Machine Dependent Features: SPARC
2972_fi__(_I80386__ && !_ALL_ARCH__)
2973_if__(_I80386__)
93b45514
RP
2974@section Intel 80386
2975@subsection Options
2976The 80386 has no machine dependent options.
2977
2978@subsection AT&T Syntax versus Intel Syntax
2979In order to maintain compatibility with the output of @code{GCC},
2980@code{as} supports AT&T System V/386 assembler syntax. This is quite
2981different from Intel syntax. We mention these differences because
2982almost all 80386 documents used only Intel syntax. Notable differences
2983between the two syntaxes are:
2984@itemize @bullet
2985@item
2986AT&T immediate operands are preceded by @samp{$}; Intel immediate
2987operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2988AT&T register operands are preceded by @samp{%}; Intel register operands
2989are undelimited. AT&T absolute (as opposed to PC relative) jump/call
2990operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2991
2992@item
2993AT&T and Intel syntax use the opposite order for source and destination
2994operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The
2995@samp{source, dest} convention is maintained for compatibility with
2996previous Unix assemblers.
2997
2998@item
2999In AT&T syntax the size of memory operands is determined from the last
3000character of the opcode name. Opcode suffixes of @samp{b}, @samp{w},
3001and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
3002memory references. Intel syntax accomplishes this by prefixes memory
3003operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
3004@samp{word ptr}, and @samp{dword ptr}. Thus, Intel @samp{mov al, byte
3005ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
3006
3007@item
3008Immediate form long jumps and calls are
3009@samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
3010Intel syntax is
3011@samp{call/jmp far @var{segment}:@var{offset}}. Also, the far return
3012instruction
3013is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3014@samp{ret far @var{stack-adjust}}.
3015
3016@item
3017The AT&T assembler does not provide support for multiple segment
3018programs. Unix style systems expect all programs to be single segments.
3019@end itemize
3020
3021@subsection Opcode Naming
3022Opcode names are suffixed with one character modifiers which specify the
3023size of operands. The letters @samp{b}, @samp{w}, and @samp{l} specify
3024byte, word, and long operands. If no suffix is specified by an
3025instruction and it contains no memory operands then @code{as} tries to
3026fill in the missing suffix based on the destination register operand
3027(the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent
3028to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3029@samp{movw $1, %bx}. Note that this is incompatible with the AT&T Unix
3030assembler which assumes that a missing opcode suffix implies long
3031operand size. (This incompatibility does not affect compiler output
3032since compilers always explicitly specify the opcode suffix.)
3033
3034Almost all opcodes have the same names in AT&T and Intel format. There
3035are a few exceptions. The sign extend and zero extend instructions need
3036two sizes to specify them. They need a size to sign/zero extend
3037@emph{from} and a size to zero extend @emph{to}. This is accomplished
3038by using two opcode suffixes in AT&T syntax. Base names for sign extend
3039and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3040syntax (@samp{movsx} and @samp{movzx} in Intel syntax). The opcode
3041suffixes are tacked on to this base name, the @emph{from} suffix before
3042the @emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3043``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes,
3044thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3045and @samp{wl} (from word to long).
3046
3047The Intel syntax conversion instructions
3048@itemize @bullet
3049@item
3050@samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3051@item
3052@samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3053@item
3054@samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3055@item
3056@samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3057@end itemize
3058are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3059AT&T naming. @code{as} accepts either naming for these instructions.
3060
3061Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3062AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3063convention.
3064
3065@subsection Register Naming
3066Register operands are always prefixes with @samp{%}. The 80386 registers
3067consist of
3068@itemize @bullet
3069@item
3070the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3071@samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3072frame pointer), and @samp{%esp} (the stack pointer).
3073
3074@item
3075the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3076@samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3077
3078@item
3079the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3080@samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3081are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3082@samp{%cx}, and @samp{%dx})
3083
3084@item
3085the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3086(data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3087and @samp{%gs}.
3088
3089@item
3090the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3091@samp{%cr3}.
3092
3093@item
3094the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3095@samp{%db3}, @samp{%db6}, and @samp{%db7}.
3096
3097@item
3098the 2 test registers @samp{%tr6} and @samp{%tr7}.
3099
3100@item
3101the 8 floating point register stack @samp{%st} or equivalently
3102@samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3103@samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3104@end itemize
3105
3106@subsection Opcode Prefixes
3107Opcode prefixes are used to modify the following opcode. They are used
3108to repeat string instructions, to provide segment overrides, to perform
3109bus lock operations, and to give operand and address size (16-bit
3110operands are specified in an instruction by prefixing what would
3111normally be 32-bit operands with a ``operand size'' opcode prefix).
3112Opcode prefixes are usually given as single-line instructions with no
3113operands, and must directly precede the instruction they act upon. For
3114example, the @samp{scas} (scan string) instruction is repeated with:
3115@example
3116 repne
3117 scas
3118@end example
3119
3120Here is a list of opcode prefixes:
3121@itemize @bullet
3122@item
3123Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3124@samp{fs}, @samp{gs}. These are automatically added by specifying
3125using the @var{segment}:@var{memory-operand} form for memory references.
3126
3127@item
3128Operand/Address size prefixes @samp{data16} and @samp{addr16}
3129change 32-bit operands/addresses into 16-bit operands/addresses. Note
3130that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3131are not supported (yet).
3132
3133@item
3134The bus lock prefix @samp{lock} inhibits interrupts during
3135execution of the instruction it precedes. (This is only valid with
3136certain instructions; see a 80386 manual for details).
3137
3138@item
3139The wait for coprocessor prefix @samp{wait} waits for the
3140coprocessor to complete the current instruction. This should never be
3141needed for the 80386/80387 combination.
3142
3143@item
3144The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3145to string instructions to make them repeat @samp{%ecx} times.
3146@end itemize
3147
3148@subsection Memory References
3149An Intel syntax indirect memory reference of the form
3150@example
3151@var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3152@end example
3153is translated into the AT&T syntax
3154@example
3155@var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3156@end example
3157where @var{base} and @var{index} are the optional 32-bit base and
3158index registers, @var{disp} is the optional displacement, and
3159@var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3160to calculate the address of the operand. If no @var{scale} is
3161specified, @var{scale} is taken to be 1. @var{segment} specifies the
3162optional segment register for the memory operand, and may override the
3163default segment register (see a 80386 manual for segment register
3164defaults). Note that segment overrides in AT&T syntax @emph{must} have
3165be preceded by a @samp{%}. If you specify a segment override which
3166coincides with the default segment register, @code{as} will @emph{not}
3167output any segment register override prefixes to assemble the given
3168instruction. Thus, segment overrides can be specified to emphasize which
3169segment register is used for a given memory operand.
3170
3171Here are some examples of Intel and AT&T style memory references:
3172@table @asis
3173
3174@item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]}
3175@var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3176missing, and the default segment is used (@samp{%ss} for addressing with
3177@samp{%ebp} as the base register). @var{index}, @var{scale} are both missing.
3178
3179@item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3180@var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3181@samp{foo}. All other fields are missing. The segment register here
3182defaults to @samp{%ds}.
3183
3184@item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3185This uses the value pointed to by @samp{foo} as a memory operand.
3186Note that @var{base} and @var{index} are both missing, but there is only
3187@emph{one} @samp{,}. This is a syntactic exception.
3188
3189@item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3190This selects the contents of the variable @samp{foo} with segment
3191register @var{segment} being @samp{%gs}.
3192
3193@end table
3194
3195Absolute (as opposed to PC relative) call and jump operands must be
3196prefixed with @samp{*}. If no @samp{*} is specified, @code{as} will
3197always choose PC relative addressing for jump/call labels.
3198
3199Any instruction that has a memory operand @emph{must} specify its size (byte,
3200word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3201respectively).
3202
3203@subsection Handling of Jump Instructions
3204Jump instructions are always optimized to use the smallest possible
3205displacements. This is accomplished by using byte (8-bit) displacement
3206jumps whenever the target is sufficiently close. If a byte displacement
3207is insufficient a long (32-bit) displacement is used. We do not support
3208word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3209with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3210@samp{%eip} to 16 bits after the word displacement is added.
3211
3212Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3213@samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3214byte displacements, so that it is possible that use of these
3215instructions (@code{GCC} does not use them) will cause the assembler to
3216print an error message (and generate incorrect code). The AT&T 80386
3217assembler tries to get around this problem by expanding @samp{jcxz foo} to
3218@example
3219 jcxz cx_zero
3220 jmp cx_nonzero
3221cx_zero: jmp foo
3222cx_nonzero:
3223@end example
3224
3225@subsection Floating Point
3226All 80387 floating point types except packed BCD are supported.
3227(BCD support may be added without much difficulty). These data
3228types are 16-, 32-, and 64- bit integers, and single (32-bit),
3229double (64-bit), and extended (80-bit) precision floating point.
3230Each supported type has an opcode suffix and a constructor
3231associated with it. Opcode suffixes specify operand's data
3232types. Constructors build these data types into memory.
3233
3234@itemize @bullet
3235@item
3236Floating point constructors are @samp{.float} or @samp{.single},
3237@samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3238These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3239@samp{t} stands for temporary real, and that the 80387 only supports
3240this format via the @samp{fldt} (load temporary real to stack top) and
3241@samp{fstpt} (store temporary real and pop stack) instructions.
3242
3243@item
3244Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3245@samp{.quad} for the 16-, 32-, and 64-bit integer formats. The corresponding
3246opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3247(quad). As with the temporary real format the 64-bit @samp{q} format is
3248only present in the @samp{fildq} (load quad integer to stack top) and
3249@samp{fistpq} (store quad integer and pop stack) instructions.
3250@end itemize
3251
3252Register to register operations do not require opcode suffixes,
3253so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3254
3255Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3256instructions are almost never needed (this is not the case for the
b50e59fe 325780286/80287 and 8086/8087 combinations). Therefore, @code{as} suppresses
93b45514
RP
3258the @samp{fwait} instruction whenever it is implicitly selected by one
3259of the @samp{fn@dots{}} instructions. For example, @samp{fsave} and
3260@samp{fnsave} are treated identically. In general, all the @samp{fn@dots{}}
3261instructions are made equivalent to @samp{f@dots{}} instructions. If
3262@samp{fwait} is desired it must be explicitly coded.
3263
3264@subsection Notes
3265There is some trickery concerning the @samp{mul} and @samp{imul}
3266instructions that deserves mention. The 16-, 32-, and 64-bit expanding
3267multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3268for @samp{imul}) can be output only in the one operand form. Thus,
3269@samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3270the expanding multiply would clobber the @samp{%edx} register, and this
3271would confuse @code{GCC} output. Use @samp{imul %ebx} to get the
327264-bit product in @samp{%edx:%eax}.
3273
3274We have added a two operand form of @samp{imul} when the first operand
3275is an immediate mode expression and the second operand is a register.
3276This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3277example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3278$69, %eax, %eax}.
09352a5d
RP
3279_fi__(_I80386__)
3280
3281
3282@c [email protected]: we ignore the following chapters, since internals are
3283@c changing rapidly. These may need to be moved to another
47342e8f
RP
3284@c book anyhow, if we adopt the model of user/modifier
3285@c books.
3286@ignore
b50e59fe 3287@node Maintenance, Retargeting, Machine Dependent, Top
93b45514
RP
3288@chapter Maintaining the Assembler
3289[[this chapter is still being built]]
3290
3291@section Design
3292We had these goals, in descending priority:
3293@table @b
3294@item Accuracy.
3295For every program composed by a compiler, @code{as} should emit
3296``correct'' code. This leaves some latitude in choosing addressing
3297modes, order of @code{relocation_info} structures in the object
47342e8f 3298file, @emph{etc}.
93b45514
RP
3299
3300@item Speed, for usual case.
3301By far the most common use of @code{as} will be assembling compiler
3302emissions.
3303
3304@item Upward compatibility for existing assembler code.
3305Well @dots{} we don't support Vax bit fields but everything else
3306seems to be upward compatible.
3307
3308@item Readability.
3309The code should be maintainable with few surprises. (JF: ha!)
3310
3311@end table
3312
3313We assumed that disk I/O was slow and expensive while memory was
3314fast and access to memory was cheap. We expect the in-memory data
3315structures to be less than 10 times the size of the emitted object
3316file. (Contrast this with the C compiler where in-memory structures
3317might be 100 times object file size!)
3318This suggests:
3319@itemize @bullet
3320@item
3321Try to read the source file from disk only one time. For other
3322reasons, we keep large chunks of the source file in memory during
3323assembly so this is not a problem. Also the assembly algorithm
3324should only scan the source text once if the compiler composed the
3325text according to a few simple rules.
3326@item
3327Emit the object code bytes only once. Don't store values and then
3328backpatch later.
3329@item
3330Build the object file in memory and do direct writes to disk of
3331large buffers.
3332@end itemize
3333
3334RMS suggested a one-pass algorithm which seems to work well. By not
3335parsing text during a second pass considerable time is saved on
47342e8f 3336large programs (@emph{e.g.} the sort of C program @code{yacc} would
93b45514
RP
3337emit).
3338
3339It happened that the data structures needed to emit relocation
3340information to the object file were neatly subsumed into the data
3341structures that do backpatching of addresses after pass 1.
3342
3343Many of the functions began life as re-usable modules, loosely
3344connected. RMS changed this to gain speed. For example, input
3345parsing routines which used to work on pre-sanitized strings now
3346must parse raw data. Hence they have to import knowledge of the
47342e8f 3347assemblers' comment conventions @emph{etc}.
93b45514
RP
3348
3349@section Deprecated Feature(?)s
3350We have stopped supporting some features:
3351@itemize @bullet
3352@item
3353@code{.org} statements must have @b{defined} expressions.
3354@item
3355Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3356@end itemize
3357
3358It might be a good idea to not support these features in a future release:
3359@itemize @bullet
3360@item
3361@kbd{#} should begin a comment, even in column 1.
3362@item
3363Why support the logical line & file concept any more?
3364@item
3365Subsegments are a good candidate for flushing.
3366Depends on which compilers need them I guess.
3367@end itemize
3368
3369@section Bugs, Ideas, Further Work
3370Clearly the major improvement is DON'T USE A TEXT-READING
3371ASSEMBLER for the back end of a compiler. It is much faster to
3372interpret binary gobbledygook from a compiler's tables than to
3373ask the compiler to write out human-readable code just so the
3374assembler can parse it back to binary.
3375
3376Assuming you use @code{as} for human written programs: here are
3377some ideas:
3378@itemize @bullet
3379@item
3380Document (here) @code{APP}.
3381@item
3382Take advantage of knowing no spaces except after opcode
3383to speed up @code{as}. (Modify @code{app.c} to flush useless spaces:
3384only keep space/tabs at begin of line or between 2
3385symbols.)
3386@item
3387Put pointers in this documentation to @file{a.out} documentation.
3388@item
3389Split the assembler into parts so it can gobble direct binary
47342e8f 3390from @emph{e.g.} @code{cc}. It is silly for@code{cc} to compose text
93b45514
RP
3391just so @code{as} can parse it back to binary.
3392@item
3393Rewrite hash functions: I want a more modular, faster library.
3394@item
3395Clean up LOTS of code.
3396@item
3397Include all the non-@file{.c} files in the maintenance chapter.
3398@item
3399Document flonums.
3400@item
3401Implement flonum short literals.
3402@item
3403Change all talk of expression operands to expression quantities,
47342e8f 3404or perhaps to expression arguments.
93b45514
RP
3405@item
3406Implement pass 2.
3407@item
3408Whenever a @code{.text} or @code{.data} statement is seen, we close
3409of the current frag with an imaginary @code{.fill 0}. This is
3410because we only have one obstack for frags, and we can't grow new
3411frags for a new subsegment, then go back to the old subsegment and
3412append bytes to the old frag. All this nonsense goes away if we
3413give each subsegment its own obstack. It makes code simpler in
3414about 10 places, but nobody has bothered to do it because C compiler
3415output rarely changes subsegments (compared to ending frags with
3416relaxable addresses, which is common).
3417@end itemize
3418
3419@section Sources
3420@c The following files in the @file{as} directory
3421@c are symbolic links to other files, of
3422@c the same name, in a different directory.
3423@c @itemize @bullet
3424@c @item
3425@c @file{atof_generic.c}
3426@c @item
3427@c @file{atof_vax.c}
3428@c @item
3429@c @file{flonum_const.c}
3430@c @item
3431@c @file{flonum_copy.c}
3432@c @item
3433@c @file{flonum_get.c}
3434@c @item
3435@c @file{flonum_multip.c}
3436@c @item
3437@c @file{flonum_normal.c}
3438@c @item
3439@c @file{flonum_print.c}
3440@c @end itemize
3441
3442Here is a list of the source files in the @file{as} directory.
3443
3444@table @file
3445@item app.c
3446This contains the pre-processing phase, which deletes comments,
3447handles whitespace, etc. This was recently re-written, since app
3448used to be a separate program, but RMS wanted it to be inline.
3449
3450@item append.c
3451This is a subroutine to append a string to another string returning a
3452pointer just after the last @code{char} appended. (JF: All these
3453little routines should probably all be put in one file.)
3454
3455@item as.c
3456Here you will find the main program of the assembler @code{as}.
3457
3458@item expr.c
3459This is a branch office of @file{read.c}. This understands
47342e8f
RP
3460expressions, arguments. Inside @code{as}, arguments are called
3461(expression) @emph{operands}. This is confusing, because we also talk
3462(elsewhere) about instruction @emph{operands}. Also, expression
3463operands are called @emph{quantities} explicitly to avoid confusion
93b45514
RP
3464with instruction operands. What a mess.
3465
3466@item frags.c
3467This implements the @b{frag} concept. Without frags, finding the
3468right size for branch instructions would be a lot harder.
3469
3470@item hash.c
47342e8f 3471This contains the symbol table, opcode table @emph{etc.} hashing
93b45514
RP
3472functions.
3473
3474@item hex_value.c
3475This is a table of values of digits, for use in atoi() type
3476functions. Could probably be flushed by using calls to strtol(), or
3477something similar.
3478
3479@item input-file.c
3480This contains Operating system dependent source file reading
3481routines. Since error messages often say where we are in reading
3482the source file, they live here too. Since @code{as} is intended to
3483run under GNU and Unix only, this might be worth flushing. Anyway,
3484almost all C compilers support stdio.
3485
3486@item input-scrub.c
3487This deals with calling the pre-processor (if needed) and feeding the
3488chunks back to the rest of the assembler the right way.
3489
3490@item messages.c
3491This contains operating system independent parts of fatal and
3492warning message reporting. See @file{append.c} above.
3493
3494@item output-file.c
3495This contains operating system dependent functions that write an
3496object file for @code{as}. See @file{input-file.c} above.
3497
3498@item read.c
3499This implements all the directives of @code{as}. This also deals
3500with passing input lines to the machine dependent part of the
3501assembler.
3502
3503@item strstr.c
3504This is a C library function that isn't in most C libraries yet.
3505See @file{append.c} above.
3506
3507@item subsegs.c
3508This implements subsegments.
3509
3510@item symbols.c
3511This implements symbols.
3512
3513@item write.c
3514This contains the code to perform relaxation, and to write out
3515the object file. It is mostly operating system independent, but
3516different OSes have different object file formats in any case.
3517
3518@item xmalloc.c
3519This implements @code{malloc()} or bust. See @file{append.c} above.
3520
3521@item xrealloc.c
3522This implements @code{realloc()} or bust. See @file{append.c} above.
3523
3524@item atof-generic.c
3525The following files were taken from a machine-independent subroutine
3526library for manipulating floating point numbers and very large
3527integers.
3528
3529@file{atof-generic.c} turns a string into a flonum internal format
3530floating-point number.
3531
3532@item flonum-const.c
3533This contains some potentially useful floating point numbers in
3534flonum format.
3535
3536@item flonum-copy.c
3537This copies a flonum.
3538
3539@item flonum-multip.c
3540This multiplies two flonums together.
3541
3542@item bignum-copy.c
3543This copies a bignum.
3544
3545@end table
3546
3547Here is a table of all the machine-specific files (this includes
3548both source and header files). Typically, there is a
3549@var{machine}.c file, a @var{machine}-opcode.h file, and an
3550atof-@var{machine}.c file. The @var{machine}-opcode.h file should
3551be identical to the one used by GDB (which uses it for disassembly.)
3552
3553@table @file
3554
3555@item atof-ieee.c
3556This contains code to turn a flonum into a ieee literal constant.
3557This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3558
3559@item i386-opcode.h
3560This is the opcode-table for the i386 version of the assembler.
3561
3562@item i386.c
3563This contains all the code for the i386 version of the assembler.
3564
3565@item i386.h
3566This defines constants and macros used by the i386 version of the assembler.
3567
3568@item m-generic.h
3569generic 68020 header file. To be linked to m68k.h on a
3570non-sun3, non-hpux system.
3571
3572@item m-sun2.h
357368010 header file for Sun2 workstations. Not well tested. To be linked
3574to m68k.h on a sun2. (See also @samp{-DSUN_ASM_SYNTAX} in the
3575@file{Makefile}.)
3576
3577@item m-sun3.h
357868020 header file for Sun3 workstations. To be linked to m68k.h before
3579compiling on a Sun3 system. (See also @samp{-DSUN_ASM_SYNTAX} in the
3580@file{Makefile}.)
3581
3582@item m-hpux.h
358368020 header file for a HPUX (system 5?) box. Which box, which
3584version of HPUX, etc? I don't know.
3585
3586@item m68k.h
3587A hard- or symbolic- link to one of @file{m-generic.h},
3588@file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3589680x0 you are assembling for. (See also @samp{-DSUN_ASM_SYNTAX} in the
3590@file{Makefile}.)
3591
3592@item m68k-opcode.h
3593Opcode table for 68020. This is now a link to the opcode table
3594in the @code{GDB} source directory.
3595
3596@item m68k.c
3597All the mc680x0 code, in one huge, slow-to-compile file.
3598
3599@item ns32k.c
3600This contains the code for the ns32032/ns32532 version of the
3601assembler.
3602
3603@item ns32k-opcode.h
3604This contains the opcode table for the ns32032/ns32532 version
3605of the assembler.
3606
3607@item vax-inst.h
3608Vax specific file for describing Vax operands and other Vax-ish things.
3609
3610@item vax-opcode.h
3611Vax opcode table.
3612
3613@item vax.c
3614Vax specific parts of @code{as}. Also includes the former files
3615@file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3616
3617@item atof-vax.c
3618Turns a flonum into a Vax constant.
3619
3620@item vms.c
3621This file contains the special code needed to put out a VMS
3622style object file for the Vax.
3623
3624@end table
3625
3626Here is a list of the header files in the source directory.
3627(Warning: This section may not be very accurate. I didn't
3628write the header files; I just report them.) Also note that I
3629think many of these header files could be cleaned up or
3630eliminated.
3631
3632@table @file
3633
3634@item a.out.h
3635This describes the structures used to create the binary header data
3636inside the object file. Perhaps we should use the one in
3637@file{/usr/include}?
3638
3639@item as.h
09352a5d
RP
3640This defines all the globally useful things, and pulls in _0__<stdio.h>_1__
3641and _0__<assert.h>_1__.
93b45514
RP
3642
3643@item bignum.h
3644This defines macros useful for dealing with bignums.
3645
3646@item expr.h
3647Structure and macros for dealing with expression()
3648
3649@item flonum.h
3650This defines the structure for dealing with floating point
3651numbers. It #includes @file{bignum.h}.
3652
3653@item frags.h
3654This contains macro for appending a byte to the current frag.
3655
3656@item hash.h
3657Structures and function definitions for the hashing functions.
3658
3659@item input-file.h
3660Function headers for the input-file.c functions.
3661
3662@item md.h
3663structures and function headers for things defined in the
3664machine dependent part of the assembler.
3665
3666@item obstack.h
3667This is the GNU systemwide include file for manipulating obstacks.
3668Since nobody is running under real GNU yet, we include this file.
3669
3670@item read.h
3671Macros and function headers for reading in source files.
3672
3673@item struct-symbol.h
3674Structure definition and macros for dealing with the gas
3675internal form of a symbol.
3676
3677@item subsegs.h
3678structure definition for dealing with the numbered subsegments
3679of the text and data segments.
3680
3681@item symbols.h
3682Macros and function headers for dealing with symbols.
3683
3684@item write.h
3685Structure for doing segment fixups.
3686@end table
3687
3688@comment ~subsection Test Directory
3689@comment (Note: The test directory seems to have disappeared somewhere
3690@comment along the line. If you want it, you'll probably have to find a
3691@comment REALLY OLD dump tape~dots{})
3692@comment
3693@comment The ~file{test/} directory is used for regression testing.
b50e59fe
RP
3694@comment After you modify ~@code{as}, you can get a quick go/nogo
3695@comment confidence test by running the new ~@code{as} over the source
93b45514
RP
3696@comment files in this directory. You use a shell script ~file{test/do}.
3697@comment
3698@comment The tests in this suite are evolving. They are not comprehensive.
3699@comment They have, however, caught hundreds of bugs early in the debugging
b50e59fe
RP
3700@comment cycle of ~@code{as}. Most test statements in this suite were naturally
3701@comment selected: they were used to demonstrate actual ~@code{as} bugs rather
93b45514
RP
3702@comment than being written ~i{a prioi}.
3703@comment
3704@comment Another testing suggestion: over 30 bugs have been found simply by
b50e59fe 3705@comment running examples from this manual through ~@code{as}.
93b45514 3706@comment Some examples in this manual are selected
b50e59fe 3707@comment to distinguish boundary conditions; they are good for testing ~@code{as}.
93b45514
RP
3708@comment
3709@comment ~subsubsection Regression Testing
3710@comment Each regression test involves assembling a file and comparing the
b50e59fe 3711@comment actual output of ~@code{as} to ``known good'' output files. Both
93b45514 3712@comment the object file and the error/warning message file (stderr) are
b50e59fe 3713@comment inspected. Optionally ~@code{as}' exit status may be checked.
93b45514 3714@comment Discrepencies are reported. Each discrepency means either that
b50e59fe 3715@comment you broke some part of ~@code{as} or that the ``known good'' files
93b45514
RP
3716@comment are now out of date and should be changed to reflect the new
3717@comment definition of ``good''.
3718@comment
3719@comment Each regression test lives in its own directory, in a tree
3720@comment rooted in the directory ~file{test/}. Each such directory
3721@comment has a name ending in ~file{.ret}, where `ret' stands for
3722@comment REgression Test. The ~file{.ret} ending allows ~code{find
3723@comment (1)} to find all regression tests in the tree, without
3724@comment needing to list them explicitly.
3725@comment
3726@comment Any ~file{.ret} directory must contain a file called
3727@comment ~file{input} which is the source file to assemble. During
3728@comment testing an object file ~file{output} is created, as well as
3729@comment a file ~file{stdouterr} which contains the output to both
3730@comment stderr and stderr. If there is a file ~file{output.good} in
3731@comment the directory, and if ~file{output} contains exactly the
3732@comment same data as ~file{output.good}, the file ~file{output} is
3733@comment deleted. Likewise ~file{stdouterr} is removed if it exactly
3734@comment matches a file ~file{stdouterr.good}. If file
3735@comment ~file{status.good} is present, containing a decimal number
b50e59fe 3736@comment before a newline, the exit status of ~@code{as} is compared
93b45514
RP
3737@comment to this number. If the status numbers are not equal, a file
3738@comment ~file{status} is written to the directory, containing the
3739@comment actual status as a decimal number followed by newline.
3740@comment
3741@comment Should any of the ~file{*.good} files fail to match their corresponding
3742@comment actual files, this is noted by a 1-line message on the screen during
b50e59fe 3743@comment the regression test, and you can use ~@code{find (1)} to find any
93b45514
RP
3744@comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3745@comment
b50e59fe 3746@node Retargeting, License, Maintenance, Top
93b45514
RP
3747@chapter Teaching the Assembler about a New Machine
3748
3749This chapter describes the steps required in order to make the
3750assembler work with another machine's assembly language. This
3751chapter is not complete, and only describes the steps in the
3752broadest terms. You should look at the source for the
3753currently supported machine in order to discover some of the
3754details that aren't mentioned here.
3755
3756You should create a new file called @file{@var{machine}.c}, and
3757add the appropriate lines to the file @file{Makefile} so that
3758you can compile your new version of the assembler. This should
3759be straighforward; simply add lines similar to the ones there
3760for the four current versions of the assembler.
3761
47342e8f 3762If you want to be compatible with GDB, (and the current
93b45514
RP
3763machine-dependent versions of the assembler), you should create
3764a file called @file{@var{machine}-opcode.h} which should
3765contain all the information about the names of the machine
3766instructions, their opcodes, and what addressing modes they
3767support. If you do this right, the assembler and GDB can share
3768this file, and you'll only have to write it once. Note that
3769while you're writing @code{as}, you may want to use an
3770independent program (if you have access to one), to make sure
3771that @code{as} is emitting the correct bytes. Since @code{as}
3772and @code{GDB} share the opcode table, an incorrect opcode
3773table entry may make invalid bytes look OK when you disassemble
3774them with @code{GDB}.
3775
3776@section Functions You will Have to Write
3777
3778Your file @file{@var{machine}.c} should contain definitions for
3779the following functions and variables. It will need to include
3780some header files in order to use some of the structures
3781defined in the machine-independent part of the assembler. The
3782needed header files are mentioned in the descriptions of the
3783functions that will need them.
3784
3785@table @code
3786
3787@item long omagic;
3788This long integer holds the value to place at the beginning of
3789the @file{a.out} file. It is usually @samp{OMAGIC}, except on
3790machines that store additional information in the magic-number.
3791
3792@item char comment_chars[];
3793This character array holds the values of the characters that
3794start a comment anywhere in a line. Comments are stripped off
3795automatically by the machine independent part of the
3796assembler. Note that the @samp{/*} will always start a
3797comment, and that only @samp{*/} will end a comment started by
3798@samp{*/}.
3799
3800@item char line_comment_chars[];
3801This character array holds the values of the chars that start a
3802comment only if they are the first (non-whitespace) character
3803on a line. If the character @samp{#} does not appear in this
3804list, you may get unexpected results. (Various
3805machine-independent parts of the assembler treat the comments
3806@samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3807that start with @samp{#} are comments.)
3808
3809@item char EXP_CHARS[];
3810This character array holds the letters that can separate the
3811mantissa and the exponent of a floating point number. Typical
3812values are @samp{e} and @samp{E}.
3813
3814@item char FLT_CHARS[];
3815This character array holds the letters that--when they appear
3816immediately after a leading zero--indicate that a number is a
3817floating-point number. (Sort of how 0x indicates that a
3818hexadecimal number follows.)
3819
3820@item pseudo_typeS md_pseudo_table[];
3821(@var{pseudo_typeS} is defined in @file{md.h})
3822This array contains a list of the machine_dependent directives
3823the assembler must support. It contains the name of each
3824pseudo op (Without the leading @samp{.}), a pointer to a
3825function to be called when that directive is encountered, and
3826an integer argument to be passed to that function.
3827
3828@item void md_begin(void)
3829This function is called as part of the assembler's
3830initialization. It should do any initialization required by
3831any of your other routines.
3832
3833@item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3834This routine is called once for each option on the command line
3835that the machine-independent part of @code{as} does not
3836understand. This function should return non-zero if the option
3837pointed to by @var{optionPTR} is a valid option. If it is not
3838a valid option, this routine should return zero. The variables
3839@var{argcPTR} and @var{argvPTR} are provided in case the option
3840requires a filename or something similar as an argument. If
3841the option is multi-character, @var{optionPTR} should be
3842advanced past the end of the option, otherwise every letter in
3843the option will be treated as a separate single-character
3844option.
3845
3846@item void md_assemble(char *string)
3847This routine is called for every machine-dependent
3848non-directive line in the source file. It does all the real
3849work involved in reading the opcode, parsing the operands,
3850etc. @var{string} is a pointer to a null-terminated string,
3851that comprises the input line, with all excess whitespace and
3852comments removed.
3853
3854@item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3855This routine is called to turn a C long int, short int, or char
3856into the series of bytes that represents that number on the
3857target machine. @var{outputPTR} points to an array where the
3858result should be stored; @var{value} is the value to store; and
3859@var{nbytes} is the number of bytes in 'value' that should be
3860stored.
3861
3862@item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3863This routine is called to turn a C long int, short int, or char
3864into the series of bytes that represent an immediate value on
3865the target machine. It is identical to the function @code{md_number_to_chars},
3866except on NS32K machines.@refill
3867
3868@item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3869This routine is called to turn a C long int, short int, or char
3870into the series of bytes that represent an displacement value on
3871the target machine. It is identical to the function @code{md_number_to_chars},
3872except on NS32K machines.@refill
3873
3874@item void md_number_to_field(char *outputPTR,long value,int nbytes)
3875This routine is identical to @code{md_number_to_chars},
3876except on NS32K machines.
3877
3878@item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3879(@code{struct relocation_info} is defined in @file{a.out.h})
3880This routine emits the relocation info in @var{ri}
3881in the appropriate bit-pattern for the target machine.
3882The result should be stored in the location pointed
3883to by @var{riPTR}. This routine may be a no-op unless you are
3884attempting to do cross-assembly.
3885
3886@item char *md_atof(char type,char *outputPTR,int *sizePTR)
3887This routine turns a series of digits into the appropriate
3888internal representation for a floating-point number.
3889@var{type} is a character from @var{FLT_CHARS[]} that describes
3890what kind of floating point number is wanted; @var{outputPTR}
3891is a pointer to an array that the result should be stored in;
3892and @var{sizePTR} is a pointer to an integer where the size (in
3893bytes) of the result should be stored. This routine should
3894return an error message, or an empty string (not (char *)0) for
3895success.
3896
3897@item int md_short_jump_size;
3898This variable holds the (maximum) size in bytes of a short (16
3899bit or so) jump created by @code{md_create_short_jump()}. This
3900variable is used as part of the broken-word feature, and isn't
3901needed if the assembler is compiled with
3902@samp{-DWORKING_DOT_WORD}.
3903
3904@item int md_long_jump_size;
3905This variable holds the (maximum) size in bytes of a long (32
3906bit or so) jump created by @code{md_create_long_jump()}. This
3907variable is used as part of the broken-word feature, and isn't
3908needed if the assembler is compiled with
3909@samp{-DWORKING_DOT_WORD}.
3910
3911@item void md_create_short_jump(char *resultPTR,long from_addr,
3912@code{long to_addr,fragS *frag,symbolS *to_symbol)}
3913This function emits a jump from @var{from_addr} to @var{to_addr} in
3914the array of bytes pointed to by @var{resultPTR}. If this creates a
3915type of jump that must be relocated, this function should call
3916@code{fix_new()} with @var{frag} and @var{to_symbol}. The jump
3917emitted by this function may be smaller than @var{md_short_jump_size},
3918but it must never create a larger one.
3919(If it creates a smaller jump, the extra bytes of memory will not be
3920used.) This function is used as part of the broken-word feature,
3921and isn't needed if the assembler is compiled with
3922@samp{-DWORKING_DOT_WORD}.@refill
3923
3924@item void md_create_long_jump(char *ptr,long from_addr,
3925@code{long to_addr,fragS *frag,symbolS *to_symbol)}
3926This function is similar to the previous function,
3927@code{md_create_short_jump()}, except that it creates a long
3928jump instead of a short one. This function is used as part of
3929the broken-word feature, and isn't needed if the assembler is
3930compiled with @samp{-DWORKING_DOT_WORD}.
3931
3932@item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3933This function does the initial setting up for relaxation. This
3934includes forcing references to still-undefined symbols to the
3935appropriate addressing modes.
3936
3937@item relax_typeS md_relax_table[];
3938(relax_typeS is defined in md.h)
3939This array describes the various machine dependent states a
3940frag may be in before relaxation. You will need one group of
3941entries for each type of addressing mode you intend to relax.
3942
3943@item void md_convert_frag(fragS *fragPTR)
3944(@var{fragS} is defined in @file{as.h})
3945This routine does the required cleanup after relaxation.
3946Relaxation has changed the type of the frag to a type that can
3947reach its destination. This function should adjust the opcode
3948of the frag to use the appropriate addressing mode.
3949@var{fragPTR} points to the frag to clean up.
3950
3951@item void md_end(void)
3952This function is called just before the assembler exits. It
3953need not free up memory unless the operating system doesn't do
3954it automatically on exit. (In which case you'll also have to
3955track down all the other places where the assembler allocates
3956space but never frees it.)
3957
3958@end table
3959
3960@section External Variables You will Need to Use
3961
3962You will need to refer to or change the following external variables
3963from within the machine-dependent part of the assembler.
3964
3965@table @code
3966@item extern char flagseen[];
3967This array holds non-zero values in locations corresponding to
3968the options that were on the command line. Thus, if the
3969assembler was called with @samp{-W}, @var{flagseen['W']} would
3970be non-zero.
3971
3972@item extern fragS *frag_now;
3973This pointer points to the current frag--the frag that bytes
3974are currently being added to. If nothing else, you will need
3975to pass it as an argument to various machine-independent
3976functions. It is maintained automatically by the
3977frag-manipulating functions; you should never have to change it
3978yourself.
3979
3980@item extern LITTLENUM_TYPE generic_bignum[];
3981(@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3982This is where @dfn{bignums}--numbers larger than 32 bits--are
3983returned when they are encountered in an expression. You will
3984need to use this if you need to implement directives (or
3985anything else) that must deal with these large numbers.
3986@code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3987@file{as.h}, and have a positive @code{X_add_number}. The
3988@code{X_add_number} of a @code{bignum} is the number of
3989@code{LITTLENUMS} in @var{generic_bignum} that the number takes
3990up.
3991
3992@item extern FLONUM_TYPE generic_floating_point_number;
3993(@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3994The is where @dfn{flonums}--floating-point numbers within
3995expressions--are returned. @code{Flonums} are of @code{segT}
3996@code{SEG_BIG}, and have a negative @code{X_add_number}.
3997@code{Flonums} are returned in a generic format. You will have
3998to write a routine to turn this generic format into the
3999appropriate floating-point format for your machine.
4000
4001@item extern int need_pass_2;
4002If this variable is non-zero, the assembler has encountered an
4003expression that cannot be assembled in a single pass. Since
4004the second pass isn't implemented, this flag means that the
4005assembler is punting, and is only looking for additional syntax
4006errors. (Or something like that.)
4007
4008@item extern segT now_seg;
4009This variable holds the value of the segment the assembler is
4010currently assembling into.
4011
4012@end table
4013
4014@section External functions will you need
4015
4016You will find the following external functions useful (or
4017indispensable) when you're writing the machine-dependent part
4018of the assembler.
4019
4020@table @code
4021
4022@item char *frag_more(int bytes)
4023This function allocates @var{bytes} more bytes in the current
4024frag (or starts a new frag, if it can't expand the current frag
4025any more.) for you to store some object-file bytes in. It
4026returns a pointer to the bytes, ready for you to store data in.
4027
4028@item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4029This function stores a relocation fixup to be acted on later.
4030@var{frag} points to the frag the relocation belongs in;
4031@var{where} is the location within the frag where the relocation begins;
4032@var{size} is the size of the relocation, and is usually 1 (a single byte),
4033 2 (sixteen bits), or 4 (a longword).
4034The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
09352a5d 4035at _0__@var{frag->literal[where]}_1__. If @var{pcrel} is non-zero, the address of the
93b45514
RP
4036location is subtracted from the result. A relocation entry is also added
4037to the @file{a.out} file. @var{add_symbol}, @var{sub_symbol}, and/or
4038@var{offset} may be NULL.@refill
4039
4040@item char *frag_var(relax_stateT type, int max_chars, int var,
4041@code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4042This function creates a machine-dependent frag of type @var{type}
4043(usually @code{rs_machine_dependent}).
4044@var{max_chars} is the maximum size in bytes that the frag may grow by;
4045@var{var} is the current size of the variable end of the frag;
4046@var{subtype} is the sub-type of the frag. The sub-type is used to index into
4047@var{md_relax_table[]} during @code{relaxation}.
4048@var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4049@var{opcode} points into a byte whose value may have to be modified if the
4050addressing mode used by this frag changes. It typically points into the
4051@var{fr_literal[]} of the previous frag, and is used to point to a location
4052that @code{md_convert_frag()}, may have to change.@refill
4053
4054@item void frag_wane(fragS *fragPTR)
4055This function is useful from within @code{md_convert_frag}. It
4056changes a frag to type rs_fill, and sets the variable-sized
4057piece of the frag to zero. The frag will never change in size
4058again.
4059
4060@item segT expression(expressionS *retval)
4061(@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4062This function parses the string pointed to by the external char
4063pointer @var{input_line_pointer}, and returns the segment-type
4064of the expression. It also stores the results in the
4065@var{expressionS} pointed to by @var{retval}.
4066@var{input_line_pointer} is advanced to point past the end of
4067the expression. (@var{input_line_pointer} is used by other
4068parts of the assembler. If you modify it, be sure to restore
4069it to its original value.)
4070
4071@item as_warn(char *message,@dots{})
4072If warning messages are disabled, this function does nothing.
4073Otherwise, it prints out the current file name, and the current
4074line number, then uses @code{fprintf} to print the
4075@var{message} and any arguments it was passed.
4076
4077@item as_bad(char *message,@dots{})
4078This function should be called when @code{as} encounters
4079conditions that are bad enough that @code{as} should not
4080produce an object file, but should continue reading input and
4081printing warning and bad error messages.
4082
4083@item as_fatal(char *message,@dots{})
4084This function prints out the current file name and line number,
4085prints the word @samp{FATAL:}, then uses @code{fprintf} to
4086print the @var{message} and any arguments it was passed. Then
4087the assembler exits. This function should only be used for
4088serious, unrecoverable errors.
4089
4090@item void float_const(int float_type)
4091This function reads floating-point constants from the current
4092input line, and calls @code{md_atof} to assemble them. It is
4093useful as the function to call for the directives
4094@samp{.single}, @samp{.double}, @samp{.float}, etc.
4095@var{float_type} must be a character from @var{FLT_CHARS}.
4096
4097@item void demand_empty_rest_of_line(void);
4098This function can be used by machine-dependent directives to
4099make sure the rest of the input line is empty. It prints a
4100warning message if there are additional characters on the line.
4101
4102@item long int get_absolute_expression(void)
4103This function can be used by machine-dependent directives to
4104read an absolute number from the current input line. It
4105returns the result. If it isn't given an absolute expression,
4106it prints a warning message and returns zero.
4107
4108@end table
4109
4110
4111@section The concept of Frags
4112
4113This assembler works to optimize the size of certain addressing
4114modes. (e.g. branch instructions) This means the size of many
4115pieces of object code cannot be determined until after assembly
4116is finished. (This means that the addresses of symbols cannot be
4117determined until assembly is finished.) In order to do this,
4118@code{as} stores the output bytes as @dfn{frags}.
4119
4120Here is the definition of a frag (from @file{as.h})
4121@example
4122struct frag
4123@{
4124 long int fr_fix;
4125 long int fr_var;
4126 relax_stateT fr_type;
4127 relax_substateT fr_substate;
4128 unsigned long fr_address;
4129 long int fr_offset;
4130 struct symbol *fr_symbol;
4131 char *fr_opcode;
4132 struct frag *fr_next;
4133 char fr_literal[];
4134@}
4135@end example
4136
4137@table @var
4138@item fr_fix
4139is the size of the fixed-size piece of the frag.
4140
4141@item fr_var
4142is the maximum (?) size of the variable-sized piece of the frag.
4143
4144@item fr_type
4145is the type of the frag.
4146Current types are:
4147rs_fill
4148rs_align
4149rs_org
4150rs_machine_dependent
4151
4152@item fr_substate
4153This stores the type of machine-dependent frag this is. (what
4154kind of addressing mode is being used, and what size is being
4155tried/will fit/etc.
4156
4157@item fr_address
4158@var{fr_address} is only valid after relaxation is finished.
4159Before relaxation, the only way to store an address is (pointer
4160to frag containing the address) plus (offset into the frag).
4161
4162@item fr_offset
4163This contains a number, whose meaning depends on the type of
4164the frag.
4165for machine_dependent frags, this contains the offset from
4166fr_symbol that the frag wants to go to. Thus, for branch
4167instructions it is usually zero. (unless the instruction was
4168@samp{jba foo+12} or something like that.)
4169
4170@item fr_symbol
4171for machine_dependent frags, this points to the symbol the frag
4172needs to reach.
4173
4174@item fr_opcode
4175This points to the location in the frag (or in a previous frag)
4176of the opcode for the instruction that caused this to be a frag.
4177@var{fr_opcode} is needed if the actual opcode must be changed
4178in order to use a different form of the addressing mode.
4179(For example, if a conditional branch only comes in size tiny,
4180a large-size branch could be implemented by reversing the sense
4181of the test, and turning it into a tiny branch over a large jump.
4182This would require changing the opcode.)
4183
4184@var{fr_literal} is a variable-size array that contains the
4185actual object bytes. A frag consists of a fixed size piece of
4186object data, (which may be zero bytes long), followed by a
4187piece of object data whose size may not have been determined
4188yet. Other information includes the type of the frag (which
4189controls how it is relaxed),
4190
4191@item fr_next
4192This is the next frag in the singly-linked list. This is
4193usually only needed by the machine-independent part of
4194@code{as}.
4195
4196@end table
47342e8f
RP
4197@end ignore
4198
b50e59fe 4199@node License, , Retargeting, Top
47342e8f
RP
4200@unnumbered GNU GENERAL PUBLIC LICENSE
4201@center Version 1, February 1989
4202
4203@display
4204Copyright @copyright{} 1989 Free Software Foundation, Inc.
4205675 Mass Ave, Cambridge, MA 02139, USA
4206
4207Everyone is permitted to copy and distribute verbatim copies
4208of this license document, but changing it is not allowed.
4209@end display
4210
4211@unnumberedsec Preamble
4212
4213 The license agreements of most software companies try to keep users
4214at the mercy of those companies. By contrast, our General Public
4215License is intended to guarantee your freedom to share and change free
4216software---to make sure the software is free for all its users. The
4217General Public License applies to the Free Software Foundation's
4218software and to any other program whose authors commit to using it.
4219You can use it for your programs, too.
4220
4221 When we speak of free software, we are referring to freedom, not
4222price. Specifically, the General Public License is designed to make
4223sure that you have the freedom to give away or sell copies of free
4224software, that you receive source code or can get it if you want it,
4225that you can change the software or use pieces of it in new free
4226programs; and that you know you can do these things.
4227
4228 To protect your rights, we need to make restrictions that forbid
4229anyone to deny you these rights or to ask you to surrender the rights.
4230These restrictions translate to certain responsibilities for you if you
4231distribute copies of the software, or if you modify it.
4232
4233 For example, if you distribute copies of a such a program, whether
4234gratis or for a fee, you must give the recipients all the rights that
4235you have. You must make sure that they, too, receive or can get the
4236source code. And you must tell them their rights.
4237
4238 We protect your rights with two steps: (1) copyright the software, and
4239(2) offer you this license which gives you legal permission to copy,
4240distribute and/or modify the software.
4241
4242 Also, for each author's protection and ours, we want to make certain
4243that everyone understands that there is no warranty for this free
4244software. If the software is modified by someone else and passed on, we
4245want its recipients to know that what they have is not the original, so
4246that any problems introduced by others will not reflect on the original
4247authors' reputations.
4248
4249 The precise terms and conditions for copying, distribution and
4250modification follow.
4251
4252@iftex
4253@unnumberedsec TERMS AND CONDITIONS
4254@end iftex
4255@ifinfo
4256@center TERMS AND CONDITIONS
4257@end ifinfo
4258
4259@enumerate
4260@item
4261This License Agreement applies to any program or other work which
4262contains a notice placed by the copyright holder saying it may be
4263distributed under the terms of this General Public License. The
4264``Program'', below, refers to any such program or work, and a ``work based
4265on the Program'' means either the Program or any work containing the
4266Program or a portion of it, either verbatim or with modifications. Each
4267licensee is addressed as ``you''.
4268
4269@item
4270You may copy and distribute verbatim copies of the Program's source
4271code as you receive it, in any medium, provided that you conspicuously and
4272appropriately publish on each copy an appropriate copyright notice and
4273disclaimer of warranty; keep intact all the notices that refer to this
4274General Public License and to the absence of any warranty; and give any
4275other recipients of the Program a copy of this General Public License
4276along with the Program. You may charge a fee for the physical act of
4277transferring a copy.
4278
4279@item
4280You may modify your copy or copies of the Program or any portion of
4281it, and copy and distribute such modifications under the terms of Paragraph
42821 above, provided that you also do the following:
4283
4284@itemize @bullet
4285@item
4286cause the modified files to carry prominent notices stating that
4287you changed the files and the date of any change; and
4288
4289@item
4290cause the whole of any work that you distribute or publish, that
4291in whole or in part contains the Program or any part thereof, either
4292with or without modifications, to be licensed at no charge to all
4293third parties under the terms of this General Public License (except
4294that you may choose to grant warranty protection to some or all
4295third parties, at your option).
4296
4297@item
4298If the modified program normally reads commands interactively when
4299run, you must cause it, when started running for such interactive use
4300in the simplest and most usual way, to print or display an
4301announcement including an appropriate copyright notice and a notice
4302that there is no warranty (or else, saying that you provide a
4303warranty) and that users may redistribute the program under these
4304conditions, and telling the user how to view a copy of this General
4305Public License.
4306
4307@item
4308You may charge a fee for the physical act of transferring a
4309copy, and you may at your option offer warranty protection in
4310exchange for a fee.
4311@end itemize
4312
4313Mere aggregation of another independent work with the Program (or its
4314derivative) on a volume of a storage or distribution medium does not bring
4315the other work under the scope of these terms.
4316
4317@item
4318You may copy and distribute the Program (or a portion or derivative of
4319it, under Paragraph 2) in object code or executable form under the terms of
4320Paragraphs 1 and 2 above provided that you also do one of the following:
4321
4322@itemize @bullet
4323@item
4324accompany it with the complete corresponding machine-readable
4325source code, which must be distributed under the terms of
4326Paragraphs 1 and 2 above; or,
4327
4328@item
4329accompany it with a written offer, valid for at least three
4330years, to give any third party free (except for a nominal charge
4331for the cost of distribution) a complete machine-readable copy of the
4332corresponding source code, to be distributed under the terms of
4333Paragraphs 1 and 2 above; or,
4334
4335@item
4336accompany it with the information you received as to where the
4337corresponding source code may be obtained. (This alternative is
4338allowed only for noncommercial distribution and only if you
4339received the program in object code or executable form alone.)
4340@end itemize
4341
4342Source code for a work means the preferred form of the work for making
4343modifications to it. For an executable file, complete source code means
4344all the source code for all modules it contains; but, as a special
4345exception, it need not include source code for modules which are standard
4346libraries that accompany the operating system on which the executable
4347file runs, or for standard header files or definitions files that
4348accompany that operating system.
4349
4350@item
4351You may not copy, modify, sublicense, distribute or transfer the
4352Program except as expressly provided under this General Public License.
4353Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4354the Program is void, and will automatically terminate your rights to use
4355the Program under this License. However, parties who have received
4356copies, or rights to use copies, from you under this General Public
4357License will not have their licenses terminated so long as such parties
4358remain in full compliance.
4359
4360@item
4361By copying, distributing or modifying the Program (or any work based
4362on the Program) you indicate your acceptance of this license to do so,
4363and all its terms and conditions.
4364
4365@item
4366Each time you redistribute the Program (or any work based on the
4367Program), the recipient automatically receives a license from the original
4368licensor to copy, distribute or modify the Program subject to these
4369terms and conditions. You may not impose any further restrictions on the
4370recipients' exercise of the rights granted herein.
4371
4372@item
4373The Free Software Foundation may publish revised and/or new versions
4374of the General Public License from time to time. Such new versions will
4375be similar in spirit to the present version, but may differ in detail to
4376address new problems or concerns.
4377
4378Each version is given a distinguishing version number. If the Program
4379specifies a version number of the license which applies to it and ``any
4380later version'', you have the option of following the terms and conditions
4381either of that version or of any later version published by the Free
4382Software Foundation. If the Program does not specify a version number of
4383the license, you may choose any version ever published by the Free Software
4384Foundation.
4385
4386@item
4387If you wish to incorporate parts of the Program into other free
4388programs whose distribution conditions are different, write to the author
4389to ask for permission. For software which is copyrighted by the Free
4390Software Foundation, write to the Free Software Foundation; we sometimes
4391make exceptions for this. Our decision will be guided by the two goals
4392of preserving the free status of all derivatives of our free software and
4393of promoting the sharing and reuse of software generally.
93b45514 4394
93b45514 4395@iftex
47342e8f 4396@heading NO WARRANTY
93b45514 4397@end iftex
47342e8f
RP
4398@ifinfo
4399@center NO WARRANTY
4400@end ifinfo
4401
4402@item
4403BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4404FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
4405OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4406PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4407OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4408MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
4409TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
4410PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4411REPAIR OR CORRECTION.
4412
4413@item
4414IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4415ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4416REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4417INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4418ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4419LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4420SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4421WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4422ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4423@end enumerate
4424
4425@iftex
4426@heading END OF TERMS AND CONDITIONS
4427@end iftex
4428@ifinfo
4429@center END OF TERMS AND CONDITIONS
4430@end ifinfo
4431
4432@page
f4335d56 4433@unnumberedsec How to Apply These Terms to Your New Programs
47342e8f
RP
4434
4435 If you develop a new program, and you want it to be of the greatest
4436possible use to humanity, the best way to achieve this is to make it
4437free software which everyone can redistribute and change under these
4438terms.
4439
4440 To do so, attach the following notices to the program. It is safest to
4441attach them to the start of each source file to most effectively convey
4442the exclusion of warranty; and each file should have at least the
4443``copyright'' line and a pointer to where the full notice is found.
4444
4445@smallexample
4446@var{one line to give the program's name and a brief idea of what it does.}
4447Copyright (C) 19@var{yy} @var{name of author}
4448
4449This program is free software; you can redistribute it and/or modify
4450it under the terms of the GNU General Public License as published by
4451the Free Software Foundation; either version 1, or (at your option)
4452any later version.
4453
4454This program is distributed in the hope that it will be useful,
4455but WITHOUT ANY WARRANTY; without even the implied warranty of
4456MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4457GNU General Public License for more details.
4458
4459You should have received a copy of the GNU General Public License
4460along with this program; if not, write to the Free Software
4461Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4462@end smallexample
4463
4464Also add information on how to contact you by electronic and paper mail.
4465
4466If the program is interactive, make it output a short notice like this
4467when it starts in an interactive mode:
4468
4469@smallexample
4470Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4471Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4472This is free software, and you are welcome to redistribute it
4473under certain conditions; type `show c' for details.
4474@end smallexample
4475
4476The hypothetical commands `show w' and `show c' should show the
4477appropriate parts of the General Public License. Of course, the
4478commands you use may be called something other than `show w' and `show
4479c'; they could even be mouse-clicks or menu items---whatever suits your
4480program.
4481
4482You should also get your employer (if you work as a programmer) or your
4483school, if any, to sign a ``copyright disclaimer'' for the program, if
b50e59fe 4484necessary. Here is a sample; alter the names:
47342e8f 4485
f4335d56 4486@smallexample
47342e8f
RP
4487Yoyodyne, Inc., hereby disclaims all copyright interest in the
4488program `Gnomovision' (a program to direct compilers to make passes
4489at assemblers) written by James Hacker.
4490
4491@var{signature of Ty Coon}, 1 April 1989
4492Ty Coon, President of Vice
f4335d56 4493@end smallexample
47342e8f
RP
4494
4495That's all there is to it!
4496
4497
93b45514
RP
4498@summarycontents
4499@contents
4500@bye
This page took 0.54992 seconds and 4 git commands to generate.