]>
Commit | Line | Data |
---|---|---|
47342e8f RP |
1 | \input texinfo |
2 | @c @tex | |
3 | @c \special{twoside} | |
4 | @c @end tex | |
09352a5d RP |
5 | _if__(_ALL_ARCH__) |
6 | @setfilename as.info | |
7 | _fi__(_ALL_ARCH__) | |
8 | _if__(_M680X0__ && !_ALL_ARCH__) | |
9 | @setfilename as-m680x0.info | |
10 | _fi__(_M680X0__ && !_ALL_ARCH__) | |
11 | _if__(_AMD29K__ && !_ALL_ARCH__) | |
12 | @setfilename as-29k.info | |
13 | _fi__(_AMD29K__ && !_ALL_ARCH__) | |
14 | @c | |
15 | @c NOTE: this manual is marked up for preprocessing with a collection | |
16 | @c of m4 macros called "pretex.m4". If you see <_if__> and <_fi__> | |
17 | @c scattered around the source, you have the full source before | |
18 | @c preprocessing; if you don't, you have the source configured for some | |
19 | @c particular architecture (and you can of course get the full source, | |
20 | @c with all configurations, from wherever you got this). The full | |
21 | @c source needs to be run through m4 before either tex- or info- | |
22 | @c formatting: for example, | |
23 | @c m4 pretex.m4 none.m4 m680x0.m4 as.texinfo >as-680x0.texinfo | |
24 | @c will produce (assuming your path finds either GNU or SysV m4; | |
25 | @c Berkeley won't do) a file suitable for formatting. | |
26 | @c See the text in "pretex.m4" for a fuller explanation (and the macro | |
27 | @c definitions). | |
28 | @c | |
47342e8f RP |
29 | @synindex ky cp |
30 | @ifinfo | |
31 | This file documents the GNU Assembler "as". | |
32 | ||
33 | Copyright (C) 1991 Free Software Foundation, Inc. | |
34 | ||
35 | Permission is granted to make and distribute verbatim copies of | |
36 | this manual provided the copyright notice and this permission notice | |
37 | are preserved on all copies. | |
38 | ||
39 | @ignore | |
40 | Permission is granted to process this file through Tex and print the | |
41 | results, provided the printed document carries copying permission | |
42 | notice identical to this one except for the removal of this paragraph | |
43 | (this paragraph not being relevant to the printed manual). | |
44 | ||
45 | @end ignore | |
46 | Permission is granted to copy and distribute modified versions of this | |
47 | manual under the conditions for verbatim copying, provided also that the | |
48 | section entitled ``GNU General Public License'' is included exactly as | |
49 | in the original, and provided that the entire resulting derived work is | |
50 | distributed under the terms of a permission notice identical to this | |
51 | one. | |
52 | ||
53 | Permission is granted to copy and distribute translations of this manual | |
54 | into another language, under the above conditions for modified versions, | |
55 | except that the section entitled ``GNU General Public License'' may be | |
56 | included in a translation approved by the author instead of in the | |
57 | original English. | |
58 | @end ifinfo | |
63f5d795 RP |
59 | @tex |
60 | @finalout | |
61 | @end tex | |
f4335d56 | 62 | @smallbook |
47342e8f | 63 | @setchapternewpage odd |
09352a5d RP |
64 | _if__(_M680X0__) |
65 | @settitle Using GNU as (680x0) | |
66 | _fi__(_M680X0__) | |
67 | _if__(_AMD29K__) | |
b50e59fe | 68 | @settitle Using GNU as (AMD 29K) |
09352a5d | 69 | _fi__(_AMD29K__) |
93b45514 | 70 | @titlepage |
b50e59fe | 71 | @title{Using GNU as} |
47342e8f | 72 | @subtitle{The GNU Assembler} |
09352a5d RP |
73 | _if__(_M680X0__) |
74 | @subtitle{for Motorola 680x0} | |
75 | _fi__(_M680X0__) | |
76 | _if__(_AMD29K__) | |
b50e59fe | 77 | @subtitle{for the AMD 29K family} |
09352a5d | 78 | _fi__(_AMD29K__) |
93b45514 | 79 | @sp 1 |
b50e59fe | 80 | @subtitle February 1991 |
93b45514 RP |
81 | @sp 13 |
82 | The Free Software Foundation Inc. thanks The Nice Computer | |
83 | Company of Australia for loaning Dean Elsner to write the | |
84 | first (Vax) version of @code{as} for Project GNU. | |
85 | The proprietors, management and staff of TNCCA thank FSF for | |
86 | distracting the boss while they got some work | |
87 | done. | |
88 | @sp 3 | |
47342e8f RP |
89 | @author{Dean Elsner, Jay Fenlason & friends} |
90 | @author{revised by Roland Pesch for Cygnus Support} | |
91 | @c [email protected] | |
92 | @page | |
93 | @tex | |
94 | \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$ | |
95 | \xdef\manvers{\$Revision$} % For use in headers, footers too | |
96 | {\parskip=0pt | |
97 | \hfill Cygnus Support\par | |
98 | \hfill \manvers\par | |
99 | \hfill \TeX{}info \texinfoversion\par | |
100 | } | |
b50e59fe RP |
101 | %"boxit" macro for figures: |
102 | %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3) | |
103 | \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt | |
104 | \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil | |
105 | #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline | |
106 | \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box | |
47342e8f | 107 | @end tex |
93b45514 | 108 | |
47342e8f RP |
109 | @vskip 0pt plus 1filll |
110 | Copyright @copyright{} 1991 Free Software Foundation, Inc. | |
93b45514 RP |
111 | |
112 | Permission is granted to make and distribute verbatim copies of | |
113 | this manual provided the copyright notice and this permission notice | |
114 | are preserved on all copies. | |
115 | ||
93b45514 | 116 | Permission is granted to copy and distribute modified versions of this |
47342e8f RP |
117 | manual under the conditions for verbatim copying, provided also that the |
118 | section entitled ``GNU General Public License'' is included exactly as | |
119 | in the original, and provided that the entire resulting derived work is | |
120 | distributed under the terms of a permission notice identical to this | |
121 | one. | |
93b45514 RP |
122 | |
123 | Permission is granted to copy and distribute translations of this manual | |
47342e8f RP |
124 | into another language, under the above conditions for modified versions, |
125 | except that the section entitled ``GNU General Public License'' may be | |
126 | included in a translation approved by the author instead of in the | |
127 | original English. | |
93b45514 | 128 | @end titlepage |
47342e8f RP |
129 | @page |
130 | ||
b50e59fe | 131 | @node Top, Overview, (dir), (dir) |
47342e8f | 132 | |
93b45514 | 133 | @menu |
b50e59fe RP |
134 | * Overview:: Overview |
135 | * Syntax:: Syntax | |
136 | * Segments:: Segments and Relocation | |
137 | * Symbols:: Symbols | |
138 | * Expressions:: Expressions | |
139 | * Pseudo Ops:: Assembler Directives | |
140 | * Maintenance:: Maintaining the Assembler | |
141 | * Retargeting:: Teaching the Assembler about a New Machine | |
142 | * License:: GNU GENERAL PUBLIC LICENSE | |
143 | ||
144 | --- The Detailed Node Listing --- | |
145 | ||
146 | Overview | |
147 | ||
148 | * Invoking:: Invoking @code{as} | |
149 | * Manual:: Structure of this Manual | |
150 | * GNU Assembler:: as, the GNU Assembler | |
151 | * Command Line:: Command Line | |
152 | * Input Files:: Input Files | |
153 | * Object:: Output (Object) File | |
154 | * Errors:: Error and Warning Messages | |
155 | * Options:: Options | |
156 | ||
157 | Input Files | |
158 | ||
159 | * Filenames:: Input Filenames and Line-numbers | |
160 | ||
161 | Syntax | |
162 | ||
163 | * Pre-processing:: Pre-processing | |
164 | * Whitespace:: Whitespace | |
165 | * Comments:: Comments | |
166 | * Symbol Intro:: Symbols | |
167 | * Statements:: Statements | |
168 | * Constants:: Constants | |
169 | ||
170 | Constants | |
171 | ||
172 | * Characters:: Character Constants | |
173 | * Numbers:: Number Constants | |
174 | ||
175 | Character Constants | |
176 | ||
177 | * Strings:: Strings | |
178 | * Chars:: Characters | |
179 | ||
180 | Segments and Relocation | |
181 | ||
182 | * Segs Background:: Background | |
183 | * ld Segments:: ld Segments | |
184 | * as Segments:: as Internal Segments | |
185 | * Sub-Segments:: Sub-Segments | |
186 | * bss:: bss Segment | |
187 | ||
188 | Segments and Relocation | |
189 | ||
190 | * ld Segments:: ld Segments | |
191 | * as Segments:: as Internal Segments | |
192 | * Sub-Segments:: Sub-Segments | |
193 | * bss:: bss Segment | |
194 | ||
195 | Symbols | |
196 | ||
197 | * Labels:: Labels | |
198 | * Setting Symbols:: Giving Symbols Other Values | |
199 | * Symbol Names:: Symbol Names | |
200 | * Dot:: The Special Dot Symbol | |
201 | * Symbol Attributes:: Symbol Attributes | |
202 | ||
203 | Symbol Names | |
204 | ||
205 | * Local Symbols:: Local Symbol Names | |
206 | ||
207 | Symbol Attributes | |
208 | ||
209 | * Symbol Value:: Value | |
210 | * Symbol Type:: Type | |
211 | * Symbol Desc:: Descriptor | |
212 | * Symbol Other:: Other | |
213 | ||
214 | Expressions | |
215 | ||
216 | * Empty Exprs:: Empty Expressions | |
217 | * Integer Exprs:: Integer Expressions | |
218 | ||
219 | Integer Expressions | |
220 | ||
221 | * Arguments:: Arguments | |
222 | * Operators:: Operators | |
223 | * Prefix Ops:: Prefix Operators | |
224 | * Infix Ops:: Infix Operators | |
225 | ||
226 | Assembler Directives | |
227 | ||
228 | * Abort:: The Abort directive causes as to abort | |
229 | * Align:: Pad the location counter to a power of 2 | |
230 | * App-File:: Set the logical file name | |
231 | * Ascii:: Fill memory with bytes of ASCII characters | |
232 | * Asciz:: Fill memory with bytes of ASCII characters followed | |
233 | by a null. | |
234 | * Byte:: Fill memory with 8-bit integers | |
235 | * Comm:: Reserve public space in the BSS segment | |
236 | * Data:: Change to the data segment | |
237 | * Desc:: Set the n_desc of a symbol | |
238 | * Double:: Fill memory with double-precision floating-point numbers | |
239 | * Else:: @code{.else} | |
240 | * End:: @code{.end} | |
241 | * Endif:: @code{.endif} | |
242 | * Equ:: @code{.equ @var{symbol}, @var{expression}} | |
243 | * Extern:: @code{.extern} | |
244 | * Fill:: Fill memory with repeated values | |
245 | * Float:: Fill memory with single-precision floating-point numbers | |
246 | * Global:: Make a symbol visible to the linker | |
247 | * Ident:: @code{.ident} | |
248 | * If:: @code{.if @var{absolute expression}} | |
249 | * Include:: @code{.include "@var{file}"} | |
250 | * Int:: Fill memory with 32-bit integers | |
251 | * Lcomm:: Reserve private space in the BSS segment | |
252 | * Line:: Set the logical line number | |
253 | * Ln:: @code{.ln @var{line-number}} | |
254 | * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl} | |
255 | * Long:: Fill memory with 32-bit integers | |
256 | * Lsym:: Create a local symbol | |
257 | * Octa:: Fill memory with 128-bit integers | |
258 | * Org:: Change the location counter | |
259 | * Quad:: Fill memory with 64-bit integers | |
260 | * Set:: Set the value of a symbol | |
261 | * Short:: Fill memory with 16-bit integers | |
262 | * Single:: @code{.single @var{flonums}} | |
263 | * Stab:: Store debugging information | |
264 | * Text:: Change to the text segment | |
b50e59fe | 265 | * Word:: Fill memory with 32-bit integers |
b50e59fe RP |
266 | * Deprecated:: Deprecated Directives |
267 | * Machine Options:: Options | |
268 | * Machine Syntax:: Syntax | |
269 | * Floating Point:: Floating Point | |
270 | * Machine Directives:: Machine Directives | |
271 | * Opcodes:: Opcodes | |
272 | ||
273 | Machine Directives | |
274 | ||
275 | * block:: @code{.block @var{size} , @var{fill}} | |
276 | * cputype:: @code{.cputype} | |
277 | * file:: @code{.file} | |
278 | * hword:: @code{.hword @var{expressions}} | |
279 | * line:: @code{.line} | |
280 | * reg:: @code{.reg @var{symbol}, @var{expression}} | |
281 | * sect:: @code{.sect} | |
282 | * use:: @code{.use @var{segment name}} | |
93b45514 | 283 | @end menu |
47342e8f | 284 | |
b50e59fe RP |
285 | @node Overview, Syntax, Top, Top |
286 | @chapter Overview | |
287 | ||
47342e8f | 288 | This manual is a user guide to the GNU assembler @code{as}. |
09352a5d RP |
289 | _if__(_M680X0__) |
290 | This version of the manual describes @code{as} configured to generate | |
291 | code for Motorola 680x0 architectures. | |
292 | _fi__(_M680X0__) | |
293 | _if__(_AMD29K__) | |
b50e59fe RP |
294 | This version of the manual describes @code{as} configured to generate |
295 | code for Advanced Micro Devices' 29K architectures. | |
09352a5d | 296 | _fi__(_AMD29K__) |
b50e59fe RP |
297 | |
298 | @menu | |
299 | * Invoking:: Invoking @code{as} | |
300 | * Manual:: Structure of this Manual | |
301 | * GNU Assembler:: as, the GNU Assembler | |
302 | * Command Line:: Command Line | |
303 | * Input Files:: Input Files | |
304 | * Object:: Output (Object) File | |
305 | * Errors:: Error and Warning Messages | |
306 | * Options:: Options | |
307 | @end menu | |
47342e8f | 308 | |
b50e59fe RP |
309 | @node Invoking, Manual, Overview, Overview |
310 | @section Invoking @code{as} | |
47342e8f | 311 | |
b50e59fe RP |
312 | Here is a brief summary of how to invoke GNU @code{as}. For details, |
313 | @pxref{Options}. | |
314 | ||
315 | @c We don't use @deffn and friends for the following because they seem | |
316 | @c to be limited to one line for the header. | |
47342e8f | 317 | @example |
b50e59fe | 318 | as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ] |
09352a5d RP |
319 | _if__(_M680X0__) |
320 | [ -l ] [ -mc68000 | -mc68010 | -mc68020 ] | |
321 | _fi__(_M680X0__) | |
322 | _if__(_AMD29K__) | |
323 | @c am29k has no machine-dependent assembler options | |
324 | _fi__(_AMD29K__) | |
47342e8f RP |
325 | [ -- | @var{files} @dots{} ] |
326 | @end example | |
327 | ||
328 | @table @code | |
b50e59fe RP |
329 | |
330 | @item -D | |
331 | This option is accepted only for script compatibility with calls to | |
332 | other assemblers; it has no effect on GNU @code{as}. | |
333 | ||
47342e8f RP |
334 | @item -f |
335 | ``fast''---skip preprocessing (assume source is compiler output) | |
336 | ||
b50e59fe RP |
337 | @item -I @var{path} |
338 | Add @var{path} to the search list for @code{.include} directives | |
339 | ||
47342e8f | 340 | @item -k |
09352a5d | 341 | _if__(_AMD29K__) |
b50e59fe | 342 | This option is accepted but has no effect on the 29K family. |
09352a5d RP |
343 | _fi__(_AMD29K__) |
344 | _if__(!_AMD29K__) | |
345 | Issue warnings when difference tables altered for long displacements | |
346 | _fi__(!_AMD29K__) | |
47342e8f RP |
347 | |
348 | @item -L | |
349 | Keep (in symbol table) local symbols, starting with @samp{L} | |
350 | ||
351 | @item -o @var{objfile} | |
352 | Name the object-file output from @code{as} | |
353 | ||
354 | @item -R | |
355 | Fold data segment into text segment | |
356 | ||
357 | @item -W | |
b50e59fe | 358 | Suppress warning messages |
47342e8f | 359 | |
09352a5d RP |
360 | _if__(_M680X0__) |
361 | @item -l | |
362 | Shorten references to undefined symbols, to one word instead of two | |
363 | ||
364 | @item -mc68000 | -mc68010 | -mc68020 | |
365 | Specify what processor in the 68000 family is the target (default 68020) | |
366 | _fi__(_M680X0__) | |
47342e8f RP |
367 | |
368 | @item -- | @var{files} @dots{} | |
369 | Source files to assemble, or standard input | |
370 | @end table | |
371 | ||
b50e59fe | 372 | @node Manual, GNU Assembler, Invoking, Overview |
47342e8f RP |
373 | @section Structure of this Manual |
374 | This document is intended to describe what you need to know to use GNU | |
375 | @code{as}. We cover the syntax expected in source files, including | |
376 | notation for symbols, constants, and expressions; the directives that | |
377 | @code{as} understands; and of course how to invoke @code{as}. | |
378 | ||
09352a5d RP |
379 | _if__(_M680X0__ && !_ALL_ARCH__) |
380 | We also cover special features in the 68000 configuration of @code{as}, | |
381 | including pseudo-operations. | |
382 | _fi__(_M680X0__ && !_ALL_ARCH__) | |
383 | _if__(_AMD29K__ && !_ALL_ARCH__) | |
b50e59fe RP |
384 | We also cover special features in the AMD 29K configuration of @code{as}, |
385 | including assembler directives. | |
09352a5d | 386 | _fi__(_AMD29K__ && !_ALL_ARCH__) |
47342e8f | 387 | |
09352a5d RP |
388 | _if__(_ALL_ARCH__) |
389 | This document also describes some of the machine-dependent features of | |
390 | various flavors of the assembler. | |
391 | _fi__(_ALL_ARCH__) | |
392 | _if__(_INTERNALS__) | |
93b45514 RP |
393 | This document also describes how the assembler works internally, and |
394 | provides some information that may be useful to people attempting to | |
395 | port the assembler to another machine. | |
09352a5d | 396 | _fi__(_INTERNALS__) |
93b45514 | 397 | |
47342e8f | 398 | On the other hand, this manual is @emph{not} intended as an introduction |
b50e59fe RP |
399 | to programming in assembly language---let alone programming in general! |
400 | In a similar vein, we make no attempt to introduce the machine | |
47342e8f RP |
401 | architecture; we do @emph{not} describe the instruction set, standard |
402 | mnemonics, registers or addressing modes that are standard to a | |
403 | particular architecture. You may want to consult the manufacturer's | |
b50e59fe RP |
404 | machine architecture manual for this information. |
405 | ||
93b45514 | 406 | |
47342e8f RP |
407 | @c I think this is [email protected], 17jan1991 |
408 | @ignore | |
93b45514 RP |
409 | Throughout this document, we assume that you are running @dfn{GNU}, |
410 | the portable operating system from the @dfn{Free Software | |
411 | Foundation, Inc.}. This restricts our attention to certain kinds of | |
47342e8f | 412 | computer (in particular, the kinds of computers that GNU can run on); |
93b45514 RP |
413 | once this assumption is granted examples and definitions need less |
414 | qualification. | |
415 | ||
93b45514 RP |
416 | @code{as} is part of a team of programs that turn a high-level |
417 | human-readable series of instructions into a low-level | |
418 | computer-readable series of instructions. Different versions of | |
09352a5d | 419 | @code{as} are used for different kinds of computer. |
47342e8f | 420 | @end ignore |
93b45514 | 421 | |
b50e59fe RP |
422 | @c There used to be a section "Terminology" here, which defined |
423 | @c "contents", "byte", "word", and "long". Defining "word" to any | |
424 | @c particular size is confusing when the .word directive may generate 16 | |
425 | @c bits on one machine and 32 bits on another; in general, for the user | |
426 | @c version of this manual, none of these terms seem essential to define. | |
427 | @c They were used very little even in the former draft of the manual; | |
428 | @c this draft makes an effort to avoid them (except in names of | |
429 | @c directives). | |
93b45514 | 430 | |
b50e59fe | 431 | @node GNU Assembler, Command Line, Manual, Overview |
93b45514 | 432 | @section as, the GNU Assembler |
47342e8f RP |
433 | @code{as} is primarily intended to assemble the output of the GNU C |
434 | compiler @code{gcc} for use by the linker @code{ld}. Nevertheless, | |
b50e59fe RP |
435 | we've tried to make @code{as} assemble correctly everything that the native |
436 | assembler would. | |
09352a5d | 437 | _if__(_VAX__) |
b50e59fe | 438 | Any exceptions are documented explicitly (@pxref{Machine Dependent}). |
09352a5d | 439 | _fi__(_VAX__) |
b50e59fe RP |
440 | This doesn't mean @code{as} always uses the same syntax as another |
441 | assembler for the same architecture; for example, we know of several | |
442 | incompatible versions of 680x0 assembly language syntax. | |
47342e8f RP |
443 | |
444 | GNU @code{as} is really a family of assemblers. If you use (or have | |
445 | used) GNU @code{as} on another architecture, you should find a fairly | |
446 | similar environment. Each version has much in common with the others, | |
447 | including object file formats, most assembler directives (often called | |
93b45514 RP |
448 | @dfn{pseudo-ops)} and assembler syntax. |
449 | ||
b50e59fe RP |
450 | Unlike older assemblers, @code{as} is designed to assemble a source |
451 | program in one pass of the source file. This has a subtle impact on the | |
452 | @kbd{.org} directive (@pxref{Org}). | |
93b45514 | 453 | |
b50e59fe RP |
454 | @node Command Line, Input Files, GNU Assembler, Overview |
455 | @section Command Line | |
93b45514 RP |
456 | |
457 | After the program name @code{as}, the command line may contain | |
458 | options and file names. Options may be in any order, and may be | |
459 | before, after, or between file names. The order of file names is | |
460 | significant. | |
461 | ||
47342e8f | 462 | @file{--} (two hyphens) by itself names the standard input file |
b50e59fe | 463 | explicitly, as one of the files for @code{as} to assemble. |
47342e8f | 464 | |
93b45514 RP |
465 | Except for @samp{--} any command line argument that begins with a |
466 | hyphen (@samp{-}) is an option. Each option changes the behavior of | |
467 | @code{as}. No option changes the way another option works. An | |
47342e8f | 468 | option is a @samp{-} followed by one or more letters; the case of |
b50e59fe | 469 | the letter is important. All options are optional. |
93b45514 RP |
470 | |
471 | Some options expect exactly one file name to follow them. The file | |
472 | name may either immediately follow the option's letter (compatible | |
473 | with older assemblers) or it may be the next command argument (GNU | |
474 | standard). These two command lines are equivalent: | |
475 | ||
476 | @example | |
477 | as -o my-object-file.o mumble | |
478 | as -omy-object-file.o mumble | |
479 | @end example | |
480 | ||
b50e59fe | 481 | @node Input Files, Object, Command Line, Overview |
47342e8f | 482 | @section Input Files |
93b45514 | 483 | |
47342e8f | 484 | We use the phrase @dfn{source program}, abbreviated @dfn{source}, to |
93b45514 RP |
485 | describe the program input to one run of @code{as}. The program may |
486 | be in one or more files; how the source is partitioned into files | |
487 | doesn't change the meaning of the source. | |
488 | ||
b50e59fe RP |
489 | @c I added "con" prefix to "catenation" just to prove I can overcome my |
490 | @c APL training... [email protected] | |
491 | The source program is a concatenation of the text in all the files, in the | |
47342e8f | 492 | order specified. |
93b45514 RP |
493 | |
494 | Each time you run @code{as} it assembles exactly one source | |
47342e8f | 495 | program. The source program is made up of one or more files. |
93b45514 RP |
496 | (The standard input is also a file.) |
497 | ||
498 | You give @code{as} a command line that has zero or more input file | |
499 | names. The input files are read (from left file name to right). A | |
500 | command line argument (in any position) that has no special meaning | |
47342e8f | 501 | is taken to be an input file name. |
93b45514 | 502 | |
47342e8f RP |
503 | If @code{as} is given no file names it attempts to read one input file |
504 | from @code{as}'s standard input, which is normally your terminal. You | |
505 | may have to type @key{ctl-D} to tell @code{as} there is no more program | |
506 | to assemble. | |
93b45514 | 507 | |
47342e8f RP |
508 | Use @samp{--} if you need to explicitly name the standard input file |
509 | in your command line. | |
93b45514 | 510 | |
b50e59fe | 511 | If the source is empty, @code{as} will produce a small, empty object |
47342e8f | 512 | file. |
93b45514 | 513 | |
b50e59fe RP |
514 | @menu |
515 | * Filenames:: Input Filenames and Line-numbers | |
516 | @end menu | |
517 | ||
518 | @node Filenames, , Input Files, Input Files | |
93b45514 | 519 | @subsection Input Filenames and Line-numbers |
47342e8f | 520 | There are two ways of locating a line in the input file (or files) and both |
93b45514 RP |
521 | are used in reporting error messages. One way refers to a line |
522 | number in a physical file; the other refers to a line number in a | |
47342e8f | 523 | ``logical'' file. |
93b45514 RP |
524 | |
525 | @dfn{Physical files} are those files named in the command line given | |
526 | to @code{as}. | |
527 | ||
47342e8f RP |
528 | @dfn{Logical files} are simply names declared explicitly by assembler |
529 | directives; they bear no relation to physical files. Logical file names | |
530 | help error messages reflect the original source file, when @code{as} | |
b50e59fe | 531 | source is itself synthesized from other files. @xref{App-File}. |
93b45514 | 532 | |
b50e59fe | 533 | @node Object, Errors, Input Files, Overview |
93b45514 RP |
534 | @section Output (Object) File |
535 | Every time you run @code{as} it produces an output file, which is | |
536 | your assembly language program translated into numbers. This file | |
47342e8f | 537 | is the object file, named @code{a.out} unless you tell @code{as} to |
93b45514 RP |
538 | give it another name by using the @code{-o} option. Conventionally, |
539 | object file names end with @file{.o}. The default name of | |
47342e8f | 540 | @file{a.out} is used for historical reasons: older assemblers were |
93b45514 | 541 | capable of assembling self-contained programs directly into a |
47342e8f RP |
542 | runnable program. |
543 | @c This may still work, but hasn't been tested. | |
93b45514 | 544 | |
47342e8f | 545 | The object file is meant for input to the linker @code{ld}. It contains |
b50e59fe RP |
546 | assembled program code, information to help @code{ld} integrate |
547 | the assembled program into a runnable file, and (optionally) symbolic | |
47342e8f | 548 | information for the debugger. |
93b45514 RP |
549 | |
550 | @comment link above to some info file(s) like the description of a.out. | |
551 | @comment don't forget to describe GNU info as well as Unix lossage. | |
552 | ||
b50e59fe | 553 | @node Errors, Options, Object, Overview |
93b45514 RP |
554 | @section Error and Warning Messages |
555 | ||
b50e59fe RP |
556 | @code{as} may write warnings and error messages to the standard error |
557 | file (usually your terminal). This should not happen when @code{as} is | |
558 | run automatically by a compiler. Warnings report an assumption made so | |
559 | that @code{as} could keep assembling a flawed program; errors report a | |
560 | grave problem that stops the assembly. | |
93b45514 RP |
561 | |
562 | Warning messages have the format | |
563 | @example | |
b50e59fe | 564 | file_name:@b{NNN}:Warning Message Text |
93b45514 | 565 | @end example |
b50e59fe RP |
566 | @noindent(where @b{NNN} is a line number). If a logical file name has |
567 | been given (@pxref{App-File}) it is used for the filename, otherwise the | |
568 | name of the current input file is used. If a logical line number was | |
63f5d795 | 569 | given |
09352a5d RP |
570 | _if__(!_AMD29K__) |
571 | (@pxref{Line}) | |
572 | _fi__(!_AMD29K__) | |
573 | _if__(_AMD29K__) | |
63f5d795 | 574 | (@pxref{Ln}) |
09352a5d | 575 | _fi__(_AMD29K__) |
63f5d795 | 576 | then it is used to calculate the number printed, |
b50e59fe RP |
577 | otherwise the actual line in the current source file is printed. The |
578 | message text is intended to be self explanatory (in the grand Unix | |
63f5d795 | 579 | tradition). @refill |
93b45514 RP |
580 | |
581 | Error messages have the format | |
582 | @example | |
b50e59fe | 583 | file_name:@b{NNN}:FATAL:Error Message Text |
93b45514 | 584 | @end example |
47342e8f | 585 | The file name and line number are derived as for warning |
93b45514 RP |
586 | messages. The actual message text may be rather less explanatory |
587 | because many of them aren't supposed to happen. | |
588 | ||
63f5d795 | 589 | @group |
b50e59fe | 590 | @node Options, , Errors, Overview |
47342e8f | 591 | @section Options |
b50e59fe RP |
592 | @subsection @code{-D} |
593 | This option has no effect whatsoever, but it is accepted to make it more | |
594 | likely that scripts written for other assemblers will also work with | |
595 | GNU @code{as}. | |
63f5d795 | 596 | @end group |
b50e59fe RP |
597 | |
598 | @subsection Work Faster: @code{-f} | |
93b45514 | 599 | @samp{-f} should only be used when assembling programs written by a |
47342e8f | 600 | (trusted) compiler. @samp{-f} stops the assembler from pre-processing |
b50e59fe RP |
601 | the input file(s) before assembling them. |
602 | @quotation | |
603 | @emph{Warning:} if the files actually need to be pre-processed (if they | |
604 | contain comments, for example), @code{as} will not work correctly if | |
605 | @samp{-f} is used. | |
606 | @end quotation | |
607 | ||
608 | @subsection Add to @code{.include} search path: @code{-I} @var{path} | |
609 | Use this option to add a @var{path} to the list of directories GNU | |
610 | @code{as} will search for files specified in @code{.include} directives | |
611 | (@pxref{Include}). You may use @code{-I} as many times as necessary to | |
612 | include a variety of paths. The current working directory is always | |
613 | searched first; after that, @code{as} searches any @samp{-I} directories | |
614 | in the same order as they were specified (left to right) on the command | |
615 | line. | |
616 | ||
617 | @subsection Warn if difference tables altered: @code{-k} | |
09352a5d | 618 | _if__(_AMD29K__) |
b50e59fe RP |
619 | On the AMD 29K family, this option is allowed, but has no effect. It is |
620 | permitted for compatibility with GNU @code{as} on other platforms, | |
621 | where it can be used to warn when @code{as} alters the machine code | |
622 | generated for @samp{.word} directives in difference tables. The AMD 29K | |
623 | family does not have the addressing limitations that sometimes lead to this | |
624 | alteration on other platforms. | |
09352a5d | 625 | _fi__(_AMD29K__) |
b50e59fe | 626 | |
09352a5d | 627 | _if__(!_AMD29K__) |
47342e8f RP |
628 | @code{as} sometimes alters the code emitted for directives of the form |
629 | @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}. | |
630 | You can use the @samp{-k} option if you want a warning issued when this | |
631 | is done. | |
09352a5d | 632 | _fi__(!_AMD29K__) |
47342e8f | 633 | |
b50e59fe RP |
634 | @subsection Include Local Labels: @code{-L} |
635 | Labels beginning with @samp{L} (upper case only) are called @dfn{local | |
636 | labels}. @xref{Symbol Names}. Normally you don't see such labels when | |
47342e8f | 637 | debugging, because they are intended for the use of programs (like |
b50e59fe | 638 | compilers) that compose assembler programs, not for your notice. |
47342e8f | 639 | Normally both @code{as} and @code{ld} discard such labels, so you don't |
b50e59fe | 640 | normally debug with them. |
93b45514 RP |
641 | |
642 | This option tells @code{as} to retain those @samp{L@dots{}} symbols | |
643 | in the object file. Usually if you do this you also tell the linker | |
644 | @code{ld} to preserve symbols whose names begin with @samp{L}. | |
645 | ||
b50e59fe | 646 | @subsection Name the Object File: @code{-o} |
93b45514 RP |
647 | There is always one object file output when you run @code{as}. By |
648 | default it has the name @file{a.out}. You use this option (which | |
649 | takes exactly one filename) to give the object file a different name. | |
650 | ||
651 | Whatever the object file is called, @code{as} will overwrite any | |
652 | existing file of the same name. | |
653 | ||
f4335d56 | 654 | @subsection Data Segment into Text Segment: @code{-R} |
93b45514 RP |
655 | @code{-R} tells @code{as} to write the object file as if all |
656 | data-segment data lives in the text segment. This is only done at | |
657 | the very last moment: your binary data are the same, but data | |
658 | segment parts are relocated differently. The data segment part of | |
659 | your object file is zero bytes long because all it bytes are | |
660 | appended to the text segment. (@xref{Segments}.) | |
661 | ||
b50e59fe | 662 | When you specify @code{-R} it would be possible to generate shorter |
47342e8f RP |
663 | address displacements (because we don't have to cross between text and |
664 | data segment). We don't do this simply for compatibility with older | |
b50e59fe | 665 | versions of @code{as}. In future, @code{-R} may work this way. |
93b45514 | 666 | |
b50e59fe | 667 | @subsection Suppress Warnings: @code{-W} |
93b45514 RP |
668 | @code{as} should never give a warning or error message when |
669 | assembling compiler output. But programs written by people often | |
670 | cause @code{as} to give a warning that a particular assumption was | |
671 | made. All such warnings are directed to the standard error file. | |
47342e8f RP |
672 | If you use this option, no warnings are issued. This option only |
673 | affects the warning messages: it does not change any particular of how | |
93b45514 RP |
674 | @code{as} assembles your file. Errors, which stop the assembly, are |
675 | still reported. | |
676 | ||
b50e59fe | 677 | @node Syntax, Segments, Overview, Top |
47342e8f RP |
678 | @chapter Syntax |
679 | This chapter describes the machine-independent syntax allowed in a | |
680 | source file. @code{as} syntax is similar to what many other assemblers | |
b50e59fe | 681 | use; it is inspired in BSD 4.2 |
09352a5d | 682 | _if__(!_VAX__) |
b50e59fe | 683 | assembler. @refill |
09352a5d RP |
684 | _fi__(!_VAX__) |
685 | _if__(_VAX__) | |
686 | assembler, except that @code{as} does not assemble Vax bit-fields. | |
687 | _fi__(_VAX__) | |
b50e59fe RP |
688 | |
689 | @menu | |
690 | * Pre-processing:: Pre-processing | |
691 | * Whitespace:: Whitespace | |
692 | * Comments:: Comments | |
693 | * Symbol Intro:: Symbols | |
694 | * Statements:: Statements | |
695 | * Constants:: Constants | |
696 | @end menu | |
93b45514 | 697 | |
b50e59fe RP |
698 | @node Pre-processing, Whitespace, Syntax, Syntax |
699 | @section Pre-processing | |
93b45514 | 700 | |
b50e59fe RP |
701 | The pre-processor: |
702 | @itemize @bullet | |
703 | @item | |
704 | adjusts and removes extra whitespace. It leaves one space or tab before | |
705 | the keywords on a line, and turns any other whitespace on the line into | |
706 | a single space. | |
93b45514 | 707 | |
b50e59fe RP |
708 | @item |
709 | removes all comments, replacing them with a single space, or an | |
710 | appropriate number of newlines. | |
93b45514 | 711 | |
b50e59fe RP |
712 | @item |
713 | converts character constants into the appropriate numeric values. | |
714 | @end itemize | |
715 | ||
716 | Excess whitespace, comments, and character constants | |
93b45514 RP |
717 | cannot be used in the portions of the input text that are not |
718 | pre-processed. | |
719 | ||
b50e59fe RP |
720 | If the first line of an input file is @code{#NO_APP} or the @samp{-f} |
721 | option is given, the input file will not be pre-processed. Within such | |
722 | an input file, parts of the file can be pre-processed by putting a line | |
723 | that says @code{#APP} before the text that should be pre-processed, and | |
724 | putting a line that says @code{#NO_APP} after them. This feature is | |
725 | mainly intend to support @code{asm} statements in compilers whose output | |
726 | normally does not need to be pre-processed. | |
93b45514 | 727 | |
b50e59fe | 728 | @node Whitespace, Comments, Pre-processing, Syntax |
93b45514 RP |
729 | @section Whitespace |
730 | @dfn{Whitespace} is one or more blanks or tabs, in any order. | |
731 | Whitespace is used to separate symbols, and to make programs neater | |
732 | for people to read. Unless within character constants | |
b50e59fe | 733 | (@pxref{Characters}), any whitespace means the same as exactly one |
93b45514 RP |
734 | space. |
735 | ||
b50e59fe | 736 | @node Comments, Symbol Intro, Whitespace, Syntax |
93b45514 RP |
737 | @section Comments |
738 | There are two ways of rendering comments to @code{as}. In both | |
739 | cases the comment is equivalent to one space. | |
740 | ||
47342e8f RP |
741 | Anything from @samp{/*} through the next @samp{*/} is a comment. |
742 | This means you may not nest these comments. | |
93b45514 RP |
743 | |
744 | @example | |
745 | /* | |
746 | The only way to include a newline ('\n') in a comment | |
747 | is to use this sort of comment. | |
748 | */ | |
47342e8f | 749 | |
93b45514 RP |
750 | /* This sort of comment does not nest. */ |
751 | @end example | |
752 | ||
753 | Anything from the @dfn{line comment} character to the next newline | |
47342e8f | 754 | is considered a comment and is ignored. The line comment character is |
09352a5d RP |
755 | _if__(_VAX__) |
756 | @samp{#} on the Vax; | |
757 | _fi__(_VAX__) | |
758 | _if__(_M680X0__) | |
759 | @samp{|} on the 680x0; | |
760 | _fi__(_M680X0__) | |
761 | _if__(_AMD29K__) | |
762 | @samp{;} for the AMD 29K family; | |
763 | _fi__(_AMD29K__) | |
764 | @pxref{Machine Dependent}. @refill | |
765 | ||
766 | _if__(_ALL_ARCH__) | |
b50e59fe RP |
767 | On some machines there are two different line comment characters. One |
768 | will only begin a comment if it is the first non-whitespace character on | |
769 | a line, while the other will always begin a comment. | |
09352a5d | 770 | _fi__(_ALL_ARCH__) |
93b45514 RP |
771 | |
772 | To be compatible with past assemblers a special interpretation is | |
773 | given to lines that begin with @samp{#}. Following the @samp{#} an | |
774 | absolute expression (@pxref{Expressions}) is expected: this will be | |
775 | the logical line number of the @b{next} line. Then a string | |
776 | (@xref{Strings}.) is allowed: if present it is a new logical file | |
777 | name. The rest of the line, if any, should be whitespace. | |
778 | ||
779 | If the first non-whitespace characters on the line are not numeric, | |
780 | the line is ignored. (Just like a comment.) | |
781 | @example | |
782 | # This is an ordinary comment. | |
783 | # 42-6 "new_file_name" # New logical file name | |
784 | # This is logical line # 36. | |
785 | @end example | |
786 | This feature is deprecated, and may disappear from future versions | |
787 | of @code{as}. | |
788 | ||
b50e59fe | 789 | @node Symbol Intro, Statements, Comments, Syntax |
93b45514 RP |
790 | @section Symbols |
791 | A @dfn{symbol} is one or more characters chosen from the set of all | |
792 | letters (both upper and lower case), digits and the three characters | |
b50e59fe RP |
793 | @samp{_.$}. No symbol may begin with a digit. Case is significant. |
794 | There is no length limit: all characters are significant. Symbols are | |
795 | delimited by characters not in that set, or by the beginning of a file | |
796 | (since the source program must end with a newline, the end of a file is | |
797 | not a possible symbol delimiter). @xref{Symbols}. | |
93b45514 | 798 | |
b50e59fe | 799 | @node Statements, Constants, Symbol Intro, Syntax |
93b45514 | 800 | @section Statements |
b50e59fe | 801 | A @dfn{statement} ends at a newline character (@samp{\n}) |
09352a5d RP |
802 | _if__(!_AMD29K__) |
803 | or at a semicolon (@samp{;}). The newline or semicolon | |
804 | _fi__(!_AMD29K__) | |
805 | _if__(_AMD29K__) | |
b50e59fe | 806 | or an ``at'' sign (@samp{@@}). The newline or at sign |
09352a5d | 807 | _fi__(_AMD29K__) |
b50e59fe RP |
808 | is considered part |
809 | of the preceding statement. Newlines | |
09352a5d RP |
810 | _if__(!_AMD29K__) |
811 | and semicolons | |
812 | _fi__(!_AMD29K__) | |
813 | _if__(_AMD29K__) | |
b50e59fe | 814 | and at signs |
09352a5d | 815 | _fi__(_AMD29K__) |
b50e59fe | 816 | within |
93b45514 RP |
817 | character constants are an exception: they don't end statements. |
818 | It is an error to end any statement with end-of-file: the last | |
b50e59fe | 819 | character of any input file should be a newline.@refill |
93b45514 RP |
820 | |
821 | You may write a statement on more than one line if you put a | |
822 | backslash (@kbd{\}) immediately in front of any newlines within the | |
823 | statement. When @code{as} reads a backslashed newline both | |
824 | characters are ignored. You can even put backslashed newlines in | |
825 | the middle of symbol names without changing the meaning of your | |
826 | source program. | |
827 | ||
47342e8f | 828 | An empty statement is allowed, and may include whitespace. It is ignored. |
93b45514 | 829 | |
b50e59fe RP |
830 | @c "key symbol" is not used elsewhere in the document; seems pedantic to |
831 | @c @defn{} it in that case, as was done previously... [email protected], | |
832 | @c 13feb91. | |
47342e8f | 833 | A statement begins with zero or more labels, optionally followed by a |
b50e59fe | 834 | key symbol which determines what kind of statement it is. The key |
93b45514 | 835 | symbol determines the syntax of the rest of the statement. If the |
b50e59fe | 836 | symbol begins with a dot @samp{.} then the statement is an assembler |
47342e8f RP |
837 | directive: typically valid for any computer. If the symbol begins with |
838 | a letter the statement is an assembly language @dfn{instruction}: it | |
839 | will assemble into a machine language instruction. Different versions | |
840 | of @code{as} for different computers will recognize different | |
841 | instructions. In fact, the same symbol may represent a different | |
842 | instruction in a different computer's assembly language. | |
843 | ||
844 | A label is a symbol immediately followed by a colon (@code{:}). | |
845 | Whitespace before a label or after a colon is permitted, but you may not | |
846 | have whitespace between a label's symbol and its colon. @xref{Labels}. | |
93b45514 RP |
847 | |
848 | @example | |
849 | label: .directive followed by something | |
850 | another$label: # This is an empty statement. | |
851 | instruction operand_1, operand_2, @dots{} | |
852 | @end example | |
853 | ||
b50e59fe | 854 | @node Constants, , Statements, Syntax |
93b45514 RP |
855 | @section Constants |
856 | A constant is a number, written so that its value is known by | |
857 | inspection, without knowing any context. Like this: | |
f4335d56 | 858 | @smallexample |
93b45514 RP |
859 | .byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value. |
860 | .ascii "Ring the bell\7" # A string constant. | |
861 | .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum. | |
862 | .float 0f-314159265358979323846264338327\ | |
863 | 95028841971.693993751E-40 # - pi, a flonum. | |
f4335d56 | 864 | @end smallexample |
93b45514 | 865 | |
b50e59fe RP |
866 | @menu |
867 | * Characters:: Character Constants | |
868 | * Numbers:: Number Constants | |
869 | @end menu | |
870 | ||
871 | @node Characters, Numbers, Constants, Constants | |
93b45514 | 872 | @subsection Character Constants |
47342e8f RP |
873 | There are two kinds of character constants. A @dfn{character} stands |
874 | for one character in one byte and its value may be used in | |
93b45514 | 875 | numeric expressions. String constants (properly called string |
47342e8f | 876 | @emph{literals}) are potentially many bytes and their values may not be |
93b45514 RP |
877 | used in arithmetic expressions. |
878 | ||
b50e59fe RP |
879 | @menu |
880 | * Strings:: Strings | |
881 | * Chars:: Characters | |
882 | @end menu | |
883 | ||
884 | @node Strings, Chars, Characters, Characters | |
93b45514 RP |
885 | @subsubsection Strings |
886 | A @dfn{string} is written between double-quotes. It may contain | |
47342e8f | 887 | double-quotes or null characters. The way to get special characters |
93b45514 | 888 | into a string is to @dfn{escape} these characters: precede them with |
b50e59fe | 889 | a backslash @samp{\} character. For example @samp{\\} represents |
93b45514 RP |
890 | one backslash: the first @code{\} is an escape which tells |
891 | @code{as} to interpret the second character literally as a backslash | |
892 | (which prevents @code{as} from recognizing the second @code{\} as an | |
893 | escape character). The complete list of escapes follows. | |
894 | ||
895 | @table @kbd | |
93b45514 RP |
896 | @c @item \a |
897 | @c Mnemonic for ACKnowledge; for ASCII this is octal code 007. | |
898 | @item \b | |
899 | Mnemonic for backspace; for ASCII this is octal code 010. | |
900 | @c @item \e | |
901 | @c Mnemonic for EOText; for ASCII this is octal code 004. | |
902 | @item \f | |
903 | Mnemonic for FormFeed; for ASCII this is octal code 014. | |
904 | @item \n | |
905 | Mnemonic for newline; for ASCII this is octal code 012. | |
906 | @c @item \p | |
907 | @c Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}. | |
908 | @item \r | |
909 | Mnemonic for carriage-Return; for ASCII this is octal code 015. | |
910 | @c @item \s | |
911 | @c Mnemonic for space; for ASCII this is octal code 040. Included for compliance with | |
912 | @c other assemblers. | |
913 | @item \t | |
914 | Mnemonic for horizontal Tab; for ASCII this is octal code 011. | |
915 | @c @item \v | |
916 | @c Mnemonic for Vertical tab; for ASCII this is octal code 013. | |
917 | @c @item \x @var{digit} @var{digit} @var{digit} | |
918 | @c A hexadecimal character code. The numeric code is 3 hexadecimal digits. | |
919 | @item \ @var{digit} @var{digit} @var{digit} | |
920 | An octal character code. The numeric code is 3 octal digits. | |
47342e8f RP |
921 | For compatibility with other Unix systems, 8 and 9 are accepted as digits: |
922 | for example, @code{\008} has the value 010, and @code{\009} the value 011. | |
93b45514 RP |
923 | @item \\ |
924 | Represents one @samp{\} character. | |
925 | @c @item \' | |
926 | @c Represents one @samp{'} (accent acute) character. | |
927 | @c This is needed in single character literals | |
928 | @c (@xref{Characters}.) to represent | |
929 | @c a @samp{'}. | |
930 | @item \" | |
931 | Represents one @samp{"} character. Needed in strings to represent | |
932 | this character, because an unescaped @samp{"} would end the string. | |
933 | @item \ @var{anything-else} | |
934 | Any other character when escaped by @kbd{\} will give a warning, but | |
935 | assemble as if the @samp{\} was not present. The idea is that if | |
936 | you used an escape sequence you clearly didn't want the literal | |
937 | interpretation of the following character. However @code{as} has no | |
938 | other interpretation, so @code{as} knows it is giving you the wrong | |
939 | code and warns you of the fact. | |
940 | @end table | |
941 | ||
942 | Which characters are escapable, and what those escapes represent, | |
943 | varies widely among assemblers. The current set is what we think | |
944 | BSD 4.2 @code{as} recognizes, and is a subset of what most C | |
945 | compilers recognize. If you are in doubt, don't use an escape | |
946 | sequence. | |
947 | ||
b50e59fe | 948 | @node Chars, , Strings, Characters |
93b45514 RP |
949 | @subsubsection Characters |
950 | A single character may be written as a single quote immediately | |
951 | followed by that character. The same escapes apply to characters as | |
952 | to strings. So if you want to write the character backslash, you | |
953 | must write @kbd{'\\} where the first @code{\} escapes the second | |
b50e59fe RP |
954 | @code{\}. As you can see, the quote is an acute accent, not a |
955 | grave accent. A newline | |
09352a5d RP |
956 | _if__(!_AMD29K__) |
957 | (or semicolon @samp{;}) | |
958 | _fi__(!_AMD29K__) | |
959 | _if__(_AMD29K__) | |
b50e59fe | 960 | (or at sign @samp{@@}) |
09352a5d | 961 | _fi__(_AMD29K__) |
b50e59fe RP |
962 | immediately |
963 | following an acute accent is taken as a literal character and does | |
93b45514 RP |
964 | not count as the end of a statement. The value of a character |
965 | constant in a numeric expression is the machine's byte-wide code for | |
966 | that character. @code{as} assumes your character code is ASCII: @kbd{'A} | |
b50e59fe | 967 | means 65, @kbd{'B} means 66, and so on. @refill |
93b45514 | 968 | |
b50e59fe | 969 | @node Numbers, , Characters, Constants |
93b45514 | 970 | @subsection Number Constants |
b50e59fe | 971 | @code{as} distinguishes three kinds of numbers according to how they |
47342e8f RP |
972 | are stored in the target machine. @emph{Integers} are numbers that |
973 | would fit into an @code{int} in the C language. @emph{Bignums} are | |
974 | integers, but they are stored in a more than 32 bits. @emph{Flonums} | |
93b45514 RP |
975 | are floating point numbers, described below. |
976 | ||
977 | @subsubsection Integers | |
b50e59fe RP |
978 | A binary integer is @samp{0b} or @samp{0B} followed by zero or more of |
979 | the binary digits @samp{01}. | |
980 | ||
93b45514 RP |
981 | An octal integer is @samp{0} followed by zero or more of the octal |
982 | digits (@samp{01234567}). | |
983 | ||
984 | A decimal integer starts with a non-zero digit followed by zero or | |
985 | more digits (@samp{0123456789}). | |
986 | ||
987 | A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or | |
988 | more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}. | |
989 | ||
47342e8f | 990 | Integers have the usual values. To denote a negative integer, use |
b50e59fe RP |
991 | the prefix operator @samp{-} discussed under expressions |
992 | (@pxref{Prefix Ops}). | |
93b45514 RP |
993 | |
994 | @subsubsection Bignums | |
995 | A @dfn{bignum} has the same syntax and semantics as an integer | |
996 | except that the number (or its negative) takes more than 32 bits to | |
997 | represent in binary. The distinction is made because in some places | |
998 | integers are permitted while bignums are not. | |
999 | ||
1000 | @subsubsection Flonums | |
b50e59fe RP |
1001 | A @dfn{flonum} represents a floating point number. The translation is |
1002 | complex: a decimal floating point number from the text is converted by | |
1003 | @code{as} to a generic binary floating point number of more than | |
1004 | sufficient precision. This generic floating point number is converted | |
1005 | to a particular computer's floating point format (or formats) by a | |
1006 | portion of @code{as} specialized to that computer. | |
93b45514 RP |
1007 | |
1008 | A flonum is written by writing (in order) | |
1009 | @itemize @bullet | |
1010 | @item | |
1011 | The digit @samp{0}. | |
1012 | @item | |
09352a5d | 1013 | _if__(_AMD29K__) |
b50e59fe RP |
1014 | One of the letters @samp{DFPRSX} (in upper or lower case), to tell |
1015 | @code{as} the rest of the number is a flonum. | |
09352a5d RP |
1016 | _fi__(_AMD29K__) |
1017 | _if__(!_AMD29K__) | |
b50e59fe RP |
1018 | A letter, to tell @code{as} the rest of the number is a flonum. @kbd{e} |
1019 | is recommended. Case is not important. (Any otherwise illegal letter | |
1020 | will work here, but that might be changed. Vax BSD 4.2 assembler seems | |
1021 | to allow any of @samp{defghDEFGH}.) | |
09352a5d | 1022 | _fi__(!_AMD29K__) |
93b45514 RP |
1023 | @item |
1024 | An optional sign: either @samp{+} or @samp{-}. | |
1025 | @item | |
47342e8f | 1026 | An optional @dfn{integer part}: zero or more decimal digits. |
93b45514 | 1027 | @item |
47342e8f | 1028 | An optional @dfn{fraction part}: @samp{.} followed by zero |
93b45514 RP |
1029 | or more decimal digits. |
1030 | @item | |
1031 | An optional exponent, consisting of: | |
1032 | @itemize @bullet | |
1033 | @item | |
09352a5d | 1034 | _if__(_AMD29K__) |
b50e59fe | 1035 | An @samp{E} or @samp{e}. |
09352a5d | 1036 | _if__(!_AMD29K__) |
93b45514 RP |
1037 | A letter; the exact significance varies according to |
1038 | the computer that executes the program. @code{as} | |
1039 | accepts any letter for now. Case is not important. | |
09352a5d | 1040 | _fi__(!_AMD29K__) |
93b45514 RP |
1041 | @item |
1042 | Optional sign: either @samp{+} or @samp{-}. | |
1043 | @item | |
1044 | One or more decimal digits. | |
1045 | @end itemize | |
1046 | @end itemize | |
1047 | ||
1048 | At least one of @var{integer part} or @var{fraction part} must be | |
47342e8f | 1049 | present. The floating point number has the usual base-10 value. |
93b45514 | 1050 | |
47342e8f RP |
1051 | @code{as} does all processing using integers. Flonums are computed |
1052 | independently of any floating point hardware in the computer running | |
1053 | @code{as}. | |
93b45514 | 1054 | |
b50e59fe | 1055 | @node Segments, Symbols, Syntax, Top |
47342e8f | 1056 | @chapter Segments and Relocation |
b50e59fe RP |
1057 | @menu |
1058 | * Segs Background:: Background | |
1059 | * ld Segments:: ld Segments | |
1060 | * as Segments:: as Internal Segments | |
1061 | * Sub-Segments:: Sub-Segments | |
1062 | * bss:: bss Segment | |
1063 | @end menu | |
1064 | ||
1065 | @node Segs Background, ld Segments, Segments, Segments | |
1066 | @section Background | |
47342e8f RP |
1067 | Roughly, a segment is a range of addresses, with no gaps; all data |
1068 | ``in'' those addresses is treated the same for some particular purpose. | |
1069 | For example there may be a ``read only'' segment. | |
93b45514 RP |
1070 | |
1071 | The linker @code{ld} reads many object files (partial programs) and | |
1072 | combines their contents to form a runnable program. When @code{as} | |
47342e8f RP |
1073 | emits an object file, the partial program is assumed to start at address |
1074 | 0. @code{ld} will assign the final addresses the partial program | |
1075 | occupies, so that different partial programs don't overlap. This is | |
1076 | actually an over-simplification, but it will suffice to explain how | |
1077 | @code{as} uses segments. | |
93b45514 RP |
1078 | |
1079 | @code{ld} moves blocks of bytes of your program to their run-time | |
1080 | addresses. These blocks slide to their run-time addresses as rigid | |
47342e8f RP |
1081 | units; their length does not change and neither does the order of bytes |
1082 | within them. Such a rigid unit is called a @emph{segment}. Assigning | |
1083 | run-time addresses to segments is called @dfn{relocation}. It includes | |
1084 | the task of adjusting mentions of object-file addresses so they refer to | |
1085 | the proper run-time addresses. | |
93b45514 | 1086 | |
b50e59fe RP |
1087 | An object file written by @code{as} has three segments, any of which may |
1088 | be empty. These are named @dfn{text}, @dfn{data} and @dfn{bss} | |
93b45514 | 1089 | segments. Within the object file, the text segment starts at |
b50e59fe RP |
1090 | address @code{0}, the data segment follows, and the bss segment |
1091 | follows the data segment. | |
93b45514 RP |
1092 | |
1093 | To let @code{ld} know which data will change when the segments are | |
1094 | relocated, and how to change that data, @code{as} also writes to the | |
1095 | object file details of the relocation needed. To perform relocation | |
47342e8f RP |
1096 | @code{ld} must know, each time an address in the object |
1097 | file is mentioned: | |
93b45514 RP |
1098 | @itemize @bullet |
1099 | @item | |
47342e8f RP |
1100 | Where in the object file is the beginning of this reference to |
1101 | an address? | |
93b45514 | 1102 | @item |
47342e8f | 1103 | How long (in bytes) is this reference? |
93b45514 | 1104 | @item |
b50e59fe RP |
1105 | Which segment does the address refer to? What is the numeric value of |
1106 | @display | |
1107 | (@var{address}) @minus{} (@var{start-address of segment})? | |
1108 | @end display | |
93b45514 | 1109 | @item |
b50e59fe | 1110 | Is the reference to an address ``Program-Counter relative''? |
93b45514 RP |
1111 | @end itemize |
1112 | ||
47342e8f | 1113 | In fact, every address @code{as} ever uses is expressed as |
b50e59fe RP |
1114 | @code{(@var{segment}) + (@var{offset into segment})}. Further, every |
1115 | expression @code{as} computes is of this segmented nature. | |
47342e8f | 1116 | @dfn{Absolute expression} means an expression with segment ``absolute'' |
b50e59fe RP |
1117 | (@pxref{ld Segments}). A @dfn{pass1 expression} means an expression |
1118 | with segment ``pass1'' (@pxref{as Segments}). In this manual we use the | |
47342e8f | 1119 | notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment |
b50e59fe | 1120 | @var{segname}''. |
93b45514 RP |
1121 | |
1122 | Apart from text, data and bss segments you need to know about the | |
1123 | @dfn{absolute} segment. When @code{ld} mixes partial programs, | |
47342e8f | 1124 | addresses in the absolute segment remain unchanged. That is, address |
b50e59fe | 1125 | @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}. |
47342e8f | 1126 | Although two partial programs' data segments will not overlap addresses |
b50e59fe RP |
1127 | after linking, @emph{by definition} their absolute segments will overlap. |
1128 | Address @code{@{absolute@ 239@}} in one partial program will always be the same | |
1129 | address when the program is running as address @code{@{absolute@ 239@}} in any | |
47342e8f RP |
1130 | other partial program. |
1131 | ||
1132 | The idea of segments is extended to the @dfn{undefined} segment. Any | |
1133 | address whose segment is unknown at assembly time is by definition | |
1134 | rendered @{undefined @var{U}@}---where @var{U} will be filled in later. | |
1135 | Since numbers are always defined, the only way to generate an undefined | |
93b45514 RP |
1136 | address is to mention an undefined symbol. A reference to a named |
1137 | common block would be such a symbol: its value is unknown at assembly | |
47342e8f | 1138 | time so it has segment @emph{undefined}. |
93b45514 | 1139 | |
b50e59fe | 1140 | By analogy the word @emph{segment} is used to describe groups of segments in |
47342e8f | 1141 | the linked program. @code{ld} puts all partial programs' text |
93b45514 | 1142 | segments in contiguous addresses in the linked program. It is |
47342e8f | 1143 | customary to refer to the @emph{text segment} of a program, meaning all |
93b45514 RP |
1144 | the addresses of all partial program's text segments. Likewise for |
1145 | data and bss segments. | |
1146 | ||
93b45514 RP |
1147 | Some segments are manipulated by @code{ld}; others are invented for |
1148 | use of @code{as} and have no meaning except during assembly. | |
1149 | ||
b50e59fe RP |
1150 | @menu |
1151 | * ld Segments:: ld Segments | |
1152 | * as Segments:: as Internal Segments | |
1153 | * Sub-Segments:: Sub-Segments | |
1154 | * bss:: bss Segment | |
1155 | @end menu | |
47342e8f | 1156 | |
b50e59fe RP |
1157 | @node ld Segments, as Segments, Segs Background, Segments |
1158 | @section ld Segments | |
1159 | @code{ld} deals with just five kinds of segments, summarized below. | |
1160 | ||
1161 | @table @strong | |
47342e8f | 1162 | |
93b45514 RP |
1163 | @item text segment |
1164 | @itemx data segment | |
47342e8f RP |
1165 | These segments hold your program. @code{as} and @code{ld} treat them as |
1166 | separate but equal segments. Anything you can say of one segment is | |
b50e59fe RP |
1167 | true of the other. When the program is running, however, it is |
1168 | customary for the text segment to be unalterable. The | |
1169 | text segment is often shared among processes: it will contain | |
1170 | instructions, constants and the like. The data segment of a running | |
1171 | program is usually alterable: for example, C variables would be stored | |
1172 | in the data segment. | |
47342e8f RP |
1173 | |
1174 | @item bss segment | |
1175 | This segment contains zeroed bytes when your program begins running. It | |
1176 | is used to hold unitialized variables or common storage. The length of | |
1177 | each partial program's bss segment is important, but because it starts | |
1178 | out containing zeroed bytes there is no need to store explicit zero | |
b50e59fe | 1179 | bytes in the object file. The bss segment was invented to eliminate |
47342e8f RP |
1180 | those explicit zeros from object files. |
1181 | ||
1182 | @item absolute segment | |
1183 | Address 0 of this segment is always ``relocated'' to runtime address 0. | |
1184 | This is useful if you want to refer to an address that @code{ld} must | |
1185 | not change when relocating. In this sense we speak of absolute | |
1186 | addresses being ``unrelocatable'': they don't change during relocation. | |
1187 | ||
b50e59fe | 1188 | @item @code{undefined} segment |
47342e8f RP |
1189 | This ``segment'' is a catch-all for address references to objects not in |
1190 | the preceding segments. | |
1191 | @c FIXME: ref to some other doc on obj-file formats could go here. | |
1192 | ||
93b45514 | 1193 | @end table |
47342e8f | 1194 | |
93b45514 | 1195 | An idealized example of the 3 relocatable segments follows. Memory |
47342e8f | 1196 | addresses are on the horizontal axis. |
93b45514 | 1197 | |
b50e59fe | 1198 | @ifinfo |
93b45514 RP |
1199 | @example |
1200 | +-----+----+--+ | |
1201 | partial program # 1: |ttttt|dddd|00| | |
1202 | +-----+----+--+ | |
1203 | ||
1204 | text data bss | |
1205 | seg. seg. seg. | |
1206 | ||
1207 | +---+---+---+ | |
1208 | partial program # 2: |TTT|DDD|000| | |
1209 | +---+---+---+ | |
1210 | ||
1211 | +--+---+-----+--+----+---+-----+~~ | |
1212 | linked program: | |TTT|ttttt| |dddd|DDD|00000| | |
1213 | +--+---+-----+--+----+---+-----+~~ | |
1214 | ||
1215 | addresses: 0 @dots{} | |
1216 | @end example | |
b50e59fe RP |
1217 | @end ifinfo |
1218 | @tex | |
1219 | \halign{\hfil\rm #\quad&#\cr | |
1220 | \cr | |
1221 | &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr | |
1222 | Partial program \#1: | |
1223 | &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr | |
1224 | \cr | |
1225 | &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr | |
1226 | Partial program \#2: | |
1227 | &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr | |
1228 | \cr | |
1229 | &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr | |
1230 | linked program: | |
1231 | &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt | |
1232 | ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt | |
1233 | DDDD}\boxit{2cm}{00000}\ \dots\cr | |
1234 | addresses: | |
1235 | &\dots\cr | |
1236 | } | |
1237 | @end tex | |
93b45514 | 1238 | |
b50e59fe RP |
1239 | @node as Segments, Sub-Segments, ld Segments, Segments |
1240 | @section as Internal Segments | |
93b45514 RP |
1241 | These segments are invented for the internal use of @code{as}. They |
1242 | have no meaning at run-time. You don't need to know about these | |
1243 | segments except that they might be mentioned in @code{as}' warning | |
1244 | messages. These segments are invented to permit the value of every | |
1245 | expression in your assembly language program to be a segmented | |
1246 | address. | |
1247 | ||
47342e8f RP |
1248 | @table @b |
1249 | @item absent segment | |
1250 | An expression was expected and none was | |
1251 | found. | |
1252 | ||
1253 | @item goof segment | |
1254 | An internal assembler logic error has been | |
1255 | found. This means there is a bug in the assembler. | |
1256 | ||
93b45514 | 1257 | @item grand segment |
47342e8f RP |
1258 | A @dfn{grand number} is a bignum or a flonum, but not an integer. If a |
1259 | number can't be written as a C @code{int} constant, it is a grand | |
1260 | number. @code{as} has to remember that a flonum or a bignum does not | |
b50e59fe | 1261 | fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an |
47342e8f | 1262 | expression: this is done by making a flonum or bignum be in segment |
b50e59fe | 1263 | grand. This is purely for internal @code{as} convenience; grand |
47342e8f RP |
1264 | segment behaves similarly to absolute segment. |
1265 | ||
1266 | @item pass1 segment | |
93b45514 | 1267 | The expression was impossible to evaluate in the first pass. The |
47342e8f RP |
1268 | assembler will attempt a second pass (second reading of the source) to |
1269 | evaluate the expression. Your expression mentioned an undefined symbol | |
1270 | in a way that defies the one-pass (segment + offset in segment) assembly | |
1271 | process. No compiler need emit such an expression. | |
1272 | ||
b50e59fe RP |
1273 | @quotation |
1274 | @emph{Warning:} the second pass is currently not implemented. @code{as} | |
1275 | will abort with an error message if one is required. | |
1276 | @end quotation | |
47342e8f RP |
1277 | |
1278 | @item difference segment | |
93b45514 | 1279 | As an assist to the C compiler, expressions of the forms |
b50e59fe RP |
1280 | @display |
1281 | (@var{undefined symbol}) @minus{} (@var{expression} | |
1282 | (@var{something} @minus{} (@var{undefined symbol}) | |
1283 | (@var{undefined symbol}) @minus{} (@var{undefined symbol}) | |
1284 | @end display | |
1285 | are permitted, and belong to the difference segment. @code{as} | |
47342e8f RP |
1286 | re-evaluates such expressions after the source file has been read and |
1287 | the symbol table built. If by that time there are no undefined symbols | |
1288 | in the expression then the expression assumes a new segment. The | |
1289 | intention is to permit statements like | |
1290 | @samp{.word label - base_of_table} | |
1291 | to be assembled in one pass where both @code{label} and | |
1292 | @code{base_of_table} are undefined. This is useful for compiling C and | |
1293 | Algol switch statements, Pascal case statements, FORTRAN computed goto | |
1294 | statements and the like. | |
93b45514 RP |
1295 | @end table |
1296 | ||
b50e59fe | 1297 | @node Sub-Segments, bss, as Segments, Segments |
93b45514 | 1298 | @section Sub-Segments |
b50e59fe RP |
1299 | Assembled bytes fall into two segments: text and data. |
1300 | Because you may have groups of text or data that you want to end up near | |
1301 | to each other in the object file, @code{as} allows you to use | |
93b45514 | 1302 | @dfn{subsegments}. Within each segment, there can be numbered |
b50e59fe RP |
1303 | subsegments with values from 0 to 8192. Objects assembled into the same |
1304 | subsegment will be grouped with other objects in the same subsegment | |
1305 | when they are all put into the object file. For example, a compiler | |
1306 | might want to store constants in the text segment, but might not want to | |
1307 | have them interspersed with the program being assembled. In this case, | |
1308 | the compiler could issue a @code{text 0} before each section of code | |
1309 | being output, and a @code{text 1} before each group of constants being | |
1310 | output. | |
1311 | ||
1312 | Subsegments are optional. If you don't use subsegments, everything | |
93b45514 RP |
1313 | will be stored in subsegment number zero. |
1314 | ||
09352a5d RP |
1315 | _if__(!_AMD29K__) |
1316 | Each subsegment is zero-padded up to a multiple of four bytes. | |
1317 | (Subsegments may be padded a different amount on different flavors | |
1318 | of @code{as}.) | |
1319 | _fi__(!_AMD29K__) | |
1320 | _if__(_AMD29K__) | |
b50e59fe RP |
1321 | On the AMD 29K family, no particular padding is added to segment sizes; |
1322 | GNU as forces no alignment on this platform. | |
09352a5d | 1323 | _fi__(_AMD29K__) |
b50e59fe RP |
1324 | Subsegments appear in your object file in numeric order, lowest numbered |
1325 | to highest. (All this to be compatible with other people's assemblers.) | |
1326 | The object file contains no representation of subsegments; @code{ld} and | |
1327 | other programs that manipulate object files will see no trace of them. | |
1328 | They just see all your text subsegments as a text segment, and all your | |
1329 | data subsegments as a data segment. | |
93b45514 RP |
1330 | |
1331 | To specify which subsegment you want subsequent statements assembled | |
1332 | into, use a @samp{.text @var{expression}} or a @samp{.data | |
1333 | @var{expression}} statement. @var{Expression} should be an absolute | |
1334 | expression. (@xref{Expressions}.) If you just say @samp{.text} | |
1335 | then @samp{.text 0} is assumed. Likewise @samp{.data} means | |
1336 | @samp{.data 0}. Assembly begins in @code{text 0}. | |
1337 | For instance: | |
1338 | @example | |
1339 | .text 0 # The default subsegment is text 0 anyway. | |
1340 | .ascii "This lives in the first text subsegment. *" | |
1341 | .text 1 | |
1342 | .ascii "But this lives in the second text subsegment." | |
1343 | .data 0 | |
1344 | .ascii "This lives in the data segment," | |
1345 | .ascii "in the first data subsegment." | |
1346 | .text 0 | |
1347 | .ascii "This lives in the first text segment," | |
1348 | .ascii "immediately following the asterisk (*)." | |
1349 | @end example | |
1350 | ||
b50e59fe RP |
1351 | Each segment has a @dfn{location counter} incremented by one for every |
1352 | byte assembled into that segment. Because subsegments are merely a | |
1353 | convenience restricted to @code{as} there is no concept of a subsegment | |
1354 | location counter. There is no way to directly manipulate a location | |
1355 | counter---but the @code{.align} directive will change it, and any label | |
1356 | definition will capture its current value. The location counter of the | |
1357 | segment that statements are being assembled into is said to be the | |
93b45514 RP |
1358 | @dfn{active} location counter. |
1359 | ||
b50e59fe RP |
1360 | @node bss, , Sub-Segments, Segments |
1361 | @section bss Segment | |
1362 | The bss segment is used for local common variable storage. | |
1363 | You may allocate address space in the bss segment, but you may | |
93b45514 | 1364 | not dictate data to load into it before your program executes. When |
b50e59fe | 1365 | your program starts running, all the contents of the bss |
93b45514 RP |
1366 | segment are zeroed bytes. |
1367 | ||
47342e8f | 1368 | Addresses in the bss segment are allocated with special directives; |
93b45514 | 1369 | you may not assemble anything directly into the bss segment. Hence |
47342e8f | 1370 | there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}. |
93b45514 | 1371 | |
b50e59fe | 1372 | @node Symbols, Expressions, Segments, Top |
93b45514 | 1373 | @chapter Symbols |
47342e8f RP |
1374 | Symbols are a central concept: the programmer uses symbols to name |
1375 | things, the linker uses symbols to link, and the debugger uses symbols | |
1376 | to debug. | |
1377 | ||
b50e59fe RP |
1378 | @quotation |
1379 | @emph{Warning:} @code{as} does not place symbols in the object file in | |
1380 | the same order they were declared. This may break some debuggers. | |
1381 | @end quotation | |
93b45514 | 1382 | |
b50e59fe RP |
1383 | @menu |
1384 | * Labels:: Labels | |
1385 | * Setting Symbols:: Giving Symbols Other Values | |
1386 | * Symbol Names:: Symbol Names | |
1387 | * Dot:: The Special Dot Symbol | |
1388 | * Symbol Attributes:: Symbol Attributes | |
1389 | @end menu | |
1390 | ||
1391 | @node Labels, Setting Symbols, Symbols, Symbols | |
93b45514 RP |
1392 | @section Labels |
1393 | A @dfn{label} is written as a symbol immediately followed by a colon | |
b50e59fe | 1394 | @samp{:}. The symbol then represents the current value of the |
93b45514 RP |
1395 | active location counter, and is, for example, a suitable instruction |
1396 | operand. You are warned if you use the same symbol to represent two | |
1397 | different locations: the first definition overrides any other | |
1398 | definitions. | |
1399 | ||
b50e59fe | 1400 | @node Setting Symbols, Symbol Names, Labels, Symbols |
93b45514 | 1401 | @section Giving Symbols Other Values |
b50e59fe RP |
1402 | A symbol can be given an arbitrary value by writing a symbol, followed |
1403 | by an equals sign @samp{=}, followed by an expression | |
93b45514 | 1404 | (@pxref{Expressions}). This is equivalent to using the @code{.set} |
b50e59fe | 1405 | directive. @xref{Set}. |
93b45514 | 1406 | |
b50e59fe | 1407 | @node Symbol Names, Dot, Setting Symbols, Symbols |
93b45514 RP |
1408 | @section Symbol Names |
1409 | Symbol names begin with a letter or with one of @samp{$._}. That | |
1410 | character may be followed by any string of digits, letters, | |
1411 | underscores and dollar signs. Case of letters is significant: | |
1412 | @code{foo} is a different symbol name than @code{Foo}. | |
1413 | ||
09352a5d | 1414 | _if__(_AMD29K__) |
b50e59fe RP |
1415 | For the AMD 29K family, @samp{?} is also allowed in the |
1416 | body of a symbol name, though not at its beginning. | |
09352a5d | 1417 | _fi__(_AMD29K__) |
b50e59fe | 1418 | |
47342e8f RP |
1419 | Each symbol has exactly one name. Each name in an assembly language |
1420 | program refers to exactly one symbol. You may use that symbol name any | |
1421 | number of times in a program. | |
93b45514 | 1422 | |
b50e59fe RP |
1423 | @menu |
1424 | * Local Symbols:: Local Symbol Names | |
1425 | @end menu | |
1426 | ||
1427 | @node Local Symbols, , Symbol Names, Symbol Names | |
93b45514 RP |
1428 | @subsection Local Symbol Names |
1429 | ||
1430 | Local symbols help compilers and programmers use names temporarily. | |
b50e59fe RP |
1431 | There are ten local symbol names, which are re-used throughout the |
1432 | program. You may refer to them using the names @samp{0} @samp{1} | |
1433 | @dots{} @samp{9}. To define a local symbol, write a label of the form | |
1434 | @samp{@b{N}:} (where @b{N} represents any digit). To refer to the most | |
1435 | recent previous definition of that symbol write @samp{@b{N}b}, using the | |
1436 | same digit as when you defined the label. To refer to the next | |
1437 | definition of a local label, write @samp{@b{N}f}---where @b{N} gives you | |
1438 | a choice of 10 forward references. The @samp{b} stands for | |
1439 | ``backwards'' and the @samp{f} stands for ``forwards''. | |
1440 | ||
1441 | Local symbols are not emitted by the current GNU C compiler. | |
93b45514 RP |
1442 | |
1443 | There is no restriction on how you can use these labels, but | |
1444 | remember that at any point in the assembly you can refer to at most | |
1445 | 10 prior local labels and to at most 10 forward local labels. | |
1446 | ||
47342e8f | 1447 | Local symbol names are only a notation device. They are immediately |
93b45514 | 1448 | transformed into more conventional symbol names before the assembler |
47342e8f RP |
1449 | uses them. The symbol names stored in the symbol table, appearing in |
1450 | error messages and optionally emitted to the object file have these | |
1451 | parts: | |
1452 | ||
1453 | @table @code | |
93b45514 RP |
1454 | @item L |
1455 | All local labels begin with @samp{L}. Normally both @code{as} and | |
1456 | @code{ld} forget symbols that start with @samp{L}. These labels are | |
1457 | used for symbols you are never intended to see. If you give the | |
1458 | @samp{-L} option then @code{as} will retain these symbols in the | |
b50e59fe | 1459 | object file. If you also instruct @code{ld} to retain these symbols, |
93b45514 | 1460 | you may use them in debugging. |
47342e8f RP |
1461 | |
1462 | @item @var{digit} | |
93b45514 RP |
1463 | If the label is written @samp{0:} then the digit is @samp{0}. |
1464 | If the label is written @samp{1:} then the digit is @samp{1}. | |
1465 | And so on up through @samp{9:}. | |
47342e8f RP |
1466 | |
1467 | @item @ctrl{A} | |
93b45514 RP |
1468 | This unusual character is included so you don't accidentally invent |
1469 | a symbol of the same name. The character has ASCII value | |
1470 | @samp{\001}. | |
47342e8f RP |
1471 | |
1472 | @item @emph{ordinal number} | |
1473 | This is a serial number to keep the labels distinct. The first | |
93b45514 | 1474 | @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the |
47342e8f | 1475 | number @samp{15}; @emph{etc.}. Likewise for the other labels @samp{1:} |
93b45514 RP |
1476 | through @samp{9:}. |
1477 | @end table | |
47342e8f RP |
1478 | |
1479 | For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th | |
1480 | @code{3:} is named @code{L3@ctrl{A}44}. | |
93b45514 | 1481 | |
b50e59fe | 1482 | @node Dot, Symbol Attributes, Symbol Names, Symbols |
93b45514 RP |
1483 | @section The Special Dot Symbol |
1484 | ||
b50e59fe | 1485 | The special symbol @samp{.} refers to the current address that |
93b45514 | 1486 | @code{as} is assembling into. Thus, the expression @samp{melvin: |
b50e59fe | 1487 | .long .} will cause @code{melvin} to contain its own address. |
93b45514 RP |
1488 | Assigning a value to @code{.} is treated the same as a @code{.org} |
1489 | directive. Thus, the expression @samp{.=.+4} is the same as saying | |
09352a5d RP |
1490 | _if__(!_AMD29K__) |
1491 | @samp{.space 4}. | |
1492 | _fi__(!_AMD29K__) | |
1493 | _if__(_AMD29K__) | |
b50e59fe | 1494 | @samp{.block 4}. |
09352a5d | 1495 | _fi__(_AMD29K__) |
b50e59fe RP |
1496 | |
1497 | @node Symbol Attributes, , Dot, Symbols | |
93b45514 | 1498 | @section Symbol Attributes |
47342e8f | 1499 | Every symbol has these attributes: Value, Type, Descriptor, and ``Other''. |
09352a5d RP |
1500 | _if__(_INTERNALS__) |
1501 | The detailed definitions are in _0__<a.out.h>_1__. | |
1502 | _fi__(_INTERNALS__) | |
93b45514 RP |
1503 | |
1504 | If you use a symbol without defining it, @code{as} assumes zero for | |
1505 | all these attributes, and probably won't warn you. This makes the | |
1506 | symbol an externally defined symbol, which is generally what you | |
1507 | would want. | |
1508 | ||
b50e59fe RP |
1509 | @menu |
1510 | * Symbol Value:: Value | |
1511 | * Symbol Type:: Type | |
1512 | * Symbol Desc:: Descriptor | |
1513 | * Symbol Other:: Other | |
1514 | @end menu | |
1515 | ||
1516 | @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes | |
93b45514 | 1517 | @subsection Value |
47342e8f | 1518 | The value of a symbol is (usually) 32 bits, the size of one GNU C |
93b45514 | 1519 | @code{int}. For a symbol which labels a location in the |
b50e59fe | 1520 | text, data, bss or absolute segments the |
93b45514 | 1521 | value is the number of addresses from the start of that segment to |
b50e59fe | 1522 | the label. Naturally for text, data and bss |
93b45514 | 1523 | segments the value of a symbol changes as @code{ld} changes segment |
b50e59fe | 1524 | base addresses during linking. absolute symbols' values do |
93b45514 RP |
1525 | not change during linking: that is why they are called absolute. |
1526 | ||
b50e59fe RP |
1527 | The value of an undefined symbol is treated in a special way. If it is |
1528 | 0 then the symbol is not defined in this assembler source program, and | |
1529 | @code{ld} will try to determine its value from other programs it is | |
1530 | linked with. You make this kind of symbol simply by mentioning a symbol | |
1531 | name without defining it. A non-zero value represents a @code{.comm} | |
1532 | common declaration. The value is how much common storage to reserve, in | |
1533 | bytes (addresses). The symbol refers to the first address of the | |
1534 | allocated storage. | |
93b45514 | 1535 | |
b50e59fe | 1536 | @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes |
93b45514 RP |
1537 | @subsection Type |
1538 | The type attribute of a symbol is 8 bits encoded in a devious way. | |
1539 | We kept this coding standard for compatibility with older operating | |
1540 | systems. | |
1541 | ||
b50e59fe | 1542 | @ifinfo |
93b45514 RP |
1543 | @example |
1544 | ||
1545 | 7 6 5 4 3 2 1 0 bit numbers | |
1546 | +-----+-----+-----+-----+-----+-----+-----+-----+ | |
1547 | | | | | | |
1548 | | N_STAB bits | N_TYPE bits |N_EXT| | |
1549 | | | | bit | | |
1550 | +-----+-----+-----+-----+-----+-----+-----+-----+ | |
1551 | ||
b50e59fe | 1552 | Type byte |
93b45514 | 1553 | @end example |
b50e59fe RP |
1554 | @end ifinfo |
1555 | @tex | |
1556 | \vskip 1pc | |
1557 | \halign{#\quad&#\cr | |
63f5d795 | 1558 | \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1.1cm}{0}&bit numbers\cr |
b50e59fe | 1559 | \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE} |
63f5d795 | 1560 | bits}\boxit{1.1cm}{\tt N\_EXT}\cr |
b50e59fe RP |
1561 | \hfill {\bf Type} byte\hfill\cr |
1562 | } | |
1563 | @end tex | |
93b45514 | 1564 | |
b50e59fe | 1565 | @subsubsection @code{N_EXT} bit |
47342e8f RP |
1566 | This bit is set if @code{ld} might need to use the symbol's type bits |
1567 | and value. If this bit is off, then @code{ld} can ignore the | |
93b45514 RP |
1568 | symbol while linking. It is set in two cases. If the symbol is |
1569 | undefined, then @code{ld} is expected to find the symbol's value | |
1570 | elsewhere in another program module. Otherwise the symbol has the | |
1571 | value given, but this symbol name and value are revealed to any other | |
1572 | programs linked in the same executable program. This second use of | |
b50e59fe | 1573 | the @code{N_EXT} bit is most often made by a @code{.globl} statement. |
93b45514 | 1574 | |
b50e59fe | 1575 | @subsubsection @code{N_TYPE} bits |
93b45514 RP |
1576 | These establish the symbol's ``type'', which is mainly a relocation |
1577 | concept. Common values are detailed in the manual describing the | |
1578 | executable file format. | |
1579 | ||
b50e59fe | 1580 | @subsubsection @code{N_STAB} bits |
93b45514 RP |
1581 | Common values for these bits are described in the manual on the |
1582 | executable file format. | |
1583 | ||
b50e59fe | 1584 | @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes |
47342e8f | 1585 | @subsection Descriptor |
93b45514 | 1586 | This is an arbitrary 16-bit value. You may establish a symbol's |
47342e8f | 1587 | descriptor value by using a @code{.desc} statement (@pxref{Desc}). |
93b45514 RP |
1588 | A descriptor value means nothing to @code{as}. |
1589 | ||
b50e59fe | 1590 | @node Symbol Other, , Symbol Desc, Symbol Attributes |
93b45514 RP |
1591 | @subsection Other |
1592 | This is an arbitrary 8-bit value. It means nothing to @code{as}. | |
1593 | ||
b50e59fe | 1594 | @node Expressions, Pseudo Ops, Symbols, Top |
93b45514 RP |
1595 | @chapter Expressions |
1596 | An @dfn{expression} specifies an address or numeric value. | |
1597 | Whitespace may precede and/or follow an expression. | |
1598 | ||
b50e59fe RP |
1599 | @menu |
1600 | * Empty Exprs:: Empty Expressions | |
1601 | * Integer Exprs:: Integer Expressions | |
1602 | @end menu | |
1603 | ||
1604 | @node Empty Exprs, Integer Exprs, Expressions, Expressions | |
93b45514 | 1605 | @section Empty Expressions |
47342e8f | 1606 | An empty expression has no value: it is just whitespace or null. |
93b45514 RP |
1607 | Wherever an absolute expression is required, you may omit the |
1608 | expression and @code{as} will assume a value of (absolute) 0. This | |
1609 | is compatible with other assemblers. | |
1610 | ||
b50e59fe | 1611 | @node Integer Exprs, , Empty Exprs, Expressions |
93b45514 | 1612 | @section Integer Expressions |
47342e8f RP |
1613 | An @dfn{integer expression} is one or more @emph{arguments} delimited |
1614 | by @emph{operators}. | |
1615 | ||
b50e59fe RP |
1616 | @menu |
1617 | * Arguments:: Arguments | |
1618 | * Operators:: Operators | |
1619 | * Prefix Ops:: Prefix Operators | |
1620 | * Infix Ops:: Infix Operators | |
1621 | @end menu | |
1622 | ||
1623 | @node Arguments, Operators, Integer Exprs, Integer Exprs | |
47342e8f | 1624 | @subsection Arguments |
93b45514 | 1625 | |
47342e8f RP |
1626 | @dfn{Arguments} are symbols, numbers or subexpressions. In other |
1627 | contexts arguments are sometimes called ``arithmetic operands''. In | |
1628 | this manual, to avoid confusing them with the ``instruction operands'' of | |
1629 | the machine language, we use the term ``argument'' to refer to parts of | |
b50e59fe | 1630 | expressions only, reserving the word ``operand'' to refer only to machine |
47342e8f | 1631 | instruction operands. |
93b45514 | 1632 | |
b50e59fe RP |
1633 | Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where |
1634 | @var{segment} is one of text, data, bss, absolute, | |
1635 | or @code{undefined}. @var{NNN} is a signed, 2's complement 32 bit | |
93b45514 RP |
1636 | integer. |
1637 | ||
1638 | Numbers are usually integers. | |
1639 | ||
1640 | A number can be a flonum or bignum. In this case, you are warned | |
1641 | that only the low order 32 bits are used, and @code{as} pretends | |
1642 | these 32 bits are an integer. You may write integer-manipulating | |
1643 | instructions that act on exotic constants, compatible with other | |
1644 | assemblers. | |
1645 | ||
b50e59fe RP |
1646 | Subexpressions are a left parenthesis @samp{(} followed by an integer |
1647 | expression, followed by a right parenthesis @samp{)}; or a prefix | |
47342e8f | 1648 | operator followed by an argument. |
93b45514 | 1649 | |
b50e59fe | 1650 | @node Operators, Prefix Ops, Arguments, Integer Exprs |
93b45514 | 1651 | @subsection Operators |
b50e59fe RP |
1652 | @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}. Prefix |
1653 | operators are followed by an argument. Infix operators appear | |
47342e8f | 1654 | between their arguments. Operators may be preceded and/or followed by |
93b45514 RP |
1655 | whitespace. |
1656 | ||
b50e59fe RP |
1657 | @node Prefix Ops, Infix Ops, Operators, Integer Exprs |
1658 | @subsection Prefix Operators | |
1659 | @code{as} has the following @dfn{prefix operators}. They each take | |
47342e8f | 1660 | one argument, which must be absolute. |
b50e59fe | 1661 | @table @code |
93b45514 | 1662 | @item - |
b50e59fe | 1663 | @dfn{Negation}. Two's complement negation. |
93b45514 | 1664 | @item ~ |
b50e59fe | 1665 | @dfn{Complementation}. Bitwise not. |
93b45514 RP |
1666 | @end table |
1667 | ||
b50e59fe RP |
1668 | @node Infix Ops, , Prefix Ops, Integer Exprs |
1669 | @subsection Infix Operators | |
47342e8f | 1670 | |
b50e59fe RP |
1671 | @dfn{Infix operators} take two arguments, one on either side. Operators |
1672 | have precedence, but operations with equal precedence are performed left | |
1673 | to right. Apart from @code{+} or @code{-}, both arguments must be | |
1674 | absolute, and the result is absolute. | |
47342e8f | 1675 | |
93b45514 | 1676 | @enumerate |
47342e8f | 1677 | |
93b45514 | 1678 | @item |
47342e8f | 1679 | Highest Precedence |
93b45514 RP |
1680 | @table @code |
1681 | @item * | |
1682 | @dfn{Multiplication}. | |
1683 | @item / | |
1684 | @dfn{Division}. Truncation is the same as the C operator @samp{/} | |
93b45514 RP |
1685 | @item % |
1686 | @dfn{Remainder}. | |
09352a5d RP |
1687 | @item _0__<_1__ |
1688 | @itemx _0__<<_1__ | |
1689 | @dfn{Shift Left}. Same as the C operator @samp{_0__<<_1__} | |
1690 | @item _0__>_1__ | |
1691 | @itemx _0__>>_1__ | |
1692 | @dfn{Shift Right}. Same as the C operator @samp{_0__>>_1__} | |
93b45514 | 1693 | @end table |
47342e8f | 1694 | |
93b45514 | 1695 | @item |
47342e8f RP |
1696 | Intermediate precedence |
1697 | @table @code | |
93b45514 RP |
1698 | @item | |
1699 | @dfn{Bitwise Inclusive Or}. | |
1700 | @item & | |
1701 | @dfn{Bitwise And}. | |
1702 | @item ^ | |
1703 | @dfn{Bitwise Exclusive Or}. | |
1704 | @item ! | |
1705 | @dfn{Bitwise Or Not}. | |
1706 | @end table | |
47342e8f | 1707 | |
93b45514 | 1708 | @item |
47342e8f RP |
1709 | Lowest Precedence |
1710 | @table @code | |
93b45514 | 1711 | @item + |
47342e8f RP |
1712 | @dfn{Addition}. If either argument is absolute, the result |
1713 | has the segment of the other argument. | |
1714 | If either argument is pass1 or undefined, the result is pass1. | |
1715 | Otherwise @code{+} is illegal. | |
93b45514 | 1716 | @item - |
47342e8f RP |
1717 | @dfn{Subtraction}. If the right argument is absolute, the |
1718 | result has the segment of the left argument. | |
1719 | If either argument is pass1 the result is pass1. | |
1720 | If either argument is undefined the result is difference segment. | |
1721 | If both arguments are in the same segment, the result is absolute---provided | |
b50e59fe RP |
1722 | that segment is one of text, data or bss. |
1723 | Otherwise subtraction is illegal. | |
93b45514 RP |
1724 | @end table |
1725 | @end enumerate | |
1726 | ||
b50e59fe | 1727 | The sense of the rule for addition is that it's only meaningful to add |
47342e8f RP |
1728 | the @emph{offsets} in an address; you can only have a defined segment in |
1729 | one of the two arguments. | |
93b45514 | 1730 | |
47342e8f RP |
1731 | Similarly, you can't subtract quantities from two different segments. |
1732 | ||
b50e59fe | 1733 | @node Pseudo Ops, Machine Dependent, Expressions, Top |
93b45514 RP |
1734 | @chapter Assembler Directives |
1735 | @menu | |
b50e59fe RP |
1736 | * Abort:: The Abort directive causes as to abort |
1737 | * Align:: Pad the location counter to a power of 2 | |
1738 | * App-File:: Set the logical file name | |
1739 | * Ascii:: Fill memory with bytes of ASCII characters | |
1740 | * Asciz:: Fill memory with bytes of ASCII characters followed | |
93b45514 | 1741 | by a null. |
b50e59fe RP |
1742 | * Byte:: Fill memory with 8-bit integers |
1743 | * Comm:: Reserve public space in the BSS segment | |
1744 | * Data:: Change to the data segment | |
1745 | * Desc:: Set the n_desc of a symbol | |
1746 | * Double:: Fill memory with double-precision floating-point numbers | |
1747 | * Else:: @code{.else} | |
1748 | * End:: @code{.end} | |
1749 | * Endif:: @code{.endif} | |
1750 | * Equ:: @code{.equ @var{symbol}, @var{expression}} | |
1751 | * Extern:: @code{.extern} | |
1752 | * Fill:: Fill memory with repeated values | |
1753 | * Float:: Fill memory with single-precision floating-point numbers | |
1754 | * Global:: Make a symbol visible to the linker | |
1755 | * Ident:: @code{.ident} | |
1756 | * If:: @code{.if @var{absolute expression}} | |
1757 | * Include:: @code{.include "@var{file}"} | |
1758 | * Int:: Fill memory with 32-bit integers | |
1759 | * Lcomm:: Reserve private space in the BSS segment | |
1760 | * Line:: Set the logical line number | |
1761 | * Ln:: @code{.ln @var{line-number}} | |
1762 | * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl} | |
1763 | * Long:: Fill memory with 32-bit integers | |
1764 | * Lsym:: Create a local symbol | |
1765 | * Octa:: Fill memory with 128-bit integers | |
1766 | * Org:: Change the location counter | |
1767 | * Quad:: Fill memory with 64-bit integers | |
1768 | * Set:: Set the value of a symbol | |
1769 | * Short:: Fill memory with 16-bit integers | |
1770 | * Single:: @code{.single @var{flonums}} | |
1771 | * Stab:: Store debugging information | |
1772 | * Text:: Change to the text segment | |
b50e59fe | 1773 | * Word:: Fill memory with 32-bit integers |
b50e59fe RP |
1774 | * Deprecated:: Deprecated Directives |
1775 | * Machine Options:: Options | |
1776 | * Machine Syntax:: Syntax | |
1777 | * Floating Point:: Floating Point | |
1778 | * Machine Directives:: Machine Directives | |
1779 | * Opcodes:: Opcodes | |
93b45514 RP |
1780 | @end menu |
1781 | ||
47342e8f RP |
1782 | All assembler directives have names that begin with a period (@samp{.}). |
1783 | The rest of the name is letters: their case does not matter. | |
93b45514 | 1784 | |
b50e59fe RP |
1785 | This chapter discusses directives present in all versions of GNU |
1786 | @code{as}; @pxref{Machine Dependent} for additional directives. | |
1787 | ||
47342e8f | 1788 | @node Abort, Align, Pseudo Ops, Pseudo Ops |
b50e59fe | 1789 | @section @code{.abort} |
93b45514 RP |
1790 | This directive stops the assembly immediately. It is for |
1791 | compatibility with other assemblers. The original idea was that the | |
47342e8f RP |
1792 | assembler program would be piped into the assembler. If the sender |
1793 | of a program quit, it could use this directive tells @code{as} to | |
93b45514 RP |
1794 | quit also. One day @code{.abort} will not be supported. |
1795 | ||
b50e59fe | 1796 | @node Align, App-File, Abort, Pseudo Ops |
f4335d56 | 1797 | @section @code{.align @var{abs-expression} , @var{abs-expression}} |
b50e59fe | 1798 | Pad the location counter (in the current subsegment) to a particular |
f4335d56 RP |
1799 | storage boundary. The first expression (which must be absolute) is the |
1800 | number of low-order zero bits the location counter will have after | |
1801 | advancement. For example @samp{.align 3} will advance the location | |
1802 | counter until it a multiple of 8. If the location counter is already a | |
1803 | multiple of 8, no change is needed. | |
93b45514 | 1804 | |
f4335d56 RP |
1805 | The second expression (also absolute) gives the value to be stored in |
1806 | the padding bytes. It (and the comma) may be omitted. If it is | |
1807 | omitted, the padding bytes are zero. | |
93b45514 | 1808 | |
b50e59fe RP |
1809 | @node App-File, Ascii, Align, Pseudo Ops |
1810 | @section @code{.app-file @var{string}} | |
1811 | @code{.app-file} tells @code{as} that we are about to start a new | |
1812 | logical file. @var{String} is the new file name. In general, the | |
1813 | filename is recognized whether or not it is surrounded by quotes @samp{"}; | |
1814 | but if you wish to specify an empty file name is permitted, | |
1815 | you must give the quotes--@code{""}. This statement may go away in | |
1816 | future: it is only recognized to be compatible with old @code{as} | |
1817 | programs. | |
1818 | ||
1819 | @node Ascii, Asciz, App-File, Pseudo Ops | |
1820 | @section @code{.ascii "@var{string}"}@dots{} | |
47342e8f | 1821 | @code{.ascii} expects zero or more string literals (@pxref{Strings}) |
93b45514 RP |
1822 | separated by commas. It assembles each string (with no automatic |
1823 | trailing zero byte) into consecutive addresses. | |
1824 | ||
47342e8f | 1825 | @node Asciz, Byte, Ascii, Pseudo Ops |
b50e59fe RP |
1826 | @section @code{.asciz "@var{string}"}@dots{} |
1827 | @code{.asciz} is just like @code{.ascii}, but each string is followed by | |
1828 | a zero byte. The ``z'' in @samp{.asciz} stands for ``zero''. | |
93b45514 | 1829 | |
47342e8f | 1830 | @node Byte, Comm, Asciz, Pseudo Ops |
b50e59fe | 1831 | @section @code{.byte @var{expressions}} |
93b45514 | 1832 | |
47342e8f | 1833 | @code{.byte} expects zero or more expressions, separated by commas. |
93b45514 RP |
1834 | Each expression is assembled into the next byte. |
1835 | ||
b50e59fe RP |
1836 | @node Comm, Data, Byte, Pseudo Ops |
1837 | @section @code{.comm @var{symbol} , @var{length} } | |
47342e8f RP |
1838 | @code{.comm} declares a named common area in the bss segment. Normally |
1839 | @code{ld} reserves memory addresses for it during linking, so no partial | |
1840 | program defines the location of the symbol. Use @code{.comm} to tell | |
1841 | @code{ld} that it must be at least @var{length} bytes long. @code{ld} | |
1842 | will allocate space for each @code{.comm} symbol that is at least as | |
1843 | long as the longest @code{.comm} request in any of the partial programs | |
1844 | linked. @var{length} is an absolute expression. | |
1845 | ||
1846 | @node Data, Desc, Comm, Pseudo Ops | |
b50e59fe | 1847 | @section @code{.data @var{subsegment}} |
47342e8f | 1848 | @code{.data} tells @code{as} to assemble the following statements onto the |
93b45514 RP |
1849 | end of the data subsegment numbered @var{subsegment} (which is an |
1850 | absolute expression). If @var{subsegment} is omitted, it defaults | |
1851 | to zero. | |
1852 | ||
47342e8f | 1853 | @node Desc, Double, Data, Pseudo Ops |
f4335d56 | 1854 | @section @code{.desc @var{symbol}, @var{abs-expression}} |
b50e59fe | 1855 | This directive sets the descriptor of the symbol (@pxref{Symbol Attributes}) |
f4335d56 | 1856 | to the low 16 bits of an absolute expression. |
93b45514 | 1857 | |
b50e59fe RP |
1858 | @node Double, Else, Desc, Pseudo Ops |
1859 | @section @code{.double @var{flonums}} | |
47342e8f | 1860 | @code{.double} expects zero or more flonums, separated by commas. It assembles |
b50e59fe | 1861 | floating point numbers. |
09352a5d RP |
1862 | _if__(_ALL_ARCH__) |
1863 | The exact kind of floating point numbers emitted depends on how | |
1864 | @code{as} is configured. @xref{Machine Dependent}. | |
1865 | _fi__(_ALL_ARCH__) | |
1866 | _if__(_AMD29K__) | |
b50e59fe | 1867 | On the AMD 29K family the floating point format used is IEEE. |
09352a5d | 1868 | _fi__(_AMD29K__) |
b50e59fe RP |
1869 | |
1870 | @node Else, End, Double, Pseudo Ops | |
1871 | @section @code{.else} | |
1872 | @code{.else} is part of the @code{as} support for conditional assembly; | |
1873 | @pxref{If}. It marks the beginning of a section of code to be assembled | |
1874 | if the condition for the preceding @code{.if} was false. | |
1875 | ||
1876 | @ignore | |
1877 | @node End, Endif, Else, Pseudo Ops | |
1878 | @section @code{.end} | |
1879 | This doesn't do anything---but isn't an s_ignore, so I suspect it's | |
1880 | meant to do something eventually (which is why it isn't documented here | |
1881 | as "for compatibility with blah"). | |
1882 | @end ignore | |
1883 | ||
1884 | @node Endif, Equ, End, Pseudo Ops | |
1885 | @section @code{.endif} | |
1886 | @code{.endif} is part of the @code{as} support for conditional assembly; | |
1887 | it marks the end of a block of code that is only assembled | |
1888 | conditionally. @xref{If}. | |
1889 | ||
1890 | @node Equ, Extern, Endif, Pseudo Ops | |
1891 | @section @code{.equ @var{symbol}, @var{expression}} | |
1892 | ||
1893 | This directive sets the value of @var{symbol} to @var{expression}. | |
1894 | It is synonymous with @samp{.set}; @pxref{Set}. | |
1895 | ||
1896 | @node Extern, Fill, Equ, Pseudo Ops | |
1897 | @section @code{.extern} | |
1898 | @code{.extern} is accepted in the source program---for compatibility | |
1899 | with other assemblers---but it is ignored. GNU @code{as} treats | |
1900 | all undefined symbols as external. | |
1901 | ||
1902 | @node Fill, Float, Extern, Pseudo Ops | |
1903 | @section @code{.fill @var{repeat} , @var{size} , @var{value}} | |
93b45514 RP |
1904 | @var{result}, @var{size} and @var{value} are absolute expressions. |
1905 | This emits @var{repeat} copies of @var{size} bytes. @var{Repeat} | |
1906 | may be zero or more. @var{Size} may be zero or more, but if it is | |
1907 | more than 8, then it is deemed to have the value 8, compatible with | |
1908 | other people's assemblers. The contents of each @var{repeat} bytes | |
1909 | is taken from an 8-byte number. The highest order 4 bytes are | |
1910 | zero. The lowest order 4 bytes are @var{value} rendered in the | |
1911 | byte-order of an integer on the computer @code{as} is assembling for. | |
1912 | Each @var{size} bytes in a repetition is taken from the lowest order | |
1913 | @var{size} bytes of this number. Again, this bizarre behavior is | |
1914 | compatible with other people's assemblers. | |
1915 | ||
1916 | @var{Size} and @var{value} are optional. | |
1917 | If the second comma and @var{value} are absent, @var{value} is | |
1918 | assumed zero. If the first comma and following tokens are absent, | |
1919 | @var{size} is assumed to be 1. | |
1920 | ||
47342e8f | 1921 | @node Float, Global, Fill, Pseudo Ops |
b50e59fe RP |
1922 | @section @code{.float @var{flonums}} |
1923 | This directive assembles zero or more flonums, separated by commas. It | |
1924 | has the same effect as @code{.single}. | |
09352a5d RP |
1925 | _if__(_ALL_ARCH__) |
1926 | The exact kind of floating point numbers emitted depends on how | |
1927 | @code{as} is configured. | |
1928 | @xref{Machine Dependent}. | |
1929 | _fi__(_ALL_ARCH__) | |
1930 | _if__(_AMD29K__) | |
b50e59fe | 1931 | The floating point format used for the AMD 29K family is IEEE. |
09352a5d | 1932 | _fi__(_AMD29K__) |
93b45514 | 1933 | |
b50e59fe RP |
1934 | @node Global, Ident, Float, Pseudo Ops |
1935 | @section @code{.global @var{symbol}}, @code{.globl @var{symbol}} | |
47342e8f | 1936 | @code{.global} makes the symbol visible to @code{ld}. If you define |
93b45514 RP |
1937 | @var{symbol} in your partial program, its value is made available to |
1938 | other partial programs that are linked with it. Otherwise, | |
1939 | @var{symbol} will take its attributes from a symbol of the same name | |
1940 | from another partial program it is linked with. | |
1941 | ||
b50e59fe RP |
1942 | This is done by setting the @code{N_EXT} bit of that symbol's type byte |
1943 | to 1. @xref{Symbol Attributes}. | |
1944 | ||
1945 | Both spellings (@samp{.globl} and @samp{.global}) are accepted, for | |
1946 | compatibility with other assemblers. | |
1947 | ||
1948 | @node Ident, If, Global, Pseudo Ops | |
1949 | @section @code{.ident} | |
1950 | This directive is used by some assemblers to place tags in object files. | |
1951 | GNU @code{as} simply accepts the directive for source-file | |
1952 | compatibility with such assemblers, but does not actually emit anything | |
1953 | for it. | |
1954 | ||
1955 | @node If, Include, Ident, Pseudo Ops | |
1956 | @section @code{.if @var{absolute expression}} | |
1957 | @code{.if} marks the beginning of a section of code which is only | |
1958 | considered part of the source program being assembled if the argument | |
1959 | (which must be an @var{absolute expression}) is non-zero. The end of | |
1960 | the conditional section of code must be marked by @code{.endif} | |
1961 | (@pxref{Endif}); optionally, you may include code for the | |
1962 | alternative condition, flagged by @code{.else} (@pxref{Else}. | |
1963 | ||
1964 | The following variants of @code{.if} are also supported: | |
1965 | @table @code | |
1966 | @item ifdef @var{symbol} | |
1967 | Assembles the following section of code if the specified @var{symbol} | |
1968 | has been defined. | |
1969 | ||
1970 | @ignore | |
1971 | @item ifeqs | |
1972 | BOGONS?? | |
1973 | @end ignore | |
1974 | ||
1975 | @item ifndef @var{symbol} | |
1976 | @itemx ifnotdef @var{symbol} | |
1977 | Assembles the following section of code if the specified @var{symbol} | |
1978 | has not been defined. Both spelling variants are equivalent. | |
93b45514 | 1979 | |
b50e59fe RP |
1980 | @ignore |
1981 | @item ifnes | |
1982 | NO bogons, I presume? | |
1983 | @end ignore | |
1984 | @end table | |
1985 | ||
1986 | @node Include, Int, If, Pseudo Ops | |
1987 | @section @code{.include "@var{file}"} | |
1988 | This directive provides a way to include supporting files at specified | |
1989 | points in your source program. The code from @var{file} is assembled as | |
1990 | if it followed the point of the @code{.include}; when the end of the | |
1991 | included file is reached, assembly of the original file continues. You | |
1992 | can control the search paths used with the @samp{-I} command-line option | |
1993 | (@pxref{Options}). Quotation marks are required around @var{file}. | |
1994 | ||
1995 | @node Int, Lcomm, Include, Pseudo Ops | |
1996 | @section @code{.int @var{expressions}} | |
93b45514 RP |
1997 | Expect zero or more @var{expressions}, of any segment, separated by |
1998 | commas. For each expression, emit a 32-bit number that will, at run | |
1999 | time, be the value of that expression. The byte order of the | |
2000 | expression depends on what kind of computer will run the program. | |
2001 | ||
47342e8f | 2002 | @node Lcomm, Line, Int, Pseudo Ops |
b50e59fe | 2003 | @section @code{.lcomm @var{symbol} , @var{length}} |
93b45514 | 2004 | Reserve @var{length} (an absolute expression) bytes for a local |
47342e8f | 2005 | common denoted by @var{symbol}. The segment and value of @var{symbol} are |
93b45514 | 2006 | those of the new local common. The addresses are allocated in the |
b50e59fe | 2007 | bss segment, so at run-time the bytes will start off zeroed. |
47342e8f | 2008 | @var{Symbol} is not declared global (@pxref{Global}), so is normally |
93b45514 RP |
2009 | not visible to @code{ld}. |
2010 | ||
09352a5d | 2011 | _if__(!_AMD29K__) |
b50e59fe RP |
2012 | @node Line, Ln, Lcomm, Pseudo Ops |
2013 | @section @code{.line @var{line-number}}, @code{.ln @var{line-number}} | |
2014 | @code{.line}, and its alternate spelling @code{.ln}, tell | |
09352a5d RP |
2015 | _fi__(!_AMD29K__) |
2016 | _if__(_AMD29K__) | |
b50e59fe RP |
2017 | @node Ln, List, Line, Pseudo Ops |
2018 | @section @code{.ln @var{line-number}} | |
2019 | Tell | |
09352a5d | 2020 | _fi__(_AMD29K__) |
b50e59fe RP |
2021 | @code{as} to change the logical line number. @var{line-number} must be |
2022 | an absolute expression. The next line will have that logical line | |
2023 | number. So any other statements on the current line (after a statement | |
2024 | separator character | |
09352a5d | 2025 | _if__(_AMD29K__) |
b50e59fe | 2026 | @samp{@@}) |
09352a5d RP |
2027 | _fi__(_AMD29K__) |
2028 | _if__(!_AMD29K__) | |
2029 | @code{;}) | |
2030 | _fi__(!_AMD29K__) | |
b50e59fe RP |
2031 | will be reported as on logical line number |
2032 | @var{logical line number} @minus{} 1. | |
2033 | One day this directive will be unsupported: it is used only | |
2034 | for compatibility with existing assembler programs. @refill | |
2035 | ||
2036 | @node List, Long, Ln, Pseudo Ops | |
f4335d56 RP |
2037 | @section @code{.list} and related directives |
2038 | GNU @code{as} ignores the directives @code{.list}, @code{.nolist}, | |
2039 | @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}; however, | |
2040 | they're accepted for compatibility with assemblers that use them. | |
b50e59fe RP |
2041 | |
2042 | @node Long, Lsym, List, Pseudo Ops | |
2043 | @section @code{.long @var{expressions}} | |
47342e8f | 2044 | @code{.long} is the same as @samp{.int}, @pxref{Int}. |
93b45514 | 2045 | |
47342e8f | 2046 | @node Lsym, Octa, Long, Pseudo Ops |
b50e59fe | 2047 | @section @code{.lsym @var{symbol}, @var{expression}} |
47342e8f | 2048 | @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in |
93b45514 RP |
2049 | the hash table, ensuring it cannot be referenced by name during the |
2050 | rest of the assembly. This sets the attributes of the symbol to be | |
47342e8f | 2051 | the same as the expression value: |
b50e59fe RP |
2052 | @example |
2053 | @var{other} = @var{descriptor} = 0 | |
2054 | @var{type} = @r{(segment of @var{expression})} | |
2055 | N_EXT = 0 | |
2056 | @var{value} = @var{expression} | |
2057 | @end example | |
93b45514 | 2058 | |
47342e8f | 2059 | @node Octa, Org, Lsym, Pseudo Ops |
b50e59fe | 2060 | @section @code{.octa @var{bignums}} |
47342e8f | 2061 | This directive expects zero or more bignums, separated by commas. For each |
b50e59fe RP |
2062 | bignum, it emits a 16-byte integer. |
2063 | ||
2064 | The term ``quad'' comes from contexts in which a ``word'' was two bytes; | |
2065 | hence @emph{quad}-word for 8 bytes. | |
93b45514 | 2066 | |
47342e8f | 2067 | @node Org, Quad, Octa, Pseudo Ops |
b50e59fe | 2068 | @section @code{.org @var{new-lc} , @var{fill}} |
47342e8f RP |
2069 | |
2070 | @code{.org} will advance the location counter of the current segment to | |
93b45514 | 2071 | @var{new-lc}. @var{new-lc} is either an absolute expression or an |
47342e8f RP |
2072 | expression with the same segment as the current subsegment. That is, |
2073 | you can't use @code{.org} to cross segments: if @var{new-lc} has the | |
2074 | wrong segment, the @code{.org} directive is ignored. To be compatible | |
2075 | with former assemblers, if the segment of @var{new-lc} is absolute, | |
2076 | @code{as} will issue a warning, then pretend the segment of @var{new-lc} | |
2077 | is the same as the current subsegment. | |
2078 | ||
2079 | @code{.org} may only increase the location counter, or leave it | |
2080 | unchanged; you cannot use @code{.org} to move the location counter | |
2081 | backwards. | |
2082 | ||
b50e59fe RP |
2083 | @c double negative used below "not undefined" because this is a specific |
2084 | @c reference to "undefined" (as SEG_UNKNOWN is called in this manual) | |
2085 | @c segment. [email protected] 18feb91 | |
47342e8f | 2086 | Because @code{as} tries to assemble programs in one pass @var{new-lc} |
b50e59fe | 2087 | may not be undefined. If you really detest this restriction we eagerly await |
47342e8f | 2088 | a chance to share your improved assembler. |
93b45514 RP |
2089 | |
2090 | Beware that the origin is relative to the start of the segment, not | |
2091 | to the start of the subsegment. This is compatible with other | |
2092 | people's assemblers. | |
2093 | ||
47342e8f | 2094 | When the location counter (of the current subsegment) is advanced, the |
93b45514 RP |
2095 | intervening bytes are filled with @var{fill} which should be an |
2096 | absolute expression. If the comma and @var{fill} are omitted, | |
2097 | @var{fill} defaults to zero. | |
2098 | ||
47342e8f | 2099 | @node Quad, Set, Org, Pseudo Ops |
b50e59fe RP |
2100 | @section @code{.quad @var{bignums}} |
2101 | @code{.quad} expects zero or more bignums, separated by commas. For | |
2102 | each bignum, it emits an 8-byte integer. If the bignum won't fit in a 8 | |
2103 | bytes, it prints a warning message; and just takes the lowest order 8 | |
2104 | bytes of the bignum. | |
2105 | ||
2106 | The term ``quad'' comes from contexts in which a ``word'' was two bytes; | |
2107 | hence @emph{quad}-word for 8 bytes. | |
93b45514 | 2108 | |
47342e8f | 2109 | @node Set, Short, Quad, Pseudo Ops |
b50e59fe | 2110 | @section @code{.set @var{symbol}, @var{expression}} |
93b45514 | 2111 | |
47342e8f | 2112 | This directive sets the value of @var{symbol} to @var{expression}. This |
b50e59fe RP |
2113 | will change @var{symbol}'s value and type to conform to |
2114 | @var{expression}. If @code{N_EXT} is set, it remains set. | |
2115 | (@xref{Symbol Attributes}.) | |
93b45514 | 2116 | |
47342e8f | 2117 | You may @code{.set} a symbol many times in the same assembly. |
93b45514 RP |
2118 | If the expression's segment is unknowable during pass 1, a second |
2119 | pass over the source program will be forced. The second pass is | |
2120 | currently not implemented. @code{as} will abort with an error | |
2121 | message if one is required. | |
2122 | ||
2123 | If you @code{.set} a global symbol, the value stored in the object | |
2124 | file is the last value stored into it. | |
2125 | ||
b50e59fe RP |
2126 | @node Short, Single, Set, Pseudo Ops |
2127 | @section @code{.short @var{expressions}} | |
09352a5d RP |
2128 | _if__(! (_SPARC__ || _AMD29K__) ) |
2129 | @code{.short} is the same as @samp{.word}. @xref{Word}. | |
2130 | _fi__(! (_SPARC__ || _AMD29K__) ) | |
2131 | _if__(_SPARC__ || _AMD29K__) | |
b50e59fe RP |
2132 | This expects zero or more @var{expressions}, and emits |
2133 | a 16 bit number for each. | |
09352a5d | 2134 | _fi__(_SPARC__ || _AMD29K__) |
b50e59fe RP |
2135 | |
2136 | @node Single, Space, Short, Pseudo Ops | |
2137 | @section @code{.single @var{flonums}} | |
2138 | This directive assembles zero or more flonums, separated by commas. It | |
2139 | has the same effect as @code{.float}. | |
09352a5d RP |
2140 | _if__(_ALL_ARCH__) |
2141 | The exact kind of floating point numbers emitted depends on how | |
2142 | @code{as} is configured. @xref{Machine Dependent}. | |
2143 | _fi__(_ALL_ARCH__) | |
2144 | _if__(_AMD29K__) | |
b50e59fe | 2145 | The floating point format used for the AMD 29K family is IEEE. |
09352a5d | 2146 | _fi__(_AMD29K__) |
b50e59fe RP |
2147 | |
2148 | ||
2149 | @node Space, Space, Single, Pseudo Ops | |
09352a5d | 2150 | _if__(!_AMD29K__) |
b50e59fe | 2151 | @section @code{.space @var{size} , @var{fill}} |
47342e8f | 2152 | This directive emits @var{size} bytes, each of value @var{fill}. Both |
93b45514 RP |
2153 | @var{size} and @var{fill} are absolute expressions. If the comma |
2154 | and @var{fill} are omitted, @var{fill} is assumed to be zero. | |
09352a5d | 2155 | _fi__(!_AMD29K__) |
b50e59fe | 2156 | |
09352a5d | 2157 | _if__(_AMD29K__) |
b50e59fe RP |
2158 | @section @code{.space} |
2159 | This directive is ignored; it is accepted for compatibility with other | |
2160 | AMD 29K assemblers. | |
2161 | ||
2162 | @quotation | |
2163 | @emph{Warning:} In other versions of GNU @code{as}, the directive | |
2164 | @code{.space} has the effect of @code{.block} @xref{Machine Directives}. | |
2165 | @end quotation | |
09352a5d | 2166 | _fi__(_AMD29K__) |
93b45514 | 2167 | |
47342e8f | 2168 | @node Stab, Text, Space, Pseudo Ops |
b50e59fe | 2169 | @section @code{.stabd, .stabn, .stabs} |
47342e8f | 2170 | There are three directives that begin @samp{.stab}. |
b50e59fe | 2171 | All emit symbols (@pxref{Symbols}), for use by symbolic debuggers. |
93b45514 | 2172 | The symbols are not entered in @code{as}' hash table: they |
b50e59fe | 2173 | cannot be referenced elsewhere in the source file. |
93b45514 RP |
2174 | Up to five fields are required: |
2175 | @table @var | |
2176 | @item string | |
2177 | This is the symbol's name. It may contain any character except @samp{\000}, | |
2178 | so is more general than ordinary symbol names. Some debuggers used to | |
47342e8f | 2179 | code arbitrarily complex structures into symbol names using this field. |
93b45514 | 2180 | @item type |
b50e59fe | 2181 | An absolute expression. The symbol's type is set to the low 8 |
93b45514 RP |
2182 | bits of this expression. |
2183 | Any bit pattern is permitted, but @code{ld} and debuggers will choke on | |
2184 | silly bit patterns. | |
2185 | @item other | |
2186 | An absolute expression. | |
b50e59fe | 2187 | The symbol's ``other'' attribute is set to the low 8 bits of this expression. |
93b45514 RP |
2188 | @item desc |
2189 | An absolute expression. | |
b50e59fe | 2190 | The symbol's descriptor is set to the low 16 bits of this expression. |
93b45514 | 2191 | @item value |
b50e59fe | 2192 | An absolute expression which becomes the symbol's value. |
93b45514 RP |
2193 | @end table |
2194 | ||
b50e59fe RP |
2195 | If a warning is detected while reading a @code{.stabd}, @code{.stabn}, |
2196 | or @code{.stabs} statement, the symbol has probably already been created | |
2197 | and you will get a half-formed symbol in your object file. This is | |
2198 | compatible with earlier assemblers! | |
93b45514 | 2199 | |
47342e8f RP |
2200 | @table @code |
2201 | @item .stabd @var{type} , @var{other} , @var{desc} | |
93b45514 RP |
2202 | |
2203 | The ``name'' of the symbol generated is not even an empty string. | |
2204 | It is a null pointer, for compatibility. Older assemblers used a | |
2205 | null pointer so they didn't waste space in object files with empty | |
2206 | strings. | |
2207 | ||
b50e59fe | 2208 | The symbol's value is set to the location counter, |
93b45514 RP |
2209 | relocatably. When your program is linked, the value of this symbol |
2210 | will be where the location counter was when the @code{.stabd} was | |
2211 | assembled. | |
2212 | ||
47342e8f | 2213 | @item .stabn @var{type} , @var{other} , @var{desc} , @var{value} |
93b45514 RP |
2214 | |
2215 | The name of the symbol is set to the empty string @code{""}. | |
2216 | ||
47342e8f | 2217 | @item .stabs @var{string} , @var{type} , @var{other} , @var{desc} , @var{value} |
93b45514 | 2218 | |
47342e8f RP |
2219 | All five fields are specified. |
2220 | @end table | |
2221 | ||
2222 | @node Text, Word, Stab, Pseudo Ops | |
b50e59fe | 2223 | @section @code{.text @var{subsegment}} |
93b45514 RP |
2224 | Tells @code{as} to assemble the following statements onto the end of |
2225 | the text subsegment numbered @var{subsegment}, which is an absolute | |
2226 | expression. If @var{subsegment} is omitted, subsegment number zero | |
2227 | is used. | |
2228 | ||
b50e59fe RP |
2229 | @node Word, Deprecated, Text, Pseudo Ops |
2230 | @section @code{.word @var{expressions}} | |
47342e8f | 2231 | This directive expects zero or more @var{expressions}, of any segment, |
b50e59fe | 2232 | separated by commas. |
09352a5d | 2233 | _if__(_SPARC__ || _AMD29K__) |
b50e59fe | 2234 | For each expression, @code{as} emits a 32-bit number. |
09352a5d RP |
2235 | _fi__(_SPARC__ || _AMD29K__) |
2236 | _if__(! (_SPARC__ || _AMD29K__) ) | |
2237 | For each expression, @code{as} emits a 16-bit number. | |
2238 | _fi__(! (_SPARC__ || _AMD29K__) ) | |
2239 | ||
2240 | _if__(_ALL_ARCH__) | |
2241 | The byte order of the expression depends on what kind of computer will | |
2242 | run the program. | |
2243 | _fi__(_ALL_ARCH__) | |
2244 | ||
2245 | @c on the 29k the "special treatment to support compilers" doesn't | |
2246 | @c happen---32-bit addressability, period; no long/short jumps. | |
2247 | _if__(!_AMD29K__) | |
47342e8f RP |
2248 | @subsection Special Treatment to support Compilers |
2249 | ||
2250 | In order to assemble compiler output into something that will work, | |
2251 | @code{as} will occasionlly do strange things to @samp{.word} directives. | |
2252 | Directives of the form @samp{.word sym1-sym2} are often emitted by | |
2253 | compilers as part of jump tables. Therefore, when @code{as} assembles a | |
2254 | directive of the form @samp{.word sym1-sym2}, and the difference between | |
2255 | @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will | |
2256 | create a @dfn{secondary jump table}, immediately before the next label. | |
2257 | This @var{secondary jump table} will be preceded by a short-jump to the | |
2258 | first byte after the secondary table. This short-jump prevents the flow | |
2259 | of control from accidentally falling into the new table. Inside the | |
2260 | table will be a long-jump to @code{sym2}. The original @samp{.word} | |
2261 | will contain @code{sym1} minus the address of the long-jump to | |
2262 | @code{sym2}. | |
2263 | ||
2264 | If there were several occurrences of @samp{.word sym1-sym2} before the | |
2265 | secondary jump table, all of them will be adjusted. If there was a | |
2266 | @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a | |
2267 | long-jump to @code{sym4} will be included in the secondary jump table, | |
2268 | and the @code{.word} directives will be adjusted to contain @code{sym3} | |
2269 | minus the address of the long-jump to @code{sym4}; and so on, for as many | |
2270 | entries in the original jump table as necessary. | |
09352a5d RP |
2271 | |
2272 | _if__(_INTERNALS__) | |
47342e8f RP |
2273 | @emph{This feature may be disabled by compiling @code{as} with the |
2274 | @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse | |
2275 | assembly language programmers. | |
09352a5d RP |
2276 | _fi__(_INTERNALS__) |
2277 | _fi__(!_AMD29K__) | |
93b45514 | 2278 | |
b50e59fe | 2279 | @node Deprecated, Machine Dependent, Word, Pseudo Ops |
93b45514 RP |
2280 | @section Deprecated Directives |
2281 | One day these directives won't work. | |
2282 | They are included for compatibility with older assemblers. | |
2283 | @table @t | |
2284 | @item .abort | |
b50e59fe | 2285 | @item .app-file |
93b45514 RP |
2286 | @item .line |
2287 | @end table | |
2288 | ||
b50e59fe | 2289 | @node Machine Dependent, Machine Dependent, Pseudo Ops, Top |
09352a5d RP |
2290 | _if__(_ALL_ARCH__) |
2291 | @chapter Machine Dependent Features | |
2292 | _fi__(_ALL_ARCH__) | |
2293 | ||
2294 | _if__(_VAX__ && !_ALL_ARCH__) | |
2295 | @chapter Machine Dependent Features: VAX | |
2296 | _fi__(_VAX__ && !_ALL_ARCH__) | |
2297 | _if__(_ALL_ARCH__) | |
93b45514 | 2298 | @section Vax |
09352a5d RP |
2299 | _fi__(_ALL_ARCH__) |
2300 | _if__(_VAX__) | |
93b45514 RP |
2301 | @subsection Options |
2302 | ||
2303 | The Vax version of @code{as} accepts any of the following options, | |
2304 | gives a warning message that the option was ignored and proceeds. | |
2305 | These options are for compatibility with scripts designed for other | |
2306 | people's assemblers. | |
2307 | ||
2308 | @table @asis | |
2309 | @item @kbd{-D} (Debug) | |
2310 | @itemx @kbd{-S} (Symbol Table) | |
2311 | @itemx @kbd{-T} (Token Trace) | |
2312 | These are obsolete options used to debug old assemblers. | |
2313 | ||
2314 | @item @kbd{-d} (Displacement size for JUMPs) | |
2315 | This option expects a number following the @kbd{-d}. Like options | |
2316 | that expect filenames, the number may immediately follow the | |
2317 | @kbd{-d} (old standard) or constitute the whole of the command line | |
2318 | argument that follows @kbd{-d} (GNU standard). | |
2319 | ||
2320 | @item @kbd{-V} (Virtualize Interpass Temporary File) | |
2321 | Some other assemblers use a temporary file. This option | |
2322 | commanded them to keep the information in active memory rather | |
2323 | than in a disk file. @code{as} always does this, so this | |
2324 | option is redundant. | |
2325 | ||
2326 | @item @kbd{-J} (JUMPify Longer Branches) | |
2327 | Many 32-bit computers permit a variety of branch instructions | |
2328 | to do the same job. Some of these instructions are short (and | |
2329 | fast) but have a limited range; others are long (and slow) but | |
2330 | can branch anywhere in virtual memory. Often there are 3 | |
2331 | flavors of branch: short, medium and long. Some other | |
2332 | assemblers would emit short and medium branches, unless told by | |
2333 | this option to emit short and long branches. | |
2334 | ||
2335 | @item @kbd{-t} (Temporary File Directory) | |
2336 | Some other assemblers may use a temporary file, and this option | |
2337 | takes a filename being the directory to site the temporary | |
2338 | file. @code{as} does not use a temporary disk file, so this | |
2339 | option makes no difference. @kbd{-t} needs exactly one | |
2340 | filename. | |
2341 | @end table | |
2342 | ||
2343 | The Vax version of the assembler accepts two options when | |
2344 | compiled for VMS. They are @kbd{-h}, and @kbd{-+}. The | |
2345 | @kbd{-h} option prevents @code{as} from modifying the | |
2346 | symbol-table entries for symbols that contain lowercase | |
2347 | characters (I think). The @kbd{-+} option causes @code{as} to | |
2348 | print warning messages if the FILENAME part of the object file, | |
2349 | or any symbol name is larger than 31 characters. The @kbd{-+} | |
2350 | option also insertes some code following the @samp{_main} | |
47342e8f | 2351 | symbol so that the object file will be compatible with Vax-11 |
93b45514 RP |
2352 | "C". |
2353 | ||
2354 | @subsection Floating Point | |
2355 | Conversion of flonums to floating point is correct, and | |
2356 | compatible with previous assemblers. Rounding is | |
2357 | towards zero if the remainder is exactly half the least significant bit. | |
2358 | ||
2359 | @code{D}, @code{F}, @code{G} and @code{H} floating point formats | |
2360 | are understood. | |
2361 | ||
47342e8f | 2362 | Immediate floating literals (@emph{e.g.} @samp{S`$6.9}) |
93b45514 RP |
2363 | are rendered correctly. Again, rounding is towards zero in the |
2364 | boundary case. | |
2365 | ||
2366 | The @code{.float} directive produces @code{f} format numbers. | |
2367 | The @code{.double} directive produces @code{d} format numbers. | |
2368 | ||
2369 | @subsection Machine Directives | |
2370 | The Vax version of the assembler supports four directives for | |
2371 | generating Vax floating point constants. They are described in the | |
2372 | table below. | |
2373 | ||
2374 | @table @code | |
2375 | @item .dfloat | |
2376 | This expects zero or more flonums, separated by commas, and | |
2377 | assembles Vax @code{d} format 64-bit floating point constants. | |
2378 | ||
2379 | @item .ffloat | |
2380 | This expects zero or more flonums, separated by commas, and | |
2381 | assembles Vax @code{f} format 32-bit floating point constants. | |
2382 | ||
2383 | @item .gfloat | |
2384 | This expects zero or more flonums, separated by commas, and | |
2385 | assembles Vax @code{g} format 64-bit floating point constants. | |
2386 | ||
2387 | @item .hfloat | |
2388 | This expects zero or more flonums, separated by commas, and | |
2389 | assembles Vax @code{h} format 128-bit floating point constants. | |
2390 | ||
2391 | @end table | |
2392 | ||
2393 | @subsection Opcodes | |
2394 | All DEC mnemonics are supported. Beware that @code{case@dots{}} | |
2395 | instructions have exactly 3 operands. The dispatch table that | |
2396 | follows the @code{case@dots{}} instruction should be made with | |
2397 | @code{.word} statements. This is compatible with all unix | |
2398 | assemblers we know of. | |
2399 | ||
2400 | @subsection Branch Improvement | |
2401 | Certain pseudo opcodes are permitted. They are for branch | |
2402 | instructions. They expand to the shortest branch instruction that | |
2403 | will reach the target. Generally these mnemonics are made by | |
2404 | substituting @samp{j} for @samp{b} at the start of a DEC mnemonic. | |
2405 | This feature is included both for compatibility and to help | |
2406 | compilers. If you don't need this feature, don't use these | |
2407 | opcodes. Here are the mnemonics, and the code they can expand into. | |
2408 | ||
2409 | @table @code | |
2410 | @item jbsb | |
2411 | @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}. | |
2412 | @table @asis | |
2413 | @item (byte displacement) | |
2414 | @kbd{bsbb @dots{}} | |
2415 | @item (word displacement) | |
2416 | @kbd{bsbw @dots{}} | |
2417 | @item (long displacement) | |
2418 | @kbd{jsb @dots{}} | |
2419 | @end table | |
2420 | @item jbr | |
2421 | @itemx jr | |
2422 | Unconditional branch. | |
2423 | @table @asis | |
2424 | @item (byte displacement) | |
2425 | @kbd{brb @dots{}} | |
2426 | @item (word displacement) | |
2427 | @kbd{brw @dots{}} | |
2428 | @item (long displacement) | |
2429 | @kbd{jmp @dots{}} | |
2430 | @end table | |
2431 | @item j@var{COND} | |
2432 | @var{COND} may be any one of the conditional branches | |
2433 | @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}. | |
2434 | @var{COND} may also be one of the bit tests | |
2435 | @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}. | |
2436 | @var{NOTCOND} is the opposite condition to @var{COND}. | |
2437 | @table @asis | |
2438 | @item (byte displacement) | |
2439 | @kbd{b@var{COND} @dots{}} | |
2440 | @item (word displacement) | |
2441 | @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:} | |
2442 | @item (long displacement) | |
2443 | @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:} | |
2444 | @end table | |
2445 | @item jacb@var{X} | |
2446 | @var{X} may be one of @code{b d f g h l w}. | |
2447 | @table @asis | |
2448 | @item (word displacement) | |
2449 | @kbd{@var{OPCODE} @dots{}} | |
2450 | @item (long displacement) | |
2451 | @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:} | |
2452 | @end table | |
2453 | @item jaob@var{YYY} | |
2454 | @var{YYY} may be one of @code{lss leq}. | |
2455 | @item jsob@var{ZZZ} | |
2456 | @var{ZZZ} may be one of @code{geq gtr}. | |
2457 | @table @asis | |
2458 | @item (byte displacement) | |
2459 | @kbd{@var{OPCODE} @dots{}} | |
2460 | @item (word displacement) | |
2461 | @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:} | |
2462 | @item (long displacement) | |
2463 | @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: } | |
2464 | @end table | |
2465 | @item aobleq | |
2466 | @itemx aoblss | |
2467 | @itemx sobgeq | |
2468 | @itemx sobgtr | |
2469 | @table @asis | |
2470 | @item (byte displacement) | |
2471 | @kbd{@var{OPCODE} @dots{}} | |
2472 | @item (word displacement) | |
2473 | @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:} | |
2474 | @item (long displacement) | |
2475 | @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:} | |
2476 | @end table | |
2477 | @end table | |
2478 | ||
2479 | @subsection operands | |
2480 | The immediate character is @samp{$} for Unix compatibility, not | |
2481 | @samp{#} as DEC writes it. | |
2482 | ||
2483 | The indirect character is @samp{*} for Unix compatibility, not | |
2484 | @samp{@@} as DEC writes it. | |
2485 | ||
2486 | The displacement sizing character is @samp{`} (an accent grave) for | |
2487 | Unix compatibility, not @samp{^} as DEC writes it. The letter | |
2488 | preceding @samp{`} may have either case. @samp{G} is not | |
2489 | understood, but all other letters (@code{b i l s w}) are understood. | |
2490 | ||
2491 | Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp | |
2492 | pc}. Any case of letters will do. | |
2493 | ||
2494 | For instance | |
2495 | @example | |
2496 | tstb *w`$4(r5) | |
2497 | @end example | |
2498 | ||
2499 | Any expression is permitted in an operand. Operands are comma | |
2500 | separated. | |
2501 | ||
2502 | @c There is some bug to do with recognizing expressions | |
2503 | @c in operands, but I forget what it is. It is | |
2504 | @c a syntax clash because () is used as an address mode | |
2505 | @c and to encapsulate sub-expressions. | |
2506 | @subsection Not Supported | |
2507 | Vax bit fields can not be assembled with @code{as}. Someone | |
2508 | can add the required code if they really need it. | |
09352a5d | 2509 | _fi__(_VAX__) |
93b45514 | 2510 | |
09352a5d RP |
2511 | _if__(_AMD29K__ && !_ALL_ARCH__) |
2512 | @chapter Machine Dependent Features: AMD 29K | |
2513 | _fi__(_AMD29K__ && !_ALL_ARCH__) | |
2514 | _if__(_AMD29K__) | |
b50e59fe RP |
2515 | @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent |
2516 | @section Options | |
2517 | GNU @code{as} has no additional command-line options for the AMD | |
2518 | 29K family. | |
2519 | ||
2520 | @node Machine Syntax, Floating Point, Machine Options, Machine Dependent | |
2521 | @section Syntax | |
2522 | @subsection Special Characters | |
2523 | @samp{;} is the line comment character. | |
2524 | ||
2525 | @samp{@@} can be used instead of a newline to separate statements. | |
2526 | ||
2527 | The character @samp{?} is permitted in identifiers (but may not begin | |
2528 | an identifier). | |
2529 | ||
2530 | @subsection Register Names | |
2531 | General-purpose registers are represented by predefined symbols of the | |
2532 | form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}} | |
2533 | (for local registers), where @var{nnn} represents a number between | |
2534 | @code{0} and @code{127}, written with no leading zeros. The leading | |
2535 | letters may be in either upper or lower case; for example, @samp{gr13} | |
2536 | and @samp{LR7} are both valid register names. | |
2537 | ||
2538 | You may also refer to general-purpose registers by specifying the | |
2539 | register number as the result of an expression (prefixed with @samp{%%} | |
2540 | to flag the expression as a register number): | |
2541 | @example | |
2542 | %%@var{expression} | |
2543 | @end example | |
2544 | @noindent---where @var{expression} must be an absolute expression | |
2545 | evaluating to a number between @code{0} and @code{255}. The range | |
2546 | [0, 127] refers to global registers, and the range [128, 255] to local | |
2547 | registers. | |
2548 | ||
2549 | In addition, GNU @code{as} understands the following protected | |
2550 | special-purpose register names for the AMD 29K family: | |
2551 | ||
2552 | @example | |
2553 | vab chd pc0 | |
2554 | ops chc pc1 | |
2555 | cps rbp pc2 | |
2556 | cfg tmc mmu | |
2557 | cha tmr lru | |
2558 | @end example | |
2559 | ||
2560 | These unprotected special-purpose register names are also recognized: | |
2561 | @example | |
2562 | ipc alu fpe | |
2563 | ipa bp inte | |
2564 | ipb fc fps | |
2565 | q cr exop | |
2566 | @end example | |
2567 | ||
2568 | @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent | |
2569 | @section Floating Point | |
2570 | The AMD 29K family uses IEEE floating-point numbers. | |
2571 | ||
2572 | @node Machine Directives, Opcodes, Floating Point, Machine Dependent | |
2573 | @section Machine Directives | |
2574 | ||
2575 | @menu | |
2576 | * block:: @code{.block @var{size} , @var{fill}} | |
2577 | * cputype:: @code{.cputype} | |
2578 | * file:: @code{.file} | |
2579 | * hword:: @code{.hword @var{expressions}} | |
2580 | * line:: @code{.line} | |
2581 | * reg:: @code{.reg @var{symbol}, @var{expression}} | |
2582 | * sect:: @code{.sect} | |
2583 | * use:: @code{.use @var{segment name}} | |
2584 | @end menu | |
2585 | ||
2586 | @node block, cputype, Machine Directives, Machine Directives | |
2587 | @subsection @code{.block @var{size} , @var{fill}} | |
2588 | This directive emits @var{size} bytes, each of value @var{fill}. Both | |
2589 | @var{size} and @var{fill} are absolute expressions. If the comma | |
2590 | and @var{fill} are omitted, @var{fill} is assumed to be zero. | |
2591 | ||
2592 | In other versions of GNU @code{as}, this directive is called | |
2593 | @samp{.space}. | |
2594 | ||
2595 | @node cputype, file, block, Machine Directives | |
2596 | @subsection @code{.cputype} | |
2597 | This directive is ignored; it is accepted for compatibility with other | |
2598 | AMD 29K assemblers. | |
2599 | ||
2600 | @node file, hword, cputype, Machine Directives | |
2601 | @subsection @code{.file} | |
2602 | This directive is ignored; it is accepted for compatibility with other | |
2603 | AMD 29K assemblers. | |
2604 | ||
2605 | @quotation | |
2606 | @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is | |
2607 | used for the directive called @code{.app-file} in the AMD 29K support. | |
2608 | @end quotation | |
2609 | ||
2610 | @node hword, line, file, Machine Directives | |
2611 | @subsection @code{.hword @var{expressions}} | |
2612 | This expects zero or more @var{expressions}, and emits | |
2613 | a 16 bit number for each. (Synonym for @samp{.short}.) | |
2614 | ||
2615 | @node line, reg, hword, Machine Directives | |
2616 | @subsection @code{.line} | |
2617 | This directive is ignored; it is accepted for compatibility with other | |
2618 | AMD 29K assemblers. | |
2619 | ||
2620 | @node reg, sect, line, Machine Directives | |
2621 | @subsection @code{.reg @var{symbol}, @var{expression}} | |
2622 | @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}. | |
2623 | ||
2624 | @node sect, use, reg, Machine Directives | |
2625 | @subsection @code{.sect} | |
2626 | This directive is ignored; it is accepted for compatibility with other | |
2627 | AMD 29K assemblers. | |
2628 | ||
2629 | @node use, , sect, Machine Directives | |
2630 | @subsection @code{.use @var{segment name}} | |
2631 | Establishes the segment and subsegment for the following code; | |
2632 | @var{segment name} may be one of @code{.text}, @code{.data}, | |
2633 | @code{.data1}, or @code{.lit}. With one of the first three @var{segment | |
2634 | name} options, @samp{.use} is equivalent to the machine directive | |
2635 | @var{segment name}; the remaining case, @samp{.use .lit}, is the same as | |
2636 | @samp{.data 200}. | |
2637 | ||
2638 | ||
2639 | @node Opcodes, Opcodes, Machine Directives, Machine Dependent | |
2640 | @section Opcodes | |
2641 | GNU @code{as} implements all the standard AMD 29K opcodes. No | |
2642 | additional pseudo-instructions are needed on this family. | |
2643 | ||
2644 | For information on the 29K machine instruction set, see @cite{Am29000 | |
2645 | User's Manual}, Advanced Micro Devices, Inc. | |
2646 | ||
2647 | ||
09352a5d RP |
2648 | _fi__(_AMD29K__) |
2649 | _if__(_M680X0__ && !_ALL_ARCH__) | |
2650 | @chapter Machine Dependent Features: Motorola 680x0 | |
2651 | _fi__(_M680X0__ && !_ALL_ARCH__) | |
2652 | _if__(_M680X0__) | |
47342e8f | 2653 | @section Options |
93b45514 RP |
2654 | The 680x0 version of @code{as} has two machine dependent options. |
2655 | One shortens undefined references from 32 to 16 bits, while the | |
2656 | other is used to tell @code{as} what kind of machine it is | |
2657 | assembling for. | |
2658 | ||
2659 | You can use the @kbd{-l} option to shorten the size of references to | |
47342e8f RP |
2660 | undefined symbols. If the @kbd{-l} option is not given, references to |
2661 | undefined symbols will be a full long (32 bits) wide. (Since @code{as} | |
2662 | cannot know where these symbols will end up, @code{as} can only allocate | |
2663 | space for the linker to fill in later. Since @code{as} doesn't know how | |
2664 | far away these symbols will be, it allocates as much space as it can.) | |
2665 | If this option is given, the references will only be one word wide (16 | |
2666 | bits). This may be useful if you want the object file to be as small as | |
2667 | possible, and you know that the relevant symbols will be less than 17 | |
2668 | bits away. | |
2669 | ||
2670 | The 680x0 version of @code{as} is most frequently used to assemble | |
2671 | programs for the Motorola MC68020 microprocessor. Occasionally it is | |
2672 | used to assemble programs for the mostly similar, but slightly different | |
2673 | MC68000 or MC68010 microprocessors. You can give @code{as} the options | |
2674 | @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010}, | |
2675 | @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the | |
2676 | target. | |
2677 | ||
2678 | @section Syntax | |
2679 | ||
2680 | The 680x0 version of @code{as} uses syntax similar to the Sun assembler. | |
2681 | Size modifiers are appended directly to the end of the opcode without an | |
2682 | intervening period. For example, write @samp{movl} rather than | |
2683 | @samp{move.l}. | |
2684 | ||
09352a5d | 2685 | _if__(_INTERNALS__) |
47342e8f RP |
2686 | If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow |
2687 | Sun-style local labels of the form @samp{1$} through @samp{$9}. | |
09352a5d | 2688 | _fi__(_INTERNALS__) |
93b45514 RP |
2689 | |
2690 | In the following table @dfn{apc} stands for any of the address | |
2691 | registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the | |
2692 | Program Counter (@samp{pc}), or the zero-address relative to the | |
2693 | program counter (@samp{zpc}). | |
2694 | ||
2695 | The following addressing modes are understood: | |
2696 | @table @dfn | |
2697 | @item Immediate | |
2698 | @samp{#@var{digits}} | |
2699 | ||
2700 | @item Data Register | |
2701 | @samp{d0} through @samp{d7} | |
2702 | ||
2703 | @item Address Register | |
2704 | @samp{a0} through @samp{a7} | |
2705 | ||
2706 | @item Address Register Indirect | |
2707 | @samp{a0@@} through @samp{a7@@} | |
2708 | ||
2709 | @item Address Register Postincrement | |
2710 | @samp{a0@@+} through @samp{a7@@+} | |
2711 | ||
2712 | @item Address Register Predecrement | |
2713 | @samp{a0@@-} through @samp{a7@@-} | |
2714 | ||
2715 | @item Indirect Plus Offset | |
2716 | @samp{@var{apc}@@(@var{digits})} | |
2717 | ||
2718 | @item Index | |
2719 | @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})} | |
2720 | or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})} | |
2721 | ||
2722 | @item Postindex | |
2723 | @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})} | |
2724 | or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})} | |
2725 | ||
2726 | @item Preindex | |
2727 | @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})} | |
2728 | or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})} | |
2729 | ||
2730 | @item Memory Indirect | |
2731 | @samp{@var{apc}@@(@var{digits})@@(@var{digits})} | |
2732 | ||
2733 | @item Absolute | |
47342e8f | 2734 | @samp{@var{symbol}}, or @samp{@var{digits}} |
09352a5d | 2735 | @ignore |
47342e8f RP |
2736 | @c [email protected]: gnu, rich concur the following needs careful |
2737 | @c research before documenting. | |
2738 | , or either of the above followed | |
93b45514 | 2739 | by @samp{:b}, @samp{:w}, or @samp{:l}. |
09352a5d | 2740 | @end ignore |
93b45514 RP |
2741 | @end table |
2742 | ||
47342e8f | 2743 | @section Floating Point |
93b45514 RP |
2744 | The floating point code is not too well tested, and may have |
2745 | subtle bugs in it. | |
2746 | ||
2747 | Packed decimal (P) format floating literals are not supported. | |
47342e8f | 2748 | Feel free to add the code! |
93b45514 RP |
2749 | |
2750 | The floating point formats generated by directives are these. | |
2751 | @table @code | |
2752 | @item .float | |
2753 | @code{Single} precision floating point constants. | |
2754 | @item .double | |
2755 | @code{Double} precision floating point constants. | |
2756 | @end table | |
2757 | ||
2758 | There is no directive to produce regions of memory holding | |
2759 | extended precision numbers, however they can be used as | |
2760 | immediate operands to floating-point instructions. Adding a | |
2761 | directive to create extended precision numbers would not be | |
47342e8f | 2762 | hard, but it has not yet seemed necessary. |
93b45514 | 2763 | |
47342e8f | 2764 | @section Machine Directives |
93b45514 RP |
2765 | In order to be compatible with the Sun assembler the 680x0 assembler |
2766 | understands the following directives. | |
2767 | @table @code | |
2768 | @item .data1 | |
2769 | This directive is identical to a @code{.data 1} directive. | |
2770 | @item .data2 | |
2771 | This directive is identical to a @code{.data 2} directive. | |
2772 | @item .even | |
2773 | This directive is identical to a @code{.align 1} directive. | |
2774 | @c Is this true? does it work??? | |
2775 | @item .skip | |
2776 | This directive is identical to a @code{.space} directive. | |
2777 | @end table | |
2778 | ||
47342e8f RP |
2779 | @section Opcodes |
2780 | @c [email protected]: I don't see any point in the following | |
2781 | @c paragraph. Bugs are bugs; how does saying this | |
2782 | @c help anyone? | |
09352a5d | 2783 | @ignore |
93b45514 RP |
2784 | Danger: Several bugs have been found in the opcode table (and |
2785 | fixed). More bugs may exist. Be careful when using obscure | |
2786 | instructions. | |
09352a5d | 2787 | @end ignore |
47342e8f RP |
2788 | |
2789 | @subsection Branch Improvement | |
2790 | ||
2791 | Certain pseudo opcodes are permitted for branch instructions. | |
2792 | They expand to the shortest branch instruction that will reach the | |
2793 | target. Generally these mnemonics are made by substituting @samp{j} for | |
2794 | @samp{b} at the start of a Motorola mnemonic. | |
2795 | ||
2796 | The following table summarizes the pseudo-operations. A @code{*} flags | |
2797 | cases that are more fully described after the table: | |
2798 | ||
2799 | @example | |
2800 | Displacement | |
2801 | +--------------------------------------------------------- | |
2802 | | 68020 68000/10 | |
2803 | Pseudo-Op |BYTE WORD LONG LONG non-PC relative | |
2804 | +--------------------------------------------------------- | |
2805 | jbsr |bsrs bsr bsrl jsr jsr | |
2806 | jra |bras bra bral jmp jmp | |
2807 | * jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp | |
2808 | * dbXX |dbXX dbXX dbXX; bra; jmpl | |
2809 | * fjXX |fbXXw fbXXw fbXXl fbNXw;jmp | |
2810 | ||
2811 | XX: condition | |
2812 | NX: negative of condition XX | |
2813 | ||
2814 | @end example | |
2815 | @center{@code{*}---see full description below} | |
2816 | ||
2817 | @table @code | |
2818 | @item jbsr | |
2819 | @itemx jra | |
2820 | These are the simplest jump pseudo-operations; they always map to one | |
2821 | particular machine instruction, depending on the displacement to the | |
2822 | branch target. | |
2823 | ||
2824 | @item j@var{XX} | |
2825 | Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations, | |
2826 | where @var{XX} is a conditional branch or condition-code test. The full | |
2827 | list of pseudo-ops in this family is: | |
2828 | @example | |
2829 | jhi jls jcc jcs jne jeq jvc | |
2830 | jvs jpl jmi jge jlt jgt jle | |
2831 | @end example | |
93b45514 | 2832 | |
47342e8f RP |
2833 | For the cases of non-PC relative displacements and long displacements on |
2834 | the 68000 or 68010, @code{as} will issue a longer code fragment in terms of | |
2835 | @var{NX}, the opposite condition to @var{XX}: | |
2836 | @example | |
2837 | j@var{XX} foo | |
2838 | @end example | |
2839 | gives | |
2840 | @example | |
2841 | b@var{NX}s oof | |
2842 | jmp foo | |
2843 | oof: | |
2844 | @end example | |
93b45514 | 2845 | |
47342e8f RP |
2846 | @item db@var{XX} |
2847 | The full family of pseudo-operations covered here is | |
2848 | @example | |
2849 | dbhi dbls dbcc dbcs dbne dbeq dbvc | |
2850 | dbvs dbpl dbmi dbge dblt dbgt dble | |
2851 | dbf dbra dbt | |
2852 | @end example | |
2853 | ||
2854 | Other than for word and byte displacements, when the source reads | |
2855 | @samp{db@var{XX} foo}, @code{as} will emit | |
2856 | @example | |
2857 | db@var{XX} oo1 | |
2858 | bra oo2 | |
2859 | oo1:jmpl foo | |
2860 | oo2: | |
2861 | @end example | |
2862 | ||
2863 | @item fj@var{XX} | |
2864 | This family includes | |
2865 | @example | |
2866 | fjne fjeq fjge fjlt fjgt fjle fjf | |
2867 | fjt fjgl fjgle fjnge fjngl fjngle fjngt | |
2868 | fjnle fjnlt fjoge fjogl fjogt fjole fjolt | |
2869 | fjor fjseq fjsf fjsne fjst fjueq fjuge | |
2870 | fjugt fjule fjult fjun | |
2871 | @end example | |
2872 | ||
2873 | For branch targets that are not PC relative, @code{as} emits | |
2874 | @example | |
2875 | fb@var{NX} oof | |
2876 | jmp foo | |
2877 | oof: | |
2878 | @end example | |
2879 | when it encounters @samp{fj@var{XX} foo}. | |
2880 | ||
2881 | @end table | |
2882 | ||
2883 | @subsection Special Characters | |
93b45514 RP |
2884 | The immediate character is @samp{#} for Sun compatibility. The |
2885 | line-comment character is @samp{|}. If a @samp{#} appears at the | |
2886 | beginning of a line, it is treated as a comment unless it looks like | |
2887 | @samp{# line file}, in which case it is treated normally. | |
09352a5d | 2888 | _fi__(_M680X0__) |
93b45514 | 2889 | |
09352a5d | 2890 | @c [email protected]: conditionalize, rather than ignore, when filled in. |
47342e8f | 2891 | @ignore |
93b45514 | 2892 | @section 32x32 |
47342e8f | 2893 | @section Options |
93b45514 RP |
2894 | The 32x32 version of @code{as} accepts a @kbd{-m32032} option to |
2895 | specify thiat it is compiling for a 32032 processor, or a | |
2896 | @kbd{-m32532} to specify that it is compiling for a 32532 option. | |
2897 | The default (if neither is specified) is chosen when the assembler | |
2898 | is compiled. | |
2899 | ||
2900 | @subsection Syntax | |
2901 | I don't know anything about the 32x32 syntax assembled by | |
2902 | @code{as}. Someone who undersands the processor (I've never seen | |
2903 | one) and the possible syntaxes should write this section. | |
2904 | ||
2905 | @subsection Floating Point | |
2906 | The 32x32 uses IEEE floating point numbers, but @code{as} will only | |
2907 | create single or double precision values. I don't know if the 32x32 | |
2908 | understands extended precision numbers. | |
2909 | ||
2910 | @subsection Machine Directives | |
2911 | The 32x32 has no machine dependent directives. | |
09352a5d | 2912 | @end ignore |
93b45514 | 2913 | |
09352a5d RP |
2914 | @c [email protected]: stop ignoring this when "syntax" section filled in |
2915 | @ignore | |
2916 | _if__(_SPARC__ && !_ALL_ARCH__) | |
2917 | @chapter Machine Dependent Features: SPARC | |
2918 | _fi__(_SPARC__ && !_ALL_ARCH__) | |
93b45514 RP |
2919 | @section Sparc |
2920 | @subsection Options | |
2921 | The sparc has no machine dependent options. | |
2922 | ||
2923 | @subsection syntax | |
2924 | I don't know anything about Sparc syntax. Someone who does | |
2925 | will have to write this section. | |
2926 | ||
2927 | @subsection Floating Point | |
2928 | The Sparc uses ieee floating-point numbers. | |
2929 | ||
2930 | @subsection Machine Directives | |
2931 | The Sparc version of @code{as} supports the following additional | |
2932 | machine directives: | |
2933 | ||
2934 | @table @code | |
2935 | @item .common | |
2936 | This must be followed by a symbol name, a positive number, and | |
2937 | @code{"bss"}. This behaves somewhat like @code{.comm}, but the | |
2938 | syntax is different. | |
2939 | ||
2940 | @item .global | |
2941 | This is functionally identical to @code{.globl}. | |
2942 | ||
2943 | @item .half | |
2944 | This is functionally identical to @code{.short}. | |
2945 | ||
2946 | @item .proc | |
2947 | This directive is ignored. Any text following it on the same | |
2948 | line is also ignored. | |
2949 | ||
2950 | @item .reserve | |
2951 | This must be followed by a symbol name, a positive number, and | |
2952 | @code{"bss"}. This behaves somewhat like @code{.lcomm}, but the | |
2953 | syntax is different. | |
2954 | ||
2955 | @item .seg | |
2956 | This must be followed by @code{"text"}, @code{"data"}, or | |
2957 | @code{"data1"}. It behaves like @code{.text}, @code{.data}, or | |
2958 | @code{.data 1}. | |
2959 | ||
2960 | @item .skip | |
2961 | This is functionally identical to the .space directive. | |
2962 | ||
2963 | @item .word | |
2964 | On the Sparc, the .word directive produces 32 bit values, | |
2965 | instead of the 16 bit values it produces on every other machine. | |
2966 | ||
2967 | @end table | |
09352a5d | 2968 | @end ignore |
93b45514 | 2969 | |
09352a5d RP |
2970 | _if__(_I80386__ && !_ALL_ARCH__) |
2971 | @chapter Machine Dependent Features: SPARC | |
2972 | _fi__(_I80386__ && !_ALL_ARCH__) | |
2973 | _if__(_I80386__) | |
93b45514 RP |
2974 | @section Intel 80386 |
2975 | @subsection Options | |
2976 | The 80386 has no machine dependent options. | |
2977 | ||
2978 | @subsection AT&T Syntax versus Intel Syntax | |
2979 | In order to maintain compatibility with the output of @code{GCC}, | |
2980 | @code{as} supports AT&T System V/386 assembler syntax. This is quite | |
2981 | different from Intel syntax. We mention these differences because | |
2982 | almost all 80386 documents used only Intel syntax. Notable differences | |
2983 | between the two syntaxes are: | |
2984 | @itemize @bullet | |
2985 | @item | |
2986 | AT&T immediate operands are preceded by @samp{$}; Intel immediate | |
2987 | operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}). | |
2988 | AT&T register operands are preceded by @samp{%}; Intel register operands | |
2989 | are undelimited. AT&T absolute (as opposed to PC relative) jump/call | |
2990 | operands are prefixed by @samp{*}; they are undelimited in Intel syntax. | |
2991 | ||
2992 | @item | |
2993 | AT&T and Intel syntax use the opposite order for source and destination | |
2994 | operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The | |
2995 | @samp{source, dest} convention is maintained for compatibility with | |
2996 | previous Unix assemblers. | |
2997 | ||
2998 | @item | |
2999 | In AT&T syntax the size of memory operands is determined from the last | |
3000 | character of the opcode name. Opcode suffixes of @samp{b}, @samp{w}, | |
3001 | and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit) | |
3002 | memory references. Intel syntax accomplishes this by prefixes memory | |
3003 | operands (@emph{not} the opcodes themselves) with @samp{byte ptr}, | |
3004 | @samp{word ptr}, and @samp{dword ptr}. Thus, Intel @samp{mov al, byte | |
3005 | ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax. | |
3006 | ||
3007 | @item | |
3008 | Immediate form long jumps and calls are | |
3009 | @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the | |
3010 | Intel syntax is | |
3011 | @samp{call/jmp far @var{segment}:@var{offset}}. Also, the far return | |
3012 | instruction | |
3013 | is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is | |
3014 | @samp{ret far @var{stack-adjust}}. | |
3015 | ||
3016 | @item | |
3017 | The AT&T assembler does not provide support for multiple segment | |
3018 | programs. Unix style systems expect all programs to be single segments. | |
3019 | @end itemize | |
3020 | ||
3021 | @subsection Opcode Naming | |
3022 | Opcode names are suffixed with one character modifiers which specify the | |
3023 | size of operands. The letters @samp{b}, @samp{w}, and @samp{l} specify | |
3024 | byte, word, and long operands. If no suffix is specified by an | |
3025 | instruction and it contains no memory operands then @code{as} tries to | |
3026 | fill in the missing suffix based on the destination register operand | |
3027 | (the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent | |
3028 | to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to | |
3029 | @samp{movw $1, %bx}. Note that this is incompatible with the AT&T Unix | |
3030 | assembler which assumes that a missing opcode suffix implies long | |
3031 | operand size. (This incompatibility does not affect compiler output | |
3032 | since compilers always explicitly specify the opcode suffix.) | |
3033 | ||
3034 | Almost all opcodes have the same names in AT&T and Intel format. There | |
3035 | are a few exceptions. The sign extend and zero extend instructions need | |
3036 | two sizes to specify them. They need a size to sign/zero extend | |
3037 | @emph{from} and a size to zero extend @emph{to}. This is accomplished | |
3038 | by using two opcode suffixes in AT&T syntax. Base names for sign extend | |
3039 | and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T | |
3040 | syntax (@samp{movsx} and @samp{movzx} in Intel syntax). The opcode | |
3041 | suffixes are tacked on to this base name, the @emph{from} suffix before | |
3042 | the @emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for | |
3043 | ``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes, | |
3044 | thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word), | |
3045 | and @samp{wl} (from word to long). | |
3046 | ||
3047 | The Intel syntax conversion instructions | |
3048 | @itemize @bullet | |
3049 | @item | |
3050 | @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax}, | |
3051 | @item | |
3052 | @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax}, | |
3053 | @item | |
3054 | @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax}, | |
3055 | @item | |
3056 | @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax}, | |
3057 | @end itemize | |
3058 | are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in | |
3059 | AT&T naming. @code{as} accepts either naming for these instructions. | |
3060 | ||
3061 | Far call/jump instructions are @samp{lcall} and @samp{ljmp} in | |
3062 | AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel | |
3063 | convention. | |
3064 | ||
3065 | @subsection Register Naming | |
3066 | Register operands are always prefixes with @samp{%}. The 80386 registers | |
3067 | consist of | |
3068 | @itemize @bullet | |
3069 | @item | |
3070 | the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx}, | |
3071 | @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the | |
3072 | frame pointer), and @samp{%esp} (the stack pointer). | |
3073 | ||
3074 | @item | |
3075 | the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx}, | |
3076 | @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}. | |
3077 | ||
3078 | @item | |
3079 | the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh}, | |
3080 | @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These | |
3081 | are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx}, | |
3082 | @samp{%cx}, and @samp{%dx}) | |
3083 | ||
3084 | @item | |
3085 | the 6 segment registers @samp{%cs} (code segment), @samp{%ds} | |
3086 | (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs}, | |
3087 | and @samp{%gs}. | |
3088 | ||
3089 | @item | |
3090 | the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and | |
3091 | @samp{%cr3}. | |
3092 | ||
3093 | @item | |
3094 | the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2}, | |
3095 | @samp{%db3}, @samp{%db6}, and @samp{%db7}. | |
3096 | ||
3097 | @item | |
3098 | the 2 test registers @samp{%tr6} and @samp{%tr7}. | |
3099 | ||
3100 | @item | |
3101 | the 8 floating point register stack @samp{%st} or equivalently | |
3102 | @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)}, | |
3103 | @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}. | |
3104 | @end itemize | |
3105 | ||
3106 | @subsection Opcode Prefixes | |
3107 | Opcode prefixes are used to modify the following opcode. They are used | |
3108 | to repeat string instructions, to provide segment overrides, to perform | |
3109 | bus lock operations, and to give operand and address size (16-bit | |
3110 | operands are specified in an instruction by prefixing what would | |
3111 | normally be 32-bit operands with a ``operand size'' opcode prefix). | |
3112 | Opcode prefixes are usually given as single-line instructions with no | |
3113 | operands, and must directly precede the instruction they act upon. For | |
3114 | example, the @samp{scas} (scan string) instruction is repeated with: | |
3115 | @example | |
3116 | repne | |
3117 | scas | |
3118 | @end example | |
3119 | ||
3120 | Here is a list of opcode prefixes: | |
3121 | @itemize @bullet | |
3122 | @item | |
3123 | Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es}, | |
3124 | @samp{fs}, @samp{gs}. These are automatically added by specifying | |
3125 | using the @var{segment}:@var{memory-operand} form for memory references. | |
3126 | ||
3127 | @item | |
3128 | Operand/Address size prefixes @samp{data16} and @samp{addr16} | |
3129 | change 32-bit operands/addresses into 16-bit operands/addresses. Note | |
3130 | that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes) | |
3131 | are not supported (yet). | |
3132 | ||
3133 | @item | |
3134 | The bus lock prefix @samp{lock} inhibits interrupts during | |
3135 | execution of the instruction it precedes. (This is only valid with | |
3136 | certain instructions; see a 80386 manual for details). | |
3137 | ||
3138 | @item | |
3139 | The wait for coprocessor prefix @samp{wait} waits for the | |
3140 | coprocessor to complete the current instruction. This should never be | |
3141 | needed for the 80386/80387 combination. | |
3142 | ||
3143 | @item | |
3144 | The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added | |
3145 | to string instructions to make them repeat @samp{%ecx} times. | |
3146 | @end itemize | |
3147 | ||
3148 | @subsection Memory References | |
3149 | An Intel syntax indirect memory reference of the form | |
3150 | @example | |
3151 | @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}] | |
3152 | @end example | |
3153 | is translated into the AT&T syntax | |
3154 | @example | |
3155 | @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale}) | |
3156 | @end example | |
3157 | where @var{base} and @var{index} are the optional 32-bit base and | |
3158 | index registers, @var{disp} is the optional displacement, and | |
3159 | @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index} | |
3160 | to calculate the address of the operand. If no @var{scale} is | |
3161 | specified, @var{scale} is taken to be 1. @var{segment} specifies the | |
3162 | optional segment register for the memory operand, and may override the | |
3163 | default segment register (see a 80386 manual for segment register | |
3164 | defaults). Note that segment overrides in AT&T syntax @emph{must} have | |
3165 | be preceded by a @samp{%}. If you specify a segment override which | |
3166 | coincides with the default segment register, @code{as} will @emph{not} | |
3167 | output any segment register override prefixes to assemble the given | |
3168 | instruction. Thus, segment overrides can be specified to emphasize which | |
3169 | segment register is used for a given memory operand. | |
3170 | ||
3171 | Here are some examples of Intel and AT&T style memory references: | |
3172 | @table @asis | |
3173 | ||
3174 | @item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]} | |
3175 | @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is | |
3176 | missing, and the default segment is used (@samp{%ss} for addressing with | |
3177 | @samp{%ebp} as the base register). @var{index}, @var{scale} are both missing. | |
3178 | ||
3179 | @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]} | |
3180 | @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is | |
3181 | @samp{foo}. All other fields are missing. The segment register here | |
3182 | defaults to @samp{%ds}. | |
3183 | ||
3184 | @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]} | |
3185 | This uses the value pointed to by @samp{foo} as a memory operand. | |
3186 | Note that @var{base} and @var{index} are both missing, but there is only | |
3187 | @emph{one} @samp{,}. This is a syntactic exception. | |
3188 | ||
3189 | @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo} | |
3190 | This selects the contents of the variable @samp{foo} with segment | |
3191 | register @var{segment} being @samp{%gs}. | |
3192 | ||
3193 | @end table | |
3194 | ||
3195 | Absolute (as opposed to PC relative) call and jump operands must be | |
3196 | prefixed with @samp{*}. If no @samp{*} is specified, @code{as} will | |
3197 | always choose PC relative addressing for jump/call labels. | |
3198 | ||
3199 | Any instruction that has a memory operand @emph{must} specify its size (byte, | |
3200 | word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l}, | |
3201 | respectively). | |
3202 | ||
3203 | @subsection Handling of Jump Instructions | |
3204 | Jump instructions are always optimized to use the smallest possible | |
3205 | displacements. This is accomplished by using byte (8-bit) displacement | |
3206 | jumps whenever the target is sufficiently close. If a byte displacement | |
3207 | is insufficient a long (32-bit) displacement is used. We do not support | |
3208 | word (16-bit) displacement jumps (i.e. prefixing the jump instruction | |
3209 | with the @samp{addr16} opcode prefix), since the 80386 insists upon masking | |
3210 | @samp{%eip} to 16 bits after the word displacement is added. | |
3211 | ||
3212 | Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz}, | |
3213 | @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in | |
3214 | byte displacements, so that it is possible that use of these | |
3215 | instructions (@code{GCC} does not use them) will cause the assembler to | |
3216 | print an error message (and generate incorrect code). The AT&T 80386 | |
3217 | assembler tries to get around this problem by expanding @samp{jcxz foo} to | |
3218 | @example | |
3219 | jcxz cx_zero | |
3220 | jmp cx_nonzero | |
3221 | cx_zero: jmp foo | |
3222 | cx_nonzero: | |
3223 | @end example | |
3224 | ||
3225 | @subsection Floating Point | |
3226 | All 80387 floating point types except packed BCD are supported. | |
3227 | (BCD support may be added without much difficulty). These data | |
3228 | types are 16-, 32-, and 64- bit integers, and single (32-bit), | |
3229 | double (64-bit), and extended (80-bit) precision floating point. | |
3230 | Each supported type has an opcode suffix and a constructor | |
3231 | associated with it. Opcode suffixes specify operand's data | |
3232 | types. Constructors build these data types into memory. | |
3233 | ||
3234 | @itemize @bullet | |
3235 | @item | |
3236 | Floating point constructors are @samp{.float} or @samp{.single}, | |
3237 | @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats. | |
3238 | These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}. | |
3239 | @samp{t} stands for temporary real, and that the 80387 only supports | |
3240 | this format via the @samp{fldt} (load temporary real to stack top) and | |
3241 | @samp{fstpt} (store temporary real and pop stack) instructions. | |
3242 | ||
3243 | @item | |
3244 | Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and | |
3245 | @samp{.quad} for the 16-, 32-, and 64-bit integer formats. The corresponding | |
3246 | opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q} | |
3247 | (quad). As with the temporary real format the 64-bit @samp{q} format is | |
3248 | only present in the @samp{fildq} (load quad integer to stack top) and | |
3249 | @samp{fistpq} (store quad integer and pop stack) instructions. | |
3250 | @end itemize | |
3251 | ||
3252 | Register to register operations do not require opcode suffixes, | |
3253 | so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}. | |
3254 | ||
3255 | Since the 80387 automatically synchronizes with the 80386 @samp{fwait} | |
3256 | instructions are almost never needed (this is not the case for the | |
b50e59fe | 3257 | 80286/80287 and 8086/8087 combinations). Therefore, @code{as} suppresses |
93b45514 RP |
3258 | the @samp{fwait} instruction whenever it is implicitly selected by one |
3259 | of the @samp{fn@dots{}} instructions. For example, @samp{fsave} and | |
3260 | @samp{fnsave} are treated identically. In general, all the @samp{fn@dots{}} | |
3261 | instructions are made equivalent to @samp{f@dots{}} instructions. If | |
3262 | @samp{fwait} is desired it must be explicitly coded. | |
3263 | ||
3264 | @subsection Notes | |
3265 | There is some trickery concerning the @samp{mul} and @samp{imul} | |
3266 | instructions that deserves mention. The 16-, 32-, and 64-bit expanding | |
3267 | multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5 | |
3268 | for @samp{imul}) can be output only in the one operand form. Thus, | |
3269 | @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply; | |
3270 | the expanding multiply would clobber the @samp{%edx} register, and this | |
3271 | would confuse @code{GCC} output. Use @samp{imul %ebx} to get the | |
3272 | 64-bit product in @samp{%edx:%eax}. | |
3273 | ||
3274 | We have added a two operand form of @samp{imul} when the first operand | |
3275 | is an immediate mode expression and the second operand is a register. | |
3276 | This is just a shorthand, so that, multiplying @samp{%eax} by 69, for | |
3277 | example, can be done with @samp{imul $69, %eax} rather than @samp{imul | |
3278 | $69, %eax, %eax}. | |
09352a5d RP |
3279 | _fi__(_I80386__) |
3280 | ||
3281 | ||
3282 | @c [email protected]: we ignore the following chapters, since internals are | |
3283 | @c changing rapidly. These may need to be moved to another | |
47342e8f RP |
3284 | @c book anyhow, if we adopt the model of user/modifier |
3285 | @c books. | |
3286 | @ignore | |
b50e59fe | 3287 | @node Maintenance, Retargeting, Machine Dependent, Top |
93b45514 RP |
3288 | @chapter Maintaining the Assembler |
3289 | [[this chapter is still being built]] | |
3290 | ||
3291 | @section Design | |
3292 | We had these goals, in descending priority: | |
3293 | @table @b | |
3294 | @item Accuracy. | |
3295 | For every program composed by a compiler, @code{as} should emit | |
3296 | ``correct'' code. This leaves some latitude in choosing addressing | |
3297 | modes, order of @code{relocation_info} structures in the object | |
47342e8f | 3298 | file, @emph{etc}. |
93b45514 RP |
3299 | |
3300 | @item Speed, for usual case. | |
3301 | By far the most common use of @code{as} will be assembling compiler | |
3302 | emissions. | |
3303 | ||
3304 | @item Upward compatibility for existing assembler code. | |
3305 | Well @dots{} we don't support Vax bit fields but everything else | |
3306 | seems to be upward compatible. | |
3307 | ||
3308 | @item Readability. | |
3309 | The code should be maintainable with few surprises. (JF: ha!) | |
3310 | ||
3311 | @end table | |
3312 | ||
3313 | We assumed that disk I/O was slow and expensive while memory was | |
3314 | fast and access to memory was cheap. We expect the in-memory data | |
3315 | structures to be less than 10 times the size of the emitted object | |
3316 | file. (Contrast this with the C compiler where in-memory structures | |
3317 | might be 100 times object file size!) | |
3318 | This suggests: | |
3319 | @itemize @bullet | |
3320 | @item | |
3321 | Try to read the source file from disk only one time. For other | |
3322 | reasons, we keep large chunks of the source file in memory during | |
3323 | assembly so this is not a problem. Also the assembly algorithm | |
3324 | should only scan the source text once if the compiler composed the | |
3325 | text according to a few simple rules. | |
3326 | @item | |
3327 | Emit the object code bytes only once. Don't store values and then | |
3328 | backpatch later. | |
3329 | @item | |
3330 | Build the object file in memory and do direct writes to disk of | |
3331 | large buffers. | |
3332 | @end itemize | |
3333 | ||
3334 | RMS suggested a one-pass algorithm which seems to work well. By not | |
3335 | parsing text during a second pass considerable time is saved on | |
47342e8f | 3336 | large programs (@emph{e.g.} the sort of C program @code{yacc} would |
93b45514 RP |
3337 | emit). |
3338 | ||
3339 | It happened that the data structures needed to emit relocation | |
3340 | information to the object file were neatly subsumed into the data | |
3341 | structures that do backpatching of addresses after pass 1. | |
3342 | ||
3343 | Many of the functions began life as re-usable modules, loosely | |
3344 | connected. RMS changed this to gain speed. For example, input | |
3345 | parsing routines which used to work on pre-sanitized strings now | |
3346 | must parse raw data. Hence they have to import knowledge of the | |
47342e8f | 3347 | assemblers' comment conventions @emph{etc}. |
93b45514 RP |
3348 | |
3349 | @section Deprecated Feature(?)s | |
3350 | We have stopped supporting some features: | |
3351 | @itemize @bullet | |
3352 | @item | |
3353 | @code{.org} statements must have @b{defined} expressions. | |
3354 | @item | |
3355 | Vax Bit fields (@kbd{:} operator) are entirely unsupported. | |
3356 | @end itemize | |
3357 | ||
3358 | It might be a good idea to not support these features in a future release: | |
3359 | @itemize @bullet | |
3360 | @item | |
3361 | @kbd{#} should begin a comment, even in column 1. | |
3362 | @item | |
3363 | Why support the logical line & file concept any more? | |
3364 | @item | |
3365 | Subsegments are a good candidate for flushing. | |
3366 | Depends on which compilers need them I guess. | |
3367 | @end itemize | |
3368 | ||
3369 | @section Bugs, Ideas, Further Work | |
3370 | Clearly the major improvement is DON'T USE A TEXT-READING | |
3371 | ASSEMBLER for the back end of a compiler. It is much faster to | |
3372 | interpret binary gobbledygook from a compiler's tables than to | |
3373 | ask the compiler to write out human-readable code just so the | |
3374 | assembler can parse it back to binary. | |
3375 | ||
3376 | Assuming you use @code{as} for human written programs: here are | |
3377 | some ideas: | |
3378 | @itemize @bullet | |
3379 | @item | |
3380 | Document (here) @code{APP}. | |
3381 | @item | |
3382 | Take advantage of knowing no spaces except after opcode | |
3383 | to speed up @code{as}. (Modify @code{app.c} to flush useless spaces: | |
3384 | only keep space/tabs at begin of line or between 2 | |
3385 | symbols.) | |
3386 | @item | |
3387 | Put pointers in this documentation to @file{a.out} documentation. | |
3388 | @item | |
3389 | Split the assembler into parts so it can gobble direct binary | |
47342e8f | 3390 | from @emph{e.g.} @code{cc}. It is silly for@code{cc} to compose text |
93b45514 RP |
3391 | just so @code{as} can parse it back to binary. |
3392 | @item | |
3393 | Rewrite hash functions: I want a more modular, faster library. | |
3394 | @item | |
3395 | Clean up LOTS of code. | |
3396 | @item | |
3397 | Include all the non-@file{.c} files in the maintenance chapter. | |
3398 | @item | |
3399 | Document flonums. | |
3400 | @item | |
3401 | Implement flonum short literals. | |
3402 | @item | |
3403 | Change all talk of expression operands to expression quantities, | |
47342e8f | 3404 | or perhaps to expression arguments. |
93b45514 RP |
3405 | @item |
3406 | Implement pass 2. | |
3407 | @item | |
3408 | Whenever a @code{.text} or @code{.data} statement is seen, we close | |
3409 | of the current frag with an imaginary @code{.fill 0}. This is | |
3410 | because we only have one obstack for frags, and we can't grow new | |
3411 | frags for a new subsegment, then go back to the old subsegment and | |
3412 | append bytes to the old frag. All this nonsense goes away if we | |
3413 | give each subsegment its own obstack. It makes code simpler in | |
3414 | about 10 places, but nobody has bothered to do it because C compiler | |
3415 | output rarely changes subsegments (compared to ending frags with | |
3416 | relaxable addresses, which is common). | |
3417 | @end itemize | |
3418 | ||
3419 | @section Sources | |
3420 | @c The following files in the @file{as} directory | |
3421 | @c are symbolic links to other files, of | |
3422 | @c the same name, in a different directory. | |
3423 | @c @itemize @bullet | |
3424 | @c @item | |
3425 | @c @file{atof_generic.c} | |
3426 | @c @item | |
3427 | @c @file{atof_vax.c} | |
3428 | @c @item | |
3429 | @c @file{flonum_const.c} | |
3430 | @c @item | |
3431 | @c @file{flonum_copy.c} | |
3432 | @c @item | |
3433 | @c @file{flonum_get.c} | |
3434 | @c @item | |
3435 | @c @file{flonum_multip.c} | |
3436 | @c @item | |
3437 | @c @file{flonum_normal.c} | |
3438 | @c @item | |
3439 | @c @file{flonum_print.c} | |
3440 | @c @end itemize | |
3441 | ||
3442 | Here is a list of the source files in the @file{as} directory. | |
3443 | ||
3444 | @table @file | |
3445 | @item app.c | |
3446 | This contains the pre-processing phase, which deletes comments, | |
3447 | handles whitespace, etc. This was recently re-written, since app | |
3448 | used to be a separate program, but RMS wanted it to be inline. | |
3449 | ||
3450 | @item append.c | |
3451 | This is a subroutine to append a string to another string returning a | |
3452 | pointer just after the last @code{char} appended. (JF: All these | |
3453 | little routines should probably all be put in one file.) | |
3454 | ||
3455 | @item as.c | |
3456 | Here you will find the main program of the assembler @code{as}. | |
3457 | ||
3458 | @item expr.c | |
3459 | This is a branch office of @file{read.c}. This understands | |
47342e8f RP |
3460 | expressions, arguments. Inside @code{as}, arguments are called |
3461 | (expression) @emph{operands}. This is confusing, because we also talk | |
3462 | (elsewhere) about instruction @emph{operands}. Also, expression | |
3463 | operands are called @emph{quantities} explicitly to avoid confusion | |
93b45514 RP |
3464 | with instruction operands. What a mess. |
3465 | ||
3466 | @item frags.c | |
3467 | This implements the @b{frag} concept. Without frags, finding the | |
3468 | right size for branch instructions would be a lot harder. | |
3469 | ||
3470 | @item hash.c | |
47342e8f | 3471 | This contains the symbol table, opcode table @emph{etc.} hashing |
93b45514 RP |
3472 | functions. |
3473 | ||
3474 | @item hex_value.c | |
3475 | This is a table of values of digits, for use in atoi() type | |
3476 | functions. Could probably be flushed by using calls to strtol(), or | |
3477 | something similar. | |
3478 | ||
3479 | @item input-file.c | |
3480 | This contains Operating system dependent source file reading | |
3481 | routines. Since error messages often say where we are in reading | |
3482 | the source file, they live here too. Since @code{as} is intended to | |
3483 | run under GNU and Unix only, this might be worth flushing. Anyway, | |
3484 | almost all C compilers support stdio. | |
3485 | ||
3486 | @item input-scrub.c | |
3487 | This deals with calling the pre-processor (if needed) and feeding the | |
3488 | chunks back to the rest of the assembler the right way. | |
3489 | ||
3490 | @item messages.c | |
3491 | This contains operating system independent parts of fatal and | |
3492 | warning message reporting. See @file{append.c} above. | |
3493 | ||
3494 | @item output-file.c | |
3495 | This contains operating system dependent functions that write an | |
3496 | object file for @code{as}. See @file{input-file.c} above. | |
3497 | ||
3498 | @item read.c | |
3499 | This implements all the directives of @code{as}. This also deals | |
3500 | with passing input lines to the machine dependent part of the | |
3501 | assembler. | |
3502 | ||
3503 | @item strstr.c | |
3504 | This is a C library function that isn't in most C libraries yet. | |
3505 | See @file{append.c} above. | |
3506 | ||
3507 | @item subsegs.c | |
3508 | This implements subsegments. | |
3509 | ||
3510 | @item symbols.c | |
3511 | This implements symbols. | |
3512 | ||
3513 | @item write.c | |
3514 | This contains the code to perform relaxation, and to write out | |
3515 | the object file. It is mostly operating system independent, but | |
3516 | different OSes have different object file formats in any case. | |
3517 | ||
3518 | @item xmalloc.c | |
3519 | This implements @code{malloc()} or bust. See @file{append.c} above. | |
3520 | ||
3521 | @item xrealloc.c | |
3522 | This implements @code{realloc()} or bust. See @file{append.c} above. | |
3523 | ||
3524 | @item atof-generic.c | |
3525 | The following files were taken from a machine-independent subroutine | |
3526 | library for manipulating floating point numbers and very large | |
3527 | integers. | |
3528 | ||
3529 | @file{atof-generic.c} turns a string into a flonum internal format | |
3530 | floating-point number. | |
3531 | ||
3532 | @item flonum-const.c | |
3533 | This contains some potentially useful floating point numbers in | |
3534 | flonum format. | |
3535 | ||
3536 | @item flonum-copy.c | |
3537 | This copies a flonum. | |
3538 | ||
3539 | @item flonum-multip.c | |
3540 | This multiplies two flonums together. | |
3541 | ||
3542 | @item bignum-copy.c | |
3543 | This copies a bignum. | |
3544 | ||
3545 | @end table | |
3546 | ||
3547 | Here is a table of all the machine-specific files (this includes | |
3548 | both source and header files). Typically, there is a | |
3549 | @var{machine}.c file, a @var{machine}-opcode.h file, and an | |
3550 | atof-@var{machine}.c file. The @var{machine}-opcode.h file should | |
3551 | be identical to the one used by GDB (which uses it for disassembly.) | |
3552 | ||
3553 | @table @file | |
3554 | ||
3555 | @item atof-ieee.c | |
3556 | This contains code to turn a flonum into a ieee literal constant. | |
3557 | This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}. | |
3558 | ||
3559 | @item i386-opcode.h | |
3560 | This is the opcode-table for the i386 version of the assembler. | |
3561 | ||
3562 | @item i386.c | |
3563 | This contains all the code for the i386 version of the assembler. | |
3564 | ||
3565 | @item i386.h | |
3566 | This defines constants and macros used by the i386 version of the assembler. | |
3567 | ||
3568 | @item m-generic.h | |
3569 | generic 68020 header file. To be linked to m68k.h on a | |
3570 | non-sun3, non-hpux system. | |
3571 | ||
3572 | @item m-sun2.h | |
3573 | 68010 header file for Sun2 workstations. Not well tested. To be linked | |
3574 | to m68k.h on a sun2. (See also @samp{-DSUN_ASM_SYNTAX} in the | |
3575 | @file{Makefile}.) | |
3576 | ||
3577 | @item m-sun3.h | |
3578 | 68020 header file for Sun3 workstations. To be linked to m68k.h before | |
3579 | compiling on a Sun3 system. (See also @samp{-DSUN_ASM_SYNTAX} in the | |
3580 | @file{Makefile}.) | |
3581 | ||
3582 | @item m-hpux.h | |
3583 | 68020 header file for a HPUX (system 5?) box. Which box, which | |
3584 | version of HPUX, etc? I don't know. | |
3585 | ||
3586 | @item m68k.h | |
3587 | A hard- or symbolic- link to one of @file{m-generic.h}, | |
3588 | @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of | |
3589 | 680x0 you are assembling for. (See also @samp{-DSUN_ASM_SYNTAX} in the | |
3590 | @file{Makefile}.) | |
3591 | ||
3592 | @item m68k-opcode.h | |
3593 | Opcode table for 68020. This is now a link to the opcode table | |
3594 | in the @code{GDB} source directory. | |
3595 | ||
3596 | @item m68k.c | |
3597 | All the mc680x0 code, in one huge, slow-to-compile file. | |
3598 | ||
3599 | @item ns32k.c | |
3600 | This contains the code for the ns32032/ns32532 version of the | |
3601 | assembler. | |
3602 | ||
3603 | @item ns32k-opcode.h | |
3604 | This contains the opcode table for the ns32032/ns32532 version | |
3605 | of the assembler. | |
3606 | ||
3607 | @item vax-inst.h | |
3608 | Vax specific file for describing Vax operands and other Vax-ish things. | |
3609 | ||
3610 | @item vax-opcode.h | |
3611 | Vax opcode table. | |
3612 | ||
3613 | @item vax.c | |
3614 | Vax specific parts of @code{as}. Also includes the former files | |
3615 | @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}. | |
3616 | ||
3617 | @item atof-vax.c | |
3618 | Turns a flonum into a Vax constant. | |
3619 | ||
3620 | @item vms.c | |
3621 | This file contains the special code needed to put out a VMS | |
3622 | style object file for the Vax. | |
3623 | ||
3624 | @end table | |
3625 | ||
3626 | Here is a list of the header files in the source directory. | |
3627 | (Warning: This section may not be very accurate. I didn't | |
3628 | write the header files; I just report them.) Also note that I | |
3629 | think many of these header files could be cleaned up or | |
3630 | eliminated. | |
3631 | ||
3632 | @table @file | |
3633 | ||
3634 | @item a.out.h | |
3635 | This describes the structures used to create the binary header data | |
3636 | inside the object file. Perhaps we should use the one in | |
3637 | @file{/usr/include}? | |
3638 | ||
3639 | @item as.h | |
09352a5d RP |
3640 | This defines all the globally useful things, and pulls in _0__<stdio.h>_1__ |
3641 | and _0__<assert.h>_1__. | |
93b45514 RP |
3642 | |
3643 | @item bignum.h | |
3644 | This defines macros useful for dealing with bignums. | |
3645 | ||
3646 | @item expr.h | |
3647 | Structure and macros for dealing with expression() | |
3648 | ||
3649 | @item flonum.h | |
3650 | This defines the structure for dealing with floating point | |
3651 | numbers. It #includes @file{bignum.h}. | |
3652 | ||
3653 | @item frags.h | |
3654 | This contains macro for appending a byte to the current frag. | |
3655 | ||
3656 | @item hash.h | |
3657 | Structures and function definitions for the hashing functions. | |
3658 | ||
3659 | @item input-file.h | |
3660 | Function headers for the input-file.c functions. | |
3661 | ||
3662 | @item md.h | |
3663 | structures and function headers for things defined in the | |
3664 | machine dependent part of the assembler. | |
3665 | ||
3666 | @item obstack.h | |
3667 | This is the GNU systemwide include file for manipulating obstacks. | |
3668 | Since nobody is running under real GNU yet, we include this file. | |
3669 | ||
3670 | @item read.h | |
3671 | Macros and function headers for reading in source files. | |
3672 | ||
3673 | @item struct-symbol.h | |
3674 | Structure definition and macros for dealing with the gas | |
3675 | internal form of a symbol. | |
3676 | ||
3677 | @item subsegs.h | |
3678 | structure definition for dealing with the numbered subsegments | |
3679 | of the text and data segments. | |
3680 | ||
3681 | @item symbols.h | |
3682 | Macros and function headers for dealing with symbols. | |
3683 | ||
3684 | @item write.h | |
3685 | Structure for doing segment fixups. | |
3686 | @end table | |
3687 | ||
3688 | @comment ~subsection Test Directory | |
3689 | @comment (Note: The test directory seems to have disappeared somewhere | |
3690 | @comment along the line. If you want it, you'll probably have to find a | |
3691 | @comment REALLY OLD dump tape~dots{}) | |
3692 | @comment | |
3693 | @comment The ~file{test/} directory is used for regression testing. | |
b50e59fe RP |
3694 | @comment After you modify ~@code{as}, you can get a quick go/nogo |
3695 | @comment confidence test by running the new ~@code{as} over the source | |
93b45514 RP |
3696 | @comment files in this directory. You use a shell script ~file{test/do}. |
3697 | @comment | |
3698 | @comment The tests in this suite are evolving. They are not comprehensive. | |
3699 | @comment They have, however, caught hundreds of bugs early in the debugging | |
b50e59fe RP |
3700 | @comment cycle of ~@code{as}. Most test statements in this suite were naturally |
3701 | @comment selected: they were used to demonstrate actual ~@code{as} bugs rather | |
93b45514 RP |
3702 | @comment than being written ~i{a prioi}. |
3703 | @comment | |
3704 | @comment Another testing suggestion: over 30 bugs have been found simply by | |
b50e59fe | 3705 | @comment running examples from this manual through ~@code{as}. |
93b45514 | 3706 | @comment Some examples in this manual are selected |
b50e59fe | 3707 | @comment to distinguish boundary conditions; they are good for testing ~@code{as}. |
93b45514 RP |
3708 | @comment |
3709 | @comment ~subsubsection Regression Testing | |
3710 | @comment Each regression test involves assembling a file and comparing the | |
b50e59fe | 3711 | @comment actual output of ~@code{as} to ``known good'' output files. Both |
93b45514 | 3712 | @comment the object file and the error/warning message file (stderr) are |
b50e59fe | 3713 | @comment inspected. Optionally ~@code{as}' exit status may be checked. |
93b45514 | 3714 | @comment Discrepencies are reported. Each discrepency means either that |
b50e59fe | 3715 | @comment you broke some part of ~@code{as} or that the ``known good'' files |
93b45514 RP |
3716 | @comment are now out of date and should be changed to reflect the new |
3717 | @comment definition of ``good''. | |
3718 | @comment | |
3719 | @comment Each regression test lives in its own directory, in a tree | |
3720 | @comment rooted in the directory ~file{test/}. Each such directory | |
3721 | @comment has a name ending in ~file{.ret}, where `ret' stands for | |
3722 | @comment REgression Test. The ~file{.ret} ending allows ~code{find | |
3723 | @comment (1)} to find all regression tests in the tree, without | |
3724 | @comment needing to list them explicitly. | |
3725 | @comment | |
3726 | @comment Any ~file{.ret} directory must contain a file called | |
3727 | @comment ~file{input} which is the source file to assemble. During | |
3728 | @comment testing an object file ~file{output} is created, as well as | |
3729 | @comment a file ~file{stdouterr} which contains the output to both | |
3730 | @comment stderr and stderr. If there is a file ~file{output.good} in | |
3731 | @comment the directory, and if ~file{output} contains exactly the | |
3732 | @comment same data as ~file{output.good}, the file ~file{output} is | |
3733 | @comment deleted. Likewise ~file{stdouterr} is removed if it exactly | |
3734 | @comment matches a file ~file{stdouterr.good}. If file | |
3735 | @comment ~file{status.good} is present, containing a decimal number | |
b50e59fe | 3736 | @comment before a newline, the exit status of ~@code{as} is compared |
93b45514 RP |
3737 | @comment to this number. If the status numbers are not equal, a file |
3738 | @comment ~file{status} is written to the directory, containing the | |
3739 | @comment actual status as a decimal number followed by newline. | |
3740 | @comment | |
3741 | @comment Should any of the ~file{*.good} files fail to match their corresponding | |
3742 | @comment actual files, this is noted by a 1-line message on the screen during | |
b50e59fe | 3743 | @comment the regression test, and you can use ~@code{find (1)} to find any |
93b45514 RP |
3744 | @comment files named ~file{status}, ~file {output} or ~file{stdouterr}. |
3745 | @comment | |
b50e59fe | 3746 | @node Retargeting, License, Maintenance, Top |
93b45514 RP |
3747 | @chapter Teaching the Assembler about a New Machine |
3748 | ||
3749 | This chapter describes the steps required in order to make the | |
3750 | assembler work with another machine's assembly language. This | |
3751 | chapter is not complete, and only describes the steps in the | |
3752 | broadest terms. You should look at the source for the | |
3753 | currently supported machine in order to discover some of the | |
3754 | details that aren't mentioned here. | |
3755 | ||
3756 | You should create a new file called @file{@var{machine}.c}, and | |
3757 | add the appropriate lines to the file @file{Makefile} so that | |
3758 | you can compile your new version of the assembler. This should | |
3759 | be straighforward; simply add lines similar to the ones there | |
3760 | for the four current versions of the assembler. | |
3761 | ||
47342e8f | 3762 | If you want to be compatible with GDB, (and the current |
93b45514 RP |
3763 | machine-dependent versions of the assembler), you should create |
3764 | a file called @file{@var{machine}-opcode.h} which should | |
3765 | contain all the information about the names of the machine | |
3766 | instructions, their opcodes, and what addressing modes they | |
3767 | support. If you do this right, the assembler and GDB can share | |
3768 | this file, and you'll only have to write it once. Note that | |
3769 | while you're writing @code{as}, you may want to use an | |
3770 | independent program (if you have access to one), to make sure | |
3771 | that @code{as} is emitting the correct bytes. Since @code{as} | |
3772 | and @code{GDB} share the opcode table, an incorrect opcode | |
3773 | table entry may make invalid bytes look OK when you disassemble | |
3774 | them with @code{GDB}. | |
3775 | ||
3776 | @section Functions You will Have to Write | |
3777 | ||
3778 | Your file @file{@var{machine}.c} should contain definitions for | |
3779 | the following functions and variables. It will need to include | |
3780 | some header files in order to use some of the structures | |
3781 | defined in the machine-independent part of the assembler. The | |
3782 | needed header files are mentioned in the descriptions of the | |
3783 | functions that will need them. | |
3784 | ||
3785 | @table @code | |
3786 | ||
3787 | @item long omagic; | |
3788 | This long integer holds the value to place at the beginning of | |
3789 | the @file{a.out} file. It is usually @samp{OMAGIC}, except on | |
3790 | machines that store additional information in the magic-number. | |
3791 | ||
3792 | @item char comment_chars[]; | |
3793 | This character array holds the values of the characters that | |
3794 | start a comment anywhere in a line. Comments are stripped off | |
3795 | automatically by the machine independent part of the | |
3796 | assembler. Note that the @samp{/*} will always start a | |
3797 | comment, and that only @samp{*/} will end a comment started by | |
3798 | @samp{*/}. | |
3799 | ||
3800 | @item char line_comment_chars[]; | |
3801 | This character array holds the values of the chars that start a | |
3802 | comment only if they are the first (non-whitespace) character | |
3803 | on a line. If the character @samp{#} does not appear in this | |
3804 | list, you may get unexpected results. (Various | |
3805 | machine-independent parts of the assembler treat the comments | |
3806 | @samp{#APP} and @samp{#NO_APP} specially, and assume that lines | |
3807 | that start with @samp{#} are comments.) | |
3808 | ||
3809 | @item char EXP_CHARS[]; | |
3810 | This character array holds the letters that can separate the | |
3811 | mantissa and the exponent of a floating point number. Typical | |
3812 | values are @samp{e} and @samp{E}. | |
3813 | ||
3814 | @item char FLT_CHARS[]; | |
3815 | This character array holds the letters that--when they appear | |
3816 | immediately after a leading zero--indicate that a number is a | |
3817 | floating-point number. (Sort of how 0x indicates that a | |
3818 | hexadecimal number follows.) | |
3819 | ||
3820 | @item pseudo_typeS md_pseudo_table[]; | |
3821 | (@var{pseudo_typeS} is defined in @file{md.h}) | |
3822 | This array contains a list of the machine_dependent directives | |
3823 | the assembler must support. It contains the name of each | |
3824 | pseudo op (Without the leading @samp{.}), a pointer to a | |
3825 | function to be called when that directive is encountered, and | |
3826 | an integer argument to be passed to that function. | |
3827 | ||
3828 | @item void md_begin(void) | |
3829 | This function is called as part of the assembler's | |
3830 | initialization. It should do any initialization required by | |
3831 | any of your other routines. | |
3832 | ||
3833 | @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR) | |
3834 | This routine is called once for each option on the command line | |
3835 | that the machine-independent part of @code{as} does not | |
3836 | understand. This function should return non-zero if the option | |
3837 | pointed to by @var{optionPTR} is a valid option. If it is not | |
3838 | a valid option, this routine should return zero. The variables | |
3839 | @var{argcPTR} and @var{argvPTR} are provided in case the option | |
3840 | requires a filename or something similar as an argument. If | |
3841 | the option is multi-character, @var{optionPTR} should be | |
3842 | advanced past the end of the option, otherwise every letter in | |
3843 | the option will be treated as a separate single-character | |
3844 | option. | |
3845 | ||
3846 | @item void md_assemble(char *string) | |
3847 | This routine is called for every machine-dependent | |
3848 | non-directive line in the source file. It does all the real | |
3849 | work involved in reading the opcode, parsing the operands, | |
3850 | etc. @var{string} is a pointer to a null-terminated string, | |
3851 | that comprises the input line, with all excess whitespace and | |
3852 | comments removed. | |
3853 | ||
3854 | @item void md_number_to_chars(char *outputPTR,long value,int nbytes) | |
3855 | This routine is called to turn a C long int, short int, or char | |
3856 | into the series of bytes that represents that number on the | |
3857 | target machine. @var{outputPTR} points to an array where the | |
3858 | result should be stored; @var{value} is the value to store; and | |
3859 | @var{nbytes} is the number of bytes in 'value' that should be | |
3860 | stored. | |
3861 | ||
3862 | @item void md_number_to_imm(char *outputPTR,long value,int nbytes) | |
3863 | This routine is called to turn a C long int, short int, or char | |
3864 | into the series of bytes that represent an immediate value on | |
3865 | the target machine. It is identical to the function @code{md_number_to_chars}, | |
3866 | except on NS32K machines.@refill | |
3867 | ||
3868 | @item void md_number_to_disp(char *outputPTR,long value,int nbytes) | |
3869 | This routine is called to turn a C long int, short int, or char | |
3870 | into the series of bytes that represent an displacement value on | |
3871 | the target machine. It is identical to the function @code{md_number_to_chars}, | |
3872 | except on NS32K machines.@refill | |
3873 | ||
3874 | @item void md_number_to_field(char *outputPTR,long value,int nbytes) | |
3875 | This routine is identical to @code{md_number_to_chars}, | |
3876 | except on NS32K machines. | |
3877 | ||
3878 | @item void md_ri_to_chars(struct relocation_info *riPTR,ri) | |
3879 | (@code{struct relocation_info} is defined in @file{a.out.h}) | |
3880 | This routine emits the relocation info in @var{ri} | |
3881 | in the appropriate bit-pattern for the target machine. | |
3882 | The result should be stored in the location pointed | |
3883 | to by @var{riPTR}. This routine may be a no-op unless you are | |
3884 | attempting to do cross-assembly. | |
3885 | ||
3886 | @item char *md_atof(char type,char *outputPTR,int *sizePTR) | |
3887 | This routine turns a series of digits into the appropriate | |
3888 | internal representation for a floating-point number. | |
3889 | @var{type} is a character from @var{FLT_CHARS[]} that describes | |
3890 | what kind of floating point number is wanted; @var{outputPTR} | |
3891 | is a pointer to an array that the result should be stored in; | |
3892 | and @var{sizePTR} is a pointer to an integer where the size (in | |
3893 | bytes) of the result should be stored. This routine should | |
3894 | return an error message, or an empty string (not (char *)0) for | |
3895 | success. | |
3896 | ||
3897 | @item int md_short_jump_size; | |
3898 | This variable holds the (maximum) size in bytes of a short (16 | |
3899 | bit or so) jump created by @code{md_create_short_jump()}. This | |
3900 | variable is used as part of the broken-word feature, and isn't | |
3901 | needed if the assembler is compiled with | |
3902 | @samp{-DWORKING_DOT_WORD}. | |
3903 | ||
3904 | @item int md_long_jump_size; | |
3905 | This variable holds the (maximum) size in bytes of a long (32 | |
3906 | bit or so) jump created by @code{md_create_long_jump()}. This | |
3907 | variable is used as part of the broken-word feature, and isn't | |
3908 | needed if the assembler is compiled with | |
3909 | @samp{-DWORKING_DOT_WORD}. | |
3910 | ||
3911 | @item void md_create_short_jump(char *resultPTR,long from_addr, | |
3912 | @code{long to_addr,fragS *frag,symbolS *to_symbol)} | |
3913 | This function emits a jump from @var{from_addr} to @var{to_addr} in | |
3914 | the array of bytes pointed to by @var{resultPTR}. If this creates a | |
3915 | type of jump that must be relocated, this function should call | |
3916 | @code{fix_new()} with @var{frag} and @var{to_symbol}. The jump | |
3917 | emitted by this function may be smaller than @var{md_short_jump_size}, | |
3918 | but it must never create a larger one. | |
3919 | (If it creates a smaller jump, the extra bytes of memory will not be | |
3920 | used.) This function is used as part of the broken-word feature, | |
3921 | and isn't needed if the assembler is compiled with | |
3922 | @samp{-DWORKING_DOT_WORD}.@refill | |
3923 | ||
3924 | @item void md_create_long_jump(char *ptr,long from_addr, | |
3925 | @code{long to_addr,fragS *frag,symbolS *to_symbol)} | |
3926 | This function is similar to the previous function, | |
3927 | @code{md_create_short_jump()}, except that it creates a long | |
3928 | jump instead of a short one. This function is used as part of | |
3929 | the broken-word feature, and isn't needed if the assembler is | |
3930 | compiled with @samp{-DWORKING_DOT_WORD}. | |
3931 | ||
3932 | @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type) | |
3933 | This function does the initial setting up for relaxation. This | |
3934 | includes forcing references to still-undefined symbols to the | |
3935 | appropriate addressing modes. | |
3936 | ||
3937 | @item relax_typeS md_relax_table[]; | |
3938 | (relax_typeS is defined in md.h) | |
3939 | This array describes the various machine dependent states a | |
3940 | frag may be in before relaxation. You will need one group of | |
3941 | entries for each type of addressing mode you intend to relax. | |
3942 | ||
3943 | @item void md_convert_frag(fragS *fragPTR) | |
3944 | (@var{fragS} is defined in @file{as.h}) | |
3945 | This routine does the required cleanup after relaxation. | |
3946 | Relaxation has changed the type of the frag to a type that can | |
3947 | reach its destination. This function should adjust the opcode | |
3948 | of the frag to use the appropriate addressing mode. | |
3949 | @var{fragPTR} points to the frag to clean up. | |
3950 | ||
3951 | @item void md_end(void) | |
3952 | This function is called just before the assembler exits. It | |
3953 | need not free up memory unless the operating system doesn't do | |
3954 | it automatically on exit. (In which case you'll also have to | |
3955 | track down all the other places where the assembler allocates | |
3956 | space but never frees it.) | |
3957 | ||
3958 | @end table | |
3959 | ||
3960 | @section External Variables You will Need to Use | |
3961 | ||
3962 | You will need to refer to or change the following external variables | |
3963 | from within the machine-dependent part of the assembler. | |
3964 | ||
3965 | @table @code | |
3966 | @item extern char flagseen[]; | |
3967 | This array holds non-zero values in locations corresponding to | |
3968 | the options that were on the command line. Thus, if the | |
3969 | assembler was called with @samp{-W}, @var{flagseen['W']} would | |
3970 | be non-zero. | |
3971 | ||
3972 | @item extern fragS *frag_now; | |
3973 | This pointer points to the current frag--the frag that bytes | |
3974 | are currently being added to. If nothing else, you will need | |
3975 | to pass it as an argument to various machine-independent | |
3976 | functions. It is maintained automatically by the | |
3977 | frag-manipulating functions; you should never have to change it | |
3978 | yourself. | |
3979 | ||
3980 | @item extern LITTLENUM_TYPE generic_bignum[]; | |
3981 | (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}. | |
3982 | This is where @dfn{bignums}--numbers larger than 32 bits--are | |
3983 | returned when they are encountered in an expression. You will | |
3984 | need to use this if you need to implement directives (or | |
3985 | anything else) that must deal with these large numbers. | |
3986 | @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in | |
3987 | @file{as.h}, and have a positive @code{X_add_number}. The | |
3988 | @code{X_add_number} of a @code{bignum} is the number of | |
3989 | @code{LITTLENUMS} in @var{generic_bignum} that the number takes | |
3990 | up. | |
3991 | ||
3992 | @item extern FLONUM_TYPE generic_floating_point_number; | |
3993 | (@var{FLONUM_TYPE} is defined in @file{flonum.h}. | |
3994 | The is where @dfn{flonums}--floating-point numbers within | |
3995 | expressions--are returned. @code{Flonums} are of @code{segT} | |
3996 | @code{SEG_BIG}, and have a negative @code{X_add_number}. | |
3997 | @code{Flonums} are returned in a generic format. You will have | |
3998 | to write a routine to turn this generic format into the | |
3999 | appropriate floating-point format for your machine. | |
4000 | ||
4001 | @item extern int need_pass_2; | |
4002 | If this variable is non-zero, the assembler has encountered an | |
4003 | expression that cannot be assembled in a single pass. Since | |
4004 | the second pass isn't implemented, this flag means that the | |
4005 | assembler is punting, and is only looking for additional syntax | |
4006 | errors. (Or something like that.) | |
4007 | ||
4008 | @item extern segT now_seg; | |
4009 | This variable holds the value of the segment the assembler is | |
4010 | currently assembling into. | |
4011 | ||
4012 | @end table | |
4013 | ||
4014 | @section External functions will you need | |
4015 | ||
4016 | You will find the following external functions useful (or | |
4017 | indispensable) when you're writing the machine-dependent part | |
4018 | of the assembler. | |
4019 | ||
4020 | @table @code | |
4021 | ||
4022 | @item char *frag_more(int bytes) | |
4023 | This function allocates @var{bytes} more bytes in the current | |
4024 | frag (or starts a new frag, if it can't expand the current frag | |
4025 | any more.) for you to store some object-file bytes in. It | |
4026 | returns a pointer to the bytes, ready for you to store data in. | |
4027 | ||
4028 | @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel) | |
4029 | This function stores a relocation fixup to be acted on later. | |
4030 | @var{frag} points to the frag the relocation belongs in; | |
4031 | @var{where} is the location within the frag where the relocation begins; | |
4032 | @var{size} is the size of the relocation, and is usually 1 (a single byte), | |
4033 | 2 (sixteen bits), or 4 (a longword). | |
4034 | The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s) | |
09352a5d | 4035 | at _0__@var{frag->literal[where]}_1__. If @var{pcrel} is non-zero, the address of the |
93b45514 RP |
4036 | location is subtracted from the result. A relocation entry is also added |
4037 | to the @file{a.out} file. @var{add_symbol}, @var{sub_symbol}, and/or | |
4038 | @var{offset} may be NULL.@refill | |
4039 | ||
4040 | @item char *frag_var(relax_stateT type, int max_chars, int var, | |
4041 | @code{relax_substateT subtype, symbolS *symbol, char *opcode)} | |
4042 | This function creates a machine-dependent frag of type @var{type} | |
4043 | (usually @code{rs_machine_dependent}). | |
4044 | @var{max_chars} is the maximum size in bytes that the frag may grow by; | |
4045 | @var{var} is the current size of the variable end of the frag; | |
4046 | @var{subtype} is the sub-type of the frag. The sub-type is used to index into | |
4047 | @var{md_relax_table[]} during @code{relaxation}. | |
4048 | @var{symbol} is the symbol whose value should be used to when relax-ing this frag. | |
4049 | @var{opcode} points into a byte whose value may have to be modified if the | |
4050 | addressing mode used by this frag changes. It typically points into the | |
4051 | @var{fr_literal[]} of the previous frag, and is used to point to a location | |
4052 | that @code{md_convert_frag()}, may have to change.@refill | |
4053 | ||
4054 | @item void frag_wane(fragS *fragPTR) | |
4055 | This function is useful from within @code{md_convert_frag}. It | |
4056 | changes a frag to type rs_fill, and sets the variable-sized | |
4057 | piece of the frag to zero. The frag will never change in size | |
4058 | again. | |
4059 | ||
4060 | @item segT expression(expressionS *retval) | |
4061 | (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h}) | |
4062 | This function parses the string pointed to by the external char | |
4063 | pointer @var{input_line_pointer}, and returns the segment-type | |
4064 | of the expression. It also stores the results in the | |
4065 | @var{expressionS} pointed to by @var{retval}. | |
4066 | @var{input_line_pointer} is advanced to point past the end of | |
4067 | the expression. (@var{input_line_pointer} is used by other | |
4068 | parts of the assembler. If you modify it, be sure to restore | |
4069 | it to its original value.) | |
4070 | ||
4071 | @item as_warn(char *message,@dots{}) | |
4072 | If warning messages are disabled, this function does nothing. | |
4073 | Otherwise, it prints out the current file name, and the current | |
4074 | line number, then uses @code{fprintf} to print the | |
4075 | @var{message} and any arguments it was passed. | |
4076 | ||
4077 | @item as_bad(char *message,@dots{}) | |
4078 | This function should be called when @code{as} encounters | |
4079 | conditions that are bad enough that @code{as} should not | |
4080 | produce an object file, but should continue reading input and | |
4081 | printing warning and bad error messages. | |
4082 | ||
4083 | @item as_fatal(char *message,@dots{}) | |
4084 | This function prints out the current file name and line number, | |
4085 | prints the word @samp{FATAL:}, then uses @code{fprintf} to | |
4086 | print the @var{message} and any arguments it was passed. Then | |
4087 | the assembler exits. This function should only be used for | |
4088 | serious, unrecoverable errors. | |
4089 | ||
4090 | @item void float_const(int float_type) | |
4091 | This function reads floating-point constants from the current | |
4092 | input line, and calls @code{md_atof} to assemble them. It is | |
4093 | useful as the function to call for the directives | |
4094 | @samp{.single}, @samp{.double}, @samp{.float}, etc. | |
4095 | @var{float_type} must be a character from @var{FLT_CHARS}. | |
4096 | ||
4097 | @item void demand_empty_rest_of_line(void); | |
4098 | This function can be used by machine-dependent directives to | |
4099 | make sure the rest of the input line is empty. It prints a | |
4100 | warning message if there are additional characters on the line. | |
4101 | ||
4102 | @item long int get_absolute_expression(void) | |
4103 | This function can be used by machine-dependent directives to | |
4104 | read an absolute number from the current input line. It | |
4105 | returns the result. If it isn't given an absolute expression, | |
4106 | it prints a warning message and returns zero. | |
4107 | ||
4108 | @end table | |
4109 | ||
4110 | ||
4111 | @section The concept of Frags | |
4112 | ||
4113 | This assembler works to optimize the size of certain addressing | |
4114 | modes. (e.g. branch instructions) This means the size of many | |
4115 | pieces of object code cannot be determined until after assembly | |
4116 | is finished. (This means that the addresses of symbols cannot be | |
4117 | determined until assembly is finished.) In order to do this, | |
4118 | @code{as} stores the output bytes as @dfn{frags}. | |
4119 | ||
4120 | Here is the definition of a frag (from @file{as.h}) | |
4121 | @example | |
4122 | struct frag | |
4123 | @{ | |
4124 | long int fr_fix; | |
4125 | long int fr_var; | |
4126 | relax_stateT fr_type; | |
4127 | relax_substateT fr_substate; | |
4128 | unsigned long fr_address; | |
4129 | long int fr_offset; | |
4130 | struct symbol *fr_symbol; | |
4131 | char *fr_opcode; | |
4132 | struct frag *fr_next; | |
4133 | char fr_literal[]; | |
4134 | @} | |
4135 | @end example | |
4136 | ||
4137 | @table @var | |
4138 | @item fr_fix | |
4139 | is the size of the fixed-size piece of the frag. | |
4140 | ||
4141 | @item fr_var | |
4142 | is the maximum (?) size of the variable-sized piece of the frag. | |
4143 | ||
4144 | @item fr_type | |
4145 | is the type of the frag. | |
4146 | Current types are: | |
4147 | rs_fill | |
4148 | rs_align | |
4149 | rs_org | |
4150 | rs_machine_dependent | |
4151 | ||
4152 | @item fr_substate | |
4153 | This stores the type of machine-dependent frag this is. (what | |
4154 | kind of addressing mode is being used, and what size is being | |
4155 | tried/will fit/etc. | |
4156 | ||
4157 | @item fr_address | |
4158 | @var{fr_address} is only valid after relaxation is finished. | |
4159 | Before relaxation, the only way to store an address is (pointer | |
4160 | to frag containing the address) plus (offset into the frag). | |
4161 | ||
4162 | @item fr_offset | |
4163 | This contains a number, whose meaning depends on the type of | |
4164 | the frag. | |
4165 | for machine_dependent frags, this contains the offset from | |
4166 | fr_symbol that the frag wants to go to. Thus, for branch | |
4167 | instructions it is usually zero. (unless the instruction was | |
4168 | @samp{jba foo+12} or something like that.) | |
4169 | ||
4170 | @item fr_symbol | |
4171 | for machine_dependent frags, this points to the symbol the frag | |
4172 | needs to reach. | |
4173 | ||
4174 | @item fr_opcode | |
4175 | This points to the location in the frag (or in a previous frag) | |
4176 | of the opcode for the instruction that caused this to be a frag. | |
4177 | @var{fr_opcode} is needed if the actual opcode must be changed | |
4178 | in order to use a different form of the addressing mode. | |
4179 | (For example, if a conditional branch only comes in size tiny, | |
4180 | a large-size branch could be implemented by reversing the sense | |
4181 | of the test, and turning it into a tiny branch over a large jump. | |
4182 | This would require changing the opcode.) | |
4183 | ||
4184 | @var{fr_literal} is a variable-size array that contains the | |
4185 | actual object bytes. A frag consists of a fixed size piece of | |
4186 | object data, (which may be zero bytes long), followed by a | |
4187 | piece of object data whose size may not have been determined | |
4188 | yet. Other information includes the type of the frag (which | |
4189 | controls how it is relaxed), | |
4190 | ||
4191 | @item fr_next | |
4192 | This is the next frag in the singly-linked list. This is | |
4193 | usually only needed by the machine-independent part of | |
4194 | @code{as}. | |
4195 | ||
4196 | @end table | |
47342e8f RP |
4197 | @end ignore |
4198 | ||
b50e59fe | 4199 | @node License, , Retargeting, Top |
47342e8f RP |
4200 | @unnumbered GNU GENERAL PUBLIC LICENSE |
4201 | @center Version 1, February 1989 | |
4202 | ||
4203 | @display | |
4204 | Copyright @copyright{} 1989 Free Software Foundation, Inc. | |
4205 | 675 Mass Ave, Cambridge, MA 02139, USA | |
4206 | ||
4207 | Everyone is permitted to copy and distribute verbatim copies | |
4208 | of this license document, but changing it is not allowed. | |
4209 | @end display | |
4210 | ||
4211 | @unnumberedsec Preamble | |
4212 | ||
4213 | The license agreements of most software companies try to keep users | |
4214 | at the mercy of those companies. By contrast, our General Public | |
4215 | License is intended to guarantee your freedom to share and change free | |
4216 | software---to make sure the software is free for all its users. The | |
4217 | General Public License applies to the Free Software Foundation's | |
4218 | software and to any other program whose authors commit to using it. | |
4219 | You can use it for your programs, too. | |
4220 | ||
4221 | When we speak of free software, we are referring to freedom, not | |
4222 | price. Specifically, the General Public License is designed to make | |
4223 | sure that you have the freedom to give away or sell copies of free | |
4224 | software, that you receive source code or can get it if you want it, | |
4225 | that you can change the software or use pieces of it in new free | |
4226 | programs; and that you know you can do these things. | |
4227 | ||
4228 | To protect your rights, we need to make restrictions that forbid | |
4229 | anyone to deny you these rights or to ask you to surrender the rights. | |
4230 | These restrictions translate to certain responsibilities for you if you | |
4231 | distribute copies of the software, or if you modify it. | |
4232 | ||
4233 | For example, if you distribute copies of a such a program, whether | |
4234 | gratis or for a fee, you must give the recipients all the rights that | |
4235 | you have. You must make sure that they, too, receive or can get the | |
4236 | source code. And you must tell them their rights. | |
4237 | ||
4238 | We protect your rights with two steps: (1) copyright the software, and | |
4239 | (2) offer you this license which gives you legal permission to copy, | |
4240 | distribute and/or modify the software. | |
4241 | ||
4242 | Also, for each author's protection and ours, we want to make certain | |
4243 | that everyone understands that there is no warranty for this free | |
4244 | software. If the software is modified by someone else and passed on, we | |
4245 | want its recipients to know that what they have is not the original, so | |
4246 | that any problems introduced by others will not reflect on the original | |
4247 | authors' reputations. | |
4248 | ||
4249 | The precise terms and conditions for copying, distribution and | |
4250 | modification follow. | |
4251 | ||
4252 | @iftex | |
4253 | @unnumberedsec TERMS AND CONDITIONS | |
4254 | @end iftex | |
4255 | @ifinfo | |
4256 | @center TERMS AND CONDITIONS | |
4257 | @end ifinfo | |
4258 | ||
4259 | @enumerate | |
4260 | @item | |
4261 | This License Agreement applies to any program or other work which | |
4262 | contains a notice placed by the copyright holder saying it may be | |
4263 | distributed under the terms of this General Public License. The | |
4264 | ``Program'', below, refers to any such program or work, and a ``work based | |
4265 | on the Program'' means either the Program or any work containing the | |
4266 | Program or a portion of it, either verbatim or with modifications. Each | |
4267 | licensee is addressed as ``you''. | |
4268 | ||
4269 | @item | |
4270 | You may copy and distribute verbatim copies of the Program's source | |
4271 | code as you receive it, in any medium, provided that you conspicuously and | |
4272 | appropriately publish on each copy an appropriate copyright notice and | |
4273 | disclaimer of warranty; keep intact all the notices that refer to this | |
4274 | General Public License and to the absence of any warranty; and give any | |
4275 | other recipients of the Program a copy of this General Public License | |
4276 | along with the Program. You may charge a fee for the physical act of | |
4277 | transferring a copy. | |
4278 | ||
4279 | @item | |
4280 | You may modify your copy or copies of the Program or any portion of | |
4281 | it, and copy and distribute such modifications under the terms of Paragraph | |
4282 | 1 above, provided that you also do the following: | |
4283 | ||
4284 | @itemize @bullet | |
4285 | @item | |
4286 | cause the modified files to carry prominent notices stating that | |
4287 | you changed the files and the date of any change; and | |
4288 | ||
4289 | @item | |
4290 | cause the whole of any work that you distribute or publish, that | |
4291 | in whole or in part contains the Program or any part thereof, either | |
4292 | with or without modifications, to be licensed at no charge to all | |
4293 | third parties under the terms of this General Public License (except | |
4294 | that you may choose to grant warranty protection to some or all | |
4295 | third parties, at your option). | |
4296 | ||
4297 | @item | |
4298 | If the modified program normally reads commands interactively when | |
4299 | run, you must cause it, when started running for such interactive use | |
4300 | in the simplest and most usual way, to print or display an | |
4301 | announcement including an appropriate copyright notice and a notice | |
4302 | that there is no warranty (or else, saying that you provide a | |
4303 | warranty) and that users may redistribute the program under these | |
4304 | conditions, and telling the user how to view a copy of this General | |
4305 | Public License. | |
4306 | ||
4307 | @item | |
4308 | You may charge a fee for the physical act of transferring a | |
4309 | copy, and you may at your option offer warranty protection in | |
4310 | exchange for a fee. | |
4311 | @end itemize | |
4312 | ||
4313 | Mere aggregation of another independent work with the Program (or its | |
4314 | derivative) on a volume of a storage or distribution medium does not bring | |
4315 | the other work under the scope of these terms. | |
4316 | ||
4317 | @item | |
4318 | You may copy and distribute the Program (or a portion or derivative of | |
4319 | it, under Paragraph 2) in object code or executable form under the terms of | |
4320 | Paragraphs 1 and 2 above provided that you also do one of the following: | |
4321 | ||
4322 | @itemize @bullet | |
4323 | @item | |
4324 | accompany it with the complete corresponding machine-readable | |
4325 | source code, which must be distributed under the terms of | |
4326 | Paragraphs 1 and 2 above; or, | |
4327 | ||
4328 | @item | |
4329 | accompany it with a written offer, valid for at least three | |
4330 | years, to give any third party free (except for a nominal charge | |
4331 | for the cost of distribution) a complete machine-readable copy of the | |
4332 | corresponding source code, to be distributed under the terms of | |
4333 | Paragraphs 1 and 2 above; or, | |
4334 | ||
4335 | @item | |
4336 | accompany it with the information you received as to where the | |
4337 | corresponding source code may be obtained. (This alternative is | |
4338 | allowed only for noncommercial distribution and only if you | |
4339 | received the program in object code or executable form alone.) | |
4340 | @end itemize | |
4341 | ||
4342 | Source code for a work means the preferred form of the work for making | |
4343 | modifications to it. For an executable file, complete source code means | |
4344 | all the source code for all modules it contains; but, as a special | |
4345 | exception, it need not include source code for modules which are standard | |
4346 | libraries that accompany the operating system on which the executable | |
4347 | file runs, or for standard header files or definitions files that | |
4348 | accompany that operating system. | |
4349 | ||
4350 | @item | |
4351 | You may not copy, modify, sublicense, distribute or transfer the | |
4352 | Program except as expressly provided under this General Public License. | |
4353 | Any attempt otherwise to copy, modify, sublicense, distribute or transfer | |
4354 | the Program is void, and will automatically terminate your rights to use | |
4355 | the Program under this License. However, parties who have received | |
4356 | copies, or rights to use copies, from you under this General Public | |
4357 | License will not have their licenses terminated so long as such parties | |
4358 | remain in full compliance. | |
4359 | ||
4360 | @item | |
4361 | By copying, distributing or modifying the Program (or any work based | |
4362 | on the Program) you indicate your acceptance of this license to do so, | |
4363 | and all its terms and conditions. | |
4364 | ||
4365 | @item | |
4366 | Each time you redistribute the Program (or any work based on the | |
4367 | Program), the recipient automatically receives a license from the original | |
4368 | licensor to copy, distribute or modify the Program subject to these | |
4369 | terms and conditions. You may not impose any further restrictions on the | |
4370 | recipients' exercise of the rights granted herein. | |
4371 | ||
4372 | @item | |
4373 | The Free Software Foundation may publish revised and/or new versions | |
4374 | of the General Public License from time to time. Such new versions will | |
4375 | be similar in spirit to the present version, but may differ in detail to | |
4376 | address new problems or concerns. | |
4377 | ||
4378 | Each version is given a distinguishing version number. If the Program | |
4379 | specifies a version number of the license which applies to it and ``any | |
4380 | later version'', you have the option of following the terms and conditions | |
4381 | either of that version or of any later version published by the Free | |
4382 | Software Foundation. If the Program does not specify a version number of | |
4383 | the license, you may choose any version ever published by the Free Software | |
4384 | Foundation. | |
4385 | ||
4386 | @item | |
4387 | If you wish to incorporate parts of the Program into other free | |
4388 | programs whose distribution conditions are different, write to the author | |
4389 | to ask for permission. For software which is copyrighted by the Free | |
4390 | Software Foundation, write to the Free Software Foundation; we sometimes | |
4391 | make exceptions for this. Our decision will be guided by the two goals | |
4392 | of preserving the free status of all derivatives of our free software and | |
4393 | of promoting the sharing and reuse of software generally. | |
93b45514 | 4394 | |
93b45514 | 4395 | @iftex |
47342e8f | 4396 | @heading NO WARRANTY |
93b45514 | 4397 | @end iftex |
47342e8f RP |
4398 | @ifinfo |
4399 | @center NO WARRANTY | |
4400 | @end ifinfo | |
4401 | ||
4402 | @item | |
4403 | BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY | |
4404 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN | |
4405 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES | |
4406 | PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED | |
4407 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF | |
4408 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS | |
4409 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE | |
4410 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, | |
4411 | REPAIR OR CORRECTION. | |
4412 | ||
4413 | @item | |
4414 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL | |
4415 | ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR | |
4416 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, | |
4417 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES | |
4418 | ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT | |
4419 | LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES | |
4420 | SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE | |
4421 | WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN | |
4422 | ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. | |
4423 | @end enumerate | |
4424 | ||
4425 | @iftex | |
4426 | @heading END OF TERMS AND CONDITIONS | |
4427 | @end iftex | |
4428 | @ifinfo | |
4429 | @center END OF TERMS AND CONDITIONS | |
4430 | @end ifinfo | |
4431 | ||
4432 | @page | |
f4335d56 | 4433 | @unnumberedsec How to Apply These Terms to Your New Programs |
47342e8f RP |
4434 | |
4435 | If you develop a new program, and you want it to be of the greatest | |
4436 | possible use to humanity, the best way to achieve this is to make it | |
4437 | free software which everyone can redistribute and change under these | |
4438 | terms. | |
4439 | ||
4440 | To do so, attach the following notices to the program. It is safest to | |
4441 | attach them to the start of each source file to most effectively convey | |
4442 | the exclusion of warranty; and each file should have at least the | |
4443 | ``copyright'' line and a pointer to where the full notice is found. | |
4444 | ||
4445 | @smallexample | |
4446 | @var{one line to give the program's name and a brief idea of what it does.} | |
4447 | Copyright (C) 19@var{yy} @var{name of author} | |
4448 | ||
4449 | This program is free software; you can redistribute it and/or modify | |
4450 | it under the terms of the GNU General Public License as published by | |
4451 | the Free Software Foundation; either version 1, or (at your option) | |
4452 | any later version. | |
4453 | ||
4454 | This program is distributed in the hope that it will be useful, | |
4455 | but WITHOUT ANY WARRANTY; without even the implied warranty of | |
4456 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
4457 | GNU General Public License for more details. | |
4458 | ||
4459 | You should have received a copy of the GNU General Public License | |
4460 | along with this program; if not, write to the Free Software | |
4461 | Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. | |
4462 | @end smallexample | |
4463 | ||
4464 | Also add information on how to contact you by electronic and paper mail. | |
4465 | ||
4466 | If the program is interactive, make it output a short notice like this | |
4467 | when it starts in an interactive mode: | |
4468 | ||
4469 | @smallexample | |
4470 | Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author} | |
4471 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. | |
4472 | This is free software, and you are welcome to redistribute it | |
4473 | under certain conditions; type `show c' for details. | |
4474 | @end smallexample | |
4475 | ||
4476 | The hypothetical commands `show w' and `show c' should show the | |
4477 | appropriate parts of the General Public License. Of course, the | |
4478 | commands you use may be called something other than `show w' and `show | |
4479 | c'; they could even be mouse-clicks or menu items---whatever suits your | |
4480 | program. | |
4481 | ||
4482 | You should also get your employer (if you work as a programmer) or your | |
4483 | school, if any, to sign a ``copyright disclaimer'' for the program, if | |
b50e59fe | 4484 | necessary. Here is a sample; alter the names: |
47342e8f | 4485 | |
f4335d56 | 4486 | @smallexample |
47342e8f RP |
4487 | Yoyodyne, Inc., hereby disclaims all copyright interest in the |
4488 | program `Gnomovision' (a program to direct compilers to make passes | |
4489 | at assemblers) written by James Hacker. | |
4490 | ||
4491 | @var{signature of Ty Coon}, 1 April 1989 | |
4492 | Ty Coon, President of Vice | |
f4335d56 | 4493 | @end smallexample |
47342e8f RP |
4494 | |
4495 | That's all there is to it! | |
4496 | ||
4497 | ||
93b45514 RP |
4498 | @summarycontents |
4499 | @contents | |
4500 | @bye |