]>
Commit | Line | Data |
---|---|---|
c074abee DHW |
1 | \input texinfo |
2 | @parindent=0pt | |
3 | @setfilename gld | |
4 | @c @@setchapternewpage odd | |
5 | @settitle GLD, The GNU linker | |
6 | @titlepage | |
7 | @title{gld} | |
8 | @subtitle{The gnu loader} | |
9 | @sp 1 | |
10 | @subtitle Second Edition---gld version 2.0 | |
11 | @subtitle January 1991 | |
12 | @vskip 0pt plus 1filll | |
13 | Copyright @copyright{} 1991 Free Software Foundation, Inc. | |
14 | ||
15 | Permission is granted to make and distribute verbatim copies of | |
16 | this manual provided the copyright notice and this permission notice | |
17 | are preserved on all copies. | |
18 | ||
19 | Permission is granted to copy and distribute modified versions of this | |
20 | manual under the conditions for verbatim copying, provided also that | |
21 | the entire resulting derived work is distributed under the terms of a | |
22 | permission notice identical to this one. | |
23 | ||
24 | Permission is granted to copy and distribute translations of this manual | |
25 | into another language, under the above conditions for modified versions. | |
26 | ||
27 | @author {Steve Chamberlain} | |
28 | @author {Cygnus Support} | |
29 | @author {steve@@cygnus.com} | |
30 | @end titlepage | |
31 | ||
32 | @node Top,,, | |
33 | @comment node-name, next, previous, up | |
34 | @ifinfo | |
35 | This file documents the GNU linker gld. | |
36 | @end ifinfo | |
37 | ||
38 | @c chapter What does a linker do ? | |
39 | @c chapter Command Language | |
40 | @noindent | |
41 | @chapter Overview | |
42 | ||
43 | ||
44 | The @code{gld} command combines a number of object and archive files, | |
45 | relocates their data and ties up symbol references. Often the last | |
46 | step in building a new compiled program to run is a call to @code{gld}. | |
47 | ||
48 | The @code{gld} command accepts Linker Command Language files in | |
49 | a superset of AT+T's Link Editor Command Language syntax, | |
50 | to provide explict and total control over the linking process. | |
51 | ||
52 | This version of @code{gld} uses the general purpose @code{bfd} libraries | |
53 | to operate on object files. This allows @code{gld} to read and | |
54 | write any of the formats supported by @code{bfd}, different | |
55 | formats may be linked together producing any available object file. | |
56 | ||
57 | Supported formats: | |
58 | @itemize @bullet | |
59 | @item | |
60 | Sun3 68k a.out | |
61 | @item | |
62 | IEEE-695 68k Object Module Format | |
63 | @item | |
64 | Oasys 68k Binary Relocatable Object File Format | |
65 | @item | |
66 | Sun4 sparc a.out | |
67 | @item | |
68 | 88k bcs coff | |
69 | @item | |
70 | i960 coff little endian | |
71 | @item | |
72 | i960 coff big endian | |
73 | @item | |
74 | i960 b.out little endian | |
75 | @item | |
76 | i960 b.out big endian | |
77 | @item | |
78 | s-records | |
79 | @end itemize | |
80 | ||
81 | When linking similar formats, @code{gld} maintains all debugging | |
82 | information. | |
83 | ||
84 | @chapter Command line options | |
85 | ||
86 | @example | |
87 | gld [ -Bstatic ] [ -D @var{datasize} ] | |
88 | [ -c @var{filename} ] | |
89 | [ -d ] | [ -dc ] | [ -dp ] | |
90 | [ -i ] | |
91 | [ -e @var{entry} ] [ -l @var{arch} ] [ -L @var{searchdir} ] [ -M ] | |
92 | [ -N | -n | -z ] [ -noinhibit-exec ] [ -r ] [ -S ] [ -s ] | |
93 | [ -f @var{fill} ] | |
94 | [ -T @var{textorg} ] [ -Tdata @var{dataorg} ] [ -t ] [ -u @var{sym}] | |
95 | [ -X ] [ -x ] | |
96 | [-o @var{output} ] @var{objfiles}@dots{} | |
97 | @end example | |
98 | ||
99 | Command-line options to GNU @code{gld} may be specified in any order, and | |
100 | may be repeated at will. For the most part, repeating an option with a | |
101 | different argument will either have no further effect, or override prior | |
102 | occurrences (those further to the left on the command line) of an | |
103 | option. | |
104 | ||
105 | The exceptions which may meaningfully be present several times | |
106 | are @code{-L}, @code{-l}, and @code{-u}. | |
107 | ||
108 | @var{objfiles} may follow, precede, or be mixed in with | |
109 | command-line options; save that an @var{objfiles} argument may not be | |
110 | placed between an option flag and its argument. | |
111 | ||
112 | Option arguments must follow the option letter without intervening | |
113 | whitespace, or be given as separate arguments immediately following the | |
114 | option that requires them. | |
115 | ||
116 | @table @code | |
117 | @item @var{objfiles}@dots{} | |
118 | The object files @var{objfiles} to be linked; at least one must be specified. | |
119 | ||
120 | @item -Bstatic | |
121 | This flag is accepted for command-line compatibility with the SunOS linker, | |
122 | but has no effect on @code{gld}. | |
123 | ||
124 | @item -c @var{commandfile} | |
125 | Directs @code{gld} to read linkage commands from the file @var{commandfile}. | |
126 | ||
127 | @item -D @var{datasize} | |
128 | Use this option to specify a target size for the @code{data} segment of | |
129 | your linked program. The option is only obeyed if @var{datasize} is | |
130 | larger than the natural size of the program's @code{data} segment. | |
131 | ||
132 | @var{datasize} must be an integer specified in hexadecimal. | |
133 | ||
134 | @code{ld} will simply increase the size of the @code{data} segment, | |
135 | padding the created gap with zeros, and reduce the size of the | |
136 | @code{bss} segment to match. | |
137 | ||
138 | @item -d | |
139 | Force @code{ld} to assign space to common symbols | |
140 | even if a relocatable output file is specified (@code{-r}). | |
141 | ||
142 | @item -dc | -dp | |
143 | This flags is accepted for command-line compatibility with the SunOS linker, | |
144 | but has no effect on @code{gld}. | |
145 | ||
146 | @item -e @var{entry} | |
147 | Use @var{entry} as the explicit symbol for beginning execution of your | |
148 | program, rather than the default entry point. If this symbol is | |
149 | not specified, the symbol @code{start} is used as the entry address. | |
150 | If there is no symbol called @code{start}, then the entry address | |
151 | is set to the first address in the first output section | |
152 | (usually the @samp{text} section). | |
153 | ||
154 | @item -f @var{fill} | |
155 | Sets the default fill pattern for ``holes'' in the output file to | |
156 | the lowest two bytes of the expression specified. | |
157 | ||
158 | @item -i | |
159 | Produce an incremental link (same as option @code{-r}). | |
160 | ||
161 | @item -l @var{arch} | |
162 | Add an archive file @var{arch} to the list of files to link. This | |
163 | option may be used any number of times. @code{ld} will search its | |
164 | path-list for occurrences of @code{lib@var{arch}.a} for every @var{arch} | |
165 | specified. | |
166 | ||
167 | @c This also has a side effect of using the "c++ demangler" if we happen | |
168 | @c to specify -llibg++. Document? pesch@@cygnus.com, 24jan91 | |
169 | ||
170 | @item -L @var{searchdir} | |
171 | This command adds path @var{searchdir} to the | |
172 | list of paths that @code{gld} will search for archive libraries. You | |
173 | may use this option any number of times. | |
174 | ||
175 | @c Should we make any attempt to list the standard paths searched | |
176 | @c without listing? When hacking on a new system I often want to know | |
177 | @c this, but this may not be the place... it's not constant across | |
178 | @c systems, of course, which is what makes it interesting. | |
179 | @c pesch@@cygnus.com, 24jan91. | |
180 | ||
181 | @item -M | |
182 | @itemx -m | |
183 | Print (to the standard output file) a link map---diagnostic information | |
184 | about where symbols are mapped by @code{ld}, and information on global | |
185 | common storage allocation. | |
186 | ||
187 | @item -N | |
188 | specifies read and writable @code{text} and @code{data} sections. If | |
189 | the output format supports Unix style magic numbers, then OMAGIC is set. | |
190 | ||
191 | @item -n | |
192 | sets the text segment to be read only, and @code{NMAGIC} is written | |
193 | if possible. | |
194 | ||
195 | @item -o @var{output} | |
196 | @var{output} is a name for the program produced by @code{ld}; if this | |
197 | option is not specified, the name @samp{a.out} is used by default. | |
198 | ||
199 | @item -r | |
200 | Generates relocatable output---i.e., generate an output file that can in | |
201 | turn serve as input to @code{gld}. As a side effect, this option also | |
202 | sets the output file's magic number to @code{OMAGIC}; see @samp{-N}. If this | |
203 | option is not specified, an absolute file is produced. | |
204 | ||
205 | @item -S | |
206 | Omits debugger symbol information (but not all symbols) from the output file. | |
207 | ||
208 | @item -s | |
209 | Omits all symbol information from the output file. | |
210 | ||
211 | @item -T @var{textorg} | |
212 | @itemx -Ttext @var{textorg} | |
213 | Use @var{textorg} as the starting address for the @code{text} segment of the | |
214 | output file. Both forms of this option are equivalent. The option | |
215 | argument must be a hexadecimal integer. | |
216 | ||
217 | @item -Tdata @var{dataorg} | |
218 | Use @var{dataorg} as the starting address for the @code{data} segment of | |
219 | the output file. The option argument must be a hexadecimal integer. | |
220 | ||
221 | @item -t | |
222 | Prints names of input files as @code{ld} processes them. | |
223 | ||
224 | @item -u @var{sym} | |
225 | Forces @var{sym} to be entered in the output file as an undefined symbol. | |
226 | This may, for example, trigger linking of additional modules from | |
227 | standard libraries. @code{-u} may be repeated with different option | |
228 | arguments to enter additional undefined symbols. This option is equivalent | |
229 | to the @code{EXTERN} linker command. | |
230 | ||
231 | @item -X | |
232 | If @code{-s} or @code{-S} is also specified, delete only local symbols | |
233 | beginning with @samp{L}. | |
234 | ||
235 | @item -z | |
236 | @code{-z} sets @code{ZMAGIC}, the default: the @code{text} segment is | |
237 | read-only, demand pageable, and shared. | |
238 | ||
239 | Specifying a relocatable output file (@code{-r}) will also set the magic | |
240 | number to @code{OMAGIC}. | |
241 | ||
242 | See description of @samp{-N}. | |
243 | ||
244 | ||
245 | @end table | |
246 | @chapter Command Language | |
247 | ||
248 | ||
249 | The command language allows explicit control over the linkage process, allowing | |
250 | specification of: | |
251 | @table @bullet | |
252 | @item input files | |
253 | @item file formats | |
254 | @item output file format | |
255 | @item addresses of sections | |
256 | @item placement of common blocks | |
257 | @item and more | |
258 | @end table | |
259 | ||
260 | A command file may be supplied to the linker, either explicitly through the | |
261 | @code{-c} option, or implicitly as an ordinary file. If the linker opens | |
262 | a file which does not have a reasonable object or archive format, it tries | |
263 | to read the file as if it were a command file. | |
264 | @section Structure | |
265 | To be added | |
266 | ||
267 | @section Expressions | |
268 | The syntax for expressions in the command language is identical to that of | |
269 | C expressions, with the following features: | |
270 | @table @bullet | |
271 | @item All expressions evaluated as integers and | |
272 | are of ``long'' or ``unsigned long'' type. | |
273 | @item All constants are integers. | |
274 | @item All of the C arithmetic operators are provided. | |
275 | @item Global variables may be referenced, defined and created. | |
276 | @item Build in functions may be called. | |
277 | @end table | |
278 | ||
279 | @section Expressions | |
280 | ||
281 | The linker has a practice of ``lazy evaluation'' for expressions; it only | |
282 | calculates an expression when absolutely necessary. For instance, | |
283 | when the linker reads in the command file it has to know the values | |
284 | of the start address and the length of the memory regions for linkage to continue, so these | |
285 | values are worked out, but other values (such as symbol values) are not | |
286 | known or needed until after storage allocation. | |
287 | They are evaluated later, when the other | |
288 | information, such as the sizes of output sections are available for use in | |
289 | the symbol assignment expression. | |
290 | ||
291 | When a linker expression is evaluated and assigned to a variable it is given | |
292 | either an absolute or a relocatable type. An absolute expression type | |
293 | is one in which the symbol contains the value that it will have in the | |
294 | output file, a relocateable expression type is one in which the value | |
295 | is expressed as a fixed offset from the base of a section. | |
296 | ||
297 | The type of the expression is controlled by its position in the script | |
298 | file. A symbol assigned within a @code{SECTION} specification is | |
299 | created relative to the base of the section, a symbol assigned in any | |
300 | other place is created as an absolute symbol. Since a symbol created | |
301 | within a @code{SECTION} specification is relative to the base of the | |
302 | section it will remain relocatable if relocatable output is requested. | |
303 | A symbol may be created with an absolute value even when assigned to | |
304 | within a @code{SECTION} specification by using the absolute assignment | |
305 | function @code{ABSOLUTE} For example, to create an absolute symbol | |
306 | whose address is the last byte of the output section @code{.data}: | |
307 | @example | |
308 | .data : | |
309 | @{ | |
310 | *(.data) | |
311 | _edata = ABSOLUTE(.) ; | |
312 | @} | |
313 | @end example | |
314 | ||
315 | Unless quoted, symbol names start with a letter, underscore, point or | |
316 | minus sign and may include any letters, underscores, digits, points, | |
317 | and minus signs. Unquoted symbol names must not conflict with any | |
318 | keywords. To specify a symbol which contains odd characters or has | |
319 | the same name as a keyword surround it in double quotes: | |
320 | @example | |
321 | ``SECTION'' = 9; | |
322 | ``with a space'' = ``also with a space'' + 10; | |
323 | @end example | |
324 | ||
325 | @subsection Integers | |
326 | An octal integer is @samp{0} followed by zero or more of the octal | |
327 | digits (@samp{01234567}). | |
328 | ||
329 | A decimal integer starts with a non-zero digit followed by zero or | |
330 | more digits (@samp{0123456789}). | |
331 | ||
332 | A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or | |
333 | more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}. | |
334 | ||
335 | Integers have the usual values. To denote a negative integer, use | |
336 | the unary operator @samp{-} discussed under expressions. | |
337 | ||
338 | Additionally the suffixes @code{K} and @code{M} may be used to multiply the | |
339 | previous constant by 1024 or | |
340 | @tex | |
341 | $1024^2$ | |
342 | @end tex | |
343 | respectively. | |
344 | ||
345 | @example | |
346 | _as_decimal = 57005; | |
347 | _as_hex = 0xdead; | |
348 | _as_octal = 0157255; | |
349 | ||
350 | _4k_1 = 4K; | |
351 | _4k_2 = 4096; | |
352 | _4k_3 = 0x1000; | |
353 | @end example | |
354 | @subsection Operators | |
355 | The linker provides the standard C set of arithmetic operators, with | |
356 | the standard bindings and precedence levels: | |
357 | @example | |
358 | ||
359 | @end example | |
360 | @tex | |
361 | ||
362 | \vbox{\offinterlineskip | |
363 | \hrule | |
364 | \halign | |
365 | {\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#\cr | |
366 | height2pt&&&&&\cr | |
367 | &Level&& associativity &&Operators&\cr | |
368 | height2pt&&&&&\cr | |
369 | \noalign{\hrule} | |
370 | height2pt&&&&&\cr | |
371 | &highest&&&&&&\cr | |
372 | &1&&left&&$ ! - ~$&\cr | |
373 | height2pt&&&&&\cr | |
374 | &2&&left&&* / \%&\cr | |
375 | height2pt&&&&&\cr | |
376 | &3&&left&&+ -&\cr | |
377 | height2pt&&&&&\cr | |
378 | &4&&left&&$>> <<$&\cr | |
379 | height2pt&&&&&\cr | |
380 | &5&&left&&$== != > < <= >=$&\cr | |
381 | height2pt&&&&&\cr | |
382 | &6&&left&&\&&\cr | |
383 | height2pt&&&&&\cr | |
384 | &7&&left&&|&\cr | |
385 | height2pt&&&&&\cr | |
386 | &8&&left&&{\&\&}&\cr | |
387 | height2pt&&&&&\cr | |
388 | &9&&left&&||&\cr | |
389 | height2pt&&&&&\cr | |
390 | &10&&right&&? :&\cr | |
391 | height2pt&&&&&\cr | |
392 | &11&&right&&$${\&= += -= *= /=}&\cr | |
393 | &lowest&&&&&&\cr | |
394 | height2pt&&&&&\cr} | |
395 | \hrule} | |
396 | @end tex | |
397 | ||
398 | @section Built in Functions | |
399 | The command language provides built in functions for use in | |
400 | expressions in linkage scripts. | |
401 | @table @bullet | |
402 | @item @code{ALIGN(@var{exp})} | |
403 | returns the result of the current location counter (@code{dot}) | |
404 | aligned to the next @var{exp} boundary, where @var{exp} is a power of | |
405 | two. This is equivalent to @code{(. + @var{exp} -1) & ~(@var{exp}-1)}. | |
406 | As an example, to align the output @code{.data} section to the | |
407 | next 0x2000 byte boundary after the preceding section and to set a | |
408 | variable within the section to the next 0x8000 boundary after the | |
409 | input sections: | |
410 | @example | |
411 | .data ALIGN(0x2000) :@{ | |
412 | *(.data) | |
413 | variable = ALIGN(0x8000); | |
414 | @} | |
415 | @end example | |
416 | ||
417 | @item @code{ADDR(@var{section name})} | |
418 | returns the absolute address of the named section if the section has | |
419 | already been bound. In the following examples the @code{symbol_1} and | |
420 | @code{symbol_2} are assigned identical values: | |
421 | @example | |
422 | .output1: | |
423 | @{ | |
424 | start_of_output_1 $= .; | |
425 | ... | |
426 | @} | |
427 | .output: | |
428 | @{ | |
429 | symbol_1 = ADDR(.output1); | |
430 | symbol_2 = start_of_output_1; | |
431 | @} | |
432 | @end example | |
433 | ||
434 | @item @code{SIZEOF(@var{section name})} | |
435 | returns the size in bytes of the named section, if the section has | |
436 | been allocated. In the following example the @code{symbol_1} and | |
437 | @code{symbol_2} are assigned identical values: | |
438 | @example | |
439 | .output @{ | |
440 | .start = . ; | |
441 | ... | |
442 | .end = .; | |
443 | @} | |
444 | symbol_1 = .end - .start; | |
445 | symbol_2 = SIZEOF(.output); | |
446 | @end example | |
447 | ||
448 | @item @code{DEFINED(@var{symbol name})} | |
449 | Returns 1 if the symbol is in the linker global symbol table and is | |
450 | defined, otherwise it returns 0. This example shows the setting of a | |
451 | global symbol @code{begin} to the first location in the @code{.text} | |
452 | section, only if there is no other symbol | |
453 | called @code{begin} already: | |
454 | @example | |
455 | .text: @{ | |
456 | begin = DEFINED(begin) ? begin : . ; | |
457 | ... | |
458 | @} | |
459 | @end example | |
460 | @end table | |
461 | @page | |
462 | @section MEMORY Directive | |
463 | The linker's default configuration is for all memory to be | |
464 | allocatable. This state may be overridden by using the @code{MEMORY} | |
465 | directive. The @code{MEMORY} directive describes the location and | |
466 | size of blocks of memory in the target. Careful use can describe | |
467 | memory regions which may or may not be used by the linker. The linker | |
468 | does not shuffle sections to fit into the available regions, but does | |
469 | move the requested sections into the correct regions and issue errors | |
470 | when the regions become too full. The syntax is: | |
471 | ||
472 | @example | |
473 | MEMORY | |
474 | @{ | |
475 | @tex | |
476 | $\bigl\lbrace {\it name_1} ({\it attr_1}):$ ORIGIN = ${\it origin_1},$ LENGTH $= {\it len_1} \bigr\rbrace $ | |
477 | @end tex | |
478 | ||
479 | @} | |
480 | @end example | |
481 | @table @code | |
482 | @item @var{name} | |
483 | is a name used internally by the linker to refer to the region. Any | |
484 | symbol name may be used. The region names are stored in a separate | |
485 | name space, and will not conflict with symbols, filenames or section | |
486 | names. | |
487 | @item @var{attr} | |
488 | is an optional list of attributes, parsed for compatibility with the | |
489 | AT+T linker | |
490 | but ignored by the both the AT+T and the gnu linker. | |
491 | @item @var{origin} | |
492 | is the start address of the region in physical memory expressed as | |
493 | standard linker expression which must evaluate to a constant before | |
494 | memory allocation is performed. The keyword @code{ORIGIN} may be | |
495 | abbreviated to @code{org} or @code{o}. | |
496 | @item @var{len} | |
497 | is the size in bytes of the region as a standard linker expression. | |
498 | The keyword @code{LENGTH} may be abbreviated to @code{len} or @code{l} | |
499 | @end table | |
500 | ||
501 | For example, to specify that memory has two regions available for | |
502 | allocation; one starting at 0 for 256k, and the other starting at | |
503 | 0x40000000 for four megabytes: | |
504 | ||
505 | @example | |
506 | MEMORY | |
507 | @{ | |
508 | rom : ORIGIN= 0, LENGTH = 256K | |
509 | ram : ORIGIN= 0x40000000, LENGTH = 4M | |
510 | @} | |
511 | ||
512 | @end example | |
513 | ||
514 | If the combined output sections directed to a region are too big for | |
515 | the region the linker will emit an error message. | |
516 | @page | |
517 | @section SECTIONS Directive | |
518 | The @code{SECTIONS} directive | |
519 | controls exactly where input sections are placed into output sections, their | |
520 | order and to which output sections they are allocated. | |
521 | ||
522 | When no @code{SECTIONS} directives are specified, the default action | |
523 | of the linker is to place each input section into an identically named | |
524 | output section in the order that the sections appear in the first | |
525 | file, and then the order of the files. | |
526 | ||
527 | The syntax of the @code{SECTIONS} directive is: | |
528 | ||
529 | @example | |
530 | SECTIONS | |
531 | @{ | |
532 | @tex | |
533 | $\bigl\lbrace {\it name_n}\bigl[options\bigr]\colon$ $\bigl\lbrace {\it statements_n} \bigr\rbrace \bigl[ = {\it fill expression } \bigr] \bigl[ > mem spec \bigr] \bigr\rbrace $ | |
534 | @end tex | |
535 | @} | |
536 | @end example | |
537 | ||
538 | @table @code | |
539 | @item @var{name} | |
540 | controls the name of the output section. In formats which only support | |
541 | a limited number of sections, such as @code{a.out}, the name must be | |
542 | one of the names supported by the format (in the case of a.out, | |
543 | @code{.text}, @code{.data} or @code{.bss}). If the output format | |
544 | supports any number of sections, but with numbers and not names (in | |
545 | the case of IEEE), the name should be supplied as a quoted numeric | |
546 | string. A section name may consist of any sequence characters, but | |
547 | any name which does not conform to the standard @code{gld} symbol name | |
548 | syntax must be quoted. To copy sections 1 through 4 from a Oasys file | |
549 | into the @code{.text} section of an @code{a.out} file, and sections 13 | |
550 | and 14 into the @code{data} section: | |
551 | @example | |
552 | ||
553 | SECTION @{ | |
554 | .text :@{ | |
555 | *(``1'' ``2'' ``3'' ``4'') | |
556 | @} | |
557 | ||
558 | .data :@{ | |
559 | *(``13'' ``14'') | |
560 | @} | |
561 | @} | |
562 | @end example | |
563 | ||
564 | @item @var{fill expression} | |
565 | If present this | |
566 | expression sets the fill value. Any unallocated holes in the current output | |
567 | section when written to the output file will | |
568 | be filled with the two least significant bytes of the value, repeated as | |
569 | necessary. | |
570 | @page | |
571 | @item @var{options} | |
572 | the @var{options} parameter is a list of optional arguments specifying | |
573 | attributes of the output section, they may be taken from the following | |
574 | list: | |
575 | @table @bullet{} | |
576 | @item @var{addr expression} | |
577 | forces the output section to be loaded at a specified address. The | |
578 | address is specified as a standard linker expression. The following | |
579 | example generates section @var{output} at location | |
580 | @code{0x40000000}: | |
581 | @example | |
582 | SECTIONS @{ | |
583 | output 0x40000000: @{ | |
584 | ... | |
585 | @} | |
586 | @} | |
587 | @end example | |
588 | Since the built in function @code{ALIGN} references the location | |
589 | counter implicitly, a section may be located on a certain boundary by | |
590 | using the @code{ALIGN} function in the expression. For example, to | |
591 | locate the @code{.data} section on the next 8k boundary after the end | |
592 | of the @code{.text} section: | |
593 | @example | |
594 | SECTIONS @{ | |
595 | .text @{ | |
596 | ... | |
597 | @} | |
598 | .data ALIGN(4K) @{ | |
599 | ... | |
600 | @} | |
601 | @} | |
602 | @end example | |
603 | @end table | |
604 | @item @var{statements} | |
605 | is a list of file names, input sections and assignments. These statements control what is placed into the | |
606 | output section. | |
607 | The syntax of a single @var{statement} is one of: | |
608 | @table @bullet | |
609 | ||
610 | @item @var{symbol} [ $= | += | -= | *= | /= ] @var{ expression} @code{;} | |
611 | ||
612 | Global symbols may be created and have their values (addresses) | |
613 | altered using the assignment statement. The linker tries to put off | |
614 | the evaluation of an assignment until all the terms in the source | |
615 | expression are known; for instance the sizes of sections cannot be | |
616 | known until after allocation, so assignments dependent upon these are | |
617 | not performed until after allocation. Some expressions, such as those | |
618 | depending upon the location counter @code{dot}, @samp{.} must be | |
619 | evaluated during allocation. If the result of an expression is | |
620 | required, but the value is not available, then an error results: eg | |
621 | @example | |
622 | SECTIONS @{ | |
623 | text 9+this_isnt_constant: | |
624 | @{ | |
625 | @} | |
626 | @} | |
627 | testscript:21: Non constant expression for initial address | |
628 | @end example | |
629 | ||
630 | @item @code{CREATE_OBJECT_SYMBOLS} | |
631 | causes the linker to create a symbol for each input file and place it | |
632 | into the specified section set with the value of the first byte of | |
633 | data written from the input file. For instance, with @code{a.out} | |
634 | files it is conventional to have a symbol for each input file. | |
635 | @example | |
636 | SECTIONS @{ | |
637 | .text 0x2020 : | |
638 | @{ | |
639 | CREATE_OBJECT_SYMBOLS | |
640 | *(.text) | |
641 | _etext = ALIGN(0x2000); | |
642 | @} | |
643 | @} | |
644 | @end example | |
645 | Supplied with four object files, @code{a.o}, @code{b.o}, @code{c.o}, | |
646 | and @code{d.o} a run of | |
647 | @code{gld} could create a map: | |
648 | @example | |
649 | From functions like : | |
650 | a.c: | |
651 | afunction() { } | |
652 | int adata=1; | |
653 | int abss; | |
654 | ||
655 | 00000000 A __DYNAMIC | |
656 | 00004020 B _abss | |
657 | 00004000 D _adata | |
658 | 00002020 T _afunction | |
659 | 00004024 B _bbss | |
660 | 00004008 D _bdata | |
661 | 00002038 T _bfunction | |
662 | 00004028 B _cbss | |
663 | 00004010 D _cdata | |
664 | 00002050 T _cfunction | |
665 | 0000402c B _dbss | |
666 | 00004018 D _ddata | |
667 | 00002068 T _dfunction | |
668 | 00004020 D _edata | |
669 | 00004030 B _end | |
670 | 00004000 T _etext | |
671 | 00002020 t a.o | |
672 | 00002038 t b.o | |
673 | 00002050 t c.o | |
674 | 00002068 t d.o | |
675 | ||
676 | @end example | |
677 | ||
678 | @item @var{filename} @code{(} @var{section name list} @code{)} | |
679 | This command allocates all the named sections from the input object | |
680 | file supplied into the output section at the current point. Sections | |
681 | are written in the order they appear in the list so: | |
682 | @example | |
683 | SECTIONS @{ | |
684 | .text 0x2020 : | |
685 | @{ | |
686 | a.o(.data) | |
687 | b.o(.data) | |
688 | *(.text) | |
689 | @} | |
690 | .data : | |
691 | @{ | |
692 | *(.data) | |
693 | @} | |
694 | .bss : | |
695 | @{ | |
696 | *(.bss) | |
697 | COMMON | |
698 | @} | |
699 | @} | |
700 | @end example | |
701 | will produce a map: | |
702 | @example | |
703 | ||
704 | insert here | |
705 | @end example | |
706 | @item @code{* (} @var{section name list} @code{)} | |
707 | This command causes all sections from all input files which have not | |
708 | yet been assigned output sections to be assigned the current output | |
709 | section. | |
710 | ||
711 | @item @var{filename} @code{[COMMON]} | |
712 | This allocates all the common symbols from the specified file and places | |
713 | them into the current output section. | |
714 | ||
715 | @item @code{* [COMMON]} | |
716 | This allocates all the common symbols from the files which have not | |
717 | yet had their common symbols allocated and places them into the current | |
718 | output section. | |
719 | ||
720 | @item @var{filename} | |
721 | A filename alone within a @code{SECTIONS} statement will cause all the | |
722 | input sections from the file to be placed into the current output | |
723 | section at the current location. If the file name has been mentioned | |
724 | before with a section name list then only those | |
725 | sections which have not yet been allocated are noted. | |
726 | ||
727 | The following example reads all of the sections from file all.o and | |
728 | places them at the start of output section @code{outputa} which starts | |
729 | at location @code{0x10000}. All of the data from section @code{.input1} from | |
730 | file foo.o is placed next into the same output section. All of | |
731 | section @code{.input2} is read from foo.o and placed into output | |
732 | section @code{outputb}. Next all of section @code{.input1} is read | |
733 | from foo1.o. All of the remaining @code{.input1} and @code{.input2} | |
734 | sections from any files are written to output section @code{output3}. | |
735 | ||
736 | @example | |
737 | SECTIONS | |
738 | @{ | |
739 | outputa 0x10000 : | |
740 | @{ | |
741 | all.o | |
742 | foo.o (.input1) | |
743 | @} | |
744 | outputb : | |
745 | @{ | |
746 | foo.o (.input2) | |
747 | foo1.o (.input1) | |
748 | @} | |
749 | outputc : | |
750 | @{ | |
751 | *(.input1) | |
752 | *(.input2) | |
753 | @} | |
754 | @} | |
755 | ||
756 | @end example | |
757 | @end table | |
758 | @end table | |
759 | @section Using the Location Counter | |
760 | The special linker variable @code{dot}, @samp{.} always contains the | |
761 | current output location counter. Since the @code{dot} always refers to | |
762 | a location in an output section, it must always appear in an | |
763 | expression within a @code{SECTIONS} directive. The @code{dot} symbol | |
764 | may appear anywhere that an ordinary symbol may appear in an | |
765 | expression, but its assignments have a side effect. Assigning a value | |
766 | to the @code{dot} symbol will cause the location counter to be moved. | |
767 | This may be used to create holes in the output section. The location | |
768 | counter may never be moved backwards. | |
769 | @example | |
770 | SECTIONS | |
771 | @{ | |
772 | output : | |
773 | @{ | |
774 | file1(.text) | |
775 | . = . + 1000; | |
776 | file2(.text) | |
777 | . += 1000; | |
778 | file3(.text) | |
779 | . -= 32; | |
780 | file4(.text) | |
781 | @} = 0x1234; | |
782 | @} | |
783 | @end example | |
784 | In the previous example, @code{file1} is located at the beginning of | |
785 | the output section, then there is a 1000 byte gap, filled with 0x1234. | |
786 | Then @code{file2} appears, also with a 1000 byte gap following before | |
787 | @code{file3} is loaded. Then the first 32 bytes of @code{file4} are | |
788 | placed over the last 32 bytes of @code{file3}. | |
789 | @section Command Language Syntax | |
790 | @section The Entry Point | |
791 | The linker chooses the first executable instruction in an output file from a list | |
792 | of possibilities, in order: | |
793 | @itemize @bullet | |
794 | @item | |
795 | The value of the symbol provided to the command line with the @code{-e} option, when | |
796 | present. | |
797 | @item | |
798 | The value of the symbol provided in the @code{ENTRY} directive, | |
799 | if present. | |
800 | @item | |
801 | The value of the symbol @code{start}, if present. | |
802 | @item | |
803 | The value of the symbol @code{_main}, if present. | |
804 | @item | |
805 | The address of the first byte of the @code{.text} section, if present. | |
806 | @item | |
807 | The value 0. | |
808 | @end itemize | |
809 | If the symbol @code{start} is not defined within the set of input | |
810 | files to a link, it may be generated by a simple assignment | |
811 | expression. eg. | |
812 | @example | |
813 | start = 0x2020; | |
814 | @end example | |
815 | @section Section Attributes | |
816 | @section Allocation of Sections into Memory | |
817 | @section Defining Symbols | |
818 | @chapter Examples of operation | |
819 | The simplest case is linking standard Unix object files on a standard | |
820 | Unix system supported by the linker. To link a file hello.o: | |
821 | @example | |
822 | $ gld -o output /lib/crt0.o hello.o -lc | |
823 | @end example | |
824 | This tells gld to produce a file called @code{output} after linking | |
825 | the file @code{/lib/crt0.o} with @code{hello.o} and the library | |
826 | @code{libc.a} which will come from the standard search directories. | |
827 | @chapter Partial Linking | |
828 | Specifying the @code{-r} on the command line causes @code{gld} to | |
829 | perform a partial link. | |
830 | ||
831 | ||
832 | @chapter BFD | |
833 | ||
834 | The linker accesses object and archive files using the @code{bfd} | |
835 | libraries. These libraries allow the linker to use the same routines | |
836 | to operate on object files whatever the object file format. | |
837 | ||
838 | A different object file format can be supported simply by creating a | |
839 | new @code{bfd} back end and adding it to the library. | |
840 | ||
841 | Formats currently supported: | |
842 | @itemize @bullet | |
843 | @item | |
844 | Sun3 68k a.out | |
845 | @item | |
846 | IEEE-695 68k Object Module Format | |
847 | @item | |
848 | Oasys 68k Binary Relocatable Object File Format | |
849 | @item | |
850 | Sun4 sparc a.out | |
851 | @item | |
852 | 88k bcs coff | |
853 | @item | |
854 | i960 coff little endian | |
855 | @item | |
856 | i960 coff big endian | |
857 | @item | |
858 | i960 b.out little endian | |
859 | @item | |
860 | i960 b.out big endian | |
861 | @end itemize | |
862 | ||
863 | As with most implementations, @code{bfd} is a compromise between | |
864 | several conflicting requirements. The major factor influencing | |
865 | @code{bfd} design was efficiency, any time used converting between | |
866 | formats is time which would not have been spent had @code{bfd} not | |
867 | been involved. This is partly offset by abstraction payback; since | |
868 | @code{bfd} simplifies applications and back ends, more time and care | |
869 | may be spent optimizing algorithms for a greater speed. | |
870 | ||
871 | One minor artifact of the @code{bfd} solution which the | |
872 | user should be aware of is information lossage. | |
873 | There are two places where useful information can be lost using the | |
874 | @code{bfd} mechanism; during conversion and during output. | |
875 | ||
876 | @section How it works | |
877 | When an object file is opened, @code{bfd} | |
878 | tries to automatically determine the format of the input object file, a | |
879 | descriptor is built in memory with pointers to routines to access | |
880 | elements of the object file's data structures. | |
881 | ||
882 | As different information from the the object files is required | |
883 | @code{bfd} reads from different sections of the file and processes | |
884 | them. For example a very common operation for the linker is processing | |
885 | symbol tables. Each @code{bfd} back end provides a routine for | |
886 | converting between the object file's representation of symbols and an | |
887 | internal canonical format. When the linker asks for the symbol table | |
888 | of an object file, it calls through the memory pointer to the relevant | |
889 | @code{bfd} back end routine which reads and converts the table into | |
890 | the canonical form. Linker then operates upon the common form. When | |
891 | the link is finished and the linker writes the symbol table of the | |
892 | output file, another @code{bfd} back end routine is called which takes | |
893 | the newly created symbol table and converts it into the output format. | |
894 | ||
895 | @section Information Leaks | |
896 | @table @bullet{} | |
897 | @item Information lost during output. | |
898 | The output formats supported by @code{bfd} do not provide identical | |
899 | facilities, and information which may be described in one form | |
900 | has no where to go in another format. One example of this would be | |
901 | alignment information in @code{b.out}. There is no where in an @code{a.out} | |
902 | format file to store alignment information on the contained data, so when | |
903 | a file is linked from @code{b.out} and an @code{a.out} image is produced, | |
904 | alignment information is lost. (Note that in this case the linker has the | |
905 | alignment information internally, so the link is performed correctly). | |
906 | ||
907 | Another example is COFF section names. COFF files may contain an | |
908 | unlimited number of sections, each one with a textual section name. If | |
909 | the target of the link is a format which does not have many sections | |
910 | (eg @code{a.out}) or has sections without names (eg the Oasys format) | |
911 | the link cannot be done simply. It is possible to circumvent this | |
912 | problem by describing the desired input section to output section | |
913 | mapping with the command language. | |
914 | ||
915 | @item Information lost during canonicalization. | |
916 | The @code{bfd} | |
917 | internal canonical form of the external formats is not exhaustive, | |
918 | there are structures in input formats for which there is no direct | |
919 | representation internally. This means that the @code{bfd} back ends | |
920 | cannot maintain all the data richness through the transformation | |
921 | between external to internal and back to external formats. | |
922 | ||
923 | This limitation is only a problem when using the linker to read one | |
924 | format and write another. Each @code{bfd} back end is responsible for | |
925 | maintaining as much data as possible, and the internal @code{bfd} | |
926 | canonical form has structures which are opaque to the @code{bfd} core, | |
927 | and exported only to the back ends. When a file is read in one format, | |
928 | the canonical form is generated for @code{bfd} and the linker. At the | |
929 | same time, the back end saves away any information which may otherwise | |
930 | be lost. If the data is then written back to the same back end, the | |
931 | back end routine will be able to use the canonical form provided by | |
932 | the @code{bfd} core as well as the information it prepared earlier. | |
933 | Since there is a great deal of commonality between back ends, this | |
934 | mechanism is very useful. There is no information lost when linking | |
935 | big endian COFF to little endian COFF, or from a.out to b.out. When a | |
936 | mixture of formats are linked, the information is only lost from the | |
937 | files with a different format to the destination. | |
938 | @end table | |
939 | @section Mechanism | |
940 | The smallest amount of information is preserved when there | |
941 | is a small union between the information provided by the source | |
942 | format, that stored by the canonical format and the information needed | |
943 | by the destination format. A brief description of the canonical form | |
944 | will help the user appreciate what is possible to be maintained | |
945 | between conversions. | |
946 | ||
947 | @table @bullet | |
948 | @item file level Information on target machine | |
949 | architecture, particular implementation and format type are stored on | |
950 | a per file basis. Other information includes a demand pageable bit and | |
951 | a write protected bit. Note that information like Unix magic numbers | |
952 | is not stored here, only the magic numbers meaning, so a ZMAGIC file | |
953 | would have both the demand pageable bit and the write protected text | |
954 | bit set. | |
955 | ||
956 | The byte order of the target is stored on a per file basis, so that | |
957 | both big and little endian object files may be linked together at the | |
958 | same time. | |
959 | @item section level | |
960 | Each section in the input file contains the name of the section, the | |
961 | original address in the object file, various flags, size and alignment | |
962 | information and pointers into other @code{bfd} data structures. | |
963 | @item symbol level | |
964 | Each symbol contains a pointer to the object file which originally | |
965 | defined it, its name, value and various flags bits. When a symbol | |
966 | table is read in all symbols are relocated to make them relative to | |
967 | the base of the section they were defined in, so each symbol points to | |
968 | the containing section. Each symbol also has a varying amount of | |
969 | hidden data to contain private data for the back end. Since the symbol | |
970 | points to the original file, the symbol private data format is | |
971 | accessible. Operations may be done to a list of symbols of wildly | |
972 | different formats without problems. | |
973 | ||
974 | Normal global and simple local symbols are maintained on output, so an | |
975 | output file, no matter the format will retain symbols pointing to | |
976 | functions, globals, statics and commons. Some symbol information is | |
977 | not worth retaining; in @code{a.out} type information is stored in the | |
978 | symbol table as long symbol names. This information would be useless | |
979 | to most coff debuggers and may be thrown away with appropriate command | |
980 | line switches. (Note that gdb does support stabs in coff). | |
981 | ||
982 | There is one word of type information within the symbol, so if the | |
983 | format supports symbol type information within symbols - (eg COFF, | |
984 | IEEE, Oasys) and the type is simple enough to fit within one word | |
985 | (nearly everything but aggregates) the information will be preserved. | |
986 | ||
987 | @item relocation level | |
988 | Each canonical relocation record contains a pointer to the symbol to | |
989 | relocate to, the offset of the data to relocate, the section the data | |
990 | is in and a pointer to a relocation type descriptor. Relocation is | |
991 | performed effectively by message passing through the relocation type | |
992 | descriptor and symbol pointer. It allows relocations to be performed | |
993 | on output data using a relocation method only available in one of the | |
994 | input formats. For instance, Oasys provides a byte relocation format. | |
995 | A relocation record requesting this relocation type would point | |
996 | indirectly to a routine to perform this, so the relocation may be | |
997 | performed on a byte being written to a COFF file, even though 68k COFF | |
998 | has no such relocation type. | |
999 | ||
1000 | @item line numbers | |
1001 | Line numbers have to be relocated along with the symbol information. | |
1002 | Each symbol with an associated list of line number records points to | |
1003 | the first record of the list. The head of a line number list consists | |
1004 | of a pointer to the symbol, which allows divination of the address of | |
1005 | the function who's line number is being described. The rest of the | |
1006 | list is tuples offsets into the section and line indexes. Any format | |
1007 | which can simply derive this information can pass it without lossage | |
1008 | between formats (COFF, IEEE and Oasys). | |
1009 | @end table | |
1010 | ||
1011 | ||
1012 | @bye | |
1013 | ||
1014 |