]> Git Repo - binutils.git/blame - bfd/doc/bfdint.texi
add target vector documentation
[binutils.git] / bfd / doc / bfdint.texi
CommitLineData
c91a48dd
ILT
1\input texinfo
2@setfilename bfdint.info
accf488e
ILT
3
4@settitle BFD Internals
5@iftex
6@title{BFD Internals}
7@author{Ian Lance Taylor}
8@author{Cygnus Solutions}
9@end iftex
10
c91a48dd
ILT
11@node Top
12@top BFD Internals
13@raisesections
14@cindex bfd internals
15
16This document describes some BFD internal information which may be
17helpful when working on BFD. It is very incomplete.
18
19This document is not updated regularly, and may be out of date. It was
20last modified on $Date$.
21
22The initial version of this document was written by Ian Lance Taylor
23@email{ian@@cygnus.com}.
24
25@menu
26* BFD glossary:: BFD glossary
27* BFD guidelines:: BFD programming guidelines
5de80887 28* BFD target vector:: BFD target vector
c91a48dd
ILT
29* BFD generated files:: BFD generated files
30* BFD multiple compilations:: Files compiled multiple times in BFD
508fa296 31* BFD relocation handling:: BFD relocation handling
d1d5d252 32* BFD ELF support:: BFD ELF support
c91a48dd
ILT
33* Index:: Index
34@end menu
35
36@node BFD glossary
37@section BFD glossary
38@cindex glossary for bfd
39@cindex bfd glossary
40
41This is a short glossary of some BFD terms.
42
43@table @asis
44@item a.out
45The a.out object file format. The original Unix object file format.
46Still used on SunOS, though not Solaris. Supports only three sections.
47
48@item archive
49A collection of object files produced and manipulated by the @samp{ar}
50program.
51
52@item BFD
53The BFD library itself. Also, each object file, archive, or exectable
54opened by the BFD library has the type @samp{bfd *}, and is sometimes
55referred to as a bfd.
56
57@item COFF
58The Common Object File Format. Used on Unix SVR3. Used by some
59embedded targets, although ELF is normally better.
60
61@item DLL
62A shared library on Windows.
63
64@item dynamic linker
65When a program linked against a shared library is run, the dynamic
66linker will locate the appropriate shared library and arrange to somehow
67include it in the running image.
68
69@item dynamic object
70Another name for an ELF shared library.
71
72@item ECOFF
73The Extended Common Object File Format. Used on Alpha Digital Unix
74(formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF.
75
76@item ELF
77The Executable and Linking Format. The object file format used on most
78modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also
79used on many embedded systems.
80
81@item executable
82A program, with instructions and symbols, and perhaps dynamic linking
83information. Normally produced by a linker.
84
85@item NLM
86NetWare Loadable Module. Used to describe the format of an object which
87be loaded into NetWare, which is some kind of PC based network server
88program.
89
90@item object file
91A binary file including machine instructions, symbols, and relocation
92information. Normally produced by an assembler.
93
94@item object file format
95The format of an object file. Typically object files and executables
96for a particular system are in the same format, although executables
97will not contain any relocation information.
98
99@item PE
100The Portable Executable format. This is the object file format used for
101Windows (specifically, Win32) object files. It is based closely on
102COFF, but has a few significant differences.
103
104@item PEI
105The Portable Executable Image format. This is the object file format
106used for Windows (specifically, Win32) executables. It is very similar
107to PE, but includes some additional header information.
108
109@item relocations
508fa296
ILT
110Information used by the linker to adjust section contents. Also called
111relocs.
c91a48dd
ILT
112
113@item section
114Object files and executable are composed of sections. Sections have
115optional data and optional relocation information.
116
117@item shared library
118A library of functions which may be used by many executables without
119actually being linked into each executable. There are several different
120implementations of shared libraries, each having slightly different
121features.
122
123@item symbol
124Each object file and executable may have a list of symbols, often
125referred to as the symbol table. A symbol is basically a name and an
126address. There may also be some additional information like the type of
127symbol, although the type of a symbol is normally something simple like
128function or object, and should be confused with the more complex C
129notion of type. Typically every global function and variable in a C
130program will have an associated symbol.
131
132@item Win32
133The current Windows API, implemented by Windows 95 and later and Windows
134NT 3.51 and later, but not by Windows 3.1.
135
136@item XCOFF
137The eXtended Common Object File Format. Used on AIX. A variant of
138COFF, with a completely different symbol table implementation.
139@end table
140
141@node BFD guidelines
142@section BFD programming guidelines
143@cindex bfd programming guidelines
144@cindex programming guidelines for bfd
145@cindex guidelines, bfd programming
146
147There is a lot of poorly written and confusing code in BFD. New BFD
148code should be written to a higher standard. Merely because some BFD
149code is written in a particular manner does not mean that you should
150emulate it.
151
152Here are some general BFD programming guidelines:
153
154@itemize @bullet
508fa296
ILT
155@item
156Follow the GNU coding standards.
157
c91a48dd
ILT
158@item
159Avoid global variables. We ideally want BFD to be fully reentrant, so
160that it can be used in multiple threads. All uses of global or static
161variables interfere with that. Initialized constant variables are OK,
162and they should be explicitly marked with const. Instead of global
163variables, use data attached to a BFD or to a linker hash table.
164
165@item
166All externally visible functions should have names which start with
167@samp{bfd_}. All such functions should be declared in some header file,
168typically @file{bfd.h}. See, for example, the various declarations near
169the end of @file{bfd-in.h}, which mostly declare functions required by
170specific linker emulations.
171
172@item
173All functions which need to be visible from one file to another within
174BFD, but should not be visible outside of BFD, should start with
175@samp{_bfd_}. Although external names beginning with @samp{_} are
176prohibited by the ANSI standard, in practice this usage will always
177work, and it is required by the GNU coding standards.
178
179@item
180Always remember that people can compile using --enable-targets to build
181several, or all, targets at once. It must be possible to link together
182the files for all targets.
183
184@item
185BFD code should compile with few or no warnings using @samp{gcc -Wall}.
186Some warnings are OK, like the absence of certain function declarations
187which may or may not be declared in system header files. Warnings about
188ambiguous expressions and the like should always be fixed.
189@end itemize
190
5de80887
ILT
191@node BFD target vector
192@section BFD target vector
193@cindex bfd target vector
194@cindex target vector in bfd
195
196BFD supports multiple object file formats by using the @dfn{target
197vector}. This is simply a set of function pointers which implement
198behaviour that is specific to a particular object file format.
199
200In this section I list all of the entries in the target vector and
201describe what they do.
202
203@menu
204* BFD target vector miscellaneous:: Miscellaneous constants
205* BFD target vector swap:: Swapping functions
206* BFD target vector format:: Format type dependent functions
207* BFD_JUMP_TABLE macros:: BFD_JUMP_TABLE macros
208* BFD target vector generic:: Generic functions
209* BFD target vector copy:: Copy functions
210* BFD target vector core:: Core file support functions
211* BFD target vector archive:: Archive functions
212* BFD target vector symbols:: Symbol table functions
213* BFD target vector relocs:: Relocation support
214* BFD target vector write:: Output functions
215* BFD target vector link:: Linker functions
216* BFD target vector dynamic:: Dynamic linking information functions
217@end menu
218
219@node BFD target vector miscellaneous
220@subsection Miscellaneous constants
221
222The target vector starts with a set of constants.
223
224@table @samp
225@item name
226The name of the target vector. This is an arbitrary string. This is
227how the target vector is named in command line options for tools which
228use BFD, such as the @samp{-oformat} linker option.
229
230@item flavour
231A general description of the type of target. The following flavours are
232currently defined:
233@table @samp
234@item bfd_target_unknown_flavour
235Undefined or unknown.
236@item bfd_target_aout_flavour
237a.out.
238@item bfd_target_coff_flavour
239COFF.
240@item bfd_target_ecoff_flavour
241ECOFF.
242@item bfd_target_elf_flavour
243ELF.
244@item bfd_target_ieee_flavour
245IEEE-695.
246@item bfd_target_nlm_flavour
247NLM.
248@item bfd_target_oasys_flavour
249OASYS.
250@item bfd_target_tekhex_flavour
251Tektronix hex format.
252@item bfd_target_srec_flavour
253Motorola S-record format.
254@item bfd_target_ihex_flavour
255Intel hex format.
256@item bfd_target_som_flavour
257SOM (used on HP/UX).
258@item bfd_target_os9k_flavour
259os9000.
260@item bfd_target_versados_flavour
261VERSAdos.
262@item bfd_target_msdos_flavour
263MS-DOS.
264@item bfd_target_evax_flavour
265openVMS.
266@end table
267
268@item byteorder
269The byte order of data in the object file. One of
270@samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or
271@samp{BFD_ENDIAN_UNKNOWN}. The latter would be used for a format such
272as S-records which do not record the architecture of the data.
273
274@item header_byteorder
275The byte order of header information in the object file. Normally the
276same as the @samp{byteorder} field, but there are certain cases where it
277may be different.
278
279@item object_flags
280Flags which may appear in the @samp{flags} field of a BFD with this
281format.
282
283@item section_flags
284Flags which may appear in the @samp{flags} field of a section within a
285BFD with this format.
286
287@item symbol_leading_char
288A character which the C compiler normally puts before a symbol. For
289example, an a.out compiler will typically generate the symbol
290@samp{_foo} for a function named @samp{foo} in the C source, in which
291case this field would be @samp{_}. If there is no such character, this
292field will be @samp{0}.
293
294@item ar_pad_char
295The padding character to use at the end of an archive name. Normally
296@samp{/}.
297
298@item ar_max_namelen
299The maximum length of a short name in an archive. Normally @samp{14}.
300
301@item backend_data
302A pointer to constant backend data. This is used by backends to store
303whatever additional information they need to distinguish similar target
304vectors which use the same sets of functions.
305@end table
306
307@node BFD target vector swap
308@subsection Swapping functions
309
310Every target vector has fuction pointers used for swapping information
311in and out of the target representation. There are two sets of
312functions: one for data information, and one for header information.
313Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has
314three actual functions: put, get unsigned, and get signed.
315
316These 18 functions are used to convert data between the host and target
317representations.
318
319@node BFD target vector format
320@subsection Format type dependent functions
321
322Every target vector has three arrays of function pointers which are
323indexed by the BFD format type. The BFD format types are as follows:
324@table @samp
325@item bfd_unknown
326Unknown format. Not used for anything useful.
327@item bfd_object
328Object file.
329@item bfd_archive
330Archive file.
331@item bfd_core
332Core file.
333@end table
334
335The three arrays of function pointers are as follows:
336@table @samp
337@item bfd_check_format
338Check whether the BFD is of a particular format (object file, archive
339file, or core file) corresponding to this target vector. This is called
340by the @samp{bfd_check_format} function when examining an existing BFD.
341If the BFD matches the desired format, this function will initialize any
342format specific information such as the @samp{tdata} field of the BFD.
343This function must be called before any other BFD target vector function
344on a file opened for reading.
345
346@item bfd_set_format
347Set the format of a BFD which was created for output. This is called by
348the @samp{bfd_set_format} function after creating the BFD with a
349function such as @samp{bfd_openw}. This function will initialize format
350specific information required to write out an object file or whatever of
351the given format. This function must be called before any other BFD
352target vector function on a file opened for writing.
353
354@item bfd_write_contents
355Write out the contents of the BFD in the given format. This is called
356by @samp{bfd_close} function for a BFD opened for writing. This really
357should not be an array selected by format type, as the
358@samp{bfd_set_format} function provides all the required information.
359In fact, BFD will fail if a different format is used when calling
360through the @samp{bfd_set_format} and the @samp{bfd_write_contents}
361arrays; fortunately, since @samp{bfd_close} gets it right, this is a
362difficult error to make.
363@end table
364
365@node BFD_JUMP_TABLE macros
366@subsection @samp{BFD_JUMP_TABLE} macros
367@cindex @samp{BFD_JUMP_TABLE}
368
369Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros.
370These macros take a single argument, which is a prefix applied to a set
371of functions. The macros are then used to initialize the fields in the
372target vector.
373
374For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three
375functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc},
376and @samp{_bfd_reloc_type_lookup}. A reference like
377@samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions
378prefixed with @samp{foo}: @samp{foo_get_reloc_upper_found}, etc. The
379@samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three
380functions initialize the appropriate fields in the BFD target vector.
381
382This is done because it turns out that many different target vectors can
383shared certain classes of functions. For example, archives are similar
384on most platforms, so most target vectors can use the same archive
385functions. Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE}
386with the same argument, calling a set of functions which is defined in
387@file{archive.c}.
388
389Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with
390the description of the function pointers which it defines. The function
391pointers will be described using the name without the prefix which the
392@samp{BFD_JUMP_TABLE} macro defines. This name is normally the same as
393the name of the field in the target vector structure. Any differences
394will be noted.
395
396@node BFD target vector generic
397@subsection Generic functions
398@cindex @samp{BFD_JUMP_TABLE_GENERIC}
399
400The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all
401functions which don't easily fit into other categories.
402
403@table @samp
404@item _close_and_cleanup
405Free any target specific information associated with the BFD. This is
406called when any BFD is closed (the @samp{bfd_write_contents} function
407mentioned earlier is only called for a BFD opened for writing). Most
408targets use @samp{bfd_alloc} to allocate all target specific
409information, and therefore don't have to do anything in this function.
410This function pointer is typically set to
411@samp{_bfd_generic_close_and_cleanup}, which simply returns true.
412
413@item _bfd_free_cached_info
414Free any cached information associated with the BFD which can be
415recreated later if necessary. This is used to reduce the memory
416consumption required by programs using BFD. This is normally called via
417the @samp{bfd_free_cached_info} macro. It is used by the default
418archive routines when computing the archive map. Most targets do not
419do anything special for this entry point, and just set it to
420@samp{_bfd_generic_free_cached_info}, which simply returns true.
421
422@item _new_section_hook
423This is called from @samp{bfd_make_section_anyway} whenever a new
424section is created. Most targets use it to initialize section specific
425information. This function is called whether or not the section
426corresponds to an actual section in an actual BFD.
427
428@item _get_section_contents
429Get the contents of a section. This is called from
430@samp{bfd_get_section_contents}. Most targets set this to
431@samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek}
432based on the section's @samp{filepos} field and a @samp{bfd_read}. The
433corresponding field in the target vector is named
434@samp{_bfd_get_section_contents}.
435
436@item _get_section_contents_in_window
437Set a @samp{bfd_window} to hold the contents of a section. This is
438called from @samp{bfd_get_section_contents_in_window}. The
439@samp{bfd_window} idea never really caught in, and I don't think this is
440ever called. Pretty much all targets implement this as
441@samp{bfd_generic_get_section_contents_in_window}, which uses
442@samp{bfd_get_section_contents} to do the right thing. The
443corresponding field in the target vector is named
444@samp{_bfd_get_section_contents_in_window}.
445@end table
446
447@node BFD target vector copy
448@subsection Copy functions
449@cindex @samp{BFD_JUMP_TABLE_COPY}
450
451The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are
452called when copying BFDs, and for a couple of functions which deal with
453internal BFD information.
454
455@table @samp
456@item _bfd_copy_private_bfd_data
457This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}.
458If the input and output BFDs have the same format, this will copy any
459private information over. This is called after all the section contents
460have been written to the output file. Only a few targets do anything in
461this function.
462
463@item _bfd_merge_private_bfd_data
464This is called when linking, via @samp{bfd_merge_private_bfd_data}. It
465gives the backend linker code a chance to set any special flags in the
466output file based on the contents of the input file. Only a few targets
467do anything in this function.
468
469@item _bfd_copy_private_section_data
470This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called
471for each section, via @samp{bfd_copy_private_section_data}. This
472function is called before any section contents have been written. Only
473a few targets do anything in this function.
474
475@item _bfd_copy_private_symbol_data
476This is called via @samp{bfd_copy_private_symbol_data}, but I don't
477think anything actually calls it. If it were defined, it could be used
478to copy private symbol data from one BFD to another. However, most BFDs
479store extra symbol information by allocating space which is larger than
480the @samp{asymbol} structure and storing private information in the
481extra space. Since @samp{objcopy} and other programs copy symbol
482information by copying pointers to @samp{asymbol} structures, the
483private symbol information is automatically copied as well. Most
484targets do not do anything in this function.
485
486@item _bfd_set_private_flags
487This is called via @samp{bfd_set_private_flags}. It is basically a hook
488for the assembler to set magic information. For example, the PowerPC
489ELF assembler uses it to set flags which appear in the e_flags field of
490the ELF header. Most targets do not do anything in this function.
491
492@item _bfd_print_private_bfd_data
493This is called by @samp{objdump} when the @samp{-p} option is used. It
494is called via @samp{bfd_print_private_data}. It prints any interesting
495information about the BFD which can not be otherwise represented by BFD
496and thus can not be printed by @samp{objdump}. Most targets do not do
497anything in this function.
498@end table
499
500@node BFD target vector core
501@subsection Core file support functions
502@cindex @samp{BFD_JUMP_TABLE_CORE}
503
504The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal
505with core files. Obviously, these functions only do something
506interesting for targets which have core file support.
507
508@table @samp
509@item _core_file_failing_command
510Given a core file, this returns the command which was run to produce the
511core file.
512
513@item _core_file_failing_signal
514Given a core file, this returns the signal number which produced the
515core file.
516
517@item _core_file_matches_executable_p
518Given a core file and a BFD for an executable, this returns whether the
519core file was generated by the executable.
520@end table
521
522@node BFD target vector archive
523@subsection Archive functions
524@cindex @samp{BFD_JUMP_TABLE_ARCHIVE}
525
526The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal
527with archive files. Most targets use COFF style archive files
528(including ELF targets), and these use @samp{_bfd_archive_coff} as the
529argument to @samp{BFD_JUMP_TABLE_ARCHIVE}. Some targets use BSD/a.out
530style archives, and these use @samp{_bfd_archive_bsd}. (The main
531difference between BSD and COFF archives is the format of the archive
532symbol table). Targets with no archive support use
533@samp{_bfd_noarchive}. Finally, a few targets have unusual archive
534handling.
535
536@table @samp
537@item _slurp_armap
538Read in the archive symbol table, storing it in private BFD data. This
539is normally called from the archive @samp{check_format} routine. The
540corresponding field in the target vector is named
541@samp{_bfd_slurp_armap}.
542
543@item _slurp_extended_name_table
544Read in the extended name table from the archive, if there is one,
545storing it in private BFD data. This is normally called from the
546archive @samp{check_format} routine. The corresponding field in the
547target vector is named @samp{_bfd_slurp_extended_name_table}.
548
549@item construct_extended_name_table
550Build and return an extended name table if one is needed to write out
551the archive. This also adjusts the archive headers to refer to the
552extended name table appropriately. This is normally called from the
553archive @samp{write_contents} routine. The corresponding field in the
554target vector is named @samp{_bfd_construct_extended_name_table}.
555
556@item _truncate_arname
557This copies a file name into an archive header, truncating it as
558required. It is normally called from the archive @samp{write_contents}
559routine. This function is more interesting in targets which do not
560support extended name tables, but I think the GNU @samp{ar} program
561always uses extended name tables anyhow. The corresponding field in the
562target vector is named @samp{_bfd_truncate_arname}.
563
564@item _write_armap
565Write out the archive symbol table using calls to @samp{bfd_write}.
566This is normally called from the archive @samp{write_contents} routine.
567The corresponding field in the target vector is named @samp{write_armap}
568(no leading underscore).
569
570@item _read_ar_hdr
571Read and parse an archive header. This handles expanding the archive
572header name into the real file name using the extended name table. This
573is called by routines which read the archive symbol table or the archive
574itself. The corresponding field in the target vector is named
575@samp{_bfd_read_ar_hdr_fn}.
576
577@item _openr_next_archived_file
578Given an archive and a BFD representing a file stored within the
579archive, return a BFD for the next file in the archive. This is called
580via @samp{bfd_openr_next_archived_file}. The corresponding field in the
581target vector is named @samp{openr_next_archived_file} (no leading
582underscore).
583
584@item _get_elt_at_index
585Given an archive and an index, return a BFD for the file in the archive
586corresponding to that entry in the archive symbol table. This is called
587via @samp{bfd_get_elt_at_index}. The corresponding field in the target
588vector is named @samp{_bfd_get_elt_at_index}.
589
590@item _generic_stat_arch_elt
591Do a stat on an element of an archive, returning information read from
592the archive header (modification time, uid, gid, file mode, size). This
593is called via @samp{bfd_stat_arch_elt}. The corresponding field in the
594target vector is named @samp{_bfd_stat_arch_elt}.
595
596@item _update_armap_timestamp
597After the entire contents of an archive have been written out, update
598the timestamp of the archive symbol table to be newer than that of the
599file. This is required for a.out style archives. This is normally
600called by the archive @samp{write_contents} routine. The corresponding
601field in the target vector is named @samp{_bfd_update_armap_timestamp}.
602@end table
603
604@node BFD target vector symbols
605@subsection Symbol table functions
606@cindex @samp{BFD_JUMP_TABLE_SYMBOLS}
607
608The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal
609with symbols.
610
611@table @samp
612@item _get_symtab_upper_bound
613Return a sensible upper bound on the amount of memory which will be
614required to read the symbol table. In practice most targets return the
615amount of memory required to hold @samp{asymbol} pointers for all the
616symbols plus a trailing @samp{NULL} entry, and store the actual symbol
617information in BFD private data. This is called via
618@samp{bfd_get_symtab_upper_bound}. The corresponding field in the
619target vector is named @samp{_bfd_get_symtab_upper_bound}.
620
621@item _get_symtab
622Read in the symbol table. This is called via
623@samp{bfd_canonicalize_symtab}. The corresponding field in the target
624vector is named @samp{_bfd_canonicalize_symtab}.
625
626@item _make_empty_symbol
627Create an empty symbol for the BFD. This is needed because most targets
628store extra information with each symbol by allocating a structure
629larger than an @samp{asymbol} and storing the extra information at the
630end. This function will allocate the right amount of memory, and return
631what looks like a pointer to an empty @samp{asymbol}. This is called
632via @samp{bfd_make_empty_symbol}. The corresponding field in the target
633vector is named @samp{_bfd_make_empty_symbol}.
634
635@item _print_symbol
636Print information about the symbol. This is called via
637@samp{bfd_print_symbol}. One of the arguments indicates what sort of
638information should be printed:
639@table @samp
640@item bfd_print_symbol_name
641Just print the symbol name.
642@item bfd_print_symbol_more
643Print the symbol name and some interesting flags. I don't think
644anything actually uses this.
645@item bfd_print_symbol_all
646Print all information about the symbol. This is used by @samp{objdump}
647when run with the @samp{-t} option.
648@end table
649The corresponding field in the target vector is named
650@samp{_bfd_print_symbol}.
651
652@item _get_symbol_info
653Return a standard set of information about the symbol. This is called
654via @samp{bfd_symbol_info}. The corresponding field in the target
655vector is named @samp{_bfd_get_symbol_info}.
656
657@item _bfd_is_local_label_name
658Return whether the given string would normally represent the name of a
659local label. This is called via @samp{bfd_is_local_label} and
660@samp{bfd_is_local_label_name}. Local labels are normally discarded by
661the assembler. In the linker, this defines the difference between the
662@samp{-x} and @samp{-X} options.
663
664@item _get_lineno
665Return line number information for a symbol. This is only meaningful
666for a COFF target. This is called when writing out COFF line numbers.
667
668@item _find_nearest_line
669Given an address within a section, use the debugging information to find
670the matching file name, function name, and line number, if any. This is
671called via @samp{bfd_find_nearest_line}. The corresponding field in the
672target vector is named @samp{_bfd_find_nearest_line}.
673
674@item _bfd_make_debug_symbol
675Make a debugging symbol. This is only meaningful for a COFF target,
676where it simply returns a symbol which will be placed in the
677@samp{N_DEBUG} section when it is written out. This is called via
678@samp{bfd_make_debug_symbol}.
679
680@item _read_minisymbols
681Minisymbols are used to reduce the memory requirements of programs like
682@samp{nm}. A minisymbol is a cookie pointing to internal symbol
683information which the caller can use to extract complete symbol
684information. This permits BFD to not convert all the symbols into
685generic form, but to instead convert them one at a time. This is called
686via @samp{bfd_read_minisymbols}. Most targets do not implement this,
687and just use generic support which is based on using standard
688@samp{asymbol} structures.
689
690@item _minisymbol_to_symbol
691Convert a minisymbol to a standard @samp{asymbol}. This is called via
692@samp{bfd_minisymbol_to_symbol}.
693@end table
694
695@node BFD target vector relocs
696@subsection Relocation support
697@cindex @samp{BFD_JUMP_TABLE_RELOCS}
698
699The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal
700with relocations.
701
702@table @samp
703@item _get_reloc_upper_bound
704Return a sensible upper bound on the amount of memory which will be
705required to read the relocations for a section. In practice most
706targets return the amount of memory required to hold @samp{arelent}
707pointers for all the relocations plus a trailing @samp{NULL} entry, and
708store the actual relocation information in BFD private data. This is
709called via @samp{bfd_get_reloc_upper_bound}.
710
711@item _canonicalize_reloc
712Return the relocation information for a section. This is called via
713@samp{bfd_canonicalize_reloc}. The corresponding field in the target
714vector is named @samp{_bfd_canonicalize_reloc}.
715
716@item _bfd_reloc_type_lookup
717Given a relocation code, return the corresponding howto structure
718(@pxref{BFD relocation codes}). This is called via
719@samp{bfd_reloc_type_lookup}. The corresponding field in the target
720vector is named @samp{reloc_type_lookup}.
721@end table
722
723@node BFD target vector write
724@subsection Output functions
725@cindex @samp{BFD_JUMP_TABLE_WRITE}
726
727The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal
728with writing out a BFD.
729
730@table @samp
731@item _set_arch_mach
732Set the architecture and machine number for a BFD. This is called via
733@samp{bfd_set_arch_mach}. Most targets implement this by calling
734@samp{bfd_default_set_arch_mach}. The corresponding field in the target
735vector is named @samp{_bfd_set_arch_mach}.
736
737@item _set_section_contents
738Write out the contents of a section. This is called via
739@samp{bfd_set_section_contents}. The corresponding field in the target
740vector is named @samp{_bfd_set_section_contents}.
741@end table
742
743@node BFD target vector link
744@subsection Linker functions
745@cindex @samp{BFD_JUMP_TABLE_LINK}
746
747The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the
748linker.
749
750@table @samp
751@item _sizeof_headers
752Return the size of the header information required for a BFD. This is
753used to implement the @samp{SIZEOF_HEADERS} linker script function. It
754is normally used to align the first section at an efficient position on
755the page. This is called via @samp{bfd_sizeof_headers}. The
756corresponding field in the target vector is named
757@samp{_bfd_sizeof_headers}.
758
759@item _bfd_get_relocated_section_contents
760Read the contents of a section and apply the relocation information.
761This handles both a final link and a relocateable link; in the latter
762case, it adjust the relocation information as well. This is called via
763@samp{bfd_get_relocated_section_contents}. Most targets implement it by
764calling @samp{bfd_generic_get_relocated_section_contents}.
765
766@item _bfd_relax_section
767Try to use relaxation to shrink the size of a section. This is called
768by the linker when the @samp{-relax} option is used. This is called via
769@samp{bfd_relax_section}. Most targets do not support any sort of
770relaxation.
771
772@item _bfd_link_hash_table_create
773Create the symbol hash table to use for the linker. This linker hook
774permits the backend to control the size and information of the elements
775in the linker symbol hash table. This is called via
776@samp{bfd_link_hash_table_create}.
777
778@item _bfd_link_add_symbols
779Given an object file or an archive, add all symbols into the linker
780symbol hash table. Use callbacks to the linker to include archive
781elements in the link. This is called via @samp{bfd_link_add_symbols}.
782
783@item _bfd_final_link
784Finish the linking process. The linker calls this hook after all of the
785input files have been read, when it is ready to finish the link and
786generate the output file. This is called via @samp{bfd_final_link}.
787
788@item _bfd_link_split_section
789I don't know what this is for. Nothing seems to call it. The only
790non-trivial definition is in @file{som.c}.
791@end table
792
793@node BFD target vector dynamic
794@subsection Dynamic linking information functions
795@cindex @samp{BFD_JUMP_TABLE_DYNAMIC}
796
797The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read
798dynamic linking information.
799
800@table @samp
801@item _get_dynamic_symtab_upper_bound
802Return a sensible upper bound on the amount of memory which will be
803required to read the dynamic symbol table. In practice most targets
804return the amount of memory required to hold @samp{asymbol} pointers for
805all the symbols plus a trailing @samp{NULL} entry, and store the actual
806symbol information in BFD private data. This is called via
807@samp{bfd_get_dynamic_symtab_upper_bound}. The corresponding field in
808the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}.
809
810@item _canonicalize_dynamic_symtab
811Read the dynamic symbol table. This is called via
812@samp{bfd_canonicalize_dynamic_symtab}. The corresponding field in the
813target vector is named @samp{_bfd_canonicalize_dynamic_symtab}.
814
815@item _get_dynamic_reloc_upper_bound
816Return a sensible upper bound on the amount of memory which will be
817required to read the dynamic relocations. In practice most targets
818return the amount of memory required to hold @samp{arelent} pointers for
819all the relocations plus a trailing @samp{NULL} entry, and store the
820actual relocation information in BFD private data. This is called via
821@samp{bfd_get_dynamic_reloc_upper_bound}. The corresponding field in
822the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}.
823
824@item _canonicalize_dynamic_reloc
825Read the dynamic relocations. This is called via
826@samp{bfd_canonicalize_dynamic_reloc}. The corresponding field in the
827target vector is named @samp{_bfd_canonicalize_dynamic_reloc}.
828@end table
829
c91a48dd
ILT
830@node BFD generated files
831@section BFD generated files
832@cindex generated files in bfd
833@cindex bfd generated files
834
835BFD contains several automatically generated files. This section
836describes them. Some files are created at configure time, when you
837configure BFD. Some files are created at make time, when you build
838time. Some files are automatically rebuilt at make time, but only if
839you configure with the @samp{--enable-maintainer-mode} option. Some
840files live in the object directory---the directory from which you run
841configure---and some live in the source directory. All files that live
842in the source directory are checked into the CVS repository.
843
844@table @file
845@item bfd.h
846@cindex @file{bfd.h}
847@cindex @file{bfd-in3.h}
848Lives in the object directory. Created at make time from
849@file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at
850configure time from @file{bfd-in2.h}. There are automatic dependencies
851to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h}
852changes, so you can normally ignore @file{bfd-in3.h}, and just think
853about @file{bfd-in2.h} and @file{bfd.h}.
854
855@file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}.
856To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly
857control whether BFD is built for a 32 bit target or a 64 bit target.
858
859@item bfd-in2.h
860@cindex @file{bfd-in2.h}
861Lives in the source directory. Created from @file{bfd-in.h} and several
862other BFD source files. If you configure with the
863@samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt
864automatically when a source file changes.
865
866@item elf32-target.h
867@itemx elf64-target.h
868@cindex @file{elf32-target.h}
869@cindex @file{elf64-target.h}
870Live in the object directory. Created from @file{elfxx-target.h}.
871These files are versions of @file{elfxx-target.h} customized for either
872a 32 bit ELF target or a 64 bit ELF target.
873
874@item libbfd.h
875@cindex @file{libbfd.h}
876Lives in the source directory. Created from @file{libbfd-in.h} and
877several other BFD source files. If you configure with the
878@samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt
879automatically when a source file changes.
880
881@item libcoff.h
882@cindex @file{libcoff.h}
883Lives in the source directory. Created from @file{libcoff-in.h} and
884@file{coffcode.h}. If you configure with the
885@samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt
886automatically when a source file changes.
887
888@item targmatch.h
889@cindex @file{targmatch.h}
890Lives in the object directory. Created at make time from
891@file{config.bfd}. This file is used to map configuration triplets into
892BFD target vector variable names at run time.
893@end table
894
895@node BFD multiple compilations
896@section Files compiled multiple times in BFD
897Several files in BFD are compiled multiple times. By this I mean that
898there are header files which contain function definitions. These header
899filesare included by other files, and thus the functions are compiled
900once per file which includes them.
901
902Preprocessor macros are used to control the compilation, so that each
903time the files are compiled the resulting functions are slightly
904different. Naturally, if they weren't different, there would be no
905reason to compile them multiple times.
906
907This is a not a particularly good programming technique, and future BFD
908work should avoid it.
909
910@itemize @bullet
911@item
912Since this technique is rarely used, even experienced C programmers find
913it confusing.
914
915@item
916It is difficult to debug programs which use BFD, since there is no way
917to describe which version of a particular function you are looking at.
918
919@item
920Programs which use BFD wind up incorporating two or more slightly
921different versions of the same function, which wastes space in the
922executable.
923
924@item
925This technique is never required nor is it especially efficient. It is
926always possible to use statically initialized structures holding
927function pointers and magic constants instead.
928@end itemize
929
accf488e 930The following is a list of the files which are compiled multiple times.
c91a48dd
ILT
931
932@table @file
933@item aout-target.h
934@cindex @file{aout-target.h}
935Describes a few functions and the target vector for a.out targets. This
936is used by individual a.out targets with different definitions of
937@samp{N_TXTADDR} and similar a.out macros.
938
939@item aoutf1.h
940@cindex @file{aoutf1.h}
941Implements standard SunOS a.out files. In principle it supports 64 bit
942a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but
943since all known a.out targets are 32 bits, this code may or may not
944work. This file is only included by a few other files, and it is
945difficult to justify its existence.
946
947@item aoutx.h
948@cindex @file{aoutx.h}
949Implements basic a.out support routines. This file can be compiled for
950either 32 or 64 bit support. Since all known a.out targets are 32 bits,
951the 64 bit support may or may not work. I believe the original
952intention was that this file would only be included by @samp{aout32.c}
953and @samp{aout64.c}, and that other a.out targets would simply refer to
954the functions it defined. Unfortunately, some other a.out targets
955started including it directly, leading to a somewhat confused state of
956affairs.
957
958@item coffcode.h
959@cindex @file{coffcode.h}
960Implements basic COFF support routines. This file is included by every
961COFF target. It implements code which handles COFF magic numbers as
962well as various hook functions called by the generic COFF functions in
963@file{coffgen.c}. This file is controlled by a number of different
964macros, and more are added regularly.
965
966@item coffswap.h
967@cindex @file{coffswap.h}
968Implements COFF swapping routines. This file is included by
969@file{coffcode.h}, and thus by every COFF target. It implements the
970routines which swap COFF structures between internal and external
971format. The main control for this file is the external structure
972definitions in the files in the @file{include/coff} directory. A COFF
973target file will include one of those files before including
974@file{coffcode.h} and thus @file{coffswap.h}. There are a few other
975macros which affect @file{coffswap.h} as well, mostly describing whether
976certain fields are present in the external structures.
977
978@item ecoffswap.h
979@cindex @file{ecoffswap.h}
980Implements ECOFF swapping routines. This is like @file{coffswap.h}, but
981for ECOFF. It is included by the ECOFF target files (of which there are
982only two). The control is the preprocessor macro @samp{ECOFF_32} or
983@samp{ECOFF_64}.
984
985@item elfcode.h
986@cindex @file{elfcode.h}
987Implements ELF functions that use external structure definitions. This
988file is included by two other files: @file{elf32.c} and @file{elf64.c}.
989It is controlled by the @samp{ARCH_SIZE} macro which is defined to be
990@samp{32} or @samp{64} before including it. The @samp{NAME} macro is
991used internally to give the functions different names for the two target
992sizes.
993
994@item elfcore.h
995@cindex @file{elfcore.h}
996Like @file{elfcode.h}, but for functions that are specific to ELF core
997files. This is included only by @file{elfcode.h}.
998
999@item elflink.h
1000@cindex @file{elflink.h}
1001Like @file{elfcode.h}, but for functions used by the ELF linker. This
1002is included only by @file{elfcode.h}.
1003
1004@item elfxx-target.h
1005@cindex @file{elfxx-target.h}
1006This file is the source for the generated files @file{elf32-target.h}
1007and @file{elf64-target.h}, one of which is included by every ELF target.
1008It defines the ELF target vector.
1009
1010@item freebsd.h
1011@cindex @file{freebsd.h}
1012Presumably intended to be included by all FreeBSD targets, but in fact
1013there is only one such target, @samp{i386-freebsd}. This defines a
1014function used to set the right magic number for FreeBSD, as well as
1015various macros, and includes @file{aout-target.h}.
1016
1017@item netbsd.h
1018@cindex @file{netbsd.h}
1019Like @file{freebsd.h}, except that there are several files which include
1020it.
1021
1022@item nlm-target.h
1023@cindex @file{nlm-target.h}
1024Defines the target vector for a standard NLM target.
1025
1026@item nlmcode.h
1027@cindex @file{nlmcode.h}
1028Like @file{elfcode.h}, but for NLM targets. This is only included by
1029@file{nlm32.c} and @file{nlm64.c}, both of which define the macro
1030@samp{ARCH_SIZE} to an appropriate value. There are no 64 bit NLM
1031targets anyhow, so this is sort of useless.
1032
1033@item nlmswap.h
1034@cindex @file{nlmswap.h}
1035Like @file{coffswap.h}, but for NLM targets. This is included by each
1036NLM target, but I think it winds up compiling to the exact same code for
1037every target, and as such is fairly useless.
1038
1039@item peicode.h
1040@cindex @file{peicode.h}
1041Provides swapping routines and other hooks for PE targets.
1042@file{coffcode.h} will include this rather than @file{coffswap.h} for a
1043PE target. This defines PE specific versions of the COFF swapping
1044routines, and also defines some macros which control @file{coffcode.h}
1045itself.
1046@end table
1047
508fa296
ILT
1048@node BFD relocation handling
1049@section BFD relocation handling
1050@cindex bfd relocation handling
1051@cindex relocations in bfd
1052
1053The handling of relocations is one of the more confusing aspects of BFD.
1054Relocation handling has been implemented in various different ways, all
1055somewhat incompatible, none perfect.
1056
1057@menu
accf488e
ILT
1058* BFD relocation concepts:: BFD relocation concepts
1059* BFD relocation functions:: BFD relocation functions
d1d5d252 1060* BFD relocation codes:: BFD relocation codes
accf488e 1061* BFD relocation future:: BFD relocation future
508fa296
ILT
1062@end menu
1063
1064@node BFD relocation concepts
1065@subsection BFD relocation concepts
1066
1067A relocation is an action which the linker must take when linking. It
1068describes a change to the contents of a section. The change is normally
1069based on the final value of one or more symbols. Relocations are
1070created by the assembler when it creates an object file.
1071
1072Most relocations are simple. A typical simple relocation is to set 32
1073bits at a given offset in a section to the value of a symbol. This type
1074of relocation would be generated for code like @code{int *p = &i;} where
1075@samp{p} and @samp{i} are global variables. A relocation for the symbol
1076@samp{i} would be generated such that the linker would initialize the
1077area of memory which holds the value of @samp{p} to the value of the
1078symbol @samp{i}.
1079
1080Slightly more complex relocations may include an addend, which is a
1081constant to add to the symbol value before using it. In some cases a
1082relocation will require adding the symbol value to the existing contents
1083of the section in the object file. In others the relocation will simply
1084replace the contents of the section with the symbol value. Some
1085relocations are PC relative, so that the value to be stored in the
1086section is the difference between the value of a symbol and the final
1087address of the section contents.
1088
1089In general, relocations can be arbitrarily complex. For
1090example,relocations used in dynamic linking systems often require the
1091linker to allocate space in a different section and use the offset
1092within that section as the value to store. In the IEEE object file
1093format, relocations may involve arbitrary expressions.
1094
1095When doing a relocateable link, the linker may or may not have to do
1096anything with a relocation, depending upon the definition of the
1097relocation. Simple relocations generally do not require any special
1098action.
1099
1100@node BFD relocation functions
1101@subsection BFD relocation functions
1102
1103In BFD, each section has an array of @samp{arelent} structures. Each
1104structure has a pointer to a symbol, an address within the section, an
1105addend, and a pointer to a @samp{reloc_howto_struct} structure. The
1106howto structure has a bunch of fields describing the reloc, including a
1107type field. The type field is specific to the object file format
1108backend; none of the generic code in BFD examines it.
1109
1110Originally, the function @samp{bfd_perform_relocation} was supposed to
1111handle all relocations. In theory, many relocations would be simple
1112enough to be described by the fields in the howto structure. For those
1113that weren't, the howto structure included a @samp{special_function}
1114field to use as an escape.
1115
1116While this seems plausible, a look at @samp{bfd_perform_relocation}
1117shows that it failed. The function has odd special cases. Some of the
1118fields in the howto structure, such as @samp{pcrel_offset}, were not
1119adequately documented.
1120
1121The linker uses @samp{bfd_perform_relocation} to do all relocations when
1122the input and output file have different formats (e.g., when generating
1123S-records). The generic linker code, which is used by all targets which
1124do not define their own special purpose linker, uses
1125@samp{bfd_get_relocated_section_contents}, which for most targets turns
1126into a call to @samp{bfd_generic_get_relocated_section_contents}, which
1127calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation}
1128is still widely used, which makes it difficult to change, since it is
1129difficult to test all possible cases.
1130
1131The assembler used @samp{bfd_perform_relocation} for a while. This
1132turned out to be the wrong thing to do, since
1133@samp{bfd_perform_relocation} was written to handle relocations on an
1134existing object file, while the assembler needed to create relocations
1135in a new object file. The assembler was changed to use the new function
1136@samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation}
1137was created as a copy of @samp{bfd_perform_relocation}.
1138
1139Unfortunately, the work did not progress any farther, so
1140@samp{bfd_install_relocation} remains a simple copy of
1141@samp{bfd_perform_relocation}, with all the odd special cases and
1142confusing code. This again is difficult to change, because again any
1143change can affect any assembler target, and so is difficult to test.
1144
1145The new linker, when using the same object file format for all input
1146files and the output file, does not convert relocations into
1147@samp{arelent} structures, so it can not use
1148@samp{bfd_perform_relocation} at all. Instead, users of the new linker
1149are expected to write a @samp{relocate_section} function which will
1150handle relocations in a target specific fashion.
1151
1152There are two helper functions for target specific relocation:
1153@samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}.
1154These functions use a howto structure, but they @emph{do not} use the
1155@samp{special_function} field. Since the functions are normally called
1156from target specific code, the @samp{special_function} field adds
1157little; any relocations which require special handling can be handled
1158without calling those functions.
1159
1160So, if you want to add a new target, or add a new relocation to an
1161existing target, you need to do the following:
1162@itemize @bullet
1163@item
1164Make sure you clearly understand what the contents of the section should
1165look like after assembly, after a relocateable link, and after a final
1166link. Make sure you clearly understand the operations the linker must
1167perform during a relocateable link and during a final link.
1168
1169@item
1170Write a howto structure for the relocation. The howto structure is
1171flexible enough to represent any relocation which should be handled by
1172setting a contiguous bitfield in the destination to the value of a
1173symbol, possibly with an addend, possibly adding the symbol value to the
1174value already present in the destination.
1175
1176@item
1177Change the assembler to generate your relocation. The assembler will
1178call @samp{bfd_install_relocation}, so your howto structure has to be
1179able to handle that. You may need to set the @samp{special_function}
1180field to handle assembly correctly. Be careful to ensure that any code
1181you write to handle the assembler will also work correctly when doing a
1182relocateable link. For example, see @samp{bfd_elf_generic_reloc}.
1183
1184@item
1185Test the assembler. Consider the cases of relocation against an
1186undefined symbol, a common symbol, a symbol defined in the object file
1187in the same section, and a symbol defined in the object file in a
1188different section. These cases may not all be applicable for your
1189reloc.
1190
1191@item
1192If your target uses the new linker, which is recommended, add any
1193required handling to the target specific relocation function. In simple
1194cases this will just involve a call to @samp{_bfd_final_link_relocate}
1195or @samp{_bfd_relocate_contents}, depending upon the definition of the
1196relocation and whether the link is relocateable or not.
1197
1198@item
1199Test the linker. Test the case of a final link. If the relocation can
1200overflow, use a linker script to force an overflow and make sure the
1201error is reported correctly. Test a relocateable link, whether the
1202symbol is defined or undefined in the relocateable output. For both the
1203final and relocateable link, test the case when the symbol is a common
1204symbol, when the symbol looked like a common symbol but became a defined
1205symbol, when the symbol is defined in a different object file, and when
1206the symbol is defined in the same object file.
1207
1208@item
1209In order for linking to another object file format, such as S-records,
1210to work correctly, @samp{bfd_perform_relocation} has to do the right
1211thing for the relocation. You may need to set the
1212@samp{special_function} field to handle this correctly. Test this by
1213doing a link in which the output object file format is S-records.
1214
1215@item
1216Using the linker to generate relocateable output in a different object
1217file format is impossible in the general case, so you generally don't
1218have to worry about that. Linking input files of different object file
1219formats together is quite unusual, but if you're really dedicated you
1220may want to consider testing this case, both when the output object file
1221format is the same as your format, and when it is different.
1222@end itemize
1223
d1d5d252
ILT
1224@node BFD relocation codes
1225@subsection BFD relocation codes
1226
1227BFD has another way of describing relocations besides the howto
1228structures described above: the enum @samp{bfd_reloc_code_real_type}.
1229
1230Every known relocation type can be described as a value in this
1231enumeration. The enumeration contains many target specific relocations,
1232but where two or more targets have the same relocation, a single code is
1233used. For example, the single value @samp{BFD_RELOC_32} is used for all
1234simple 32 bit relocation types.
1235
1236The main purpose of this relocation code is to give the assembler some
1237mechanism to create @samp{arelent} structures. In order for the
1238assembler to create an @samp{arelent} structure, it has to be able to
1239obtain a howto structure. The function @samp{bfd_reloc_type_lookup},
1240which simply calls the target vector entry point
1241@samp{reloc_type_lookup}, takes a relocation code and returns a howto
1242structure.
1243
1244The function @samp{bfd_get_reloc_code_name} returns the name of a
1245relocation code. This is mainly used in error messages.
1246
1247Using both howto structures and relocation codes can be somewhat
1248confusing. There are many processor specific relocation codes.
1249However, the relocation is only fully defined by the howto structure.
1250The same relocation code will map to different howto structures in
1251different object file formats. For example, the addend handling may be
1252different.
1253
1254Most of the relocation codes are not really general. The assembler can
1255not use them without already understanding what sorts of relocations can
1256be used for a particular target. It might be possible to replace the
1257relocation codes with something simpler.
1258
508fa296
ILT
1259@node BFD relocation future
1260@subsection BFD relocation future
1261
1262Clearly the current BFD relocation support is in bad shape. A
1263wholescale rewrite would be very difficult, because it would require
1264thorough testing of every BFD target. So some sort of incremental
1265change is required.
1266
1267My vague thoughts on this would involve defining a new, clearly defined,
1268howto structure. Some mechanism would be used to determine which type
1269of howto structure was being used by a particular format.
1270
1271The new howto structure would clearly define the relocation behaviour in
1272the case of an assembly, a relocateable link, and a final link. At
1273least one special function would be defined as an escape, and it might
1274make sense to define more.
1275
1276One or more generic functions similar to @samp{bfd_perform_relocation}
1277would be written to handle the new howto structure.
1278
1279This should make it possible to write a generic version of the relocate
1280section functions used by the new linker. The target specific code
1281would provide some mechanism (a function pointer or an initial
1282conversion) to convert target specific relocations into howto
1283structures.
1284
1285Ideally it would be possible to use this generic relocate section
1286function for the generic linker as well. That is, it would replace the
1287@samp{bfd_generic_get_relocated_section_contents} function which is
1288currently normally used.
1289
1290For the special case of ELF dynamic linking, more consideration needs to
1291be given to writing ELF specific but ELF target generic code to handle
1292special relocation types such as GOT and PLT.
1293
d1d5d252
ILT
1294@node BFD ELF support
1295@section BFD ELF support
1296@cindex elf support in bfd
1297@cindex bfd elf support
1298
1299The ELF object file format is defined in two parts: a generic ABI and a
1300processor specific supplement. The ELF support in BFD is split in a
1301similar fashion. The processor specific support is largely kept within
1302a single file. The generic support is provided by several other file.
1303The processor specific support provides a set of function pointers and
1304constants used by the generic support.
1305
1306@menu
1307* BFD ELF generic support:: BFD ELF generic support
1308* BFD ELF processor specific support:: BFD ELF processor specific support
1309* BFD ELF future:: BFD ELF future
1310@end menu
1311
1312@node BFD ELF generic support
1313@subsection BFD ELF generic support
1314
1315In general, functions which do not read external data from the ELF file
1316are found in @file{elf.c}. They operate on the internal forms of the
1317ELF structures, which are defined in @file{include/elf/internal.h}. The
1318internal structures are defined in terms of @samp{bfd_vma}, and so may
1319be used for both 32 bit and 64 bit ELF targets.
1320
1321The file @file{elfcode.h} contains functions which operate on the
1322external data. @file{elfcode.h} is compiled twice, once via
1323@file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via
1324@file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}.
1325@file{elfcode.h} includes functions to swap the ELF structures in and
1326out of external form, as well as a few more complex functions.
1327
1328Linker support is found in @file{elflink.c} and @file{elflink.h}. The
1329latter file is compiled twice, for both 32 and 64 bit support. The
1330linker support is only used if the processor specific file defines
1331@samp{elf_backend_relocate_section}, which is required to relocate the
1332section contents. If that macro is not defined, the generic linker code
1333is used, and relocations are handled via @samp{bfd_perform_relocation}.
1334
1335The core file support is in @file{elfcore.h}, which is compiled twice,
1336for both 32 and 64 bit support. The more interesting cases of core file
1337support only work on a native system which has the @file{sys/procfs.h}
1338header file. Without that file, the core file support does little more
1339than read the ELF program segments as BFD sections.
1340
1341The BFD internal header file @file{elf-bfd.h} is used for communication
1342among these files and the processor specific files.
1343
1344The default entries for the BFD ELF target vector are found mainly in
1345@file{elf.c}. Some functions are found in @file{elfcode.h}.
1346
1347The processor specific files may override particular entries in the
1348target vector, but most do not, with one exception: the
1349@samp{bfd_reloc_type_lookup} entry point is always processor specific.
1350
1351@node BFD ELF processor specific support
1352@subsection BFD ELF processor specific support
1353
1354By convention, the processor specific support for a particular processor
1355will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is
1356either 32 or 64, and @var{cpu} is the name of the processor.
1357
1358@menu
1359* BFD ELF processor required:: Required processor specific support
1360* BFD ELF processor linker:: Processor specific linker support
1361* BFD ELF processor other:: Other processor specific support options
1362@end menu
1363
1364@node BFD ELF processor required
1365@subsubsection Required processor specific support
1366
1367When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the
1368following:
1369@itemize @bullet
1370@item
1371Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or
1372both, to a unique C name to use for the target vector. This name should
1373appear in the list of target vectors in @file{targets.c}, and will also
1374have to appear in @file{config.bfd} and @file{configure.in}. Define
1375@samp{TARGET_BIG_SYM} for a big-endian processor,
1376@samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both
1377for a bi-endian processor.
1378@item
1379Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or
1380both, to a string used as the name of the target vector. This is the
1381name which a user of the BFD tool would use to specify the object file
1382format. It would normally appear in a linker emulation parameters
1383file.
1384@item
1385Define @samp{ELF_ARCH} to the BFD architecture (an element of the
1386@samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}).
1387@item
1388Define @samp{ELF_MACHINE_CODE} to the magic number which should appear
1389in the @samp{e_machine} field of the ELF header. As of this writing,
1390these magic numbers are assigned by SCO; if you want to get a magic
1391number for a particular processor, try sending a note to
1392@email{registry@@sco.com}. In the BFD sources, the magic numbers are
1393found in @file{include/elf/common.h}; they have names beginning with
1394@samp{EM_}.
1395@item
1396Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in
1397memory. This can normally be found at the start of chapter 5 in the
1398processor specific supplement. For a processor which will only be used
1399in an embedded system, or which has no memory management hardware, this
1400can simply be @samp{1}.
1401@item
1402If the format should use @samp{Rel} rather than @samp{Rela} relocations,
1403define @samp{USE_REL}. This is normally defined in chapter 4 of the
1404processor specific supplement. In the absence of a supplement, it's
1405usually easier to work with @samp{Rela} relocations, although they will
1406require more space in object files (but not in executables, except when
1407using dynamic linking). It is possible, though somewhat awkward, to
1408support both @samp{Rel} and @samp{Rela} relocations for a single target;
1409@file{elf64-mips.c} does it by overriding the relocation reading and
1410writing routines.
1411@item
1412Define howto structures for all the relocation types.
1413@item
1414Define a @samp{bfd_reloc_type_lookup} routine. This must be named
1415@samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a
1416function or a macro. It must translate a BFD relocation code into a
1417howto structure. This is normally a table lookup or a simple switch.
1418@item
1419If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}.
1420If using @samp{Rela} relocations, define @samp{elf_info_to_howto}.
1421Either way, this is a macro defined as the name of a function which
1422takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and
1423sets the @samp{howto} field of the @samp{arelent} based on the
1424@samp{Rel} or @samp{Rela} structure. This is normally uses
1425@samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as
1426an index into a table of howto structures.
1427@end itemize
1428
1429You must also add the magic number for this processor to the
1430@samp{prep_headers} function in @file{elf.c}.
1431
1432@node BFD ELF processor linker
1433@subsubsection Processor specific linker support
1434
1435The linker will be much more efficient if you define a relocate section
1436function. This will permit BFD to use the ELF specific linker support.
1437
1438If you do not define a relocate section function, BFD must use the
1439generic linker support, which requires converting all symbols and
1440relocations into BFD @samp{asymbol} and @samp{arelent} structures. In
1441this case, relocations will be handled by calling
1442@samp{bfd_perform_relocation}, which will use the howto structures you
1443have defined. @xref{BFD relocation handling}.
1444
1445In order to support linking into a different object file format, such as
1446S-records, @samp{bfd_perform_relocation} must work correctly with your
1447howto structures, so you can't skip that step. However, if you define
1448the relocate section function, then in the normal case of linking into
1449an ELF file the linker will not need to convert symbols and relocations,
1450and will be much more efficient.
1451
1452To use a relocation section function, define the macro
1453@samp{elf_backend_relocate_section} as the name of a function which will
1454take the contents of a section, as well as relocation, symbol, and other
1455information, and modify the section contents according to the relocation
1456information. In simple cases, this is little more than a loop over the
1457relocations which computes the value of each relocation and calls
1458@samp{_bfd_final_link_relocate}. The function must check for a
1459relocateable link, and in that case normally needs to do nothing other
1460than adjust the addend for relocations against a section symbol.
1461
1462The complex cases generally have to do with dynamic linker support. GOT
1463and PLT relocations must be handled specially, and the linker normally
1464arranges to set up the GOT and PLT sections while handling relocations.
1465When generating a shared library, random relocations must normally be
1466copied into the shared library, or converted to RELATIVE relocations
1467when possible.
1468
1469@node BFD ELF processor other
1470@subsubsection Other processor specific support options
1471
1472There are many other macros which may be defined in
1473@file{elf@var{nn}-@var{cpu}.c}. These macros may be found in
1474@file{elfxx-target.h}.
1475
1476Macros may be used to override some of the generic ELF target vector
1477functions.
1478
1479Several processor specific hook functions which may be defined as
1480macros. These functions are found as function pointers in the
1481@samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In
1482general, a hook function is set by defining a macro
1483@samp{elf_backend_@var{name}}.
1484
1485There are a few processor specific constants which may also be defined.
1486These are again found in the @samp{elf_backend_data} structure.
1487
1488I will not define the various functions and constants here; see the
1489comments in @file{elf-bfd.h}.
1490
1491Normally any odd characteristic of a particular ELF processor is handled
1492via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON}
1493section number found in MIPS ELF is handled via the hooks
1494@samp{section_from_bfd_section}, @samp{symbol_processing},
1495@samp{add_symbol_hook}, and @samp{output_symbol_hook}.
1496
1497Dynamic linking support, which involves processor specific relocations
1498requiring special handling, is also implemented via hook functions.
1499
1500@node BFD ELF future
1501@subsection BFD ELF future
1502
1503The current dynamic linking support has too much code duplication.
1504While each processor has particular differences, much of the dynamic
1505linking support is quite similar for each processor. The GOT and PLT
1506are handled in fairly similar ways, the details of -Bsymbolic linking
1507are generally similar, etc. This code should be reworked to use more
1508generic functions, eliminating the duplication.
1509
1510Similarly, the relocation handling has too much duplication. Many of
1511the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are
1512quite similar. The relocate section functions are also often quite
1513similar, both in the standard linker handling and the dynamic linker
1514handling. Many of the COFF processor specific backends share a single
1515relocate section function (@samp{_bfd_coff_generic_relocate_section}),
1516and it should be possible to do something like this for the ELF targets
1517as well.
1518
1519The appearance of the processor specific magic number in
1520@samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be
1521possible to add support for a new processor without changing the generic
1522support.
1523
1524The processor function hooks and constants are ad hoc and need better
1525documentation.
1526
c91a48dd
ILT
1527@node Index
1528@unnumberedsec Index
1529@printindex cp
1530
1531@contents
1532@bye
This page took 0.176314 seconds and 4 git commands to generate.