]>
Commit | Line | Data |
---|---|---|
83d290c5 | 1 | # SPDX-License-Identifier: GPL-2.0+ |
b2e16a85 SG |
2 | # |
3 | # Copyright (c) 2013 The Chromium OS Authors. | |
b2e16a85 SG |
4 | |
5 | Tracing in U-Boot | |
6 | ================= | |
7 | ||
8 | U-Boot supports a simple tracing feature which allows a record of excecution | |
9 | to be collected and sent to a host machine for analysis. At present the | |
10 | main use for this is to profile boot time. | |
11 | ||
12 | ||
13 | Overview | |
14 | -------- | |
15 | ||
16 | The trace feature uses GCC's instrument-functions feature to trace all | |
17 | function entry/exit points. These are then recorded in a memory buffer. | |
18 | The memory buffer can be saved to the host over a network link using | |
19 | tftpput or by writing to an attached memory device such as MMC. | |
20 | ||
21 | On the host, the file is first converted with a tool called 'proftool', | |
22 | which extracts useful information from it. The resulting trace output | |
23 | resembles that emitted by Linux's ftrace feature, so can be visually | |
24 | displayed by pytimechart. | |
25 | ||
26 | ||
27 | Quick-start using Sandbox | |
28 | ------------------------- | |
29 | ||
30 | Sandbox is a build of U-Boot that can run under Linux so it is a convenient | |
31 | way of trying out tracing before you use it on your actual board. To do | |
32 | this, follow these steps: | |
33 | ||
34 | Add the following to include/configs/sandbox.h (if not already there) | |
35 | ||
36 | #define CONFIG_TRACE | |
37 | #define CONFIG_CMD_TRACE | |
38 | #define CONFIG_TRACE_BUFFER_SIZE (16 << 20) | |
39 | #define CONFIG_TRACE_EARLY_SIZE (8 << 20) | |
40 | #define CONFIG_TRACE_EARLY | |
41 | #define CONFIG_TRACE_EARLY_ADDR 0x00100000 | |
42 | ||
43 | Build sandbox U-Boot with tracing enabled: | |
44 | ||
45 | $ make FTRACE=1 O=sandbox sandbox_config | |
46 | $ make FTRACE=1 O=sandbox | |
47 | ||
48 | Run sandbox, wait for a bit of trace information to appear, and then capture | |
49 | a trace: | |
50 | ||
51 | $ ./sandbox/u-boot | |
52 | ||
53 | ||
54 | U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) | |
55 | ||
56 | DRAM: 128 MiB | |
57 | trace: enabled | |
58 | Using default environment | |
59 | ||
60 | In: serial | |
61 | Out: serial | |
62 | Err: serial | |
63 | =>trace stats | |
93e14596 WD |
64 | 671,406 function sites |
65 | 69,712 function calls | |
66 | 0 untracked function calls | |
67 | 73,373 traced function calls | |
68 | 16 maximum observed call depth | |
69 | 15 call depth limit | |
70 | 66,491 calls not traced due to depth | |
b2e16a85 | 71 | =>trace stats |
93e14596 | 72 | 671,406 function sites |
b2e16a85 | 73 | 1,279,450 function calls |
93e14596 WD |
74 | 0 untracked function calls |
75 | 950,490 traced function calls (333217 dropped due to overflow) | |
76 | 16 maximum observed call depth | |
77 | 15 call depth limit | |
b2e16a85 SG |
78 | 1,275,767 calls not traced due to depth |
79 | =>trace calls 0 e00000 | |
80 | Call list dumped to 00000000, size 0xae0a40 | |
81 | ||
82 | baudrate=115200 | |
83 | profbase=0 | |
84 | profoffset=ae0a40 | |
85 | profsize=e00000 | |
86 | stderr=serial | |
87 | stdin=serial | |
88 | stdout=serial | |
89 | ||
90 | Environment size: 117/8188 bytes | |
6d07d63d | 91 | =>host save host 0 trace 0 ${profoffset} |
b2e16a85 SG |
92 | 11405888 bytes written in 10 ms (1.1 GiB/s) |
93 | =>reset | |
94 | ||
95 | ||
96 | Then run proftool to convert the trace information to ftrace format. | |
97 | ||
98 | $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt | |
99 | ||
100 | Finally run pytimechart to display it: | |
101 | ||
102 | $ pytimechart trace.txt | |
103 | ||
104 | Using this tool you can zoom and pan across the trace, with the function | |
105 | calls on the left and little marks representing the start and end of each | |
106 | function. | |
107 | ||
108 | ||
109 | CONFIG Options | |
110 | -------------- | |
111 | ||
112 | - CONFIG_TRACE | |
113 | Enables the trace feature in U-Boot. | |
114 | ||
115 | - CONFIG_CMD_TRACE | |
116 | Enables the trace command. | |
117 | ||
118 | - CONFIG_TRACE_BUFFER_SIZE | |
119 | Size of trace buffer to allocate for U-Boot. This buffer is | |
120 | used after relocation, as a place to put function tracing | |
121 | information. The address of the buffer is determined by | |
122 | the relocation code. | |
123 | ||
124 | - CONFIG_TRACE_EARLY | |
125 | Define this to start tracing early, before relocation. | |
126 | ||
127 | - CONFIG_TRACE_EARLY_SIZE | |
128 | Size of 'early' trace buffer. Before U-Boot has relocated | |
129 | it doesn't have a proper trace buffer. On many boards | |
130 | you can define an area of memory to use for the trace | |
131 | buffer until the 'real' trace buffer is available after | |
132 | relocation. The contents of this buffer are then copied to | |
133 | the real buffer. | |
134 | ||
135 | - CONFIG_TRACE_EARLY_ADDR | |
136 | Address of early trace buffer | |
137 | ||
138 | ||
139 | Building U-Boot with Tracing Enabled | |
140 | ------------------------------------ | |
141 | ||
142 | Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. | |
143 | This is kept as a separate option so that it is easy to enable/disable | |
144 | instrumenting from the command line instead of having to change board | |
145 | config files. | |
146 | ||
147 | ||
148 | Collecting Trace Data | |
149 | --------------------- | |
150 | ||
151 | When you run U-Boot on your board it will collect trace data up to the | |
152 | limit of the trace buffer size you have specified. Once that is exhausted | |
153 | no more data will be collected. | |
154 | ||
155 | Collecting trace data has an affect on execution time/performance. You | |
156 | will notice this particularly with trvial functions - the overhead of | |
157 | recording their execution may even exceed their normal execution time. | |
158 | In practice this doesn't matter much so long as you are aware of the | |
159 | effect. Once you have done your optimisations, turn off tracing before | |
160 | doing end-to-end timing. | |
161 | ||
162 | The best time to start tracing is right at the beginning of U-Boot. The | |
163 | best time to stop tracing is right at the end. In practice it is hard | |
164 | to achieve these ideals. | |
165 | ||
166 | This implementation enables tracing early in board_init_f(). This means | |
167 | that it captures most of the board init process, missing only the | |
168 | early architecture-specific init. However, it also misses the entire | |
169 | SPL stage if there is one. | |
170 | ||
171 | U-Boot typically ends with a 'bootm' command which loads and runs an | |
172 | OS. There is useful trace data in the execution of that bootm | |
173 | command. Therefore this implementation provides a way to collect trace | |
174 | data after bootm has finished processing, but just before it jumps to | |
175 | the OS. In practical terms, U-Boot runs the 'fakegocmd' environment | |
176 | variable at this point. This variable should have a short script which | |
177 | collects the trace data and writes it somewhere. | |
178 | ||
179 | Trace data collection relies on a microsecond timer, accesed through | |
180 | timer_get_us(). So the first think you should do is make sure that | |
181 | this produces sensible results for your board. Suitable sources for | |
182 | this timer include high resolution timers, PWMs or profile timers if | |
183 | available. Most modern SOCs have a suitable timer for this. Make sure | |
184 | that you mark this timer (and anything it calls) with | |
185 | __attribute__((no_instrument_function)) so that the trace library can | |
186 | use it without causing an infinite loop. | |
187 | ||
188 | ||
189 | Commands | |
190 | -------- | |
191 | ||
192 | The trace command has variable sub-commands: | |
193 | ||
194 | - stats | |
195 | Display tracing statistics | |
196 | ||
197 | - pause | |
198 | Pause tracing | |
199 | ||
200 | - resume | |
201 | Resume tracing | |
202 | ||
203 | - funclist [<addr> <size>] | |
204 | Dump a list of functions into the buffer | |
205 | ||
206 | - calls [<addr> <size>] | |
207 | Dump function call trace into buffer | |
208 | ||
209 | If the address and size are not given, these are obtained from environment | |
210 | variables (see below). In any case the environment variables are updated | |
211 | after the command runs. | |
212 | ||
213 | ||
214 | Environment Variables | |
215 | --------------------- | |
216 | ||
217 | The following are used: | |
218 | ||
219 | - profbase | |
220 | Base address of trace output buffer | |
221 | ||
222 | - profoffset | |
223 | Offset of first unwritten byte in trace output buffer | |
224 | ||
225 | - profsize | |
226 | Size of trace output buffer | |
227 | ||
228 | All of these are set by the 'trace calls' command. | |
229 | ||
230 | These variables keep track of the amount of data written to the trace | |
231 | output buffer by the 'trace' command. The trace commands which write data | |
232 | to the output buffer can use these to specify the buffer to write to, and | |
233 | update profoffset each time. This allows successive commands to append data | |
234 | to the same buffer, for example: | |
235 | ||
236 | trace funclist 10000 e00000 | |
237 | trace calls | |
238 | ||
239 | (the latter command appends more data to the buffer). | |
240 | ||
241 | ||
242 | - fakegocmd | |
243 | Specifies commands to run just before booting the OS. This | |
244 | is a useful time to write the trace data to the host for | |
245 | processing. | |
246 | ||
247 | ||
248 | Writing Out Trace Data | |
249 | ---------------------- | |
250 | ||
251 | Once the trace data is in an output buffer in memory there are various ways | |
252 | to transmit it to the host. Notably you can use tftput to send the data | |
253 | over a network link: | |
254 | ||
255 | fakegocmd=trace pause; usb start; set autoload n; bootp; | |
256 | trace calls 10000000 1000000; | |
257 | tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls | |
258 | ||
259 | This starts up USB (to talk to an attached USB Ethernet dongle), writes | |
260 | a trace log to address 10000000 and sends it to a host machine using | |
261 | TFTP. After this, U-Boot will boot the OS normally, albeit a little | |
262 | later. | |
263 | ||
264 | ||
265 | Converting Trace Output Data | |
266 | ---------------------------- | |
267 | ||
268 | The trace output data is kept in a binary format which is not documented | |
269 | here. To convert it into something useful, you can use proftool. | |
270 | ||
271 | This tool must be given the U-Boot map file and the trace data received | |
272 | from running that U-Boot. It produces a text output file. | |
273 | ||
274 | Options | |
275 | -m <map_file> | |
276 | Specify U-Boot map file | |
277 | ||
278 | -p <trace_file> | |
279 | Specifiy profile/trace file | |
280 | ||
281 | Commands: | |
282 | ||
283 | - dump-ftrace | |
284 | Write a text dump of the file in Linux ftrace format to stdout | |
285 | ||
286 | ||
287 | Viewing the Trace Data | |
288 | ---------------------- | |
289 | ||
290 | You can use pytimechart for this (sudo apt-get pytimechart might work on | |
291 | your Debian-style machine, and use your favourite search engine to obtain | |
292 | documentation). It expects the file to have a .txt extension. The program | |
293 | has terse user interface but is very convenient for viewing U-Boot | |
294 | profile information. | |
295 | ||
296 | ||
297 | Workflow Suggestions | |
298 | -------------------- | |
299 | ||
300 | The following suggestions may be helpful if you are trying to reduce boot | |
301 | time: | |
302 | ||
303 | 1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get | |
304 | you are helpful overall snapshot of the boot time. | |
305 | ||
306 | 2. Build U-Boot with tracing and run it. Note the difference in boot time | |
307 | (it is common for tracing to add 10% to the time) | |
308 | ||
309 | 3. Collect the trace information as descibed above. Use this to find where | |
310 | all the time is being spent. | |
311 | ||
312 | 4. Take a look at that code and see if you can optimise it. Perhaps it is | |
313 | possible to speed up the initialisation of a device, or remove an unused | |
314 | feature. | |
315 | ||
316 | 5. Rebuild, run and collect again. Compare your results. | |
317 | ||
318 | 6. Keep going until you run out of steam, or your boot is fast enough. | |
319 | ||
320 | ||
321 | Configuring Trace | |
322 | ----------------- | |
323 | ||
324 | There are a few parameters in the code that you may want to consider. | |
325 | There is a function call depth limit (set to 15 by default). When the | |
326 | stack depth goes above this then no tracing information is recorded. | |
327 | The maximum depth reached is recorded and displayed by the 'trace stats' | |
328 | command. | |
329 | ||
330 | ||
331 | Future Work | |
332 | ----------- | |
333 | ||
334 | Tracing could be a little tidier in some areas, for example providing | |
335 | run-time configuration options for trace. | |
336 | ||
337 | Some other features that might be useful: | |
338 | ||
339 | - Trace filter to select which functions are recorded | |
340 | - Sample-based profiling using a timer interrupt | |
341 | - Better control over trace depth | |
342 | - Compression of trace information | |
343 | ||
344 | ||
345 | Simon Glass <[email protected]> | |
346 | April 2013 |