]>
Commit | Line | Data |
---|---|---|
b2e16a85 SG |
1 | # |
2 | # Copyright (c) 2013 The Chromium OS Authors. | |
3 | # | |
1a459660 | 4 | # SPDX-License-Identifier: GPL-2.0+ |
b2e16a85 SG |
5 | # |
6 | ||
7 | Tracing in U-Boot | |
8 | ================= | |
9 | ||
10 | U-Boot supports a simple tracing feature which allows a record of excecution | |
11 | to be collected and sent to a host machine for analysis. At present the | |
12 | main use for this is to profile boot time. | |
13 | ||
14 | ||
15 | Overview | |
16 | -------- | |
17 | ||
18 | The trace feature uses GCC's instrument-functions feature to trace all | |
19 | function entry/exit points. These are then recorded in a memory buffer. | |
20 | The memory buffer can be saved to the host over a network link using | |
21 | tftpput or by writing to an attached memory device such as MMC. | |
22 | ||
23 | On the host, the file is first converted with a tool called 'proftool', | |
24 | which extracts useful information from it. The resulting trace output | |
25 | resembles that emitted by Linux's ftrace feature, so can be visually | |
26 | displayed by pytimechart. | |
27 | ||
28 | ||
29 | Quick-start using Sandbox | |
30 | ------------------------- | |
31 | ||
32 | Sandbox is a build of U-Boot that can run under Linux so it is a convenient | |
33 | way of trying out tracing before you use it on your actual board. To do | |
34 | this, follow these steps: | |
35 | ||
36 | Add the following to include/configs/sandbox.h (if not already there) | |
37 | ||
38 | #define CONFIG_TRACE | |
39 | #define CONFIG_CMD_TRACE | |
40 | #define CONFIG_TRACE_BUFFER_SIZE (16 << 20) | |
41 | #define CONFIG_TRACE_EARLY_SIZE (8 << 20) | |
42 | #define CONFIG_TRACE_EARLY | |
43 | #define CONFIG_TRACE_EARLY_ADDR 0x00100000 | |
44 | ||
45 | Build sandbox U-Boot with tracing enabled: | |
46 | ||
47 | $ make FTRACE=1 O=sandbox sandbox_config | |
48 | $ make FTRACE=1 O=sandbox | |
49 | ||
50 | Run sandbox, wait for a bit of trace information to appear, and then capture | |
51 | a trace: | |
52 | ||
53 | $ ./sandbox/u-boot | |
54 | ||
55 | ||
56 | U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) | |
57 | ||
58 | DRAM: 128 MiB | |
59 | trace: enabled | |
60 | Using default environment | |
61 | ||
62 | In: serial | |
63 | Out: serial | |
64 | Err: serial | |
65 | =>trace stats | |
93e14596 WD |
66 | 671,406 function sites |
67 | 69,712 function calls | |
68 | 0 untracked function calls | |
69 | 73,373 traced function calls | |
70 | 16 maximum observed call depth | |
71 | 15 call depth limit | |
72 | 66,491 calls not traced due to depth | |
b2e16a85 | 73 | =>trace stats |
93e14596 | 74 | 671,406 function sites |
b2e16a85 | 75 | 1,279,450 function calls |
93e14596 WD |
76 | 0 untracked function calls |
77 | 950,490 traced function calls (333217 dropped due to overflow) | |
78 | 16 maximum observed call depth | |
79 | 15 call depth limit | |
b2e16a85 SG |
80 | 1,275,767 calls not traced due to depth |
81 | =>trace calls 0 e00000 | |
82 | Call list dumped to 00000000, size 0xae0a40 | |
83 | ||
84 | baudrate=115200 | |
85 | profbase=0 | |
86 | profoffset=ae0a40 | |
87 | profsize=e00000 | |
88 | stderr=serial | |
89 | stdin=serial | |
90 | stdout=serial | |
91 | ||
92 | Environment size: 117/8188 bytes | |
93 | =>sb save host 0 trace 0 ${profoffset} | |
94 | 11405888 bytes written in 10 ms (1.1 GiB/s) | |
95 | =>reset | |
96 | ||
97 | ||
98 | Then run proftool to convert the trace information to ftrace format. | |
99 | ||
100 | $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt | |
101 | ||
102 | Finally run pytimechart to display it: | |
103 | ||
104 | $ pytimechart trace.txt | |
105 | ||
106 | Using this tool you can zoom and pan across the trace, with the function | |
107 | calls on the left and little marks representing the start and end of each | |
108 | function. | |
109 | ||
110 | ||
111 | CONFIG Options | |
112 | -------------- | |
113 | ||
114 | - CONFIG_TRACE | |
115 | Enables the trace feature in U-Boot. | |
116 | ||
117 | - CONFIG_CMD_TRACE | |
118 | Enables the trace command. | |
119 | ||
120 | - CONFIG_TRACE_BUFFER_SIZE | |
121 | Size of trace buffer to allocate for U-Boot. This buffer is | |
122 | used after relocation, as a place to put function tracing | |
123 | information. The address of the buffer is determined by | |
124 | the relocation code. | |
125 | ||
126 | - CONFIG_TRACE_EARLY | |
127 | Define this to start tracing early, before relocation. | |
128 | ||
129 | - CONFIG_TRACE_EARLY_SIZE | |
130 | Size of 'early' trace buffer. Before U-Boot has relocated | |
131 | it doesn't have a proper trace buffer. On many boards | |
132 | you can define an area of memory to use for the trace | |
133 | buffer until the 'real' trace buffer is available after | |
134 | relocation. The contents of this buffer are then copied to | |
135 | the real buffer. | |
136 | ||
137 | - CONFIG_TRACE_EARLY_ADDR | |
138 | Address of early trace buffer | |
139 | ||
140 | ||
141 | Building U-Boot with Tracing Enabled | |
142 | ------------------------------------ | |
143 | ||
144 | Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. | |
145 | This is kept as a separate option so that it is easy to enable/disable | |
146 | instrumenting from the command line instead of having to change board | |
147 | config files. | |
148 | ||
149 | ||
150 | Collecting Trace Data | |
151 | --------------------- | |
152 | ||
153 | When you run U-Boot on your board it will collect trace data up to the | |
154 | limit of the trace buffer size you have specified. Once that is exhausted | |
155 | no more data will be collected. | |
156 | ||
157 | Collecting trace data has an affect on execution time/performance. You | |
158 | will notice this particularly with trvial functions - the overhead of | |
159 | recording their execution may even exceed their normal execution time. | |
160 | In practice this doesn't matter much so long as you are aware of the | |
161 | effect. Once you have done your optimisations, turn off tracing before | |
162 | doing end-to-end timing. | |
163 | ||
164 | The best time to start tracing is right at the beginning of U-Boot. The | |
165 | best time to stop tracing is right at the end. In practice it is hard | |
166 | to achieve these ideals. | |
167 | ||
168 | This implementation enables tracing early in board_init_f(). This means | |
169 | that it captures most of the board init process, missing only the | |
170 | early architecture-specific init. However, it also misses the entire | |
171 | SPL stage if there is one. | |
172 | ||
173 | U-Boot typically ends with a 'bootm' command which loads and runs an | |
174 | OS. There is useful trace data in the execution of that bootm | |
175 | command. Therefore this implementation provides a way to collect trace | |
176 | data after bootm has finished processing, but just before it jumps to | |
177 | the OS. In practical terms, U-Boot runs the 'fakegocmd' environment | |
178 | variable at this point. This variable should have a short script which | |
179 | collects the trace data and writes it somewhere. | |
180 | ||
181 | Trace data collection relies on a microsecond timer, accesed through | |
182 | timer_get_us(). So the first think you should do is make sure that | |
183 | this produces sensible results for your board. Suitable sources for | |
184 | this timer include high resolution timers, PWMs or profile timers if | |
185 | available. Most modern SOCs have a suitable timer for this. Make sure | |
186 | that you mark this timer (and anything it calls) with | |
187 | __attribute__((no_instrument_function)) so that the trace library can | |
188 | use it without causing an infinite loop. | |
189 | ||
190 | ||
191 | Commands | |
192 | -------- | |
193 | ||
194 | The trace command has variable sub-commands: | |
195 | ||
196 | - stats | |
197 | Display tracing statistics | |
198 | ||
199 | - pause | |
200 | Pause tracing | |
201 | ||
202 | - resume | |
203 | Resume tracing | |
204 | ||
205 | - funclist [<addr> <size>] | |
206 | Dump a list of functions into the buffer | |
207 | ||
208 | - calls [<addr> <size>] | |
209 | Dump function call trace into buffer | |
210 | ||
211 | If the address and size are not given, these are obtained from environment | |
212 | variables (see below). In any case the environment variables are updated | |
213 | after the command runs. | |
214 | ||
215 | ||
216 | Environment Variables | |
217 | --------------------- | |
218 | ||
219 | The following are used: | |
220 | ||
221 | - profbase | |
222 | Base address of trace output buffer | |
223 | ||
224 | - profoffset | |
225 | Offset of first unwritten byte in trace output buffer | |
226 | ||
227 | - profsize | |
228 | Size of trace output buffer | |
229 | ||
230 | All of these are set by the 'trace calls' command. | |
231 | ||
232 | These variables keep track of the amount of data written to the trace | |
233 | output buffer by the 'trace' command. The trace commands which write data | |
234 | to the output buffer can use these to specify the buffer to write to, and | |
235 | update profoffset each time. This allows successive commands to append data | |
236 | to the same buffer, for example: | |
237 | ||
238 | trace funclist 10000 e00000 | |
239 | trace calls | |
240 | ||
241 | (the latter command appends more data to the buffer). | |
242 | ||
243 | ||
244 | - fakegocmd | |
245 | Specifies commands to run just before booting the OS. This | |
246 | is a useful time to write the trace data to the host for | |
247 | processing. | |
248 | ||
249 | ||
250 | Writing Out Trace Data | |
251 | ---------------------- | |
252 | ||
253 | Once the trace data is in an output buffer in memory there are various ways | |
254 | to transmit it to the host. Notably you can use tftput to send the data | |
255 | over a network link: | |
256 | ||
257 | fakegocmd=trace pause; usb start; set autoload n; bootp; | |
258 | trace calls 10000000 1000000; | |
259 | tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls | |
260 | ||
261 | This starts up USB (to talk to an attached USB Ethernet dongle), writes | |
262 | a trace log to address 10000000 and sends it to a host machine using | |
263 | TFTP. After this, U-Boot will boot the OS normally, albeit a little | |
264 | later. | |
265 | ||
266 | ||
267 | Converting Trace Output Data | |
268 | ---------------------------- | |
269 | ||
270 | The trace output data is kept in a binary format which is not documented | |
271 | here. To convert it into something useful, you can use proftool. | |
272 | ||
273 | This tool must be given the U-Boot map file and the trace data received | |
274 | from running that U-Boot. It produces a text output file. | |
275 | ||
276 | Options | |
277 | -m <map_file> | |
278 | Specify U-Boot map file | |
279 | ||
280 | -p <trace_file> | |
281 | Specifiy profile/trace file | |
282 | ||
283 | Commands: | |
284 | ||
285 | - dump-ftrace | |
286 | Write a text dump of the file in Linux ftrace format to stdout | |
287 | ||
288 | ||
289 | Viewing the Trace Data | |
290 | ---------------------- | |
291 | ||
292 | You can use pytimechart for this (sudo apt-get pytimechart might work on | |
293 | your Debian-style machine, and use your favourite search engine to obtain | |
294 | documentation). It expects the file to have a .txt extension. The program | |
295 | has terse user interface but is very convenient for viewing U-Boot | |
296 | profile information. | |
297 | ||
298 | ||
299 | Workflow Suggestions | |
300 | -------------------- | |
301 | ||
302 | The following suggestions may be helpful if you are trying to reduce boot | |
303 | time: | |
304 | ||
305 | 1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get | |
306 | you are helpful overall snapshot of the boot time. | |
307 | ||
308 | 2. Build U-Boot with tracing and run it. Note the difference in boot time | |
309 | (it is common for tracing to add 10% to the time) | |
310 | ||
311 | 3. Collect the trace information as descibed above. Use this to find where | |
312 | all the time is being spent. | |
313 | ||
314 | 4. Take a look at that code and see if you can optimise it. Perhaps it is | |
315 | possible to speed up the initialisation of a device, or remove an unused | |
316 | feature. | |
317 | ||
318 | 5. Rebuild, run and collect again. Compare your results. | |
319 | ||
320 | 6. Keep going until you run out of steam, or your boot is fast enough. | |
321 | ||
322 | ||
323 | Configuring Trace | |
324 | ----------------- | |
325 | ||
326 | There are a few parameters in the code that you may want to consider. | |
327 | There is a function call depth limit (set to 15 by default). When the | |
328 | stack depth goes above this then no tracing information is recorded. | |
329 | The maximum depth reached is recorded and displayed by the 'trace stats' | |
330 | command. | |
331 | ||
332 | ||
333 | Future Work | |
334 | ----------- | |
335 | ||
336 | Tracing could be a little tidier in some areas, for example providing | |
337 | run-time configuration options for trace. | |
338 | ||
339 | Some other features that might be useful: | |
340 | ||
341 | - Trace filter to select which functions are recorded | |
342 | - Sample-based profiling using a timer interrupt | |
343 | - Better control over trace depth | |
344 | - Compression of trace information | |
345 | ||
346 | ||
347 | Simon Glass <[email protected]> | |
348 | April 2013 |