Commit | Line | Data |
---|---|---|
a2681707 WD |
1 | The most frequent cause of problems when porting U-Boot to new |
2 | hardware, or when using a sloppy port on some board, is memory errors. | |
3 | In most cases these are not caused by failing hardware, but by | |
4 | incorrect initialization of the memory controller. So it appears to | |
5 | be a good idea to always test if the memory is working correctly, | |
6 | before looking for any other potential causes of any problems. | |
7 | ||
8 | U-Boot implements 3 different approaches to perform memory tests: | |
9 | ||
10 | 1. The get_ram_size() function (see "common/memsize.c"). | |
11 | ||
12 | This function is supposed to be used in each and every U-Boot port | |
13 | determine the presence and actual size of each of the potential | |
14 | memory banks on this piece of hardware. The code is supposed to be | |
15 | very fast, so running it for each reboot does not hurt. It is a | |
16 | little known and generally underrated fact that this code will also | |
17 | catch 99% of hardware related (i. e. reliably reproducible) memory | |
18 | errors. It is strongly recommended to always use this function, in | |
19 | each and every port of U-Boot. | |
20 | ||
21 | 2. The "mtest" command. | |
22 | ||
23 | This is probably the best known memory test utility in U-Boot. | |
24 | Unfortunately, it is also the most problematic, and the most | |
25 | useless one. | |
26 | ||
27 | There are a number of serious problems with this command: | |
28 | ||
29 | - It is terribly slow. Running "mtest" on the whole system RAM | |
30 | takes a _long_ time before there is any significance in the fact | |
31 | that no errors have been found so far. | |
32 | ||
33 | - It is difficult to configure, and to use. And any errors here | |
34 | will reliably crash or hang your system. "mtest" is dumb and has | |
35 | no knowledge about memory ranges that may be in use for other | |
36 | purposes, like exception code, U-Boot code and data, stack, | |
37 | malloc arena, video buffer, log buffer, etc. If you let it, it | |
38 | will happily "test" all such areas, which of course will cause | |
39 | some problems. | |
40 | ||
41 | - It is not easy to configure and use, and a large number of | |
42 | systems are seriously misconfigured. The original idea was to | |
43 | test basically the whole system RAM, with only exempting the | |
44 | areas used by U-Boot itself - on most systems these are the areas | |
45 | used for the exception vectors (usually at the very lower end of | |
46 | system memory) and for U-Boot (code, data, etc. - see above; | |
47 | these are usually at the very upper end of system memory). But | |
48 | experience has shown that a very large number of ports use | |
49 | pretty much bogus settings of CONFIG_SYS_MEMTEST_START and | |
50 | CONFIG_SYS_MEMTEST_END; this results in useless tests (because | |
51 | the ranges is too small and/or badly located) or in critical | |
52 | failures (system crashes). | |
53 | ||
54 | Because of these issues, the "mtest" command is considered depre- | |
55 | cated. It should not be enabled in most normal ports of U-Boot, | |
56 | especially not in production. If you really need a memory test, | |
57 | then see 1. and 3. above resp. below. | |
58 | ||
59 | 3. The most thorough memory test facility is available as part of the | |
60 | POST (Power-On Self Test) sub-system, see "post/drivers/memory.c". | |
61 | ||
62 | If you really need to perform memory tests (for example, because | |
63 | it is mandatory part of your requirement specification), then | |
64 | enable this test which is generic and should work on all archi- | |
65 | tectures. | |
66 | ||
67 | WARNING: | |
68 | ||
69 | It should pointed out that _all_ these memory tests have one | |
70 | fundamental, unfixable design flaw: they are based on the assumption | |
71 | that memory errors can be found by writing to and reading from memory. | |
72 | Unfortunately, this is only true for the relatively harmless, usually | |
73 | static errors like shorts between data or address lines, unconnected | |
74 | pins, etc. All the really nasty errors which will first turn your | |
75 | hair gray, only to make you tear it out later, are dynamical errors, | |
76 | which usually happen not with simple read or write cycles on the bus, | |
77 | but when performing back-to-back data transfers in burst mode. Such | |
78 | accesses usually happen only for certain DMA operations, or for heavy | |
79 | cache use (instruction fetching, cache flushing). So far I am not | |
80 | aware of any freely available code that implements a generic, and | |
81 | efficient, memory test like that. The best known test case to stress | |
82 | a system like that is to boot Linux with root file system mounted over | |
83 | NFS, and then build some larger software package natively (say, | |
84 | compile a Linux kernel on the system) - this will cause enough context | |
85 | switches, network traffic (and thus DMA transfers from the network | |
86 | controller), varying RAM use, etc. to trigger any weak spots in this | |
87 | area. | |
88 | ||
89 | Note: An attempt was made once to implement such a test to catch | |
90 | memory problems on a specific board. The code is pretty much board | |
91 | specific (for example, it includes setting specific GPIO signals to | |
92 | provide triggers for an attached logic analyzer), but you can get an | |
93 | idea how it works: see "examples/standalone/test_burst*". | |
94 | ||
95 | Note 2: Ironically enough, the "test_burst" did not catch any RAM | |
96 | errors, not a single one ever. The problems this code was supposed | |
97 | to catch did not happen when accessing the RAM, but when reading from | |
98 | NOR flash. |