]>
Commit | Line | Data |
---|---|---|
2d6fff63 DH |
1 | ========================== |
2 | General Filesystem Caching | |
3 | ========================== | |
4 | ||
5 | ======== | |
6 | OVERVIEW | |
7 | ======== | |
8 | ||
9 | This facility is a general purpose cache for network filesystems, though it | |
10 | could be used for caching other things such as ISO9660 filesystems too. | |
11 | ||
12 | FS-Cache mediates between cache backends (such as CacheFS) and network | |
13 | filesystems: | |
14 | ||
15 | +---------+ | |
16 | | | +--------------+ | |
17 | | NFS |--+ | | | |
18 | | | | +-->| CacheFS | | |
19 | +---------+ | +----------+ | | /dev/hda5 | | |
20 | | | | | +--------------+ | |
21 | +---------+ +-->| | | | |
22 | | | | |--+ | |
23 | | AFS |----->| FS-Cache | | |
24 | | | | |--+ | |
25 | +---------+ +-->| | | | |
26 | | | | | +--------------+ | |
27 | +---------+ | +----------+ | | | | |
28 | | | | +-->| CacheFiles | | |
29 | | ISOFS |--+ | /var/cache | | |
30 | | | +--------------+ | |
31 | +---------+ | |
32 | ||
33 | Or to look at it another way, FS-Cache is a module that provides a caching | |
34 | facility to a network filesystem such that the cache is transparent to the | |
35 | user: | |
36 | ||
37 | +---------+ | |
38 | | | | |
39 | | Server | | |
40 | | | | |
41 | +---------+ | |
42 | | NETWORK | |
43 | ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
44 | | | |
45 | | +----------+ | |
46 | V | | | |
47 | +---------+ | | | |
48 | | | | | | |
49 | | NFS |----->| FS-Cache | | |
50 | | | | |--+ | |
51 | +---------+ | | | +--------------+ +--------------+ | |
52 | | | | | | | | | | |
53 | V +----------+ +-->| CacheFiles |-->| Ext3 | | |
54 | +---------+ | /var/cache | | /dev/sda6 | | |
55 | | | +--------------+ +--------------+ | |
56 | | VFS | ^ ^ | |
57 | | | | | | |
58 | +---------+ +--------------+ | | |
59 | | KERNEL SPACE | | | |
60 | ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~ | |
61 | | USER SPACE | | | |
62 | V | | | |
63 | +---------+ +--------------+ | |
64 | | | | | | |
65 | | Process | | cachefilesd | | |
66 | | | | | | |
67 | +---------+ +--------------+ | |
68 | ||
69 | ||
70 | FS-Cache does not follow the idea of completely loading every netfs file | |
71 | opened in its entirety into a cache before permitting it to be accessed and | |
72 | then serving the pages out of that cache rather than the netfs inode because: | |
73 | ||
74 | (1) It must be practical to operate without a cache. | |
75 | ||
76 | (2) The size of any accessible file must not be limited to the size of the | |
77 | cache. | |
78 | ||
79 | (3) The combined size of all opened files (this includes mapped libraries) | |
80 | must not be limited to the size of the cache. | |
81 | ||
82 | (4) The user should not be forced to download an entire file just to do a | |
83 | one-off access of a small portion of it (such as might be done with the | |
84 | "file" program). | |
85 | ||
86 | It instead serves the cache out in PAGE_SIZE chunks as and when requested by | |
87 | the netfs('s) using it. | |
88 | ||
89 | ||
90 | FS-Cache provides the following facilities: | |
91 | ||
92 | (1) More than one cache can be used at once. Caches can be selected | |
93 | explicitly by use of tags. | |
94 | ||
95 | (2) Caches can be added / removed at any time. | |
96 | ||
97 | (3) The netfs is provided with an interface that allows either party to | |
98 | withdraw caching facilities from a file (required for (2)). | |
99 | ||
100 | (4) The interface to the netfs returns as few errors as possible, preferring | |
101 | rather to let the netfs remain oblivious. | |
102 | ||
103 | (5) Cookies are used to represent indices, files and other objects to the | |
104 | netfs. The simplest cookie is just a NULL pointer - indicating nothing | |
105 | cached there. | |
106 | ||
107 | (6) The netfs is allowed to propose - dynamically - any index hierarchy it | |
108 | desires, though it must be aware that the index search function is | |
109 | recursive, stack space is limited, and indices can only be children of | |
110 | indices. | |
111 | ||
112 | (7) Data I/O is done direct to and from the netfs's pages. The netfs | |
113 | indicates that page A is at index B of the data-file represented by cookie | |
114 | C, and that it should be read or written. The cache backend may or may | |
115 | not start I/O on that page, but if it does, a netfs callback will be | |
116 | invoked to indicate completion. The I/O may be either synchronous or | |
117 | asynchronous. | |
118 | ||
119 | (8) Cookies can be "retired" upon release. At this point FS-Cache will mark | |
120 | them as obsolete and the index hierarchy rooted at that point will get | |
121 | recycled. | |
122 | ||
123 | (9) The netfs provides a "match" function for index searches. In addition to | |
124 | saying whether a match was made or not, this can also specify that an | |
125 | entry should be updated or deleted. | |
126 | ||
127 | (10) As much as possible is done asynchronously. | |
128 | ||
129 | ||
130 | FS-Cache maintains a virtual indexing tree in which all indices, files, objects | |
131 | and pages are kept. Bits of this tree may actually reside in one or more | |
132 | caches. | |
133 | ||
134 | FSDEF | |
135 | | | |
136 | +------------------------------------+ | |
137 | | | | |
138 | NFS AFS | |
139 | | | | |
140 | +--------------------------+ +-----------+ | |
141 | | | | | | |
142 | homedir mirror afs.org redhat.com | |
143 | | | | | |
144 | +------------+ +---------------+ +----------+ | |
145 | | | | | | | | |
146 | 00001 00002 00007 00125 vol00001 vol00002 | |
147 | | | | | | | |
148 | +---+---+ +-----+ +---+ +------+------+ +-----+----+ | |
149 | | | | | | | | | | | | | | | |
150 | PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak | |
151 | | | | |
152 | PG0 +-------+ | |
153 | | | | |
154 | 00001 00003 | |
155 | | | |
156 | +---+---+ | |
157 | | | | | |
158 | PG0 PG1 PG2 | |
159 | ||
160 | In the example above, you can see two netfs's being backed: NFS and AFS. These | |
161 | have different index hierarchies: | |
162 | ||
163 | (*) The NFS primary index contains per-server indices. Each server index is | |
164 | indexed by NFS file handles to get data file objects. Each data file | |
165 | objects can have an array of pages, but may also have further child | |
166 | objects, such as extended attributes and directory entries. Extended | |
167 | attribute objects themselves have page-array contents. | |
168 | ||
169 | (*) The AFS primary index contains per-cell indices. Each cell index contains | |
170 | per-logical-volume indices. Each of volume index contains up to three | |
171 | indices for the read-write, read-only and backup mirrors of those volumes. | |
172 | Each of these contains vnode data file objects, each of which contains an | |
173 | array of pages. | |
174 | ||
175 | The very top index is the FS-Cache master index in which individual netfs's | |
176 | have entries. | |
177 | ||
178 | Any index object may reside in more than one cache, provided it only has index | |
179 | children. Any index with non-index object children will be assumed to only | |
180 | reside in one cache. | |
181 | ||
182 | ||
183 | The netfs API to FS-Cache can be found in: | |
184 | ||
185 | Documentation/filesystems/caching/netfs-api.txt | |
186 | ||
187 | The cache backend API to FS-Cache can be found in: | |
188 | ||
189 | Documentation/filesystems/caching/backend-api.txt | |
190 | ||
36c95590 DH |
191 | A description of the internal representations and object state machine can be |
192 | found in: | |
193 | ||
194 | Documentation/filesystems/caching/object.txt | |
195 | ||
2d6fff63 DH |
196 | |
197 | ======================= | |
198 | STATISTICAL INFORMATION | |
199 | ======================= | |
200 | ||
201 | If FS-Cache is compiled with the following options enabled: | |
202 | ||
2d6fff63 DH |
203 | CONFIG_FSCACHE_STATS=y |
204 | CONFIG_FSCACHE_HISTOGRAM=y | |
205 | ||
206 | then it will gather certain statistics and display them through a number of | |
207 | proc files. | |
208 | ||
209 | (*) /proc/fs/fscache/stats | |
210 | ||
211 | This shows counts of a number of events that can happen in FS-Cache: | |
212 | ||
213 | CLASS EVENT MEANING | |
214 | ======= ======= ======================================================= | |
215 | Cookies idx=N Number of index cookies allocated | |
216 | dat=N Number of data storage cookies allocated | |
217 | spc=N Number of special cookies allocated | |
218 | Objects alc=N Number of objects allocated | |
219 | nal=N Number of object allocation failures | |
220 | avl=N Number of objects that reached the available state | |
221 | ded=N Number of objects that reached the dead state | |
222 | ChkAux non=N Number of objects that didn't have a coherency check | |
223 | ok=N Number of objects that passed a coherency check | |
224 | upd=N Number of objects that needed a coherency data update | |
225 | obs=N Number of objects that were declared obsolete | |
226 | Pages mrk=N Number of pages marked as being cached | |
227 | unc=N Number of uncache page requests seen | |
228 | Acquire n=N Number of acquire cookie requests seen | |
229 | nul=N Number of acq reqs given a NULL parent | |
230 | noc=N Number of acq reqs rejected due to no cache available | |
231 | ok=N Number of acq reqs succeeded | |
232 | nbf=N Number of acq reqs rejected due to error | |
233 | oom=N Number of acq reqs failed on ENOMEM | |
234 | Lookups n=N Number of lookup calls made on cache backends | |
235 | neg=N Number of negative lookups made | |
236 | pos=N Number of positive lookups made | |
237 | crt=N Number of objects created by lookup | |
fee096de | 238 | tmo=N Number of lookups timed out and requeued |
2d6fff63 DH |
239 | Updates n=N Number of update cookie requests seen |
240 | nul=N Number of upd reqs given a NULL parent | |
241 | run=N Number of upd reqs granted CPU time | |
242 | Relinqs n=N Number of relinquish cookie requests seen | |
243 | nul=N Number of rlq reqs given a NULL parent | |
244 | wcr=N Number of rlq reqs waited on completion of creation | |
245 | AttrChg n=N Number of attribute changed requests seen | |
246 | ok=N Number of attr changed requests queued | |
247 | nbf=N Number of attr changed rejected -ENOBUFS | |
248 | oom=N Number of attr changed failed -ENOMEM | |
249 | run=N Number of attr changed ops given CPU time | |
250 | Allocs n=N Number of allocation requests seen | |
251 | ok=N Number of successful alloc reqs | |
252 | wt=N Number of alloc reqs that waited on lookup completion | |
253 | nbf=N Number of alloc reqs rejected -ENOBUFS | |
5753c441 | 254 | int=N Number of alloc reqs aborted -ERESTARTSYS |
2d6fff63 DH |
255 | ops=N Number of alloc reqs submitted |
256 | owt=N Number of alloc reqs waited for CPU time | |
60d543ca | 257 | abt=N Number of alloc reqs aborted due to object death |
2d6fff63 DH |
258 | Retrvls n=N Number of retrieval (read) requests seen |
259 | ok=N Number of successful retr reqs | |
260 | wt=N Number of retr reqs that waited on lookup completion | |
261 | nod=N Number of retr reqs returned -ENODATA | |
262 | nbf=N Number of retr reqs rejected -ENOBUFS | |
263 | int=N Number of retr reqs aborted -ERESTARTSYS | |
264 | oom=N Number of retr reqs failed -ENOMEM | |
265 | ops=N Number of retr reqs submitted | |
266 | owt=N Number of retr reqs waited for CPU time | |
60d543ca | 267 | abt=N Number of retr reqs aborted due to object death |
2d6fff63 DH |
268 | Stores n=N Number of storage (write) requests seen |
269 | ok=N Number of successful store reqs | |
270 | agn=N Number of store reqs on a page already pending storage | |
271 | nbf=N Number of store reqs rejected -ENOBUFS | |
272 | oom=N Number of store reqs failed -ENOMEM | |
273 | ops=N Number of store reqs submitted | |
274 | run=N Number of store reqs granted CPU time | |
1bccf513 DH |
275 | pgs=N Number of pages given store req processing time |
276 | rxd=N Number of store reqs deleted from tracking tree | |
277 | olm=N Number of store reqs over store limit | |
201a1542 DH |
278 | VmScan nos=N Number of release reqs against pages with no pending store |
279 | gon=N Number of release reqs against pages stored by time lock granted | |
280 | bsy=N Number of release reqs ignored due to in-progress store | |
281 | can=N Number of page stores cancelled due to release req | |
2d6fff63 DH |
282 | Ops pend=N Number of times async ops added to pending queues |
283 | run=N Number of times async ops given CPU time | |
284 | enq=N Number of times async ops queued for processing | |
5753c441 | 285 | can=N Number of async ops cancelled |
e3d4d28b | 286 | rej=N Number of async ops rejected due to object lookup/create failure |
2d6fff63 DH |
287 | dfr=N Number of async ops queued for deferred release |
288 | rel=N Number of async ops released | |
289 | gc=N Number of deferred-release async ops garbage collected | |
52bd75fd DH |
290 | CacheOp alo=N Number of in-progress alloc_object() cache ops |
291 | luo=N Number of in-progress lookup_object() cache ops | |
292 | luc=N Number of in-progress lookup_complete() cache ops | |
293 | gro=N Number of in-progress grab_object() cache ops | |
294 | upo=N Number of in-progress update_object() cache ops | |
295 | dro=N Number of in-progress drop_object() cache ops | |
296 | pto=N Number of in-progress put_object() cache ops | |
297 | syn=N Number of in-progress sync_cache() cache ops | |
298 | atc=N Number of in-progress attr_changed() cache ops | |
299 | rap=N Number of in-progress read_or_alloc_page() cache ops | |
300 | ras=N Number of in-progress read_or_alloc_pages() cache ops | |
301 | alp=N Number of in-progress allocate_page() cache ops | |
302 | als=N Number of in-progress allocate_pages() cache ops | |
303 | wrp=N Number of in-progress write_page() cache ops | |
304 | ucp=N Number of in-progress uncache_page() cache ops | |
305 | dsp=N Number of in-progress dissociate_pages() cache ops | |
2d6fff63 DH |
306 | |
307 | ||
308 | (*) /proc/fs/fscache/histogram | |
309 | ||
310 | cat /proc/fs/fscache/histogram | |
7394daa8 | 311 | JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS |
2d6fff63 DH |
312 | ===== ===== ========= ========= ========= ========= ========= |
313 | ||
314 | This shows the breakdown of the number of times each amount of time | |
315 | between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The | |
316 | columns are as follows: | |
317 | ||
318 | COLUMN TIME MEASUREMENT | |
319 | ======= ======================================================= | |
320 | OBJ INST Length of time to instantiate an object | |
321 | OP RUNS Length of time a call to process an operation took | |
322 | OBJ RUNS Length of time a call to process an object event took | |
323 | RETRV DLY Time between an requesting a read and lookup completing | |
324 | RETRIEVLS Time between beginning and end of a retrieval | |
325 | ||
326 | Each row shows the number of events that took a particular range of times. | |
7394daa8 DH |
327 | Each step is 1 jiffy in size. The JIFS column indicates the particular |
328 | jiffy range covered, and the SECS field the equivalent number of seconds. | |
2d6fff63 DH |
329 | |
330 | ||
4fbf4291 DH |
331 | =========== |
332 | OBJECT LIST | |
333 | =========== | |
334 | ||
335 | If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a | |
336 | list of all the objects currently allocated and allow them to be viewed | |
337 | through: | |
338 | ||
339 | /proc/fs/fscache/objects | |
340 | ||
341 | This will look something like: | |
342 | ||
343 | [root@andromeda ~]# head /proc/fs/fscache/objects | |
344 | OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA | |
345 | ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================ | |
8b8edefa TH |
346 | 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a |
347 | 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a | |
4fbf4291 DH |
348 | |
349 | where the first set of columns before the '|' describe the object: | |
350 | ||
351 | COLUMN DESCRIPTION | |
352 | ======= =============================================================== | |
353 | OBJECT Object debugging ID (appears as OBJ%x in some debug messages) | |
354 | PARENT Debugging ID of parent object | |
355 | STAT Object state | |
356 | CHLDN Number of child objects of this object | |
357 | OPS Number of outstanding operations on this object | |
358 | OOP Number of outstanding child object management operations | |
359 | IPR | |
360 | EX Number of outstanding exclusive operations | |
361 | READS Number of outstanding read operations | |
362 | EM Object's event mask | |
363 | EV Events raised on this object | |
364 | F Object flags | |
8b8edefa | 365 | S Object work item busy state mask (1:pending 2:running) |
4fbf4291 DH |
366 | |
367 | and the second set of columns describe the object's cookie, if present: | |
368 | ||
369 | COLUMN DESCRIPTION | |
370 | =============== ======================================================= | |
371 | NETFS_COOKIE_DEF Name of netfs cookie definition | |
372 | TY Cookie type (IX - index, DT - data, hex - special) | |
373 | FL Cookie flags | |
374 | NETFS_DATA Netfs private data stored in the cookie | |
375 | OBJECT_KEY Object key } 1 column, with separating comma | |
376 | AUX_DATA Object aux data } presence may be configured | |
377 | ||
378 | The data shown may be filtered by attaching the a key to an appropriate keyring | |
379 | before viewing the file. Something like: | |
380 | ||
381 | keyctl add user fscache:objlist <restrictions> @s | |
382 | ||
383 | where <restrictions> are a selection of the following letters: | |
384 | ||
385 | K Show hexdump of object key (don't show if not given) | |
386 | A Show hexdump of object aux data (don't show if not given) | |
387 | ||
388 | and the following paired letters: | |
389 | ||
390 | C Show objects that have a cookie | |
391 | c Show objects that don't have a cookie | |
392 | B Show objects that are busy | |
393 | b Show objects that aren't busy | |
394 | W Show objects that have pending writes | |
395 | w Show objects that don't have pending writes | |
396 | R Show objects that have outstanding reads | |
397 | r Show objects that don't have outstanding reads | |
8b8edefa TH |
398 | S Show objects that have work queued |
399 | s Show objects that don't have work queued | |
4fbf4291 DH |
400 | |
401 | If neither side of a letter pair is given, then both are implied. For example: | |
402 | ||
403 | keyctl add user fscache:objlist KB @s | |
404 | ||
405 | shows objects that are busy, and lists their object keys, but does not dump | |
406 | their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is | |
407 | not implied. | |
408 | ||
409 | By default all objects and all fields will be shown. | |
410 | ||
411 | ||
2d6fff63 DH |
412 | ========= |
413 | DEBUGGING | |
414 | ========= | |
415 | ||
7394daa8 DH |
416 | If CONFIG_FSCACHE_DEBUG is enabled, the FS-Cache facility can have runtime |
417 | debugging enabled by adjusting the value in: | |
2d6fff63 DH |
418 | |
419 | /sys/module/fscache/parameters/debug | |
420 | ||
421 | This is a bitmask of debugging streams to enable: | |
422 | ||
423 | BIT VALUE STREAM POINT | |
424 | ======= ======= =============================== ======================= | |
425 | 0 1 Cache management Function entry trace | |
426 | 1 2 Function exit trace | |
427 | 2 4 General | |
428 | 3 8 Cookie management Function entry trace | |
429 | 4 16 Function exit trace | |
430 | 5 32 General | |
431 | 6 64 Page handling Function entry trace | |
432 | 7 128 Function exit trace | |
433 | 8 256 General | |
434 | 9 512 Operation management Function entry trace | |
435 | 10 1024 Function exit trace | |
436 | 11 2048 General | |
437 | ||
438 | The appropriate set of values should be OR'd together and the result written to | |
439 | the control file. For example: | |
440 | ||
441 | echo $((1|8|64)) >/sys/module/fscache/parameters/debug | |
442 | ||
443 | will turn on all function entry debugging. |