]>
Commit | Line | Data |
---|---|---|
3418d036 AP |
1 | Partial Parity Log |
2 | ||
3 | Partial Parity Log (PPL) is a feature available for RAID5 arrays. The issue | |
4 | addressed by PPL is that after a dirty shutdown, parity of a particular stripe | |
5 | may become inconsistent with data on other member disks. If the array is also | |
6 | in degraded state, there is no way to recalculate parity, because one of the | |
7 | disks is missing. This can lead to silent data corruption when rebuilding the | |
8 | array or using it is as degraded - data calculated from parity for array blocks | |
9 | that have not been touched by a write request during the unclean shutdown can | |
10 | be incorrect. Such condition is known as the RAID5 Write Hole. Because of | |
11 | this, md by default does not allow starting a dirty degraded array. | |
12 | ||
13 | Partial parity for a write operation is the XOR of stripe data chunks not | |
14 | modified by this write. It is just enough data needed for recovering from the | |
15 | write hole. XORing partial parity with the modified chunks produces parity for | |
16 | the stripe, consistent with its state before the write operation, regardless of | |
17 | which chunk writes have completed. If one of the not modified data disks of | |
18 | this stripe is missing, this updated parity can be used to recover its | |
19 | contents. PPL recovery is also performed when starting an array after an | |
20 | unclean shutdown and all disks are available, eliminating the need to resync | |
21 | the array. Because of this, using write-intent bitmap and PPL together is not | |
22 | supported. | |
23 | ||
24 | When handling a write request PPL writes partial parity before new data and | |
25 | parity are dispatched to disks. PPL is a distributed log - it is stored on | |
26 | array member drives in the metadata area, on the parity drive of a particular | |
27 | stripe. It does not require a dedicated journaling drive. Write performance is | |
28 | reduced by up to 30%-40% but it scales with the number of drives in the array | |
29 | and the journaling drive does not become a bottleneck or a single point of | |
30 | failure. | |
31 | ||
32 | Unlike raid5-cache, the other solution in md for closing the write hole, PPL is | |
33 | not a true journal. It does not protect from losing in-flight data, only from | |
34 | silent data corruption. If a dirty disk of a stripe is lost, no PPL recovery is | |
35 | performed for this stripe (parity is not updated). So it is possible to have | |
36 | arbitrary data in the written part of a stripe if that disk is lost. In such | |
37 | case the behavior is the same as in plain raid5. | |
38 | ||
39 | PPL is available for md version-1 metadata and external (specifically IMSM) | |
40 | metadata arrays. It can be enabled using mdadm option --consistency-policy=ppl. | |
41 | ||
42 | Currently, volatile write-back cache should be disabled on all member drives | |
43 | when using PPL. Otherwise it cannot guarantee consistency in case of power | |
44 | failure. |