On x86, we *do* still use the non-NOP rmb()/wmb() for IO barriers,
but even that is generally questionable.
Leave them around as historial unless somebody can point to a
case where they care about the performance, but tweak the
comment so people don't think they are strictly required in all
cases.