pglazyfreed (npn)
Amount of reclaimed lazyfree pages
+ swpin_zero
+ Number of pages swapped into memory and filled with zero, where I/O
+ was optimized out because the page content was detected to be zero
+ during swapout.
+
+ swpout_zero
+ Number of zero-filled pages swapped out with I/O skipped due to the
+ content being detected as zero.
+
zswpin
Number of pages moved in to memory from zswap.
a queue (device) has been associated with the bio and
before submission.
- wbc_account_cgroup_owner(@wbc, @page, @bytes)
+ wbc_account_cgroup_owner(@wbc, @folio, @bytes)
Should be called for each data segment being written out.
While this function doesn't care exactly when it's called
during the writeback session, it's the easiest and most
S: Maintained
F: drivers/net/ethernet/alteon/acenic*
-ACER ASPIRE 1 EMBEDDED CONTROLLER DRIVER
-S: Maintained
-F: Documentation/devicetree/bindings/platform/acer,aspire1-ec.yaml
-F: drivers/platform/arm64/acer-aspire1-ec.c
-
ACER ASPIRE ONE TEMPERATURE AND FAN DRIVER
ALLWINNER DMIC DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/allwinner,sun50i-h6-dmic.yaml
F: sound/soc/sunxi/sun50i-dmic.c
ALPHA PORT
S: Odd Fixes
F: drivers/hid/amd-sfh-hid/
AMD SPI DRIVER
-S: Maintained
+S: Supported
F: drivers/spi/spi-amd.c
AMD XGBE DRIVER
ANALOG DEVICES INC ASOC CODEC DRIVERS
S: Supported
W: http://wiki.analog.com/
W: https://ez.analog.com/linux-software-drivers
AOA (Apple Onboard Audio) ALSA DRIVER
S: Maintained
F: sound/aoa/
ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS)
S: Maintained
P: Documentation/process/maintainer-soc.rst
C: irc://irc.libera.chat/armlinux
ARM/Amlogic Meson SoC Sound Drivers
S: Maintained
F: Documentation/devicetree/bindings/sound/amlogic*
F: sound/soc/meson/
ARM/APPLE MACHINE SOUND DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/adi,ssm3515.yaml
F: Documentation/devicetree/bindings/sound/apple,*
S: Maintained
F: arch/arm/mach-ep93xx/ts72xx.c
-ARM/CIRRUS LOGIC CLPS711X ARM ARCHITECTURE
-S: Odd Fixes
-N: clps711x
-
ARM/CIRRUS LOGIC EP93XX ARM ARCHITECTURE
F: Documentation/devicetree/bindings/bus/qcom*
F: Documentation/devicetree/bindings/cache/qcom,llcc.yaml
F: Documentation/devicetree/bindings/firmware/qcom,scm.yaml
-F: Documentation/devicetree/bindings/reserved-memory/qcom
+F: Documentation/devicetree/bindings/reserved-memory/qcom*
F: Documentation/devicetree/bindings/soc/qcom/
F: arch/arm/boot/dts/qcom/
F: arch/arm/configs/qcom_defconfig
AXENTIA ASOC DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/axentia,*
F: sound/soc/atmel/tse850-pcm5142.c
AXI PWM GENERATOR
S: Supported
W: https://ez.analog.com/linux-software-drivers
F: include/linux/backlight.h
F: include/linux/pwm_backlight.h
-BAIKAL-T1 PVT HARDWARE MONITOR DRIVER
-S: Supported
-F: Documentation/devicetree/bindings/hwmon/baikal,bt1-pvt.yaml
-F: Documentation/hwmon/bt1-pvt.rst
-F: drivers/hwmon/bt1-pvt.[ch]
-
BARCO P50 GPIO DRIVER
BT87X AUDIO DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: Documentation/sound/cards/bt87x.rst
C-MEDIA CMI8788 DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: sound/pci/oxygen/
DESIGNWARE EDMA CORE IP DRIVER
S: Maintained
F: drivers/dma/dw-edma/
DRM GPU SCHEDULER
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
EDIROL UA-101/UA-1000 DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: sound/usb/misc/ua101.c
FIREWIRE AUDIO DRIVERS and IEC 61883-1/6 PACKET STREAMING ENGINE
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: include/uapi/sound/firewire.h
FOCUSRITE SCARLETT2 MIXER DRIVER (Scarlett Gen 2+ and Clarett)
S: Maintained
W: https://github.com/geoffreybennett/scarlett-gen2
B: https://github.com/geoffreybennett/scarlett-gen2/issues
F: lib/fortify_kunit.c
F: lib/memcpy_kunit.c
F: lib/test_fortify/*
+K: \bunsafe_memcpy\b
K: \b__NO_FORTIFY\b
FPGA DFL DRIVERS
S: Maintained
F: sound/soc/fsl/fsl*
S: Maintained
F: Documentation/devicetree/bindings/sound/nxp,lpc3220-i2s.yaml
FREESCALE SOC SOUND QMC DRIVER
S: Maintained
F: Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
F: include/linux/gpio.h
F: include/linux/gpio/
F: include/linux/of_gpio.h
+K: (devm_)?gpio_(request|free|direction|get|set)
GPIO UAPI
F: include/uapi/linux/gpio.h
F: tools/gpio/
-GRE DEMULTIPLEXER DRIVER
-S: Maintained
-F: include/net/gre.h
-F: net/ipv4/gre_demux.c
-F: net/ipv4/gre_offload.c
-
GRETH 10/100/1G Ethernet MAC device driver
F: drivers/bus/hisi_lpc.c
HISILICON NETWORK SUBSYSTEM 3 DRIVER (HNS3)
-M: Yisen Zhuang <yisen.zhuang@huawei.com>
+M: Jian Shen <shenjian15@huawei.com>
F: drivers/net/ethernet/hisilicon/hns3/
HISILICON NETWORK SUBSYSTEM DRIVER
-M: Yisen Zhuang <yisen.zhuang@huawei.com>
+M: Jian Shen <shenjian15@huawei.com>
S: Maintained
F: Documentation/mm/vmemmap_dedup.rst
F: fs/hugetlbfs/
F: include/linux/hugetlb.h
+ F: include/trace/events/hugetlbfs.h
F: mm/hugetlb.c
F: mm/hugetlb_vmemmap.c
F: mm/hugetlb_vmemmap.h
INFINEON PEB2466 ASoC CODEC
S: Maintained
F: Documentation/devicetree/bindings/sound/infineon,peb2466.yaml
F: sound/soc/codecs/peb2466.c
F: security/integrity/ima/
INTEGRITY POLICY ENFORCEMENT (IPE)
-M: Fan Wu <wufan@linux.microsoft.com>
+M: Fan Wu <wufan@kernel.org>
S: Supported
-T: git https://github.com/microsoft/ipe.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe.git
F: Documentation/admin-guide/LSM/ipe.rst
F: Documentation/security/ipe.rst
F: scripts/ipe/
S: Supported
F: sound/soc/intel/
INTEL IN FIELD SCAN (IFS) DEVICE
-R: Ashok Raj <ashok.raj@intel.com>
+R: Ashok Raj <ashok.raj.linux@gmail.com>
S: Maintained
F: drivers/platform/x86/intel/ifs
F: drivers/crypto/intel/keembay/ocs-hcu.c
F: drivers/crypto/intel/keembay/ocs-hcu.h
+INTEL LA JOLLA COVE ADAPTER (LJCA) USB I/O EXPANDER DRIVERS
+S: Maintained
+F: drivers/gpio/gpio-ljca.c
+F: drivers/i2c/busses/i2c-ljca.c
+F: drivers/spi/spi-ljca.c
+F: drivers/usb/misc/usb-ljca.c
+F: include/linux/usb/ljca.h
+
INTEL MANAGEMENT ENGINE (mei)
IRON DEVICE AUDIO CODEC DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/irondevice,*
F: sound/soc/codecs/sma*
S: Maintained
+B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
F: Documentation/dev-tools/kasan.rst
F: arch/*/include/asm/*kasan.h
F: arch/*/mm/kasan_init*
S: Maintained
+B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
F: Documentation/dev-tools/kcov.rst
F: include/linux/kcov.h
F: include/uapi/linux/kcov.h
F: kernel/configs/hardening.config
F: lib/usercopy_kunit.c
F: mm/usercopy.c
+F: security/Kconfig.hardening
K: \b(add|choose)_random_kstack_offset\b
K: \b__check_(object_size|heap_object)\b
K: \b__counted_by\b
KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
-R: James Morse <james.morse@arm.com>
+R: Joey Gouly <joey.gouly@arm.com>
S: Maintained
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
F: drivers/ata/pata_arasan_cf.c
F: include/linux/pata_arasan_cf_data.h
-LIBATA PATA DRIVERS
-F: drivers/ata/ata_*.c
-F: drivers/ata/pata_*.c
-
LIBATA PATA FARADAY FTIDE010 AND GEMINI SATA BRIDGE DRIVERS
S: Maintained
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
F: drivers/ata/pata_ftide010.c
F: drivers/ata/sata_gemini.c
F: drivers/ata/sata_gemini.h
LIBATA SATA AHCI PLATFORM devices support
S: Maintained
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
F: drivers/ata/ahci_platform.c
F: drivers/ata/libahci_platform.c
F: include/linux/ahci_platform.h
-LIBATA SATA AHCI SYNOPSYS DWC CONTROLLER DRIVER
-S: Maintained
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata.git
-F: Documentation/devicetree/bindings/ata/baikal,bt1-ahci.yaml
-F: Documentation/devicetree/bindings/ata/snps,dwc-ahci.yaml
-F: drivers/ata/ahci_dwc.c
-
LIBATA SATA PROMISE TX2/TX4 CONTROLLER DRIVER
S: Maintained
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
F: drivers/ata/sata_promise.*
LIBATA SUBSYSTEM (Serial and Parallel ATA drivers)
MAX9860 MONO AUDIO VOICE CODEC DRIVER
S: Maintained
F: Documentation/devicetree/bindings/sound/max9860.txt
F: sound/soc/codecs/max9860.*
F: drivers/media/platform/nxp/imx-pxp.[ch]
MEDIA DRIVERS FOR ASCOT2E
S: Supported
W: https://linuxtv.org
F: drivers/media/dvb-frontends/cxd2099*
MEDIA DRIVERS FOR CXD2841ER
S: Supported
W: https://linuxtv.org
F: drivers/media/platform/nxp/imx8mq-mipi-csi2.c
MEDIA DRIVERS FOR HELENE
-M: Abylay Ospan <aospan@netup.ru>
+M: Abylay Ospan <aospan@amazon.com>
S: Supported
W: https://linuxtv.org
F: drivers/media/dvb-frontends/helene*
MEDIA DRIVERS FOR HORUS3A
S: Supported
W: https://linuxtv.org
F: drivers/media/dvb-frontends/horus3a*
MEDIA DRIVERS FOR LNBH25
S: Supported
W: https://linuxtv.org
F: drivers/media/dvb-frontends/mxl5xx*
MEDIA DRIVERS FOR NETUP PCI UNIVERSAL DVB devices
S: Supported
W: https://linuxtv.org
MEMORY MAPPING
-R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
+R: Jann Horn <jannh@google.com>
S: Maintained
W: http://www.linux-mm.org
F: include/linux/mtd/
F: include/uapi/mtd/
-MEMSENSING MICROSYSTEMS MSA311 DRIVER
-S: Maintained
-F: Documentation/devicetree/bindings/iio/accel/memsensing,msa311.yaml
-F: drivers/iio/accel/msa311.c
-
MEN A21 WATCHDOG DRIVER
MICROCHIP AUDIO ASOC DRIVERS
S: Supported
F: Documentation/devicetree/bindings/sound/atmel*
F: Documentation/devicetree/bindings/sound/axentia,tse850-pcm5142.txt
MICROCHIP MCP16502 PMIC DRIVER
S: Supported
F: Documentation/devicetree/bindings/regulator/microchip,mcp16502.yaml
MICROCHIP POLARFIRE FPGA DRIVERS
S: Supported
F: Documentation/devicetree/bindings/fpga/microchip,mpf-spi-fpga-mgr.yaml
MICROCHIP SSC DRIVER
S: Supported
F: Documentation/devicetree/bindings/misc/atmel-ssc.txt
F: drivers/platform/mips/
F: include/dt-bindings/mips/
-MIPS BAIKAL-T1 PLATFORM
-S: Supported
-F: Documentation/devicetree/bindings/bus/baikal,bt1-*.yaml
-F: Documentation/devicetree/bindings/clock/baikal,bt1-*.yaml
-F: drivers/bus/bt1-*.c
-F: drivers/clk/baikal-t1/
-F: drivers/memory/bt1-l2-ctl.c
-F: drivers/mtd/maps/physmap-bt1-rom.[ch]
-
MIPS BOSTON DEVELOPMENT BOARD
MIPS CORE DRIVERS
S: Supported
F: drivers/bus/mips_cdmm.c
NATIVE INSTRUMENTS USB SOUND INTERFACE DRIVER
S: Maintained
W: http://www.native-instruments.com
F: sound/usb/caiaq/
F: net/core/drop_monitor.c
NETWORKING DRIVERS
NETWORKING [DSA]
S: Maintained
F: Documentation/devicetree/bindings/net/dsa/
S: Maintained
P: Documentation/process/maintainer-netdev.rst
F: lib/net_utils.c
F: lib/random32.c
F: net/
+F: samples/pktgen/
F: tools/net/
F: tools/testing/selftests/net/
+X: Documentation/networking/mac80211-injection.rst
+X: Documentation/networking/mac80211_hwsim/
+X: Documentation/networking/regulatory.rst
+X: include/net/cfg80211.h
+X: include/net/ieee80211_radiotap.h
+X: include/net/iw_handler.h
+X: include/net/mac80211.h
+X: include/net/wext.h
X: net/9p/
X: net/bluetooth/
+X: net/mac80211/
+X: net/rfkill/
+X: net/wireless/
NETWORKING [IPSEC]
F: include/linux/ntb_transport.h
F: tools/testing/selftests/ntb/
-NTB IDT DRIVER
-S: Supported
-F: drivers/ntb/hw/idt/
-
NTB INTEL DRIVER
NXP SGTL5000 DRIVER
S: Maintained
F: Documentation/devicetree/bindings/sound/fsl,sgtl5000.yaml
F: sound/soc/codecs/sgtl5000*
NXP TFA9879 DRIVER
S: Maintained
F: Documentation/devicetree/bindings/sound/nxp,tfa9879.yaml
F: sound/soc/codecs/tfa9879*
NXP/Goodix TFA989X (TFA1) DRIVER
S: Maintained
F: Documentation/devicetree/bindings/sound/nxp,tfa989x.yaml
F: sound/soc/codecs/tfa989x.c
OMAP AUDIO SUPPORT
S: Maintained
F: sound/soc/ti/n810.c
OPL4 DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: sound/drivers/opl4/
F: include/linux/pps*.h
F: include/uapi/linux/pps.h
-PPTP DRIVER
-S: Maintained
-W: http://sourceforge.net/projects/accel-pptp
-F: drivers/net/ppp/pptp.c
-
PRESSURE STALL INFORMATION (PSI)
QCOM AUDIO (ASoC) DRIVERS
S: Supported
F: Documentation/devicetree/bindings/soc/qcom/qcom,apr*
F: Documentation/tools/rtla/
F: tools/tracing/rtla/
+Real-time Linux (PREEMPT_RT)
+S: Supported
+K: PREEMPT_RT
+
REALTEK AUDIO CODECS
S: Maintained
F: drivers/i2c/busses/i2c-emev2.c
RENESAS ETHERNET AVB DRIVER
+S: Supported
F: Documentation/devicetree/bindings/net/renesas,etheravb.yaml
F: drivers/net/ethernet/renesas/Kconfig
F: drivers/net/ethernet/renesas/Makefile
RENESAS IDT821034 ASoC CODEC
S: Maintained
F: Documentation/devicetree/bindings/sound/renesas,idt821034.yaml
F: sound/soc/codecs/idt821034.c
F: drivers/i2c/busses/i2c-sh_mobile.c
RENESAS R-CAR SATA DRIVER
S: Supported
F: drivers/i2c/busses/i2c-rzv2m.c
RENESAS SUPERH ETHERNET DRIVER
+S: Supported
F: Documentation/devicetree/bindings/net/renesas,ether.yaml
F: drivers/net/ethernet/renesas/Kconfig
F: drivers/net/ethernet/renesas/Makefile
S: Maintained
Q: https://patchwork.kernel.org/project/linux-riscv/list/
T: git https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/
-F: Documentation/devicetree/bindings/riscv/
-F: arch/riscv/boot/dts/
-X: arch/riscv/boot/dts/allwinner/
-X: arch/riscv/boot/dts/renesas/
-X: arch/riscv/boot/dts/sophgo/
-X: arch/riscv/boot/dts/thead/
+F: arch/riscv/boot/dts/canaan/
+F: arch/riscv/boot/dts/microchip/
+F: arch/riscv/boot/dts/sifive/
+F: arch/riscv/boot/dts/starfive/
RISC-V PMU DRIVERS
SAMSUNG AUDIO (ASoC) DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/samsung*
SERIAL LOW-POWER INTER-CHIP MEDIA BUS (SLIMbus)
S: Maintained
F: Documentation/devicetree/bindings/slimbus/
F: drivers/slimbus/
F: drivers/i2c/busses/i2c-synquacer.c
SOCIONEXT UNIPHIER SOUND DRIVER
S: Orphan
F: sound/soc/uniphier/
SOUND - COMPRESSED AUDIO
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: Documentation/sound/designs/compress-offload.rst
W: https://github.com/thesofproject/linux/
F: sound/soc/sof/
+SOUND - GENERIC SOUND CARD (Simple-Audio-Card, Audio-Graph-Card)
+S: Supported
+F: sound/soc/generic/
+F: include/sound/simple_card*
+F: Documentation/devicetree/bindings/sound/simple-card.yaml
+F: Documentation/devicetree/bindings/sound/audio-graph*.yaml
+
SOUNDWIRE SUBSYSTEM
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire.git
F: Documentation/driver-api/soundwire/
SPEAR PLATFORM/CLOCK/PINCTRL SUPPORT
S: Maintained
W: http://www.st.com/spear
F: arch/arm/boot/dts/st/spear*
STI AUDIO (ASoC) DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/st,sti-asoc-card.txt
F: sound/soc/sti/
STM32 AUDIO (ASoC) DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.yaml
F: Documentation/devicetree/bindings/sound/st,stm32-*.yaml
SYNOPSYS DESIGNWARE APB GPIO DRIVER
S: Maintained
F: Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
F: drivers/gpio/gpio-dwapb.c
-SYNOPSYS DESIGNWARE APB SSI DRIVER
-S: Supported
-F: Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
-F: drivers/spi/spi-dw*
-
SYNOPSYS DESIGNWARE AXI DMAC DRIVER
S: Maintained
TEXAS INSTRUMENTS ASoC DRIVERS
S: Maintained
F: Documentation/devicetree/bindings/sound/davinci-mcasp-audio.yaml
F: sound/soc/ti/
S: Maintained
F: Documentation/devicetree/bindings/sound/tas2552.txt
F: Documentation/devicetree/bindings/sound/ti,tas2562.yaml
TI LM49xxx FAMILY ASoC CODEC DRIVERS
S: Maintained
F: sound/soc/codecs/isabelle*
F: sound/soc/codecs/lm49453*
F: drivers/iio/adc/ti-lmp92064.c
TI PCM3060 ASoC CODEC DRIVER
-M: Kirill Marinushkin <kmarinushkin@birdec.com>
+M: Kirill Marinushkin <k.marinushkin@gmail.com>
S: Maintained
F: Documentation/devicetree/bindings/sound/pcm3060.txt
F: sound/soc/codecs/pcm3060*
TI TAS571X FAMILY ASoC CODEC DRIVER
S: Odd Fixes
F: sound/soc/codecs/tas571x*
TI TWL4030 SERIES SOC CODEC DRIVER
S: Maintained
F: sound/soc/codecs/twl4030*
S: Maintained
F: drivers/hid/hid-udraw-ps3.c
-UFS FILESYSTEM
-S: Maintained
-F: Documentation/admin-guide/ufs.rst
-F: fs/ufs/
-
UHID USERSPACE HID IO DRIVER
USB MIDI DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
F: sound/usb/midi.*
S: Maintained
+B: https://github.com/xairy/raw-gadget/issues
F: Documentation/usb/raw-gadget.rst
F: drivers/usb/gadget/legacy/raw_gadget.c
F: include/uapi/linux/usb/raw_gadget.h
USER DATAGRAM PROTOCOL (UDP)
S: Maintained
F: include/linux/udp.h
+F: include/net/udp.h
+F: include/trace/events/udp.h
+F: include/uapi/linux/udp.h
F: net/ipv4/udp.c
F: net/ipv6/udp.c
S: Maintained
F: include/uapi/linux/virtio_snd.h
F: sound/virtio/*
VMA
-R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
+R: Jann Horn <jannh@google.com>
S: Maintained
W: https://www.linux-mm.org
XEN SOUND FRONTEND DRIVER
S: Supported
F: sound/xen/*
F: include/xen/swiotlb-xen.h
XFS FILESYSTEM
S: Supported
for (i = 0; i < found_folios; i++) {
struct folio *folio = fbatch.folios[i];
- u32 len = end + 1 - start;
+ u64 range_start;
+ u32 range_len;
if (folio == locked_folio)
continue;
- if (btrfs_folio_start_writer_lock(fs_info, folio, start,
- len))
- goto out;
-
+ folio_lock(folio);
if (!folio_test_dirty(folio) || folio->mapping != mapping) {
- btrfs_folio_end_writer_lock(fs_info, folio, start,
- len);
+ folio_unlock(folio);
goto out;
}
+ range_start = max_t(u64, folio_pos(folio), start);
+ range_len = min_t(u64, folio_pos(folio) + folio_size(folio),
+ end + 1) - range_start;
+ btrfs_folio_set_writer_lock(fs_info, folio, range_start, range_len);
- processed_end = folio_pos(folio) + folio_size(folio) - 1;
+ processed_end = range_start + range_len - 1;
}
folio_batch_release(&fbatch);
cond_resched();
}
if (bio_ctrl->wbc)
- wbc_account_cgroup_owner(bio_ctrl->wbc, &folio->page,
+ wbc_account_cgroup_owner(bio_ctrl->wbc, folio,
len);
size -= len;
free_extent_map(em);
em = NULL;
+ /*
+ * Although the PageDirty bit is cleared before entering this
+ * function, subpage dirty bit is not cleared.
+ * So clear subpage dirty bit here so next time we won't submit
+ * a folio for a range already written to disk.
+ */
+ btrfs_folio_clear_dirty(fs_info, folio, filepos, sectorsize);
btrfs_set_range_writeback(inode, filepos, filepos + sectorsize - 1);
/*
* Above call should set the whole folio with writeback flag, even
*/
ASSERT(folio_test_writeback(folio));
- /*
- * Although the PageDirty bit is cleared before entering this
- * function, subpage dirty bit is not cleared.
- * So clear subpage dirty bit here so next time we won't submit
- * folio for range already written to disk.
- */
- btrfs_folio_clear_dirty(fs_info, folio, filepos, sectorsize);
submit_extent_folio(bio_ctrl, disk_bytenr, folio,
sectorsize, filepos - folio_pos(folio));
return 0;
ret = bio_add_folio(&bbio->bio, folio, eb->len,
eb->start - folio_pos(folio));
ASSERT(ret);
- wbc_account_cgroup_owner(wbc, folio_page(folio, 0), eb->len);
+ wbc_account_cgroup_owner(wbc, folio, eb->len);
folio_unlock(folio);
} else {
int num_folios = num_extent_folios(eb);
folio_start_writeback(folio);
ret = bio_add_folio(&bbio->bio, folio, eb->folio_size, 0);
ASSERT(ret);
- wbc_account_cgroup_owner(wbc, folio_page(folio, 0),
- eb->folio_size);
+ wbc_account_cgroup_owner(wbc, folio, eb->folio_size);
wbc->nr_to_write -= folio_nr_pages(folio);
folio_unlock(folio);
}
#include <linux/migrate.h>
#include <linux/sched/mm.h>
#include <linux/iomap.h>
-#include <asm/unaligned.h>
+#include <linux/unaligned.h>
#include <linux/fsverity.h>
#include "misc.h"
#include "ctree.h"
clear_bits |= EXTENT_CLEAR_DATA_RESV;
extent_clear_unlock_delalloc(inode, start, end, locked_folio,
&cached, clear_bits, page_ops);
- btrfs_qgroup_free_data(inode, NULL, start, cur_alloc_size, NULL);
+ btrfs_qgroup_free_data(inode, NULL, start, end - start + 1, NULL);
}
return ret;
}
* need full accuracy. Just account the whole thing
* against the first page.
*/
- wbc_account_cgroup_owner(wbc, &locked_folio->page,
+ wbc_account_cgroup_owner(wbc, locked_folio,
cur_end - start);
async_chunk[i].locked_folio = locked_folio;
locked_folio = NULL;
ret = btrfs_update_inode_fallback(trans, inode);
if (ret) /* -ENOMEM or corruption */
btrfs_abort_transaction(trans, ret);
+
+ ret = btrfs_insert_raid_extent(trans, ordered_extent);
+ if (ret)
+ btrfs_abort_transaction(trans, ret);
+
goto out;
}
*/
if (btrfs_ino(inode) == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID) {
di = btrfs_search_dir_index_item(root, path, dir_ino, &fname.disk_name);
- if (IS_ERR_OR_NULL(di)) {
- if (!di)
- ret = -ENOENT;
- else
- ret = PTR_ERR(di);
+ if (IS_ERR(di)) {
+ ret = PTR_ERR(di);
btrfs_abort_transaction(trans, ret);
goto out;
}
#include <linux/mutex.h>
#include <linux/buffer_head.h>
#include <linux/blkdev.h>
+#include <linux/fs_context.h>
#include "hfsplus_raw.h"
#define DBG_BNODE_REFS 0x00000001
/* Runtime variables */
u32 blockoffset;
+ u32 min_io_size;
sector_t part_start;
sector_t sect_count;
int fs_shift;
*/
static inline unsigned short hfsplus_min_io_size(struct super_block *sb)
{
- return max_t(unsigned short, bdev_logical_block_size(sb->s_bdev),
+ return max_t(unsigned short, HFSPLUS_SB(sb)->min_io_size,
HFSPLUS_SECTOR_SIZE);
}
/* options.c */
void hfsplus_fill_defaults(struct hfsplus_sb_info *opts);
-int hfsplus_parse_options_remount(char *input, int *force);
-int hfsplus_parse_options(char *input, struct hfsplus_sb_info *sbi);
+int hfsplus_parse_param(struct fs_context *fc, struct fs_parameter *param);
int hfsplus_show_options(struct seq_file *seq, struct dentry *root);
/* part_tbl.c */
#include <linux/fs.h>
#include <linux/blkdev.h>
#include <linux/cdrom.h>
-#include <asm/unaligned.h>
+#include <linux/unaligned.h>
#include "hfsplus_fs.h"
#include "hfsplus_raw.h"
if (!blocksize)
goto out;
+ sbi->min_io_size = blocksize;
+
if (hfsplus_get_last_session(sb, &part_start, &part_size))
goto out;
#include <linux/list_lru.h>
#include <linux/iversion.h>
#include <linux/rw_hint.h>
+#include <linux/seq_file.h>
+#include <linux/debugfs.h>
#include <trace/events/writeback.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/timestamp.h>
+
#include "internal.h"
/*
return nr_dirty > 0 ? nr_dirty : 0;
}
+#ifdef CONFIG_DEBUG_FS
+static DEFINE_PER_CPU(long, mg_ctime_updates);
+static DEFINE_PER_CPU(long, mg_fine_stamps);
+static DEFINE_PER_CPU(long, mg_ctime_swaps);
+
+static unsigned long get_mg_ctime_updates(void)
+{
+ unsigned long sum = 0;
+ int i;
+
+ for_each_possible_cpu(i)
+ sum += data_race(per_cpu(mg_ctime_updates, i));
+ return sum;
+}
+
+static unsigned long get_mg_fine_stamps(void)
+{
+ unsigned long sum = 0;
+ int i;
+
+ for_each_possible_cpu(i)
+ sum += data_race(per_cpu(mg_fine_stamps, i));
+ return sum;
+}
+
+static unsigned long get_mg_ctime_swaps(void)
+{
+ unsigned long sum = 0;
+ int i;
+
+ for_each_possible_cpu(i)
+ sum += data_race(per_cpu(mg_ctime_swaps, i));
+ return sum;
+}
+
+#define mgtime_counter_inc(__var) this_cpu_inc(__var)
+
+static int mgts_show(struct seq_file *s, void *p)
+{
+ unsigned long ctime_updates = get_mg_ctime_updates();
+ unsigned long ctime_swaps = get_mg_ctime_swaps();
+ unsigned long fine_stamps = get_mg_fine_stamps();
+ unsigned long floor_swaps = timekeeping_get_mg_floor_swaps();
+
+ seq_printf(s, "%lu %lu %lu %lu\n",
+ ctime_updates, ctime_swaps, fine_stamps, floor_swaps);
+ return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(mgts);
+
+static int __init mg_debugfs_init(void)
+{
+ debugfs_create_file("multigrain_timestamps", S_IFREG | S_IRUGO, NULL, NULL, &mgts_fops);
+ return 0;
+}
+late_initcall(mg_debugfs_init);
+
+#else /* ! CONFIG_DEBUG_FS */
+
+#define mgtime_counter_inc(__var) do { } while (0)
+
+#endif /* CONFIG_DEBUG_FS */
+
/*
* Handle nr_inode sysctl
*/
}
/**
- * inode_init_always - perform inode structure initialisation
+ * inode_init_always_gfp - perform inode structure initialisation
* @sb: superblock inode belongs to
* @inode: inode to initialise
+ * @gfp: allocation flags
*
* These are initializations that need to be done on every inode
* allocation as the fields are not initialised by slab allocation.
+ * If there are additional allocations required @gfp is used.
*/
-int inode_init_always(struct super_block *sb, struct inode *inode)
+int inode_init_always_gfp(struct super_block *sb, struct inode *inode, gfp_t gfp)
{
static const struct inode_operations empty_iops;
static const struct file_operations no_open_fops = {.open = no_open};
inode->i_opflags = 0;
if (sb->s_xattr)
inode->i_opflags |= IOP_XATTR;
+ if (sb->s_type->fs_flags & FS_MGTIME)
+ inode->i_opflags |= IOP_MGTIME;
i_uid_write(inode, 0);
i_gid_write(inode, 0);
atomic_set(&inode->i_writecount, 0);
#endif
inode->i_flctx = NULL;
- if (unlikely(security_inode_alloc(inode)))
+ if (unlikely(security_inode_alloc(inode, gfp)))
return -ENOMEM;
this_cpu_inc(nr_inodes);
return 0;
}
-EXPORT_SYMBOL(inode_init_always);
+EXPORT_SYMBOL(inode_init_always_gfp);
void free_inode_nonrcu(struct inode *inode)
{
* ___wait_var_event() either sees the bit cleared or
* waitqueue_active() check in wake_up_var() sees the waiter.
*/
- smp_mb();
+ smp_mb__after_spinlock();
inode_wake_up_bit(inode, __I_NEW);
BUG_ON(inode->i_state != (I_FREEING | I_CLEAR));
spin_unlock(&inode->i_lock);
* @data: opaque data pointer to pass to @test and @set
*
* Search for the inode specified by @hashval and @data in the inode cache,
- * and if present it is return it with an increased reference count. This is
- * a variant of iget5_locked() for callers that don't want to fail on memory
- * allocation of inode.
+ * and if present return it with an increased reference count. This is a
+ * variant of iget5_locked() that doesn't allocate an inode.
*
- * If the inode is not in cache, insert the pre-allocated inode to cache and
+ * If the inode is not present in the cache, insert the pre-allocated inode and
* return it locked, hashed, and with the I_NEW flag set. The file system gets
* to fill it in before unlocking it via unlock_new_inode().
*
- * Note both @test and @set are called with the inode_hash_lock held, so can't
- * sleep.
+ * Note that both @test and @set are called with the inode_hash_lock held, so
+ * they can't sleep.
*/
struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
int (*test)(struct inode *, void *),
* @data: opaque data pointer to pass to @test and @set
*
* Search for the inode specified by @hashval and @data in the inode cache,
- * and if present it is return it with an increased reference count. This is
- * a generalized version of iget_locked() for file systems where the inode
+ * and if present return it with an increased reference count. This is a
+ * generalized version of iget_locked() for file systems where the inode
* number is not sufficient for unique identification of an inode.
*
- * If the inode is not in cache, allocate a new inode and return it locked,
- * hashed, and with the I_NEW flag set. The file system gets to fill it in
- * before unlocking it via unlock_new_inode().
+ * If the inode is not present in the cache, allocate and insert a new inode
+ * and return it locked, hashed, and with the I_NEW flag set. The file system
+ * gets to fill it in before unlocking it via unlock_new_inode().
*
- * Note both @test and @set are called with the inode_hash_lock held, so can't
- * sleep.
+ * Note that both @test and @set are called with the inode_hash_lock held, so
+ * they can't sleep.
*/
struct inode *iget5_locked(struct super_block *sb, unsigned long hashval,
int (*test)(struct inode *, void *),
}
EXPORT_SYMBOL(file_remove_privs);
+/**
+ * current_time - Return FS time (possibly fine-grained)
+ * @inode: inode.
+ *
+ * Return the current time truncated to the time granularity supported by
+ * the fs, as suitable for a ctime/mtime change. If the ctime is flagged
+ * as having been QUERIED, get a fine-grained timestamp, but don't update
+ * the floor.
+ *
+ * For a multigrain inode, this is effectively an estimate of the timestamp
+ * that a file would receive. An actual update must go through
+ * inode_set_ctime_current().
+ */
+struct timespec64 current_time(struct inode *inode)
+{
+ struct timespec64 now;
+ u32 cns;
+
+ ktime_get_coarse_real_ts64_mg(&now);
+
+ if (!is_mgtime(inode))
+ goto out;
+
+ /* If nothing has queried it, then coarse time is fine */
+ cns = smp_load_acquire(&inode->i_ctime_nsec);
+ if (cns & I_CTIME_QUERIED) {
+ /*
+ * If there is no apparent change, then get a fine-grained
+ * timestamp.
+ */
+ if (now.tv_nsec == (cns & ~I_CTIME_QUERIED))
+ ktime_get_real_ts64(&now);
+ }
+out:
+ return timestamp_truncate(now, inode);
+}
+EXPORT_SYMBOL(current_time);
+
static int inode_needs_update_time(struct inode *inode)
{
+ struct timespec64 now, ts;
int sync_it = 0;
- struct timespec64 now = current_time(inode);
- struct timespec64 ts;
/* First try to exhaust all avenues to not sync */
if (IS_NOCMTIME(inode))
return 0;
+ now = current_time(inode);
+
ts = inode_get_mtime(inode);
if (!timespec64_equal(&ts, &now))
- sync_it = S_MTIME;
+ sync_it |= S_MTIME;
ts = inode_get_ctime(inode);
if (!timespec64_equal(&ts, &now))
}
EXPORT_SYMBOL(inode_nohighmem);
+struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timespec64 ts)
+{
+ trace_inode_set_ctime_to_ts(inode, &ts);
+ set_normalized_timespec64(&ts, ts.tv_sec, ts.tv_nsec);
+ inode->i_ctime_sec = ts.tv_sec;
+ inode->i_ctime_nsec = ts.tv_nsec;
+ return ts;
+}
+EXPORT_SYMBOL(inode_set_ctime_to_ts);
+
/**
* timestamp_truncate - Truncate timespec to a granularity
* @t: Timespec
EXPORT_SYMBOL(timestamp_truncate);
/**
- * current_time - Return FS time
- * @inode: inode.
+ * inode_set_ctime_current - set the ctime to current_time
+ * @inode: inode
*
- * Return the current time truncated to the time granularity supported by
- * the fs.
+ * Set the inode's ctime to the current value for the inode. Returns the
+ * current value that was assigned. If this is not a multigrain inode, then we
+ * set it to the later of the coarse time and floor value.
*
- * Note that inode and inode->sb cannot be NULL.
- * Otherwise, the function warns and returns time without truncation.
+ * If it is multigrain, then we first see if the coarse-grained timestamp is
+ * distinct from what is already there. If so, then use that. Otherwise, get a
+ * fine-grained timestamp.
+ *
+ * After that, try to swap the new value into i_ctime_nsec. Accept the
+ * resulting ctime, regardless of the outcome of the swap. If it has
+ * already been replaced, then that timestamp is later than the earlier
+ * unacceptable one, and is thus acceptable.
*/
-struct timespec64 current_time(struct inode *inode)
+struct timespec64 inode_set_ctime_current(struct inode *inode)
{
struct timespec64 now;
+ u32 cns, cur;
- ktime_get_coarse_real_ts64(&now);
- return timestamp_truncate(now, inode);
+ ktime_get_coarse_real_ts64_mg(&now);
+ now = timestamp_truncate(now, inode);
+
+ /* Just return that if this is not a multigrain fs */
+ if (!is_mgtime(inode)) {
+ inode_set_ctime_to_ts(inode, now);
+ goto out;
+ }
+
+ /*
+ * A fine-grained time is only needed if someone has queried
+ * for timestamps, and the current coarse grained time isn't
+ * later than what's already there.
+ */
+ cns = smp_load_acquire(&inode->i_ctime_nsec);
+ if (cns & I_CTIME_QUERIED) {
+ struct timespec64 ctime = { .tv_sec = inode->i_ctime_sec,
+ .tv_nsec = cns & ~I_CTIME_QUERIED };
+
+ if (timespec64_compare(&now, &ctime) <= 0) {
+ ktime_get_real_ts64_mg(&now);
+ now = timestamp_truncate(now, inode);
+ mgtime_counter_inc(mg_fine_stamps);
+ }
+ }
+ mgtime_counter_inc(mg_ctime_updates);
+
+ /* No need to cmpxchg if it's exactly the same */
+ if (cns == now.tv_nsec && inode->i_ctime_sec == now.tv_sec) {
+ trace_ctime_xchg_skip(inode, &now);
+ goto out;
+ }
+ cur = cns;
+retry:
+ /* Try to swap the nsec value into place. */
+ if (try_cmpxchg(&inode->i_ctime_nsec, &cur, now.tv_nsec)) {
+ /* If swap occurred, then we're (mostly) done */
+ inode->i_ctime_sec = now.tv_sec;
+ trace_ctime_ns_xchg(inode, cns, now.tv_nsec, cur);
+ mgtime_counter_inc(mg_ctime_swaps);
+ } else {
+ /*
+ * Was the change due to someone marking the old ctime QUERIED?
+ * If so then retry the swap. This can only happen once since
+ * the only way to clear I_CTIME_QUERIED is to stamp the inode
+ * with a new ctime.
+ */
+ if (!(cns & I_CTIME_QUERIED) && (cns | I_CTIME_QUERIED) == cur) {
+ cns = cur;
+ goto retry;
+ }
+ /* Otherwise, keep the existing ctime */
+ now.tv_sec = inode->i_ctime_sec;
+ now.tv_nsec = cur & ~I_CTIME_QUERIED;
+ }
+out:
+ return now;
}
-EXPORT_SYMBOL(current_time);
+EXPORT_SYMBOL(inode_set_ctime_current);
/**
- * inode_set_ctime_current - set the ctime to current_time
- * @inode: inode
+ * inode_set_ctime_deleg - try to update the ctime on a delegated inode
+ * @inode: inode to update
+ * @update: timespec64 to set the ctime
*
- * Set the inode->i_ctime to the current value for the inode. Returns
- * the current value that was assigned to i_ctime.
+ * Attempt to atomically update the ctime on behalf of a delegation holder.
+ *
+ * The nfs server can call back the holder of a delegation to get updated
+ * inode attributes, including the mtime. When updating the mtime, update
+ * the ctime to a value at least equal to that.
+ *
+ * This can race with concurrent updates to the inode, in which
+ * case the update is skipped.
+ *
+ * Note that this works even when multigrain timestamps are not enabled,
+ * so it is used in either case.
*/
-struct timespec64 inode_set_ctime_current(struct inode *inode)
+struct timespec64 inode_set_ctime_deleg(struct inode *inode, struct timespec64 update)
{
- struct timespec64 now = current_time(inode);
+ struct timespec64 now, cur_ts;
+ u32 cur, old;
- inode_set_ctime_to_ts(inode, now);
- return now;
+ /* pairs with try_cmpxchg below */
+ cur = smp_load_acquire(&inode->i_ctime_nsec);
+ cur_ts.tv_nsec = cur & ~I_CTIME_QUERIED;
+ cur_ts.tv_sec = inode->i_ctime_sec;
+
+ /* If the update is older than the existing value, skip it. */
+ if (timespec64_compare(&update, &cur_ts) <= 0)
+ return cur_ts;
+
+ ktime_get_coarse_real_ts64_mg(&now);
+
+ /* Clamp the update to "now" if it's in the future */
+ if (timespec64_compare(&update, &now) > 0)
+ update = now;
+
+ update = timestamp_truncate(update, inode);
+
+ /* No need to update if the values are already the same */
+ if (timespec64_equal(&update, &cur_ts))
+ return cur_ts;
+
+ /*
+ * Try to swap the nsec value into place. If it fails, that means
+ * it raced with an update due to a write or similar activity. That
+ * stamp takes precedence, so just skip the update.
+ */
+retry:
+ old = cur;
+ if (try_cmpxchg(&inode->i_ctime_nsec, &cur, update.tv_nsec)) {
+ inode->i_ctime_sec = update.tv_sec;
+ mgtime_counter_inc(mg_ctime_swaps);
+ return update;
+ }
+
+ /*
+ * Was the change due to another task marking the old ctime QUERIED?
+ *
+ * If so, then retry the swap. This can only happen once since
+ * the only way to clear I_CTIME_QUERIED is to stamp the inode
+ * with a new ctime.
+ */
+ if (!(old & I_CTIME_QUERIED) && (cur == (old | I_CTIME_QUERIED)))
+ goto retry;
+
+ /* Otherwise, it was a new timestamp. */
+ cur_ts.tv_sec = inode->i_ctime_sec;
+ cur_ts.tv_nsec = cur & ~I_CTIME_QUERIED;
+ return cur_ts;
}
-EXPORT_SYMBOL(inode_set_ctime_current);
+EXPORT_SYMBOL(inode_set_ctime_deleg);
/**
* in_group_or_capable - check whether caller is CAP_FSETID privileged
* @inode: inode to check
* @vfsgid: the new/current vfsgid of @inode
*
- * Check wether @vfsgid is in the caller's group list or if the caller is
+ * Check whether @vfsgid is in the caller's group list or if the caller is
* privileged with CAP_FSETID over @inode. This can be used to determine
* whether the setgid bit can be kept or must be dropped.
*
}
/*
+ * When a short write occurs, the filesystem might need to use ->iomap_end
+ * to remove space reservations created in ->iomap_begin.
+ *
+ * For filesystems that use delayed allocation, there can be dirty pages over
+ * the delalloc extent outside the range of a short write but still within the
+ * delalloc extent allocated for this iomap if the write raced with page
+ * faults.
+ *
* Punch out all the delalloc blocks in the range given except for those that
* have dirty data still pending in the page cache - those are going to be
* written and so must still retain the delalloc backing for writeback.
*
+ * The punch() callback *must* only punch delalloc extents in the range passed
+ * to it. It must skip over all other types of extents in the range and leave
+ * them completely unchanged. It must do this punch atomically with respect to
+ * other extent modifications.
+ *
+ * The punch() callback may be called with a folio locked to prevent writeback
+ * extent allocation racing at the edge of the range we are currently punching.
+ * The locked folio may or may not cover the range being punched, so it is not
+ * safe for the punch() callback to lock folios itself.
+ *
+ * Lock order is:
+ *
+ * inode->i_rwsem (shared or exclusive)
+ * inode->i_mapping->invalidate_lock (exclusive)
+ * folio_lock()
+ * ->punch
+ * internal filesystem allocation lock
+ *
* As we are scanning the page cache for data, we don't need to reimplement the
* wheel - mapping_seek_hole_data() does exactly what we need to identify the
* start and end of data ranges correctly even for sub-folio block sizes. This
* require sprinkling this code with magic "+ 1" and "- 1" arithmetic and expose
* the code to subtle off-by-one bugs....
*/
-static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
+void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
loff_t end_byte, unsigned flags, struct iomap *iomap,
iomap_punch_t punch)
{
loff_t scan_end_byte = min(i_size_read(inode), end_byte);
/*
- * Lock the mapping to avoid races with page faults re-instantiating
- * folios and dirtying them via ->page_mkwrite whilst we walk the
- * cache and perform delalloc extent removal. Failing to do this can
- * leave dirty pages with no space reservation in the cache.
+ * The caller must hold invalidate_lock to avoid races with page faults
+ * re-instantiating folios and dirtying them via ->page_mkwrite whilst
+ * we walk the cache and perform delalloc extent removal. Failing to do
+ * this can leave dirty pages with no space reservation in the cache.
*/
- filemap_invalidate_lock(inode->i_mapping);
+ lockdep_assert_held_write(&inode->i_mapping->invalidate_lock);
+
while (start_byte < scan_end_byte) {
loff_t data_end;
if (start_byte == -ENXIO || start_byte == scan_end_byte)
break;
if (WARN_ON_ONCE(start_byte < 0))
- goto out_unlock;
+ return;
WARN_ON_ONCE(start_byte < punch_start_byte);
WARN_ON_ONCE(start_byte > scan_end_byte);
data_end = mapping_seek_hole_data(inode->i_mapping, start_byte,
scan_end_byte, SEEK_HOLE);
if (WARN_ON_ONCE(data_end < 0))
- goto out_unlock;
+ return;
/*
* If we race with post-direct I/O invalidation of the page cache,
if (punch_start_byte < end_byte)
punch(inode, punch_start_byte, end_byte - punch_start_byte,
iomap);
-out_unlock:
- filemap_invalidate_unlock(inode->i_mapping);
}
-
-/*
- * When a short write occurs, the filesystem may need to remove reserved space
- * that was allocated in ->iomap_begin from it's ->iomap_end method. For
- * filesystems that use delayed allocation, we need to punch out delalloc
- * extents from the range that are not dirty in the page cache. As the write can
- * race with page faults, there can be dirty pages over the delalloc extent
- * outside the range of a short write but still within the delalloc extent
- * allocated for this iomap.
- *
- * This function uses [start_byte, end_byte) intervals (i.e. open ended) to
- * simplify range iterations.
- *
- * The punch() callback *must* only punch delalloc extents in the range passed
- * to it. It must skip over all other types of extents in the range and leave
- * them completely unchanged. It must do this punch atomically with respect to
- * other extent modifications.
- *
- * The punch() callback may be called with a folio locked to prevent writeback
- * extent allocation racing at the edge of the range we are currently punching.
- * The locked folio may or may not cover the range being punched, so it is not
- * safe for the punch() callback to lock folios itself.
- *
- * Lock order is:
- *
- * inode->i_rwsem (shared or exclusive)
- * inode->i_mapping->invalidate_lock (exclusive)
- * folio_lock()
- * ->punch
- * internal filesystem allocation lock
- */
-void iomap_file_buffered_write_punch_delalloc(struct inode *inode,
- loff_t pos, loff_t length, ssize_t written, unsigned flags,
- struct iomap *iomap, iomap_punch_t punch)
-{
- loff_t start_byte;
- loff_t end_byte;
- unsigned int blocksize = i_blocksize(inode);
-
- if (iomap->type != IOMAP_DELALLOC)
- return;
-
- /* If we didn't reserve the blocks, we're not allowed to punch them. */
- if (!(iomap->flags & IOMAP_F_NEW))
- return;
-
- /*
- * start_byte refers to the first unused block after a short write. If
- * nothing was written, round offset down to point at the first block in
- * the range.
- */
- if (unlikely(!written))
- start_byte = round_down(pos, blocksize);
- else
- start_byte = round_up(pos + written, blocksize);
- end_byte = round_up(pos + length, blocksize);
-
- /* Nothing to do if we've written the entire delalloc extent */
- if (start_byte >= end_byte)
- return;
-
- iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
- punch);
-}
-EXPORT_SYMBOL_GPL(iomap_file_buffered_write_punch_delalloc);
+EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
static loff_t iomap_unshare_iter(struct iomap_iter *iter)
{
loff_t length = iomap_length(iter);
loff_t written = 0;
- /* Don't bother with blocks that are not shared to start with. */
- if (!(iomap->flags & IOMAP_F_SHARED))
- return length;
-
- /*
- * Don't bother with holes or unwritten extents.
- *
- * Note that we use srcmap directly instead of iomap_iter_srcmap as
- * unsharing requires providing a separate source map, and the presence
- * of one is a good indicator that unsharing is needed, unlike
- * IOMAP_F_SHARED which can be set for any data that goes into the COW
- * fork for XFS.
- */
- if (iter->srcmap.type == IOMAP_HOLE ||
- iter->srcmap.type == IOMAP_UNWRITTEN)
+ if (!iomap_want_unshare_iter(iter))
return length;
do {
struct iomap_iter iter = {
.inode = inode,
.pos = pos,
- .len = len,
.flags = IOMAP_WRITE | IOMAP_UNSHARE,
};
+ loff_t size = i_size_read(inode);
int ret;
+ if (pos < 0 || pos >= size)
+ return 0;
+
+ iter.len = min(len, size - pos);
while ((ret = iomap_iter(&iter, ops)) > 0)
iter.processed = iomap_unshare_iter(&iter);
return ret;
if (ifs)
atomic_add(len, &ifs->write_bytes_pending);
wpc->ioend->io_size += len;
- wbc_account_cgroup_owner(wbc, &folio->page, len);
+ wbc_account_cgroup_owner(wbc, folio, len);
return 0;
}
}
new_ns->ns.ops = &mntns_operations;
if (!anon)
- new_ns->seq = atomic64_add_return(1, &mnt_ns_seq);
+ new_ns->seq = atomic64_inc_return(&mnt_ns_seq);
refcount_set(&new_ns->ns.count, 1);
refcount_set(&new_ns->passive, 1);
new_ns->mounts = RB_ROOT;
new = copy_tree(old, old->mnt.mnt_root, copy_flags);
if (IS_ERR(new)) {
namespace_unlock();
- free_mnt_ns(new_ns);
+ ns_free_inum(&new_ns->ns);
+ dec_mnt_namespaces(new_ns->ucounts);
+ mnt_ns_release(new_ns);
return ERR_CAST(new);
}
if (user_ns != ns->user_ns) {
return 0;
}
+ static void statmount_fs_subtype(struct kstatmount *s, struct seq_file *seq)
+ {
+ struct super_block *sb = s->mnt->mnt_sb;
+
+ if (sb->s_subtype)
+ seq_puts(seq, sb->s_subtype);
+ }
+
+ static int statmount_sb_source(struct kstatmount *s, struct seq_file *seq)
+ {
+ struct super_block *sb = s->mnt->mnt_sb;
+ struct mount *r = real_mount(s->mnt);
+
+ if (sb->s_op->show_devname) {
+ size_t start = seq->count;
+ int ret;
+
+ ret = sb->s_op->show_devname(seq, s->mnt->mnt_root);
+ if (ret)
+ return ret;
+
+ if (unlikely(seq_has_overflowed(seq)))
+ return -EAGAIN;
+
+ /* Unescape the result */
+ seq->buf[seq->count] = '\0';
+ seq->count = start;
+ seq_commit(seq, string_unescape_inplace(seq->buf + start, UNESCAPE_OCTAL));
+ } else if (r->mnt_devname) {
+ seq_puts(seq, r->mnt_devname);
+ }
+ return 0;
+ }
+
static void statmount_mnt_ns_id(struct kstatmount *s, struct mnt_namespace *ns)
{
s->sm.mask |= STATMOUNT_MNT_NS_ID;
return 0;
}
+ static inline int statmount_opt_unescape(struct seq_file *seq, char *buf_start)
+ {
+ char *buf_end, *opt_start, *opt_end;
+ int count = 0;
+
+ buf_end = seq->buf + seq->count;
+ *buf_end = '\0';
+ for (opt_start = buf_start + 1; opt_start < buf_end; opt_start = opt_end + 1) {
+ opt_end = strchrnul(opt_start, ',');
+ *opt_end = '\0';
+ buf_start += string_unescape(opt_start, buf_start, 0, UNESCAPE_OCTAL) + 1;
+ if (WARN_ON_ONCE(++count == INT_MAX))
+ return -EOVERFLOW;
+ }
+ seq->count = buf_start - 1 - seq->buf;
+ return count;
+ }
+
+ static int statmount_opt_array(struct kstatmount *s, struct seq_file *seq)
+ {
+ struct vfsmount *mnt = s->mnt;
+ struct super_block *sb = mnt->mnt_sb;
+ size_t start = seq->count;
+ char *buf_start;
+ int err;
+
+ if (!sb->s_op->show_options)
+ return 0;
+
+ buf_start = seq->buf + start;
+ err = sb->s_op->show_options(seq, mnt->mnt_root);
+ if (err)
+ return err;
+
+ if (unlikely(seq_has_overflowed(seq)))
+ return -EAGAIN;
+
+ if (seq->count == start)
+ return 0;
+
+ err = statmount_opt_unescape(seq, buf_start);
+ if (err < 0)
+ return err;
+
+ s->sm.opt_num = err;
+ return 0;
+ }
+
+ static int statmount_opt_sec_array(struct kstatmount *s, struct seq_file *seq)
+ {
+ struct vfsmount *mnt = s->mnt;
+ struct super_block *sb = mnt->mnt_sb;
+ size_t start = seq->count;
+ char *buf_start;
+ int err;
+
+ buf_start = seq->buf + start;
+
+ err = security_sb_show_options(seq, sb);
+ if (!err)
+ return err;
+
+ if (unlikely(seq_has_overflowed(seq)))
+ return -EAGAIN;
+
+ if (seq->count == start)
+ return 0;
+
+ err = statmount_opt_unescape(seq, buf_start);
+ if (err < 0)
+ return err;
+
+ s->sm.opt_sec_num = err;
+ return 0;
+ }
+
static int statmount_string(struct kstatmount *s, u64 flag)
{
- int ret;
+ int ret = 0;
size_t kbufsize;
struct seq_file *seq = &s->seq;
struct statmount *sm = &s->sm;
+ u32 start = seq->count;
switch (flag) {
case STATMOUNT_FS_TYPE:
- sm->fs_type = seq->count;
+ sm->fs_type = start;
ret = statmount_fs_type(s, seq);
break;
case STATMOUNT_MNT_ROOT:
- sm->mnt_root = seq->count;
+ sm->mnt_root = start;
ret = statmount_mnt_root(s, seq);
break;
case STATMOUNT_MNT_POINT:
- sm->mnt_point = seq->count;
+ sm->mnt_point = start;
ret = statmount_mnt_point(s, seq);
break;
case STATMOUNT_MNT_OPTS:
- sm->mnt_opts = seq->count;
+ sm->mnt_opts = start;
ret = statmount_mnt_opts(s, seq);
break;
+ case STATMOUNT_OPT_ARRAY:
+ sm->opt_array = start;
+ ret = statmount_opt_array(s, seq);
+ break;
+ case STATMOUNT_OPT_SEC_ARRAY:
+ sm->opt_sec_array = start;
+ ret = statmount_opt_sec_array(s, seq);
+ break;
+ case STATMOUNT_FS_SUBTYPE:
+ sm->fs_subtype = start;
+ statmount_fs_subtype(s, seq);
+ break;
+ case STATMOUNT_SB_SOURCE:
+ sm->sb_source = start;
+ ret = statmount_sb_source(s, seq);
+ break;
default:
WARN_ON_ONCE(true);
return -EINVAL;
}
+ /*
+ * If nothing was emitted, return to avoid setting the flag
+ * and terminating the buffer.
+ */
+ if (seq->count == start)
+ return ret;
if (unlikely(check_add_overflow(sizeof(*sm), seq->count, &kbufsize)))
return -EOVERFLOW;
if (kbufsize >= s->bufsize)
if (!err && s->mask & STATMOUNT_MNT_OPTS)
err = statmount_string(s, STATMOUNT_MNT_OPTS);
+ if (!err && s->mask & STATMOUNT_OPT_ARRAY)
+ err = statmount_string(s, STATMOUNT_OPT_ARRAY);
+
+ if (!err && s->mask & STATMOUNT_OPT_SEC_ARRAY)
+ err = statmount_string(s, STATMOUNT_OPT_SEC_ARRAY);
+
+ if (!err && s->mask & STATMOUNT_FS_SUBTYPE)
+ err = statmount_string(s, STATMOUNT_FS_SUBTYPE);
+
+ if (!err && s->mask & STATMOUNT_SB_SOURCE)
+ err = statmount_string(s, STATMOUNT_SB_SOURCE);
+
if (!err && s->mask & STATMOUNT_MNT_NS_ID)
statmount_mnt_ns_id(s, ns);
}
#define STATMOUNT_STRING_REQ (STATMOUNT_MNT_ROOT | STATMOUNT_MNT_POINT | \
- STATMOUNT_FS_TYPE | STATMOUNT_MNT_OPTS)
+ STATMOUNT_FS_TYPE | STATMOUNT_MNT_OPTS | \
+ STATMOUNT_FS_SUBTYPE | STATMOUNT_SB_SOURCE | \
+ STATMOUNT_OPT_ARRAY | STATMOUNT_OPT_SEC_ARRAY)
static int prepare_kstatmount(struct kstatmount *ks, struct mnt_id_req *kreq,
struct statmount __user *buf, size_t bufsize,
destroy_unhashed_deleg(dp);
}
+/**
+ * revoke_delegation - perform nfs4 delegation structure cleanup
+ * @dp: pointer to the delegation
+ *
+ * This function assumes that it's called either from the administrative
+ * interface (nfsd4_revoke_states()) that's revoking a specific delegation
+ * stateid or it's called from a laundromat thread (nfsd4_landromat()) that
+ * determined that this specific state has expired and needs to be revoked
+ * (both mark state with the appropriate stid sc_status mode). It is also
+ * assumed that a reference was taken on the @dp state.
+ *
+ * If this function finds that the @dp state is SC_STATUS_FREED it means
+ * that a FREE_STATEID operation for this stateid has been processed and
+ * we can proceed to removing it from recalled list. However, if @dp state
+ * isn't marked SC_STATUS_FREED, it means we need place it on the cl_revoked
+ * list and wait for the FREE_STATEID to arrive from the client. At the same
+ * time, we need to mark it as SC_STATUS_FREEABLE to indicate to the
+ * nfsd4_free_stateid() function that this stateid has already been added
+ * to the cl_revoked list and that nfsd4_free_stateid() is now responsible
+ * for removing it from the list. Inspection of where the delegation state
+ * in the revocation process is protected by the clp->cl_lock.
+ */
static void revoke_delegation(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_stid.sc_client;
WARN_ON(!list_empty(&dp->dl_recall_lru));
+ WARN_ON_ONCE(!(dp->dl_stid.sc_status &
+ (SC_STATUS_REVOKED | SC_STATUS_ADMIN_REVOKED)));
trace_nfsd_stid_revoke(&dp->dl_stid);
- if (dp->dl_stid.sc_status &
- (SC_STATUS_REVOKED | SC_STATUS_ADMIN_REVOKED)) {
- spin_lock(&clp->cl_lock);
- refcount_inc(&dp->dl_stid.sc_count);
- list_add(&dp->dl_recall_lru, &clp->cl_revoked);
- spin_unlock(&clp->cl_lock);
+ spin_lock(&clp->cl_lock);
+ if (dp->dl_stid.sc_status & SC_STATUS_FREED) {
+ list_del_init(&dp->dl_recall_lru);
+ goto out;
}
+ list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+ dp->dl_stid.sc_status |= SC_STATUS_FREEABLE;
+out:
+ spin_unlock(&clp->cl_lock);
destroy_unhashed_deleg(dp);
}
mutex_unlock(&stp->st_mutex);
break;
case SC_TYPE_DELEG:
+ refcount_inc(&stid->sc_count);
dp = delegstateid(stid);
spin_lock(&state_lock);
if (!unhash_delegation_locked(
dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
if (!state_expired(<, dp->dl_time))
break;
+ refcount_inc(&dp->dl_stid.sc_count);
unhash_delegation_locked(dp, SC_STATUS_REVOKED);
list_add(&dp->dl_recall_lru, &reaplist);
}
switch (s->sc_type) {
case SC_TYPE_DELEG:
if (s->sc_status & SC_STATUS_REVOKED) {
+ s->sc_status |= SC_STATUS_CLOSED;
spin_unlock(&s->sc_lock);
dp = delegstateid(s);
- list_del_init(&dp->dl_recall_lru);
+ if (s->sc_status & SC_STATUS_FREEABLE)
+ list_del_init(&dp->dl_recall_lru);
+ s->sc_status |= SC_STATUS_FREED;
spin_unlock(&cl->cl_lock);
nfs4_put_stid(s);
ret = nfs_ok;
if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
return status;
- status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG, 0, &s, nn);
+ status = nfsd4_lookup_stateid(cstate, stateid, SC_TYPE_DELEG,
+ SC_STATUS_REVOKED | SC_STATUS_FREEABLE,
+ &s, nn);
if (status)
goto out;
dp = delegstateid(s);
fp = lock_stp->st_stid.sc_file;
switch (lock->lk_type) {
case NFS4_READW_LT:
- if (nfsd4_has_session(cstate) ||
- exportfs_lock_op_is_async(sb->s_export_op))
- flags |= FL_SLEEP;
fallthrough;
case NFS4_READ_LT:
spin_lock(&fp->fi_lock);
type = F_RDLCK;
break;
case NFS4_WRITEW_LT:
- if (nfsd4_has_session(cstate) ||
- exportfs_lock_op_is_async(sb->s_export_op))
- flags |= FL_SLEEP;
fallthrough;
case NFS4_WRITE_LT:
spin_lock(&fp->fi_lock);
goto out;
}
- /*
- * Most filesystems with their own ->lock operations will block
- * the nfsd thread waiting to acquire the lock. That leads to
- * deadlocks (we don't want every nfsd thread tied up waiting
- * for file locks), so don't attempt blocking lock notifications
- * on those filesystems:
- */
- if (!exportfs_lock_op_is_async(sb->s_export_op))
- flags &= ~FL_SLEEP;
+ if (lock->lk_type & (NFS4_READW_LT | NFS4_WRITEW_LT) &&
+ nfsd4_has_session(cstate) &&
+ locks_can_async_lock(nf->nf_file->f_op))
+ flags |= FL_SLEEP;
nbl = find_or_allocate_block(lock_sop, &fp->fi_fhandle, nn);
if (!nbl) {
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
shrinker_free(nn->nfsd_client_shrinker);
- cancel_work(&nn->nfsd_shrinker_work);
+ cancel_work_sync(&nn->nfsd_shrinker_work);
cancel_delayed_work_sync(&nn->laundromat_work);
locks_end_grace(&nn->nfsd4_manager);
trace_ocfs2_setattr(inode, dentry,
(unsigned long long)OCFS2_I(inode)->ip_blkno,
dentry->d_name.len, dentry->d_name.name,
- attr->ia_valid, attr->ia_mode,
- from_kuid(&init_user_ns, attr->ia_uid),
- from_kgid(&init_user_ns, attr->ia_gid));
+ attr->ia_valid,
+ attr->ia_valid & ATTR_MODE ? attr->ia_mode : 0,
+ attr->ia_valid & ATTR_UID ?
+ from_kuid(&init_user_ns, attr->ia_uid) : 0,
+ attr->ia_valid & ATTR_GID ?
+ from_kgid(&init_user_ns, attr->ia_gid) : 0);
/* ensuring we don't even attempt to truncate a symlink */
if (S_ISLNK(inode->i_mode))
return 0;
if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) {
+ int id_count = ocfs2_max_inline_data_with_xattr(inode->i_sb, di);
+
+ if (byte_start > id_count || byte_start + byte_len > id_count) {
+ ret = -EINVAL;
+ mlog_errno(ret);
+ goto out;
+ }
+
ret = ocfs2_truncate_inline(inode, di_bh, byte_start,
byte_start + byte_len, 0);
if (ret) {
.splice_write = iter_file_splice_write,
.fallocate = ocfs2_fallocate,
.remap_file_range = ocfs2_remap_file_range,
+ .fop_flags = FOP_ASYNC_LOCK,
};
WRAP_DIR_ITER(ocfs2_readdir) // FIXME!
#endif
.lock = ocfs2_lock,
.flock = ocfs2_flock,
+ .fop_flags = FOP_ASYNC_LOCK,
};
/*
#define IOP_NOFOLLOW 0x0004
#define IOP_XATTR 0x0008
#define IOP_DEFAULT_READLINK 0x0010
+#define IOP_MGTIME 0x0020
/*
* Keep mostly read-only and often accessed (especially for
struct timespec64 current_time(struct inode *inode);
struct timespec64 inode_set_ctime_current(struct inode *inode);
+struct timespec64 inode_set_ctime_deleg(struct inode *inode,
+ struct timespec64 update);
static inline time64_t inode_get_atime_sec(const struct inode *inode)
{
return inode_set_mtime_to_ts(inode, ts);
}
+/*
+ * Multigrain timestamps
+ *
+ * Conditionally use fine-grained ctime and mtime timestamps when there
+ * are users actively observing them via getattr. The primary use-case
+ * for this is NFS clients that use the ctime to distinguish between
+ * different states of the file, and that are often fooled by multiple
+ * operations that occur in the same coarse-grained timer tick.
+ */
+#define I_CTIME_QUERIED ((u32)BIT(31))
+
static inline time64_t inode_get_ctime_sec(const struct inode *inode)
{
return inode->i_ctime_sec;
static inline long inode_get_ctime_nsec(const struct inode *inode)
{
- return inode->i_ctime_nsec;
+ return inode->i_ctime_nsec & ~I_CTIME_QUERIED;
}
static inline struct timespec64 inode_get_ctime(const struct inode *inode)
return ts;
}
-static inline struct timespec64 inode_set_ctime_to_ts(struct inode *inode,
- struct timespec64 ts)
-{
- inode->i_ctime_sec = ts.tv_sec;
- inode->i_ctime_nsec = ts.tv_nsec;
- return ts;
-}
+struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timespec64 ts);
/**
* inode_set_ctime - set the ctime in the inode
#define FOP_HUGE_PAGES ((__force fop_flags_t)(1 << 4))
/* Treat loff_t as unsigned (e.g., /dev/mem) */
#define FOP_UNSIGNED_OFFSET ((__force fop_flags_t)(1 << 5))
+ /* Supports asynchronous lock callbacks */
+ #define FOP_ASYNC_LOCK ((__force fop_flags_t)(1 << 6))
/* Wrap a directory iterator that needs exclusive inode access */
int wrap_directory_iterator(struct file *, struct dir_context *,
#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */
#define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */
#define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */
+#define FS_MGTIME 64 /* FS uses multigrain timestamps */
#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
int (*init_fs_context)(struct fs_context *);
const struct fs_parameter_spec *parameters;
#define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME)
+/**
+ * is_mgtime: is this inode using multigrain timestamps
+ * @inode: inode to test for multigrain timestamps
+ *
+ * Return true if the inode uses multigrain timestamps, false otherwise.
+ */
+static inline bool is_mgtime(const struct inode *inode)
+{
+ return inode->i_opflags & IOP_MGTIME;
+}
+
extern struct dentry *mount_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int));
extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);
-extern int inode_init_always(struct super_block *, struct inode *);
+extern int inode_init_always_gfp(struct super_block *, struct inode *, gfp_t);
+static inline int inode_init_always(struct super_block *sb, struct inode *inode)
+{
+ return inode_init_always_gfp(sb, inode, GFP_NOFS);
+}
+
extern void inode_init_once(struct inode *);
extern void address_space_init_once(struct address_space *mapping);
extern struct inode * igrab(struct inode *);
extern int page_symlink(struct inode *inode, const char *symname, int len);
extern const struct inode_operations page_symlink_inode_operations;
extern void kfree_link(void *);
+void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *inode);
void generic_fillattr(struct mnt_idmap *, u32, struct inode *, struct kstat *);
void generic_fill_statx_attr(struct inode *inode, struct kstat *stat);
void generic_fill_statx_atomic_writes(struct kstat *stat,
loff_t isize, end_offset;
loff_t last_pos = ra->prev_pos;
+ if (unlikely(iocb->ki_pos < 0))
+ return -EINVAL;
if (unlikely(iocb->ki_pos >= inode->i_sb->s_maxbytes))
return 0;
if (unlikely(!iov_iter_count(iter)))
return 0;
- iov_iter_truncate(iter, inode->i_sb->s_maxbytes);
+ iov_iter_truncate(iter, inode->i_sb->s_maxbytes - iocb->ki_pos);
folio_batch_init(&fbatch);
do {