description: Shenzen Chuangsiqi Technology Co.,Ltd.
"^ctera,.*":
description: CTERA Networks Intl.
+ "^ctu,.*":
+ description: Czech Technical University in Prague
"^cubietech,.*":
description: Cubietech, Ltd.
"^cui,.*":
description: Sensirion AG
"^sensortek,.*":
description: Sensortek Technology Corporation
+ "^sercomm,.*":
+ description: Sercomm (Suzhou) Corporation
"^sff,.*":
description: Small Form Factor Committee
"^sgd,.*":
F: drivers/net/ethernet/amd/xgbe/
AMD SENSOR FUSION HUB DRIVER
S: Maintained
AQUACOMPUTER D5 NEXT PUMP SENSOR DRIVER
S: Maintained
F: Documentation/hwmon/aquacomputer_d5next.rst
F: Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml
F: drivers/mmc/host/sdhci-of-aspeed*
+ASPEED SMC SPI DRIVER
+S: Maintained
+F: Documentation/devicetree/bindings/spi/aspeed,ast2600-fmc.yaml
+F: drivers/spi/spi-aspeed-smc.c
+
ASPEED VIDEO ENGINE DRIVER
F: drivers/phy/phy-can-transceiver.c
F: include/linux/can/bittiming.h
F: include/linux/can/dev.h
- F: include/linux/can/led.h
F: include/linux/can/length.h
F: include/linux/can/platform/
F: include/linux/can/rx-offload.h
S: Maintained
F: Documentation/admin-guide/module-signing.rst
F: certs/
+F: scripts/check-blacklist-hashes.awk
F: scripts/sign-file.c
+F: tools/certs/
CFAG12864B LCD DRIVER
CHINESE DOCUMENTATION
S: Maintained
F: Documentation/translations/zh_CN/
F: Documentation/hwmon/corsair-psu.rst
F: drivers/hwmon/corsair-psu.c
- COSA/SRP SYNC SERIAL DRIVER
- S: Maintained
- W: http://www.fi.muni.cz/~kas/cosa/
- F: drivers/net/wan/cosa*
-
COUNTER SUBSYSTEM
F: Documentation/devicetree/bindings/media/allwinner,sun6i-a31-csi.yaml
F: drivers/media/platform/sunxi/sun6i-csi/
+ CTU CAN FD DRIVER
+ S: Maintained
+ F: Documentation/devicetree/bindings/net/can/ctu,ctucanfd.yaml
+ F: drivers/net/can/ctucanfd/
+
CW1200 WLAN driver
S: Maintained
S: Maintained
F: Documentation/translations/it_IT
+DOCUMENTATION/JAPANESE
+S: Maintained
+F: Documentation/translations/ja_JP
+
DONGWOON DW9714 LENS VOICE COIL DRIVER
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/execve
F: arch/alpha/kernel/binfmt_loader.c
-F: arch/x86/ia32/ia32_aout.c
F: fs/*binfmt_*.c
F: fs/exec.c
F: include/linux/binfmts.h
F: drivers/iio/*/hid-*
F: include/linux/hid-sensor-*
+HID WACOM DRIVER
+S: Maintained
+F: drivers/hid/wacom.h
+F: drivers/hid/wacom_*
+
HIGH-RESOLUTION TIMERS, CLOCKEVENTS
HIGH-SPEED SCC DRIVER FOR AX.25
S: Orphan
- F: drivers/net/hamradio/dmascc.c
F: drivers/net/hamradio/scc.c
HIGHPOINT ROCKETRAID 3xxx RAID DRIVER
T: git git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git
F: drivers/idle/intel_idle.c
+INTEL IN FIELD SCAN (IFS) DEVICE
+S: Maintained
+F: drivers/platform/x86/intel/ifs
+F: include/trace/events/intel_ifs.h
+
INTEL INTEGRATED SENSOR HUB DRIVER
F: include/keys/trusted_tee.h
F: security/keys/trusted-keys/trusted_tee.c
+KEYS-TRUSTED-CAAM
+S: Maintained
+F: include/keys/trusted_caam.h
+F: security/keys/trusted-keys/trusted_caam.c
+
KEYS/KEYRINGS
S: Supported
-F: Documentation/devicetree/bindings/mmc/marvell,xenon-sdhci.txt
+F: Documentation/devicetree/bindings/mmc/marvell,xenon-sdhci.yaml
F: drivers/mmc/host/sdhci-xenon*
+ MARVELL OCTEON ENDPOINT DRIVER
+ S: Supported
+ F: drivers/net/ethernet/marvell/octeon_ep
+
MATROX FRAMEBUFFER DRIVER
S: Orphan
F: Documentation/admin-guide/media/imx7.rst
F: Documentation/devicetree/bindings/media/nxp,imx-mipi-csi2.yaml
F: Documentation/devicetree/bindings/media/nxp,imx7-csi.yaml
-F: drivers/media/platform/imx/imx-mipi-csis.c
+F: drivers/media/platform/nxp/imx-mipi-csis.c
F: drivers/staging/media/imx/imx7-media-csi.c
MEDIA DRIVERS FOR HELENE
S: Maintained
T: git git://linuxtv.org/media_tree.git
-F: Documentation/devicetree/bindings/media/nvidia,tegra-vde.txt
+F: Documentation/devicetree/bindings/media/nvidia,tegra-vde.yaml
F: drivers/media/platform/nvidia/tegra-vde/
MEDIA DRIVERS FOR RENESAS - CEU
F: include/dt-bindings/memory/mt*-port.h
MEDIATEK JPEG DRIVER
S: Supported
-F: Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.txt
+F: Documentation/devicetree/bindings/media/mediatek-jpeg-*.yaml
F: drivers/media/platform/mediatek/jpeg/
MEDIATEK MDP DRIVER
S: Supported
-F: Documentation/devicetree/bindings/media/mediatek-vcodec.txt
+F: Documentation/devicetree/bindings/media/mediatek,vcodec*.yaml
F: Documentation/devicetree/bindings/media/mediatek-vpu.txt
F: drivers/media/platform/mediatek/vcodec/
F: drivers/media/platform/mediatek/vpu/
F: drivers/net/dsa/mt7530.*
F: net/dsa/tag_mtk.c
+ MEDIATEK T7XX 5G WWAN MODEM DRIVER
+ S: Supported
+ F: drivers/net/wwan/t7xx/
+
MEDIATEK USB3 DRD IP DRIVER
F: include/linux/platform_data/microchip-ksz.h
F: net/dsa/tag_ksz.c
+ MICROCHIP LAN87xx/LAN937x T1 PHY DRIVER
+ S: Maintained
+ F: drivers/net/phy/microchip_t1.c
+
MICROCHIP LAN743X ETHERNET DRIVER
S: Maintained
F: net/ncsi/
-NCT6775 HARDWARE MONITOR DRIVER
+NCT6775 HARDWARE MONITOR DRIVER - CORE & PLATFORM DRIVER
S: Maintained
F: Documentation/hwmon/nct6775.rst
-F: drivers/hwmon/nct6775.c
+F: drivers/hwmon/nct6775-core.c
+F: drivers/hwmon/nct6775-platform.c
+F: drivers/hwmon/nct6775.h
+
+NCT6775 HARDWARE MONITOR DRIVER - I2C DRIVER
+S: Maintained
+F: Documentation/devicetree/bindings/hwmon/nuvoton,nct6775.yaml
+F: drivers/hwmon/nct6775-i2c.c
NETDEVSIM
F: include/trace/events/mptcp.h
F: include/uapi/linux/mptcp.h
F: net/mptcp/
+ F: tools/testing/selftests/bpf/*/*mptcp*.c
F: tools/testing/selftests/net/mptcp/
NETWORKING [TCP]
S: Maintained
F: Documentation/devicetree/bindings/media/nxp,imx8-jpeg.yaml
-F: drivers/media/platform/imx-jpeg
+F: drivers/media/platform/nxp/imx-jpeg
NZXT-KRAKEN2 HARDWARE MONITORING DRIVER
F: include/linux/padata.h
F: kernel/padata.c
+PAGE CACHE
+S: Supported
+T: git git://git.infradead.org/users/willy/pagecache.git
+F: Documentation/filesystems/locking.rst
+F: Documentation/filesystems/vfs.rst
+F: include/linux/pagemap.h
+F: mm/filemap.c
+F: mm/page-writeback.c
+F: mm/readahead.c
+F: mm/truncate.c
+
PAGE POOL
PRINTK INDEXING
S: Maintained
+F: Documentation/core-api/printk-index.rst
F: kernel/printk/index.c
+K: printk_index
PROC FILESYSTEM
F: Documentation/admin-guide/media/pulse8-cec.rst
F: drivers/media/cec/usb/pulse8/
+ PURELIFI PLFXLC DRIVER
+ S: Supported
+ F: drivers/net/wireless/purelifi/plfxlc/
+
PVRUSB2 VIDEO4LINUX DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
SILICON LABS WIRELESS DRIVERS (for WFxxx series)
S: Supported
- F: Documentation/devicetree/bindings/staging/net/wireless/silabs,wfx.yaml
- F: drivers/staging/wfx/
+ F: Documentation/devicetree/bindings/net/wireless/silabs,wfx.yaml
+ F: drivers/net/wireless/silabs/wfx/
SILICON MOTION SM712 FRAME BUFFER DRIVER
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git
S: Maintained
F: drivers/net/ethernet/dlink/sundance.c
+ SUNPLUS ETHERNET DRIVER
+ S: Maintained
+ W: https://sunplus.atlassian.net/wiki/spaces/doc/overview
+ F: Documentation/devicetree/bindings/net/sunplus,sp7021-emac.yaml
+ F: drivers/net/ethernet/sunplus/
+
SUNPLUS OCOTP DRIVER
S: Maintained
F: include/linux/cpu_cooling.h
F: include/linux/thermal.h
F: include/uapi/linux/thermal.h
+F: tools/lib/thermal/
F: tools/thermal/
THERMAL DRIVER FOR AMLOGIC SOCS
TMIO/SDHI MMC DRIVER
S: Supported
F: drivers/mmc/host/renesas_sdhi*
F: drivers/mmc/host/tmio_mmc*
S: Maintained
+F: Documentation/devicetree/bindings/hwmon/ti,tmp401.yaml
F: Documentation/hwmon/tmp401.rst
F: drivers/hwmon/tmp401.c
USB VIDEO CLASS
S: Maintained
W: http://www.ideasonboard.org/uvc/
XDP SOCKETS (AF_XDP)
S: Maintained
- F: Documentation/devicetree/bindings/net/can/xilinx_can.txt
+ F: Documentation/devicetree/bindings/net/can/xilinx,can.yaml
F: drivers/net/can/xilinx_can.c
XILINX GPIO DRIVER
}
}
-#if defined(CONFIG_RETPOLINE) && defined(CONFIG_STACK_VALIDATION)
+#if defined(CONFIG_RETPOLINE) && defined(CONFIG_OBJTOOL)
/*
* CALL/JMP *%\reg
}
}
-#else /* !RETPOLINES || !CONFIG_STACK_VALIDATION */
+#else /* !CONFIG_RETPOLINE || !CONFIG_OBJTOOL */
void __init_or_module noinline apply_retpolines(s32 *start, s32 *end) { }
-#endif /* CONFIG_RETPOLINE && CONFIG_STACK_VALIDATION */
+#endif /* CONFIG_RETPOLINE && CONFIG_OBJTOOL */
#ifdef CONFIG_X86_KERNEL_IBT
__ro_after_init struct mm_struct *poking_mm;
__ro_after_init unsigned long poking_addr;
- static void *__text_poke(void *addr, const void *opcode, size_t len)
+ static void text_poke_memcpy(void *dst, const void *src, size_t len)
+ {
+ memcpy(dst, src, len);
+ }
+
+ static void text_poke_memset(void *dst, const void *src, size_t len)
+ {
+ int c = *(const int *)src;
+
+ memset(dst, c, len);
+ }
+
+ typedef void text_poke_f(void *dst, const void *src, size_t len);
+
+ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t len)
{
bool cross_page_boundary = offset_in_page(addr) + len > PAGE_SIZE;
struct page *pages[2] = {NULL};
prev = use_temporary_mm(poking_mm);
kasan_disable_current();
- memcpy((u8 *)poking_addr + offset_in_page(addr), opcode, len);
+ func((u8 *)poking_addr + offset_in_page(addr), src, len);
kasan_enable_current();
/*
(cross_page_boundary ? 2 : 1) * PAGE_SIZE,
PAGE_SHIFT, false);
- /*
- * If the text does not match what we just wrote then something is
- * fundamentally screwy; there's nothing we can really do about that.
- */
- BUG_ON(memcmp(addr, opcode, len));
+ if (func == text_poke_memcpy) {
+ /*
+ * If the text does not match what we just wrote then something is
+ * fundamentally screwy; there's nothing we can really do about that.
+ */
+ BUG_ON(memcmp(addr, src, len));
+ }
local_irq_restore(flags);
pte_unmap_unlock(ptep, ptl);
{
lockdep_assert_held(&text_mutex);
- return __text_poke(addr, opcode, len);
+ return __text_poke(text_poke_memcpy, addr, opcode, len);
}
/**
*/
void *text_poke_kgdb(void *addr, const void *opcode, size_t len)
{
- return __text_poke(addr, opcode, len);
+ return __text_poke(text_poke_memcpy, addr, opcode, len);
}
/**
s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched);
- __text_poke((void *)ptr, opcode + patched, s);
+ __text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, s);
+ patched += s;
+ }
+ mutex_unlock(&text_mutex);
+ return addr;
+ }
+
+ /**
+ * text_poke_set - memset into (an unused part of) RX memory
+ * @addr: address to modify
+ * @c: the byte to fill the area with
+ * @len: length to copy, could be more than 2x PAGE_SIZE
+ *
+ * This is useful to overwrite unused regions of RX memory with illegal
+ * instructions.
+ */
+ void *text_poke_set(void *addr, int c, size_t len)
+ {
+ unsigned long start = (unsigned long)addr;
+ size_t patched = 0;
+
+ if (WARN_ON_ONCE(core_kernel_text(start)))
+ return NULL;
+
+ mutex_lock(&text_mutex);
+ while (patched < len) {
+ unsigned long ptr = start + patched;
+ size_t s;
+
+ s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched);
+
+ __text_poke(text_poke_memset, (void *)ptr, (void *)&c, s);
patched += s;
}
mutex_unlock(&text_mutex);
* prepare to perform part of a write to a page
*/
int afs_write_begin(struct file *file, struct address_space *mapping,
- loff_t pos, unsigned len, unsigned flags,
+ loff_t pos, unsigned len,
struct page **_page, void **fsdata)
{
struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
* file. We need to do this before we get a lock on the page in case
* there's more than one writer competing for the same cache block.
*/
- ret = netfs_write_begin(file, mapping, pos, len, flags, &folio, fsdata);
+ ret = netfs_write_begin(file, mapping, pos, len, &folio, fsdata);
if (ret < 0)
return ret;
case -EKEYEXPIRED:
case -EKEYREJECTED:
case -EKEYREVOKED:
+ case -ENETRESET:
afs_redirty_pages(wbc, mapping, start, len);
mapping_set_error(mapping, ret);
break;
#include <linux/slab.h>
#include <linux/percpu-refcount.h>
#include <linux/bpfptr.h>
+ #include <linux/btf.h>
struct bpf_verifier_env;
struct bpf_verifier_log;
int (*map_push_elem)(struct bpf_map *map, void *value, u64 flags);
int (*map_pop_elem)(struct bpf_map *map, void *value);
int (*map_peek_elem)(struct bpf_map *map, void *value);
+ void *(*map_lookup_percpu_elem)(struct bpf_map *map, void *key, u32 cpu);
/* funcs called by prog_array and perf_event_array map */
void *(*map_fd_get_ptr)(struct bpf_map *map, struct file *map_file,
bpf_callback_t callback_fn,
void *callback_ctx, u64 flags);
- /* BTF name and id of struct allocated by map_alloc */
- const char * const map_btf_name;
+ /* BTF id of struct allocated by map_alloc */
int *map_btf_id;
/* bpf_iter info used to open a seq_file */
const struct bpf_iter_seq_info *iter_seq_info;
};
+ enum {
+ /* Support at most 8 pointers in a BPF map value */
+ BPF_MAP_VALUE_OFF_MAX = 8,
+ BPF_MAP_OFF_ARR_MAX = BPF_MAP_VALUE_OFF_MAX +
+ 1 + /* for bpf_spin_lock */
+ 1, /* for bpf_timer */
+ };
+
+ enum bpf_kptr_type {
+ BPF_KPTR_UNREF,
+ BPF_KPTR_REF,
+ };
+
+ struct bpf_map_value_off_desc {
+ u32 offset;
+ enum bpf_kptr_type type;
+ struct {
+ struct btf *btf;
+ struct module *module;
+ btf_dtor_kfunc_t dtor;
+ u32 btf_id;
+ } kptr;
+ };
+
+ struct bpf_map_value_off {
+ u32 nr_off;
+ struct bpf_map_value_off_desc off[];
+ };
+
+ struct bpf_map_off_arr {
+ u32 cnt;
+ u32 field_off[BPF_MAP_OFF_ARR_MAX];
+ u8 field_sz[BPF_MAP_OFF_ARR_MAX];
+ };
+
struct bpf_map {
/* The first two cachelines with read-mostly members of which some
* are also accessed in fast-path (e.g. ops, max_entries).
u64 map_extra; /* any per-map-type extra fields */
u32 map_flags;
int spin_lock_off; /* >=0 valid offset, <0 error */
+ struct bpf_map_value_off *kptr_off_tab;
int timer_off; /* >=0 valid offset, <0 error */
u32 id;
int numa_node;
struct mem_cgroup *memcg;
#endif
char name[BPF_OBJ_NAME_LEN];
- bool bypass_spec_v1;
- bool frozen; /* write-once; write-protected by freeze_mutex */
- /* 14 bytes hole */
-
+ struct bpf_map_off_arr *off_arr;
/* The 3rd and 4th cacheline with misc members to avoid false sharing
* particularly with refcounting.
*/
bool jited;
bool xdp_has_frags;
} owner;
+ bool bypass_spec_v1;
+ bool frozen; /* write-once; write-protected by freeze_mutex */
};
static inline bool map_value_has_spin_lock(const struct bpf_map *map)
return map->timer_off >= 0;
}
+ static inline bool map_value_has_kptrs(const struct bpf_map *map)
+ {
+ return !IS_ERR_OR_NULL(map->kptr_off_tab);
+ }
+
static inline void check_and_init_map_value(struct bpf_map *map, void *dst)
{
if (unlikely(map_value_has_spin_lock(map)))
memset(dst + map->spin_lock_off, 0, sizeof(struct bpf_spin_lock));
if (unlikely(map_value_has_timer(map)))
memset(dst + map->timer_off, 0, sizeof(struct bpf_timer));
+ if (unlikely(map_value_has_kptrs(map))) {
+ struct bpf_map_value_off *tab = map->kptr_off_tab;
+ int i;
+
+ for (i = 0; i < tab->nr_off; i++)
+ *(u64 *)(dst + tab->off[i].offset) = 0;
+ }
}
/* copy everything but bpf_spin_lock and bpf_timer. There could be one of each. */
static inline void copy_map_value(struct bpf_map *map, void *dst, void *src)
{
- u32 s_off = 0, s_sz = 0, t_off = 0, t_sz = 0;
+ u32 curr_off = 0;
+ int i;
- if (unlikely(map_value_has_spin_lock(map))) {
- s_off = map->spin_lock_off;
- s_sz = sizeof(struct bpf_spin_lock);
- }
- if (unlikely(map_value_has_timer(map))) {
- t_off = map->timer_off;
- t_sz = sizeof(struct bpf_timer);
+ if (likely(!map->off_arr)) {
+ memcpy(dst, src, map->value_size);
+ return;
}
- if (unlikely(s_sz || t_sz)) {
- if (s_off < t_off || !s_sz) {
- swap(s_off, t_off);
- swap(s_sz, t_sz);
- }
- memcpy(dst, src, t_off);
- memcpy(dst + t_off + t_sz,
- src + t_off + t_sz,
- s_off - t_off - t_sz);
- memcpy(dst + s_off + s_sz,
- src + s_off + s_sz,
- map->value_size - s_off - s_sz);
- } else {
- memcpy(dst, src, map->value_size);
+ for (i = 0; i < map->off_arr->cnt; i++) {
+ u32 next_off = map->off_arr->field_off[i];
+
+ memcpy(dst + curr_off, src + curr_off, next_off - curr_off);
+ curr_off += map->off_arr->field_sz[i];
}
+ memcpy(dst + curr_off, src + curr_off, map->value_size - curr_off);
}
void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,
bool lock_src);
*/
MEM_PERCPU = BIT(4 + BPF_BASE_TYPE_BITS),
- __BPF_TYPE_LAST_FLAG = MEM_PERCPU,
+ /* Indicates that the argument will be released. */
+ OBJ_RELEASE = BIT(5 + BPF_BASE_TYPE_BITS),
+
+ /* PTR is not trusted. This is only used with PTR_TO_BTF_ID, to mark
+ * unreferenced and referenced kptr loaded from map value using a load
+ * instruction, so that they can only be dereferenced but not escape the
+ * BPF program into the kernel (i.e. cannot be passed as arguments to
+ * kfunc or bpf helpers).
+ */
+ PTR_UNTRUSTED = BIT(6 + BPF_BASE_TYPE_BITS),
+
+ MEM_UNINIT = BIT(7 + BPF_BASE_TYPE_BITS),
+
+ /* DYNPTR points to memory local to the bpf program. */
+ DYNPTR_TYPE_LOCAL = BIT(8 + BPF_BASE_TYPE_BITS),
+
+ /* DYNPTR points to a ringbuf record. */
+ DYNPTR_TYPE_RINGBUF = BIT(9 + BPF_BASE_TYPE_BITS),
+
+ __BPF_TYPE_FLAG_MAX,
+ __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1,
};
+ #define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF)
+
/* Max number of base types. */
#define BPF_BASE_TYPE_LIMIT (1UL << BPF_BASE_TYPE_BITS)
ARG_CONST_MAP_PTR, /* const argument used as pointer to bpf_map */
ARG_PTR_TO_MAP_KEY, /* pointer to stack used as map key */
ARG_PTR_TO_MAP_VALUE, /* pointer to stack used as map value */
- ARG_PTR_TO_UNINIT_MAP_VALUE, /* pointer to valid memory used to store a map value */
- /* the following constraints used to prototype bpf_memcmp() and other
- * functions that access data on eBPF program stack
+ /* Used to prototype bpf_memcmp() and other functions that access data
+ * on eBPF program stack
*/
ARG_PTR_TO_MEM, /* pointer to valid memory (stack, packet, map value) */
- ARG_PTR_TO_UNINIT_MEM, /* pointer to memory does not need to be initialized,
- * helper function must fill all bytes or clear
- * them in error case.
- */
ARG_CONST_SIZE, /* number of bytes accessed from memory */
ARG_CONST_SIZE_OR_ZERO, /* number of bytes accessed from memory or 0 */
ARG_PTR_TO_STACK, /* pointer to stack */
ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */
ARG_PTR_TO_TIMER, /* pointer to bpf_timer */
+ ARG_PTR_TO_KPTR, /* pointer to referenced kptr */
+ ARG_PTR_TO_DYNPTR, /* pointer to bpf_dynptr. See bpf_type_flag for dynptr type */
__BPF_ARG_TYPE_MAX,
/* Extended arg_types. */
ARG_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_SOCKET,
ARG_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_ALLOC_MEM,
ARG_PTR_TO_STACK_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_STACK,
+ ARG_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_BTF_ID,
+ /* pointer to memory does not need to be initialized, helper function must fill
+ * all bytes or clear them in error case.
+ */
+ ARG_PTR_TO_UNINIT_MEM = MEM_UNINIT | ARG_PTR_TO_MEM,
/* This must be the last entry. Its purpose is to ensure the enum is
* wide enough to hold the higher bits reserved for bpf_type_flag.
RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
+ RET_PTR_TO_DYNPTR_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
/* This must be the last entry. Its purpose is to ensure the enum is
#define BPF_TRAMP_F_RET_FENTRY_RET BIT(4)
/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
- * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
+ * bytes on x86.
*/
- #define BPF_MAX_TRAMP_PROGS 38
+ #define BPF_MAX_TRAMP_LINKS 38
- struct bpf_tramp_progs {
- struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS];
- int nr_progs;
+ struct bpf_tramp_links {
+ struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
+ int nr_links;
};
+ struct bpf_tramp_run_ctx;
+
/* Different use cases for BPF trampoline:
* 1. replace nop at the function entry (kprobe equivalent)
* flags = BPF_TRAMP_F_RESTORE_REGS
struct bpf_tramp_image;
int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_tramp_progs *tprogs,
+ struct bpf_tramp_links *tlinks,
void *orig_call);
/* these two functions are called from generated trampoline */
- u64 notrace __bpf_prog_enter(struct bpf_prog *prog);
- void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);
- u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog);
- void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start);
+ u64 notrace __bpf_prog_enter(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
+ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_run_ctx *run_ctx);
+ u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
+ void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
+ struct bpf_tramp_run_ctx *run_ctx);
void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
{
return bpf_func(ctx, insnsi);
}
+
#ifdef CONFIG_BPF_JIT
- int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr);
- int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr);
+ int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
+ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
struct bpf_trampoline *bpf_trampoline_get(u64 key,
struct bpf_attach_target_info *tgt_info);
void bpf_trampoline_put(struct bpf_trampoline *tr);
void bpf_jit_uncharge_modmem(u32 size);
bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
#else
- static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
+ static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
struct bpf_trampoline *tr)
{
return -ENOTSUPP;
}
- static inline int bpf_trampoline_unlink_prog(struct bpf_prog *prog,
+ static inline int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
struct bpf_trampoline *tr)
{
return -ENOTSUPP;
bool tail_call_reachable;
bool xdp_has_frags;
bool use_bpf_prog_pack;
- struct hlist_node tramp_hlist;
/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
const struct btf_type *attach_func_proto;
/* function name for valid attach_btf_id */
struct bpf_link_info *info);
};
+ struct bpf_tramp_link {
+ struct bpf_link link;
+ struct hlist_node tramp_hlist;
+ u64 cookie;
+ };
+
+ struct bpf_tracing_link {
+ struct bpf_tramp_link link;
+ enum bpf_attach_type attach_type;
+ struct bpf_trampoline *trampoline;
+ struct bpf_prog *tgt_prog;
+ };
+
struct bpf_link_primer {
struct bpf_link *link;
struct file *file;
void bpf_struct_ops_put(const void *kdata);
int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, void *key,
void *value);
- int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_progs *tprogs,
- struct bpf_prog *prog,
+ int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
+ struct bpf_tramp_link *link,
const struct btf_func_model *model,
void *image, void *image_end);
static inline bool bpf_try_module_get(const void *data, struct module *owner)
/* an array of programs to be executed under rcu_lock.
*
* Typical usage:
- * ret = BPF_PROG_RUN_ARRAY(&bpf_prog_array, ctx, bpf_prog_run);
+ * ret = bpf_prog_run_array(rcu_dereference(&bpf_prog_array), ctx, bpf_prog_run);
*
* the structure returned by bpf_prog_array_alloc() should be populated
* with program pointers and the last pointer must be NULL.
u64 bpf_cookie;
};
+ struct bpf_tramp_run_ctx {
+ struct bpf_run_ctx run_ctx;
+ u64 bpf_cookie;
+ struct bpf_run_ctx *saved_run_ctx;
+ };
+
static inline struct bpf_run_ctx *bpf_set_run_ctx(struct bpf_run_ctx *new_ctx)
{
struct bpf_run_ctx *old_ctx = NULL;
typedef u32 (*bpf_prog_run_fn)(const struct bpf_prog *prog, const void *ctx);
- static __always_inline int
- BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
- const void *ctx, bpf_prog_run_fn run_prog,
- int retval, u32 *ret_flags)
- {
- const struct bpf_prog_array_item *item;
- const struct bpf_prog *prog;
- const struct bpf_prog_array *array;
- struct bpf_run_ctx *old_run_ctx;
- struct bpf_cg_run_ctx run_ctx;
- u32 func_ret;
-
- run_ctx.retval = retval;
- migrate_disable();
- rcu_read_lock();
- array = rcu_dereference(array_rcu);
- item = &array->items[0];
- old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
- while ((prog = READ_ONCE(item->prog))) {
- run_ctx.prog_item = item;
- func_ret = run_prog(prog, ctx);
- if (!(func_ret & 1) && !IS_ERR_VALUE((long)run_ctx.retval))
- run_ctx.retval = -EPERM;
- *(ret_flags) |= (func_ret >> 1);
- item++;
- }
- bpf_reset_run_ctx(old_run_ctx);
- rcu_read_unlock();
- migrate_enable();
- return run_ctx.retval;
- }
-
- static __always_inline int
- BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
- const void *ctx, bpf_prog_run_fn run_prog,
- int retval)
- {
- const struct bpf_prog_array_item *item;
- const struct bpf_prog *prog;
- const struct bpf_prog_array *array;
- struct bpf_run_ctx *old_run_ctx;
- struct bpf_cg_run_ctx run_ctx;
-
- run_ctx.retval = retval;
- migrate_disable();
- rcu_read_lock();
- array = rcu_dereference(array_rcu);
- item = &array->items[0];
- old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
- while ((prog = READ_ONCE(item->prog))) {
- run_ctx.prog_item = item;
- if (!run_prog(prog, ctx) && !IS_ERR_VALUE((long)run_ctx.retval))
- run_ctx.retval = -EPERM;
- item++;
- }
- bpf_reset_run_ctx(old_run_ctx);
- rcu_read_unlock();
- migrate_enable();
- return run_ctx.retval;
- }
-
static __always_inline u32
- BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
+ bpf_prog_run_array(const struct bpf_prog_array *array,
const void *ctx, bpf_prog_run_fn run_prog)
{
const struct bpf_prog_array_item *item;
const struct bpf_prog *prog;
- const struct bpf_prog_array *array;
struct bpf_run_ctx *old_run_ctx;
struct bpf_trace_run_ctx run_ctx;
u32 ret = 1;
- migrate_disable();
- rcu_read_lock();
- array = rcu_dereference(array_rcu);
+ RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "no rcu lock held");
+
if (unlikely(!array))
- goto out;
+ return ret;
+
+ migrate_disable();
old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
item = &array->items[0];
while ((prog = READ_ONCE(item->prog))) {
item++;
}
bpf_reset_run_ctx(old_run_ctx);
- out:
- rcu_read_unlock();
migrate_enable();
return ret;
}
- /* To be used by __cgroup_bpf_run_filter_skb for EGRESS BPF progs
- * so BPF programs can request cwr for TCP packets.
- *
- * Current cgroup skb programs can only return 0 or 1 (0 to drop the
- * packet. This macro changes the behavior so the low order bit
- * indicates whether the packet should be dropped (0) or not (1)
- * and the next bit is a congestion notification bit. This could be
- * used by TCP to call tcp_enter_cwr()
- *
- * Hence, new allowed return values of CGROUP EGRESS BPF programs are:
- * 0: drop packet
- * 1: keep packet
- * 2: drop packet and cn
- * 3: keep packet and cn
- *
- * This macro then converts it to one of the NET_XMIT or an error
- * code that is then interpreted as drop packet (and no cn):
- * 0: NET_XMIT_SUCCESS skb should be transmitted
- * 1: NET_XMIT_DROP skb should be dropped and cn
- * 2: NET_XMIT_CN skb should be transmitted and cn
- * 3: -err skb should be dropped
- */
- #define BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(array, ctx, func) \
- ({ \
- u32 _flags = 0; \
- bool _cn; \
- u32 _ret; \
- _ret = BPF_PROG_RUN_ARRAY_CG_FLAGS(array, ctx, func, 0, &_flags); \
- _cn = _flags & BPF_RET_SET_CN; \
- if (_ret && !IS_ERR_VALUE((long)_ret)) \
- _ret = -EFAULT; \
- if (!_ret) \
- _ret = (_cn ? NET_XMIT_CN : NET_XMIT_SUCCESS); \
- else \
- _ret = (_cn ? NET_XMIT_DROP : _ret); \
- _ret; \
- })
-
#ifdef CONFIG_BPF_SYSCALL
DECLARE_PER_CPU(int, bpf_prog_active);
extern struct mutex bpf_stats_enabled_mutex;
void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock);
void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock);
+ struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset);
+ void bpf_map_free_kptr_off_tab(struct bpf_map *map);
+ struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map);
+ bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b);
+ void bpf_map_free_kptrs(struct bpf_map *map, void *map_value);
+
struct bpf_map *bpf_map_get(u32 ufd);
struct bpf_map *bpf_map_get_with_uref(u32 ufd);
struct bpf_map *__bpf_map_get(struct fd f);
int bpf_link_new_fd(struct bpf_link *link);
struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
struct bpf_link *bpf_link_get_from_fd(u32 ufd);
+ struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
int bpf_obj_get_user(const char __user *pathname, int flags);
u32 *next_btf_id, enum bpf_type_flag *flag);
bool btf_struct_ids_match(struct bpf_verifier_log *log,
const struct btf *btf, u32 id, int off,
- const struct btf *need_btf, u32 need_type_id);
+ const struct btf *need_btf, u32 need_type_id,
+ bool strict);
int btf_distill_func_proto(struct bpf_verifier_log *log,
struct btf *btf,
struct net_device *netdev);
bool bpf_offload_dev_match(struct bpf_prog *prog, struct net_device *netdev);
+void unpriv_ebpf_notify(int new_state);
+
#if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL)
int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr);
extern const struct bpf_func_proto bpf_map_push_elem_proto;
extern const struct bpf_func_proto bpf_map_pop_elem_proto;
extern const struct bpf_func_proto bpf_map_peek_elem_proto;
+ extern const struct bpf_func_proto bpf_map_lookup_percpu_elem_proto;
extern const struct bpf_func_proto bpf_get_prandom_u32_proto;
extern const struct bpf_func_proto bpf_get_smp_processor_id_proto;
extern const struct bpf_func_proto bpf_ringbuf_submit_proto;
extern const struct bpf_func_proto bpf_ringbuf_discard_proto;
extern const struct bpf_func_proto bpf_ringbuf_query_proto;
+ extern const struct bpf_func_proto bpf_ringbuf_reserve_dynptr_proto;
+ extern const struct bpf_func_proto bpf_ringbuf_submit_dynptr_proto;
+ extern const struct bpf_func_proto bpf_ringbuf_discard_dynptr_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp6_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp_timewait_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_unix_sock_proto;
+ extern const struct bpf_func_proto bpf_skc_to_mptcp_sock_proto;
extern const struct bpf_func_proto bpf_copy_from_user_proto;
extern const struct bpf_func_proto bpf_snprintf_btf_proto;
extern const struct bpf_func_proto bpf_snprintf_proto;
extern const struct bpf_func_proto bpf_loop_proto;
extern const struct bpf_func_proto bpf_strncmp_proto;
extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
+ extern const struct bpf_func_proto bpf_kptr_xchg_proto;
const struct bpf_func_proto *tracing_prog_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
void *addr1, void *addr2);
void *bpf_arch_text_copy(void *dst, void *src, size_t len);
+ int bpf_arch_text_invalidate(void *dst, size_t len);
struct btf_id_set;
bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
u32 **bin_buf, u32 num_args);
void bpf_bprintf_cleanup(void);
+ /* the implementation of the opaque uapi struct bpf_dynptr */
+ struct bpf_dynptr_kern {
+ void *data;
+ /* Size represents the number of usable bytes of dynptr data.
+ * If for example the offset is at 4 for a local dynptr whose data is
+ * of type u64, the number of usable bytes is 4.
+ *
+ * The upper 8 bits are reserved. It is as follows:
+ * Bits 0 - 23 = size
+ * Bits 24 - 30 = dynptr type
+ * Bit 31 = whether dynptr is read-only
+ */
+ u32 size;
+ u32 offset;
+ } __aligned(8);
+
+ enum bpf_dynptr_type {
+ BPF_DYNPTR_TYPE_INVALID,
+ /* Points to memory that is local to the bpf program */
+ BPF_DYNPTR_TYPE_LOCAL,
+ /* Underlying data is a ringbuf record */
+ BPF_DYNPTR_TYPE_RINGBUF,
+ };
+
+ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
+ enum bpf_dynptr_type type, u32 offset, u32 size);
+ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
+ int bpf_dynptr_check_size(u32 size);
+
#endif /* _LINUX_BPF_H */
#include <linux/binfmts.h>
#include <linux/sched/sysctl.h>
#include <linux/kexec.h>
- #include <linux/bpf.h>
#include <linux/mount.h>
#include <linux/userfaultfd_k.h>
#include <linux/latencytop.h>
#endif /* CONFIG_SYSCTL */
- #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_SYSCTL)
- static int bpf_stats_handler(struct ctl_table *table, int write,
- void *buffer, size_t *lenp, loff_t *ppos)
- {
- struct static_key *key = (struct static_key *)table->data;
- static int saved_val;
- int val, ret;
- struct ctl_table tmp = {
- .data = &val,
- .maxlen = sizeof(val),
- .mode = table->mode,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- };
-
- if (write && !capable(CAP_SYS_ADMIN))
- return -EPERM;
-
- mutex_lock(&bpf_stats_enabled_mutex);
- val = saved_val;
- ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
- if (write && !ret && val != saved_val) {
- if (val)
- static_key_slow_inc(key);
- else
- static_key_slow_dec(key);
- saved_val = val;
- }
- mutex_unlock(&bpf_stats_enabled_mutex);
- return ret;
- }
-
- void __weak unpriv_ebpf_notify(int new_state)
- {
- }
-
- static int bpf_unpriv_handler(struct ctl_table *table, int write,
- void *buffer, size_t *lenp, loff_t *ppos)
- {
- int ret, unpriv_enable = *(int *)table->data;
- bool locked_state = unpriv_enable == 1;
- struct ctl_table tmp = *table;
-
- if (write && !capable(CAP_SYS_ADMIN))
- return -EPERM;
-
- tmp.data = &unpriv_enable;
- ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
- if (write && !ret) {
- if (locked_state && unpriv_enable != 1)
- return -EPERM;
- *(int *)table->data = unpriv_enable;
- }
-
- unpriv_ebpf_notify(unpriv_enable);
-
- return ret;
- }
- #endif /* CONFIG_BPF_SYSCALL && CONFIG_SYSCTL */
-
/*
* /proc/sys support
*/
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
},
- #ifdef CONFIG_BPF_SYSCALL
-#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
-- {
- .procname = "unprivileged_bpf_disabled",
- .data = &sysctl_unprivileged_bpf_disabled,
- .maxlen = sizeof(sysctl_unprivileged_bpf_disabled),
- .procname = "timer_migration",
- .data = &sysctl_timer_migration,
- .maxlen = sizeof(unsigned int),
-- .mode = 0644,
- .proc_handler = bpf_unpriv_handler,
- .proc_handler = timer_migration_handler,
-- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_TWO,
- },
- {
- .procname = "bpf_stats_enabled",
- .data = &bpf_stats_enabled_key.key,
- .maxlen = sizeof(bpf_stats_enabled_key),
- .mode = 0644,
- .proc_handler = bpf_stats_handler,
- .extra2 = SYSCTL_ONE,
-- },
--#endif
#if defined(CONFIG_TREE_RCU)
{
.procname = "panic_on_rcu_stall",
#include <linux/prandom.h>
#include <linux/once_lite.h>
+ #include "dev.h"
#include "net-sysfs.h"
if (WARN_ON_ONCE(last_dev == ctx.dev))
return -1;
}
+
+ if (!ctx.dev)
+ return ret;
+
path = dev_fwd_path(stack);
if (!path)
return -1;
}
EXPORT_SYMBOL(netif_set_real_num_queues);
+ /**
+ * netif_set_tso_max_size() - set the max size of TSO frames supported
+ * @dev: netdev to update
+ * @size: max skb->len of a TSO frame
+ *
+ * Set the limit on the size of TSO super-frames the device can handle.
+ * Unless explicitly set the stack will assume the value of
+ * %GSO_LEGACY_MAX_SIZE.
+ */
+ void netif_set_tso_max_size(struct net_device *dev, unsigned int size)
+ {
+ dev->tso_max_size = min(GSO_MAX_SIZE, size);
+ if (size < READ_ONCE(dev->gso_max_size))
+ netif_set_gso_max_size(dev, size);
+ }
+ EXPORT_SYMBOL(netif_set_tso_max_size);
+
+ /**
+ * netif_set_tso_max_segs() - set the max number of segs supported for TSO
+ * @dev: netdev to update
+ * @segs: max number of TCP segments
+ *
+ * Set the limit on the number of TCP segments the device can generate from
+ * a single TSO super-frame.
+ * Unless explicitly set the stack will assume the value of %GSO_MAX_SEGS.
+ */
+ void netif_set_tso_max_segs(struct net_device *dev, unsigned int segs)
+ {
+ dev->tso_max_segs = segs;
+ if (segs < READ_ONCE(dev->gso_max_segs))
+ netif_set_gso_max_segs(dev, segs);
+ }
+ EXPORT_SYMBOL(netif_set_tso_max_segs);
+
+ /**
+ * netif_inherit_tso_max() - copy all TSO limits from a lower device to an upper
+ * @to: netdev to update
+ * @from: netdev from which to copy the limits
+ */
+ void netif_inherit_tso_max(struct net_device *to, const struct net_device *from)
+ {
+ netif_set_tso_max_size(to, from->tso_max_size);
+ netif_set_tso_max_segs(to, from->tso_max_segs);
+ }
+ EXPORT_SYMBOL(netif_inherit_tso_max);
+
/**
* netif_get_num_default_rss_queues - default number of RSS queues
*
}
offset = skb_checksum_start_offset(skb);
- BUG_ON(offset >= skb_headlen(skb));
+ ret = -EINVAL;
+ if (WARN_ON_ONCE(offset >= skb_headlen(skb))) {
+ DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
+ goto out;
+ }
csum = skb_checksum(skb, offset, skb->len - offset, 0);
offset += skb->csum_offset;
- BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
-
+ if (WARN_ON_ONCE(offset + sizeof(__sum16) > skb_headlen(skb))) {
+ DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
+ goto out;
+ }
ret = skb_ensure_writable(skb, offset + sizeof(__sum16));
if (ret)
goto out;
dev_queue_xmit_nit(skb, dev);
len = skb->len;
- PRANDOM_ADD_NOISE(skb, dev, txq, len + jiffies);
trace_net_dev_start_xmit(skb, dev);
rc = netdev_start_xmit(skb, dev, txq, more);
trace_net_dev_xmit(skb, rc, dev, len);
return skb;
}
+
+ static struct netdev_queue *
+ netdev_tx_queue_mapping(struct net_device *dev, struct sk_buff *skb)
+ {
+ int qm = skb_get_queue_mapping(skb);
+
+ return netdev_get_tx_queue(dev, netdev_cap_txqueue(dev, qm));
+ }
+
+ static bool netdev_xmit_txqueue_skipped(void)
+ {
+ return __this_cpu_read(softnet_data.xmit.skip_txqueue);
+ }
+
+ void netdev_xmit_skip_txqueue(bool skip)
+ {
+ __this_cpu_write(softnet_data.xmit.skip_txqueue, skip);
+ }
+ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
#endif /* CONFIG_NET_EGRESS */
#ifdef CONFIG_XPS
}
/**
- * __dev_queue_xmit - transmit a buffer
- * @skb: buffer to transmit
- * @sb_dev: suboordinate device used for L2 forwarding offload
- *
- * Queue a buffer for transmission to a network device. The caller must
- * have set the device and priority and built the buffer before calling
- * this function. The function can be called from an interrupt.
+ * __dev_queue_xmit() - transmit a buffer
+ * @skb: buffer to transmit
+ * @sb_dev: suboordinate device used for L2 forwarding offload
*
- * A negative errno code is returned on a failure. A success does not
- * guarantee the frame will be transmitted as it may be dropped due
- * to congestion or traffic shaping.
+ * Queue a buffer for transmission to a network device. The caller must
+ * have set the device and priority and built the buffer before calling
+ * this function. The function can be called from an interrupt.
*
- * -----------------------------------------------------------------------------------
- * I notice this method can also return errors from the queue disciplines,
- * including NET_XMIT_DROP, which is a positive value. So, errors can also
- * be positive.
+ * When calling this method, interrupts MUST be enabled. This is because
+ * the BH enable code must have IRQs enabled so that it will not deadlock.
*
- * Regardless of the return value, the skb is consumed, so it is currently
- * difficult to retry a send to this method. (You can bump the ref count
- * before sending to hold a reference for retry if you are careful.)
+ * Regardless of the return value, the skb is consumed, so it is currently
+ * difficult to retry a send to this method. (You can bump the ref count
+ * before sending to hold a reference for retry if you are careful.)
*
- * When calling this method, interrupts MUST be enabled. This is because
- * the BH enable code must have IRQs enabled so that it will not deadlock.
- * --BLG
+ * Return:
+ * * 0 - buffer successfully transmitted
+ * * positive qdisc return code - NET_XMIT_DROP etc.
+ * * negative errno - other errors
*/
- static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
+ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
{
struct net_device *dev = skb->dev;
- struct netdev_queue *txq;
+ struct netdev_queue *txq = NULL;
struct Qdisc *q;
int rc = -ENOMEM;
bool again = false;
if (!skb)
goto out;
}
+
+ netdev_xmit_skip_txqueue(false);
+
nf_skip_egress(skb, true);
skb = sch_handle_egress(skb, &rc, dev);
if (!skb)
goto out;
nf_skip_egress(skb, false);
+
+ if (netdev_xmit_txqueue_skipped())
+ txq = netdev_tx_queue_mapping(dev, skb);
}
#endif
/* If device/qdisc don't need skb->dst, release it right now while
else
skb_dst_force(skb);
- txq = netdev_core_pick_tx(dev, skb, sb_dev);
+ if (!txq)
+ txq = netdev_core_pick_tx(dev, skb, sb_dev);
+
q = rcu_dereference_bh(txq->qdisc);
trace_net_dev_queue(skb);
if (!skb)
goto out;
- PRANDOM_ADD_NOISE(skb, dev, txq, jiffies);
HARD_TX_LOCK(dev, txq, cpu);
if (!netif_xmit_stopped(txq)) {
rcu_read_unlock_bh();
return rc;
}
-
- int dev_queue_xmit(struct sk_buff *skb)
- {
- return __dev_queue_xmit(skb, NULL);
- }
- EXPORT_SYMBOL(dev_queue_xmit);
-
- int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev)
- {
- return __dev_queue_xmit(skb, sb_dev);
- }
- EXPORT_SYMBOL(dev_queue_xmit_accel);
+ EXPORT_SYMBOL(__dev_queue_xmit);
int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
{
skb_set_queue_mapping(skb, queue_id);
txq = skb_get_tx_queue(dev, skb);
- PRANDOM_ADD_NOISE(skb, dev, txq, jiffies);
local_bh_disable();
EXPORT_SYMBOL(netdev_max_backlog);
int netdev_tstamp_prequeue __read_mostly = 1;
+ unsigned int sysctl_skb_defer_max __read_mostly = 64;
int netdev_budget __read_mostly = 300;
/* Must be at least 2 jiffes to guarantee 1 jiffy timeout */
unsigned int __read_mostly netdev_budget_usecs = 2 * USEC_PER_SEC / HZ;
#endif /* CONFIG_RPS */
+ /* Called from hardirq (IPI) context */
+ static void trigger_rx_softirq(void *data)
+ {
+ struct softnet_data *sd = data;
+
+ __raise_softirq_irqoff(NET_RX_SOFTIRQ);
+ smp_store_release(&sd->defer_ipi_scheduled, 0);
+ }
+
/*
* Check if this softnet_data structure is another cpu one
* If yes, queue it to our IPI list and return 1
*ppt_prev = pt_prev;
} else {
drop:
- if (!deliver_exact) {
+ if (!deliver_exact)
dev_core_stats_rx_dropped_inc(skb->dev);
- kfree_skb_reason(skb, SKB_DROP_REASON_PTYPE_ABSENT);
- } else {
+ else
dev_core_stats_rx_nohandler_inc(skb->dev);
- kfree_skb(skb);
- }
+ kfree_skb_reason(skb, SKB_DROP_REASON_UNHANDLED_PROTO);
/* Jamal, now you will not able to escape explaining
* me how you were going to use this. :-)
*/
}
EXPORT_SYMBOL(dev_set_threaded);
- void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
- int (*poll)(struct napi_struct *, int), int weight)
+ void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
+ int (*poll)(struct napi_struct *, int), int weight)
{
if (WARN_ON(test_and_set_bit(NAPI_STATE_LISTED, &napi->state)))
return;
if (dev->threaded && napi_kthread_create(napi))
dev->threaded = 0;
}
- EXPORT_SYMBOL(netif_napi_add);
+ EXPORT_SYMBOL(netif_napi_add_weight);
void napi_disable(struct napi_struct *n)
{
return 0;
}
+ static void skb_defer_free_flush(struct softnet_data *sd)
+ {
+ struct sk_buff *skb, *next;
+ unsigned long flags;
+
+ /* Paired with WRITE_ONCE() in skb_attempt_defer_free() */
+ if (!READ_ONCE(sd->defer_list))
+ return;
+
+ spin_lock_irqsave(&sd->defer_lock, flags);
+ skb = sd->defer_list;
+ sd->defer_list = NULL;
+ sd->defer_count = 0;
+ spin_unlock_irqrestore(&sd->defer_lock, flags);
+
+ while (skb != NULL) {
+ next = skb->next;
+ napi_consume_skb(skb, 1);
+ skb = next;
+ }
+ }
+
static __latent_entropy void net_rx_action(struct softirq_action *h)
{
struct softnet_data *sd = this_cpu_ptr(&softnet_data);
for (;;) {
struct napi_struct *n;
+ skb_defer_free_flush(sd);
+
if (list_empty(&list)) {
if (!sd_has_rps_ipi_waiting(sd) && list_empty(&repoll))
- return;
+ goto end;
break;
}
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
net_rps_action_and_irq_enable(sd);
+ end:;
}
struct netdev_adjacent {
{
dev->group = new_group;
}
- EXPORT_SYMBOL(dev_set_group);
/**
* dev_pre_changeaddr_notify - Call NETDEV_PRE_CHANGEADDR.
return -ENODEV;
return ops->ndo_change_carrier(dev, new_carrier);
}
- EXPORT_SYMBOL(dev_change_carrier);
/**
* dev_get_phys_port_id - Get device physical port ID
return -EOPNOTSUPP;
return ops->ndo_get_phys_port_id(dev, ppid);
}
- EXPORT_SYMBOL(dev_get_phys_port_id);
/**
* dev_get_phys_port_name - Get device physical port name
}
return devlink_compat_phys_port_name_get(dev, name, len);
}
- EXPORT_SYMBOL(dev_get_phys_port_name);
/**
* dev_get_port_parent_id - Get the device's port parent identifier
dev->proto_down = proto_down;
return 0;
}
- EXPORT_SYMBOL(dev_change_proto_down);
/**
* dev_change_proto_down_reason - proto down reason
}
}
}
- EXPORT_SYMBOL(dev_change_proto_down_reason);
struct bpf_xdp_link {
struct bpf_link link;
}
/* Delayed registration/unregisteration */
- static LIST_HEAD(net_todo_list);
+ LIST_HEAD(net_todo_list);
DECLARE_WAIT_QUEUE_HEAD(netdev_unregistering_wq);
static void net_set_todo(struct net_device *dev)
EXPORT_SYMBOL(netif_tx_stop_all_queues);
/**
- * register_netdevice - register a network device
- * @dev: device to register
- *
- * Take a completed network device structure and add it to the kernel
- * interfaces. A %NETDEV_REGISTER message is sent to the netdev notifier
- * chain. 0 is returned on success. A negative errno code is returned
- * on a failure to set up the device, or if the name is a duplicate.
+ * register_netdevice() - register a network device
+ * @dev: device to register
*
- * Callers must hold the rtnl semaphore. You may want
- * register_netdev() instead of this.
- *
- * BUGS:
- * The locking appears insufficient to guarantee two parallel registers
- * will not get the same name.
+ * Take a prepared network device structure and make it externally accessible.
+ * A %NETDEV_REGISTER message is sent to the netdev notifier chain.
+ * Callers must hold the rtnl lock - you may want register_netdev()
+ * instead of this.
*/
-
int register_netdevice(struct net_device *dev)
{
int ret;
storage->rx_dropped += READ_ONCE(core_stats->rx_dropped);
storage->tx_dropped += READ_ONCE(core_stats->tx_dropped);
storage->rx_nohandler += READ_ONCE(core_stats->rx_nohandler);
+ storage->rx_otherhost_dropped += READ_ONCE(core_stats->rx_otherhost_dropped);
}
}
return storage;
dev_net_set(dev, &init_net);
- dev->gso_max_size = GSO_MAX_SIZE;
+ dev->gso_max_size = GSO_LEGACY_MAX_SIZE;
dev->gso_max_segs = GSO_MAX_SEGS;
- dev->gro_max_size = GRO_MAX_SIZE;
+ dev->gro_max_size = GRO_LEGACY_MAX_SIZE;
+ dev->tso_max_size = TSO_LEGACY_MAX_SIZE;
+ dev->tso_max_segs = TSO_MAX_SEGS;
dev->upper_level = 1;
dev->lower_level = 1;
#ifdef CONFIG_LOCKDEP
INIT_CSD(&sd->csd, rps_trigger_softirq, sd);
sd->cpu = i;
#endif
+ INIT_CSD(&sd->defer_csd, trigger_rx_softirq, sd);
+ spin_lock_init(&sd->defer_lock);
init_gro_hash(&sd->backlog);
sd->backlog.poll = process_backlog;
return ret;
}
- if (!(ifa->ifa_flags & IFA_F_SECONDARY)) {
- prandom_seed((__force u32) ifa->ifa_local);
+ if (!(ifa->ifa_flags & IFA_F_SECONDARY))
ifap = last_primary;
- }
rcu_assign_pointer(ifa->ifa_next, *ifap);
rcu_assign_pointer(*ifap, ifa);
struct devinet_sysctl_table *t;
char path[sizeof("net/ipv4/conf/") + IFNAMSIZ];
- t = kmemdup(&devinet_sysctl, sizeof(*t), GFP_KERNEL);
+ t = kmemdup(&devinet_sysctl, sizeof(*t), GFP_KERNEL_ACCOUNT);
if (!t)
goto out;
{
int i;
- idev->stats.ipv6 = alloc_percpu(struct ipstats_mib);
+ idev->stats.ipv6 = alloc_percpu_gfp(struct ipstats_mib, GFP_KERNEL_ACCOUNT);
if (!idev->stats.ipv6)
goto err_ip;
if (!idev->stats.icmpv6dev)
goto err_icmp;
idev->stats.icmpv6msgdev = kzalloc(sizeof(struct icmpv6msg_mib_device),
- GFP_KERNEL);
+ GFP_KERNEL_ACCOUNT);
if (!idev->stats.icmpv6msgdev)
goto err_icmpmsg;
if (dev->mtu < IPV6_MIN_MTU && dev != blackhole_netdev)
return ERR_PTR(-EINVAL);
- ndev = kzalloc(sizeof(struct inet6_dev), GFP_KERNEL);
+ ndev = kzalloc(sizeof(*ndev), GFP_KERNEL_ACCOUNT);
if (!ndev)
return ERR_PTR(err);
{
struct net_device *dev;
struct inet6_ifaddr *ifa;
+ LIST_HEAD(tmp_addr_list);
if (!idev)
return;
}
}
+ read_lock_bh(&idev->lock);
list_for_each_entry(ifa, &idev->addr_list, if_list) {
if (ifa->flags&IFA_F_TENTATIVE)
continue;
+ list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
+ }
+ read_unlock_bh(&idev->lock);
+
+ while (!list_empty(&tmp_addr_list)) {
+ ifa = list_first_entry(&tmp_addr_list,
+ struct inet6_ifaddr, if_list_aux);
+ list_del(&ifa->if_list_aux);
if (idev->cnf.forwarding)
addrconf_join_anycast(ifa);
else
addrconf_leave_anycast(ifa);
}
+
inet6_netconf_notify_devconf(dev_net(dev), RTM_NEWNETCONF,
NETCONFA_FORWARDING,
dev->ifindex, &idev->cnf);
unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
struct net *net = dev_net(dev);
struct inet6_dev *idev;
- struct inet6_ifaddr *ifa, *tmp;
+ struct inet6_ifaddr *ifa;
+ LIST_HEAD(tmp_addr_list);
bool keep_addr = false;
bool was_ready;
int state, i;
write_lock_bh(&idev->lock);
}
- list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
+ list_for_each_entry(ifa, &idev->addr_list, if_list)
+ list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
+ write_unlock_bh(&idev->lock);
+
+ while (!list_empty(&tmp_addr_list)) {
struct fib6_info *rt = NULL;
bool keep;
+ ifa = list_first_entry(&tmp_addr_list,
+ struct inet6_ifaddr, if_list_aux);
+ list_del(&ifa->if_list_aux);
+
addrconf_del_dad_work(ifa);
keep = keep_addr && (ifa->flags & IFA_F_PERMANENT) &&
!addr_is_local(&ifa->addr);
- write_unlock_bh(&idev->lock);
spin_lock_bh(&ifa->lock);
if (keep) {
addrconf_leave_solict(ifa->idev, &ifa->addr);
}
- write_lock_bh(&idev->lock);
if (!keep) {
+ write_lock_bh(&idev->lock);
list_del_rcu(&ifa->if_list);
+ write_unlock_bh(&idev->lock);
in6_ifa_put(ifa);
}
}
- write_unlock_bh(&idev->lock);
-
/* Step 5: Discard anycast and multicast list */
if (unregister) {
ipv6_ac_destroy_dev(idev);
addrconf_join_solict(dev, &ifp->addr);
- prandom_seed((__force u32) ifp->addr.s6_addr32[3]);
-
read_lock_bh(&idev->lock);
spin_lock(&ifp->lock);
if (ifp->state == INET6_IFADDR_STATE_DEAD)
send_rs = send_mld &&
ipv6_accept_ra(ifp->idev) &&
ifp->idev->cnf.rtr_solicits != 0 &&
- (dev->flags&IFF_LOOPBACK) == 0;
+ (dev->flags & IFF_LOOPBACK) == 0 &&
+ (dev->type != ARPHRD_TUNNEL);
read_unlock_bh(&ifp->idev->lock);
/* While dad is in progress mld report's source address is in6_addrany.
array[DEVCONF_IOAM6_ID] = cnf->ioam6_id;
array[DEVCONF_IOAM6_ID_WIDE] = cnf->ioam6_id_wide;
array[DEVCONF_NDISC_EVICT_NOCARRIER] = cnf->ndisc_evict_nocarrier;
+ array[DEVCONF_ACCEPT_UNSOLICITED_NA] = cnf->accept_unsolicited_na;
}
static inline size_t inet6_ifla6_size(void)
.extra1 = (void *)SYSCTL_ZERO,
.extra2 = (void *)SYSCTL_ONE,
},
+ {
+ .procname = "accept_unsolicited_na",
+ .data = &ipv6_devconf.accept_unsolicited_na,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = (void *)SYSCTL_ZERO,
+ .extra2 = (void *)SYSCTL_ONE,
+ },
{
/* sentinel */
}
struct ctl_table *table;
char path[sizeof("net/ipv6/conf/") + IFNAMSIZ];
- table = kmemdup(addrconf_sysctl, sizeof(addrconf_sysctl), GFP_KERNEL);
+ table = kmemdup(addrconf_sysctl, sizeof(addrconf_sysctl), GFP_KERNEL_ACCOUNT);
if (!table)
goto out;
if (params->deliver) {
KUNIT_EXPECT_EQ(test, rc, 0);
- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
+ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
KUNIT_EXPECT_NOT_ERR_OR_NULL(test, skb2);
KUNIT_EXPECT_EQ(test, skb->len, 1);
} else {
KUNIT_EXPECT_NE(test, rc, 0);
- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
+ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
- KUNIT_EXPECT_PTR_EQ(test, skb2, NULL);
+ KUNIT_EXPECT_NULL(test, skb2);
}
__mctp_route_test_fini(test, dev, rt, sock);
rc = mctp_route_input(&rt->rt, skb);
}
- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
+ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
if (params->rx_len) {
KUNIT_EXPECT_NOT_ERR_OR_NULL(test, skb2);
skb_free_datagram(sock->sk, skb2);
} else {
- KUNIT_EXPECT_PTR_EQ(test, skb2, NULL);
+ KUNIT_EXPECT_NULL(test, skb2);
}
__mctp_route_test_fini(test, dev, rt, sock);
rc = mctp_route_input(&rt->rt, skb);
/* (potentially) receive message */
- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
+ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
if (params->deliver)
KUNIT_EXPECT_NOT_ERR_OR_NULL(test, skb2);
struct socket *sock_from_file(struct file *file)
{
if (file->f_op == &socket_file_ops)
- return file->private_data; /* set in sock_map_fd */
+ return file->private_data; /* set in sock_alloc_file */
return NULL;
}
{
u8 flags = *tx_flags;
- if (tsflags & SOF_TIMESTAMPING_TX_HARDWARE)
+ if (tsflags & SOF_TIMESTAMPING_TX_HARDWARE) {
flags |= SKBTX_HW_TSTAMP;
+ /* PTP hardware clocks can provide a free running cycle counter
+ * as a time base for virtual clocks. Tell driver to use the
+ * free running cycle counter for timestamp if socket is bound
+ * to virtual clock.
+ */
+ if (tsflags & SOF_TIMESTAMPING_BIND_PHC)
+ flags |= SKBTX_HW_TSTAMP_USE_CYCLES;
+ }
+
if (tsflags & SOF_TIMESTAMPING_TX_SOFTWARE)
flags |= SKBTX_SW_TSTAMP;
return skb->tstamp && !false_tstamp && skb_is_err_queue(skb);
}
- static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
+ static ktime_t get_timestamp(struct sock *sk, struct sk_buff *skb, int *if_index)
+ {
+ bool cycles = sk->sk_tsflags & SOF_TIMESTAMPING_BIND_PHC;
+ struct skb_shared_hwtstamps *shhwtstamps = skb_hwtstamps(skb);
+ struct net_device *orig_dev;
+ ktime_t hwtstamp;
+
+ rcu_read_lock();
+ orig_dev = dev_get_by_napi_id(skb_napi_id(skb));
+ if (orig_dev) {
+ *if_index = orig_dev->ifindex;
+ hwtstamp = netdev_get_tstamp(orig_dev, shhwtstamps, cycles);
+ } else {
+ hwtstamp = shhwtstamps->hwtstamp;
+ }
+ rcu_read_unlock();
+
+ return hwtstamp;
+ }
+
+ static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb,
+ int if_index)
{
struct scm_ts_pktinfo ts_pktinfo;
struct net_device *orig_dev;
memset(&ts_pktinfo, 0, sizeof(ts_pktinfo));
- rcu_read_lock();
- orig_dev = dev_get_by_napi_id(skb_napi_id(skb));
- if (orig_dev)
- ts_pktinfo.if_index = orig_dev->ifindex;
- rcu_read_unlock();
+ if (!if_index) {
+ rcu_read_lock();
+ orig_dev = dev_get_by_napi_id(skb_napi_id(skb));
+ if (orig_dev)
+ if_index = orig_dev->ifindex;
+ rcu_read_unlock();
+ }
+ ts_pktinfo.if_index = if_index;
ts_pktinfo.pkt_length = skb->len - skb_mac_offset(skb);
put_cmsg(msg, SOL_SOCKET, SCM_TIMESTAMPING_PKTINFO,
int empty = 1, false_tstamp = 0;
struct skb_shared_hwtstamps *shhwtstamps =
skb_hwtstamps(skb);
+ int if_index;
ktime_t hwtstamp;
/* Race occurred between timestamp enabling and packet
if (shhwtstamps &&
(sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
!skb_is_swtx_tstamp(skb, false_tstamp)) {
- if (sk->sk_tsflags & SOF_TIMESTAMPING_BIND_PHC)
- hwtstamp = ptp_convert_timestamp(shhwtstamps,
- sk->sk_bind_phc);
+ if_index = 0;
+ if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP_NETDEV)
+ hwtstamp = get_timestamp(sk, skb, &if_index);
else
hwtstamp = shhwtstamps->hwtstamp;
+ if (sk->sk_tsflags & SOF_TIMESTAMPING_BIND_PHC)
+ hwtstamp = ptp_convert_timestamp(&hwtstamp,
+ sk->sk_bind_phc);
+
if (ktime_to_timespec64_cond(hwtstamp, tss.ts + 2)) {
empty = 0;
if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&
!skb_is_err_queue(skb))
- put_ts_pktinfo(msg, skb);
+ put_ts_pktinfo(msg, skb, if_index);
}
}
if (!empty) {
sizeof(__u32), &SOCK_SKB_CB(skb)->dropcount);
}
- void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
- struct sk_buff *skb)
+ static void sock_recv_mark(struct msghdr *msg, struct sock *sk,
+ struct sk_buff *skb)
+ {
+ if (sock_flag(sk, SOCK_RCVMARK) && skb)
+ put_cmsg(msg, SOL_SOCKET, SO_MARK, sizeof(__u32),
+ &skb->mark);
+ }
+
+ void __sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
+ struct sk_buff *skb)
{
sock_recv_timestamp(msg, sk, skb);
sock_recv_drops(msg, sk, skb);
+ sock_recv_mark(msg, sk, skb);
}
- EXPORT_SYMBOL_GPL(__sock_recv_ts_and_drops);
+ EXPORT_SYMBOL_GPL(__sock_recv_cmsgs);
INDIRECT_CALLABLE_DECLARE(int inet_recvmsg(struct socket *, struct msghdr *,
size_t, int));
}
EXPORT_SYMBOL(sock_create_kern);
-int __sys_socket(int family, int type, int protocol)
+static struct socket *__sys_socket_create(int family, int type, int protocol)
{
- int retval;
struct socket *sock;
- int flags;
+ int retval;
/* Check the SOCK_* constants for consistency. */
BUILD_BUG_ON(SOCK_CLOEXEC != O_CLOEXEC);
BUILD_BUG_ON(SOCK_CLOEXEC & SOCK_TYPE_MASK);
BUILD_BUG_ON(SOCK_NONBLOCK & SOCK_TYPE_MASK);
- flags = type & ~SOCK_TYPE_MASK;
- if (flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
- return -EINVAL;
+ if ((type & ~SOCK_TYPE_MASK) & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
+ return ERR_PTR(-EINVAL);
type &= SOCK_TYPE_MASK;
+ retval = sock_create(family, type, protocol, &sock);
+ if (retval < 0)
+ return ERR_PTR(retval);
+
+ return sock;
+}
+
+struct file *__sys_socket_file(int family, int type, int protocol)
+{
+ struct socket *sock;
+ struct file *file;
+ int flags;
+
+ sock = __sys_socket_create(family, type, protocol);
+ if (IS_ERR(sock))
+ return ERR_CAST(sock);
+
+ flags = type & ~SOCK_TYPE_MASK;
if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
- retval = sock_create(family, type, protocol, &sock);
- if (retval < 0)
- return retval;
+ file = sock_alloc_file(sock, flags, NULL);
+ if (IS_ERR(file))
+ sock_release(sock);
+
+ return file;
+}
+
+int __sys_socket(int family, int type, int protocol)
+{
+ struct socket *sock;
+ int flags;
+
+ sock = __sys_socket_create(family, type, protocol);
+ if (IS_ERR(sock))
+ return PTR_ERR(sock);
+
+ flags = type & ~SOCK_TYPE_MASK;
+ if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
+ flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
return sock_map_fd(sock, flags & (O_CLOEXEC | O_NONBLOCK));
}
* so that no locks are necessary.
*/
- skb = skb_recv_datagram(sk, 0, flags&O_NONBLOCK, &err);
+ skb = skb_recv_datagram(sk, (flags & O_NONBLOCK) ? MSG_DONTWAIT : 0,
+ &err);
if (!skb) {
/* This means receive shutdown. */
if (err == 0)
static bool unix_skb_scm_eq(struct sk_buff *skb,
struct scm_cookie *scm)
{
- const struct unix_skb_parms *u = &UNIXCB(skb);
-
- return u->pid == scm->pid &&
- uid_eq(u->uid, scm->creds.uid) &&
- gid_eq(u->gid, scm->creds.gid) &&
+ return UNIXCB(skb).pid == scm->pid &&
+ uid_eq(UNIXCB(skb).uid, scm->creds.uid) &&
+ gid_eq(UNIXCB(skb).gid, scm->creds.gid) &&
unix_secdata_eq(scm, skb);
}
const struct proto *prot = READ_ONCE(sk->sk_prot);
if (prot != &unix_dgram_proto)
- return prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
- flags & ~MSG_DONTWAIT, NULL);
+ return prot->recvmsg(sk, msg, size, flags, NULL);
#endif
return __unix_dgram_recvmsg(sk, msg, size, flags);
}
int used, err;
mutex_lock(&u->iolock);
- skb = skb_recv_datagram(sk, 0, 1, &err);
+ skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err);
mutex_unlock(&u->iolock);
if (!skb)
return err;
const struct proto *prot = READ_ONCE(sk->sk_prot);
if (prot != &unix_stream_proto)
- return prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
- flags & ~MSG_DONTWAIT, NULL);
+ return prot->recvmsg(sk, msg, size, flags, NULL);
#endif
return unix_stream_read_generic(&state, true);
}