L: Mailing list that is relevant to this area
W: Web-page with status/info
Q: Patchwork web based patch tracking system site
- T: SCM tree type and location. Type is one of: git, hg, quilt, stgit, topgit.
+ T: SCM tree type and location.
+ Type is one of: git, hg, quilt, stgit, topgit
S: Status, one of the following:
Supported: Someone is actually paid to look after this.
Maintained: Someone actually looks after it.
AGPGART DRIVER
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git
+T: git git://people.freedesktop.org/~airlied/linux (part of drm maint)
S: Maintained
F: drivers/char/agp/
F: include/linux/agp*
ALTERA UART/JTAG UART SERIAL DRIVERS
-L: nios2-dev@sopc.et.ntust.edu.tw (moderated for non-subscribers)
+L: nios2-dev@lists.rocketboards.org (moderated for non-subscribers)
S: Maintained
F: drivers/tty/serial/altera_uart.c
F: drivers/tty/serial/altera_jtaguart.c
F: arch/arm/mach-footbridge/
ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
-M: Shawn Guo <shawn.guo@linaro.org>
+M: Shawn Guo <shawn.guo@freescale.com>
S: Maintained
-T: git git://git.linaro.org/people/shawnguo/linux-2.6.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git
F: arch/arm/mach-imx/
F: arch/arm/boot/dts/imx*
F: arch/arm/configs/imx*_defconfig
S: Supported
F: arch/arm/mach-u300/
+ F: drivers/clocksource/timer-u300.c
F: drivers/i2c/busses/i2c-stu300.c
F: drivers/rtc/rtc-coh901331.c
F: drivers/watchdog/coh901327_wdt.c
F: drivers/net/wireless/atmel*
ATTO EXPRESSSAS SAS/SATA RAID SCSI DRIVER
-W: http://www.attotech.com
-S: Supported
-F: drivers/scsi/esas2r
+W: http://www.attotech.com
+S: Supported
+F: drivers/scsi/esas2r
AUDIT SUBSYSTEM
BLACKFIN ARCHITECTURE
+T: git git://git.code.sf.net/p/adi-linux/code
W: http://blackfin.uclinux.org
S: Supported
F: arch/blackfin/
F: include/net/bluetooth/
BONDING DRIVER
-M: Veaceslav Falico <vfalico@redhat.com>
+M: Jay Vosburgh <j.vosburgh@gmail.com>
+M: Veaceslav Falico <vfalico@gmail.com>
W: http://sourceforge.net/projects/bonding/
BROADCOM BCM281XX/BCM11XXX ARM ARCHITECTURE
T: git git://git.github.com/broadcom/bcm11351
S: Maintained
CHIPIDEA USB HIGH SPEED DUAL ROLE CONTROLLER
-T: git://github.com/hzpeterchen/linux-usb.git
+T: git git://github.com/hzpeterchen/linux-usb.git
S: Maintained
F: drivers/usb/chipidea/
F: drivers/net/ethernet/cisco/enic/
CISCO VIC LOW LATENCY NIC DRIVER
-S: Supported
-F: drivers/infiniband/hw/usnic
+S: Supported
+F: drivers/infiniband/hw/usnic
CIRRUS LOGIC EP93XX ETHERNET DRIVER
CPU FREQUENCY DRIVERS - ARM BIG LITTLE
W: http://www.arm.com/products/processors/technologies/biglittleprocessing.php
F: drivers/cpufreq/arm_big_little_dt.c
CPUIDLE DRIVER - ARM BIG LITTLE
-T: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
-S: Maintained
-F: drivers/cpuidle/cpuidle-big_little.c
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
+S: Maintained
+F: drivers/cpuidle/cpuidle-big_little.c
CPUIDLE DRIVERS
S: Maintained
-T: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
F: drivers/cpuidle/*
F: include/linux/cpuidle.h
CPUSETS
W: http://www.bullopensource.org/cpuset/
W: http://oss.sgi.com/projects/cpusets/
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
S: Maintained
F: Documentation/cgroups/cpusets.txt
F: include/linux/cpuset.h
F: sound/pci/cs5535audio/
CW1200 WLAN driver
-S: Maintained
-F: drivers/net/wireless/cw1200/
+S: Maintained
+F: drivers/net/wireless/cw1200/
CX18 VIDEO4LINUX DRIVER
-W: http://twibble.org/dist/dc395x/
-L: http://lists.twibble.org/mailman/listinfo/dc395x/
+W: http://twibble.org/dist/dc395x/
+W: http://lists.twibble.org/mailman/listinfo/dc395x/
S: Maintained
F: Documentation/scsi/dc395x.txt
F: drivers/scsi/dc395x.*
F: drivers/acpi/dock.c
DOCUMENTATION
-T: TBD
+T: quilt http://www.infradead.org/~rdunlap/Doc/patches/
S: Maintained
F: Documentation/
DRM DRIVERS
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git
+T: git git://people.freedesktop.org/~airlied/linux
S: Maintained
F: drivers/gpu/drm/
F: include/drm/
F: include/uapi/drm/
+RADEON DRM DRIVERS
+T: git git://people.freedesktop.org/~agd5f/linux
+S: Supported
+F: drivers/gpu/drm/radeon/
+F: include/drm/radeon*
+F: include/uapi/drm/radeon*
+
INTEL DRM DRIVERS (excluding Poulsbo, Moorestown and derivative chipsets)
Q: http://patchwork.freedesktop.org/project/intel-gfx/
-T: git git://people.freedesktop.org/~danvet/drm-intel
+T: git git://anongit.freedesktop.org/drm-intel
S: Supported
F: drivers/gpu/drm/i915/
F: include/drm/i915*
EDAC-CORE
W: bluesmoke.sourceforge.net
S: Supported
F: include/linux/netfilter_bridge/
F: net/bridge/
+ETHERNET PHY LIBRARY
+S: Maintained
+F: include/linux/phy.h
+F: include/linux/phy_fixed.h
+F: drivers/net/phy/
+F: Documentation/networking/phy.txt
+F: drivers/of/of_mdio.c
+F: drivers/of/of_net.c
+
EXT2 FILE SYSTEM
W: http://www.intel.com/support/feedback.htm
W: http://e1000.sourceforge.net/
F: Documentation/networking/i40e.txt
F: Documentation/networking/i40evf.txt
F: drivers/net/ethernet/intel/
+F: drivers/net/ethernet/intel/*/
INTEL-MID GPIO DRIVER
KCONFIG
-T: git://gitorious.org/linux-kconfig/linux-kconfig
+T: git git://gitorious.org/linux-kconfig/linux-kconfig
S: Maintained
F: Documentation/kbuild/kconfig-language.txt
F: scripts/kconfig/
F: drivers/media/tuners/m88ts2022*
MA901 MASTERKIT USB FM RADIO DRIVER
-T: git git://linuxtv.org/media_tree.git
-S: Maintained
-F: drivers/media/radio/radio-ma901.c
+T: git git://linuxtv.org/media_tree.git
+S: Maintained
+F: drivers/media/radio/radio-ma901.c
MAC80211
S: Maintained
+MARVELL ARMADA DRM SUPPORT
+S: Maintained
+F: drivers/gpu/drm/armada/
+
MARVELL GIGABIT ETHERNET DRIVERS (skge/sky2)
MELLANOX ETHERNET DRIVER (mlx4_en)
S: Supported
W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
F: include/uapi/mtd/
MEN A21 WATCHDOG DRIVER
S: Supported
F: drivers/watchdog/mena21_wdt.c
W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
Q: http://patchwork.kernel.org/project/linux-rdma/list/
-T: git://openfabrics.org/~eli/connect-ib.git
+T: git git://openfabrics.org/~eli/connect-ib.git
S: Supported
F: drivers/net/ethernet/mellanox/mlx5/core/
F: include/linux/mlx5/
Mellanox MLX5 IB driver
-W: http://www.mellanox.com
-Q: http://patchwork.kernel.org/project/linux-rdma/list/
-T: git://openfabrics.org/~eli/connect-ib.git
-S: Supported
-F: include/linux/mlx5/
-F: drivers/infiniband/hw/mlx5/
+W: http://www.mellanox.com
+Q: http://patchwork.kernel.org/project/linux-rdma/list/
+T: git git://openfabrics.org/~eli/connect-ib.git
+S: Supported
+F: include/linux/mlx5/
+F: drivers/infiniband/hw/mlx5/
MODULE SUPPORT
F: include/uapi/linux/in.h
F: include/uapi/linux/net.h
F: include/uapi/linux/netdevice.h
+F: tools/net/
+F: tools/testing/selftests/net/
+F: lib/random32.c
NETWORKING [IPv4/IPv6]
F: drivers/block/nvme*
F: include/linux/nvme.h
+NXP TDA998X DRM DRIVER
+S: Supported
+F: drivers/gpu/drm/i2c/tda998x_drv.c
+F: include/drm/i2c/tda998x.h
+
OMAP SUPPORT
F: drivers/net/ethernet/rdc/r6040.c
RDS - RELIABLE DATAGRAM SOCKETS
-M: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
+M: Chien Yen <chien.yen@oracle.com>
S: Supported
F: net/rds/
S: Supported
F: arch/s390/
F: drivers/s390/
-F: block/partitions/ibm.c
F: Documentation/s390/
F: Documentation/DocBook/s390*
+S390 COMMON I/O LAYER
+W: http://www.ibm.com/developerworks/linux/linux390/
+S: Supported
+F: drivers/s390/cio/
+
+S390 DASD DRIVER
+W: http://www.ibm.com/developerworks/linux/linux390/
+S: Supported
+F: drivers/s390/block/dasd*
+F: block/partitions/ibm.c
+
S390 NETWORK DRIVERS
S: Supported
F: drivers/s390/net/
+S390 PCI SUBSYSTEM
+W: http://www.ibm.com/developerworks/linux/linux390/
+S: Supported
+F: arch/s390/pci/
+F: drivers/pci/hotplug/s390_pci_hpc.c
+
S390 ZCRYPT DRIVER
-L: http://groups.google.com/group/linux-iscsi-target-dev
W: http://www.linux-iscsi.org
+W: http://groups.google.com/group/linux-iscsi-target-dev
T: git git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master
S: Supported
F: drivers/target/
F: drivers/media/radio/radio-raremono.c
THERMAL
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal.git
-Q: https://patchwork.kernel.org/project/linux-pm/list/
-S: Supported
-F: drivers/thermal/
-F: include/linux/thermal.h
-F: include/linux/cpu_cooling.h
-F: Documentation/devicetree/bindings/thermal/
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal.git
+Q: https://patchwork.kernel.org/project/linux-pm/list/
+S: Supported
+F: drivers/thermal/
+F: include/linux/thermal.h
+F: include/linux/cpu_cooling.h
+F: Documentation/devicetree/bindings/thermal/
THINGM BLINK(1) USB RGB LED DRIVER
XFS FILESYSTEM
P: Silicon Graphics Inc
W: http://oss.sgi.com/projects/xfs
W: http://mjpeg.sourceforge.net/driver-zoran/
-T: Mercurial http://linuxtv.org/hg/v4l-dvb
+T: hg http://linuxtv.org/hg/v4l-dvb
S: Odd Fixes
F: drivers/media/pci/zoran/
ranges;
emac: ethernet@01c0b000 {
- compatible = "allwinner,sun4i-emac";
+ compatible = "allwinner,sun4i-a10-emac";
reg = <0x01c0b000 0x1000>;
interrupts = <55>;
clocks = <&ahb_gates 17>;
};
mdio@01c0b080 {
- compatible = "allwinner,sun4i-mdio";
+ compatible = "allwinner,sun4i-a10-mdio";
reg = <0x01c0b080 0x14>;
status = "disabled";
#address-cells = <1>;
};
timer@01c20c00 {
- compatible = "allwinner,sun4i-timer";
+ compatible = "allwinner,sun4i-a10-timer";
reg = <0x01c20c00 0x90>;
interrupts = <22>;
clocks = <&osc24M>;
};
rtp: rtp@01c25000 {
- compatible = "allwinner,sun4i-ts";
+ compatible = "allwinner,sun4i-a10-ts";
reg = <0x01c25000 0x100>;
interrupts = <29>;
};
ranges;
emac: ethernet@01c0b000 {
- compatible = "allwinner,sun4i-emac";
+ compatible = "allwinner,sun4i-a10-emac";
reg = <0x01c0b000 0x1000>;
interrupts = <55>;
clocks = <&ahb_gates 17>;
};
mdio@01c0b080 {
- compatible = "allwinner,sun4i-mdio";
+ compatible = "allwinner,sun4i-a10-mdio";
reg = <0x01c0b080 0x14>;
status = "disabled";
#address-cells = <1>;
};
timer@01c20c00 {
- compatible = "allwinner,sun4i-timer";
+ compatible = "allwinner,sun4i-a10-timer";
reg = <0x01c20c00 0x90>;
interrupts = <22>;
clocks = <&osc24M>;
};
rtp: rtp@01c25000 {
- compatible = "allwinner,sun4i-ts";
+ compatible = "allwinner,sun4i-a10-ts";
reg = <0x01c25000 0x100>;
interrupts = <29>;
};
};
timer@01c20c00 {
- compatible = "allwinner,sun4i-timer";
+ compatible = "allwinner,sun4i-a10-timer";
reg = <0x01c20c00 0x90>;
interrupts = <22>;
clocks = <&osc24M>;
};
rtp: rtp@01c25000 {
- compatible = "allwinner,sun4i-ts";
+ compatible = "allwinner,sun4i-a10-ts";
reg = <0x01c25000 0x100>;
interrupts = <29>;
};
ranges;
emac: ethernet@01c0b000 {
- compatible = "allwinner,sun4i-emac";
+ compatible = "allwinner,sun4i-a10-emac";
reg = <0x01c0b000 0x1000>;
interrupts = <0 55 4>;
clocks = <&ahb_gates 17>;
};
mdio@01c0b080 {
- compatible = "allwinner,sun4i-mdio";
+ compatible = "allwinner,sun4i-a10-mdio";
reg = <0x01c0b080 0x14>;
status = "disabled";
#address-cells = <1>;
};
timer@01c20c00 {
- compatible = "allwinner,sun4i-timer";
+ compatible = "allwinner,sun4i-a10-timer";
reg = <0x01c20c00 0x90>;
interrupts = <0 22 4>,
<0 23 4>,
rtc: rtc@01c20d00 {
compatible = "allwinner,sun7i-a20-rtc";
reg = <0x01c20d00 0x20>;
- interrupts = <0 24 1>;
+ interrupts = <0 24 4>;
};
sid: eeprom@01c23800 {
};
rtp: rtp@01c25000 {
- compatible = "allwinner,sun4i-ts";
+ compatible = "allwinner,sun4i-a10-ts";
reg = <0x01c25000 0x100>;
interrupts = <0 29 4>;
};
hstimer@01c60000 {
compatible = "allwinner,sun7i-a20-hstimer";
reg = <0x01c60000 0x1000>;
- interrupts = <0 81 1>,
- <0 82 1>,
- <0 83 1>,
- <0 84 1>;
+ interrupts = <0 81 4>,
+ <0 82 4>,
+ <0 83 4>,
+ <0 84 4>;
clocks = <&ahb_gates 28>;
};
select CPU_V7
select GENERIC_CLOCKEVENTS
select HAVE_ARM_SCU if SMP
- select HAVE_ARM_TWD if LOCAL_TIMERS
+ select HAVE_ARM_TWD if SMP
select HAVE_SMP
select ARM_GIC
select MIGHT_HAVE_CACHE_L2X0
config ARCH_EMEV2
bool "Emma Mobile EV2"
+ select SYS_SUPPORTS_EM_STI
config ARCH_R7S72100
bool "RZ/A1H (R7S72100)"
+ select SYS_SUPPORTS_SH_MTU2
config ARCH_R8A7790
bool "R-Car H2 (R8A77900)"
select RENESAS_IRQC
+ select SYS_SUPPORTS_SH_CMT
config ARCH_R8A7791
bool "R-Car M2 (R8A77910)"
select RENESAS_IRQC
+ select SYS_SUPPORTS_SH_CMT
comment "Renesas ARM SoCs Board Type"
select ARM_CPU_SUSPEND if PM || CPU_IDLE
select CPU_V7
select SH_CLK_CPG
+ select SYS_SUPPORTS_SH_CMT
+ select SYS_SUPPORTS_SH_TMU
config ARCH_SH73A0
bool "SH-Mobile AG5 (R8A73A00)"
select I2C
select SH_CLK_CPG
select RENESAS_INTC_IRQPIN
+ select SYS_SUPPORTS_SH_CMT
+ select SYS_SUPPORTS_SH_TMU
config ARCH_R8A73A4
bool "R-Mobile APE6 (R8A73A40)"
select RENESAS_IRQC
select ARCH_HAS_CPUFREQ
select ARCH_HAS_OPP
+ select SYS_SUPPORTS_SH_CMT
+ select SYS_SUPPORTS_SH_TMU
config ARCH_R8A7740
bool "R-Mobile A1 (R8A77400)"
select CPU_V7
select SH_CLK_CPG
select RENESAS_INTC_IRQPIN
+ select SYS_SUPPORTS_SH_CMT
+ select SYS_SUPPORTS_SH_TMU
config ARCH_R8A7778
bool "R-Car M1A (R8A77781)"
select ARM_GIC
select USB_ARCH_HAS_EHCI
select USB_ARCH_HAS_OHCI
+ select SYS_SUPPORTS_SH_TMU
config ARCH_R8A7779
bool "R-Car H1 (R8A77790)"
select USB_ARCH_HAS_EHCI
select USB_ARCH_HAS_OHCI
select RENESAS_INTC_IRQPIN
+ select SYS_SUPPORTS_SH_TMU
config ARCH_R8A7790
bool "R-Car H2 (R8A77900)"
select MIGHT_HAVE_PCI
select SH_CLK_CPG
select RENESAS_IRQC
+ select SYS_SUPPORTS_SH_CMT
config ARCH_R8A7791
bool "R-Car M2 (R8A77910)"
select MIGHT_HAVE_PCI
select SH_CLK_CPG
select RENESAS_IRQC
+ select SYS_SUPPORTS_SH_CMT
config ARCH_EMEV2
bool "Emma Mobile EV2"
select MIGHT_HAVE_PCI
select USE_OF
select AUTO_ZRELADDR
+ select SYS_SUPPORTS_EM_STI
config ARCH_R7S72100
bool "RZ/A1H (R7S72100)"
select ARM_GIC
select CPU_V7
select SH_CLK_CPG
+ select SYS_SUPPORTS_SH_MTU2
comment "Renesas ARM SoCs Board Type"
want to select a HZ value such as 128 that can evenly divide RCLK.
A HZ value that does not divide evenly may cause timer drift.
- config SH_TIMER_CMT
- bool "CMT timer driver"
- default y
- help
- This enables build of the CMT timer driver.
-
- config SH_TIMER_TMU
- bool "TMU timer driver"
- default y
- help
- This enables build of the TMU timer driver.
-
- config EM_TIMER_STI
- bool "STI timer driver"
- default y
- help
- This enables build of the STI timer driver.
-
endmenu
endif
#include <linux/of_irq.h>
#include <linux/of_platform.h>
#include <linux/of.h>
+#include <linux/memblock.h>
#include <linux/irqchip.h>
#include <linux/irqchip/arm-gic.h>
void __iomem *zynq_scu_base;
+/**
+ * zynq_memory_init - Initialize special memory
+ *
+ * We need to stop things allocating the low memory as DMA can't work in
+ * the 1st 512K of memory.
+ */
+static void __init zynq_memory_init(void)
+{
+ if (!__pa(PAGE_OFFSET))
+ memblock_reserve(__pa(PAGE_OFFSET), __pa(swapper_pg_dir));
+}
+
static struct platform_device zynq_cpuidle_device = {
.name = "cpuidle-zynq",
};
*/
static void __init zynq_init_machine(void)
{
+ struct platform_device_info devinfo = { .name = "cpufreq-cpu0", };
+
/*
* 64KB way size, 8-way associativity, parity disabled
*/
of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
platform_device_register(&zynq_cpuidle_device);
+ platform_device_register_full(&devinfo);
}
static void __init zynq_timer_init(void)
.init_machine = zynq_init_machine,
.init_time = zynq_timer_init,
.dt_compat = zynq_dt_match,
+ .reserve = zynq_memory_init,
.restart = zynq_system_reset,
MACHINE_END
#include <uapi/linux/sched.h>
+#include <linux/sched/prio.h>
+
struct sched_param {
int sched_priority;
#include <asm/page.h>
#include <asm/ptrace.h>
-#include <asm/cputime.h>
+#include <linux/cputime.h>
#include <linux/smp.h>
#include <linux/sem.h>
struct fs_struct;
struct perf_event_context;
struct blk_plug;
+struct filename;
/*
* List of flags we want to share for kernel threads,
#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
extern void nohz_balance_enter_idle(int cpu);
extern void set_cpu_sd_state_idle(void);
- extern int get_nohz_timer_target(void);
+ extern int get_nohz_timer_target(int pinned);
#else
static inline void nohz_balance_enter_idle(int cpu) { }
static inline void set_cpu_sd_state_idle(void) { }
+ static inline int get_nohz_timer_target(int pinned)
+ {
+ return smp_processor_id();
+ }
#endif
/*
#endif
#ifdef CONFIG_FAIR_GROUP_SCHED
+ int depth;
struct sched_entity *parent;
/* rq on which this entity is (to be) queued: */
struct cfs_rq *cfs_rq;
struct mutex perf_event_mutex;
struct list_head perf_event_list;
#endif
+#ifdef CONFIG_DEBUG_PREEMPT
+ unsigned long preempt_disable_ip;
+#endif
#ifdef CONFIG_NUMA
struct mempolicy *mempolicy; /* Protected by alloc_lock */
short il_next;
unsigned int numa_scan_period;
unsigned int numa_scan_period_max;
int numa_preferred_nid;
- int numa_migrate_deferred;
unsigned long numa_migrate_retry;
u64 node_stamp; /* migration stamp */
+ u64 last_task_numa_placement;
+ u64 last_sum_exec_runtime;
struct callback_head numa_work;
struct list_head numa_entry;
* Scheduling placement decisions are made based on the these counts.
* The values remain static for the duration of a PTE scan
*/
- unsigned long *numa_faults;
+ unsigned long *numa_faults_memory;
unsigned long total_numa_faults;
/*
* numa_faults_buffer records faults per node during the current
- * scan window. When the scan completes, the counts in numa_faults
- * decay and these values are copied.
+ * scan window. When the scan completes, the counts in
+ * numa_faults_memory decay and these values are copied.
+ */
+ unsigned long *numa_faults_buffer_memory;
+
+ /*
+ * Track the nodes the process was running on when a NUMA hinting
+ * fault was incurred.
*/
- unsigned long *numa_faults_buffer;
+ unsigned long *numa_faults_cpu;
+ unsigned long *numa_faults_buffer_cpu;
/*
* numa_faults_locality tracks if faults recorded during the last
extern pid_t task_numa_group_id(struct task_struct *p);
extern void set_numabalancing_state(bool enabled);
extern void task_numa_free(struct task_struct *p);
-
-extern unsigned int sysctl_numa_balancing_migrate_deferred;
+extern bool should_numa_migrate_memory(struct task_struct *p, struct page *page,
+ int src_nid, int dst_cpu);
#else
static inline void task_numa_fault(int last_node, int node, int pages,
int flags)
static inline void task_numa_free(struct task_struct *p)
{
}
+static inline bool should_numa_migrate_memory(struct task_struct *p,
+ struct page *page, int src_nid, int dst_cpu)
+{
+ return true;
+}
#endif
static inline struct pid *task_pid(struct task_struct *task)
extern bool yield_to(struct task_struct *p, bool preempt);
extern void set_user_nice(struct task_struct *p, long nice);
extern int task_prio(const struct task_struct *p);
-extern int task_nice(const struct task_struct *p);
+/**
+ * task_nice - return the nice value of a given task.
+ * @p: the task in question.
+ *
+ * Return: The nice value [ -20 ... 0 ... 19 ].
+ */
+static inline int task_nice(const struct task_struct *p)
+{
+ return PRIO_TO_NICE((p)->static_prio);
+}
extern int can_nice(const struct task_struct *p, const int nice);
extern int task_curr(const struct task_struct *p);
extern int idle_cpu(int cpu);
extern int allow_signal(int);
extern int disallow_signal(int);
-extern int do_execve(const char *,
+extern int do_execve(struct filename *,
const char __user * const __user *,
const char __user * const __user *);
extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *);
#define DECLARE_DEFERRABLE_WORK(n, f) \
struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, TIMER_DEFERRABLE)
-/*
- * initialize a work item's function pointer
- */
-#define PREPARE_WORK(_work, _func) \
- do { \
- (_work)->func = (_func); \
- } while (0)
-
-#define PREPARE_DELAYED_WORK(_work, _func) \
- PREPARE_WORK(&(_work)->work, (_func))
-
#ifdef CONFIG_DEBUG_OBJECTS_WORK
extern void __init_work(struct work_struct *work, int onstack);
extern void destroy_work_on_stack(struct work_struct *work);
+ extern void destroy_delayed_work_on_stack(struct delayed_work *work);
static inline unsigned int work_static(struct work_struct *work)
{
return *work_data_bits(work) & WORK_STRUCT_STATIC;
#else
static inline void __init_work(struct work_struct *work, int onstack) { }
static inline void destroy_work_on_stack(struct work_struct *work) { }
+ static inline void destroy_delayed_work_on_stack(struct delayed_work *work) { }
static inline unsigned int work_static(struct work_struct *work) { return 0; }
#endif
(_work)->data = (atomic_long_t) WORK_DATA_INIT(); \
lockdep_init_map(&(_work)->lockdep_map, #_work, &__key, 0); \
INIT_LIST_HEAD(&(_work)->entry); \
- PREPARE_WORK((_work), (_func)); \
+ (_work)->func = (_func); \
} while (0)
#else
#define __INIT_WORK(_work, _func, _onstack) \
__init_work((_work), _onstack); \
(_work)->data = (atomic_long_t) WORK_DATA_INIT(); \
INIT_LIST_HEAD(&(_work)->entry); \
- PREPARE_WORK((_work), (_func)); \
+ (_work)->func = (_func); \
} while (0)
#endif
* Documentation/workqueue.txt.
*/
enum {
- /*
- * All wqs are now non-reentrant making the following flag
- * meaningless. Will be removed.
- */
- WQ_NON_REENTRANT = 1 << 0, /* DEPRECATED */
-
WQ_UNBOUND = 1 << 1, /* not bound to any cpu */
WQ_FREEZABLE = 1 << 2, /* freeze during suspend */
WQ_MEM_RECLAIM = 1 << 3, /* may be used for memory reclaim */
WQ_HIGHPRI = 1 << 4, /* high priority */
- WQ_CPU_INTENSIVE = 1 << 5, /* cpu instensive workqueue */
+ WQ_CPU_INTENSIVE = 1 << 5, /* cpu intensive workqueue */
WQ_SYSFS = 1 << 6, /* visible in sysfs, see wq_sysfs_register() */
/*
static struct lock_class_key __key; \
const char *__lock_name; \
\
- if (__builtin_constant_p(fmt)) \
- __lock_name = (fmt); \
- else \
- __lock_name = #fmt; \
+ __lock_name = #fmt#args; \
\
__alloc_workqueue_key((fmt), (flags), (max_active), \
&__key, __lock_name, ##args); \
return system_wq != NULL;
}
-/*
- * Like above, but uses del_timer() instead of del_timer_sync(). This means,
- * if it returns 0 the timer function may be running and the queueing is in
- * progress.
- */
-static inline bool __deprecated __cancel_delayed_work(struct delayed_work *work)
-{
- bool ret;
-
- ret = del_timer(&work->timer);
- if (ret)
- work_clear_pending(&work->work);
- return ret;
-}
-
/* used to be different but now identical to flush_work(), deprecated */
static inline bool __deprecated flush_work_sync(struct work_struct *work)
{
* selecting an idle cpu will add more delays to the timers than intended
* (as that cpu's timer base may not be uptodate wrt jiffies etc).
*/
- int get_nohz_timer_target(void)
+ int get_nohz_timer_target(int pinned)
{
int cpu = smp_processor_id();
int i;
struct sched_domain *sd;
+ if (pinned || !get_sysctl_timer_migration() || !idle_cpu(cpu))
+ return cpu;
+
rcu_read_lock();
for_each_domain(cpu, sd) {
for_each_cpu(i, sched_domain_span(sd)) {
#endif
#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
if (static_key_false((¶virt_steal_rq_enabled))) {
- u64 st;
-
steal = paravirt_steal_clock(cpu_of(rq));
steal -= rq->prev_steal_time_rq;
if (unlikely(steal > delta))
steal = delta;
- st = steal_ticks(steal);
- steal = st * TICK_NSEC;
-
rq->prev_steal_time_rq += steal;
-
delta -= steal;
}
#endif
p->numa_scan_seq = p->mm ? p->mm->numa_scan_seq : 0;
p->numa_scan_period = sysctl_numa_balancing_scan_delay;
p->numa_work.next = &p->numa_work;
- p->numa_faults = NULL;
- p->numa_faults_buffer = NULL;
+ p->numa_faults_memory = NULL;
+ p->numa_faults_buffer_memory = NULL;
+ p->last_task_numa_placement = 0;
+ p->last_sum_exec_runtime = 0;
INIT_LIST_HEAD(&p->numa_entry);
p->numa_group = NULL;
{
struct dl_bw *dl_b = dl_bw_of(task_cpu(p));
- u64 period = attr->sched_period;
+ u64 period = attr->sched_period ?: attr->sched_deadline;
u64 runtime = attr->sched_runtime;
u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
int cpus, err = -1;
if (mm)
mmdrop(mm);
if (unlikely(prev_state == TASK_DEAD)) {
- task_numa_free(prev);
-
if (prev->sched_class->task_dead)
prev->sched_class->task_dead(prev);
#ifdef CONFIG_SMP
-/* assumes rq->lock is held */
-static inline void pre_schedule(struct rq *rq, struct task_struct *prev)
-{
- if (prev->sched_class->pre_schedule)
- prev->sched_class->pre_schedule(rq, prev);
-}
-
/* rq->lock is NOT held, but preemption is disabled */
static inline void post_schedule(struct rq *rq)
{
#else
-static inline void pre_schedule(struct rq *rq, struct task_struct *p)
-{
-}
-
static inline void post_schedule(struct rq *rq)
{
}
DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
PREEMPT_MASK - 10);
#endif
- if (preempt_count() == val)
- trace_preempt_off(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
+ if (preempt_count() == val) {
+ unsigned long ip = get_parent_ip(CALLER_ADDR1);
+#ifdef CONFIG_DEBUG_PREEMPT
+ current->preempt_disable_ip = ip;
+#endif
+ trace_preempt_off(CALLER_ADDR0, ip);
+ }
}
EXPORT_SYMBOL(preempt_count_add);
print_modules();
if (irqs_disabled())
print_irqtrace_events(prev);
+#ifdef CONFIG_DEBUG_PREEMPT
+ if (in_atomic_preempt_off()) {
+ pr_err("Preemption disabled at:");
+ print_ip_sym(current->preempt_disable_ip);
+ pr_cont("\n");
+ }
+#endif
dump_stack();
add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
}
schedstat_inc(this_rq(), sched_count);
}
-static void put_prev_task(struct rq *rq, struct task_struct *prev)
-{
- if (prev->on_rq || rq->skip_clock_update < 0)
- update_rq_clock(rq);
- prev->sched_class->put_prev_task(rq, prev);
-}
-
/*
* Pick up the highest-prio task:
*/
static inline struct task_struct *
-pick_next_task(struct rq *rq)
+pick_next_task(struct rq *rq, struct task_struct *prev)
{
- const struct sched_class *class;
+ const struct sched_class *class = &fair_sched_class;
struct task_struct *p;
/*
* Optimization: we know that if all tasks are in
* the fair class we can call that function directly:
*/
- if (likely(rq->nr_running == rq->cfs.h_nr_running)) {
- p = fair_sched_class.pick_next_task(rq);
- if (likely(p))
+ if (likely(prev->sched_class == class &&
+ rq->nr_running == rq->cfs.h_nr_running)) {
+ p = fair_sched_class.pick_next_task(rq, prev);
+ if (likely(p && p != RETRY_TASK))
return p;
}
+again:
for_each_class(class) {
- p = class->pick_next_task(rq);
- if (p)
+ p = class->pick_next_task(rq, prev);
+ if (p) {
+ if (unlikely(p == RETRY_TASK))
+ goto again;
return p;
+ }
}
BUG(); /* the idle class will always have a runnable task */
switch_count = &prev->nvcsw;
}
- pre_schedule(rq, prev);
-
- if (unlikely(!rq->nr_running))
- idle_balance(cpu, rq);
+ if (prev->on_rq || rq->skip_clock_update < 0)
+ update_rq_clock(rq);
- put_prev_task(rq, prev);
- next = pick_next_task(rq);
+ next = pick_next_task(rq, prev);
clear_tsk_need_resched(prev);
clear_preempt_need_resched();
rq->skip_clock_update = 0;
* This function changes the 'effective' priority of a task. It does
* not touch ->normal_prio like __setscheduler().
*
- * Used by the rt_mutex code to implement priority inheritance logic.
+ * Used by the rt_mutex code to implement priority inheritance
+ * logic. Call site only calls if the priority of the task changed.
*/
void rt_mutex_setprio(struct task_struct *p, int prio)
{
unsigned long flags;
struct rq *rq;
- if (TASK_NICE(p) == nice || nice < -20 || nice > 19)
+ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
return;
/*
* We have to be careful, if called from sys_setpriority(),
if (increment > 40)
increment = 40;
- nice = TASK_NICE(current) + increment;
- if (nice < -20)
- nice = -20;
- if (nice > 19)
- nice = 19;
+ nice = task_nice(current) + increment;
+ if (nice < MIN_NICE)
+ nice = MIN_NICE;
+ if (nice > MAX_NICE)
+ nice = MAX_NICE;
if (increment < 0 && !can_nice(current, nice))
return -EPERM;
return p->prio - MAX_RT_PRIO;
}
-/**
- * task_nice - return the nice value of a given task.
- * @p: the task in question.
- *
- * Return: The nice value [ -20 ... 0 ... 19 ].
- */
-int task_nice(const struct task_struct *p)
-{
- return TASK_NICE(p);
-}
-EXPORT_SYMBOL(task_nice);
-
/**
* idle_cpu - is a given cpu idle currently?
* @cpu: the processor in question.
dl_se->dl_new = 1;
}
-/* Actually do priority change: must hold pi & rq lock. */
-static void __setscheduler(struct rq *rq, struct task_struct *p,
- const struct sched_attr *attr)
+static void __setscheduler_params(struct task_struct *p,
+ const struct sched_attr *attr)
{
int policy = attr->sched_policy;
* getparam()/getattr() don't report silly values for !rt tasks.
*/
p->rt_priority = attr->sched_priority;
-
p->normal_prio = normal_prio(p);
- p->prio = rt_mutex_getprio(p);
+ set_load_weight(p);
+}
+
+/* Actually do priority change: must hold pi & rq lock. */
+static void __setscheduler(struct rq *rq, struct task_struct *p,
+ const struct sched_attr *attr)
+{
+ __setscheduler_params(p, attr);
+
+ /*
+ * If we get here, there was no pi waiters boosting the
+ * task. It is safe to use the normal prio.
+ */
+ p->prio = normal_prio(p);
if (dl_prio(p->prio))
p->sched_class = &dl_sched_class;
p->sched_class = &rt_sched_class;
else
p->sched_class = &fair_sched_class;
-
- set_load_weight(p);
}
static void
const struct sched_attr *attr,
bool user)
{
+ int newprio = dl_policy(attr->sched_policy) ? MAX_DL_PRIO - 1 :
+ MAX_RT_PRIO - 1 - attr->sched_priority;
int retval, oldprio, oldpolicy = -1, on_rq, running;
int policy = attr->sched_policy;
unsigned long flags;
*/
if (user && !capable(CAP_SYS_NICE)) {
if (fair_policy(policy)) {
- if (attr->sched_nice < TASK_NICE(p) &&
+ if (attr->sched_nice < task_nice(p) &&
!can_nice(p, attr->sched_nice))
return -EPERM;
}
return -EPERM;
}
+ /*
+ * Can't set/change SCHED_DEADLINE policy at all for now
+ * (safest behavior); in the future we would like to allow
+ * unprivileged DL tasks to increase their relative deadline
+ * or reduce their runtime (both ways reducing utilization)
+ */
+ if (dl_policy(policy))
+ return -EPERM;
+
/*
* Treat SCHED_IDLE as nice 20. Only allow a switch to
* SCHED_NORMAL if the RLIMIT_NICE would normally permit it.
*/
if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) {
- if (!can_nice(p, TASK_NICE(p)))
+ if (!can_nice(p, task_nice(p)))
return -EPERM;
}
}
/*
- * If not changing anything there's no need to proceed further:
+ * If not changing anything there's no need to proceed further,
+ * but store a possible modification of reset_on_fork.
*/
if (unlikely(policy == p->policy)) {
- if (fair_policy(policy) && attr->sched_nice != TASK_NICE(p))
+ if (fair_policy(policy) && attr->sched_nice != task_nice(p))
goto change;
if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
goto change;
if (dl_policy(policy))
goto change;
+ p->sched_reset_on_fork = reset_on_fork;
task_rq_unlock(rq, p, &flags);
return 0;
}
return -EBUSY;
}
+ p->sched_reset_on_fork = reset_on_fork;
+ oldprio = p->prio;
+
+ /*
+ * Special case for priority boosted tasks.
+ *
+ * If the new priority is lower or equal (user space view)
+ * than the current (boosted) priority, we just store the new
+ * normal parameters and do not touch the scheduler class and
+ * the runqueue. This will be done when the task deboost
+ * itself.
+ */
+ if (rt_mutex_check_prio(p, newprio)) {
+ __setscheduler_params(p, attr);
+ task_rq_unlock(rq, p, &flags);
+ return 0;
+ }
+
on_rq = p->on_rq;
running = task_current(rq, p);
if (on_rq)
if (running)
p->sched_class->put_prev_task(rq, p);
- p->sched_reset_on_fork = reset_on_fork;
-
- oldprio = p->prio;
prev_class = p->sched_class;
__setscheduler(rq, p, attr);
if (running)
p->sched_class->set_curr_task(rq);
- if (on_rq)
- enqueue_task(rq, p, 0);
+ if (on_rq) {
+ /*
+ * We enqueue to tail when the priority of a task is
+ * increased (user space view).
+ */
+ enqueue_task(rq, p, oldprio <= p->prio ? ENQUEUE_HEAD : 0);
+ }
check_class_changed(rq, p, prev_class, oldprio);
task_rq_unlock(rq, p, &flags);
* XXX: do we want to be lenient like existing syscalls; or do we want
* to be strict and return an error on out-of-bounds values?
*/
- attr->sched_nice = clamp(attr->sched_nice, -20, 19);
+ attr->sched_nice = clamp(attr->sched_nice, MIN_NICE, MAX_NICE);
out:
return ret;
* @pid: the pid in question.
* @uattr: structure containing the extended parameters.
*/
-SYSCALL_DEFINE2(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr)
+SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
+ unsigned int, flags)
{
struct sched_attr attr;
struct task_struct *p;
int retval;
- if (!uattr || pid < 0)
+ if (!uattr || pid < 0 || flags)
return -EINVAL;
if (sched_copy_attr(uattr, &attr))
attr->size = usize;
}
- ret = copy_to_user(uattr, attr, usize);
+ ret = copy_to_user(uattr, attr, attr->size);
if (ret)
return -EFAULT;
* @uattr: structure containing the extended parameters.
* @size: sizeof(attr) for fwd/bwd comp.
*/
-SYSCALL_DEFINE3(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
- unsigned int, size)
+SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
+ unsigned int, size, unsigned int, flags)
{
struct sched_attr attr = {
.size = sizeof(struct sched_attr),
int retval;
if (!uattr || pid < 0 || size > PAGE_SIZE ||
- size < SCHED_ATTR_SIZE_VER0)
+ size < SCHED_ATTR_SIZE_VER0 || flags)
return -EINVAL;
rcu_read_lock();
else if (task_has_rt_policy(p))
attr.sched_priority = p->rt_priority;
else
- attr.sched_nice = TASK_NICE(p);
+ attr.sched_nice = task_nice(p);
rcu_read_unlock();
rcu_read_unlock();
rq->curr = rq->idle = idle;
+ idle->on_rq = 1;
#if defined(CONFIG_SMP)
idle->on_cpu = 1;
#endif
BUG_ON(cpu_online(smp_processor_id()));
- if (mm != &init_mm)
+ if (mm != &init_mm) {
switch_mm(mm, &init_mm, current);
+ finish_arch_post_lock_switch();
+ }
mmdrop(mm);
}
atomic_long_add(delta, &calc_load_tasks);
}
+static void put_prev_task_fake(struct rq *rq, struct task_struct *prev)
+{
+}
+
+static const struct sched_class fake_sched_class = {
+ .put_prev_task = put_prev_task_fake,
+};
+
+static struct task_struct fake_task = {
+ /*
+ * Avoid pull_{rt,dl}_task()
+ */
+ .prio = MAX_PRIO + 1,
+ .sched_class = &fake_sched_class,
+};
+
/*
* Migrate all tasks from the rq, sleeping tasks will be migrated by
* try_to_wake_up()->select_task_rq().
if (rq->nr_running == 1)
break;
- next = pick_next_task(rq);
+ next = pick_next_task(rq, &fake_task);
BUG_ON(!next);
next->sched_class->put_prev_task(rq, next);
static struct ctl_table *
sd_alloc_ctl_domain_table(struct sched_domain *sd)
{
- struct ctl_table *table = sd_alloc_ctl_entry(13);
+ struct ctl_table *table = sd_alloc_ctl_entry(14);
if (table == NULL)
return NULL;
sizeof(int), 0644, proc_dointvec_minmax, false);
set_table_entry(&table[10], "flags", &sd->flags,
sizeof(int), 0644, proc_dointvec_minmax, false);
- set_table_entry(&table[11], "name", sd->name,
+ set_table_entry(&table[11], "max_newidle_lb_cost",
+ &sd->max_newidle_lb_cost,
+ sizeof(long), 0644, proc_doulongvec_minmax, false);
+ set_table_entry(&table[12], "name", sd->name,
CORENAME_MAX_SIZE, 0444, proc_dostring, false);
- /* &table[12] is terminator */
+ /* &table[13] is terminator */
return table;
}
rq->rt.rt_runtime = def_rt_bandwidth.rt_runtime;
#ifdef CONFIG_RT_GROUP_SCHED
- INIT_LIST_HEAD(&rq->leaf_rt_rq_list);
init_tg_rt_entry(&root_task_group, &rq->rt, NULL, i, NULL);
#endif
static unsigned long prev_jiffy; /* ratelimiting */
rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
- if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
+ if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
+ !is_idle_task(current)) ||
system_state != SYSTEM_RUNNING || oops_in_progress)
return;
if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
debug_show_held_locks(current);
if (irqs_disabled())
print_irqtrace_events(current);
+#ifdef CONFIG_DEBUG_PREEMPT
+ if (!preempt_count_equals(preempt_offset)) {
+ pr_err("Preemption disabled at:");
+ print_ip_sym(current->preempt_disable_ip);
+ pr_cont("\n");
+ }
+#endif
dump_stack();
}
EXPORT_SYMBOL(__might_sleep);
* Renice negative nice level userspace
* tasks back to 0:
*/
- if (TASK_NICE(p) < 0 && p->mm)
+ if (task_nice(p) < 0 && p->mm)
set_user_nice(p, 0);
continue;
}
u64 period = global_rt_period();
u64 new_bw = to_ratio(period, runtime);
int cpu, ret = 0;
+ unsigned long flags;
/*
* Here we want to check the bandwidth not being set to some
for_each_possible_cpu(cpu) {
struct dl_bw *dl_b = dl_bw_of(cpu);
- raw_spin_lock(&dl_b->lock);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
if (new_bw < dl_b->total_bw)
ret = -EBUSY;
- raw_spin_unlock(&dl_b->lock);
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);
if (ret)
break;
{
u64 new_bw = -1;
int cpu;
+ unsigned long flags;
def_dl_bandwidth.dl_period = global_rt_period();
def_dl_bandwidth.dl_runtime = global_rt_runtime();
for_each_possible_cpu(cpu) {
struct dl_bw *dl_b = dl_bw_of(cpu);
- raw_spin_lock(&dl_b->lock);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
dl_b->bw = new_bw;
- raw_spin_unlock(&dl_b->lock);
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);
}
}
if (sysctl_sched_rt_period <= 0)
return -EINVAL;
- if (sysctl_sched_rt_runtime > sysctl_sched_rt_period)
+ if ((sysctl_sched_rt_runtime != RUNTIME_INF) &&
+ (sysctl_sched_rt_runtime > sysctl_sched_rt_period))
return -EINVAL;
return 0;
return (dev && tick_broadcast_device.evtdev == dev);
}
+ int tick_broadcast_update_freq(struct clock_event_device *dev, u32 freq)
+ {
+ int ret = -ENODEV;
+
+ if (tick_is_broadcast_device(dev)) {
+ raw_spin_lock(&tick_broadcast_lock);
+ ret = __clockevents_update_freq(dev, freq);
+ raw_spin_unlock(&tick_broadcast_lock);
+ }
+ return ret;
+ }
+
+
static void err_broadcast(const struct cpumask *mask)
{
pr_crit_once("Failed to broadcast timer tick. Some CPUs may be unresponsive.\n");
*/
static void tick_do_periodic_broadcast(void)
{
- raw_spin_lock(&tick_broadcast_lock);
-
cpumask_and(tmpmask, cpu_online_mask, tick_broadcast_mask);
tick_do_broadcast(tmpmask);
-
- raw_spin_unlock(&tick_broadcast_lock);
}
/*
{
ktime_t next;
+ raw_spin_lock(&tick_broadcast_lock);
+
tick_do_periodic_broadcast();
/*
* The device is in periodic mode. No reprogramming necessary:
*/
if (dev->mode == CLOCK_EVT_MODE_PERIODIC)
- return;
+ goto unlock;
/*
* Setup the next period for devices, which do not have
next = ktime_add(next, tick_period);
if (!clockevents_program_event(dev, next, false))
- return;
+ goto unlock;
tick_do_periodic_broadcast();
}
+ unlock:
+ raw_spin_unlock(&tick_broadcast_lock);
}
/*
raw_spin_unlock(&tick_broadcast_lock);
}
+ static int broadcast_needs_cpu(struct clock_event_device *bc, int cpu)
+ {
+ if (!(bc->features & CLOCK_EVT_FEAT_HRTIMER))
+ return 0;
+ if (bc->next_event.tv64 == KTIME_MAX)
+ return 0;
+ return bc->bound_on == cpu ? -EBUSY : 0;
+ }
+
+ static void broadcast_shutdown_local(struct clock_event_device *bc,
+ struct clock_event_device *dev)
+ {
+ /*
+ * For hrtimer based broadcasting we cannot shutdown the cpu
+ * local device if our own event is the first one to expire or
+ * if we own the broadcast timer.
+ */
+ if (bc->features & CLOCK_EVT_FEAT_HRTIMER) {
+ if (broadcast_needs_cpu(bc, smp_processor_id()))
+ return;
+ if (dev->next_event.tv64 < bc->next_event.tv64)
+ return;
+ }
+ clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
+ }
+
+ static void broadcast_move_bc(int deadcpu)
+ {
+ struct clock_event_device *bc = tick_broadcast_device.evtdev;
+
+ if (!bc || !broadcast_needs_cpu(bc, deadcpu))
+ return;
+ /* This moves the broadcast assignment to this cpu */
+ clockevents_program_event(bc, bc->next_event, 1);
+ }
+
/*
* Powerstate information: The system enters/leaves a state, where
* affected devices might stop
+ * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
*/
- void tick_broadcast_oneshot_control(unsigned long reason)
+ int tick_broadcast_oneshot_control(unsigned long reason)
{
struct clock_event_device *bc, *dev;
struct tick_device *td;
unsigned long flags;
ktime_t now;
- int cpu;
+ int cpu, ret = 0;
/*
* Periodic mode does not care about the enter/exit of power
* states
*/
if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
- return;
+ return 0;
/*
* We are called with preemtion disabled from the depth of the
dev = td->evtdev;
if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
- return;
+ return 0;
bc = tick_broadcast_device.evtdev;
if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ENTER) {
if (!cpumask_test_and_set_cpu(cpu, tick_broadcast_oneshot_mask)) {
WARN_ON_ONCE(cpumask_test_cpu(cpu, tick_broadcast_pending_mask));
- clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
+ broadcast_shutdown_local(bc, dev);
/*
* We only reprogram the broadcast timer if we
* did not mark ourself in the force mask and
dev->next_event.tv64 < bc->next_event.tv64)
tick_broadcast_set_event(bc, cpu, dev->next_event, 1);
}
+ /*
+ * If the current CPU owns the hrtimer broadcast
+ * mechanism, it cannot go deep idle and we remove the
+ * CPU from the broadcast mask. We don't have to go
+ * through the EXIT path as the local timer is not
+ * shutdown.
+ */
+ ret = broadcast_needs_cpu(bc, cpu);
+ if (ret)
+ cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
} else {
if (cpumask_test_and_clear_cpu(cpu, tick_broadcast_oneshot_mask)) {
clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
}
out:
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
+ return ret;
}
/*
static void tick_broadcast_clear_oneshot(int cpu)
{
cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
+ cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
}
static void tick_broadcast_init_next_event(struct cpumask *mask,
cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
cpumask_clear_cpu(cpu, tick_broadcast_force_mask);
+ broadcast_move_bc(cpu);
+
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}
#define CREATE_TRACE_POINTS
#include <trace/events/timer.h>
-u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
+__visible u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
EXPORT_SYMBOL(jiffies_64);
unsigned long timer_jiffies;
unsigned long next_timer;
unsigned long active_timers;
+ unsigned long all_timers;
struct tvec_root tv1;
struct tvec tv2;
struct tvec tv3;
}
EXPORT_SYMBOL_GPL(set_timer_slack);
+ /*
+ * If the list is empty, catch up ->timer_jiffies to the current time.
+ * The caller must hold the tvec_base lock. Returns true if the list
+ * was empty and therefore ->timer_jiffies was updated.
+ */
+ static bool catchup_timer_jiffies(struct tvec_base *base)
+ {
+ if (!base->all_timers) {
+ base->timer_jiffies = jiffies;
+ return true;
+ }
+ return false;
+ }
+
static void
__internal_add_timer(struct tvec_base *base, struct timer_list *timer)
{
static void internal_add_timer(struct tvec_base *base, struct timer_list *timer)
{
+ (void)catchup_timer_jiffies(base);
__internal_add_timer(base, timer);
/*
* Update base->active_timers and base->next_timer
*/
if (!tbase_get_deferrable(timer->base)) {
- if (time_before(timer->expires, base->next_timer))
+ if (!base->active_timers++ ||
+ time_before(timer->expires, base->next_timer))
base->next_timer = timer->expires;
- base->active_timers++;
}
+ base->all_timers++;
}
#ifdef CONFIG_TIMER_STATS
detach_timer(timer, true);
if (!tbase_get_deferrable(timer->base))
base->active_timers--;
+ base->all_timers--;
+ (void)catchup_timer_jiffies(base);
}
static int detach_if_pending(struct timer_list *timer, struct tvec_base *base,
if (timer->expires == base->next_timer)
base->next_timer = base->timer_jiffies;
}
+ base->all_timers--;
+ (void)catchup_timer_jiffies(base);
return 1;
}
debug_activate(timer, expires);
- cpu = smp_processor_id();
-
- #if defined(CONFIG_NO_HZ_COMMON) && defined(CONFIG_SMP)
- if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
- cpu = get_nohz_timer_target();
- #endif
+ cpu = get_nohz_timer_target(pinned);
new_base = per_cpu(tvec_bases, cpu);
if (base != new_base) {
* with the timer by holding the timer base lock. This also
* makes sure that a CPU on the way to stop its tick can not
* evaluate the timer wheel.
+ *
+ * Spare the IPI for deferrable timers on idle targets though.
+ * The next busy ticks will take care of it. Except full dynticks
+ * require special care against races with idle_cpu(), lets deal
+ * with that later.
*/
- wake_up_nohz_cpu(cpu);
+ if (!tbase_get_deferrable(timer->base) || tick_nohz_full_cpu(cpu))
+ wake_up_nohz_cpu(cpu);
+
spin_unlock_irqrestore(&base->lock, flags);
}
EXPORT_SYMBOL_GPL(add_timer_on);
struct timer_list *timer;
spin_lock_irq(&base->lock);
+ if (catchup_timer_jiffies(base)) {
+ spin_unlock_irq(&base->lock);
+ return;
+ }
while (time_after_eq(jiffies, base->timer_jiffies)) {
struct list_head work_list;
struct list_head *head = &work_list;
!cascade(base, &base->tv4, INDEX(2)))
cascade(base, &base->tv5, INDEX(3));
++base->timer_jiffies;
- list_replace_init(base->tv1.vec + index, &work_list);
+ list_replace_init(base->tv1.vec + index, head);
while (!list_empty(head)) {
void (*fn)(unsigned long);
unsigned long data;
if (!base)
return -ENOMEM;
- /* Make sure that tvec_base is 2 byte aligned */
- if (tbase_get_deferrable(base)) {
- WARN_ON(1);
+ /* Make sure tvec_base has TIMER_FLAG_MASK bits free */
+ if (WARN_ON(base != tbase_get_base(base))) {
kfree(base);
return -ENOMEM;
}
base->timer_jiffies = jiffies;
base->next_timer = base->timer_jiffies;
base->active_timers = 0;
+ base->all_timers = 0;
return 0;
}
err = timer_cpu_notify(&timers_nb, (unsigned long)CPU_UP_PREPARE,
(void *)(long)smp_processor_id());
- init_timer_stats();
-
BUG_ON(err != NOTIFY_OK);
+
+ init_timer_stats();
register_cpu_notifier(&timers_nb);
open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
}
}
EXPORT_SYMBOL_GPL(destroy_work_on_stack);
+ void destroy_delayed_work_on_stack(struct delayed_work *work)
+ {
+ destroy_timer_on_stack(&work->timer);
+ debug_object_free(&work->work, &work_debug_descr);
+ }
+ EXPORT_SYMBOL_GPL(destroy_delayed_work_on_stack);
+
#else
static inline void debug_work_activate(struct work_struct *work) { }
static inline void debug_work_deactivate(struct work_struct *work) { }
if (worker->flags & WORKER_IDLE)
pool->nr_idle--;
+ /*
+ * Once WORKER_DIE is set, the kworker may destroy itself at any
+ * point. Pin to ensure the task stays until we're done with it.
+ */
+ get_task_struct(worker->task);
+
list_del_init(&worker->entry);
worker->flags |= WORKER_DIE;
spin_unlock_irq(&pool->lock);
kthread_stop(worker->task);
+ put_task_struct(worker->task);
kfree(worker);
spin_lock_irq(&pool->lock);
return -ENOMEM;
if (sscanf(buf, "%d", &attrs->nice) == 1 &&
- attrs->nice >= -20 && attrs->nice <= 19)
+ attrs->nice >= MIN_NICE && attrs->nice <= MAX_NICE)
ret = apply_workqueue_attrs(wq, attrs);
else
ret = -EINVAL;