Peter Ujfalusi [Tue, 20 Oct 2020 07:32:43 +0000 (10:32 +0300)]
irqchip/ti-sci-inta: Add support for unmapped event handling
The DMA (BCDMA/PKTDMA and their rings/flows) events are under the INTA's
supervision as unmapped events in AM64.
In order to keep the current SW stack working, the INTA driver must replace
the dev_id with it's own when a request comes for BCDMA or PKTDMA
resources.
Implement parsing of the optional "ti,unmapped-event-sources" phandle array
to get the sci-dev-ids of the devices where the unmapped events originate.
Peter Ujfalusi [Tue, 20 Oct 2020 07:32:42 +0000 (10:32 +0300)]
dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling
The new DMA architecture introduced with AM64 introduced new event types:
unampped events.
These events are mapped within INTA in contrast to other K3 devices where
the events with similar function was originating from the UDMAP or ringacc.
The ti,unmapped-event-sources should contain phandle array to the devices
in the system (typically DMA controllers) from where the unmapped events
originate.
irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm
Get rid of the separate flag to indicate if the IRLM bit is present in
the INTC/Interrupt Control Register 0, by considering -1 an invalid
irlm_bit value.
An interrupt submitted to an affinity change will always be left enabled
after plic_set_affinity() has been called, while the expectation is that
it should stay in whatever state it was before the call.
Preserving the configuration fixes a PWM hang issue on the Unleashed
board.
[ 919.015783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 919.020922] rcu: 0-...0: (0 ticks this GP)
idle=7d2/1/0x4000000000000002 softirq=1424/1424 fqs=105807
[ 919.030295] (detected by 1, t=225825 jiffies, g=1561, q=3496)
[ 919.036109] Task dump for CPU 0:
[ 919.039321] kworker/0:1 R running task 0 30 2 0x00000008
[ 919.046359] Workqueue: events set_brightness_delayed
[ 919.051302] Call Trace:
[ 919.053738] [<ffffffe000930d92>] __schedule+0x194/0x4de
[ 982.035783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 982.040923] rcu: 0-...0: (0 ticks this GP)
idle=7d2/1/0x4000000000000002 softirq=1424/1424 fqs=113325
[ 982.050294] (detected by 1, t=241580 jiffies, g=1561, q=3509)
[ 982.056108] Task dump for CPU 0:
[ 982.059321] kworker/0:1 R running task 0 30 2 0x00000008
[ 982.066359] Workqueue: events set_brightness_delayed
[ 982.071302] Call Trace:
[ 982.073739] [<ffffffe000930d92>] __schedule+0x194/0x4de
[..]
Marc Zyngier [Fri, 16 Oct 2020 08:28:23 +0000 (09:28 +0100)]
irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY
Now that GENERIC_IRQ_IPI selects IRQ_DOMAIN_HIERARCHY, there is no
need to have this conditional select for IRQ_MIPS_CPU. Similarily,
MIPS_GIC only needs selecting GENERIC_IRQ_IPI.
irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7
The MStar interrupt controller is only found on MStar, SigmaStar, and
Mediatek SoCs. Hence add dependencies on ARCH_MEDIATEK and
ARCH_MSTARV7, to prevent asking the user about the MStar interrupt
controller driver when configuring a kernel without support for MStar,
SigmaStar, and Mediatek SoCs.
The Tegra PMC driver does ungodly things with the interrupt hierarchy,
repeatedly corrupting it by pulling hwirq numbers out of thin air,
overriding existing IRQ mappings and changing the handling flow
of unsuspecting users.
All of this is done in the name of preserving the interrupt hierarchy
even when these levels do not exist in the HW. Together with the use
of proper IRQs for IPIs, this leads to an unbootable system as the
rescheduling IPI gets repeatedly repurposed for random drivers...
Instead, let's simply mark the level from which the hierarchy does
not make sense for the HW, and let the core code trim the usused
levels from the hierarchy.
Marc Zyngier [Tue, 6 Oct 2020 09:10:20 +0000 (10:10 +0100)]
genirq/irqdomain: Allow partial trimming of irq_data hierarchy
It appears that some HW is ugly enough that not all the interrupts
connected to a particular interrupt controller end up with the same
hierarchy depth (some of them are terminated early). This leaves
the irqchip hacker with only two choices, both equally bad:
- create discrete domain chains, one for each "hierarchy depth",
which is very hard to maintain
- create fake hierarchy levels for the shallow paths, leading
to all kind of problems (what are the safe hwirq values for these
fake levels?)
Implement the ability to cut short a single interrupt hierarchy
from a level marked as being disconnected by using the new
irq_domain_disconnect_hierarchy() helper.
The irqdomain allocation code will then perform the trimming
Maulik Shah [Mon, 28 Sep 2020 04:32:04 +0000 (10:02 +0530)]
irqchip/qcom-pdc: Reset PDC interrupts during init
Kexec can directly boot into a new kernel without going to complete
reboot. This can leave the previous kernel's configuration for PDC
interrupts as is.
Clear previous kernel's configuration during init by setting interrupts
in enable bank to zero. The IRQs specified in qcom,pdc-ranges property
are the only ones that can be used by the new kernel so clear only those
IRQs. The remaining ones may be in use by a different kernel and should
not be set by new kernel.
Maulik Shah [Mon, 28 Sep 2020 04:32:01 +0000 (10:02 +0530)]
genirq/PM: Introduce IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag
An interrupt that is disabled/masked but set for wakeup may still need to
be able to wake up the system from sleep states like "suspend to RAM".
To that effect, introduce the IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag.
If the irqchip have this flag set, the irq PM code will enable/unmask
the irqs that are marked for wakeup, but that are in a disabled state.
On resume, such irqs will be restored back to their disabled state.
Maulik Shah [Mon, 28 Sep 2020 04:32:00 +0000 (10:02 +0530)]
pinctrl: qcom: Use return value from irq_set_wake() call
msmgpio irqchip was not using return value of irq_set_irq_wake() callback
since previously GIC-v3 irqchip neither had IRQCHIP_SKIP_SET_WAKE flag nor
it implemented .irq_set_wake callback. This lead to irq_set_irq_wake()
return error -ENXIO.
However from 'commit 4110b5cbb014 ("irqchip/gic-v3: Allow interrupt to be
configured as wake-up sources")' GIC irqchip has IRQCHIP_SKIP_SET_WAKE
flag.
Use return value from irq_set_irq_wake() and irq_chip_set_wake_parent()
instead of always returning success.
Maulik Shah [Mon, 28 Sep 2020 04:31:59 +0000 (10:01 +0530)]
pinctrl: qcom: Set IRQCHIP_SET_TYPE_MASKED and IRQCHIP_MASK_ON_SUSPEND flags
Both IRQCHIP_SET_TYPE_MASKED and IRQCHIP_MASK_ON_SUSPEND flags are already
set for msmgpio's parent PDC irqchip but GPIO interrupts do not get masked
during suspend or during setting irq type since genirq checks irqchip flag
of msmgpio irqchip which forwards these calls to its parent PDC irqchip.
Add irqchip specific flags for msmgpio irqchip to mask non wakeirqs during
suspend and mask before setting irq type. Masking before changing type make
sures any spurious interrupt is not detected during this operation.
This interrupt controller is found in the Actions Semi Owl SoCs (S500,
S700 and S900) and provides support for handling up to 3 external
interrupt lines.
Each line can be independently configured as interrupt and triggers on
either of the edges or either of the levels. Additionally, each line
can also be masked individually.
Actions Semi Owl SoCs SIRQ interrupt controller is found in S500, S700
and S900 SoCs and provides support for handling up to 3 external
interrupt lines.
Zhen Lei [Thu, 24 Sep 2020 07:17:49 +0000 (15:17 +0800)]
genirq: Add stub for set_handle_irq() when !GENERIC_IRQ_MULTI_HANDLER
In order to avoid compilation errors when a driver references set_handle_irq(),
but that the architecture doesn't select GENERIC_IRQ_MULTI_HANDLER,
add a stub function that will just WARN_ON_ONCE() if ever used.
Marc Zyngier [Tue, 15 Sep 2020 13:03:51 +0000 (14:03 +0100)]
irqchip/gic: Cleanup Franken-GIC handling
Introduce a static key identifying Samsung's unique creation, allowing
to replace the indirect call to compute the base addresses with
a simple test on the static key.
Marc Zyngier [Mon, 14 Sep 2020 16:21:16 +0000 (17:21 +0100)]
irqchip/bcm2836: Provide mask/unmask dummy methods for IPIs
Although it doesn't seem possible to disable individual mailbox
interrupts, we still need to provide some callbacks.
Fixes: 09eb672ce4fb ("irqchip/bcm2836: Configure mailbox interrupts as standard interrupts") Reported-by: Marek Szyprowski <[email protected]> Tested-by: Marek Szyprowski <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
Marc Zyngier [Mon, 22 Jun 2020 20:23:36 +0000 (21:23 +0100)]
irqchip/armada-370-xp: Configure IPIs as standard interrupts
To introduce IPIs as standard interrupts to the Armada 370-XP
driver, let's allocate a completely separate irqdomain and
irqchip combo that lives parallel to the "standard" one.
This effectively should be modelled as a chained interrupt
controller, but the code is in such a state that it is
pretty hard to shoehorn, as it would require the rewrite
of the MSI layer as well.
Marc Zyngier [Sat, 20 Jun 2020 19:02:18 +0000 (20:02 +0100)]
irqchip/hip04: Configure IPIs as standard interrupts
In order to switch the hip04 driver to provide standard interrupts
for IPIs, rework the way interrupts are allocated, making sure
the irqdomain covers the SGIs as well as the rest of the interrupt
range.
The driver is otherwise so old-school that it creates all interrupts
upfront (duh!), so there is hardly anything else to change, apart
from communicating the IPIs to the arch code.
Marc Zyngier [Sat, 25 Apr 2020 14:24:01 +0000 (15:24 +0100)]
irqchip/gic: Configure SGIs as standard interrupts
Change the way we deal with GIC SGIs by turning them into proper
IRQs, and calling into the arch code to register the interrupt range
instead of a callback.
Marc Zyngier [Sat, 25 Apr 2020 14:24:01 +0000 (15:24 +0100)]
irqchip/gic-v3: Configure SGIs as standard interrupts
Change the way we deal with GICv3 SGIs by turning them into proper
IRQs, and calling into the arch code to register the interrupt range
instead of a callback.
Suman Anna [Wed, 16 Sep 2020 16:36:38 +0000 (18:36 +0200)]
irqchip/irq-pruss-intc: Add support for ICSSG INTC on K3 SoCs
The K3 AM65x and J721E SoCs have the next generation of the PRU-ICSS IP,
commonly called ICSSG. The PRUSS INTC present within the ICSSG supports
more System Events (160 vs 64), more Interrupt Channels and Host Interrupts
(20 vs 10) compared to the previous generation PRUSS INTC instances. The
first 2 and the last 10 of these host interrupt lines are used by the
PRU and other auxiliary cores and sub-modules within the ICSSG, with 8
host interrupts connected to MPU. The host interrupts 5, 6, 7 are also
connected to the other ICSSG instances within the SoC and can be
partitioned as per system integration through the board dts files.
Enhance the PRUSS INTC driver to add support for this ICSSG INTC
instance.
This implements the irq_get_irqchip_state and irq_set_irqchip_state
callbacks for the TI PRUSS INTC driver. The set callback can be used
by drivers to "kick" a PRU by injecting a PRU system event.
Suman Anna [Wed, 16 Sep 2020 16:36:36 +0000 (18:36 +0200)]
irqchip/irq-pruss-intc: Add logic for handling reserved interrupts
The PRUSS INTC has a fixed number of output interrupt lines that are
connected to a number of processors or other PRUSS instances or other
devices (like DMA) on the SoC. The output interrupt lines 2 through 9
are usually connected to the main Arm host processor and are referred
to as host interrupts 0 through 7 from ARM/MPU perspective.
All of these 8 host interrupts are not always exclusively connected
to the Arm interrupt controller. Some SoCs have some interrupt lines
not connected to the Arm interrupt controller at all, while a few others
have the interrupt lines connected to multiple processors in which they
need to be partitioned as per SoC integration needs. For example, AM437x
and 66AK2G SoCs have 2 PRUSS instances each and have the host interrupt 5
connected to the other PRUSS, while AM335x has host interrupt 0 shared
between MPU and TSC_ADC and host interrupts 6 & 7 shared between MPU and
a DMA controller.
Add logic to the PRUSS INTC driver to ignore both these shared and
invalid interrupts.
irqchip/irq-pruss-intc: Add a PRUSS irqchip driver for PRUSS interrupts
The Programmable Real-Time Unit Subsystem (PRUSS) contains a local
interrupt controller (INTC) that can handle various system input events
and post interrupts back to the device-level initiators. The INTC can
support upto 64 input events with individual control configuration and
hardware prioritization. These events are mapped onto 10 output interrupt
lines through two levels of many-to-one mapping support. Different
interrupt lines are routed to the individual PRU cores or to the host
CPU, or to other devices on the SoC. Some of these events are sourced
from peripherals or other sub-modules within that PRUSS, while a few
others are sourced from SoC-level peripherals/devices.
The PRUSS INTC platform driver manages this PRUSS interrupt controller
and implements an irqchip driver to provide a Linux standard way for
the PRU client users to enable/disable/ack/re-trigger a PRUSS system
event. The system events to interrupt channels and output interrupts
relies on the mapping configuration provided either through the PRU
firmware blob (for interrupts routed to PRU cores) or via the PRU
application's device tree node (for interrupt routed to the main CPU).
In the first case the mappings will be programmed on PRU remoteproc
driver demand (via irq_create_fwspec_mapping) during the boot of a PRU
core and cleaned up after the PRU core is stopped.
Reference counting is used to allow multiple system events to share a
single channel and to allow multiple channels to share a single host
event.
The PRUSS INTC module is reference counted during the interrupt
setup phase through the irqchip's irq_request_resources() and
irq_release_resources() ops. This restricts the module from being
removed as long as there are active interrupt users.
The driver currently supports and can be built for OMAP architecture
based AM335x, AM437x and AM57xx SoCs; Keystone2 architecture based
66AK2G SoCs and Davinci architecture based OMAP-L13x/AM18x/DA850 SoCs.
All of these SoCs support 64 system events, 10 interrupt channels and
10 output interrupt lines per PRUSS INTC with a few SoC integration
differences.
NOTE:
Each PRU-ICSS's INTC on AM57xx SoCs is preceded by a Crossbar that
enables multiple external events to be routed to a specific number
of input interrupt events. Any non-default external interrupt event
directed towards PRUSS needs this crossbar to be setup properly.
The Programmable Real-Time Unit and Industrial Communication Subsystem
(PRU-ICSS or simply PRUSS) contains an interrupt controller (INTC) that
can handle various system input events and post interrupts back to the
device-level initiators. The INTC can support up to 64 input events on
most SoCs with individual control configuration and h/w prioritization.
These events are mapped onto 10 output interrupt lines through two levels
of many-to-one mapping support. Different interrupt lines are routed to
the individual PRU cores or to the host CPU or to other PRUSS instances.
The K3 AM65x and J721E SoCs have the next generation of the PRU-ICSS IP,
commonly called ICSSG. The ICSSG interrupt controller on K3 SoCs provide
a higher number of host interrupts (20 vs 10) and can handle an increased
number of input events (160 vs 64) from various SoC interrupt sources.
Add the bindings document for these interrupt controllers on all the
applicable SoCs. It covers the OMAP architecture SoCs - AM33xx, AM437x
and AM57xx; the Keystone 2 architecture based 66AK2G SoC; the Davinci
architecture based OMAPL138 SoCs, and the K3 architecture based AM65x
and J721E SoCs.
Alexandru Elisei [Sat, 12 Sep 2020 15:37:07 +0000 (16:37 +0100)]
irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0
The GIC's internal view of the priority mask register and the assigned
interrupt priorities are based on whether GIC security is enabled and
whether firmware routes Group 0 interrupts to EL3. At the moment, we
support priority masking when ICC_PMR_EL1 and interrupt priorities are
either both modified by the GIC, or both left unchanged.
Trusted Firmware-A's default interrupt routing model allows Group 0
interrupts to be delivered to the non-secure world (SCR_EL3.FIQ == 0).
Unfortunately, this is precisely the case that the GIC driver doesn't
support: ICC_PMR_EL1 remains unchanged, but the GIC's view of interrupt
priorities is different from the software programmed values.
Support pseudo-NMIs when SCR_EL3.FIQ == 0 by using a different value to
mask regular interrupts. All the other values remain the same.
Alexandru Elisei [Sat, 12 Sep 2020 15:37:06 +0000 (16:37 +0100)]
irqchip/gic-v3: Spell out when pseudo-NMIs are enabled
When NMIs cannot be enabled, the driver prints a message stating that
unambiguously. When they are enabled, the only feedback we get is a message
regarding the use of synchronization for ICC_PMR_EL1 writes, which is not
as useful for a user who is not intimately familiar with how NMIs are
implemented.
Let's make it obvious that pseudo-NMIs are enabled. Keep the message about
using a barrier for ICC_PMR_EL1 writes, because it has a non-negligible
impact on performance.
Marc Zyngier [Tue, 23 Jun 2020 19:38:41 +0000 (20:38 +0100)]
ARM: Allow IPIs to be handled as normal interrupts
In order to deal with IPIs as normal interrupts, let's add
a new way to register them with the architecture code.
set_smp_ipi_range() takes a range of interrupts, and allows
the arch code to request them as if the were normal interrupts.
A standard handler is then called by the core IRQ code to deal
with the IPI.
This means that we don't need to call irq_enter/irq_exit, and
that we don't need to deal with set_irq_regs either. So let's
move the dispatcher into its own function, and leave handle_IPI()
as a compatibility function.
On the sending side, let's make use of ipi_send_mask, which
already exists for this purpose.
One of the major difference is that we end up, in some cases
(such as when performing IRQ time accounting on the scheduler
IPI), end up with nested irq_enter()/irq_exit() pairs.
Other than the (relatively small) overhead, there should be
no consequences to it (these pairs are designed to nest
correctly, and the accounting shouldn't be off).
Marc Zyngier [Sat, 25 Apr 2020 14:03:47 +0000 (15:03 +0100)]
arm64: Allow IPIs to be handled as normal interrupts
In order to deal with IPIs as normal interrupts, let's add
a new way to register them with the architecture code.
set_smp_ipi_range() takes a range of interrupts, and allows
the arch code to request them as if the were normal interrupts.
A standard handler is then called by the core IRQ code to deal
with the IPI.
This means that we don't need to call irq_enter/irq_exit, and
that we don't need to deal with set_irq_regs either. So let's
move the dispatcher into its own function, and leave handle_IPI()
as a compatibility function.
On the sending side, let's make use of ipi_send_mask, which
already exists for this purpose.
One of the major difference is that we end up, in some cases
(such as when performing IRQ time accounting on the scheduler
IPI), end up with nested irq_enter()/irq_exit() pairs.
Other than the (relatively small) overhead, there should be
no consequences to it (these pairs are designed to nest
correctly, and the accounting shouldn't be off).
Marc Zyngier [Tue, 19 May 2020 13:58:13 +0000 (14:58 +0100)]
genirq: Allow interrupts to be excluded from /proc/interrupts
A number of architectures implement IPI statistics directly,
duplicating the core kstat_irqs accounting. As we move IPIs to
being actual IRQs, we would end-up with a confusing display
in /proc/interrupts (where the IPIs would appear twice).
In order to solve this, allow interrupts to be flagged as
"hidden", which excludes them from /proc/interrupts.
Marc Zyngier [Tue, 19 May 2020 09:41:00 +0000 (10:41 +0100)]
genirq: Add fasteoi IPI flow
For irqchips using the fasteoi flow, IPIs are a bit special.
They need to be EOI'd early (before calling the handler), as
funny things may happen in the handler (they do not necessarily
behave like a normal interrupt).
Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block
Pull more io_uring fixes from Jens Axboe:
"Two followup fixes. One is fixing a regression from this merge window,
the other is two commits fixing cancelation of deferred requests.
Both have gone through full testing, and both spawned a few new
regression test additions to liburing.
- Don't play games with const, properly store the output iovec and
assign it as needed.
- Deferred request cancelation fix (Pavel)"
* tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
io_uring: fix linked deferred ->files cancellation
io_uring: fix cancel of deferred reqs with ->files
io_uring: fix explicit async read/write mapping for large segments
Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
pointer dereference bug and serialize a hardware register access as
required by the VT-d spec.
- two patches for AMD IOMMU to force AMD GPUs into translation mode
when memory encryption is active and disallow using IOMMUv2
functionality. This makes the AMDGPU driver work when memory
encryption is active.
- two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
Table Entries.
- MAINTAINERS file update for the Qualcom IOMMU driver.
* tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Handle 36bit addressing for x86-32
iommu/amd: Do not use IOMMUv2 functionality when SME is active
iommu/amd: Do not force direct mapping when SME is active
iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
iommu/vt-d: Serialize IOMMU GCMD register modifications
MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- more generic entry code ABI fallout
- debug register handling bugfixes
- fix vmalloc mappings on 32-bit kernels
- kprobes instrumentation output fix on 32-bit kernels
- fix over-eager WARN_ON_ONCE() on !SMAP hardware
- NUMA debugging fix
- fix Clang related crash on !RETPOLINE kernels
* tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry: Unbreak 32bit fast syscall
x86/debug: Allow a single level of #DB recursion
x86/entry: Fix AC assertion
tracing/kprobes, x86/ptrace: Fix regs argument order for i386
x86, fakenuma: Fix invalid starting node ID
x86/mm/32: Bring back vmalloc faulting on x86_32
x86/cmdline: Disable jump tables for cmdline.c
The GIC irqchips can now use a HW resend when a retrigger is invoked by
check_irq_resend(). However, should the HW resend fail, check_irq_resend()
will still attempt to trigger a SW resend, which is still a bad idea for
the GICs.
Prevent this from happening by setting IRQD_HANDLE_ENFORCE_IRQCTX on all
GIC IRQs. Technically per-cpu IRQs do not need this, as their flow handlers
never set IRQS_PENDING, but this aligns all IRQs wrt context enforcement:
this also forces all GIC IRQ handling to happen in IRQ context (as defined
by in_irq()).
While digging around IRQCHIP_EOI_IF_HANDLED and irq/resend.c, it has come
to my attention that the IRQ resend situation seems a bit precarious for
the GIC(s).
When marking an IRQ with IRQS_PENDING, handle_fasteoi_irq() will bail out
and issue an irq_eoi(). Should the IRQ in question be re-enabled,
check_irq_resend() will trigger a SW resend, which will go through the flow
handler again and issue *another* irq_eoi() on the *same* IRQ
activation. This is something the GIC spec clearly describes as a bad idea:
any EOI must match a previous ACK.
Implement irq_chip.irq_retrigger() for the GIC chips by setting the GIC
pending bit of the relevant IRQ. After being called by check_irq_resend(),
this will eventually trigger a *new* interrupt which we will handle as usual.
Marc Zyngier [Wed, 26 Aug 2020 17:37:50 +0000 (18:37 +0100)]
genirq: Walk the irq_data hierarchy when resending an interrupt
On resending an interrupt, we only check the outermost irqchip for
a irq_retrigger callback. However, this callback could be implemented
at an inner level. Use irq_chip_retrigger_hierarchy() in this case.
Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux
Pull misc fixes from Miguel Ojeda:
"A trivial patch for auxdisplay:
- Replace HTTP links with HTTPS ones (Alexander A. Klimov)
The usual clang-format trivial update:
- Update with the latest for_each macro list (Miguel Ojeda)
And Luc requested me to pick a sparse fix on my queue, so here it goes
along with other two trivial Compiler Attributes ones (also from Luc).
- sparse: use static inline for __chk_{user,io}_ptr() (Luc Van
Oostenryck)
- Compiler Attributes: remove comment about sparse not supporting
__has_attribute (Luc Van Oostenryck)"
* tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
auxdisplay: Replace HTTP links with HTTPS ones
* tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
clang-format: Update with the latest for_each macro list
* tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
sparse: use static inline for __chk_{user,io}_ptr()
Compiler Attributes: fix comment concerning GCC 4.6
Compiler Attributes: remove comment about sparse not supporting __has_attribute
Merge tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- HSDK-4xd Dev system: perf driver updates for sampling interrupt
- HSDK* Dev System: Ethernet broken [Evgeniy Didin]
- HIGHMEM broken (2 memory banks) [Mike Rapoport]
- show_regs() rewrite once and for all
- Other minor fixes
* tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id
arc: fix memory initialization for systems with two memory banks
irqchip/eznps: Fix build error for !ARC700 builds
ARC: show_regs: fix r12 printing and simplify
ARC: HSDK: wireup perf irq
ARC: perf: don't bail setup if pct irq missing in device-tree
ARC: pgalloc.h: delete a duplicated word + other fixes
Subsystems affected by this patch series: MAINTAINERS, ipc, fork,
checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration,
hugetlb)"
* emailed patches from Andrew Morton <[email protected]>:
include/linux/log2.h: add missing () around n in roundup_pow_of_two()
mm/khugepaged.c: fix khugepaged's request size in collapse_file
mm/hugetlb: fix a race between hugetlb sysctl handlers
mm/hugetlb: try preferred node first when alloc gigantic page from cma
mm/migrate: preserve soft dirty in remove_migration_pte()
mm/migrate: remove unnecessary is_zone_device_page() check
mm/rmap: fixup copying of soft dirty and uffd ptes
mm/migrate: fixup setting UFFD_WP flag
mm: madvise: fix vma user-after-free
checkpatch: fix the usage of capture group ( ... )
fork: adjust sysctl_max_threads definition to match prototype
ipc: adjust proc_ipc_sem_dointvec definition to match prototype
mm: track page table modifications in __apply_to_page_range()
MAINTAINERS: IA64: mark Status as Odd Fixes only
MAINTAINERS: add LLVM maintainers
MAINTAINERS: update Cavium/Marvell entries
mm: slub: fix conversion of freelist_corrupted()
mm: memcg: fix memcg reclaim soft lockup
memcg: fix use-after-free in uncharge_batch
David Howells [Fri, 4 Sep 2020 23:36:16 +0000 (16:36 -0700)]
mm/khugepaged.c: fix khugepaged's request size in collapse_file
collapse_file() in khugepaged passes PAGE_SIZE as the number of pages to
be read to page_cache_sync_readahead(). The intent was probably to read
a single page. Fix it to use the number of pages to the end of the
window instead.
Muchun Song [Fri, 4 Sep 2020 23:36:13 +0000 (16:36 -0700)]
mm/hugetlb: fix a race between hugetlb sysctl handlers
There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.
Fix this by duplicating the `table`, and only update the duplicate of
it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.
Li Xinhai [Fri, 4 Sep 2020 23:36:10 +0000 (16:36 -0700)]
mm/hugetlb: try preferred node first when alloc gigantic page from cma
Since commit cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic
hugepages using cma"), the gigantic page would be allocated from node
which is not the preferred node, although there are pages available from
that node. The reason is that the nid parameter has been ignored in
alloc_gigantic_page().
Besides, the __GFP_THISNODE also need be checked if user required to
alloc only from the preferred node.
After this patch, the preferred node is tried first before other allowed
nodes, and don't try to allocate from other nodes if __GFP_THISNODE is
specified. If user don't specify the preferred node, the current node
will be used as preferred node, which makes sure consistent behavior of
allocating gigantic and non-gigantic hugetlb page.
Ralph Campbell [Fri, 4 Sep 2020 23:36:07 +0000 (16:36 -0700)]
mm/migrate: preserve soft dirty in remove_migration_pte()
The code to remove a migration PTE and replace it with a device private
PTE was not copying the soft dirty bit from the migration entry. This
could lead to page contents not being marked dirty when faulting the page
back from device private memory.
Patch series "mm/migrate: preserve soft dirty in remove_migration_pte()".
I happened to notice this from code inspection after seeing Alistair
Popple's patch ("mm/rmap: Fixup copying of soft dirty and uffd ptes").
This patch (of 2):
The check for is_zone_device_page() and is_device_private_page() is
unnecessary since the latter is sufficient to determine if the page is a
device private page. Simplify the code for easier reading.
mm/rmap: fixup copying of soft dirty and uffd ptes
During memory migration a pte is temporarily replaced with a migration
swap pte. Some pte bits from the existing mapping such as the soft-dirty
and uffd write-protect bits are preserved by copying these to the
temporary migration swap pte.
However these bits are not stored at the same location for swap and
non-swap ptes. Therefore testing these bits requires using the
appropriate helper function for the given pte type.
Unfortunately several code locations were found where the wrong helper
function is being used to test soft_dirty and uffd_wp bits which leads to
them getting incorrectly set or cleared during page-migration.
Fix these by using the correct tests based on pte type.
Commit f45ec5ff16a75 ("userfaultfd: wp: support swap and page migration")
introduced support for tracking the uffd wp bit during page migration.
However the non-swap PTE variant was used to set the flag for zone device
private pages which are a type of swap page.
This leads to corruption of the swap offset if the original PTE has the
uffd_wp flag set.
Yang Shi [Fri, 4 Sep 2020 23:35:55 +0000 (16:35 -0700)]
mm: madvise: fix vma user-after-free
The syzbot reported the below use-after-free:
BUG: KASAN: use-after-free in madvise_willneed mm/madvise.c:293 [inline]
BUG: KASAN: use-after-free in madvise_vma mm/madvise.c:942 [inline]
BUG: KASAN: use-after-free in do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145
Read of size 8 at addr ffff8880a6163eb0 by task syz-executor.0/9996
checkpatch: fix the usage of capture group ( ... )
The usage of "capture group (...)" in the immediate condition after `&&`
results in `$1` being uninitialized. This issues a warning "Use of
uninitialized value $1 in regexp compilation at ./scripts/checkpatch.pl
line 2638".
I noticed this bug while running checkpatch on the set of commits from
v5.7 to v5.8-rc1 of the kernel on the commits with a diff content in
their commit message.
This bug was introduced in the script by commit e518e9a59ec3
("checkpatch: emit an error when there's a diff in a changelog"). It
has been in the script since then.
The author intended to store the match made by capture group in variable
`$1`. This should have contained the name of the file as `[\w/]+`
matched. However, this couldn't be accomplished due to usage of capture
group and `$1` in the same regular expression.
Fix this by placing the capture group in the condition before `&&`.
Thus, `$1` can be initialized to the text that capture group matches
thereby setting it to the desired and required value.
fork: adjust sysctl_max_threads definition to match prototype
Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
definition of sysctl_max_threads to match its prototype in
linux/sysctl.h which fixes the following sparse error/warning:
kernel/fork.c:3050:47: warning: incorrect type in argument 3 (different address spaces)
kernel/fork.c:3050:47: expected void *
kernel/fork.c:3050:47: got void [noderef] __user *buffer
kernel/fork.c:3036:5: error: symbol 'sysctl_max_threads' redeclared with different type (incompatible argument 3 (different address spaces)):
kernel/fork.c:3036:5: int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )
kernel/fork.c: note: in included file (through include/linux/key.h, include/linux/cred.h, include/linux/sched/signal.h, include/linux/sched/cputime.h):
include/linux/sysctl.h:242:5: note: previously declared as:
include/linux/sysctl.h:242:5: int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )