libperf: Use sys/types.h to get ssize_t, not unistd.h
The sys/types.h header looks more sensible, from its name we can gather
it should be there because of some needed typedef, and it is much
smaller than unistd.h, so use it and fix up the fallout in places where
it was being used for something else entirely but being obtained by
sheer luck, indirectly.
perf tools: No need to include internal/lib.h from util/util.h
That was done just to have users of writen() and readn(), that before
had their prototypes in util/util.h to get it without having to add an
include for internal/lib.h, but the right way is to add it and by now
all places already do it.
Fix a fallout were readlink() was used but unistd.h was being obtained
by luck thru util.h -> internal/lib.h, now to check why unistd.h is
being included there...
Add perf_evsel__alloc_id()/perf_evsel__free_id() functions to libperf as
internal functions.
Move 'struct perf_sample_id' to internal/evsel.h header and change
'struct perf_sample_id::evsel' to 'struct perf_evsel' and the related
code that touches it.
perf evlist: Adopt backwards ring buffer state enum
As this isn't used at all in mmap.h but in evlist.h, so to cut down the
header dependency tree, move it to where it is used.
Also add mmap.h to the places using it but previously getting it
indirectly via evlist.h.
Add missing pthread.h to evlist.h, as it has a pthread_t struct member
and was getting the header via mmap.h.
Noticed while processing a Jiri's libperf batch touching mmap.h, where
almost everything gets rebuilt because evlist.h is so popular, so cut
down't this rebuild the world party.
Jiri Olsa [Thu, 12 Sep 2019 08:57:18 +0000 (10:57 +0200)]
tools: Add missing stdio.h include to asm/bug.h header
We have a direct fprintf() call in the header, so we need stdio.h
include, otherwise it could fail compilation if there's no prior stdio.h
include directive.
libtraceevent: Move traceevent plugins in its own subdirectory
All traceevent plugins code is moved to tools/lib/traceevent/plugins
subdirectory. It makes traceevent implementation in trace-cmd and in
kernel tree consistent. There is no changes in the way libtraceevent and
plugins are compiled and installed.
Committer notes:
Applied fixup provided by Steven, fixing the tools/perf/Makefile.perf
target for the plugin dynamic list file. Problem noticed when cross
building to aarch64 from a Ubuntu 19.04 container.
libtraceevent: Add tep_get_event() in event-parse.h
The tep_get_event() function is an official libtracevent API, described
in the library man pages. However, it cannot be used by the library users because
it is not declared in the event-parse.h file, where all libtracevent APIs are.
The function declaration is added in event-parse.h file.
libtraceevent: Man pages fix, changes in event printing APIs
APIs for printing various trace event information were redesigned to be
more simple. However, the main libtraceevent man page was not updated
with those changes. The documentation is updated to describe the new
event print API.
libtraceevent: Man pages fix, rename tep_ref_get() to tep_get_ref()
The tep_ref_get() was renamed to tep_get_ref(), to be more consistent
with the other tep_ref_* APIs. However, in the man pages the API is
still with the old name. The documentation is fixed to reflect the
actual name of the API.
libtraceevent: Round up in tep_print_event() time precision
When testing the output of the old trace-cmd compared to the one that
uses the updated tep_print_event() logic, it was different in that the
time stamp precision in the old format would round up to the nearest
precision, where as the new logic truncates. Bring back the old method
of rounding up.
Jose Abreu [Mon, 23 Sep 2019 07:49:08 +0000 (09:49 +0200)]
net: stmmac: selftests: Flow Control test can also run with ASYM Pause
The Flow Control selftest is also available with ASYM Pause. Lets add
this check to the test and fix eventual false positive failures.
Fixes: 091810dbded9 ("net: stmmac: Introduce selftests support") Signed-off-by: Jose Abreu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
This series includes two fixes. The first improves reset code to allow
linkwatch_event to proceed during reset. The second ensures that no more
than one thread runs in reset at a time.
v2:
- Separate change param reset from do_reset()
- Return IBMVNIC_OPEN_FAILED if __ibmvnic_open fails
- Remove setting wait_for_reset to false from __ibmvnic_reset(), this
is done in wait_for_reset()
- Move the check for force_reset_recovery from patch 1 to patch 2
v3:
- Restore reset’s successful return in open failure case
v4:
- Change resetting flag access to atomic
====================
Juliet Kim [Fri, 20 Sep 2019 20:11:23 +0000 (16:11 -0400)]
net/ibmvnic: prevent more than one thread from running in reset
The current code allows more than one thread to run in reset. This can
corrupt struct adapter data. Check adapter->resetting before performing
a reset, if there is another reset running delay (100 msec) before trying
again.
Juliet Kim [Fri, 20 Sep 2019 20:11:22 +0000 (16:11 -0400)]
net/ibmvnic: unlock rtnl_lock in reset so linkwatch_event can run
Commit a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset")
made the change to hold the RTNL lock during a reset to avoid deadlock
but linkwatch_event is fired during the reset and needs the RTNL lock.
That keeps linkwatch_event process from proceeding until the reset
is complete. The reset process cannot tolerate the linkwatch_event
processing after reset completes, so release the RTNL lock during the
process to allow a chance for linkwatch_event to run during reset.
This does not guarantee that the linkwatch_event will be processed as
soon as link state changes, but is an improvement over the current code
where linkwatch_event processing is always delayed, which prevents
transmissions on the device from being deactivated leading transmit
watchdog timer to time-out.
Release the RTNL lock before link state change and re-acquire after
the link state change to allow linkwatch_event to grab the RTNL lock
and run during the reset.
Fixes: a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset") Signed-off-by: Juliet Kim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
tracing/probe: Fix same probe event argument matching
Commit fe60b0ce8e73 ("tracing/probe: Reject exactly same probe event")
tries to reject a event which matches an already existing probe.
However it currently continues to match arguments and rejects adding a
probe even when the arguments don't match. Fix this by only rejecting a
probe if and only if all the arguments match.
netfilter: nf_tables: bogus EBUSY when deleting flowtable after flush
The deletion of a flowtable after a flush in the same transaction
results in EBUSY. This patch adds an activation and deactivation of
flowtables in order to update the _use_ counter.
netfilter: ebtables: use __u8 instead of uint8_t in uapi header
When CONFIG_UAPI_HEADER_TEST=y, exported headers are compile-tested to
make sure they can be included from user-space.
Currently, linux/netfilter_bridge/ebtables.h is excluded from the test
coverage. To make it join the compile-test, we need to fix the build
errors attached below.
For a case like this, we decided to use __u{8,16,32,64} variable types
in this discussion:
https://lkml.org/lkml/2019/6/5/18
Build log:
CC usr/include/linux/netfilter_bridge/ebtables.h.s
In file included from <command-line>:32:0:
./usr/include/linux/netfilter_bridge/ebtables.h:126:4: error: unknown type name ‘uint8_t’
uint8_t revision;
^~~~~~~
./usr/include/linux/netfilter_bridge/ebtables.h:139:4: error: unknown type name ‘uint8_t’
uint8_t revision;
^~~~~~~
./usr/include/linux/netfilter_bridge/ebtables.h:152:4: error: unknown type name ‘uint8_t’
uint8_t revision;
^~~~~~~
Wanpeng Li [Mon, 9 Sep 2019 01:40:28 +0000 (09:40 +0800)]
Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"
This patch reverts commit 75437bb304b20 (locking/pvqspinlock: Don't
wait if vCPU is preempted). A large performance regression was caused
by this commit. on over-subscription scenarios.
The test was run on a Xeon Skylake box, 2 sockets, 40 cores, 80 threads,
with three VMs of 80 vCPUs each. The score of ebizzy -M is reduced from
13000-14000 records/s to 1700-1800 records/s:
Exit from aggressive wait-early mechanism can result in premature yield
and extra scheduling latency.
Actually, only 6% of wait_early events are caused by vcpu_is_preempted()
being true. However, when one vCPU voluntarily releases its vCPU, all
the subsequently waiters in the queue will do the same and the cascading
effect leads to bad performance.
kvm optimizations:
[1] commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts)
[2] commit 266e85a5ec9 (KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption)
Sam Shih [Thu, 19 Sep 2019 22:49:03 +0000 (06:49 +0800)]
pwm: mediatek: Remove the has_clks field
We can use fixed clocks to repair mt7628 PWM during configure from
userspace. The SoC is legacy MIPS and has no complex clock tree. Because
we can get the clock frequency for period calculation from fixed clocks
specified in DT, we can remove the has_clock field, and directly use
devm_clk_get() and clk_get_rate().
Linus Walleij [Wed, 28 Aug 2019 13:03:20 +0000 (15:03 +0200)]
thermal: db8500: Rewrite to be a pure OF sensor
This patch rewrites the DB8500 thermal sensor to be a
pure OF sensor, so that it can be used with thermal zones
defined in the device tree.
This driver was initially merged before we had generic
thermal zone device tree bindings, and now it gets
modernized to the way we do things these days.
The old driver depended on a set of trigger points
provided in the device tree or platform data to
interpolate the current temperature between trigger
points depending on whether the trend was rising or
falling. This was bad because the trigger points should
be used for defining temperature zone policies and
bind to cooling devices.
As the PRCMU (power reset control management unit) can
only issue IRQs when we pass temperature trigger points
upward or downward We instead define a number of
temperature points inside the driver ranging from
15 to 100 degrees celsius. The effect is that when
we register the device we quickly trigger 15, 20 ... up
to the room temperature in succession and then we
get continous event IRQs also under normal operating
conditions, and the temperature of the system is now
reported more accurately (+/- 2.5 degrees celsius)
while in the past the first trigger point was at 70
degrees and the average temperature was simply reported
as 35 degrees celsius (between 70 degrees and 0) until
we passed 70 degrees which didn't accurately represent
the temperature of the system.
As a result of dropping all the trigger points from the
driver and reusing the core DT thermal zone management
code we reduce the code footprint quite a bit.
Linus Walleij [Wed, 28 Aug 2019 13:03:18 +0000 (15:03 +0200)]
thermal: db8500: Finalize device tree conversion
At some point there was an attempt to convert the DB8500
thermal sensor to device tree: a probe path was added
and the device tree was augmented for the Snowball board.
The switchover was never completed: instead the thermal
devices came from from the PRCMU MFD device and the probe
on the Snowball was confused as another set of configuration
appeared from the device tree.
Move over to a device-tree only approach, as we fixed up
the device trees.
Steve French [Wed, 25 Sep 2019 04:27:34 +0000 (23:27 -0500)]
smb3: Add missing reparse tags
Additional reparse tags were described for WSL and file sync.
Add missing defines for these tags. Some will be useful for
POSIX extensions (as discussed at Storage Developer Conference).
- the slave EEPROM backend gained 16 bit address support
- and lots of regular driver updates and reworks
* 'i2c/for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (52 commits)
i2c: tegra: Move suspend handling to NOIRQ phase
i2c: imx: ACPI support for NXP i2c controller
i2c: uniphier(-f): remove all dev_dbg()
i2c: uniphier(-f): use devm_platform_ioremap_resource()
i2c: slave-eeprom: Add comment about address handling
i2c: exynos5: Remove IRQF_ONESHOT
i2c: stm32f7: Make structure stm32f7_i2c_algo constant
i2c: cht-wc: drop check because i2c_unregister_device() is NULL safe
i2c-eeprom_slave: Add support for more eeprom models
i2c: fsi: Add of_put_node() before break
i2c: synquacer: Make synquacer_i2c_ops constant
i2c: hix5hd2: Remove IRQF_ONESHOT
i2c: i801: Use iTCO version 6 in Cannon Lake PCH and beyond
watchdog: iTCO: Add support for Cannon Lake PCH iTCO
i2c: iproc: Make bcm_iproc_i2c_quirks constant
i2c: iproc: Add full name of devicetree node to adapter name
i2c: piix4: Add ACPI support
i2c: piix4: Fix probing of reserved ports on AMD Family 16h Model 30h
i2c: ocores: use request_any_context_irq() to register IRQ handler
i2c: designware: Fix optional reset error handling
...
Merge tag 'sound-fix-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few small remaining wrap-up for this merge window.
Most of patches are device-specific (HD-audio and USB-audio quirks,
FireWire, pcm316a, fsl, rsnd, Atmel, and TI fixes), while there is a
simple fix (actually two commits) for ASoC core"
* tag 'sound-fix-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Add DSD support for EVGA NU Audio
ALSA: hda - Add laptop imic fixup for ASUS M9V laptop
ASoC: ti: fix SND_SOC_DM365_VOICE_CODEC dependencies
ASoC: pcm3168a: The codec does not support S32_LE
ASoC: core: use list_del_init and move it back to soc_cleanup_component
ALSA: hda/realtek - PCI quirk for Medion E4254
ALSA: hda - Apply AMD controller workaround for Raven platform
ASoC: rsnd: do error check after rsnd_channel_normalization()
ASoC: atmel_ssc_dai: Remove wrong spinlock usage
ASoC: core: delete component->card_list in soc_remove_component only
ASoC: fsl_sai: Fix noise when using EDMA
ALSA: usb-audio: Add Hiby device family to quirks for native DSD support
ALSA: hda/realtek - Fix alienware headset mic
ALSA: dice: fix wrong packet parameter for Alesis iO26
Jarkko Sakkinen [Mon, 16 Sep 2019 08:38:34 +0000 (11:38 +0300)]
tpm: Wrap the buffer from the caller to tpm_buf in tpm_send()
tpm_send() does not give anymore the result back to the caller. This
would require another memcpy(), which kind of tells that the whole
approach is somewhat broken. Instead, as Mimi suggested, this commit
just wraps the data to the tpm_buf, and thus the result will not go to
the garbage.
Obviously this assumes from the caller that it passes large enough
buffer, which makes the whole API somewhat broken because it could be
different size than @buflen but since trusted keys is the only module
using this API right now I think that this fix is sufficient for the
moment.
In the near future the plan is to replace the parameters with a tpm_buf
created by the caller.
Denis Efremov [Thu, 15 Aug 2019 22:12:00 +0000 (01:12 +0300)]
MAINTAINERS: keys: Update path to trusted.h
Update MAINTAINERS record to reflect that trusted.h
was moved to a different directory in commit 22447981fc05
("KEYS: Move trusted.h to include/keys [ver #2]").
KEYS: trusted: correctly initialize digests and fix locking issue
Commit 0b6cf6b97b7e ("tpm: pass an array of tpm_extend_digest structures to
tpm_pcr_extend()") modifies tpm_pcr_extend() to accept a digest for each
PCR bank. After modification, tpm_pcr_extend() expects that digests are
passed in the same order as the algorithms set in chip->allocated_banks.
This patch fixes two issues introduced in the last iterations of the patch
set: missing initialization of the TPM algorithm ID in the tpm_digest
structures passed to tpm_pcr_extend() by the trusted key module, and
unreleased locks in the TPM driver due to returning from tpm_pcr_extend()
without calling tpm_put_ops().
Merge tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block
Pull more io_uring updates from Jens Axboe:
"A collection of later fixes and additions, that weren't quite ready
for pushing out with the initial pull request.
This contains:
- Fix potential use-after-free of shadow requests (Jackie)
- Fix potential OOM crash in request allocation (Jackie)
- kmalloc+memcpy -> kmemdup cleanup (Jackie)
- Fix poll crash regression (me)
- Fix SQ thread not being nice and giving up CPU for !PREEMPT (me)
- Add support for timeouts, making it easier to do epoll_wait()
conversions, for instance (me)
- Ensure io_uring works without f_ops->read_iter() and
f_ops->write_iter() (me)"
* tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block:
io_uring: correctly handle non ->{read,write}_iter() file_operations
io_uring: IORING_OP_TIMEOUT support
io_uring: use cond_resched() in sqthread
io_uring: fix potential crash issue due to io_get_req failure
io_uring: ensure poll commands clear ->sqe
io_uring: fix use-after-free of shadow_req
io_uring: use kmemdup instead of kmalloc and memcpy
Merge tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"Some later additions that weren't quite done for the first pull
request, and also a few fixes that have arrived since.
This contains:
- Kill silly pktcdvd warning on attempting to register a non-scsi
passthrough device (me)
- Use symbolic constants for the block t10 protection types, and
switch to handling it in core rather than in the drivers (Max)
- libahci platform missing node put fix (Nishka)
- Small series of fixes for BFQ (Paolo)
- Fix possible nbd crash (Xiubo)"
* tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block:
block: drop device references in bsg_queue_rq()
block: t10-pi: fix -Wswitch warning
pktcdvd: remove warning on attempting to register non-passthrough dev
ata: libahci_platform: Add of_node_put() before loop exit
nbd: fix possible page fault for nbd disk
nbd: rename the runtime flags as NBD_RT_ prefixed
block, bfq: push up injection only after setting service time
block, bfq: increase update frequency of inject limit
block, bfq: reduce upper bound for inject limit to max_rq_in_driver+1
block, bfq: update inject limit only after injection occurred
block: centralize PI remapping logic to the block layer
block: use symbolic constants for t10_pi type
* emailed patches from Andrew Morton <[email protected]>: (132 commits)
mm/zsmalloc.c: fix a -Wunused-function warning
zswap: do not map same object twice
zswap: use movable memory if zpool support allocate movable memory
zpool: add malloc_support_movable to zpool_driver
shmem: fix obsolete comment in shmem_getpage_gfp()
mm/madvise: reduce code duplication in error handling paths
mm: mmap: increase sockets maximum memory size pgoff for 32bits
mm/mmap.c: refine find_vma_prev() with rb_last()
riscv: make mmap allocation top-down by default
mips: use generic mmap top-down layout and brk randomization
mips: replace arch specific way to determine 32bit task with generic version
mips: adjust brk randomization offset to fit generic version
mips: use STACK_TOP when computing mmap base address
mips: properly account for stack randomization and stack guard gap
arm: use generic mmap top-down layout and brk randomization
arm: use STACK_TOP when computing mmap base address
arm: properly account for stack randomization and stack guard gap
arm64, mm: make randomization selected by generic topdown mmap layout
arm64, mm: move generic mmap layout functions to mm
arm64: consider stack randomization for mmap base only when necessary
...
zswap_writeback_entry() maps a handle to read swpentry first, and
then in the most common case it would map the same handle again.
This is ok when zbud is the backend since its mapping callback is
plain and simple, but it slows things down for z3fold.
Since there's hardly a point in unmapping a handle _that_ fast as
zswap_writeback_entry() does when it reads swpentry, the
suggestion is to keep the handle mapped till the end.
zswap: use movable memory if zpool support allocate movable memory
This is the third version that was updated according to the comments from
Sergey Senozhatsky https://lkml.org/lkml/2019/5/29/73 and Shakeel Butt
https://lkml.org/lkml/2019/6/4/973
zswap compresses swap pages into a dynamically allocated RAM-based memory
pool. The memory pool should be zbud, z3fold or zsmalloc. All of them
will allocate unmovable pages. It will increase the number of unmovable
page blocks that will bad for anti-fragment.
zsmalloc support page migration if request movable page:
handle = zs_malloc(zram->mem_pool, comp_len,
GFP_NOIO | __GFP_HIGHMEM |
__GFP_MOVABLE);
And commit "zpool: Add malloc_support_movable to zpool_driver" add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.
This commit let zswap allocate block with gfp
__GFP_HIGHMEM | __GFP_MOVABLE if zpool support allocate movable memory.
Following part is test log in a pc that has 8G memory and 2G swap.
As a zpool_driver, zsmalloc can allocate movable memory because it support
migate pages. But zbud and z3fold cannot allocate movable memory.
Add malloc_support_movable to zpool_driver. If a zpool_driver support
allocate movable memory, set it to true. And add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.
Miles Chen [Mon, 23 Sep 2019 22:39:34 +0000 (15:39 -0700)]
shmem: fix obsolete comment in shmem_getpage_gfp()
Replace "fault_mm" with "vmf" in code comment because commit cfda05267f7b
("userfaultfd: shmem: add userfaultfd hook for shared memory faults") has
changed the prototpye of shmem_getpage_gfp() - pass vmf instead of
fault_mm to the function.
Ivan Khoronzhuk [Mon, 23 Sep 2019 22:39:28 +0000 (15:39 -0700)]
mm: mmap: increase sockets maximum memory size pgoff for 32bits
The AF_XDP sockets umem mapping interface uses XDP_UMEM_PGOFF_FILL_RING
and XDP_UMEM_PGOFF_COMPLETION_RING offsets. These offsets are
established already and are part of the configuration interface.
But for 32-bit systems, using AF_XDP socket configuration, these values
are too large to pass the maximum allowed file size verification. The
offsets can be tuned off, but instead of changing the existing
interface, let's extend the max allowed file size for sockets.
No one has been using this until this patch with 32 bits as without
this fix af_xdp sockets can't be used at all, so it unblocks af_xdp
socket usage for 32bit systems.
All list of mmap cbs for sockets was verified for side effects and all
of them contain dummy cb - sock_no_mmap() at this moment, except the
following:
xsk_mmap() - it's what this fix is needed for.
tcp_mmap() - doesn't have obvious issues with pgoff - no any references on it.
packet_mmap() - return -EINVAL if it's even set.
Wei Yang [Mon, 23 Sep 2019 22:39:25 +0000 (15:39 -0700)]
mm/mmap.c: refine find_vma_prev() with rb_last()
When addr is out of range of the whole rb_tree, pprev will point to the
right-most node. rb_tree facility already provides a helper function,
rb_last(), to do this task. We can leverage this instead of
reimplementing it.
This patch refines find_vma_prev() with rb_last() to make it a little
nicer to read.
mips: use generic mmap top-down layout and brk randomization
mips uses a top-down layout by default that exactly fits the generic
functions, so get rid of arch specific code and use the generic version by
selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
As ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
use the generic version of arch_randomize_brk since it also fits. Note
that this commit also removes the possibility for mips to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.
mips: replace arch specific way to determine 32bit task with generic version
Mips uses TASK_IS_32BIT_ADDR to determine if a task is 32bit, but this
define is mips specific and other arches do not have it: instead, use
!IS_ENABLED(CONFIG_64BIT) || is_compat_task() condition.
mips: properly account for stack randomization and stack guard gap
This commit takes care of stack randomization and stack guard gap when
computing mmap base address and checks if the task asked for
randomization. This fixes the problem uncovered and not fixed for arm
here: https://lkml.kernel.org/r/20170622200033[email protected]
arm: use generic mmap top-down layout and brk randomization
arm uses a top-down mmap layout by default that exactly fits the generic
functions, so get rid of arch specific code and use the generic version by
selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
As ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
use the generic version of arch_randomize_brk since it also fits. Note
that this commit also removes the possibility for arm to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.
Note that it is safe to remove STACK_RND_MASK since it matches the default
value.
arm: properly account for stack randomization and stack guard gap
This commit takes care of stack randomization and stack guard gap when
computing mmap base address and checks if the task asked for
randomization. This fixes the problem uncovered and not fixed for arm
here: https://lkml.kernel.org/r/20170622200033[email protected]
arm64, mm: make randomization selected by generic topdown mmap layout
This commits selects ARCH_HAS_ELF_RANDOMIZE when an arch uses the generic
topdown mmap layout functions so that this security feature is on by
default.
Note that this commit also removes the possibility for arm64 to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.
arm64, mm: move generic mmap layout functions to mm
arm64 handles top-down mmap layout in a way that can be easily reused by
other architectures, so make it available in mm. It then introduces a new
config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT that can be set by other
architectures to benefit from those functions. Note that this new config
depends on MMU being enabled, if selected without MMU support, a warning
will be thrown.
arm64: consider stack randomization for mmap base only when necessary
Do not offset mmap base address because of stack randomization if current
task does not want randomization. Note that x86 already implements this
behaviour.
arm64: make use of is_compat_task instead of hardcoding this test
Each architecture has its own way to determine if a task is a compat task,
by using is_compat_task in arch_mmap_rnd, it allows more genericity and
then it prepares its moving to mm/.
Patch series "Provide generic top-down mmap layout functions", v6.
This series introduces generic functions to make top-down mmap layout
easily accessible to architectures, in particular riscv which was the
initial goal of this series. The generic implementation was taken from
arm64 and used successively by arm, mips and finally riscv.
Note that in addition the series fixes 2 issues:
- stack randomization was taken into account even if not necessary.
- [1] fixed an issue with mmap base which did not take into account
randomization but did not report it to arm and mips, so by moving arm64
into a generic library, this problem is now fixed for both
architectures.
This work is an effort to factorize architecture functions to avoid code
duplication and oversights as in [1].