Git Repo - linux.git/log

sata_rcar: convert to SPDX identifiers

This patch updates license to use SPDX-License-Identifier
instead of verbose license text.

Signed-off-by: Kuninori Morimoto <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

ubd: fix missing initialization of io_req

The SYNC path doesn't initialize io_req->error, which can cause
random errors. Before the conversion to blk-mq, we always
completed requests with BLK_STS_OK status, but now we actually
look at the error field and this issue becomes apparent.

Signed-off-by: Anton Ivanov <[email protected]>
[axboe: fixed up commit message to explain what is actually going on]

Signed-off-by: Jens Axboe <[email protected]>

arch/alpha, termios: implement BOTHER, IBSHIFT and termios2

Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags
using arbitrary flags. Because BOTHER is not defined, the general
Linux code doesn't allow setting arbitrary baud rates, and because
CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in
drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037.

Resolve both problems by #defining BOTHER to 037 on Alpha.

However, userspace still needs to know if setting BOTHER is actually
safe given legacy kernels (does anyone actually care about that on
Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even
though they use the same structure. Define struct termios2 just for
compatibility; it is the exact same structure as struct termios. In a
future patchset, this will be cleaned up so the uapi headers are
usable from libc.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Kate Stewart <[email protected]>
Cc: Philippe Ombredanne <[email protected]>
Cc: Eugene Syromiatnikov <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: Johan Hovold <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge tag 'compiler-attributes-for-linus-v4.20-rc2' of https://github.com/ojeda/linux

Pull compiler attribute fixlets from Miguel Ojeda:
"Small improvements to Compiler Attributes:

   - Define asm_volatile_goto for non-gcc compilers (Nick Desaulniers)

   - Improve the explanation of compiler_attributes.h"

* tag 'compiler-attributes-for-linus-v4.20-rc2' of https://github.com/ojeda/linux:
  Compiler Attributes: improve explanation of header
  include/linux/compiler*.h: define asm_volatile_goto

Merge tag 'mtd/fixes-for-4.20-rc2' of git://git.infradead.org/linux-mtd

Pull MTD fixes from Boris Brezillon:
"MTD changes:
   - Kill a VLA in sa1100

  SPI NOR changes:
   - Make sure ->addr_width is restored when SFDP parsing fails
   - Propate errors happening in cqspi_direct_read_execute()

  NAND changes:
   - Fix kernel-doc mismatch
   - Fix nanddev_neraseblocks() to return the correct value
   - Avoid selection of BCH_CONST_PARAMS when some users require dynamic
     BCH settings"

* tag 'mtd/fixes-for-4.20-rc2' of git://git.infradead.org/linux-mtd:
  mtd: nand: Fix nanddev_pos_next_page() kernel-doc header
  mtd: sa1100: avoid VLA in sa1100_setup_mtd
  mtd: spi-nor: Reset nor->addr_width when SFDP parsing failed
  mtd: spi-nor: cadence-quadspi: Return error code in cqspi_direct_read_execute()
  mtd: nand: Fix nanddev_neraseblocks()
  mtd: nand: drop kernel-doc notation for a deleted function parameter
  mtd: docg3: don't set conflicting BCH_CONST_PARAMS option

termios, tty/tty_baudrate.c: fix buffer overrun

On architectures with CBAUDEX == 0 (Alpha and PowerPC), the code in tty_baudrate.c does
not do any limit checking on the tty_baudrate[] array, and in fact a
buffer overrun is possible on both architectures. Add a limit check to
prevent that situation.

This will be followed by a much bigger cleanup/simplification patch.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Requested-by: Cc: Johan Hovold <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Kate Stewart <[email protected]>
Cc: Philippe Ombredanne <[email protected]>
Cc: Eugene Syromiatnikov <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

vt: fix broken display when running aptitude

If you run aptitude on framebuffer console, the display is corrupted. The
corruption is caused by the commit d8ae7242. The patch adds "offset" to
"start" when calling scr_memsetw, but it forgets to do the same addition
on a subsequent call to do_update_region.

Signed-off-by: Mikulas Patocka <[email protected]>
Fixes: d8ae72427187 ("vt: preserve unicode values corresponding to screen characters")
Reviewed-by: Nicolas Pitre <[email protected]>
Cc: [email protected] # 4.19
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Compiler Attributes: improve explanation of header

Explain better what "optional" attributes are, and avoid calling
them so to avoid confusion. Simply retain "Optional" as a word
to look for in the comments.

Moreover, add a couple sentences to explain a bit more the intention
and the documentation links.

Signed-off-by: Miguel Ojeda <[email protected]>

mount: Prevent MNT_DETACH from disconnecting locked mounts

Timothy Baldwin <[email protected]> wrote:
> As per mount_namespaces(7) unprivileged users should not be able to look under mount points:
>
>   Mounts that come as a single unit from more privileged mount are locked
>   together and may not be separated in a less privileged mount namespace.
>
> However they can:
>
> 1. Create a mount namespace.
> 2. In the mount namespace open a file descriptor to the parent of a mount point.
> 3. Destroy the mount namespace.
> 4. Use the file descriptor to look under the mount point.
>
> I have reproduced this with Linux 4.16.18 and Linux 4.18-rc8.
>
> The setup:
>
> $ sudo sysctl kernel.unprivileged_userns_clone=1
> kernel.unprivileged_userns_clone = 1
> $ mkdir -p A/B/Secret
> $ sudo mount -t tmpfs hide A/B
>
>
> "Secret" is indeed hidden as expected:
>
> $ ls -lR A
> A:
> total 0
> drwxrwxrwt 2 root root 40 Feb 12 21:08 B
>
> A/B:
> total 0
>
>
> The attack revealing "Secret":
>
> $ unshare -Umr sh -c "exec unshare -m ls -lR /proc/self/fd/4/ 4<A"
> /proc/self/fd/4/:
> total 0
> drwxr-xr-x 3 root root 60 Feb 12 21:08 B
>
> /proc/self/fd/4/B:
> total 0
> drwxr-xr-x 2 root root 40 Feb 12 21:08 Secret
>
> /proc/self/fd/4/B/Secret:
> total 0

I tracked this down to put_mnt_ns running passing UMOUNT_SYNC and
disconnecting all of the mounts in a mount namespace.  Fix this by
factoring drop_mounts out of drop_collected_mounts and passing
0 instead of UMOUNT_SYNC.

There are two possible behavior differences that result from this.
- No longer setting UMOUNT_SYNC will no longer set MNT_SYNC_UMOUNT on
  the vfsmounts being unmounted.  This effects the lazy rcu walk by
  kicking the walk out of rcu mode and forcing it to be a non-lazy
  walk.
- No longer disconnecting locked mounts will keep some mounts around
  longer as they stay because the are locked to other mounts.

There are only two users of drop_collected mounts: audit_tree.c and
put_mnt_ns.

In audit_tree.c the mounts are private and there are no rcu lazy walks
only calls to iterate_mounts. So the changes should have no effect
except for a small timing effect as the connected mounts are disconnected.

In put_mnt_ns there may be references from process outside the mount
namespace to the mounts.  So the mounts remaining connected will
be the bug fix that is needed.  That rcu walks are allowed to continue
appears not to be a problem especially as the rcu walk change was about
an implementation detail not about semantics.

Cc: [email protected]
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Reported-by: Timothy Baldwin <[email protected]>
Tested-by: Timothy Baldwin <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>

s390/perf: Change CPUM_CF return code in event init function

The function perf_init_event() creates a new event and
assignes it to a PMU. This a done in a loop over all existing
PMUs. For each listed PMU the event init function is called
and if this function does return any other error than -ENOENT,
the loop is terminated the creation of the event fails.

If the event is invalid, return -ENOENT to try other PMUs.

Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

posix-cpu-timers: Remove useless call to check_dl_overrun()

check_dl_overrun() is used to send a SIGXCPU to users that asked to be
informed when a SCHED_DEADLINE runtime overruns occur.

The function is called by check_thread_timers() already, so the call in
check_process_timers() is redundant/wrong (even though harmless).

Remove it.

Fixes: 34be39305a77 ("sched/deadline: Implement "runtime overrun signal" support")
Signed-off-by: Juri Lelli <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Mathieu Poirier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Claudio Scordino <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

qlcnic: remove assumption that vlan_tci != 0

VLAN.TCI == 0 is perfectly valid (802.1p), so allow it to be accelerated.

Signed-off-by: Michał Mirosław <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: fix accelerated VLAN handling

Don't request tag insertion when it isn't present in outgoing skb.

Signed-off-by: Michał Mirosław <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts

Jonathan Calmels from NVIDIA reported that he's able to bypass the
mount visibility security check in place in the Linux kernel by using
a combination of the unbindable property along with the private mount
propagation option to allow a unprivileged user to see a path which
was purposefully hidden by the root user.

Reproducer:
  # Hide a path to all users using a tmpfs
  root@castiana:~# mount -t tmpfs tmpfs /sys/devices/
  root@castiana:~#

  # As an unprivileged user, unshare user namespace and mount namespace
  stgraber@castiana:~$ unshare -U -m -r

  # Confirm the path is still not accessible
  root@castiana:~# ls /sys/devices/

  # Make /sys recursively unbindable and private
  root@castiana:~# mount --make-runbindable /sys
  root@castiana:~# mount --make-private /sys

  # Recursively bind-mount the rest of /sys over to /mnnt
  root@castiana:~# mount --rbind /sys/ /mnt

  # Access our hidden /sys/device as an unprivileged user
  root@castiana:~# ls /mnt/devices/
  breakpoint cpu cstate_core cstate_pkg i915 intel_pt isa kprobe
  LNXSYSTM:00 msr pci0000:00 platform pnp0 power software system
  tracepoint uncore_arb uncore_cbox_0 uncore_cbox_1 uprobe virtual

Solve this by teaching copy_tree to fail if a mount turns out to be
both unbindable and locked.

Cc: [email protected]
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Reported-by: Jonathan Calmels <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>

mount: Retest MNT_LOCKED in do_umount

It was recently pointed out that the one instance of testing MNT_LOCKED
outside of the namespace_sem is in ksys_umount.

Fix that by adding a test inside of do_umount with namespace_sem and
the mount_lock held. As it helps to fail fails the existing test is
maintained with an additional comment pointing out that it may be racy
because the locks are not held.

Cc: [email protected]
Reported-by: Al Viro <[email protected]>
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Signed-off-by: "Eric W. Biederman" <[email protected]>

Merge branch 'FDDI-defza-Fix-a-bunch-of-small-issues'

Maciej W. Rozycki says:

====================
FDDI: defza: Fix a bunch of small issues

Here is a bunch of small fixes addressing issues that I missed in my
final round of testing. None of these affect run-time behaviour. One was
actually found by the kbuild bot, which turned out to be more pedantic
than my compiler. See individual change descriptions for details.
====================

Signed-off-by: David S. Miller <[email protected]>

FDDI: defza: Make the driver version string constant

The driver version string is obviously not meant to be changed at run
time, so mark it `const'.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

FDDI: defza: Move SMT Tx data buffer declaration next to its skb

Move the temporary data buffer used when tapping into the SMT Tx queue
from the outer function level into the conditional block it's actually
used in and its containing skb is also declared, making the structure of
code better.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

FDDI: defza: Add missing comment closing

Fix:

drivers/net/fddi/defza.h:238:1: warning: "/*" within comment [-Wcomment]

by adding a missing comment closing.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

FDDI: defza: Fix SPDX annotation

The SPDX annotation for this driver does not match the license text,
which specifies GNU GPL 2 or later. Make the two match by correcting
the SPDX tag.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

userns: also map extents in the reverse map to kernel IDs

The current logic first clones the extent array and sorts both copies, then
maps the lower IDs of the forward mapping into the lower namespace, but
doesn't map the lower IDs of the reverse mapping.

This means that code in a nested user namespace with >5 extents will see
incorrect IDs. It also breaks some access checks, like
inode_owner_or_capable() and privileged_wrt_inode_uidgid(), so a process
can incorrectly appear to be capable relative to an inode.

To fix it, we have to make sure that the "lower_first" members of extents
in both arrays are translated; and we have to make sure that the reverse
map is sorted *after* the translation (since otherwise the translation can
break the sorting).

This is CVE-2018-18955.

Fixes: 6397fac4915a ("userns: bump idmap limits to 340")
Cc: [email protected]
Signed-off-by: Jann Horn <[email protected]>
Tested-by: Eric W. Biederman <[email protected]>
Reviewed-by: Eric W. Biederman <[email protected]>
Signed-off-by: Eric W. Biederman <[email protected]>

ext4: fix buffer leak in __ext4_read_dirblock() on error path

Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.9

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2018-11-07

This series contains fixes to igb, i40e and ice drivers.

Anirudh fixes an issue during rebuild of the ice driver, where we need
to set the carrier state, as well as start or stop the queues all based
on the link status.  Removed functions that were duplicating current
functionality in the VSI rebuild/replay framework.

Dave fixes a potential resource collision during the remove path, so add
a check to see if we are in the middle of a reset.  Fixed the remove
path to ensure we call netif_napi_del() to free vectors before we set
vsi->netdev to NULL.

Akeem fixes an issue when the receive or transmit pause parameter is
set, results in link loss on the interface.  Fixed the spelling of
"Enabling" in error message.

Victor fixes potential memory leak by also freeing the related VSI
contexts in the unload path.

Md Fahad fixes a flag during port VLAN insertion, which was not being
set properly.

Brett fixes a transmit timeout during stress due to the hardware tail
and software tail were incorrectly out of sync.

Miroslav Lichvar fixes the igb PHC timecounter update interval to be
sure the timecounter is updated in time.

Chinh fixes the req_speeds variable to be u16 instead of u8 so that it
can handle all the link speeds.

Jake fixes i40e to add back the missing feature flags, which was causing
IP-in-IP offloads to be reported as not supported.
====================

Signed-off-by: David S. Miller <[email protected]>

drm/amd/amdgpu/dm: Fix dm_dp_create_fake_mst_encoder()

[why]
Removing connector reusage from DM to match the rest of the tree ended
up revealing an issue that was surprisingly subtle. The original amdgpu
code for DC that was submitted appears to have left a chunk in
dm_dp_create_fake_mst_encoder() that tries to find a "master encoder",
the likes of which isn't actually used or stored anywhere. It does so at
the wrong time as well by trying to access parts of the drm_connector
from the encoder init before it's actually been initialized. This
results in a NULL pointer deref on MST hotplugs:

[  160.696613] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  160.697234] PGD 0 P4D 0
[  160.697814] Oops: 0010 [#1] SMP PTI
[  160.698430] CPU: 2 PID: 64 Comm: kworker/2:1 Kdump: loaded Tainted: G           O      4.19.0Lyude-Test+ #2
[  160.699020] Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.22 05/17/2018
[  160.699672] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[  160.700322] RIP: 0010:          (null)
[  160.700920] Code: Bad RIP value.
[  160.701541] RSP: 0018:ffffc9000029fc78 EFLAGS: 00010206
[  160.702183] RAX: 0000000000000000 RBX: ffff8804440ed468 RCX: ffff8804440e9158
[  160.702778] RDX: 0000000000000000 RSI: ffff8804556c5700 RDI: ffff8804440ed000
[  160.703408] RBP: ffff880458e21800 R08: 0000000000000002 R09: 000000005fca0a25
[  160.704002] R10: ffff88045a077a3d R11: ffff88045a077a3c R12: ffff8804440ed000
[  160.704614] R13: ffff880458e21800 R14: ffff8804440e9000 R15: ffff8804440e9000
[  160.705260] FS:  0000000000000000(0000) GS:ffff88045f280000(0000) knlGS:0000000000000000
[  160.705854] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  160.706478] CR2: ffffffffffffffd6 CR3: 000000000200a001 CR4: 00000000003606e0
[  160.707124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  160.707724] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  160.708372] Call Trace:
[  160.708998]  ? dm_dp_add_mst_connector+0xed/0x1d0 [amdgpu]
[  160.709625]  ? drm_dp_add_port+0x2fa/0x470 [drm_kms_helper]
[  160.710284]  ? wake_up_q+0x54/0x70
[  160.710877]  ? __mutex_unlock_slowpath.isra.18+0xb3/0x110
[  160.711512]  ? drm_dp_dpcd_access+0xe7/0x110 [drm_kms_helper]
[  160.712161]  ? drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
[  160.712762]  ? drm_dp_check_and_send_link_address+0xa3/0xd0 [drm_kms_helper]
[  160.713408]  ? drm_dp_mst_link_probe_work+0x4b/0x80 [drm_kms_helper]
[  160.714013]  ? process_one_work+0x1a1/0x3a0
[  160.714667]  ? worker_thread+0x30/0x380
[  160.715326]  ? wq_update_unbound_numa+0x10/0x10
[  160.715939]  ? kthread+0x112/0x130
[  160.716591]  ? kthread_create_worker_on_cpu+0x70/0x70
[  160.717262]  ? ret_from_fork+0x35/0x40
[  160.717886] Modules linked in: amdgpu(O) vfat fat snd_hda_codec_generic joydev i915 chash gpu_sched ttm i2c_algo_bit drm_kms_helper snd_hda_codec_hdmi hp_wmi syscopyarea iTCO_wdt sysfillrect sparse_keymap sysimgblt fb_sys_fops snd_hda_intel usbhid wmi_bmof drm snd_hda_codec btusb snd_hda_core intel_rapl btrtl x86_pkg_temp_thermal btbcm btintel coretemp snd_pcm crc32_pclmul bluetooth psmouse snd_timer snd pcspkr i2c_i801 mei_me i2c_core soundcore mei tpm_tis wmi tpm_tis_core hp_accel ecdh_generic lis3lv02d tpm video rfkill acpi_pad input_polldev hp_wireless pcc_cpufreq crc32c_intel serio_raw tg3 xhci_pci xhci_hcd [last unloaded: amdgpu]
[  160.720141] CR2: 0000000000000000

Somehow the connector reusage DM was using for MST connectors managed to
paper over this issue entirely; hence why this was never caught until
now.

[how]
Since this code isn't used anywhere and seems useless anyway, we can
just drop it entirely. This appears to fix the issue on my HP ZBook with
an AMD WX4150.

Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Drop reusing drm connector for MST

[why]
It is not safe to keep existing connector while entire topology
has been removed. Could lead potential impact to uapi.
Entirely unregister all the connectors on the topology,
and use a new set of connectors when the topology is plugged back
on.

[How]
Remove the drm connector entirely each time when the
corresponding MST topology is gone.
When hotunplug a connector (e.g., DP2)
1. Remove connector from userspace.
2. Drop it's reference.
When hotplug back on:
1. Detect new topology, and create new connectors.
2. Notify userspace with sysfs hotplug event.
3. Reprobe new connectors, and reassign CRTC from old (e.g., DP2)
to new (e.g., DP3) connector.

Signed-off-by: Jerry (Fangzhi) Zuo <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Cleanup MST non-atomic code workaround

[why]
It is not correct to touch aconnector within atomic_check.

[How]
It was added as workaround before, and no longer needed.

Signed-off-by: Jerry (Fangzhi) Zuo <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: always use fast UCLK switching when UCLK DPM enabled

With UCLK DPM enabled, slow switching is not supported any more.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: set a default fclk/gfxclk ratio

Otherwise big gap between these two clocks may causes
some hangs.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

block: Clear kernel memory before copying to user

If the kernel allocates a bounce buffer for user read data, this memory
needs to be cleared before copying it to the user, otherwise it may leak
kernel memory to user space.

Laurence Oberman <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

MAINTAINERS: Fix remaining pointers to obsolete libata.git

libata.git no longer exists. Replace the remaining pointers to it by
pointers to the block tree, which is where all libata development
happens now.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

ubd: fix missing lock around request issue

We need to hold the device lock (and disable interrupts) while
writing new commands, or we could be interrupted while that
is happening and read invalid requests in the completion path.

Fixes: 4e6da0fe8058 ("um: Convert ubd driver to blk-mq")
Tested-by: Richard Weinberger <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Documentation: ABI: led-trigger-pattern: Fix typos

- Spelling s/brigntess/brightness/,
- Double "use".

Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Pavel Machek <[email protected]>
Signed-off-by: Jacek Anaszewski <[email protected]>

leds: trigger: Fix sleeping function called from invalid context

We will meet below issue due to mutex_lock() is called in interrupt context.
The mutex lock is used to protect the pattern trigger data, but before changing
new pattern trigger data (pattern values or repeat value) by users, we always
cancel the timer firstly to clear previous patterns' performance. That means
there is no race in pattern_trig_timer_function(), so we can drop the mutex
lock in pattern_trig_timer_function() to avoid this issue.

Moreover we can move the timer cancelling into mutex protection, since there
is no deadlock risk if we remove the mutex lock in pattern_trig_timer_function().

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:254
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted
4.20.0-rc1-koelsch-00841-ga338c8181013c1a9 #171
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
[<c020f19c>] (unwind_backtrace) from [<c020aecc>] (show_stack+0x10/0x14)
[<c020aecc>] (show_stack) from [<c07affb8>] (dump_stack+0x7c/0x9c)
[<c07affb8>] (dump_stack) from [<c02417d4>] (___might_sleep+0xf4/0x158)
[<c02417d4>] (___might_sleep) from [<c07c92c4>] (mutex_lock+0x18/0x60)
[<c07c92c4>] (mutex_lock) from [<c067b28c>] (pattern_trig_timer_function+0x1c/0x11c)
[<c067b28c>] (pattern_trig_timer_function) from [<c027f6fc>] (call_timer_fn+0x1c/0x90)
[<c027f6fc>] (call_timer_fn) from [<c027f944>] (expire_timers+0x94/0xa4)
[<c027f944>] (expire_timers) from [<c027fc98>] (run_timer_softirq+0x108/0x15c)
[<c027fc98>] (run_timer_softirq) from [<c02021cc>] (__do_softirq+0x1d4/0x258)
[<c02021cc>] (__do_softirq) from [<c0224d24>] (irq_exit+0x64/0xc4)
[<c0224d24>] (irq_exit) from [<c0268dd0>] (__handle_domain_irq+0x80/0xb4)
[<c0268dd0>] (__handle_domain_irq) from [<c045e1b0>] (gic_handle_irq+0x58/0x90)
[<c045e1b0>] (gic_handle_irq) from [<c02019f8>] (__irq_svc+0x58/0x74)
Exception stack(0xeb483f60 to 0xeb483fa8)
3f60: 00000000 00000000 eb9afaa0 c0217e80 00000000 ffffe000 00000000 c0e06408
3f80: 00000002 c0e0647c c0c6a5f0 00000000 c0e04900 eb483fb0 c0207ea8 c0207e98
3fa0: 60020013 ffffffff
[<c02019f8>] (__irq_svc) from [<c0207e98>] (arch_cpu_idle+0x1c/0x38)
[<c0207e98>] (arch_cpu_idle) from [<c0247ca8>] (do_idle+0x138/0x268)
[<c0247ca8>] (do_idle) from [<c0248050>] (cpu_startup_entry+0x18/0x1c)
[<c0248050>] (cpu_startup_entry) from [<402022ec>] (0x402022ec)

Fixes: 5fd752b6b3a2 ("leds: core: Introduce LED pattern trigger")
Signed-off-by: Baolin Wang <[email protected]>
Reported-by: Geert Uytterhoeven <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Jacek Anaszewski <[email protected]>

block: respect virtual boundary mask in bvecs

With drivers that are settting a virtual boundary constrain, we are
seeing a lot of bio splitting and smaller I/Os being submitted to the
driver.

This happens because the bio gap detection code does not account cases
where PAGE_SIZE - 1 is bigger than queue_virt_boundary() and thus will
split the bio unnecessarily.

Cc: Jan Kara <[email protected]>
Cc: Bart Van Assche <[email protected]>
Cc: Ming Lei <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Johannes Thumshirn <[email protected]>
Acked-by: Keith Busch <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Merge tag 'hwmon-for-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- Remove bogus __init annotations in ibmpowernv driver

- Fix double-free in error handling of __hwmon_device_register()

* tag 'hwmon-for-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ibmpowernv) Remove bogus __init annotations
hwmon: (core) Fix double-free in __hwmon_device_register()

Btrfs: fix missing delayed iputs on unmount

There's a race between close_ctree() and cleaner_kthread().
close_ctree() sets btrfs_fs_closing(), and the cleaner stops when it
sees it set, but this is racy; the cleaner might have already checked
the bit and could be cleaning stuff. In particular, if it deletes unused
block groups, it will create delayed iputs for the free space cache
inodes. As of "btrfs: don't run delayed_iputs in commit", we're no
longer running delayed iputs after a commit. Therefore, if the cleaner
creates more delayed iputs after delayed iputs are run in
btrfs_commit_super(), we will leak inodes on unmount and get a busy
inode crash from the VFS.

Fix it by parking the cleaner before we actually close anything. Then,
any remaining delayed iputs will always be handled in
btrfs_commit_super(). This also ensures that the commit in close_ctree()
is really the last commit, so we can get rid of the commit in
cleaner_kthread().

The fstest/generic/475 followed by 476 can trigger a crash that
manifests as a slab corruption caused by accessing the freed kthread
structure by a wake up function. Sample trace:

[ 5657.077612] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
[ 5657.079432] PGD 1c57a067 P4D 1c57a067 PUD da10067 PMD 0
[ 5657.080661] Oops: 0000 [#1] PREEMPT SMP
[ 5657.081592] CPU: 1 PID: 5157 Comm: fsstress Tainted: G        W         4.19.0-rc8-default+ #323
[ 5657.083703] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
[ 5657.086577] RIP: 0010:shrink_page_list+0x2f9/0xe90
[ 5657.091937] RSP: 0018:ffffb5c745c8f728 EFLAGS: 00010287
[ 5657.092953] RAX: 0000000000000074 RBX: ffffb5c745c8f830 RCX: 0000000000000000
[ 5657.094590] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9a8747fdf3d0
[ 5657.095987] RBP: ffffb5c745c8f9e0 R08: 0000000000000000 R09: 0000000000000000
[ 5657.097159] R10: ffff9a8747fdf5e8 R11: 0000000000000000 R12: ffffb5c745c8f788
[ 5657.098513] R13: ffff9a877f6ff2c0 R14: ffff9a877f6ff2c8 R15: dead000000000200
[ 5657.099689] FS:  00007f948d853b80(0000) GS:ffff9a877d600000(0000) knlGS:0000000000000000
[ 5657.101032] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5657.101953] CR2: 00000000000000cc CR3: 00000000684bd000 CR4: 00000000000006e0
[ 5657.103159] Call Trace:
[ 5657.103776]  shrink_inactive_list+0x194/0x410
[ 5657.104671]  shrink_node_memcg.constprop.84+0x39a/0x6a0
[ 5657.105750]  shrink_node+0x62/0x1c0
[ 5657.106529]  try_to_free_pages+0x1a4/0x500
[ 5657.107408]  __alloc_pages_slowpath+0x2c9/0xb20
[ 5657.108418]  __alloc_pages_nodemask+0x268/0x2b0
[ 5657.109348]  kmalloc_large_node+0x37/0x90
[ 5657.110205]  __kmalloc_node+0x236/0x310
[ 5657.111014]  kvmalloc_node+0x3e/0x70

Fixes: 30928e9baac2 ("btrfs: don't run delayed_iputs in commit")
Signed-off-by: Omar Sandoval <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add trace ]
Signed-off-by: David Sterba <[email protected]>

i40e: enable NETIF_F_NTUPLE and NETIF_F_HW_TC at driver load

The assignment of the feature flag NETIF_F_NTUPLE and NETIF_F_HW_TC
occurs prior to the initial setup of the local hw_features variable.

This means the features are set as user-changeable, but are not set in
the currently active feature list. This results in the features being
disabled at the driver's initial load.

Move the assignment after the initial assignment of hw_features, and
assign to the local variable. This ensures that NETIF_F_NTUPLE and
NETIF_F_HW_TC are marked as user-changeable, and also enables them by
default when the driver loads.

Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: restore NETIF_F_GSO_IPXIP[46] to netdev features

Since commit bacd75cfac8a ("i40e/i40evf: Add capability exchange for
outer checksum", 2017-04-06) the i40e driver has not reported support
for IP-in-IP offloads. This likely occurred due to a bad rebase, as the
commit extracts hw_enc_features into its own variable. As part of this
change, it dropped the NETIF_F_FSO_IPXIP flags from the
netdev->hw_enc_features. This was unfortunately not caught during code
review.

Fix this by adding back the missing feature flags.

For reference, NETIF_F_GSO_IPXIP4 was added in commit 7e13318daa4a
("net: define gso types for IPx over IPv4 and IPv6", 2016-05-20),
replacing NETIF_F_GSO_IPIP and NETIF_F_GSO_SIT.

NETIF_F_GSO_IPXIP6 was added in commit bf2d1df39502 ("intel: Add support
for IPv6 IP-in-IP offload", 2016-05-20).

Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Change req_speeds to be u16

Since the req_speeds field in struct ice_link_status is a u8,
req_speeds & ICE_AQ_LINK_SPEED_40GB always returns 0. This was caught
by a coverity scan.

Fix this by changing req_speeds to be u16.

Reported-by: Bruce Allan <[email protected]>
Signed-off-by: Chinh T Cao <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"A few more fixes that have come in, and one revert of a previous fix.

  I was a bit too trigger happy to enable PREEMPT on multi_v7_defconfig,
  and it ended up regressing at least BeagleBone XM boards. While we get
  that debugged for next merge window, let's disable it again.

  Beyond that:

   - Stratix change to fix multicast filtering

   - Minor DT fixes for Renesas and i.MX

   - Ethernet fix for a Renesas board (switching main interfaces)

   - Ethernet phy regulator fix for i.MX6SX"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: dts: stratix10: fix multicast filtering
  ARM: defconfig: Disable PREEMPT again on  multi_v7
  arm64: dts: renesas: condor: switch from EtherAVB to GEther
  dt-bindings: arm: Fix RZ/G2E part number
  arm64: dts: renesas: r8a7795: add missing dma-names on hscif2
  ARM: dts: imx6sx-sdb: Fix enet phy regulator
  ARM: dts: fsl: Fix improperly quoted stdout-path values
  ARM: dts: imx6sll: fix typo for fsl,imx6sll-i2c node

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- hid.git is moving towards group maintainership (where group is myself
   and Benjamin Tissoires), therefore this pull request updates
   MAINTAINERS accordingly

- fix for hid-asus config dependency from Arnd Bergmann

- two device-specific quirks for i2c-hid from Julian Sax and Kai-Heng
   Feng

- other few small assorted fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: fix up .raw_event() documentation
  HID: asus: fix build warning wiht CONFIG_ASUS_WMI disabled
  HID: i2c-hid: add Direkt-Tek DTLAPY133-1 to descriptor override
  HID: moving to group maintainership model
  HID: alps: allow incoming reports when only the trackstick is opened
  Revert "HID: add NOGET quirk for Eaton Ellipse MAX UPS"
  HID: i2c-hid: Add a small delay after sleep command for Raydium touchpanel
  HID: hiddev: fix potential Spectre v1

ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error path

Fixes: de05ca852679 ("ext4: move call to ext4_error() into ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.17

ext4: fix buffer leak in ext4_xattr_move_to_block() on error path

Fixes: 3f2571c1f91f ("ext4: factor out xattr moving")
Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...")
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.23

Merge tag 'stratix10_dts_fix_for_v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into fixes

ARM: dts: stratix10: fix multicast filtering

On Stratix 10, the EMAC has 256 hash buckets for multicast filtering. This
needs to be specified in DTS, otherwise the stmmac driver defaults to 64
buckets and initializes the filter incorrectly. As a result, e.g. valid
IPv6 multicast traffic ends up being dropped.

* tag 'stratix10_dts_fix_for_v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
arm64: dts: stratix10: fix multicast filtering

Signed-off-by: Olof Johansson <[email protected]>

ext4: release bs.bh before re-using in ext4_xattr_block_find()

bs.bh was taken in previous ext4_xattr_block_find() call,
it should be released before re-using

Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.26

ext4: fix buffer leak in ext4_xattr_get_block() on error path

Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.13

Merge tag 'renesas-fixes-for-v4.20' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes

Renesas ARM Based SoC Fixes for v4.20

* R-Car V3H (r8a77980) based Condor board
  - Switch from EtherAVB to GEther to match offical boards

* RZ/G2E (ra8774c0) SoC: correct documentation of part number

* R-Car H3 (r8a7795) SoC: reinstate all DMA channels on HSCIF2

* tag 'renesas-fixes-for-v4.20' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  arm64: dts: renesas: condor: switch from EtherAVB to GEther
  dt-bindings: arm: Fix RZ/G2E part number
  arm64: dts: renesas: r8a7795: add missing dma-names on hscif2

Signed-off-by: Olof Johansson <[email protected]>

ext4: fix possible leak of s_journal_flag_rwsem in error path

Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.7

resource/docs: Complete kernel-doc style function documentation

Add the missing kernel-doc style function parameters documentation.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: b69c2e20f6e4 ("resource: Clean it up a bit")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

ext4: fix possible leak of sbi->s_group_desc_leak in error path

Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock")
Reported-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.18

Merge tag 'gvt-fixes-2018-11-07' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2018-11-07

- Fix invalidate of old ggtt entry (Hang)
- Fix partial ggtt entry update in any order (Hang)
- Fix one mask setting for chicken reg (Xinyun)
- Fix eDP warning in guest (Longhe)

Signed-off-by: Joonas Lahtinen <[email protected]>
From: Zhenyu Wang <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'exynos-drm-fixes-for-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes

Three regressions
- Revert frame counter support
  . This patch fixes a issue which doesn't work extension and clone
    mode because some CRTC devices don't provide frame counter value
    properly.
- Fix lack of fbdev on Rinato and trats boards.
  . This patch considers for connector to be registered by DSI after
    DRM device is registered, and also it makes fbdev initializaion
    to be done even if no connector at the moment.
- Check for dsi->panel object correctly
  . This patch fixes checking for dsi->panel. of_drm_find_panel
    function returns panel object or error value so error value
    should be checked using IS_ERR macro.

Signed-off-by: Dave Airlie <[email protected]>
From: Inki Dae <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes

Single etnaviv fence fix for GPU recovery.

Signed-off-by: Dave Airlie <[email protected]>
From: Lucas Stach <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

ext4: remove unneeded brelse call in ext4_xattr_inode_update_ref()

Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: avoid possible double brelse() in add_new_gdb() on error path

Fixes: b40971426a83 ("ext4: add error checking to calls to ...")
Reported-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.38

ext4: avoid buffer leak in ext4_orphan_add() after prior errors

Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock")
Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link")
Cc: Dmitry Monakhov <[email protected]>
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.34

ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty()

ext4_mark_iloc_dirty() callers expect that it releases iloc->bh
even if it returns an error.

Fixes: 0db1ff222d40 ("ext4: add shutdown bit and check for it")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.11

drm/amdgpu/display/dce11: only enable FBC when selected

Causes a black screen on a Stoney laptop.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=108577
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/display/dm: handle FBC dc feature parameter

Set the dc_config properly when the option is enabled.

Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/display/dc: add FBC to dc_config

Add FBC to the list of features that can be enabled from the DM.

Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add DC feature mask module parameter

Similar to ppfeaturemask. Allows you to selectively enable/disable
DC features.

Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/display: check if fbc is available in set_static_screen_control (v2)

The value is dependent on whether fbc is available.

v2: only check if num_pipes is valid

Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/vega20: add CLK base offset

In case we need to access CLK registers.

Acked-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Stop leaking planes

[Why]
drm_plane_cleanup does not free the plane.

[How]
Call drm_primary_helper_destroy which will also free the plane.

Signed-off-by: Harry Wentland <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix misleading buffer information

RETIMER_REDRIVER_INFO shows the buffer as a decimal value with a '0x'
prefix, which is somewhat misleading.

Fix it to print hexadecimal, as was intended.

Fixes: 2f14bc89("drm/amd/display: add retimer log for HWQ tuning use.")
Cc: Charlene Liu <[email protected]>
Cc: Dmytro Laktyushkin <[email protected]>
Signed-off-by: Shaokun Zhang <[email protected]>
Reviewed-by: Leo Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Revert "drm/amd/display: set backlight level limit to 1"

This reverts commit 0cafc82fae41531b0162150f9a97f2c74f97118f.

This breaks some apps that assume 0 is minimum brightness.

Revert for 4.20. This is fixed properly for drm-next/4.21 in:
"drm/amd: Don't fail on backlight = 0"
However, that patch depends on more extensive changes to the
backlight interface which are too invasive for -fixes.

Fixes: Bugzilla: https://bugs.freedesktop.org/108668
Signed-off-by: Alex Deucher <[email protected]>

ext4: fix possible inode leak in the retry loop of ext4_resize_fs()

Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.7

ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing

Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.7

watchdog/core: Add missing prototypes for weak functions

The split out of the hard lockup detector exposed two new weak functions,
but no prototypes for them, which triggers the build warning:

kernel/watchdog.c:109:12: warning: no previous prototype for ‘watchdog_nmi_enable’ [-Wmissing-prototypes]
kernel/watchdog.c:115:13: warning: no previous prototype for ‘watchdog_nmi_disable’ [-Wmissing-prototypes]

Add the prototypes.

Fixes: 73ce0511c436 ("kernel/watchdog.c: move hardlockup detector to separate file")
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Babu Moger <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

igb: shorten maximum PHC timecounter update interval

The timecounter needs to be updated at least once per ~550 seconds in
order to avoid a 40-bit SYSTIM timestamp to be misinterpreted as an old
timestamp.

Since commit 500462a9de65 ("timers: Switch to a non-cascading wheel"),
scheduling of delayed work seems to be less accurate and a requested
delay of 540 seconds may actually be longer than 550 seconds. Also, the
PHC may be adjusted to run up to 6% faster than real time and the system
clock up to 10% slower. Shorten the delay to 360 seconds to be sure the
timecounter is updated in time.

This fixes an issue with HW timestamps on 82580/I350/I354 being off by
~1100 seconds for few seconds every ~9 minutes.

Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Miroslav Lichvar <[email protected]>
Acked-by: Jacob Keller <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix the bytecount sent to netdev_tx_sent_queue

Currently if the driver does a TSO offload the bytecount sent to
netdev_tx_sent_queue will be incorrect. This is because in ice_tso we
overwrite the initial value that we set in ice_tx_map. This creates a
mismatch between the Tx and Tx clean flow. In the Tx clean flow we
calculate the bytecount (called total_bytes) as we clean the
descriptors so the value used in the Tx clean path is correct. Fix this
by using += in ice_tso instead of =. This fixes the mismatch in
bytecount mentioned above.

Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix tx_timeout in PF driver

Prior to this commit the driver was running into tx_timeouts when a
queue was stressed enough. This was happening because the HW tail
and SW tail (NTU) were incorrectly out of sync. Consequently this was
causing the HW head to collide with the HW tail, which to the hardware
means that all descriptors posted for Tx have been processed.

Due to the Tx logic used in the driver SW tail and HW tail are allowed
to be out of sync. This is done as an optimization because it allows the
driver to write HW tail as infrequently as possible, while still
updating the SW tail index to keep track. However, there are situations
where this results in the tail never getting updated, resulting in Tx
timeouts.

Tx HW tail write condition:
if (netif_xmit_stopped(txring_txq(tx_ring) || !skb->xmit_more)
writel(sw_tail, tx_ring->tail);

An issue was found in the Tx logic that was causing the afore mentioned
condition for updating HW tail to never happen, causing tx_timeouts.

In ice_xmit_frame_ring we calculate how many descriptors we need for the
Tx transaction based on the skb the kernel hands us. This is then passed
into ice_maybe_stop_tx along with some extra padding to determine if we
have enough descriptors available for this transaction. If we don't then
we return -EBUSY to the stack, otherwise we move on and eventually
prepare the Tx descriptors accordingly in ice_tx_map and set
next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx
with a value of MAX_SKB_FRAGS + 4. The key here is that this value is
possibly less than the value we sent in the first call to
ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused
descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first
call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update
the HW tail because of the "Tx HW tail write condition" above. This is
because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx
instead of calling __ice_maybe_stop_tx and subsequently calling
netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This
bit is then checked in the "Tx HW tail write condition" by calling
netif_xmit_stopped and subsequently updating HW tail if the
afore mentioned bit is set.

In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning
the descriptors that HW sets the DD bit on and we have the budget. The
HW head will eventually run into the HW tail in response to the
description in the paragraph above.

The next time through ice_xmit_frame_ring we make the initial call to
ice_maybe_stop_tx with another skb from the stack. This time we do not
have enough descriptors available and we return NETDEV_TX_BUSY to the
stack and end up setting next_to_watch to NULL.

This is where we are stuck. In ice_clean_tx_irq we never clean anything
because next_to_watch is always NULL and in ice_xmit_frame_ring we never
update HW tail because we already return NETDEV_TX_BUSY to the stack and
eventually we hit a tx_timeout.

This issue was fixed by making sure that the second call to
ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value
that was used on the initial call to ice_maybe_stop_tx in
ice_xmit_frame_ring. This was done by adding the following defines to
make the logic more clear and to reduce the chance of mucking this up
again:

ICE_CACHE_LINE_BYTES 64
ICE_DESCS_PER_CACHE_LINE (ICE_CACHE_LINE_BYTES / \
sizeof(struct ice_tx_desc))
ICE_DESCS_FOR_CTX_DESC 1
ICE_DESCS_FOR_SKB_DATA_PTR 1

The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we
don't have to figure this out on every pass through the Tx path. Instead
I added a sanity check in ice_probe to verify cache line size and print
a message if it's not 64 Bytes. This will make it easier to file issues
if they are seen when the cache line size is not 64 Bytes when reading
from the GLPCI_CNF2 register.

Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix napi delete calls for remove

In the remove path, the vsi->netdev is being set to NULL before the call
to free vectors. This is causing the netif_napi_del call to never be made.

Add a call to ice_napi_del to the same location as the calls to
unregister_netdev and just prior to them. This will use the reverse flow
as the register and netif_napi_add calls.

Signed-off-by: Dave Ertman <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix typo in error message

Print should say "Enabling" instead of "Enaabling"

Signed-off-by: Akeem G Abodunrin <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix flags for port VLAN

According to the spec, whenever insert PVID field is set, the VLAN
driver insertion mode should be set to 01b which isn't done currently.
Fix it.

Signed-off-by: Md Fahad Iqbal Polash <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Remove duplicate addition of VLANs in replay path

ice_restore_vlan and active_vlans were originally put in place to
reprogram VLAN filters in the replay path. This is now done as part
of the much broader VSI rebuild/replay framework. So remove both
ice_restore_vlan and active_vlans

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Free VSI contexts during for unload

In the unload path, all VSIs are freed. Also free the related VSI
contexts to prevent memory leaks.

Signed-off-by: Victor Raj <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix dead device link issue with flow control

Setting Rx or Tx pause parameter currently results in link loss on the
interface, requiring the platform/host to be cold power cycled. Fix it.

Signed-off-by: Akeem G Abodunrin <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Check for reset in progress during remove

The remove path does not currently check to see if a
reset is in progress before proceeding. This can cause
a resource collision resulting in various types of errors.

Check for reset in progress and wait for a reasonable
amount of time before allowing the remove to progress.

Signed-off-by: Dave Ertman <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Set carrier state and start/stop queues in rebuild

Set the carrier state post rebuild by querying the link status. Also
start/stop queues based on link status.

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

drm/amd: Update atom_smu_info_v3_3 structure

Mainly adding the WAFL spread spectrum info, for adjusting display
clocks when XGMI is enabled.

Signed-off-by: Leo Li <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

x86/vsmp: Remove dependency on pv_irq_ops

vSMP dependency on pv_irq_ops has been removed some years ago, but the code
still deals with pv_irq_ops.

In short, "cap & ctl & (1 << 4)" is always returning 0, so all
PARAVIRT/PARAVIRT_XXL code related to that can be removed.

However, the rest of the code depends on CONFIG_PCI, so fix it accordingly.

Rename set_vsmp_pv_ops to set_vsmp_ctl as the original name does not make
sense anymore.

Signed-off-by: Eial Czerwacki <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Shai Fultheim <[email protected]>
Cc: Juergen Gross <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

x86/ldt: Remove unused variable in map_ldt_struct()

Splitting out the sanity check in map_ldt_struct() moved page table syncing
into a separate function, which made the pgd variable unused. Remove it.

[ tglx: Massaged changelog ]

Fixes: 9bae3197e15d ("x86/ldt: Split out sanity check in map_ldt_struct()")
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

x86/ldt: Unmap PTEs for the slot before freeing LDT pages

modify_ldt(2) leaves the old LDT mapped after switching over to the new
one. The old LDT gets freed and the pages can be re-used.

Leaving the mapping in place can have security implications. The mapping is
present in the userspace page tables and Meltdown-like attacks can read
these freed and possibly reused pages.

It's relatively simple to fix: unmap the old LDT and flush TLB before
freeing the old LDT memory.

This further allows to avoid flushing the TLB in map_ldt_struct() as the
slot is unmapped and flushed by unmap_ldt_struct() or has never been mapped
at all.

[ tglx: Massaged changelog and removed the needless line breaks ]

Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

x86/mm: Move LDT remap out of KASLR region on 5-level paging

On 5-level paging the LDT remap area is placed in the middle of the KASLR
randomization region and it can overlap with the direct mapping, the
vmalloc or the vmap area.

The LDT mapping is per mm, so it cannot be moved into the P4D page table
next to the CPU_ENTRY_AREA without complicating PGD table allocation for
5-level paging.

The 4 PGD slot gap just before the direct mapping is reserved for
hypervisors, so it cannot be used.

Move the direct mapping one slot deeper and use the resulting gap for the
LDT remap area. The resulting layout is the same for 4 and 5 level paging.

[ tglx: Massaged changelog ]

Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

net: phy: Allow BCM54616S PHY to setup internal TX/RX clock delay

This patch allows users to enable/disable internal TX and/or RX clock
delay for BCM54616S PHYs so as to satisfy RGMII timing specifications.

On a particular platform, whether TX and/or RX clock delay is required
depends on how PHY connected to the MAC IP. This requirement can be
specified through "phy-mode" property in the platform device tree.

The patch is inspired by commit 733336262b28 ("net: phy: Allow BCM5481x
PHYs to setup internal TX/RX clock delay").

Signed-off-by: Tao Ren <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'perf-urgent-for-mingo-4.20-20181106' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent improvements and fixes from Arnaldo Carvalho de Melo:

Intel PT SQL viewer: (Adrian Hunter)

- Fall back to /usr/local/lib/libxed.so
- Add Selected branches report
- Add help window
- Fix table find when table re-ordered

Intel PT debug log (Adrian Hunter)

- Add more event information
- Add MTC and CYC timestamps

perf record: (Andi Kleen)

- Support weak groups, just like with 'perf stat'

perf trace: (Arnaldo Carvalho de Melo)

- Start augmenting raw_syscalls:{sys_enter,sys_exit}: goal is to have a
  generic, arch independent eBPF kernel component that is programmed with
  syscall table details, what to copy, how many bytes, pid, arg filters from the
  userspace via eBPF maps by the 'perf trace' tool that continues to use all its
  argument beautifiers, just taking advantage of the extra pointer contents.

JVMTI: (Gustavo Romero)

- Fix undefined symbol scnprintf in libperf-jvmti.so

perf top: (Jin Yao)

- Display the LBR stats in callchain entries

perf stat: (Thomas Richter)

- Handle different PMU names with common prefix

arm64: Will (Deacon)

- Fix arm64 tools build failure wrt smp_load_{acquire,release}.

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

acpi/nfit, x86/mce: Validate a MCE's address before using it

The NFIT machine check handler uses the physical address from the mce
structure, and compares it against information in the ACPI NFIT table
to determine whether that location lies on an NVDIMM. The mce->addr
field however may not always be valid, and this is indicated by the
MCI_STATUS_ADDRV bit in the status field.

Export mce_usable_address() which already performs validation for the
address, and use it in the NFIT handler.

Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: Robert Elliott <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
CC: Arnd Bergmann <[email protected]>
Cc: Dan Williams <[email protected]>
CC: Dave Jiang <[email protected]>
CC: [email protected]
CC: "H. Peter Anvin" <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Len Brown <[email protected]>
CC: [email protected]
CC: linux-edac <[email protected]>
CC: [email protected]
CC: Qiuxu Zhuo <[email protected]>
CC: "Rafael J. Wysocki" <[email protected]>
CC: Ross Zwisler <[email protected]>
CC: stable <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Tony Luck <[email protected]>
CC: x86-ml <[email protected]>
CC: Yazen Ghannam <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

acpi/nfit, x86/mce: Handle only uncorrectable machine checks

The MCE handler for nfit devices is called for memory errors on a
Non-Volatile DIMM and adds the error location to a 'badblocks' list.
This list is used by the various NVDIMM drivers to avoid consuming known
poison locations during IO.

The MCE handler gets called for both corrected and uncorrectable errors.
Until now, both kinds of errors have been added to the badblocks list.
However, corrected memory errors indicate that the problem has already
been fixed by hardware, and the resulting interrupt is merely a
notification to Linux.

As far as future accesses to that location are concerned, it is
perfectly fine to use, and thus doesn't need to be included in the above
badblocks list.

Add a check in the nfit MCE handler to filter out corrected mce events,
and only process uncorrectable errors.

Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: Omar Avelar <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
CC: Arnd Bergmann <[email protected]>
CC: Dan Williams <[email protected]>
CC: Dave Jiang <[email protected]>
CC: [email protected]
CC: "H. Peter Anvin" <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Len Brown <[email protected]>
CC: [email protected]
CC: linux-edac <[email protected]>
CC: [email protected]
CC: Qiuxu Zhuo <[email protected]>
CC: "Rafael J. Wysocki" <[email protected]>
CC: Ross Zwisler <[email protected]>
CC: stable <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Tony Luck <[email protected]>
CC: x86-ml <[email protected]>
CC: Yazen Ghannam <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

lib/raid6: Fix arm64 test build

The lib/raid6/test fails to build the neon objects
on arm64 because the correct machine type is 'aarch64'.

Once this is correctly enabled, the neon recovery objects
need to be added to the build.

Reviewed-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Jeremy Linton <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

mtd: nand: Fix nanddev_pos_next_page() kernel-doc header

Function name is wrong in the kernel-doc header.

Fixes: 9c3736a3de21 ("mtd: nand: Add core infrastructure to deal with NAND devices")
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Miquel Raynal <[email protected]>

clk: fixed-factor: fix of_node_get-put imbalance

When the fixed factor clock is created by devicetree,
of_clk_add_provider is called. Add a call to
of_clk_del_provider in the remove function to balance
it out.

Reported-by: Alan Tull <[email protected]>
Fixes: 971451b3b15d ("clk: fixed-factor: Convert into a module platform driver")
Signed-off-by: Ricardo Ribalda Delgado <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>

drm/i915: Fix ilk+ watermarks when disabling pipes

We're no longer programming any watermarks when we're disabling
a pipe. That means ilk_wm_merge() & co. will keep considering
the any pipe that is getting disabled as still enabled. Thus we
either get no LP1+ watermakrs (ilk-ivb), or we get suboptimal
ones (hsw-bdw).

This seems to have been broken by commit b6b178a77210 ("drm/i915:
Calculate ironlake intermediate watermarks correctly, v2."). Before
that we apparently had some difference between the intermediate
and optimal watermarks and so we would program the optiomal ones.
Now intermediate and optimal are identical for disabled pipes
and so we don't program either.

Fix this by programming the intermediate watermarks even for
disabled pipes. We were already doing that for skl+. We'll
leave out gmch platforms for now since those do the merging
in a different manner and should work as is. We'll want to
unify this eventually, but play it safe for now and just put
in a FIXME.

Cc: [email protected]
Cc: Matt Roper <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Fixes: b6b178a77210 ("drm/i915: Calculate ironlake intermediate watermarks correctly, v2.")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Maarten Lankhorst <[email protected]> #irc
(cherry picked from commit a748faea3bfd7fd1d1485bc1c426c7d460cc6503)
Signed-off-by: Joonas Lahtinen <[email protected]>

Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"Masami found a slight bug in his code where he transposed the
  arguments of a call to strpbrk.

  The reason this wasn't detected in our tests is that the only way this
  would transpire is when a kprobe event with a symbol offset is
  attached to a function that belongs to a module that isn't loaded yet.
  When the kprobe trace event is added, the offset would be truncated
  after it was parsed, and when the module is loaded, it would use the
  symbol without the offset (as the nul character added by the parsing
  would not be replaced with the original character)"

* tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/kprobes: Fix strpbrk() argument order

Merge branch 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fix from Russell King:
"Ard spotted a typo in one of the assembly files which leads to a
kernel oops when that code path is executed. Fix this"

* 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8809/1: proc-v7: fix Thumb annotation of cpu_v7_hvc_switch_mm

drm/sun4i: tcon: prevent tcon->panel dereference if NULL

If tcon->panel pointer is NULL, trying to dereference from it
(i.e. tcon->panel->connector) will cause a null pointer dereference.

Add tcon->panel null pointer check before calling
sun4i_tcon0_mode_set_dithering().

Signed-off-by: Giulio Benetti <[email protected]>
Fixes: f11adcecbd5f ("drm/sun4i: tcon: Add dithering support for
RGB565/RGB666 LCD panels")
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/sun4i: tcon: fix check of tcon->panel null pointer

Since tcon->panel is a pointer returned by of_drm_find_panel() need to
check if it is not NULL, hence a valid pointer.
IS_ERR() instead checks return error values, not NULL pointers.

Substitute "if (!IS_ERR(tcon->panel))" with "if (tcon->panel)".

Signed-off-by: Giulio Benetti <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

xfs: fix overflow in xfs_attr3_leaf_verify

generic/070 on 64k block size filesystems is failing with a verifier
corruption on writeback or an attribute leaf block:

[   94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480
[   94.975623] XFS (pmem0): Unmount and run xfs_repair
[   94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer:
[   94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
[   94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00  ................
[   94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6  "{\.-\GL.1.7....
[   94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00  ................
[   94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00  .......h........
[   94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00  .-2.. ...-Xe....
[   94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00  .-[f.....-q5.|..
[   94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00  .-r7.....-......
[   94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3

This is failing this check:

                end = ichdr.freemap[i].base + ichdr.freemap[i].size;
                if (end < ichdr.freemap[i].base)
>>>>>                   return __this_address;
                if (end > mp->m_attr_geo->blksize)
                        return __this_address;

And from the buffer output above, the freemap array is:

freemap[0].base = 0x00a0
freemap[0].size = 0xdcf4 end = 0xdd94
freemap[1].base = 0xfe98
freemap[1].size = 0x0168 end = 0x10000
freemap[2].base = 0xf0d8
freemap[2].size = 0x07e0 end = 0xf8b8

These all look valid - the block size is 0x10000 and so from the
last check in the above verifier fragment we know that the end
of freemap[1] is valid. The problem is that end is declared as:

uint16_t end;

And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a
corruption. Fix the verifier to use uint32_t types for the check and
hence avoid the overflow.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: print buffer offsets when dumping corrupt buffers

Use DUMP_PREFIX_OFFSET when printing hex dumps of corrupt buffers
because modern Linux now prints a 32-bit hash of our 64-bit pointer when
using DUMP_PREFIX_ADDRESS:

00000000b4bb4297: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
00000005ec77e26: 00 00 00 00 02 d0 5a 00 00 00 00 00 00 00 00 00  ......Z.........
000000015938018: 21 98 e8 b4 fd de 4c 07 bc ea 3c e5 ae b4 7c 48  !.....L...<...|H

This is totally worthless for a sequential dump since we probably only
care about tracking the buffer offsets and afaik there's no way to
recover the actual pointer from the hashed value.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

xfs: Fix error code in 'xfs_ioc_getbmap()'

In this function, once 'buf' has been allocated, we unconditionally
return 0.
However, 'error' is set to some error codes in several error handling
paths.
Before commit 232b51948b99 ("xfs: simplify the xfs_getbmap interface")
this was not an issue because all error paths were returning directly,
but now that some cleanup at the end may be needed, we must propagate the
error code.

Fixes: 232b51948b99 ("xfs: simplify the xfs_getbmap interface")
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>