Chenbo Feng [Tue, 20 Mar 2018 00:57:27 +0000 (17:57 -0700)]
bpf: skip unnecessary capability check
The current check statement in BPF syscall will do a capability check
for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
code path will trigger unnecessary security hooks on capability checking
and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
access. This can be resolved by simply switch the order of the statement
and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
allowed.
Yonghong Song [Tue, 20 Mar 2018 18:19:17 +0000 (11:19 -0700)]
trace/bpf: remove helper bpf_perf_prog_read_value from tracepoint type programs
Commit 4bebdc7a85aa ("bpf: add helper bpf_perf_prog_read_value")
added helper bpf_perf_prog_read_value so that perf_event type program
can read event counter and enabled/running time.
This commit, however, introduced a bug which allows this helper
for tracepoint type programs. This is incorrect as bpf_perf_prog_read_value
needs to access perf_event through its bpf_perf_event_data_kern type context,
which is not available for tracepoint type program.
This patch fixed the issue by separating bpf_func_proto between tracepoint
and perf_event type programs and removed bpf_perf_prog_read_value
from tracepoint func prototype.
test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches
Function bpf_fill_maxinsns11 is designed to not be able to be JITed on
x86_64. So, it fails when CONFIG_BPF_JIT_ALWAYS_ON=y, and
commit 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when
CONFIG_BPF_JIT_ALWAYS_ON=y") makes sure that failure is detected on that
case.
However, it does not fail on other architectures, which have a different
JIT compiler design. So, test_bpf has started to fail to load on those.
After this fix, test_bpf loads fine on both x86_64 and ppc64el.
Fixes: 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y") Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Reviewed-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Linus Torvalds [Tue, 20 Mar 2018 19:16:59 +0000 (12:16 -0700)]
kvm/x86: fix icebp instruction handling
The undocumented 'icebp' instruction (aka 'int1') works pretty much like
'int3' in the absense of in-circuit probing equipment (except,
obviously, that it raises #DB instead of raising #BP), and is used by
some validation test-suites as such.
But Andy Lutomirski noticed that his test suite acted differently in kvm
than on bare hardware.
The reason is that kvm used an inexact test for the icebp instruction:
it just assumed that an all-zero VM exit qualification value meant that
the VM exit was due to icebp.
That is not unlike the guess that do_debug() does for the actual
exception handling case, but it's purely a heuristic, not an absolute
rule. do_debug() does it because it wants to ascribe _some_ reasons to
the #DB that happened, and an empty %dr6 value means that 'icebp' is the
most likely casue and we have no better information.
But kvm can just do it right, because unlike the do_debug() case, kvm
actually sees the real reason for the #DB in the VM-exit interruption
information field.
So instead of relying on an inexact heuristic, just use the actual VM
exit information that says "it was 'icebp'".
Right now the 'icebp' instruction isn't technically documented by Intel,
but that will hopefully change. The special "privileged software
exception" information _is_ actually mentioned in the Intel SDM, even
though the cause of it isn't enumerated.
Leon Romanovsky [Tue, 20 Mar 2018 15:05:13 +0000 (17:05 +0200)]
RDMA/ucma: Ensure that CM_ID exists prior to access it
Prior to access UCMA commands, the context should be initialized
and connected to CM_ID with ucma_create_id(). In case user skips
this step, he can provide non-valid ctx without CM_ID and cause
to multiple NULL dereferences.
Also there are situations where the create_id can be raced with
other user access, ensure that the context is only shared to
other threads once it is fully initialized to avoid the races.
Stefano Brivio [Mon, 19 Mar 2018 10:24:58 +0000 (11:24 +0100)]
ipv6: old_dport should be a __be16 in __ip6_datagram_connect()
Fixes: 2f987a76a977 ("net: ipv6: keep sk status consistent after datagram connect failure") Signed-off-by: Stefano Brivio <[email protected]> Acked-by: Paolo Abeni <[email protected]> Acked-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Tue, 20 Mar 2018 16:42:36 +0000 (12:42 -0400)]
Merge tag 'linux-can-fixes-for-4.16-20180319' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2018-03-19
this is a pull reqeust of one patch for net/master.
The patch is by Andri Yngvason and fixes a potential use-after-free bug
in the cc770 driver introduced in the previous pull-request.
====================
net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred
If the optional regulator is deferred, we must release some resources.
They will be re-allocated when the probe function will be called again.
Fixes: 6eacf31139bf ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The current code performs unneeded free. Remove the redundant skb freeing
during the error path.
Fixes: 1555d204e743 ("devlink: Support for pipeline debug (dpipe)") Signed-off-by: Arkadi Sharshevsky <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
There were only a few Pentium Pro multiprocessors systems where this
errata applied. They are more than 20 years old now, and we've slowly
dropped places which put the workarounds in and discouraged anyone
from enabling the workaround.
At the same time, some syslog utilities (like rsyslog by default) don't
like the additional newlines control characters, saving lines like this
to /var/log/messages:
Mar 16 16:02:29 localhost kernel: #012runnable tasks:#012 S task PID tree-key ...
^^^^ ^^^^
Clean these up by moving newline characters to their own SEQ_printf
invocation. This leaves the /proc/sched_debug unchanged, but brings the
entire output into alignment when prefixed:
Joe Lawrence [Mon, 19 Mar 2018 18:35:54 +0000 (14:35 -0400)]
sched/debug: Fix per-task line continuation for console output
When the SEQ_printf() macro prints to the console, it runs a simple
printk() without KERN_CONT "continued" line printing. The result of
this is oddly wrapped task info, for example:
Kan Liang [Tue, 13 Mar 2018 18:51:34 +0000 (11:51 -0700)]
perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers
The number of CHAs is miscalculated on multi-domain PCI Skylake server systems,
resulting in an uncore driver initialization error.
Gary Kroening explains:
"For systems with a single PCI segment, it is sufficient to look for the
bus number to change in order to determine that all of the CHa's have
been counted for a single socket.
However, for multi PCI segment systems, each socket is given a new
segment and the bus number does NOT change. So looking only for the
bus number to change ends up counting all of the CHa's on all sockets
in the system. This leads to writing CPU MSRs beyond a valid range and
causes an error in ivbep_uncore_msr_init_box()."
To fix this bug, query the number of CHAs from the CAPID6 register:
it should read bits 27:0 in the CAPID6 register located at
Device 30, Function 3, Offset 0x9C. These 28 bits form a bit vector
of available LLC slices and the CHAs that manage those slices.
Josh Poimboeuf [Mon, 19 Mar 2018 18:18:57 +0000 (13:18 -0500)]
jump_label: Disable jump labels in __exit code
With the following commit:
333522447063 ("jump_label: Explicitly disable jump labels in __init code")
... we explicitly disabled jump labels in __init code, so they could be
detected and not warned about in the following commit:
dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt")
In-kernel __exit code has the same issue. It's never used, so it's
freed along with the rest of initmem. But jump label entries in __exit
code aren't explicitly disabled, so we get the following warning when
enabling pr_debug() in __exit code:
can't patch jump_label at dmi_sysfs_exit+0x0/0x2d
WARNING: CPU: 0 PID: 22572 at kernel/jump_label.c:376 __jump_label_update+0x9d/0xb0
Fix the warning by disabling all jump labels in initmem (which includes
both __init and __exit code).
Dan Carpenter [Sat, 17 Mar 2018 11:52:16 +0000 (14:52 +0300)]
perf/x86/intel: Don't accidentally clear high bits in bdw_limit_period()
We intended to clear the lowest 6 bits but because of a type bug we
clear the high 32 bits as well. Andi says that periods are rarely more
than U32_MAX so this bug probably doesn't have a huge runtime impact.
When the PEBS interrupt threshold is larger than one, there is no way
to get exact auto-reload times and value for userspace RDPMC. Disable
the userspace RDPMC usage when large PEBS is enabled.
The only exception is when the PEBS interrupt threshold is 1, in which
case user-space RDPMC works well even with auto-reload events.
Matthew Wilcox [Thu, 15 Mar 2018 11:58:12 +0000 (04:58 -0700)]
locking/mutex: Improve documentation
On Wed, Mar 14, 2018 at 01:56:31PM -0700, Andrew Morton wrote:
> My memory is weak and our documentation is awful. What does
> mutex_lock_killable() actually do and how does it differ from
> mutex_lock_interruptible()?
Add kernel-doc for mutex_lock_killable() and mutex_lock_io(). Reword the
kernel-doc for mutex_lock_interruptible().
H.J. Lu [Mon, 19 Mar 2018 20:57:46 +0000 (13:57 -0700)]
x86/build/64: Force the linker to use 2MB page size
Binutils 2.31 will enable -z separate-code by default for x86 to avoid
mixing code pages with data to improve cache performance as well as
security. To reduce x86-64 executable and shared object sizes, the
maximum page size is reduced from 2MB to 4KB. But x86-64 kernel must
be aligned to 2MB. Pass -z max-page-size=0x200000 to linker to force
2MB page size regardless of the default page size used by linker.
David S. Miller [Tue, 20 Mar 2018 01:14:27 +0000 (21:14 -0400)]
Merge branch 'phy-relax-error-checking'
Grygorii Strashko says:
====================
net: phy: relax error checking when creating sysfs link netdev->phydev
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY.
As result, second CPSW external port will became unusable.
This regression was introduced by commits: 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev" a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"
Patch 1: exports sysfs_create_link_nowarn() function as preparation for Patch 2.
Patch 2: relaxes error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppresses warning by using
sysfs_create_link_nowarn() and adds error message instead, so links creation
failure is not fatal any more and system can continue working,
which fixes TI CPSW issue and makes boot logs accessible
in case of NFS boot, for example.
net: phy: relax error checking when creating sysfs link netdev->phydev
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY. As result, second CPSW external
port will became unusable.
Fix it by relaxing error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppressing warning by using
sysfs_create_link_nowarn() and adding error message instead.
After this change links (phy->netdev and netdev->phy) creation failure is not
fatal any more and system can continue working, which fixes TI CPSW issue.
The sysfs_create_link_nowarn() is going to be used in phylib framework in
subsequent patch which can be built as module. Hence, export
sysfs_create_link_nowarn() to avoid build errors.
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
If bios sets up an MST output and hardware state readout code sees this is
an SST configuration, when disabling the encoder we end up calling
->post_disable_dp() hook instead of the MST version. Consequently, we write
to the DP_SET_POWER dpcd to set it D3 state. Further along when we try
enable the encoder in MST mode, POWER_UP_PHY transaction fails to power up
the MST hub. This results in continuous link training failures which keep
the system busy delaying boot. We could identify bios MST boot discrepancy
and handle it accordingly but a simple way to solve this is to write to the
DP_SET_POWER dpcd for MST too.
Takashi Iwai [Mon, 19 Mar 2018 13:51:49 +0000 (14:51 +0100)]
ACPI / watchdog: Fix off-by-one error at resource assignment
The resource allocation in WDAT watchdog has off-one-by error, it sets
one byte more than the actual end address. This may eventually lead
to unexpected resource conflicts.
Linus Torvalds [Mon, 19 Mar 2018 22:13:04 +0000 (15:13 -0700)]
Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"Two low-impact workqueue commits.
One fixes workqueue creation error path and the other removes the
unused cancel_work()"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: remove unused cancel_work()
workqueue: use put_device() instead of kfree()
Linus Torvalds [Mon, 19 Mar 2018 21:48:35 +0000 (14:48 -0700)]
Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fixes from Tejun Heo:
"Late percpu pull request for v4.16-rc6.
- percpu allocator pool replenishing no longer triggers OOM or
warning messages.
Also, the alloc interface now understands __GFP_NORETRY and
__GFP_NOWARN. This is to allow avoiding OOMs from userland
triggered actions like bpf map creation.
Also added cond_resched() in alloc loop.
- perpcu allocation now can be interrupted by kill sigs to avoid
deadlocking OOM killer.
- Added Dennis Zhou as a co-maintainer.
He has rewritten the area map allocator, understands most of the
code base and has been responsive for all bug reports"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()
percpu: include linux/sched.h for cond_resched()
percpu: add a schedule point in pcpu_balance_workfn()
percpu: allow select gfp to be passed to underlying allocators
percpu: add __GFP_NORETRY semantics to the percpu balancing path
percpu: match chunk allocator declarations with definitions
percpu: add Dennis Zhou as a percpu co-maintainer
Linus Torvalds [Mon, 19 Mar 2018 21:23:30 +0000 (14:23 -0700)]
Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"I sat on them too long and it's quite a few this late, but nothing has
a wide blast area. The changes are...
- Fix corner cases in SG command handling.
- Recent introduction of default powersaving mode config option
exposed several devices with broken powersaving behaviors. A number
of patches to update the blacklist accordingly.
- Fix a kernel panic on SAS hotplug.
- Other misc and device specific updates"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version
libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions
libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs
libata: Enable queued TRIM for Samsung SSD 860
PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L
ahci: Add PCI-id for the Highpoint Rocketraid 644L card
ata: do not schedule hot plug if it is a sas host
libata: disable LPM for Crucial BX100 SSD 500GB drive
libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs
libata: update documentation for sysfs interfaces
ata: sata_rcar: Remove unused variable in sata_rcar_init_controller()
libata: transport: cleanup documentation of sysfs interface
sata_rcar: Reset SATA PHY when Salvator-X board resumes
libata: don't try to pass through NCQ commands to non-NCQ devices
libata: remove WARN() for DMA or PIO command without data
libata: fix length validation of ATAPI-relayed SCSI commands
ata: libahci: fix comment indentation
ahci: Add check for device presence (PCIe hot unplug) in ahci_stop_engine()
libata: Fix compile warning with ATA_DEBUG enabled
Leon Romanovsky [Mon, 19 Mar 2018 10:21:43 +0000 (12:21 +0200)]
RDMA/verbs: Remove restrack entry from XRCD structure
XRCD object is not implemented in the restrack, so lets remove it.
Fixes: 02d8883f520e ("RDMA/restrack: Add general infrastructure to track RDMA resources") Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
Leon Romanovsky [Mon, 19 Mar 2018 12:20:15 +0000 (14:20 +0200)]
RDMA/ucma: Fix use-after-free access in ucma_close
The error in ucma_create_id() left ctx in the list of contexts belong
to ucma file descriptor. The attempt to close this file descriptor causes
to use-after-free accesses while iterating over such list.
Tejun Heo [Wed, 14 Mar 2018 19:45:12 +0000 (12:45 -0700)]
percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
percpu_ref internally uses sched-RCU to implement the percpu -> atomic
mode switching and the documentation suggested that this could be
depended upon. This doesn't seem like a good idea.
* percpu_ref uses sched-RCU which has different grace periods regular
RCU. Users may combine percpu_ref with regular RCU usage and
incorrectly believe that regular RCU grace periods are performed by
percpu_ref. This can lead to, for example, use-after-free due to
premature freeing.
* percpu_ref has a grace period when switching from percpu to atomic
mode. It doesn't have one between the last put and release. This
distinction is subtle and can lead to surprising bugs.
* percpu_ref allows starting in and switching to atomic mode manually
for debugging and other purposes. This means that there may not be
any grace periods from kill to release.
This patch makes it clear that the grace periods are percpu_ref's
internal implementation detail and can't be depended upon by the
users.
Kirill Tkhai [Mon, 19 Mar 2018 15:32:10 +0000 (18:32 +0300)]
mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()
In case of memory deficit and low percpu memory pages,
pcpu_balance_workfn() takes pcpu_alloc_mutex for a long
time (as it makes memory allocations itself and waits
for memory reclaim). If tasks doing pcpu_alloc() are
choosen by OOM killer, they can't exit, because they
are waiting for the mutex.
The patch makes pcpu_alloc() to care about killing signal
and use mutex_lock_killable(), when it's allowed by GFP
flags. This guarantees, a task does not miss SIGKILL
from OOM killer.
CM_PLLx and A2W_XOSC_CTRL registers are accessed by different clock
handlers and must be accessed with ->regs_lock held.
Update the sections where this protection is missing.
Boris Brezillon [Thu, 8 Feb 2018 13:43:35 +0000 (14:43 +0100)]
clk: bcm2835: Fix ana->maskX definitions
ana->maskX values are already '~'-ed in bcm2835_pll_set_rate(). Remove
the '~' in the definition to fix ANA setup.
Note that this commit fixes a long standing bug preventing one from
using an HDMI display if it's plugged after the FW has booted Linux.
This is because PLLH is used by the HDMI encoder to generate the pixel
clock.
ALSA: usb-audio: Fix parsing descriptor of UAC2 processing unit
Currently, the offsets in the UAC2 processing unit descriptor are
calculated incorrectly. It causes an issue when connecting the device which
provides such a feature:
~~~~
[84126.724420] usb 1-1.3.1: invalid Processing Unit descriptor (id 18)
~~~~
After this patch is applied, the UAC2 processing unit inits w/o this error.
Hans de Goede [Mon, 19 Mar 2018 15:34:00 +0000 (16:34 +0100)]
libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version
When commit 9c7be59fc519af ("libata: Apply NOLPM quirk to Crucial MX100
512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk
from the existing "Crucial_CT*MX100*" entry, but that entry sets model_rev
to "MU01", where as the entry adding the NOLPM quirk sets it to NULL.
This means that after this commit we no apply the NO_NCQ_TRIM quirk to
all "Crucial_CT512MX100*" SSDs even if they have the fixed "MU02"
firmware. This commit splits the "Crucial_CT512MX100*" quirk into 2
quirks, one for the "MU01" firmware and one for all other firmware
versions, so that we once again only apply the NO_NCQ_TRIM quirk to the
"MU01" firmware version.
Hans de Goede [Mon, 19 Mar 2018 15:33:59 +0000 (16:33 +0100)]
libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions
Commit b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB
drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs
but limited this to the MU02 firmware version, according to:
http://www.crucial.com/usa/en/support-ssd-firmware
MU02 is the last version, so there are no newer possibly fixed versions
and if the MU02 version has broken LPM then the MU01 almost certainly
also has broken LPM, so this commit changes the quirk to apply to all
firmware versions.
Hans de Goede [Mon, 19 Mar 2018 15:33:58 +0000 (16:33 +0100)]
libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs
There have been reports of the Crucial M500 480GB model not working
with LPM set to min_power / med_power_with_dipm level.
It has not been tested with medium_power, but that typically has no
measurable power-savings.
Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03
and there is a MU05 update available, but that update does not mention any
LPM fixes in its changelog, so the quirk matches all firmware versions.
In my experience the LPM problems with (older) Crucial SSDs seem to be
limited to higher capacity versions of the SSDs (different firmware?),
so this commit adds a NOLPM quirk for the 480 and 960GB versions of the
M500, to avoid LPM causing issues with these SSDs.
Dan Carpenter [Mon, 19 Mar 2018 11:07:45 +0000 (14:07 +0300)]
staging: ncpfs: memory corruption in ncp_read_kernel()
If the server is malicious then *bytes_read could be larger than the
size of the "target" buffer. It would lead to memory corruption when we
do the memcpy().
Merge tag 'iio-fixes-for-4.16b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
Second set of IIO fixes for the 4.16 cycle.
A slightly fiddly revert then fix pair in here as the bug lead to
an unused local variable that was then removed without us noticing
the bug. The revert should only be needed on 4.16 - the fix
goes back futher.
* ccs811
- Fix the transition from 'boot' to 'application' mode. Fixes the case
where the power is not cut between boot cycles.
* meson-saradc
- Fix missing mutex_unlock in an error path.
* sd-modulator
- Fix bindings doc to have the right value of io-channel-cells to reflect
that this device type only ever outputs one channel.
* st-accel
- Revert drop of redundant pointer patch.
- Use the now available pointer to avoid overwriting the platform data
pointer and causing trouble on reprobing the driver.
* st-pressure
- Use local copy of the platform data pointer to avoid overwriting the
one associated with the device, which would cause issues on reprobing
the driver.
* stm32-dfsdm
- Use the right regmap_cfg for the type of device.
- Correct the ID passed to stop channel to be the channel one.
- Correct which clock is used to allow for the 'audio' clock.
- Fix allocation of channels when more than one is enabled.
OuYang ZhiZhong [Sun, 11 Mar 2018 07:59:07 +0000 (15:59 +0800)]
mtdchar: fix usage of mtd_ooblayout_ecc()
Section was not properly computed. The value of OOB region definition is
always ECC section 0 information in the OOB area, but we want to get all
the ECC bytes information, so we should call
mtd_ooblayout_ecc(mtd, section++, &oobregion) until it returns -ERANGE.
Fixes: c2b78452a9db ("mtd: use mtd_ooblayout_xxx() helpers where appropriate") Cc: <[email protected]> Signed-off-by: OuYang ZhiZhong <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
Daniel Drake [Wed, 14 Mar 2018 08:42:17 +0000 (16:42 +0800)]
Revert "ACPI / battery: Add quirk for Asus GL502VSK and UX305LA"
Revert commit c68f0676ef7d ("ACPI / battery: Add quirk for Asus
GL502VSK and UX305LA") and commit 4446823e2573 ("ACPI / battery: Add
quirk for Asus UX360UA and UX410UAK").
On many many Asus products, the battery is sometimes reported as
charging or discharging even when it is full and you are on AC power.
This change quirked the kernel to avoid advertising the discharging
state when this happens on 4 laptop models, under the belief that
this was incorrect information. I presume it originates from user
reports who are confused that their battery status icon says that it
is discharging.
However, the reported information is indeed correct, and the quirk
approach taken is inadequate and more thought is needed first.
Specifically:
1. It only quirks discharging state, not charging
2. There are so many different Asus products and DMI naming variants
within those product families that behave this way; Linux could
grow to quirk hundreds of products and still not even be close at
"winning" this battle.
3. Asus previously clarified that this behaviour is intentional. The
platform will periodically do a partial discharge/charge cycle
when the battery is full, because this is one way to extend the
lifetime of the battery (leaving a battery at 100% charge and
unused will decrease its usable capacity over time).
My understanding is that any decent consumer product will have
this behaviour, but it appears that Asus is different in that
they expose this info through ACPI.
However, the behaviour seems correct. The ACPI spec does not
suggest in that the platform should hide the truth. It lets you
report that the battery is full of charge, and discharging, and
with external power connected; and Asus does this.
4. In terms of not confusing the user, this seems like something that
could/should be handled by userspace, which can also detect these
same (accurate) conditions in the general case.
Revert this quirk before it gets included in a release, while we look
for better approaches.
Thierry Reding [Sun, 18 Mar 2018 00:13:39 +0000 (01:13 +0100)]
drm/tegra: Shutdown on driver unbind
Since commit 846c7dfc1193 ("drm/atomic: Try to preserve the crtc enabled
state in drm_atomic_remove_fb, v2."), removing the last framebuffer will
no longer disable the corresponding pipeline, which causes the KMS core
to complain about leaked connectors on driver unbind.
Fix this by calling drm_atomic_helper_shutdown() on driver unbind, which
will cause all display pipelines to be shut down and therefore drop the
extra references on the connectors.
Thierry Reding [Sat, 17 Mar 2018 00:52:38 +0000 (01:52 +0100)]
drm/tegra: dc: Detach IOMMU group from domain only once
Detaching from an IOMMU group multiple times can lead to a crash. This
could potentially be fixed in the IOMMU driver, but it's easy to avoid
the subsequent detach operations in this driver, so do that as well.
Andy Lutomirski [Sat, 17 Mar 2018 15:25:07 +0000 (08:25 -0700)]
selftests/x86/ptrace_syscall: Fix for yet more glibc interference
glibc keeps getting cleverer, and my version now turns raise() into
more than one syscall. Since the test relies on ptrace seeing an
exact set of syscalls, this breaks the test. Replace raise(SIGSTOP)
with syscall(SYS_tgkill, ...) to force glibc to get out of our way.
Linus Torvalds [Sun, 18 Mar 2018 19:03:15 +0000 (12:03 -0700)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner:
"Another set of melted spectrum updates:
- Iron out the last late microcode loading issues by actually
checking whether new microcode is present and preventing the CPU
synchronization to run into a timeout induced hang.
- Remove Skylake C2 from the microcode blacklist according to the
latest Intel documentation
- Fix the VM86 POPF emulation which traps if VIP is set, but VIF is
not. Enhance the selftests to catch that kind of issue
- Annotate indirect calls/jumps for objtool on 32bit. This is not a
functional issue, but for consistency sake its the right thing to
do.
- Fix a jump label build warning observed on SPARC64 which uses 32bit
storage for the code location which is casted to 64 bit pointer w/o
extending it to 64bit first.
- Add two new cpufeature bits. Not really an urgent issue, but
provides them for both x86 and x86/kvm work. No impact on the
current kernel"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Fix CPU synchronization routine
x86/microcode: Attempt late loading only when new microcode is present
x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
jump_label: Fix sparc64 warning
x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels
x86/vm86/32: Fix POPF emulation
selftests/x86/entry_from_vm86: Add test cases for POPF
selftests/x86/entry_from_vm86: Exit with 1 if we fail
x86/cpufeatures: Add Intel PCONFIG cpufeature
x86/cpufeatures: Add Intel Total Memory Encryption cpufeature
Linus Torvalds [Sun, 18 Mar 2018 19:01:14 +0000 (12:01 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"A single fix for vmalloc_fault() which uses p*d_huge() unconditionally
whether CONFIG_HUGETLBFS is set or not. In case of CONFIG_HUGETLBFS=n
this results in a crash as p*d_huge() returns 0 in that case"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix vmalloc_fault to use pXd_large
Linus Torvalds [Sun, 18 Mar 2018 18:56:53 +0000 (11:56 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Thomas Gleixner:
"A single fix to prevent partially initialized pointers in mixed mode
(64bit kernel on 32bit UEFI)"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub/tpm: Initialize pointer variables to zero for mixed mode
Linus Torvalds [Sun, 18 Mar 2018 18:23:12 +0000 (11:23 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"PPC:
- fix bug leading to lost IPIs and smp_call_function_many() lockups
on POWER9
ARM:
- locking fix
- reset fix
- GICv2 multi-source SGI injection fix
- GICv2-on-v3 MMIO synchronization fix
- make the console less verbose.
x86:
- fix device passthrough on AMD SME"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix device passthrough when SME is active
kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
KVM: arm/arm64: Reduce verbosity of KVM init log
KVM: arm/arm64: Reset mapped IRQs on VM reset
KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN
KVM: arm/arm64: vgic: Add missing irq_lock to vgic_mmio_read_pending
KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry
Sven Eckelmann [Fri, 16 Mar 2018 20:14:32 +0000 (21:14 +0100)]
batman-adv: Fix skbuff rcsum on packet reroute
batadv_check_unicast_ttvn may redirect a packet to itself or another
originator. This involves rewriting the ttvn and the destination address in
the batadv unicast header. These field were not yet pulled (with skb rcsum
update) and thus any change to them also requires a change in the receive
checksum.
Ronak Doshi [Fri, 16 Mar 2018 21:49:19 +0000 (14:49 -0700)]
vmxnet3: use correct flag to indicate LRO feature
'Commit 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2
(fwd)")' introduced a flag "lro" in structure vmxnet3_adapter which is
used to indicate whether LRO is enabled or not. However, the patch
did not set the flag and hence it was never exercised.
So, when LRO is enabled, it resulted in poor TCP performance due to
delayed acks. This issue is seen with packets which are larger than
the mss getting a delayed ack rather than an immediate ack, thus
resulting in high latency.
This patch removes the lro flag and directly uses device features
against NETIF_F_LRO to check if lro is enabled.
Ronak Doshi [Fri, 16 Mar 2018 21:47:54 +0000 (14:47 -0700)]
vmxnet3: avoid xmit reset due to a race in vmxnet3
The field txNumDeferred is used by the driver to keep track of the number
of packets it has pushed to the emulation. The driver increments it on
pushing the packet to the emulation and the emulation resets it to 0 at
the end of the transmit.
There is a possibility of a race either when (a) ESX is under heavy load or
(b) workload inside VM is of low packet rate.
This race results in xmit hangs when network coalescing is disabled. This
change creates a local copy of txNumDeferred and uses it to perform ring
arithmetic.
David S. Miller [Sat, 17 Mar 2018 23:53:29 +0000 (19:53 -0400)]
Merge branch 'tcf_foo_init-NULL-deref'
Davide Caratti says:
====================
net/sched: fix NULL dereference in the error path of .init()
with several TC actions it's possible to see NULL pointer dereference,
when the .init() function calls tcf_idr_alloc(), fails at some point and
then calls tcf_idr_release(): this series fixes all them introducing
non-NULL tests in the .cleanup() function.
====================
Davide Caratti [Thu, 15 Mar 2018 23:00:57 +0000 (00:00 +0100)]
net/sched: fix NULL dereference on the error path of tcf_skbmod_init()
when the following command
# tc action replace action skbmod swap mac index 100
is run for the first time, and tcf_skbmod_init() fails to allocate struct
tcf_skbmod_params, tcf_skbmod_cleanup() calls kfree_rcu(NULL), thus
causing the following error:
Davide Caratti [Thu, 15 Mar 2018 23:00:56 +0000 (00:00 +0100)]
net/sched: fix NULL dereference in the error path of tcf_sample_init()
when the following command
# tc action add action sample rate 100 group 100 index 100
is run for the first time, and psample_group_get(100) fails to create a
new group, tcf_sample_cleanup() calls psample_group_put(NULL), thus
causing the following error:
Davide Caratti [Thu, 15 Mar 2018 23:00:55 +0000 (00:00 +0100)]
net/sched: fix NULL dereference in the error path of tunnel_key_init()
when the following command
# tc action add action tunnel_key unset index 100
is run for the first time, and tunnel_key_init() fails to allocate struct
tcf_tunnel_key_params, tunnel_key_release() dereferences NULL pointers.
This causes the following error:
Davide Caratti [Thu, 15 Mar 2018 23:00:54 +0000 (00:00 +0100)]
net/sched: fix NULL dereference in the error path of tcf_csum_init()
when the following command
# tc action add action csum udp continue index 100
is run for the first time, and tcf_csum_init() fails allocating struct
tcf_csum, tcf_csum_cleanup() calls kfree_rcu(NULL,...). This causes the
following error:
fix this in tcf_csum_cleanup(), ensuring that kfree_rcu(param, ...) is
called only when param is not NULL.
Fixes: 9c5f69bbd75a ("net/sched: act_csum: don't use spinlock in the fast path") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Davide Caratti [Thu, 15 Mar 2018 23:00:53 +0000 (00:00 +0100)]
net/sched: fix NULL dereference in the error path of tcf_vlan_init()
when the following command
# tc actions replace action vlan pop index 100
is run for the first time, and tcf_vlan_init() fails allocating struct
tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes
the following error:
SZ Lin (林上智) [Thu, 15 Mar 2018 16:56:01 +0000 (00:56 +0800)]
net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface
According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and
DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only
available when PHY is configured in RGMII mode with 10Mbps speed. It will
cause some networking issues without RGMII mode, such as carrier sense
errors and low throughput. TI also mentioned this issue in their forum[4].
This patch adds the check mechanism for PHY interface with RGMII interface
type, the in-band mode can only be set in RGMII mode with 10Mbps speed.
Matthias Brugger [Thu, 15 Mar 2018 16:54:20 +0000 (17:54 +0100)]
net: hns: Fix ethtool private flags
The driver implementation returns support for private flags, while
no private flags are present. When asked for the number of private
flags it returns the number of statistic flag names.
Fix this by returning EOPNOTSUPP for not implemented ethtool flags.
Takashi Iwai [Sat, 17 Mar 2018 21:40:18 +0000 (22:40 +0100)]
ALSA: hda/realtek - Always immediately update mute LED with pin VREF
Some HP laptops have a mute mute LED controlled by a pin VREF. The
Realtek codec driver updates the VREF via vmaster hook by calling
snd_hda_set_pin_ctl_cache().
This works fine as long as the driver is running in a normal mode.
However, when the VREF change happens during the codec being in
runtime PM suspend, the regmap access will skip and postpone the
actual register change. This ends up with the unchanged LED status
until the next runtime PM resume even if you change the Master mute
switch. (Interestingly, the machine keeps the LED status even after
the codec goes into D3 -- but it's another story.)
For improving this usability, let the driver temporarily powering up /
down only during the pin VREF change. This can be achieved easily by
wrapping the call with snd_hda_power_up_pm() / *_down_pm().
Ido Schimmel [Thu, 15 Mar 2018 12:49:56 +0000 (14:49 +0200)]
mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic
In commit 9ffcc3725f09 ("mlxsw: spectrum: Allow packets to be trapped
from any PG") I fixed a problem where packets could not be trapped to
the CPU due to exceeded shared buffer quotas. The mentioned commit
explains the problem in detail.
The problem was fixed by assigning a minimum quota for the CPU port and
the traffic class used for scheduling traffic to the CPU.
However, commit 117b0dad2d54 ("mlxsw: Create a different trap group list
for each device") assigned different traffic classes to different
packet types and rendered the fix useless.
Fix the problem by assigning a minimum quota for the CPU port and all
the traffic classes that are currently in use.
parisc: Handle case where flush_cache_range is called with no context
Just when I had decided that flush_cache_range() was always called with
a valid context, Helge reported two cases where the
"BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd:
This indicates that we need to handle the no context case in
flush_cache_range() as we do in flush_cache_mm().
In thinking about this, I realized that we don't need to flush the TLB
when there is no context. So, I added context checks to the large flush
cases in flush_cache_mm() and flush_cache_range(). The large flush case
occurs frequently in flush_cache_mm() and the change should improve fork
performance.
The v2 version of this change removes the BUG_ON from flush_cache_page()
by skipping the TLB flush when there is no context. I also added code
to flush the TLB in flush_cache_mm() and flush_cache_range() when we
have a context that's not current. Now all three routines handle TLB
flushes in a similar manner.
Linus Torvalds [Fri, 16 Mar 2018 20:37:42 +0000 (13:37 -0700)]
Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"There's an important revert in this pull request that needs to go to
stable as it causes a corruption on big endian machines.
The other fix is for FIEMAP incorrectly reporting shared extents
before a sync and one fix for a crash in raid56.
So far we got only one report about the BE corruption, the stable
kernels were out for like a week, so hopefully the scope of the damage
is low"
* tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Revert "btrfs: use proper endianness accessors for super_copy"
btrfs: add missing initialization in btrfs_check_shared
btrfs: Fix NULL pointer exception in find_bio_stripe
Borislav Petkov [Wed, 14 Mar 2018 18:36:15 +0000 (19:36 +0100)]
x86/microcode: Fix CPU synchronization routine
Emanuel reported an issue with a hang during microcode update because my
dumb idea to use one atomic synchronization variable for both rendezvous
- before and after update - was simply bollocks:
microcode: microcode_reload_late: late_cpus: 4
microcode: __reload_late: cpu 2 entered
microcode: __reload_late: cpu 1 entered
microcode: __reload_late: cpu 3 entered
microcode: __reload_late: cpu 0 entered
microcode: __reload_late: cpu 1 left
microcode: Timeout while waiting for CPUs rendezvous, remaining: 1
CPU1 above would finish, leave and the others will still spin waiting for
it to join.
So do two synchronization atomics instead, which makes the code a lot more
straightforward.
Also, since the update is serialized and it also takes quite some time per
microcode engine, increase the exit timeout by the number of CPUs on the
system.
That's ok because the moment all CPUs are done, that timeout will be cut
short.
Furthermore, panic when some of the CPUs timeout when returning from a
microcode update: we can't allow a system with not all cores updated.
Also, as an optimization, do not do the exit sync if microcode wasn't
updated.
nouveau:
- two backlight fixes
- fix for some lockups
Pretty quiet week, seems like everyone was fixing backlights"
* tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau/bl: fix backlight regression
drm/nouveau/bl: Fix oops on driver unbind
drm/nouveau/mmu: ALIGN_DOWN correct variable
drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field
drm/i915/gvt: Correct the privilege shadow batch buffer address
drm/amdgpu/dce: Don't turn off DP sink when disconnected
drm/amdgpu: save/restore backlight level in legacy dce code
drm/radeon: fix prime teardown order
drm/amdgpu: fix prime teardown order
drm/i915: Kick the rps worker when changing the boost frequency
drm/i915: Only prune fences after wait-for-all
drm/i915: Enable VBT based BL control for DP
drm/i915/gvt: keep oa config in shadow ctx
drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio
batman-adv: fix header size check in batadv_dbg_arp()
Checking for 0 is insufficient: when an SKB without a batadv header, but
with a VLAN header is received, hdr_size will be 4, making the following
code interpret the Ethernet header as a batadv header.
Fixes: be1db4f6615b ("batman-adv: make the Distributed ARP Table vlan aware") Signed-off-by: Matthias Schiffer <[email protected]> Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]>
batadv_check_unicast_ttvn() calls skb_cow(), so pointers into the SKB data
must be (re)set after calling it. The ethhdr variable is dropped
altogether.
Andrew Lunn [Fri, 16 Mar 2018 17:03:46 +0000 (18:03 +0100)]
net: dsa: mv88e6xxx: Fix binding documentation for MDIO busses
The MDIO busses are switch properties and so should be inside the
switch node. Fix the examples in the binding document.
Reported-by: 尤晓杰 <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Fixes: a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: David S. Miller <[email protected]>
Nicolas Dichtel [Wed, 14 Mar 2018 20:10:23 +0000 (21:10 +0100)]
netlink: avoid a double skb free in genlmsg_mcast()
nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.
Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()") Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Michal Kalderon [Wed, 14 Mar 2018 12:56:53 +0000 (14:56 +0200)]
qede: Fix qedr link update
Link updates were not reported to qedr correctly.
Leading to cases where a link could be down, but qedr
would see it as up.
In addition, once qede was loaded, link state would be up,
regardless of the actual link state.
Michal Kalderon [Wed, 14 Mar 2018 12:49:27 +0000 (14:49 +0200)]
qed: Fix MPA unalign flow in case header is split across two packets.
There is a corner case in the MPA unalign flow where a FPDU header is
split over two tcp segments. The length of the first fragment in this
case was not initialized properly and should be '1'
Fixes: c7d1d839 ("qed: Add support for MPA header being split over two tcp packets") Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
There is no need for complex checking between the last consumed index
and current consumed index, a simple subtraction will do.
This also eliminates the possibility of a permanent transmit queue stall
under the following conditions:
- one CPU bursts ring->size worth of traffic (up to 256 buffers), to the
point where we run out of free descriptors, so we stop the transmit
queue at the end of bcm_sysport_xmit()
- because of our locking, we have the transmit process disable
interrupts which means we can be blocking the TX reclamation process
- when TX reclamation finally runs, we will be computing the difference
between ring->c_index (last consumed index by SW) and what the HW
reports through its register
- this register is masked with (ring->size - 1) = 0xff, which will lead
to stripping the upper bits of the index (register is 16-bits wide)
- we will be computing last_tx_cn as 0, which means there is no work to
be done, and we never wake-up the transmit queue, leaving it
permanently disabled
A practical example is e.g: ring->c_index aka last_c_index = 12, we
pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12,
so last_tx_cn == 0, nothing happens.
Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>