Linus Torvalds [Sat, 12 Jan 2019 18:39:43 +0000 (10:39 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"Minor fixes for new code, corner cases, and documentation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86/kvm/nVMX: don't skip emulated instruction twice when vmptr address is not backed
Documentation/virtual/kvm: Update URL for AMD SEV API specification
KVM/VMX: Avoid return error when flush tlb successfully in the hv_remote_flush_tlb_with_range()
kvm: sev: Fail KVM_SEV_INIT if already initialized
KVM: validate userspace input in kvm_clear_dirty_log_protect()
KVM: x86: Fix bit shifting in update_intel_pt_cfg
Linus Torvalds [Sat, 12 Jan 2019 18:30:43 +0000 (10:30 -0800)]
Merge tag 'drm-fixes-2019-01-11-1' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Daniel Vetter:
"Dave sends out his pull, everybody remembers holidays are over :-)
Since Dave's already in weekend mode and it was quite a few patches I
figured better to apply all the pulls and forward them to you. Hence
here 2nd part of bugfixes for -rc2.
i915:
- Disable PSR for Apple panels
- Broxton ERR_PTR error state fix
- Kabylake VECS workaround fix
- Unwind failure on pinning the gen7 ppgtt
- GVT workload request allocation fix
core:
- Fix fb-helper to work correctly with SDL 1.2 bugs
- Fix lockdep warning in the atomic ioctl and setproperty"
* tag 'drm-fixes-2019-01-11-1' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/falcon: avoid touching registers if engine is off
drm/nouveau: Don't disable polling in fallback mode
drm/nouveau: register backlight on pascal and newer
drm: Fix documentation generation for DP_DPCD_QUIRK_NO_PSR
drm/i915: init per-engine WAs for all engines
drm/i915: Unwind failure on pinning the gen7 ppgtt
drm/i915: Skip the ERR_PTR error state
drm/i915: Disable PSR in Apple panels
gpu/drm: Fix lock held when returning to user space.
drm/fb-helper: Ignore the value of fb_var_screeninfo.pixclock
drm/fb-helper: Partially bring back workaround for bugs of SDL 1.2
drm/i915/gvt: Fix workload request allocation before request add
Varun Prakash [Thu, 10 Jan 2019 17:59:28 +0000 (23:29 +0530)]
scsi: cxgb4i: add wait_for_completion()
In case of ->set_param() and ->bind_conn() cxgb4i driver does not wait for
cmd completion, this can create race conditions, to avoid this add
wait_for_completion().
After Commit 54aed4dd3526 ("MIPS: IP27: use dma_direct_ops") qla1280 driver
failed on SGI IP27 machines with
qla1280: QLA1040 found on PCI bus 0, dev 0
qla1280 0000:00:00.0: enabling device (0006 -> 0007)
qla1280: Failed to get request memory
qla1280: probe of 0000:00:00.0 failed with error -12
Reason is that SGI IP27 always generates 64bit DMA addresses and has no
fallback mode for 32bit DMA addresses implemented. QLA1280 supports 64bit
addressing for all DMA accesses so setting coherent mask to 64bit fixes the
issue.
Avri Altman [Thu, 10 Jan 2019 11:31:26 +0000 (13:31 +0200)]
scsi: ufs: Fix geometry descriptor size
Albeit we no longer rely on those hard-coded descriptor sizes, we still use
them as our defaults, so better get it right. While adding its sysfs
entries, we forgot to update the geometry descriptor size. It is 0x48
according to UFS2.1, and wasn't changed in UFS3.0.
[mkp: typo]
Fixes: c720c091222e (scsi: ufs: sysfs: geometry descriptor) Signed-off-by: Avri Altman <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Shivasharan S [Wed, 9 Jan 2019 13:08:37 +0000 (05:08 -0800)]
scsi: megaraid_sas: Retry reads of outbound_intr_status reg
commit 272652fcbf1a ("scsi: megaraid_sas: add retry logic in megasas_readl")
missed changing readl to megasas_readl in megasas_clear_intr_fusion(). For
Aero controllers, reads of outbound_intr_status register needs to be
retried.
scsi: qedi: Add ep_state for login completion on un-reachable targets
When the driver finds invalid destination MAC for the first un-reachable
target, and before completes the PATH_REQ operation, set new ep_state to
OFFLDCONN_NONE so that as part of driver ep_poll mechanism, the upper
open-iscsi layer is notified to complete the login process on the first
un-reachable target and thus proceed login to other reachable targets.
Stanley Chu [Mon, 7 Jan 2019 14:19:34 +0000 (22:19 +0800)]
scsi: ufs: Fix system suspend status
hba->is_sys_suspended is set after successful system suspend but
not clear after successful system resume.
According to current behavior, hba->is_sys_suspended will not be set if
host is runtime-suspended but not system-suspended. Thus we shall aligh the
same policy: clear this flag even if host remains runtime-suspended after
ufshcd_system_resume is successfully returned.
Ming Lei [Fri, 11 Jan 2019 17:40:47 +0000 (09:40 -0800)]
scsi: qla2xxx: Use correct number of vectors for online CPUs
When SCSI-MQ is enabled, in some case system would present
nr_possible_cpus() which is greater than requested vectors by the
driver. This results into driver being able to get larger number of MSI-X
vectors than actual online CPUs. Driver then uses
pci_alloc_irq_vectors_affinity() to assign 1:1 mapping and affinity for
each MSI-x vector to CPUs. When the command is submitted using MSI-x
vector, assigned to offline CPU, it results in an ABTS and system
hang. This hang is result of a driver not being able to process interrupt
on a vector assigned to an Off-line CPUs
This patch fixes this issue by setting irq_offset value for the
blk_mq_pci_map_queues() to use only those CPUs which has CPU mask affinity
assigned and are online. By using the irq_offset value, driver will allow
online cpumask to decide which vectors are used in blk_mq_pci_map_queues().
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:
John Garry [Thu, 10 Jan 2019 13:32:41 +0000 (21:32 +0800)]
scsi: hisi_sas: Set protection parameters prior to adding SCSI host
Currently we set the protection parameters after calling scsi_add_host()
for v3 hw.
They should be set beforehand, so make this change.
Appearantly this fixes our DIX issue (not mainline yet) also, but more
testing required.
Fixes: d6a9000b81be ("scsi: hisi_sas: Add support for DIF feature for v2 hw") Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Paolo Abeni [Tue, 8 Jan 2019 17:45:05 +0000 (18:45 +0100)]
net: clear skb->tstamp in bridge forwarding path
Matteo reported forwarding issues inside the linux bridge,
if the enslaved interfaces use the fq qdisc.
Similar to commit 8203e2d844d3 ("net: clear skb->tstamp in
forwarding paths"), we need to clear the tstamp field in
the bridge forwarding path.
Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.") Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Reported-and-tested-by: Matteo Croce <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Sat, 12 Jan 2019 02:05:41 +0000 (18:05 -0800)]
Merge branch 'bpfilter-fixes'
Taehee Yoo says:
====================
net: bpfilter: fix two bugs in bpfilter
This patches fix two bugs in the bpfilter_umh which are related in
iptables command.
The first patch adds an exit code for UMH process.
This provides an opportunity to cleanup members of the umh_info
to modules which use the UMH.
In order to identify UMH processes, a new flag PF_UMH is added.
The second patch makes the bpfilter_umh use UMH cleanup callback.
The third patch adds re-start routine for the bpfilter_umh.
The bpfilter_umh does not re-start after error occurred.
because there is no re-start routine in the module.
The fourth patch ensures that the bpfilter.ko module will not removed while
it's being used.
The bpfilter.ko is not protected by locks or module reference counter.
Therefore that can be removed while module is being used.
In order to protect that, mutex is used.
The first and second patch are preparation patches for the third and
fourth patch.
TEST #1
while :
do
modprobe bpfilter
kill -9 <pid of the bpfilter_umh>
iptables -vnL
done
TEST #2
while :
do
iptables -I FORWARD -m string --string ap --algo kmp &
iptables -F &
modprobe -rv bpfilter &
done
TEST #3
while :
do
modprobe bpfilter &
modprobe -rv bpfilter &
done
The TEST1 makes a failure of iptables command.
This is fixed by the third patch.
The TEST2 makes a panic because of a race condition in the bpfilter_umh
module.
This is fixed by the fourth patch.
The TEST3 makes a double-create UMH process.
This is fixed by the third and fourth patch.
v4 :
- declare the exit_umh() as static inline
- check stop flag in the load_umh() to avoid a double-create UMH
v3 :
- Avoid unnecessary list lookup for non-UMH processes
- Add a new PF_UMH flag
v2 : add the first and second patch
v1 : Initial patch
====================
Taehee Yoo [Tue, 8 Jan 2019 17:25:10 +0000 (02:25 +0900)]
net: bpfilter: disallow to remove bpfilter module while being used
The bpfilter.ko module can be removed while functions of the bpfilter.ko
are executing. so panic can occurred. in order to protect that, locks can
be used. a bpfilter_lock protects routines in the
__bpfilter_process_sockopt() but it's not enough because __exit routine
can be executed concurrently.
Now, the bpfilter_umh can not run in parallel.
So, the module do not removed while it's being used and it do not
double-create UMH process.
The members of the umh_info and the bpfilter_umh_ops are protected by
the bpfilter_umh_ops.lock.
test commands:
while :
do
iptables -I FORWARD -m string --string ap --algo kmp &
modprobe -rv bpfilter &
done
Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Taehee Yoo [Tue, 8 Jan 2019 17:24:53 +0000 (02:24 +0900)]
net: bpfilter: restart bpfilter_umh when error occurred
The bpfilter_umh will be stopped via __stop_umh() when the bpfilter
error occurred.
The bpfilter_umh() couldn't start again because there is no restart
routine.
The section of the bpfilter_umh_{start/end} is no longer .init.rodata
because these area should be reused in the restart routine. hence
the section name is changed to .bpfilter_umh.
The bpfilter_ops->start() is restart callback. it will be called when
bpfilter_umh is stopped.
The stop bit means bpfilter_umh is stopped. this bit is set by both
start and stop routine.
Before this patch,
Test commands:
$ iptables -vnL
$ kill -9 <pid of bpfilter_umh>
$ iptables -vnL
[ 480.045136] bpfilter: write fail -32
$ iptables -vnL
All iptables commands will fail.
After this patch,
Test commands:
$ iptables -vnL
$ kill -9 <pid of bpfilter_umh>
$ iptables -vnL
$ iptables -vnL
Now, all iptables commands will work.
Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Taehee Yoo [Tue, 8 Jan 2019 17:24:34 +0000 (02:24 +0900)]
net: bpfilter: use cleanup callback to release umh_info
Now, UMH process is killed, do_exit() calls the umh_info->cleanup callback
to release members of the umh_info.
This patch makes bpfilter_umh's cleanup routine to use the
umh_info->cleanup callback.
Taehee Yoo [Tue, 8 Jan 2019 17:23:56 +0000 (02:23 +0900)]
umh: add exit routine for UMH process
A UMH process which is created by the fork_usermode_blob() such as
bpfilter needs to release members of the umh_info when process is
terminated.
But the do_exit() does not release members of the umh_info. hence module
which uses UMH needs own code to detect whether UMH process is
terminated or not.
But this implementation needs extra code for checking the status of
UMH process. it eventually makes the code more complex.
The new PF_UMH flag is added and it is used to identify UMH processes.
The exit_umh() does not release members of the umh_info.
Hence umh_info->cleanup callback should release both members of the
umh_info and the private data.
Xiubo Li [Fri, 23 Nov 2018 01:15:30 +0000 (09:15 +0800)]
scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes
Currently there is one cmd timeout timer and one qfull timer for each udev,
and whenever any new command is coming in we will update the cmd timer or
qfull timer. For some corner cases the timers are always working only for
the ringbuffer's and full queue's newest cmd. That's to say the timer won't
be fired even if one cmd has been stuck for a very long time and the
deadline is reached.
This fix will keep the cmd/qfull timers to be pended for the oldest cmd in
ringbuffer and full queue, and will update them with the next cmd's
deadline only when the old cmd's deadline is reached or removed from the
ringbuffer and full queue.
Jia-Ju Bai [Tue, 8 Jan 2019 13:04:48 +0000 (21:04 +0800)]
isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
The functions isdn_tty_tiocmset() and isdn_tty_set_termios() may be
concurrently executed.
isdn_tty_tiocmset
isdn_tty_modem_hup
line 719: kfree(info->dtmf_state);
line 721: kfree(info->silence_state);
line 723: kfree(info->adpcms);
line 725: kfree(info->adpcmr);
isdn_tty_set_termios
isdn_tty_modem_hup
line 719: kfree(info->dtmf_state);
line 721: kfree(info->silence_state);
line 723: kfree(info->adpcms);
line 725: kfree(info->adpcmr);
Thus, some concurrency double-free bugs may occur.
These possible bugs are found by a static tool written by myself and
my manual code review.
To fix these possible bugs, the mutex lock "modem_info_mutex" used in
isdn_tty_tiocmset() is added in isdn_tty_set_termios().
The vsock core only supports 32bit CID, but the Virtio-vsock spec define
CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
zero. This inconsistency causes one bug in vhost vsock driver. The
scenarios is:
0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
object. And hash_min() is used to compute the hash key. hash_min() is
defined as:
(sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)).
That means the hash algorithm has dependency on the size of macro
argument 'val'.
0. In function vhost_vsock_set_cid(), a 64bit CID is passed to
hash_min() to compute the hash key when inserting a vsock object into
the hash table.
0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min()
to compute the hash key when looking up a vsock for an CID.
Because the different size of the CID, hash_min() returns different hash
key, thus fails to look up the vsock object for an CID.
To fix this bug, we keep CID as u64 in the IOCTLs and virtio message
headers, but explicitly convert u64 to u32 when deal with the hash table
and vsock core.
Jose Abreu [Wed, 9 Jan 2019 09:05:59 +0000 (10:05 +0100)]
net: stmmac: Fix the logic of checking if RX Watchdog must be enabled
RX Watchdog can be disabled by platform definitions but currently we are
initializing the descriptors before checking if Watchdog must be
disabled or not.
Fix this by checking earlier if user wants Watchdog disabled or not.
Jose Abreu [Wed, 9 Jan 2019 09:05:56 +0000 (10:05 +0100)]
net: stmmac: Fix PCI module removal leak
Since commit b7d0f08e9129, the enable / disable of PCI device is not
managed which will result in IO regions not being automatically unmapped.
As regions continue mapped it is currently not possible to remove and
then probe again the PCI module of stmmac.
Fix this by manually unmapping regions on remove callback.
Reported-by: Jane Chu <[email protected]> Fixes: 23222f8f8dce ("acpi, nfit: Add function to look up nvdimm...") Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Dan Williams <[email protected]>
Miquel Raynal [Tue, 4 Dec 2018 19:28:29 +0000 (20:28 +0100)]
ata: ahci: mvebu: request PHY suspend/resume for Armada 3700
A feature has been added in the libahci driver: the possibility to set
a new flag in hpriv->flags to let the core handle PHY suspend/resume
automatically. Make use of this feature to make suspend to RAM work
with SATA drives on A3700.
Miquel Raynal [Tue, 4 Dec 2018 19:28:28 +0000 (20:28 +0100)]
ata: ahci: mvebu: add Armada 3700 initialization needed for S2RAM
A3700 comphy initialization is done in the firmware (TF-A). Looking at
the SATA PHY initialization routine, there is a comment about "vendor
specific" registers. Two registers are mentioned. They are not
initialized there in the firmware because they are AHCI related, while
the firmware at this location does only PHY configuration. The
solution to avoid doing such initialization is relying on U-Boot.
While this work at boot time, U-Boot is definitely not going to run
during a resume after suspending to RAM.
Two possible solutions were considered:
* Fixing the firmware.
* Fixing the kernel driver.
The first solution would take ages to propagate, while the second
solution is easy to implement as the driver as been a little bit
reworked to prepare for such platform configuration. Hence, this patch
adds an Armada 3700 configuration function to set these two registers
both at boot time (in the probe) and after a suspend (in the resume
path).
Miquel Raynal [Tue, 4 Dec 2018 19:28:27 +0000 (20:28 +0100)]
ata: ahci: mvebu: do Armada 38x configuration only on relevant SoCs
At the beginning, only Armada 38x SoCs where supported by the
ahci_mvebu.c driver. Commit 15d3ce7b63bd ("ata: ahci_mvebu: add
support for Armada 3700 variant") introduced Armada 3700 support. As
opposed to Armada 38x SoCs, the 3700 variants do not have to configure
mbus and the regret option. This patch took care of avoiding such
configuration when not needed in the probe function, but failed to do
the same in the resume path. While doing so looks harmless by
experience, let's clean the driver logic and avoid doing this useless
configuration with Armada 3700 SoCs.
Because the logic is very similar between these two places, it has
been decided to factorize this code and put it in a "Armada 38x
configuration function". This function is part of a new
(per-compatible) platform data structure, so that the addition of such
configuration function for Armada 3700 will be eased.
Fixes: 15d3ce7b63bd ("ata: ahci_mvebu: add support for Armada 3700 variant") Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
Miquel Raynal [Tue, 4 Dec 2018 19:28:26 +0000 (20:28 +0100)]
ata: ahci: mvebu: remove stale comment
For Armada-38x (32-bit) SoCs, PM platform support has been added since:
commit 32f9494c9dfd ("ARM: mvebu: prepare pm-board.c for the
introduction of Armada 38x support")
commit 3cbd6a6ca81c ("ARM: mvebu: Add standby support")
For Armada 64-bit SoCs, like the A3700 also using this AHCI driver, PM
platform support has always existed.
There are even suspend/resume hooks in this driver since:
commit d6ecf15814888 ("ata: ahci_mvebu: add suspend/resume support")
Remove the stale comment at the end of this driver stating that all
the above does not exist yet.
Miquel Raynal [Tue, 4 Dec 2018 19:28:25 +0000 (20:28 +0100)]
ata: libahci_platform: comply to PHY framework
Current implementation of the libahci does not take into account the
new PHY framework. Correct the situation by adding a call to
phy_set_mode() before phy_power_on().
PHYs should also be handled at suspend/resume time. For this, call
ahci_platform_enable/disable_phys() at suspend/resume_host() time. These
calls are guarded by a HFLAG (AHCI_HFLAG_SUSPEND_PHYS) that the user of
the libahci driver must set manually in hpriv->flags at probe time. This
is to avoid breaking users that have not been tested with this change.
Linus Torvalds [Fri, 11 Jan 2019 20:28:01 +0000 (12:28 -0800)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has one core and one driver bugfix for you"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: tegra: Fix Maximum transfer size
i2c: dev: prevent adapter retries and timeout being set as minus value
Linus Torvalds [Fri, 11 Jan 2019 20:25:40 +0000 (12:25 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Another handful of arm64 fixes here. Most of the complication comes
from improving our kpti code to avoid lengthy pauses (30+ seconds)
during boot when we rewrite the page tables. There are also a couple
of IORT fixes that came in via Lorenzo.
Summary:
- Don't error in kexec_file_load if kaslr-seed is missing in
device-tree
- Fix incorrect argument type passed to iort_match_node_callback()
- Fix IORT build failure when CONFIG_IOMMU_API=n
- Fix kpti performance regression with new rodata default option
- Typo fix"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kexec_file: return successfully even if kaslr-seed doesn't exist
ACPI/IORT: Fix rc_dma_get_range()
arm64: kpti: Avoid rewriting early page tables when KASLR is enabled
arm64: asm-prototypes: Fix fat-fingered typo in comment
ACPI/IORT: Fix build when CONFIG_IOMMU_API=n
Linus Torvalds [Fri, 11 Jan 2019 20:17:30 +0000 (12:17 -0800)]
Merge tag 'ceph-for-5.0-rc2' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"A patch to allow setting abort_on_full and a fix for an old "rbd
unmap" edge case, marked for stable"
* tag 'ceph-for-5.0-rc2' of git://github.com/ceph/ceph-client:
rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set
ceph: use vmf_error() in ceph_filemap_fault()
libceph: allow setting abort_on_full for rbd
Arnd Bergmann [Thu, 10 Jan 2019 16:24:31 +0000 (17:24 +0100)]
mips: fix n32 compat_ipc_parse_version
While reading through the sysvipc implementation, I noticed that the n32
semctl/shmctl/msgctl system calls behave differently based on whether
o32 support is enabled or not: Without o32, the IPC_64 flag passed by
user space is rejected but calls without that flag get IPC_64 behavior.
As far as I can tell, this was inadvertently changed by a cleanup patch
but never noticed by anyone, possibly nobody has tried using sysvipc
on n32 after linux-3.19.
Linus Torvalds [Fri, 11 Jan 2019 17:44:05 +0000 (09:44 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling updates from Ingo Molnar:
"Tooling changes only: fixes and a few stray improvements.
Most of the diffstat is dominated by a PowerPC related fix of system
call trace output beautification that allows us to (again) use the
UAPI header version and sync up with the kernel's version of PowerPC
system call names in the arch/powerpc/kernel/syscalls/syscall.tbl
header"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
tools headers powerpc: Remove unistd.h
perf powerpc: Rework syscall table generation
perf symbols: Add 'arch_cpu_idle' to the list of kernel idle symbols
tools include uapi: Sync linux/if_link.h copy with the kernel sources
tools include uapi: Sync linux/vhost.h with the kernel sources
tools include uapi: Sync linux/fs.h copy with the kernel sources
perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h
tools include uapi: Grab a copy of linux/mount.h
perf top: Lift restriction on using callchains without "sym" in --sort
tools lib traceevent: Remove tep_data_event_from_type() API
tools lib traceevent: Rename tep_is_file_bigendian() to tep_file_bigendian()
tools lib traceevent: Changed return logic of tep_register_event_handler() API
tools lib traceevent: Changed return logic of trace_seq_printf() and trace_seq_vprintf() APIs
tools lib traceevent: Rename struct cmdline to struct tep_cmdline
tools lib traceevent: Initialize host_bigendian at tep_handle allocation
tools lib traceevent: Introduce new libtracevent API: tep_override_comm()
perf tests: Add a test for the ARM 32-bit [vectors] page
perf tools: Make find_vdso_map() more modular
perf trace: Fix alignment for [continued] lines
perf trace: Fix ')' placement in "interrupted" syscall lines
...
x86/kvm/nVMX: don't skip emulated instruction twice when vmptr address is not backed
Since commit 09abb5e3e5e50 ("KVM: nVMX: call kvm_skip_emulated_instruction
in nested_vmx_{fail,succeed}") nested_vmx_failValid() results in
kvm_skip_emulated_instruction() so doing it again in handle_vmptrld() when
vmptr address is not backed is wrong, we end up advancing RIP twice.
Lan Tianyu [Fri, 4 Jan 2019 07:20:44 +0000 (15:20 +0800)]
KVM/VMX: Avoid return error when flush tlb successfully in the hv_remote_flush_tlb_with_range()
The "ret" is initialized to be ENOTSUPP. The return value of
__hv_remote_flush_tlb_with_range() will be Or with "ret" when ept
table potiners are mismatched. This will cause return ENOTSUPP even if
flush tlb successfully. This patch is to fix the issue and set
"ret" to 0.
Fixes: a5c214dad198 ("KVM/VMX: Change hv flush logic when ept tables are mismatched.") Signed-off-by: Lan Tianyu <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
Tomas Bortoli [Wed, 2 Jan 2019 17:29:37 +0000 (18:29 +0100)]
KVM: validate userspace input in kvm_clear_dirty_log_protect()
The function at issue does not fully validate the content of the
structure pointed by the log parameter, though its content has just been
copied from userspace and lacks validation. Fix that.
Moreover, change the type of n to unsigned long as that is the type
returned by kvm_dirty_bitmap_bytes().
ctl_bitmask in pt_desc is of type u64. When an integer like 0xf is
being left shifted more than 32 bits, the behavior is undefined.
Fix this by adding suffix ULL to integer 0xf.
Addresses-Coverity-ID: 1476095 ("Bad bit shift operation") Fixes: 6c0f0bba85a0 ("KVM: x86: Introduce a function to initialize the PT configuration") Signed-off-by: Gustavo A. R. Silva <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Luwei Kang <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
Linus Torvalds [Fri, 11 Jan 2019 17:07:19 +0000 (09:07 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A 32-bit build fix, CONFIG_RETPOLINE fixes and rename CONFIG_RESCTRL
to CONFIG_X86_RESCTRL"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE
x86/cache: Rename config option to CONFIG_X86_RESCTRL
samples/seccomp: Fix 32-bit build
Linus Torvalds [Fri, 11 Jan 2019 17:04:36 +0000 (09:04 -0800)]
Merge tag 'acpi-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Fix a build failure introduced recently, fix the xpower PMIC ACPI
driver, clean up the handling of duplicate entries in _PRx power
resource lists and fix addresses in NUMA-related messages on 32-bit
with PAE.
Specifics:
- Fix build failures with both CONFIG_NLS and CONFIG_PCI unset that
can occur since ACPI can be built without PCI now (Sinan Kaya).
- Clean up the handling of duplicate entries in power resource lists
returned by _PRx evaluation to avoid triggering WARN_ON() on
attempts to add duplicate symlinks in sysfs (Hans de Goede).
- Fix issues with the TS current-source switching on systems using
the xpower PMIC by avoiding to update unrelated bits in the TS
pin-ctrl register and avoiding to unconditionally enable TS
current-source on systems where it is not used (Hans de Goede).
- Fix addresses in NUMA-related messages on 32-bit with PAE which can
be truncated due to integer type conversions (Chao Fan)"
* tag 'acpi-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PMIC: xpower: Fix TS-pin current-source handling
ACPI: NUMA: Use correct type for printing addresses on i386-PAE
ACPI: power: Skip duplicate power resource references in _PRx
ACPI: Fix build failure when CONFIG_NLS is set to 'n'
Linus Torvalds [Fri, 11 Jan 2019 17:01:43 +0000 (09:01 -0800)]
Merge tag 'pm-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These fix fallout after starting to use hrtimers in the runtime PM
framework, fix a few cpufreq issues, fix a recently broken reference
to cpuidle documentation, update MAINTAINERS entries for cpufreq and
cpuidle and make the recently added system suspend and resume support
in devfreq actually work.
Specifics:
- Prevent integer overflows from occurring on 32-bit when converting
milliseconds to nanoseconds in the runtime PM framework and update
comments that still refer to jiffies in it (Vincent Guittot,
Ladislav Michl).
- Fix the SCMI cpufreq driver to always use the same frequency units
for arch_set_freq_scale() and make the scale-invariant load
tracking acutally work with this driver (Quentin Perret).
- Fix freeing of dynamic OPPs in the SCPI and SCMI cpufreq drivers
broken during the 4.20 defelopment cycle (Viresh Kumar).
- Prevent the cpufreq core from attempting to return the current
frequency of offline CPUs (Sudeep Holla).
- Add devfreq suspend and resume hooks (missed previously) to the PM
core to make the recently added system suspend and resume support
in devfreq actually work (Lukasz Luba).
- Update MAINTAINERS entries for cpufreq and cpuidle, mostly to add
references to new/current documentation to them (Rafael Wysocki).
- Fix a recently broken reference to cpuidle documentation (Otto
Sabart)"
* tag 'pm-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM-runtime: Fix autosuspend_delay on 32bits arch
PM-runtime: Fix 'jiffies' in comments after switch to hrtimers
cpufreq: scmi: Fix frequency invariance in slow path
doc: trace: fix reference to cpuidle documentation file
cpufreq: check if policy is inactive early in __cpufreq_get()
cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
cpuidle / Documentation: Update cpuidle MAINTAINERS entry
cpufreq / Documentation: Update cpufreq MAINTAINERS entry
PM: sleep: call devfreq suspend/resume
Linus Torvalds [Fri, 11 Jan 2019 16:58:02 +0000 (08:58 -0800)]
Merge tag 'drm-fixes-2019-01-11' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Not a huge amount for rc2, assume the usual quiet period, and rc3 will
be most of it.
amdgpu:
- Powerplay fixes
- Virtual display pinning fixes
- Golden register updates for Vega
- Pitch and gem size validation fixes
- SR-IOV init error fix
- Pagetables in system RAM disable for some Raven system
- DP-MST resume fixes
tc358767 bridge:
- fix to work with displayport connector"
* tag 'drm-fixes-2019-01-11' of git://anongit.freedesktop.org/drm/drm: (26 commits)
drm/amdgpu: disable system memory page tables for now
drm/amdgpu: set WRITE_BURST_LENGTH to 64B to workaround SDMA1 hang
drm/amdgpu: fix CPDMA hang in PRT mode for VEGA20
drm/bridge: tc358767: use DP connector if no panel set
drm/bridge: tc358767: fix output H/V syncs
drm/bridge: tc358767: reject modes which require too much BW
drm/bridge: tc358767: fix initial DP0/1_SRCCTRL value
drm/bridge: tc358767: fix single lane configuration
drm/bridge: tc358767: add defines for DP1_SRCCTRL & PHY_2LANE
drm/bridge: tc358767: add bus flags
drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()
drm/amdgpu: Don't fail resume process if resuming atomic state fails
drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()
drm/amdgpu: validate user GEM object size
drm/amdgpu: validate user pitch alignment
drm/amd/powerplay: drop the unnecessary uclk hard min setting
drm/amd/powerplay: avoid possible buffer overflow
drm/amd/powerplay: create pp_od_clk_voltage device file under OD support
drm/amd/powerplay: update OD support flag for SKU with no OD capabilities
drm/amdgpu: make gfx9 enter into rlc safe mode when set MGCG
...
Dmitry Safonov [Wed, 9 Jan 2019 01:17:40 +0000 (01:17 +0000)]
tty: Don't hold ldisc lock in tty_reopen() if ldisc present
Try to get reference for ldisc during tty_reopen().
If ldisc present, we don't need to do tty_ldisc_reinit() and lock the
write side for line discipline semaphore.
Effectively, it optimizes fast-path for tty_reopen(), but more
importantly it won't interrupt ongoing IO on the tty as no ldisc change
is needed.
Fixes user-visible issue when tty_reopen() interrupted login process for
user with a long password, observed and reported by Lukas.
Fixes: c96cf923a98d ("tty: Don't block on IO when ldisc change is pending") Fixes: 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()") Cc: Jiri Slaby <[email protected]> Reported-by: Lukas F. Hartmann <[email protected]> Tested-by: Lukas F. Hartmann <[email protected]> Cc: stable <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mmc: core: don't override the CD GPIO level when "cd-inverted" is set
Since commit 89a5e15bcba87d ("gpio/mmc/of: Respect polarity in the device
tree") gpiolib-of parses the "cd-gpios" property and flips the polarity
if "cd-inverted" is also set. This results in the "cd-inverted" property
being evaluated twice, which effectively makes it a no-op:
- first in drivers/gpio/gpiolib-of.c (of_xlate_and_get_gpiod_flags) when
setting up the CD GPIO
- then again in drivers/mmc/core/slot-gpio.c (mmc_gpio_get_cd) when
reading the CD GPIO value at runtime
On boards which are using device-tree with the "cd-inverted" property
being set any inserted card are not detected anymore. This is due to the
MMC core treating the CD GPIO with the wrong polarity.
Disable "override_cd_active_level" for the card detection GPIO which is
parsed using mmc_of_parse. This fixes SD card detection on the boards
which are currently using the "cd-inverted" device-tree property (tested
on Meson8b Odroid-C1 and Meson8b EC-100).
This does not remove the CD GPIO inversion logic from the MMC core
because there's at least one driver (sdhci-pci-core for Intel BayTrail
based boards) which still passes "override_cd_active_level = true" to
mmc_gpiod_request_cd(). Due to lack of hardware for testing this is left
untouched.
In the future the GPIO inversion logic for both, card and read-only
detection can be removed once no driver is using it anymore.
Rob Herring [Fri, 11 Jan 2019 13:34:39 +0000 (14:34 +0100)]
fbdev: offb: Fix OF node name handling
Commit 5c63e407aaab ("fbdev: Convert to using %pOFn instead of
device_node.name") changed how the OF FB driver handles the OF node
name. This missed the case where the node name is passed to
offb_init_palette_hacks(). This results in a NULL ptr dereference
in strncmp and breaks any system except ones using bootx with no display
node.
Fix this by making offb_init_palette_hacks() use the OF node pointer and
use of_node_name_prefix() helper function instead for node name
comparisons. This helps in moving all OF node name accesses to helper
functions in preparation to remove struct device_node.name pointer.
Vlad Tsyrklevich [Fri, 11 Jan 2019 13:34:38 +0000 (14:34 +0100)]
omap2fb: Fix stack memory disclosure
Using [1] for static analysis I found that the OMAPFB_QUERY_PLANE,
OMAPFB_GET_COLOR_KEY, OMAPFB_GET_DISPLAY_INFO, and OMAPFB_GET_VRAM_INFO
cases could all leak uninitialized stack memory--either due to
uninitialized padding or 'reserved' fields.
Fix them by clearing the shared union used to store copied out data.
Pavel Shilovsky [Tue, 8 Jan 2019 19:15:28 +0000 (11:15 -0800)]
CIFS: Fix error paths in writeback code
This patch aims to address writeback code problems related to error
paths. In particular it respects EINTR and related error codes and
stores and returns the first error occurred during writeback.
Pavel Shilovsky [Thu, 3 Jan 2019 23:53:10 +0000 (15:53 -0800)]
CIFS: Move credit processing to mid callbacks for SMB3
Currently we account for credits in the thread initiating a request
and waiting for a response. The demultiplex thread receives the response,
wakes up the thread and the latter collects credits from the response
buffer and add them to the server structure on the client. This approach
is not accurate, because it may race with reconnect events in the
demultiplex thread which resets the number of credits.
Fix this by moving credit processing to new mid callbacks that collect
credits granted by the server from the response in the demultiplex thread.
Pavel Shilovsky [Fri, 4 Jan 2019 00:45:27 +0000 (16:45 -0800)]
CIFS: Fix credits calculation for cancelled requests
If a request is cancelled, we can't assume that the server returns
1 credit back. Instead we need to wait for a response and process
the number of credits granted by the server.
Create a separate mid callback for cancelled request, parse the number
of credits in a response buffer and add them to the client's credits.
If the didn't get a response (no response buffer available) assume
0 credits granted. The latter most probably happens together with
session reconnect, so the client's credits are adjusted anyway.
Ross Lagerwall [Tue, 8 Jan 2019 18:30:56 +0000 (18:30 +0000)]
cifs: Limit memory used by lock request calls to a page
The code tries to allocate a contiguous buffer with a size supplied by
the server (maxBuf). This could fail if memory is fragmented since it
results in high order allocations for commonly used server
implementations. It is also wasteful since there are probably
few locks in the usual case. Limit the buffer to be no larger than a
page to avoid memory allocation failures due to fragmentation.
Pavel Shilovsky [Thu, 10 Jan 2019 19:27:28 +0000 (11:27 -0800)]
CIFS: Do not hide EINTR after sending network packets
Currently we hide EINTR code returned from sock_sendmsg()
and return 0 instead. This makes a caller think that we
successfully completed the network operation which is not
true. Fix this by properly returning EINTR to callers.
Michael Ellerman [Fri, 11 Jan 2019 12:53:46 +0000 (23:53 +1100)]
powerpc/4xx/ocm: Fix fix for phys_addr_t printf warnings
My recent commit to fix the printf warnings in ocm.c got the format
specifier wrong, because I copied it from the documentation without
realising the square brackets are not meant as literals.
This results in the address being suffixed with a literal "[p]".
An opencapi device is using a device PE, so the current code breaks
because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi,
as the device sends real addresses directly (admittedly, the
virtualization story is yet to be written). So let's fix it by
skipping the IOMMU group setup for opencapi PHBs.
Commit e1c3743e1a20 ("powerpc/tm: Set MSR[TS] just prior to recheckpoint")
moved a code block around and this block uses a 'msr' variable outside of
the CONFIG_PPC_TRANSACTIONAL_MEM, however the 'msr' variable is declared
inside a CONFIG_PPC_TRANSACTIONAL_MEM block, causing a possible error when
CONFIG_PPC_TRANSACTION_MEM is not defined.
error: 'msr' undeclared (first use in this function)
This is not causing a compilation error in the mainline kernel, because
'msr' is being used as an argument of MSR_TM_ACTIVE(), which is defined as
the following when CONFIG_PPC_TRANSACTIONAL_MEM is *not* set:
#define MSR_TM_ACTIVE(x) 0
This patch just fixes this issue avoiding the 'msr' variable usage outside
the CONFIG_PPC_TRANSACTIONAL_MEM block, avoiding trusting in the
MSR_TM_ACTIVE() definition.
powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool.
Commit 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer
dereference") moved the loading of r6 earlier in the code. As some
functions are called inbetween, r6 needs to be loaded again with the
address of swapper_pg_dir in order to set PTE pointers for
the Abatron BDI.
Fixes: 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
ARM: integrator: impd1: use struct_size() in devm_kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
AKASHI Takahiro [Fri, 11 Jan 2019 07:40:21 +0000 (16:40 +0900)]
arm64: kexec_file: return successfully even if kaslr-seed doesn't exist
In kexec_file_load, kaslr-seed property of the current dtb will be deleted
any way before setting a new value if possible. It doesn't matter whether
it exists in the current dtb.
When executed for a PCI_ROOT_COMPLEX type, iort_match_node_callback()
expects the opaque pointer argument to be a PCI bus device. At the
moment rc_dma_get_range() passes the PCI endpoint instead of the bus,
and we've been lucky to have pci_domain_nr(ptr) return 0 instead of
crashing. Pass the bus device to iort_scan_node().
Daniel Borkmann [Fri, 11 Jan 2019 09:40:55 +0000 (10:40 +0100)]
Merge branch 'bpf-fix-bitfield-printing'
Yonghong Song says:
====================
The previous BTF kind_flag support patch set introduced a bug
for kernel bpffs pretty printing and another bug for bpftool
map pretty printing. If a bitfield struct member offset is
greater than 256 bits, printed value for that struct
member will be incorrect.
- Patch #1 fixed the bug in kernel bpffs pretty printing.
- Patch #2 enhanced the test_btf test case to cover the
issue exposed by patch #1.
- Patch #3 fixed the bug in bpftool map pretty printing.
====================
Yonghong Song [Thu, 10 Jan 2019 19:14:02 +0000 (11:14 -0800)]
tools/bpf: fix bpftool map dump with bitfields
Commit 8772c8bc093b ("tools: bpftool: support pretty print
with kind_flag set") added bpftool map dump with kind_flag
support. When bitfield_size can be retrieved directly from
btf_member, function btf_dumper_bitfield() is called to
dump the bitfield. The implementation passed the
wrong parameter "bit_offset" to the function. The excepted
value is the bit_offset within a byte while the passed-in
value is the struct member offset.
This commit fixed the bug with passing correct "bit_offset"
with adjusted data pointer.
Fixes: 8772c8bc093b ("tools: bpftool: support pretty print with kind_flag set") Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
With the kernel fix, the modified test will succeed:
$ test_btf -p
......
BTF pretty print array(#1)......OK
BTF pretty print array(#2)......OK
PASS:8 SKIP:0 FAIL:0
Fixes: 9d5f9f701b18 ("bpf: btf: fix struct/union/fwd types with kind_flag") Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Yonghong Song [Thu, 10 Jan 2019 19:14:00 +0000 (11:14 -0800)]
bpf: fix bpffs bitfield pretty print
Commit 9d5f9f701b18 ("bpf: btf: fix struct/union/fwd types
with kind_flag") introduced kind_flag and used bitfield_size
in the btf_member to directly pretty print member values.
The commit contained a bug where the incorrect parameters could be
passed to function btf_bitfield_seq_show(). The bits_offset
parameter in the function expects a value less than 8.
Instead, the member offset in the structure is passed.
The below is btf_bitfield_seq_show() func signature:
void btf_bitfield_seq_show(void *data, u8 bits_offset,
u8 nr_bits, struct seq_file *m)
both bits_offset and nr_bits are u8 type. If the bitfield
member offset is greater than 256, incorrect value will
be printed.
This patch fixed the issue by calculating correct proper
data offset and bits_offset similar to non kind_flag case.
Fixes: 9d5f9f701b18 ("bpf: btf: fix struct/union/fwd types with kind_flag") Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Daniel Vetter [Fri, 11 Jan 2019 09:26:21 +0000 (10:26 +0100)]
Merge tag 'drm-intel-fixes-2019-01-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
i915 fixes for v5.0-rc2:
- Disable PSR for Apple panels
- Broxton ERR_PTR error state fix
- Kabylake VECS workaround fix
- Unwind failure on pinning the gen7 ppgtt
- GVT workload request allocation fix
Daniel Vetter [Fri, 11 Jan 2019 09:25:05 +0000 (10:25 +0100)]
Merge tag 'drm-misc-fixes-2019-01-10-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Second pull request, drm-misc-fixes for v5.0-rc2:
- Fix fb-helper to work correctly with SDL 1.2 bugs.
- Fix lockdep warning in the atomic ioctl and setproperty.
* pm-cpufreq:
cpufreq: scmi: Fix frequency invariance in slow path
cpufreq: check if policy is inactive early in __cpufreq_get()
cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
cpufreq / Documentation: Update cpufreq MAINTAINERS entry
Ingo Molnar [Fri, 11 Jan 2019 07:12:09 +0000 (08:12 +0100)]
Merge tag 'perf-core-for-mingo-5.0-20190110' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core fixes and improvements from Arnaldo Carvalho de Melo:
perf trace:
Ravi Bangoria:
- Rework PowerPC syscall table generation, now using a .tbl file just like
x86_64 and S/390, also silencing a tools build warning about headers out of
sync with the kernel sources.
tools include uapi:
Arnaldo Carvalho de Melo:
- Sync linux/if_link.h copy with the kernel sources, silencing a build warning.
perf top:
Arnaldo Carvalho de Melo:
- Add 'arch_cpu_idle' to the list of kernel idle symbols, noticed on a Orange
Pi Zero ARM board, just like with other symbols in other arches.
drm/nouveau: Don't disable polling in fallback mode
When a fan is controlled via linear fallback without cstate, we
shouldn't stop polling. Otherwise it won't be adjusted again and
keeps running at an initial crazy pace.
Fixes: 800efb4c2857 ("drm/nouveau/drm/therm/fan: add a fallback if no fan control is specified in the vbios")
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1103356
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107447 Reported-by: Thomas Blume <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Reviewed-by: Martin Peres <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
Stephen Smalley [Wed, 9 Jan 2019 15:55:10 +0000 (10:55 -0500)]
selinux: fix GPF on invalid policy
levdatum->level can be NULL if we encounter an error while loading
the policy during sens_read prior to initializing it. Make sure
sens_destroy handles that case correctly.
Leon Romanovsky [Thu, 10 Jan 2019 06:15:45 +0000 (08:15 +0200)]
RDMA/mthca: Clear QP objects during their allocation
As part of audit process to update drivers to use rdma_restrack_add()
ensure that QP objects is cleared before access. Such change fixes the
crash observed with uninitialized non zero sgid attr accessed by
ib_destroy_qp().
wenxu [Thu, 10 Jan 2019 06:51:35 +0000 (14:51 +0800)]
netfilter: nft_flow_offload: fix interaction with vrf slave device
In the forward chain, the iif is changed from slave device to master vrf
device. Thus, flow offload does not find a match on the lower slave
device.
This patch uses the cached route, ie. dst->dev, to update the iif and
oif fields in the flow entry.
After this patch, the following example works fine:
# ip addr add dev eth0 1.1.1.1/24
# ip addr add dev eth1 10.0.0.1/24
# ip link add user1 type vrf table 1
# ip l set user1 up
# ip l set dev eth0 master user1
# ip l set dev eth1 master user1
# nft add table firewall
# nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; }
# nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; }
# nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1
# nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1
Shakeel Butt [Thu, 3 Jan 2019 03:14:31 +0000 (19:14 -0800)]
netfilter: ebtables: account ebt_table_info to kmemcg
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
memory is already accounted to kmemcg. Do the same for ebtables. The
syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
whole system from a restricted memcg, a potential DoS.
By accounting the ebt_table_info, the memory used for ebt_table_info can
be contained within the memcg of the allocating process. However the
lifetime of ebt_table_info is independent of the allocating process and
is tied to the network namespace. So, the oom-killer will not be able to
relieve the memory pressure due to ebt_table_info memory. The memory for
ebt_table_info is allocated through vmalloc. Currently vmalloc does not
handle the oom-killed allocating process correctly and one large
allocation can bypass memcg limit enforcement. So, with this patch,
at least the small allocations will be contained. For large allocations,
we need to fix vmalloc.
Yi Zeng [Wed, 9 Jan 2019 07:33:07 +0000 (15:33 +0800)]
i2c: dev: prevent adapter retries and timeout being set as minus value
If adapter->retries is set to a minus value from user space via ioctl,
it will make __i2c_transfer and __i2c_smbus_xfer skip the calling to
adapter->algo->master_xfer and adapter->algo->smbus_xfer that is
registered by the underlying bus drivers, and return value 0 to all the
callers. The bus driver will never be accessed anymore by all users,
besides, the users may still get successful return value without any
error or information log print out.
If adapter->timeout is set to minus value from user space via ioctl,
it will make the retrying loop in __i2c_transfer and __i2c_smbus_xfer
always break after the the first try, due to the time_after always
returns true.
Fabio Estevam [Wed, 26 Dec 2018 12:06:19 +0000 (10:06 -0200)]
qcom-scm: Include <linux/err.h> header
Since commit e6f6d63ed14c ("drm/msm: add headless gpu device for imx5")
the DRM_MSM symbol can be selected by SOC_IMX5 causing the following
error when building imx_v6_v7_defconfig:
In file included from ../drivers/gpu/drm/msm/adreno/a5xx_gpu.c:17:0:
../include/linux/qcom_scm.h: In function 'qcom_scm_set_cold_boot_addr':
../include/linux/qcom_scm.h:73:10: error: 'ENODEV' undeclared (first use in this function)
return -ENODEV;
Include the <linux/err.h> header file to fix this problem.
Jens Axboe [Thu, 10 Jan 2019 22:29:57 +0000 (15:29 -0700)]
Merge branch 'nvme-5.0' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph.
* 'nvme-5.0' of git://git.infradead.org/nvme:
nvme: don't initlialize ctrl->cntlid twice
nvme: introduce NVME_QUIRK_IGNORE_DEV_SUBNQN
nvme: pad fake subsys NQN vid and ssvid with zeros
nvme-multipath: zero out ANA log buffer
nvme-fabrics: unset write/poll queues for discovery controllers
nvme-tcp: don't ask if controller is fabrics
nvme-tcp: remove dead code
nvme-pci: fix out of bounds access in nvme_cqe_pending
nvme-pci: rerun irq setup on IO queue init errors
nvme-pci: use the same attributes when freeing host_mem_desc_bufs.
nvme-pci: fix the wrong setting of nr_maps
Yuchung Cheng [Wed, 9 Jan 2019 02:14:28 +0000 (18:14 -0800)]
tcp: change txhash on SYN-data timeout
Previously upon SYN timeouts the sender recomputes the txhash to
try a different path. However this does not apply on the initial
timeout of SYN-data (active Fast Open). Therefore an active IPv6
Fast Open connection may incur one second RTO penalty to take on
a new path after the second SYN retransmission uses a new flow label.
This patch removes this undesirable behavior so Fast Open changes
the flow label just like the regular connections. This also helps
avoid falsely disabling Fast Open on the sender which triggers
after two consecutive SYN timeouts on Fast Open.
A network device stack with multiple layers of bonding devices can
trigger a false positive lockdep warning. Adding lockdep nest levels
fixes this. Update the level on both enslave and unlink, to avoid the
following series of events ..
ip netns add test
ip netns exec test bash
ip link set dev lo addr 00:11:22:33:44:55
ip link set dev lo down
ip link add dev bond1 type bond
ip link add dev bond2 type bond
ip link set dev lo master bond1
ip link set dev bond1 master bond2
ip link set dev bond1 nomaster
ip link set dev bond2 master bond1
.. from still generating a splat:
[ 193.652127] ======================================================
[ 193.658231] WARNING: possible circular locking dependency detected
[ 193.664350] 4.20.0 #8 Not tainted
[ 193.668310] ------------------------------------------------------
[ 193.674417] ip/15577 is trying to acquire lock:
[ 193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290
[ 193.687851]
but task is already holding lock:
[ 193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290
Fixes: 7e2556e40026 ("bonding: avoid lockdep confusion in bond_get_stats()") Reported-by: syzbot <[email protected]> Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>