Linus Torvalds [Wed, 4 Jan 2017 17:03:37 +0000 (09:03 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"A set of fixes for the current series, one fixing a regression with
block size < page cache size in the alias series from Jan. Outside of
that, two small cleanups for wbt from Bart, a nvme pull request from
Christoph, and a few small fixes of documentation updates"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix up io_poll documentation
block: Avoid that sparse complains about context imbalance in __wbt_wait()
block: Make wbt_wait() definition consistent with declaration
clean_bdev_aliases: Prevent cleaning blocks that are not in block range
genhd: remove dead and duplicated scsi code
block: add back plugging in __blkdev_direct_IO
nvmet/fcloop: remove some logically dead code performing redundant ret checks
nvmet: fix KATO offset in Set Features
nvme/fc: simplify error handling of nvme_fc_create_hw_io_queues
nvme/fc: correct some printk information
nvme/scsi: Remove START STOP emulation
nvme/pci: Delete misleading queue-wrap comment
nvme/pci: Fix whitespace problem
nvme: simplify stripe quirk
nvme: update maintainers information
Linus Torvalds [Wed, 4 Jan 2017 17:00:57 +0000 (09:00 -0800)]
Merge tag 'fbdev-v4.10-rc2' of git://github.com/bzolnier/linux
Pull fbdev fixes from Bartlomiej Zolnierkiewicz:
- bring fbdev subsystem back into Maintained mode
- add missing devm_ioremap() error checking to cobalt_lcdfb driver
* tag 'fbdev-v4.10-rc2' of git://github.com/bzolnier/linux:
video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
MAINTAINERS: add myself as maintainer of fbdev
Linus Torvalds [Wed, 4 Jan 2017 16:56:05 +0000 (08:56 -0800)]
Merge tag 'gcc-plugins-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc-plugins fixes from Kees Cook:
"Small fixes for gcc-plugins when using certain gcc versions:
- update gcc-common.h for gcc 7 (Emese Revfy)
- fix latent_entropy type for early gcc on ARM (PaX Team)"
* tag 'gcc-plugins-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: update gcc-common.h for gcc-7
latent_entropy: fix ARM build error on earlier gcc
Arvind Yadav [Tue, 13 Dec 2016 08:20:52 +0000 (13:50 +0530)]
video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.
xfs: fix crash and data corruption due to removal of busy COW extents
There is a race window between write_cache_pages calling
clear_page_dirty_for_io and XFS calling set_page_writeback, in which
the mapping for an inode is tagged neither as dirty, nor as writeback.
If the COW shrinker hits in exactly that window we'll remove the delayed
COW extents and writepages trying to write it back, which in release
kernels will manifest as corruption of the bmap btree, and in debug
kernels will trip the ASSERT about now calling xfs_bmapi_write with the
COWFORK flag for holes. A complex customer load manages to hit this
window fairly reliably, probably by always having COW writeback in flight
while the cow shrinker runs.
This patch adds another check for having the I_DIRTY_PAGES flag set,
which is still set during this race window. While this fixes the problem
I'm still not overly happy about the way the COW shrinker works as it
still seems a bit fragile.
Darrick J. Wong [Wed, 4 Jan 2017 02:39:33 +0000 (18:39 -0800)]
xfs: use the actual AG length when reserving blocks
We need to use the actual AG length when making per-AG reservations,
since we could otherwise end up reserving more blocks out of the last
AG than there are actual blocks.
Yotam Gigi [Tue, 3 Jan 2017 17:20:24 +0000 (19:20 +0200)]
net/sched: cls_matchall: Fix error path
Fix several error paths in matchall:
- Release reference to actions in case the hardware fails offloading
(relevant to skip_sw only)
- Fix error path in case tcf_exts initialization/validation fail
Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier") Signed-off-by: Yotam Gigi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Tue, 3 Jan 2017 21:23:01 +0000 (16:23 -0500)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2017-01-03
This series contains updates to ixgbe and ixgbevf only.
Emil fixes ixgbe to use the NVM settings for FEC, so do not override the
settings. Fixed the indirection table for x550, where newer devices can
support up to 64 RSS queues. Extends the rtnl_lock() to protect the call
to netif_device_detach() and ixgbe_clear_interrupt_scheme() to avoid
against a double free WARN and/or a BUG in free_msi_irqs(). Fixed AER
error handling by making sure that the driver frees the IRQs in
ixgbe_io_error_detected() when responding to a PCIe AER error, and to
restore them when the interface recovers.
Tony updates the driver to report the driver version to the firmware using
the host interface command for x550 devices. Fixed the PHY reset check
for x550em_ext_t PHY type. Fixed bounds checking for x540 devices to
ensure the index is valid for the LED function. Fixed the BaseT adapters
which support 100Mb capability and were not reporting the capability.
Ken Cox adds a missing check for the trusted bit before trying to set the
MACVLAN MAC address.
Yusuke Suzuki fixes an issue with 82599 and x540 devices where receive
timestamps were not working becase the bitwise operation for RX_HWSTAMP
falg was incorrect.
Don ensures that x553 KR/KX devices correctly advertise link speeds.
Adds the mailbox message to allow for VF promiscuous mode support.
Mark fixes two issues with EEPROM access, where the semaphore was not
being held until the entire response was read and the acquiring/releasing
of the semaphore is slow. Cleaned up firmware version method and
functions which are no longer used. Added new interfaces for firmware
commands to access some new PHYs.
v2: fixed tab indentation in patch 12 and mis-spelled words in patch 15
based on feedback from Sergei Shtylyov and Rami Rosen.
====================
Don Skidmore [Fri, 16 Dec 2016 02:18:32 +0000 (21:18 -0500)]
ixgbe: Add PF support for VF promiscuous mode
This patch extends the xcast mailbox message to include support for
unicast promiscuous mode. To allow a VF to enter this mode the PF
must be in promiscuous mode.
A later patch will add the support needed in the VF driver (ixgbevf)
Mark Rustad [Wed, 14 Dec 2016 19:02:00 +0000 (11:02 -0800)]
ixgbe: Fix issues with EEPROM access
There are two problems with EEPROM access. One is that it needs to
hold the semaphore until the entire response is read or else the
response can be corrupted by other firmware accesses. The second
problem is that acquiring and releasing the semaphore is slow, so
it should be taken and released once when multiple EEPROM accesses
will be done.
Both of these issues can be solved by adding a new function,
ixgbe_hic_unlocked, to issue firmware commands that will assume
that the caller has acquired the needed semaphore.
Don Skidmore [Wed, 14 Dec 2016 01:34:51 +0000 (20:34 -0500)]
ixgbe: Configure advertised speeds correctly for KR/KX backplane
This patch ensures that the advertised link speeds are configured
for X553 KR/KX backplane. Without this patch the link remains at
1G when resuming from low power after being downshifted by LPLU.
Yusuke Suzuki [Mon, 21 Nov 2016 06:48:45 +0000 (06:48 +0000)]
ixgbe: Fix incorrect bitwise operations of PTP Rx timestamp flags
Rx timestamp does not work on 82599 and X540 because bitwise operation
of RX_HWTSTAMP flags is incorrect and ixgbe_ptp_rx_hwtstamp() is never
called. This patch fixes it to enable Rx timestamp on 82599 and X540.
Without this fix:
ptp4l[278.730]: selected /dev/ptp8 as PTP clock
ptp4l[278.733]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[278.733]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[278.834]: port 1: received SYNC without timestamp
ptp4l[278.835]: port 1: new foreign master 1c3947.fffe.60f9cc-1
ptp4l[279.834]: port 1: received SYNC without timestamp
ptp4l[280.834]: port 1: received SYNC without timestamp
ptp4l[281.834]: port 1: received SYNC without timestamp
ptp4l[282.834]: port 1: received SYNC without timestamp
ptp4l[282.835]: selected best master clock 1c3947.fffe.60f9cc
ptp4l[282.835]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[283.834]: port 1: received SYNC without timestamp
With this fix:
ptp4l[239.154]: selected /dev/ptp8 as PTP clock
ptp4l[239.157]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[239.157]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[240.989]: port 1: new foreign master 1c3947.fffe.60f9cc-1
ptp4l[244.989]: selected best master clock 1c3947.fffe.60f9cc
ptp4l[244.989]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[246.977]: master offset -899583339542096 s0 freq +0 path delay 16222
ptp4l[247.977]: master offset -899583339617265 s1 freq -75169 path delay 16177
ptp4l[248.977]: master offset -130 s2 freq -75299 path delay 16177
ptp4l[248.977]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[249.977]: master offset -9 s2 freq -75217 path delay 16177
ptp4l[250.977]: master offset 88 s2 freq -75123 path delay 16132
Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Yusuke Suzuki <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
Emil Tantilov [Wed, 16 Nov 2016 19:25:34 +0000 (11:25 -0800)]
ixgbevf: fix AER error handling
Make sure that we free the IRQs in ixgbevf_io_error_detected() when
responding to an PCIe AER error and also restore them when the
interface recovers from it.
Previously it was possible to trigger BUG_ON() check in free_msix_irqs()
in the case where we call ixgbevf_remove() after a failed recovery from
AER error because the interrupts were not freed.
Also moved the down and free functions into ixgbevf_close_suspend()
same as with ixgbe.
Emil Tantilov [Wed, 16 Nov 2016 17:48:02 +0000 (09:48 -0800)]
ixgbe: fix AER error handling
Make sure that we free the IRQs in ixgbe_io_error_detected() when
responding to an PCIe AER error and also restore them when the
interface recovers from it.
Previously it was possible to trigger BUG_ON() check in free_msix_irqs()
in the case where we call ixgbe_remove() after a failed recovery from
AER error because the interrupts were not freed.
Ken Cox [Tue, 15 Nov 2016 19:00:37 +0000 (13:00 -0600)]
ixgbe: test for trust in macvlan adjustments for VF
There are two methods for setting mac addresses in a Macvlan, that
differentiate themselves in the function macvlan_set_mac_Address.
If the macvlan mode is passthru, then we use the dev_set_mac_address
method, otherwise we use the dev_uc api via macvlan_sync_addresses.
The latter method (which would stem from using any non-passthru mode,
like bridge, or vepa), calls down into the driver in a path that terminates
in ixgbevf_set_uc_addr_vf, which sends a IXGBE_VF_SET_MACVLAN message,
which causes the pf to spawn the noted error message. This occurs because
it appears that the guest is trying to delete the mac address of the macvlan
before adding another.
The other path in macvlan_set_mac_address uses dev_set_mac_address, which
calls into ixgbevf_set_mac which uses the IXGBE_VF_SET_MAC_ADDR to the
pf to set the macvlan mac address.
The discrepancy here is in the handlers. The handler function for
IXGBE_VF_SET_MAC_ADDR (ixgbe_set_vf_mac_addr) has a check for
the vfinfo[].trusted bit to allow the operation if the vf is trusted.
In comparison, the IXGBE_VF_SET_MACVLAN message handler
(ixgbe_set_vf_macvlan_msg) has no such check of the trusted bit.
Emil Tantilov [Fri, 11 Nov 2016 18:12:51 +0000 (10:12 -0800)]
ixgbevf: handle race between close and suspend on shutdown
When an interface is part of a namespace it is possible that
ixgbevf_close() may be called while ixgbevf_suspend() is running
which ends up in a double free WARN and/or a BUG in free_msi_irqs()
To handle this situation we extend the rtnl_lock() to protect the
call to netif_device_detach() and check for !netif_device_present()
to avoid entering close while in suspend.
Also added rtnl locks to ixgbevf_queue_reset_subtask().
Emil Tantilov [Fri, 11 Nov 2016 18:07:47 +0000 (10:07 -0800)]
ixgbe: handle close/suspend race with netif_device_detach/present
When an interface is part of a namespace it is possible that
ixgbe_close() may be called while __ixgbe_shutdown() is running
which ends up in a double free WARN and/or a BUG in free_msi_irqs().
To handle this situation we extend the rtnl_lock() to protect the
call to netif_device_detach() and ixgbe_clear_interrupt_scheme()
in __ixgbe_shutdown() and check for netif_device_present()
to avoid clearing the interrupts second time in ixgbe_close();
Also extend the rtnl lock in ixgbe_resume() to netif_device_attach().
Tony Nguyen [Fri, 11 Nov 2016 00:00:33 +0000 (16:00 -0800)]
ixgbe: Fix reporting of 100Mb capability
BaseT adapters that are capable of supporting 100Mb are not reporting this
capability. This patch corrects the reporting so that 100Mb is shown as
supported on those adapters.
Tony Nguyen [Thu, 10 Nov 2016 17:57:29 +0000 (09:57 -0800)]
ixgbe: Reduce I2C retry count on X550 devices
A retry count of 10 is likely to run into problems on X550 devices that
have to detect and reset unresponsive CS4227 devices. So, reduce the I2C
retry count to 3 for X550 and above. This should avoid any possible
regressions in existing devices.
Tony Nguyen [Wed, 9 Nov 2016 18:48:48 +0000 (10:48 -0800)]
ixgbe: Add bounds check for x540 LED functions
This is an extension of commit 003287e0f087 ("ixgbevf: Correct parameter
sent to LED function"); add bounds checking to x540 functions to ensure the
index is valid.
Tony Nguyen [Mon, 31 Oct 2016 19:11:58 +0000 (12:11 -0700)]
ixgbe: Fix check for ixgbe_phy_x550em_ext_t reset
The generic PHY reset check we had previously is not sufficient for the
ixgbe_phy_x550em_ext_t PHY type. Check 1.CC02.0 instead - same as
ixgbe_init_ext_t_x550().
Tony Nguyen [Wed, 26 Oct 2016 23:25:18 +0000 (16:25 -0700)]
ixgbe: Report driver version to firmware for x550 devices
Some x550 devices require the driver version reported to its firmware; this
patch sends the driver version string to the firmware through the host
interface command for x550 devices.
Linus Torvalds [Tue, 3 Jan 2017 18:50:05 +0000 (10:50 -0800)]
Merge branch 'parisc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
- limit usage of processor-internal cr16 clocksource to UP systems only
- segfault info lines in syslog were too long, split those up
- drop own TIF_RESTORE_SIGMASK flag and switch to generic code
* 'parisc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Add line-break when printing segfault info
parisc: Drop TIF_RESTORE_SIGMASK and switch to generic code
parisc: Mark cr16 clocksource unstable on SMP systems
We fix a very real starvation problem that may occur when a link
encounters send buffer congestion. At the same time we make the
interaction between the socket and link layer simpler and more
consistent.
====================
Jon Paul Maloy [Tue, 3 Jan 2017 15:55:11 +0000 (10:55 -0500)]
tipc: reduce risk of user starvation during link congestion
The socket code currently handles link congestion by either blocking
and trying to send again when the congestion has abated, or just
returning to the user with -EAGAIN and let him re-try later.
This mechanism is prone to starvation, because the wakeup algorithm is
non-atomic. During the time the link issues a wakeup signal, until the
socket wakes up and re-attempts sending, other senders may have come
in between and occupied the free buffer space in the link. This in turn
may lead to a socket having to make many send attempts before it is
successful. In extremely loaded systems we have observed latency times
of several seconds before a low-priority socket is able to send out a
message.
In this commit, we simplify this mechanism and reduce the risk of the
described scenario happening. When a message is attempted sent via a
congested link, we now let it be added to the link's backlog queue
anyway, thus permitting an oversubscription of one message per source
socket. We still create a wakeup item and return an error code, hence
instructing the sender to block or stop sending. Only when enough space
has been freed up in the link's backlog queue do we issue a wakeup event
that allows the sender to continue with the next message, if any.
The fact that a socket now can consider a message sent even when the
link returns a congestion code means that the sending socket code can
be simplified. Also, since this is a good opportunity to get rid of the
obsolete 'mtu change' condition in the three socket send functions, we
now choose to refactor those functions completely.
Jon Paul Maloy [Tue, 3 Jan 2017 15:55:10 +0000 (10:55 -0500)]
tipc: modify struct tipc_plist to be more versatile
During multicast reception we currently use a simple linked list with
push/pop semantics to store port numbers.
We now see a need for a more generic list for storing values of type
u32. We therefore make some modifications to this list, while replacing
the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new
functions which will come to use in the next commits.
Jon Paul Maloy [Tue, 3 Jan 2017 15:55:09 +0000 (10:55 -0500)]
tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions
The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very
similar. The latter function is also called from two locations, and
there will be more in the coming commits, which will all need to test on
different conditions.
Instead of making yet another duplicates of the function, we now
introduce a new macro tipc_wait_for_cond() where the wakeup condition
can be stated as an argument to the call. This macro replaces all
current and future uses of the two functions, which can now be
eliminated.
David S. Miller [Tue, 3 Jan 2017 16:00:28 +0000 (11:00 -0500)]
Merge branch 'TPACKET_V3-TX_RING-support'
Sowmini Varadhan says:
====================
TPACKET_V3 TX_RING support
This patch series allows an application to use a single PF_PACKET
descriptor and leverage the best implementations of TX_RING
and RX_RING that exist today.
Patch 1 adds the kernel/Documentation changes for TX_RING
support and patch2 adds the associated test case in selftests.
Changes since v2: additional sanity checks for setsockopt
input for TX_RING/TPACKET_V3. Refactored psock_tpacket.c
test code to avoid code duplication from V2.
====================
Although TPACKET_V3 Rx has some benefits over TPACKET_V2 Rx, *_v3
does not currently have TX_RING support. As a result an application
that wants the best perf for Tx and Rx (e.g. to handle request/response
transacations) ends up needing 2 sockets, one with *_v2 for Tx and
another with *_v3 for Rx.
This patch enables TPACKET_V2 compatible Tx features in TPACKET_V3
so that an application can use a single descriptor to get the benefits
of _v3 RX_RING and _v2 TX_RING. An application may do a block-send by
first filling up multiple frames in the Tx ring and then triggering a
transmit. This patch only support fixed size Tx frames for TPACKET_V3,
and requires that tp_next_offset must be zero.
Sabrina Dubroca [Tue, 3 Jan 2017 15:26:04 +0000 (16:26 +0100)]
benet: stricter vxlan offloading check in be_features_check
When VXLAN offloading is enabled, be_features_check() tries to check if
an encapsulated packet is indeed a VXLAN packet. The check is not strict
enough, and considers any UDP-encapsulated ethernet frame with a 8-byte
tunnel header as being VXLAN. Unfortunately, both GENEVE and VXLAN-GPE
have a 8-byte header, so they get through this check.
Force the UDP destination port to be the one that has been offloaded to
hardware.
Without this, GENEVE-encapsulated packets can end up having an incorrect
checksum when both a GENEVE and a VXLAN (offloaded) tunnel are
configured.
This is similar to commit a547224dceed ("mlx4e: Do not attempt to
offload VXLAN ports that are unrecognized").
ipmr, ip6mr: add RTNH_F_UNRESOLVED flag to unresolved cache entries
While working with ipmr, we noticed that it is impossible to determine
if an entry is actually unresolved or its IIF interface has disappeared
(e.g. virtual interface got deleted). These entries look almost
identical to user-space when dumping or receiving notifications. So in
order to recognize them add a new RTNH_F_UNRESOLVED flag which is set when
sending an unresolved cache entry to user-space.
Some devices, such as the mv88e6097 do have ADDR[0] external and so it
is possible to configure the device to use SMI address 0x1. Remove the
restriction, as there are boards using this address.
David S. Miller [Tue, 3 Jan 2017 14:45:44 +0000 (09:45 -0500)]
Merge branch 'for_4.11/net-next/rds_v3' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux
Santosh Shilimkar says:
====================
net: RDS updates
v2->v3:
- Re-based against latest net-next head.
- Dropped a user visible change after discussing with David Miller.
It needs some more work to fully support old/new tools matrix.
- Addressed Dave's comment about bool usage in patch
"RDS: IB: track and log active side..."
v1->v2:
Re-aligned indentation in patch 'RDS: mark few internal functions..."
Series consist of:
- RDMA transport fixes for map failure, listen sequence, handler panic and
composite message notification.
- Couple of sparse fixes.
- Message logging improvements for bind failure, use once mr semantics
and connection remote address, active end point.
- Performance improvement for RDMA transport by reducing the post send
pressure on the queue and spreading the CQ vectors.
- Useful statistics for socket send/recv usage and receive cache usage.
- Additional RDS CMSG used by application to track the RDS message
stages for certain type of traffic to find out latency spots.
Can be enabled/disabled per socket.
====================
Alexander Duyck [Mon, 2 Jan 2017 21:32:54 +0000 (13:32 -0800)]
ipv4: Do not allow MAIN to be alias for new LOCAL w/ custom rules
In the case of custom rules being present we need to handle the case of the
LOCAL table being intialized after the new rule has been added. To address
that I am adding a new check so that we can make certain we don't use an
alias of MAIN for LOCAL when allocating a new table.
Fixes: 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse") Reported-by: Oliver Brunel <[email protected]> Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Bartosz Folta [Mon, 2 Jan 2017 12:41:50 +0000 (12:41 +0000)]
net: macb: Updated resource allocation function calls to new version of API.
Changed function calls of resource allocation to new API. Changed way
of setting DMA mask. Removed unnecessary sanity check.
This patch is sent in regard to recently applied patch
Commit 83a77e9ec4150ee4acc635638f7dedd9da523a26
net: macb: Added PCI wrapper for Platform Driver.
David S. Miller [Tue, 3 Jan 2017 14:33:00 +0000 (09:33 -0500)]
Merge branch 'dwmac-oxnas-leaks'
Johan Hovold says:
====================
net: stmmac: dwmac-oxnas: fix leaks and simplify pm
These patches fixes of-node and fixed-phydev leaks in the recently added
dwmac-oxnas driver, and ultimately switches over to using the generic pm
implementation as the required callbacks are now in place.
Note that this series has only been compile tested.
====================
Make sure to deregister and free any fixed-link phy registered during
probe on probe errors and on driver unbind by calling the new glue
helper function.
For driver unbind, use the generic stmmac-platform remove implementation
and add an exit callback to disable the clock.
Fixes: 5ed7414062e7 ("net: stmmac: Add OXNAS Glue Driver") Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Johan Hovold [Mon, 2 Jan 2017 11:56:02 +0000 (12:56 +0100)]
net: stmmac: dwmac-oxnas: fix of-node leak
Use the syscon lookup-by-phandle helper so that the reference taken by
of_parse_phandle() is released when done with the node.
Fixes: 5ed7414062e7 ("net: stmmac: Add OXNAS Glue Driver") Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Linus Torvalds [Tue, 3 Jan 2017 02:32:59 +0000 (18:32 -0800)]
Merge tag 'fscrypt-for-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt
Pull fscrypt fixes from Ted Ts'o:
"Two fscrypt bug fixes, one of which was unmasked by an update to the
crypto tree during the merge window"
* tag 'fscrypt-for-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
fscrypt: fix renaming and linking special files
fscrypt: fix the test_dummy_encryption mount option
RDS: add receive message trace used by application
Socket option to tap receive path latency in various stages
in nano seconds. It can be enabled on selective sockets using
using SO_RDS_MSG_RXPATH_LATENCY socket option. RDS will return
the data to application with RDS_CMSG_RXPATH_LATENCY in defined
format. Scope is left to add more trace points for future
without need of change in the interface.
Avinash Repaka [Mon, 29 Feb 2016 23:30:57 +0000 (15:30 -0800)]
RDS: make message size limit compliant with spec
RDS support max message size as 1M but the code doesn't check this
in all cases. Patch fixes it for RDMA & non-RDMA and RDS MR size
and its enforced irrespective of underlying transport.
RDS: IB: fix panic due to handlers running post teardown
Shutdown code reaping loop takes care of emptying the
CQ's before they being destroyed. And once tasklets are
killed, the hanlders are not expected to run.
But because of core tasklet code issues, tasklet handler could
still run even after tasklet_kill,
RDS IB shutdown code already reaps the CQs before freeing
cq/qp resources so as such the handlers have nothing left
to do post shutdown.
On other hand any handler running after teardown and trying
to access already freed qp/cq resources causes issues
Patch fixes this race by makes sure that handlers returns
without any action post teardown.
RDS: RDMA: Fix the composite message user notification
When application sends an RDS RDMA composite message consist of
RDMA transfer to be followed up by non RDMA payload, it expect to
be notified *only* when the full message gets delivered. RDS RDMA
notification doesn't behave this way though.
Thanks to Venkat for debug and root casuing the issue
where only first part of the message(RDMA) was
successfully delivered but remainder payload delivery failed.
In that case, application should not be notified with
a false positive of message delivery success.
Fix this case by making sure the user gets notified only after
the full message delivery.
In absence of extension headers, message log will keep
flooding the console. As such even without use_once we can
clean up the MRs so its not really an error case message
so make it debug message
RDS: IB: split the mr registration and invalidation path
MR invalidation in RDS is done in background thread and not in
data path like registration. So break the dependency between them
which helps to remove the performance bottleneck.
RDS: RDMA: return appropriate error on rdma map failures
The first message to a remote node should prompt a new
connection even if it is RDMA operation. For RDMA operation
the MR mapping can fail because connections is not yet up.
Since the connection establishment is asynchronous,
we make sure the map failure because of unavailable
connection reach to the user by appropriate error code.
Before returning to the user, lets trigger the connection
so that its ready for the next retry.
RDS: mark few internal functions static to make sparse build happy
Fixes below warnings:
warning: symbol 'rds_send_probe' was not declared. Should it be static?
warning: symbol 'rds_send_ping' was not declared. Should it be static?
warning: symbol 'rds_tcp_accept_one_path' was not declared. Should it be static?
warning: symbol 'rds_walk_conn_path_info' was not declared. Should it be static?
Philippe Reynes [Sun, 1 Jan 2017 19:49:26 +0000 (20:49 +0100)]
net: dlink: dl2k: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
The previous implementation of set_settings was modifying
the value of speed and duplex, but with the new API, it's not
possible. The structure ethtool_link_ksettings is defined
as const.
David S. Miller [Mon, 2 Jan 2017 20:51:21 +0000 (15:51 -0500)]
Merge branch 'mlx5-odp'
Saeed Mahameed says:
====================
Mellanox mlx5 core and ODP updates 2017-01-01
The following eleven patches mainly come from Artemy Kovalyov
who expanded mlx5 on-demand-paging (ODP) support. In addition
there are three cleanup patches which don't change any functionality,
but are needed to align codebase prior accepting other patches.
Memory region (MR) in IB can be huge and ODP (on-demand paging)
technique allows to use unpinned memory, which can be consumed and
released on demand. This allows to applications do not pin down
the underlying physical pages of the address space, and save from them
need to track the validity of the mappings.
Rather, the HCA requests the latest translations from the OS when pages
are not present, and the OS invalidates translations which are no longer
valid due to either non-present pages or mapping changes.
In existing ODP implementation applications is needed to register
memory buffers for communication, though registered memory regions
need not have valid mappings at registration time.
This patch set performs the following steps to expand
current ODP implementation:
1. It refactors UMR to support large regions, by introducing generic
function to perform HCA translation table modifications. This
function supports both atomic and process contexts and is not limited
by number of modified entries.
This function allows to enable reallocated memory regions of
arbitrary size, so adding MR cache buckets to support up to 16GB MRs.
2. It changes page fault event format and refactor page faults logic
together with addition of atomic support.
3. It prepares mlx5 core code to support implicit registration with
simplified and relaxed semantics.
Implicit ODP semantics allows to applications provide special memory
key that represents their complete address space. Thus all IO accesses
referencing to this key (with proper access rights associated with the key)
wouldn't need not register any virtual address range.
Thanks,
Artemy, Ilya and Leon
v1->v2:
- Don't use 'inline' in .c files
====================
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:46 +0000 (11:37 +0200)]
{net,IB}/mlx5: Refactor page fault handling
* Update page fault event according to last specification.
* Separate code path for page fault EQ, completion EQ and async EQ.
* Move page fault handling work queue from mlx5_ib static variable
into mlx5_core page fault EQ.
* Allocate memory to store ODP event dynamically as the
events arrive, since in atomic context - use mempool.
* Make mlx5_ib page fault handler run in process context.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:45 +0000 (11:37 +0200)]
net/mlx5: Update PAGE_FAULT_RESUME layout
Update PAGE_FAULT_RESUME command layout.
Three bit fields describing page fault: rdma, rdma_write, req_res gave 8
possible combinations, while only a few were legal. Now they
are interpreted as three-bit type field, where former legal
combinations turns into corresponding types and unused were added as new
types.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:44 +0000 (11:37 +0200)]
IB/mlx5: Add MR cache for large UMR regions
In this change we turn mlx5_ib_update_mtt() into generic
mlx5_ib_update_xlt() to perfrom HCA translation table modifiactions
supporting both atomic and process contexts and not limited by number
of modified entries.
Using this function we increase preallocated MRs up to 16GB.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:42 +0000 (11:37 +0200)]
IB/mlx5: Refactor UMR post send format
* Update struct mlx5_wqe_umr_ctrl_seg.
* Currenlty UMR send_flags aim only certain use cases: enabled/disable
cached MR, modifying XLT for ODP. By making flags independent make UMR
more flexible allowing arbitrary manipulations.
* Since different UMR formats have different entry sizes UMR request
should receive exact size of translation table update instead of
number of entries. Rename field npages to xlt_size in struct mlx5_umr_wr
and update relevant code accordingly.
* Add support of length64 bit.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:41 +0000 (11:37 +0200)]
net/mlx5: Support new MR features
This patch adds the following items to IFC file.
1. MLX5_MKC_ACCESS_MODE_KSM enum value for creating KSM memory keys.
KSM access mode used when indirect MKey associated with fixed memory
size entries.
2. null_mkey field that is used to indicate non-present KLM/KSM
entries, where it causes the device to generate page fault event
when trying to access it.
3. struct mlx5_ifc_cmd_hca_cap_bits capability bits indicating
related value/field is supported:
* fixed_buffer_size - MLX5_MKC_ACCESS_MODE_KSM
* umr_extended_translation_offset - translation_offset_42_16
in UMR ctrl segment
* null_mkey - null_mkey in QUERY_SPECIAL_CONTEXTS
Binoy Jayan [Mon, 2 Jan 2017 09:37:40 +0000 (11:37 +0200)]
IB/mlx5: Add helper mlx5_ib_post_send_wait
Clean up the following common code (to post a list of work requests to the
send queue of the specified QP) at various places and add a helper function
'mlx5_ib_post_send_wait' to implement the same.
- Initialize 'mlx5_ib_umr_context' on stack
- Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe
- Acquire the semaphore
- call ib_post_send with a single ib_send_wr
- wait_for_completion()
- Check for umr_context.status
- Release the semaphore