Paul Durrant [Fri, 28 Mar 2014 11:39:07 +0000 (11:39 +0000)]
xen-netback: BUG_ON in xenvif_rx_action() not catching overflow
The BUG_ON to catch ring overflow in xenvif_rx_action() makes the assumption
that meta_slots_used == ring slots used. This is not necessarily the case
for GSO packets, because the non-prefix GSO protocol consumes one more ring
slot than meta-slot for the 'extra_info'. This patch changes the test to
actually check ring slots.
Paul Durrant [Fri, 28 Mar 2014 11:39:06 +0000 (11:39 +0000)]
xen-netback: worse-case estimate in xenvif_rx_action is underestimating
The worse-case estimate for skb ring slot usage in xenvif_rx_action()
fails to take fragment page_offset into account. The page_offset does,
however, affect the number of times the fragmentation code calls
start_new_rx_buffer() (i.e. consume another slot) and the worse-case
should assume that will always return true. This patch adds the page_offset
into the DIV_ROUND_UP for each frag.
Unfortunately some frontends aggressively limit the number of requests
they post into the shared ring so to avoid an estimate that is 'too'
pessimal it is capped at MAX_SKB_FRAGS.
Paul Durrant [Fri, 28 Mar 2014 11:39:05 +0000 (11:39 +0000)]
xen-netback: remove pointless clause from if statement
This patch removes a test in start_new_rx_buffer() that checks whether
a copy operation is less than MAX_BUFFER_OFFSET in length, since
MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
Wang Yufen [Fri, 28 Mar 2014 04:07:03 +0000 (12:07 +0800)]
ipv6: fix checkpatch errors of brace and trailing statements
ERROR: open brace '{' following enum go on the same line
ERROR: open brace '{' following struct go on the same line
ERROR: trailing statements should be on next line
Wang Yufen [Fri, 28 Mar 2014 04:07:02 +0000 (12:07 +0800)]
ipv6: fix checkpatch errors comments and space
WARNING: please, no space before tabs
WARNING: please, no spaces at the start of a line
ERROR: spaces required around that ':' (ctx:VxW)
ERROR: spaces required around that '>' (ctx:VxV)
ERROR: spaces required around that '>=' (ctx:VxV)
Linus Torvalds [Sat, 29 Mar 2014 22:01:09 +0000 (15:01 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"A late breaking fix from John. (The bug fixed has a hard lockup
potential, but that was not observed, warnings were)"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Revert to calling clock_was_set_delayed() while in irq context
David S. Miller [Sat, 29 Mar 2014 22:00:37 +0000 (18:00 -0400)]
Merge branch 'netpoll-next'
Eric W. Biederman says:
====================
netpoll: Cleanups and fixes
This should be a small set of safe cleanups and fixes to netpoll.
The fixes are vlan headers are now always inserted when needed, and
napi polling is always avoided when network devices are closed.
There are a bunch of little cleanups removing unnecessary code, fixing
function naming, not taking unnecessary locks and removing general
silliness.
====================
Stop taking the transmit lock when a network device has specified
NETIF_F_LLTX.
If no locks needed to trasnmit a packet this is the ideal scenario for
netpoll as all packets can be trasnmitted immediately.
Even if some locks are needed in ndo_start_xmit skipping any unnecessary
serialization is desirable for netpoll as it makes it more likely a
debugging packet may be trasnmitted immediately instead of being
deferred until later.
netpoll: Remove strong unnecessary assumptions about skbs
Remove the assumption that the skbs that make it to
netpoll_send_skb_on_dev are allocated with find_skb, such that
skb->users == 1 and nothing is attached that would prevent the skbs from
being freed from hard irq context.
Remove this assumption by replacing __kfree_skb on error paths with
dev_kfree_skb_irq (in hard irq context) and kfree_skb (in process
context).
netpoll: Move rx enable/disable into __dev_close_many
Today netpoll_rx_enable and netpoll_rx_disable are called from
dev_close and and __dev_close, and not from dev_close_many.
Move the calls into __dev_close_many so that we have a single call
site to maintain, and so that dev_close_many gains this protection as
well. Which importantly makes batched network device deletes safe.
netpoll: Only call ndo_start_xmit from a single place
Factor out the code that needs to surround ndo_start_xmit
from netpoll_send_skb_on_dev into netpoll_start_xmit.
It is an unfortunate fact that as the netpoll code has been maintained
the primary call site ndo_start_xmit learned how to handle vlans
and timestamps but the second call of ndo_start_xmit in queue_process
did not.
With the introduction of netpoll_start_xmit this associated logic now
happens at both call sites of ndo_start_xmit and should make it easy
for that to continue into the future.
netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup()
slave_enable_netpoll() and __netpoll_setup() may be called
with read_lock() held, so should use GFP_ATOMIC to allocate
memory. Eric suggested to pass gfp flags to __netpoll_setup().
bonding: don't call slave_xxx_netpoll under spinlocks
The slave_xxx_netpoll will call synchronize_rcu_bh(),
so the function may schedule and sleep, it should't be
called under spinlocks.
bond_netpoll_setup() and bond_netpoll_cleanup() are always
protected by rtnl lock, it is no need to take the read lock,
as the slave list couldn't be changed outside rtnl lock.
Signed-off-by: Ding Tianhong <[email protected]> Cc: Jay Vosburgh <[email protected]> Cc: Andy Gospodarek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Nothing else that calls __netpoll_setup or ndo_netpoll_setup
requires a gfp paramter, so remove the gfp parameter from both
of these functions making the code clearer.
Dmitry Torokhov [Thu, 6 Mar 2014 20:57:24 +0000 (12:57 -0800)]
Input: mousedev - fix race when creating mixed device
We should not be using static variable mousedev_mix in methods that can be
called before that singleton gets assigned. While at it let's add open and
close methods to mousedev structure so that we do not need to test if we
are dealing with multiplexor or normal device and simply call appropriate
method directly.
This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=71551
Thomas Petazzoni [Thu, 27 Mar 2014 10:39:29 +0000 (11:39 +0100)]
net: mvneta: use devm_ioremap_resource() instead of of_iomap()
The mvneta driver currently uses of_iomap(), which has two drawbacks:
it doesn't request the resource, and it isn't devm-style so some error
handling is needed.
This commit switches to use devm_ioremap_resource() instead, which
automatically requests the resource (so the I/O registers region shows
up properly in /proc/iomem), and also is devm-style, which allows to
get rid of some error handling to unmap the I/O registers region.
Input: don't modify the id of ioctl-provided ff effect on upload failure
If a new (id == -1) ff effect was uploaded from userspace,
ff-core.c::input_ff_upload() will have assigned a positive number to the
new effect id. Currently, evdev.c::evdev_do_ioctl() will save this new id
to userspace, regardless of whether the upload succeeded or not.
On upload failure, this can be confusing because the dev->ff->effects[]
array will not contain an element at the index of that new effect id.
This patch fixes this by leaving the id unchanged after upload fails.
Note: Unfortunately applications should still expect changed effect id for
quite some time.
This has been discussed on:
http://www.mail-archive.com/[email protected]/msg08513.html
("ff-core effect id handling in case of a failed effect upload")
Alex Elder [Tue, 25 Mar 2014 13:36:02 +0000 (15:36 +0200)]
rbd: drop an unsafe assertion
Olivier Bonvalet reported having repeated crashes due to a failed
assertion he was hitting in rbd_img_obj_callback():
Assertion failure in rbd_img_obj_callback() at line 2165:
rbd_assert(which >= img_request->next_completion);
With a lot of help from Olivier with reproducing the problem
we were able to determine the object and image requests had
already been completed (and often freed) at the point the
assertion failed.
There was a great deal of discussion on the ceph-devel mailing list
about this. The problem only arose when there were two (or more)
object requests in an image request, and the problem was always
seen when the second request was being completed.
The problem is due to a race in the window between setting the
"done" flag on an object request and checking the image request's
next completion value. When the first object request completes, it
checks to see if its successor request is marked "done", and if
so, that request is also completed. In the process, the image
request's next_completion value is updated to reflect that both
the first and second requests are completed. By the time the
second request is able to check the next_completion value, it
has been set to a value *greater* than its own "which" value,
which caused an assertion to fail.
Fix this problem by skipping over any completion processing
unless the completing object request is the next one expected.
Test only for inequality (not >=), and eliminate the bad
assertion.
Jianyu Zhan [Fri, 28 Mar 2014 12:55:21 +0000 (20:55 +0800)]
percpu: renew the max_contig if we merge the head and previous block
During pcpu_alloc_area(), we might merge the current head with the
previous block. Since we have calculated the max_contig using the
size of previous block before we skip it, and now we update the size
of previous block, so we should renew the max_contig.
Axel Lin [Fri, 28 Mar 2014 15:37:54 +0000 (23:37 +0800)]
spi: mpc52xx: Convert to use bits_per_word_mask
This controller only supports 8-bit word length.
Set bits_per_word_mask so spi core will reject transfers that attempt to use
an unsupported bits_per_word value.
Also remove the duplicate code to test spi->mode, it is done by spi core.
Paul Mackerras [Mon, 24 Mar 2014 23:47:08 +0000 (10:47 +1100)]
KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
Currently we save the host PMU configuration, counter values, etc.,
when entering a guest, and restore it on return from the guest.
(We have to do this because the guest has control of the PMU while
it is executing.) However, we missed saving/restoring the SIAR and
SDAR registers, as well as the registers which are new on POWER8,
namely SIER and MMCR2.
This adds code to save the values of these registers when entering
the guest and restore them on exit. This also works around the bug
in POWER8 where setting PMAE with a counter already negative doesn't
generate an interrupt.
Commit c7699822bc21 ("KVM: PPC: Book3S HV: Make physical thread 0 do
the MMU switching") reordered the guest entry/exit code so that most
of the guest register save/restore code happened in guest MMU context.
A side effect of that is that the timebase still contains the guest
timebase value at the point where we compute and use vcpu->arch.dec_expires,
and therefore that is now a guest timebase value rather than a host
timebase value. That in turn means that the timeouts computed in
kvmppc_set_timer() are wrong if the timebase offset for the guest is
non-zero. The consequence of that is things such as "sleep 1" in a
guest after migration may sleep for much longer than they should.
This fixes the problem by converting between guest and host timebase
values as necessary, by adding or subtracting the timebase offset.
This also fixes an incorrect comment.
Paul Mackerras [Mon, 24 Mar 2014 23:47:06 +0000 (10:47 +1100)]
KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
With HV KVM, some high-frequency hypercalls such as H_ENTER are handled
in real mode, and need to access the memslots array for the guest.
Accessing the memslots array is safe, because we hold the SRCU read
lock for the whole time that a guest vcpu is running. However, the
checks that kvm_memslots() does when lockdep is enabled are potentially
unsafe in real mode, when only the linear mapping is available.
Furthermore, kvm_memslots() can be called from a secondary CPU thread,
which is an offline CPU from the point of view of the host kernel,
and is not running the task which holds the SRCU read lock.
To avoid false positives in the checks in kvm_memslots(), and to avoid
possible side effects from doing the checks in real mode, this replaces
kvm_memslots() with kvm_memslots_raw() in all the places that execute
in real mode. kvm_memslots_raw() is a new function that is like
kvm_memslots() but uses rcu_dereference_raw_notrace() instead of
kvm_dereference_check().
Paul Mackerras [Mon, 24 Mar 2014 23:47:05 +0000 (10:47 +1100)]
KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
If an attempt is made to load the kvm-hv module on a machine which
doesn't have hypervisor mode available, return an ENODEV error,
which is the conventional thing to return to indicate that this
module is not applicable to the hardware of the current machine,
rather than EIO, which causes a warning to be printed.
Paul Mackerras [Mon, 24 Mar 2014 23:47:04 +0000 (10:47 +1100)]
KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
The in-kernel emulation of RTAS functions needs to read the argument
buffer from guest memory in order to find out what function is being
requested. The guest supplies the guest physical address of the buffer,
and on a real system the code that reads that buffer would run in guest
real mode. In guest real mode, the processor ignores the top 4 bits
of the address specified in load and store instructions. In order to
emulate that behaviour correctly, we need to mask off those bits
before calling kvm_read_guest() or kvm_write_guest(). This adds that
masking.
Michael Neuling [Mon, 24 Mar 2014 23:47:02 +0000 (10:47 +1100)]
KVM: PPC: Book3S HV: Add transactional memory support
This adds saving of the transactional memory (TM) checkpointed state
on guest entry and exit. We only do this if we see that the guest has
an active transaction.
It also adds emulation of the TM state changes when delivering IRQs
into the guest. According to the architecture, if we are
transactional when an IRQ occurs, the TM state is changed to
suspended, otherwise it's left unchanged.
Jiri Kosina [Sat, 29 Mar 2014 01:40:42 +0000 (18:40 -0700)]
HID: hyperv: fix _raw_request() prototype
The 3rd argument is pointer to the buffer, not a single __u8.
This has no bad sideeffect, as the stub is not using any of its
argument, but better have it correct.
Kees Cook [Sat, 15 Mar 2014 20:11:18 +0000 (13:11 -0700)]
[IA64] Keep format strings from leaking into printk
The buffer being sent to printk has already had format strings
resolved. The string should not be reinterpreted again to avoid any
unintended format strings from leaking into printk.
1) We've discovered a common error in several networking drivers, they
put VLAN offload features into ->vlan_features, which would suggest
that they support offloading 2 or more levels of VLAN encapsulation.
Not only do these devices not do that, but we don't have the
infrastructure yet to handle that at all.
Fixes from Vlad Yasevich.
2) Fix tcpdump crash with bridging and vlans, also from Vlad.
3) Some MAINTAINERS updates for random32 and bonding.
4) Fix late reseeds of prandom generator, from Sasha Levin.
6) Fix deadlock in openvswitch, from Flavio Leitner.
7) get_timewait4_sock() doesn't report delay times correctly, fix from
Eric Dumazet.
8) Duplicate address detection and addrconf verification need to run in
contexts where RTNL can be obtained. Move them to run from a
workqueue. From Hannes Frederic Sowa.
9) Fix route refcount leaking in ip tunnels, from Pravin B Shelar.
10) Don't return -EINTR from non-blocking recvmsg() on AF_UNIX sockets,
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
vlan: Warn the user if lowerdev has bad vlan features.
veth: Turn off vlan rx acceleration in vlan_features
ifb: Remove vlan acceleration from vlan_features
qlge: Do not propaged vlan tag offloads to vlans
bridge: Fix crash with vlan filtering and tcpdump
net: Account for all vlan headers in skb_mac_gso_segment
MAINTAINERS: bonding: change email address
MAINTAINERS: bonding: change email address
ipv6: move DAD and addrconf_verify processing to workqueue
tcp: fix get_timewait4_sock() delay computation on 64bit
openvswitch: fix a possible deadlock and lockdep warning
bridge: Fix handling stacked vlan tags
bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled
vhost: validate vhost_get_vq_desc return value
vhost: fix total length when packets are too short
random32: avoid attempt to late reseed if in the middle of seeding
random32: assign to network folks in MAINTAINERS
net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset
core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errors
vlan: Set hard_header_len according to available acceleration
...
Vlad Yasevich [Fri, 28 Mar 2014 02:14:49 +0000 (22:14 -0400)]
vlan: Warn the user if lowerdev has bad vlan features.
Some drivers incorrectly assign vlan acceleration features to
vlan_features thus causing issues for Q-in-Q vlan configurations.
Warn the user of such cases.
Vlad Yasevich [Fri, 28 Mar 2014 02:14:46 +0000 (22:14 -0400)]
qlge: Do not propaged vlan tag offloads to vlans
qlge driver turns off NETIF_F_HW_CTAG_FILTER, but forgets to
turn off HW_CTAG_TX and HW_CTAG_RX on vlan devices. With the
current settings, q-in-q will only generate a single vlan header.
Remember to mask off CTAG_TX and CTAG_RX features in vlan_features.
Vlad Yasevich [Fri, 28 Mar 2014 01:51:18 +0000 (21:51 -0400)]
bridge: Fix crash with vlan filtering and tcpdump
When the vlan filtering is enabled on the bridge, but
the filter is not configured on the bridge device itself,
running tcpdump on the bridge device will result in a
an Oops with NULL pointer dereference. The reason
is that br_pass_frame_up() will bypass the vlan
check because promisc flag is set. It will then try
to get the table pointer and process the packet based
on the table. Since the table pointer is NULL, we oops.
Catch this special condition in br_handle_vlan().
Vlad Yasevich [Thu, 27 Mar 2014 21:26:18 +0000 (17:26 -0400)]
net: Account for all vlan headers in skb_mac_gso_segment
skb_network_protocol() already accounts for multiple vlan
headers that may be present in the skb. However, skb_mac_gso_segment()
doesn't know anything about it and assumes that skb->mac_len
is set correctly to skip all mac headers. That may not
always be the case. If we are simply forwarding the packet (via
bridge or macvtap), all vlan headers may not be accounted for.
A simple solution is to allow skb_network_protocol to return
the vlan depth it has calculated. This way skb_mac_gso_segment
will correctly skip all mac headers.
Linus Torvalds [Fri, 28 Mar 2014 20:57:13 +0000 (13:57 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge two fixes from Andrew Morton:
"The x86 fix should come from x86 guys but they appear to be
conferencing or otherwise distracted.
The ocfs2 fix is a bit of a mess - the code runs into an immediate
NULL deref and we're trying to work out how this got through test and
review, but we haven't heard from Goldwyn in the past few days.
Sasha's patch fixes the oops, but the feature as a whole is probably
broken. So this is a stopgap for 3.14 - I'll aim to get the real
fixes into 3.14.x"
* emailed patches from Andrew Morton [email protected]>:
x86: fix boot on uniprocessor systems
ocfs2: check if cluster name exists before deref
Artem Fetishev [Fri, 28 Mar 2014 20:33:39 +0000 (13:33 -0700)]
x86: fix boot on uniprocessor systems
On x86 uniprocessor systems topology_physical_package_id() returns -1
which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized
which leads to GPF in rapl_pmu_init().
See arch/x86/kernel/cpu/perf_event_intel_rapl.c.
It turns out that physical_package_id and core_id can actually be
retreived for uniprocessor systems too. Enabling them also fixes
rapl_pmu code.
Sasha Levin [Fri, 28 Mar 2014 20:33:38 +0000 (13:33 -0700)]
ocfs2: check if cluster name exists before deref
Commit c74a3bdd9b52 ("ocfs2: add clustername to cluster connection") is
trying to strlcpy a string which was explicitly passed as NULL in the
very same patch, triggering a NULL ptr deref.
ipv6: move DAD and addrconf_verify processing to workqueue
addrconf_join_solict and addrconf_join_anycast may cause actions which
need rtnl locked, especially on first address creation.
A new DAD state is introduced which defers processing of the initial
DAD processing into a workqueue.
To get rtnl lock we need to push the code paths which depend on those
calls up to workqueues, specifically addrconf_verify and the DAD
processing.
(v2)
addrconf_dad_failure needs to be queued up to the workqueue, too. This
patch introduces a new DAD state and stop the DAD processing in the
workqueue (this is because of the possible ipv6_del_addr processing
which removes the solicited multicast address from the device).
addrconf_verify_lock is removed, too. After the transition it is not
needed any more.
As we are not processing in bottom half anymore we need to be a bit more
careful about disabling bottom half out when we lock spin_locks which are also
used in bh.
Eric Dumazet [Thu, 27 Mar 2014 15:45:56 +0000 (08:45 -0700)]
net: net: add a core netdev->tx_dropped counter
Dropping packets in __dev_queue_xmit() when transmit queue
is stopped (NIC TX ring buffer full or BQL limit reached) currently
outputs a syslog message.
It would be better to get a precise count of such events available in
netdevice stats so that monitoring tools can have a clue.
This extends the work done in caf586e5f23ce
("net: add a core netdev->rx_dropped counter")
Daniel Borkmann [Thu, 27 Mar 2014 15:38:30 +0000 (16:38 +0100)]
packet: respect devices with LLTX flag in direct xmit
Quite often it can be useful to test with dummy or similar
devices as a blackhole sink for skbs. Such devices are only
equipped with a single txq, but marked as NETIF_F_LLTX as
they do not require locking their internal queues on xmit
(or implement locking themselves). Therefore, rather use
HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.
trafgen mmap/TX_RING example against dummy device with config
foo: { fill(0xff, 64) } results in the following performance
improvements for such scenarios on an ordinary Core i7/2.80GHz:
Daniel Borkmann [Thu, 27 Mar 2014 15:34:59 +0000 (16:34 +0100)]
net: nlmon: flag nlmon devs with LLTX/SG
As in xmit path we merely update statistics and free the skb, we
can mark the device with LLTX feature, so that upper layers can
avoid taking the single txq lock on xmit. While at it, also add
missing NETIF_F_SG.
Eric Dumazet [Thu, 27 Mar 2014 14:19:19 +0000 (07:19 -0700)]
tcp: fix get_timewait4_sock() delay computation on 64bit
It seems I missed one change in get_timewait4_sock() to compute
the remaining time before deletion of IPV4 timewait socket.
This could result in wrong output in /proc/net/tcp for tm->when field.
Fixes: 96f817fedec4 ("tcp: shrink tcp6_timewait_sock by one cache line") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Flavio Leitner [Thu, 27 Mar 2014 14:05:34 +0000 (11:05 -0300)]
openvswitch: fix a possible deadlock and lockdep warning
There are two problematic situations.
A deadlock can happen when is_percpu is false because it can get
interrupted while holding the spinlock. Then it executes
ovs_flow_stats_update() in softirq context which tries to get
the same lock.
The second sitation is that when is_percpu is true, the code
correctly disables BH but only for the local CPU, so the
following can happen when locking the remote CPU without
disabling BH:
Toshiaki Makita [Thu, 27 Mar 2014 12:46:56 +0000 (21:46 +0900)]
bridge: Fix handling stacked vlan tags
If a bridge with vlan_filtering enabled receives frames with stacked
vlan tags, i.e., they have two vlan tags, br_vlan_untag() strips not
only the outer tag but also the inner tag.
br_vlan_untag() is called only from br_handle_vlan(), and in this case,
it is enough to set skb->vlan_tci to 0 here, because vlan_tci has already
been set before calling br_handle_vlan().
Toshiaki Makita [Thu, 27 Mar 2014 12:46:55 +0000 (21:46 +0900)]
bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled
Bridge vlan code (br_vlan_get_tag()) assumes that all frames have vlan_tci
if they are tagged, but if vlan tx offload is manually disabled on bridge
device and frames are sent from vlan device on the bridge device, the tags
are embedded in skb->data and they break this assumption.
Extract embedded vlan tags and move them to vlan_tci at ingress.
David S. Miller [Fri, 28 Mar 2014 20:30:05 +0000 (16:30 -0400)]
Merge branch 'mlx4_vxlan'
Or Gerlitz says:
====================
Implement vxlan ndo calls
This short series adds support for the vxlan ndo calls, the udp
port is programmed to the firmware using a new command we introduce
here which is called "config device".
====================
Or Gerlitz [Thu, 27 Mar 2014 12:02:02 +0000 (14:02 +0200)]
net/mlx4: USe one wrapper that returns -EPERM
When a VF issues a firmware command which is disallowed for them, the PF
rerturns -EPERM from that command wrapper. Move to use one such wrapper
instance, instead of repeating the same code on such commands.
vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.
Sasha Levin [Fri, 28 Mar 2014 16:38:42 +0000 (17:38 +0100)]
random32: avoid attempt to late reseed if in the middle of seeding
Commit 4af712e8df ("random32: add prandom_reseed_late() and call when
nonblocking pool becomes initialized") has added a late reseed stage
that happens as soon as the nonblocking pool is marked as initialized.
This fails in the case that the nonblocking pool gets initialized
during __prandom_reseed()'s call to get_random_bytes(). In that case
we'd double back into __prandom_reseed() in an attempt to do a late
reseed - deadlocking on 'lock' early on in the boot process.
Instead, just avoid even waiting to do a reseed if a reseed is already
occuring.
Fixes: 4af712e8df99 ("random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized") Signed-off-by: Sasha Levin <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Linus Torvalds [Fri, 28 Mar 2014 20:03:00 +0000 (13:03 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"Updates to Synaptics touchpad to better cope with devices in Lenovo
laptops, and a couple more fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics - add manual min/max quirk for ThinkPad X240
Input: synaptics - add manual min/max quirk
Input: cypress_ps2 - don't report as a button pads
Input: da9052_onkey - use correct register bit for key status
Input: adp5588-keys - get value from data out when dir is out
Ben Dooks [Fri, 21 Mar 2014 11:09:14 +0000 (12:09 +0100)]
sh_eth: ensure pm_runtime cannot suspend the device during init
The pm_rumtime work queue is causing the device to be suspended during
initialisation, thus the initialisation may not be able to access registers
properly. As the code is called from a work queue, it is possible that this
is not seen from certain configurations/builds due to the asynchronos
nature of the code.
Another issue has also been found where the network device registration
calls back into the driver thus causing further pm_runtime calls that
also caused issues with the MDIO bus code. This has now been checked
and is the only place the MDIO can be called without the device open.
Use pm_runtime_get_sync() and pm_runtime_put() to ensure that the
pm system does not suspend it during the probe() call and remove the
now unnecessary pm_runtime_resume() call. Also add a call in the error
path to call pm_runtime_disable().
This fixes the external abort that can cause /sbin/init or other such
init processed to die.
David S. Miller [Fri, 28 Mar 2014 18:46:34 +0000 (14:46 -0400)]
Merge branch 'tipc-next'
Erik Hugne says:
====================
tipc: fix handling of NETDEV_CHANGEADDR event
Aside from manual reconfiguration of the netdevice hwaddr, this can also
be changed automatically for an interface bond in active-backup mode
if fail_over_mac is enabled. This patchset fixes the handling of this
event in TIPC by properly updating the l2 media address for the bearer,
followed by a reinitialization of the node discovery mechanism.
====================
Erik Hugne [Fri, 28 Mar 2014 09:32:09 +0000 (10:32 +0100)]
tipc: make discovery domain a bearer attribute
The node discovery domain is assigned when a bearer is enabled.
In the previous commit we reflect this attribute directly in the
bearer structure since it's needed to reinitialize the node
discovery mechanism after a hardware address change.
There's no need to replicate this attribute anywhere else, so we
remove it from the tipc_link_req structure.
Erik Hugne [Fri, 28 Mar 2014 09:32:08 +0000 (10:32 +0100)]
tipc: fix neighbor detection problem after hw address change
If the hardware address of a underlying netdevice is changed, it is
not enough to simply reset the bearer/links over this device. We
also need to reflect this change in the TIPC bearer and node
discovery structures aswell.
This patch adds the necessary reinitialization of the node disovery
mechanism following a hardware address change so that the correct
originating media address is advertised in the discovery messages.
David S. Miller [Fri, 28 Mar 2014 18:44:06 +0000 (14:44 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to e1000e, igb, i40e and i40evf
Anjali provides i40e fix to remove the ATR filter on RST as well as FIN
packets. Cleans up add_del_fdir() because it was used and implemented
only for the add, so change the name and drop a parameter. Adds the
ability to drop a flow if we wanted to and adds a flow director
message level to be used for flow director specific messages.
Mitch fixes an issue on i40evf where the Tx watchdog handler was causing
an oops when sending an admin queue message to request a reset because
the admin queue functions use spinlocks.
Greg provides a change to i40e to make the alloc and free queue vector
calls orthogonal.
Shannon fixes i40e to verify the eeprom checksum and firmware CRC status
bits, and shutdown the driver if they fail. This change stops the
processing of traffic, but does not kill the PF netdev so that the
NVMUpdate process still has a chance at fixing the image. Also provides
a fix to make sure the VSI has a netdev before trying to use it in
the debugfs netdev_ops commands.
Jakub Kicinski provides patches for e1000e and igb to fix a number issues
found in the PTP code.
v2:
- drop patch 11 "i40e: Add a fallback debug flow for the driver" from the
series based on feedback from David Miller
====================
Byungho An [Fri, 28 Mar 2014 17:57:36 +0000 (10:57 -0700)]
net: sxgbe: fix sparse warnings about static declaration
This fixes followings:
sparse warnings: (new ones prefixed by >>)
>> drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c:197:5:
sparse: symbol 'sxgbe_platform_freeze' was not declared. Should it be static?
>> drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c:204:5:
sparse: symbol 'sxgbe_platform_restore' was not declared. Should it be static?
>> drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c:228:24:
sparse: symbol 'sxgbe_platform_driver' was not declared. Should it be static?
>> drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c:1795:6:
sparse: symbol 'sxgbe_get_ops' was not declared. Should it be static?
David S. Miller [Fri, 28 Mar 2014 18:25:02 +0000 (14:25 -0400)]
Merge branch 'be2net-next'
Sathya Perla says:
====================
be2net: add vxlan offload support
The first patch adds the FW cmds needed to configure the Skyhawk-R
chip for supporting VxLAN offloads. The second patch implements the
ndo_add/del_vxlan_port() methods and the plumbing for supporting
RX/TX csum, TSO and RSS steering offloads for VxLAN traffic.
v2 changes:
NETIF_F_SG need not be set for hw_enc_features by the driver as it is
done by the stack.
v3 changes:
* Defer FW cmds needed for VxLAN offloads to a workqueue
* Reset FW to VxLAN offloads disabled state in the unload path
v4 changes:
* Revert the usage of workqueue (introduced in v3) to implement
ndo_add/del_vxlan_port() as it is currently not needed (none of the
FW cmd calls sleep.) Suggested by David M.
====================
Sathya Perla [Thu, 27 Mar 2014 05:16:18 +0000 (10:46 +0530)]
be2net: add FW cmds needed for VxLAN offloads
This patch adds support for the FW cmds needed for VxLAN offloads
on Skyhawk-R:
1) The VxLAN UDP port needs to be configured via the port-desc of
SET_PROFILE_CONFIG_v1 cmd.
This patch re-factors the be_set_profile_config() code (used so far
only for setting VF QoS) to be used to set any type of descriptor.
2) The MANAGE_IFACE_FILTERS cmds is needed to convert a normal interface
into a tunnel interface. This allows for RSS to work even on the inner
TCP/UDP headers of VxLAN traffic.
Linus Torvalds [Fri, 28 Mar 2014 17:58:10 +0000 (10:58 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"I didn't want these to wait for stable cycle.
The nouveau and radeon ones are the same problem, where the runtime pm
stuff broke non-runtime pm managed secondary GPUs.
The udl fix is for an oops on unplug, and the i915 fix is for a
regression on Sandybridge even though it may break haswell (regression
wins)"
Daniel Vetter comments:
"My apologies for the i915 regression fumble, that thing somehow fell
through the cracks here for almost half a year :( Imo that's more than
enough flailing to just go ahead with the revert, and the re-broken
hsw should get peoples attention ..."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Undo gtt scratch pte unmapping again
drm/radeon: fix runtime suspend breaking secondary GPUs
drm/nouveau: fail runtime pm properly.
drm/udl: take reference to device struct for dma-bufs
Linus Torvalds [Fri, 28 Mar 2014 17:52:05 +0000 (10:52 -0700)]
Merge tag 'stable/for-linus-3.14-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen bugfixes from David Vrabel:
"Fix two bugs that cause x86 PV guest crashes.
1. Ballooning a 32-bit guest would eventually crash it.
2. Revert a broken fix for a regression with NUMA_BALACING. The bad
fix caused PV guests to crash after migration. This is not ideal
but unpicking the madness that is _PAGE_NUMA == _PAGE_PROTNONE will
take a while longer"
* tag 'stable/for-linus-3.14-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
Revert "xen: properly account for _PAGE_NUMA during xen pte translations"
xen/balloon: flush persistent kmaps in correct position
The new Lenovo Haswell series (-40's) contains a new Synaptics touchpad.
However, these new Synaptics devices report bad axis ranges.
Under Windows, it is not a problem because the Windows driver uses RMI4
over SMBus to talk to the device. Under Linux, we are using the PS/2
fallback interface and it occurs the reported ranges are wrong.
Of course, it would be too easy to have only one range for the whole
series, each touchpad seems to be calibrated in a different way.
We can not use SMBus to get the actual range because I suspect the firmware
will switch into the SMBus mode and stop talking through PS/2 (this is the
case for hybrid HID over I2C / PS/2 Synaptics touchpads).
So as a temporary solution (until RMI4 land into upstream), start a new
list of quirks with the min/max manually set.
Grant Likely [Fri, 28 Mar 2014 00:11:23 +0000 (17:11 -0700)]
of: Add support for ePAPR "stdout-path" property
ePAPR 1.1 defines the "stdout-path" property for specifying the console
device, but Linux currently only handles the older "linux,stdout-path"
property. This patch adds parsing for the new property name.
Jakub Kicinski [Sat, 15 Mar 2014 14:55:32 +0000 (14:55 +0000)]
igb: fix race conditions on queuing skb for HW time stamp
igb has a single set of TX time stamping resources per NIC.
Use a simple bit lock to avoid race conditions and leaking skbs
when multiple TX rings try to claim time stamping.
Jakub Kicinski [Sat, 15 Mar 2014 14:55:26 +0000 (14:55 +0000)]
igb: never generate both software and hardware timestamps
skb_tx_timestamp() does not report software time stamp
if SKBTX_IN_PROGRESS is set. According to timestamping.txt
software time stamps are a fallback and should not be
generated if hardware time stamp is provided.
Move call to skb_tx_timestamp() after setting
SKBTX_IN_PROGRESS.