Git Repo - linux.git/log

]> Git Repo - linux.git/log

projects / linux.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Catalin Marinas [Tue, 25 Jul 2017 13:53:03 +0000 (14:53 +0100)]

arm64: Fix potential race with hardware DBM in ptep_set_access_flags()

In a system with DBM (dirty bit management) capable agents there is a
possible race between a CPU executing ptep_set_access_flags() (maybe
non-DBM capable) and a hardware update of the dirty state (clearing of
PTE_RDONLY). The scenario:

a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old
   (PTE_AF clear)
b) ptep_set_access_flags() is called as a result of a read access and it
   needs to set the pte to writable, clean and young (PTE_AF set)
c) a DBM-capable agent, as a result of a different write access, is
   marking the entry as young (setting PTE_AF) and dirty (clearing
   PTE_RDONLY)

The current ptep_set_access_flags() implementation would set the
PTE_RDONLY bit in the resulting value overriding the DBM update and
losing the dirty state.

This patch fixes such race by setting PTE_RDONLY to the most permissive
(lowest value) of the current entry and the new one.

Fixes: 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
Cc: Will Deacon <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Acked-by: Steve Capper <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 11:22:33 +0000 (13:22 +0200)]

Merge tag 'davinci-fixes-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into fixes

Pull "DaVinci fixes for v4.13" from Sekhar Nori:

Drop unused VPIF endpoints from device-tree.
They should be used only when an actual
remote-endpoint is connected.

* tag 'davinci-fixes-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: dts: da850-lcdk: drop unused VPIF endpoints
ARM: dts: da850-evm: drop unused VPIF endpoints

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 11:04:42 +0000 (13:04 +0200)]

Merge tag 'sunxi-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes

Pull "Allwinner fixes for 4.13" from Chen-Yu Tsai:

Two fixes to correct the EMAC blocks memory region size to match the
datasheet. One that converts raw A83T clock indices to macros from the
clk dt-binding header, completing the A83T sunxi-ng clk driver.

* tag 'sunxi-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun8i: a83t: Switch to CCU device tree binding macros
  arm64: allwinner: sun50i-a64: Correct emac register size
  ARM: dts: sunxi: h3/h5: Correct emac register size

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 11:03:24 +0000 (13:03 +0200)]

Merge tag 'qcom-arm64-defconfig-fixes-for-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into fixes

Pull "Qualcomm ARM64 based defconfig Fixes for v4.13-rc2" from Andy Gross:

* Enable missing HWSPINLOCK

* tag 'qcom-arm64-defconfig-fixes-for-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
arm64: defconfig: enable missing HWSPINLOCK

commit | commitdiff | tree

Marc Gonzalez [Fri, 28 Jul 2017 13:27:49 +0000 (15:27 +0200)]

ARM: dts: tango4: Request RGMII RX and TX clock delays

RX and TX clock delays are required. Request them explicitly.

Fixes: cad008b8a77e6 ("ARM: dts: tango4: Initial device trees")
Cc: [email protected]
Signed-off-by: Marc Gonzalez <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

commit | commitdiff | tree

Masahiro Yamada [Mon, 31 Jul 2017 05:49:25 +0000 (14:49 +0900)]

bus: uniphier-system-bus: set up registers when resuming

When resuming, set up registers that have been lost in the sleep state.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 10:54:41 +0000 (12:54 +0200)]

Merge tag 'renesas-fixes3-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes

Pull "Third Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:

Fix deadlock in regulator quirk for R-Car Gen 2 SoCs

The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus
notifier, and unregisters the notifier when it is no longer needed.
However, a notifier must not be unregistered from within the call chain.

This bug went unnoticed, as blocking_notifier_chain_unregister() didn't
take the semaphore during early boot. This is no longer the case as of
upstream commit 1c3c5eab171590f8 ("sched/core: Enable might_sleep() and
smp_processor_id() checks early") and a deadlock occurs.

* tag 'renesas-fixes3-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: rcar-gen2: Fix deadlock in regulator quirk

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 10:53:21 +0000 (12:53 +0200)]

Merge tag 'mvebu-fixes-4.13-2' of git://git.infradead.org/linux-mvebu into fixes

Pull "mvebu fixes for 4.13 (part 2)" from Gregory CLEMENT:

All the fixes are for ARM64 mvebu:

- Fix the RTC interrupt on A7K/A8K which was missed when switching
   from GIC to ICU
- Mark the A7K/A8K crypto engine as dma coherent
- Fix the number of GPIO on south bridge on Armada 3700

* tag 'mvebu-fixes-4.13-2' of git://git.infradead.org/linux-mvebu:
  ARM64: dts: marvell: armada-37xx: Fix the number of GPIO on south bridge
  arm64: dts: marvell: mark the cp110 crypto engine as dma coherent
  arm64: dts: marvell: use ICU for the CP110 slave RTC

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 10:50:52 +0000 (12:50 +0200)]

Merge tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into fixes

Pull "Amlogic fixes for v4.13-rc" from Kevin Hilman:

- 2 minor DT fixes

* tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
ARM64: dts: meson-gxl-s905x-libretech-cc: fixup board definition
ARM64: dts: meson-gx: use specific compatible for the AO pwms

commit | commitdiff | tree

Arnd Bergmann [Fri, 4 Aug 2017 10:48:46 +0000 (12:48 +0200)]

Merge tag 'v4.13-rockchip-dts32fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes

Pull "Rockchip dts32 fixes for 4.13" from Heiko Stübner:

Fix for the recently added mali dt support. The example
showed a wrong value, so fix it before it gets copy-pasted
to much.

* tag 'v4.13-rockchip-dts32fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: fix mali gpu node on rk3288
dt-bindings: gpu: drop wrong compatible from midgard binding example

commit | commitdiff | tree

David S. Miller [Fri, 4 Aug 2017 04:37:30 +0000 (21:37 -0700)]

Merge branch 'socket-sendmsg-zerocopy'

Willem de Bruijn says:

====================
socket sendmsg MSG_ZEROCOPY

Introduce zerocopy socket send flag MSG_ZEROCOPY. This extends the
shared page support (SKBTX_SHARED_FRAG) from sendpage to sendmsg.
Implement the feature for TCP initially, as large writes benefit
most.

On a send call with MSG_ZEROCOPY, the kernel pins user pages and
links these directly into the skbuff frags[] array.

Each send call with MSG_ZEROCOPY that transmits data will eventually
queue a completion notification on the error queue: a per-socket u32
incremented on each such call. A request may have to revert to copy
to succeed, for instance when a device cannot support scatter-gather
IO. In that case a flag is passed along to notify that the operation
succeeded without zerocopy optimization.

The implementation extends the existing zerocopy infra for tuntap,
vhost and xen with features needed for TCP, notably reference
counting to handle cloning on retransmit and GSO.

For more details, see also the netdev 2.1 paper and presentation at
https://netdevconf.org/2.1/session.html?debruijn

Changelog:

  v3 -> v4:
    - dropped UDP, RAW and PF_PACKET for now
        Without loopback support, datagrams are usually smaller than
        the ~8KB size threshold needed to benefit from zerocopy.
    - style: a few reverse chrismas tree
    - minor: SO_ZEROCOPY returns ENOTSUPP on unsupported protocols
    - minor: squashed SO_EE_CODE_ZEROCOPY_COPIED patch
    - minor: rebased on top of net-next with kmap_atomic fix

  v2 -> v3:
    - fix rebase conflict: SO_ZEROCOPY 59 -> 60

  v1 -> v2:
    - fix (kbuild-bot): do not remove uarg until patch 5
    - fix (kbuild-bot): move zerocopy_sg_from_iter doc with function
    - fix: remove unused extern in header file

  RFCv2 -> v1:
    - patch 2
        - review comment: in skb_copy_ubufs, always allocate order-0
            page, also when replacing compound source pages.
    - patch 3
        - fix: always queue completion notification on MSG_ZEROCOPY,
    also if revert to copy.
- fix: on syscall abort, correctly revert notification state
- minor: skip queue notification on SOCK_DEAD
- minor: replace BUG_ON with WARN_ON in recoverable error
    - patch 4
        - new: add socket option SOCK_ZEROCOPY.
    only honor MSG_ZEROCOPY if set, ignore for legacy apps.
    - patch 5
        - fix: clear zerocopy state on skb_linearize
    - patch 6
        - fix: only coalesce if prev errqueue elem is zerocopy
- minor: try coalescing with list tail instead of head
        - minor: merge bytelen limit patch
    - patch 7
        - new: signal when data had to be copied
    - patch 8 (tcp)
        - optimize: avoid setting PSH bit when exceeding max frags.
    that limits GRO on the client. do not goto new_segment.
- fix: fail on MSG_ZEROCOPY | MSG_FASTOPEN
- minor: do not wait for memory: does not work for optmem
- minor: simplify alloc
    - patch 9 (udp)
        - new: add PF_INET6
        - fix: attach zerocopy notification even if revert to copy
- minor: simplify alloc size arithmetic
    - patch 10 (raw hdrinc)
        - new: add PF_INET6
    - patch 11 (pf_packet)
        - minor: simplify slightly
    - patch 12
        - new msg_zerocopy regression test: use veth pair to test
    all protocols: ipv4/ipv6/packet, tcp/udp/raw, cork
    all relevant ethtool settings: rx off, sg off
    all relevant packet lengths: 0, <MAX_HEADER, max size

  RFC -> RFCv2:
    - review comment: do not loop skb with zerocopy frags onto rx:
          add skb_orphan_frags_rx to orphan even refcounted frags
  call this in __netif_receive_skb_core, deliver_skb and tun:
  same as commit 1080e512d44d ("net: orphan frags on receive")
    - fix: hold an explicit sk reference on each notification skb.
          previously relied on the reference (or wmem) held by the
  data skb that would trigger notification, but this breaks
  on skb_orphan.
    - fix: when aborting a send, do not inc the zerocopy counter
          this caused gaps in the notification chain
    - fix: in packet with SOCK_DGRAM, pull ll headers before calling
          zerocopy_sg_from_iter
    - fix: if sock_zerocopy_realloc does not allow coalescing,
          do not fail, just allocate a new ubuf
    - fix: in tcp, check return value of second allocation attempt
    - chg: allocate notification skbs from optmem
          to avoid affecting tcp write queue accounting (TSQ)
    - chg: limit #locked pages (ulimit) per user instead of per process
    - chg: grow notification ids from 16 to 32 bit
      - pass range [lo, hi] through 32 bit fields ee_info and ee_data
    - chg: rebased to davem-net-next on top of v4.10-rc7
    - add: limit notification coalescing
          sharing ubufs limits overhead, but delays notification until
  the last packet is released, possibly unbounded. Add a cap.
    - tests: add snd_zerocopy_lo pf_packet test
    - tests: two bugfixes (add do_flush_tcp, ++sent not only in debug)

Limitations / Known Issues:
    - TCP may build slightly smaller than max TSO packets due to
      exceeding MAX_SKB_FRAGS frags when zerocopy pages are unaligned.
    - All SKBTX_SHARED_FRAG may require additional __skb_linearize or
      skb_copy_ubufs calls in u32, skb_find_text, similar to
      skb_checksum_help.

Notification skbuffs are allocated from optmem. For sockets that
cannot effectively coalesce notifications, the optmem max may need
to be increased to avoid hitting -ENOBUFS:

  sysctl -w net.core.optmem_max=1048576

In application load, copy avoidance shows a roughly 5% systemwide
reduction in cycles when streaming large flows and a 4-8% reduction in
wall clock time on early tensorflow test workloads.

For the single-machine veth tests to succeed, loopback support has to
be temporarily enabled by making skb_orphan_frags_rx map to
skb_orphan_frags.

* Performance

The below table shows cycles reported by perf for a netperf process
sending a single 10 Gbps TCP_STREAM. The first three columns show
Mcycles spent in the netperf process context. The second three columns
show time spent systemwide (-a -C A,B) on the two cpus that run the
process and interrupt handler. Reported is the median of at least 3
runs. std is a standard netperf, zc uses zerocopy and % is the ratio.
Netperf is pinned to cpu 2, network interrupts to cpu3, rps and rfs
are disabled and the kernel is booted with idle=halt.

NETPERF=./netperf -t TCP_STREAM -H $host -T 2 -l 30 -- -m $size

perf stat -e cycles $NETPERF
perf stat -C 2,3 -a -e cycles $NETPERF

        --process cycles--      ----cpu cycles----
           std      zc   %      std         zc   %
4K      27,609  11,217  41      49,217  39,175  79
16K     21,370   3,823  18      43,540  29,213  67
64K     20,557   2,312  11      42,189  26,910  64
256K    21,110   2,134  10      43,006  27,104  63
1M      20,987   1,610   8      42,759  25,931  61

Perf record indicates the main source of these differences. Process
cycles only at 1M writes (perf record; perf report -n):

std:
Samples: 42K of event 'cycles', Event count (approx.): 21258597313
79.41%         33884  netperf  [kernel.kallsyms]  [k] copy_user_generic_string
  3.27%          1396  netperf  [kernel.kallsyms]  [k] tcp_sendmsg
  1.66%           694  netperf  [kernel.kallsyms]  [k] get_page_from_freelist
  0.79%           325  netperf  [kernel.kallsyms]  [k] tcp_ack
  0.43%           188  netperf  [kernel.kallsyms]  [k] __alloc_skb

zc:
Samples: 1K of event 'cycles', Event count (approx.): 1439509124
30.36%           584  netperf.zerocop  [kernel.kallsyms]  [k] gup_pte_range
14.63%           284  netperf.zerocop  [kernel.kallsyms]  [k] __zerocopy_sg_from_iter
  8.03%           159  netperf.zerocop  [kernel.kallsyms]  [k] skb_zerocopy_add_frags_iter
  4.84%            96  netperf.zerocop  [kernel.kallsyms]  [k] __alloc_skb
  3.10%            60  netperf.zerocop  [kernel.kallsyms]  [k] kmem_cache_alloc_node

* Safety

The number of pages that can be pinned on behalf of a user with
MSG_ZEROCOPY is bound by the locked memory ulimit.

While the kernel holds process memory pinned, a process cannot safely
reuse those pages for other purposes. Packets looped onto the receive
stack and queued to a socket can be held indefinitely. Avoid unbounded
notification latency by restricting user pages to egress paths only.
skb_orphan_frags_rx() will create a private copy of pages even for
refcounted packets when these are looped, as did skb_orphan_frags for
the original tun zerocopy implementation.

Pages are not remapped read-only. Processes can modify packet contents
while packets are in flight in the kernel path. Bytes on which kernel
control flow depends (headers) are copied to avoid TOCTTOU attacks.
Datapath integrity does not otherwise depend on payload, with three
exceptions: checksums, optional sk_filter/tc u32/.. and device +
driver logic. The effect of wrong checksums is limited to the
misbehaving process. TC filters that access contents may have to be
excluded by adding an skb_orphan_frags_rx.

Processes can also safely avoid OOM conditions by bounding the number
of bytes passed with MSG_ZEROCOPY and by removing shared pages after
transmission from their own memory map.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:45 +0000 (16:29 -0400)]

test: add msg_zerocopy test

Introduce regression test for msg_zerocopy feature. Send traffic from
one process to another with and without zerocopy.

Evaluate tcp, udp, raw and packet sockets, including variants
- udp: corking and corking with mixed copy/zerocopy calls
- raw: with and without hdrincl
- packet: at both raw and dgram level

Test on both ipv4 and ipv6, optionally with ethtool changes to
disable scatter-gather, tx checksum or tso offload. All of these
can affect zerocopy behavior.

The regression test can be run on a single machine if over a veth
pair. Then skb_orphan_frags_rx must be modified to be identical to
skb_orphan_frags to allow forwarding zerocopy locally.

The msg_zerocopy.sh script will setup the veth pair in network
namespaces and run all tests.

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:44 +0000 (16:29 -0400)]

tcp: enable MSG_ZEROCOPY

Enable support for MSG_ZEROCOPY to the TCP stack. TSO and GSO are
both supported. Only data sent to remote destinations is sent without
copying. Packets looped onto a local destination have their payload
copied to avoid unbounded latency.

Tested:
  A 10x TCP_STREAM between two hosts showed a reduction in netserver
  process cycles by up to 70%, depending on packet size. Systemwide,
  savings are of course much less pronounced, at up to 20% best case.

  msg_zerocopy.sh 4 tcp:

  without zerocopy
    tx=121792 (7600 MB) txc=0 zc=n
    rx=60458 (7600 MB)

  with zerocopy
    tx=286257 (17863 MB) txc=286257 zc=y
    rx=140022 (17863 MB)

  This test opens a pair of sockets over veth, one one calls send with
  64KB and optionally MSG_ZEROCOPY and on the other reads the initial
  bytes. The receiver truncates, so this is strictly an upper bound on
  what is achievable. It is more representative of sending data out of
  a physical NIC (when payload is not touched, either).

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:43 +0000 (16:29 -0400)]

sock: ulimit on MSG_ZEROCOPY pages

Bound the number of pages that a user may pin.

Follow the lead of perf tools to maintain a per-user bound on memory
locked pages commit 789f90fcf6b0 ("perf_counter: per user mlock gift")

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:42 +0000 (16:29 -0400)]

sock: MSG_ZEROCOPY notification coalescing

In the simple case, each sendmsg() call generates data and eventually
a zerocopy ready notification N, where N indicates the Nth successful
invocation of sendmsg() with the MSG_ZEROCOPY flag on this socket.

TCP and corked sockets can cause send() calls to append new data to an
existing sk_buff and, thus, ubuf_info. In that case the notification
must hold a range. odify ubuf_info to store a inclusive range [N..N+m]
and add skb_zerocopy_realloc() to optionally extend an existing range.

Also coalesce notifications in this common case: if a notification
[1, 1] is about to be queued while [0, 0] is the queue tail, just modify
the head of the queue to read [0, 1].

Coalescing is limited to a few TSO frames worth of data to bound
notification latency.

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:41 +0000 (16:29 -0400)]

sock: enable MSG_ZEROCOPY

Prepare the datapath for refcounted ubuf_info. Clone ubuf_info with
skb_zerocopy_clone() wherever needed due to skb split, merge, resize
or clone.

Split skb_orphan_frags into two variants. The split, merge, .. paths
support reference counted zerocopy buffers, so do not do a deep copy.
Add skb_orphan_frags_rx for paths that may loop packets to receive
sockets. That is not allowed, as it may cause unbounded latency.
Deep copy all zerocopy copy buffers, ref-counted or not, in this path.

The exact locations to modify were chosen by exhaustively searching
through all code that might modify skb_frag references and/or the
the SKBTX_DEV_ZEROCOPY tx_flags bit.

The changes err on the safe side, in two ways.

(1) legacy ubuf_info paths virtio and tap are not modified. They keep
    a 1:1 ubuf_info to sk_buff relationship. Calls to skb_orphan_frags
    still call skb_copy_ubufs and thus copy frags in this case.

(2) not all copies deep in the stack are addressed yet. skb_shift,
    skb_split and skb_try_coalesce can be refined to avoid copying.
    These are not in the hot path and this patch is hairy enough as
    is, so that is left for future refinement.

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:40 +0000 (16:29 -0400)]

sock: add SOCK_ZEROCOPY sockopt

The send call ignores unknown flags. Legacy applications may already
unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a
socket opts in to zerocopy.

Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing.
Processes can also query this socket option to detect kernel support
for the feature. Older kernels will return ENOPROTOOPT.

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:39 +0000 (16:29 -0400)]

sock: add MSG_ZEROCOPY

The kernel supports zerocopy sendmsg in virtio and tap. Expand the
infrastructure to support other socket types. Introduce a completion
notification channel over the socket error queue. Notifications are
returned with ee_origin SO_EE_ORIGIN_ZEROCOPY. ee_errno is 0 to avoid
blocking the send/recv path on receiving notifications.

Add reference counting, to support the skb split, merge, resize and
clone operations possible with SOCK_STREAM and other socket types.

The patch does not yet modify any datapaths.

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:38 +0000 (16:29 -0400)]

sock: skb_copy_ubufs support for compound pages

Refine skb_copy_ubufs to support compound pages. With upcoming TCP
zerocopy sendmsg, such fragments may appear.

The existing code replaces each page one for one. Splitting each
compound page into an independent number of regular pages can result
in exceeding limit MAX_SKB_FRAGS if data is not exactly page aligned.

Instead, fill all destination pages but the last to PAGE_SIZE.
Split the existing alloc + copy loop into separate stages:
1. compute bytelength and minimum number of pages to store this.
2. allocate
3. copy, filling each page except the last to PAGE_SIZE bytes
4. update skb frag array

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Willem de Bruijn [Thu, 3 Aug 2017 20:29:37 +0000 (16:29 -0400)]

sock: allocate skbs from optmem

Add sock_omalloc and sock_ofree to be able to allocate control skbs,
for instance for looping errors onto sk_error_queue.

The transmit budget (sk_wmem_alloc) is involved in transmit skb
shaping, most notably in TCP Small Queues. Using this budget for
control packets would impact transmission.

Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Nicholas Piggin [Tue, 1 Aug 2017 13:59:28 +0000 (23:59 +1000)]

powerpc/64: Fix __check_irq_replay missing decrementer interrupt

If the decrementer wraps again and de-asserts the decrementer
exception while hard-disabled, __check_irq_replay() has a test to
notice the wrap when interrupts are re-enabled.

The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS
flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked
because the decrementer interrupt was always the first one checked
after clearing the hard disable flag, but HMI check was moved ahead of
that, which introduced this bug.

This can cause a missed decrementer interrupt if we soft-disable
interrupts then take an HMI which is recorded in irq_happened, then
hard-disable interrupts for > 4s to wrap the decrementer.

Fixes: e0e0d6b7390b ("powerpc/64: Replay hypervisor maintenance interrupt first")
Cc: [email protected] # v4.9+
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

commit | commitdiff | tree

Nicholas Piggin [Thu, 20 Jul 2017 01:53:22 +0000 (11:53 +1000)]

powerpc/perf: POWER9 PMU stops after idle workaround

POWER9 DD2 PMU can stop after a state-loss idle in some conditions.

A solution is to set then clear MMCRA[60] after wake from state-loss
idle. MMCRA[60] is a non-architected bit, see the user manual for
details.

Signed-off-by: Nicholas Piggin <[email protected]>
Acked-by: Madhavan Srinivasan <[email protected]>
Reviewed-by: Vaidyanathan Srinivasan <[email protected]>
Acked-by: Anton Blanchard <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

commit | commitdiff | tree

Dave Airlie [Fri, 4 Aug 2017 01:43:14 +0000 (11:43 +1000)]

Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Just a few small fixes for 4.13.

* 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: Use list_del_init in amdgpu_mn_unregister
  drm/amdgpu: Fix undue fallthroughs in golden registers initialization
  drm/amdgpu: fix header on gfx9 clear state

commit | commitdiff | tree

David S. Miller [Thu, 3 Aug 2017 22:36:01 +0000 (15:36 -0700)]

Merge branch 'mlxsw-Support-for-IPv6-UC-router'

Jiri Pirko says:

====================
mlxsw: Support for IPv6 UC router

Ido says:

This set adds support for IPv6 unicast routes offload. The first four
patches make the FIB notification chain generic so that it could be used
by address families other than IPv4. This is done by having each address
family register its callbacks with the common code, so that its FIB tables
and rules could be dumped upon registration to the chain, while ensuring
the integrity of the dump. The exact mechanics are explained in detail in
the first patch.

The next six patches build upon this work and add the necessary callbacks
in IPv6 code. This allows listeners of the chain to receive notifications
about IPv6 routes addition, deletion and replacement as well as FIB rules
notifications.

Unlike user space notifications for IPv6 multipath routes, the FIB
notification chain notifies these on a per-nexthop basis. This allows
us to keep the common code lean and is also unnecessary, as notifications
are serialized by each table's lock whereas applications maintaining
netlink caches may suffer from concurrent dumps and deletions / additions
of routes.

The next five patches audit the different code paths reading the route's
reference count (rt6i_ref) and remove assumptions regarding its meaning.
This is needed since non-FIB users need to be able to hold a reference on
the route and a non-zero reference count no longer means the route is in
the FIB.

The last six patches enable the mlxsw driver to offload IPv6 unicast
routes to the Spectrum ASIC. Without resorting to ACLs, lookup is done
solely based on the destination IP, so the abort mechanism is invoked
upon the addition of source-specific routes.

Follow-up patch sets will increase the scale of gatewayed routes by
consolidating identical nexthop groups to one adjacency entry in the
device's adjacency table (as in IPv4), as well as add support for
NH_{ADD,DEL} events which enable support for the
'ignore_routes_with_linkdown' sysctl.

Changes in v2:
* Provide offload indication for individual nexthops (David Ahern).
* Use existing route reference count instead of adding another one.
This resulted in several new patches to remove assumptions regarding
current semantics of the existing reference count (David Ahern).
* Add helpers to allow non-FIB users to take a reference on route.
* Remove use of tb6_lock in mlxsw (David Ahern).
* Add IPv6 dependency to mlxsw.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:31 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Don't ignore IPv6 notifications

We now have all the necessary IPv6 infrastructure in place, so stop
ignoring these notifications.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:30 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Abort on source-specific routes

Without resorting to ACLs, the device performs route lookup solely based
on the destination IP address.

In case source-specific routing is needed, an error is returned and the
abort mechanism is activated, thus allowing the kernel to take over
forwarding decisions.

Instead of aborting, we can trap specific destination prefixes where
source-specific routes are present, but this will result in a lot more
code that is unlikely to ever be used.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:29 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Add support for route replace

In case we got a replace event, then the replaced route must exist. If
the route isn't capable of multipath, then replace first matching
non-multipath capable route.

If the route is capable of multipath and matching multipath capable
route is found, then replace it. Otherwise, replace first matching
non-multipath capable route.

The new route is inserted before the replaced one. In case the replaced
route is currently offloaded, then it's overwritten in the device's table
by the new route and later deleted, thus not impacting routed traffic.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:28 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Add support for IPv6 routes addition / deletion

Allow directly connected and remote unicast IPv6 routes to be programmed
to the device's tables.

As with IPv4, identical routes - sharing the same destination prefix -
are ordered in a FIB node according to their table ID and then the
metric. While the kernel doesn't share the same trie for the local and
main table, this does happen in the device, so ordering according to
table ID is needed.

Since individual nexthops can be added and deleted in IPv6, each FIB
entry stores a linked list of the rt6_info structs it represents. Upon
the addition or deletion of a nexthop, a new nexthop group is allocated
according to the new configuration and the old one is destroyed.
Identical groups aren't currently consolidated, but will be in a
follow-up patchset.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:27 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Sanitize IPv6 FIB rules

We only allow FIB offload in the presence of default rules or an l3mdev
rule. In a similar fashion to IPv4 FIB rules, sanitize IPv6 rules.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:26 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Demultiplex FIB event based on family

The FIB notification block currently only handles IPv4 events, but we
want to start handling IPv6 events soon, so lay the groundwork now.

Do that by preparing the work item and process it according to the
notified address family.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:25 +0000 (13:28 +0200)]

ipv6: fib: Add helpers to hold / drop a reference on rt6_info

Similar to commit 1c677b3d2828 ("ipv4: fib: Add fib_info_hold() helper")
and commit b423cb10807b ("ipv4: fib: Export free_fib_info()") add an
helper to hold a reference on rt6_info and export rt6_release() to drop
it and potentially release the route.

This is needed so that drivers capable of FIB offload could hold a
reference on the route before queueing it for offload and drop it after
the route has been programmed to the device's tables.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:24 +0000 (13:28 +0200)]

ipv6: Regenerate host route according to node pointer upon interface up

When an interface is brought back up, the kernel tries to restore the
host routes tied to its permanent addresses.

However, if the host route was removed from the FIB, then we need to
reinsert it. This is done by releasing the current dst and allocating a
new, so as to not reuse a dst with obsolete values.

Since this function is called under RTNL and using the same explanation
from the previous patch, we can test if the route is in the FIB by
checking its node pointer instead of its reference count.

Tested using the following script and Andrey's reproducer mentioned
in commit 8048ced9beb2 ("net: ipv6: regenerate host route if moved to gc
list") and linked below:

$ ip link set dev lo up
$ ip link add dummy1 type dummy
$ ip -6 address add cafe::1/64 dev dummy1
$ ip link set dev lo down # cafe::1/128 is removed
$ ip link set dev dummy1 up
$ ip link set dev lo up

The host route is correctly regenerated.

Signed-off-by: Ido Schimmel <[email protected]>
Link: http://lkml.kernel.org/r/CAAeHK+zSe82vc5gCRgr_EoUwiALPnWVdWJBPwJZBpbxYz=kGJw@mail.gmail.com
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:23 +0000 (13:28 +0200)]

ipv6: Regenerate host route according to node pointer upon loopback up

When the loopback device is brought back up we need to check if the host
route attached to the address is still in the FIB and regenerate one in
case it's not.

Host routes using the loopback device are always inserted into and
removed from the FIB under RTNL (under which this function is called),
so we can test their node pointer instead of the reference count in
order to check if the route is in the FIB or not.

Tested using the following script from Nicolas mentioned in
commit a220445f9f43 ("ipv6: correctly add local routes when lo goes up"):

$ ip link add dummy1 type dummy
$ ip link set dummy1 up
$ ip link set lo down ; ip link set lo up

The host route is correctly regenerated.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:22 +0000 (13:28 +0200)]

ipv6: fib: Unlink replaced routes from their nodes

When a route is deleted its node pointer is set to NULL to indicate it's
no longer linked to its node. Do the same for routes that are replaced.

This will later allow us to test if a route is still in the FIB by
checking its node pointer instead of its reference count.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:21 +0000 (13:28 +0200)]

ipv6: fib: Don't assume only nodes hold a reference on routes

The code currently assumes that only FIB nodes can hold a reference on
routes. Therefore, after fib6_purge_rt() has run and the route is no
longer present in any intermediate nodes, it's assumed that its
reference count would be 1 - taken by the node where it's currently
stored.

However, we're going to allow users other than the FIB to take a
reference on a route, so this assumption is no longer valid and the
BUG_ON() needs to be removed.

Note that purging only takes place if the initial reference count is
different than 1. I've left that check intact, as in the majority of
systems (where routes are only referenced by the FIB), it does actually
mean the route is present in intermediate nodes.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:20 +0000 (13:28 +0200)]

ipv6: fib: Add offload indication to routes

Allow user space applications to see which routes are offloaded and
which aren't by setting the RTNH_F_OFFLOAD flag when dumping them.

To be consistent with IPv4, offload indication is provided on a
per-nexthop basis.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:19 +0000 (13:28 +0200)]

ipv6: fib: Dump tables during registration to FIB chain

Dump all the FIB tables in each net namespace upon registration to the
FIB notification chain so that the callee will have a complete view of
the tables.

The integrity of the dump is ensured by a per-table sequence counter
that is incremented (under write lock) whenever a route is added or
deleted from the table.

All the sequence counters are read (under each table's read lock) and
summed, prior and after the dump. In case the counters differ, then the
dump is either restarted or the registration fails.

While it's possible for a table to be modified after its counter has
been read, this isn't really a problem. In case it happened before it
was read the second time, then the comparison at the end will fail. If
it happened afterwards, then we're guaranteed to be notified about the
change, as the notification block is registered prior to the second
read.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:18 +0000 (13:28 +0200)]

ipv6: fib_rules: Dump rules during registration to FIB chain

Allow users of the FIB notification chain to receive a complete view of
the IPv6 FIB rules upon registration to the chain.

The integrity of the dump is ensured by a per-family sequence counter
that is incremented (under RTNL) whenever a rule is added or deleted.

All the sequence counters are read (under RTNL) and summed, prior and
after the dump. In case the counters differ, then the dump is either
restarted or the registration fails.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:17 +0000 (13:28 +0200)]

ipv6: fib: Add in-kernel notifications for route add / delete

As with IPv4, allow listeners of the FIB notification chain to receive
notifications whenever a route is added, replaced or deleted. This is
done by placing calls to the FIB notification chain in the two lowest
level functions that end up performing these operations - namely,
fib6_add_rt2node() and fib6_del_route().

Unlike IPv4, APPEND notifications aren't sent as the kernel doesn't
distinguish between "append" (NLM_F_CREATE|NLM_F_APPEND) and "prepend"
(NLM_F_CREATE). If NLM_F_EXCL isn't set, duplicate routes are always
added after the existing duplicate routes.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:16 +0000 (13:28 +0200)]

ipv6: fib: Add FIB notifiers callbacks

We're about to add IPv6 FIB offload support, so implement the necessary
callbacks in IPv6 code, which will later allow us to add routes and
rules notifications.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:15 +0000 (13:28 +0200)]

ipv6: fib_rules: Check if rule is a default rule

As explained in commit 3c71006d15fd ("ipv4: fib_rules: Check if rule is
a default rule"), drivers supporting IPv6 FIB offload need to be able to
sanitize the rules they don't support and potentially flush their
tables.

Add an IPv6 helper to check if a FIB rule is a default rule.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:14 +0000 (13:28 +0200)]

net: fib_rules: Implement notification logic in core

Unlike the routing tables, the FIB rules share a common core, so instead
of replicating the same logic for each address family we can simply dump
the rules and send notifications from the core itself.

To protect the integrity of the dump, a rules-specific sequence counter
is added for each address family and incremented whenever a rule is
added or deleted (under RTNL).

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:13 +0000 (13:28 +0200)]

rocker: Ignore address families other than IPv4

As in previous patch, ignore IPv6 notifications since the driver doesn't
support these.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:12 +0000 (13:28 +0200)]

mlxsw: spectrum_router: Ignore address families other than IPv4

We're about to add IPv6 notifications in the FIB notification chain, but
the driver currently doesn't support these, so ignore them.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Ido Schimmel [Thu, 3 Aug 2017 11:28:11 +0000 (13:28 +0200)]

net: core: Make the FIB notification chain generic

The FIB notification chain is currently soley used by IPv4 code.
However, we're going to introduce IPv6 FIB offload support, which
requires these notification as well.

As explained in commit c3852ef7f2f8 ("ipv4: fib: Replay events when
registering FIB notifier"), upon registration to the chain, the callee
receives a full dump of the FIB tables and rules by traversing all the
net namespaces. The integrity of the dump is ensured by a per-namespace
sequence counter that is incremented whenever a change to the tables or
rules occurs.

In order to allow more address families to use the chain, each family is
expected to register its fib_notifier_ops in its pernet init. These
operations allow the common code to read the family's sequence counter
as well as dump its tables and rules in the given net namespace.

Additionally, a 'family' parameter is added to sent notifications, so
that listeners could distinguish between the different families.

Implement the common code that allows listeners to register to the chain
and for address families to register their fib_notifier_ops. Subsequent
patches will implement these operations in IPv6.

In the future, ipmr and ip6mr will be extended to provide these
notifications as well.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Aug 2017 22:25:14 +0000 (15:25 -0700)]

Merge tag 'vfio-v4.13-rc4' of git://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

- SPAPR/EEH config build fix (Murilo Opsfelder Araujo)

- Fix possible device lock deadlock (Alex Williamson)

- Correctly size integrated endpoint PCIe capabilities (Alex
   Williamson)

* tag 'vfio-v4.13-rc4' of git://github.com/awilliam/linux-vfio:
  vfio/pci: Fix handling of RC integrated endpoint PCIe capability size
  vfio/pci: Use pci_try_reset_function() on initial open
  include/linux/vfio.h: Guard powerpc-specific functions with CONFIG_VFIO_SPAPR_EEH

commit | commitdiff | tree

David S. Miller [Thu, 3 Aug 2017 22:16:09 +0000 (15:16 -0700)]

Merge branch 'mvpp2-add-TX-interrupts-support'

Thomas Petazzoni says:

====================
net: mvpp2: add TX interrupts support

So far, the mvpp2 driver was using an hrtimer to handle TX
completion. This patch series adds support for using TX interrupts
(for each CPU) on PPv2.2, the variant of the IP used on Marvell Armada
7K/8K.

Dave: this version can be applied right away, it no longer depends on
Antoine's patch series. Antoine series had some comments, so he will
have to respin later on. Therefore, let's merge this smaller patch
series first.

Changes since v1:

- Rebased on top of net-next, instead of on top of Antoine's series.

- Removed the Device Tree patch, as it shouldn't go through the net
tree.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:42:01 +0000 (10:42 +0200)]

dt-bindings: net: marvell-pp2: update interrupt-names with TX interrupts

The PPv2.2 unit has several interrupts used for TX completion
notification. This commit updates the Device Tree binding describing
this HW block to mention such interrupts.

While at it, we update the example to use a recent Device Tree
example, that uses interrupts going through the ICU, and not to the
GIC directly.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:42:00 +0000 (10:42 +0200)]

net: mvpp2: add support for TX interrupts and RX queue distribution modes

This commit adds the support for two related features:

- Support for TX interrupts, with one interrupt for each CPU

- Support for different RX queue distribution modes
   MVPP2_QDIST_SINGLE_MODE where a single interrupt, shared by all
   CPUs, receives the RX events, and MVPP2_QDIST_MULTI_MODE, where the
   per-CPU interrupts used for TX events are also used for RX events.

Since additional interrupts are needed, an update to the Device Tree
binding is needed. However, backward compatibility is preserved with
the old Device Tree binding, by gracefully degrading to the original
behavior, with only one RX interrupt, and TX completion being handled
by an hrtimer.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:41:59 +0000 (10:41 +0200)]

net: mvpp2: introduce queue_vector concept

In preparation to the introduction of TX interrupts and improved RX
queue distribution, this commit introduces the concept of "queue
vector". A queue vector represents a number of RX and/or TX queues,
and an associated NAPI instance and interrupt.

This commit currently only creates a single queue_vector, so there are
no changes in behavior, but it paves the way for additional
queue_vector in the next commits.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:41:58 +0000 (10:41 +0200)]

net: mvpp2: move from cpu-centric naming to "software thread" naming

The PPv2.2 IP has a concept of "software thread", with all registers
of the PPv2.2 mapped 8 times, for concurrent accesses by 8 "software
threads". In addition, interrupts on RX queues are associated to such
"software thread".

For most cases, we map a "software thread" to the more conventional
concept of CPU, but we will soon have one exception: we will have a
model where we have one TX interrupt per CPU (each using one software
thread), and all RX events mapped to another software thread
(associated to another interrupt).

In preparation for this change, it makes sense to change the naming
from MVPP2_MAX_CPUS to MVPP2_MAX_THREADS, and plan for 8 software
threads instead of 4 currently.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:41:57 +0000 (10:41 +0200)]

net: mvpp2: introduce per-port nrxqs/ntxqs variables

Currently, the global variables rxq_number and txq_number hold the
number of per-port TXQs and RXQs. Until now, such numbers were
constant regardless of the driver configuration. As we are going to
introduce different modes for TX and RX queues, these numbers will
depend on the configuration (PPv2.1 vs. PPv2.2, exact queue
distribution logic).

Therefore, as a preparation, we move the number of RXQs and TXQs in
the 'struct mvpp2_port' structure, next to the RXQs and TXQs
descriptor arrays.

For now, they remain initialized to the same default values as
rxq_number/txq_number used to be initialized, but this will change in
future commits.

The only non-mechanical change in this patch is that the check to
verify hardware constraints on the number of RXQs and TXQs is moved
from mvpp2_probe() to mvpp2_port_probe(), since it's now in
mvpp2_port_probe() that we initialize the per-port count of RXQ and
TXQ.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:41:56 +0000 (10:41 +0200)]

net: mvpp2: remove RX queue group reset code

The RX queue group allocation is anyway re-done later in
mvpp2_port_init(), so resetting it in mvpp2_init() is not very useful,
and will be annoying as we are going to rework the RX queue group
allocation logic.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Thomas Petazzoni [Thu, 3 Aug 2017 08:41:55 +0000 (10:41 +0200)]

net: mvpp2: fix MVPP21_ISR_RXQ_GROUP_REG definition

The MVPP21_ISR_RXQ_GROUP_REG register is not indexed by rxq, but by
port, so we fix the parameter name accordingly. There are no
functional changes.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Romain Perier [Thu, 3 Aug 2017 07:49:03 +0000 (09:49 +0200)]

net: arc_emac: Add support for ndo_do_ioctl net_device_ops operation

This operation is required for handling ioctl commands like SIOCGMIIREG,
when debugging MDIO registers from userspace.

This commit adds support for this operation.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

David S. Miller [Thu, 3 Aug 2017 22:08:18 +0000 (15:08 -0700)]

Merge branch 'hns3-ethernet-driver'

Salil Mehta says:

====================
Hisilicon Network Subsystem 3 Ethernet Driver

This patch-set contains the support of the HNS3 (Hisilicon Network Subsystem 3)
Ethernet driver for hip08 family of SoCs and future upcoming SoCs.

Hisilicon's new hip08 SoCs have integrated ethernet based on PCI Express and
hence there was a need of new driver over the previous HNS driver which is
already part of the Linux mainline. This new driver is NOT backward
compatible with HNS.

This current driver is meant to control the Physical Function and there would
soon be a support of a separate driver for Virtual Function once this base PF
driver has been accepted. Also, this driver is the ongoing development work and
HNS3 Ethernet driver would be incrementally enhanced with more new features.

High Level Architecture:

        [ Ethtool ]
   ^  |
           |  |
     [Ethernet Client]  [ODP/UIO Client] . . . [ RoCE Client ]
                         |                            |
                   [ HNAE Device ]                    |
                         |                            |
    ---------------------------------------------     |
                         |                            |
     [ HNAE3 Framework (Register/unregister) ]        |
                         |                            |
    ---------------------------------------------     |
                         |                            |
                   [ HCLGE Layer]                     |
         ________________|_________________           |
        |                |                 |          |
    [ MDIO ]    [ Scheduler/Shaper ]  [ Debugfs* ]    |
        |                |                 |          |
        |________________|_________________|          |
                         |                            |
             [ IMP command Interface ]                |
    ---------------------------------------------     |
              HIP08  H A R D W A R E                  *

Current patch-set broadly adds the support of the following PF functionality:
1. Basic Rx and Tx functionality
2. TSO support
3. Ethtool support
4. * Debugfs support -> this patch for now has been taken off.
5. HNAE framework and hardware compatability layer
6. Scheduler and Shaper support in transmit function
7. MDIO support

Change Log:
V5->V6: Addressed below comments:
        * Andrew Lunn: Comments on MDIO and ethtool link mode
        * Leon Romanvosky: Some comments on HNAE layer tidy-up
        * Internal comments on redundant code removal, fixing error types etc.
V4->V5: Addressed below concerns:
        * Florian Fanelli: Miscellaneous comments on ethtool & enet layer
        * Stephen Hemminger: comment of Netdev stats in ethool layer
        * Leon Romanvosky: Comments on Driver Version String, naming & Kconfig
        * Rochard Cochran: Redundant function prototype
V3->V4: Addressed below comments:
        * Andrew Lunn: Various comments on MDIO, ethtool, ENET driver etc,
        * Stephen Hemminger: change access and updation to 64 but statistics
        * Bo You: some spelling mistakes and checkpatch.pl errors.
V2->V3: Addressed comments
        * Yuval Mintz: Removal of redundant userprio-to-tc code
        * Stephen Hemminger: Ethtool & interuupt enable
        * Andrew Lunn: On C45/C22 PHy support, HNAE, ethtool
        * Florian Fainelli: C45/C22 and phy_connect/attach
        * Intel kbuild errors
V1->V2: Addressed some comments by kbuild, Yuval MIntz, Andrew Lunn &
        Florian Fainelli in the following patches:
        * Add support of HNS3 Ethernet Driver for hip08 SoC
        * Add MDIO support to HNS3 Ethernet driver for hip08 SoC
        * Add support of debugfs interface to HNS3 driver
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:52 +0000 (16:59 +0100)]

net: hns3: Add HNS3 driver to kernel build framework & MAINTAINERS

This patch updates the MAINTAINERS file with HNS3 Ethernet driver
maintainers names and other details. This also introduces the new
Makefiles required to build the HNS3 Ethernet driver and updates
the existing Kconfig file in the hisilicon folder.

Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:51 +0000 (16:59 +0100)]

net: hns3: Add Ethtool support to HNS3 driver

This patch adds the support of the Ethtool interface to
the HNS3 Ethernet driver. Various commands to read the
statistics, configure the offloading, loopback selftest etc.
are supported.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:50 +0000 (16:59 +0100)]

net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC

This patch adds the support of MDIO bus interface for HNS3 driver.
Code provides various interfaces to start and stop the PHY layer
and to read and write the MDIO bus or PHY.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:49 +0000 (16:59 +0100)]

net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

THis patch adds the support of the Scheduling and Shaping
functionalities during the transmit leg. This also adds the
support of Pause at MAC level. (Pause at per-priority level
shall be added later along with the DCB feature).

Hardware as such consists of two types of cofiguration of 6 level
schedulers. Algorithms varies according to the level and type
of scheduler being used. Current patch is used to initialize
the mapping, algorithms(like SP, DWRR etc) and shaper(CIR, PIR etc)
being used.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: Wei Hu (Xavier) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:48 +0000 (16:59 +0100)]

net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support

This patch adds the support of Hisilicon Network Subsystem Accceleration
Engine and common operations to access it. This layer provides access to the
hardware configuration, hardware statistics. This layer is also
responsible for triggering the initialization of the PHY layer through
the below MDIO layer.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: Wei Hu (Xavier) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:47 +0000 (16:59 +0100)]

net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support

This patch adds the support of IMP (Integrated Management Processor)
command interface to the HNS3 driver.

Each PF/VF has support of CQP(Command Queue Pair) ring interface.
Each CQP consis of send queue CSQ and receive queue CRQ.
There are various commands a PF/VF may support, like for Flow Table
manipulation, Device management, Packet buffer allocation, Forwarding,
VLANs config, Tunneling/Overlays etc.

This patch contains code to initialize the command queue, manage the
command queue descriptors and Rx/Tx protocol with the command processor
in the form of various commands/results and acknowledgements.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:46 +0000 (16:59 +0100)]

net: hns3: Add support of the HNAE3 framework

This patch adds the support of the HNAE3 (Hisilicon Network
Acceleration Engine 3) framework support to the HNS3 driver.

Framework facilitates clients like ENET(HNS3 Ethernet Driver), RoCE
and user-space Ethernet drivers (like ODP etc.) to register with HNAE3
devices and their associated operations.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Salil [Wed, 2 Aug 2017 15:59:45 +0000 (16:59 +0100)]

net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

This patch adds the support of Hisilicon Network Subsystem 3
Ethernet driver to hip08 family of SoCs.

This driver includes basic Rx/Tx functionality. It also includes
the client registration code with the HNAE3(Hisilicon Network
Acceleration Engine 3) framework.

This work provides the initial support to the hip08 SoC and
would incrementally add features or enhancements.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: lipeng <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: Wei Hu (Xavier) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Aug 2017 21:58:13 +0000 (14:58 -0700)]

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"15 fixes"

[ This does not merge the "fortify: use WARN instead of BUG for now"
  patch, which needs a bit of extra work to build cleanly with all
  configurations. Arnd is on it.   - Linus ]

* emailed patches from Andrew Morton <[email protected]>:
  ocfs2: don't clear SGID when inheriting ACLs
  mm: allow page_cache_get_speculative in interrupt context
  userfaultfd: non-cooperative: flush event_wqh at release time
  ipc: add missing container_of()s for randstruct
  cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()
  userfaultfd_zeropage: return -ENOSPC in case mm has gone
  mm: take memory hotplug lock within numa_zonelist_order_handler()
  mm/page_io.c: fix oops during block io poll in swapin path
  zram: do not free pool->size_class
  kthread: fix documentation build warning
  kasan: avoid -Wmaybe-uninitialized warning
  userfaultfd: non-cooperative: notify about unmap of destination during mremap
  mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
  pid: kill pidhash_size in pidhash_init()
  mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Aug 2017 19:37:12 +0000 (12:37 -0700)]

Merge tag 'acpi-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"These fix two issues in the ACPI SoC drivers (Intel LPSS and AMD APD),
  a crash in the PCC mailbox initialization code and a WDAT watchdog
  initialization failure.

  Specifics:

   - Fix a device ID of Hisilicon Hip07/08 in the ACPI APD (AMD SoC)
     driver (Hanjun Guo).

   - Fix list corruption (introduced during the 4.11 cycle) in the ACPI
     LPSS (Intel SoC) driver (Hans de Goede).

   - Fix PCC mailbox handling code crash during initialization when PCCT
     is not present and PCC channel 0 is requested (Hoan Tran).

   - Fix a WDAT watchdog initialization issue causing platform device
     creation to fail due to partially overlapping address ranges in
     resources (Ryan Kennedy)"

* tag 'acpi-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: APD: Fix HID for Hisilicon Hip07/08
  mailbox: pcc: Fix crash when request PCC channel 0
  ACPI / watchdog: Fix init failure with overlapping register regions
  ACPI / LPSS: Only call pwm_add_table() for the first PWM controller

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Aug 2017 19:32:49 +0000 (12:32 -0700)]

Merge tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix two cpufreq issues, one introduced recently and one related
  to recent changes, fix cpufreq documentation, fix up recently added
  code in the Thunderbolt driver and update runtime PM framework
  documentation.

  Specifics:

   - Fix the handling of the scaling_cur_freq cpufreq policy attribute
     on x86 systems with the MPERF/APERF registers present to make it
     behave more as expected after recent changes (Rafael Wysocki).

   - Drop a leftover callback from the intel_pstate driver which also
     prevents the cpuinfo_cur_freq cpufreq policy attribute from being
     incorrectly exposed when intel_pstate works in the active mode
     (Rafael Wysocki).

   - Add a missing piece describing the cpuinfo_cur_freq policy
     attribute to cpufreq documentation (Rafael Wysocki).

   - Fix up a recently added part of the Thunderbolt driver to avoid
     aborting system suspends if its mailbox commands time out (Rafael
     Wysocki).

   - Update device runtime PM framework documentation to reflect the
     current behavior of the code (Johan Hovold)"

* tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thunderbolt: icm: Ignore mailbox errors in icm_suspend()
  cpufreq: x86: Make scaling_cur_freq behave more as expected
  PM / runtime: Document new pm_runtime_set_suspended() constraint
  cpufreq: docs: Add missing cpuinfo_cur_freq description
  cpufreq: intel_pstate: Drop ->get from intel_pstate structure

commit | commitdiff | tree

Rafael J. Wysocki [Thu, 3 Aug 2017 18:30:18 +0000 (20:30 +0200)]

Merge branches 'acpi-soc', 'acpi-wdat' and 'acpi-cppc'

* acpi-soc:
  ACPI: APD: Fix HID for Hisilicon Hip07/08
  ACPI / LPSS: Only call pwm_add_table() for the first PWM controller

* acpi-wdat:
  ACPI / watchdog: Fix init failure with overlapping register regions

* acpi-cppc:
  mailbox: pcc: Fix crash when request PCC channel 0

commit | commitdiff | tree

Rafael J. Wysocki [Thu, 3 Aug 2017 18:29:45 +0000 (20:29 +0200)]

Merge branches 'pm-core' and 'pm-misc'

* pm-core:
PM / runtime: Document new pm_runtime_set_suspended() constraint

* pm-misc:
thunderbolt: icm: Ignore mailbox errors in icm_suspend()

commit | commitdiff | tree

Rafael J. Wysocki [Thu, 3 Aug 2017 18:29:24 +0000 (20:29 +0200)]

Merge branches 'pm-cpufreq-x86', 'pm-cpufreq-docs' and 'intel_pstate'

* pm-cpufreq-x86:
  cpufreq: x86: Make scaling_cur_freq behave more as expected

* pm-cpufreq-docs:
  cpufreq: docs: Add missing cpuinfo_cur_freq description

* intel_pstate:
  cpufreq: intel_pstate: Drop ->get from intel_pstate structure

commit | commitdiff | tree

David S. Miller [Thu, 3 Aug 2017 16:45:48 +0000 (09:45 -0700)]

Merge branch 'sctp-remove-typedefs-from-structures-part-4'

Xin Long says:

====================
sctp: remove typedefs from structures part 4

As we know, typedef is suggested not to use in kernel, even checkpatch.pl
also gives warnings about it. Now sctp is using it for many structures.

All this kind of typedef's using should be removed. This patchset is the
part 4 to remove it for another 14 basic structures from linux/sctp.h.
After this patchset, all typedefs are cleaned in linux/sctp.h.

Just as the part 1-3, No any code's logic would be changed in these patches,
only cleaning up.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:22 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_auth_chunk_t

This patch is to remove the typedef sctp_auth_chunk_t, and
replace with struct sctp_auth_chunk in the places where it's
using this typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:21 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_authhdr_t

This patch is to remove the typedef sctp_authhdr_t, and
replace with struct sctp_authhdr in the places where it's
using this typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:20 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_addip_chunk_t

This patch is to remove the typedef sctp_addip_chunk_t, and
replace with struct sctp_addip_chunk in the places where it's
using this typedef.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:19 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_addiphdr_t

This patch is to remove the typedef sctp_addiphdr_t, and
replace with struct sctp_addiphdr in the places where it's
using this typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:18 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_addip_param_t

This patch is to remove the typedef sctp_addip_param_t, and
replace with struct sctp_addip_param in the places where it's
using this typedef.

It is to use sizeof(variable) instead of sizeof(type), and
also fix some indent problems.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:17 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_cwr_chunk_t

Remove this typedef including the struct, there is even no places
using it.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:16 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_cwrhdr_t

This patch is to remove the typedef sctp_cwrhdr_t, and
replace with struct sctp_cwrhdr in the places where it's
using this typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:15 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_ecne_chunk_t

This patch is to remove the typedef sctp_ecne_chunk_t, and
replace with struct sctp_ecne_chunk in the places where it's
using this typedef.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:14 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_ecnehdr_t

This patch is to remove the typedef sctp_ecnehdr_t, and
replace with struct sctp_ecnehdr in the places where it's
using this typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:13 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_error_t

This patch is to remove the typedef sctp_error_t, and replace
with enum sctp_error in the places where it's using this typedef.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:12 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_operr_chunk_t

This patch is to remove the typedef sctp_operr_chunk_t, and
replace with struct sctp_operr_chunk in the places where it's
using this typedef.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:11 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_errhdr_t

This patch is to remove the typedef sctp_errhdr_t, and replace
with struct sctp_errhdr in the places where it's using this
typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:10 +0000 (15:42 +0800)]

sctp: fix the name of struct sctp_shutdown_chunk_t

This patch is to fix the name of struct sctp_shutdown_chunk_t
, replace with struct sctp_initack_chunk in the places where
it's using it.

It is also to fix some indent problem.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Xin Long [Thu, 3 Aug 2017 07:42:09 +0000 (15:42 +0800)]

sctp: remove the typedef sctp_shutdownhdr_t

This patch is to remove the typedef sctp_shutdownhdr_t, and
replace with struct sctp_shutdownhdr in the places where it's
using this typedef.

It is also to use sizeof(variable) instead of sizeof(type).

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

David S. Miller [Thu, 3 Aug 2017 16:33:06 +0000 (09:33 -0700)]

Merge branch 'ibmvnic-ethtool'

John Allen says:

====================
ibmvnic: Improve ethtool functionality

This patch series improves ibmvnic ethtool functionality by adding support
for ethtool -l and -g options, correcting existing statistics reporting,
and augmenting the existing statistics with counters for each tx and rx
queue.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

John Allen [Wed, 2 Aug 2017 21:47:17 +0000 (16:47 -0500)]

ibmvnic: Implement .get_channels

Implement .get_channels (ethtool -l) functionality

Signed-off-by: John Allen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

John Allen [Wed, 2 Aug 2017 21:46:30 +0000 (16:46 -0500)]

ibmvnic: Implement .get_ringparam

Implement .get_ringparam (ethtool -g) functionality

Signed-off-by: John Allen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

John Allen [Wed, 2 Aug 2017 21:45:28 +0000 (16:45 -0500)]

ibmvnic: Convert vnic server reported statistics to cpu endian

The vnic server reports the statistics buffer in big endian format and must
be converted to cpu endian in order to be displayed correctly on little
endian lpars.

Signed-off-by: John Allen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

John Allen [Wed, 2 Aug 2017 21:44:14 +0000 (16:44 -0500)]

ibmvnic: Implement per-queue statistics reporting

Add counters to report number of packets, bytes, and dropped packets for
each transmit queue and number of packets, bytes, and interrupts for each
receive queue. Modify ethtool callbacks to report the new statistics.

Signed-off-by: John Allen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Neal Cardwell [Wed, 2 Aug 2017 19:59:58 +0000 (15:59 -0400)]

tcp: remove extra POLL_OUT added for finished active connect()

Commit 45f119bf936b ("tcp: remove header prediction") introduced a
minor bug: the sk_state_change() and sk_wake_async() notifications for
a completed active connection happen twice: once in this new spot
inside tcp_finish_connect() and once in the existing code in
tcp_rcv_synsent_state_process() immediately after it calls
tcp_finish_connect(). This commit remoes the duplicate POLL_OUT
notifications.

Fixes: 45f119bf936b ("tcp: remove header prediction")
Signed-off-by: Neal Cardwell <[email protected]>
Cc: Florian Westphal <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Vivien Didelot [Wed, 2 Aug 2017 19:48:25 +0000 (15:48 -0400)]

net: dsa: bcm_sf2: dst in not an array

It's been a while now since ds->dst is not an array anymore, but a
simple pointer to a dsa_switch_tree.

Fortunately, SF2 does not support multi-chip and thus ds->index is
always 0.

This patch substitutes 'ds->dst[ds->index].' with 'ds->dst->'.

Signed-off-by: Vivien Didelot <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Bhumika Goyal [Wed, 2 Aug 2017 17:57:14 +0000 (23:27 +0530)]

qlcnic: add const to bin_attribute structure

Add const to bin_attribute structure as it is only passed to the
functions sysfs_{remove/create}_bin_file. The corresponding
arguments are of type const, so declare the structure to be const.

Signed-off-by: Bhumika Goyal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Sowmini Varadhan [Wed, 2 Aug 2017 17:34:31 +0000 (10:34 -0700)]

rds: reduce memory footprint for RDS when transport is RDMA

RDS over IB does not use multipath RDS, so the array
of additional rds_conn_path structures is always superfluous
in this case. Reduce the memory footprint of the rds module
by making this a dynamic allocation predicated on whether
the transport is mp_capable.

Signed-off-by: Sowmini Varadhan <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Tested-by: Efrain Galaviz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Tonghao Zhang [Wed, 2 Aug 2017 16:34:15 +0000 (09:34 -0700)]

ipv4: Introduce ipip_offload_init helper function.

It's convenient to init ipip offload. We will check
the return value, and print KERN_CRIT info on failure.

Signed-off-by: Tonghao Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

William Tu [Wed, 2 Aug 2017 15:43:52 +0000 (08:43 -0700)]

bpf: fix the printing of ifindex

Save the ifindex before it gets zeroed so the invalid
ifindex can be printed out.

Signed-off-by: William Tu <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

David S. Miller [Thu, 3 Aug 2017 16:24:06 +0000 (09:24 -0700)]

Merge tag 'batadv-next-for-davem-20170802' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

- bump version strings, by Simon Wunderlich

- Remove unnecessary length qualifier, by Joe Perches

- Remove too short %pM field width, by Sven Eckelmann

- Remove return value handling from skb_put_data, by Sven Eckelmann

- Spelling fixes, by Colin Ian King

- Convert batman-adv.txt to reStructuredText, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Lin Yun Sheng [Wed, 2 Aug 2017 09:57:37 +0000 (17:57 +0800)]

net: hns: Add self-adaptive interrupt coalesce support in hns driver

When deal with low and high throughput, it is hard to achiece both
high performance and low latency. In order to achiece that, this patch
calculates the rx rate, and adjust the interrupt coalesce parameter
accordingly.

Signed-off-by: Yunsheng Lin <[email protected]>
Tested-by: Weiwei Deng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Julia Lawall [Wed, 2 Aug 2017 09:35:00 +0000 (11:35 +0200)]

X25: constify null_x25_address

null_x25_address is only used to access the string it contains, so it can
be const.

Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Radim Krčmář [Thu, 3 Aug 2017 15:59:58 +0000 (17:59 +0200)]

Merge tag 'kvm-arm-for-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

KVM/ARM Fixes for v4.13-rc4

- Yet another race with VM destruction plugged
- A set of small vgic fixes

Empty description

This page took 0.130368 seconds and 4 git commands to generate.