Git Repo - linux.git/log

bridge: vlan: fix possible null vlgrp deref while registering new port

While a new port is being initialized the rx_handler gets set, but the
vlans get initialized later in br_add_if() and in that window if we
receive a frame with a link-local address we can try to dereference
p->vlgrp in:
br_handle_frame() -> br_handle_local_finish() -> br_should_learn()

Fix this by checking vlgrp before using it.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: vlan: adjust rhashtable initial size and hash locks size

As Stephen pointed out the default initial size is more than we need, so
let's start small (4 elements, thus nelem_hint = 3). Also limit the hash
locks to the number of CPUs as we don't need any write-side scaling and
this looks like the minimum.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"(Relatively) a lot of reverts, mostly.

  Bugs have trickled in for a new feature in 4.2 (MTRR support in
  guests) so I'm reverting it all; let's not make this -rc period busier
  for KVM than it's been so far.  This covers the four reverts from me.

  The fifth patch is being reverted because Radim found a bug in the
  implementation of stable scheduler clock, *but* also managed to
  implement the feature entirely without hypervisor support.  So instead
  of fixing the hypervisor side we can remove it completely; 4.4 will
  get the new implementation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS
  Update KVM homepage Url
  Revert "KVM: SVM: use NPT page attributes"
  Revert "KVM: svm: handle KVM_X86_QUIRK_CD_NW_CLEARED in svm_get_mt_mask"
  Revert "KVM: SVM: Sync g_pat with guest-written PAT value"
  Revert "KVM: x86: apply guest MTRR virtualization on host reserved pages"
  Revert "KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR"

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/ipoib: increase the max mcast backlog queue
  IB/ipoib: Make sendonly multicast joins create the mcast group
  IB/ipoib: Expire sendonly multicast joins
  IB/mlx5: Remove pa_lkey usages
  IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY
  IB/iser: Add module parameter for always register memory
  xprtrdma: Replace global lkey with lkey local to PD

Merge branches 'pm-cpuidle', 'pm-opp' and 'pm-tools'

* pm-cpuidle:
  intel_idle: Skylake Client Support - updated

* pm-opp:
  PM / OPP: Fix typo modifcation -> modification
  PM / OPP: of_property_count_u32_elems() can return errors

* pm-tools:
  tools/power turbosat: update version number
  tools/power turbostat: SKL: Adjust for TSC difference from base frequency
  tools/power turbostat: KNL workaround for %Busy and Avg_MHz
  tools/power turbostat: IVB Xeon: fix --debug regression

Merge branch 'acpi-ec'

* acpi-ec:
ACPI / EC: Fix a memory leak issue in acpi_ec_query()

Merge branches 'pm-pci' and 'acpi-pci'

* pm-pci:
  PCI / PM: Update runtime PM documentation for PCI devices

* acpi-pci:
  ACPI / PCI: Remove duplicated penalty on SCI IRQ
  ACPI, PCI, irq: Do not share PCI IRQ with ISA IRQ

Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS

The cpu feature flags are not ever going to change, so warning
everytime can cause a lot of kernel log spam
(in our case more than 10GB/hour).

The warning seems to only occur when nested virtualization is
enabled, so it's probably triggered by a KVM bug. This is a
sensible and safe change anyway, and the KVM bug fix might not
be suitable for stable releases anyway.

Cc: [email protected]
Signed-off-by: Dirk Mueller <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Update KVM homepage Url

The old one appears to be a generic catch all page, which
is unhelpful.

Signed-off-by: Dirk Mueller <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge tag 'upstream-4.3-rc4' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS fixes from Richard Weinberger:
"This contains three bug fixes for both UBI and UBIFS"

* tag 'upstream-4.3-rc4' of git://git.infradead.org/linux-ubifs:
  UBI: return ENOSPC if no enough space available
  UBI: Validate data_size
  UBIFS: Kill unneeded locking in ubifs_init_security

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull key signing fixes from James Morris:
"Keyrings and modsign fixes from David Howells"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  MODSIGN: Change from CMS to PKCS#7 signing if the openssl is too old
  X.509: Don't strip leading 00's from key ID when constructing key description
  KEYS: Remove unnecessary header #inclusions from extract-cert.c
  KEYS: Fix race between key destruction and finding a keyring by name

Revert "KVM: SVM: use NPT page attributes"

This reverts commit 3c2e7f7de3240216042b61073803b61b9b3cfb22.
Initializing the mapping from MTRR to PAT values was reported to
fail nondeterministically, and it also caused extremely slow boot
(due to caching getting disabled---bug 103321) with assigned devices.

Reported-by: Markus Trippelsdorf <[email protected]>
Reported-by: Sebastian Schuette <[email protected]>
Cc: [email protected] # 4.2+
Signed-off-by: Paolo Bonzini <[email protected]>

Revert "KVM: svm: handle KVM_X86_QUIRK_CD_NW_CLEARED in svm_get_mt_mask"

This reverts commit 5492830370171b6a4ede8a3bfba687a8d0f25fa5.
It builds on the commit that is being reverted next.

Cc: [email protected] # 4.2+
Signed-off-by: Paolo Bonzini <[email protected]>

Revert "KVM: SVM: Sync g_pat with guest-written PAT value"

This reverts commit e098223b789b4a618dacd79e5e0dad4a9d5018d1,
which has a dependency on other commits being reverted.

Cc: [email protected] # 4.2+
Signed-off-by: Paolo Bonzini <[email protected]>

Revert "KVM: x86: apply guest MTRR virtualization on host reserved pages"

This reverts commit fd717f11015f673487ffc826e59b2bad69d20fe5.
It was reported to cause Machine Check Exceptions (bug 104091).

Reported-by: [email protected]
Cc: [email protected] # 4.2+
Signed-off-by: Paolo Bonzini <[email protected]>

Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:
"This fixes:

   - module autoload for 3 OF platform drivers
   - poweroff behaviour on bcm2835 watchdog device
   - I2C dependencies for iTCO_wdt.c"

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: iTCO: Fix dependencies on I2C
  watchdog: bcm2835: Fix poweroff behaviour
  watchdog: Fix module autoload for OF platform driver

Merge tag 'hwmon-for-linus-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmin fixes from Guenter Roeck:
"Fix module autoload for various drivers"

* tag 'hwmon-for-linus-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pwm-fan) Fix module autoload for OF platform driver
  hwmon: (gpio-fan) Fix module autoload for OF platform driver
  hwmon: (abx500) Fix module autoload for OF platform driver

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU fixes from Ingo Molnar:
"Two RCU fixes:

   - work around bug with recent GCC versions.

   - fix false positive lockdep splat"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Suppress lockdep false positive for rcp->exp_funnel_mutex
  rcu: Change _wait_rcu_gp() to work around GCC bug 67055

Initialize msg/shm IPC objects before doing ipc_addid()

As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
having initialized the IPC object state. Yes, we initialize the IPC
object in a locked state, but with all the lockless RCU lookup work,
that IPC object lock no longer means that the state cannot be seen.

We already did this for the IPC semaphore code (see commit e8577d1f0329:
"ipc/sem.c: fully initialize sem_array before making it visible") but we
clearly forgot about msg and shm.

Reported-by: Dmitry Vyukov <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mmc: core: fix dead loop of mmc_retune

When get a CRC error, start the mmc_retune, it will issue CMD19/CMD21
to do tune, assume there were 10 clock phase need to try, phase 0 to
phase 6 is ok, phase 7 to phase 9 is NG, we try it from 0 to 9, so
the last CMD19/CMD21 will get CRC error, host->need_retune was set and
cause mmc_retune was called, then dead loop of mmc_retune

Signed-off-by: Chaotian Jing <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Fixes: bd11e8bd03ca ("mmc: core: Flag re-tuning is needed on CRC errors")
Signed-off-by: Ulf Hansson <[email protected]>

i40e: fix 32 bit build warnings

Sparse found some issues with 32 bit compilation, which probably should
at least work without warning.  Not only that, but the code was wrong.
Thanks sparse!!

And thanks to the kbuild robot zero day testing for finding this issue.

$ make ARCH=i386 M=drivers/net/ethernet/intel/i40e C=2 CF="-D__CHECK_ENDIAN__"
  CHECK   drivers/net/ethernet/intel/i40e/i40e_main.c
  include/linux/etherdevice.h:79:32: warning: restricted __be16 degrades to integer
  drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (32) for type unsigned long
  drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (42) for type unsigned long
  drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (39) for type unsigned long
  drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (40) for type unsigned long

CC: [email protected]
Signed-off-by: Jesse Brandeburg <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: fix kbuild warnings

The 0day build infrastructure found some issues in i40e, this
removes the warnings by adding a harmless cast to a dev_info.

CC: [email protected]
Signed-off-by: Jesse Brandeburg <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40evf: tweak init timing

This patch tweaks the init timing of the driver just a little bit to
increase stability on load/unload and SR-IOV enable/disable cycles.

First, run the init_task loop a little quicker in order to reduce
overall init time.

Second, stagger the start of the init task based on the device's
PCIe function ID. This lessens the impact on the firmware when a
whole bunch of VFs are initialized simultaneously, e.g. enabling
SR-IOV without the VF driver blacklisted. For single VFs assigned
to VMs this will have no effect as the function ID will always be 0.

Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: warn on double free

Down was requesting queue disables, but then exited immediately without
waiting for the queues to actually disable. This could allow any
function called after i40evf_down to run immediately, including
i40evf_up, and causes a memory leak.

This issue has been fixed in a recent refactor of the reset code, but
add a couple WARN_ONs in the slow path to help us recognize if we
reintroduce this issue or if we missed any cases.

Change-ID: I27b6b5c9a79c1892f0ba453129f116bc32647dd0
Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: refactor interrupt enable

The interrupt enable function was always making the caller add
the base_vector from the VSI struct which is already passed to
the function. Just collapse the math into the helper function.

Change-ID: I54ef33aa7ceebc3231c3cc48f7b39fd0c3ff5806
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Strip VEB stats if they are disabled in HW

Due to performance reasons, VEB stats have been disabled in the hw. This
patch adds code to check for that condition before accumulating these
stats.

Change-ID: I7d805669476fedabb073790403703798ae5d878e
Signed-off-by: Anjali Singhai Jain <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: add new device id 1588

Add new device id and support for another 20Gb device.

Change-ID: Ib1b61e5bb6201d84953f97cade39a6e3369c2cf2
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Remove useless message

Remove a useless message that blathers on whenever a vxlan port is deleted.

Change-ID: If63fb8cf38e56cf433b68e498f11389de51919ba
Signed-off-by: Greg Rose <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: limit debugfs io ops

Don't let the debugfs register read and write commands try to access
outside of the ioremapped space. While we're at it, remove the use of
a misleading constant.

Change-ID: Ifce2893e232c65c7a76c23532c658f298218a81b
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: use QOS field consistently

In i40e_ndo_set_vf_port_vlan, we were using the QOS value
inconsistently, sometimes shifting it, sometimes not. Do the shift-and-
or operation correctly, once, and use the result consistently everywhere
in the function.

Change-ID: I46f062f3edc90a8a017ecec9137f4d1ab0ab9e41
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: count drops in netstat interface

The i40e rx_dropped counter was not showing up in netstat -i.
Add the right counter to be updated with the stats.

Change-ID: I4dd552e9995836099184f9d9a08e90edb591155f
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: fix Tx hang workaround code

The arm writeback (arm_wb) code is used for kicking the Tx ring to
make sure any pending work is completed even if interrupts are
disabled. It was running when it didn't need to, and not clearing
the ring->arm_wb state after it was set. This caused Tx hangs
to still occur occasionally when there really was no hang.
Fix this by resetting the variable right after it was used.

Change-ID: I7bf75d552ba9c4bd203d40615213861a24bb5594
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: fixup padding issue in get_cee_dcb_cfg_v1_resp

The struct i40e_aqc_get_cee_dcb_cfg_v1_resp was originally defined with
word boundary layout issues, which most compilers deal with by silently
adding padding, making the actual struct larger than designed.
This patch adds an extra byte in fields reserved3 and reserved4 to directly
acknowledge that padding.

Because the struct doesn't actually change in size or layout, this doesn't
constitute a change in the API.

Change-ID: I53fa4741b73fa255621232a85fba000b0e223015
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Fix a port VLAN configuration bug

If a port VLAN is set for a given virtual function (VF) before the VF
driver is loaded then a configuration error results in which the port
VLAN is ignored when the VF driver is subsequently loaded. This causes
the VF's MAC/VLAN filters to not use the correct VLAN filter. This
patch ensures that the port VLAN filter is considered at the right time
during configuration of the VF's MAC/VLAN filters.

Change-ID: I28f404cbc21a4c6d70a7980b87c77f13f06685a4
Signed-off-by: Greg Rose <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: fix up type clash in i40e_aq_rc_to_posix conversion

The error code sent into i40e_aq_rc_to_posix() are signed values, so we
really need to treat them as such.

Change-ID: I3d1ae0ee9ae0b1b6f5fc424f8b8cc58b0ea93203
Reported-by: Helin Zhang <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: rtnl_lock called twice in i40e_pci_error_resume()

Signed-off-by: Vasily Averin <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40evf: missing rtnl_unlock in i40evf_resume()

Signed-off-by: Vasily Averin <[email protected]>
Tested-by: Andrews Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

net: Initialize flow flags in input path

The fib_table_lookup tracepoint found 2 places where the flowi4_flags is
not initialized.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following pull request contains Netfilter/IPVS updates for net-next
containing 90 patches from Eric Biederman.

The main goal of this batch is to avoid recurrent lookups for the netns
pointer, that happens over and over again in our Netfilter/IPVS code. The idea
consists of passing netns pointer from the hook state to the relevant functions
and objects where this may be needed.

You can find more information on the IPVS updates from Simon Horman's commit
merge message:

c3456026adc0 ("Merge tag 'ipvs2-for-v4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next").

Exceptionally, this time, I'm not posting the patches again on netdev, Eric
already Cc'ed this mailing list in the original submission. If you need me to
make, just let me know.
====================

Signed-off-by: David S. Miller <[email protected]>

net: dsa: fix preparation of a port STP update

Because of the default 0 value of ret in dsa_slave_port_attr_set, a
driver may return -EOPNOTSUPP from the commit phase of a STP state,
which triggers a WARN() from switchdev.

This happened on a 6185 switch which does not support hardware bridging.

Fixes: 3563606258cf ("switchdev: convert STP update to switchdev attr set")
Reported-by: Andrew Lunn <[email protected]>
Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: fix preparation of a port STP update

Because of the default 0 value of ret in dsa_slave_port_attr_set, a
driver may return -EOPNOTSUPP from the commit phase of a STP state,
which triggers a WARN() from switchdev.

This happened on a 6185 switch which does not support hardware bridging.

Reported-by: Andrew Lunn <[email protected]>
Signed-off-by: Vivien Didelot <[email protected]>
Acked-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Add support for filtering neigh dump by master device

Add support for filtering neighbor dumps by master device by adding
the NDA_MASTER attribute to the dump request. A new netlink flag,
NLM_F_DUMP_FILTERED, is added to indicate the kernel supports the
request and output is filtered as requested.

Signed-off-by: David Ahern <[email protected]>
Acked-by: Roopa Prabhu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: fix tcp_v6_md5_do_lookup prototype

tcp_v6_md5_do_lookup() now takes a const socket, even if
CONFIG_TCP_MD5SIG is not set.

Fixes: b83e3deb974c ("tcp: md5: constify tcp_md5_do_lookup() socket argument")
From: Eric Dumazet <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'switchdev-callback'

Vivien Didelot says:

====================
net: switchdev: use specific switchdev_obj_*

This patchset changes switchdev add, del, dump operations from this:

    int     (*switchdev_port_obj_add)(struct net_device *dev,
                                      struct switchdev_obj *obj,
                                      struct switchdev_trans *trans);
    int     (*switchdev_port_obj_del)(struct net_device *dev,
                                      struct switchdev_obj *obj);
    int     (*switchdev_port_obj_dump)(struct net_device *dev,
                                      struct switchdev_obj *obj);

to something similar to the notifier_call callback of a notifier_block:

    int     (*switchdev_port_obj_add)(struct net_device *dev,
                                      enum switchdev_obj_id id,
                                      const void *obj,
                                      struct switchdev_trans *trans);
    int     (*switchdev_port_obj_del)(struct net_device *dev,
                                      enum switchdev_obj_id id,
                                      const void *obj);
    int     (*switchdev_port_obj_dump)(struct net_device *dev,
                                       enum switchdev_obj_id id, void *obj,
                                       int (*cb)(void *obj));

This allows the caller to pass and expect back a specific switchdev_obj_*
structure (e.g. switchdev_obj_fdb) instead of the generic switchdev_obj one.

This will simplify pushing the callback function down to the drivers.

The first 3 patches get rid of the dev parameter of the dump callback, since it
is not always neeeded (e.g. vlan_dump) and some drivers (such as DSA drivers)
may not have easy access to it.

Patches 4 and 5 implement the change in the switchdev operations and its users.

Patch 6 extracts the inner switchdev_obj_* structures from switchdev_obj and
removes this last one.

v2: fix error spotted by kbuild (extra ';' inline switchdev_port_obj_dump).
====================

Signed-off-by: David S. Miller <[email protected]>

net: switchdev: extract struct switchdev_obj_*

Now that switchdev and its drivers directly use specific switchdev_obj_*
structures, move them out of the switchdev_obj union and get rif of this
outer structure.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: switchdev: abstract object in add/del ops

Similar to the notifier_call callback of a notifier_block, change the
function signature of switchdev add and del operations to:

int switchdev_port_obj_add/del(struct net_device *dev,
enum switchdev_obj_id id, void *obj);

This allows the caller to pass a specific switchdev_obj_* structure
instead of the generic switchdev_obj one.

Drivers implementation of these operations and switchdev have been
changed accordingly.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: switchdev: pass callback to dump operation

Similar to the notifier_call callback of a notifier_block, change the
function signature of switchdev dump operation to:

    int switchdev_port_obj_dump(struct net_device *dev,
                                enum switchdev_obj_id id, void *obj,
                                int (*cb)(void *obj));

This allows the caller to pass and expect back a specific
switchdev_obj_* structure instead of the generic switchdev_obj one.

Drivers implementation of dump operation can now expect this specific
structure and call the callback with it. Drivers have been changed
accordingly.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: switchdev: remove dev from switchdev_obj cb

The net_device associated to a dump operation does not have to be passed
to the callback. switchdev stores it in a superset struct, if needed.

Also some drivers (such as DSA drivers) may not have easy access to it.

This will simplify pushing the callback function down to the drivers.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: switchdev: move dev in switchdev_fdb_dump

The FDB dump callback requires the related net_device so move it to the
struct switchdev_fdb_dump superset instead of using a callback param.

With this done, it'll be simpler to change the dump function signature.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: switchdev: remove dev in port_vlan_dump_put

The static switchdev_port_vlan_dump_put function does not need the
net_device parameter, so remove it.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

testptp: Silence compiler warnings on ppc64

When compiling Documentation/ptp/testptp.c the following compiler
warnings are printed out:

Documentation/ptp/testptp.c: In function ‘main’:
Documentation/ptp/testptp.c:367:11: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 3 has type ‘__s64’ [-Wformat=]
           event.t.sec, event.t.nsec);
           ^
Documentation/ptp/testptp.c:505:5: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 2 has type ‘__s64’ [-Wformat=]
     (pct+2*i)->sec, (pct+2*i)->nsec);
     ^
Documentation/ptp/testptp.c:507:5: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 2 has type ‘__s64’ [-Wformat=]
     (pct+2*i+1)->sec, (pct+2*i+1)->nsec);
     ^
Documentation/ptp/testptp.c:509:5: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 2 has type ‘__s64’ [-Wformat=]
     (pct+2*i+2)->sec, (pct+2*i+2)->nsec);

This happens because __s64 is by default defined as "long" on ppc64,
not as "long long". However, to fix these warnings, it's possible to
define the __SANE_USERSPACE_TYPES__ so that __s64 gets defined to
"long long" on ppc64, too.

Signed-off-by: Thomas Huth <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4: Handle return codes in mlx4_qp_attach_common

Both new_steering_entry() and existing_steering_entry() return values
based on their success or failure, but currently they fall through
silently. This can make troubleshooting difficult, as we were unable
to tell which one of these two functions returned errors or
specifically what code was returned. This patch remedies that
situation by passing the return codes to err, which is returned by
mlx4_qp_attach_common() itself.

This also addresses a leak in the call to mlx4_bitmap_free() as well.

Signed-off-by: Robb Manes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'm68k-netdev-modular'

Geert Uytterhoeven says:

====================
net: m68k: Allow modular build

This patch series makes the remaining m68k Ethernet drivers modular.
It's an alternative to the last 3 patches of Paul Gortmaker's series
"[PATCH net-next 0/6] make non-modular code explicitly non-modular".

Note that "[PATCH 5/5] net: macmace: Allow modular build" depends on
"[PATCH 4/5] m68k/mac: Export Peripheral System Controller (PSC) base
address to modules". Feel free to take the dependency through the netdev
tree to avoid modular build breakage.

This was compile-tested only (mac_defconfig + allmodconfig) due to lack
of hardware.
====================

Signed-off-by: David S. Miller <[email protected]>

net: macmace: Allow modular build

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

m68k/mac: Export Peripheral System Controller (PSC) base address to modules

If CONFIG_MACMACE=m:

ERROR: psc [drivers/net/ethernet/apple/macmace.ko] undefined!

Add the missing export to fix this.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hplance: Allow modular build

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: 7990: Export lance_poll() to modules

If CONFIG_HPLANCE=m and CONFIG_NET_POLL_CONTROLLER=y:

ERROR: "lance_poll" [drivers/net/ethernet/amd/hplance.ko] undefined!

Add the missing export to fix this.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mac8390: Allow modular build

The modular driver supports only one card, just like the built-in
driver.

Note that this limitation is a problem which affects all Nubus card
drivers, because they have to do all their own bus matching, because
Nubus still lacks the necessary driver model support.

Suggested-by: Finn Thain <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dsa: mv88e6xxx: Fix unsigned/signed issue

commit dea870242a9c ("dsa: mv88e6xxx: Allow speed/duplex of port to be
configured") leads to the following static checker warning:

        drivers/net/dsa/mv88e6xxx.c:585 mv88e6xxx_adjust_link()
        warn: unsigned 'ret' is never less than zero.

drivers/net/dsa/mv88e6xxx.c
   573  void mv88e6xxx_adjust_link(struct dsa_switch *ds, int port,
   574                             struct phy_device *phydev)
   575  {
   576          struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
   577          u32 ret, reg;
   578
   579          if (!phy_is_pseudo_fixed_link(phydev))
   580                  return;
   581
   582          mutex_lock(&ps->smi_mutex);
   583
   584          ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL);
   585          if (ret < 0)

Make ret an int, which is the return type for _mv88e6xxx_reg_read()

Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dsa: mv88e6xxx: Enable forwarding for unknown to the CPU port

Frames destined to an unknown address must be forwarded to the CPU
port. Otherwise incoming ARP, dhcp leases, etc, do not work.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'L3_master_device'

David Ahern says:

====================
net: L3 master device

The VRF device is essentially a Layer 3 master device used to associate
netdevices with a specific routing table and to influence FIB lookups
via 'ip rules' and controlling the oif/iif used for the lookup.

This series generalizes the VRF into L3 master device, l3mdev. Similar
to switchdev it has a Kconfig option and separate set of operations
in net_device allowing it to be completely compiled out if not wanted.
The l3mdev methods rely on the 'master' aspect and use of
netdev_master_upper_dev_get_rcu to retrieve the master device from a
given netdevice if it is enslaved to an L3_MASTER.

The VRF device is converted to use the l3mdev operations. At the end the
vrf_ptr is no longer and removed, as are all direct references to VRF.
The end result is a much simpler implementation for VRF.

Thanks to Nikolay for suggestions (eg., use of the master linkage which
is the key to making this work) and to Roopa, Andy and Shrijeet for
early reviews.

v3
- added license header to l3mdev.c

- export symbols in l3mdev.c for use with GPL modules

- removed netdevice header from l3mdev.h (not needed) and fixed
typo in comment

v2
- rebased to top of net-next
- addressed Niks comments (checking master, removing extra lines, and
flipping the order of patches 1 and 2)
====================

Signed-off-by: David S. Miller <[email protected]>

net: Move netif_index_is_l3_master to l3mdev.h

Change CONFIG dependency to CONFIG_NET_L3_MASTER_DEV as well.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Remove vrf header file

Move remaining structs to VRF driver and delete the vrf header file.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Remove the now unused vrf_ptr

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Replace calls to vrf_dev_get_rth

Replace calls to vrf_dev_get_rth with l3mdev_get_rtable.
The check on the flow flags is handled in the l3mdev operation.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Replace vrf_dev_table and friends

Replace calls to vrf_dev_table and friends with l3mdev_fib_table
and kin.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Replace vrf_master_ifindex{, _rcu} with l3mdev equivalents

Replace calls to vrf_master_ifindex_rcu and vrf_master_ifindex with either
l3mdev_master_ifindex_rcu or l3mdev_master_ifindex.

The pattern:
oif = vrf_master_ifindex(dev) ? : dev->ifindex;
is replaced with
oif = l3mdev_fib_oif(dev);

And remove the now unused vrf macros.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Add support for l3mdev ops to VRF driver

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Introduce L3 Master device abstraction

L3 master devices allow users of the abstraction to influence FIB lookups
for enslaved devices. Current API provides a means for the master device
to return a specific FIB table for an enslaved device, to return an
rtable/custom dst and influence the OIF used for fib lookups.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTER

Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTER and update the name of the
netif_is_vrf and netif_index_is_vrf macros.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'listener-refactoring-preparations'

Eric Dumazet says:

====================
tcp: listener refactoring preparations

This patch series makes changes to TCP/DCCP stacks so that
we can switch listener code to lockless mode.

This is done by marking const the listener socket in all
appropriate paths.

FastOpen code had to be changed to not dynamically allocate
a very small structure to make code simpler for following changes.
====================

Signed-off-by: David S. Miller <[email protected]>

tcp: prepare fastopen code for upcoming listener changes

While auditing TCP stack for upcoming 'lockless' listener changes,
I found I had to change fastopen_init_queue() to properly init the object
before publishing it.

Otherwise an other cpu could try to lock the spinlock before it gets
properly initialized.

Instead of adding appropriate barriers, just remove dynamic memory
allocations :
- Structure is 28 bytes on 64bit arches. Using additional 8 bytes
for holding a pointer seems overkill.
- Two listeners can share same cache line and performance would suffer.

If we really want to save few bytes, we would instead dynamically allocate
whole struct request_sock_queue in the future.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: constify tcp_syn_flood_action() socket argument

tcp_syn_flood_action() will soon be called with unlocked socket.
In order to avoid SYN flood warning being emitted multiple times,
use xchg().
Extend max_qlen_log and synflood_warned fields in struct listen_sock
to u32

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: constify tcp_v{4|6}_route_req() sock argument

These functions do not change the listener socket.
Goal is to make sure tcp_conn_request() is not messing with
listener in a racy way.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: cookie_init_sequence() cleanups

Some common IPv4/IPv6 code can be factorized.
Also constify cookie_init_sequence() socket argument.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp/dccp: constify syn_recv_sock() method sock argument

We'll soon no longer hold listener socket lock, these
functions do not modify the socket in any way.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: constify tcp_create_openreq_child() socket argument

This method does not touch the listener socket.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: constify dccp_create_openreq_child() sock argument

socket no longer needs to be read/write

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: constify sk_gfp_atomic() sock argument

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

inet: constify __inet_inherit_port() sock argument

socket is not touched, make it const.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

inet: constify inet_csk_route_child_sock() socket argument

The socket points to the (shared) listener.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: use inet6_csk_route_req() helper

Before changing dccp_v6_request_recv_sock() sock argument
to const, we need to get rid of security_sk_classify_flow(),
and it seems doable by reusing inet6_csk_route_req() helper.

We need to add a proto parameter to inet6_csk_route_req(),
not assume it is TCP.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: remove tcp_rcv_state_process() tcp_hdr argument

Factorize code to get tcp header from skb. It makes no sense
to duplicate code in callers.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: remove unused len argument from tcp_rcv_state_process()

Once we realize tcp_rcv_synsent_state_process() does not use
its 'len' argument and we get rid of it, then it becomes clear
this argument is no longer used in tcp_rcv_state_process()

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp/dccp: constify send_synack and send_reset socket argument

None of these functions need to change the socket, make it
const.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

skbuff: Fix skb checksum partial check.

Earlier patch 6ae459bda tried to detect void ckecksum partial
skb by comparing pull length to checksum offset. But it does
not work for all cases since checksum-offset depends on
updates to skb->data.

Following patch fixes it by validating checksum start offset
after skb-data pointer is updated. Negative value of checksum
offset start means there is no need to checksum.

Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin <[email protected]>
Signed-off-by: Pravin B Shelar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ipv4-routing-cleanups'

Alexander Duyck says:

====================
Minor IPv4 routing cleanups

These patches just contain some minor cleanups to address a few minor
issues. The first and the third mostly just improve readability. The
second patch should improve the performance for multicast destination
addresses that do not have a localhost source IP address by avoiding some
unnecessary dereferences.
====================

Signed-off-by: David S. Miller <[email protected]>

net: Remove martian_source_keep_err goto label

err is initialized to -EINVAL when it is declared. It is not reset until
fib_lookup which is well after the 3 users of the martian_source jump. So
resetting err to -EINVAL at martian_source label is not needed.

Removing that line obviates the need for the martian_source_keep_err label
so delete it.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Swap ordering of tests in ip_route_input_mc

This patch just swaps the ordering of one of the conditional tests in
ip_route_input_mc. Specifically it swaps the testing for the source
address to see if it is loopback, and the test to see if we allow a
loopback source address.

The reason for swapping these two tests is because it is much faster to
test if an address is loopback than it is to dereference several pointers
to get at the net structure to see if the use of loopback is allowed.

Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ipv4: Pass proto as u8 instead of u16 in ip_check_mc_rcu

This patch updates ip_check_mc_rcu so that protocol is passed as a u8
instead of a u16.

The motivation is just to avoid any unneeded type transitions since some
systems will require an instruction to zero extend a u8 field to a u16.
Also it makes it a bit more readable as to the fact that protocol is a u8
so there are no byte ordering changes needed to pass it.

Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set

Wolfgang reported that IPv6 stack is ignoring oif in output route lookups:

    With ipv6, ip -6 route get always returns the specific route.

    $ ip -6 r
    2001:db8:e2::1 dev enp2s0  proto kernel  metric 256
    2001:db8:e2::/64 dev enp2s0  metric 1024
    2001:db8:e3::1 dev enp3s0  proto kernel  metric 256
    2001:db8:e3::/64 dev enp3s0  metric 1024
    fe80::/64 dev enp3s0  proto kernel  metric 256
    default via 2001:db8:e3::255 dev enp3s0  metric 1024

    $ ip -6 r get 2001:db8:e2::100
    2001:db8:e2::100 from :: dev enp2s0  src 2001:db8:e3::1  metric 0
        cache

    $ ip -6 r get 2001:db8:e2::100 oif enp3s0
    2001:db8:e2::100 from :: dev enp2s0  src 2001:db8:e3::1  metric 0
        cache

The stack does consider the oif but a mismatch in rt6_device_match is not
considered fatal because RT6_LOOKUP_F_IFACE is not set in the flags.

Cc: Wolfgang Nothdurft <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

RESEND: [PATCH v3 net-next] sky2: use random address if EEPROM is bad

On some embedded systems the EEPROM does not contain a valid MAC address.
In that case it is better to fallback to a generated mac address and
let init scripts fix the value later.

Reported-by: Liviu Dudau <[email protected]>
Signed-off-by: Stephen Hemminger <[email protected]>
[Changed handcoded setup to use eth_hw_addr_random() and to save new address into HW]
Signed-off-by: Liviu Dudau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netpoll: Drop budget parameter from NAPI polling call hierarchy

For some reason we were carrying the budget value around between the
various calls to napi->poll.  If for example one of the drivers called had
a bug in which it returned a non-zero value for work this could result in
the budget value becoming negative.

Rather than carry around a value of budget that is 0 or less we can instead
just loop through and pass 0 to each napi->poll call.  If any driver
returns a value for work done that is non-zero then we can report that
driver and continue rather than allowing a bad actor to make the budget
value negative and pass that negative value to napi->poll.

Note, the only actual change here is that instead of letting budget become
negative we are keeping it at 0 regardless of the value returned for work
since it should not be possible for the polling routine to do any actual
work with a budget of 0.  So if the polling routine returns a non-0 value
we are just reporting it and continuing with a budget of 0 rather than
letting that work value be subtracted from the budget of 0.

Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net sysfs: Print link speed as signed integer

Otherwise 4294967295 (MBit/s) (-1) will be printed when there is no link.
Documentation/ABI/testing/sysfs-class-net does not state if this shall be
signed or unsigned.
Also remove the now unused variable fmt_udec.

Signed-off-by: Alexander Stein <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bna: fix error handling

Several functions can return negative value in case of error,
so their return type should be fixed as well as type of variables
to which this value is assigned.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'af_unix_MSG_PEEK'

Aaron Conole says:

====================
af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag

This patch set implements a bugfix for kernel.org bugzilla #12323, allowing
MSG_PEEK to return all queued data on the unix domain socket, not just the
data contained in a single SKB.

This is the v3 version of this patch, which includes a suggested modification
by Eric Dumazet to convert the unix_sk() conversion macro to a static inline
function. These patches are independent and can be applied separately.

This set was tested over a 24-hour period, utilizing a loop continually
executing the bugzilla issue attached python code. It was instrumented with
a pr_err_once() ([ 13.798683] unix: went there at least one time).

v2->v3:
- Added Eric Dumazet's suggestion for #define to static inline
- Fixed an issue calling unix_state_lock() with an invalid argument

v3->v4:
- Eliminated an XXX comment
- Changed from goto unlock to explicit unix_state_unlock() and break
====================

Signed-off-by: David S. Miller <[email protected]>

af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag

AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
is set.

This is referenced in kernel bugzilla #12323 @
https://bugzilla.kernel.org/show_bug.cgi?id=12323

As described both in the BZ and lkml thread @
http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
AF_UNIX socket only reads a single skb, where the desired effect is
to return as much skb data has been queued, until hitting the recv
buffer size (whichever comes first).

The modified MSG_PEEK path will now move to the next skb in the tree
and jump to the again: label, rather than following the natural loop
structure. This requires duplicating some of the loop head actions.

This was tested using the python socketpair python code attached to
the bugzilla issue.

Signed-off-by: Aaron Conole <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

af_unix: Convert the unix_sk macro to an inline function for type safety

As suggested by Eric Dumazet this change replaces the
#define with a static inline function to enjoy
complaints by the compiler when misusing the API.

Signed-off-by: Aaron Conole <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: vlan: add per-vlan struct and move to rhashtables

This patch changes the bridge vlan implementation to use rhashtables
instead of bitmaps. The main motivation behind this change is that we
need extensible per-vlan structures (both per-port and global) so more
advanced features can be introduced and the vlan support can be
extended. I've tried to break this up but the moment net_port_vlans is
changed and the whole API goes away, thus this is a larger patch.
A few short goals of this patch are:
- Extensible per-vlan structs stored in rhashtables and a sorted list
- Keep user-visible behaviour (compressed vlans etc)
- Keep fastpath ingress/egress logic the same (optimizations to come
later)

Here's a brief list of some of the new features we'd like to introduce:
- per-vlan counters
- vlan ingress/egress mapping
- per-vlan igmp configuration
- vlan priorities
- avoid fdb entries replication (e.g. local fdb scaling issues)

The structure is kept single for both global and per-port entries so to
avoid code duplication where possible and also because we'll soon introduce
"port0 / aka bridge as port" which should simplify things further
(thanks to Vlad for the suggestion!).

Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port
rhashtable, if an entry is added to a port it'll get a pointer to its
global context so it can be quickly accessed later. There's also a
sorted vlan list which is used for stable walks and some user-visible
behaviour such as the vlan ranges, also for error paths.
VLANs are stored in a "vlan group" which currently contains the
rhashtable, sorted vlan list and the number of "real" vlan entries.
A good side-effect of this change is that it resembles how hw keeps
per-vlan data.
One important note after this change is that if a VLAN is being looked up
in the bridge's rhashtable for filtering purposes (or to check if it's an
existing usable entry, not just a global context) then the new helper
br_vlan_should_use() needs to be used if the vlan is found. In case the
lookup is done only with a port's vlan group, then this check can be
skipped.

Things tested so far:
- basic vlan ingress/egress
- pvids
- untagged vlans
- undef CONFIG_BRIDGE_VLAN_FILTERING
- adding/deleting vlans in different scenarios (with/without global ctx,
while transmitting traffic, in ranges etc)
- loading/removing the module while having/adding/deleting vlans
- extracting bridge vlan information (user ABI), compressed requests
- adding/deleting fdbs on vlans
- bridge mac change, promisc mode
- default pvid change
- kmemleak ON during the whole time

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mvneta_percpu_irq'

Gregory CLEMENT says:

====================
net: mvneta: Switch to per-CPU irq and make rxq_def useful

As stated in the first version: "this patchset reworks the Marvell
neta driver in order to really support its per-CPU interrupts, instead
of faking them as SPI, and allow the use of any RX queue instead of
the hardcoded RX queue 0 that we have currently."

Following the review which has been done, Maxime started adding the
CPU hotplug support. I continued his work a few weeks ago and here is
the result.

Since the 1st version the main change is this CPU hotplug support, in
order to validate it I powered up and down the CPUs while performing
iperf. I ran the tests during hours: the kernel didn't crash and the
network interfaces were still usable. Of course it impacted the
performance, but continuously power down and up the CPUs is not
something we usually do.

I also reorganized the series, the 3 first patches should go through
the irq subsystem, whereas the 4 others should go to the network
subsystem.

However, there is a runtime dependency between the two parts. Patch 5
depend on the patch 3 to be able to use the percpu irq.

Thanks,

Gregory

PS: Thanks to Willy who gave me some pointers on how to deal with the
NAPI.
====================

Signed-off-by: David S. Miller <[email protected]>