]> Git Repo - linux.git/log
linux.git
3 years agonet: dsa: sja1105: unregister the MDIO buses during teardown
Vladimir Oltean [Wed, 11 Aug 2021 11:59:45 +0000 (14:59 +0300)]
net: dsa: sja1105: unregister the MDIO buses during teardown

The call to sja1105_mdiobus_unregister is present in the error path but
absent from the main driver unbind path.

Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: dsa: mt7530: fix VLAN traffic leaks again
DENG Qingfang [Wed, 11 Aug 2021 09:50:43 +0000 (17:50 +0800)]
net: dsa: mt7530: fix VLAN traffic leaks again

When a port leaves a VLAN-aware bridge, the current code does not clear
other ports' matrix field bit. If the bridge is later set to VLAN-unaware
mode, traffic in the bridge may leak to that port.

Remove the VLAN filtering check in mt7530_port_bridge_leave.

Fixes: 474a2ddaa192 ("net: dsa: mt7530: fix VLAN traffic leaks")
Fixes: 83163f7dca56 ("net: dsa: mediatek: add VLAN support for MT7530")
Signed-off-by: DENG Qingfang <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: phy: nxp-tja11xx: log critical health state
Oleksij Rempel [Wed, 11 Aug 2021 06:37:12 +0000 (08:37 +0200)]
net: phy: nxp-tja11xx: log critical health state

TJA1102 provides interrupt notification for the critical health states
like overtemperature and undervoltage.

The overtemperature bit is set if package temperature is beyond 155C°.
This functionality was tested by heating the package up to 200C°

The undervoltage bit is set if supply voltage drops beyond some critical
threshold. Currently not tested.

In a typical use case, both of this events should be logged and stored
(or send to some remote system) for further investigations.

Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'pktgen-imix'
David S. Miller [Thu, 12 Aug 2021 09:50:33 +0000 (10:50 +0100)]
Merge branch 'pktgen-imix'

Nick Richardson says:

====================
pktgen: Add IMIX mode

Adds internet mix (IMIX) mode to pktgen. Internet mix is
included in many user-space network perf testing tools. It allows
for the user to specify a distribution of discrete packet sizes to be
generated. This type of test is common among vendors when perf testing
their devices.
link: https://datatracker.ietf.org/doc/html/rfc2544#section-9.1]
This allows users to get a
more complete picture of how their device will perform in the
real-world.

This feature adds a command that allows users to specify an imix
distribution in the following format:
  imix_weights size_1,weight_1 size_2,weight_2 ... size_n,weight_n

The distribution of packets with size_i will be
(weight_i / total_weights) where
total_weights = weight_1 + weight_2 + ... + weight_n

For example:
  imix_weights 40,7 576,4 1500,1

The pkt_size "40" will account for 7 / (7 + 4 + 1) = ~58% of the total
packets sent.

This patch was tested with the following:
1. imix_weights = 40,7 576,4 1500,1
2. imix_weights = 0,7 576,4 1500,1
  - Packet size of 0 is resized to the minimum, 42
3. imix_weights = 40,7 576,4 1500,1 count = 0
  - Zero count.
  - Runs until user stops pktgen.
Invalid Configurations
1. clone_skb = 200 imix_weights = 40,7 576,4 1500,1
    - Returns error code -524 (-ENOTSUPP) when setting imix_weights
2. len(imix_weights) > MAX_IMIX_ENTRIES
    - Returns -7 (-E2BIG)

This patch is split into three parts, each provide different aspects of
required functionality:
  1. Parse internet mix input.
  2. Add IMIX Distribution representation.
  3. Process and output IMIX results.

Changes in v2:
* Remove __ prefix outside of uAPI.
* Use seq_puts instead of seq_printf where necessary.
* Reorder variable declaration.
* Return -EINVAL instead of -ENOTSUPP when using IMIX with clone_skb > 0
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agopktgen: Add output for imix results
Nick Richardson [Tue, 10 Aug 2021 19:01:55 +0000 (19:01 +0000)]
pktgen: Add output for imix results

The bps for imix mode is calculated by:
sum(imix_entry.size) / time_elapsed

The actual counts of each imix_entry are displayed under the
"Current:" section of the interface output in the following format:
imix_size_counts: size_1,count_1 size_2,count_2 ... size_n,count_n

Example (count = 200000):
imix_weights: 256,1 859,3 205,2
imix_size_counts: 256,32082 859,99796 205,68122
Result: OK: 17992362(c17964678+d27684) usec, 200000 (859byte,0frags)
  11115pps 47Mb/sec (47977140bps) errors: 0

Summary of changes:
Calculate bps based on imix counters when in IMIX mode.
Add output for IMIX counters.

Signed-off-by: Nick Richardson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agopktgen: Add imix distribution bins
Nick Richardson [Tue, 10 Aug 2021 19:01:54 +0000 (19:01 +0000)]
pktgen: Add imix distribution bins

In order to represent the distribution of imix packet sizes, a
pre-computed data structure is used. It features 100 (IMIX_PRECISION)
"bins". Contiguous ranges of these bins represent the respective
packet size of each imix entry. This is done to avoid the overhead of
selecting the correct imix packet size based on the corresponding weights.

Example:
imix_weights 40,7 576,4 1500,1
total_weight = 7 + 4 + 1 = 12

pkt_size 40 occurs 7/total_weight = 58% of the time
pkt_size 576 occurs 4/total_weight = 33% of the time
pkt_size 1500 occurs 1/total_weight = 9% of the time

We generate a random number between 0-100 and select the corresponding
packet size based on the specified weights.
Eg. random number = 358723895 % 100 = 65
Selects the packet size corresponding to index:65 in the pre-computed
imix_distribution array.
An example of the  pre-computed array is below:

The imix_distribution will look like the following:
0        ->  0 (index of imix_entry.size == 40)
1        ->  0 (index of imix_entry.size == 40)
2        ->  0 (index of imix_entry.size == 40)
[...]    ->  0 (index of imix_entry.size == 40)
57       ->  0 (index of imix_entry.size == 40)
58       ->  1 (index of imix_entry.size == 576)
[...]    ->  1 (index of imix_entry.size == 576)
90       ->  1 (index of imix_entry.size == 576)
91       ->  2 (index of imix_entry.size == 1500)
[...]    ->  2 (index of imix_entry.size == 1500)
99       ->  2 (index of imix_entry.size == 1500)

Create and use "bin" representation of the imix distribution.

Signed-off-by: Nick Richardson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agopktgen: Parse internet mix (imix) input
Nick Richardson [Tue, 10 Aug 2021 19:01:53 +0000 (19:01 +0000)]
pktgen: Parse internet mix (imix) input

Adds "imix_weights" command for specifying internet mix distribution.

The command is in this format:
"imix_weights size_1,weight_1 size_2,weight_2 ... size_n,weight_n"
where the probability that packet size_i is picked is:
weight_i / (weight_1 + weight_2 + .. + weight_n)

The user may provide up to 100 imix entries (size_i,weight_i) in this
command.

The user specified imix entries will be displayed in the "Params"
section of the interface output.

Values for clone_skb > 0 is not supported in IMIX mode.

Summary of changes:
Add flag for enabling internet mix mode.
Add command (imix_weights) for internet mix input.
Return -ENOTSUPP when clone_skb > 0 in IMIX mode.
Display imix_weights in Params.
Create data structures to store imix entries and distribution.

Signed-off-by: Nick Richardson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoRevert "tipc: Return the correct errno code"
Hoang Le [Wed, 11 Aug 2021 01:22:09 +0000 (08:22 +0700)]
Revert "tipc: Return the correct errno code"

This reverts commit 0efea3c649f0 because of:
- The returning -ENOBUF error is fine on socket buffer allocation.
- There is side effect in the calling path
tipc_node_xmit()->tipc_link_xmit() when checking error code returning.

Fixes: 0efea3c649f0 ("tipc: Return the correct errno code")
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: Hoang Le <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: mscc: Fix non-GPL export of regmap APIs
Mark Brown [Tue, 10 Aug 2021 12:37:48 +0000 (13:37 +0100)]
net: mscc: Fix non-GPL export of regmap APIs

The ocelot driver makes use of regmap, wrapping it with driver specific
operations that are thin wrappers around the core regmap APIs. These are
exported with EXPORT_SYMBOL, dropping the _GPL from the core regmap
exports which is frowned upon. Add _GPL suffixes to at least the APIs that
are doing register I/O.

Signed-off-by: Mark Brown <[email protected]>
Acked-by: Alexandre Belloni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 12 Aug 2021 06:00:55 +0000 (20:00 -1000)]
Merge tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull orphan section linker fix from Kees Cook:

 - Handle changes to Clang's Sanitizer section layout (Nathan
   Chancellor)

* tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  vmlinux.lds.h: Handle clang's module.{c,d}tor sections

3 years agoMerge tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 12 Aug 2021 05:56:10 +0000 (19:56 -1000)]
Merge tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp fixes from Kees Cook:

 - Fix typo in user notification documentation (Rodrigo Campos)

 - Fix userspace counter report when using TSYNC (Hsuan-Chi Kuo, Wiktor
   Garbacz)

* tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Fix setting loaded filter count during TSYNC
  Documentation: seccomp: Fix typo in user notification

3 years agoMerge tag 'amd-drm-fixes-5.14-2021-08-11' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 12 Aug 2021 03:38:12 +0000 (13:38 +1000)]
Merge tag 'amd-drm-fixes-5.14-2021-08-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.14-2021-08-11:

amdgpu:
- Yellow carp update
- RAS EEPROM fixes
- BACO/BOCO fixes
- Fix a memory leak in an error path
- Freesync fix
- VCN harvesting fix
- Display fixes

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
3 years agonet: bridge: vlan: fix global vlan option range dumping
Nikolay Aleksandrov [Tue, 10 Aug 2021 09:21:39 +0000 (12:21 +0300)]
net: bridge: vlan: fix global vlan option range dumping

When global vlan options are equal sequentially we compress them in a
range to save space and reduce processing time. In order to have the
proper range end id we need to update range_end if the options are equal
otherwise we get ranges with the same end vlan id as the start.

Fixes: 743a53d9636a ("net: bridge: vlan: add support for dumping global vlan options")
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agomctp: Specify route types, require rtm_type in RTM_*ROUTE messages
Jeremy Kerr [Tue, 10 Aug 2021 02:38:34 +0000 (10:38 +0800)]
mctp: Specify route types, require rtm_type in RTM_*ROUTE messages

This change adds a 'type' attribute to routes, which can be parsed from
a RTM_NEWROUTE message. This will help to distinguish local vs. peer
routes in a future change.

This means userspace will need to set a correct rtm_type in RTM_NEWROUTE
and RTM_DELROUTE messages; we currently only accept RTN_UNICAST.

Signed-off-by: Jeremy Kerr <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: igmp: increase size of mr_ifc_count
Eric Dumazet [Wed, 11 Aug 2021 19:57:15 +0000 (12:57 -0700)]
net: igmp: increase size of mr_ifc_count

Some arches support cmpxchg() on 4-byte and 8-byte only.
Increase mr_ifc_count width to 32bit to fix this problem.

Fixes: 4a2b285e7e10 ("net: igmp: fix data-race in igmp_ifc_timer_expire()")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: hns3: add support for triggering reset by ethtool
Yufeng Mo [Tue, 10 Aug 2021 13:28:48 +0000 (21:28 +0800)]
net: hns3: add support for triggering reset by ethtool

Currently, four reset types are supported for the HNS3 ethernet
driver: IMP reset, global reset, function reset, and FLR. Only
FLR can now be triggered by the user. To restore the device when
an exception occurs, add support for triggering reset by ethtool.

Run the "ethtool --reset DEVNAME mgmt | all | dedicated" to
trigger the IMP | global | function reset manually.

In addition, VF can only trigger function reset.

Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Guangbin Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agoMAINTAINERS: switch to my OMP email for Renesas Ethernet drivers
Sergey Shtylyov [Tue, 10 Aug 2021 20:17:12 +0000 (23:17 +0300)]
MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers

I'm still going to continue looking after the Renesas Ethernet drivers and
device tree bindings. Now my new employer, Open Mobile Platform (OMP), will
pay for all my upstream work. Let's switch to my OMP email for the reviews.

Signed-off-by: Sergey Shtylyov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agotcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
Neal Cardwell [Wed, 11 Aug 2021 02:40:56 +0000 (22:40 -0400)]
tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets

Currently if BBR congestion control is initialized after more than 2B
packets have been delivered, depending on the phase of the
tp->delivered counter the tracking of BBR round trips can get stuck.

The bug arises because if tp->delivered is between 2^31 and 2^32 at
the time the BBR congestion control module is initialized, then the
initialization of bbr->next_rtt_delivered to 0 will cause the logic to
believe that the end of the round trip is still billions of packets in
the future. More specifically, the following check will fail
repeatedly:

  !before(rs->prior_delivered, bbr->next_rtt_delivered)

and thus the connection will take up to 2B packets delivered before
that check will pass and the connection will set:

  bbr->round_start = 1;

This could cause many mechanisms in BBR to fail to trigger, for
example bbr_check_full_bw_reached() would likely never exit STARTUP.

This bug is 5 years old and has not been observed, and as a practical
matter this would likely rarely trigger, since it would require
transferring at least 2B packets, or likely more than 3 terabytes of
data, before switching congestion control algorithms to BBR.

This patch is a stable candidate for kernels as far back as v4.9,
when tcp_bbr.c was added.

Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <[email protected]>
Reviewed-by: Yuchung Cheng <[email protected]>
Reviewed-by: Kevin Yang <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agoMerge branch 'bonding-cleanup-header-file-and-error-msgs'
Jakub Kicinski [Wed, 11 Aug 2021 21:57:33 +0000 (14:57 -0700)]
Merge branch 'bonding-cleanup-header-file-and-error-msgs'

Jonathan Toppins says:

====================
bonding: cleanup header file and error msgs

Two small patches removing unreferenced symbols and unifying error
messages across netlink and printk.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agobonding: combine netlink and console error messages
Jonathan Toppins [Wed, 11 Aug 2021 02:53:31 +0000 (22:53 -0400)]
bonding: combine netlink and console error messages

There seems to be no reason to have different error messages between
netlink and printk. It also cleans up the function slightly.

Signed-off-by: Jonathan Toppins <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agobonding: remove extraneous definitions from bonding.h
Jonathan Toppins [Wed, 11 Aug 2021 02:53:30 +0000 (22:53 -0400)]
bonding: remove extraneous definitions from bonding.h

All of the symbols either only exist in bond_options.c or nowhere at
all. These symbols were verified to not exist in the code base by
using `git grep` and their removal was verified by compiling bonding.ko.

Signed-off-by: Jonathan Toppins <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: pcs: xpcs: fix error handling on failed to allocate memory
Wong Vee Khee [Tue, 10 Aug 2021 08:58:12 +0000 (16:58 +0800)]
net: pcs: xpcs: fix error handling on failed to allocate memory

Drivers such as sja1105 and stmmac that call xpcs_create() expects an
error returned by the pcs-xpcs module, but this was not the case on
failed to allocate memory.

Fixed this by returning an -ENOMEM instead of a NULL pointer.

Fixes: 3ad1d171548e ("net: dsa: sja1105: migrate to xpcs for SGMII")
Signed-off-by: Wong Vee Khee <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: mscc: Fix non-GPL export of regmap APIs
Mark Brown [Tue, 10 Aug 2021 12:37:48 +0000 (13:37 +0100)]
net: mscc: Fix non-GPL export of regmap APIs

The ocelot driver makes use of regmap, wrapping it with driver specific
operations that are thin wrappers around the core regmap APIs. These are
exported with EXPORT_SYMBOL, dropping the _GPL from the core regmap
exports which is frowned upon. Add _GPL suffixes to at least the APIs that
are doing register I/O.

Signed-off-by: Mark Brown <[email protected]>
Acked-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: linkwatch: fix failure to restore device state across suspend/resume
Willy Tarreau [Mon, 9 Aug 2021 16:06:28 +0000 (18:06 +0200)]
net: linkwatch: fix failure to restore device state across suspend/resume

After migrating my laptop from 4.19-LTS to 5.4-LTS a while ago I noticed
that my Ethernet port to which a bond and a VLAN interface are attached
appeared to remain up after resuming from suspend with the cable unplugged
(and that problem still persists with 5.10-LTS).

It happens that the following happens:

  - the network driver (e1000e here) prepares to suspend, calls e1000e_down()
    which calls netif_carrier_off() to signal that the link is going down.
  - netif_carrier_off() adds a link_watch event to the list of events for
    this device
  - the device is completely stopped.
  - the machine suspends
  - the cable is unplugged and the machine brought to another location
  - the machine is resumed
  - the queued linkwatch events are processed for the device
  - the device doesn't yet have the __LINK_STATE_PRESENT bit and its events
    are silently dropped
  - the device is resumed with its link down
  - the upper VLAN and bond interfaces are never notified that the link had
    been turned down and remain up
  - the only way to provoke a change is to physically connect the machine
    to a port and possibly unplug it.

The state after resume looks like this:
  $ ip -br li | egrep 'bond|eth'
  bond0            UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP>
  eth0             DOWN           e8:6a:64:64:64:64 <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP>
  eth0.2@eth0      UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>

Placing an explicit call to netdev_state_change() either in the suspend
or the resume code in the NIC driver worked around this but the solution
is not satisfying.

The issue in fact really is in link_watch that loses events while it
ought not to. It happens that the test for the device being present was
added by commit 124eee3f6955 ("net: linkwatch: add check for netdevice
being present to linkwatch_do_dev") in 4.20 to avoid an access to
devices that are not present.

Instead of dropping events, this patch proceeds slightly differently by
postponing their handling so that they happen after the device is fully
resumed.

Fixes: 124eee3f6955 ("net: linkwatch: add check for netdevice being present to linkwatch_do_dev")
Link: https://lists.openwall.net/netdev/2018/03/15/62
Cc: Heiner Kallweit <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Florian Fainelli <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agovmlinux.lds.h: Handle clang's module.{c,d}tor sections
Nathan Chancellor [Sat, 31 Jul 2021 02:31:08 +0000 (19:31 -0700)]
vmlinux.lds.h: Handle clang's module.{c,d}tor sections

A recent change in LLVM causes module_{c,d}tor sections to appear when
CONFIG_K{A,C}SAN are enabled, which results in orphan section warnings
because these are not handled anywhere:

ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_ctor) is being placed in '.text.asan.module_ctor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_dtor) is being placed in '.text.asan.module_dtor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.tsan.module_ctor) is being placed in '.text.tsan.module_ctor'

Fangrui explains: "the function asan.module_ctor has the SHF_GNU_RETAIN
flag, so it is in a separate section even with -fno-function-sections
(default)".

Place them in the TEXT_TEXT section so that these technologies continue
to work with the newer compiler versions. All of the KASAN and KCSAN
KUnit tests continue to pass after this change.

Cc: [email protected]
Link: https://github.com/ClangBuiltLinux/linux/issues/1432
Link: https://github.com/llvm/llvm-project/commit/7b789562244ee941b7bf2cefeb3fc08a59a01865
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Reviewed-by: Fangrui Song <[email protected]>
Acked-by: Marco Elver <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
3 years agoseccomp: Fix setting loaded filter count during TSYNC
Hsuan-Chi Kuo [Thu, 4 Mar 2021 23:37:08 +0000 (17:37 -0600)]
seccomp: Fix setting loaded filter count during TSYNC

The desired behavior is to set the caller's filter count to thread's.
This value is reported via /proc, so this fixes the inaccurate count
exposed to userspace; it is not used for reference counting, etc.

Signed-off-by: Hsuan-Chi Kuo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Co-developed-by: Wiktor Garbacz <[email protected]>
Signed-off-by: Wiktor Garbacz <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Kees Cook <[email protected]>
Cc: [email protected]
Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status")
3 years agonet/mlx5e: Make use of netdev_warn()
Cai Huoqing [Tue, 10 Aug 2021 02:08:22 +0000 (10:08 +0800)]
net/mlx5e: Make use of netdev_warn()

to replace printk(KERN_WARNING ...) with netdev_warn() kindly

Signed-off-by: Cai Huoqing <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix variable type to match 64bit
Eran Ben Elisha [Tue, 10 Aug 2021 18:15:05 +0000 (21:15 +0300)]
net/mlx5: Fix variable type to match 64bit

Fix the following smatch warning:
wait_func_handle_exec_timeout() warn: should '1 << ent->idx' be a 64 bit type?

Use 1ULL, to have a 64 bit type variable.

Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Eran Ben Elisha <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Initialize numa node for all core devices
Parav Pandit [Wed, 16 Jun 2021 19:23:23 +0000 (22:23 +0300)]
net/mlx5: Initialize numa node for all core devices

Subsequent patches make use of numa node affinity for memory
allocations. Initialize it for PCI PF, VF and SF devices.

Signed-off-by: Parav Pandit <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Allocate individual capability
Parav Pandit [Tue, 13 Jul 2021 11:17:03 +0000 (14:17 +0300)]
net/mlx5: Allocate individual capability

Currently mlx5_core_dev contains array of capabilities. It contains 19
valid capabilities of the device, 2 reserved entries and 12 holes.
Due to this for 14 unused entries, mlx5_core_dev allocates 14 * 8K = 112K
bytes of memory which is never used. Due to this mlx5_core_dev structure
size is 270Kbytes odd. This allocation further aligns to next power of 2
to 512Kbytes.

By skipping non-existent entries,
(a) 112Kbyte is saved,
(b) mlx5_core_dev reduces to 8KB with alignment
(c) 350KB saved in alignment

In future individual capability allocation can be used to skip its
allocation when such capability is disabled at the device level. This
patch prepares mlx5_core_dev to hold capability using a pointer instead
of inline array.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: Shay Drory <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Reorganize current and maximal capabilities to be per-type
Parav Pandit [Tue, 13 Jul 2021 09:36:05 +0000 (12:36 +0300)]
net/mlx5: Reorganize current and maximal capabilities to be per-type

In the current code, the current and maximal capabilities are
maintained in separate arrays which are both per type. In order to
allow the creation of such a basic structure as a dynamically
allocated array, we move curr and max fields to a unified
structure so that specific capabilities can be allocated as one unit.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: Shay Drory <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: SF, use recent sysfs api
Parav Pandit [Tue, 18 May 2021 05:50:04 +0000 (08:50 +0300)]
net/mlx5: SF, use recent sysfs api

Use sysfs_emit() which is aware of PAGE_SIZE buffer.

Signed-off-by: Parav Pandit <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Refcount mlx5_irq with integer
Shay Drory [Tue, 22 Jun 2021 11:20:16 +0000 (14:20 +0300)]
net/mlx5: Refcount mlx5_irq with integer

Currently, all access to mlx5 IRQs are done undere a lock. Hance, there
isn't a reason to have kref in struct mlx5_irq.
Switch it to integer.

Signed-off-by: Shay Drory <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Change SF missing dedicated MSI-X err message to dbg
Shay Drory [Tue, 29 Jun 2021 11:47:30 +0000 (14:47 +0300)]
net/mlx5: Change SF missing dedicated MSI-X err message to dbg

When MSI-X vectors allocated are not enough for SFs to have dedicated,
MSI-X, kernel log buffer has too many entries.
Hence only enable such log with debug level.

Signed-off-by: Shay Drory <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Align mlx5_irq structure
Shay Drory [Wed, 16 Jun 2021 15:58:26 +0000 (18:58 +0300)]
net/mlx5: Align mlx5_irq structure

mlx5_irq structure have holes due to incorrect position of fields in it.
Make them naturally align.

pahole output after alignment:
struct mlx5_irq {
        struct atomic_notifier_head nh;                  /*     0    72 */
        /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
        cpumask_var_t              mask;                 /*    72     8 */
        char                       name[32];             /*    80    32 */
        struct mlx5_irq_pool *     pool;                 /*   112     8 */
        struct kref                kref;                 /*   120     4 */
        u32                        index;                /*   124     4 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        int                        irqn;                 /*   128     4 */

        /* size: 136, cachelines: 3, members: 7 */
        /* padding: 4 */
        /* last cacheline: 8 bytes */

};

Signed-off-by: Shay Drory <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Delete impossible dev->state checks
Leon Romanovsky [Sun, 1 Aug 2021 08:37:57 +0000 (11:37 +0300)]
net/mlx5: Delete impossible dev->state checks

New mlx5_core device structure is allocated through devlink_alloc
with\ kzalloc and that ensures that all fields are equal to zero
and it includes ->state too.

That means that checks of that field in the mlx5_init_one() is
completely redundant, because that function is called only once
in the begging of mlx5_core_dev lifetime.

PCI:
 .probe()
  -> probe_one()
   -> mlx5_init_one()

The recovery flow can't run at that time or before it, because relevant
work initialized later in mlx5_init_once().

Such initialization flow ensures that dev->state can't be
MLX5_DEVICE_STATE_UNINITIALIZED at all, so remove such impossible
checks.

Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix inner TTC table creation
Maor Gottlieb [Mon, 9 Aug 2021 12:12:45 +0000 (15:12 +0300)]
net/mlx5: Fix inner TTC table creation

Fix typo of the cited commit that calls to mlx5_create_ttc_table, instead
of mlx5_create_inner_ttc_table.

Fixes: f4b45940e9b9 ("net/mlx5: Embed mlx5_ttc_table")
Signed-off-by: Maor Gottlieb <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix typo in comments
Cai Huoqing [Fri, 30 Jul 2021 03:03:00 +0000 (11:03 +0800)]
net/mlx5: Fix typo in comments

Fix typo:
*vectores  ==> vectors
*realeased  ==> released
*erros  ==> errors
*namepsace  ==> namespace
*trafic  ==> traffic
*proccessed  ==> processed
*retore  ==> restore
*Currenlty  ==> Currently
*crated  ==> created
*chane  ==> change
*cannnot  ==> cannot
*usuallly  ==> usually
*failes  ==> fails
*importent  ==> important
*reenabled  ==> re-enabled
*alocation  ==> allocation
*recived  ==> received
*tanslation  ==> translation

Signed-off-by: Cai Huoqing <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agoMerge branch 'dsa-tagger-helpers'
David S. Miller [Wed, 11 Aug 2021 13:44:59 +0000 (14:44 +0100)]
Merge branch 'dsa-tagger-helpers'

Vladimir Oltean says:

====================
DSA tagger helpers

The goal of this series is to minimize the use of memmove and skb->data
in the DSA tagging protocol drivers. Unfiltered access to this level of
information is not very friendly to drive-by contributors, and sometimes
is also not the easiest to review.

For starters, I have converted the most common form of DSA tagging
protocols: the DSA headers which are placed where the EtherType is.

The helper functions introduced by this series are:
- dsa_alloc_etype_header
- dsa_strip_etype_header
- dsa_etype_header_pos_rx
- dsa_etype_header_pos_tx

This series is just a resend as non-RFC of v1.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet: dsa: create a helper for locating EtherType DSA headers on TX
Vladimir Oltean [Tue, 10 Aug 2021 13:13:56 +0000 (16:13 +0300)]
net: dsa: create a helper for locating EtherType DSA headers on TX

Create a similar helper for locating the offset to the DSA header
relative to skb->data, and make the existing EtherType header taggers to
use it.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: dsa: create a helper for locating EtherType DSA headers on RX
Vladimir Oltean [Tue, 10 Aug 2021 13:13:55 +0000 (16:13 +0300)]
net: dsa: create a helper for locating EtherType DSA headers on RX

It seems that protocol tagging driver writers are always surprised about
the formula they use to reach their EtherType header on RX, which
becomes apparent from the fact that there are comments in multiple
drivers that mention the same information.

Create a helper that returns a void pointer to skb->data - 2, as well as
centralize the explanation why that is the case.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: dsa: create a helper which allocates space for EtherType DSA headers
Vladimir Oltean [Tue, 10 Aug 2021 13:13:54 +0000 (16:13 +0300)]
net: dsa: create a helper which allocates space for EtherType DSA headers

Hide away the memmove used by DSA EtherType header taggers to shift the
MAC SA and DA to the left to make room for the header, after they've
called skb_push(). The call to skb_push() is still left explicit in
drivers, to be symmetric with dsa_strip_etype_header, and because not
all callers can be refactored to do it (for example, brcm_tag_xmit_ll
has common code for a pre-Ethernet DSA tag and an EtherType DSA tag).

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: dsa: create a helper that strips EtherType DSA headers on RX
Vladimir Oltean [Tue, 10 Aug 2021 13:13:53 +0000 (16:13 +0300)]
net: dsa: create a helper that strips EtherType DSA headers on RX

All header taggers open-code a memmove that is fairly not all that
obvious, and we can hide the details behind a helper function, since the
only thing specific to the driver is the length of the header tag.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'devlink-aux-devices'
David S. Miller [Wed, 11 Aug 2021 13:34:22 +0000 (14:34 +0100)]
Merge branch 'devlink-aux-devices'

Parav Pandit says:

====================
devlink: Control auxiliary devices

Currently, for mlx5 multi-function device, a user is not able to control
which functionality to enable/disable. For example, each PCI
PF, VF, SF function by default has netdevice, RDMA and vdpa-net
devices always enabled.

Hence, enable user to control which device functionality to enable/disable.

This is achieved by using existing devlink params [1] to
enable/disable eth, rdma and vdpa net functionality control knob.

For example user interested in only vdpa device function: performs,

$ devlink dev param set pci/0000:06:00.0 name enable_rdma value false \
                   cmode driverinit
$ devlink dev param set pci/0000:06:00.0 name enable_eth value false \
                   cmode driverinit
$ devlink dev param set pci/0000:06:00.0 name enable_vnet value true \
                   cmode driverinit

$ devlink dev reload pci/0000:06:00.0

Reload command honors parameters set, initializes the device that user
has composed using devlink dev params and resources.
Devices before reload:

            mlx5_core.sf.4
         (subfunction device)
                  /\
                 /| \
                / |  \
               /  |   \
 mlx5_core.eth.4  |  mlx5_core.rdma.4
(SF eth aux dev)  |  (SF rdma aux dev)
    |             |        |
    |             |        |
 enp6s0f0s88      |      mlx5_0
 (SF netdev)      |  (SF rdma device)
                  |
         mlx5_core.vnet.4
         (SF vnet aux dev)
                 |
                 |
        auxiliary/mlx5_core.sf.4
        (vdpa net mgmt device)

Above example reconfigures the device with only VDPA functionality.
Devices after reload:

            mlx5_core.sf.4
         (subfunction device)
                  /\
                 /  \
                /    \
               /      \
 mlx5_core.vnet.4     no eth, no rdma aux devices
 (SF vnet aux dev)

Above parameters enable user to compose the device as needed based
on the use case.

Since devlink params are done on the devlink instance, these
knobs are uniformly usable for PCI PF, VF and SF devices.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet/mlx5: Support enable_vnet devlink dev param
Parav Pandit [Tue, 10 Aug 2021 13:24:24 +0000 (16:24 +0300)]
net/mlx5: Support enable_vnet devlink dev param

Enable user to disable VDPA net auxiliary device so that when it is not
required, user can disable it.

For example,

$ devlink dev param set pci/0000:06:00.0 \
              name enable_vnet value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device
mlx5_core.vnet.2 for the VDPA net functionality.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet/mlx5: Support enable_rdma devlink dev param
Parav Pandit [Tue, 10 Aug 2021 13:24:23 +0000 (16:24 +0300)]
net/mlx5: Support enable_rdma devlink dev param

Enable user to disable RDMA auxiliary device so that when it is not
required, user can disable it.

For example,

$ devlink dev param set pci/0000:06:00.0 \
              name enable_rdma value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device
mlx5_core.rdma.2 for the RDMA functionality.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet/mlx5: Support enable_eth devlink dev param
Parav Pandit [Tue, 10 Aug 2021 13:24:22 +0000 (16:24 +0300)]
net/mlx5: Support enable_eth devlink dev param

Enable user to disable Ethernet auxiliary device so that when it is not
required, user can disable it.

For example,

$ devlink dev param set pci/0000:06:00.0 \
              name enable_eth value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create mlx5_core.eth.2 auxiliary
device for the Ethernet functionality.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet/mlx5: Fix unpublish devlink parameters
Parav Pandit [Tue, 10 Aug 2021 13:24:21 +0000 (16:24 +0300)]
net/mlx5: Fix unpublish devlink parameters

Cleanup routine missed to unpublish the parameters. Add it.

Fixes: e890acd5ff18 ("net/mlx5: Add devlink flow_steering_mode parameter")
Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodevlink: Add APIs to publish, unpublish individual parameter
Parav Pandit [Tue, 10 Aug 2021 13:24:20 +0000 (16:24 +0300)]
devlink: Add APIs to publish, unpublish individual parameter

Enable drivers to publish/unpublish individual parameter.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodevlink: Add API to register and unregister single parameter
Parav Pandit [Tue, 10 Aug 2021 13:24:19 +0000 (16:24 +0300)]
devlink: Add API to register and unregister single parameter

Currently device configuration parameters can be registered as an array.
Due to this a constant array must be registered. A single driver
supporting multiple devices each with different device capabilities end
up registering all parameters even if it doesn't support it.

One possible workaround a driver can do is, it registers multiple single
entry arrays to overcome such limitation.

Better is to provide a API that enables driver to register/unregister a
single parameter. This also further helps in two ways.
(1) to reduce the memory of devlink_param_entry by avoiding in registering
parameters which are not supported by the device.
(2) avoid generating multiple parameter add, delete, publish, unpublish,
init value notifications for such unsupported parameters

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodevlink: Create a helper function for one parameter registration
Parav Pandit [Tue, 10 Aug 2021 13:24:18 +0000 (16:24 +0300)]
devlink: Create a helper function for one parameter registration

Create and use a helper function for one parameter registration.
Subsequent patch also will reuse this for driver facing routine to
register a single parameter.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodevlink: Add new "enable_vnet" generic device param
Parav Pandit [Tue, 10 Aug 2021 13:24:17 +0000 (16:24 +0300)]
devlink: Add new "enable_vnet" generic device param

Add new device generic parameter to enable/disable creation of
VDPA net auxiliary device and associated device functionality
in the devlink instance.

User who prefers to disable such functionality can disable it using below
example.

$ devlink dev param set pci/0000:06:00.0 \
              name enable_vnet value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device for the
VDPA net functionality.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodevlink: Add new "enable_rdma" generic device param
Parav Pandit [Tue, 10 Aug 2021 13:24:16 +0000 (16:24 +0300)]
devlink: Add new "enable_rdma" generic device param

Add new device generic parameter to enable/disable creation of
RDMA auxiliary device and associated device functionality
in the devlink instance.

User who prefers to disable such functionality can disable it using below
example.

$ devlink dev param set pci/0000:06:00.0 \
              name enable_rdma value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device for the
RDMA functionality.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodevlink: Add new "enable_eth" generic device param
Parav Pandit [Tue, 10 Aug 2021 13:24:15 +0000 (16:24 +0300)]
devlink: Add new "enable_eth" generic device param

Add new device generic parameter to enable/disable creation of
Ethernet auxiliary device and associated device functionality
in the devlink instance.

User who prefers to disable such functionality can disable it using below
example.

$ devlink dev param set pci/0000:06:00.0 \
              name enable_eth value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device for the
Ethernet functionality.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'bridge-global-mcast'
David S. Miller [Wed, 11 Aug 2021 12:34:41 +0000 (13:34 +0100)]
Merge branch 'bridge-global-mcast'

Nikolay Aleksandrov says:

====================
net: bridge: vlan: add global mcast options

This is the first follow-up set after the support for per-vlan multicast
contexts which extends global vlan options to support bridge's multicast
config per-vlan, it enables user-space to change and dump the already
existing bridge vlan multicast context options. The global option patches
(01 - 09 and 12-13) follow a similar pattern of changing current mcast
functions to take multicast context instead of a port/bridge directly.
Option equality checks have been added for dumping vlan range compression.
The last 2 patches extend the mcast router dump support so it can be
re-used when dumping vlan config.

patches 01 - 09: add support for various mcast options
patches 10 - 11: prepare for per-vlan querier control
patches 12 - 13: add support for querier control and router control
patches 14 - 15: add support for dumping per-vlan router ports

Next patch-sets:
 - per-port/vlan router option config
 - iproute2 support for all new vlan options
 - selftests
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: use br_rports_fill_info() to export mcast router ports
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:33 +0000 (18:29 +0300)]
net: bridge: vlan: use br_rports_fill_info() to export mcast router ports

Embed the standard multicast router port export by br_rports_fill_info()
into a new global vlan attribute BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS.
In order to have the same format for the global bridge mcast context and
the per-vlan mcast context we need a double-nesting:
 - BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS
   - MDBA_ROUTER

Currently we don't compare router lists, if any router port exists in
the bridge mcast contexts we consider their option sets as different and
export them separately.

In addition we export the router port vlan id when dumping similar to
the router port notification format.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: mcast: use the proper multicast context when dumping router ports
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:32 +0000 (18:29 +0300)]
net: bridge: mcast: use the proper multicast context when dumping router ports

When we are dumping the router ports of a vlan mcast context we need to
use the bridge/vlan and port/vlan's multicast contexts to check if
IPv4/IPv6 router port is present and later to dump the vlan id.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast router global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:31 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast router global option

Add support to change and retrieve global vlan multicast router state
which is used for the bridge itself. We just need to pass multicast context
to br_multicast_set_router instead of bridge device and the rest of the
logic remains the same.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast querier global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:30 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast querier global option

Add support to change and retrieve global vlan multicast querier state.
We just need to pass multicast context to br_multicast_set_querier
instead of bridge device and the rest of the logic remains the same.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: mcast: querier and query state affect only current context type
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:29 +0000 (18:29 +0300)]
net: bridge: mcast: querier and query state affect only current context type

It is a minor optimization and better behaviour to make sure querier and
query sending routines affect only the matching multicast context
depending if vlan snooping is enabled (vlan ctx vs bridge ctx).
It also avoids sending unnecessary extra query packets.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: mcast: move querier state to the multicast context
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:28 +0000 (18:29 +0300)]
net: bridge: mcast: move querier state to the multicast context

We need to have the querier state per multicast context in order to have
per-vlan control, so remove the internal option bit and move it to the
multicast context. Also annotate the lockless reads of the new variable.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast startup query interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:27 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast startup query interval global option

Add support to change and retrieve global vlan multicast startup query
interval option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast query response interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:26 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast query response interval global option

Add support to change and retrieve global vlan multicast query response
interval option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast query interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:25 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast query interval global option

Add support to change and retrieve global vlan multicast query interval
option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast querier interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:24 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast querier interval global option

Add support to change and retrieve global vlan multicast querier interval
option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast membership interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:23 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast membership interval global option

Add support to change and retrieve global vlan multicast membership
interval option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast last member interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:22 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast last member interval global option

Add support to change and retrieve global vlan multicast last member
interval option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast startup query count global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:21 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast startup query count global option

Add support to change and retrieve global vlan multicast startup query
count option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast last member count global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:20 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast last member count global option

Add support to change and retrieve global vlan multicast last member
count option.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: bridge: vlan: add support for mcast igmp/mld version global options
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:19 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast igmp/mld version global options

Add support to change and retrieve global vlan IGMP/MLD versions.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'ipa-runtime-pm'
David S. Miller [Wed, 11 Aug 2021 12:31:56 +0000 (13:31 +0100)]
Merge branch 'ipa-runtime-pm'

Alex Elder says:

====================
net: ipa: use runtime PM reference counting

This series does further rework of the IPA clock code so that we
rely on some of the core runtime power management code (including
its referencing counting) instead.

The first patch makes ipa_clock_get() act like pm_runtime_get_sync().

The second patch makes system suspend occur regardless of the
current reference count value, which is again more like how the
runtime PM core code behaves.

The third patch creates functions to encapsulate all hardware
suspend and resume activity.  The fourth uses those functions as
the ->runtime_suspend and ->runtime_resume power callbacks.  With
that in place, ipa_clock_get() and ipa_clock_put() are changed to
use runtime PM get and put functions when needed.

The fifth patch eliminates an extra clock reference previously used
to control system suspend.  The sixth eliminates the "IPA clock"
reference count and mutex.

The final patch replaces the one call to ipa_clock_get_additional()
with a call to pm_runtime_get_if_active(), making the former
unnecessary.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: kill ipa_clock_get_additional()
Alex Elder [Tue, 10 Aug 2021 19:27:04 +0000 (14:27 -0500)]
net: ipa: kill ipa_clock_get_additional()

Now that ipa_clock_get_additional() is a trivial wrapper around
pm_runtime_get_if_active(), just open-code it in its only caller
and delete the function.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: kill IPA clock reference count
Alex Elder [Tue, 10 Aug 2021 19:27:03 +0000 (14:27 -0500)]
net: ipa: kill IPA clock reference count

The runtime power management core code maintains a usage count.  This
count mirrors the IPA clock reference count, and there's no need to
maintain both.  So get rid of the IPA clock reference count and just
rely on the runtime PM usage count to determine when the hardware
should be suspended or resumed.

Use pm_runtime_get_if_active() in ipa_clock_get_additional().  We
care whether power is active, regardless of whether it's in use, so
pass true for its ign_usage_count argument.

The IPA clock mutex is just used to make enabling/disabling the
clock and updating the reference count occur atomically.  Without
the reference count, there's no need for the mutex, so get rid of
that too.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: get rid of extra clock reference
Alex Elder [Tue, 10 Aug 2021 19:27:02 +0000 (14:27 -0500)]
net: ipa: get rid of extra clock reference

Suspending the IPA hardware is now managed by the runtime PM core
code.  The ->runtime_idle callback returns a non-zero value, so it
will never suspend except when forced.  As a result, there's no need
to take an extra "do not suspend" clock reference.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: use runtime PM core
Alex Elder [Tue, 10 Aug 2021 19:27:01 +0000 (14:27 -0500)]
net: ipa: use runtime PM core

Use the runtime power management core to cause hardware suspend and
resume to occur.  Enable it in ipa_clock_init() (without autosuspend),
and disable it in ipa_clock_exit().

Use ipa_runtime_suspend() as the ->runtime_suspend power operation,
and arrange for it to be called by having ipa_clock_get() call
pm_runtime_get_sync() when the first clock reference is taken.
Similarly, use ipa_runtime_resume() as the ->runtime_resume power
operation, and pm_runtime_put() when the last IPA clock reference
is dropped.

Introduce ipa_runtime_idle() as the ->runtime_idle power operation,
and have it return a non-zero value; this way suspend will never
occur except when forced.

Use pm_runtime_force_suspend() and pm_runtime_force_resume() as the
system suspend and resume callbacks, and remove ipa_suspend() and
ipa_resume().

Store a pointer to the device structure passed to ipa_clock_init(),
so it can be used by ipa_clock_exit() to disable runtime power
management.

For now we preserve IPA clock reference counting.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: resume in ipa_clock_get()
Alex Elder [Tue, 10 Aug 2021 19:27:00 +0000 (14:27 -0500)]
net: ipa: resume in ipa_clock_get()

Introduce ipa_runtime_suspend() and ipa_runtime_resume(), which
encapsulate the activities necessary for suspending and resuming
the IPA hardware.  Call these functions from ipa_clock_get() and
ipa_clock_put() when the first reference is taken or last one is
dropped.

When the very first clock reference is taken (for ipa_config()),
setup isn't complete yet, so (as before) only the core clock gets
enabled.

When the last clock reference is dropped (after ipa_deconfig()),
ipa_teardown() will have made the setup_complete flag false, so
there too, the core clock will be stopped without affecting GSI
or the endpoints.

Otherwise these new functions will perform the desired suspend and
resume actions once setup is complete.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: disable clock in suspend
Alex Elder [Tue, 10 Aug 2021 19:26:59 +0000 (14:26 -0500)]
net: ipa: disable clock in suspend

Disable the IPA clock rather than dropping a reference to it in the
system suspend callback.  This forces the suspend to occur without
affecting existing references.

Similarly, enable the clock rather than taking a reference in
ipa_resume(), forcing a resume without changing the reference count.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ipa: have ipa_clock_get() return a value
Alex Elder [Tue, 10 Aug 2021 19:26:58 +0000 (14:26 -0500)]
net: ipa: have ipa_clock_get() return a value

We currently assume no errors occur when enabling or disabling the
IPA core clock and interconnects.  And although this commit exposes
errors that could occur, we generally assume this won't happen in
practice.

This commit changes ipa_clock_get() and ipa_clock_put() so each
returns a value.  The values returned are meant to mimic what the
runtime power management functions return, so we can set up error
handling here before we make the switch.  Have ipa_clock_get()
increment the reference count even if it returns an error, to match
the behavior of pm_runtime_get().

More details follow.

When taking a reference in ipa_clock_get(), return 0 for the first
reference, 1 for subsequent references, or a negative error code if
an error occurs.  Note that if ipa_clock_get() returns an error, we
must not touch hardware; in some cases such errors now cause entire
blocks of code to be skipped.

When dropping a reference in ipa_clock_put(), we return 0 or an
error code.  The error would come from ipa_clock_disable(), which
now returns what ipa_interconnect_disable() returns (either 0 or a
negative error code).  For now, callers ignore the return value;
if an error occurs, a message will have already been logged, and
little more can actually be done to improve the situation.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Wed, 11 Aug 2021 09:22:26 +0000 (10:22 +0100)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Use nfnetlink_unicast() instead of netlink_unicast() in nft_compat.

2) Remove call to nf_ct_l4proto_find() in flowtable offload timeout
   fixup.

3) CLUSTERIP registers ARP hook on demand, from Florian.

4) Use clusterip_net to store pernet warning, also from Florian.

5) Remove struct netns_xt, from Florian Westphal.

6) Enable ebtables hooks in initns on demand, from Florian.

7) Allow to filter conntrack netlink dump per status bits,
   from Florian Westphal.

8) Register x_tables hooks in initns on demand, from Florian.

9) Remove queue_handler from per-netns structure, again from Florian.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'mediatek-drm-fixes-5.14' of https://git.kernel.org/pub/scm/linux/kernel...
Dave Airlie [Wed, 11 Aug 2021 04:11:44 +0000 (14:11 +1000)]
Merge tag 'mediatek-drm-fixes-5.14' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes

Mediatek DRM Fixes for Linux 5.14

1. Fix dpi bridge bug.
2. Fix cursor plane no update.

Signed-off-by: Dave Airlie <[email protected]>
From: Chun-Kuang Hu <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
3 years agoMerge tag 'arc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Linus Torvalds [Wed, 11 Aug 2021 02:34:34 +0000 (16:34 -1000)]
Merge tag 'arc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - Fix FPU_STATUS update

 - Update my email address

 - Other spellos and fixes

* tag 'arc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  MAINTAINERS: update Vineet's email address
  ARC: fp: set FPU_STATUS.FWE to enable FPU_STATUS update on context switch
  ARC: Fix CONFIG_STACKDEPOT
  arc: Fix spelling mistake and grammar in Kconfig
  arc: Prefer unsigned int to bare use of unsigned

3 years agonet: Support filtering interfaces on no master
Lahav Schlesinger [Tue, 10 Aug 2021 09:06:58 +0000 (09:06 +0000)]
net: Support filtering interfaces on no master

Currently there's support for filtering neighbours/links for interfaces
which have a specific master device (using the IFLA_MASTER/NDA_MASTER
attributes).

This patch adds support for filtering interfaces/neighbours dump for
interfaces that *don't* have a master.

Signed-off-by: Lahav Schlesinger <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet/sched: cls_api, reset flags on replay
Mark Bloch [Tue, 10 Aug 2021 03:43:05 +0000 (03:43 +0000)]
net/sched: cls_api, reset flags on replay

tc_new_tfilter() can replay a request if it got EAGAIN. The cited commit
didn't account for this when it converted TC action ->init() API
to use flags instead of parameters. This can lead to passing stale flags
down the call chain which results in trying to lock rtnl when it's
already locked, deadlocking the entire system.

Fix by making sure to reset flags on each replay.

============================================
WARNING: possible recursive locking detected
5.14.0-rc3-custom-49011-g3d2bbb4f104d #447 Not tainted
--------------------------------------------
tc/37605 is trying to acquire lock:
ffffffff841df2f0 (rtnl_mutex){+.+.}-{3:3}, at: tc_setup_cb_add+0x14b/0x4d0

but task is already holding lock:
ffffffff841df2f0 (rtnl_mutex){+.+.}-{3:3}, at: tc_new_tfilter+0xb12/0x22e0

other info that might help us debug this:
 Possible unsafe locking scenario:
       CPU0
       ----
  lock(rtnl_mutex);
  lock(rtnl_mutex);

 *** DEADLOCK ***
 May be due to missing lock nesting notation
1 lock held by tc/37605:
 #0: ffffffff841df2f0 (rtnl_mutex){+.+.}-{3:3}, at: tc_new_tfilter+0xb12/0x22e0

stack backtrace:
CPU: 0 PID: 37605 Comm: tc Not tainted 5.14.0-rc3-custom-49011-g3d2bbb4f104d #447
Hardware name: Mellanox Technologies Ltd. MSN2010/SA002610, BIOS 5.6.5 08/24/2017
Call Trace:
 dump_stack_lvl+0x8b/0xb3
 __lock_acquire.cold+0x175/0x3cb
 lock_acquire+0x1a4/0x4f0
 __mutex_lock+0x136/0x10d0
 fl_hw_replace_filter+0x458/0x630 [cls_flower]
 fl_change+0x25f2/0x4a64 [cls_flower]
 tc_new_tfilter+0xa65/0x22e0
 rtnetlink_rcv_msg+0x86c/0xc60
 netlink_rcv_skb+0x14d/0x430
 netlink_unicast+0x539/0x7e0
 netlink_sendmsg+0x84d/0xd80
 ____sys_sendmsg+0x7ff/0x970
 ___sys_sendmsg+0xf8/0x170
 __sys_sendmsg+0xea/0x1b0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7b93b6c0a7
Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48>
RSP: 002b:00007ffe365b3818 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b93b6c0a7
RDX: 0000000000000000 RSI: 00007ffe365b3880 RDI: 0000000000000003
RBP: 00000000610a75f6 R08: 0000000000000001 R09: 0000000000000000
R10: fffffffffffff3a9 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 00007ffe365b7b58 R15: 00000000004822c0

Fixes: 695176bfe5de ("net_sched: refactor TC action init API")
Signed-off-by: Mark Bloch <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: bridge: fix memleak in br_add_if()
Yang Yingliang [Mon, 9 Aug 2021 13:20:23 +0000 (21:20 +0800)]
net: bridge: fix memleak in br_add_if()

I got a memleak report:

BUG: memory leak
unreferenced object 0x607ee521a658 (size 240):
comm "syz-executor.0", pid 955, jiffies 4294780569 (age 16.449s)
hex dump (first 32 bytes, cpu 1):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000d830ea5a>] br_multicast_add_port+0x1c2/0x300 net/bridge/br_multicast.c:1693
[<00000000274d9a71>] new_nbp net/bridge/br_if.c:435 [inline]
[<00000000274d9a71>] br_add_if+0x670/0x1740 net/bridge/br_if.c:611
[<0000000012ce888e>] do_set_master net/core/rtnetlink.c:2513 [inline]
[<0000000012ce888e>] do_set_master+0x1aa/0x210 net/core/rtnetlink.c:2487
[<0000000099d1cafc>] __rtnl_newlink+0x1095/0x13e0 net/core/rtnetlink.c:3457
[<00000000a01facc0>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488
[<00000000acc9186c>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5550
[<00000000d4aabb9c>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504
[<00000000bc2e12a3>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<00000000bc2e12a3>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340
[<00000000e4dc2d0e>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929
[<000000000d22c8b3>] sock_sendmsg_nosec net/socket.c:654 [inline]
[<000000000d22c8b3>] sock_sendmsg+0x139/0x170 net/socket.c:674
[<00000000e281417a>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350
[<00000000237aa2ab>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404
[<000000004f2dc381>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433
[<0000000005feca6c>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47
[<000000007304477d>] entry_SYSCALL_64_after_hwframe+0x44/0xae

On error path of br_add_if(), p->mcast_stats allocated in
new_nbp() need be freed, or it will be leaked.

Fixes: 1080ab95e3c7 ("net: bridge: add support for IGMP/MLD stats and export them via netlink")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Yang Yingliang <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers...
Vladimir Oltean [Tue, 10 Aug 2021 11:50:24 +0000 (14:50 +0300)]
net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge

The blamed commit added a new field to struct switchdev_notifier_fdb_info,
but did not make sure that all call paths set it to something valid.
For example, a switchdev driver may emit a SWITCHDEV_FDB_ADD_TO_BRIDGE
notifier, and since the 'is_local' flag is not set, it contains junk
from the stack, so the bridge might interpret those notifications as
being for local FDB entries when that was not intended.

To avoid that now and in the future, zero-initialize all
switchdev_notifier_fdb_info structures created by drivers such that all
newly added fields to not need to touch drivers again.

Fixes: 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications")
Reported-by: Ido Schimmel <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: Karsten Graul <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Jakub Kicinski [Tue, 10 Aug 2021 20:19:16 +0000 (13:19 -0700)]
Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
pull-request: mlx5-next 2020-08-9

This pulls mlx5-next branch which includes patches already reviewed on
net-next and rdma mailing lists.

1) mlx5 single E-Switch FDB for lag

2) IB/mlx5: Rename is_apu_thread_cq function to is_apu_cq

3) Add DCS caps & fields support

[1] https://patchwork.kernel.org/project/netdevbpf/cover/20210803231959[email protected]/

[2] https://patchwork.kernel.org/project/netdevbpf/patch/0e3364dab7e0e4eea5423878b01aa42470be8d36.1626609184[email protected]/

[3] https://patchwork.kernel.org/project/netdevbpf/patch/55e1d69bef1fbfa5cf195c0bfcbe35c8019de35e.1624258894[email protected]/

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Lag, Create shared FDB when in switchdev mode
  net/mlx5: E-Switch, add logic to enable shared FDB
  net/mlx5: Lag, move lag destruction to a workqueue
  net/mlx5: Lag, properly lock eswitch if needed
  net/mlx5: Add send to vport rules on paired device
  net/mlx5: E-Switch, Add event callback for representors
  net/mlx5e: Use shared mappings for restoring from metadata
  net/mlx5e: Add an option to create a shared mapping
  net/mlx5: E-Switch, set flow source for send to uplink rule
  RDMA/mlx5: Add shared FDB support
  {net, RDMA}/mlx5: Extend send to vport rules
  RDMA/mlx5: Fill port info based on the relevant eswitch
  net/mlx5: Lag, add initial logic for shared FDB
  net/mlx5: Return mdev from eswitch
  IB/mlx5: Rename is_apu_thread_cq function to is_apu_cq
  net/mlx5: Add DCS caps & fields support
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agonet: bridge: fix flags interpretation for extern learn fdb entries
Nikolay Aleksandrov [Tue, 10 Aug 2021 11:00:10 +0000 (14:00 +0300)]
net: bridge: fix flags interpretation for extern learn fdb entries

Ignore fdb flags when adding port extern learn entries and always set
BR_FDB_LOCAL flag when adding bridge extern learn entries. This is
closest to the behaviour we had before and avoids breaking any use cases
which were allowed.

This patch fixes iproute2 calls which assume NUD_PERMANENT and were
allowed before, example:
$ bridge fdb add 00:11:22:33:44:55 dev swp1 extern_learn

Extern learn entries are allowed to roam, but do not expire, so static
or dynamic flags make no sense for them.

Also add a comment for future reference.

Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
Fixes: 0541a6293298 ("net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry")
Reviewed-by: Ido Schimmel <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agoMerge tag 'platform-drivers-x86-v5.14-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 10 Aug 2021 16:46:33 +0000 (09:46 -0700)]
Merge tag 'platform-drivers-x86-v5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:
 "Small set of pdx86 fixes for 5.14"

* tag 'platform-drivers-x86-v5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: pcengines-apuv2: Add missing terminating entries to gpio-lookup tables
  platform/x86: Make dual_accel_detect() KIOX010A + KIOX020A detect more robust
  platform/x86: Add and use a dual_accel_detect() helper

3 years agoMerge tag 'ovl-fixes-5.14-rc6-v2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 10 Aug 2021 16:40:09 +0000 (09:40 -0700)]
Merge tag 'ovl-fixes-5.14-rc6-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "Fix several bugs in overlayfs"

* tag 'ovl-fixes-5.14-rc6-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: prevent private clone if bind mount is not allowed
  ovl: fix uninitialized pointer read in ovl_lookup_real_one()
  ovl: fix deadlock in splice write
  ovl: skip stale entries in merge dir cache iteration

3 years agonetfilter: nf_queue: move hookfn registration out of struct net
Florian Westphal [Thu, 5 Aug 2021 10:02:43 +0000 (12:02 +0200)]
netfilter: nf_queue: move hookfn registration out of struct net

This was done to detect when the pernet->init() function was not called
yet, by checking if net->nf.queue_handler is NULL.

Once the nfnetlink_queue module is active, all struct net pointers
contain the same address.  So place this back in nf_queue.c.

Handle the 'netns error unwind' test by checking nfnl_queue_net for a
NULL pointer and add a comment for this.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
3 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Jakub Kicinski [Tue, 10 Aug 2021 14:27:09 +0000 (07:27 -0700)]
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
bpf-next 2021-08-10

We've added 31 non-merge commits during the last 8 day(s) which contain
a total of 28 files changed, 3644 insertions(+), 519 deletions(-).

1) Native XDP support for bonding driver & related BPF selftests, from Jussi Maki.

2) Large batch of new BPF JIT tests for test_bpf.ko that came out as a result from
   32-bit MIPS JIT development, from Johan Almbladh.

3) Rewrite of netcnt BPF selftest and merge into test_progs, from Stanislav Fomichev.

4) Fix XDP bpf_prog_test_run infra after net to net-next merge, from Andrii Nakryiko.

5) Follow-up fix in unix_bpf_update_proto() to enforce socket type, from Cong Wang.

6) Fix bpf-iter-tcp4 selftest to print the correct dest IP, from Jose Blanquicet.

7) Various misc BPF XDP sample improvements, from Niklas Söderlund, Matthew Cover,
   and Muhammad Falak R Wani.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (31 commits)
  bpf, tests: Add tail call test suite
  bpf, tests: Add tests for BPF_CMPXCHG
  bpf, tests: Add tests for atomic operations
  bpf, tests: Add test for 32-bit context pointer argument passing
  bpf, tests: Add branch conversion JIT test
  bpf, tests: Add word-order tests for load/store of double words
  bpf, tests: Add tests for ALU operations implemented with function calls
  bpf, tests: Add more ALU64 BPF_MUL tests
  bpf, tests: Add more BPF_LSH/RSH/ARSH tests for ALU64
  bpf, tests: Add more ALU32 tests for BPF_LSH/RSH/ARSH
  bpf, tests: Add more tests of ALU32 and ALU64 bitwise operations
  bpf, tests: Fix typos in test case descriptions
  bpf, tests: Add BPF_MOV tests for zero and sign extension
  bpf, tests: Add BPF_JMP32 test cases
  samples, bpf: Add an explict comment to handle nested vlan tagging.
  selftests/bpf: Add tests for XDP bonding
  selftests/bpf: Fix xdp_tx.c prog section name
  net, core: Allow netdev_lower_get_next_private_rcu in bh context
  bpf, devmap: Exclude XDP broadcast to master device
  net, bonding: Add XDP support to the bonding driver
  ...
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Tue, 10 Aug 2021 14:52:09 +0000 (07:52 -0700)]
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
bpf 2021-08-10

We've added 5 non-merge commits during the last 2 day(s) which contain
a total of 7 files changed, 27 insertions(+), 15 deletions(-).

1) Fix missing bpf_read_lock_trace() context for BPF loader progs, from Yonghong Song.

2) Fix corner case where BPF prog retrieves wrong local storage, also from Yonghong Song.

3) Restrict availability of BPF write_user helper behind lockdown, from Daniel Borkmann.

4) Fix multiple kernel-doc warnings in BPF core, from Randy Dunlap.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, core: Fix kernel-doc notation
  bpf: Fix potentially incorrect results with bpf_get_local_storage()
  bpf: Add missing bpf_read_[un]lock_trace() for syscall program
  bpf: Add lockdown check for probe_write_user helper
  bpf: Add _kernel suffix to internal lockdown_bpf_read
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
3 years agodrm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work
Anson Jacob [Fri, 30 Jul 2021 23:46:20 +0000 (19:46 -0400)]
drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work

Replace GFP_KERNEL with GFP_ATOMIC as amdgpu_dm_irq_schedule_work
can't sleep.

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:196
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 253, name: kworker/6:1H
CPU: 6 PID: 253 Comm: kworker/6:1H Tainted: G        W  OE     5.11.0-promotion_2021_06_07-18_36_28_prelim_revert_retrain #8
Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 3405 02/01/2021
Workqueue: events_highpri dm_irq_work_func [amdgpu]
Call Trace:
 <IRQ>
 dump_stack+0x5e/0x74
 ___might_sleep.cold+0x87/0x98
 __might_sleep+0x4b/0x80
 kmem_cache_alloc_trace+0x390/0x4f0
 amdgpu_dm_irq_handler+0x171/0x230 [amdgpu]
 amdgpu_irq_dispatch+0xc0/0x1e0 [amdgpu]
 amdgpu_ih_process+0x81/0x100 [amdgpu]
 amdgpu_irq_handler+0x26/0xa0 [amdgpu]
 __handle_irq_event_percpu+0x49/0x190
 ? __hrtimer_get_next_event+0x4d/0x80
 handle_irq_event_percpu+0x33/0x80
 handle_irq_event+0x33/0x60
 handle_edge_irq+0x82/0x190
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 common_interrupt+0xbb/0x140
 asm_common_interrupt+0x1e/0x40
RIP: 0010:amdgpu_device_rreg.part.0+0x44/0xf0 [amdgpu]
Code: 53 48 89 fb 4c 3b af c8 08 00 00 73 6d 83 e2 02 75 0d f6 87 40 62 01 00 10 0f 85 83 00 00 00 4c 03 ab d0 08 00 00 45 8b 6d 00 <8b> 05 3e b6 52 00 85 c0 7e 62 48 8b 43 08 0f b7 70 3e 65 8b 05 e3
RSP: 0018:ffffae7740fff9e8 EFLAGS: 00000286
RAX: ffffffffc05ee610 RBX: ffff8aaf8f620000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000005430 RDI: ffff8aaf8f620000
RBP: ffffae7740fffa08 R08: 0000000000000001 R09: 000000000000000a
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000005430
R13: 0000000071000000 R14: 0000000000000001 R15: 0000000000005430
 ? amdgpu_cgs_write_register+0x20/0x20 [amdgpu]
 amdgpu_device_rreg+0x17/0x20 [amdgpu]
 amdgpu_cgs_read_register+0x14/0x20 [amdgpu]
 dm_read_reg_func+0x38/0xb0 [amdgpu]
 generic_reg_wait+0x80/0x160 [amdgpu]
 dce_aux_transfer_raw+0x324/0x7c0 [amdgpu]
 dc_link_aux_transfer_raw+0x43/0x50 [amdgpu]
 dm_dp_aux_transfer+0x87/0x110 [amdgpu]
 drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
 drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
 drm_dp_get_one_sb_msg+0x349/0x480 [drm_kms_helper]
 drm_dp_mst_hpd_irq+0xc5/0xe40 [drm_kms_helper]
 ? drm_dp_mst_hpd_irq+0xc5/0xe40 [drm_kms_helper]
 dm_handle_hpd_rx_irq+0x184/0x1a0 [amdgpu]
 ? dm_handle_hpd_rx_irq+0x184/0x1a0 [amdgpu]
 handle_hpd_rx_irq+0x195/0x240 [amdgpu]
 ? __switch_to_asm+0x42/0x70
 ? __switch_to+0x131/0x450
 dm_irq_work_func+0x19/0x20 [amdgpu]
 process_one_work+0x209/0x400
 worker_thread+0x4d/0x3e0
 ? cancel_delayed_work+0xa0/0xa0
 kthread+0x124/0x160
 ? kthread_park+0x90/0x90
 ret_from_fork+0x22/0x30

Reviewed-by: Aurabindo Jayamohanan Pillai <[email protected]>
Acked-by: Anson Jacob <[email protected]>
Signed-off-by: Anson Jacob <[email protected]>
Cc: [email protected]
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
3 years agodrm/amd/display: Remove invalid assert for ODM + MPC case
Eric Bernstein [Mon, 26 Jul 2021 19:53:18 +0000 (15:53 -0400)]
drm/amd/display: Remove invalid assert for ODM + MPC case

Reviewed-by: Dmytro Laktyushkin <[email protected]>
Acked-by: Anson Jacob <[email protected]>
Signed-off-by: Eric Bernstein <[email protected]>
Cc: [email protected]
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
3 years agodrm/amd/pm: bug fix for the runtime pm BACO
Kenneth Feng [Fri, 6 Aug 2021 02:28:04 +0000 (10:28 +0800)]
drm/amd/pm: bug fix for the runtime pm BACO

In some systems only MACO is supported. This is to fix the problem
that runtime pm is enabled but BACO is not supported. MACO will be
handled seperately.

Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Jack Gui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
3 years agodrm/amdgpu: handle VCN instances when harvesting (v2)
Alex Deucher [Mon, 9 Aug 2021 15:22:20 +0000 (11:22 -0400)]
drm/amdgpu: handle VCN instances when harvesting (v2)

There may be multiple instances and only one is harvested.

v2: fix typo in commit message

Fixes: 83a0b8639185 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <[email protected]>
Reviewed-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
3 years agoMerge tag 'gvt-fixes-2021-08-10' of https://github.com/intel/gvt-linux into drm-intel...
Rodrigo Vivi [Tue, 10 Aug 2021 13:49:14 +0000 (09:49 -0400)]
Merge tag 'gvt-fixes-2021-08-10' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2021-08-10

- Fix windows VM hang issue for atomics workaround (Zhenyu)

Signed-off-by: Rodrigo Vivi <[email protected]>
From: Zhenyu Wang <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
3 years agoALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC
Jeremy Szu [Tue, 10 Aug 2021 10:08:45 +0000 (18:08 +0800)]
ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC

The HP ProBook 650 G8 Notebook PC is using ALC236 codec which is
using 0x02 to control mute LED and 0x01 to control micmute LED.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
3 years agoMerge branch 'fdb-backpressure-fixes'
David S. Miller [Tue, 10 Aug 2021 12:17:22 +0000 (13:17 +0100)]
Merge branch 'fdb-backpressure-fixes'

Vladimir Oltean says:

====================
Fix broken backpressure during FDB dump in DSA drivers

rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

DSA is one of the few switchdev drivers that have an .ndo_fdb_dump
implementation, because of the assumption that the hardware and software
FDBs cannot be efficiently kept in sync via SWITCHDEV_FDB_ADD_TO_BRIDGE.
Other drivers with a home-cooked .ndo_fdb_dump implementation are
ocelot and dpaa2-switch. These appear to do the correct thing, as do the
other DSA drivers, so nothing else appears to need fixing.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet: dsa: sja1105: fix broken backpressure in .port_fdb_dump
Vladimir Oltean [Tue, 10 Aug 2021 11:19:56 +0000 (14:19 +0300)]
net: dsa: sja1105: fix broken backpressure in .port_fdb_dump

rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

Fix the broken backpressure by propagating the "cb" return code and
allow rtnl_fdb_dump() to restart the FDB dump with a new skb.

Fixes: 291d1e72b756 ("net: dsa: sja1105: Add support for FDB and MDB management")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This page took 0.134531 seconds and 4 git commands to generate.