RDS: RDMA: Fix the composite message user notification
When application sends an RDS RDMA composite message consist of
RDMA transfer to be followed up by non RDMA payload, it expect to
be notified *only* when the full message gets delivered. RDS RDMA
notification doesn't behave this way though.
Thanks to Venkat for debug and root casuing the issue
where only first part of the message(RDMA) was
successfully delivered but remainder payload delivery failed.
In that case, application should not be notified with
a false positive of message delivery success.
Fix this case by making sure the user gets notified only after
the full message delivery.
In absence of extension headers, message log will keep
flooding the console. As such even without use_once we can
clean up the MRs so its not really an error case message
so make it debug message
RDS: IB: split the mr registration and invalidation path
MR invalidation in RDS is done in background thread and not in
data path like registration. So break the dependency between them
which helps to remove the performance bottleneck.
RDS: RDMA: return appropriate error on rdma map failures
The first message to a remote node should prompt a new
connection even if it is RDMA operation. For RDMA operation
the MR mapping can fail because connections is not yet up.
Since the connection establishment is asynchronous,
we make sure the map failure because of unavailable
connection reach to the user by appropriate error code.
Before returning to the user, lets trigger the connection
so that its ready for the next retry.
RDS: mark few internal functions static to make sparse build happy
Fixes below warnings:
warning: symbol 'rds_send_probe' was not declared. Should it be static?
warning: symbol 'rds_send_ping' was not declared. Should it be static?
warning: symbol 'rds_tcp_accept_one_path' was not declared. Should it be static?
warning: symbol 'rds_walk_conn_path_info' was not declared. Should it be static?
David S. Miller [Mon, 2 Jan 2017 20:51:21 +0000 (15:51 -0500)]
Merge branch 'mlx5-odp'
Saeed Mahameed says:
====================
Mellanox mlx5 core and ODP updates 2017-01-01
The following eleven patches mainly come from Artemy Kovalyov
who expanded mlx5 on-demand-paging (ODP) support. In addition
there are three cleanup patches which don't change any functionality,
but are needed to align codebase prior accepting other patches.
Memory region (MR) in IB can be huge and ODP (on-demand paging)
technique allows to use unpinned memory, which can be consumed and
released on demand. This allows to applications do not pin down
the underlying physical pages of the address space, and save from them
need to track the validity of the mappings.
Rather, the HCA requests the latest translations from the OS when pages
are not present, and the OS invalidates translations which are no longer
valid due to either non-present pages or mapping changes.
In existing ODP implementation applications is needed to register
memory buffers for communication, though registered memory regions
need not have valid mappings at registration time.
This patch set performs the following steps to expand
current ODP implementation:
1. It refactors UMR to support large regions, by introducing generic
function to perform HCA translation table modifications. This
function supports both atomic and process contexts and is not limited
by number of modified entries.
This function allows to enable reallocated memory regions of
arbitrary size, so adding MR cache buckets to support up to 16GB MRs.
2. It changes page fault event format and refactor page faults logic
together with addition of atomic support.
3. It prepares mlx5 core code to support implicit registration with
simplified and relaxed semantics.
Implicit ODP semantics allows to applications provide special memory
key that represents their complete address space. Thus all IO accesses
referencing to this key (with proper access rights associated with the key)
wouldn't need not register any virtual address range.
Thanks,
Artemy, Ilya and Leon
v1->v2:
- Don't use 'inline' in .c files
====================
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:46 +0000 (11:37 +0200)]
{net,IB}/mlx5: Refactor page fault handling
* Update page fault event according to last specification.
* Separate code path for page fault EQ, completion EQ and async EQ.
* Move page fault handling work queue from mlx5_ib static variable
into mlx5_core page fault EQ.
* Allocate memory to store ODP event dynamically as the
events arrive, since in atomic context - use mempool.
* Make mlx5_ib page fault handler run in process context.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:45 +0000 (11:37 +0200)]
net/mlx5: Update PAGE_FAULT_RESUME layout
Update PAGE_FAULT_RESUME command layout.
Three bit fields describing page fault: rdma, rdma_write, req_res gave 8
possible combinations, while only a few were legal. Now they
are interpreted as three-bit type field, where former legal
combinations turns into corresponding types and unused were added as new
types.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:44 +0000 (11:37 +0200)]
IB/mlx5: Add MR cache for large UMR regions
In this change we turn mlx5_ib_update_mtt() into generic
mlx5_ib_update_xlt() to perfrom HCA translation table modifiactions
supporting both atomic and process contexts and not limited by number
of modified entries.
Using this function we increase preallocated MRs up to 16GB.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:42 +0000 (11:37 +0200)]
IB/mlx5: Refactor UMR post send format
* Update struct mlx5_wqe_umr_ctrl_seg.
* Currenlty UMR send_flags aim only certain use cases: enabled/disable
cached MR, modifying XLT for ODP. By making flags independent make UMR
more flexible allowing arbitrary manipulations.
* Since different UMR formats have different entry sizes UMR request
should receive exact size of translation table update instead of
number of entries. Rename field npages to xlt_size in struct mlx5_umr_wr
and update relevant code accordingly.
* Add support of length64 bit.
Artemy Kovalyov [Mon, 2 Jan 2017 09:37:41 +0000 (11:37 +0200)]
net/mlx5: Support new MR features
This patch adds the following items to IFC file.
1. MLX5_MKC_ACCESS_MODE_KSM enum value for creating KSM memory keys.
KSM access mode used when indirect MKey associated with fixed memory
size entries.
2. null_mkey field that is used to indicate non-present KLM/KSM
entries, where it causes the device to generate page fault event
when trying to access it.
3. struct mlx5_ifc_cmd_hca_cap_bits capability bits indicating
related value/field is supported:
* fixed_buffer_size - MLX5_MKC_ACCESS_MODE_KSM
* umr_extended_translation_offset - translation_offset_42_16
in UMR ctrl segment
* null_mkey - null_mkey in QUERY_SPECIAL_CONTEXTS
Binoy Jayan [Mon, 2 Jan 2017 09:37:40 +0000 (11:37 +0200)]
IB/mlx5: Add helper mlx5_ib_post_send_wait
Clean up the following common code (to post a list of work requests to the
send queue of the specified QP) at various places and add a helper function
'mlx5_ib_post_send_wait' to implement the same.
- Initialize 'mlx5_ib_umr_context' on stack
- Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe
- Acquire the semaphore
- call ib_post_send with a single ib_send_wr
- wait_for_completion()
- Check for umr_context.status
- Release the semaphore
David S. Miller [Mon, 2 Jan 2017 20:23:34 +0000 (15:23 -0500)]
Merge tag 'wireless-drivers-next-for-davem-2017-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.11
The most notable change here is the inclusion of airtime fairness
scheduling to ath9k. It prevents slow clients from hogging all the
airtime and unfairly slowing down faster clients.
For core revision 3.x Address-Aligned Beats is available in two registers.
The DT property snps,aal was created for AAL in the DMA bus register,
which is a read/write bit.
The DT property snps,axi_all was created for AXI_AAL in the AXI bus mode
register, which is a read only bit that reflects the value of AAL in the
DMA bus register.
Since the value of snps,axi_all is never used in the driver,
and since the property was created for a bit that is read only,
it should be safe to remove the property.
David S. Miller [Mon, 2 Jan 2017 02:02:21 +0000 (21:02 -0500)]
Merge branch 'qed-driver-updates'
Yuval Mintz says:
====================
qed*: Driver updates
The more interesting changes in this series include:
- Restructuring of the qede files - qede_main.c has grown big and this
series splits it into 3 parts [patches #2 and #3].
- Some significant changes in the API through which RSS indirection
table gets configured [#8].
- Support for ndo_set_vf_trust() [#9] which would regulate which VFs
are allowed to use promisc/multi-promisc mode.
It also contains various minor changes to qed/qede, as well as
non-functional changes [#1, #12] to complement other changes.
====================
Mintz, Yuval [Sun, 1 Jan 2017 11:57:07 +0000 (13:57 +0200)]
qed*: RSS indirection based on queue-handles
A step toward having qede agnostic to the queue configurations
in firmware/hardware - let the RSS indirections use queue handles
instead of actual queue indices.
Mintz, Yuval [Sun, 1 Jan 2017 11:57:04 +0000 (13:57 +0200)]
qede: Postpone reallocation until NAPI end
During Rx flow driver allocates a replacement buffer each time
it consumes an Rx buffer. Failing to do so, it would consume the
currently processed buffer and re-post it on the ring.
As a result, the Rx ring is always completely full [from driver POV].
We now allow the Rx ring to shorten by doing the re-allocations
at the end of the NAPI run. The only limitation is that we still want to
make sure each time we reallocate that we'd still have sufficient
elements in the Rx ring to guarantee that FW would be able to post
additional data and trigger an interrupt.
Mintz, Yuval [Sun, 1 Jan 2017 11:57:03 +0000 (13:57 +0200)]
qed*: Change maximal number of queues
Today qede requests contexts that would suffice for 64 'whole'
combined queues [192 meant for 64 rx, tx and xdp tx queues],
but registers netdev and limits the number of queues based on
information received by qed. In turn, qed doesn't take context
into account when informing qede how many queues it can support.
This would lead to a configuration problem in case user tries
configuring >64 combined queues to interface [or >96 in case
xdp isn't enabled]. Since we don't have a mangement firware
that actually provides so many interrupt lines to a single
device we're currently safe but that's about to change soon.
The new maximum is hence changed:
- For RoCE devices, the limit would remain 64.
- For non-RoCE devices, the limit might be higher [depending
on the actual configuration of the device].
qed would start enforcing that limit in both scenarios.
Mintz, Yuval [Sun, 1 Jan 2017 11:57:00 +0000 (13:57 +0200)]
qed*: Update to dual-license
Since the submission of the qedr driver, there's inconsistency
in the licensing of the various qed/qede files - some are GPLv2
and some are dual-license.
Since qedr requires dual-license and it's dependent on both,
we're updating the licensing of all qed/qede source files.
Thomas Preisner [Fri, 30 Dec 2016 02:37:54 +0000 (03:37 +0100)]
net: 3com: typhoon: typhoon_init_one: make return values more specific
In some cases the return value of a failing function is not being used
and the function typhoon_init_one() returns another negative error code
instead.
In a few cases the err-variable is not set to a negative error code if a
function call in typhoon_init_one() fails and thus 0 is returned
instead.
It may be better to set err to the appropriate negative error
code before returning.
Relax the check in setsockopt to allow setting mc_index to an L3 slave if
sk_bound_dev_if points to an L3 master.
Make a similar change for IPv6. In this case change the device lookup to
take the rcu_read_lock avoiding a refcnt. The rcu lock is also needed for
the lookup of a potential L3 master device.
This really only silences a setsockopt failure since uses of mc_index are
secondary to sk_bound_dev_if if it is set. In both cases, if either index
is an L3 slave or master, lookups are directed to the same FIB table so
relaxing the check at setsockopt time causes no harm.
Patch is based on a suggested change by Darwin for a problem noted in
their code base.
Ping-Ke Shih [Wed, 28 Dec 2016 21:40:04 +0000 (15:40 -0600)]
rtlwifi: Fix alignment issues
The addresses of Wlan NIC registers are natural alignment, but some
drivers have bugs. These are evident on platforms that need natural
alignment to access registers. This change contains the following:
1. Function _rtl8821ae_dbi_read() is used to read one byte from DBI,
thus it should use rtl_read_byte().
2. Register 0x4C7 of 8192ee is single byte.
Bhumika Goyal [Sat, 17 Dec 2016 22:57:24 +0000 (04:27 +0530)]
libertas: constify cfg80211_ops structures
cfg80211_ops structures are only passed as an argument to the function
wiphy_new. This argument is of type const, so cfg80211_ops strutures
having this property can be declared as const.
Done using Coccinelle
Larry Finger [Thu, 15 Dec 2016 18:23:10 +0000 (12:23 -0600)]
rtlwifi: Remove some redundant code
The symbol DBG_EMERG is no longer used and is removed.
In a number of places, the code has redundant messages. For example, if
the failure for the firmware to run is logged, it is not necessary to
log that the firmware has been started. In addition, extraneous braces are
removed.
Larry Finger [Thu, 15 Dec 2016 18:23:09 +0000 (12:23 -0600)]
rtlwifi: rtl8188ee: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:08 +0000 (12:23 -0600)]
rtlwifi: rtl8192c-common: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:07 +0000 (12:23 -0600)]
rtlwifi: rtl8192ce: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:06 +0000 (12:23 -0600)]
rtlwifi: rtl8192cu: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:05 +0000 (12:23 -0600)]
rtlwifi: rtl8192de: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:04 +0000 (12:23 -0600)]
rtlwifi: rtl8192se: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:03 +0000 (12:23 -0600)]
rtlwifi: rtl8723-common: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:02 +0000 (12:23 -0600)]
rtlwifi: rtl8192ee: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:01 +0000 (12:23 -0600)]
rtlwifi: rtl8723ae: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:23:00 +0000 (12:23 -0600)]
rtlwifi: rtl8723be: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
Larry Finger [Thu, 15 Dec 2016 18:22:59 +0000 (12:22 -0600)]
rtlwifi: rtl8821ae: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().
rt2800: replace msleep() with usleep_range() on channel switch
msleep(1) can sleep much more time then requested 1ms, this is not good
on channel switch, which we want to be performed fast (i.e. to make scan
faster). Replace msleep() with usleep_range(), which has much smaller
maximum sleeping time boundary.
We need to perform different actions (AGC and VCO calibrations and VGC
tuning) periodically at different intervals. We don't need separate
works for those, we can use link tuner work and just check for proper
interval on it.
This fixes performing AGC and VCO calibration when scanning on STA
mode. We need to be on-channel to perform those calibrations.
rt2800: set MAX_PSDU len according to remote STAs capabilities
MAX_LEN_CFG_MAX_PSDU specify maximum transmitted by HW AMPDU length
(0 - 8kB, 1 - 16kB, 2 - 32kB, 3 - 64kB). Set this option according to
remote stations capabilities (based on HT ampdu_factor). However limit
the value based our hardware TX capabilities as some chips can not send
more than 16kB (factor 1). Limit for all chips is currently 32kB
(factor 2), but perhaps for some chips this could be increased
to 64kB by setting drv_data->max_psdu to 3.
Since MAX_LEN_CFG_MAX_PSDU is global setting, on multi stations modes
(AP, IBSS, mesh) we limit according to less capable remote STA. We can
not set bigger value to speed up communication with some stations and
do not break communication with slow stations.
Johannes Berg [Wed, 7 Dec 2016 06:36:46 +0000 (07:36 +0100)]
iwlegacy: make il3945_mac_ops __ro_after_init
There's no need for this to be only __read_mostly, since
it's only written in a single way depending on the module
parameter, so that can be moved into the module's __init
function, and the ops can be __ro_after_init.
This is a little bit safer since it means the ops can't
be overwritten (accidentally or otherwise), which would
otherwise cause an arbitrary function or bad pointer to
be called.
mwifiex: sdio: fix use after free issue for save_adapter
If we have sdio work requests received when sdio card reset is
happening, we may end up accessing older save_adapter pointer
later which is already freed during card reset.
This patch solves the problem by cancelling those pending requests.
ath10k: enable advertising support for channel 169, 5Ghz
Enable advertising support for channel 169, 5Ghz so that
based on the regulatory domain(country code) this channel
shall be active for use. For example in countries like India
this channel shall be available for use with latest regulatory updates
Ryan Hsu [Thu, 22 Dec 2016 23:02:37 +0000 (15:02 -0800)]
ath10k: ignore configuring the incorrect board_id
With command to get board_id from otp, in the case of following
boot get otp board id result 0x00000000 board_id 0 chip_id 0
boot using board name 'bus=pci,bmi-chip-id=0,bmi-board-id=0"
...
failed to fetch board data for bus=pci,bmi-chip-id=0,bmi-board-id=0 from
ath10k/QCA6174/hw3.0/board-2.bin
The invalid board_id=0 will be used as index to search in the board-2.bin.
Ignore the case with board_id=0, as it means the otp is not carrying
the board id information.
Ryan Hsu [Thu, 22 Dec 2016 22:31:46 +0000 (14:31 -0800)]
ath10k: recal the txpower when removing interface
The txpower is being recalculated when adding interface to make sure
txpower won't overshoot the spec, and when removing the interface,
the txpower should again to be recalculated to restore the correct value
from the active interface list.
Following is one of the scenario
vdev0 is created as STA and connected: txpower:23
vdev1 is created as P2P_DEVICE for control interface: txpower:0
vdev2 is created as p2p go/gc interface: txpower is 21
So the vdev2@txpower:21 will be set to firmware when vdev2 is created.
When we tear down the vdev2, the txpower needs to be recalculated to
re-set it to vdev0@txpower:23 as vdev0/vdev1 are the active interface.
ath10k_pci mac vdev 0 peer create 8c:fd:f0:01:62:98
ath10k_pci mac vdev_id 0 txpower 23
... (adding interface)
ath10k_pci mac vdev create 2 (add interface) type 1 subtype 3
ath10k_pci mac vdev_id 2 txpower 21
ath10k_pci mac txpower 21
... (removing interface)
ath10k_pci mac vdev 2 delete (remove interface)
ath10k_pci vdev 1 txpower 0
ath10k_pci vdev 0 txpower 23
ath10k_pci mac txpower 23
Arun Khandavalli [Wed, 21 Dec 2016 12:19:21 +0000 (14:19 +0200)]
ath10k: support dev_coredump for crash dump
Whenever firmware crashes, and both CONFIG_ATH10K_DEBUGFS and
CONFIG_ALLOW_DEV_COREDUMP are enabled, dump information about the crash via a
devcoredump device. Dump can be read from userspace for further analysis from:
/sys/class/devcoredump/devcd*/data
As until now we have provided the firmware crash dump file via fw_crash_dump
debugfs keep it still available but deprecate and a warning print that the user
should switch to using dev_coredump.
Future improvement would be not to depend on CONFIG_ATH10K_DEBUGFS, as there
might be systems which want to get the firmware crash dump but not enable
debugfs. How to handle memory consumption is also something which needs to be
taken into account.
Ryan Hsu [Tue, 13 Dec 2016 22:55:19 +0000 (14:55 -0800)]
ath10k: fix incorrect txpower set by P2P_DEVICE interface
Ath10k reports the phy capability that supports P2P_DEVICE interface.
When we use the P2P supported wpa_supplicant to start connection, it'll
create two interfaces, one is wlan0 (vdev_id=0) and one is P2P_DEVICE
p2p-dev-wlan0 which is for p2p control channel (vdev_id=1).
ath10k_pci mac vdev create 0 (add interface) type 2 subtype 0
ath10k_add_interface: vdev_id: 0, txpower: 0, bss_power: 0
...
ath10k_pci mac vdev create 1 (add interface) type 2 subtype 1
ath10k_add_interface: vdev_id: 1, txpower: 0, bss_power: 0
And the txpower in per vif bss_conf will only be set to valid tx power when
the interface is assigned with channel_ctx.
But this P2P_DEVICE interface will never be used for any connection, so
that the uninitialized bss_conf.txpower=0 is assinged to the
arvif->txpower when interface created.
Since the txpower configuration is firmware per physical interface.
So the smallest txpower of all vifs will be the one limit the tx power
of the physical device, that causing the low txpower issue on other
active interfaces.
wlan0: Limiting TX power to 21 (24 - 3) dBm
ath10k_pci mac vdev_id 0 txpower 21
ath10k_mac_txpower_recalc: vdev_id: 1, txpower: 0
ath10k_mac_txpower_recalc: vdev_id: 0, txpower: 21
ath10k_pci mac txpower 0
This issue only happens when we use the wpa_supplicant that supports
P2P or if we use the iw tool to create the control P2P_DEVICE interface.
ath10k: fix potential memory leak in ath10k_wmi_tlv_op_pull_fw_stats()
ath10k_wmi_tlv_op_pull_fw_stats() uses tb = ath10k_wmi_tlv_parse_alloc(...)
function, which allocates memory. If any of the three error-paths are
taken, this tb needs to be freed.
Felix Manlunas [Fri, 30 Dec 2016 01:04:47 +0000 (17:04 -0800)]
liquidio: optimize reads from Octeon PCI console
Reads from Octeon PCI console are inefficient because before each read
operation, a dynamic mapping to Octeon DRAM is set up. This patch replaces
the repeated setup of a dynamic mapping with a one-time setup of a static
mapping.
Oftenly, introducing side effects on packet processing on the other half
of the stack by adjusting one of TX/RX via sysctl is not desirable.
There are cases of demand for asymmetric, orthogonal configurability.
This holds true especially for nodes where RPS for RFS usage on top is
configured and therefore use the 'old dev_weight'. This is quite a
common base configuration setup nowadays, even with NICs of superior processing
support (e.g. aRFS).
A good example use case are nodes acting as noSQL data bases with a
large number of tiny requests and rather fewer but large packets as responses.
It's affordable to have large budget and rx dev_weights for the
requests. But as a side effect having this large a number on TX
processed in one run can overwhelm drivers.
This patch therefore introduces an independent configurability via sysctl to
userland.
David S. Miller [Thu, 29 Dec 2016 19:37:25 +0000 (14:37 -0500)]
Merge branch 'bnxt_en-updates'
Michael Chan says:
====================
bnxt_en: updates for net-next.
This patch series for net-next contains cleanups, new features and minor
fixes. The driver specific busy polling code is removed to use busy
polling support in core networking. Hardware RFS support is enhanced with
added ipv6 flows support and VF support. A new scheme to allocate TX
rings from the firmware is implemented for newer chips and firmware. Plus
some misc. cleanups, minor fixes, and to add the maintainer entry. Please
review.
====================
Michael Chan [Thu, 29 Dec 2016 17:13:43 +0000 (12:13 -0500)]
bnxt_en: Handle no aggregation ring gracefully.
The current code assumes that we will always have at least 2 rx rings, 1
will be used as an aggregation ring for TPA and jumbo page placements.
However, it is possible, especially on a VF, that there is only 1 rx
ring available. In this scenario, the current code will fail to initialize.
To handle it, we need to properly set up only 1 ring without aggregation.
Set a new flag BNXT_FLAG_NO_AGG_RINGS for this condition and add logic to
set up the chip to place RX data linearly into a single buffer per packet.
Michael Chan [Thu, 29 Dec 2016 17:13:42 +0000 (12:13 -0500)]
bnxt_en: Set default completion ring for async events.
With the added support for the bnxt_re RDMA driver, both drivers can be
allocating completion rings in any order. The firmware does not know
which completion ring should be receiving async events. Add an
extra step to tell firmware the completion ring number for receiving
async events after bnxt_en allocates the completion rings.
Michael Chan [Thu, 29 Dec 2016 17:13:41 +0000 (12:13 -0500)]
bnxt_en: Implement new scheme to reserve tx rings.
In order to properly support TX rate limiting in SRIOV VF functions or
NPAR functions, firmware needs better control over tx ring allocations.
The new scheme requires the driver to reserve the number of tx rings
and to query to see if the requested number of tx rings is reserved.
The driver will use the new scheme when the firmware interface spec is
1.6.1 or newer.