Roland Dreier [Wed, 20 Jul 2011 09:28:56 +0000 (09:28 +0000)]
target: Make se_tmr_lock IRQ-safe
transport_lookup_tmr_lun() can be called from interrupt context and
therefore needs to use IRQ-safe spinlock functions. Fix this up, and
to make the locking work, convert the other uses of se_tmr_lock to be
IRQ-disabling.
Roland Dreier [Tue, 14 Jun 2011 03:55:06 +0000 (20:55 -0700)]
target: Make se_dev_check_online() locking IRQ-safe
se_dev_check_online() is called from transport_lookup_cmd_lun(), which
as discussed before may be called from interrupt context. So it needs
to use spin_lock_irqsave() instead of spin_lock_irq() to avoid
enabling interrupts at the wrong time.
Roland Dreier [Wed, 20 Jul 2011 09:09:10 +0000 (09:09 +0000)]
target: Make transport_lookup_cmd_lun() locking IRQ-safe
transport_lookup_cmd_lun() may be called from interrupt context (eg
tcm_loop_allocate_core_cmd() calls it, and it has a comment that says,
"Can be called from interrupt context"), so it needs to use
spin_lock_irqsave() instead of spin_lock_irq() to avoid enabling
interrupts at the wrong time.
(And indeed the last set of lock operations, on lun_cmd_lock, were
already using spin_lock_irqsave(), so we just need to fix the other
two locks we take)
This patch adds SCF_EMULATE_QUEUE_FULL support using -EAGAIN failures
via transport_handle_queue_full() to signal queue full in completion
path TFO->queue_data_in() and TFO->queue_status() callbacks.
This is done using a new se_cmd->transport_qf_callback() to handle
the following queue full exception cases within target core:
*) transport_send_check_condition_and_sense() failure paths in
transport_generic_request_failure() and transport_generic_complete_ok()
All logic is driven using se_device->qf_work_queue -> target_qf_do_work()
to to requeue outstanding se_cmd at the head of se_dev->queue_obj->qobj_list
for transport_processing_thread() execution.
Tested using tcm_qla2xxx with MAX_OUTSTANDING_COMMANDS=128 for FCP READ
to trigger the TRANSPORT_COMPLETE_OK queue full cases, and a simulated
TFO->write_pending() -EAGAIN failure to trigger TRANSPORT_COMPLETE_QF_WP.
This patch adds a transport_handle_cdb_direct() optimization for mapping
and queueing tasks directly from within fabric processing context by calling
the newly exported transport_generic_new_cmd(). This currently expects to
be called from process context only, and will fail if called within interrupt
context.
This patch also leaves transport_generic_handle_cdb() unmodified for the
moment to function as expected with existing tcm_fc and ib_srpt fabrics,
and will be removed once these have been converted and tested with v4.1
code using transport_handle_cdb_direct().
Based on Andy's original patch here:
[PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
This patch contains a squashed version to remove unused SCF_* flags:
target: remove the unused SCF_SE_DISABLE_ONLINE_CHECK flag
target: remove the unused SCF_CMD_PASSTHROUGH_NOALLOC flag
target: remove the unused SCF_EMULATE_SYNC_UNMAP flag
target: remove the unused SCF_EMULATE_SYNC_CACHE flag
Andy Grover [Tue, 3 May 2011 00:12:10 +0000 (17:12 -0700)]
target: Updates from AGrover and HCH (round 3)
This patch contains a squashed version of third round series cleanups,
improvements ,and simplfications from Andy and Christoph ahead of the
heavy lifting between round 3 -> 4 for the target core SGL conversion.
This include cleanups to the main target I/O path and other miscellaneous
updates.
target: Replace custom sg<->buf functions with lib funcs
target: Simplify sector limiting code
target: get_cdb should never return NULL
target: Simplify transport_memcpy_se_mem_read_contig
target: Use assignment rather than increment for t_task_cdbs
target: Don't pass dma_size to generic_get_mem
target: Pass sg with type scatterlist in transport_map_sg_to_mem
target: Move task_sg_num next to task_sg in struct se_task
target: inline struct se_transport_task into struct se_cmd
target: Change name & semantics of transport_get_sectors()
target: Remove unused members of se_cmd
target: Rename se_cmd.t_task_cdbs to t_task_list_num
target: Fix some spelling
target: Remove unused var from transport_generic_do_tmr
target: map_sg_to_mem: return sg_count in return value
target/pscsi: Use min_t for sector limits
target/pscsi: Unused param for pscsi_get_bio()
target: Rename get_cdb_count to allocate_tasks
target: Make transport_generic_new_cmd() available for iscsi-target
target: Remove fabric callback to allocate iovecs
target: Fix transport_generic_new_cmd WRITE comment
(hch: Use __GFP_ZERO usage for alloc_pages() usage)
target: Fix WRITE_SAME_[16,32] number of blocks=0 case
This patch fixes the handling of WRITE_SAME_[16,32] emulation where a
WRITE_SAME_* CDB with number of blocks=0 was being rejected by SCSI
expected data transfer length overflow checking in target core.
It changes both CDB cases in transport_generic_cmd_sequencer() to use
dev->se_sub_dev->se_dev_attrib.block_size to match what sg_write_same
is sending us with --num=0. It also fixes target_emulate_write_same()
to properly determine the num_blocks with --num=0 case to determine the
remaining range for dev->transport->do_discard().
Andy Grover [Tue, 19 Jul 2011 10:26:37 +0000 (10:26 +0000)]
target: More core cleanups from AGrover (round 2)
This patch contains the squashed version of second round of target core
cleanups and simplifications and Andy and Co. It also contains a handful
of fixes to address bugs the original series and other minor cleanups.
Here is the condensed shortlog:
target: Remove unneeded casts to void*
target: Rename get_lun_for_{cmd,tmr} to lookup_{cmd,tmr}_lun
target: Make t_task a member of se_cmd, not a pointer
target: Handle functions returning "-2"
target: Use cmd->se_dev over cmd->se_lun->lun_se_dev
target: Embed qr in struct se_cmd
target: Replace embedded struct se_queue_req with a list_head
target: Rename list_heads that are nodes in struct se_cmd to "*_node"
target: Fold transport_device_setup_cmd() into lookup_{tmr,cmd}_lun()
target: Make t_mem_list and t_mem_list_bidi members of t_task
target: Add comment & cleanup transport_map_sg_to_mem()
target: Remove unneeded checks in transport_free_pages()
Andy Grover [Tue, 19 Jul 2011 08:55:10 +0000 (08:55 +0000)]
target: Core cleanups from AGrover (round 1)
This patch contains the squashed version of a number of cleanups and
minor fixes from Andy's initial series (round 1) for target core this
past spring. The condensed log looks like:
target: use errno values instead of returning -1 for everything
target: Rename transport_calc_sg_num to transport_init_task_sg
target: Fix leak in error path in transport_init_task_sg
target/pscsi: Remove pscsi_get_sh() usage
target: Make two runtime checks into WARN_ONs
target: Remove hba queue depth and convert to spin_lock_irq usage
target: dev->dev_status_queue_obj is unused
target: Make struct se_queue_req.cmd type struct se_cmd *
target: Remove __transport_get_qr_from_queue()
target: Rename se_dev->g_se_dev_list to se_dev_node
target: Remove struct se_global
target: Simplify scsi mib index table code
target: Make dev_queue_obj a member of se_device instead of a pointer
target: remove extraneous returns at end of void functions
target: Ensure transport_dump_vpd_ident_type returns null-terminated str
target: Function pointers don't need to use '&' to be assigned
target: Fix comment in __transport_execute_tasks()
target: Misc style cleanups
target: rename struct pr_reservation_template to pr_reservation
target: Remove #defines that just perform indirection
target: Inline transport_get_task_from_execute_queue()
target: Minor header comment fixes
target: Remove unused su_group usage in fabric register/dergister
This patch removes two instances of left over v3.x code performing local
scope access to struct target_core_fabric_ops->tf_subsys->su_group in
target_fabric_configfs_register() and target_fabric_configfs_deregister().
This patch removes the now unnecessary 'unsigned char *cdb' function
parameter from transport_get_lun_for_cmd(). This also includes updating
lio-target, tcm_loop and tcm_fc usage of transport_get_lun_for_cmd().
In commit c225150b "slab: fix DEBUG_SLAB build",
"if ((unsigned long)objp & (ARCH_SLAB_MINALIGN-1))" is always true if
ARCH_SLAB_MINALIGN == 0. Do not print warning if ARCH_SLAB_MINALIGN == 0.
Don Skidmore [Thu, 21 Jul 2011 05:55:00 +0000 (05:55 +0000)]
ixgbe: convert to ndo_fix_features
Private rx_csum flags are now duplicate of netdev->features &
NETIF_F_RXCSUM. We remove those duplicates and now use the net_device_ops
ndo_set_features. This was based on the original patch submitted by
Michal Miroslaw <[email protected]>. I also removed the special
case not requiring a reset for X540 hardware. It is needed just as it is
in 82599 hardware.
Andy Gospodarek [Sat, 16 Jul 2011 07:31:33 +0000 (07:31 +0000)]
ixgbe: only enable WoL for magic packet by default
Martin Wilck <[email protected]> reported that systems using
the ixgbe-driver that were capable of WoL were rebooting almost as soon
as they were shut down. This is because the default WoL settings
enabled magic packet, broadcast, unicast, and multicast.
Other Intel devices seem to use the stored eeprom value for initial WoL
capabilities. The 82578DM (e1000e) and 82576 (igb) the devices I looked
at had only the magic packet enabled in the eeprom, so that seems
appropriate on ixgbe-based devices as well. I set the WoL options on my
82578DM to be the same default as the ixgbe devices (umbg) and saw the
same as Martin -- almost as soon as my box shutdown, it booted again.
This patch changes the default to only be the magic packet. This is the
same as the default for most Intel and non-Intel hardware currently
upstream.
Alexander Duyck [Sat, 11 Jun 2011 01:45:13 +0000 (01:45 +0000)]
ixgbe: Pass staterr instead of re-reading status and error bits from descriptor
This change is meant to address possible race conditions from the status
and error bits on the RX descriptors being re-read by multiple functions in
the RX cleanup path. To resolve this I have added code that will pass the
staterr value to those functions.
Alexander Duyck [Sat, 11 Jun 2011 01:45:08 +0000 (01:45 +0000)]
ixgbe: Move interrupt related values out of ring and into q_vector
This change moves work_limit, total_packets, and total_bytes into the ring
container struct of the q_vector. The advantage of this is that it should
reduce the size of memory used in the event of multiple rings being
assigned to a single q_vector. In addition it should help to reduce the
total workload for calculating itr since now total_packets and total_bytes
will be the total work done of the interrupt instead of for the ring.
Alexander Duyck [Sat, 11 Jun 2011 01:45:03 +0000 (01:45 +0000)]
ixgbe: add structure for containing RX/TX rings to q_vector
This patch adds support for a ring container structure to be used within
the q_vector. The basic idea is to provide a means of separating the RX
and TX rings while maintaining a common structure for their containment.
The advantage to this is that later we should be able to pass this
structure to the update_itr functions without needing to pass individual
rings.
Alexander Duyck [Sat, 11 Jun 2011 01:44:58 +0000 (01:44 +0000)]
ixgbe: inline the ixgbe_maybe_stop_tx function
The ixgbe_maybe_stop_tx function is only a few lines long and is called
multiple times through the xmit hotpath. In order to streamline things it
makes sense to just inline it.
Alexander Duyck [Sat, 11 Jun 2011 01:44:53 +0000 (01:44 +0000)]
ixgbe: Update ATR to use recorded TX queues instead of CPU for routing
This change is meant to update ATR so that it will use the recorded RX
queue instead of the CPU in the case of routing. This change is meant to
help ixgbe default behavior to more closely match that of the kernel.
e1000: always call e1000_check_for_link() on e1000_ce4100 MACs.
Interrupts about link lost or rx sequence errors are not reported by
the ce4100 hardware, leading to transitions from link UP to link DOWN
never being reported.
Rusty Russell [Fri, 22 Jul 2011 05:09:49 +0000 (14:39 +0930)]
lguest: Simplify device initialization.
We used to notify the Host every time we updated a device's status. However,
it only really needs to know when we're resetting the device, or failed to
initialize it, or when we've finished our feature negotiation.
In particular, we used to wait for VIRTIO_CONFIG_S_DRIVER_OK in the
status byte before starting the device service threads. But this
corresponds to the successful finish of device initialization, which
might (like virtio_blk's partition scanning) use the device. So we
had a hack, if they used the device before we expected we started the
threads anyway.
Now we hook into the finalize_features hook in the Guest: at that
point we tell the Launcher that it can rely on the features we have
acked. On the Launcher side, we look at the status at that point, and
start servicing the device.
Rusty Russell [Fri, 22 Jul 2011 05:09:48 +0000 (14:39 +0930)]
lguest: use a special 1:1 linear pagetable mode until first switch.
The Host used to create some page tables for the Guest to use at the
top of Guest memory; it would then tell the Guest where this was. In
particular, it created linear mappings for 0 and 0xC0000000 addresses
because lguest used to switch to its real page tables quite late in
boot.
However, since d50d8fe19 Linux initialized boot page tables in
head_32.S even before the "are we lguest?" boot jump. So, now we can
simplify things: the Host pagetable code assumes 1:1 linear mapping
until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do
before we reach C code.
This also means that the Host doesn't need to know anything about the
Guest's PAGE_OFFSET. (Non-Linux guests might not even have such a
thing).
Sakari Ailus [Sun, 26 Jun 2011 16:36:46 +0000 (19:36 +0300)]
lguest: Do not exit on non-fatal errors
Do not exit on some non-fatal errors:
- writev() fails in net_output(). The result is a lost packet or packets.
- writev() fails in console_output(). The result is partially lost console
output.
- readv() fails in net_input(). The result is a lost packet or packets.
Rather than bringing the guest down, this patch ignores e.g. an allocation
failure on the host side. Example:
o Minimum fw version supported for P3 chip is 4.0.505
o File Fw > 4.0.554 is not supported if flash fw < 4.0.554.
o In mn firmware case, file fw older than flash fw is allowed.
o Change variable names for readability
o Update driver version 4.0.76
be2net: request native mode each time the card is reset
Currently be3-native mode is requested only in probe(). It must be requested, each time the card is reset either after an EEH error or after
sleep/hibernation.
Also, the be_cmd_check_native_mode() is better named be_cmd_req_native_mode()
Bill Sommerfeld [Tue, 19 Jul 2011 15:22:33 +0000 (15:22 +0000)]
ipv4: Constrain UFO fragment sizes to multiples of 8 bytes
Because the ip fragment offset field counts 8-byte chunks, ip
fragments other than the last must contain a multiple of 8 bytes of
payload. ip_ufo_append_data wasn't respecting this constraint and,
depending on the MTU and ip option sizes, could create malformed
non-final fragments.
Fix a panic in virtnet_remove. unregister_netdev has already
freed up the netdev (and virtnet_info) due to dev->destructor
being set, while virtnet_info is still required. Remove
virtnet_free altogether, and move the freeing of the per-cpu
statistics from virtnet_free to virtnet_remove.
Eric Dumazet [Fri, 22 Jul 2011 04:25:58 +0000 (21:25 -0700)]
ipv6: make fragment identifications less predictable
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.
Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)
This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter
stmmac: unify MAC and PHY configuration parameters (V2)
Prior to this change, most PHY configuration parameters were passed
into the STMMAC device as a separate PHY device. As well as being
unusual, this made it difficult to make changes to the MAC/PHY
relationship.
This patch moves all the PHY parameters into the MAC configuration
structure, mainly as a separate structure. This allows us to completely
ignore the MDIO bus attached to a stmmac if desired, and not create
the PHY bus. It also allows the stmmac driver to use a different PHY
from the one it is connected to, for example a fixed PHY or bit banging
PHY.
Also derive the stmmac/PHY connection type (MII/RMII etc) from the
mode can be passed into <platf>_configure_ethernet.
STLinux kernel at git://git.stlinux.com/stm/linux-sh4-2.6.32.y.git
provides several examples how to use this new infrastructure (that
actually is easier to maintain and clearer).
stmmac: remove warning when compile as built-in (V2)
The patch removes the following serie of warnings
when the driver is compiled as built-in:
drivers/net/stmmac/stmmac_main.c: In function stmmac_cmdline_opt:
drivers/net/stmmac/stmmac_main.c:1855:12: warning: ignoring return
value of kstrtoul, declared with attribute warn_unused_result
[snip]
Ben Hutchings [Thu, 21 Jul 2011 22:25:30 +0000 (15:25 -0700)]
ethtool: Allow zero-length register dumps again
Some drivers (ab)use the ethtool_ops::get_regs operation to expose
only a hardware revision ID. Commit a77f5db361ed9953b5b749353ea2c7fed2bf8d93 ('ethtool: Allocate register
dump buffer with vmalloc()') had the side-effect of breaking these, as
vmalloc() returns a null pointer for size=0 whereas kmalloc() did not.
For backward-compatibility, allow zero-length dumps again.
This patch add the missing dma_unmap().
Which solved the critical issue of system freeze on heavy load.
Michal Miroslaw's rejected patch:
[PATCH v2 10/46] net: jme: convert to generic DMA API
Pointed out the issue also, thank you Michal.
But the fix was incorrect. It would unmap needed address
when low memory.
Got lots of feedback from End user and Gentoo Bugzilla.
https://bugs.gentoo.org/show_bug.cgi?id=373109
Thank you all. :)
Dan Carpenter [Tue, 19 Jul 2011 22:51:49 +0000 (22:51 +0000)]
skbuff: fix error handling in pskb_copy()
There are two problems:
1) "n" was allocated with alloc_skb() so we should free it with
kfree_skb() instead of regular kfree().
2) We return the freed pointer instead of NULL.