If a userspace application performes a "delete_controller" command
early during the ctrl initialization, the delete operation
may race against the init code and the kernel will crash.
nvme nvme5: Connect command failed: host path error
nvme nvme5: failed to connect queue: 0 ret=880
PF: supervisor write access in kernel mode
PF: error_code(0x0002) - not-present page
blk_mq_quiesce_queue+0x18/0x90
nvme_tcp_delete_ctrl+0x24/0x40 [nvme_tcp]
nvme_do_delete_ctrl+0x7f/0x8b [nvme_core]
nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
kernfs_fop_write_iter+0x124/0x1b0
new_sync_write+0xff/0x190
vfs_write+0x1ef/0x280
Fix the crash by checking the NVME_CTRL_STARTED_ONCE bit;
if it's not set it means that the nvme controller is still
in the process of getting initialized and the kernel
will return an -EBUSY error to userspace.
Set the NVME_CTRL_STARTED_ONCE later in the nvme_start_ctrl()
function, after the controller start operation is completed.
Signed-off-by: Maurizio Lombardi <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
{
struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+ if (!test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags))
+ return -EBUSY;
+
if (device_remove_file_self(dev, attr))
nvme_delete_ctrl_sync(ctrl);
return count;
* that were missed. We identify persistent discovery controllers by
* checking that they started once before, hence are reconnecting back.
*/
- if (test_and_set_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
+ if (test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
nvme_discovery_ctrl(ctrl))
nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
}
nvme_change_uevent(ctrl, "NVME_EVENT=connected");
+ set_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags);
}
EXPORT_SYMBOL_GPL(nvme_start_ctrl);