Bjorn Helgaas [Fri, 5 Nov 2021 16:28:48 +0000 (11:28 -0500)]
Merge branch 'pci/host/apple'
- Make of_phandle_args_to_fwspec() generally available (Marc Zyngier)
- Allow matching of interrupt-maps local to interrupt controller or PCI
device (Marc Zyngier)
- Add Apple SoC (e.g., M1) PCIe host controller driver, which enables
access to USB type-A, Ethernet, Wi-Fi, Bluetooth devices; these require
additional drivers of their own (Alyssa Rosenzweig)
- Add apple INTx, per-port, and MSI interrupt support (Marc Zyngier)
- Configure apple Requester-ID-to-Stream-ID mapper for IOMMU (DART) support
(Marc Zyngier)
* pci/host/apple:
PCI: apple: Configure RID to SID mapper on device addition
iommu/dart: Exclude MSI doorbell from PCIe device IOVA range
PCI: apple: Implement MSI support
PCI: apple: Add INTx and per-port interrupt support
PCI: apple: Set up reference clocks when probing
PCI: apple: Add initial hardware bring-up
PCI: of: Allow matching of an interrupt-map local to a PCI device
of/irq: Allow matching of an interrupt-map local to an interrupt controller
irqdomain: Make of_phandle_args_to_fwspec() generally available
- Fix reporting of Data Link Layer Link Active (Pali Rohár)
- Fix emulation of W1C bits (Marek Behún)
- Fix MSI domain .alloc() method to return zero on success (Marek Behún)
- Read entire 16-bit MSI vector in MSI handler, not just low 8 bits (Marek
Behún)
- Clear Root Port I/O Space, Memory Space, and Bus Master Enable bits at
startup; PCI core will set those as necessary (Pali Rohár)
- When operating as a Root Port, set class code to "PCI Bridge" instead of
the default "Mass Storage Controller" (Pali Rohár)
- Add emulation for PCI_BRIDGE_CTL_BUS_RESET since aardvark doesn't
implement this per spec (Pali Rohár)
- Add emulation of option ROM BAR since aardvark doesn't implement this per
spec (Pali Rohár)
* remotes/lorenzo/pci/aardvark:
PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge
PCI: aardvark: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
PCI: aardvark: Set PCI Bridge Class Code to PCI Bridge
PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge
PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
PCI: aardvark: Fix return value of MSI domain .alloc() method
PCI: pci-bridge-emul: Fix emulation of W1C bits
PCI: aardvark: Fix reporting Data Link Layer Link Active
PCI: aardvark: Fix checking for link up via LTSSM state
PCI: aardvark: Fix link training
PCI: aardvark: Simplify initialization of rootcap on virtual bridge
PCI: aardvark: Implement re-issuing config requests on CRS response
PCI: aardvark: Deduplicate code in advk_pcie_rd_conf()
PCI: aardvark: Do not unmask unused interrupts
PCI: aardvark: Do not clear status bits of masked interrupts
PCI: aardvark: Fix configuring Reference clock
PCI: aardvark: Fix preserving PCI_EXP_RTCTL_CRSSVE flag on emulated bridge
PCI: aardvark: Don't spam about PIO Response Status
PCI: aardvark: Fix PCIe Max Payload Size setting
PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
- Remove unused pci_pool wrappers, which have been replaced by dma_pool
(Cai Huoqing)
- Remove a redundant initialization in __pci_reset_function_locked() (Colin
Ian King)
- Use 'unsigned int' instead of 'unsigned' (Krzysztof Wilczyński)
- Update PCI subsystem information in MAINTAINERS (Krzysztof Wilczyński)
- Include generic <linux/> headers instead of <asm/> for cpqphp and vmd
(Krzysztof Wilczyński)
* pci/misc:
PCI: vmd: Drop redundant includes of <asm/device.h>, <asm/msi.h>
PCI: cpqphp: Use <linux/io.h> instead of <asm/io.h>
MAINTAINERS: Update PCI subsystem information
PCI: Prefer 'unsigned int' over bare 'unsigned'
PCI: Remove redundant 'rc' initialization
PCI: Remove unused pci_pool wrappers
PCI: cpqphp: Format if-statement code block correctly
PCI: Use unsigned to match sscanf("%x") in pci_dev_str_match_path()
PCI: hv: Remove unnecessary use of %hx
PCI: Correct misspelled and remove duplicated words
PCI: Tidy comments
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:47 +0000 (11:28 -0500)]
Merge branch 'pci/vpd'
- Add pci_read_vpd_any(), pci_write_vpd_any() to access VPD at arbitrary
offsets (Heiner Kallweit)
- Use VPD API to replace custom code in cxgb3 driver (Heiner Kallweit)
* pci/vpd:
cxgb3: Remove seeprom_write and use VPD API
cxgb3: Use VPD API in t3_seeprom_wp()
cxgb3: Remove t3_seeprom_read and use VPD API
PCI/VPD: Use pci_read_vpd_any() in pci_vpd_size()
PCI/VPD: Add pci_read/write_vpd_any()
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:46 +0000 (11:28 -0500)]
Merge branch 'pci/sysfs'
- Check for CAP_SYS_ADMIN before validating sysfs user input, not after
(Krzysztof Wilczyński)
- Always return -EINVAL from sysfs "store" functions for invalid user input
instead of -EINVAL sometimes and -ERANGE others (Krzysztof Wilczyński)
- Use kstrtobool() directly instead of the strtobool() wrapper (Krzysztof
Wilczyński)
* pci/sysfs:
PCI: Use kstrtobool() directly, sans strtobool() wrapper
PCI/sysfs: Return -EINVAL consistently from "store" functions
PCI/sysfs: Check CAP_SYS_ADMIN before parsing user input
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:45 +0000 (11:28 -0500)]
Merge branch 'pci/switchtec'
- Return error to application when command execution fails because an
out-of-band reset has cleared the device BARs, Memory Space Enable, etc
(Kelvin Cao)
- Fix MRPC error status handling issue (Kelvin Cao)
- Mask out other bits when reading of management VEP instance ID (Kelvin
Cao)
- Return EOPNOTSUPP instead of ENOTSUPP from sysfs show functions (Kelvin
Cao)
- Add check of event support (Logan Gunthorpe)
* pci/switchtec:
PCI/switchtec: Add check of event support
PCI/switchtec: Replace ENOTSUPP with EOPNOTSUPP
PCI/switchtec: Update the way of getting management VEP instance ID
PCI/switchtec: Fix a MRPC error status handling issue
PCI/switchtec: Error out MRPC execution when MMIO reads fail
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:44 +0000 (11:28 -0500)]
Merge branch 'pci/msi'
- Document sysfs "irq" attribute, which contains either the INTx IRQ (the
intended behavior) or the first MSI IRQ (historical mistake retained for
backwards compatibility) (Barry Song)
- Rework "irq" sysfs show function to explicitly fetch first MSI IRQ
instead of depending on core to put it in dev->irq, to enable future core
cleanup (Barry Song)
* pci/msi:
PCI/sysfs: Explicitly show first MSI IRQ for 'irq'
PCI: Document /sys/bus/pci/devices/.../irq
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:43 +0000 (11:28 -0500)]
Merge branch 'pci/driver'
- Drop the struct pci_dev.driver pointer, which is redundant with the
struct device.driver pointer (Uwe Kleine-König)
* pci/driver:
PCI: Remove struct pci_dev->driver
PCI: Use to_pci_driver() instead of pci_dev->driver
x86/pci/probe_roms: Use to_pci_driver() instead of pci_dev->driver
perf/x86/intel/uncore: Use to_pci_driver() instead of pci_dev->driver
powerpc/eeh: Use to_pci_driver() instead of pci_dev->driver
usb: xhci: Use to_pci_driver() instead of pci_dev->driver
cxl: Use to_pci_driver() instead of pci_dev->driver
cxl: Factor out common dev->driver expressions
xen/pcifront: Use to_pci_driver() instead of pci_dev->driver
xen/pcifront: Drop pcifront_common_process() tests of pcidev, pdrv
nfp: use dev_driver_string() instead of pci_dev->driver->name
mlxsw: pci: Use dev_driver_string() instead of pci_dev->driver->name
net: marvell: prestera: use dev_driver_string() instead of pci_dev->driver->name
net: hns3: use dev_driver_string() instead of pci_dev->driver->name
crypto: hisilicon - use dev_driver_string() instead of pci_dev->driver->name
powerpc/eeh: Use dev_driver_string() instead of struct pci_dev->driver->name
ssb: Use dev_driver_string() instead of pci_dev->driver->name
bcma: simplify reference to driver name
crypto: qat - simplify adf_enable_aer()
scsi: message: fusion: Remove unused mpt_pci driver .probe() 'id' parameter
PCI/ERR: Factor out common dev->driver expressions
PCI: Drop pci_device_probe() test of !pci_dev->driver
PCI: Drop pci_device_remove() test of pci_dev->driver
PCI: Return NULL for to_pci_driver(NULL)
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:42 +0000 (11:28 -0500)]
Merge branch 'pci/aspm'
- Re-enable LTR in Downstream Ports after it has been disabled by reset or
hotplug to allow use of ASPM L1.2 again and prevent Unsupported Request
errors when Endpoint sends LTR messages (Mingchuang Qiao)
* pci/aspm:
PCI: Re-enable Downstream Port LTR after reset or hotplug
Bjorn Helgaas [Fri, 5 Nov 2021 16:28:42 +0000 (11:28 -0500)]
Merge branch 'pci/acpi'
- Simplify _OSC negotiation with platform for control of PCIe features
(Joerg Roedel)
* pci/acpi:
PCI/ACPI: Check for _OSC support in acpi_pci_osc_control_set()
PCI/ACPI: Move _OSC query checks to separate function
PCI/ACPI: Move supported and control calculations to separate functions
PCI/ACPI: Remove OSC_PCI_SUPPORT_MASKS and OSC_PCI_CONTROL_MASKS
The Pericom PI7C9X2G404/PI7C9X2G304/PI7C9X2G303 PCIe switches have an
erratum for ACS P2P Request Redirect behaviour when used in the cut-through
forwarding mode. The recommended work around for this issue is to use the
switch in store and forward mode. The erratum results in packets being
queued and not being delivered upstream, which can be observed as very poor
downstream device performance and/or dropped device-generated
data/interrupts.
Add a fixup so that when enabling or resuming the downstream port we check
if it has enabled ACS P2P Request Redirect, and if so, change the device
(via the upstream port) to use the store and forward operating mode.
Marc Zyngier [Wed, 29 Sep 2021 16:38:42 +0000 (17:38 +0100)]
PCI: apple: Configure RID to SID mapper on device addition
The Apple PCIe controller doesn't directly feed the endpoint's Requester ID
to the IOMMU (DART), but instead maps RIDs onto Stream IDs (SIDs). The DART
and the PCIe controller must thus agree on the SIDs that are used for
translation (by using the 'iommu-map' property).
For this purpose, parse the 'iommu-map' property each time a device gets
added, and use the resulting translation to configure the PCIe RID-to-SID
mapper. Similarly, remove the translation if/when the device gets removed.
This is all driven from a bus notifier which gets registered at probe time.
Hopefully this is the only PCI controller driver in the whole system.
Marc Zyngier [Wed, 29 Sep 2021 16:38:41 +0000 (17:38 +0100)]
iommu/dart: Exclude MSI doorbell from PCIe device IOVA range
The MSI doorbell on Apple HW can be any address in the low 4GB range.
However, the MSI write is matched by the PCIe block before hitting the
iommu. It must thus be excluded from the IOVA range that is assigned to any
PCIe device.
Marc Zyngier [Wed, 29 Sep 2021 16:38:39 +0000 (17:38 +0100)]
PCI: apple: Add INTx and per-port interrupt support
Add support for the per-port interrupt controller that deals with both INTx
signalling and management interrupts.
This allows the Link-up/Link-down interrupts to be wired, allowing the
bring-up to be synchronised (and provide debug information). The framework
can further be used to handle the rest of the per port events if and when
necessary.
Likewise, INTx signalling is implemented so that end-points can actually be
used.
Add a minimal driver to bring up the PCIe bus on Apple system-on-chips,
particularly the Apple M1. This driver exposes the internal bus used for
the USB type-A ports, Ethernet, Wi-Fi, and Bluetooth. Bringing up the
radios requires additional drivers beyond what's necessary for PCIe itself.
Marc Zyngier [Wed, 29 Sep 2021 16:38:36 +0000 (17:38 +0100)]
PCI: of: Allow matching of an interrupt-map local to a PCI device
Just as we now allow an interrupt map to be parsed when part of an
interrupt controller, there is no reason to ignore an interrupt map that
would be part of a pci device node such as a root port since we already
allow interrupt specifiers.
Allow the matching of such property when local to the node of a PCI
device, which allows the device itself to use the interrupt map for for
its own purpose.
Marc Zyngier [Wed, 29 Sep 2021 16:38:35 +0000 (17:38 +0100)]
of/irq: Allow matching of an interrupt-map local to an interrupt controller
of_irq_parse_raw() has a baked assumption that if a node has an
interrupt-controller property, it cannot possibly also have an
interrupt-map property (the latter being ignored).
This seems to be an odd behaviour, and there is no reason why we should
avoid supporting this use case. This is specially useful when a PCI root
port acts as an interrupt controller for PCI endpoints, such as this:
Handle it by detecting that we have an interrupt-map early in the parsing,
and special case the situation where the phandle in the interrupt map
refers to the current node (which is the interesting case here).
Host crashes when pci_enable_atomic_ops_to_root() is called for VFs with
virtual buses. The virtual buses added to SR-IOV have bus->self set to NULL
and host crashes due to this.
Pali Rohár [Thu, 28 Oct 2021 18:56:58 +0000 (20:56 +0200)]
PCI: aardvark: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
Aardvark supports PCIe Hot Reset via PCIE_CORE_CTRL1_REG.
Use it for implementing PCI_BRIDGE_CTL_BUS_RESET bit of PCI_BRIDGE_CONTROL
register on emulated bridge.
With this, the function pci_reset_secondary_bus() starts working and can
reset connected PCIe card. Custom userspace script [1] which uses setpci
can trigger PCIe Hot Reset and reset the card manually.
Pali Rohár [Thu, 28 Oct 2021 18:56:57 +0000 (20:56 +0200)]
PCI: aardvark: Set PCI Bridge Class Code to PCI Bridge
Aardvark controller has something like config space of a Root Port
available at offset 0x0 of internal registers - these registers are used
for implementation of the emulated bridge.
The default value of Class Code of this bridge corresponds to a RAID Mass
storage controller, though. (This is probably intended for when the
controller is used as Endpoint.)
Change the Class Code to correspond to a PCI Bridge.
Pali Rohár [Thu, 28 Oct 2021 18:56:56 +0000 (20:56 +0200)]
PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge
From very vague, ambiguous and incomplete information from Marvell we
deduced that the 32-bit Aardvark register at address 0x4
(PCIE_CORE_CMD_STATUS_REG), which is not documented for Root Complex mode
in the Functional Specification (only for Endpoint mode), controls two
16-bit PCIe registers: Command Register and Status Registers of PCIe Root
Port.
This means that bit 2 controls bus mastering and forwarding of memory and
I/O requests in the upstream direction. According to PCI specifications
bits [0:2] of Command Register, this should be by default disabled on
reset. So explicitly disable these bits at early setup of the Aardvark
driver.
Remove code which unconditionally enables all 3 bits and let kernel code
(via pci_set_master() function) to handle bus mastering of Root PCIe
Bridge via emulated PCI_COMMAND on emulated bridge.
Marek Behún [Thu, 28 Oct 2021 18:56:55 +0000 (20:56 +0200)]
PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
The PCIE_MSI_PAYLOAD_REG contains 16-bit MSI number, not only lower
8 bits. Fix reading content of this register and add a comment
describing the access to this register.
Marek Behún [Thu, 28 Oct 2021 18:56:54 +0000 (20:56 +0200)]
PCI: aardvark: Fix return value of MSI domain .alloc() method
MSI domain callback .alloc() (implemented by advk_msi_irq_domain_alloc()
function) should return zero on success, since non-zero value indicates
failure.
When the driver was converted to generic MSI API in commit f21a8b1b6837
("PCI: aardvark: Move to MSI handling using generic MSI support"), it
was converted so that it returns hwirq number.
Marek Behún [Thu, 28 Oct 2021 18:56:53 +0000 (20:56 +0200)]
PCI: pci-bridge-emul: Fix emulation of W1C bits
The pci_bridge_emul_conf_write() function correctly clears W1C bits in
cfgspace cache, but it does not inform the underlying implementation
about the clear request: the .write_op() method is given the value with
these bits cleared.
This is wrong if the .write_op() needs to know which bits were requested
to be cleared.
Fix the value to be passed into the .write_op() method to have requested
W1C bits set, so that it can clear them.
Both pci-bridge-emul users (mvebu and aardvark) are compatible with this
change.
Update the following information related to the PCI subsystem which
includes the PCI drivers, PCI native host bridge and endpoint drivers,
and the PCI endpoint sub-system:
- Sort fields as per preferred order
- Sort files in the alphabetical order
- Update old Patchwork URLs
- Update Git repository for the PCI endpoint subsystem
- Add Bugzilla link
- Add link to the official IRC channel
- Add files "drivers/pci/pci-bridge-emul.{c,h}" to the right
section so that proper ownership is returned for both files
from the get_maintainer.pl script
Use standard VPD API to replace t3_seeprom_write(), this prepares for
removing this function. Chelsio T3 maps the EEPROM write protect flag
to an arbitrary place in VPD address space, therefore we have to use
pci_write_vpd_any().
Using the VPD API allows to simplify the code and completely get rid
of t3_seeprom_read(). Note that we don't have to use pci_read_vpd_any()
here because a VPD quirk sets dev->vpd.len to the full EEPROM size.
Mingchuang Qiao [Tue, 12 Oct 2021 07:56:14 +0000 (15:56 +0800)]
PCI: Re-enable Downstream Port LTR after reset or hotplug
Per PCIe r5.0, sec 7.5.3.16, Downstream Ports must disable LTR if the link
goes down (the Port goes DL_Down status). This is a problem because the
Downstream Port's dev->ltr_path is still set, so we think LTR is still
enabled, and we enable LTR in the Endpoint. When it sends LTR messages,
they cause Unsupported Request errors at the Downstream Port.
This happens in the reset path, where we may enable LTR in
pci_restore_pcie_state() even though the Downstream Port disabled LTR
because the reset caused a link down event.
It also happens in the hot-remove and hot-add path, where we may enable LTR
in pci_configure_ltr() even though the Downstream Port disabled LTR when
the hot-remove took the link down.
In these two scenarios, check the upstream bridge and restore its LTR
enable if appropriate.
The Unsupported Request may be logged by AER as follows:
In addition, if LTR is not configured correctly, the link cannot enter the
L1.2 state, which prevents some machines from entering the S0ix low power
state.
Barry Song [Wed, 25 Aug 2021 10:26:35 +0000 (18:26 +0800)]
PCI/sysfs: Explicitly show first MSI IRQ for 'irq'
The sysfs "irq" file contains the legacy INTx IRQ. Or, if the device has
MSI enabled, it contains the first MSI IRQ instead.
Previously this file showed the pci_dev.irq value directly. But we'd
prefer to use pci_dev.irq only for the INTx IRQ and decouple that from any
MSI or MSI-X IRQs.
If the device has MSI enabled, explicitly look up and show the first MSI
IRQ in the sysfs "irq" file. Otherwise, show the INTx IRQ.
This removes the requirement that msi_capability_init() set pci_dev.irq to
the first MSI IRQ when enabling MSI and pci_msi_shutdown() restore the INTx
IRQ when disabling MSI.
Barry Song [Wed, 25 Aug 2021 10:26:34 +0000 (18:26 +0800)]
PCI: Document /sys/bus/pci/devices/.../irq
Document /sys/bus/pci/devices/.../irq.
This file contains the IRQ of the INTx interrupt (or zero if the device
doesn't support INTx interrupts).
If the device has enabled MSI (not MSI-X), it contains the first MSI IRQ
instead. This is a historical mistake because devices may support several
MSI or MSI-X vectors, and this file can't contain them all. But we
preserve this behavior to avoid breaking userspace.
Uwe Kleine-König [Tue, 12 Oct 2021 21:11:24 +0000 (16:11 -0500)]
PCI: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pci_dev->driver with to_pci_driver(). This is a step toward
removing pci_dev->driver.
Uwe Kleine-König [Tue, 12 Oct 2021 21:05:47 +0000 (16:05 -0500)]
x86/pci/probe_roms: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pdev->driver with to_pci_driver(). This is a step toward removing
pci_dev->driver.
Uwe Kleine-König [Tue, 12 Oct 2021 21:38:32 +0000 (16:38 -0500)]
perf/x86/intel/uncore: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pdev->driver with to_pci_driver(). This is a step toward removing
pci_dev->driver.
Uwe Kleine-König [Tue, 12 Oct 2021 21:37:55 +0000 (16:37 -0500)]
powerpc/eeh: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pdev->driver with to_pci_driver(). This is a step toward removing
pci_dev->driver.
Uwe Kleine-König [Tue, 12 Oct 2021 21:36:05 +0000 (16:36 -0500)]
usb: xhci: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pdev->driver with to_pci_driver(). This is a step toward removing
pci_dev->driver.
Uwe Kleine-König [Tue, 12 Oct 2021 21:03:39 +0000 (16:03 -0500)]
cxl: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pdev->driver with to_pci_driver(). This is a step toward removing
pci_dev->driver.
Bjorn Helgaas [Tue, 12 Oct 2021 20:51:32 +0000 (15:51 -0500)]
cxl: Factor out common dev->driver expressions
Save the struct pci_driver and struct pci_error_handlers pointers from
pdev->driver instead of chasing the pointers several times. No functional
change intended.
The sole non-static function in err.c, pcie_do_recovery(), is only
called from:
* aer.c (if CONFIG_PCIEAER=y)
* dpc.c (if CONFIG_PCIE_DPC=y, which depends on CONFIG_PCIEAER)
* edr.c (if CONFIG_PCIE_EDR=y, which depends on CONFIG_PCIE_DPC)
Thus, err.c need not be compiled if CONFIG_PCIEAER=n.
Also, pci_uevent_ers() and pcie_clear_device_status(), which are called
from err.c, can be #ifdef'ed away unless CONFIG_PCIEAER=y.
Since x86_64_defconfig doesn't enable CONFIG_PCIEAER, this change may
slightly reduce compile time for anyone doing a test build with that
config.
Commit c6c889d932bb ("PCI/portdrv: Remove pcie_port_bus_type link order
dependency") removed pcie_port_bus_{,un}register() but erroneously
retained their declarations in portdrv.h. Remove them as well.
Commit 3e41a317ae45 ("PCI/AER: Remove unused aer_error_resume()")
removed the resume err_handler from AER. Since no other port service
implements the callback, support for it can be removed from portdrv.
It can be revived later if need be, preferably by re-using the
pcie_port_device_iter() iterator.
PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset
Stuart Hayes reports that an error handled by DPC at a Root Port results
in pciehp gratuitously bringing down a subordinate hotplug port:
RP -- UP -- DP -- UP -- DP (hotplug) -- EP
pciehp brings the slot down because the Link to the Endpoint goes down.
That is caused by a Hot Reset being propagated as a result of DPC.
Per PCIe Base Spec 5.0, section 6.6.1 "Conventional Reset":
For a Switch, the following must cause a hot reset to be sent on all
Downstream Ports: [...]
* The Data Link Layer of the Upstream Port reporting DL_Down status.
In Switches that support Link speeds greater than 5.0 GT/s, the
Upstream Port must direct the LTSSM of each Downstream Port to the
Hot Reset state, but not hold the LTSSMs in that state. This permits
each Downstream Port to begin Link training immediately after its
hot reset completes. This behavior is recommended for all Switches.
* Receiving a hot reset on the Upstream Port.
Once DPC recovers, pcie_do_recovery() walks down the hierarchy and
invokes pcie_portdrv_slot_reset() to restore each port's config space.
At that point, a hotplug interrupt is signaled per PCIe Base Spec r5.0,
section 6.7.3.4 "Software Notification of Hot-Plug Events":
If the Port is enabled for edge-triggered interrupt signaling using
MSI or MSI-X, an interrupt message must be sent every time the logical
AND of the following conditions transitions from FALSE to TRUE: [...]
* The Hot-Plug Interrupt Enable bit in the Slot Control register is
set to 1b.
* At least one hot-plug event status bit in the Slot Status register
and its associated enable bit in the Slot Control register are both
set to 1b.
Prevent pciehp from gratuitously bringing down the slot by clearing the
error-induced Data Link Layer State Changed event before restoring
config space. Afterwards, check whether the link has unexpectedly
failed to retrain and synthesize a DLLSC event if so.
Allow each pcie_port_service_driver (one of them being pciehp) to define
a slot_reset callback and re-use the existing pm_iter() function to
iterate over the callbacks.
Thereby, the Endpoint driver remains bound throughout error recovery and
may restore the device to working state.
Surprise removal during error recovery is detected through a Presence
Detect Changed event. The hotplug port is expected to not signal that
event as a result of a Hot Reset.
The issue isn't DPC-specific, it also occurs when an error is handled by
AER through aer_root_reset(). So while the issue was noticed only now,
it's been around since 2006 when AER support was first introduced.
Uwe Kleine-König [Tue, 12 Oct 2021 21:35:34 +0000 (16:35 -0500)]
xen/pcifront: Use to_pci_driver() instead of pci_dev->driver
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pdev->driver with to_pci_driver(). This is a step toward removing
pci_dev->driver.
Logan Gunthorpe [Thu, 14 Oct 2021 14:18:59 +0000 (14:18 +0000)]
PCI/switchtec: Add check of event support
Not all events are supported by every gen/variant of the Switchtec
firmware. To solve this, since Gen4, a new bit in each event header
is introduced to indicate if an event is supported by the firmware.
Kelvin Cao [Thu, 14 Oct 2021 14:18:57 +0000 (14:18 +0000)]
PCI/switchtec: Update the way of getting management VEP instance ID
Gen4 firmware adds DMA VEP and NVMe VEP support in VEP (virtual EP)
instance ID register in addtion to management EP, update the way of
getting management VEP instance ID.
Kelvin Cao [Thu, 14 Oct 2021 14:18:56 +0000 (14:18 +0000)]
PCI/switchtec: Fix a MRPC error status handling issue
If an error is encountered when executing a MRPC command, the firmware
will set the status register to SWITCHTEC_MRPC_STATUS_ERROR and return
the error code in the return value register.
Add handling of SWITCHTEC_MRPC_STATUS_ERROR on status register when
completing a MRPC command.
Kelvin Cao [Thu, 14 Oct 2021 14:18:55 +0000 (14:18 +0000)]
PCI/switchtec: Error out MRPC execution when MMIO reads fail
A firmware hard reset may be initiated by various mechanisms including a
UART interface, TWI sideband interface from BMC, MRPC command from
userspace, etc. The switchtec management driver is unaware of these
resets.
The reset clears PCI state including the BARs and Memory Space Enable
bits, so the device no longer responds to the MMIO accesses the driver
uses to operate it.
MMIO reads to the device will fail with a PCIe error. When the root
complex handles that error, it typically fabricates ~0 data to complete
the CPU read.
Check for this sort of error by reading the device ID from MMIO space.
This ID can never be ~0, so if we see that value, it probably means the
PCIe Memory Read failed and we should return an error indication to the
application using the switchtec driver.
ssb: Use dev_driver_string() instead of pci_dev->driver->name
All drivers that use ssb_pcihost_probe(), i.e., b43_pci_bridge_driver and
b44_pci_driver, set the pci_driver.name, and __pci_register_driver() sets
the struct driver.name member to the same value.
Replace dev->driver_name() by dev_driver_string() for the corresponding
struct device. This is a step toward removing pci_dev->driver.
When bcma_host_pci_probe() is called, the PCI driver core has already
assigned the device's driver in local_pci_probe(). So dev->driver is always
true, and here it points to bcma_pci_bridge_driver, which has .name set to
"bcma-pci-bridge". Simplify accordingly.
A struct pci_driver is shared across all device instances, so assigning
pci_driver.err_handler once per device isn't really sensible.
Set adf_driver.err_handler statically instead of in adf_enable_aer().
This removes a use of pci_dev->driver, which is a step toward removing
pci_dev->driver altogether.
Since adf_enable_aer() returns zero unconditionally, make it a void
function.
"dom_req" is a u16 but varargs automatically promotes it to int, so there's
no point in using the %h modifier. Drop it.
See cbacb5ab0aa0 ("docs: printk-formats: Stop encouraging use of
unnecessary %h[xudi] and %hh[xudi]") and 70eb2275ff8e ("checkpatch: add
warning for unnecessary use of %h[xudi] and %hh[xudi]").
Pali Rohár [Tue, 5 Oct 2021 18:09:52 +0000 (20:09 +0200)]
PCI: aardvark: Fix reporting Data Link Layer Link Active
Add support for reporting PCI_EXP_LNKSTA_DLLLA bit in Link Control register
on emulated bridge via current LTSSM state. Also correctly indicate DLLLA
capability via PCI_EXP_LNKCAP_DLLLARC bit in Link Control Capability
register.
Pali Rohár [Tue, 5 Oct 2021 18:09:50 +0000 (20:09 +0200)]
PCI: aardvark: Fix link training
Fix multiple link training issues in aardvark driver. The main reason of
these issues was misunderstanding of what certain registers do, since their
names and comments were misleading: before commit 96be36dbffac ("PCI:
aardvark: Replace custom macros by standard linux/pci_regs.h macros"), the
pci-aardvark.c driver used custom macros for accessing standard PCIe Root
Bridge registers, and misleading comments did not help to understand what
the code was really doing.
After doing more tests and experiments I've come to the conclusion that the
SPEED_GEN register in aardvark sets the PCIe revision / generation
compliance and forces maximal link speed. Both GEN3 and GEN2 values set the
read-only PCI_EXP_FLAGS_VERS bits (PCIe capabilities version of Root
Bridge) to value 2, while GEN1 value sets PCI_EXP_FLAGS_VERS to 1, which
matches with PCI Express specifications revisions 3, 2 and 1 respectively.
Changing SPEED_GEN also sets the read-only bits PCI_EXP_LNKCAP_SLS and
PCI_EXP_LNKCAP2_SLS to corresponding speed.
(Note that PCI Express rev 1 specification does not define PCI_EXP_LNKCAP2
and PCI_EXP_LNKCTL2 registers and when SPEED_GEN is set to GEN1 (which
also sets PCI_EXP_FLAGS_VERS set to 1), lspci cannot access
PCI_EXP_LNKCAP2 and PCI_EXP_LNKCTL2 registers.)
Changing PCIe link speed can be done via PCI_EXP_LNKCTL2_TLS bits of
PCI_EXP_LNKCTL2 register. Armada 3700 Functional Specifications says that
the default value of PCI_EXP_LNKCTL2_TLS is based on SPEED_GEN value, but
tests showed that the default value is always 8.0 GT/s, independently of
speed set by SPEED_GEN. So after setting SPEED_GEN, we must also set value
in PCI_EXP_LNKCTL2 register via PCI_EXP_LNKCTL2_TLS bits.
Triggering PCI_EXP_LNKCTL_RL bit immediately after setting LINK_TRAINING_EN
bit actually doesn't do anything. Tests have shown that a delay is needed
after enabling LINK_TRAINING_EN bit. As triggering PCI_EXP_LNKCTL_RL
currently does nothing, remove it.
Commit 43fc679ced18 ("PCI: aardvark: Improve link training") introduced
code which sets SPEED_GEN register based on negotiated link speed from
PCI_EXP_LNKSTA_CLS bits of PCI_EXP_LNKSTA register. This code was added to
fix detection of Compex WLE900VX (Atheros QCA9880) WiFi GEN1 PCIe cards, as
otherwise these cards were "invisible" on PCIe bus (probably because they
crashed). But apparently more people reported the same issues with these
cards also with other PCIe controllers [1] and I was able to reproduce this
issue also with other "noname" WiFi cards based on Atheros QCA9890 chip
(with the same PCI vendor/device ids as Atheros QCA9880). So this is not an
issue in aardvark but rather an issue in Atheros QCA98xx chips. Also, this
issue only exists if the kernel is compiled with PCIe ASPM support, and a
generic workaround for this is to change PCIe Bridge to 2.5 GT/s link speed
via PCI_EXP_LNKCTL2_TLS_2_5GT bits in PCI_EXP_LNKCTL2 register [2], before
triggering PCI_EXP_LNKCTL_RL bit. This workaround also works when SPEED_GEN
is set to value GEN2 (5 GT/s). So remove this hack completely in the
aardvark driver and always set SPEED_GEN to value from 'max-link-speed' DT
property. Fix for Atheros QCA98xx chips is handled separately by patch [2].
These two things (code for triggering PCI_EXP_LNKCTL_RL bit and changing
SPEED_GEN value) also explain why commit 6964494582f5 ("PCI: aardvark:
Train link immediately after enabling training") somehow fixed detection of
those problematic Compex cards with Atheros chips: if triggering link
retraining (via PCI_EXP_LNKCTL_RL bit) was done immediately after enabling
link training (via LINK_TRAINING_EN), it did nothing. If there was a
specific delay, aardvark HW already initialized PCIe link and therefore
triggering link retraining caused the above issue. Compex cards triggered
link down event and disappeared from the PCIe bus.
Commit f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready before
training link") added 100ms sleep before calling 'Start link training'
command and explained that it is a requirement of PCI Express
specification. But the code after this 100ms sleep was not doing 'Start
link training', rather it triggered PCI_EXP_LNKCTL_RL bit via PCIe Root
Bridge to put link into Recovery state.
The required delay after fundamental reset is already done in function
advk_pcie_wait_for_link() which also checks whether PCIe link is up.
So after removing the code which triggers PCI_EXP_LNKCTL_RL bit on PCIe
Root Bridge, there is no need to wait 100ms again. Remove the extra
msleep() call and update comment about the delay required by the PCI
Express specification.
According to Marvell Armada 3700 Functional Specifications, Link training
should be enabled via aardvark register LINK_TRAINING_EN after selecting
PCIe generation and x1 lane. There is no need to disable it prior resetting
card via PERST# signal. This disabling code was introduced in commit 5169a9851daa ("PCI: aardvark: Issue PERST via GPIO") as a workaround for
some Atheros cards. It turns out that this also is Atheros specific issue
and affects any PCIe controller, not only aardvark. Moreover this Atheros
issue was triggered by juggling with PCI_EXP_LNKCTL_RL, LINK_TRAINING_EN
and SPEED_GEN bits interleaved with sleeps. Now, after removing triggering
PCI_EXP_LNKCTL_RL, there is no need to explicitly disable LINK_TRAINING_EN
bit. So remove this code too. The problematic Compex cards described in
previous git commits are correctly detected in advk_pcie_train_link()
function even after applying all these changes.
Note that with this patch, and also prior this patch, some NVMe disks which
support PCIe GEN3 with 8 GT/s speed are negotiated only at the lowest link
speed 2.5 GT/s, independently of SPEED_GEN value. After manually triggering
PCI_EXP_LNKCTL_RL bit (e.g. from userspace via setpci), these NVMe disks
change link speed to 5 GT/s when SPEED_GEN was configured to GEN2. This
issue first needs to be properly investigated. I will send a fix in the
future.
On the other hand, some other GEN2 PCIe cards with 5 GT/s speed are
autonomously by HW autonegotiated at full 5 GT/s speed without need of any
software interaction.
Armada 3700 Functional Specifications describes the following steps for
link training: set SPEED_GEN to GEN2, enable LINK_TRAINING_EN, poll until
link training is complete, trigger PCI_EXP_LNKCTL_RL, poll until signal
rate is 5 GT/s, poll until link training is complete, enable ASPM L0s.
The requirement for triggering PCI_EXP_LNKCTL_RL can be explained by the
need to achieve 5 GT/s speed (as changing link speed is done by throw to
recovery state entered by PCI_EXP_LNKCTL_RL) or maybe as a part of enabling
ASPM L0s (but in this case ASPM L0s should have been enabled prior
PCI_EXP_LNKCTL_RL).
It is unknown why the original pci-aardvark.c driver was triggering
PCI_EXP_LNKCTL_RL bit before waiting for the link to be up. This does not
align with neither PCIe base specifications nor with Armada 3700 Functional
Specification. (Note that in older versions of aardvark, this bit was
called incorrectly PCIE_CORE_LINK_TRAINING, so this may be the reason.)
It is also unknown why Armada 3700 Functional Specification says that it is
needed to trigger PCI_EXP_LNKCTL_RL for GEN2 mode, as according to PCIe
base specification 5 GT/s speed negotiation is supposed to be entirely
autonomous, even if initial speed is 2.5 GT/s.
Pali Rohár [Tue, 5 Oct 2021 18:09:48 +0000 (20:09 +0200)]
PCI: aardvark: Implement re-issuing config requests on CRS response
Commit 43f5c77bcbd2 ("PCI: aardvark: Fix reporting CRS value") fixed
handling of CRS response and when CRSSVE flag was not enabled it marked CRS
response as failed transaction (due to simplicity).
But pci-aardvark.c driver is already waiting up to the PIO_RETRY_CNT count
for PIO config response and so we can with a small change implement
re-issuing of config requests as described in PCIe base specification.
This change implements re-issuing of config requests when response is CRS.
Set upper bound of wait cycles to around PIO_RETRY_CNT, afterwards the
transaction is marked as failed and an all-ones value is returned as
before.
We do this by returning appropriate error codes from function
advk_pcie_check_pio_status(). On CRS we return -EAGAIN and caller then
reissues transaction.
Pali Rohár [Tue, 5 Oct 2021 18:09:46 +0000 (20:09 +0200)]
PCI: aardvark: Do not unmask unused interrupts
There are lot of undocumented interrupt bits. To prevent unwanted
spurious interrupts, fix all *_ALL_MASK macros to define all interrupt
bits, so that driver can properly mask all interrupts, including those
which are undocumented.
Pali Rohár [Tue, 5 Oct 2021 18:09:45 +0000 (20:09 +0200)]
PCI: aardvark: Do not clear status bits of masked interrupts
The PCIE_ISR1_REG says which interrupts are currently set / active,
including those which are masked.
The driver currently reads this register and looks if some unmasked
interrupts are active, and if not, it clears status bits of _all_
interrupts, including the masked ones.
This is incorrect, since, for example, some drivers may poll these bits.
Remove this clearing, and also remove this early return statement
completely, since it does not change functionality in any way.
Pali Rohár [Tue, 5 Oct 2021 18:09:44 +0000 (20:09 +0200)]
PCI: aardvark: Fix configuring Reference clock
Commit 366697018c9a ("PCI: aardvark: Add PHY support") introduced
configuration of PCIe Reference clock via PCIE_CORE_REF_CLK_REG register,
but did it incorrectly.
PCIe Reference clock differential pair is routed from system board to
endpoint card, so on CPU side it has output direction. Therefore it is
required to enable transmitting and disable receiving.
Default configuration according to Armada 3700 Functional Specifications is
enabled receiver part and disabled transmitter.
We need this change because otherwise PCIe Reference clock is configured to
some undefined state when differential pair is used for both transmitting
and receiving.
Pali Rohár [Tue, 5 Oct 2021 18:09:43 +0000 (20:09 +0200)]
PCI: aardvark: Fix preserving PCI_EXP_RTCTL_CRSSVE flag on emulated bridge
Commit 43f5c77bcbd2 ("PCI: aardvark: Fix reporting CRS value") started
using CRSSVE flag for handling CRS responses.
PCI_EXP_RTCTL_CRSSVE flag is stored only in emulated config space buffer
and there is handler for PCI_EXP_RTCTL register. So every read operation
from config space automatically clears CRSSVE flag as it is not defined in
PCI_EXP_RTCTL read handler.
Fix this by reading current CRSSVE bit flag from emulated space buffer and
appending it to PCI_EXP_RTCTL read response.