Bjorn Helgaas [Wed, 21 Oct 2020 14:58:40 +0000 (09:58 -0500)]
Merge branch 'remotes/lorenzo/pci/iproc'
- Set affinity mask on MSI interrupts (Mark Tomlinson)
- Simplify by using module_bcma_driver (Liu Shixin)
- Fix 'using integer as NULL pointer' warning (Krzysztof Wilczyński)
* remotes/lorenzo/pci/iproc:
PCI: iproc: Fix using plain integer as NULL pointer in iproc_pcie_pltfm_probe
PCI: iproc: Use module_bcma_driver to simplify the code
PCI: iproc: Set affinity mask on MSI interrupts
Bjorn Helgaas [Wed, 21 Oct 2020 14:58:40 +0000 (09:58 -0500)]
Merge branch 'remotes/lorenzo/pci/imx6'
- Use "fallthrough" pseudo-keyword (Gustavo A. R. Silva)
- Drop redundant error messages after devm_clk_get() (Anson Huang)
* remotes/lorenzo/pci/imx6:
PCI: imx6: Do not output error message when devm_clk_get() failed with -EPROBE_DEFER
PCI: imx6: Use fallthrough pseudo-keyword
- Add common iATU register support (Kunihiko Hayashi)
- Remove keystone iATU register mapping in favor of generic dwc support
(Kunihiko Hayashi)
- Skip PCIE_MSI_INTR0* programming if MSI is disabled (Jisheng Zhang)
- Fix MSI page leakage in suspend/resume (Jisheng Zhang)
- Check whether link is up before attempting config access (best-effort fix
even though it's racy) (Hou Zhiqiang)
* remotes/lorenzo/pci/dwc:
PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()
PCI: dwc: Fix MSI page leakage in suspend/resume
PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabled
PCI: keystone: Remove iATU register mapping
PCI: dwc: Add common iATU register support
dt-bindings: PCI: uniphier-ep: Add iATU register description
dt-bindings: PCI: uniphier: Add iATU register description
PCI: dwc: Fix 'cast truncates bits from constant value'
misc: pci_endpoint_test: Add driver data for Layerscape PCIe controllers
misc: pci_endpoint_test: Add LS1088a in pci_device_id table
PCI: layerscape: Add EP mode support for ls1088a and ls2088a
PCI: layerscape: Modify the MSIX to the doorbell mode
PCI: layerscape: Modify the way of getting capability with different PEX
PCI: layerscape: Fix some format issue of the code
dt-bindings: pci: layerscape-pci: Add compatible strings for ls1088a and ls2088a
PCI: designware-ep: Modify MSI and MSIX CAP way of finding
PCI: designware-ep: Move the function of getting MSI capability forward
PCI: designware-ep: Add the doorbell mode of MSI-X in EP mode
PCI: designware-ep: Add multiple PFs support for DWC
PCI: dwc: Use DBI accessors
PCI: dwc: Move N_FTS setup to common setup
PCI: dwc/intel-gw: Drop unused max_width
PCI: dwc/intel-gw: Move getting PCI_CAP_ID_EXP offset to intel_pcie_link_setup()
PCI: dwc/intel-gw: Drop unnecessary checking of DT 'device_type' property
PCI: dwc: Set PORT_LINK_DLL_LINK_EN in common setup code
PCI: dwc: Centralize link gen setting
PCI: dwc: Make ATU accessors private
PCI: dwc: Remove read_dbi2 code
PCI: dwc/tegra: Use common Designware port logic register definitions
PCI: dwc: Remove hardcoded PCI_CAP_ID_EXP offset
PCI: dwc/qcom: Use common PCI register definitions
PCI: dwc/imx6: Use common PCI register definitions
PCI: dwc/meson: Rework PCI config and DW port logic register accesses
PCI: dwc/meson: Drop unnecessary RC config space initialization
PCI: dwc/meson: Drop the duplicate number of lanes setup
PCI: dwc: Ensure FAST_LINK_MODE is cleared
PCI: dwc: Add a 'num_lanes' field to struct dw_pcie
PCI: dwc/imx6: Remove duplicate define PCIE_LINK_WIDTH_SPEED_CONTROL
PCI: dwc: Check CONFIG_PCI_MSI inside dw_pcie_msi_init()
PCI: dwc/keystone: Drop duplicated 'num-viewport'
PCI: dwc: Simplify config space handling
PCI: dwc: Remove storing of PCI resources
PCI: dwc: Remove root_bus pointer
PCI: dwc: Convert to use pci_host_probe()
PCI: dwc: keystone: Convert .scan_bus() callback to use add_bus
PCI: Also call .add_bus() callback for root bus
PCI: dwc: Use generic config accessors
PCI: dwc: Remove dwc specific config accessor ops
PCI: dwc: histb: Use pci_ops for root config space accessors
PCI: dwc: exynos: Use pci_ops for root config space accessors
PCI: dwc: kirin: Use pci_ops for root config space accessors
PCI: dwc: meson: Use pci_ops for root config space accessors
PCI: dwc: tegra: Use pci_ops for root config space accessors
PCI: dwc: keystone: Use pci_ops for config space accessors
PCI: dwc: al: Use pci_ops for child config space accessors
PCI: dwc: Add a default pci_ops.map_bus for root port
PCI: dwc: Allow overriding bridge pci_ops
PCI: dwc: Use DBI accessors instead of own config accessors
PCI: Allow root and child buses to have different pci_ops
PCI: designware-ep: Fix the Header Type check
Bjorn Helgaas [Wed, 21 Oct 2020 14:58:38 +0000 (09:58 -0500)]
Merge branch 'remotes/lorenzo/pci/brcmstb'
- Make PCIE_BRCMSTB depend on and default to ARCH_BRCMSTB (Jim Quinlan)
- Add DT bindings for 7278, 7216, 7211, and new properties (Jim Quinlan)
- Add bcm7278 register info (Jim Quinlan)
- Add suspend and resume pm_ops (Jim Quinlan)
- Add bcm7278 PERST# support (Jim Quinlan)
- Add control of RESCAL reset (Jim Quinlan)
- Set additional internal memory DMA viewport sizes (Jim Quinlan)
- Accommodate MSI for older chips (Jim Quinlan)
- Set bus max burst size by chip type (Jim Quinlan)
- Add bcm7211, bcm7216, bcm7445, bcm7278 to match list (Jim Quinlan)
* remotes/lorenzo/pci/brcmstb:
PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list
PCI: brcmstb: Set bus max burst size by chip type
PCI: brcmstb: Accommodate MSI for older chips
PCI: brcmstb: Set additional internal memory DMA viewport sizes
PCI: brcmstb: Add control of rescal reset
PCI: brcmstb: Add bcm7278 PERST# support
PCI: brcmstb: Add suspend and resume pm_ops
PCI: brcmstb: Add bcm7278 register info
dt-bindings: PCI: Add bindings for more Brcmstb chips
PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
- Fix initialization with old Marvell's Arm Trusted Firmware (Pali Rohár)
* remotes/lorenzo/pci/aardvark:
PCI: aardvark: Fix initialization with old Marvell's Arm Trusted Firmware
phy: marvell: comphy: Convert internal SMCC firmware return codes to errno
PCI: aardvark: Move PCIe reset card code to advk_pcie_train_link()
PCI: aardvark: Implement driver 'remove' function and allow to build it as module
PCI: pci-bridge-emul: Export API functions
PCI: aardvark: Check for errors from pci_bridge_emul_init() call
PCI: aardvark: Fix compilation on s390
Bjorn Helgaas [Wed, 21 Oct 2020 14:58:36 +0000 (09:58 -0500)]
Merge branch 'remotes/lorenzo/pci/apei'
- Add ACPI APEI notifier chain for unknown (vendor) CPER records (Shiju
Jose)
- Add handling of HiSilicon HIP PCIe controller errors (Yicong Yang)
* remotes/lorenzo/pci/apei:
PCI: hip: Add handling of HiSilicon HIP PCIe controller errors
ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records
- Drop double zeroing for P2PDMA sg_init_table() (Julia Lawall)
* pci/misc:
PCI: v3-semi: Remove unneeded break
PCI/P2PDMA: Drop double zeroing for sg_init_table()
PCI: Simplify bool comparisons
PCI: endpoint: Use "NULL" instead of "0" as a NULL pointer
PCI: Simplify pci_dev_reset_slot_function()
PCI: Update mmap-related #ifdef comments
PCI/LINK: Print IRQ number used by port
PCI/IOV: Simplify pci-pf-stub with module_pci_driver()
PCI: Use scnprintf(), not snprintf(), in sysfs "show" functions
x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled
PCI: Remove unnecessary header includes
Bjorn Helgaas [Wed, 21 Oct 2020 14:58:34 +0000 (09:58 -0500)]
Merge branch 'pci/enumeration'
- Tone down message about missing optional MCFG (Jeremy Linton)
- Add schedule point in pci_read_config() (Jiang Biao)
- Add Ampere Altra SOC MCFG quirk (Tuan Phan)
- Add Kconfig options for MPS/MRRS strategy (Jim Quinlan)
* pci/enumeration:
PCI: Add Kconfig options for MPS/MRRS strategy
PCI/ACPI: Add Ampere Altra SOC MCFG quirk
PCI: Add schedule point in pci_read_config()
PCI/ACPI: Tone down missing MCFG message
PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()
NXP Layerscape (ls1028a, ls2088a), dra7xxx and imx6 platforms are either
programmed or statically configured to forward the error triggered by a
link-down state (eg no connected endpoint device) on the system bus for
PCI configuration transactions; these errors are reported as an SError
at system level, which is fatal.
Enumerating a PCI tree when the PCIe link is down is not sensible
either, so even if the link-up check is racy (link can go down after
map_bus() is called) add a link-up check in map_bus() to prevent issuing
configuration transactions when the link is down.
Previously we computed L1.2 parameters in the enumeration path, saved them
in struct pcie_link_state.l1ss, and programmed them into the devices
whenever we enabled or disabled L1.2 on the link. But these parameters are
constant and don't need to be updated when enabling/disabling L1.2.
Compute and program the L1.2 parameters once during enumeration and remove
the struct pcie_link_state.l1ss member. No functional change intended.
Previously we stored the L1SS Capabilities value in the struct
aspm_register_info.
We only need this information in one place, so read it there and remove
struct aspm_register_info completely, since it's now empty. No functional
change intended.
Bjorn Helgaas [Thu, 15 Oct 2020 19:30:37 +0000 (14:30 -0500)]
PCI/ASPM: Pass L1SS Capabilities value, not struct aspm_register_info
aspm_calc_l1ss_info() needs only the L1SS Capabilities. It doesn't need
anything else from struct aspm_register_info, so pass only the Capabilities
value. No functional change intended.
Save the L1 Substates Capability pointer in struct pci_dev. Then we don't
have to keep track of it in the struct aspm_register_info and struct
pcie_link_state, which makes the code easier to read. No functional change
intended.
Previously we stored L0s and L1 Exit Latency information from the Link
Capabilities register in the struct aspm_register_info.
We only need these latencies when we already have the Link Capabilities
values, so use those directly and remove the latencies from struct
aspm_register_info. No functional change intended.
Previously we stored the "ASPM Control" bits from the Link Control register
in the struct aspm_register_info.
Read PCI_EXP_LNKCTL directly when needed. This means we can use the
PCI_EXP_LNKCTL_ASPM_* bits directly instead of the similar but different
PCIE_LINK_STATE_* bits. No functional change intended.
Bjorn Helgaas [Thu, 15 Oct 2020 19:30:30 +0000 (14:30 -0500)]
PCI/ASPM: Use 'parent' and 'child' for readability
Other users of link->pdev and link->downstream, e.g., pcie_aspm_cap_init(),
pcie_config_aspm_l1ss(), and pcie_config_aspm_link(), use "parent" and
"child" as local names.
Do the same in aspm_calc_l1ss_info() for readability. No functional change
intended.
Bjorn Helgaas [Thu, 15 Oct 2020 19:30:29 +0000 (14:30 -0500)]
PCI/ASPM: Move LTR path check to where it's used
pcie_get_aspm_reg() mostly reads ASPM-related registers, but in some cases
it also updates the value read from PCI_L1SS_CAP based on LTR properties.
Move this update to the point where the value is used to make the code more
readable.
No functional change intended, although previously we could clear
PCI_L1SS_CAP_ASPM_L1_2 for both ends of the link, and now we'll only do it
for the downstream end of a link. This shouldn't matter because we always
test that bit by ANDing l1ss_cap for the upstream and downstream ends.
Jisheng Zhang [Fri, 9 Oct 2020 07:55:05 +0000 (15:55 +0800)]
PCI: dwc: Fix MSI page leakage in suspend/resume
Currently, dw_pcie_msi_init() allocates and maps page for msi, then
program the PCIE_MSI_ADDR_LO and PCIE_MSI_ADDR_HI. The Root Complex
may lose power during suspend-to-RAM, so when we resume, we want to
redo the latter but not the former. If designware based driver (for
example, pcie-tegra194.c) calls dw_pcie_msi_init() in resume path, the
msi page will be leaked.
As pointed out by Rob and Ard, there's no need to allocate a page for
the MSI address, we could use an address in the driver data.
To avoid map the MSI msg again during resume, we move the map MSI msg
from dw_pcie_msi_init() to dw_pcie_host_init().
This gets iATU register area from reg property that has reg-names "atu".
In Synopsys DWC version 4.80 or later, since iATU register area is
separated from core register area, this area is necessary to get from
DT independently.
PCI: iproc: Fix using plain integer as NULL pointer in iproc_pcie_pltfm_probe
Fix sparse build warning:
drivers/pci/controller/pcie-iproc-platform.c:102:33: warning: Using plain integer as NULL pointer
The map_irq member of the struct iproc_pcie takes a function pointer
serving as a callback to map interrupts, therefore we should pass a NULL
pointer to it rather than a integer in the iproc_pcie_pltfm_probe()
function.
Related:
commit b64aa11eb2dd ("PCI: Set bridge map_irq and swizzle_irq to
default functions")
For arches that do not select CONFIG_GENERIC_IOMAP, the current
pci_iounmap() function does nothing causing obvious memory leaks
for mapped regions that are backed by MMIO physical space.
In order to detect if a mapped pointer is IO vs MMIO, a check must made
available to the pci_iounmap() function so that it can actually detect
whether the pointer has to be unmapped.
In configurations where CONFIG_HAS_IOPORT_MAP && !CONFIG_GENERIC_IOMAP,
a mapped port is detected using an ioport_map() stub defined in
asm-generic/io.h.
Use the same logic to implement a stub (ie __pci_ioport_unmap()) that
detects if the passed in pointer in pci_iounmap() is IO vs MMIO to
iounmap conditionally and call it in pci_iounmap() fixing the issue.
Leave __pci_ioport_unmap() as a NOP for all other config options.
Driver ->power_on and ->power_off callbacks leaks internal SMCC firmware
return codes to phy caller. This patch converts SMCC error codes to
standard linux errno codes. Include file linux/arm-smccc.h already provides
defines for SMCC error codes, so use them instead of custom driver defines.
Note that return value is signed 32bit, but stored in unsigned long type
with zero padding.
Jim Quinlan [Fri, 11 Sep 2020 17:52:29 +0000 (13:52 -0400)]
PCI: brcmstb: Set bus max burst size by chip type
The proper value of the parameter SCB_MAX_BURST_SIZE varies per chip. The
2711 family requires 128B whereas other devices can employ 512. The
assignment is complicated by the fact that the values for this two-bit
field have different meanings;
Jim Quinlan [Fri, 11 Sep 2020 17:52:28 +0000 (13:52 -0400)]
PCI: brcmstb: Accommodate MSI for older chips
Older BrcmSTB chips do not have a separate register for MSI interrupts; the
MSIs are in a register that also contains unrelated interrupts. In
addition, the interrupts lie in bits [31..24] for these legacy chips. This
commit provides common code for both legacy and non-legacy MSI interrupt
registers.
Jim Quinlan [Fri, 11 Sep 2020 17:52:27 +0000 (13:52 -0400)]
PCI: brcmstb: Set additional internal memory DMA viewport sizes
The Raspberry Pi (RPI) is currently the only chip using this driver
(pcie-brcmstb.c). There, only one memory controller is used, without an
extension region, and the SCB0 viewport size is set to the size of the
first and only dma-range region. Other BrcmSTB SOCs have more complicated
memory configurations that require setting additional viewport sizes.
BrcmSTB PCIe controllers are intimately connected to the memory
controller(s) on the SOC. The SOC may have one to three memory
controllers; they are indicated by the term SCBi. Each controller has a
base region and an optional extension region. In physical memory, the base
and extension regions of a controller are not adjacent, but in PCIe-space
they are.
There is a "viewport" for each memory controller that allows DMA from
endpoint devices. Each viewport's size must be set to a power of two, and
that size must be equal to or larger than the amount of memory each
controller supports which is the sum of base region and its optional
extension. Further, the 1-3 viewports are also adjacent in PCIe-space.
Unfortunately the viewport sizes cannot be ascertained from the
"dma-ranges" property so they have their own property, "brcm,scb-sizes".
This is because dma-range information does not indicate what memory
controller it is associated. For example, consider the following case
where the size of one dma-range is 2GB and the second dma-range is 1GB:
/* Case 1: SCB0 size set to 4GB */
dma-range0: 2GB (from memc0-base)
dma-range1: 1GB (from memc0-extension)
/* Case 2: SCB0 size set to 2GB, SCB1 size set to 1GB */
dma-range0: 2GB (from memc0-base)
dma-range1: 1GB (from memc0-extension)
By just looking at the dma-ranges information, one cannot tell which
situation applies. That is why an additional property is needed. Its
length indicates the number of memory controllers being used and each value
indicates the viewport size.
Note that the RPI DT does not have a "brcm,scb-sizes" property value,
as it is assumed that it only requires one memory controller and no
extension. So the optional use of "brcm,scb-sizes" will be backwards
compatible.
One last layer of complexity exists: all of the viewports sizes must be
added and rounded up to a power of two to determine what the "BAR" size is.
Further, an offset must be given that indicates the base PCIe address of
this "BAR". The use of the term BAR is typically associated with endpoint
devices, and the term is used here because the PCIe HW may be used as an RC
or an EP. In the former case, all of the system memory appears in a single
"BAR" region in PCIe memory. As it turns out, BrcmSTB PCIe HW is rarely
used in the EP role and its system of mapping memory is an artifact that
requires multiple dma-ranges regions.
Jim Quinlan [Fri, 11 Sep 2020 17:52:26 +0000 (13:52 -0400)]
PCI: brcmstb: Add control of rescal reset
Some STB chips have a special purpose reset controller named RESCAL (reset
calibration). The PCIe HW can now control RESCAL to start and stop its
operation. On probe(), the RESCAL is deasserted and the driver goes
through the sequence of setting registers and reading status in order to
start the internal PHY that is required for the PCIe.
Dexuan Cui [Fri, 2 Oct 2020 08:51:58 +0000 (01:51 -0700)]
PCI: hv: Fix hibernation in case interrupts are not re-created
pci_restore_msi_state() directly writes the MSI/MSI-X related registers
via MMIO. On a physical machine, this works perfectly; for a Linux VM
running on a hypervisor, which typically enables IOMMU interrupt remapping,
the hypervisor usually should trap and emulate the MMIO accesses in order
to re-create the necessary interrupt remapping table entries in the IOMMU,
otherwise the interrupts can not work in the VM after hibernation.
Hyper-V is different from other hypervisors in that it does not trap and
emulate the MMIO accesses, and instead it uses a para-virtualized method,
which requires the VM to call hv_compose_msi_msg() to notify the hypervisor
of the info that would be passed to the hypervisor in the case of the
trap-and-emulate method. This is not an issue to a lot of PCI device
drivers, which destroy and re-create the interrupts across hibernation, so
hv_compose_msi_msg() is called automatically. However, some PCI device
drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia
proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts
across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(),
otherwise the PCI device drivers can no longer receive interrupts after
the VM resumes from hibernation.
Hyper-V is also different in that chip->irq_unmask() may fail in a
Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask()
can not fail because unmasking an MSI/MSI-X register just means an MMIO
write): during hibernation, when a CPU is offlined, the kernel tries
to move the interrupt to the remaining CPUs that haven't been offlined
yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails
because the vmbus channel has been closed: here the early "return" in
hv_irq_unmask() means the pci_msi_unmask_irq() is not called, i.e. the
desc->masked remains "true", so later after hibernation, the MSI interrupt
always remains masked, which is incorrect. Refer to cpu_disable_common()
-> fixup_irqs() -> irq_migrate_all_off_this_cpu() -> migrate_one_irq():
Fix the issue by calling pci_msi_unmask_irq() unconditionally in
hv_irq_unmask(). Also suppress the error message for hibernation because
the hypercall failure during hibernation does not matter (at this time
all the devices have been frozen). Note: the correct affinity info is
still updated into the irqdata data structure in migrate_one_irq() ->
irq_do_set_affinity() -> hv_set_affinity(), so later when the VM
resumes, hv_pci_restore_msi_state() is able to correctly restore
the interrupt with the correct affinity.
Jim Quinlan [Mon, 28 Sep 2020 19:46:51 +0000 (15:46 -0400)]
PCI: Add Kconfig options for MPS/MRRS strategy
Add Kconfig options for changing the default pcie_bus_config, i.e., the
strategy for configuration MPS and MRRS, in the same manner as the
CONFIG_PCIEASPM_XXXX choice. The pci_bus_config setting may still be
overridden by kernel command-line parameters, e.g.,
"pci=pcie_bus_tune_off".
7e24bc347e57 was based on PCIe r5.0, sec 5.9, which claims we need a 200 ms
delay when transitioning to or from D2. However, sec 5.3.1.3 states the
delay as 200 μs (microseconds), as does the table in PCIe r4.0, sec 5.9.1.
This looks like a typo in the r5.0 spec, so revert back to a 200 μs delay
instead of a 200 ms delay.
Fixes: 7e24bc347e57 ("PCI/PM: Apply D2 delay as milliseconds, not microseconds") Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Rafael J. Wysocki <[email protected]>
PCI devices support two variants of the D3 power state: D3hot (main power
present) D3cold (main power removed). Previously struct pci_dev contained:
unsigned int d3_delay; /* D3->D0 transition time in ms */
unsigned int d3cold_delay; /* D3cold->D0 transition time in ms */
"d3_delay" refers specifically to the D3hot state. Rename it to
"d3hot_delay" to avoid ambiguity and align with the ACPI "_DSM for
Specifying Device Readiness Durations" in the PCI Firmware spec r3.2,
sec 4.6.9.
The "struct dev_pm_ops pcibios_pm_ops", declared in include/linux/pci.h and
defined in drivers/pci/pci-driver.c, provided arch-specific hooks when a
PCI device was doing a hibernate transition.
394216275c7d ("s390: remove broken hibernate / power management support")
removed the last use of pcibios_pm_ops, so remove it completely.
Bean Huo [Fri, 18 Sep 2020 12:38:00 +0000 (14:38 +0200)]
PCI: kirin: Return -EPROBE_DEFER in case the gpio isn't ready
PCI host bridge driver can be probed before the gpiochip it requires,
so, of_get_named_gpio() can return -EPROBE_DEFER. Current code lets the
kirin_pcie_probe() directly return -ENODEV, which results in the PCI
host controller driver probe failure; with this error code the PCI host
controller driver will not be probed again when the gpiochip driver is
loaded.
Fix the above issue by letting kirin_pcie_probe() return -EPROBE_DEFER in
such a case.
misc: pci_endpoint_test: Add driver data for Layerscape PCIe controllers
The commit 0a121f9bc3f5 ("misc: pci_endpoint_test: Use streaming DMA
APIs for buffer allocation") changed to use streaming DMA APIs, however,
dma_map_single() might not return a 4KB aligned address, so add the
default_data as driver data for Layerscape PCIe controllers to make it
4KB aligned.
Xiaowei Bao [Fri, 18 Sep 2020 08:00:20 +0000 (16:00 +0800)]
PCI: layerscape: Modify the MSIX to the doorbell mode
dw_pcie_ep_raise_msix_irq was never called in the exisitng driver
before, because the ls1046a platform don't support the MSIX feature
and msix_capable was always set to false.
Now that add the ls1088a platform with MSIX support, use the doorbell
method to support the MSIX feature.
Xiaowei Bao [Fri, 18 Sep 2020 08:00:19 +0000 (16:00 +0800)]
PCI: layerscape: Modify the way of getting capability with different PEX
The different PCIe controller in one board may be have different
capability of MSI or MSIX, so change the way of getting the MSI
capability, make it more flexible.
Xiaowei Bao [Fri, 18 Sep 2020 08:00:16 +0000 (16:00 +0800)]
PCI: designware-ep: Modify MSI and MSIX CAP way of finding
Each PF of EP device should have its own MSI or MSIX capabitily
struct, so create a dw_pcie_ep_func struct and move the msi_cap
and msix_cap to this struct from dw_pcie_ep, and manage the PFs
via a list.
Xiaowei Bao [Fri, 18 Sep 2020 08:00:15 +0000 (16:00 +0800)]
PCI: designware-ep: Move the function of getting MSI capability forward
Move the function of getting MSI capability to the front of init
function, because the init function of the EP platform driver will use
the return value by the function of getting MSI capability.
Xiaowei Bao [Fri, 18 Sep 2020 08:00:13 +0000 (16:00 +0800)]
PCI: designware-ep: Add multiple PFs support for DWC
Add multiple PFs support for DWC, due to different PF have different
config space, we use func_conf_select callback function to access
the different PF's config space, the different chip company need to
implement this callback function when use the DWC IP core and intend
to support multiple PFs feature.
The msi_ctrl, io_optional and align_resource fields in struct hw_pci are
currently unused by arm/mach PCI host controller drivers and we won't
be adding any new users.
PCI: endpoint: Use "NULL" instead of "0" as a NULL pointer
When returning a NULL pointer, use "NULL" instead of "0".
Fixes sparse warning given by executing "make C=2 drivers/pci/":
CHECK drivers/pci/endpoint/pci-epc-core.c
drivers/pci/endpoint/pci-epc-core.c: note: in included file:
./include/linux/pci-ep-cfs.h:22:16: warning:
Using plain integer as NULL pointer
CHECK drivers/pci/endpoint/pci-epf-core.c
drivers/pci/endpoint/pci-epf-core.c: note: in included file:
./include/linux/pci-ep-cfs.h:31:16: warning:
Using plain integer as NULL pointer
pci_dev_reset_slot_function() refuses to reset a hotplug slot if it is
shared by multiple pci_devs. That's the case if and only if the slot is
occupied by a multifunction device.
Simplify the function to check the device's multifunction flag instead
of iterating over the devices on the bus. (Iterating over the devices
requires holding pci_bus_sem, which the function erroneously does not
acquire.)
When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of up to 244 msec between the two events with Thunderbolt.
After pciehp has brought down the slot in response to the first event, the
other bit may still be set. It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear. So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity. If the slot is no longer
occupied, the messages are false positives and annoy users.
Stuart Hayes reports the following splat on hot removal:
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status
Dongdong Liu complains about a similar splat:
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status
Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs. In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change. In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f991424
("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1. On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat). To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see 6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link").
pciehp is designed to work with all of these variants. Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all. Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.
But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:
* Drop the "Timeout waiting for Presence Detect" message in
pcie_wait_for_presence(). The sole caller of that function,
pciehp_check_link_status(), ignores the timeout and carries on. It emits
error messages of its own and I don't think this particular message adds
much value.
* There's a single error condition in pciehp_check_link_status() which
does not emit a message. Adding one allows dropping the "Failed to check
link status" message emitted by board_added() if
pciehp_check_link_status() returns a non-zero integer.
* Tone down all messages in pciehp_check_link_status() to KERN_INFO
severity and rephrase them to look as innocuous as possible. To this
end, move the message emitted by pcie_wait_for_link_delay() to its
callers.
As a result, Stuart Hayes' splat becomes:
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001
Dongdong Liu's splat becomes:
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link
The messages now merely serve as information that presence or link bits
were set a little longer than expected. Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.
Clint Sbisa [Fri, 21 Aug 2020 15:51:21 +0000 (15:51 +0000)]
PCI: Update mmap-related #ifdef comments
f719582435af ("PCI: Add pci_mmap_resource_range() and use it for ARM64")
changed the #ifdef condition around pci_create_resource_files(),
pci_remove_resource_files(), and related functions, but did not update
comments at the #else and #ifdef.
Dongdong Liu [Thu, 10 Sep 2020 11:24:15 +0000 (19:24 +0800)]
PCI/LINK: Print IRQ number used by port
Print the IRQ used by PCIe Link Bandwidth Notification services port as
AER, PME and DPC do. It provides convenience to track PCIe BW notification
interrupt counts of certain port from /proc/interrupts.
The dmesg log is as below:
pcieport 0000:00:00.0: bw_notification: enabled with IRQ 1166
Jiang Biao [Mon, 24 Aug 2020 05:20:25 +0000 (13:20 +0800)]
PCI: Add schedule point in pci_read_config()
The PCI sysfs "config" file allows large reads, and the resulting PCI
config reads can take several milliseconds to complete. Testing with the
cyclictest [1] benchmark showed 5ms+ latencies.
Add a schedule point in pci_read_config() to reduce the maximum latency.
Jim Quinlan [Fri, 11 Sep 2020 17:52:25 +0000 (13:52 -0400)]
PCI: brcmstb: Add bcm7278 PERST# support
The PERST# bit was moved to a different register in 7278-type STB chips.
In addition, the polarity of the bit was also changed; for other chips
writing a 1 specified assert; for 7278-type chips, writing a 0 specifies
assert. Of course, PERST# is a PCIe asserted-low signal.
While we are here, also change the bridge_sw_init_set() functions so like
the perst_set() functions they are chip specific and we no longer rely on
data wrt chip specific field mask and shift values.
Jim Quinlan [Fri, 11 Sep 2020 17:52:23 +0000 (13:52 -0400)]
PCI: brcmstb: Add bcm7278 register info
Add in compatibility strings and code for three Broadcom STB chips. Some
of the register locations, shifts, and masks are different for certain
chips, requiring the use of different constants based on of_id.
We would like to add the following at this time to the match list but we
need to wait until the end of this patchset so that everything works.
Jim Quinlan [Fri, 11 Sep 2020 17:52:22 +0000 (13:52 -0400)]
dt-bindings: PCI: Add bindings for more Brcmstb chips
- Add compatible strings for three more Broadcom STB chips: 7278, 7216,
7211 (STB version of RPi4).
- Add new property 'brcm,scb-sizes'.
- Add new property 'resets'.
- Add new property 'reset-names' for 7216 only.
- Allow 'ranges' and 'dma-ranges' to have more than one item and update
the example to show this.
Rajat Jain [Tue, 7 Jul 2020 22:46:04 +0000 (15:46 -0700)]
PCI/ACS: Enable Translation Blocking for external devices
Translation Blocking is a required feature for Downstream Ports (Root
Ports or Switch Downstream Ports) that implement ACS. When enabled, the
Port checks the Address Type (AT) of each upstream Memory Request it
receives.
The default AT (00b) means "untranslated" and the IOMMU can decide whether
to treat the address as I/O virtual or physical.
If AT is not the default, i.e., if the Memory Request contains an
already-translated (physical) address, the Port blocks the request and
reports an ACS error.
When enabling ACS, enable Translation Blocking for external-facing ports
and untrusted (external) devices. This is to help prevent attacks from
external devices that initiate DMA with physical addresses that bypass the
IOMMU.
Yicong Yang [Thu, 3 Sep 2020 12:34:56 +0000 (13:34 +0100)]
PCI: hip: Add handling of HiSilicon HIP PCIe controller errors
The HiSilicon HIP PCIe controller is capable of handling errors
on root port and performing port reset separately at each root port.
Add error handling driver for HIP PCIe controller to log
and report recoverable errors. Perform root port reset and restore
link status after the recovery.
Following are some of the PCIe controller's recoverable errors
1. completion transmission timeout error.
2. CRS retry counter over the threshold error.
3. ECC 2 bit errors
4. AXI bresponse/rresponse errors etc.
The driver placed in the drivers/pci/controller/ because the
HIP PCIe controller does not use DWC IP.
Shiju Jose [Thu, 3 Sep 2020 12:34:55 +0000 (13:34 +0100)]
ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records
CPER records describing a firmware-first error are identified by GUID.
The ghes driver currently logs, but ignores any unknown CPER records.
This prevents describing errors that can't be represented by a standard
entry, that would otherwise allow a driver to recover from an error.
The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of
version 2.8).
Add a notifier chain for these non-standard/vendor-records. Callers
must identify their type of records by GUID.
Record data is copied to memory from the ghes_estatus_pool to allow
us to keep it until after the notifier has run.
Jeremy Linton [Tue, 8 Sep 2020 21:03:59 +0000 (16:03 -0500)]
PCI/ACPI: Tone down missing MCFG message
MCFG is an optional ACPI table. Given there are machines without PCI (or
it is hidden) we have been receiving queries/complaints about what this
message means given it's being presented as an error.
Reduce the message severity. The ACPI table list printed at boot will
continue to provide another way to detect when the table is missing.
Rob Herring [Fri, 21 Aug 2020 03:54:19 +0000 (21:54 -0600)]
PCI: dwc: Move N_FTS setup to common setup
The Designware controller has common registers to set number of fast
training sequence ordered sets. The Artpec6, Intel, and Tegra driver
initialize these register fields. Let's move the initialization to the
common setup code and drivers just have to provide the value.
There's a slight change in that the common clock mode N_FTS field is
now initialized. Previously only the Intel driver set this. It's not
clear from the code if common clock mode is used in the Artpec6 or Tegra
driver. It depends on the DWC configuration. Given the field is not
initialized while the others are, it seems unlikely common clock mode
is used.