summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2020-10-13mm/memremap_pages: convert to 'struct range'Dan Williams
The 'struct resource' in 'struct dev_pagemap' is only used for holding resource span information. The other fields, 'name', 'flags', 'desc', 'parent', 'sibling', and 'child' are all unused wasted space. This is in preparation for introducing a multi-range extension of devm_memremap_pages(). The bulk of this change is unwinding all the places internal to libnvdimm that used 'struct resource' unnecessarily, and replacing instances of 'struct dev_pagemap'.res with 'struct dev_pagemap'.range. P2PDMA had a minor usage of the resource flags field, but only to report failures with "%pR". That is replaced with an open coded print of the range. [dan.carpenter@oracle.com: mm/hmm/test: use after free in dmirror_allocate_chunk()] Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadam Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen] Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13PCI: dwc: Fix MSI page leakage in suspend/resumeJisheng Zhang
Currently, dw_pcie_msi_init() allocates and maps page for msi, then program the PCIE_MSI_ADDR_LO and PCIE_MSI_ADDR_HI. The Root Complex may lose power during suspend-to-RAM, so when we resume, we want to redo the latter but not the former. If designware based driver (for example, pcie-tegra194.c) calls dw_pcie_msi_init() in resume path, the msi page will be leaked. As pointed out by Rob and Ard, there's no need to allocate a page for the MSI address, we could use an address in the driver data. To avoid map the MSI msg again during resume, we move the map MSI msg from dw_pcie_msi_init() to dw_pcie_host_init(). Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20201009155505.5a580ef5@xhacker.debian Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-13PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabledJisheng Zhang
If MSI is disabled, there's no need to program PCIE_MSI_INTR0_MASK and PCIE_MSI_INTR0_ENABLE registers. Link: https://lore.kernel.org/r/20201009155436.27e67238@xhacker.debian Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-10-13PCI: keystone: Remove iATU register mappingKunihiko Hayashi
After applying "PCI: dwc: Add common iATU register support", there is no need to set own iATU in the Keystone driver itself. Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1601444167-11316-5-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-10-13PCI: dwc: Add common iATU register supportKunihiko Hayashi
This gets iATU register area from reg property that has reg-names "atu". In Synopsys DWC version 4.80 or later, since iATU register area is separated from core register area, this area is necessary to get from DT independently. Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1601444167-11316-4-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-10-12Merge tag 'x86-irq-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Surgery of the MSI interrupt handling to prepare the support of upcoming devices which require non-PCI based MSI handling: - Cleanup historical leftovers all over the place - Rework the code to utilize more core functionality - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain assignment to PCI devices possible. - Assign irqdomains to PCI devices at initialization time which allows to utilize the full functionality of hierarchical irqdomains. - Remove arch_.*_msi_irq() functions from X86 and utilize the irqdomain which is assigned to the device for interrupt management. - Make the arch_.*_msi_irq() support conditional on a config switch and let the last few users select it" * tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS x86/apic/msi: Unbreak DMAR and HPET MSI iommu/amd: Remove domain search for PCI/MSI iommu/vt-d: Remove domain search for PCI/MSI[X] x86/irq: Make most MSI ops XEN private x86/irq: Cleanup the arch_*_msi_irqs() leftovers PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable x86/pci: Set default irq domain in pcibios_add_device() iommm/amd: Store irq domain in struct device iommm/vt-d: Store irq domain in struct device x86/xen: Wrap XEN MSI management into irqdomain irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() x86/xen: Consolidate XEN-MSI init x86/xen: Rework MSI teardown x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() PCI/MSI: Provide pci_dev_has_special_msi_domain() helper PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI x86/irq: Initialize PCI/MSI domain at PCI init time x86/pci: Reducde #ifdeffery in PCI init code ...
2020-10-09PCI: iproc: Fix using plain integer as NULL pointer in iproc_pcie_pltfm_probeKrzysztof Wilczyński
Fix sparse build warning: drivers/pci/controller/pcie-iproc-platform.c:102:33: warning: Using plain integer as NULL pointer The map_irq member of the struct iproc_pcie takes a function pointer serving as a callback to map interrupts, therefore we should pass a NULL pointer to it rather than a integer in the iproc_pcie_pltfm_probe() function. Related: commit b64aa11eb2dd ("PCI: Set bridge map_irq and swizzle_irq to default functions") Link: https://lore.kernel.org/r/20200922194932.465925-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-06PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPILukas Wunner
Recent laptops with dual AMD GPUs fail to suspend the discrete GPU, thus causing lockups on system sleep and high power consumption at runtime. The discrete GPU would normally be suspended to D3cold by turning off ACPI _PR3 Power Resources of the Root Port above the GPU. However on affected systems, the Root Port is hotplug-capable and pci_bridge_d3_possible() only allows hotplug ports to go to D3 if they belong to a Thunderbolt device or if the Root Port possesses a "HotPlugSupportInD3" ACPI property. Neither is the case on affected laptops. The reason for whitelisting only specific, known to work hotplug ports for D3 is that there have been reports of SkyLake Xeon-SP systems raising Hardware Error NMIs upon suspending their hotplug ports: https://lore.kernel.org/linux-pci/20170503180426.GA4058@otc-nc-03/ But if a hotplug port is power manageable by ACPI (as can be detected through presence of Power Resources and corresponding _PS0 and _PS3 methods) then it ought to be safe to suspend it to D3. To this end, amend acpi_pci_bridge_d3() to whitelist such ports for D3. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1222 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1252 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1304 Reported-and-tested-by: Arthur Borsboom <arthurborsboom@gmail.com> Reported-and-tested-by: matoro <matoro@airmail.cc> Reported-by: Aaron Zakhrov <aaron.zakhrov@gmail.com> Reported-by: Michal Rostecki <mrostecki@suse.com> Reported-by: Shai Coleman <git@shaicoleman.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-06dma-mapping: move dma-debug.h to kernel/dma/Christoph Hellwig
Most of dma-debug.h is not required by anything outside of kernel/dma. Move the four declarations needed by dma-mappin.h or dma-ops providers into dma-mapping.h and dma-map-ops.h, and move the remainder of the file to kernel/dma/debug.h. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06dma-mapping: split <linux/dma-mapping.h>Christoph Hellwig
Split out all the bits that are purely for dma_map_ops implementations and related code into a new <linux/dma-map-ops.h> header so that they don't get pulled into all the drivers. That also means the architecture specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h> any more, which leads to a missing includes that were pulled in by the x86 or arm versions in a few not overly portable drivers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-05PCI: meson: Build as module by defaultKevin Hilman
Enable pci-meson to build as a module whenever ARCH_MESON is enabled. Link: https://lore.kernel.org/r/20200918181251.32423-1-khilman@baylibre.com Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Cc: Yue Wang <yue.wang@amlogic.com>
2020-10-05Merge 5.9-rc8 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well for testing. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02Merge tag 'pci-v5.9-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix rockchip regression in rockchip_pcie_valid_device() (Lorenzo Pieralisi) - Add Pali Rohár as aardvark PCI maintainer (Pali Rohár) * tag 'pci-v5.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Add Pali Rohár as aardvark PCI maintainer PCI: rockchip: Fix bus checks in rockchip_pcie_valid_device()
2020-10-02PCI: aardvark: Fix initialization with old Marvell's Arm Trusted FirmwarePali Rohár
Old ATF automatically power on pcie phy and does not provide SMC call for phy power on functionality which leads to aardvark initialization failure: [ 0.330134] mvebu-a3700-comphy d0018300.phy: unsupported SMC call, try updating your firmware [ 0.338846] phy phy-d0018300.phy.1: phy poweron failed --> -95 [ 0.344753] advk-pcie d0070000.pcie: Failed to initialize PHY (-95) [ 0.351160] advk-pcie: probe of d0070000.pcie failed with error -95 This patch fixes above failure by ignoring 'not supported' error in aardvark driver. In this case it is expected that phy is already power on. Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Link: https://lore.kernel.org/r/20200902144344.16684-3-pali@kernel.org Fixes: 366697018c9a ("PCI: aardvark: Add PHY support") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: <stable@vger.kernel.org> # 5.8+: ea17a0f153af: phy: marvell: comphy: Convert internal SMCC firmware return codes to errno
2020-10-02PCI: xgene: Remove unused assignment to variable msi_valKrzysztof Wilczyński
The value assigned to msi_val after the inner loop finishes its run is never used for anything, and it is also immediately overridden in the line that follows with the return value from the xgene_msi_int_read() function. Since the value of msi_val following the inner loop completion is never used in any meaningful way the assignment can be removed. Addresses-Coverity-ID: 1437183 ("Unused value") Link: https://lore.kernel.org/r/20200922030257.459898-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-02PCI: loongson: Simplify loongson_pci_probe() return expressionQinglang Miao
Simplify the return expression. Link: https://lore.kernel.org/r/20200921131054.92797-1-miaoqinglang@huawei.com Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-02PCI: cadence: Simplify cdns_pcie_host_init_address_translation() return ↵Qinglang Miao
expression Simplify the return expression. Link: https://lore.kernel.org/r/20200921131053.92752-1-miaoqinglang@huawei.com Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-02PCI: mobiveil: Simplify mobiveil_pcie_init_irq_domain() return expressionLiu Shixin
Simplify the return expression by removing useless code. Link: https://lore.kernel.org/r/20200921082447.2591877-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-10-02PCI: iproc: Use module_bcma_driver to simplify the codeLiu Shixin
module_bcma_driver() makes the code simpler by eliminating boilerplate code. Link: https://lore.kernel.org/r/20200918030829.3946025-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Ray Jui <ray.jui@broadcom.com>
2020-10-02PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match listJim Quinlan
Now that the support is in place with previous commits, we add several chips that use the BrcmSTB driver. Link: https://lore.kernel.org/r/20200911175232.19016-11-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Rob Herring <robh@kernel.org>
2020-10-02PCI: brcmstb: Set bus max burst size by chip typeJim Quinlan
The proper value of the parameter SCB_MAX_BURST_SIZE varies per chip. The 2711 family requires 128B whereas other devices can employ 512. The assignment is complicated by the fact that the values for this two-bit field have different meanings; Value Type_Generic Type_7278 00 Reserved 128B 01 128B 256B 10 256B 512B 11 512B Reserved Link: https://lore.kernel.org/r/20200911175232.19016-10-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <jquinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2020-10-02PCI: brcmstb: Accommodate MSI for older chipsJim Quinlan
Older BrcmSTB chips do not have a separate register for MSI interrupts; the MSIs are in a register that also contains unrelated interrupts. In addition, the interrupts lie in bits [31..24] for these legacy chips. This commit provides common code for both legacy and non-legacy MSI interrupt registers. Link: https://lore.kernel.org/r/20200911175232.19016-9-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <jquinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2020-10-02PCI: brcmstb: Set additional internal memory DMA viewport sizesJim Quinlan
The Raspberry Pi (RPI) is currently the only chip using this driver (pcie-brcmstb.c). There, only one memory controller is used, without an extension region, and the SCB0 viewport size is set to the size of the first and only dma-range region. Other BrcmSTB SOCs have more complicated memory configurations that require setting additional viewport sizes. BrcmSTB PCIe controllers are intimately connected to the memory controller(s) on the SOC. The SOC may have one to three memory controllers; they are indicated by the term SCBi. Each controller has a base region and an optional extension region. In physical memory, the base and extension regions of a controller are not adjacent, but in PCIe-space they are. There is a "viewport" for each memory controller that allows DMA from endpoint devices. Each viewport's size must be set to a power of two, and that size must be equal to or larger than the amount of memory each controller supports which is the sum of base region and its optional extension. Further, the 1-3 viewports are also adjacent in PCIe-space. Unfortunately the viewport sizes cannot be ascertained from the "dma-ranges" property so they have their own property, "brcm,scb-sizes". This is because dma-range information does not indicate what memory controller it is associated. For example, consider the following case where the size of one dma-range is 2GB and the second dma-range is 1GB: /* Case 1: SCB0 size set to 4GB */ dma-range0: 2GB (from memc0-base) dma-range1: 1GB (from memc0-extension) /* Case 2: SCB0 size set to 2GB, SCB1 size set to 1GB */ dma-range0: 2GB (from memc0-base) dma-range1: 1GB (from memc0-extension) By just looking at the dma-ranges information, one cannot tell which situation applies. That is why an additional property is needed. Its length indicates the number of memory controllers being used and each value indicates the viewport size. Note that the RPI DT does not have a "brcm,scb-sizes" property value, as it is assumed that it only requires one memory controller and no extension. So the optional use of "brcm,scb-sizes" will be backwards compatible. One last layer of complexity exists: all of the viewports sizes must be added and rounded up to a power of two to determine what the "BAR" size is. Further, an offset must be given that indicates the base PCIe address of this "BAR". The use of the term BAR is typically associated with endpoint devices, and the term is used here because the PCIe HW may be used as an RC or an EP. In the former case, all of the system memory appears in a single "BAR" region in PCIe memory. As it turns out, BrcmSTB PCIe HW is rarely used in the EP role and its system of mapping memory is an artifact that requires multiple dma-ranges regions. Link: https://lore.kernel.org/r/20200911175232.19016-8-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2020-10-02PCI: brcmstb: Add control of rescal resetJim Quinlan
Some STB chips have a special purpose reset controller named RESCAL (reset calibration). The PCIe HW can now control RESCAL to start and stop its operation. On probe(), the RESCAL is deasserted and the driver goes through the sequence of setting registers and reading status in order to start the internal PHY that is required for the PCIe. Link: https://lore.kernel.org/r/20200911175232.19016-7-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <jquinlan@broadcom.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2020-10-02PCI: hv: Fix hibernation in case interrupts are not re-createdDexuan Cui
pci_restore_msi_state() directly writes the MSI/MSI-X related registers via MMIO. On a physical machine, this works perfectly; for a Linux VM running on a hypervisor, which typically enables IOMMU interrupt remapping, the hypervisor usually should trap and emulate the MMIO accesses in order to re-create the necessary interrupt remapping table entries in the IOMMU, otherwise the interrupts can not work in the VM after hibernation. Hyper-V is different from other hypervisors in that it does not trap and emulate the MMIO accesses, and instead it uses a para-virtualized method, which requires the VM to call hv_compose_msi_msg() to notify the hypervisor of the info that would be passed to the hypervisor in the case of the trap-and-emulate method. This is not an issue to a lot of PCI device drivers, which destroy and re-create the interrupts across hibernation, so hv_compose_msi_msg() is called automatically. However, some PCI device drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(), otherwise the PCI device drivers can no longer receive interrupts after the VM resumes from hibernation. Hyper-V is also different in that chip->irq_unmask() may fail in a Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask() can not fail because unmasking an MSI/MSI-X register just means an MMIO write): during hibernation, when a CPU is offlined, the kernel tries to move the interrupt to the remaining CPUs that haven't been offlined yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails because the vmbus channel has been closed: here the early "return" in hv_irq_unmask() means the pci_msi_unmask_irq() is not called, i.e. the desc->masked remains "true", so later after hibernation, the MSI interrupt always remains masked, which is incorrect. Refer to cpu_disable_common() -> fixup_irqs() -> irq_migrate_all_off_this_cpu() -> migrate_one_irq(): static bool migrate_one_irq(struct irq_desc *desc) { ... if (maskchip && chip->irq_mask) chip->irq_mask(d); ... err = irq_do_set_affinity(d, affinity, false); ... if (maskchip && chip->irq_unmask) chip->irq_unmask(d); Fix the issue by calling pci_msi_unmask_irq() unconditionally in hv_irq_unmask(). Also suppress the error message for hibernation because the hypercall failure during hibernation does not matter (at this time all the devices have been frozen). Note: the correct affinity info is still updated into the irqdata data structure in migrate_one_irq() -> irq_do_set_affinity() -> hv_set_affinity(), so later when the VM resumes, hv_pci_restore_msi_state() is able to correctly restore the interrupt with the correct affinity. Link: https://lore.kernel.org/r/20201002085158.9168-1-decui@microsoft.com Fixes: ac82fc832708 ("PCI: hv: Add hibernation support") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Jake Oshins <jakeo@microsoft.com>
2020-09-30PCI: Add Kconfig options for MPS/MRRS strategyJim Quinlan
Add Kconfig options for changing the default pcie_bus_config, i.e., the strategy for configuration MPS and MRRS, in the same manner as the CONFIG_PCIEASPM_XXXX choice. The pci_bus_config setting may still be overridden by kernel command-line parameters, e.g., "pci=pcie_bus_tune_off". [bhelgaas: depend on EXPERT, tweak help texts] Link: https://lore.kernel.org/r/20200928194651.5393-2-james.quinlan@broadcom.com Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-30PCI/PM: Revert "PCI/PM: Apply D2 delay as milliseconds, not microseconds"Bjorn Helgaas
This reverts commit 7e24bc347e57992d532bc2ed700209b0fc0a4bf5. 7e24bc347e57 was based on PCIe r5.0, sec 5.9, which claims we need a 200 ms delay when transitioning to or from D2. However, sec 5.3.1.3 states the delay as 200 μs (microseconds), as does the table in PCIe r4.0, sec 5.9.1. This looks like a typo in the r5.0 spec, so revert back to a 200 μs delay instead of a 200 ms delay. Fixes: 7e24bc347e57 ("PCI/PM: Apply D2 delay as milliseconds, not microseconds") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-30PCI/PM: Remove unused PCI_PM_BUS_WAITBjorn Helgaas
476e7faefc43 ("PCI PM: Do not wait for buses in B2 or B3 during resume") removed the last use of PCI_PM_BUS_WAIT. Remove the definition as well. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-30PCI/P2PDMA: Drop double zeroing for sg_init_table()Julia Lawall
sg_init_table() zeroes its first argument, so the allocation of that argument doesn't have to. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; @@ x = - kzalloc + kmalloc (...) ... sg_init_table(x,...) // </smpl> Link: https://lore.kernel.org/r/1600601186-7420-15-git-send-email-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-30PCI: Simplify bool comparisonsKrzysztof Wilczyński
Take care about Coccinelle warnings: drivers/pci/pci.c:6008:6-12: WARNING: Comparison to bool drivers/pci/pci.c:6024:7-13: WARNING: Comparison to bool No change to functionality intended. Link: https://lore.kernel.org/r/20200925224555.1752460-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-30Merge tag 'thunderbolt-for-v5.10-rc1' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next Mika writes: thunderbolt: Changes for v5.10 merge window This includes following Thunderbolt/USB4 changes for v5.10 merge window: * A couple of optimizations around Tiger Lake force power logic and NHI (Native Host Interface) LC (Link Controller) mailbox command processing * Power management improvements for Software Connection Manager * Debugfs support * Allow KUnit tests to be enabled also when Thunderbolt driver is configured as module. * Few minor cleanups and fixes All these have been in linux-next with no reported issues. * tag 'thunderbolt-for-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (37 commits) thunderbolt: Capitalize comment on top of QUIRK_FORCE_POWER_LINK_CONTROLLER thunderbolt: Correct tb_check_quirks() kernel-doc thunderbolt: Log correct zeroX entries in decode_error() thunderbolt: Handle ERR_LOCK notification thunderbolt: Use "if USB4" instead of "depends on" in Kconfig thunderbolt: Allow KUnit tests to be built also when CONFIG_USB4=m thunderbolt: Only stop control channel when entering freeze thunderbolt: debugfs: Fix uninitialized return in counters_write() thunderbolt: Add debugfs interface thunderbolt: No need to warn in TB_CFG_ERROR_INVALID_CONFIG_SPACE thunderbolt: Introduce tb_switch_is_tiger_lake() thunderbolt: Introduce tb_switch_is_ice_lake() thunderbolt: Check for Intel vendor ID when identifying controller thunderbolt: Introduce tb_port_is_nhi() thunderbolt: Introduce tb_switch_next_cap() thunderbolt: Introduce tb_port_next_cap() thunderbolt: Move struct tb_cap_any to tb_regs.h thunderbolt: Add runtime PM for Software CM thunderbolt: Create device links from ACPI description ACPI: Export acpi_get_first_physical_node() to modules ...
2020-09-29PCI/PM: Rename pci_dev.d3_delay to d3hot_delayKrzysztof Wilczyński
PCI devices support two variants of the D3 power state: D3hot (main power present) D3cold (main power removed). Previously struct pci_dev contained: unsigned int d3_delay; /* D3->D0 transition time in ms */ unsigned int d3cold_delay; /* D3cold->D0 transition time in ms */ "d3_delay" refers specifically to the D3hot state. Rename it to "d3hot_delay" to avoid ambiguity and align with the ACPI "_DSM for Specifying Device Readiness Durations" in the PCI Firmware spec r3.2, sec 4.6.9. There is no change to the functionality. Link: https://lore.kernel.org/r/20200730210848.1578826-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-29PCI/PM: Remove unused pcibios_pm_opsVaibhav Gupta
The "struct dev_pm_ops pcibios_pm_ops", declared in include/linux/pci.h and defined in drivers/pci/pci-driver.c, provided arch-specific hooks when a PCI device was doing a hibernate transition. 394216275c7d ("s390: remove broken hibernate / power management support") removed the last use of pcibios_pm_ops, so remove it completely. [bhelgaas: drop unused "error"] Link: https://lore.kernel.org/r/20200730194416.1029509-1-vaibhavgupta40@gmail.com Reported-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-28PCI: shpchp: Remove unused 'rc' assignmentKrzysztof Wilczyński
The value of the constant POWER_FAILURE assigned to the variable rc after the power fault check is never used for anything, so remove it. Addresses-Coverity-ID: 1226899 ("Unused value") Link: https://lore.kernel.org/r/20200923025225.471459-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-28PCI: kirin: Return -EPROBE_DEFER in case the gpio isn't readyBean Huo
PCI host bridge driver can be probed before the gpiochip it requires, so, of_get_named_gpio() can return -EPROBE_DEFER. Current code lets the kirin_pcie_probe() directly return -ENODEV, which results in the PCI host controller driver probe failure; with this error code the PCI host controller driver will not be probed again when the gpiochip driver is loaded. Fix the above issue by letting kirin_pcie_probe() return -EPROBE_DEFER in such a case. Link: https://lore.kernel.org/r/20200918123800.19983-1-huobean@gmail.com Signed-off-by: Bean Huo <beanhuo@micron.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2020-09-28PCI: dwc: Fix 'cast truncates bits from constant value'Gustavo Pimentel
Fixes warning given by executing "make C=2 drivers/pci/" Sparse output: CHECK drivers/pci/controller/dwc/pcie-designware.c drivers/pci/controller/dwc/pcie-designware.c:432:52: warning: cast truncates bits from constant value (ffffffff7fffffff becomes 7fffffff) Link: https://lore.kernel.org/r/7ea7f7d342f97c758949a17b870012f52ce5b3f5.1600767645.git.gustavo.pimentel@synopsys.com Reported-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Joao Pinto <jpinto@synopsys.com>
2020-09-28PCI: tegra: Convert to use DEFINE_SEQ_ATTRIBUTE macroLiu Shixin
Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code. Link: https://lore.kernel.org/r/20200916025025.3992783-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com>
2020-09-28PCI: hv: Document missing hv_pci_protocol_negotiation() parameterKrzysztof Wilczyński
Add missing documentation for the parameter "version" and "num_version" of the hv_pci_protocol_negotiation() function and resolve build time kernel-doc warnings: drivers/pci/controller/pci-hyperv.c:2535: warning: Function parameter or member 'version' not described in 'hv_pci_protocol_negotiation' drivers/pci/controller/pci-hyperv.c:2535: warning: Function parameter or member 'num_version' not described in 'hv_pci_protocol_negotiation' No change to functionality intended. Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20200925234753.1767227-1-kw@linux.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-09-21PCI/IOV: Mark VFs as not implementing PCI_COMMAND_MEMORYMatthew Rosato
For VFs, the Memory Space Enable bit in the Command Register is hard-wired to 0. Add a new bit to signify devices where the Command Register Memory Space Enable bit does not control the device's response to MMIO accesses. Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-09-21PCI: layerscape: Add EP mode support for ls1088a and ls2088aXiaowei Bao
Add PCIe EP mode support for ls1088a and ls2088a, there are some difference between LS1 and LS2 platform, so refactor the code of the EP driver. Link: https://lore.kernel.org/r/20200918080024.13639-10-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-09-21PCI: layerscape: Modify the MSIX to the doorbell modeXiaowei Bao
dw_pcie_ep_raise_msix_irq was never called in the exisitng driver before, because the ls1046a platform don't support the MSIX feature and msix_capable was always set to false. Now that add the ls1088a platform with MSIX support, use the doorbell method to support the MSIX feature. Link: https://lore.kernel.org/r/20200918080024.13639-9-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com>
2020-09-21PCI: layerscape: Modify the way of getting capability with different PEXXiaowei Bao
The different PCIe controller in one board may be have different capability of MSI or MSIX, so change the way of getting the MSI capability, make it more flexible. Link: https://lore.kernel.org/r/20200918080024.13639-8-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-09-21PCI: layerscape: Fix some format issue of the codeXiaowei Bao
Fix some format issue of the code in EP driver. Link: https://lore.kernel.org/r/20200918080024.13639-7-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com>
2020-09-21PCI: designware-ep: Modify MSI and MSIX CAP way of findingXiaowei Bao
Each PF of EP device should have its own MSI or MSIX capabitily struct, so create a dw_pcie_ep_func struct and move the msi_cap and msix_cap to this struct from dw_pcie_ep, and manage the PFs via a list. Link: https://lore.kernel.org/r/20200918080024.13639-5-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2020-09-21PCI: designware-ep: Move the function of getting MSI capability forwardXiaowei Bao
Move the function of getting MSI capability to the front of init function, because the init function of the EP platform driver will use the return value by the function of getting MSI capability. Link: https://lore.kernel.org/r/20200918080024.13639-4-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com>
2020-09-21PCI: designware-ep: Add the doorbell mode of MSI-X in EP modeXiaowei Bao
Add the doorbell mode of MSI-X in DWC EP driver. Link: https://lore.kernel.org/r/20200918080024.13639-3-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com>
2020-09-21PCI: designware-ep: Add multiple PFs support for DWCXiaowei Bao
Add multiple PFs support for DWC, due to different PF have different config space, we use func_conf_select callback function to access the different PF's config space, the different chip company need to implement this callback function when use the DWC IP core and intend to support multiple PFs feature. Link: https://lore.kernel.org/r/20200918080024.13639-2-Zhiqiang.Hou@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2020-09-17PCI: Simplify pci_dev_reset_slot_function()Lukas Wunner
pci_dev_reset_slot_function() refuses to reset a hotplug slot if it is shared by multiple pci_devs. That's the case if and only if the slot is occupied by a multifunction device. Simplify the function to check the device's multifunction flag instead of iterating over the devices on the bus. (Iterating over the devices requires holding pci_bus_sem, which the function erroneously does not acquire.) Link: https://lore.kernel.org/r/c6aab5af096f7b1b3db57f6335cebba8f0fcca89.1595330431.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Alex Williamson <alex.williamson@redhat.com>
2020-09-17PCI: pciehp: Reduce noisiness on hot removalLukas Wunner
When a PCIe card is hot-removed, the Presence Detect State and Data Link Layer Link Active bits often do not clear simultaneously. I've seen delays of up to 244 msec between the two events with Thunderbolt. After pciehp has brought down the slot in response to the first event, the other bit may still be set. It's not discernible whether it's set because a new card is already in the slot or if it will soon clear. So pciehp tries to bring up the slot and in the latter case fails with a bunch of messages, some of them at KERN_ERR severity. If the slot is no longer occupied, the messages are false positives and annoy users. Stuart Hayes reports the following splat on hot removal: KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001 KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status Dongdong Liu complains about a similar splat: KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down KERN_INFO iommu: Removing device 0000:87:00.0 from group 12 KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status Users are particularly irritated to see a bringup attempt even though the slot was explicitly brought down via sysfs. In a perfect world, we could avoid this by setting Link Disable on slot bringdown and re-enabling it upon a Presence Detect State change. In reality however, there are broken hotplug ports which hardwire Presence Detect to zero, see 80696f991424 ("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely, PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active Reporting wasn't specified before PCIe r1.1. On unplug, some ports first clear Presence then Link (see Stuart Hayes' splat) whereas others use the inverse order (see Dongdong Liu's splat). To top it off, there are hotplug ports which flap the Presence and Link bits on slot bringup, see 6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link"). pciehp is designed to work with all of these variants. Surplus attempts at slot bringup are a lesser evil than not being able to bring up slots at all. Although we could try to perfect the behavior for specific hotplug controllers, we'd risk breaking others or increasing code complexity. But we can certainly minimize annoyance by emitting only a single message with KERN_INFO severity if bringup is unsuccessful: * Drop the "Timeout waiting for Presence Detect" message in pcie_wait_for_presence(). The sole caller of that function, pciehp_check_link_status(), ignores the timeout and carries on. It emits error messages of its own and I don't think this particular message adds much value. * There's a single error condition in pciehp_check_link_status() which does not emit a message. Adding one allows dropping the "Failed to check link status" message emitted by board_added() if pciehp_check_link_status() returns a non-zero integer. * Tone down all messages in pciehp_check_link_status() to KERN_INFO severity and rephrase them to look as innocuous as possible. To this end, move the message emitted by pcie_wait_for_link_delay() to its callers. As a result, Stuart Hayes' splat becomes: KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001 Dongdong Liu's splat becomes: KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link The messages now merely serve as information that presence or link bits were set a little longer than expected. Bringup failures which are not false positives are still reported, albeit no longer at KERN_ERR severity. Link: https://lore.kernel.org/linux-pci/20200310182100.102987-1-stuart.w.hayes@gmail.com/ Link: https://lore.kernel.org/linux-pci/1547649064-19019-1-git-send-email-liudongdong3@huawei.com/ Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com> Reported-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2020-09-17PCI: rpadlpar: Use for_each_child_of_node() and for_each_node_by_name()Qinglang Miao
Use for_each_child_of_node() and for_each_node_by_name() macros instead of open coding them. Link: https://lore.kernel.org/r/20200916062128.190819-1-miaoqinglang@huawei.com Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>