From c20eaf44113eac090b0d77fa2036143a836b9f56 Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Fri, 8 Mar 2024 14:59:29 -0700 Subject: cxl/region: Add sysfs attribute for locality attributes of CXL regions Add read/write latencies and bandwidth sysfs attributes for the enabled CXL region. The bandwidth is the aggregated bandwidth of all devices that contribute to the CXL region. The latency is the worst latency of the device amongst all the devices that contribute to the CXL region. Reviewed-by: Jonathan Cameron Tested-by: Jonathan Cameron Signed-off-by: Dave Jiang Link: https://lore.kernel.org/r/20240308220055.2172956-11-dave.jiang@intel.com Signed-off-by: Dan Williams --- Documentation/ABI/testing/sysfs-bus-cxl | 34 +++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index fff2581b8033..3f5627a1210a 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -552,3 +552,37 @@ Description: attribute is only visible for devices supporting the capability. The retrieved errors are logged as kernel events when cxl_poison event tracing is enabled. + + +What: /sys/bus/cxl/devices/regionZ/accessY/read_bandwidth + /sys/bus/cxl/devices/regionZ/accessY/write_banwidth +Date: Jan, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) The aggregated read or write bandwidth of the region. The + number is the accumulated read or write bandwidth of all CXL memory + devices that contributes to the region in MB/s. It is + identical data that should appear in + /sys/devices/system/node/nodeX/accessY/initiators/read_bandwidth or + /sys/devices/system/node/nodeX/accessY/initiators/write_bandwidth. + See Documentation/ABI/stable/sysfs-devices-node. access0 provides + the number to the closest initiator and access1 provides the + number to the closest CPU. + + +What: /sys/bus/cxl/devices/regionZ/accessY/read_latency + /sys/bus/cxl/devices/regionZ/accessY/write_latency +Date: Jan, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) The read or write latency of the region. The number is + the worst read or write latency of all CXL memory devices that + contributes to the region in nanoseconds. It is identical data + that should appear in + /sys/devices/system/node/nodeX/accessY/initiators/read_latency or + /sys/devices/system/node/nodeX/accessY/initiators/write_latency. + See Documentation/ABI/stable/sysfs-devices-node. access0 provides + the number to the closest initiator and access1 provides the + number to the closest CPU. -- cgit v1.2.3 From a0563f58300360ef2a00b8fcfea91711594d70be Mon Sep 17 00:00:00 2001 From: Ben Cheatham Date: Mon, 11 Mar 2024 09:25:08 -0500 Subject: EINJ, Documentation: Update EINJ kernel doc Update EINJ kernel document to include how to inject CXL protocol error types, build the kernel to include CXL error types, and give an example injection. Reviewed-by: Jonathan Cameron Signed-off-by: Ben Cheatham Link: https://lore.kernel.org/r/20240311142508.31717-5-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams --- Documentation/firmware-guide/acpi/apei/einj.rst | 34 +++++++++++++++++++++++++ 1 file changed, 34 insertions(+) (limited to 'Documentation') diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index d6b61d22f525..c52b9da08fa9 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -32,6 +32,10 @@ configuration:: CONFIG_ACPI_APEI CONFIG_ACPI_APEI_EINJ +...and to (optionally) enable CXL protocol error injection set:: + + CONFIG_ACPI_APEI_EINJ_CXL + The EINJ user interface is in /apei/einj. The following files belong to it: @@ -118,6 +122,24 @@ The following files belong to it: this actually works depends on what operations the BIOS actually includes in the trigger phase. +CXL error types are supported from ACPI 6.5 onwards (given a CXL port +is present). The EINJ user interface for CXL error types is at +/cxl. The following files belong to it: + +- einj_types: + + Provides the same functionality as available_error_types above, but + for CXL error types + +- $dport_dev/einj_inject: + + Injects a CXL error type into the CXL port represented by $dport_dev, + where $dport_dev is the name of the CXL port (usually a PCIe device name). + Error injections targeting a CXL 2.0+ port can use the legacy interface + under /apei/einj, while CXL 1.1/1.0 port injections + must use this file. + + BIOS versions based on the ACPI 4.0 specification have limited options in controlling where the errors are injected. Your BIOS may support an extension (enabled with the param_extension=1 module parameter, or boot @@ -181,6 +203,18 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +A CXL error injection example with $dport_dev=0000:e0:01.1:: + + # cd /sys/kernel/debug/cxl/ + # ls + 0000:e0:01.1 0000:0c:00.0 + # cat einj_types # See which errors can be injected + 0x00008000 CXL.mem Protocol Correctable + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal + 0x00020000 CXL.mem Protocol Uncorrectable fatal + # cd 0000:e0:01.1 # Navigate to dport to inject into + # echo 0x8000 > einj_inject # Inject error + Special notes for injection into SGX enclaves: There may be a separate BIOS setup option to enable SGX injection. -- cgit v1.2.3 From 8039804cfa7314ad50085a779923aa5469889f88 Mon Sep 17 00:00:00 2001 From: Ben Cheatham Date: Mon, 11 Mar 2024 09:25:07 -0500 Subject: cxl/core: Add CXL EINJ debugfs files Export CXL helper functions in einj-cxl.c for getting/injecting available CXL protocol error types to sysfs under kernel/debug/cxl. The kernel/debug/cxl/einj_types file will print the available CXL protocol errors in the same format as the available_error_types file provided by the einj module. The kernel/debug/cxl/$dport_dev/einj_inject file is functionally the same as the error_type and error_inject files provided by the EINJ module, i.e.: writing an error type into $dport_dev/einj_inject will inject said error type into the CXL dport represented by $dport_dev. Reviewed-by: Jonathan Cameron Signed-off-by: Ben Cheatham Link: https://lore.kernel.org/r/20240311142508.31717-4-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams --- Documentation/ABI/testing/debugfs-cxl | 30 +++++++++++++++++++++++++ drivers/cxl/core/port.c | 41 +++++++++++++++++++++++++++++++++++ 2 files changed, 71 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl index fe61d372e3fa..4c0f62f881ca 100644 --- a/Documentation/ABI/testing/debugfs-cxl +++ b/Documentation/ABI/testing/debugfs-cxl @@ -33,3 +33,33 @@ Description: device cannot clear poison from the address, -ENXIO is returned. The clear_poison attribute is only visible for devices supporting the capability. + +What: /sys/kernel/debug/cxl/einj_types +Date: January, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) Prints the CXL protocol error types made available by + the platform in the format "0x ". + The possible error types are (as of ACPI v6.5): + 0x1000 CXL.cache Protocol Correctable + 0x2000 CXL.cache Protocol Uncorrectable non-fatal + 0x4000 CXL.cache Protocol Uncorrectable fatal + 0x8000 CXL.mem Protocol Correctable + 0x10000 CXL.mem Protocol Uncorrectable non-fatal + 0x20000 CXL.mem Protocol Uncorrectable fatal + + The can be written to einj_inject to inject + into a chosen dport. + +What: /sys/kernel/debug/cxl/$dport_dev/einj_inject +Date: January, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (WO) Writing an integer to this file injects the corresponding + CXL protocol error into $dport_dev ($dport_dev will be a device + name from /sys/bus/pci/devices). The integer to type mapping for + injection can be found by reading from einj_types. If the dport + was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise + a CXL 2.0 error is injected. diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index e59d9d37aa65..01e64b83a78c 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include @@ -793,6 +794,40 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport, return rc; } +DEFINE_SHOW_ATTRIBUTE(einj_cxl_available_error_type); + +static int cxl_einj_inject(void *data, u64 type) +{ + struct cxl_dport *dport = data; + + if (dport->rch) + return einj_cxl_inject_rch_error(dport->rcrb.base, type); + + return einj_cxl_inject_error(to_pci_dev(dport->dport_dev), type); +} +DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, + "0x%llx\n"); + +static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport) +{ + struct dentry *dir; + + if (!einj_cxl_is_initialized()) + return; + + /* + * dport_dev needs to be a PCIe port for CXL 2.0+ ports because + * EINJ expects a dport SBDF to be specified for 2.0 error injection. + */ + if (!dport->rch && !dev_is_pci(dport->dport_dev)) + return; + + dir = cxl_debugfs_create_dir(dev_name(dport->dport_dev)); + + debugfs_create_file("einj_inject", 0200, dir, dport, + &cxl_einj_inject_fops); +} + static struct cxl_port *__devm_cxl_add_port(struct device *host, struct device *uport_dev, resource_size_t component_reg_phys, @@ -1149,6 +1184,8 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev, if (dev_is_pci(dport_dev)) dport->link_latency = cxl_pci_get_latency(to_pci_dev(dport_dev)); + cxl_debugfs_create_dport_dir(dport); + return dport; } @@ -2221,6 +2258,10 @@ static __init int cxl_core_init(void) cxl_debugfs = debugfs_create_dir("cxl", NULL); + if (einj_cxl_is_initialized()) + debugfs_create_file("einj_types", 0400, cxl_debugfs, NULL, + &einj_cxl_available_error_type_fops); + cxl_mbox_init(); rc = cxl_memdev_init(); -- cgit v1.2.3 From edc1243437e75ea019ba264d38b2cd793ae83ed0 Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Thu, 14 Mar 2024 17:34:26 -0700 Subject: Documentation/ABI/testing/debugfs-cxl: Fix "Unexpected indentation" Stephen reported that an htmldocs build hit: Documentation/ABI/testing/debugfs-cxl:38: ERROR: Unexpected indentation. It turns out that line was fine but the tool was unhappy about some line breaks in the table of values to error types. It turns out that: make V=1 SPHINXDIRS="admin-guide" htmldocs ...can not be used to get more info about what is behind a documentation build error. It was only pure luck that reflowing the text resulted in an error message that seemed a imply a problem later on with line breaks around the table. Fixes: 8039804cfa73 ("cxl/core: Add CXL EINJ debugfs files") Cc: Jonathan Corbet Reported-by: Stephen Rothwell Closes: http://lore.kernel.org/r/20240314141313.7ba04aff@canb.auug.org.au Cc: Ben Cheatham Signed-off-by: Dan Williams --- Documentation/ABI/testing/debugfs-cxl | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl index 4c0f62f881ca..c61f9b813973 100644 --- a/Documentation/ABI/testing/debugfs-cxl +++ b/Documentation/ABI/testing/debugfs-cxl @@ -40,8 +40,12 @@ KernelVersion: v6.9 Contact: linux-cxl@vger.kernel.org Description: (RO) Prints the CXL protocol error types made available by - the platform in the format "0x ". + the platform in the format: + + 0x + The possible error types are (as of ACPI v6.5): + 0x1000 CXL.cache Protocol Correctable 0x2000 CXL.cache Protocol Uncorrectable non-fatal 0x4000 CXL.cache Protocol Uncorrectable fatal -- cgit v1.2.3