summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/reporting-issues.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/reporting-issues.rst')
-rw-r--r--Documentation/admin-guide/reporting-issues.rst1117
1 files changed, 620 insertions, 497 deletions
diff --git a/Documentation/admin-guide/reporting-issues.rst b/Documentation/admin-guide/reporting-issues.rst
index 07879d01fe68..48b4d0ef2b09 100644
--- a/Documentation/admin-guide/reporting-issues.rst
+++ b/Documentation/admin-guide/reporting-issues.rst
@@ -9,25 +9,6 @@
(for example by the kernel's build system) might contain content taken from
files which use a more restrictive license.
-.. important::
-
- This document is being prepared to replace
- 'Documentation/admin-guide/reporting-bugs.rst'. The main work is done and
- you are already free to follow its instructions when reporting issues to the
- Linux kernel developers. But keep in mind, below text still needs a few
- finishing touches and review. It was merged to the Linux kernel sources at
- this stage to make this process easier and increase the text's visibility.
-
- Any improvements for the text or other feedback is thus very much welcome.
- Please send it to 'Thorsten Leemhuis <linux@leemhuis.info>' and 'Jonathan
- Corbet <corbet@lwn.net>', ideally with 'Linux kernel mailing list (LKML)
- <linux-kernel@vger.kernel.org>' and the 'Linux Kernel Documentation List
- <linux-doc@vger.kernel.org>' in CC.
-
- Areas in the text that still need work or discussion contain a hint like this
- which point out the remaining issues; all of them start with the word "FIXME"
- to make them easy to find.
-
Reporting issues
++++++++++++++++
@@ -36,46 +17,43 @@ Reporting issues
The short guide (aka TL;DR)
===========================
-If you're facing multiple issues with the Linux kernel at once, report each
-separately to its developers. Try your best guess which kernel part might be
-causing the issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its
-developers expect to be told about issues. Note, it's rarely
-`bugzilla.kernel.org <https://bugzilla.kernel.org/>`_, as in almost all cases
-the report needs to be sent by email!
-
-Check the destination thoroughly for existing reports; also search the LKML
-archives and the web. Join existing discussion if you find matches. If you
-don't find any, install `the latest Linux mainline kernel
-<https://kernel.org/>`_. Make sure it's vanilla, thus is not patched or using
-add-on kernel modules. Also ensure the kernel is running in a healthy
-environment and is not already tainted before the issue occurs.
-
-If you can reproduce your issue with the mainline kernel, send a report to the
-destination you determined earlier. Make sure it includes all relevant
-information, which in case of a regression should mention the change that's
-causing it which can often can be found with a bisection. Also ensure the
-report reaches all people that need to know about it, for example the security
-team, the stable maintainers or the developers of the patch that causes a
-regression. Once the report is out, answer any questions that might be raised
-and help where you can. That includes keeping the ball rolling: every time a
-new rc1 mainline kernel is released, check if the issue is still happening
-there and attach a status update to your initial report.
-
-If you can not reproduce the issue with the mainline kernel, consider sticking
-with it; if you'd like to use an older version line and want to see it fixed
-there, first make sure it's still supported. Install its latest release as
-vanilla kernel. If you cannot reproduce the issue there, try to find the commit
-that fixed it in mainline or any discussion preceding it: those will often
-mention if backporting is planed or considered too complex. If backporting was
-not discussed, ask if it's in the cards. In case you don't find any commits or
-a preceding discussion, see the Linux-stable mailing list archives for existing
-reports, as it might be a regression specific to the version line. If it is,
-report it like you would report a problem in mainline (including the
-bisection).
-
-If you reached this point without a solution, ask for advice one the
-subsystem's mailing list.
-
+Are you facing a regression with vanilla kernels from the same stable or
+longterm series? One still supported? Then search the `LKML
+<https://lore.kernel.org/lkml/>`_ and the `Linux stable mailing list
+<https://lore.kernel.org/stable/>`_ archives for matching reports to join. If
+you don't find any, install `the latest release from that series
+<https://kernel.org/>`_. If it still shows the issue, report it to the stable
+mailing list (stable@vger.kernel.org) and CC the regressions list
+(regressions@lists.linux.dev).
+
+In all other cases try your best guess which kernel part might be causing the
+issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its developers
+expect to be told about problems, which most of the time will be by email with a
+mailing list in CC. Check the destination's archives for matching reports;
+search the `LKML <https://lore.kernel.org/lkml/>`_ and the web, too. If you
+don't find any to join, install `the latest mainline kernel
+<https://kernel.org/>`_. If the issue is present there, send a report.
+
+The issue was fixed there, but you would like to see it resolved in a still
+supported stable or longterm series as well? Then install its latest release.
+If it shows the problem, search for the change that fixed it in mainline and
+check if backporting is in the works or was discarded; if it's neither, ask
+those who handled the change for it.
+
+**General remarks**: When installing and testing a kernel as outlined above,
+ensure it's vanilla (IOW: not patched and not using add-on modules). Also make
+sure it's built and running in a healthy environment and not already tainted
+before the issue occurs.
+
+If you are facing multiple issues with the Linux kernel at once, report each
+separately. While writing your report, include all information relevant to the
+issue, like the kernel and the distro used. In case of a regression, CC the
+regressions mailing list (regressions@lists.linux.dev) to your report; also try
+to include the commit-id of the change causing it, which a bisection can find.
+
+Once the report is out, answer any questions that come up and help where you
+can. That includes keeping the ball rolling by occasionally retesting with newer
+releases and sending a status update afterwards.
Step-by-step guide how to report issues to the kernel maintainers
=================================================================
@@ -94,28 +72,23 @@ early if an issue that looks like a Linux kernel problem is actually caused by
something else. These steps thus help to ensure the time you invest in this
process won't feel wasted in the end:
- * Stop reading this document and report the problem to your vendor instead,
- unless you are running the latest mainline kernel already or are willing to
- install it. This kernel must not be modified or enhanced in any way, and
- thus be considered 'vanilla'.
+ * Are you facing an issue with a Linux kernel a hardware or software vendor
+ provided? Then in almost all cases you are better off to stop reading this
+ document and reporting the issue to your vendor instead, unless you are
+ willing to install the latest Linux version yourself. Be aware the latter
+ will often be needed anyway to hunt down and fix issues.
+
+ * Perform a rough search for existing reports with your favorite internet
+ search engine; additionally, check the archives of the `Linux Kernel Mailing
+ List (LKML) <https://lore.kernel.org/lkml/>`_. If you find matching reports,
+ join the discussion instead of sending a new one.
* See if the issue you are dealing with qualifies as regression, security
issue, or a really severe problem: those are 'issues of high priority' that
need special handling in some steps that are about to follow.
- * Check if your kernel was 'tainted' when the issue occurred, as the event
- that made the kernel set this flag might be causing the issue you face.
-
- * Locate the driver or kernel subsystem that seems to be causing the issue.
- Find out how and where its developers expect reports. Note: most of the
- time this won't be bugzilla.kernel.org, as issues typically need to be sent
- by mail to a maintainer and a public mailing list.
-
- * Search the archives of the bug tracker or mailing list in question
- thoroughly for reports that might match your issue. Also check if you find
- something with your favorite internet search engine or in the Linux Kernel
- Mailing List (LKML) archives. If you find anything, join the discussion
- instead of sending a new report.
+ * Make sure it's not the kernel's surroundings that are causing the issue
+ you face.
* Create a fresh backup and put system repair and restore tools at hand.
@@ -123,8 +96,8 @@ process won't feel wasted in the end:
kernel modules on-the-fly, which solutions like DKMS might be doing locally
without your knowledge.
- * Make sure it's not the kernel's surroundings that are causing the issue
- you face.
+ * Check if your kernel was 'tainted' when the issue occurred, as the event
+ that made the kernel set this flag might be causing the issue you face.
* Write down coarsely how to reproduce the issue. If you deal with multiple
issues at once, create separate notes for each of them and make sure they
@@ -132,20 +105,35 @@ process won't feel wasted in the end:
needs to get reported to the kernel developers separately, unless they are
strongly entangled.
+ * If you are facing a regression within a stable or longterm version line
+ (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
+ 'Dealing with regressions within a stable and longterm kernel line'.
+
+ * Locate the driver or kernel subsystem that seems to be causing the issue.
+ Find out how and where its developers expect reports. Note: most of the
+ time this won't be bugzilla.kernel.org, as issues typically need to be sent
+ by mail to a maintainer and a public mailing list.
+
+ * Search the archives of the bug tracker or mailing list in question
+ thoroughly for reports that might match your issue. If you find anything,
+ join the discussion instead of sending a new report.
+
After these preparations you'll now enter the main part:
- * Install the latest Linux mainline kernel: that's where all issues get
- fixed first, because it's the version line the kernel developers mainly
- care about. Testing and reporting with the latest Linux stable kernel can
- be an acceptable alternative in some situations, for example during the
- merge window; but during that period you might want to suspend your efforts
- till its end anyway.
+ * Unless you are already running the latest 'mainline' Linux kernel, better
+ go and install it for the reporting process. Testing and reporting with
+ the latest 'stable' Linux can be an acceptable alternative in some
+ situations; during the merge window that actually might be even the best
+ approach, but in that development phase it can be an even better idea to
+ suspend your efforts for a few days anyway. Whatever version you choose,
+ ideally use a 'vanilla' build. Ignoring these advices will dramatically
+ increase the risk your report will be rejected or ignored.
* Ensure the kernel you just installed does not 'taint' itself when
running.
* Reproduce the issue with the kernel you just installed. If it doesn't show
- up there, head over to the instructions for issues only happening with
+ up there, scroll down to the instructions for issues only happening with
stable and longterm kernels.
* Optimize your notes: try to find and write the most straightforward way to
@@ -154,8 +142,8 @@ After these preparations you'll now enter the main part:
that hear about it for the first time. And if you learned something in this
process, consider searching again for existing reports about the issue.
- * If the failure includes a stack dump, like an Oops does, consider decoding
- it to find the offending line of code.
+ * If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
+ decoding the kernel log to find the line of code that triggered the error.
* If your problem is a regression, try to narrow down when the issue was
introduced as much as possible.
@@ -184,28 +172,54 @@ After these preparations you'll now enter the main part:
help yourself, if you don't get any help or if it's unsatisfying.
+Reporting regressions within a stable and longterm kernel line
+--------------------------------------------------------------
+
+This subsection is for you, if you followed above process and got sent here at
+the point about regression within a stable or longterm kernel version line. You
+face one of those if something breaks when updating from 5.10.4 to 5.10.5 (a
+switch from 5.9.15 to 5.10.5 does not qualify). The developers want to fix such
+regressions as quickly as possible, hence there is a streamlined process to
+report them:
+
+ * Check if the kernel developers still maintain the Linux kernel version
+ line you care about: go to the `front page of kernel.org
+ <https://kernel.org/>`_ and make sure it mentions
+ the latest release of the particular version line without an '[EOL]' tag.
+
+ * Check the archives of the `Linux stable mailing list
+ <https://lore.kernel.org/stable/>`_ for existing reports.
+
+ * Install the latest release from the particular version line as a vanilla
+ kernel. Ensure this kernel is not tainted and still shows the problem, as
+ the issue might have already been fixed there. If you first noticed the
+ problem with a vendor kernel, check a vanilla build of the last version
+ known to work performs fine as well.
+
+ * Send a short problem report to the Linux stable mailing list
+ (stable@vger.kernel.org) and CC the Linux regressions mailing list
+ (regressions@lists.linux.dev). Roughly describe the issue and ideally
+ explain how to reproduce it. Mention the first version that shows the
+ problem and the last version that's working fine. Then wait for further
+ instructions.
+
+The reference section below explains each of these steps in more detail.
+
+
Reporting issues only occurring in older kernel version lines
-------------------------------------------------------------
-This section is for you, if you tried the latest mainline kernel as outlined
+This subsection is for you, if you tried the latest mainline kernel as outlined
above, but failed to reproduce your issue there; at the same time you want to
-see the issue fixed in older version lines or a vendor kernel that's regularly
-rebased on new stable or longterm releases. If that case follow these steps:
+see the issue fixed in a still supported stable or longterm series or vendor
+kernels regularly rebased on those. If that the case, follow these steps:
* Prepare yourself for the possibility that going through the next few steps
might not get the issue solved in older releases: the fix might be too big
or risky to get backported there.
- * Check if the kernel developers still maintain the Linux kernel version
- line you care about: go to the front page of kernel.org and make sure it
- mentions the latest release of the particular version line without an
- '[EOL]' tag.
-
- * Check the archives of the Linux stable mailing list for existing reports.
-
- * Install the latest release from the particular version line as a vanilla
- kernel. Ensure this kernel is not tainted and still shows the problem, as
- the issue might have already been fixed there.
+ * Perform the first three steps in the section "Dealing with regressions
+ within a stable and longterm kernel line" above.
* Search the Linux kernel version control system for the change that fixed
the issue in mainline, as its commit message might tell you if the fix is
@@ -215,22 +229,13 @@ rebased on new stable or longterm releases. If that case follow these steps:
deemed unsuitable for backporting. If backporting was not considered at
all, join the newest discussion, asking if it's in the cards.
- * Check if you're dealing with a regression that was never present in
- mainline by installing the first release of the version line you care
- about. If the issue doesn't show up with it, you basically need to report
- the issue with this version like you would report a problem with mainline
- (see above). This ideally includes a bisection followed by a search for
- existing reports on the net; with the help of the subject and the two
- relevant commit-ids. If that doesn't turn up anything, write the report; CC
- or forward the report to the stable maintainers, the stable mailing list,
- and those who authored the change. Include the shortened commit-id if you
- found the change that causes it.
-
* One of the former steps should lead to a solution. If that doesn't work
out, ask the maintainers for the subsystem that seems to be causing the
issue for advice; CC the mailing list for the particular subsystem as well
as the stable mailing list.
+The reference section below explains each of these steps in more detail.
+
Reference section: Reporting issues to the kernel maintainers
=============================================================
@@ -276,54 +281,103 @@ issues to the Linux kernel developers.
Make sure you're using the upstream Linux kernel
------------------------------------------------
- *Stop reading this document and report the problem to your vendor instead,
- unless you are running the latest mainline kernel already or are willing to
- install it. This kernel must not be modified or enhanced in any way, and
- thus be considered 'vanilla'.*
+ *Are you facing an issue with a Linux kernel a hardware or software vendor
+ provided? Then in almost all cases you are better off to stop reading this
+ document and reporting the issue to your vendor instead, unless you are
+ willing to install the latest Linux version yourself. Be aware the latter
+ will often be needed anyway to hunt down and fix issues.*
Like most programmers, Linux kernel developers don't like to spend time dealing
-with reports for issues that don't even happen with the source code they
-maintain: it's just a waste everybody's time, yours included. That's why you
-later will have to test your issue with the latest 'vanilla' kernel: a kernel
-that was build using the Linux sources taken straight from `kernel.org
-<https://kernel.org/>`_ and not modified or enhanced in any way.
-
-Almost all kernels used in devices (Computers, Laptops, Smartphones, Routers,
-…) and most kernels shipped by Linux distributors are ancient from the point of
-kernel development and heavily modified. They thus do not qualify for reporting
-an issue to the Linux kernel developers: the issue you face with such a kernel
-might be fixed already or caused by the changes or additions, even if they look
-small or totally unrelated. That's why issues with such kernels need to be
-reported to the vendor that distributed it. Its developers should look into the
+with reports for issues that don't even happen with their current code. It's
+just a waste everybody's time, especially yours. Unfortunately such situations
+easily happen when it comes to the kernel and often leads to frustration on both
+sides. That's because almost all Linux-based kernels pre-installed on devices
+(Computers, Laptops, Smartphones, Routers, …) and most shipped by Linux
+distributors are quite distant from the official Linux kernel as distributed by
+kernel.org: these kernels from these vendors are often ancient from the point of
+Linux development or heavily modified, often both.
+
+Most of these vendor kernels are quite unsuitable for reporting issues to the
+Linux kernel developers: an issue you face with one of them might have been
+fixed by the Linux kernel developers months or years ago already; additionally,
+the modifications and enhancements by the vendor might be causing the issue you
+face, even if they look small or totally unrelated. That's why you should report
+issues with these kernels to the vendor. Its developers should look into the
report and, in case it turns out to be an upstream issue, fix it directly
-upstream or report it there. In practice that sometimes does not work out. If
-that the case, you might want to circumvent the vendor by installing the latest
-mainline kernel yourself and reporting the issue as outlined in this document;
-just make sure to use really fresh kernel (see below).
-
-
-.. note::
-
- FIXME: Should we accept reports for issues with kernel images that are pretty
- close to vanilla? But when are they close enough and how to put that line in
- words? Maybe something like this?
+upstream or forward the report there. In practice that often does not work out
+or might not what you want. You thus might want to consider circumventing the
+vendor by installing the very latest Linux kernel core yourself. If that's an
+option for you move ahead in this process, as a later step in this guide will
+explain how to do that once it rules out other potential causes for your issue.
+
+Note, the previous paragraph is starting with the word 'most', as sometimes
+developers in fact are willing to handle reports about issues occurring with
+vendor kernels. If they do in the end highly depends on the developers and the
+issue in question. Your chances are quite good if the distributor applied only
+small modifications to a kernel based on a recent Linux version; that for
+example often holds true for the mainline kernels shipped by Debian GNU/Linux
+Sid or Fedora Rawhide. Some developers will also accept reports about issues
+with kernels from distributions shipping the latest stable kernel, as long as
+its only slightly modified; that for example is often the case for Arch Linux,
+regular Fedora releases, and openSUSE Tumbleweed. But keep in mind, you better
+want to use a mainline Linux and avoid using a stable kernel for this
+process, as outlined in the section 'Install a fresh kernel for testing' in more
+detail.
+
+Obviously you are free to ignore all this advice and report problems with an old
+or heavily modified vendor kernel to the upstream Linux developers. But note,
+those often get rejected or ignored, so consider yourself warned. But it's still
+better than not reporting the issue at all: sometimes such reports directly or
+indirectly will help to get the issue fixed over time.
+
+
+Search for existing reports, first run
+--------------------------------------
+
+ *Perform a rough search for existing reports with your favorite internet
+ search engine; additionally, check the archives of the Linux Kernel Mailing
+ List (LKML). If you find matching reports, join the discussion instead of
+ sending a new one.*
+
+Reporting an issue that someone else already brought forward is often a waste of
+time for everyone involved, especially you as the reporter. So it's in your own
+interest to thoroughly check if somebody reported the issue already. At this
+step of the process it's okay to just perform a rough search: a later step will
+tell you to perform a more detailed search once you know where your issue needs
+to be reported to. Nevertheless, do not hurry with this step of the reporting
+process, it can save you time and trouble.
+
+Simply search the internet with your favorite search engine first. Afterwards,
+search the `Linux Kernel Mailing List (LKML) archives
+<https://lore.kernel.org/lkml/>`_.
- *Note: Some Linux kernel developers accept reports from vendor kernels that
- are known to be close to upstream. That for example is often the case for
- the kernels that Debian GNU/Linux Sid or Fedora Rawhide ship, which are
- normally following mainline closely and carry only a few patches. So a
- report with one of these might be accepted by the developers that need to
- handle it. But if they do, depends heavily on the individual developers and
- the issue at hand. That's why installing a mainline vanilla kernel is the
- safe bet.*
+If you get flooded with results consider telling your search engine to limit
+search timeframe to the past month or year. And wherever you search, make sure
+to use good search terms; vary them a few times, too. While doing so try to
+look at the issue from the perspective of someone else: that will help you to
+come up with other words to use as search terms. Also make sure not to use too
+many search terms at once. Remember to search with and without information like
+the name of the kernel driver or the name of the affected hardware component.
+But its exact brand name (say 'ASUS Red Devil Radeon RX 5700 XT Gaming OC')
+often is not much helpful, as it is too specific. Instead try search terms like
+the model line (Radeon 5700 or Radeon 5000) and the code name of the main chip
+('Navi' or 'Navi10') with and without its manufacturer ('AMD').
- *Arch Linux, other Fedora releases, and openSUSE Tumbleweed often use quite
- recent stable kernels that are pretty close to upstream, too. Some
- developers accept bugs from them as well. But note that you normally should
- avoid stable kernels for reporting issues and use a mainline kernel instead
- (see below).*
+In case you find an existing report about your issue, join the discussion, as
+you might be able to provide valuable additional information. That can be
+important even when a fix is prepared or in its final stages already, as
+developers might look for people that can provide additional information or
+test a proposed fix. Jump to the section 'Duties after the report went out' for
+details on how to get properly involved.
- Are there any other major Linux distributions that should be mentioned here?
+Note, searching `bugzilla.kernel.org <https://bugzilla.kernel.org/>`_ might also
+be a good idea, as that might provide valuable insights or turn up matching
+reports. If you find the latter, just keep in mind: most subsystems expect
+reports in different places, as described below in the section "Check where you
+need to report your issue". The developers that should take care of the issue
+thus might not even be aware of the bugzilla ticket. Hence, check the ticket if
+the issue already got reported as outlined in this document and if not consider
+doing so.
Issue of high priority?
@@ -365,6 +419,75 @@ fatal error where the kernel stop itself) with a 'Oops' (a recoverable error),
as the kernel remains running after the latter.
+Ensure a healthy environment
+----------------------------
+
+ *Make sure it's not the kernel's surroundings that are causing the issue
+ you face.*
+
+Problems that look a lot like a kernel issue are sometimes caused by build or
+runtime environment. It's hard to rule out that problem completely, but you
+should minimize it:
+
+ * Use proven tools when building your kernel, as bugs in the compiler or the
+ binutils can cause the resulting kernel to misbehave.
+
+ * Ensure your computer components run within their design specifications;
+ that's especially important for the main processor, the main memory, and the
+ motherboard. Therefore, stop undervolting or overclocking when facing a
+ potential kernel issue.
+
+ * Try to make sure it's not faulty hardware that is causing your issue. Bad
+ main memory for example can result in a multitude of issues that will
+ manifest itself in problems looking like kernel issues.
+
+ * If you're dealing with a filesystem issue, you might want to check the file
+ system in question with ``fsck``, as it might be damaged in a way that leads
+ to unexpected kernel behavior.
+
+ * When dealing with a regression, make sure it's not something else that
+ changed in parallel to updating the kernel. The problem for example might be
+ caused by other software that was updated at the same time. It can also
+ happen that a hardware component coincidentally just broke when you rebooted
+ into a new kernel for the first time. Updating the systems BIOS or changing
+ something in the BIOS Setup can also lead to problems that on look a lot
+ like a kernel regression.
+
+
+Prepare for emergencies
+-----------------------
+
+ *Create a fresh backup and put system repair and restore tools at hand.*
+
+Reminder, you are dealing with computers, which sometimes do unexpected things,
+especially if you fiddle with crucial parts like the kernel of its operating
+system. That's what you are about to do in this process. Thus, make sure to
+create a fresh backup; also ensure you have all tools at hand to repair or
+reinstall the operating system as well as everything you need to restore the
+backup.
+
+
+Make sure your kernel doesn't get enhanced
+------------------------------------------
+
+ *Ensure your system does not enhance its kernels by building additional
+ kernel modules on-the-fly, which solutions like DKMS might be doing locally
+ without your knowledge.*
+
+The risk your issue report gets ignored or rejected dramatically increases if
+your kernel gets enhanced in any way. That's why you should remove or disable
+mechanisms like akmods and DKMS: those build add-on kernel modules
+automatically, for example when you install a new Linux kernel or boot it for
+the first time. Also remove any modules they might have installed. Then reboot
+before proceeding.
+
+Note, you might not be aware that your system is using one of these solutions:
+they often get set up silently when you install Nvidia's proprietary graphics
+driver, VirtualBox, or other software that requires a some support from a
+module not part of the Linux kernel. That why your might need to uninstall the
+packages with such software to get rid of any 3rd party kernel module.
+
+
Check 'taint' flag
------------------
@@ -433,9 +556,52 @@ three things:
the name of the module in question).
-Locate kernel area that causes the issue
+Document how to reproduce issue
+-------------------------------
+
+ *Write down coarsely how to reproduce the issue. If you deal with multiple
+ issues at once, create separate notes for each of them and make sure they
+ work independently on a freshly booted system. That's needed, as each issue
+ needs to get reported to the kernel developers separately, unless they are
+ strongly entangled.*
+
+If you deal with multiple issues at once, you'll have to report each of them
+separately, as they might be handled by different developers. Describing
+various issues in one report also makes it quite difficult for others to tear
+it apart. Hence, only combine issues in one report if they are very strongly
+entangled.
+
+Additionally, during the reporting process you will have to test if the issue
+happens with other kernel versions. Therefore, it will make your work easier if
+you know exactly how to reproduce an issue quickly on a freshly booted system.
+
+Note: it's often fruitless to report issues that only happened once, as they
+might be caused by a bit flip due to cosmic radiation. That's why you should
+try to rule that out by reproducing the issue before going further. Feel free
+to ignore this advice if you are experienced enough to tell a one-time error
+due to faulty hardware apart from a kernel issue that rarely happens and thus
+is hard to reproduce.
+
+
+Regression in stable or longterm kernel?
----------------------------------------
+ *If you are facing a regression within a stable or longterm version line
+ (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
+ 'Dealing with regressions within a stable and longterm kernel line'.*
+
+Regression within a stable and longterm kernel version line are something the
+Linux developers want to fix badly, as such issues are even more unwanted than
+regression in the main development branch, as they can quickly affect a lot of
+people. The developers thus want to learn about such issues as quickly as
+possible, hence there is a streamlined process to report them. Note,
+regressions with newer kernel version line (say something broke when switching
+from 5.9.15 to 5.10.5) do not qualify.
+
+
+Check where you need to report your issue
+-----------------------------------------
+
*Locate the driver or kernel subsystem that seems to be causing the issue.
Find out how and where its developers expect reports. Note: most of the
time this won't be bugzilla.kernel.org, as issues typically need to be sent
@@ -526,26 +692,6 @@ example above does not have such a line. That is the case for most sections, as
Linux kernel development is completely driven by mail. Very few subsystems use
a bug tracker, and only some of those rely on bugzilla.kernel.org.
-
-.. note::
-
- FIXME: The old text took a totally different approach to bugzilla.kernel.org,
- as it mentions it as the place to file issue for people that don't known how
- to contact the appropriate people. The new one mentions it rarely; and when
- it does like here, it warns users that it's often the wrong place to go.
-
- This approach was chosen as the main author of this document noticed quite a
- few users (or even a lot?) get no reply to the bugs they file in bugzilla.
- That's kind of expected, as quite a few (many? most?) of the maintainers
- don't even get notified when reports for their subsystem get filed there. And
- not getting a single reply to report is something that is just annoying for
- users and might make them angry. Improving bugzilla.k.o would be an option,
- but on the kernel and maintainers summit 2017 it was agreed on to first go
- this route (sorry it took so long): it's easier to achieve and less
- controversial, as putting additional burden on already overworked maintainers
- is unlikely to get well received.
-
-
In this and many other cases you thus have to look for lines starting with
'Mail:' instead. Those mention the name and the email addresses for the
maintainers of the particular code. Also look for a line starting with 'Mailing
@@ -558,21 +704,6 @@ and might leave some work for other developers on the subsystem specific list;
and LKML is important to have one place where all issue reports can be found.
-.. note::
-
- FIXME: Above section tells users to always CC LKML. These days it's a kind of
- "catch-all" list anyway, which nearly nobody seems to follow closely. So it
- seems appropriate to go "all in" and make people send their reports here,
- too, as everything (reports, fixes, ...) then can be found in one place (at
- least for all reports sent by mail and all subsystems that CC LKML).
-
- Related: Should we create mailing list like 'linux-issues@vger.kernel.org'
- and tell users above to always CC it when reporting issues? Then there would
- be one central place reporters could search for existing reports (at least
- for issues reported by mail) without getting regular LKML traffic mixed into
- the results.
-
-
Finding the maintainers with the help of a script
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -611,205 +742,87 @@ modified during tree-wide cleanups by developers that do not care about the
particular driver at all.
-Search for existing reports
----------------------------
+Search for existing reports, second run
+---------------------------------------
*Search the archives of the bug tracker or mailing list in question
- thoroughly for reports that might match your issue. Also check if you find
- something with your favorite internet search engine or in the Linux Kernel
- Mailing List (LKML) archives. If you find anything, join the discussion
- instead of sending a new report.*
-
-Reporting an issue that someone else already brought forward is often a waste
-of time for everyone involved, especially you as the reporter. So it's in your
-own interest to thoroughly check if somebody reported the issue already. Thus
-do not hurry with this step of the reporting process. Spending 30 to 60 minutes
-or even more time can save you and others quite a lot of time and trouble.
-
-The best place to search is the bug tracker or the mailing list where your
-report needs to be filed. You'll find quite a few of those lists on
-`lore.kernel.org <https://lore.kernel.org/>`_, but some are hosted in
-different places. That for example is the case for the ath10k WiFi driver used
-as example in the previous step. But you'll often find the archives for these
-lists easily on the net. Searching for 'archive ath10k@lists.infradead.org' for
-example will quickly lead you to the `Info page for the ath10k mailing list
-<https://lists.infradead.org/mailman/listinfo/ath10k>`_, which at the top links
-to its `list archives <https://lists.infradead.org/pipermail/ath10k/>`_.
-
-Sadly this and quite a few other lists miss a way to search the archives. In
-those cases use a regular internet search engine and add something like
+ thoroughly for reports that might match your issue. If you find anything,
+ join the discussion instead of sending a new report.*
+
+As mentioned earlier already: reporting an issue that someone else already
+brought forward is often a waste of time for everyone involved, especially you
+as the reporter. That's why you should search for existing report again, now
+that you know where they need to be reported to. If it's mailing list, you will
+often find its archives on `lore.kernel.org <https://lore.kernel.org/>`_.
+
+But some list are hosted in different places. That for example is the case for
+the ath10k WiFi driver used as example in the previous step. But you'll often
+find the archives for these lists easily on the net. Searching for 'archive
+ath10k@lists.infradead.org' for example will lead you to the `Info page for the
+ath10k mailing list <https://lists.infradead.org/mailman/listinfo/ath10k>`_,
+which at the top links to its
+`list archives <https://lists.infradead.org/pipermail/ath10k/>`_. Sadly this and
+quite a few other lists miss a way to search the archives. In those cases use a
+regular internet search engine and add something like
'site:lists.infradead.org/pipermail/ath10k/' to your search terms, which limits
the results to the archives at that URL.
-Additionally, search the internet and the `Linux Kernel Mailing List (LKML)
-archives <https://lore.kernel.org/lkml/>`_, as maybe the real culprit might be
-in some other subsystem. Searching in `bugzilla.kernel.org
-<https://bugzilla.kernel.org/>`_ might also be a good idea, but if you find
-anything there keep in mind: most subsystems expect reports in different
-places, hence those you find there might have not even reached the people
-responsible for the subsystem in question. Nevertheless, the data there might
-provide valuable insights.
-
-If you get flooded with results consider telling your search engine to limit
-search timeframe to the past month or year. And wherever you search, make sure
-to use good search terms; vary them a few times, too. While doing so try to
-look at the issue from the perspective of someone else: that will help you to
-come up with other words to use as search terms. Also make sure not to use too
-many search terms at once. Remember to search with and without information like
-the name of the kernel driver or the name of the affected hardware component.
-But its exact brand name (say 'ASUS Red Devil Radeon RX 5700 XT Gaming OC')
-often is not much helpful, as it is too specific. Instead try search terms like
-the model line (Radeon 5700 or Radeon 5000) and the code name of the main chip
-('Navi' or 'Navi10') with and without its manufacturer ('AMD').
-
-In case you find an existing report about your issue, join the discussion, as
-you might be able to provide valuable additional information. That can be
-important even when a fix is prepared or in its final stages already, as
-developers might look for people that can provide additional information or
-test a proposed fix. Jump to the section 'Duties after the report went out' for
-details on how to get properly involved.
-
-
-Prepare for emergencies
------------------------
-
- *Create a fresh backup and put system repair and restore tools at hand.*
-
-Reminder, you are dealing with computers, which sometimes do unexpected things,
-especially if you fiddle with crucial parts like the kernel of its operating
-system. That's what you are about to do in this process. Thus, make sure to
-create a fresh backup; also ensure you have all tools at hand to repair or
-reinstall the operating system as well as everything you need to restore the
-backup.
-
-
-Make sure your kernel doesn't get enhanced
-------------------------------------------
-
- *Ensure your system does not enhance its kernels by building additional
- kernel modules on-the-fly, which solutions like DKMS might be doing locally
- without your knowledge.*
-
-Your kernel must be 'vanilla' when reporting an issue, but stops being pure as
-soon as it loads a kernel module not built from the sources used to compile the
-kernel image itself. That's why you need to ensure your Linux kernel stays
-vanilla by removing or disabling mechanisms like akmods and DKMS: those might
-build additional kernel modules automatically, for example when your boot into
-a newly installed Linux kernel the first time. Reboot after removing them and
-any modules they installed.
-
-Note, you might not be aware that your system is using one of these solutions:
-they often get set up silently when you install Nvidia's proprietary graphics
-driver, VirtualBox, or other software that requires a some support from a
-module not part of the Linux kernel. That why your might need to uninstall the
-packages with such software to get rid of any 3rd party kernel module.
-
-
-Ensure a healthy environment
-----------------------------
-
- *Make sure it's not the kernel's surroundings that are causing the issue
- you face.*
-
-Problems that look a lot like a kernel issue are sometimes caused by build or
-runtime environment. It's hard to rule out that problem completely, but you
-should minimize it:
-
- * Use proven tools when building your kernel, as bugs in the compiler or the
- binutils can cause the resulting kernel to misbehave.
-
- * Ensure your computer components run within their design specifications;
- that's especially important for the main processor, the main memory, and the
- motherboard. Therefore, stop undervolting or overclocking when facing a
- potential kernel issue.
-
- * Try to make sure it's not faulty hardware that is causing your issue. Bad
- main memory for example can result in a multitude of issues that will
- manifest itself in problems looking like kernel issues.
-
- * If you're dealing with a filesystem issue, you might want to check the file
- system in question with ``fsck``, as it might be damaged in a way that leads
- to unexpected kernel behavior.
-
- * When dealing with a regression, make sure it's not something else that
- changed in parallel to updating the kernel. The problem for example might be
- caused by other software that was updated at the same time. It can also
- happen that a hardware component coincidentally just broke when you rebooted
- into a new kernel for the first time. Updating the systems BIOS or changing
- something in the BIOS Setup can also lead to problems that on look a lot
- like a kernel regression.
-
+It's also wise to check the internet, LKML and maybe bugzilla.kernel.org again
+at this point.
-Document how to reproduce issue
--------------------------------
+For details how to search and what to do if you find matching reports see
+"Search for existing reports, first run" above.
- *Write down coarsely how to reproduce the issue. If you deal with multiple
- issues at once, create separate notes for each of them and make sure they
- work independently on a freshly booted system. That's needed, as each issue
- needs to get reported to the kernel developers separately, unless they are
- strongly entangled.*
-
-If you deal with multiple issues at once, you'll have to report each of them
-separately, as they might be handled by different developers. Describing
-various issues in one report also makes it quite difficult for others to tear
-it apart. Hence, only combine issues in one report if they are very strongly
-entangled.
-
-Additionally, during the reporting process you will have to test if the issue
-happens with other kernel versions. Therefore, it will make your work easier if
-you know exactly how to reproduce an issue quickly on a freshly booted system.
-
-Note: it's often fruitless to report issues that only happened once, as they
-might be caused by a bit flip due to cosmic radiation. That's why you should
-try to rule that out by reproducing the issue before going further. Feel free
-to ignore this advice if you are experienced enough to tell a one-time error
-due to faulty hardware apart from a kernel issue that rarely happens and thus
-is hard to reproduce.
+Do not hurry with this step of the reporting process: spending 30 to 60 minutes
+or even more time can save you and others quite a lot of time and trouble.
Install a fresh kernel for testing
----------------------------------
- *Install the latest Linux mainline kernel: that's where all issues get
- fixed first, because it's the version line the kernel developers mainly
- care about. Testing and reporting with the latest Linux stable kernel can
- be an acceptable alternative in some situations, for example during the
- merge window; but during that period you might want to suspend your efforts
- till its end anyway.*
-
-Reporting an issue to the Linux kernel developers they fixed weeks or months
-ago is annoying for them and wasting their and your time. That's why it's in
-everybody's interest to check if the issue occurs with the latest codebase
-before reporting it.
-
-In the scope of the Linux kernel the term 'latest' means: a kernel version
-recently created from the main line of development, as this 'mainline' tree is
-where developers first apply fixes; only after that are they are allowed to get
-backported to older, still supported version lines called 'stable' and
-'longterm' kernels. That's why you should check a recent mainline kernel, even
-if you deal with an issue you only want to see fixed in an older version line.
-Another reason: some fixes are only applied to mainline or recent version
-lines, as it's too hard or risky to backport them to older versions. If that
-the case, reporting the issue again is unlikely to change anything.
-
-Longterm kernels (sometimes called "LTS kernels") are therefore unsuitable for
-testing; they simply are too distant from current development. Even the latest
-Linux 'stable' kernel is a significant bit behind and thus better avoided. At
-least most of the time, as sometimes a stable kernel can the best choice; but
-in those situations you might want to wait a few days anyway:
-
-Choosing between mainline, stable and waiting
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Head over to `kernel.org <https://kernel.org/>`_ to decide which version to
-use. Ignore the big yellow button that says 'Latest release' and look a little
-lower for a table. At its top you'll see a line starting with 'mainline', which
-most of the time will point to a pre-release with a version number like
-'5.8-rc2'. If that's the case, you'll want to use this mainline kernel for
-testing. Do not let that 'rc' scare you, these 'development kernels' are pretty
-reliable — and you made a backup, as you were instructed above, didn't you?
-
-In about two out of every nine to ten weeks, 'mainline' might point you to a
+ *Unless you are already running the latest 'mainline' Linux kernel, better
+ go and install it for the reporting process. Testing and reporting with
+ the latest 'stable' Linux can be an acceptable alternative in some
+ situations; during the merge window that actually might be even the best
+ approach, but in that development phase it can be an even better idea to
+ suspend your efforts for a few days anyway. Whatever version you choose,
+ ideally use a 'vanilla' built. Ignoring these advices will dramatically
+ increase the risk your report will be rejected or ignored.*
+
+As mentioned in the detailed explanation for the first step already: Like most
+programmers, Linux kernel developers don't like to spend time dealing with
+reports for issues that don't even happen with the current code. It's just a
+waste everybody's time, especially yours. That's why it's in everybody's
+interest that you confirm the issue still exists with the latest upstream code
+before reporting it. You are free to ignore this advice, but as outlined
+earlier: doing so dramatically increases the risk that your issue report might
+get rejected or simply ignored.
+
+In the scope of the kernel "latest upstream" normally means:
+
+ * Install a mainline kernel; the latest stable kernel can be an option, but
+ most of the time is better avoided. Longterm kernels (sometimes called 'LTS
+ kernels') are unsuitable at this point of the process. The next subsection
+ explains all of this in more detail.
+
+ * The over next subsection describes way to obtain and install such a kernel.
+ It also outlines that using a pre-compiled kernel are fine, but better are
+ vanilla, which means: it was built using Linux sources taken straight `from
+ kernel.org <https://kernel.org/>`_ and not modified or enhanced in any way.
+
+Choosing the right version for testing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Head over to `kernel.org <https://kernel.org/>`_ to find out which version you
+want to use for testing. Ignore the big yellow button that says 'Latest release'
+and look a little lower at the table. At its top you'll see a line starting with
+mainline, which most of the time will point to a pre-release with a version
+number like '5.8-rc2'. If that's the case, you'll want to use this mainline
+kernel for testing, as that where all fixes have to be applied first. Do not let
+that 'rc' scare you, these 'development kernels' are pretty reliable — and you
+made a backup, as you were instructed above, didn't you?
+
+In about two out of every nine to ten weeks, mainline might point you to a
proper release with a version number like '5.7'. If that happens, consider
suspending the reporting process until the first pre-release of the next
version (5.8-rc1) shows up on kernel.org. That's because the Linux development
@@ -830,45 +843,79 @@ case mainline for some reason does currently not work for you. An in general:
using it for reproducing the issue is also better than not reporting it issue
at all.
+Better avoid using the latest stable kernel outside merge windows, as all fixes
+must be applied to mainline first. That's why checking the latest mainline
+kernel is so important: any issue you want to see fixed in older version lines
+needs to be fixed in mainline first before it can get backported, which can
+take a few days or weeks. Another reason: the fix you hope for might be too
+hard or risky for backporting; reporting the issue again hence is unlikely to
+change anything.
+
+These aspects are also why longterm kernels (sometimes called "LTS kernels")
+are unsuitable for this part of the reporting process: they are to distant from
+the current code. Hence go and test mainline first and follow the process
+further: if the issue doesn't occur with mainline it will guide you how to get
+it fixed in older version lines, if that's in the cards for the fix in question.
+
How to obtain a fresh Linux kernel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-You can use pre-built or self-compiled kernel for testing; if you choose the
-latter approach, you can either obtain the source code using git or download it
-as tar archive.
-
-Using a pre-compiled kernel for testing is often the quickest, easiest, and
-safest way – especially is you are unfamiliar with the Linux kernel. But it
-needs to be a vanilla kernel, which can be hard to come buy. You are in luck if
-you are using a popular Linux distribution: for quite a few of them you'll find
-repositories on the net that contain packages with the latest mainline or
-stable kernels in vanilla fashion. It's totally okay to use these, just make
-sure from the repository's documentation they are really vanilla. And ensure
-the packages contain the latest versions as offered on kernel.org; they are
-likely unsuitable if the package is older than a week, as new mainline and
-stable kernels typically get released at least once a week. And be aware that
-you might need to get build your own kernel later anyway when it comes to
-helping test fixes, as described later in this document.
-
-Developers and experienced Linux users familiar with git are often best served
-by obtaining the latest Linux kernel sources straight from the `official
-development repository on kernel.org
+**Using a pre-compiled kernel**: This is often the quickest, easiest, and safest
+way for testing — especially is you are unfamiliar with the Linux kernel. The
+problem: most of those shipped by distributors or add-on repositories are build
+from modified Linux sources. They are thus not vanilla and therefore often
+unsuitable for testing and issue reporting: the changes might cause the issue
+you face or influence it somehow.
+
+But you are in luck if you are using a popular Linux distribution: for quite a
+few of them you'll find repositories on the net that contain packages with the
+latest mainline or stable Linux built as vanilla kernel. It's totally okay to
+use these, just make sure from the repository's description they are vanilla or
+at least close to it. Additionally ensure the packages contain the latest
+versions as offered on kernel.org. The packages are likely unsuitable if they
+are older than a week, as new mainline and stable kernels typically get released
+at least once a week.
+
+Please note that you might need to build your own kernel manually later: that's
+sometimes needed for debugging or testing fixes, as described later in this
+document. Also be aware that pre-compiled kernels might lack debug symbols that
+are needed to decode messages the kernel prints when a panic, Oops, warning, or
+BUG occurs; if you plan to decode those, you might be better off compiling a
+kernel yourself (see the end of this subsection and the section titled 'Decode
+failure messages' for details).
+
+**Using git**: Developers and experienced Linux users familiar with git are
+often best served by obtaining the latest Linux kernel sources straight from the
+`official development repository on kernel.org
<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_.
Those are likely a bit ahead of the latest mainline pre-release. Don't worry
about it: they are as reliable as a proper pre-release, unless the kernel's
development cycle is currently in the middle of a merge window. But even then
they are quite reliable.
-People unfamiliar with git are often best served by downloading the sources as
-tarball from `kernel.org <https://kernel.org/>`_.
+**Conventional**: People unfamiliar with git are often best served by
+downloading the sources as tarball from `kernel.org <https://kernel.org/>`_.
-How to actually build a kernel isnot described here, as many websites explain
+How to actually build a kernel is not described here, as many websites explain
the necessary steps already. If you are new to it, consider following one of
those how-to's that suggest to use ``make localmodconfig``, as that tries to
pick up the configuration of your current kernel and then tries to adjust it
somewhat for your system. That does not make the resulting kernel any better,
but quicker to compile.
+Note: If you are dealing with a panic, Oops, warning, or BUG from the kernel,
+please try to enable CONFIG_KALLSYMS when configuring your kernel.
+Additionally, enable CONFIG_DEBUG_KERNEL and CONFIG_DEBUG_INFO, too; the
+latter is the relevant one of those two, but can only be reached if you enable
+the former. Be aware CONFIG_DEBUG_INFO increases the storage space required to
+build a kernel by quite a bit. But that's worth it, as these options will allow
+you later to pinpoint the exact line of code that triggers your issue. The
+section 'Decode failure messages' below explains this in more detail.
+
+But keep in mind: Always keep a record of the issue encountered in case it is
+hard to reproduce. Sending an undecoded report is better than not reporting
+the issue at all.
+
Check 'taint' flag
------------------
@@ -888,7 +935,7 @@ Reproduce issue with the fresh kernel
-------------------------------------
*Reproduce the issue with the kernel you just installed. If it doesn't show
- up there, head over to the instructions for issues only happening with
+ up there, scroll down to the instructions for issues only happening with
stable and longterm kernels.*
Check if the issue occurs with the fresh Linux kernel version you just
@@ -923,31 +970,55 @@ instead you can join.
Decode failure messages
-----------------------
-.. note::
+ *If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
+ decoding the kernel log to find the line of code that triggered the error.*
- FIXME: The text in this section is a placeholder for now and quite similar to
- the old text found in 'Documentation/admin-guide/reporting-bugs.rst'
- currently. It and the document it references are known to be outdated and
- thus need to be revisited. Thus consider this note a request for help: if you
- are familiar with this topic, please write a few lines that would fit here.
- Alternatively, simply outline the current situation roughly to the main
- authors of this document (see intro), as they might be able to write
- something then.
+When the kernel detects an internal problem, it will log some information about
+the executed code. This makes it possible to pinpoint the exact line in the
+source code that triggered the issue and shows how it was called. But that only
+works if you enabled CONFIG_DEBUG_INFO and CONFIG_KALLSYMS when configuring
+your kernel. If you did so, consider to decode the information from the
+kernel's log. That will make it a lot easier to understand what lead to the
+'panic', 'Oops', 'warning', or 'BUG', which increases the chances that someone
+can provide a fix.
- This section in the end should answer questions like "when is this actually
- needed", "what .config options to ideally set earlier to make this step easy
- or unnecessary?" (likely CONFIG_UNWINDER_ORC when it's available, otherwise
- CONFIG_UNWINDER_FRAME_POINTER; but is there anything else needed?).
+Decoding can be done with a script you find in the Linux source tree. If you
+are running a kernel you compiled yourself earlier, call it like this::
-..
+ [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux
+
+If you are running a packaged vanilla kernel, you will likely have to install
+the corresponding packages with debug symbols. Then call the script (which you
+might need to get from the Linux sources if your distro does not package it)
+like this::
+
+ [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh \
+ /usr/lib/debug/lib/modules/5.10.10-4.1.x86_64/vmlinux /usr/src/kernels/5.10.10-4.1.x86_64/
+
+The script will work on log lines like the following, which show the address of
+the code the kernel was executing when the error occurred::
+
+ [ 68.387301] RIP: 0010:test_module_init+0x5/0xffa [test_module]
+
+Once decoded, these lines will look like this::
- *If the failure includes a stack dump, like an Oops does, consider decoding
- it to find the offending line of code.*
+ [ 68.387301] RIP: 0010:test_module_init (/home/username/linux-5.10.5/test-module/test-module.c:16) test_module
-When the kernel detects an error, it will print a stack dump that allows to
-identify the exact line of code where the issue happens. But that information
-sometimes needs to get decoded to be readable, which is explained in
-admin-guide/bug-hunting.rst.
+In this case the executed code was built from the file
+'~/linux-5.10.5/test-module/test-module.c' and the error occurred by the
+instructions found in line '16'.
+
+The script will similarly decode the addresses mentioned in the section
+starting with 'Call trace', which show the path to the function where the
+problem occurred. Additionally, the script will show the assembler output for
+the code section the kernel was executing.
+
+Note, if you can't get this to work, simply skip this step and mention the
+reason for it in the report. If you're lucky, it might not be needed. And if it
+is, someone might help you to get things going. Also be aware this is just one
+of several ways to decode kernel stack traces. Sometimes different steps will
+be required to retrieve the relevant details. Don't worry about that, if that's
+needed in your case, developers will tell you what to do.
Special care for regressions
@@ -1000,8 +1071,7 @@ In the whole process keep in mind: an issue only qualifies as regression if the
older and the newer kernel got built with a similar configuration. The best way
to archive this: copy the configuration file (``.config``) from the old working
kernel freshly to each newer kernel version you try. Afterwards run ``make
-oldnoconfig`` to adjust it for the needs of the new version without enabling
-any new feature, as those are allowed to cause regressions.
+olddefconfig`` to adjust it for the needs of the new version.
Write and send the report
@@ -1166,17 +1236,26 @@ Special handling for high priority issues
Reports for high priority issues need special handling.
-**Severe bugs**: make sure the subject or ticket title as well as the first
+**Severe issues**: make sure the subject or ticket title as well as the first
paragraph makes the severeness obvious.
-**Regressions**: If the issue is a regression add [REGRESSION] to the mail's
-subject or the title in the bug-tracker. If you did not perform a bisection
-mention at least the latest mainline version you tested that worked fine (say
-5.7) and the oldest where the issue occurs (say 5.8). If you did a successful
-bisection mention the commit id and subject of the change that causes the
-regression. Also make sure to add the author of that change to your report; if
-you need to file your bug in a bug-tracker forward the report to him in a
-private mail and mention where your filed it.
+**Regressions**: make the report's subject start with '[REGRESSION]'.
+
+In case you performed a successful bisection, use the title of the change that
+introduced the regression as the second part of your subject. Make the report
+also mention the commit id of the culprit. In case of an unsuccessful bisection,
+make your report mention the latest tested version that's working fine (say 5.7)
+and the oldest where the issue occurs (say 5.8-rc1).
+
+When sending the report by mail, CC the Linux regressions mailing list
+(regressions@lists.linux.dev). In case the report needs to be filed to some web
+tracker, proceed to do so; once filed, forward the report by mail to the
+regressions list. Make sure to inline the forwarded report, hence do not attach
+it. Also add a short note at the top where you mention the URL to the ticket.
+
+When mailing or forwarding the report, in case of a successful bisection add the
+author of the culprit to the recipients; also CC everyone in the signed-off-by
+chain, which you find at the end of its commit message.
**Security issues**: for these issues your will have to evaluate if a
short-term risk to other users would arise if details were publicly disclosed.
@@ -1255,7 +1334,7 @@ you never have heard of yet; or you might be asked to apply a patch to the
Linux kernel sources to test if it helps. In some cases it will be fine sending
a reply asking for instructions how to do that. But before going that route try
to find the answer own your own by searching the internet; alternatively
-consider asking in other places for advice. For example ask a fried or post
+consider asking in other places for advice. For example ask a friend or post
about it to a chatroom or forum you normally hang out.
**Be patient**: If you are really lucky you might get a reply to your report
@@ -1390,32 +1469,11 @@ easier. And with a bit of luck there might be someone in the team that knows a
bit about programming and might be able to write a fix.
-Details about reporting issues only occurring in older kernel version lines
----------------------------------------------------------------------------
+Reference for "Reporting regressions within a stable and longterm kernel line"
+------------------------------------------------------------------------------
-This subsection provides details for steps you need to take if you could not
-reproduce your issue with a mainline kernel, but want to see it fixed in older
-version lines (aka stable and longterm kernels).
-
-Some fixes are too complex
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- *Prepare yourself for the possibility that going through the next few steps
- might not get the issue solved in older releases: the fix might be too big
- or risky to get backported there.*
-
-Even small and seemingly obvious code-changes sometimes introduce new and
-totally unexpected problems. The maintainers of the stable and longterm kernels
-are very aware of that and thus only apply changes to these kernels that are
-within rules outlined in 'Documentation/process/stable-kernel-rules.rst'.
-
-Complex or risky changes for example do not qualify and thus only get applied
-to mainline. Other fixes are easy to get backported to the newest stable and
-longterm kernels, but too risky to integrate into older ones. So be aware the
-fix you are hoping for might be one of those that won't be backported to the
-version line your care about. In that case you'll have no other choice then to
-live with the issue or switch to a newer Linux version, unless you want to
-patch the fix into your kernels yourself.
+This subsection provides details for the steps you need to perform if you face
+a regression within a stable and longterm kernel line.
Make sure the particular version line still gets support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1431,7 +1489,7 @@ chosen and gets supported for at least two years (often six). That's why you
need to check if the kernel developers still support the version line you care
for.
-Note, if kernel.org lists two 'stable' version lines on the front page, you
+Note, if kernel.org lists two stable version lines on the front page, you
should consider switching to the newer one and forget about the older one:
support for it is likely to be abandoned soon. Then it will get a "end-of-life"
(EOL) stamp. Version lines that reached that point still get mentioned on the
@@ -1454,12 +1512,103 @@ Reproduce issue with the newest release
*Install the latest release from the particular version line as a vanilla
kernel. Ensure this kernel is not tainted and still shows the problem, as
- the issue might have already been fixed there.*
+ the issue might have already been fixed there. If you first noticed the
+ problem with a vendor kernel, check a vanilla build of the last version
+ known to work performs fine as well.*
Before investing any more time in this process you want to check if the issue
was already fixed in the latest release of version line you're interested in.
This kernel needs to be vanilla and shouldn't be tainted before the issue
-happens, as detailed outlined already above in the process of testing mainline.
+happens, as detailed outlined already above in the section "Install a fresh
+kernel for testing".
+
+Did you first notice the regression with a vendor kernel? Then changes the
+vendor applied might be interfering. You need to rule that out by performing
+a recheck. Say something broke when you updated from 5.10.4-vendor.42 to
+5.10.5-vendor.43. Then after testing the latest 5.10 release as outlined in
+the previous paragraph check if a vanilla build of Linux 5.10.4 works fine as
+well. If things are broken there, the issue does not qualify as upstream
+regression and you need switch back to the main step-by-step guide to report
+the issue.
+
+Report the regression
+~~~~~~~~~~~~~~~~~~~~~
+
+ *Send a short problem report to the Linux stable mailing list
+ (stable@vger.kernel.org) and CC the Linux regressions mailing list
+ (regressions@lists.linux.dev). Roughly describe the issue and ideally
+ explain how to reproduce it. Mention the first version that shows the
+ problem and the last version that's working fine. Then wait for further
+ instructions.*
+
+When reporting a regression that happens within a stable or longterm kernel
+line (say when updating from 5.10.4 to 5.10.5) a brief report is enough for
+the start to get the issue reported quickly. Hence a rough description is all
+it takes.
+
+But note, it helps developers a great deal if you can specify the exact version
+that introduced the problem. Hence if possible within a reasonable time frame,
+try to find that version using vanilla kernels. Lets assume something broke when
+your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as
+instructed above go and check the latest kernel from that version line, say
+5.10.9. If it shows the problem, try a vanilla 5.10.5 to ensure that no patches
+the distributor applied interfere. If the issue doesn't manifest itself there,
+try 5.10.7 and then (depending on the outcome) 5.10.8 or 5.10.6 to find the
+first version where things broke. Mention it in the report and state that 5.10.9
+is still broken.
+
+What the previous paragraph outlines is basically a rough manual 'bisection'.
+Once your report is out your might get asked to do a proper one, as it allows to
+pinpoint the exact change that causes the issue (which then can easily get
+reverted to fix the issue quickly). Hence consider to do a proper bisection
+right away if time permits. See the section 'Special care for regressions' and
+the document 'Documentation/admin-guide/bug-bisect.rst' for details how to
+perform one.
+
+
+Reference for "Reporting issues only occurring in older kernel version lines"
+-----------------------------------------------------------------------------
+
+This section provides details for the steps you need to take if you could not
+reproduce your issue with a mainline kernel, but want to see it fixed in older
+version lines (aka stable and longterm kernels).
+
+Some fixes are too complex
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ *Prepare yourself for the possibility that going through the next few steps
+ might not get the issue solved in older releases: the fix might be too big
+ or risky to get backported there.*
+
+Even small and seemingly obvious code-changes sometimes introduce new and
+totally unexpected problems. The maintainers of the stable and longterm kernels
+are very aware of that and thus only apply changes to these kernels that are
+within rules outlined in 'Documentation/process/stable-kernel-rules.rst'.
+
+Complex or risky changes for example do not qualify and thus only get applied
+to mainline. Other fixes are easy to get backported to the newest stable and
+longterm kernels, but too risky to integrate into older ones. So be aware the
+fix you are hoping for might be one of those that won't be backported to the
+version line your care about. In that case you'll have no other choice then to
+live with the issue or switch to a newer Linux version, unless you want to
+patch the fix into your kernels yourself.
+
+Common preparations
+~~~~~~~~~~~~~~~~~~~
+
+ *Perform the first three steps in the section "Reporting issues only
+ occurring in older kernel version lines" above.*
+
+You need to carry out a few steps already described in another section of this
+guide. Those steps will let you:
+
+ * Check if the kernel developers still maintain the Linux kernel version line
+ you care about.
+
+ * Search the Linux stable mailing list for exiting reports.
+
+ * Check with the latest release.
+
Check code history and search for existing discussions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1513,41 +1662,6 @@ discussions abound it.
join the discussion: mention the version where you face the issue and that
you would like to see it fixed, if suitable.
-Check if it's a regression specific to stable or longterm kernels
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- *Check if you're dealing with a regression that was never present in
- mainline by installing the first release of the version line you care
- about. If the issue doesn't show up with it, you basically need to report
- the issue with this version like you would report a problem with mainline
- (see above). This ideally includes a bisection followed by a search for
- existing reports on the net; with the help of the subject and the two
- relevant commit-ids. If that doesn't turn up anything, write the report; CC
- or forward the report to the stable maintainers, the stable mailing list,
- and those who authored the change. Include the shortened commit-id if you
- found the change that causes it.*
-
-Sometimes you won't find anything in the previous step: the issue you face
-might have never occurred in mainline, as it is caused by some change that is
-incomplete or not correctly applied. To check this, install the first release
-from version line you care about, e.g., if you care about 5.4.x, install 5.4.
-
-If the issue doesn't show itself there, it's a regression specific to the
-particular version line. In that case you need to report it like an issue
-happening in mainline, like the last few steps in the main section in the above
-outline.
-
-One of them suggests doing a bisection, which you are strongly advised to do in
-this case. After finding the culprit, search the net for existing reports
-again: not only search for the exact subject and the commit-id (proper and
-shortened to twelve characters) of the change, but also for the commit-id
-(proper and shortened) mentioned as 'Upstream commit' in the commit message.
-
-Write the report; just keep a few specialties in mind: CC or forward the report
-to the stable maintainers, the stable mailing list, which the :ref:`MAINTAINERS
-<maintainers>` file mentions in the section "STABLE BRANCH". If you performed a
-successful bisection, CC the author of the change and include its subject and
-the shortened commit-id.
Ask for advice
~~~~~~~~~~~~~~
@@ -1560,8 +1674,7 @@ Ask for advice
If the previous three steps didn't get you closer to a solution there is only
one option left: ask for advice. Do that in a mail you sent to the maintainers
for the subsystem where the issue seems to have its roots; CC the mailing list
-for the subsystem as well as the stable mailing list the :ref:`MAINTAINERS
-<maintainers>` file mention in the section "STABLE BRANCH".
+for the subsystem as well as the stable mailing list (stable@vger.kernel.org).
Why some issues won't get any reaction or remain unfixed after being reported
@@ -1629,3 +1742,13 @@ issues to the Linux kernel developers: the length and complexity of this
document and the implications between the lines illustrate that. But that's how
it is for now. The main author of this text hopes documenting the state of the
art will lay some groundwork to improve the situation over time.
+
+
+..
+ This text is maintained by Thorsten Leemhuis <linux@leemhuis.info>. If you
+ spot a typo or small mistake, feel free to let him know directly and he'll
+ fix it. You are free to do the same in a mostly informal way if you want
+ to contribute changes to the text, but for copyright reasons please CC
+ linux-doc@vger.kernel.org and "sign-off" your contribution as
+ Documentation/process/submitting-patches.rst outlines in the section "Sign
+ your work - the Developer's Certificate of Origin".