Working out guideline for Troubleshooting crashes

Hi,

l have been experiencing a peculiar problem with AlmaLinux release 8.5 (Arctic Sphynx) on Raspberry Pi 4B. All of a sudden the screen freezes, mouse and keyboard do not respond, machine cannot be pinged from remote machines and the only option left is to switch the power off and on again. And then, after the restart nothing suspicious is noticed in dmesg or syslog.

This happens on the GUI Desktop version (1157 packages) as well as the command line server version (378 packages) of the OS. The machines are kept up-to-date by running “dnf update” several times a day. After about 30 to 40 such incidents I have a feeling that - at least on the desktop GUI version - this might be related to the firefox because these incidents occurred more frequently when I was using the browser there. Probably, it is not likely to be any hardware issue because I notice this on three independent machines which, otherwise are running OK with e.g. Debian buster. Also, it does not make any difference if I use SD card or USB.

So, now I am looking into configuring generation dump files in order to analyze them and find out what exactly is the root cause of this.

The questions that come up are:

  1. Has anyone else observed such a behavior? What was the solution?
  2. What is the best way to configure and use core / kernel dumps?
  3. Would it be worthwhile to start compiling a generic troubleshooting guideline if it does not exist already?

Any, help appreciated. Many thanks!

Maniaak