Without any clear reason, today an error appeared on my home server, in fact the server crashed and did not respond to any commands, the only option was the hardware reset, from the button!
The server is a HTPC, with remote access (headless server), based on a Ryzen 2400G processor, 16GB Ram, 1xNVMe, 1xSSD, 4xHDDs and a GeForce® GTX 1050 Ti video card, used for transcoding and internet access through two providers, 100 + 1000Mbps. It ran non-stop for almost 4 years, I only stopped it manually, to add / change HDD, video card, network, etc. and is ensured that it stays ON all the time, using a UPS with an autonomy of approximately 12 hours.
After rebooting, I checked the logs and found the following errors:
/var/log/syslog Feb 26 19:11:19 zen kernel: [706943.754304] ahci 0000:02:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0x4055abe000608040 flags=0x0030] Feb 26 19:11:19 zen kernel: [706943.754363] ahci 0000:02:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0x4055abe000608080 flags=0x0030] Feb 26 19:11:19 zen kernel: [706943.754416] ahci 0000:02:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0x4055abe0006080c0 flags=0x0030] Feb 26 19:11:19 zen kernel: [706943.754468] ahci 0000:02:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0x4055abe000608024 flags=0x0030]
I searched online for a solution and apparently there is no simple one, but in the end someone specified (via bugzilla in 2019) that in kernel version 5.5 the problem would no longer exist, so I didn’t stop to think and upgraded to the kernel (5.5.0) and I hope that I got rid of the problem!
After upgrading the kernel, I encountered another problem, the server enters sleep mode (standby) for no reason, but I solved the problem quickly and I hope there are no more!