About

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as zed
  & handle of notzed

Tags

android (44)
beagle (63)
biographical (93)
blogz (9)
business (1)
code (71)
cooking (31)
dez (7)
dusk (30)
extensionz (1)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (448)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (228)
java ee (3)
javafx (49)
jjmpeg (80)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (9)
opencl (120)
os (17)
panamaz (2)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
vulkan (3)
wanki (3)
workshop (3)
zcl (2)
zedzone (23)
Wednesday, 18 December 2019, 13:59

amdgpu / x570 / sea islands / IO_PAGE_FAULT

After turning on the amdgpu driver on my new system (radeon.si_support=0 amdgpu.si_support=1) so I could play with vulkan it seemed nice and stable.

But I was only playing around in emacs/netbeans/firefox and wasn't doing much with the graphics system. When I ran a JavaFX thing i've been playing with (the genetic art) and eventually it crashed the graphics driver. So I went back to radeon for the time being until I decided to look at it again.

Anyway today I had another look, and whilst logged on remotely I got the error log:

[  993.135454] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe05000 flags=0x0000]
[  993.135461] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe0a040 flags=0x0000]
[  993.135466] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe338c0 flags=0x0000]
[  993.135471] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe1d100 flags=0x0000]
[  993.135476] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe38900 flags=0x0000]
[  993.135481] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe5dd80 flags=0x0000]
[  993.135486] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe23580 flags=0x0000]
[  993.135491] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe25c00 flags=0x0000]
[  993.135496] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe4ed00 flags=0x0000]
[  993.135501] amdgpu 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0003 address=0xffffe07a80 flags=0x0000]
[  993.135507] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe52a00 flags=0x0000]
[  993.135512] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe55200 flags=0x0000]
[  993.135517] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe5a200 flags=0x0000]
[  993.135521] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe5f2c0 flags=0x0000]
[  993.135526] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe10700 flags=0x0000]
[  993.135531] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe51600 flags=0x0000]
[  993.135535] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe606c0 flags=0x0000]
[  993.135540] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe4c7c0 flags=0x0000]
[  993.135544] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe53e00 flags=0x0000]
[  993.135549] AMD-Vi: Event logged [IO_PAGE_FAULT device=09:00.0 domain=0x0003 address=0xffffe58e00 flags=0x0000]
[ 1003.712058] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=144954, emitted seq=144956
[ 1003.712116] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1301 thread Xorg:cs0 pid 1303
[ 1003.712118] [drm] GPU recovery disabled.

There wasn't a lot to be found on the internet regarding amdgpu but there were some other similar problems. The suggested fix was to add iommu=soft to the kernel arguments. I'm trying this now and so far it looks good!

I have been tracking the 5.4.x kernel releases and thought that might have something to do with it but fortunately not. When it crashed initially i'd just done a slackpkg upgrade-all and all sorts of things broke horribly. But that was my fault for overzealously removing packages that didn't look important: polkit uses a lot of snot unfortunately so I broke that. Fucking firefox lost all my settings though!

Tagged hacking.
busymon - tool to force computer breaks | Zed's Bread Baby, ...
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!