What if you encounter an issue with a kernel module that doesn’t give feedback on what went wrong in the kernel log?
An example of a problem I faced along that line was when I got an error while trying to unload a module.
$ sudo modprobe -r vkms modprobe: FATAL: Module vkms is in use.
Usually, this would have been easily resolved with lsmod by checking the Used by column. Unfortunately, that’s not possible all the time.
$ lsmod | grep vkms vkms 16384 1 drm_kms_helper 172032 1 vkms drm 405504 4 drm_kms_helper,vkms
For example, from above, we can see that Used by in the third column shows that one instance of drm_kms_helper is used by vkms, but it doesn’t give more details on what’s using vkms.
I have been told that it’s probably used by a part of the GUI manager. To understand what exactly was using the module, I have stumbled upon an answer in this thread [1] which explained how Ftrace utility could be used to shed light on the modules dependencies.
Ftrace
Ftrace (Function Tracer) is a linux kernel utility that enables tracing kernel function calls. Ftrace is not only helpful for debugging but also to understand how a module works by following the order of function calls.
Reading the documentation file [2]
and this guide [3], as well as exploring the directory
/sys/kernel/debug/tracing
was helpful to understand how ftrace works and how to use it for debugging
(to access the directory, super-user privileges needed).
The following use cases give some overview on how to navigate ftrace files:
1) To check if tracing is enabled:
$ cat /sys/kernel/debug/tracing/tracing_on
2) By default, tracing is enabled in my system. To disable it:
$ echo 0 > /sys/kernel/debug/tracing/tracing_on
3) To check available tracers:
$ cat /sys/kernel/debug/tracing/available_tracers hwlat blk mmiotrace function_graph wakeup_dl wakeup_rt wakeup function nop
Other files with available prefix such as available_[events, filter_functions] is helpful to learn about the available events and functions that can be traced.
4) By default nop is used as the current_tracer, to change the tracing function:
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
6) To clean trace buffer:
$ echo > /sys/kernel/debug/tracing/trace
Ftrace through trace-cmd Tool
Since dealing with Ftrace through the debugfs filesystem is not that convenient, there exists a user-space command-line tool for Ftrace that may be easier to use.
$ sudo apt-get install trace-cmd
For more details on trace-cmd, this guide [4] might be helpful.
Moreover, checking the help manual for a specific command such as: $ trace-cmd record --help
would give more details on the
proper usage of the command.
Inspecting module dependncies with trace-cmd:
As [1] mentioned, try_module_get && module_put are the ones responsible for managing a module usage count (refcount).
We can see the available events related to module load and unload by inspecting available_events
file in the Ftrace directory:
$ cat /sys/kernel/debug/tracing/available_events | grep module_ module:module_request module:module_put module:module_get module:module_free module:module_load
Use Case:
So now we can use trace-cmd to understand the module dependencies for vkms:
1- Start recording all events associated with module keyword (ex: module_request, module_get, etc.)
$ sudo trace-cmd record -e module
2- In another terminal, install the module again. Since the vkms module depends on other modules, I would use modprobe.
$ modprobe vkms
3- To inspect the results of the event tracing, we can stop recording and run trace-cmd again with the report command.
$ trace-cmd report cpus=1 modprobe-1687 [000] 102.001424: module_load: sysimgblt modprobe-1687 [000] 102.001519: module_put: sysimgblt call_site=do_init_module refcnt=1 modprobe-1687 [000] 102.013795: module_load: sysfillrect modprobe-1687 [000] 102.013812: module_put: sysfillrect call_site=do_init_module refcnt=1 modprobe-1687 [000] 102.023779: module_load: syscopyarea modprobe-1687 [000] 102.023792: module_put: syscopyarea call_site=do_init_module refcnt=1 modprobe-1687 [000] 102.030766: module_load: fb_sys_fops modprobe-1687 [000] 102.030776: module_put: fb_sys_fops call_site=do_init_module refcnt=1 modprobe-1687 [000] 102.117198: module_load: drm modprobe-1687 [000] 102.118399: module_put: drm call_site=do_init_module refcnt=1 modprobe-1687 [000] 102.145601: module_get: drm call_site=ref_module refcnt=2 modprobe-1687 [000] 102.145674: module_get: fb_sys_fops call_site=ref_module refcnt=2 modprobe-1687 [000] 102.145680: module_get: sysimgblt call_site=ref_module refcnt=2 modprobe-1687 [000] 102.145687: module_get: sysfillrect call_site=ref_module refcnt=2 modprobe-1687 [000] 102.145754: module_get: syscopyarea call_site=ref_module refcnt=2 modprobe-1687 [000] 102.149619: module_load: drm_kms_helper modprobe-1687 [000] 102.150797: module_put: drm_kms_helper call_site=do_init_module refcnt=1 modprobe-1687 [000] 102.160087: module_get: drm call_site=ref_module refcnt=3 modprobe-1687 [000] 102.160090: module_get: drm_kms_helper call_site=ref_module refcnt=2 modprobe-1687 [000] 102.161679: module_load: vkms modprobe-1687 [000] 102.161689: module_get: drm call_site=get_filesystem refcnt=4 modprobe-1687 [000] 102.163020: module_put: vkms call_site=do_init_module refcnt=1 systemd-logind-545 [000] 102.168295: module_get: drm call_site=cdev_get refcnt=5 systemd-logind-545 [000] 102.168296: module_get: drm call_site=chrdev_open refcnt=6 systemd-logind-545 [000] 102.168297: module_get: vkms call_site=drm_stub_open refcnt=2 systemd-logind-545 [000] 102.168298: module_put: drm call_site=drm_stub_open refcnt=5
We can see from the tracing output that some other modules were loaded first (sysimgblt, sysfillrect, etc.).
That’s because modprobe checks first the module directory in /lib/modules/`uname -r`
and load first a set of modules that the module to be installed depends on.
$ cat /lib/modules/$(uname -r)/modules.dep | grep vkms kernel/drivers/gpu/drm/vkms/vkms.ko: kernel/drivers/gpu/drm/drm_kms_helper.ko kernel/drivers/gpu/drm/drm.ko kernel/drivers/video/fbdev/core/fb_sys_fops.ko kernel/drivers/video/fbdev/core/syscopyarea.ko kernel/drivers/video/fbdev/core/sysfillrect.ko kernel/drivers/video/fbdev/core/sysimgblt.ko
Resolving the Problem
It appears that, systemd-login (which manages user logins) [5] has been using an instance of vkms and drm modules as can be shown bellow:
systemd-logind-545 [000] 102.168295: module_get: drm call_site=cdev_get refcnt=5 systemd-logind-545 [000] 102.168296: module_get: drm call_site=chrdev_open refcnt=6 systemd-logind-545 [000] 102.168297: module_get: vkms call_site=drm_stub_open refcnt=2 systemd-logind-545 [000] 102.168298: module_put: drm call_site=drm_stub_open refcnt=5
To stop systemd-login from using these modules we need to disable the graphical login manager.
Since my machine uses systemd instead of init [6] I can switch to a text mode by changing the runlevel or its equivelant keyword: target.
$ cat /proc/1/status | grep Name Name:systemd
To inspect the current target get-default
can be used:
$ systemctl get-default graphical.target
We can disable the graphical login manager permenantly by changing the default target to multi-user as follows:
$ sudo systemctl set-default multi-user.target
To understand the difference between graphical and multiuser targets, pstree can be used to give some context on both of them:
$ systemctl isolate multiuser.target $ pstree systemd─┬─ModemManager───2*[{ModemManager}] ├─NetworkManager─┬─2*[dhclient] │ └─2*[{NetworkManager}] ├─accounts-daemon───3*[{accounts-daemon}] ├─agetty ├─avahi-daemon───avahi-daemon ├─cron ├─cups-browsed───2*[{cups-browsed}] ├─cupsd ├─dbus-daemon ├─dbus-daemon───dbus-daemon ├─evolution-addre───5*[{evolution-addre}] ├─evolution-calen───5*[{evolution-calen}] ├─evolution-sourc───4*[{evolution-sourc}] ├─gnome-keyring-d───3*[{gnome-keyring-d}] ├─goa-daemon───5*[{goa-daemon}] ├─goa-identity-se───3*[{goa-identity-se}] ├─gvfsd───3*[{gvfsd}] ├─gvfsd-fuse───5*[{gvfsd-fuse}] ├─2*[kerneloops] ├─networkd-dispat───{networkd-dispat} ├─packagekitd───3*[{packagekitd}] ├─polkitd───3*[{polkitd}] ├─rsyslogd───3*[{rsyslogd}] ├─snapd───6*[{snapd}] ├─ssh-agent ├─sshd───sshd───sshd───bash───pstree ├─systemd-journal ├─systemd-logind ├─systemd-resolve ├─systemd-timesyn───{systemd-timesyn} ├─systemd-udevd ├─whoopsie───2*[{whoopsie}] └─wpa_supplicant
$ systemctl isolate graphical.target $ pstree systemd─┬─ModemManager───2*[{ModemManager}] ├─NetworkManager─┬─2*[dhclient] │ └─2*[{NetworkManager}] ├─accounts-daemon───2*[{accounts-daemon}] ├─at-spi-bus-laun─┬─dbus-daemon │ └─4*[{at-spi-bus-laun}] ├─at-spi2-registr───2*[{at-spi2-registr}] ├─avahi-daemon───avahi-daemon ├─cron ├─cups-browsed───2*[{cups-browsed}] ├─cupsd ├─dbus-daemon ├─dbus-daemon───dbus-daemon ├─evolution-sourc───3*[{evolution-sourc}] ├─gdm3─┬─gdm-session-wor─┬─gdm-x-session─┬─Xorg───{Xorg} │ │ │ ├─dbus-daemon │ │ │ ├─gnome-session-b─┬─gnome-shell─┬─ibus-daemon─┬─ibus-dconf───{ibu+ │ │ │ │ │ │ └─3*[{ibus-daemon}] │ │ │ │ │ └─11*[{gnome-shell}] │ │ │ │ ├─ssh-agent │ │ │ │ └─4*[{gnome-session-b}] │ │ │ └─2*[{gdm-x-session}] │ │ └─2*[{gdm-session-wor}] │ └─3*[{gdm3}] ├─gnome-keyring-d───3*[{gnome-keyring-d}] ├─goa-daemon───4*[{goa-daemon}] ├─goa-identity-se───3*[{goa-identity-se}] ├─gvfsd───2*[{gvfsd}] ├─gvfsd───3*[{gvfsd}] ├─gvfsd-fuse───5*[{gvfsd-fuse}] ├─2*[kerneloops] ├─networkd-dispat───{networkd-dispat} ├─plymouth ├─plymouthd ├─polkitd───3*[{polkitd}] ├─pulseaudio───2*[{pulseaudio}] ├─rsyslogd───3*[{rsyslogd}] ├─rtkit-daemon───2*[{rtkit-daemon}] ├─snapd───6*[{snapd}] ├─sshd───sshd───sshd───bash───pstree ├─systemd-journal ├─systemd-logind ├─systemd-resolve ├─systemd-timesyn───{systemd-timesyn} ├─systemd-udevd ├─udisksd───5*[{udisksd}] ├─upowerd───3*[{upowerd}] ├─whoopsie───2*[{whoopsie}] └─wpa_supplicant
If we reboot now and test again modprobe with trace-cmd, we can see that systemd-login no longer uses drm or vkms.
$ trace-cmd report cpus=1 modprobe-1101 [000] 523.667393: module_load: sysimgblt modprobe-1101 [000] 523.667459: module_put: sysimgblt call_site=do_init_module refcnt=1 modprobe-1101 [000] 523.674594: module_load: sysfillrect modprobe-1101 [000] 523.674605: module_put: sysfillrect call_site=do_init_module refcnt=1 modprobe-1101 [000] 523.680929: module_load: syscopyarea modprobe-1101 [000] 523.680937: module_put: syscopyarea call_site=do_init_module refcnt=1 modprobe-1101 [000] 523.686194: module_load: fb_sys_fops modprobe-1101 [000] 523.686203: module_put: fb_sys_fops call_site=do_init_module refcnt=1 modprobe-1101 [000] 523.771319: module_load: drm modprobe-1101 [000] 523.772248: module_put: drm call_site=do_init_module refcnt=1 modprobe-1101 [000] 523.799666: module_get: drm call_site=ref_module refcnt=2 modprobe-1101 [000] 523.799739: module_get: fb_sys_fops call_site=ref_module refcnt=2 modprobe-1101 [000] 523.799745: module_get: sysimgblt call_site=ref_module refcnt=2 modprobe-1101 [000] 523.799751: module_get: sysfillrect call_site=ref_module refcnt=2 modprobe-1101 [000] 523.799818: module_get: syscopyarea call_site=ref_module refcnt=2 modprobe-1101 [000] 523.803201: module_load: drm_kms_helper modprobe-1101 [000] 523.804125: module_put: drm_kms_helper call_site=do_init_module refcnt=1 modprobe-1101 [000] 523.813720: module_get: drm call_site=ref_module refcnt=3 modprobe-1101 [000] 523.813723: module_get: drm_kms_helper call_site=ref_module refcnt=2 modprobe-1101 [000] 523.815070: module_load: vkms modprobe-1101 [000] 523.815081: module_get: drm call_site=get_filesystem refcnt=4 modprobe-1101 [000] 523.817306: module_put: vkms call_site=do_init_module refcnt=1
And the Used by entry for vkms is zero now, so we can unload the module vkms without any complains \o/
$ lsmod | grep vkms vkms 16384 0 drm_kms_helper 172032 1 vkms drm 405504 3 drm_kms_helper,vkms
Refrences:
[1] Stack Overflow: Is there a way to figure out what is using a Linux kernel module?
[2] Kernel Documentation: FTrace - Function Tracer
[3] Debugging the kernel using Ftrace - part 1
[4] trace-cmd: A front-end for Ftrace
[5] Systemd-logind