Consider this a draft. I’ll update it as I have time, but I’m posting now because it may help someone.
Splunk 7.2.2 brought along new features (which previously didn’t happen in a “maintenance release” – but that’s another topic for another time). One of the new features is “systemd support”. It didn’t take long before folks were on Splunk Answers wondering where their cheese had been moved to. Some workarounds were provided, some of which work in some cases but not others. So, @automine and I dug into a little more late today. (Not done yet though)
When Splunk 7.2.2 is installed on a systemd-compatible system and you use splunk enable boot-start to create the systemd unit file, the Splunk CLI changes its mode of operation for the start, stop, and restart commands. Specifically, it passes them through as calls to systemctl. Below is a snippet of an strace capture of me running splunk stop as the splunk user.
29384 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc4f84fb000 29384 write(1, "Stopping splunkd...\n", 20) = 20 29384 write(1, "Shutting down. Please wait, as "..., 61) = 61 29384 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fc4f84f2a10) = 29417 29384 wait4(29417, <unfinished ...> 29417 set_robust_list(0x7fc4f84f2a20, 24) = 0 29417 execve("/opt/splunk/bin/systemctl", ["systemctl", "stop", "Splunkd"], [/* 30 vars */]) = -1 ENOENT (No such file or directory) 29417 execve("/usr/local/bin/systemctl", ["systemctl", "stop", "Splunkd"], [/* 30 vars */]) = -1 ENOENT (No such file or directory) 29417 execve("/bin/systemctl", ["systemctl", "stop", "Splunkd"], [/* 30 vars */]) = 0 29417 brk(NULL) = 0x55c9c4485000
We see it fork a new child, and exec “systemctl stop Splunkd“. Notice no call to sudo or anything here. In a lot of customer environments I see/work in, the “Splunk Team” and the “OS team” exist on other sides of an organizational wall. In Splunk 7.2.1, you could have easily use the splunk user as a service account and issue stop/start/restart commands to your heart’s content and it mostly just works. In 7.2.2, those commands no longer work for you because Splunk MUST ask systemd to handle the stops and starts for it, so that systemd knows what is happening and can do process restarts and so forth.
One reasonable workaround here is adding sudo rules, and retraining the Splunk Team to use them. Some sudo rules like these (courtesy of automine) make it possible for the splunk service account to issue the needful commands to systemd in order to stop/start/restart splunk:
splunk ALL=(root) NOPASSWD: /usr/bin/systemctl restart Splunkd.service splunk ALL=(root) NOPASSWD: /usr/bin/systemctl stop Splunkd.service splunk ALL=(root) NOPASSWD: /usr/bin/systemctl start Splunkd.service splunk ALL=(root) NOPASSWD: /usr/bin/systemctl status Splunkd.service
These don’t help without retraining though! If your Splunk Admins continue to try to use the classic bin/splunk restart command that worked before, they will continue to be asked to authenticate as a wheel user each time.
This workaround may work amazingly on other distributions, I’ve not tried them all yet.
Things you can do
- Use the sudo rules and retrain yourself to always use systemctl to manage your splunk processes
- Harass Splunk to add a capability to have their behind-the-scenes calls to systemd be prefaced w/ sudo
- Harass RHEL to backport the needful systemd chunks to their version of systemd.
- Harass Ubuntu to adopt a more modern polkit
- Use some other Linux distribution
- Stay on the Splunk 7.1 release train for the foreseeable future
I would not advise getting to 7.2.0 or 7.2.1 and “parking” there. Any future 7.2 maintenance release is going to have this in it (unless Splunk takes it out further down the road and I hope they don’t).