I recently encountered a very weird problem with a RPM package. This particular RPM package is automatically built by a GitLab pipeline and is built for all active Enterprise Linux (EL) versions: 7, 8, 9, 10.
Multiple services are bundled in this RPM, all defined as Systemd services (units). The processes themselves pass through the pipeline and are verified to be working. However the pipeline uses Docker containers; this means Systemd doesn't run in them and the services themselves cannot be tested using systemctl.
After deployment of this RPM across a bunch of servers, problems with RHEL 7 started to come up.
One particular service refused to start on EL 7. This was tested and verified on a CentOS 7 machine. Even root was unable to start the service:
[root@centos7 ~]# systemctl start myservice
Unfortunately no output is shown right after the start; this leads to the confusion that everything seems to be OK. However when checking the status of this service, the service start failed:
[root@centos7 ~]# systemctl status myservice
myservice.service - SFS Super Fancy Service.
Loaded: loaded (/usr/lib/systemd/system/myservice.service; disabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Tue 2025-09-16 14:07:55 CEST; 1s ago
Process: 2243 ExecStart=/opt/myservice/bin/myservice --config.file="/opt/myservice/config/myservice.yaml" --web.listen-address=${WEB_ADDR} --log.level=${LOG_LEVEL} (code=exited, status=1/FAILURE)
Main PID: 2243 (code=exited, status=1/FAILURE)
Sep 16 14:07:54 centos7.localdomain systemd[1]: myservice.service: main process exited, code=exited, status=1/FAILURE
Sep 16 14:07:54 centos7.localdomain systemd[1]: Unit myservice.service entered failed state.
Sep 16 14:07:54 centos7.localdomain systemd[1]: myservice.service failed.
Sep 16 14:07:55 centos7.localdomain systemd[1]: myservice.service holdoff time over, scheduling restart.
Sep 16 14:07:55 centos7.localdomain systemd[1]: Stopped SFS Super Fancy Service..
Sep 16 14:07:55 centos7.localdomain systemd[1]: start request repeated too quickly for myservice.service
Sep 16 14:07:55 centos7.localdomain systemd[1]: Failed to start SFS Super Fancy Service..
Sep 16 14:07:55 centos7.localdomain systemd[1]: Unit myservice.service entered failed state.
Sep 16 14:07:55 centos7.localdomain systemd[1]: myservice.service failed.
Neither the status nor the unit log (journalctl -u myservice) did reveal why the start of this service didn't work.
To make this even more head-scratch-worthy was the fact that the process itself (as defined in ExecStart) successfully started:
-bash-4.2$ /opt/myservice/bin/myservice --config.file="/opt/myservice/config/myservice.yaml" --web.listen-address=127.0.0.1:9427 --log.level=debug
time=2025-09-16T12:13:20.476Z level=INFO source=main.go:60 msg=msg "Starting myservice"=version !BADKEY=1.7.10
time=2025-09-16T12:13:20.476Z level=INFO source=main.go:62 msg=msg !BADKEY="Loading config"
time=2025-09-16T12:13:20.477Z level=INFO source=main.go:163 msg=msg !BADKEY="Configured default DNS resolver"
time=2025-09-16T12:13:20.477Z level=INFO source=main.go:148 msg=msg "Starting myservice"=version !BADKEY=1.7.10
time=2025-09-16T12:13:20.477Z level=INFO source=main.go:149 msg=msg !BADKEY="Listening for /metrics on [127.0.0.1:9427]"
time=2025-09-16T12:13:20.479Z level=INFO source=tls_config.go:347 msg="Listening on" address=127.0.0.1:9427
time=2025-09-16T12:13:20.479Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=127.0.0.1:9427
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_ping.go:60 msg="Current Targets" type=ICMP func=AddTargets count=0 configured=0
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_ping.go:81 msg="Target names to add" type=ICMP func=AddTargets targets=[]
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_mtr.go:62 msg="Current Targets" type=MTR func=AddTargets count=0 configured=0
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_mtr.go:77 msg="Target names to add" type=MTR func=AddTargets targets=[]
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_tcp.go:55 msg="Current Targets" type=TCP func=AddTargets count=0 configured=0
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_tcp.go:81 msg="Target names to add" type=TCP func=AddTargets targets=[]
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_http.go:54 msg="Current Targets" type=HTTPGet func=AddTargets count=0 configured=0
time=2025-09-16T12:13:20.479Z level=DEBUG source=monitor_http.go:69 msg="Target names to add" type=HTTPGet func=AddTargets targets=[]
The first thought was that the EnvironmentFile, defined in the service unit file, was not read correctly. This would lead to empty variables ${WEB_ADDR} and ${LOG_LEVEL}. It would make sense that the service wouldn't start in this situation.
However the unit file defines the EnvironmentFile correctly and it exists (and the variables are correctly set):
[root@centos7 ~]# cat /usr/lib/systemd/system/myservice.service
[Unit]
Description=SFS Super Fancy Service
Wants=network-online.target
After=network-online.target
[Service]
User=app
Group=app
EnvironmentFile=/opt/myservice/config/myservice.env
ExecStart=/opt/myservice/bin/myservice \
--config.file="/opt/myservice/config/myservice.yaml" \
--web.listen-address=${WEB_ADDR} \
--log.level=${LOG_LEVEL}
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
SyslogIdentifier=myservice
[Install]
WantedBy=multi-user.target
[root@centos7 ~]# ls -la /opt/myservice/config/myservice.env
-rw-r----- 1 app app 332 Sep 3 00:00 /opt/myservice/config/myservice.env
[root@centos7 ~]# cat /opt/myservice/config/myservice.env
WEB_ADDR=127.0.0.1:9427
LOG_LEVEL=debug
The service unit file was checked for weird (invisible) characters and was verified using systemd-analyze, too:
[root@centos7 ~]# systemd-analyze verify /usr/lib/systemd/system/myservice.service
[root@centos7 ~]# echo $?
0
A syntax error or something the-like would have shown an error.
The fact that the exact same service unit file is in place on the other EL releases and the service can be started there, hints that there is no error in that file...
After I shifted the focus on the (syslog) log files, I was surprised to find the following entry right after another attempted start of the service:
[root@centos7 ~]# tail -f /var/log/messages /var/log/secure
[...]
Sep 16 14:11:05 centos7 myservice: time=2025-09-16T12:11:05.311Z level=ERROR source=main.go:64 msg=msg "Loading config"=err !BADKEY="reading config file: open \"/opt/myservice/config/myservice.yaml\": no such file or directory"
Sep 16 14:11:05 centos7 systemd: myservice.service: main process exited, code=exited, status=1/FAILURE
The major hint here is: no such file or directory on the given configuration file. The file is definitely there, so what is the problem?
After a second look at the error message and the path of the config yaml I finally saw it: Systemd seems to have added backslashes to escape the double-quotes surrounding the path of the configuration yaml file!
Could that be true? Let's try it out and modify the systemd service unit file - to use the path to the configuration file without double-quotes:
[root@centos7 ~]# diff /usr/lib/systemd/system/myservice.service /tmp/myservice.service
12c12
< --config.file=/opt/myservice/config/myservice.yaml \
---
> --config.file="/opt/myservice/config/myservice.yaml" \
Tell Systemd that the unit file has changed:
[root@centos7 ~]# systemctl daemon-reload
And now let's do another attempt to start the service using systemctl:
[root@centos7 ~]# systemctl start myservice
[root@centos7 ~]# systemctl status myservice
myservice.service - SFS Super Fancy Service.
Loaded: loaded (/usr/lib/systemd/system/myservice.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2025-09-16 14:38:05 CEST; 2s ago
Main PID: 2408 (myservice)
CGroup: /system.slice/myservice.service
|- 2408 /opt/myservice/bin/myservice --config.file=/opt/myservice/config/myservice.yaml --web.listen-address=127.0.0.1:9427 --log.level=debug
Eureka! The service successfully started!
The service start problem is therefore definitely caused by the quotes around the path of the configuration yaml, inside the Systemd service unit file!
But as mentioned before, the same service unit file was working fine on the other (newer) EL releases. This was successfully verified on Rocky Linux 8, 9 and 10.
Let's have a look at the Systemd versions on each Enterprise Linux release:
[ck@centos7 ~]$ systemd-analyze --version
systemd 219
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN
ck@rocky8 ~ $ systemd-analyze --version
systemd 239 (239-74.el8_8.5)
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy
ck@rocky9 ~ $ systemd-analyze --version
systemd 252 (252-51.el9_6.1)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified
ck@rocky10 ~ # systemd-analyze --version
systemd 257 (257-9.el10_0.1-g27e50c7)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +IPE +SMACK +SECCOMP -GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBCRYPTSETUP_PLUGINS +LIBFDISK +PCRE2 +PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +BTF +XKBCOMMON +UTMP +SYSVINIT +LIBARCHIVE
It seems that this double-quote-escaping-issue was fixed - at least since Systemd v239. I went through the official change and release logs of Systemd and unfortunately I could not find a relevant description or bug fix, which would show since when exactly this escaping behaviour was fixed.
But as the path doesn't need double-quotes, the fix (removing the quotes around the file path) works for all Systemd versions and therefore all Linux distribution releases.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Observability Office OpenSearch PHP Perl Personal PostgreSQL PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder Linux