System administrators want services to run continuously and automatically started or restarted after an outage to prevent unavailability of services and information.
Linux services can be configured to self-heal through init systems, or service managers. Service managers start, restart, and terminate services according to specified commands, dependencies and runlevels.
The available solutions depend on the Linux distribution used and the default system manager that comes with it, as shown in the following list:
- System V is the older init system:
- Debian 6 and earlier
- Ubuntu 9.04 and earlier
- CentOS 5 and earlier
- Ubuntu 9.10 to Ubuntu 14.10
- CentOS 6
- systemd is the init system for the most recent distributions featured here:
- Debian 7 and newer
- Ubuntu 15.04 and newer
- CentOS 7
Systemd is a Linux system manager and the default manager for Debian distributions starting from Debian Jessie. In systemd tasks are organized into units, individual configuration files. The most common units are services (.service), mount points (.mount), devices (.device), sockets (.socket), and timers (.timer).
For instance, to start SSH daemon the unit
ssh.service is used.
Systemd places each service in a dedicated control group (cgroup) named after the service. Modern kernels support process isolation and resource allocation based on cgroups.
To learn more about systemd, check the links at the end.
The following steps were performed on a Debian 9 64-bit.
The automatic installation via dcrinstall documented in Installing dcrd suggests that Decred executables could be installed in the
/opt directory, that a symlink to
/opt/decred could be created and that a regular user named
dcrduser could be used to start dcrd. These settings are reflected in the following dcrd service configuration file. The installation via dcrinstall also installs dcrctl in the same structure.
Create a new file named
dcrd.service and paste on it the next example:
# dcrd termination depends on dcrctl to send the command via RPC. # # Author: Marcelo Martins (https://stakey.club) # dcrd reference doc: # https://docs.decred.org/advanced/manual-cli-install/ # [Unit] Description=The full node blockchain server of Decred network Documentation=http://decred.org/ After=network.target [Service] Type=simple ExecStart=/opt/decred/dcrd --addrindex ExecStop=/opt/decred/dcrctl stop TimeoutStopSec=5 KillMode=mixed Restart=on-abnormal User=dcrduser Group=users [Install] WantedBy=multi-user.target
Read the explanation of each directive:
[Unit] Section Directives
Description: Used to describe the name and basic functionality of the unit.
Documentation: Location for a list of URIs for documentation. These can be either internally available man pages or web accessible URLs.
After: The units listed in this directive will be started before starting the current unit. This does not imply a dependency relationship and one must be established through the
Requires directive if this is required.
[Install] Section Directives
WantedBy: Specifies dependency between units. For example, if the current unit has the
WantedBy=multi-user.target directive, when the system goes into multiuser mode it will load the current unit because there is a dependency relationship between Linux multiuser mode and dcrd service.
[Service] Section Directives
Type: “simple”: The main process of the service is specified in
ExecStart. This is the default if the
Type directive is not set, but the
ExecStart is set. The other options include forking and telling systemd to wait before calling other units because this is a short-lived process.
ExecStart: Indicates the full path and arguments of the command that starts the service. If the path is preceded by “-“, termination with an exit code other than zero will be accepted without marking as failure. Specify the path of your dcrd installation. The
--addrindex option is needed to search for addresses using dcrdata interface. If you don’’ intend to use dcrdata you may remove the option.’
ExecStop: Indicates the full path and arguments of the command used to stop the service. If there is no specified command or the parameter is left out, the process will be terminated immediately when the service is stopped. To terminate dcrd subtly it is necessary to use dcrctl with the stop command. Specify the path of your dcrctl installation.
Restart: Configures whether the service shall be restarted when the service process exits, is killed, or a timeout is reached. The options are “always”, “on-success”, “on-failure”, “on-abnormal”, “on-abort”, or “on-watchdog”.
Exit causes and the effect of
Restart setting on them:
|Clean exit code or signal||X||X|
|Unclean exit code||X||X|
TimeoutStopSec: Configures the time systemd will wait before marking as failure or forcefully terminating a service that wasn’t stopped using the command in
KillMode: Configures how processes of this unit should be terminated. The option are “control-group”, “process”, “mixed” and “none”.
- control-group: all remaining processes in the control group of this unit will be killed on unit stop (for services: after the stop command is executed, as configured with ExecStop).
- process: only the main process itself is killed.
- mixed: the SIGTERM signal is sent to the main process while the subsequent SIGKILL signal is sent to all remaining processes of the unit’s control group.
- none: no process is killed. In this case, only the stop command will be executed on unit stop, but no process be killed otherwise. Processes remaining alive after stop are left in their control group and the control group continues to exist after stop unless it is empty.
User: The user impersonated by the service. This parameter may be left out.
Group: The group impersonated by the service. This parameter may be left out.
2.1. Enabling the service
dcrd.service file to the system folder that contains other service configuration files like this one. This step is required for the next command to locate the file and enable the service.
$ sudo mv dcrd.service /etc/systemd/system/ $ sudo systemctl enable dcrd.service Created symlink /etc/systemd/system/multi-user.target.wants/dcrd.service → /etc/systemd/system/dcrd.service.
Whenever the system is restarted, dcrd service will be required by ‘multi-user’ mode (
WantedBy) after starting the network service (
After) and it will be started (
ExecStart) with predetermined user (
User) and group (
Group) privileges if specified.
After the service is enabled using
systemctl, it is already possible to verify its status:
$ sudo systemctl status dcrd.service ● dcrd.service - The full node blockchain server of Decred network Loaded: loaded (/etc/systemd/system/dcrd.service; enabled; vendor preset: enabled) Active: inactive (dead) Docs: http://decred.org/
To start the service and verify its status changed as shown on the line starting with ‘Active: active (running)’:
$ sudo systemctl start dcrd.service $ sudo systemctl status dcrd.service ● dcrd.service - The full node blockchain server of Decred network Loaded: loaded (/etc/systemd/system/dcrd.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2018-05-04 21:36:47 WEST; 2s ago Docs: http://decred.org/ Main PID: 18525 (dcrd) Tasks: 8 (limit: 19660) CGroup: /system.slice/dcrd.service └─18525 /opt/decred/dcrd May 04 21:36:47 fullnode systemd: Started The full node blockchain server of Decred network. May 04 21:36:47 fullnode dcrd: 2018-05-04 21:36:47.655 [INF] DCRD: Version 1.1.2+release (Go version go1.9.2) May 04 21:36:47 fullnode dcrd: 2018-05-04 21:36:47.689 [INF] DCRD: Home dir: /home/dcrduser/.dcrd May 04 21:36:47 fullnode dcrd: 2018-05-04 21:36:47.699 [INF] DCRD: Loading block database from '/opt/blockchain/dcrd_data/mainnet/blockchain' May 04 21:36:48 fullnode dcrd: 2018-05-04 21:36:48.229 [INF] DCRD: Block database loaded May 04 21:36:48 fullnode dcrd: 2018-05-04 21:36:48.253 [INF] INDX: Exists address index is enabled May 04 21:36:49 fullnode dcrd: 2018-05-04 21:36:49.107 [INF] STKE: Stake database version 1 loaded May 04 21:36:49 fullnode dcrd: 2018-05-04 21:36:49.184 [INF] CHAN: Blockchain database version 2 loaded May 04 21:36:49 fullnode dcrd: 2018-05-04 21:36:49.184 [INF] CHAN: Chain state: height 232705, hash 00000000000000000b831418b8beff661c lines 1-18/18 (END)
To make sure dcrd service is being run with the regular user:
$ ps -aux | grep dcrd
To terminate the service execution and verify its status changed as shown on the line starting with ‘Active: inactive (dead)’:
$ sudo systemctl stop dcrd.service $ sudo systemctl status dcrd.service ● dcrd.service - The full node blockchain server of Decred network Loaded: loaded (/etc/systemd/system/dcrd.service; enabled; vendor preset: enabled) Active: inactive (dead) since Sat 2018-05-05 17:42:18 WEST; 1s ago Docs: http://decred.org/ Process: 22041 ExecStop=/opt/decred/dcrctl stop (code=exited, status=0/SUCCESS) Process: 18525 ExecStart=/opt/decred/dcrd (code=exited, status=0/SUCCESS) Main PID: 18525 (code=exited, status=0/SUCCESS) May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.752 [WRN] SRVR: Server shutting down May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.759 [WRN] RPCS: RPC server shutting down May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.770 [INF] RPCS: RPC server shutdown complete May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.771 [INF] BMGR: Block manager shutting down May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.771 [INF] AMGR: Address manager shutting down May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.776 [INF] SRVR: Server shutdown complete May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.776 [INF] DCRD: Gracefully shutting down the database... May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.797 [INF] DCRD: Received signal (terminated). Already shutting down... May 05 17:42:18 fullnode dcrd: 2018-05-05 17:42:18.812 [INF] DCRD: Shutdown complete May 05 17:42:18 fullnode systemd: Stopped The full node blockchain server of Decred network.
If dcrd is terminated right after it is started, while it is still loading the blockchain, it is possible that a timeout occurs at its termination, causing the command
sudo systemctl stop dcrd.service to show dcrd failed when terminated. This is because dcrd hangs a little bit while loading the blockchain and does not respond to RPC commands. The it will not process the command sent by dcrctl, being terminated via SIGKILL by the system.
To make changes to the
dcrd.service file the service must first be disabled and then re-enabled after saving the changes.
$ sudo systemctl disable dcrd.service (...) $ sudo systemctl enable dcrd.service