ISC's dhcpd uses this code to check for an already-running daemon:
/* Read previous pid file. */ if ((i = open (path_dhcpd_pid, O_RDONLY)) >= 0) { status = read (i, pbuf, (sizeof pbuf) - 1); close (i); if (status > 0) { pbuf [status] = 0; pid = atoi (pbuf); /* If the previous server process is not still running, write a new pid file immediately. */ if (pid && (pid == getpid() || kill (pid, 0) < 0)) { unlink (path_dhcpd_pid); if ((i = open (path_dhcpd_pid, O_WRONLY | O_CREAT, 0644)) >= 0) { sprintf (pbuf, "%d\n", (int)getpid ()); write (i, pbuf, strlen (pbuf)); close (i); pidfilewritten = 1; } } else log_fatal ("There's already a DHCP server running."); } }
The problem with this strategy is that, if the box dies, there's a stale
pid file left in /var/run/dhcpd.pid. This wouldn't be so bad -- the code
above checks [using kill(pid, 0)
] to see if there's a process running
with that pid. But when the box is restarting, there will be a bunch of
processes all starting in similar sequence each time. So on one boot,
you might see dhcpd with a pid of 1001 and ntpd with a pid of 1002. If
the box dies violently (e.g. power cut), the dhcpd pid file will contain
1001. On the second boot, assume ntpd starts first and gets a pid of
1001 and dhcpd is 1002. Now, the kill(pid, 0)
will succeed, making it
appear that dhcpd is already running, and dhcpd will exit.
How to fix this?
- Explicitly put the pid file under /tmp. Getting this right is fussy -- make sure you avoid the race conditions associated with creating temp files. Use dhcpd's "-pf" flag to tell it where to use the pid file. This avoids spurious "already running" messages, because dhcpd will never read a pid from an existing pid file. [You could also just remove the /var/run/dhcpd.pid file, but I'd rather explicitly provide the path in my startup script in case some dim bulb decides to change the compiled-in default.]
- Be careful in your restart code to kill any existing dhcpd (assuming
you really want a new dhcpd), or avoid trying to start a new one
(assuming you want to use an already running dhcpd).
pgrep(1)
andpkill(1)
will be useful here.
In researching this, I saw this bit of wisdom from Henning Brauer: "pid files are useless.".
I heartily agree...