Check DNS Reverse Lookups (check-dns) September 16, 2010
Issue
OpenNMS stores DNS reverse lookups for the IP addresses it discovers (iphostname column of the ipInterface table). It appears that OpenNMS only performs the DNS reverse lookup once and stores the result. Therefore, OpenNMS may never be aware of a DNS change. The DNS reverse lookup is visible in the OpenNMS web interface on the node page as “IP Host Name”. An outdated entry here can cause confusion. A way is needed to keep this information up to date.
More serious problems can occur when OpenNMS uses DNS reverse lookups for node labels (see How are node labels determined? for more information). In the worst case, a node can have the wrong name and outages will appear to be on a different node then they actually are. Here’s one way this can happen:
- OpenNMS discovers a node with IP address 192.168.1.20 and DNS reverse lookup of server20. It sets the node label to server20. This is correct and everything is fine.
- Now the IP address 192.168.1.10 is added to server20. 192.168.1.10 has the reverse DNS lookup of devserver10 since it was just removed from a server with that name. On the next rescan OpenNMS sees the new IP and changes the label of server20 to devserver10! This follows the node naming convention described in the link above. There are now two nodes named devserver10.
- Shortly after, the DNS administrator updates the DNS reverse lookup of 192.168.1.10 to server20. OpenNMS has already stored the old name and it never updates it. If the OpenNMS administrator was not involved then there is no reason anyone would know something is wrong.
- At some point in the future a failure on 192.168.1.10 or 192.168.1.20 will cause an outage to be created with the node label devserver10 but it should be server20!
The above scenario may seem obscure but its happened to me twice before I found the solution below.
Solution
For each IP address in the OpenNMS database that has an associated name, the check-dns script compares the name against a current DNS reverse lookup. Mismatches can optionally be removed from the database which will cause OpenNMS to refresh the information on the next node scan. The script can be automated and run on a regular basis.
Synopsis
$NOCBASE/bin/check-dns [ -y | -n ]
Description
Without options, the script will scan the OpenNMS database, iphostname column of the ipInterface table, and find any entries that do not match the current DNS reverse lookup. For each mismatch, the script asks the user if they want to remove the entry from the database. This is done by setting the name equal to the IP address (iphostname=ipaddr). The next time OpenNMS scans the associated node a DNS reverse lookup will be performed to updated the name (iphostname).
If the -y or -n option is used then all prompts will automatically be answered yes, remove mismatch name from database, or no, do not remove mismatch name from database.
Implementation
Prerequisites
- Setting up Scripts
- The nslookup command must be installed and able to complete DNS reverse lookups for IP addresses in the OpenNMS database.
Install
Download the check-dns file and copy it to $NOCBASE/bin/check-dns. Make sure to enable the execute bit with chmod as shown below.
/bin/bash source /etc/noc.conf cd $NOCBASE/bin/ wget http://opennms.dougbakewell.ca/downloads/bin/check-dns chmod a+x $NOCBASE/bin/check-dns
Example
[doug@ubuntu ~]$ ./check-dns
Checking 393 reverse DNS names that OpenNMS has stored in it's database.
Mismatch Found
IP: 192.168.1.129
DB Name: server1.dougbakewell.ca
DNS Name: 129.128-255.1.168.192.in-addr.arpa
Remove name from DB so OpenNMS will refresh on next scan? [y/n] > y
Removing server1.dougbakewell.ca from Database where IP is 192.168.1.129.
Mismatch Found
IP: 192.168.1.124
DB Name: dev.dougbakewell.ca
DNS Name: olddev.dougbakewell.ca
Remove name from DB so OpenNMS will refresh on next scan? [y/n] > y
Removing dev.dougbakewell.ca from Database where IP is 192.168.1.124.
Mismatch Found
IP: 192.168.1.214
DB Name: dev2.test.dougbakewell.ca
DNS Name: dev2.dougbakewell.ca
Remove name from DB so OpenNMS will refresh on next scan? [y/n] > y
Removing dev2.test.dougbakewell.ca from Database where IP is 192.168.1.214.
Done.
[doug@ubuntu ~]$
Automate
Once you trust the script is doing what you expect, it can be run as a cron job with the -y option. I run this script every Friday at 22:00 to ensure the DNS names are up to date.
0 22 * * 5 <your username or root> /opt/noc/bin/check-dns -y > /dev/null 2>&1
Leave a Reply