Friday, September 14, 2012

LINUX MANAGING /etc/hosts WITH LDAP

Managing the /etc/hosts File Across Multiple Servers With LDAP (Microsoft Windows Active Directory 2008 R2)

 :: PURPOSE ::


Managing the /etc/hosts file across more than a handful of servers can quickly become more trouble than it’s worth.  There are many solutions that could be leveraged to manage this; like Puppet, Chef, or Cfengine.  A cron job tied to rsync could be made to work with little effort.  A SysAdmin with a lot of time on his hands who enjoys performing repetitive tasks better suited to shell scripts…

There are also many problem associated with trying to maintain identical yet separate copies of the /etc/hosts file across multiple servers.  All of the solutions listed above depend on the time between when the problem is first encountered and how soon the solution is scheduled to run.

But, if you’re like me, and you are already leveraging an LDAP (Microsoft Windows 2008 R2 Active Directory) environment for user and group information as well as Kerberos authentication, why not leverage it to provide /etc/hosts information as well?  For starters, changes to /etc/hosts entries in LDAP are available across all servers as soon as they are made (barring replication times between Domain Controllers).  A solid second reason is that LDAP-provided hosts data can serve as a fail-safe in the event that your DNS becomes unavailable.  Or, you may have a situation like mine where you want only a couple of entries in the local /etc/hosts file, then the majority of static host entries available from LDAP, and then, finally, some critical servers and services provided by DNS.  You might also be operating in an environment with less than stellar DNS management and you’d like to circumvent a problem you cannot resolve directly…not that I’ve ever heard of such a place.

A word of thanks to Kar Ellen and Karellen’s Unix Blog for providing the most clear information to this problem.  This post is almost wholly inspired by a similar one on Kar Ellen' blog; I've just adapted it to Microsoft Windows Active Directory 2008 R2.  Lots of really good information there.

I'd like to point out that sometimes we do things simply because we can.  This is one of those cases.  Do I have a driving need to provide /etc/hosts information from LDAP?  No.  But it's cool, so I did it.  Also, the Active Directory admins are fond of telling me what can and cannot be done with LDAP.  I am fond of proving them wrong.

The best thing about this is that only those individual servers that are configured to query LDAP for hosts information will be affected by this.  So you can have one or two servers running this just to try it out without impacted name resolution for the other servers in your domain. 

:: EXAMINE CURRENT /etc/hosts FILE ::


Let’s take a look at the current /etc/hosts file and see what we can work with.

As root first make a back-up of the existing /etc/hosts file:
# cp /etc/hosts /etc/hosts.backup
Now, cat the contents to see what we have:
# cat /etc/hosts
127.0.0.1       lux001.test.internal lux001 localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.0.0.50 dc01.test.internal dc01 # ACTIVE DIRECTORY DOMAIN CONTROLLER
10.0.0.51 dc02.test.internal dc02 # ACTIVE DIRECTORY DOMAIN CONTROLLER
10.0.0.60 lux001.test.internal lux001 # TEST.INTERNAL RHEL 5
10.0.0.61 lux002.test.internal lux002 # TEST.INTERNAL RHEL 5 HOME DIRECTORY SERVER

Comment out the entries for lux001.test.internal and lux002.test.internal and save /etc/hosts:
# vim /etc/hosts
127.0.0.1       lux001.test.internal lux001 localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.0.0.50 dc01.test.internal dc01 # ACTIVE DIRECTORY DOMAIN CONTROLLER
10.0.0.51 dc02.test.internal dc02 # ACTIVE DIRECTORY DOMAIN CONTROLLER
#10.0.0.60 lux001.test.internal lux001 # TEST.INTERNAL RHEL 5
#10.0.0.61 lux002.test.internal lux002 # TEST.INTERNAL RHEL 5 HOME DIRECTORY SERVER
 

:: CREATING THE LDIF FILE ::


We’re going to create the entries for lux001, lux001.test.internal, lux002, and lux002.test.internal in LDAP.  We’re also going to create a new OU called HOSTS to store these objects. 

NOTE:  You can either create an LDIF file to perform these steps or you can manually create the OU, then use ADSIEdit to create the device object and, finally, add the objectClass “ipHost” and description through the Attribute Editor tab of the new device object.  I have found that trying to add the objectClass “ipHost” while creating the initial object manually seems to fail.  I do not know why.  It does not fail when being created by LDIF. 

NOTE:  Please remove the top-most section of this LDIF file if you have manually created the OU into which you will be storing these device entries.

On your Windows Domain Controller, create a file called “hosts.ldif” and save it.
Edit the “hosts.ldif” file with Notepad and add the following (remember to change this to suit your requirements…):

dn: ou=HOSTS,ou=servers,dc=test,dc=internal
objectClass: organizationalUnit
objectClass: top
ou: HOSTS

dn: cn=lux001,ou=hosts,ou=servers,dc=test,dc=internal
ipHostNumber: 10.0.0.60
objectClass: top
objectClass: ipHost
objectClass: device
description: /ETC/HOSTS SERVER IP: 10.0.0.60
cn: lux001

dn: cn=lux001.test.internal,ou=hosts,ou=servers,dc=test,dc=internal
ipHostNumber: 10.0.0.60
objectClass: top
objectClass: ipHost
objectClass: device
description: /ETC/HOSTS SERVER IP: 10.0.0.60
cn: lux001.test.internal

dn: cn=lux002,ou=hosts,ou=servers,dc=test,dc=internal
ipHostNumber: 10.0.0.61
objectClass: top
objectClass: ipHost
objectClass: device
description: /ETC/HOSTS SERVER IP: 10.0.0.61
cn: lux002

dn: cn=lux002.test.internal,ou=hosts,ou=servers,dc=test,dc=internal
ipHostNumber: 10.0.0.61
objectClass: top
objectClass: ipHost
objectClass: device
description: /ETC/HOSTS SERVER IP: 10.0.0.61
cn: lux002.test.internal

NOTE:  If you want your HOSTS OU to be lowercase or just not all uppercase than you must change it to the appearance you want on both the “cn:” and “dn:” lines.

Process the LDIF from the Domain Controller as a user with administrator rights for the domain as follows:
<PATH TO HOST.LDIF>:\> ldifde -i -f hosts.ldif

The ldifde command is pretty good about telling you what part of the command failed; most of the time it’s a syntax error in your file.  Double-check you work, remove any objects that ldifde may have created before failing and try running the command again.

Once it’s complete, you should see a new OU called “HOSTS” and the device objects you created in it.  If not, refresh Active Directory Users and Computers (ADUC) and they will appear.  It should look something like this:


The next steps will happen on the RHEL server.

:: CONFIGURING THE /etc/ldap.conf FILE ::


First thing we’re going to do is modify our existing /etc/ldap.conf file to look towards LDAP for hosts information.
As root, create a back-up of your existing /etc/ldap.conf file:
# cp /etc/ldap.conf /etc/ldap.conf.backup

Next, find, uncomment and update the following line to match your LDAP information:
# vim /etc/ldap.conf

nss_base_hosts          ou=servers,dc=test,dc=internal?sub?objectClass=device

Save and exit the /etc/ldap.conf file.

Check to see if the nscd daemon service is running…:

# service nscd status
[root@lux001 /]# service nscd status
nscd (pid 13587) is running...


…stop the nscd service and invalidate it’ host cache:

[root@lux001 skel]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@lux001 skel]# nscd -i hosts

So long as the nscd service is running and unless it is configure to not cache host information (which it is not by default) then you will not be able to properly test LDAP host resolution.

Next, update the /etc/nsswitch.conf file to tell it use LDAP as a source of hosts information.

As root, back-up the existing /etc/nsswitch.conf file:

# cp /etc/nsswitch.conf /etc/nsswitch.conf.backup

Next, update the following line:

# vim /etc/nsswitch.conf

hosts:      files ldap

Save and exit the /etc/nsswitch.conf file. 

NOTE:  You can remove the “dns” entry if you are 100% sure that the name resolution for these servers will work as normal just using local files name resolution with /etc/hosts and LDAP.  I keep the “dns” entry in place to act as a final fall-back and for servers and services that I consider critical and dynamic.  For testing, though, to ensure that LDAP host resolution is working, remove “dns” as shown above.  Once you know that it is working with no “dns” entry in /etc/nsswitch.conf and no nscd service running; you can put “dns” back in at the end as you see fit. 

NOTE:  If you previously had Microsoft Windows DNS entries for servers you no longer wish to be represented by DNS and only by LDAP hosts device entries, then make sure you remove them and their associated pointer (PTR) records as necessary.

:: TESTING LDAP HOST NAME RESOLUTION ::


Now that we have updated /etc/hosts, /etc/nsswitch.conf, stopped the nscd daemon service, invalidated the nscd daemon hosts cache, and ensured that no DNS entry for the servers lux001, lux001.test.internal, lux002, or lux002.test.internal exist; we can test LDAP host name resolution from the RHEL server:

[root@lux001 /]# getent hosts | grep ^10.0      
10.0.0.60       lux001
10.0.0.60       lux001.test.internal
10.0.0.61       lux002
10.0.0.61       lux002.test.internal

You will notice that, unlike a traditional /etc/hosts file entry, we aren’t getting back a FQDN and a hostname per IP address line.  I consider this a minor trade-off for not having to maintain this in a /etc/hosts file across hundreds of servers.

And now a ping test against both lux001 and lux001.test.internal:

[root@lux001 /]# ping lux001 -c4
PING lux001 (10.0.0.60) 56(84) bytes of data.
64 bytes from lux001 (10.0.0.60): icmp_seq=1 ttl=62 time=1.12 ms
64 bytes from lux001 (10.0.0.60): icmp_seq=2 ttl=62 time=1.47 ms
64 bytes from lux001 (10.0.0.60): icmp_seq=3 ttl=62 time=1.38 ms
64 bytes from lux001 (10.0.0.60): icmp_seq=4 ttl=62 time=1.20 ms
--- lux001 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 1.122/1.298/1.475/0.139 ms

[root@lux001 /]# ping lux001 -c 4
PING lux001.test.internal (10.0.0.60) 56(84) bytes of data.
64 bytes from lux001.test.internal (10.0.0.60): icmp_seq=1 ttl=62 time=2.01 ms
64 bytes from lux001.test.internal (10.0.0.60): icmp_seq=2 ttl=62 time=1.54 ms
64 bytes from lux001.test.internal (10.0.0.60): icmp_seq=3 ttl=62 time=1.53 ms
64 bytes from lux001.test.internal (10.0.0.60): icmp_seq=4 ttl=62 time=1.58 ms
--- lux001.test.internal ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 1.534/1.668/2.013/0.203 ms

:: FINAL STEPS ::


Now that we are certain that LDAP name resolution works you can turn the nscd service back on.
[root@lux001 /]# service nscd start
Starting nscd:                                             [  OK  ]

You can also add the “dns” option back to /etc/nsswitch.conf if you so desire:
# vim /etc/nsswitch.conf

hosts:      files ldap dns

This completes the process for managing the /etc/hosts files with LDAP (Microsoft Windows 2008 R2 Active Directory).

2 comments:

  1. if you want the fqdn and hostname per ip line like in etc hosts you can change your ldif to something like this:

    dn: cn=lux001.test.internal,ou=hosts,ou=servers,dc=test,dc=internal
    objectClass: top
    objectClass: iphost
    cn: lux001.test.internal
    cn: lux001
    ipHostNumber: 10.0.0.60

    thats how I have mine setup but I'm using dual openldap dedicated servers with pacemaker failover on linux instead of active directory

    ReplyDelete
  2. Jacqueline,

    Thanks for the note. I tried it out and, unfortunately, no success. I will file this under "things that work in OpenLDAP and should work in AD but don't". It would have been nice to have one entry per server! AD throws an error about CN only taking one argument ... :

    Add error on entry starting on line 1: Constraint Violation
    The server side error is: 0x2081 Multiple values were specified for an attribute
    that can have only one value.
    The extended server error is:
    00002081: AtrErr: DSID-031513A5, #1:
    0: 00002081: DSID-031513A5, problem 1005 (CONSTRAINT_ATT_TYPE), data 0,
    Att 3 (cn)

    ... if anyone knows a solution, please let me know.

    Thanks again.

    ReplyDelete