Introduction To LDAP

By Brad Marshall
bradm@linux.com
October 2000

So, you've heard about LDAP and want to know what it is. Maybe your boss has asked you to LDAP enable a mission critical application, or maybe you just want to keep up with all these acroynms flying about. This article will start at the basics, covering exactly what LDAP is, and what it can be used for, without getting bogged down in too much technical detail - there's a lot to learn here, so here we go.

LDAP stands for Lightweight Directory Access Protocol, and is defined in RFC1777 - in its most basic form it is a procotol for accessing directories. It was originally designed to act as a gateway to other directories such as X.500 but more recently has evolved into a fully fledged directory of its own.

Now, a directory is a lot like the every day directories we use in real life - a telephone book, an address book, or even a restaurant menu. It stores information about an item in attributes, with a special piece of information to uniquely identify it. LDAP stores these entries in a hierachial structure (called Directory Information Tree or DIT), based on the unique identifier (which is called the Distinguish Name or DN).

The information stored inside this directory can be just about anything, ranging from information for system authentication to a simple address book. One thing to remember, however, is that LDAP is designed to be read much much more than its written to - it doesn't generally have the feature set of a higher end database system, such as transactions, or rollback. Another design goal of LDAP was that it didn't matter if there were temporary inconsistancies in the data across servers, as long as it got into sync in a reasonable time.

LDAP uses a client server model, with clients sending LDAP request over TCP/IP to the server. Under OpenLDAP and related servers, there are two servers - slapd, the LDAP daemon, and slurpd, the replication daemon.

Slapd provides the server that clients talk to - queries are sent to it, and it talks to the backend database and returns the results to the client. It can have many back end databases - each covering a different part of the LDAP tree. There are also a few different options for what the database is - LDBM, the lightweight Berkeley database is generally used, but there are few others that are distributed with OpenLDAP.

To prevent unauthorised people from accessing and modifying data they shouldn't see, OpenLDAP also provides a rich variety of access control lists, or ACLs. Using these ACLs, you can restrict who can write or read individual attributes. This allows you, for example, to allow only administrators and the user to change passwords, and nobody else to see the crypted password.

Slurpd provides replication - replication is a way of taking the data from one server, and pushing it out one or more slave servers. By having multiple servers hosting the same data, you can increase reliability, scalability, and availability. It allows you to have servers close to where clients are - this increases availability and decreases response time, and removes a single point of failure. Having a seperate daemon for replication frees slapd from worrying about hosts being down, and it can get on with its job of answering LDAP queries.

Under OpenLDAP, replication works by the following manner - slapd takes a modification, and performs it on the directory. This modification is then written out to a replication log file, which slurpd periodically reads in and parses. It then connects to all the defined LDAP slaves, and performs the same modification on their directory.

To facilitate exchanging and modifying of the data stored in the directory, LDIF was created. LDIF, which stands for LDAP Data Interchange Format, is a human readable version of the information stored in the directory. Following is an example LDIF entry for a user account:


dn: uid=bmarshal,ou=People,dc=pisoftware,dc=com
uid: bmarshal
cn: Brad Marshall
objectclass: account
objectclass: posixAccount
objectclass: top
loginshell: /bin/bash
uidnumber: 500
gidnumber: 120
homedirectory: /mnt/home/bmarshal
gecos: Brad Marshall,,,,
userpassword: {crypt}KDnOoUYN7Neac

So, as you can see, the attributes are stored in a key value pairing on a line, with attributes that have multiple values having multiple lines. This simple format lends itself to easy readability, and also can easily be parsed by scripting languages, which enables bulk modifications to be done easily.

Now we come to possibly the most important part of the protocol - how to view or retrieve the data. The basic way this is done is with search filters, be they used on a command line utility or through a graphical interface. Search filters consist of a criteria for attributes that must be fulfilled for an entry to be returned, and a base DN that the search is performed against. The standard for search filters is defined in RFC 1960: LDAP String Representation of Search Filters and RFC 2254: LDAPv3 Search Filters. There is a wide variety of operators that can be used, as follows:

So, for example, if we wanted to search for all users who's userid, which is stored in the uid attribute, started with d and ended with l, it would be something like:

(&(uid=d*)(uid=*l))

As you can imagine, this can quickly build up to being a powerful method of retrieving data from a directory, and can be used for a wide variety of uses.