Indexing Attribute Values
Indexes
OpenDJ provides several indexing schemes to speed up searches.
When a client requests a directory search operation, the client sends
the server a filter expression such as
(&(uid=*jensen*)(l=Stavanger)). The server then uses
applicable indexes to find entries with attribute values likely to match
the search. If no indexes are applicable, then the server potentially has
to go through all entries to look for candidate matches.
Looking through all entries is resource-intensive for large directories.
For this reason, the unindexed-search privilege, allowing
users to request searches for which no applicable index exists, is reserved
for the directory root user by default.
Rather than granting the unindexed-search privilege
to more users and client applications, you configure indexes to correspond
to the searches that clients need to perform.
This chapter first describes index types, then demonstrates how to
index attribute values. This chapter also lists the default indexing
configuration for OpenDJ directory server.
Index Types & What Each Does
OpenDJ provides several different index types, each corresponding
to a different type of search.
Approximate Index
Indexes
Approximate
An approximate index is used to match values that "sound like" those
provided in the filter. An approximate index on cn
allows clients to find people even when they misspell names as in the
following example.
$ ldapsearch -b dc=example,dc=com "(cn~=Babs Jansen)" cn
dn: uid=bjensen,ou=People,dc=example,dc=com
cn: Barbara Jensen
cn: Babs Jensen
Equality Index
Indexes
Equality
An equality index is used to match values that correspond exactly
(though generally without case sensitivity) to the value provided in
the search filter. An equality index requires clients to match values
without wildcards or misspellings.
$ ldapsearch -b dc=example,dc=com "(uid=bjensen)" mail
dn: uid=bjensen,ou=People,dc=example,dc=com
mail: bjensen@example.com
Ordering Index
Indexes
Ordering
An ordering index is used to match values for a filter that
specifies a range. The ds-sync-hist has an ordering
index by default because searches on that attribute often seek entries
with changes more recent than the last time a search was performed.
The following example shows a search that specifies ranges.
$ ldapsearch -b dc=example,dc=com "(&(uidNumber>=1120)(roomNumber>=4500))" uid
dn: uid=charvey,ou=People,dc=example,dc=com
uid: charvey
dn: uid=eward,ou=People,dc=example,dc=com
uid: eward
dn: uid=mvaughan,ou=People,dc=example,dc=com
uid: mvaughan
dn: uid=pchassin,ou=People,dc=example,dc=com
uid: pchassin
Presence Index
Indexes
Presence
A presence index is used to match the fact that an attribute is
present on the entry, regardless of the value. The aci
attribute is indexed for presence by default to allow quick retrieval
of entries with ACIs.
$ ldapsearch -b dc=example,dc=com "(aci=*)" -
dn: dc=example,dc=com
dn: ou=People,dc=example,dc=com
Substring Index
Indexes
Substring
A substring index is used to match values specified with wildcards
in the filter. Substring indexes can be expensive to maintain, especially
for large attribute values.
$ ldapsearch -b dc=example,dc=com "(cn=Barb*)" cn
dn: uid=bfrancis,ou=People,dc=example,dc=com
cn: Barbara Francis
dn: uid=bhal2,ou=People,dc=example,dc=com
cn: Barbara Hall
dn: uid=bjablons,ou=People,dc=example,dc=com
cn: Barbara Jablonski
dn: uid=bjensen,ou=People,dc=example,dc=com
cn: Barbara Jensen
cn: Babs Jensen
dn: uid=bmaddox,ou=People,dc=example,dc=com
cn: Barbara Maddox
Virtual List View (Browsing) Index
Indexes
Virtual list view (browsing)
A VLV or browsing index are designed to help the server respond to
client applications that need virtual list view results, for example to
browse through a long list in a GUI. They also help the server respond
to clients that request server-side sorting of the search results.
VLV indexes correspond to particular searches. Configure your
VLV indexes using the Control Panel, and copy the command-line
equivalent from the Details pane for the operation, if necessary.
Configuring & Rebuilding Indexes
Indexes
Configuring
You modify index configurations using the dsconfig
command. The configuration changes then take effect after you rebuild the
index according to the new configuration, using the
rebuild-index.
Indexes are per directory backend rather than per suffix. To maintain
separate indexes for different suffixes on the same directory server, put
the suffixes in different backends.
Configuring a Standard Index
You can configure standard indexes from the Control Panel, and also
on the command line using the dsconfig command. After
you finish configuring the index, you must rebuild the index for the changes
to take effect.
Create a New Index
The following example creates a new substring index for
description.
$ dsconfig
-p 4444
-h `hostname`
-D "cn=Directory Manager"
-w password
create-local-db-index
--backend-name userRoot
--index-name description
--set index-type:substring
-n
Configure an Approximate Index
Indexes
Approximate
The following example configures and approximate index for
cn (common name).
$ dsconfig
-p 4444
-h `hostname`
-D "cn=Directory Manager"
-w password
set-local-db-index-prop
--backend-name userRoot
--index-name cn
--set index-type:approximate
-n
Configuring a Virtual List View Index
Indexes
Virtual list view (browsing)
In the OpenDJ Control Panel, select Manage Indexes >
New VLV Index..., and then set up your VLV index using the New VLV
Index window.
After you finish configuring your index and click OK, the Control
Panel prompts you to make the additional changes necessary to complete the
VLV index configuration, and then to build the index.
Rebuilding Indexes
Indexes
Rebuilding
After you change an index configuration, or when you find that
an index is corrupt, you can rebuild the index. If you rebuild the index
while the server is online, then you must schedule the rebuild process
as a task.
Rebuild Index
The following example rebuilds the cn index
immediately with the server online.
$ rebuild-index
-p 4444
-h `hostname`
-D "cn=Directory Manager"
-w password
-b dc=example,dc=com
-i cn
-t 0
Rebuild Index task 20110607171639867 scheduled to start Jun 7, 2011 5:16:39 PM
Changing Index Entry Limits
Indexes
Entry limits
Indexing data makes sense when maintaining the index is quicker and
cheaper than searching through all entries.
As the number of entries in your directory grows, it can make sense
not to maintain indexes for particular values. For example, every entry
in the directory has the value top for the
objectClass attribute, so maintaining a list of entries
that match the filter (objectClass=top) is not a
reasonable use of resources. In a very, very large directory, the same can
be true for (givenName=John) and
(sn=Smith).
In an index, each index key points to a list of entries that
are candidates to match. For the objectClass index key
that corresponds to =top, the list of entries can
include every entry in the directory.
OpenDJ directory server therefore defines an index entry limit. When
the number of entries that an index key points to exceeds the index entry
limit, OpenDJ stops maintaining the list of entries for that index key.
The default index entry limit value is 4000. 4000 is usually plenty
large for all index keys, except for objectClass indexes.
If you have clients performing searches with filters such as
(objectClass=person), you might suggest that they adjust
the search to be more specific, such as
(&(mail=username@maildomain.net)(objectClass=person)),
so that the server can use an index, in this case equality for mail, to
limit the number of candidate entries to check for matches.
You can change the index entry limit on a per index basis.
Change Index Entry Limit
The following example changes the index entry limit for the
objectClass index, and then rebuilds the index for the
configuration change to take effect.
$ dsconfig
-p 4444
-h `hostname`
-D "cn=Directory Manager"
-w password
set-local-db-index-prop
--backend-name userRoot
--index-name objectClass
--set index-entry-limit:5000
-n
$ rebuild-index
-p 4444
-h `hostname`
-D "cn=Directory Manager"
-w password
-b dc=example,dc=com
-i objectclass
-t 0
Rebuild Index task 20110607160349596 scheduled to start Jun 7, 2011 4:03:49 PM
Verifying Indexes
Indexes
Verifying
You can verify that indexes correspond to current directory data,
and that indexes do not contain errors using the
verify-index command.
Verify Index
The following example verifies the cn (common
name) index for completeness and for errors.
$ verify-index -b dc=example,dc=com -i cn --clean --countErrors
[07/Jun/2011:16:06:50 +0200] category=BACKEND severity=INFORMATION
msgID=9437595 msg=Local DB backend userRoot does not specify the number of
lock tables: defaulting to 97
[07/Jun/2011:16:06:50 +0200] category=BACKEND severity=INFORMATION
msgID=9437594 msg=Local DB backend userRoot does not specify the number of
cleaner threads: defaulting to 24 threads
[07/Jun/2011:16:06:51 +0200] category=JEB severity=NOTICE msgID=8847461
msg=Checked 1316 records and found 0 error(s) in 0 seconds
(average rate 2506.7/sec)
[07/Jun/2011:16:06:51 +0200] category=JEB severity=INFORMATION
msgID=8388710 msg=Number of records referencing more than one entry: 315
[07/Jun/2011:16:06:51 +0200] category=JEB severity=INFORMATION
msgID=8388711 msg=Number of records that exceed the entry limit: 0
[07/Jun/2011:16:06:51 +0200] category=JEB severity=INFORMATION
msgID=8388712 msg=Average number of entries referenced is 1.58/record
[07/Jun/2011:16:06:51 +0200] category=JEB severity=INFORMATION
msgID=8388713 msg=Maximum number of entries referenced by any
record is 32
Checking Indexes For a Search
Indexes
Debugging searches
When searching, you can improve performance by making sure your search
is indexed as you expect. One way of checking is to request the
debugsearchindex attribute in your results.
$ ldapsearch -p 1389 -b dc=example,dc=com "(uid=bjensen)" debugsearchindex
dn: cn=debugsearch
debugsearchindex: filter=(uid=bjensen)[INDEX:uid.equality][COUNT:1]
final=[COUNT:1]
When you request the debugsearchindex attribute,
instead of performing the search, OpenDJ returns debug information indicating
how it would process the search operation. In the example above you notice
OpenDJ hits the equality index for uid right away.
A less exact search requires more work from OpenDJ. In the following
example OpenDJ would have to return 160 entries.
$ ldapsearch -p 1389 -b dc=example,dc=com "(uid=*)" debugsearchindex
dn: cn=debugsearch
debugsearchindex: filter=(uid=*)[NOT-INDEXED] scope=wholeSubtree[COUNT:160]
final=[COUNT:160]
By default OpenDJ rejects unindexed searches when the number of
candidate entries goes beyond the search or look-though limit.
Default Indexes
Indexes
Default settings
When you first install OpenDJ directory server and import your
data from LDIF, the following indexes are configured.
Default Indexes
Index
Approximate
Equality
Ordering
Presence
Substring
Entry Limit
aci
-
-
-
Yes
-
4000
cn
-
Yes
-
-
Yes
4000
dn2id
Non-configurable
internal index
ds-sync-conflict
-
Yes
-
-
-
4000
ds-sync-hist
-
-
Yes
-
-
4000
entryUUID
-
Yes
-
-
-
4000
givenName
-
Yes
-
-
Yes
4000
id2children
Non-configurable
internal index
id2subtree
Non-configurable
internal index
mail
-
Yes
-
-
Yes
4000
member
-
Yes
-
-
-
4000
objectClass
-
Yes
-
-
-
4000
sn
-
Yes
-
-
Yes
4000
telephoneNumber
-
Yes
-
-
Yes
4000
uid
-
Yes
-
-
-
4000
uniqueMember
-
Yes
-
-
-
4000