mirror of https://github.com/OpenIdentityPlatform/OpenDJ.git

matthew_swift
13.00.2008 a95c9a7c31a17817c05c9112837b5106a7443d97
refs
author matthew_swift <matthew_swift@localhost>
Wednesday, February 13, 2008 11:00 +0100
committer matthew_swift <matthew_swift@localhost>
Wednesday, February 13, 2008 11:00 +0100
commita95c9a7c31a17817c05c9112837b5106a7443d97
tree 2d485c7cf760e40fdb7c562712d95972ea912688 tree | zip | gz
parent 5020d258f47a89e57074bbe8e5fa9e760528e58f view | diff
Fix relating to issues 2813 and 2578. Make DN string representations more user-friendly when they contain non-ascii characters.

This change is a flag day due to the potential for database format incompatibilities introduced by the change in DN normalized form.

Currently the DN and RDN implementations are very conservative regarding the string representation of DNs that they construct. Any non-ascii characters are escaped using back-slashes. For example, the DN:

uid=Météo.0,ou=People,dc=example,dc=com

Is encoded as:

uid=M\c3\a9t\c3\a9o.0,ou=People,dc=example,dc=com

Which is not very readable in LDAP client applications. It is also much less space efficient - something we should consider if we wish to have non-western users of OpenDS who will be heavy users of multi-byte UTF8 sequences. For example, a single Chinese character would be encoded in UTF8 as 3 or 4 bytes IIRC which would equate to 9-12 bytes or a 3X increase. This would have implications for database performance (substrings) and space efficiency.

The change is not without its minor problems however:

1. LDIF cannot contain non-ascii characters so any DNs or attribute
values must be base-64 encoded in order for the LDIF to be valid.
This is not very user-friendly, but it's easier for inquiring
users to decode base 64 than to manually decode UTF8 byte
sequences. A future change could improve this behavior by making
our LDIF generation tools (e.g. ldapsearch, ldif-export) output
comments before each base-64 encoded DN / value containing the DN
/ value in the client's native character set. This is something
that OpenLDAP clients do and I think it is a nice usability feature

2. the dn2id index and any DN / RDN syntax attribute indexes will be
potentially invalid due to the modified DN / RDN normalization
(hence this change is a flag-day)

3. DNs returned to LDAPv2 clients will potentially contain non-T.61
characters (LDAPv3 uses UTF8 and LDAPv2 uses T.61). However, I
don't think we are bothered by this because we already break
compatibility for LDAPv2 clients for directory string based
attribute values which we also return using UTF8.

5 files modified
282 ■■■■ changed files
opendj-sdk/opends/src/server/org/opends/server/types/AttributeValue.java 117 ●●●●● diff | view | raw | blame | history
opendj-sdk/opends/src/server/org/opends/server/types/RDN.java 140 ●●●●● diff | view | raw | blame | history
opendj-sdk/opends/tests/unit-tests-testng/src/server/org/opends/server/types/TestDN.java 8 ●●●● diff | view | raw | blame | history
opendj-sdk/opends/tests/unit-tests-testng/src/server/org/opends/server/types/TestRDN.java 8 ●●●● diff | view | raw | blame | history
opendj-sdk/opends/tests/unit-tests-testng/src/server/org/opends/server/util/TestLDIFWriter.java 9 ●●●●● diff | view | raw | blame | history