Details

Reviewers

theraven

Group Reviewers

manpages

Summary

Developed for GSoC 2014 by ghostmansd@
Mentored by pfg@

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Event Timeline

pfg updated this revision to Diff 5955.Jun 4 2015, 9:28 PM

pfg retitled this revision from to Initial collation support..

pfg updated this object.

pfg edited the test plan for this revision. (Show Details)

Herald added a reviewer: manpages. · View Herald TranscriptJun 4 2015, 9:28 PM

Herald added a subscriber: imp. · View Herald Transcript

There are still build issues, but I am uploading this in the hope that it will be easier to fix/review.

Update build: with help from bapt@

Herald added a subscriber: bdrewery. · View Herald TranscriptJun 4 2015, 10:06 PM

I had to change the third parameter in __collate_lookup to
avoid a dumb cast in strxfrm_l(). This also triggered the
corresponding type update in strcoll_l().

I think with these changes the code builds on all platforms
(tinderbox still running).

Also fixed the Python lint issues while here.

Minor type tidy-up.

wblock added a subscriber: wblock.Jun 5 2015, 2:33 AM

wblock added inline comments.

lib/libcolldb/colldb.3
108	Needs a serial comma after "key". Also, start new sentences on new lines.
109	New sentence on new line. Usually we would say "already be" rather than the other way around.
110	Needs an article: "a value"? "the value"? Also, s/database/the database/. Finally, start a new sentence on a new line.
115	Needs a comma after "number".
117	Start a new sentence on a new line. "In order to" can usually just be "To"
118	This sentence is kind of confusing. "may" implies permission, but "might" in that same place is still confusing. It's not clear how the second half of the sentence relates to the first. Not clear what "exotic databases" means, either. Does this have the right meaning: For some exotic databases, the weights buffer might be too small. In this case, .Fn colldb_get will return -1 and set errno to .Er ERANGE to indicate that struct colldb_value must allocate a larger weights buffer.
123	Serial comma needed after the first "key". Also, start a new sentence on a new line.
129	s/handle/a handle/ s/collation/the collation/ Also, start a new sentence on a new line.
130	Typo: s/Berkley/Berkeley/. Also, start a new sentence on a new line. Also, s/caller/the caller/
usr.sbin/colldb/colldb.1
40	Needs articles: s/Unicode/a Unicode/ s/to/to a/
41	s/database/databases/
42	s/Berkley/Berkeley/ probably should have a "the" before that, too.
43	Not sure what this is saying. It's missing some articles, but seems to be saying both that root collation is the fallback method, and also here is an example. "is expected" implies that it might not work that way. Maybe better as two sentences, one explaining that root collation is the fallback (collation method?), and a second that introduces the example.

This is still not handling endianness portably.

lib/libc/string/strcoll.c
82	This is ugly. Do we have to initialize all this?
148	Gratuitous space.
lib/libc/string/strxfrm.c
136–137	Failure on MIPS: /scratch/tmp/pfg/head/lib/libc/string/strxfrm.c: In function 'strxfrm_l': /scratch/tmp/pfg/head/lib/libc/string/strxfrm.c:141: warning: passing argument 3 of '__collate_lookup' from incompatible pointer type
lib/libc/string/wcsxfrm.c
88	Breaks SPARC64 after changing 3rd parameter to size_t cc1: warnings being treated as errors /scratch/tmp/pfg/head/lib/libc/string/strcoll.c: In function 'strcoll_l': /scratch/tmp/pfg/head/lib/libc/string/strcoll.c:127: warning: passing argument 3 of 'collate_lookup' from incompatible pointer type /scratch/tmp/pfg/head/lib/libc/string/strcoll.c:131: warning: passing argument 3 of 'collate_lookup' from incompatible pointer type
lib/libcolldb/colldb.c
243	Build broken in ARM: /scratch/tmp/pfg/head/lib/libcolldb/colldb.c: In function 'colldb_get': /scratch/tmp/pfg/head/lib/libcolldb/colldb.c:242: error: 'BYTE_ORDER' undeclared (first use in this function) /scratch/tmp/pfg/head/lib/libcolldb/colldb.c:242: error: (Each undeclared identifier is reported only once /scratch/tmp/pfg/head/lib/libcolldb/colldb.c:242: error: for each function it appears in.) /scratch/tmp/pfg/head/lib/libcolldb/colldb.c:242: error: 'ORDER_BIG_ENDIAN' undeclared (first use in this function) /scratch/tmp/pfg/head/lib/libcolldb/colldb.c: In function 'colldb_put': /scratch/tmp/pfg/head/lib/libcolldb/colldb.c:311: error: 'BYTE_ORDER' undeclared (first use in this function) /scratch/tmp/pfg/head/lib/libcolldb/colldb.c:311: error: 'ORDER_BIG_ENDIAN' undeclared (first use in this function)

Update manpages. Thanks Warren for the review!
Add some $FreeBSD$ tags.
This is all still pretty much WIP.

Fix endinanness issue using <sys/endian.h>.
It's not clear to me if we are DTRT for the big endian case.
Clean some initializations in the new code.

Fix remaining build issues.

This passes the tinderbox build so it's basically ready for review/testing.

Phabricator is "helping" by putting previous comments on the wrong lines. Please ignore that.

I still think the last sentence of colldb.1 is confusing, but maybe it makes sense in context.

lib/libcolldb/colldb.3
107	Remove semicolon.
124	No semicolon, new sentence goes on new line. Originally, I read these as accepting three parameters, but it appears they really only accept two, key and value. It might be clearer to leave out the word "database". So: accepts a key and value. If necessary, it converts the key and value
125	Not really "into", just "to". And "the" is not needed here, so: to network byte order before writing.
131	Three "e"s in "Berkeley". Database should probably not be capitalized: All other functions behave exactly like their Berkeley database ancestors.
usr.sbin/colldb/colldb.1
39	utility converts Unicode collation files to *BSD collation databases.
40	s/database/databases/

Update the man pages according to Warren's feedback.
I think we still need some technical review on the manpages
(along with the code). It may be convenient to add a reference to
the Berkeley database documentation dbopen(3).

pfg added a subscriber: bapt.Jun 28 2015, 7:21 PM

marino added a subscriber: marino.Jul 19 2015, 5:05 PM

shouldn't there be modifications to share/colldef/Makefile and maybe an addition to share/colldef ? It seems to be after this commit is entered, the collate locate for UTF-8 codesets are still going to be pointing at la_LN.US-ASCII.

or are the UTF-8 codeset collations going to be fixed in a way I don't see?

In D2736#62123, @marino wrote:

shouldn't there be modifications to share/colldef/Makefile and maybe an addition to share/colldef ? It seems to be after this commit is entered, the collate locate for UTF-8 codesets are still going to be pointing at la_LN.US-ASCII.

or are the UTF-8 codeset collations going to be fixed in a way I don't see?

I asked Dmitry to document this but he hasn't done it yet.

I recall a new set of collation databases has to be downloaded from the unicode site, generated with colldef(1), and loaded in /share/colldb/. The pre-existing code is only kept as a fallback method in case the collation databases are not available.

Baptiste is working on the Illumos/Dragonfly collation so leave this aside, at least for now, and hope the best for that effort.

Initial collation support.AbandonedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 5996

lib/Makefile

lib/libc/Makefile

lib/libc/locale/collate.h

lib/libc/locale/collate.c

lib/libc/locale/unicode.h

lib/libc/locale/xlocale.c

lib/libc/locale/xlocale_private.h

lib/libc/string/strcoll.c

lib/libc/string/strxfrm.c

lib/libc/string/wcscoll.c

lib/libc/string/wcsxfrm.c

lib/libc/unicode/

lib/libc/unicode/Makefile.inc

lib/libc/unicode/ccclass.h

lib/libc/unicode/coll.h

lib/libc/unicode/norm.h

lib/libc/unicode/uccclass.c

lib/libc/unicode/ucscanon.c

lib/libc/unicode/ucscoll.c

lib/libc/unicode/ucsnorm.c

lib/libc/unicode/ucsxfrm.c

lib/libcolldb/

lib/libcolldb/Makefile

lib/libcolldb/colldb.h

lib/libcolldb/colldb.3

lib/libcolldb/colldb.c

share/Makefile

share/colldb/

share/colldb/Makefile

share/colldb/root.src

share/examples/Makefile

share/examples/colldb/

share/examples/colldb/colldb.py

share/mk/src.libnames.mk

usr.sbin/Makefile

usr.sbin/colldb/

usr.sbin/colldb/Makefile

usr.sbin/colldb/colldb.1

usr.sbin/colldb/colldb.c

Initial collation support.
AbandonedPublic
Actions

Revision Contents
Changeset List