Developed for GSoC 2014 by ghostmansd@
Mentored by pfg@
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage
Event Timeline
There are still build issues, but I am uploading this in the hope that it will be easier to fix/review.
I had to change the third parameter in __collate_lookup to
avoid a dumb cast in strxfrm_l(). This also triggered the
corresponding type update in strcoll_l().
I think with these changes the code builds on all platforms
(tinderbox still running).
Also fixed the Python lint issues while here.
lib/libcolldb/colldb.3 | ||
---|---|---|
107 | Needs a serial comma after "key". Also, start new sentences on new lines. | |
108 | New sentence on new line. Usually we would say "already be" rather than the other way around. | |
109 | Needs an article: "a value"? "the value"? Also, s/database/the database/. Finally, start a new sentence on a new line. | |
114 | Needs a comma after "number". | |
116 | Start a new sentence on a new line. "In order to" can usually just be "To" | |
117 | This sentence is kind of confusing. "may" implies permission, but "might" in that same place is still confusing. It's not clear how the second half of the sentence relates to the first. Not clear what "exotic databases" means, either. Does this have the right meaning: For some exotic databases, the weights buffer might be too small. | |
122 | Serial comma needed after the first "key". Also, start a new sentence on a new line. | |
128 | s/handle/a handle/ | |
129 | Typo: s/Berkley/Berkeley/. | |
usr.sbin/colldb/colldb.1 | ||
39 | Needs articles: s/Unicode/a Unicode/ | |
40 | s/database/databases/ | |
41 | s/Berkley/Berkeley/ | |
42 | Not sure what this is saying. It's missing some articles, but seems to be saying both that root collation is the fallback method, and also here is an example. "is expected" implies that it might not work that way. Maybe better as two sentences, one explaining that root collation is the fallback (collation method?), and a second that introduces the example. |
This is still not handling endianness portably.
lib/libc/string/strcoll.c | ||
---|---|---|
82 | This is ugly. Do we have to initialize all this? | |
150 | Gratuitous space. | |
lib/libc/string/strxfrm.c | ||
136–137 | Failure on MIPS: /scratch/tmp/pfg/head/lib/libc/string/strxfrm.c: In function 'strxfrm_l': | |
lib/libc/string/wcsxfrm.c | ||
88 | Breaks SPARC64 after changing 3rd parameter to size_t cc1: warnings being treated as errors | |
lib/libcolldb/colldb.c | ||
242 | Build broken in ARM: /scratch/tmp/pfg/head/lib/libcolldb/colldb.c: In function 'colldb_get': |
Update manpages. Thanks Warren for the review!
Add some $FreeBSD$ tags.
This is all still pretty much WIP.
Fix endinanness issue using <sys/endian.h>.
It's not clear to me if we are DTRT for the big endian case.
Clean some initializations in the new code.
Fix remaining build issues.
This passes the tinderbox build so it's basically ready for review/testing.
Phabricator is "helping" by putting previous comments on the wrong lines. Please ignore that.
I still think the last sentence of colldb.1 is confusing, but maybe it makes sense in context.
lib/libcolldb/colldb.3 | ||
---|---|---|
108 | Remove semicolon. | |
125 | No semicolon, new sentence goes on new line. Originally, I read these as accepting three parameters, but it appears they really only accept two, key and value. It might be clearer to leave out the word "database". So: accepts a key and value. | |
126 | Not really "into", just "to". And "the" is not needed here, so: to network byte order before writing. | |
132 | Three "e"s in "Berkeley". Database should probably not be capitalized: All other functions behave exactly like their Berkeley database ancestors. | |
usr.sbin/colldb/colldb.1 | ||
40 | utility converts Unicode collation files to *BSD collation databases. | |
41 | s/database/databases/ |
Update the man pages according to Warren's feedback.
I think we still need some technical review on the manpages
(along with the code). It may be convenient to add a reference to
the Berkeley database documentation dbopen(3).
shouldn't there be modifications to share/colldef/Makefile and maybe an addition to share/colldef ? It seems to be after this commit is entered, the collate locate for UTF-8 codesets are still going to be pointing at la_LN.US-ASCII.
or are the UTF-8 codeset collations going to be fixed in a way I don't see?
I asked Dmitry to document this but he hasn't done it yet.
I recall a new set of collation databases has to be downloaded from the unicode site, generated with colldef(1), and loaded in /share/colldb/. The pre-existing code is only kept as a fallback method in case the collation databases are not available.
Baptiste is working on the Illumos/Dragonfly collation so leave this aside, at least for now, and hope the best for that effort.