od(1): Fix wrong output for some corner cases in multibyte locales.
Restore the original character to print if we used the look-ahead
buffer, but that didn't help -- we either got an illegal sequence
or still can't complete.
Differential D13963
Bug 224552 - 'od -c' show wrong char when it is a non-printable yuripv_icloud.com on Jan 18 2018, 1:28 AM. Authored by Tags None Referenced Files
Subscribers
Details od(1): Fix wrong output for some corner cases in multibyte locales. Restore the original character to print if we used the look-ahead
Diff Detail
Event TimelineComment Actions Hi, Thanks! Can you please also either: a.) Provide a sample input, preferably a small/simple input, that breaks it, or I would appreciate it- either one is fine by me, just as long as we can get a test case written by one of us to demonstrate. =) Comment Actions I thought about adding a test case and will add one, there are some questions though:
Comment Actions Excellent, thanks!
Indeed. We should probably also add some other od test cases since it's not really covered in hexdump_test, but that can be a "later" problem unless you're willing and wanting to add some. We basically just need some simple tests to make sure od is still behaving sanely for each of the arguments, then we can expand that as other bugs pop up. We also lack multibyte coverage on the hexdump side, so that's other room for improvement. The other stuff can definitely wait, though- they're completely irrelevant to this problem in front of us. =)
Sure- NetBSD does this kind of thing in the cut(1) test [1]. I wanted to say it would be a good idea to do something like: locale -a | grep -q 'en_US.UTF-8' so you can fail gracefully (atf_skip) if the locale isn't on the system, but we already have some places where we apparently didn't care so I guess we didn't deem it likely to matter. [1] https://svnweb.freebsd.org/base/head/contrib/netbsd-tests/usr.bin/cut/t_cut.sh?view=markup#l113 Comment Actions quick check (before/after the fix): loki:yuri:~/ws/pr224552/usr.bin/hexdump/tests$ atf-sh od_test.sh c_flag od_test.sh: WARNING: Running test cases outside of kyua(1) is unsupported od_test.sh: WARNING: No isolation nor timeout control is being applied; you may get unexpected failures; see atf-test-case(4) Executing command [ env LC_CTYPE=en_US.UTF-8 od -c /home/yuri/ws/pr224552/usr.bin/hexdump/tests/./d_od_a.in ] Fail: stdout does not match golden output --- /home/yuri/ws/pr224552/usr.bin/hexdump/tests/./d_od_cflag_a.out 2018-01-18 06:06:35.567676000 +0300 +++ /tmp/check.C5luBh/stdout 2018-01-18 06:29:17.777017000 +0300 @@ -1,3 +1,3 @@ -0000000 T e s t T e s t T e s t T e s 345 -0000020 T e s t 345 +0000000 T e s t T e s t T e s t T e s 124 +0000020 T e s t 360 0000025 failed: atf-check failed; see the output of the test for details loki:yuri:~/ws/pr224552/usr.bin/hexdump/tests$ PATH=/usr/obj/home/yuri/ws/pr224552/amd64.amd64/usr.bin/hexdump/:$PATH atf-sh od_test.sh c_flag od_test.sh: WARNING: Running test cases outside of kyua(1) is unsupported od_test.sh: WARNING: No isolation nor timeout control is being applied; you may get unexpected failures; see atf-test-case(4) Executing command [ env LC_CTYPE=en_US.UTF-8 od -c /home/yuri/ws/pr224552/usr.bin/hexdump/tests/./d_od_a.in ] Executing command [ env LC_CTYPE=en_US.UTF-8 od -c /home/yuri/ws/pr224552/usr.bin/hexdump/tests/./d_od_b.in ] passed
Comment Actions AFAIK, there's no technical reason it can't be assigned to "The FreeBSD Project" -- it's generally the author's (your) name, but I would assume that that's not strictly necessary if you'd rather not. I think this looks good, minus the $(atf_get_srcdir) occurrences in the two atf_checks, but I can fix this prior to commit; no need for another round of review based on that. I'll take one last look-over in the morning (~9 hours from this post) and commit it unless you have further adjustments. Comment Actions Hi, Sorry, this got stalled while I poked others about the copyright assignment. I'm told that we don't currently have a process for assigning copyright to a general entity like this, but that one is being developed. The way I interpreted it being explained to me, assigning copyright to "The FreeBSD Project" isn't valid because this isn't a legal entity (IIRC); we only get away with it in /COPYRIGHT because there it represents all of the individual contributors to the project, and this is a special case because it's not over a specific work but the project in general. The only step forward at this point is if we commit the file with something along the lines of Copyright (c) 2018 Yuri Pankov, since "anonymous" (in a way) contributions are not currently accepted until we have some process for assignment of copyright to the FreeBSD Foundation (IIRC this is how the process is going to work). Is this OK? Ditto these comments for the other review you've added me to, as well. Comment Actions Thanks for explanation, copyright updated; removed atf_get_srcdir you mentioned as well. |