Details

Reviewers

markj
emaste

Summary

From https://bugs.freebsd.org/244226 :

syslogd has code to convert all logged messages to "safe" strings.

At the moment, the code converts control characters to "^x" sequences and ALL 8-bit characters to "M-x" sequences. This means that printable characters in character sets other than ASCII are converted and so do not display as expected when viewing the logs.

This patch adds LC_CTYPE locale support to syslogd and changes the "safe" conversion code to examine the logged characters using mbrtoc32() and to use iswgraph() to test if a character needs converting to safe sequences.

It also uses vis() to do the conversion which is similar to OpenBSD but which means control chars become \^x and non-graphical 8-bit chars become \M-x.

Test Plan

simplest way is this:

Add to /etc/rc.conf: syslogd_env="LC_CTYPE=C.UTF-8"

Restart syslogd.

Run: echo '\xe0\xb8\xaa\xe0\xb8\xa7\xe0\xb8\xb1\xe0\xb8\xaa\xe0\xb8\x94\xe0\xb8\xb5' | logger

Look at /var/log/messages. You should see "สวัสดี" (hello in Thai).

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

yzhong_freebsdfoundation.org requested review of this revision.Sep 16 2020, 5:36 PM

yzhong_freebsdfoundation.org created this revision.

emaste added a subscriber: markj.Sep 16 2020, 5:45 PM

emaste added a subscriber: emaste.

The only change I made to the original patch was to re-order some of the #include directives, as more had been added after the original patch was submitted.

markj added inline comments.Sep 16 2020, 6:57 PM

usr.sbin/syslogd/syslogd.c
969	`mask_C1` is now a write-only variable. What does `vis()` do with C1 control characters?
1002	Is the cast to size_t actually required? Don't we need to limit that parameter to min(MB_LEN_MAX, <# bytes left>)?
1005	When can mbrtoc32() return a value greater than `n`? It's not clear to me from reading the man page that that's permitted.
1017	This might as well be a memcpy(). strncpy() doesn't guarantee nul-termination here, the assignment after the loop does.

I tested it more thoroughly and found an additional problem (on line 984, it used to test if a size_t was less than 0), along with the ones markj pointed out. They should all be fixed now.

yzhong_freebsdfoundation.org edited the summary of this revision. (Show Details)Sep 17 2020, 4:48 PM

yzhong_freebsdfoundation.org added inline comments.Sep 17 2020, 5:06 PM

usr.sbin/syslogd/syslogd.c
969	From my testing, vis() automatically converts C1 characters. I added back the old behaviour for when mask_C1 is false.

markj added inline comments.Sep 17 2020, 5:09 PM

usr.sbin/syslogd/syslogd.c
980	`end_q - q` measures the number of bytes remaining in the output buffer, but we're passing the input buffer to `mbrtoc32()`. In particular, `mbrtoc32()` assumes that it can access up to `n` bytes of the input buffer, which is not always true. `max_len` should be of type `size_t`.

Fixed another issue: mbrtoc32() returns the error value (size_t)-2 if it reads a C1 control character. This wouldn't be a problem by itself, as the rest of the function takes care of it, but this screws up mbrtoc32()'s internal state for all of the characters afterwards. Switching to the non-restartable mbtowc() resolves that issue. However, I am not sure if this has any bad implications.

yzhong_freebsdfoundation.org marked 4 inline comments as done.Sep 17 2020, 6:04 PM

emaste added reviewers: markj, emaste.Sep 21 2020, 3:25 PM

markj added inline comments.Sep 21 2020, 3:28 PM

usr.sbin/syslogd/syslogd.c
980	It's better to write `end_p = p + strlen(p)` and check `end_p - p` rather than recomputing the entire string's length for each wide character.

emaste added inline comments.Sep 21 2020, 3:36 PM

usr.sbin/syslogd/syslogd.c
792	Just `setlocale(LC_CTYPE, "")` probably. Only three locales are defined by default, the empty string "" which denotes the native environment, and the "C" and "POSIX" locales, which denote the C language environment. A locale argument of NULL causes setlocale() to return the current locale. If `LC_CTYPE` is unset getenv returns null, which causes setlocale() to just return the current setting. The empty string sets it according to environment variables automatically.

yzhong_freebsdfoundation.org updated this revision to Diff 77303.Sep 21 2020, 3:57 PM

yzhong_freebsdfoundation.org marked 2 inline comments as done.Sep 21 2020, 4:05 PM

Seems reasonable to me.

usr.sbin/syslogd/syslogd.c
980	You could rewrite this as `while (p < end_p && q < end_q)` I think.

This revision is now accepted and ready to land.Sep 22 2020, 1:39 PM

yzhong_freebsdfoundation.org updated this revision to Diff 77366.Sep 22 2020, 6:48 PM

This revision now requires review to proceed.Sep 22 2020, 6:48 PM

yzhong_freebsdfoundation.org marked an inline comment as done.Sep 22 2020, 6:48 PM

(I plan to commit this, but I would like to finish some work on sandboxing syslogd first - all of this string parsing code makes me a bit nervous.)

What's the status of this? We bumped into exactly this problem at $WORK.

In D26456#1069306, @arrowd wrote:

What's the status of this? We bumped into exactly this problem at $WORK.

Last year, @jfree wrote patches to capsicumize syslogd. Most of that work has landed, but a few patches are left. We will try to get through them by the end of next week, and then I think this patch can land.

Any progress on this?

In D26456#1087672, @arrowd wrote:

Any progress on this?

Yes, I’ve been working on it. I will try to get them in this weekend. Sorry for the wait

In D26456#1087673, @jfree wrote:

In D26456#1087672, @arrowd wrote:

Any progress on this?

Yes, I’ve been working on it. I will try to get them in this weekend. Sorry for the wait

Ok, all of my syslogd patches are in. So sorry about the delay!

syslogd: Support locale when removing unsafe characters
Needs ReviewPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 77366

usr.sbin/syslogd/syslogd.c

syslogd: Support locale when removing unsafe charactersNeeds ReviewPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 77366

usr.sbin/syslogd/syslogd.c

syslogd: Support locale when removing unsafe characters
Needs ReviewPublic
Actions

Revision Contents
Changeset List