grep_printline() was flushing stdout after every output line,
causing one write(2) syscall per line. Remove this and let
stdio(3) buffer output normally. --line-buffered continues
to work via setlinebuf(3).
Benchmark on 1M lines (25MB output to file):
real time: -65.3% (2.17s -> 0.75s) sys time: -99.9% (1.45s -> 0.00s)
--line-buffered: no significant difference.