Page MenuHomeFreeBSD

sh: implement persistent history storage
ClosedPublic

Authored by bapt on Mar 30 2021, 9:18 AM.

Details

Summary

First shot at implementing persistent storage.
When the shell is started in interactive mode, it will load the history for the existing .sh_history if any and truncate it to fit HISTSIZE
When the shell exits the last HISTSIZE line of the history are dumped into the .sh_history file.

Note the behaviour is the same as the one which can be found in bash [1] or many other bourne-like shell.

[1]: https://www.gnu.org/software/bash/manual/html_node/Bash-History-Facilities.html

"When the shell starts up, the history is initialized from the file named by the HISTFILE variable (default ~/.bash_history). The file named by the value of HISTFILE is truncated, if necessary, to contain no more than the number of lines specified by the value of the HISTFILESIZE variable. When a shell with history enabled exits, the last $HISTSIZE lines are copied from the history list to the file named by $HISTFILE."

to avoid sh to creating (or loading) the history file, set HISTSIZE to 0

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

bapt requested review of this revision.Mar 30 2021, 9:18 AM

Use histsizeval() to get the proper size of the history

Address first comments from pstef via irc

bapt retitled this revision from sh: implement persistent storage to sh: implement persistent history storage.Mar 30 2021, 10:07 AM
bapt edited the summary of this revision. (Show Details)
bin/sh/histedit.c
97

5 is the suffix length, but the suffix means a constant character string and not the Xs. To quote the manual page, The template should be of the form /tmp/tmpXXXXXXsuffix. There is no mkostempat so I suggest leaving histfile as it is now and just passing slen of 0 here.

I would also put spaces around the or operator.

bin/sh/histedit.c
104

I think this is backwards; I did H_FIRST here and H_PREV in the line below and it worked well for me.

104

I think this is backwards; I did H_FIRST here and H_PREV in the line below and it worked well for me.

Sorry, I meant H_LAST.

142

I saw a weird issue where some commands got pushed into a single line. Removing this and the next line helped.

jilles requested changes to this revision.Mar 30 2021, 9:18 PM

Another nice feature, but I expect that making it good enough to replace bigger shells will take more work. For example, the save-all/load-all approach may be rather slow with the number of history entries I personally like to keep.

bin/sh/histedit.c
86

For the benefit of people who copy this code and use it in untrusted directories, it'd be better to have more X's, like 10.

92

This unnecessarily activates a lot of machinery. A variable can be accessed without going through the shell parser and expansions, using lookupvar() or bltinlookup(). Compared to lookupvar(), bltinlookup() will ensure that a variable assignment before a builtin applies.

Alternatively, do something like home = expandstr("${HOME-}/.sh_history.XXXXXXXXXX"); to put the machinery to use, and reduce code below.

The return value from expandstr() is an "ungrabbed" stack string, so it will be reused automatically later and it is OK to have mkostemp() or a similar call overwrite it.

The - in ${HOME-} is indeed required to avoid an error when HOME is not set and set -u is in effect.

95

There should be INTOFF/INTON bracketing somewhere so that we do not leak file descriptors and memory or corrupt stdio state if a SIGINT comes in.

106

Suggest fdopen() and fwrite() to cut down on system calls.

Might also consider a huge writev() but I think stdio is more conventional.

143

spacing

This revision now requires changes to proceed.Mar 30 2021, 9:18 PM
bapt marked 9 inline comments as done.Mar 30 2021, 9:42 PM

@jilles do you think we should add an option to disable the load and save so people willing to have a giant history while not suffering from the load/save can do it? if yes do you have any options to suggest?

jilles requested changes to this revision.Mar 30 2021, 10:38 PM
In D29493#661261, @bapt wrote:

@jilles do you think we should add an option to disable the load and save so people willing to have a giant history while not suffering from the load/save can do it? if yes do you have any options to suggest?

Some people want to disable history saving for privacy reasons or to reduce writes to flash, so I think that option should exist. However, a giant history does not seem a good reason for that. I think a giant history is only really useful if it is saved. I have increased zsh's history size and use it to recall commands I issued months ago.

One option is to implement HISTFILE as specified by POSIX to allow having the history file at another location (or /dev/null to disable it). Changes to HISTFILE after the ENV file has been read need not be effective.

Any such configuration will require some changes to shell initialization, since as it is now the shell will read the history file immediately after processing argv, so before reading the profiles (if any) and ENV file. In some sense, this is even already wrong: if the ENV file contains set +o emacs, the shell will already have uselessly read the history file.

The ${HOME-}/.sh_history implemented here is the default HISTFILE specified by POSIX. This means that this is the same file used by ksh93 (but of course incompatible).

bin/sh/histedit.c
135

lookupvar and null check here as well.

This revision now requires changes to proceed.Mar 30 2021, 10:38 PM
bin/sh/histedit.c
111

sizeof(char) has bikeshedding potential, so I suggest using 1 to avoid it

bin/sh/histedit.c
91
108
bin/sh/histedit.c
91

Sorry, not woken up fully yet.
if (hist == NULL || histsizeval() == 0)

bapt marked 5 inline comments as done.

Respect HISTFILES, address comments

I tested the save/restore history capabilities that are built into libedit and so far it looks good to me.

void
histsave(void)
{
	HistEvent he;
	char *histtmpname = NULL;
	const char *histfile;
	int fd;
	FILE *f;

	/* don't try to save if the history size is 0 */
	if (hist == NULL || histsizeval() == 0)
		return;
	histfile = expandstr("${HISTFILE-${HOME-}/.ash_hist}");
	asprintf(&histtmpname, "%s.XXXXXXXXXX", histfile);
	if (histtmpname == NULL)
		return;
	INTOFF;
	fd = mkstemp(histtmpname);
	if (fd < 0 || (f = fdopen(fd, "w")) == NULL) {
		free(histtmpname);
		INTON;
		return;
	}
	if (history(hist, &he, H_SAVE_FP, f) < 1 ||
		rename(histtmpname, histfile) == -1)
		unlink(histtmpname);
	fclose(f);
	free(histtmpname);
	INTON;
}

static void
histload(void)
{
	const char *histfile;
	HistEvent he;

	/* don't try to load if the history size is 0 */
	if (hist == NULL || histsizeval() == 0)
		return;
	histfile = expandstr("${HISTFILE-${HOME-}/.ash_hist}");
	history(hist, &he, H_LOAD, histfile);
}

I use .ash_hist instead of .sh_history since the latter is used by other shells, as Jilles said.

Merge @pstef proposal, keep on .sh_history as default variable to respect
POSIX definition.

if (fd == -1 || (f = fdopen(fd, "w")) == NULL) and it will LGTM.

There should be documentation about this in the man page.

bin/sh/histedit.c
93

Perhaps extract this line to a function so it is not duplicated.

94

Even though it requires additional lines of code, the asprintf call should be under INTOFF protection since longjmp'ing out of it is not safe.

163

This is called just after processing argv, before reading the profile and/or ENV file. So it is not possible to configure the history file there.

The right fix is probably to ensure histedit() is not called until the profile and/or ENV file have been read.

jilles requested changes to this revision.May 2 2021, 6:40 PM
This revision now requires changes to proceed.May 2 2021, 6:40 PM
bapt marked 2 inline comments as done.

Add manpage updates
Invoke the load of history only when profiles have been read

bapt marked an inline comment as done.May 3 2021, 3:28 AM
pstef added inline comments.
bin/sh/histedit.c
91

It's probably safe to rewrite strlen(histfile) == 0 as histfile[0] == '\0'

bcr added a subscriber: bcr.

OK from manpages.

Address 2 comments from pstef

@jilles are the changes ok to get pushed in the tree for you?

Some people may get negatively surprised if they first encounter this change in a git update. A post to -current@ and Relnotes: yes are probably a good idea.

bin/sh/sh.1
1361

This can be written more clearly.

This revision is now accepted and ready to land.May 6 2021, 8:41 PM
This revision was automatically updated to reflect the committed changes.

3 months later I realized why my zsh history file keeps being truncated. Simply running sh and exiting does it. How am I supposed to avoid sh blowing away $HISTFILE when zsh uses that same var name along with HISTSIZE?

By the way it does not build with -DNO_HISTORY.

/usr/src/bin/sh/histedit.c:703:13: error: unknown type name 'EditLine'
sh_complete(EditLine *sel, int ch __unused)
            ^
/usr/src/bin/sh/histedit.c:705:24: error: implicit declaration of function 'fn_complete2' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        return (unsigned char)fn_complete2(sel, NULL, sh_matches,
                              ^
/usr/src/bin/sh/histedit.c:705:24: note: did you mean 'sh_complete'?
/usr/src/bin/sh/histedit.c:703:1: note: 'sh_complete' declared here
sh_complete(EditLine *sel, int ch __unused)
^
/usr/src/bin/sh/histedit.c:707:35: error: use of undeclared identifier 'FN_QUOTE_MATCH'
                NULL, &((int) {0}), NULL, NULL, FN_QUOTE_MATCH);
                                                ^
3 errors generated.
*** [histedit.o] Error code 1

By the way it does not build with -DNO_HISTORY.

That's because of D29361, not this one. Will work on it, thank you for the report.

By the way it does not build with -DNO_HISTORY.

That's because of D29361, not this one. Will work on it, thank you for the report.

Should be fixed in 35b253d9d238, thanks again.

3 months later I realized why my zsh history file keeps being truncated. Simply running sh and exiting does it. How am I supposed to avoid sh blowing away $HISTFILE when zsh uses that same var name along with HISTSIZE?

Well this is standard but POSIX to use the HISTFILE variable, I do have a specific HISTFILE definition in .zshrc and same in .shrc to avoid collision

I thought about this problem more. I think it would satisfy all scenarios if we made the history save dependent on whether the load was successful or not. Something similar to this (compiled, untested):

diff --git a/bin/sh/histedit.c b/bin/sh/histedit.c
index bd82016c33c..fae67e3a81d 100644
--- a/bin/sh/histedit.c
+++ b/bin/sh/histedit.c
@@ -71,6 +71,7 @@ History *hist;        /* history cookie */
 EditLine *el;  /* editline cookie */
 int displayhist;
 static FILE *el_in, *el_out;
+static int history_loaded;

 static char *fc_replace(const char *, char *, char *);
 static int not_fcnumber(const char *);
@@ -103,7 +104,7 @@ histsave(void)
        int fd;
        FILE *f;

-       if ((histfile = get_histfile()) == NULL)
+       if (history_loaded == -1 || (histfile = get_histfile()) == NULL)
                return;
        INTOFF;
        asprintf(&histtmpname, "%s.XXXXXXXXXX", histfile);
@@ -134,7 +135,7 @@ histload(void)

        if ((histfile = get_histfile()) == NULL)
                return;
-       history(hist, &he, H_LOAD, histfile);
+       history_loaded = history(hist, &he, H_LOAD, histfile);
 }

 /*