Page MenuHomeFreeBSD

Eliminate the last MI difference in AT_* definitions (for powerpc).
Needs ReviewPublic

Authored by brooks on Jun 28 2019, 8:35 PM.

Details

Reviewers
jhibbits
Summary

As a transition aide, implement an alternative elfN_freebsd_fixup which
is called for old powerpc binaries. Similarly, add a translation to rtld to
convert old values to new ones (as expected by a new rtld).

Translation of old<->new values is incomplete, but sufficient to allow an
installworld of a new userspace from an old one when a new kernel is running.

Test Plan

Someone needs to see how a new kernel/rtld/libc works with an old
binary. If if works we can probalby ship this. If not we probalby need
some more compat bits.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 25466
Build 24086: arc lint + arc unit

Event Timeline

brooks created this revision.Jun 28 2019, 8:35 PM

elfv1, new kernel, old userland:

Enter full pathname of shell or RETURN for /bin/sh:
ld-elf.so.1: morepages: cannot mmap anonymous memory: Invalid argument
Out of memory
ld-elf.so.1: morepages: cannot mmap anonymous memory: Invalid argument
Out of memory

Hmm, additionally, it looks like AT_EXECPATH is damaged, because I can't use /rescue/sh either, it just comes up with the usage.

Enter full pathname of shell or RETURN for /bin/sh: /rescue/sh
usage: rescue <prog> <args> ..., where <prog> is one of:
 cat chflags chio chmod cp date dd df echo ed red expr getfacl hostname kenv
 kill ln link ls mkdir mv pkill pgrep ps pwd realpath rm unlink rmdir setfacl
 sh -sh sleep stty sync test [ csh -csh tcsh -tcsh camcontrol clri devfs dmesg
 dump rdump dumpfs dumpon fsck fsck_ffs fsck_4.2bsd fsck_ufs fsck_msdosfs fsdb
 fsirand gbde geom glabel gpart ifconfig init kldconfig kldload kldstat
 kldunload ldconfig md5 mdconfig mdmfs mknod mount mount_cd9660 mount_msdosfs
 mount_nfs mount_nullfs mount_udf mount_unionfs newfs newfs_msdos nos-tun ping
 reboot fastboot halt fasthalt restore rrestore rcorder route savecore shutdown
 poweroff spppcontrol swapon sysctl tunefs umount ccdconfig ping6 rtsol ipf
 routed rtquery bectl zfs zpool dhclient head mt sed tail tee gzip gunzip gzcat
 zcat bzip2 bunzip2 bzcat less more xz unxz lzma unlzma xzcat lzcat zstd unzstd
 zstdcat zstdmt tar nc vi ex id groups whoami iscsictl zdb chroot chown chgrp
 iscsid rescue
ld-elf.so.1: morepages: cannot mmap anonymous memory: Invalid argument
Out of memory

Did you apply a manual bump of __FreeBSD_version and P_OSREL_POWERPC_NEW_AUX_ARGS? That part of the patch is now out of date.

Yeah, I had. Bumped both in sys/sys/param.h. Getting back into the machine to double check that I didn't do something silly with that now.

Yeah, I had bumped both of the sys/sys/param.h lines to 1300037.

My userland on the machine is 1300030.

I built a test program to examine auxv (or at least libc's idea of it): http://drop.rtk0.net/auxv.c

Output from my normal netboot kernel:

root@blackbird:~ # /auxv
AT_PHDR: 0x10000040
AT_PHENT: 56
AT_PHNUM: 5
AT_PAGESZ: 4096
AT_FLAGS: 0x0
AT_ENTRY: 0x100a53f8
AT_BASE: 0x0
AT_EHDRFLAGS: 0x0
AT_EXECPATH: /auxv
AT_OSRELDATE: 1300031
AT_CANARY: 0x3fffffffffffdf98
AT_CANARYLEN: 64
AT_NCPUS: 32
AT_PAGESIZES: 0x3fffffffffffdf90
AT_PAGESIZESLEN: 8
AT_STACKPROT: 0x3
AT_HWCAP: 0xdc007182
AT_HWCAP2: 0xffe00000

Output from kernel prompt:

/auxv
AT_PHDR: 0x10000040
AT_PHENT: 56
AT_PHNUM: 5
AT_PAGESZ: 4096
AT_FLAGS: 0x0
AT_ENTRY: 0x100a53f8
AT_BASE: 0x0
AT_EHDRFLAGS: 0x0
AT_EXECPATH: /auxv
AT_OSRELDATE: 1300037
AT_CANARY: 0x3fffffffffffdf98
AT_CANARYLEN: 64
AT_NCPUS: 32
AT_PAGESIZES: 0x3fffffffffffdf90
AT_PAGESIZESLEN: 8
AT_STACKPROT: 0x3
AT_HWCAP: 0xdc007182
AT_HWCAP2: 0xffe00000
ld-elf.so.1: morepages: cannot mmap anonymous memory: Invalid argument
Out of memory
Enter full pathname of shell or RETURN for /bin/sh:

So it looks like it's at least making it into a static program okay. (I believe the errors afterwards might be from it trying the default again?)

I do note that exec_copyout_strings() decreases a vector by AT_COUNT, which doesn't match the memory allocation anymore. Checking to see if bumping that instance to 32 helps.

Yep, that did it. exec_copyout_strings() needs to use the correct size for the aux vector or it will screw things up somehow.

bdragon added inline comments.Jul 23 2019, 2:10 AM
sys/kern/imgact_elf.c
1345

Was there a reason for the choice of 32 in the first place btw, given that increasing it over AT_COUNT is what causes the issue in exec_copyout_strings()?

brooks added inline comments.Jul 23 2019, 2:47 AM
sys/kern/imgact_elf.c
1345

Dumb mistake that for what ever reason didn't break on my mips branch... I honestly don't know what I was thinking.

brooks updated this revision to Diff 60038.Jul 23 2019, 2:56 AM
  • Use the correct AT_COUNT value, not the CheriBSD one.
brooks marked an inline comment as done.Jul 23 2019, 2:57 AM
brooks added inline comments.
sys/kern/imgact_elf.c
1345

It's a transposition from my test tree. We have 5 more AT_ value there.

OK, with OLD_AT_COUNT dropped down to 27, it continues working (whether or not I use 32 or 27 in exec_copyout_strings(), so it looks like it will continue to work even if the normal AT_COUNT is incremented in the future.)

There are two main problems currently.

The first is that calls to elf_aux_info(3) will need to be dependent on the version of the executable.

The second (more immediate) problem is that _rtld() will need to know how to deal with both kinds of auxv, as what it gets passed is dependent on p_osrel of the program it is interpreting. Lack of this makes installworld crash out immediately after installing the new rtld.

Additionally, when direct-executing (/libexec/ld-elf.so.1 /path/to/program), it will be using rtld's version, not the target program's.

I'm not sure we can get away with this without a flag day.

OK, I wasn't sure exactly how the bits in rtld were going to go. It doesn't look like this is going to be as much help as I'd hoped.

Would this survive the following?

chflags -R noschg /
tar xf base.txz

If so, that could be a path forward for a flag day. Awkward, but at least not requiring a reinstall from scratch.

Actually, we might be able to determine what format the aux vector is by heuristics.

If aux value 23 exists in the vector, it's definitely the new format, as aux value 23 cannot exist in the old format and AT_STACKPROT is *always* added to the vector.

My patch seemed to fix things. I was able to buildworld and installworld (after rolling base back to a non crashy version).

A second makeworld / installworld on top of this also worked.

elf_aux_info(3) is still likely going to give wrong answers though when the new libc is dynamically loaded into an old program. Also, __elf_aux_vector is being set to the unmodified stack version of the auxv, so libc is also being exposed to both versions depending on the program's header. Not sure what the cleanest way to sort this out would be.

Other code that does its own initialization from scratch might need to be adapted too, although Surprisingly, golang won't be affected, as it only uses entries that didn't end up getting renumbered. It's really only problematic in the dynamic linking case though, and I don't know what code other than rtld and libc is in a place to be dynamically loaded into legacy programs.

brooks updated this revision to Diff 60097.Jul 24 2019, 5:02 PM
  • Detect old auxargs constant and translate them (from bdragon@).
  • Put freebsd_fixup_old_auxargs under #ifdef powerpc.

This looks like it provides a viable update path. As you say, libc may have some edge case issues, but it does look like it does the job of letting people upgrade without a reinstall from media. What do you think about about timing this change? It seems like it should happen close to the ELFv2 switch so people get the whole thing out of the way at once.

The other question would be, when to remove the compatibility bits. I don't know the powerpc community well enough to have an opinion.

brooks edited the summary of this revision. (Show Details)Jul 24 2019, 5:25 PM

That's up to jhibbits.

For 32 bit powerpc at least, where we don't have the ABI switch excuse, I think it's gonna have to be kept around as COMPAT_FREEBSD12 for the forseeable future.