Paths

Table of Contentst

arm64: Fixing user space boudary checking in copyinout.S
ClosedPublic
Actions

Authored by der_semihalf.com on Mar 23 2016, 2:55 PM.

Details

Reviewers

kib
zbb
wma
andrew

Commits

rS297235: arm64: Fixing user space boudary checking in copyinout.S

Summary

Big buffer size could cause integer overflow and as a result
attempt to copy beyond VM_USERMAX_ADDRESS.

Fixing copyinstr boundary checking where compared value has been
overwritten by accident when setting fault handler.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

der_semihalf.com updated this revision to Diff 14548.Mar 23 2016, 2:55 PM

der_semihalf.com retitled this revision from to arm64: Fixing user space boudary checking in copyinout.S.

der_semihalf.com updated this object.

der_semihalf.com edited the test plan for this revision. (Show Details)

der_semihalf.com added reviewers: wma, zbb, andrew, kib.

der_semihalf.com set the repository for this revision to rS FreeBSD src repository - subversion.

Herald added subscribers: emaste, imp. · View Herald TranscriptMar 23 2016, 2:55 PM

Yes, there is the issue. But I would prefer a different solution. I think that the proposed change works because user/kernel split is by half of the userspace, and max-min appear to be less than half. I.e. by checking that the length is <= max-min, you actually ensure that there is no overflow when calculating the end address for copy.

So why not check that there is no unsigned overflow when calculating the end address, instead ? Like

adds x3, x1, x2
b.cs copyio_fault_nopcb

for copyout, etc.

I agree that the solution that you propose is more elegant. And takes less instructions.

How would this be triggered?

In D5719#122133, @kib wrote:

Yes, there is the issue. But I would prefer a different solution. I think that the proposed change works because user/kernel split is by half of the userspace, and max-min appear to be less than half. I.e. by checking that the length is <= max-min, you actually ensure that there is no overflow when calculating the end address for copy.

The user address space is at most 2^48 bytes, there is then an unusable region that neither userspace or the kernel can use, followed by up to 2^48 bytes for the kernel. The two regions are at the extreme of the address space, i.e. the userspace starts at the bottom of VA space, and the kernel finishes at the top of VA space.

Because of this I don't see how this would be anything other than an optimisation. If we were asked to copy such a large buffer, assuming the first user address is valid, we sill eventually hit the middle region where there are no valid virtual addresses. We will then fault & return an error. Depending on the start address this may be after a large amount of data has been copied, but the point is we will hit this invalid address well before getting to a kernel address or wrapping.

In D5719#122136, @andrew wrote:

How would this be triggered?

In D5719#122133, @kib wrote:

Yes, there is the issue. But I would prefer a different solution. I think that the proposed change works because user/kernel split is by half of the userspace, and max-min appear to be less than half. I.e. by checking that the length is <= max-min, you actually ensure that there is no overflow when calculating the end address for copy.

The user address space is at most 2^48 bytes, there is then an unusable region that neither userspace or the kernel can use, followed by up to 2^48 bytes for the kernel. The two regions are at the extreme of the address space, i.e. the userspace starts at the bottom of VA space, and the kernel finishes at the top of VA space.

Because of this I don't see how this would be anything other than an optimisation. If we were asked to copy such a large buffer, assuming the first user address is valid, we sill eventually hit the middle region where there are no valid virtual addresses. We will then fault & return an error. Depending on the start address this may be after a large amount of data has been copied, but the point is we will hit this invalid address well before getting to a kernel address or wrapping.

Suppose that malicious user pass a valid kernel address as the buffer address, and huge length that causes overflow and results in the sum of addr+length to fall into valid user address range. Then e.g. copyout would successfully copy some data to userspace before faulting.

In D5719#122136, @andrew wrote:

Because of this I don't see how this would be anything other than an optimisation.

We do not want this kind of optimizations? I do not see why it is worth to engage CPU cycles in action, that is certain to fail. That is sloppy from our side.

In D5719#122140, @kib wrote:

Suppose that malicious user pass a valid kernel address as the buffer address, and huge length that causes overflow and results in the sum of addr+length to fall into valid user address range. Then e.g. copyout would successfully copy some data to userspace before faulting.

This is an argument for using the ldtr & sttr instructions, or to check the user address is valid then rely on the unmapped middle section.

In D5719#122141, @der_semihalf.com wrote:

In D5719#122136, @andrew wrote:

Because of this I don't see how this would be anything other than an optimisation.

We do not want this kind of optimizations? I do not see why it is worth to engage CPU cycles in action, that is certain to fail. That is sloppy from our side.

Not when it's an optimisation for an uncommon case at the cost of the common case.

In D5719#122146, @andrew wrote:

This is an argument for using the ldtr & sttr instructions, or to check the user address is valid then rely on the unmapped middle section.

Using ldtr is harder than it worth, you are aware of D3617, the instruction requires fault handler to understand that failures still.

I also do not like relying on the hole, it makes a dependency on the hardware detail or implementation that could change and silently give the issue back.

Other arches do the correct thing, they check that end is computed as sum without an overflow, and then check that end does not extend beyond start of the kernel portion. It is my bug, and I even not sure why I was so mentally challenged to do the right thing on arm but not on arm64 almost simultaneously.

In D5719#122136, @andrew wrote:

Because of this I don't see how this would be anything other than an optimisation.

We do not want this kind of optimizations? I do not see why it is worth to engage CPU cycles in action, that is certain to fail. That is sloppy from our side.

Not when it's an optimisation for an uncommon case at the cost of the common case.

The above is good argument.

In D5719#122151, @kib wrote:

I also do not like relying on the hole, it makes a dependency on the hardware detail or implementation that could change and silently give the issue back.

But this is even better argument to do the checking

In D5719#122151, @kib wrote:

Other arches do the correct thing, they check that end is computed as sum without an overflow, and then check that end does not extend beyond start of the kernel portion

I will look into your solution for arm and do it similarly.

In D5719#122153, @der_semihalf.com wrote:

I will look into your solution for arm and do it similarly.

I already explained it, the adds/b.cs check during the calculation of the buffer end.

Implemented fixes suggested by kib.

copyinstr fix is shame

This revision is now accepted and ready to land.Mar 24 2016, 9:49 AM

Closed by commit rS297235: arm64: Fixing user space boudary checking in copyinout.S (authored by wma). · Explain WhyMar 24 2016, 1:28 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

Path

Size

head/

sys/

arm64/

copyinout.S

8 lines

Diff 14573

View Options

head/sys/arm64/arm64/copyinout.S

arm64: Fixing user space boudary checking in copyinout.SClosedPublicActions