Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: use platform specific Pointer sizes #4594

Conversation

ysbaddaden
Copy link
Contributor

The current implementation uses UInt64 for pointer allocation sizes, but UInt32 for operating on pointer sizes (memcpy, memmove, memset) despite using the 64 bit LLVM intrinsics; and was illogical: if we allocate a large section of memory, we must be able to zero and copy/move it, whereas 32bit platforms should deal with UInt32, the maximum possible value, not larger ones.

This patch proposes to constraint to a platform specific size only (i.e. LibC::SizeT) in order to always use UInt32 on 32 bit platforms, and UInt64 on 64 bit platforms.

We cannot clamp sizes to Int32::MAX, because some allocations (e.g. Array(Int32).new(Int32::MAX)) can be larger than allowed.

Alternatively we could disallow such allocations, thus allowing to clamp the allocation sizes? I mean, that's 2 million entries; the example array above with T=Int32 would be 8GB; with T=Int32|Int64 that's 32GB... Yet, since it could end up being an issue with external C programs, I (reluctantly) admit we should allow a platform specific sizes.

This patch also fixes the use of 64bit variants of LLVM intrinsics on AArch64 (memcpy, memmove, memset).

Fixes #4589

@@ -1,6 +1,6 @@
lib Intrinsics
fun debugtrap = "llvm.debugtrap"
{% if flag?(:x86_64) %}
{% if flag?(:x86_64) || flag(:aarch64) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not make more sense to check the type of LibC::SizeT here? It's beyond the scope of this PR because I don't think there's a way because of how aliases are handled in macros right now, but it might be a good idea as we add more 64bit platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of adding the :abi64 and :abi32 flags, based on the architecture, to solve this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that too. Then I thought that maybe SizeT could possibly be the wrong type if the platform's libc is weird. But I think that's such an abstract worry that adding flags is probably the best solution.

src/pointer.cr Outdated
@@ -405,7 +405,7 @@ struct Pointer(T)
# ptr.address # => 5678
# ```
def self.new(address : Int)
new address.to_u64
new LibC::SizeT.new(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we introduce ::SizeT type because it's very common one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And ::OffsetT too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be this way: #4594 (comment) ?

@funny-falcon
Copy link
Contributor

Still, why not expose LibC::SizeT?

struct Pointer
   alias UIntPtr = LibC::SizeT # used for allocation size
   alias IntPtr = LibC::SSizeT # used for pointer arithmetic
   # note: looks like there is no need in SSizeT/PtrDiffT/IntPtrT distinction like in C,
   # cause Crystal is supported only on platforms where they are equal
end
struct Int
   def to_uptr
      Pointer::UIntPtr.new(self)
   end
   def to_iptr
      Pointer::IntPtr.new(self)
   end
end

@RX14
Copy link
Contributor

RX14 commented Jun 19, 2017

We could put these types under Crystal::System since they're platform specific. Not sure if we want them instead of using SizeT though.

@akzhan
Copy link
Contributor

akzhan commented Jun 19, 2017

No, pointers and pointer arithmetic are different use cases.

First, we already have Pointer type. So we need only SizeT to manipulate sizes, and OffsetT to manipulate offsets.

OffsetT is always signed because may be negative. http://www.cplusplus.com/reference/cstddef/ptrdiff_t/

SizeT is unsigned by design. http://www.cplusplus.com/reference/cstddef/size_t/

@funny-falcon
Copy link
Contributor

funny-falcon commented Jun 19, 2017

Pointer#address have to return some integer, and best name for is Pointer::UIntPtr.

There could be issue on 32bit platform when allocated array is on the 2GB border: ie array start is below
border, and array end is above. It breaks end-to-start comparison with pointer address if address is signed.
Also there is PowerPC platform with full 64bit pointers (in contrast to 48bit on x86_64). Although I'm not convenient with how different OS allocates memory on PowerPC.

So Pointer#address have to be unsigned. And then it matches SizeT (signed). And expected name for is Pointer::UIntPtr.
Then signed variant is expected to be named Pointer::IntPtr, imho.

Note, names could be shorter: Pointer::Uint and Pointer::Int. But I think, it is a bit uglier, and could lead to confusion with ::Int. I could be mistaken, though.

@akzhan
Copy link
Contributor

akzhan commented Jun 19, 2017

Pointer::Address and Pointer::Diff?

@funny-falcon
Copy link
Contributor

I like Pointer::Diff.
But Pointer::Address serves both address and allocation amount.
On the other hand, Pointer::Diff could serve allocation amount well, if allocation of >= 2GB on 32bit platform will be forbidden. And then Diff is a diff of total allocated memory :-)

With Pointer::IntPtr and Pointer::UIntPtr methods Int#to_iptr and Int#to_uptr will have clear meaning.
But I could agree that this methods are not strictly necessary.

@ysbaddaden , what do you think about?

@ysbaddaden
Copy link
Contributor Author

No.

@ysbaddaden
Copy link
Contributor Author

Some aparte notes:

  • LibC::SizeT is unsigned, as per C functions / LLVM intrinsics:
  • Pointer#address is already unsigned. This may be the one place where an unsigned integer is expected:
typeof(Pointer(UInt8).null.address) # => UInt64 (on x86_64)
typeof(Pointer(UInt8).null.address) # => UInt32 (on ARM)
  • Pointer#address says nothing about allocation amount, it's a mere pointer to an address in memory. The allocated size must be retained separately —hence why one must use Slice not Pointer.

Again: Crystal isn't C nor Rust and doesn't need the numerous C types.

@ysbaddaden ysbaddaden force-pushed the fix-platform-specific-pointer-sizes branch 2 times, most recently from 1aa790e to 3a9d2b2 Compare June 19, 2017 13:22
@funny-falcon
Copy link
Contributor

funny-falcon commented Jun 19, 2017

Again: Crystal isn't C nor Rust and doesn't need the numerous C types.

No, it is not.
But since it is used (and not C nor Rust) to implement itself and its standard library, it have to have all convenient facilities for low-level.

The current implementation uses UInt64 for pointer allocation sizes,
but UInt32 for operating on pointer sizes (memcpy, memmove, memset)
despite using the 64 bit LLVM intrinsics; and is illogical: if we
allocate a large section of memory, we must be able to zero and
copy/move it, whereas 32bit platforms should deal with UInt32, the
maximum possible value, not larger ones.

This patch proposes to constraint to a platform specific size only
(i.e. LibC::SizeT) in order to always use UInt32 on 32 bit
platforms, and UInt64 on 64 bit platforms.

We cannot clamp sizes to Int32::MAX, because some allocations (e.g.
Array(Int32).new(Int32::MAX)) can be larger than allowed.

Also fixes usage of 64bit variant of LLVM intrinsics on AArch64.
@ysbaddaden ysbaddaden force-pushed the fix-platform-specific-pointer-sizes branch from 3a9d2b2 to fd84715 Compare June 19, 2017 13:44
@ysbaddaden
Copy link
Contributor Author

I should refrain from touching such internals. This is broken.

@ysbaddaden ysbaddaden closed this Jun 19, 2017
@sdogruyol
Copy link
Member

Now, this was a weird PR 😕

@@ -52,6 +52,12 @@ struct Pointer(T)
end
end

{% if flag?(:x86_64) || flag?(:aarch64) %}
private alias SizeT = UInt64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why private? and why not alias SizeT = LibC::SizeT ?

@RX14
Copy link
Contributor

RX14 commented Jun 19, 2017

@ysbaddaden I don't really understand why you closed this.

@akzhan
Copy link
Contributor

akzhan commented Jun 19, 2017

AFAIS Simple update of OffT to LongLong to support 64bit offset in files require a lot of changes in stdlib, and so on.

It should be a major move of Crystal to 64bit.

@ysbaddaden ysbaddaden deleted the fix-platform-specific-pointer-sizes branch June 20, 2017 13:06
@ysbaddaden
Copy link
Contributor Author

@akzhan OffT is libc dependent and has nothing to do with this PR.

@RX14 compiler intrinsics seem to have constraints on pointer address and allocation size handlings, and that goes beyond my knowledge. Only the to_u32 calls in pointer.cr should be changed to be platform specific (maybe).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pointer's copy_from|copy_to|move_from|move_to are inherently broken.
5 participants