Replace loop in ensureCapacity with a std.mem.max for the new capacity. #1233

BarabasGitHub · 2018-07-13T21:20:08Z

The loop just looked really weird to me and I couldn't just not fix it.

isaachier · 2018-07-13T21:57:22Z

std/array_list.zig

-                if (better_capacity >= new_capacity) break;
-            }
+            better_capacity += better_capacity / 2 + 8;
+            better_capacity = std.mem.max(usize, []usize{better_capacity, new_capacity});


This isn't what you want. You need to keep multiplying better_capacity by two and adding 8 until you are greater than new_capacity. The loop actually accomplishes this mathematical recurrence: f(n) = f(n - 1) + f(n - 1) / 2 + 8. The new code does not yield the same result.

Edit: thanks @tiehuis for correcting recurrence.

Just clarifying, but the current factor is 1.5x f(n) = f(n-1) + f(n-1) / 2 + 8 which is arguably a better ratio than 2. We want to do this in a loop because it preserves the property that a resize will always slightly overallocate on the assumption that more items beyond the point will eventually be added.

It could be worthwhile in some cases to have an ensureCapacityExact which resizes to the exact amount if it is less than.

I know the new code doesn't do the same as what the old code does. But I don't think what the old code does makes any sense. I can see that you might want slightly more than asked for, but there are simpler ways to achieve that than having a loop.

ps. Why is 1.5 better than 2? If you just append elements in a tight loop 2 is better than 1.5 (and probably >2 is even better) because you just do less allocations and copies.

pps. Just for reference. For std::vector libstdc++ has 2 and the microsoft one has 1.5. And both just take the greater of the equivalent of better_capacity and new_capacity.

https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

Reasoning for the 2 vs 1.5 discussion

Makes sense. I wonder how much that affects real life programs. However, they don't always use 1.5.
https://github.com/facebook/folly/blob/f6f9d7ead19c34c033c4b673cc7ecb2942a6cb4a/folly/FBVector.h#L1186

By the way the +8 also pushes back the point of reusing memory significantly when starting with small sizes.

Reading that piece also makes me wonder about the allocators (and realloc). Maybe they could return a larger size than you requested if that suits them better.

andrewrk · 2018-07-14T13:54:28Z

It's a curious problem, isn't it?

Thanks for the FBVector link, @monouser7dig.

As far as I'm concerned these kinds of decisions are up for debate until 1.0.0. I'm intrigued by the idea of supporting a more sophisticated allocator interface so that allocators can provide (optional?) introspection into their state and the most efficient way to utilize them.

For example, an ArenaAllocator could hint that a reallocation would unconditionally have to do a memcpy, and DirectAllocator could hint that if an allocation used less than an entire page, it could realloc to gain the rest of the page for free.

ghost · 2018-07-14T14:02:53Z

@andrewrk
there is also things like http://llvm.org/doxygen/classllvm_1_1SmallVector.html#details but you've probably seen/ used that already.
allocating on the stack first and then growing in the heap after some threshold.

andrewrk · 2018-07-14T14:21:45Z

I generalized this concept with a "stack fallback allocator":

Example usage:

zig/std/os/index.zig

Line 2802 in 2a719ee

    
           const allocator = std.heap.stackFallback(usize_count * @sizeOf(usize), fallback_allocator).get();

Implementation:

zig/std/heap.zig

Line 364 in 2a719ee

    
           pub fn stackFallback(comptime size: usize, fallback_allocator: *Allocator) StackFallbackAllocator(size) {

It's probably not faster, but if the goal is to avoid allocation entirely for allocations below some threshold then I think it makes sense to use. For example, probably we will use it here:

zig/std/os/index.zig

Lines 320 to 325 in 2a719ee

    
           pub fn posixOpen(allocator: *Allocator, file_path: []const u8, flags: u32, perm: usize) PosixOpenError!i32 { 
        
               const path_with_null = try cstr.addNullByte(allocator, file_path); 
        
               defer allocator.free(path_with_null); 
        
               return posixOpenC(path_with_null.ptr, flags, perm); 
        
           }

Replace loop in ensureCapacity with a std.mem.max for the new capacity.

20530fc

isaachier reviewed Jul 13, 2018

View reviewed changes

andrewrk closed this Jul 14, 2018

BarabasGitHub deleted the remove-weird-loop-in-array-list branch July 14, 2018 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace loop in ensureCapacity with a std.mem.max for the new capacity. #1233

Replace loop in ensureCapacity with a std.mem.max for the new capacity. #1233

BarabasGitHub commented Jul 13, 2018

isaachier Jul 13, 2018 •

edited

tiehuis Jul 13, 2018

BarabasGitHub Jul 14, 2018

ghost Jul 14, 2018

BarabasGitHub Jul 14, 2018 •

edited

andrewrk commented Jul 14, 2018

ghost commented Jul 14, 2018

andrewrk commented Jul 14, 2018

Replace loop in ensureCapacity with a std.mem.max for the new capacity. #1233

Replace loop in ensureCapacity with a std.mem.max for the new capacity. #1233

Conversation

BarabasGitHub commented Jul 13, 2018

isaachier Jul 13, 2018 • edited

Choose a reason for hiding this comment

tiehuis Jul 13, 2018

Choose a reason for hiding this comment

BarabasGitHub Jul 14, 2018

Choose a reason for hiding this comment

ghost Jul 14, 2018

Choose a reason for hiding this comment

BarabasGitHub Jul 14, 2018 • edited

Choose a reason for hiding this comment

andrewrk commented Jul 14, 2018

ghost commented Jul 14, 2018

andrewrk commented Jul 14, 2018

isaachier Jul 13, 2018 •

edited

BarabasGitHub Jul 14, 2018 •

edited