Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

external memory allocation management #4964

Merged
merged 9 commits into from Jun 18, 2013
Merged

external memory allocation management #4964

merged 9 commits into from Jun 18, 2013

Conversation

trevnorris
Copy link

Figured I'd get this out there early. There are a massive number of todo's, but wanted feedback as I moved along.

The general point of these changes are:

  • Create single location for all external memory allocation
  • Manage single location for improved performance and memory management
  • Remove Fast/Slow Buffer logic
  • Hopefully remove need for SlabAllocator (streams), SlabBuffer (tls) and allocNewPool (fs)

Everything here is up for debate. These changes will need to be ridiculously well tested and thought though.

EDIT: Anyone looking for the removal of the SlabAllocator should check trevnorris/node/no-slaballocator. That is built on top of this PR, but doesn't belong here. As soon as this is accepted we'll begin discussing the other.

@isaacs
Copy link

isaacs commented Mar 8, 2013

Just in case it is unclear to anyone: Under no circumstances should any of this go into 0.10.

Exciting stuff, though :)

@trevnorris
Copy link
Author

To help demonstrate one of the reasons for this change, take the following script:

var a = [];
for (var i = 0; i < 6e5; i++) {
  a.push(new Buffer(1));
  new Buffer(Buffer.poolSize - 2);
}
var rss = process.memoryUsage().rss;
console.log(((rss / 1024 / 1024)|0) + ' MB');
// output: 2639 MB

Where as the following similar script has very different results:

var a = [];
for (var i = 0; i < 6e5; i++) {
  a.push(new Buffer(1));
}
var rss = process.memoryUsage().rss;
console.log(((rss / 1024 / 1024)|0) + ' MB');
// output: 72 MB

What's happening here is a single byte Buffer is being stored in an Array. Which by itself doesn't cost much. Though when you allocate enough of the remainder of the buffer pool, telling it to allocate a new one, a lot of unused memory is left hanging out in the open.

Unfortunately v8 doesn't currently have a performant enough way to track when a js Object has been gc'd (right using Persistent objects and MakeWeak callbacks are the only way). So as a trade off node makes the assumption that most small buffers will be short lived.

@trevnorris
Copy link
Author

@isaacs @bnoordhuis going to say this is ready for initial review. Still waiting for a decision on #5323 to determine how to handle test-buffer.js. Other than that everything passes.

@rvagg
Copy link
Member

rvagg commented May 16, 2013

don't forget to increment NODE_MODULE_VERSION when this lands please.

@trevnorris
Copy link
Author

Rebased off latest crypto changes. All tests pass, and working fine.

review please

/cc @bnoordhuis @isaacs (and anyone else who cares)


No pooling is performed for these allocations. So there's no form of memory
leak.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean "no form of memory management"? This area needs some giant warnings indicating that these alloc'ed objects need to be manually tracked and disposed of.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better yet, just don't document it at all, and put a _ in front of the functions. Then put the scary warnings in the source code comments :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Didn't make it clear. These are completely managed by gc, with the
option to dispose at will. My perf tests show if the user is good about
disposing even Persisted objects performance is much better. Seems the
biggest hit is from gc needing to transverse and look for them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhh... ok. So these aren't thinbuffers, they're just "stick some allocated stuff onto an object"?

So, just to be clear, what happens if you Buffer.alloc(1024, {})? Does the memory get free'd when the object is GC'ed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. With the perk of being able to manually delete the external memory and
transform it to a zero length buffer

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. I should say zero length allocated object. This is mainly an api for
devs like @TooTallNate who are likely to find it useful to attach external
memory to any given object, and as a launch pad for my next patch.

There I added a dispose method to Buffers as well (which I'll show has some
awesome uses), but couldn't be done without removing the SlabAllocator.
Though all that definitely doesn't belong in this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@@ -137,6 +137,19 @@ inline bool IsBigEndian() {
return GetEndianness() == kBigEndian;
}

// parse index for external array data
inline static size_t ParseArrayIndex(v8::Handle<v8::Value> arg, size_t def) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the v8:: is unnecessary here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in node_internals.h. no using directive in there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't have to be static. Inline implies static (in C++ at least.)

#define MIN(a, b) ((a) < (b) ? (a) : (b))

#define CHECK_OOB(r) \
if (r) return ThrowRangeError("out of range index");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap in a do { ... } while (0) block without a trailing semi-colon.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@bnoordhuis
Copy link
Member

Nearly there, Trevor. :-)

@trevnorris
Copy link
Author

@bnoordhuis thanks dude. those 3 things are fixed. :)

assert(length <= dest_length);
// now we can guarantee these will catch oob access and *_start overflow
assert(source_start + length <= source_length);
assert(dest_start + length <= dest_length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a logic bug here. Assuming size_t has 32 bits, then if dest_start=0xfffffffe, length=10 and dest_length=20 then (length < dest_length) and (dest_start + length <= dest_length) both hold but dest_data + dest_start will point outside the buffer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the two necessary checks. thanks for pointing this out.

smalloc is a simple utility for quickly allocating external memory onto
js objects. This will be used to centralize how memory is managed in
node, and will become the backer for Buffers. So in the future crypto's
SlabBuffer, stream's SlabAllocator will be removed.

Note on the js API: because no arguments are optional the order of
arguments have been placed to match their cc counterparts as closely as
possible.
If the user knows the allocation is no longer needed then the memory can
be manually released.

Currently this will not ClearWeak the Persistent, so the callback will
still run.

If the user passed a ClearWeak callback, and then disposed the object,
the buffer callback argument will == NULL.
Memory allocations are now done through smalloc. The Buffer cc class has
been removed completely, but for backwards compatibility have left the
namespace as Buffer.

The .parent attribute is only set if the Buffer is a slice of an
allocation. Which is then set to the alloc object (not a Buffer).

The .offset attribute is now a ReadOnly set to 0, for backwards
compatibility. I'd like to remove it in the future (pre v1.0).

A few alterations have been made to how arguments are either coerced or
thrown. All primitives will now be coerced to their respective values,
and (most) all out of range index requests will throw.

The indexes that are coerced were left for backwards compatibility. For
example: Buffer slice operates more like Array slice, and coerces
instead of throwing out of range indexes. This may change in the future.

The reason for wanting to throw for out of range indexes is because
giving js access to raw memory has high potential risk. To mitigate that
it's easier to make sure the developer is always quickly alerted to the
fact that their code is attempting to access beyond memory bounds.

Because SlowBuffer will be deprecated, and simply returns a new Buffer
instance, all tests on SlowBuffer have been removed.

Heapdumps will now show usage under "smalloc" instead of "Buffer".

ParseArrayIndex was added to node_internals to support proper uint
argument checking/coercion for external array data indexes.

SlabAllocator had to be updated since handle_ no longer exists.
While the new Buffer implementation is much faster we still have the
necessity of using Buffer pools. This is undesirable because it may
still lead to unwanted memory retention, but for the time being this is
the best solution.

Because of this re-introduction, and since there is no more SlowBuffer
type, the SlowBuffer method has been re-purposed to return a non-pooled
Buffer instance. This will be helpful for developers to store data for
indeterminate lengths of time without introducing a memory leak.

Another change to Buffer pools was that they are only allocated if the
requested chunk is < poolSize / 2. This was done because allocations are
much quicker now, and it's a better use of the pool.
Expose the ability for users to allocate and manually dispose data on
any object. These are user-safe versions of internal smalloc functions.
Several things are now no longer necessary. These have been deprecated,
and will be removed in v0.13.
Old fill would take the char code of the first character and wrap around
the int to fit in the 127 range. Now fill will duplicate whatever string
is given through the entirety of the buffer.

Note: There is one bug around ending on a partial fill of any character
outside the ASCII range.
So that Windows users can properly include smalloc and node_buffer,
NODE_EXTERN was added to the headers that export this functionality.
@bnoordhuis
Copy link
Member

LGTM

@edhemphill
Copy link

I'm pretty late to the party and was hoping someone could catch me up... What's the correct way to wrap existing memory with a Buffer? I can't seem to get the free_callback to call with Buffer. I know there was some chatter on IRC quite a while back on getting rid of this. We're on the 0.10.x series but can move if necessary...

void free_test_cb(char *m,void *hint) {
    DBG_OUT("FREEING MEMORY.");
    free(m);
}

Handle<Value> WrapMemBufferTest(const Arguments& args) {
    HandleScope scope;
    char *mem = (char *) ::malloc(100);
    memset(mem,'A',100);
    node::Buffer *buf = node::Buffer::New(mem,100,free_test_cb,0);
    return scope.Close(buf->handle_);
}

But the free_test_cb() is just not getting called in a simple test program.
...and then I even tried throwing these in there:

void weak_cb(Persistent<Value> object, void* parameter) {
    object.Dispose();
}

Handle<Value> WrapMemBufferTest(const Arguments& args) {
    HandleScope scope;
    char *mem = (char *) ::malloc(100);
    memset(mem,'A',100);
    node::Buffer *buf = node::Buffer::New(mem,100,free_test_cb,0);
    buf->handle_.MakeWeak(NULL, weak_cb);  // new
    return scope.Close(buf->handle_);
}

Any advice appreciated.

@bnoordhuis
Copy link
Member

@edhemphill Finalizers are deferred, they don't run inmmediately when the last reference to the object goes away. If your program is short-lived, they may not run. Theoretically, even in longer-lived programs, they may not run ever. If your resource needs a deterministic life cycle, you have to manage it yourself.

@edhemphill
Copy link

@bnoordhuis Interesting... thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants