Enumerator out of memory error #2577

iconara · 2015-02-07T20:00:18Z

I've found that it's possible to stress JRuby into crashing with an out of memory error with the following code.

The code zips the bytes of two strings together, their lengths don't matter at all, a single character is sufficient. It then checks whether any of the bytes are nil, which should be impossible, but happens. Originally I didn't check for nil, but the first indication that there was a problem was that I got errors that I did things with nil where there couldn't be any nil. When I put a begin…rescue around it to see what was nil I got an out of memory error instead.

s1 = 'a'
s2 = 'b'

100000.times do
  b1 = s1.each_byte
  b2 = s2.each_byte
  bytes = b1.zip(b2).flatten
  if bytes.any? { |b| b.nil? }
    puts('this can never happen')
  end
end

prints the following in JRuby 1.7.18, 1.7.19 and HEAD (probably all other versions too):

this can never happen
this can never happen
this can never happen
this can never happen
this can never happen
Error: Your application used more memory than the safety cap of 500M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

the exact number of "this can never happen" differ.

The full stack trace of the OutOfMemoryError is:

java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:713)
    at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368)
    at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
    at org.jruby.RubyEnumerator$ThreadedNexter.ensureStarted(RubyEnumerator.java:700)
    at org.jruby.RubyEnumerator$ThreadedNexter.next(RubyEnumerator.java:654)
    at org.jruby.RubyEnumerator.next(RubyEnumerator.java:461)
    at org.jruby.RubyEnumerator$INVOKER$i$0$0$next.call(RubyEnumerator$INVOKER$i$0$0$next.gen)
    at org.jruby.RubyClass.finvoke(RubyClass.java:616)
    at org.jruby.runtime.Helpers.invoke(Helpers.java:593)
    at org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:359)
    at org.jruby.RubyEnumerable.zipEnumNext(RubyEnumerable.java:1679)
    at org.jruby.RubyEnumerable$50.call(RubyEnumerable.java:1635)
    at org.jruby.runtime.CallBlock.doYield(CallBlock.java:80)
    at org.jruby.runtime.BlockBody.yield(BlockBody.java:82)
    at org.jruby.runtime.Block.yield(Block.java:147)
    at org.jruby.RubyString.enumerateBytes(RubyString.java:5468)
    at org.jruby.RubyString.each_byte19(RubyString.java:5275)
    at org.jruby.RubyString$INVOKER$i$0$0$each_byte19.call(RubyString$INVOKER$i$0$0$each_byte19.gen)
    at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodZeroBlock.call(JavaMethod.java:472)
    at org.jruby.RubyClass.finvoke(RubyClass.java:541)
    at org.jruby.runtime.Helpers.invoke(Helpers.java:589)
    at org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:394)
    at org.jruby.RubyEnumerator.each(RubyEnumerator.java:294)
    at org.jruby.RubyEnumerator$INVOKER$i$each.call(RubyEnumerator$INVOKER$i$each.gen)
    at org.jruby.RubyClass.finvoke(RubyClass.java:520)
    at org.jruby.runtime.Helpers.invoke(Helpers.java:577)
    at org.jruby.RubyEnumerable.callEach(RubyEnumerable.java:96)
    at org.jruby.RubyEnumerable.zipCommonEnum(RubyEnumerable.java:1626)
    at org.jruby.RubyEnumerable.zipCommon19(RubyEnumerable.java:1547)
    at org.jruby.RubyEnumerable.zip19(RubyEnumerable.java:1491)
    at org.jruby.RubyEnumerable$INVOKER$s$0$0$zip19.call(RubyEnumerable$INVOKER$s$0$0$zip19.gen)
    at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)
    at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:161)
    at tmp.jruby_issue.invokeOther3:zip(tmp/jruby_issue.rb)
    at tmp.jruby_issue.\=tmp\|jruby_issue\,rb_CLOSURE_1__tmp\|jruby_issue\,rb_0(tmp/jruby_issue.rb:7)
    at org.jruby.runtime.CompiledIRBlockBody.commonYieldPath(CompiledIRBlockBody.java:66)
    at org.jruby.runtime.IRBlockBody.yieldSpecific(IRBlockBody.java:84)
    at org.jruby.runtime.Block.yieldSpecific(Block.java:116)
    at org.jruby.RubyFixnum.times(RubyFixnum.java:300)
    at org.jruby.RubyFixnum$INVOKER$i$0$0$times.call(RubyFixnum$INVOKER$i$0$0$times.gen)
    at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:303)
    at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:141)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:145)
    at tmp.jruby_issue.invokeOther13:times(tmp/jruby_issue.rb)
    at tmp.jruby_issue.__script__(tmp/jruby_issue.rb:4)
    at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:636)
    at org.jruby.ir.Compiler$1.load(Compiler.java:112)
    at org.jruby.Ruby.runScript(Ruby.java:827)
    at org.jruby.Ruby.runScript(Ruby.java:820)
    at org.jruby.Ruby.runNormally(Ruby.java:750)
    at org.jruby.Ruby.runFromMain(Ruby.java:572)
    at org.jruby.Main.doRunFromMain(Main.java:404)
    at org.jruby.Main.internalRun(Main.java:299)
    at org.jruby.Main.run(Main.java:226)
    at org.jruby.Main.main(Main.java:198)

The text was updated successfully, but these errors were encountered:

headius · 2015-03-12T22:30:25Z

This may just be a down side of our having to use threads for all Enumerator#next logic.

You are creating two enumerators and then zipping the one against the other. This will probably create at least one thread, and possibly two. Once those threads reach the end of the data, they should shut down. If you walk away from them before they're complete, they should also shut down. What you're seeing here is that too many threads have been created and not cleaned up (perhaps due to GC delays) and so we can't create any more.

headius · 2015-03-12T22:37:57Z

I attempted to make it force a GC when it fails to create a new thread, but it doesn't seem to help here.

headius · 2015-03-12T22:49:55Z

It looks like our best bet would be to finally start making non-threaded enumerator logic similar to what we already have for Array#each (RubyEnumerator.ArrayNexter for example). That should make it possible for us to handle more core-class cases without threads, which should make your case work.

iconara · 2015-03-13T07:07:40Z

Thanks for looking into this. From my casual understanding of the problem it feels like as long as the underlying collection is sequential or in some other way externally enumerable there should be no need to use threads for enumeration.

I looks, for example, like the RubyString#each_byte creates RubyEnumerators and passes a size function, so could it also pass a function that enumerated the string (essentially a RubyEnumerator.Nexter)? I'm basically just trying to see if I'm understanding the underlying code correctly, I'm not familiar enough with it to see the whole picture or the downsides of a solution like that.

iconara · 2015-03-13T07:09:56Z

What I proposed is kind of what happens for RubyArray, but instead of having RubyEnumerator check the type of the underlying collection and deciding on the best "nexter" strategy the collection creates the strategy when it creates the RubyEnumerator.

The JRuby enumerator uses a thread per next object in an enumerator which proves costly. Hundreds of threads are created (tested with yourkit) when batch-creating evidence due to the "each_slice(500)" of the enumerator. This issue is logged in JRuby: jruby/jruby#2577 The solution employed was to yield each evidence directly to the block and batch 500 into an array at a time. This should avoid the OOM exception received: ava.lang.OutOfMemoryError: unable to create new native thread Indeed the thread count was observed to be lower in yourkit.

Fixed: - Facets are not created for evidence uploaded through a dataset. - Facets are empty while uploading a dataset. - Dataset evidence collection is missing annotation/namespace URIs (#95). Changed: - Mongo schema redesign for evidence.facets and evidence facet cache. - Bumped MongoDB requirement to 3.2.0. We now use the $slice operator for facet aggregation operations. Added: - Export evidence using BEL translator plugins (#44). - Export dataset evidence using BEL translator plugins (#99). - Mongo migration scripts for existing installations of openbel-api. - Upgrading guide. - 0.6.0 changelog notes. Squashed commit of the following: commit be2e6e1 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 15:07:24 2016 -0400 replace method for BEL.keys_to_symbols additional style alignment commit fbf5368 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 09:25:06 2016 -0400 return 404 when translating empty evidence results refs #44 commit ac61baf Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 08:32:37 2016 -0400 added storage.engine note for UPGRADING to 0.6.0 commit 3f4f700 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 08:27:14 2016 -0400 added UPGRADING guide commit 29f86e8 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 08:05:01 2016 -0400 added document for 0.6.0 mongodb migration commit 0e22354 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 06:30:26 2016 -0400 add configuration check for MongoDB 3.2 Check will fail to start OpenBEL API is MongoDB is < 3.2 commit 45e5e39 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 06:17:57 2016 -0400 added missing arg to render evidence collection commit 1edb037 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 14 14:45:43 2016 -0400 set mongo operation timeouts to unbounded The operation timeout is the number of seconds that can pass before subsequent reads from a mongo operation. This change makes this read timeout unbounded in order to satisfy long evidence and facet creation queries. commit 39524ca Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 14 13:46:25 2016 -0400 remove cache facets during dataset load Cached facets were removed at the end of a dataset load. Now they are additionally removed at the start of the load as well as every increment of 10k nanopubs loaded. commit 68c2107 Merge: de9a500 61a291d Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 14 12:50:35 2016 -0400 Merge branch 'next' into rewrite_references commit 61a291d Merge: 1b4dbb7 1bdf14e Author: Tony Bargnesi <abargnesi@gmail.com> Date: Mon Mar 14 12:20:40 2016 -0400 Merge pull request #101 from nbargnesi/issue100 Issue100 commit 1bdf14e Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Mon Mar 14 12:05:43 2016 -0400 document auth.enabled, auth.secret commit 0e900f6 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Feb 2 13:56:15 2016 -0500 include only auth enabled/secret in default config for #100 commit fbb8b06 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Feb 2 13:55:54 2016 -0500 simplify authenticate route to enabled/disabled commit fe724ff Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Feb 2 13:54:30 2016 -0500 remove rest-client dependency commit de9a500 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Mar 10 14:29:16 2016 -0500 set mongo connection pool size to 30 This number was chosen in order to have at most 30 long-running queries simulaneously executing. This would then fail the 31st query unless a connection could be obtained with a timeout of 5 seconds. commit 8d46fc1 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Mar 9 14:54:15 2016 -0500 do not index value of experiment_context/metadata annotation values can be large amount of text that will not fit into an index key of 1024, if it's attempted you may see an error: WiredTigerIndex::insert: key too large to index... commit 4426582 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 8 23:01:46 2016 -0500 flatten translator arrays so we return one, if any commit 4d42c35 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 8 20:38:41 2016 -0500 bump puma to 3.1.0 commit 5081567 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 8 20:36:41 2016 -0500 remove unnecessary local variables commit 32c5e56 Author: Tony Bargnesi <abargnesi@gmail.com> Date: Tue Mar 8 16:59:38 2016 -0500 Update README.md commit 53ea95f Author: Tony Bargnesi <abargnesi@gmail.com> Date: Tue Mar 8 16:51:59 2016 -0500 Update README.md commit 53653c0 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 7 23:06:27 2016 -0500 correct references when serialization evidence using rewrite references work in bel.rb commit 1b4dbb7 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 16:11:02 2016 -0500 convert /api/evidence to BEL using translators factored out rendering of evidence_resource_collection to evidence helper refs #44 commit 3500811 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 15:20:01 2016 -0500 factored out filters validation into helper functional decomposition of filter validation for better understanding and maintenance; now reporting multiple JSON errors when responding with 400. commit 83935aa Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 15:18:27 2016 -0500 added doc for opening ::Sinatra::Helpers::Stream It is important to convey why methods were added to this class. The methods are a convenience so RDF.rb's writers can expect to call them. commit c984f8a Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 15:08:44 2016 -0500 bump version dependencies for bel-rdf-jena / rdf rdf bumped to 1.99.1 bel-rdf-jena bumped to 0.4.2 commit e4eb5dd Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Feb 1 14:50:34 2016 -0500 dataset serialization to all bel.rb translators updated dependencies to support all bel.rb translators refs #99 commit b1243d8 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Jan 26 15:57:16 2016 -0500 aggregate on full-text search; avoids Mongo limits A full-text search filter to /api/evidence with a sort on bel_statement only used the text index. This means that the bel_statement sort had to be done in memory. This reaches the 32 MB sort limit with only several tens of thousands of documents. The solution employed here was to use cursored aggregation allowing disk use for sort stages. The solution was introduced as an alternative code path if a FTS filter was included in the HTTP request. Although this did minimize the risk of regression there is a fair bit of to clean up in the mongo access layer. closes #96 commit 5d44fd0 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 25 21:48:12 2016 -0500 return annotation/namespace defs in BEL Script removed normalization of experiment_context annotation keywords. The normalized names were in inconsistent with references.annotations definitions. integrate next version of bel.rb (0.4.3) to get fixes for annotation/namespace formats. refs #95 commit 92f7e7e Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 25 15:51:14 2016 -0500 require MongoDB 3.2; closes #98 commit 0507714 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 25 14:57:28 2016 -0500 added 0.6.0 mongo migration helper, details follow The clear_evidence_facets_cache.rb mongo migration will clear out new evidence facet cache storage in case searches were built before migrating all documents in the "evidence" collection. commit 7707a92 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Jan 14 14:16:24 2016 -0500 fix /api/datasets/{id}/evidence for facet changes Now facets correctly in light of evidence facet changes and respects "max_values_per_facet". commit 19eedef Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Jan 14 13:10:57 2016 -0500 add scripts for Mongo data migrations in 0.6.0 - Drops evidence_facets since it has been replaced by evidence_facet_cache plus individual "evidence_facet_cache_{UUID}" collections. - Updates each evidence document to have "facets" field contain JSON objects instead of JSON strings. commit 21a7bc4 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Jan 14 13:08:32 2016 -0500 bumped next version to 0.6.0 Minor release looking to include: - New evidence facet storage in mongo. - Improve dataset import for large documents (occasional OOM). - Evidence streaming. - Evidence export to multiple formats. commit bb2ac16 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jan 13 16:44:47 2016 -0500 facet cache collection creation and removal This design builds individual facet_cache collections based on the filters applied to the evidence collection. Each filtered evidence collection will get it's own "evidence_facet_cache_{UUID}" mongo collection. The facets values are grouped by category, name so it's trivial to cursor out the facets (still need to set the filter string though). This alleviates the max document size issue for large evidence collections. A max of 1000 facet values can be added to each category, name pair in order to stay within the size limit. Facet cache eviction isn't great here: - Individual evidence changes require removal of facet caches for the empty filter search as well as any overlapping filter/facet. - Creation or removal of a dataset will remove all facet caches. The thought is that for large dataset imports it is more effective to regenerate than cache vs. trying to synchronize it with new data. This includes a breaking change to evidence document schema. The evidence "facets" array stores the full category, name, value json objects instead of flat strings. This is done to make it possible to separate values into category, name groupings. We should include an upgrade note for this and possibly a script. commit f5a08a3 Merge: f038be2 a515587 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jan 13 16:42:24 2016 -0500 Merge branch 'master' into next commit f038be2 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 11 22:58:47 2016 -0500 batch evidence to an array, avoid JRuby enumerator The JRuby enumerator uses a thread per next object in an enumerator which proves costly. Hundreds of threads are created (tested with yourkit) when batch-creating evidence due to the "each_slice(500)" of the enumerator. This issue is logged in JRuby: jruby/jruby#2577 The solution employed was to yield each evidence directly to the block and batch 500 into an array at a time. This should avoid the OOM exception received: ava.lang.OutOfMemoryError: unable to create new native thread Indeed the thread count was observed to be lower in yourkit.

Analyzing stacktraces indicated many threads were being created with calls to Enumerator.next on JRuby. These threads stayed did not complete and ultimiately resulted in an out of memory error. The solution employed is to process all lines yielded to the block but expand, in a stateful manner, when a line continuator is encountered. JRuby bug: jruby/jruby#2577

Analyzing stacktraces indicated many threads were being created with calls to Enumerator.next on JRuby. These threads stayed did not complete and ultimiately resulted in an out of memory error. The solution employed is to process all lines yielded to the block but expand, in a stateful manner, when a line continuator is encountered. JRuby bug: jruby/jruby#2577 remove line_continuator mixin; orphaned closes OpenBEL/bel.rb#126

@wshayes

Squashed commit of the following: commit 804c313 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jun 8 05:35:52 2016 -0400 bump versions; published 1.0.1 commit a01f56f Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jun 8 03:29:17 2016 -0400 bumped bel to version 1.0.0 commit c28b74d Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Jun 7 15:35:11 2016 -0400 set language version as configured in OpenBEL API commit d023769 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jun 6 11:24:05 2016 -0400 /api/version route; exposes API semantic version commit 12af9ce Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jun 6 10:42:55 2016 -0400 refactored /api/language routes into one class commit 5005429 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jun 1 14:44:46 2016 -0400 remove explicit statement parse for nanopub statement parsing is encapsulated within Nanopub state commit 335a982 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jun 1 14:38:09 2016 -0400 create Annotation model before unification commit 24f3cdf Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue May 31 10:57:11 2016 -0400 json-format filters; thanks @wshayes! commit d15f0e3 Merge: b36876e be3bba1 Author: William Hayes <william.s.hayes@gmail.com> Date: Fri May 27 20:19:06 2016 -0400 Merge branch 'next' of github.com:OpenBEL/openbel-api into next commit b36876e Author: Nick <nick@> Date: Fri May 27 14:15:18 2016 -0400 change nanopubs_store to nanopub_store The latter is what is used in code. commit be3bba1 Author: William Hayes <william.s.hayes@gmail.com> Date: Fri May 27 19:45:49 2016 -0400 Fixed some typo's commit 9e15b4f Author: William Hayes <william.s.hayes@gmail.com> Date: Fri May 27 19:37:09 2016 -0400 Updating configuration and API documentation commit 5452f09 Author: Nick <nick@> Date: Fri May 27 14:15:18 2016 -0400 change nanopubs_store to nanopub_store The latter is what is used in code. commit 13fe4d2 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri May 27 02:05:42 2016 -0400 fix reference to BELParser default resources refs OpenBEL/bel_parser#44 commit 07ee8d5 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri May 27 02:00:58 2016 -0400 functional validation API for expressions closes OpenBEL/bel_parser#44 commit 38dad57 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri May 27 01:57:27 2016 -0400 added validation API doc within /api/expressions commit e0aa6fb Author: William Hayes <william.s.hayes@gmail.com> Date: Thu May 26 21:12:04 2016 -0400 Added /api back to all routes commit 69e07c2 Merge: 1d90827 8dc3089 Author: William Hayes <william.s.hayes@gmail.com> Date: Wed May 25 13:30:16 2016 -0400 Merge branch 'next' of github.com:OpenBEL/openbel-api into next commit 1d90827 Author: William Hayes <william.s.hayes@gmail.com> Date: Wed May 25 13:30:10 2016 -0400 Updated RAML file - schemas and examples are now embedded commit 8dc3089 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed May 25 01:34:23 2016 -0400 [wip] Result for expression validation. commit 7ab1680 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed May 25 00:47:47 2016 -0400 config the default URI reader to ref TDB directory The default URI reader is established as the TDB directory that the biological concepts come from. The default URL reader will be ResourceURLReader and will only be used when the URI cannot be determined for a resource. commit f710ac5 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed May 25 00:20:52 2016 -0400 pluralize the "nanopubs" route; /api/nanopubs/... renamed route file, route class name, paths, and references commit f109a3c Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue May 24 09:50:43 2016 -0400 datasetload; serialize statement from hash The bel_statement is serialized after hash conversion in order to be saved to Mongo. commit d0a71c0 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 23 16:16:00 2016 -0400 refactor generate_uuid as instance method in mixin commit 2217192 Merge: 7e454a3 fcb8d52 Author: William Hayes <william.s.hayes@gmail.com> Date: Mon May 23 11:29:49 2016 -0400 Merge branch 'next' of github.com:OpenBEL/openbel-api into next commit 7e454a3 Author: William Hayes <william.s.hayes@gmail.com> Date: Mon May 23 11:29:41 2016 -0400 Fixed nanopub renaming issue commit fcb8d52 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 16 13:11:45 2016 -0400 refactor expression components api for bel_parser use the BELParser::Expression::Model as parsed objects removed unused classes that leveraged libbel APIs; the libbel API will be removed from bel.rb when bel_parser is fully integrated. commit 81a79db Author: William Hayes <william.s.hayes@gmail.com> Date: Sat May 14 15:07:06 2016 -0400 TmpFix for BEL language version (text/plain) issue Would only return the text/plain version never the application/json version. I changed it to only return the JSON formatted data and commented out the Accept header option code. commit f2066aa Author: William Hayes <william.s.hayes@gmail.com> Date: Fri May 13 20:22:39 2016 -0400 Missed a nanopub -> Nanopub edit commit 07edd9d Author: William Hayes <william.s.hayes@gmail.com> Date: Fri May 13 10:40:00 2016 -0400 Refactor naming and language paths Refactored naming: evidence to nanopub, summary text to support Moved /api/{functions|relations|version} to /api/language/... commit dda76e9 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed May 11 15:11:08 2016 -0400 rename for Nanopub model; refs OpenBEL/bel.rb#121 commit a1dafde Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue May 10 15:54:56 2016 -0400 set bel & bel plugins to version, ~> 1.0.0.beta commit 9e60c51 Author: William Hayes <william.s.hayes@gmail.com> Date: Tue May 3 11:22:43 2016 -0400 Remove sinatra reloader - no longer needed commit b0a6058 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 2 15:50:11 2016 -0400 return first for annotation/namespace properties commit 27ce1e4 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 2 15:35:09 2016 -0400 guard when item does not respond to match_text annotation_value/namespace_value resources commit 937b3f2 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 2 15:27:47 2016 -0400 correct inScheme (in_scheme accessor) in namespace commit 665f18a Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 2 15:20:12 2016 -0400 fix fromSpecies accessor (from_species) refs OpenBEL/bel.rb#120 commit 9446578 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon May 2 14:14:29 2016 -0400 bumped bel.rb dependency to version 1.0.0 1.0.0 is the version of bel.rb on the next branch. This will be the next major release of bel.rb. OpenBEL API needs version 1.0.0 in order to get bel_parser and translator plugin changes. refs OpenBEL/bel_parser#43 commit e57b936 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 29 12:21:41 2016 -0400 remove return_type from relationship resource included some cleanup in route closes #48 commit d790081 Author: William Hayes <william.s.hayes@gmail.com> Date: Fri Apr 29 12:12:14 2016 -0400 Partial update for /api/relationships Waiting on https://waffle.io/OpenBEL/bel_parser/cards/572386c9d39509b000f2b31b commit 0da2c0e Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 29 11:25:36 2016 -0400 fix vocab references due to rdf/rdf-vocab upgrade commit b52355a Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 29 11:18:35 2016 -0400 fix pref_label accessor in routes/resources closes #47 commit 01e3060 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 29 02:41:43 2016 -0400 bumped bel-rdf-jena plugin version to 0.4.3.beta Transitively includes 0.4.0.beta version of rdf-jena. commit e696785 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 29 02:10:15 2016 -0400 pass configured BEL version to Completion API update RDF serialization gems to version 2.0.0 remove dependency on 'rdf' gem; already a dependency for bel.rb closes OpenBEL/bel_parser#45 commit 041174e Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Apr 19 14:15:31 2016 -0400 don't check cookie form if not using jwt= commit 0e837bc Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Apr 19 10:42:48 2016 -0400 spec test auth capabilities commit ec6a143 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Apr 19 09:20:07 2016 -0400 cleanup auth lint warnings commit 863a3de Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Apr 19 09:19:45 2016 -0400 fix token query string access in auth middleware commit b2607e9 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 8 13:38:15 2016 -0400 refactored /api/functions for BEL 1.0 / 2.0 The functions route now uses the configured BEL specification to return functions. So far the short, long, description, and return type are provided. Updated functions resources to match object model. refs OpenBEL/bel_parser#33 commit 2dfe73f Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 8 13:35:51 2016 -0400 added "bel.version" setting to configuration added bel_parser gem as runtime dependency in .gemspec validate bel.version is set in configuration and that it is a defined BEL specification (BELParser::Language.defines_version?) commit c9c29f5 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Fri Apr 8 12:48:45 2016 -0400 bumped version to 1.0.0; prepped CHANGELOG 1.0.0 will be a major version bump to support a configurable BEL specification. This will bring support for BEL 2.0 into OpenBEL API. commit 74517e2 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Mar 30 20:41:01 2016 -0400 bumped version to 0.6.3; added changelog item refs #108 commit 31a27b9 Merge: 22eed27 29eb920 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Mar 30 20:33:06 2016 -0400 Merge branch 'master' into next commit 22eed27 Merge: 386c2ea 8d79b26 Author: Tony Bargnesi <abargnesi@gmail.com> Date: Wed Mar 30 20:28:27 2016 -0400 Merge pull request #108 from nbargnesi/param_auth look for tokens as parameters as well commit 8d79b26 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Wed Mar 30 16:51:58 2016 -0400 look for tokens as parameters as well commit 386c2ea Merge: b2abcdf ca2c733 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Mar 23 09:21:37 2016 -0400 Merge branch 'master' into next fixed conflicts in CHANGELOG.md, UPGRADING.md, and VERSION by keeping master's changes. commit b2abcdf Merge: be2e6e1 85cd7a3 Author: Tony Bargnesi <abargnesi@gmail.com> Date: Tue Mar 22 22:21:41 2016 -0400 Merge pull request #106 from nbargnesi/issue105 fixes #105 commit 85cd7a3 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Mar 22 18:17:44 2016 -0400 fixes #105 commit be2e6e1 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 15:07:24 2016 -0400 replace method for BEL.keys_to_symbols additional style alignment commit fbf5368 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 09:25:06 2016 -0400 return 404 when translating empty evidence results refs #44 commit ac61baf Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 08:32:37 2016 -0400 added storage.engine note for UPGRADING to 0.6.0 commit 3f4f700 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 08:27:14 2016 -0400 added UPGRADING guide commit 29f86e8 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 08:05:01 2016 -0400 added document for 0.6.0 mongodb migration commit 0e22354 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 06:30:26 2016 -0400 add configuration check for MongoDB 3.2 Check will fail to start OpenBEL API is MongoDB is < 3.2 commit 45e5e39 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 15 06:17:57 2016 -0400 added missing arg to render evidence collection commit 1edb037 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 14 14:45:43 2016 -0400 set mongo operation timeouts to unbounded The operation timeout is the number of seconds that can pass before subsequent reads from a mongo operation. This change makes this read timeout unbounded in order to satisfy long evidence and facet creation queries. commit 39524ca Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 14 13:46:25 2016 -0400 remove cache facets during dataset load Cached facets were removed at the end of a dataset load. Now they are additionally removed at the start of the load as well as every increment of 10k nanopubs loaded. commit 68c2107 Merge: de9a500 61a291d Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 14 12:50:35 2016 -0400 Merge branch 'next' into rewrite_references commit 61a291d Merge: 1b4dbb7 1bdf14e Author: Tony Bargnesi <abargnesi@gmail.com> Date: Mon Mar 14 12:20:40 2016 -0400 Merge pull request #101 from nbargnesi/issue100 Issue100 commit 1bdf14e Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Mon Mar 14 12:05:43 2016 -0400 document auth.enabled, auth.secret commit 0e900f6 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Feb 2 13:56:15 2016 -0500 include only auth enabled/secret in default config for #100 commit fbb8b06 Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Feb 2 13:55:54 2016 -0500 simplify authenticate route to enabled/disabled commit fe724ff Author: Nick Bargnesi <nbargnesi@selventa.com> Date: Tue Feb 2 13:54:30 2016 -0500 remove rest-client dependency commit de9a500 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Mar 10 14:29:16 2016 -0500 set mongo connection pool size to 30 This number was chosen in order to have at most 30 long-running queries simulaneously executing. This would then fail the 31st query unless a connection could be obtained with a timeout of 5 seconds. commit 8d46fc1 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Mar 9 14:54:15 2016 -0500 do not index value of experiment_context/metadata annotation values can be large amount of text that will not fit into an index key of 1024, if it's attempted you may see an error: WiredTigerIndex::insert: key too large to index... commit 4426582 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 8 23:01:46 2016 -0500 flatten translator arrays so we return one, if any commit 4d42c35 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 8 20:38:41 2016 -0500 bump puma to 3.1.0 commit 5081567 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Mar 8 20:36:41 2016 -0500 remove unnecessary local variables commit 32c5e56 Author: Tony Bargnesi <abargnesi@gmail.com> Date: Tue Mar 8 16:59:38 2016 -0500 Update README.md commit 53ea95f Author: Tony Bargnesi <abargnesi@gmail.com> Date: Tue Mar 8 16:51:59 2016 -0500 Update README.md commit 53653c0 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Mar 7 23:06:27 2016 -0500 correct references when serialization evidence using rewrite references work in bel.rb commit 1b4dbb7 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 16:11:02 2016 -0500 convert /api/evidence to BEL using translators factored out rendering of evidence_resource_collection to evidence helper refs #44 commit 3500811 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 15:20:01 2016 -0500 factored out filters validation into helper functional decomposition of filter validation for better understanding and maintenance; now reporting multiple JSON errors when responding with 400. commit 83935aa Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 15:18:27 2016 -0500 added doc for opening ::Sinatra::Helpers::Stream It is important to convey why methods were added to this class. The methods are a convenience so RDF.rb's writers can expect to call them. commit c984f8a Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Feb 2 15:08:44 2016 -0500 bump version dependencies for bel-rdf-jena / rdf rdf bumped to 1.99.1 bel-rdf-jena bumped to 0.4.2 commit e4eb5dd Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Feb 1 14:50:34 2016 -0500 dataset serialization to all bel.rb translators updated dependencies to support all bel.rb translators refs #99 commit b1243d8 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Tue Jan 26 15:57:16 2016 -0500 aggregate on full-text search; avoids Mongo limits A full-text search filter to /api/evidence with a sort on bel_statement only used the text index. This means that the bel_statement sort had to be done in memory. This reaches the 32 MB sort limit with only several tens of thousands of documents. The solution employed here was to use cursored aggregation allowing disk use for sort stages. The solution was introduced as an alternative code path if a FTS filter was included in the HTTP request. Although this did minimize the risk of regression there is a fair bit of to clean up in the mongo access layer. closes #96 commit 5d44fd0 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 25 21:48:12 2016 -0500 return annotation/namespace defs in BEL Script removed normalization of experiment_context annotation keywords. The normalized names were in inconsistent with references.annotations definitions. integrate next version of bel.rb (0.4.3) to get fixes for annotation/namespace formats. refs #95 commit 92f7e7e Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 25 15:51:14 2016 -0500 require MongoDB 3.2; closes #98 commit 0507714 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 25 14:57:28 2016 -0500 added 0.6.0 mongo migration helper, details follow The clear_evidence_facets_cache.rb mongo migration will clear out new evidence facet cache storage in case searches were built before migrating all documents in the "evidence" collection. commit 7707a92 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Jan 14 14:16:24 2016 -0500 fix /api/datasets/{id}/evidence for facet changes Now facets correctly in light of evidence facet changes and respects "max_values_per_facet". commit 19eedef Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Jan 14 13:10:57 2016 -0500 add scripts for Mongo data migrations in 0.6.0 - Drops evidence_facets since it has been replaced by evidence_facet_cache plus individual "evidence_facet_cache_{UUID}" collections. - Updates each evidence document to have "facets" field contain JSON objects instead of JSON strings. commit 21a7bc4 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Thu Jan 14 13:08:32 2016 -0500 bumped next version to 0.6.0 Minor release looking to include: - New evidence facet storage in mongo. - Improve dataset import for large documents (occasional OOM). - Evidence streaming. - Evidence export to multiple formats. commit bb2ac16 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jan 13 16:44:47 2016 -0500 facet cache collection creation and removal This design builds individual facet_cache collections based on the filters applied to the evidence collection. Each filtered evidence collection will get it's own "evidence_facet_cache_{UUID}" mongo collection. The facets values are grouped by category, name so it's trivial to cursor out the facets (still need to set the filter string though). This alleviates the max document size issue for large evidence collections. A max of 1000 facet values can be added to each category, name pair in order to stay within the size limit. Facet cache eviction isn't great here: - Individual evidence changes require removal of facet caches for the empty filter search as well as any overlapping filter/facet. - Creation or removal of a dataset will remove all facet caches. The thought is that for large dataset imports it is more effective to regenerate than cache vs. trying to synchronize it with new data. This includes a breaking change to evidence document schema. The evidence "facets" array stores the full category, name, value json objects instead of flat strings. This is done to make it possible to separate values into category, name groupings. We should include an upgrade note for this and possibly a script. commit f5a08a3 Merge: f038be2 a515587 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Wed Jan 13 16:42:24 2016 -0500 Merge branch 'master' into next commit f038be2 Author: Anthony Bargnesi <abargnesi@selventa.com> Date: Mon Jan 11 22:58:47 2016 -0500 batch evidence to an array, avoid JRuby enumerator The JRuby enumerator uses a thread per next object in an enumerator which proves costly. Hundreds of threads are created (tested with yourkit) when batch-creating evidence due to the "each_slice(500)" of the enumerator. This issue is logged in JRuby: jruby/jruby#2577 The solution employed was to yield each evidence directly to the block and batch 500 into an array at a time. This should avoid the OOM exception received: ava.lang.OutOfMemoryError: unable to create new native thread Indeed the thread count was observed to be lower in yourkit.

yousuketto mentioned this issue Apr 13, 2015

fail to check CSRF at Rails4.2 #2824

Closed

dzjuck mentioned this issue May 14, 2015

Fix List#permutation spec for JRuby hamstergem/hamster#176

Closed

jmalves mentioned this issue Dec 7, 2017

ThreadedNexter shutdown can in some situations interrupt a thread executing a different task #4887

Closed

headius mentioned this issue Mar 27, 2018

JRuby 9.2 Projects #5119

Closed

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Enumerator out of memory error #2577

Enumerator out of memory error #2577

iconara commented Feb 7, 2015

headius commented Mar 12, 2015

headius commented Mar 12, 2015

headius commented Mar 12, 2015

iconara commented Mar 13, 2015

iconara commented Mar 13, 2015

Enumerator out of memory error #2577

Enumerator out of memory error #2577

Comments

iconara commented Feb 7, 2015

headius commented Mar 12, 2015

headius commented Mar 12, 2015

headius commented Mar 12, 2015

iconara commented Mar 13, 2015

iconara commented Mar 13, 2015