You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that in some cases, Java class proxies can be made available before they're fully bound. We are using the Stanford NLP libraries to perform entity extraction on text, with the process:
Create an Annotator
Pass text to the Annotator, which produces an Annotation instance
Call .get(TokensAnnotation) to get a List<CoreLabel>
Iterate the List<CoreLabel> and for each CoreLabel call #original_text
classEntityExtractor < Component::Servicejava_import"edu.stanford.nlp.ling.CoreAnnotations$TokensAnnotation"include_package"edu.stanford.nlp.pipeline"java_import"java.util.Properties"java_import"intoxicant.analytics.coreNlp.StopwordAnnotator"# Binds an RMQ consumer that receives jobsprocessdo |msg|
ifmsg["data"]["type"] == "html"page=Page.id_or_url(msg["id"],msg["url"])processmsg,pageifpageendend@semaphore=Mutex.newdefself.pipeline@semaphore.synchronizedo@pipeline ||= beginprops=Properties.new.tapdo |p|
p.set_property"annotators","tokenize, ssplit, stopword, pos, lemma, ner"p.set_property"ner.applyNumericClassifiers","false"p.set_property"ner.useSUTime","false"p.set_property"parse.maxlen","80"# See https://mailman.stanford.edu/pipermail/java-nlp-user/2014-November/006483.htmlp.set_property"customAnnotatorClass.stopword","intoxicant.analytics.coreNlp.StopwordAnnotator"endStanfordCoreNLP.newpropsendendenddefprocess(msg,page)entities=extract_entitiesannotate(msg["data"]["article"])enddefannotate(article)Annotation.new(article).tap{|doc| self.class.pipeline.annotate(doc)}enddefextract_entities(document)document.get(TokensAnnotation).map{|e| [e.original_text,e.ner]}# < -- undefined method `original_text' for #<Java::EduStanfordNlpLing::CoreLabel:0x412ce5ad># ...endend
However, once per app restart (across several dozen instances of the application) 1-2 instances produce an error like:
undefined method `original_text' for #<Java::EduStanfordNlpLing::CoreLabel:0x412ce5ad>
Future invocations work just fine.
Per IRC discussion, this sounds like an error with the proxy class being made available before it's fully bound.
The text was updated successfully, but these errors were encountered:
these are tricky to fix but reproducing helps, which seems like we won't be able to in this case ;(
... I assume all instances in List<CoreLabel> are actually of class CoreLabel (no sub-classes) ?
... also for the record what JRuby version are you using ... latest of 1.7.x (1.7.21) if not pls try.
It appears that in some cases, Java class proxies can be made available before they're fully bound. We are using the Stanford NLP libraries to perform entity extraction on text, with the process:
Annotator
Annotator
, which produces anAnnotation
instance.get(TokensAnnotation)
to get aList<CoreLabel>
List<CoreLabel>
and for eachCoreLabel
call#original_text
However, once per app restart (across several dozen instances of the application) 1-2 instances produce an error like:
Future invocations work just fine.
Per IRC discussion, this sounds like an error with the proxy class being made available before it's fully bound.
The text was updated successfully, but these errors were encountered: