named entity recognition - Why does the Stanford NER demo convert 'this year' to 2017, whereas my CoreNLP server does not? -
i have set corenlp server , using stanford ner extract time periods sentences.
if use online interactive demo @ corenlp.run parse sentence
'last year happened.'
it shows 'date' , '2016'. however, own server, set latest release of corenlp, only shows 'date'. what's more, when use python requests query server's api same sentence, first 2 tokens in response contain fields 'timex': {'type': 'date','tid': 't1', 'altvalue': 'this p1y offset p-1y'}
, 'normalizedner': 'this p1y offset p-1y'
.
if have deal fact output not demo's, stanford ner or timex3 documentation explaining this p1y offset p-1y
means or describing other possible responses might in normalizedner
field?
here entire api response
[ {'word': 'last', 'after': ' ', 'originaltext': 'last', 'timex': {'type': 'date', 'tid': 't1', 'altvalue': 'this p1y offset p-1y'}, 'pos': 'jj', 'ner': 'date', 'lemma': 'last', 'normalizedner': 'this p1y offset p-1y', 'before': '', 'index': 1, 'characteroffsetbegin': 0, 'characteroffsetend': 4}, {'word': 'year', 'after': ' ', 'originaltext': 'year', 'timex': {'type': 'date', 'tid': 't1', 'altvalue': 'this p1y offset p-1y'}, 'pos': 'nn', 'ner': 'date', 'lemma': 'year', 'normalizedner': 'this p1y offset p-1y', 'before': ' ', 'index': 2, 'characteroffsetbegin': 5, 'characteroffsetend': 9}, {'word': 'something', 'before': ' ', 'originaltext': 'something', 'ner': 'o', 'lemma': 'something', 'after': ' ', 'characteroffsetend': 19, 'index': 3, 'characteroffsetbegin': 10, 'pos': 'nn'}, {'word': 'happened', 'before': ' ', 'originaltext': 'happened', 'ner': 'o', 'lemma': 'happen', 'after': '', 'characteroffsetend': 28, 'index': 4, 'characteroffsetbegin': 20, 'pos': 'vbd'}, {'word': '.', 'before': '', 'originaltext': '.', 'ner': 'o', 'lemma': '.', 'after': '', 'characteroffsetend': 29, 'index': 5, 'characteroffsetbegin': 28, 'pos': '.'} ]
hi have added new feature allow tell pipeline use present date docdate
when running, main source of issue. feature have use latest version of stanford corenlp available on github.
also, when start server have use -serverproperties
option , supply .properties
file these properties:
annotators = tokenize,ssplit,pos,lemma,ner,entitymentions ner.usepresentdatefordocdate = true
if should work , list 2016
Comments
Post a Comment