How to recognize Named Entity from a python list using Stanford NERTagger

up vote
1
down vote

favorite

I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name

['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']

I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity

[('France', 'ORGANIZATION'),
('India', 'ORGANIZATION'),
('Bangladesh', 'ORGANIZATION'),
('England', 'ORGANIZATION'),
('Germany', 'ORGANIZATION'),
('Brazil', 'ORGANIZATION'),
('Egypt', 'ORGANIZATION'),
('Bhutan', 'ORGANIZATION'),
('Srilanka', 'ORGANIZATION')]

May be i am missing something here

asked Nov 9 at 5:21

Kalyan

46221030

It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32

you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21

add a comment |

up vote
1
down vote

favorite

I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name

['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']

I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity

May be i am missing something here

asked Nov 9 at 5:21

Kalyan

46221030

It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32

you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21

add a comment |

up vote
1
down vote

favorite

I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name

['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']

I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity

May be i am missing something here

asked Nov 9 at 5:21

Kalyan

46221030

I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name

['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']

I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity

May be i am missing something here

nlp nltk stanford-nlp

asked Nov 9 at 5:21

Kalyan

46221030

asked Nov 9 at 5:21

Kalyan

46221030

asked Nov 9 at 5:21

Kalyan

46221030

asked Nov 9 at 5:21

Kalyan

46221030

asked Nov 9 at 5:21

Kalyan

46221030

It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32

you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21

add a comment |

It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32

you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21

It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32

you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21

add a comment |

1 Answer
1

active

oldest

votes

up vote
2
down vote

Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger

Now take a look at this sample code

import nltk

from nltk.tokenize.toktok import ToktokTokenizer

from nltk.tag import StanfordNERTagger

stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]

stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]

st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')

Check st

<nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>

My sentance

sentence = u'France is the biggest county in EU'

words = nltk.word_tokenize(sentence)

st.tag(words)

Result

[('France', 'LOCATION'),

 ('is', 'O'),

 ('the', 'O'),

 ('biggest', 'O'),

 ('county', 'O'),

 ('in', 'O'),

 ('EU', 'LOCATION')]

answered Nov 9 at 10:40

Richard Rublev

3,00841932

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53220273%2fhow-to-recognize-named-entity-from-a-python-list-using-stanford-nertagger%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger

Now take a look at this sample code

import nltk

from nltk.tokenize.toktok import ToktokTokenizer

from nltk.tag import StanfordNERTagger

stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]

stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]

st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')

Check st

<nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>

My sentance

sentence = u'France is the biggest county in EU'

words = nltk.word_tokenize(sentence)

st.tag(words)

Result

[('France', 'LOCATION'),

 ('is', 'O'),

 ('the', 'O'),

 ('biggest', 'O'),

 ('county', 'O'),

 ('in', 'O'),

 ('EU', 'LOCATION')]

answered Nov 9 at 10:40

Richard Rublev

3,00841932

add a comment |

up vote
2
down vote

Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger

Now take a look at this sample code

import nltk

from nltk.tokenize.toktok import ToktokTokenizer

from nltk.tag import StanfordNERTagger

stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]

stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]

st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')

Check st

<nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>

My sentance

sentence = u'France is the biggest county in EU'

words = nltk.word_tokenize(sentence)

st.tag(words)

Result

[('France', 'LOCATION'),

 ('is', 'O'),

 ('the', 'O'),

 ('biggest', 'O'),

 ('county', 'O'),

 ('in', 'O'),

 ('EU', 'LOCATION')]

answered Nov 9 at 10:40

Richard Rublev

3,00841932

add a comment |

up vote
2
down vote

Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger

Now take a look at this sample code

import nltk

from nltk.tokenize.toktok import ToktokTokenizer

from nltk.tag import StanfordNERTagger

stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]

stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]

st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')

Check st

<nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>

My sentance

sentence = u'France is the biggest county in EU'

words = nltk.word_tokenize(sentence)

st.tag(words)

Result

[('France', 'LOCATION'),

 ('is', 'O'),

 ('the', 'O'),

 ('biggest', 'O'),

 ('county', 'O'),

 ('in', 'O'),

 ('EU', 'LOCATION')]

answered Nov 9 at 10:40

Richard Rublev

3,00841932

Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger

Now take a look at this sample code

import nltk

from nltk.tokenize.toktok import ToktokTokenizer

from nltk.tag import StanfordNERTagger

stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]

stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]

st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')

Check st

<nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>

My sentance

sentence = u'France is the biggest county in EU'

words = nltk.word_tokenize(sentence)

st.tag(words)

Result

[('France', 'LOCATION'),

 ('is', 'O'),

 ('the', 'O'),

 ('biggest', 'O'),

 ('county', 'O'),

 ('in', 'O'),

 ('EU', 'LOCATION')]

answered Nov 9 at 10:40

Richard Rublev

3,00841932

answered Nov 9 at 10:40

Richard Rublev

3,00841932

answered Nov 9 at 10:40

Richard Rublev

3,00841932

answered Nov 9 at 10:40

Richard Rublev

3,00841932

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk