How to recognize Named Entity from a python list using Stanford NERTagger











up vote
1
down vote

favorite












I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name



['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']


I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity



[('France', 'ORGANIZATION'),
('India', 'ORGANIZATION'),
('Bangladesh', 'ORGANIZATION'),
('England', 'ORGANIZATION'),
('Germany', 'ORGANIZATION'),
('Brazil', 'ORGANIZATION'),
('Egypt', 'ORGANIZATION'),
('Bhutan', 'ORGANIZATION'),
('Srilanka', 'ORGANIZATION')]



May be i am missing something here










share|improve this question






















  • It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
    – Proyag
    Nov 9 at 10:32










  • you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
    – Nathan McCoy
    Nov 9 at 15:21















up vote
1
down vote

favorite












I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name



['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']


I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity



[('France', 'ORGANIZATION'),
('India', 'ORGANIZATION'),
('Bangladesh', 'ORGANIZATION'),
('England', 'ORGANIZATION'),
('Germany', 'ORGANIZATION'),
('Brazil', 'ORGANIZATION'),
('Egypt', 'ORGANIZATION'),
('Bhutan', 'ORGANIZATION'),
('Srilanka', 'ORGANIZATION')]



May be i am missing something here










share|improve this question






















  • It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
    – Proyag
    Nov 9 at 10:32










  • you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
    – Nathan McCoy
    Nov 9 at 15:21













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name



['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']


I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity



[('France', 'ORGANIZATION'),
('India', 'ORGANIZATION'),
('Bangladesh', 'ORGANIZATION'),
('England', 'ORGANIZATION'),
('Germany', 'ORGANIZATION'),
('Brazil', 'ORGANIZATION'),
('Egypt', 'ORGANIZATION'),
('Bhutan', 'ORGANIZATION'),
('Srilanka', 'ORGANIZATION')]



May be i am missing something here










share|improve this question













I am a beginner in NLP and first time using StanfordNERTagger. For learning purpose I am playing with Stanford NERTagger. I have a python list of country name



['France', 'India', 'Bangladesh', 'England', 'Germany', 'Brazil', 'Egypt', 'Bhutan', 'Srilanka']


I want to get 'location' entity which belongs to NERTagger but i am getting the 'Organization' Entity



[('France', 'ORGANIZATION'),
('India', 'ORGANIZATION'),
('Bangladesh', 'ORGANIZATION'),
('England', 'ORGANIZATION'),
('Germany', 'ORGANIZATION'),
('Brazil', 'ORGANIZATION'),
('Egypt', 'ORGANIZATION'),
('Bhutan', 'ORGANIZATION'),
('Srilanka', 'ORGANIZATION')]



May be i am missing something here







nlp nltk stanford-nlp






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 9 at 5:21









Kalyan

46221030




46221030












  • It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
    – Proyag
    Nov 9 at 10:32










  • you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
    – Nathan McCoy
    Nov 9 at 15:21


















  • It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
    – Proyag
    Nov 9 at 10:32










  • you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
    – Nathan McCoy
    Nov 9 at 15:21
















It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32




It's very difficult for an NER tagger to work with single words since the tagging is context-dependent. If you give it full sentences containing a country name, I would expect it to work much better.
– Proyag
Nov 9 at 10:32












you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21




you are asking for entities in a list of counties. NER tags entities on tokenized sentences.
– Nathan McCoy
Nov 9 at 15:21












1 Answer
1






active

oldest

votes

















up vote
2
down vote













Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger



Now take a look at this sample code



import nltk
from nltk.tokenize.toktok import ToktokTokenizer
from nltk.tag import StanfordNERTagger
stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]
stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]
st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')


Check st



<nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>


My sentance



sentence = u'France is the biggest county in EU'
words = nltk.word_tokenize(sentence)
st.tag(words)


Result



[('France', 'LOCATION'),
('is', 'O'),
('the', 'O'),
('biggest', 'O'),
('county', 'O'),
('in', 'O'),
('EU', 'LOCATION')]





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53220273%2fhow-to-recognize-named-entity-from-a-python-list-using-stanford-nertagger%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote













    Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger



    Now take a look at this sample code



    import nltk
    from nltk.tokenize.toktok import ToktokTokenizer
    from nltk.tag import StanfordNERTagger
    stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]
    stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]
    st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')


    Check st



    <nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>


    My sentance



    sentence = u'France is the biggest county in EU'
    words = nltk.word_tokenize(sentence)
    st.tag(words)


    Result



    [('France', 'LOCATION'),
    ('is', 'O'),
    ('the', 'O'),
    ('biggest', 'O'),
    ('county', 'O'),
    ('in', 'O'),
    ('EU', 'LOCATION')]





    share|improve this answer

























      up vote
      2
      down vote













      Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger



      Now take a look at this sample code



      import nltk
      from nltk.tokenize.toktok import ToktokTokenizer
      from nltk.tag import StanfordNERTagger
      stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]
      stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]
      st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')


      Check st



      <nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>


      My sentance



      sentence = u'France is the biggest county in EU'
      words = nltk.word_tokenize(sentence)
      st.tag(words)


      Result



      [('France', 'LOCATION'),
      ('is', 'O'),
      ('the', 'O'),
      ('biggest', 'O'),
      ('county', 'O'),
      ('in', 'O'),
      ('EU', 'LOCATION')]





      share|improve this answer























        up vote
        2
        down vote










        up vote
        2
        down vote









        Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger



        Now take a look at this sample code



        import nltk
        from nltk.tokenize.toktok import ToktokTokenizer
        from nltk.tag import StanfordNERTagger
        stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]
        stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]
        st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')


        Check st



        <nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>


        My sentance



        sentence = u'France is the biggest county in EU'
        words = nltk.word_tokenize(sentence)
        st.tag(words)


        Result



        [('France', 'LOCATION'),
        ('is', 'O'),
        ('the', 'O'),
        ('biggest', 'O'),
        ('county', 'O'),
        ('in', 'O'),
        ('EU', 'LOCATION')]





        share|improve this answer












        Firs you need to install Stanford NER on your comp. Depending of OS, both procedures how to configure Stanford ner tagger



        Now take a look at this sample code



        import nltk
        from nltk.tokenize.toktok import ToktokTokenizer
        from nltk.tag import StanfordNERTagger
        stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]
        stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]
        st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')


        Check st



        <nltk.tag.stanford.StanfordNERTagger at 0x7f897c44e6d8>


        My sentance



        sentence = u'France is the biggest county in EU'
        words = nltk.word_tokenize(sentence)
        st.tag(words)


        Result



        [('France', 'LOCATION'),
        ('is', 'O'),
        ('the', 'O'),
        ('biggest', 'O'),
        ('county', 'O'),
        ('in', 'O'),
        ('EU', 'LOCATION')]






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 9 at 10:40









        Richard Rublev

        3,00841932




        3,00841932






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53220273%2fhow-to-recognize-named-entity-from-a-python-list-using-stanford-nertagger%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Guess what letter conforming each word

            Run scheduled task as local user group (not BUILTIN)

            Port of Spain