Run Template on the Cloud Dataflow service

Multi tool use
Multi tool use











up vote
1
down vote

favorite












I'm trying to running a local template I developed in Google DataFlow.



The problem is when I run it in Google Cloud Shell with:



python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>


I got this error



/usr/bin/python: No module named apache_beam 


So I tried with a simpler example: the wordcount



Like Google said, I execute:



python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


And I got this error:



/usr/bin/python: No module named past.builtins


If I execute without .py:



 python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


Again, the same error, but with "more" informatión



Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins


What is happening? How can I run those templates in Google Cloud Dataflow?



Do i need to set up the environment in Google Cloud like I did in local or is done by default?










share|improve this question
























  • I think you should specify the setup.py file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
    – GRS
    Nov 8 at 10:35

















up vote
1
down vote

favorite












I'm trying to running a local template I developed in Google DataFlow.



The problem is when I run it in Google Cloud Shell with:



python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>


I got this error



/usr/bin/python: No module named apache_beam 


So I tried with a simpler example: the wordcount



Like Google said, I execute:



python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


And I got this error:



/usr/bin/python: No module named past.builtins


If I execute without .py:



 python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


Again, the same error, but with "more" informatión



Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins


What is happening? How can I run those templates in Google Cloud Dataflow?



Do i need to set up the environment in Google Cloud like I did in local or is done by default?










share|improve this question
























  • I think you should specify the setup.py file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
    – GRS
    Nov 8 at 10:35















up vote
1
down vote

favorite









up vote
1
down vote

favorite











I'm trying to running a local template I developed in Google DataFlow.



The problem is when I run it in Google Cloud Shell with:



python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>


I got this error



/usr/bin/python: No module named apache_beam 


So I tried with a simpler example: the wordcount



Like Google said, I execute:



python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


And I got this error:



/usr/bin/python: No module named past.builtins


If I execute without .py:



 python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


Again, the same error, but with "more" informatión



Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins


What is happening? How can I run those templates in Google Cloud Dataflow?



Do i need to set up the environment in Google Cloud like I did in local or is done by default?










share|improve this question















I'm trying to running a local template I developed in Google DataFlow.



The problem is when I run it in Google Cloud Shell with:



python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>


I got this error



/usr/bin/python: No module named apache_beam 


So I tried with a simpler example: the wordcount



Like Google said, I execute:



python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


And I got this error:



/usr/bin/python: No module named past.builtins


If I execute without .py:



 python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>


Again, the same error, but with "more" informatión



Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins


What is happening? How can I run those templates in Google Cloud Dataflow?



Do i need to set up the environment in Google Cloud like I did in local or is done by default?







python google-cloud-dataflow apache-beam






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 8 at 10:32

























asked Nov 8 at 10:27









IoT user

597




597












  • I think you should specify the setup.py file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
    – GRS
    Nov 8 at 10:35




















  • I think you should specify the setup.py file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
    – GRS
    Nov 8 at 10:35


















I think you should specify the setup.py file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
– GRS
Nov 8 at 10:35






I think you should specify the setup.py file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
– GRS
Nov 8 at 10:35














1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










Finally I did it.



This is how:



Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)



 virtualenv env --python=python2


After activate this virtualenv you can run in it






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53205820%2frun-template-on-the-cloud-dataflow-service%23new-answer', 'question_page');
    }
    );

    Post as a guest
































    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    Finally I did it.



    This is how:



    Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)



     virtualenv env --python=python2


    After activate this virtualenv you can run in it






    share|improve this answer



























      up vote
      1
      down vote



      accepted










      Finally I did it.



      This is how:



      Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)



       virtualenv env --python=python2


      After activate this virtualenv you can run in it






      share|improve this answer

























        up vote
        1
        down vote



        accepted







        up vote
        1
        down vote



        accepted






        Finally I did it.



        This is how:



        Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)



         virtualenv env --python=python2


        After activate this virtualenv you can run in it






        share|improve this answer














        Finally I did it.



        This is how:



        Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)



         virtualenv env --python=python2


        After activate this virtualenv you can run in it







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 2 days ago

























        answered Nov 8 at 11:23









        IoT user

        597




        597






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53205820%2frun-template-on-the-cloud-dataflow-service%23new-answer', 'question_page');
            }
            );

            Post as a guest




















































































            v 2MY3XiwpJ7cntM04 qtGvBnB2FABRRs51rZ 9RUpNGoPZdfYQgj
            gK5V2vxm5I9O86o6J3BlEzGOaNKk1Gie5TvGS jeoWXbK3ELNw

            Popular posts from this blog

            How to pass form data using jquery Ajax to insert data in database?

            Guess what letter conforming each word

            Run scheduled task as local user group (not BUILTIN)