using data.table to replace multiple columns on single condition












1














I want to change the default value (which is 255) to NA.



dt <- data.table(x = c(1,5,255,0,NA), y = c(1,7,255,0,0), z = c(4,2,7,8,255))
coords <- c('x', 'y')


Which gives the following code:



     x   y   z
1: 1 1 4
2: 5 7 2
3: 255 255 7
4: 0 0 8
5: NA 0 255


I the furthest I came up with is this:



dt[.SD == 255, (.SD) := NA, .SDcols = coords]


Please note that column z stays the same. So just the columns which are specified and not all columns.



But that doesn't help me to get the sollution:



     x   y   z
1: 1 1 4
2: 5 7 2
3: NA NA 7
4: 0 0 8
5: NA 0 255


I am looking for a sustainable solution because the original dataset is a couple of million rows.



EDIT:



I have found a solution but it is quite ugly and is definately too slow as it takes almost 10 seconds to get through a dataframe of 22009 x 86. Does anyone have a better solution?



The code:



dt[, replace(.SD, .SD == 255, NA), .SDcols = coords, by = c(colnames(dt)[!colnames(dt) %in% coords])]










share|improve this question
























  • You can try dt[, replace(.SD, .SD == 255, NA)]
    – Sotos
    Nov 13 at 13:46












  • Thank you for your reply, Sotos. I edited my post. I am looking for a solution that is easily upscalable when the amount of rows heavily increase. I am not sure if the function replace is that friendly.
    – Tunder250
    Nov 13 at 13:51










  • couple of million rows is not very big. replace will do just fine
    – Sotos
    Nov 13 at 13:55










  • okay, thank you. But it doesn't include the other columns.
    – Tunder250
    Nov 13 at 13:59






  • 2




    You can do this when you read in the table: fread("path/to/file", na.strings=c("NA", "255"))
    – Scott Ritchie
    Nov 13 at 14:02
















1














I want to change the default value (which is 255) to NA.



dt <- data.table(x = c(1,5,255,0,NA), y = c(1,7,255,0,0), z = c(4,2,7,8,255))
coords <- c('x', 'y')


Which gives the following code:



     x   y   z
1: 1 1 4
2: 5 7 2
3: 255 255 7
4: 0 0 8
5: NA 0 255


I the furthest I came up with is this:



dt[.SD == 255, (.SD) := NA, .SDcols = coords]


Please note that column z stays the same. So just the columns which are specified and not all columns.



But that doesn't help me to get the sollution:



     x   y   z
1: 1 1 4
2: 5 7 2
3: NA NA 7
4: 0 0 8
5: NA 0 255


I am looking for a sustainable solution because the original dataset is a couple of million rows.



EDIT:



I have found a solution but it is quite ugly and is definately too slow as it takes almost 10 seconds to get through a dataframe of 22009 x 86. Does anyone have a better solution?



The code:



dt[, replace(.SD, .SD == 255, NA), .SDcols = coords, by = c(colnames(dt)[!colnames(dt) %in% coords])]










share|improve this question
























  • You can try dt[, replace(.SD, .SD == 255, NA)]
    – Sotos
    Nov 13 at 13:46












  • Thank you for your reply, Sotos. I edited my post. I am looking for a solution that is easily upscalable when the amount of rows heavily increase. I am not sure if the function replace is that friendly.
    – Tunder250
    Nov 13 at 13:51










  • couple of million rows is not very big. replace will do just fine
    – Sotos
    Nov 13 at 13:55










  • okay, thank you. But it doesn't include the other columns.
    – Tunder250
    Nov 13 at 13:59






  • 2




    You can do this when you read in the table: fread("path/to/file", na.strings=c("NA", "255"))
    – Scott Ritchie
    Nov 13 at 14:02














1












1








1







I want to change the default value (which is 255) to NA.



dt <- data.table(x = c(1,5,255,0,NA), y = c(1,7,255,0,0), z = c(4,2,7,8,255))
coords <- c('x', 'y')


Which gives the following code:



     x   y   z
1: 1 1 4
2: 5 7 2
3: 255 255 7
4: 0 0 8
5: NA 0 255


I the furthest I came up with is this:



dt[.SD == 255, (.SD) := NA, .SDcols = coords]


Please note that column z stays the same. So just the columns which are specified and not all columns.



But that doesn't help me to get the sollution:



     x   y   z
1: 1 1 4
2: 5 7 2
3: NA NA 7
4: 0 0 8
5: NA 0 255


I am looking for a sustainable solution because the original dataset is a couple of million rows.



EDIT:



I have found a solution but it is quite ugly and is definately too slow as it takes almost 10 seconds to get through a dataframe of 22009 x 86. Does anyone have a better solution?



The code:



dt[, replace(.SD, .SD == 255, NA), .SDcols = coords, by = c(colnames(dt)[!colnames(dt) %in% coords])]










share|improve this question















I want to change the default value (which is 255) to NA.



dt <- data.table(x = c(1,5,255,0,NA), y = c(1,7,255,0,0), z = c(4,2,7,8,255))
coords <- c('x', 'y')


Which gives the following code:



     x   y   z
1: 1 1 4
2: 5 7 2
3: 255 255 7
4: 0 0 8
5: NA 0 255


I the furthest I came up with is this:



dt[.SD == 255, (.SD) := NA, .SDcols = coords]


Please note that column z stays the same. So just the columns which are specified and not all columns.



But that doesn't help me to get the sollution:



     x   y   z
1: 1 1 4
2: 5 7 2
3: NA NA 7
4: 0 0 8
5: NA 0 255


I am looking for a sustainable solution because the original dataset is a couple of million rows.



EDIT:



I have found a solution but it is quite ugly and is definately too slow as it takes almost 10 seconds to get through a dataframe of 22009 x 86. Does anyone have a better solution?



The code:



dt[, replace(.SD, .SD == 255, NA), .SDcols = coords, by = c(colnames(dt)[!colnames(dt) %in% coords])]







r data.table






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 at 14:29

























asked Nov 13 at 13:44









Tunder250

1618




1618












  • You can try dt[, replace(.SD, .SD == 255, NA)]
    – Sotos
    Nov 13 at 13:46












  • Thank you for your reply, Sotos. I edited my post. I am looking for a solution that is easily upscalable when the amount of rows heavily increase. I am not sure if the function replace is that friendly.
    – Tunder250
    Nov 13 at 13:51










  • couple of million rows is not very big. replace will do just fine
    – Sotos
    Nov 13 at 13:55










  • okay, thank you. But it doesn't include the other columns.
    – Tunder250
    Nov 13 at 13:59






  • 2




    You can do this when you read in the table: fread("path/to/file", na.strings=c("NA", "255"))
    – Scott Ritchie
    Nov 13 at 14:02


















  • You can try dt[, replace(.SD, .SD == 255, NA)]
    – Sotos
    Nov 13 at 13:46












  • Thank you for your reply, Sotos. I edited my post. I am looking for a solution that is easily upscalable when the amount of rows heavily increase. I am not sure if the function replace is that friendly.
    – Tunder250
    Nov 13 at 13:51










  • couple of million rows is not very big. replace will do just fine
    – Sotos
    Nov 13 at 13:55










  • okay, thank you. But it doesn't include the other columns.
    – Tunder250
    Nov 13 at 13:59






  • 2




    You can do this when you read in the table: fread("path/to/file", na.strings=c("NA", "255"))
    – Scott Ritchie
    Nov 13 at 14:02
















You can try dt[, replace(.SD, .SD == 255, NA)]
– Sotos
Nov 13 at 13:46






You can try dt[, replace(.SD, .SD == 255, NA)]
– Sotos
Nov 13 at 13:46














Thank you for your reply, Sotos. I edited my post. I am looking for a solution that is easily upscalable when the amount of rows heavily increase. I am not sure if the function replace is that friendly.
– Tunder250
Nov 13 at 13:51




Thank you for your reply, Sotos. I edited my post. I am looking for a solution that is easily upscalable when the amount of rows heavily increase. I am not sure if the function replace is that friendly.
– Tunder250
Nov 13 at 13:51












couple of million rows is not very big. replace will do just fine
– Sotos
Nov 13 at 13:55




couple of million rows is not very big. replace will do just fine
– Sotos
Nov 13 at 13:55












okay, thank you. But it doesn't include the other columns.
– Tunder250
Nov 13 at 13:59




okay, thank you. But it doesn't include the other columns.
– Tunder250
Nov 13 at 13:59




2




2




You can do this when you read in the table: fread("path/to/file", na.strings=c("NA", "255"))
– Scott Ritchie
Nov 13 at 14:02




You can do this when you read in the table: fread("path/to/file", na.strings=c("NA", "255"))
– Scott Ritchie
Nov 13 at 14:02












1 Answer
1






active

oldest

votes


















2














Here is how you can keep the columns outside .SDcols,



library(data.table)
dt[, (coords) := replace(.SD, .SD == 255, NA), .SDcols = coords]


which gives,




    x  y   z
1: 1 1 4
2: 5 7 2
3: NA NA 7
4: 0 0 8
5: NA 0 255






share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282404%2fusing-data-table-to-replace-multiple-columns-on-single-condition%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    Here is how you can keep the columns outside .SDcols,



    library(data.table)
    dt[, (coords) := replace(.SD, .SD == 255, NA), .SDcols = coords]


    which gives,




        x  y   z
    1: 1 1 4
    2: 5 7 2
    3: NA NA 7
    4: 0 0 8
    5: NA 0 255






    share|improve this answer


























      2














      Here is how you can keep the columns outside .SDcols,



      library(data.table)
      dt[, (coords) := replace(.SD, .SD == 255, NA), .SDcols = coords]


      which gives,




          x  y   z
      1: 1 1 4
      2: 5 7 2
      3: NA NA 7
      4: 0 0 8
      5: NA 0 255






      share|improve this answer
























        2












        2








        2






        Here is how you can keep the columns outside .SDcols,



        library(data.table)
        dt[, (coords) := replace(.SD, .SD == 255, NA), .SDcols = coords]


        which gives,




            x  y   z
        1: 1 1 4
        2: 5 7 2
        3: NA NA 7
        4: 0 0 8
        5: NA 0 255






        share|improve this answer












        Here is how you can keep the columns outside .SDcols,



        library(data.table)
        dt[, (coords) := replace(.SD, .SD == 255, NA), .SDcols = coords]


        which gives,




            x  y   z
        1: 1 1 4
        2: 5 7 2
        3: NA NA 7
        4: 0 0 8
        5: NA 0 255







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 13 at 14:49









        Sotos

        28k51640




        28k51640






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282404%2fusing-data-table-to-replace-multiple-columns-on-single-condition%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Guess what letter conforming each word

            Run scheduled task as local user group (not BUILTIN)

            Port of Spain