Parse Dataframe and store output in a single file [duplicate]

Multi tool use
Multi tool use











up vote
0
down vote

favorite













This question already has an answer here:




  • Spark split a column value into multiple rows

    1 answer




I have a data frame using Spark SQL in Scala with columns A and B with values:



A | B
1 a|b|c
2 b|d
3 d|e|f


I need to store the output to a single textfile in following format



1 a
1 b
1 c
2 b
2 d
3 d
3 e
3 f


How can I do that?










share|improve this question















marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 10 at 10:56


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















    up vote
    0
    down vote

    favorite













    This question already has an answer here:




    • Spark split a column value into multiple rows

      1 answer




    I have a data frame using Spark SQL in Scala with columns A and B with values:



    A | B
    1 a|b|c
    2 b|d
    3 d|e|f


    I need to store the output to a single textfile in following format



    1 a
    1 b
    1 c
    2 b
    2 d
    3 d
    3 e
    3 f


    How can I do that?










    share|improve this question















    marked as duplicate by user6910411 apache-spark
    Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

    StackExchange.ready(function() {
    if (StackExchange.options.isMobile) return;

    $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
    var $hover = $(this).addClass('hover-bound'),
    $msg = $hover.siblings('.dupe-hammer-message');

    $hover.hover(
    function() {
    $hover.showInfoMessage('', {
    messageElement: $msg.clone().show(),
    transient: false,
    position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
    dismissable: false,
    relativeToBody: true
    });
    },
    function() {
    StackExchange.helpers.removeMessages();
    }
    );
    });
    });
    Nov 10 at 10:56


    This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

















      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite












      This question already has an answer here:




      • Spark split a column value into multiple rows

        1 answer




      I have a data frame using Spark SQL in Scala with columns A and B with values:



      A | B
      1 a|b|c
      2 b|d
      3 d|e|f


      I need to store the output to a single textfile in following format



      1 a
      1 b
      1 c
      2 b
      2 d
      3 d
      3 e
      3 f


      How can I do that?










      share|improve this question
















      This question already has an answer here:




      • Spark split a column value into multiple rows

        1 answer




      I have a data frame using Spark SQL in Scala with columns A and B with values:



      A | B
      1 a|b|c
      2 b|d
      3 d|e|f


      I need to store the output to a single textfile in following format



      1 a
      1 b
      1 c
      2 b
      2 d
      3 d
      3 e
      3 f


      How can I do that?





      This question already has an answer here:




      • Spark split a column value into multiple rows

        1 answer








      scala apache-spark apache-spark-sql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 10 at 9:41









      SCouto

      3,71531227




      3,71531227










      asked Nov 10 at 8:59









      Nick

      96110




      96110




      marked as duplicate by user6910411 apache-spark
      Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      Nov 10 at 10:56


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






      marked as duplicate by user6910411 apache-spark
      Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      Nov 10 at 10:56


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



            resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer





















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20


















          up vote
          0
          down vote













          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer





















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16


















          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          2
          down vote



          accepted










          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



            resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer





















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20















          up vote
          2
          down vote



          accepted










          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



            resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer





















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20













          up vote
          2
          down vote



          accepted







          up vote
          2
          down vote



          accepted






          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



            resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer












          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



            resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 10 at 9:47









          SCouto

          3,71531227




          3,71531227












          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20


















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20
















          explode function is not recognized in my code. What dependency do I need to add?
          – Nick
          Nov 10 at 10:17




          explode function is not recognized in my code. What dependency do I need to add?
          – Nick
          Nov 10 at 10:17




          1




          1




          this should be enough: import org.apache.spark.sql.functions._
          – SCouto
          Nov 10 at 10:20




          this should be enough: import org.apache.spark.sql.functions._
          – SCouto
          Nov 10 at 10:20












          up vote
          0
          down vote













          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer





















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16















          up vote
          0
          down vote













          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer





















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16













          up vote
          0
          down vote










          up vote
          0
          down vote









          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer












          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 10 at 9:47









          Chitral Verma

          9241317




          9241317












          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16


















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16
















          explode(split(col : this part of your code is not recognized
          – Nick
          Nov 10 at 10:15




          explode(split(col : this part of your code is not recognized
          – Nick
          Nov 10 at 10:15












          col comes from org.apache.spark.sql.functions
          – Chitral Verma
          Nov 10 at 11:16




          col comes from org.apache.spark.sql.functions
          – Chitral Verma
          Nov 10 at 11:16



          1 FrbAaUAJeHQhyfG1HaZk z YpCrs qJLg6gDu,c00Vp63iiRFuDYHvrjn1L ni
          Edht OtX3McGA9SLjSvA8j7SbA6R

          Popular posts from this blog

          How to pass form data using jquery Ajax to insert data in database?

          Guess what letter conforming each word

          Run scheduled task as local user group (not BUILTIN)