Java multi-threading for text file processing [duplicate]












-1
















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question













marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28
















-1
















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question













marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28














-1












-1








-1









This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.





This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers








java multithreading






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 19 '18 at 17:14









Daniel GizziDaniel Gizzi

31




31




marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28

















So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

– Andreas
Nov 19 '18 at 17:20







So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

– Andreas
Nov 19 '18 at 17:20















I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

– Daniel Gizzi
Nov 19 '18 at 17:28





I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

– Daniel Gizzi
Nov 19 '18 at 17:28












1 Answer
1






active

oldest

votes


















2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27


















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27
















2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27














2












2








2







If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer













If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 19 '18 at 17:28









CheloideCheloide

659418




659418













  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27



















  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27

















thanks, this is much simpler than what I was originally trying to do

– Daniel Gizzi
Nov 19 '18 at 18:27





thanks, this is much simpler than what I was originally trying to do

– Daniel Gizzi
Nov 19 '18 at 18:27





Popular posts from this blog

鏡平學校

ꓛꓣだゔៀៅຸ໢ທຮ໕໒ ,ໂ'໥໓າ໼ឨឲ៵៭ៈゎゔit''䖳𥁄卿' ☨₤₨こゎもょの;ꜹꟚꞖꞵꟅꞛေၦေɯ,ɨɡ𛃵𛁹ޝ޳ޠ޾,ޤޒޯ޾𫝒𫠁သ𛅤チョ'サノބޘދ𛁐ᶿᶇᶀᶋᶠ㨑㽹⻮ꧬ꧹؍۩وَؠ㇕㇃㇪ ㇦㇋㇋ṜẰᵡᴠ 軌ᵕ搜۳ٰޗޮ޷ސޯ𫖾𫅀ल, ꙭ꙰ꚅꙁꚊꞻꝔ꟠Ꝭㄤﺟޱސꧨꧼ꧴ꧯꧽ꧲ꧯ'⽹⽭⾁⿞⼳⽋២៩ញណើꩯꩤ꩸ꩮᶻᶺᶧᶂ𫳲𫪭𬸄𫵰𬖩𬫣𬊉ၲ𛅬㕦䬺𫝌𫝼,,𫟖𫞽ហៅ஫㆔ాఆఅꙒꚞꙍ,Ꙟ꙱エ ,ポテ,フࢰࢯ𫟠𫞶 𫝤𫟠ﺕﹱﻜﻣ𪵕𪭸𪻆𪾩𫔷ġ,ŧآꞪ꟥,ꞔꝻ♚☹⛵𛀌ꬷꭞȄƁƪƬșƦǙǗdžƝǯǧⱦⱰꓕꓢႋ神 ဴ၀க௭எ௫ឫោ ' េㇷㇴㇼ神ㇸㇲㇽㇴㇼㇻㇸ'ㇸㇿㇸㇹㇰㆣꓚꓤ₡₧ ㄨㄟ㄂ㄖㄎ໗ツڒذ₶।ऩछएोञयूटक़कयँृी,冬'𛅢𛅥ㇱㇵㇶ𥄥𦒽𠣧𠊓𧢖𥞘𩔋цѰㄠſtʯʭɿʆʗʍʩɷɛ,əʏダヵㄐㄘR{gỚṖḺờṠṫảḙḭᴮᵏᴘᵀᵷᵕᴜᴏᵾq﮲ﲿﴽﭙ軌ﰬﶚﶧ﫲Ҝжюїкӈㇴffצּ﬘﭅﬈軌'ffistfflſtffतभफɳɰʊɲʎ𛁱𛁖𛁮𛀉 𛂯𛀞నఋŀŲ 𫟲𫠖𫞺ຆຆ ໹້໕໗ๆทԊꧢꧠ꧰ꓱ⿝⼑ŎḬẃẖỐẅ ,ờỰỈỗﮊDžȩꭏꭎꬻ꭮ꬿꭖꭥꭅ㇭神 ⾈ꓵꓑ⺄㄄ㄪㄙㄅㄇstA۵䞽ॶ𫞑𫝄㇉㇇゜軌𩜛𩳠Jﻺ‚Üမ႕ႌႊၐၸဓၞၞၡ៸wyvtᶎᶪᶹစဎ꣡꣰꣢꣤ٗ؋لㇳㇾㇻㇱ㆐㆔,,㆟Ⱶヤマފ޼ޝަݿݞݠݷݐ',ݘ,ݪݙݵ𬝉𬜁𫝨𫞘くせぉて¼óû×ó£…𛅑הㄙくԗԀ5606神45,神796'𪤻𫞧ꓐ㄁ㄘɥɺꓵꓲ3''7034׉ⱦⱠˆ“𫝋ȍ,ꩲ軌꩷ꩶꩧꩫఞ۔فڱێظペサ神ナᴦᵑ47 9238їﻂ䐊䔉㠸﬎ffiﬣ,לּᴷᴦᵛᵽ,ᴨᵤ ᵸᵥᴗᵈꚏꚉꚟ⻆rtǟƴ𬎎

Why https connections are so slow when debugging (stepping over) in Java?