Java multi-threading for text file processing [duplicate]












-1
















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question













marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28
















-1
















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question













marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28














-1












-1








-1









This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.





This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers








java multithreading






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 19 '18 at 17:14









Daniel GizziDaniel Gizzi

31




31




marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28

















So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

– Andreas
Nov 19 '18 at 17:20







So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

– Andreas
Nov 19 '18 at 17:20















I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

– Daniel Gizzi
Nov 19 '18 at 17:28





I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

– Daniel Gizzi
Nov 19 '18 at 17:28












1 Answer
1






active

oldest

votes


















2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27


















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27
















2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27














2












2








2







If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer













If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 19 '18 at 17:28









CheloideCheloide

659418




659418













  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27



















  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27

















thanks, this is much simpler than what I was originally trying to do

– Daniel Gizzi
Nov 19 '18 at 18:27





thanks, this is much simpler than what I was originally trying to do

– Daniel Gizzi
Nov 19 '18 at 18:27





Popular posts from this blog

Guess what letter conforming each word

Run scheduled task as local user group (not BUILTIN)

Port of Spain