Java multi-threading for text file processing [duplicate]

Multi tool use
Multi tool use












-1
















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question













marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28
















-1
















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question













marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28














-1












-1








-1









This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.










share|improve this question















This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers




I have a java program that reads and iterates through each text file in a directory, makes a word index (word: which pages it appears on), and prints the output for each file into an output directory. I would like to convert this to a program that utilizes multi-threading for each file (start a new thread for each file). I am pretty new to Java and completely new to multithreading in Java. The input is:
java Index inputFolder outputFolder pageLength



Here is my working code without multi-threading:



import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Scanner;
import java.util.TreeMap;
import java.io.PrintStream;

public class Index {
public static void main(String args) {
long startTime = System.nanoTime();
PrintStream stdout = System.out;
try {
File folder = new File(args[0]);
File files = folder.listFiles();
for (File file : files) {
String name = file.getName();
int pos = name.lastIndexOf(".");
if (pos > 0) {
name = name.substring(0, pos);
}
Scanner sc;
sc = new Scanner(file);
Map<String, String> wordCount = new TreeMap<String, String>();
int count = 0;
while(sc.hasNext()) {
String word = sc.next();
word = word.trim().toLowerCase();
int len = word.length();
count = (int) count + len;
int pageNumber = (int) Math.ceil(count / Float.valueOf(args[2]));
if(!wordCount.containsKey(word))
wordCount.put(word, Integer.toString(pageNumber));
else
wordCount.put(word, wordCount.get(word) + ", " + Integer.toString(pageNumber));
}

// show results
sc.close();
PrintStream outputFile = new PrintStream(args[1]+"/"+name+"_output.txt");
System.setOut(outputFile);
for(String word : wordCount.keySet())
System.out.println(word + " " + wordCount.get(word));
}
}
catch(IOException e) {
System.out.println("Unable to read from file.");
}
long endTime = System.nanoTime();
long totalTime = endTime - startTime;
System.setOut(stdout);
System.out.println(totalTime / 1000000);
}
}


To reiterate, I would like to adapt this so that each file iteration starts a new thread.





This question already has an answer here:




  • How can I pass a parameter to a Java Thread?

    18 answers








java multithreading






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 19 '18 at 17:14









Daniel GizziDaniel Gizzi

31




31




marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by Andreas java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 17:38


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28



















  • So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

    – Andreas
    Nov 19 '18 at 17:20













  • I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

    – Daniel Gizzi
    Nov 19 '18 at 17:28

















So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

– Andreas
Nov 19 '18 at 17:20







So in your research into how to code with multiple threads, you couldn't find a single example of how to start threads or use thread pools? --- idownvotedbecau.se/noresearch

– Andreas
Nov 19 '18 at 17:20















I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

– Daniel Gizzi
Nov 19 '18 at 17:28





I did, but they are all too simple (print name of thread, etc... ) for me to figure out how to apply multithreading to the problem at hand. I know I need a class that implements Runnable, and a public void run() for the processing, but I'm stumped how to connect it all together so that it works in this context.

– Daniel Gizzi
Nov 19 '18 at 17:28












1 Answer
1






active

oldest

votes


















2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27


















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27
















2














If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer
























  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27














2












2








2







If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism






share|improve this answer













If you're using Java 1.8+ you could use the streams API.



.parallelStream() will execute the tasks in parallel, assigning a thread to each task.



You'll need a List to invoke the streams API



List<File> files = new ArrayList<>(); //initialization

//populate list here

files.parallelStream()
.forEach(x->{
//logic goes here
});


Example Repl.it



Documentation about paralellism







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 19 '18 at 17:28









CheloideCheloide

659418




659418













  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27



















  • thanks, this is much simpler than what I was originally trying to do

    – Daniel Gizzi
    Nov 19 '18 at 18:27

















thanks, this is much simpler than what I was originally trying to do

– Daniel Gizzi
Nov 19 '18 at 18:27





thanks, this is much simpler than what I was originally trying to do

– Daniel Gizzi
Nov 19 '18 at 18:27





98 nAY6,vDwl4CB7Nfs94R3ruJ rSY7IBxCzd,Hq3ySD1RrkgfzxgS,cGkaSPVj,PenZYVayJrEZG7b W0I8rYjSnu,DnQnfo
Ho5RHUeBorW5B lj UIELu 1ARX 4xf5aMrsRFJZH YSCaC9hgYg6gd DMwo

Popular posts from this blog

How to pass form data using jquery Ajax to insert data in database?

Guess what letter conforming each word

Run scheduled task as local user group (not BUILTIN)