How to read a file in java : ( Text , Binary , Pdf, Csv e.t.c )

How to read a file in java

Are you looking for a article which can help you in finding the solution for – “how to read a file in java ” . Usually It is one of the common task which every developer/ data scientist  has to perform atleast once in week . So why not learn in easy way ?

How to read a file in java : ( Text , Binary , Pdf, Csv etc ) –

As you know Pdf , CSV , Text and Binary are common file format . So if you are a java developer or data scientist , This article is a must read content for you –

How to read a file in java

1.How to read a file in java  ( Text ):


import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class TextFileReadingExample {
public static void main (String Args[]) {

//mention the fileName to Read
String fileName = "fileNameWithObsolutePath.txt";

String line = null;
try {

//File Reader object creation
FileReader fileReaderObj= new FileReader(fileName);

//Buffered Reader Obj Creation
BufferedReader bufferedReader = new BufferedReader(fileReaderObj);

try {
while((line = bufferedReader.readLine()) != null) {
System.out.println(line);
}
bufferedReader.close();
} catch (IOException e) {
System.out.println("System is unable to open the the file :"+ fileName);
e.printStackTrace();
}

} catch (FileNotFoundException e) {
System.out.println("System is unable to open the the file :"+ fileName);
e.printStackTrace();
}
}

}

Description –

All you need to go through some usual class for File Handling . These are –

  1. FileReader
  2. BufferedReader

Both of them are member of java.io.* .You can directly import and call them .Apart from this I don’t think any explanation is needed for you .Of Course the while loop contains some logical part .Here the –

bufferedReader.readLine() 

Function returns complete line as an string . It will return null when there will be no lines in buffer .Please refer the comment in code example for more information .

 

1.How to create a Text file in java  :


import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
public class TextFileWriting {
public static void main(String [] args) {
// FileName to write with absolute path.
String fileName = "FileName.txt";
try {
// File writer object Creation
FileWriter fileWriter = new FileWriter(fileName);
// FileWriter object Creation.
BufferedWriter bufferedWriter = new BufferedWriter(fileWriter);
//Add String in lines of file
bufferedWriter.write("My Fiest Code for Text File Creation in java ");
bufferedWriter.write(" It so easy ");

//user this function if you need to leave a line
bufferedWriter.newLine();

bufferedWriter.write("Started from the next Line");
bufferedWriter.write(" Appended Text in seond Line ");
// Always close files.
bufferedWriter.close();
}
catch(IOException ex) {
System.out.println(
"System is unable to create the file "
+ fileName + "'");
// Or we could just do this:
// ex.printStackTrace();
}
}
}

Description –

There is only very slight difference between File reading and writing in java .In the place of FileReader and BufferedReader class from java.io.* package , We use FileWriter class and BufferedWriter class form the same input output package .I think most of thing will be clear to you . Incase of any doubt please comment in comment box .

How to read a Binary file in java  –

package com.practice.check.concept;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;

public class ReadBinaryFile {

public static void main(String [] args) {

// The name of the file with absolute path to open.it could be a binary file as well .
String fileName = "FileName.txt";

try {
// declaring the buffer size
byte[] buffer = new byte[1000];

//File Input Stream initialization
FileInputStream inputStream = new FileInputStream(fileName);

int bufferCounter = 0;

// Reading the in chun of buffer size and breaks the while loop until finish the stream
while((bufferCounter = inputStream.read(buffer)) != -1) {

//Convert the buffer into String
System.out.println(new String(buffer));

}

// Always close files.
inputStream.close();

}
catch(FileNotFoundException ex) {
System.out.println(
"File can not be open now" +
fileName + "'");
}
catch(IOException ex) {
System.out.println(
"Unable to bread now "
+ fileName + "'");

}
}

}

. Description –

The above code will work for Binary and system formatted text file . As I have already mention in code comment that you need to give the filename with absolute path .If you already walk through the above code you can easily understand the code . There are only few differences like –

  1. In the place of FileReader class object , you need to create FileInputStream class object .It will create byte stream for you .
  2. Once the file is converted into stream , Define your buffer size .
  3. Now read the stream as many buffer it consumes .

Note –

How to Read CSV file in Java –

CSV file is a special file format which is nothing but Comma separated values . The main challenge here to read it when value contains comma and when comma is the separator together .Right?

For Example –

If the value in any row of a CSV is-

“2,3”, “25,” , ………………….

here both values – (2,3 ) and (25,)  it self contains comma . Here the separator has again comma. Now how to solve this parsing problem ?

There are actually two solution . First says to apply logic in inbuilt library provide by java and parse it .For example – You can ignore all comma except which comes between two inverted comma etc . Again this is not going to be the final logic . In fact there could be multiple ways to solve this .

Other way of doing is use third party library . In our case we are going to achieve with third party library . It will auto handle such scenarios for us .

  1. create a maven project and put the dependency there for opencsv .
    <dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
    <version>4.0</version>
    </dependency>

    2.Here is the complete code –

import java.io.FileReader;
import java.io.IOException;

import com.opencsv.CSVReader;

public class CsvFileReaderExample {

public static void main(String[] args) {

//filename of csv with absolute path
String csvFileToRead = "C:\\Users\\DSL\\Documents\\Folder\\SAMPLE.csv";

CSVReader reader = null;
try {

reader = new CSVReader(new FileReader(csvFileToRead));
String[] row;

//iterate each row
while ((row = reader.readNext()) != null) {
//To acces each element of row use row_variable[order]
System.out.println( row[0] +row[1]+row[2] );
}
} catch (IOException e) {
e.printStackTrace();
}

}

}

How to read PDF File in java –

PDF is portable document format and a for of unstructured data . While it is most important thing to learn and play for java data scientist / java developers.The reason is pretty straight forward .Most of the reports have pdf format and all bank statement etc .

Although to deal with PDF in java , there are so many external API like . We are going to use PDFbox .

If you want to know more about Java PDF Libraries ,Go for the article –  5 Best Java PDF Libraries : Must Read for every Data Scientist

Before we jump into java code . We need maven dependency for pdfbox library .All you need to copy into pom.xml between the tag

<dependencies> ____your dependency ____  </dependencies>

here is the maven dependency for pdfbox library-

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.9</version>
</dependency>
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
public class ReadingPDF {

   public static void main(String args[]) throws IOException {

      //File object creation and pass as an arg to Pdfdocument 
      File fileobj = new File("C:/Folder/sample.pdf");
      PDDocument pdfDocument = PDDocument.load(fileobj);

      //object creation of  PDFTextStripper class
      PDFTextStripper pdfStripper = new PDFTextStripper();

      //text Extraction from PDF
      String textPdf = pdfStripper.getText(pdfDocument);
      System.out.println(textPdf);

      //Closing the document
      pdfDocument.close();

   }
}

How to Transform your career from Java developer to Data Scientist ?

Python , R and Julia are most popular language for data science but java is also powerful and capable of doing all the data science stuffs. Yes I agree some time performance varies in both of them.  Here is the detail article on carrer transition form java developer to Data Scientist.

Conclusion –

In this article (How to read a file in java : ( Text , Binary , Pdf, Csv etc ))  we have explored all the ways for java file handling  . In data science most of the time data scientist play with CSV  file format . Although this article give a walk through for all four file type . If you  are more interested to go  deeper  into any particular file format basic operation . Just Subscribe Data Science Learner . You will get the notification once the article publish on that .Till then keep reading Data Science Learner .

Thanks

Data Science Learner Team 

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Abhishek ( Chief Editor) , a data scientist with major expertise in NLP and Text Analytics. He has worked on various projects involving text data and have been able to achieve great results. He is currently manages Datasciencelearner.com, where he and his team share knowledge and help others learn more about data science.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner