Numpy Genfromtxt : How to Use it with Examples ?

Numpy Genfromtxt post featured image

Numpy is a very popular python module. It provides you several functions to create an array from tabular data. The numpy.genfromtext() method is one of them. In this entire tutorial, I will show you what is genfromtext() method and how to use Numpy Genfromtxt with great examples.

 

How does the Numpy Genfromtxt method work?

Numpy genfromtext has two loops. One loop is used to convert each line of the file in string sequences. And the second loop converts each string to the appropriate datatype. Thus it makes this method slower than the single loop. If you have missing values in the data then genfromtxt method takes the responsibility for it.

Input for  the Numpy Genfromtxt

The mandatory input for the numpy.genfromtxt() method is the source of the data. The data can be string, textfile, list of strings e.t.c. If you are providing a URL for the data then it is downloaded and use from the current working directory.

You can also use a zip or archived data file with the extension gzip and bz2(bzip2).

Numpy Genfromtxt Examples

Example 1: Use of delimiter argument to split the lines.

You can use delimiter as an argument inside the numpy.genfromtxt() method to split the lines into columns. For example, I want to split the strings into columns after each comma then I will assign delimiter = “,” and so on. Run the below code.

data = "10, 20, 30\n40, 50, 60"
np.genfromtxt(StringIO(data),delimiter=",")

Here in the data variable \n represents a newline escape character. It tells the python interpreter the after \n all things are in a new line. When you use genfromtxt() method there will be two arrays one is before \n and the other after \n. Each value are split using the comma delimiter.

Output

Split the line using the delimiter
Split the line using the delimiter

In the same way, you can also use delimiter = ‘\t’ for splitting the values for the tabulation character.

Example 2: Use of delimiter of fixed width to split the lines.

You can also define the size of the delimiter instead of character to split the lines. It allows you to create a list of arrays with the defined size of delimiter, Look at the below code. I am using a delimiter width of 3.

data = u" 1 2 3\n 4 5 67\n890123 4"
np.genfromtxt(StringIO(data), delimiter=3)

Output

Split the line using the delimiter width
Split the line using the delimiter width

Example 3:  Use of autostrip argument.

If you are using the only delimiter then the line decomposes into a series of a string with whitespaces also if available. But if I use autostrip =True as an argument then I will get a series of string without spaces. Execute the below code and look up the output.

data = u"1, abc , 2\n 3, xxx, 4"
# Without autostrip
np.genfromtxt(StringIO(data), delimiter=",", dtype="|U5")
Genfromtxt() Output without autostrip argument 
Genfromtxt() Output without autostrip argument
# With autostrip
np.genfromtxt(StringIO(data), delimiter=",", dtype="|U5", autostrip=True)
Genfromtxt() Output with autostrip argument 
Genfromtxt() Output with autostrip argument

Example 4: Use of comments argument

Suppose you have data that have comments or specific symbols in it and you want to remove it from the stings. To do so you can pass comments argument inside the genfromtxt() method. Just execute the code below.
data = u"""#
... # Skip me !
... # Skip me too !
... 1, 2
... 3, 4
... 5, 6 #This is the third line of the data
... 7, 8
... # And here comes the last line
... 9, 0
... """

np.genfromtxt(StringIO(data), comments="#", delimiter=",")

Output

Genfromtxt() Output with comments argument 
Genfromtxt() Output with comments argument

End Notes

Numpy Genfromtxt is one of the best methods to extract a sequence of strings and convert them into NumPy arrays. You can then manipulate the array according to your convenience. These examples are just basics to clear any query on NumPy genfromtxt .

Even if you have any doubts about it then you can contact us for more information.

Source:

Official Numpy genfromtxt Method Documentation

 

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Sukesh ( Chief Editor ), a passionate and skilled Python programmer with a deep fascination for data science, NumPy, and Pandas. His journey in the world of coding began as a curious explorer and has evolved into a seasoned data enthusiast.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner