Using .data files from UCI repository
We will learn how to use the data sets from UCI that come with the .data file in this brief article.
Downloading data from UCI repository:
There are some interesting data sets available for free in the UCI repository. You can use it to hone your analytical abilities.
To begin, we will download the .data file from the University of California, Irvine repository. We will download the iris dataset for this article. You may use the same procedure to download any dataset.
After downloading it, you can open it with notepads or microsoft excel.
Now, we will try to use it in jupyter notebook. Atfirst, we will import pandas and then will use read_csv() to read the data into a dataframe.
However, as we can’t see the column names so now we will add the column names. To do that, we will copy the attribute names from the attribute information kinda like this:
Now, we will add the column names to your DataFrame with the .columns property on the DataFrame like this:
Here is the full code from jupyter notebook if you want to try it by your ownself.
https://gist.github.com/da785d826e1c0f43b1ec4608c6528ccd.git