Previously we learned how to create a Series and a DataFrame and what they are.
Now we will learn how to import a dataset into a DataFrame.
Before that, we should first decide if we are importing from an external or a local source. After that, we have to decide what type of file we want to import.
For now, we will import a CSV file called "netflix_titles.csv".
If we are importing externally, we can just use pandas .read_csv module with the URL inside.
data:image/s3,"s3://crabby-images/18b16/18b166a62930b75980b9799213ab15b901aee90f" alt=""
On the other hand, if we are importing from a local source, we first have to make sure the file is in our working Python directory[1].
Now we can import it into our project.
💡 Add this code to the same notebook we created in the previous workout.
data:image/s3,"s3://crabby-images/aae75/aae7596f664feb5adcefb2199757272a366df96f" alt=""
If you want to use the same dataset, you can download it using the link in the Learn More section.
Footnotes
[1: Working Directory]
To check your current working directory, you have to first import the os library and use its .getcwd() module.
data:image/s3,"s3://crabby-images/a7f96/a7f96f9f736c98d9f9d83555e4469ce347f34ff7" alt=""
The .getcwd() module will output the current working directory.
Just locate the folder using the path provided by the .getcwd() module and move the CSV file there.
To check if a file is in a specific folder, we need to use the .listdir() method from the same module:
data:image/s3,"s3://crabby-images/9746e/9746e2f8562071b45bed4d88a04ba0db2841b724" alt=""
We can print the files in a list:
data:image/s3,"s3://crabby-images/478fe/478fe0093691b600e9b7a838d39aa856d9bf2140" alt=""
Or use a for loop:
data:image/s3,"s3://crabby-images/ba468/ba468209efee0601f7998c9a67d89d104000c895" alt=""
Learn More
data:image/s3,"s3://crabby-images/50999/509996dff6a9ebaf4c47661200c7e64ee75f8dad" alt=""