You can read CSV (comma separate values) files in Python without importing anything.
It’s good to have an idea what’s happening when you use a library such as import csv or import pandas. Also, it’s not too complicated to write your own code to do this – some people might even find it easier to understand than using a library.
Here’s some example CSV data showing the top ten grossing movies of all time as of October 2024, courtesy of Wikipedia. Put this into a file named “movies.csv” and save the file.
1,Avatar,2923706026,2009 2,Avengers: Endgame,2797501328,2019 3,Avatar: The Way of Water,2320250281,2022 4,Titanic,2257844554,1997 5,Star Wars: The Force Awakens,2068223624,2015 6,Avengers: Infinity War,2048359754,2018 7,Spider-Man: No Way Home,1922598800,2021 8,Inside Out 2,1696447226,2024 9,Jurassic World,1671537444,2015 10,The Lion King,1656943394,2019
Now you can read the data with the following code:
with open('movies.txt') as f: data_string = f.read() rows = data_string.split('\n') movies = [] for row in rows: info = row.split(',') movies.append(info) print(movies[0]) print(movies[0][1]) print(movies[0][2])
This is making a list of lists (2D array) out of the movie data.
Explanation of the code:
1) Open the file and load it as one big string using f.read()
2) Split the big string into a list of strings, one for each row.
The split() function turns a string into a list. You have to tell it what character is separating the items; in this case the rows are separated by a line break character, "\n"
.
3) Iterate (loop) through the rows list and split each row into individual pieces of info (rank, name, gross, year). These values are separated by commas, so we use row.split(',')
to make each row into a list of strings.
4) Add each of these lists of strings to a list called movies
.
5) Access an entire row of data using something like movies[0]
.
6) Access one item in a row using something like movies[0][1]
.
Having the data setup this way is much more useful than having one big string, or even a list of strings that contain the entire row. It allows you to pick it apart and only use the pieces of data you want.
For example, you could use this to calculate the total money made, filter the data to only list movies made after 2010, or print only the movie titles.
Here’s an example using a for loop to iterate through the data:
for movie in movies: name = movie[1] gross = movie[2] year = movie[3] print(f'{name} grossed {total:,}')
In this case, gross
is a list of int
data. We convert item 2 from each row to int
and append it to the gross
list.
At the end, this list is used to calculate the sum.