Let’s create a basic program that we can run as a file on the command line. We’ll start with a basic framework using a main()
function.
def main():
pass
if __name__ == "__main__":
main()
Save your file as file_exercise.py
and run it from the command line using python file_exercise.py
. Note: we are concentrating on Python 3 for this class, so if you have Python 2 installed, you may need to explicitly use python3 file_exercise.py
.
What happened? Because you ran the file directly, the file’s __name__
variable is set to __main__
, which triggers the if
statement to run the main()
function. This is a common pattern that you’ll see in Python programs, and it comes in handy for being able to write programs that work both on their own and when imported into other programs. The pass
keyword does nothing, it’s just there to keep the empty main()
function from throwing a syntax error.
Let’s start filling in our main()
function. We have a json file named cities.json
which contains the top five cities in the US, sorted by population. You can download cities.json
here. Let’s open the file and load in the data.
import json
def main():
cities_file = open("cities.json")
cities_data = json.load(cities_file)
print(cities_data)
if __name__ == "__main__":
main()
First, we imported the built-in json
library to help us decode the json file. Then, we opened the file using the open()
function, and passed the open file handle to the json.load()
function. The load()
function read our data in and spit it out as a Python representation - in this case, a list of dictionaries. We then print this list.
This list is a little hard to make sense of in its raw form, let’s print it a little nicer. Use enumerate()
to go through the list and print it nicely:
import json
def main():
cities_file = open("cities.json")
cities_data = json.load(cities_file)
print("Largest cities in the US by population:")
for index, entry in enumerate(cities_data):
print(f"{index + 1}: {entry['name']} - {entry['pop']}")
if __name__ == "__main__":
main()
A few new things here: first, remember that enumerate()
outputs a tuple of (index, entry), so we use index
and entry
variables to capture those. Then, for every item in the list, we print the index (+ 1, because zero-indexed lists are sometimes hard to read), and we pull the name and population out of each entry dictionary using the dictionary []
syntax.
One more thing to clean up - using the open()
keyword on its own is frowned upon, because it won’t automatically close any resources you might open. Even if you call the close()
keyword yourself, there’s no guarantee your program won’t crash, leaving important resources dangling. It’s safer to open files inside a context using the with
keyword. Once your code exits the scope of the context, your file is automatically closed. Note: our reading and formatting code has shifted to the right because of the change in scope.
import json
def main():
with open("cities.json") as cities_file:
cities_data = json.load(cities_file)
print("Largest cities in the US by population:")
for index, entry in enumerate(cities_data):
print(f"{index + 1}: {entry['name']} - {entry['pop']}")
print("The file is now closed.")
if __name__ == "__main__":
main()
Parsing files - especially if you didn’t create them - is often tricky, and you’re going to have to deal with less-than-perfect data. For example, go into your cities.json
file and delete the last ]
character. Run your program again.
Helpfully, the library told you (on the last line) approximately what is wrong and where. It also provides a Traceback to help you see what happened, starting with your main()
function, which called json.load(cities_file)
, and into the functions used internally to the json
library. This will become more useful once you start writing your own libraries, so practice reading and understanding your Tracebacks.
But let’s say we’re writing a web app or user-facing app and don’t want our users to see Tracebacks (they can be scary if you’re not a programmer, as well as risk your security by leaking information about your software). Let’s catch that JSONDecodeError
and return something prettier.
import json
def main():
with open("cities.json") as cities_file:
try:
cities_data = json.load(cities_file)
print("Largest cities in the US by population:")
for index, entry in enumerate(cities_data):
print(f"{index + 1}: {entry['name']} - {entry['pop']}")
except json.decoder.JSONDecodeError as error:
print("Sorry, there was an error decoding that json file:")
print(f"\t {error}")
print("The file is now closed.")
if __name__ == "__main__":
main()
Here, we’ve wrapped our business logic in another scope - the try - except
block. For the except
, we reach into the json
library and reference the JSONDecodeError
that’s part of the decoder
module. We assign it to error
so that we can reference it later. We then print out the entire error, prefixed with a tab character \t
to make it a little easier to read. Voilà, we’ve caught our error and reported it to the user with (hopefully) helpful information (but not too much information).