Practice

Let’s review what we learned today and put it all together.

For the final exercise of today, we’re going to write a small program that requests the top repositories from GitHub, ordered by the number of stars each repository has, then we’re going to print the results to our terminal. Create a new file called day_one.py.

You may need to install the requests library using python -m pip install requests. You may see pip used directly, but using python -m pip is recommended by Python.

Let’s start with our key function, the one that gets the data from the GitHub API. Use the requests library to do a GET request on the GitHub search API URL (“https://api.github.com/search/repositories"). Use if __name__ == "__main__" to check to make sure we’re running the file directly, and to call our function. Don’t forget to import requests

You should have something like this:
Here's what you should have seen on your command line:

Getting a Response

Looks like we got a response from the GitHub API! Looks like we hit an error - we’re missing search parameter. Checking the documentation_url that GitHub helpfully provides, we can see that we’re missing the parameter q, which contains search keywords. Let’s hardcode a query string to find repos with more than 50,000 stars and try again. We’ll add our query string to the parameters dict as q, and pass it to the params argument of requests.get()

You should have something like this:
Here's what you should have seen on your command line:

Response Parsing

Woah, we got a huge response from GitHub, including metadata for 33 repos. Let’s parse it out so we can make better sense of what we have - use response.json() to get the returned data in json format. We see that GitHub returns a list called items in our response, so let’s return that. Then, in your main function, loop through it and print out the important bits.

You should have something like this:
Here's what you should have seen on your command line:

Narrowing it Down

We should now have a much more readable list of 33 or so repos, along with their number of stars. Let’s narrow down our search a bit. To use multiple search keywords, we’ll have to programatically construct our query string. Using the GitHub API documentation, let’s make a new function to construct a query string for the repository search endpoint that searches for any number of languages, and limits our query to repos with more than 50,000 stars:

You should have something like this:

Now, let’s call our new create_query() function from repos_with_most_stars(), replacing our hardcoded query string. Add a languages argument so that we can pass in a list of languages to use to create our query. Also add sort and order parameters, which we’ll hardcode to “stars” and “desc” for now.

You should have something like this:

Finally, let’s add a languages list to limit which languages we’re interested in, and pass it to repos_with_most_stars(). Now, when we call our repos_with_most_stars() function with ["python", "javascript", "ruby"] as our languages, the create_query() function will output create a query string that looks like q=stars:>50000+language:python+language:javascript+language:ruby+&sort=stars&order=desc. Because this is a simple GET request, this gets appended to our gh_api_repo_search_url, so our actual request URL is https://api.github.com/search/repositories?q=stars:>50000+language:python+language:javascript+language:ruby+&sort=stars&order=desc.

Run your program.

You should have something like this:
Here's what you should have seen on your command line:

Cleaning Up and Handling Errors

Looking good, we now have a sorted list of the top python, javascript, and ruby repos. Let’s do a little bit of clean up and error handling. We might not always want to sort by “stars” or order by “desc”, so move those to keyword arguments. That way, they’ll be good defaults, but if someone calling our repos_with_most_stars function wants to override them, they can.

You should have something like this:

Next, we should handle any errors we might run into with the API. Maybe you’ve gotten one already. Let’s add some basic error handling on the response’s HTTP status code. We’ll check for a 403, a common error that GitHub uses to tell you that you’re hitting their API too quickly, and raise and error. We’ll also raise an error if the status code is anything but 200 (success).

You should have something like this:

There, your code should do the same thing, but should handle errors much better.

The final code, with additional comments, can be found here: