Anja's Projects

City of Ottawa Collisions Analysis

The City of Ottawa has open source data for collisions from 2017-2022 which contains information about accident type, weather, road conditions, location, and injury severity. We used this information to try to understand the features of a collision that increase the likelihood of an injury. This was done through both quantitative analysis and creation of predictive models (focused on decision trees and kNN).

Exploring Various City of Ottawa Parks and their Features

Check it out →

On the City of Ottawa's open source data website, I found 7 different datasets which are all related to various features of Ottawa parks. These were combined to make an interactive dashboard to allow people to explore Ottawa parks. It is a Tableau story with 4 pages: plan which parks to visit based on features you want, explore parks that are close to you and what features they contain, compare parks in various neighbourhoods and how well equiped they are, and finally, see how park friendly Ottawa is in general.

City of Ottawa Red-light Data Analysis

Check it out →

The City of Ottawa has open source data and I found (and used) their data for red light violations from 2016-2020. Here I analyzed questions centered around ROI for the City of Ottawa: which months/locations have the highest/lowest number of violations, and was there a statistically significant change in number of violations for the year 2020 in comparison to 2019. I also re-ran the code to include the newest dataset for 2021. There is an updated report to include 2016-2021 data. I used Pandas to analyze the data, matplotlib and NumPy to visualize the data, Folium to create map visualizations, and SciPy to run t-tests to analyze correlations.

Costco Deal Hunter

Check it out →

Part 1

I made a web scraper that pulled the information for Costco’s coupon section and looks for keywords. I used Selenium to scrape data because it required user input for the location before allowing the coupons to load (as they are based on the location). I also used Pandas again to make the data frame that stores all the coupon information.

Part 2

I analyzed prices between Costco and Real Canadian Superstore items to find which is cheaper. I automated the inputting of coupon item names into Real Canadian Superstore search and scraping the details to be stored into a dataframe using Selenium and Pandas. After this was done, the two dataframes (from different sources) were combined and cleaned to allow the analysis of the price to be done. The prices were compared by unit prices and the overall price difference distribution was displayed using the average quanity. Graphing the price distribution was done using Seaborn.

Ontario School Boards Progress/Achievement

Check it out →

I downloaded a CSV from the Ontario goverment open source data that contained achievement on the standardized reading tests in grade 6 and 10, along with the progress in credit accumulation for grade 10 and 11 and graduation rates (for 4 and 5 years). Then I connected it to BigQuery to store data and used SQL queries to answer questions that I wanted to know from an educator and parent perspective. I used Looker Studio to summarize my findings in an interactive dashboard.

Creating My Website

Check it out →

This website was created using a template that was found online on a free templates website. I really liked the colour scheme and the basic layout ideas. I used HTML to write the code for each page layout and content for the website. I also went in and changed some of the CSS in order to style certain things to my liking.

Indeed Web Scraper

Check it out →

I used a guideline web article that scraped the Monster job website. It took me a lot of extra research because I was just starting to learn about containers and attributes of HTML code. Then I decided to try out what I learned on a similar job search webpage: Indeed.ca. In this project, I used the BeautifulSoup and the requests library to parse through HTML code.

IMDb Web Scraper

Check it out →

I used an article as a guide to help me understand further how to store the data in a usable way using Pandas. I scraped the first 50 top movies from IMDb’s top 100 movies and created a data frame, which was saved to a CSV file, where all the data types stored are correct. I used this as an introduction to starting to clean up data that I have gotten.

Hangman Terminal Game

Check it out →

This project was my very first Python code project, where I was learning the basic language of Python. Here I made a hangman game without visualizations.

Anja Wu

City of Ottawa Collisions Analysis

Exploring Various City of Ottawa Parks and their Features

City of Ottawa Red-light Data Analysis

Costco Deal Hunter

Part 1

Part 2

Ontario School Boards Progress/Achievement

Creating My Website

Indeed Web Scraper

IMDb Web Scraper

Hangman Terminal Game