• Project 19: Exploring evolution of Linux

    1. Introduction Version control repositories like CVS, Subversion or Git can be a real gold mine for software developers. They contain every change to the source code including the date (the “when”), the responsible developer (the “who”), as well as a little message that describes the intention (the “what”) of a change. In this notebook,…


  • Project 17: Generating Keywords for Google Ads: Low cost furniture store

    1. The brief Imagine working for a digital marketing agency, and the agency is approached by a massive online retailer of furniture. They want to test our skills at creating large campaigns for all of their website. We are tasked with creating a prototype set of keywords for search campaigns for their sofas section. The…


  • Indonesia Salary by region from 1997-2022 – Visually by running a bar chart

    How do I create a running bar chart using Google Colab? Even though there are many ways to create a similar visualization. I would like to present to you how I did it using a google collab. If you are not familiar with Google Colab. If you have ever used a word document on the…


  • Project 12: Optimizing Online Sports Retail Revenue

    1. Counting missing values Sports clothing and athleisure attire is a huge industry, worth approximately $193 billion in 2021 with a strong growth forecast over the next decade! In this notebook, we play the role of a product analyst for an online sports clothing company. The company is specifically interested in how it can improve revenue. We…


  • Project 10: Golden Age of Video Games

    1. The ten best-selling video games Photo by Dan Schleusser on Unsplash. Video games are big business: the global gaming market is projected to be worth more than $300 billion by 2027 according to Mordor Intelligence. With so much money at stake, the major game publishers are hugely incentivized to create the next big hit. But are games getting…


  • Project 9: Analysis of NYC Public School Test Results for SAT

    1. Inspecting the data Photo by Jannis Lucas on Unsplash. Every year, American high school students take SATs, which are standardized tests intended to measure literacy, numeracy, and writing skills. There are three sections – reading, math, and writing, each with a maximum score of 800 points. These tests are extremely important for students and colleges, as they…


  • Project 8: Netflix Movie Data – Analysis

    1. Loading your friend’s data into a dictionary Netflix! What started in 1997 as a DVD rental service has since exploded into the largest entertainment/media company by market capitalization, boasting over 200 million subscribers as of January 2021. Given the large number of movies and series available on the platform, it is a perfect opportunity to flex…


  • Project 7: The NYC Airbnb Market – Analysis

    1. Importing the Data Welcome to New York City (NYC), one of the most-visited cities in the world. As a result, there are many Airbnb listings to meet the high demand for temporary lodging for anywhere between a few nights to many months. In this notebook, we will take a look at the NYC Airbnb market by…


  • Project 6: Scala Programming History on GitHub – Analysis

    1. Scala’s real-world project repository data With almost 30k commits and a history spanning over ten years, Scala is a mature programming language. It is a general-purpose programming language that has recently become another prominent language for data scientists. Scala is also an open source project. Open source projects have the advantage that their entire…


  • Project 5: Exploring Fifa World Cup Data

    This dataset (source) includes 44,066 results of international football matches starting from the very first official match in 1872 up to 2022. The matches range from FIFA World Cup to FIFI Wild Cup to regular friendly matches. The matches are strictly men’s full internationals and the data does not include Olympic Games or matches where at least…