-

Project 8: Netflix Movie Data – Analysis
1. Loading your friend’s data into a dictionary Netflix! What started in 1997 as a DVD rental service has since exploded into the largest entertainment/media company by market capitalization, boasting over 200 million subscribers as of January 2021. Given the large number of movies and series available on the platform, it is a perfect opportunity to flex…
-

Project 6: Scala Programming History on GitHub – Analysis
1. Scala’s real-world project repository data With almost 30k commits and a history spanning over ten years, Scala is a mature programming language. It is a general-purpose programming language that has recently become another prominent language for data scientists. Scala is also an open source project. Open source projects have the advantage that their entire…
-

Project 5: Exploring Fifa World Cup Data
This dataset (source) includes 44,066 results of international football matches starting from the very first official match in 1872 up to 2022. The matches range from FIFA World Cup to FIFI Wild Cup to regular friendly matches. The matches are strictly men’s full internationals and the data does not include Olympic Games or matches where at least…
-

Project 4: Search for World’s Oldest Businesses
1. The oldest businesses in the world This is Staffelter Hof Winery, Germany’s oldest business, which was established in 862 under the Carolingian dynasty. It has continued to serve customers through dramatic changes in Europe such as the Holy Roman Empire, the Ottoman Empire, and both world wars. What characteristics enable a business to stand…
-

Project 3: Reducing Traffic Mortality in the USA
1. The raw data files and their format While the rate of fatal road accidents has been decreasing steadily since the 80s, the past ten years have seen a stagnation in this reduction. Coupled with the increase in number of miles driven in the nation, the total number of traffic related-fatalities has now reached a…