Problem Statement 💡

Climate change is a contentious issue, with some believing it to be a critical threat and others dismissing it as a myth based on flawed science. In this project, we present a dataset that allows you to draw your own conclusions.

This dataset, compiled by the Berkeley Earth Surface Temperature Study, contains over 1.6 billion temperature reports from 16 archives and provides insights into long-term climate trends. The dataset includes global land and ocean temperatures, as well as temperatures by country, state, and city.

However, collecting this data has not been easy. Early temperature data was collected using mercury thermometers that were subject to variations in measurement depending on visit time, and the construction of airports in the 1940s forced many weather stations to be relocated. In the 1980s, electronic thermometers were introduced, but they were found to have a cooling bias.

Despite these challenges, three organizations - NOAA’s MLOST, NASA’s GISTEMP, and the UK’s HadCrut - have been collating and publishing climate trends data. We have repackaged the data from the Berkeley Earth Study, which offers cleaner and more organized data with source code and transformations.

In this project, we seek to explore the following questions using open-source data from Open World in Data and Kaggle:

Which countries have the highest temperature anomalies?
Which countries are the biggest emitters of CO2?
How can we sensibly rank "biggest polluters"?

Data Sources

We have used the following open-source datasets to answer the project's questions:

Problem Statement 💡

Data Sources​

Data Sources