Project 4: Natural Language Processing 25 Years of EU Climate Policy
This project expands on the work by Sewerin, S., Kaack, L.H., Küttel, J. et al. Towards understanding policy design through text-as-data approaches: The policy design annotations (POLIANNA) dataset. Sci Data 10, 896 (2023). https://doi.org/10.1038/s41597-023-02801-z.
The POLIANNA is a dataset of policy texts from the European Union (EU) that are annotated based on theoretical concepts of policy design, which can be used to develop supervised machine learning approaches for scaling policy analysis. The dataset consists of 20,577 annotated spans, drawn from 18 EU climate change mitigation and renewable energy policies. This research utilizes the existing annotations and training set and tests it on all legislations and directives from the European Union in a 25-year period from 2000 to 2024.
Tasks involved the following:
- Highligting requirements/focus (annotations)
- Modifying existing code and creating more code to collect more data
- Fine tuning hyper-parameters
- Additional models where necessary
- Data visulization/communicating findings
Legislation Changes
Change in Actors Over Time
Red line represents the “Fit-for-55” package
