This is the story of how we made sure that tracking local school holidays in Python becomes possible for data scientists, using workalendar (code example below).
But let’s start at the beginning:
When we write models that interpret and forecast meter data, we often encounter demand patterns that change drastically during holiday periods.
When this is the case, a good model really needs to take into account holidays. Facebook’s Prophet, for example, lets you do that. It relies on the python-holidays package to let you take into account national holidays (it currently supports 69 countries).
But in many cases, keeping track of national holidays alone doesn’t cut it. In particular, school holidays can have a huge influence on demand patterns. Just look at how many more days in the year are school holidays (red) than national holidays (blue).
That got us asking a simple question:
Is there a Python library that lists Dutch school holidays?
Generally, Dutch school holidays are set by the Dutch government, although some holidays are just recommendations.
We tried the following two well-maintained libraries, but none of them contained school holidays:
We wanted to make school holidays for the Python community happen, first for our local Dutch case, but in principle for all countries in the world. But in which of these two libraries?
Python-holidays has the most repositories depending on it. This includes prominent members, such as Prophet (as mentioned above), Home Assistant and GluonTS. Unfortunately, python-holidays isn’t accepting contributions for school holidays, but luckily workalendar does.
We decided to make two contributions:
- Get school holidays into the Dutch calendar (as an option). For calendars in other countries, there is now a clear way how to do this, with little effort.
- Get Carnival holidays into the Dutch calendar (it has a huge effect on demand patterns in the south of the Netherlands).
We settled on incorporating a history of school holidays since 2016, and making it very easy to expand to other years. In truth, future dates set by the Dutch government up to 2025 are not all set in stone yet. And especially with a global pandemic in progress, planning may change. For example, carnival in 2021 is not expected to be held, but I’m unsure what that means for the holiday itself.
Anyway, the Netherlands is now the first country to have school holidays easily accessible in Python. So how do you get it to work?
A code example
from datetime import date from workalendar.europe import NetherlandsWithSchoolHolidays as NL calendar = NL(region='south', carnival_instead_of_spring=True) # Get a list of holidays holiday_list = calendar.holidays(2020) # Make a dictionary with a list of holidays for each date entry holiday_dict = {} for h in holiday_list: holiday_dict.setdefault(h[0], []).append(h[1]) # Try out some dates print(holiday_dict[date(2020, 1, 1)]) # ['Christmas holiday', 'New year'] print(holiday_dict[date(2020, 2, 22)]) # ['Carnival holiday']
But where are the school holidays for my country?
Sorry, we just needed Dutch school holidays for now. If you are based outside of the Netherlands, we invite you to create a school holiday calendar for your country or local region, too.
To help you do that, a part of our contribution was to make it possible to pass class options to a calendar. Instead of making a new calendar class for each region, you can set the region when initializing the calendar. This feature was immediately recognized as a sweet extension by the library maintainers, and lead to a new major release, with relevant documentation.
This is our second blog post about contributions we made to popular data science libraries. If you’re an open source enthusiast like we are, check out how we fixed a bug in Pandas.