Introducing a python package that helps with machine learning feature engineering during the pandemic! lockdowndates

Recently during the pandemic if you were training machine learning models with some sort of time element you would have been scrambling to try and reengineer these as I would be confident to say the pandemic made all your forecasts very far from the truth.

Consumer habits were completely changed and became so unexpected it was hard to forecast. Everyone was left in the dark as to whether we all would be thrown into a complete or partial lockdown.

As a machine learning engineer I had to completely change my approach to training machine learning models during the pandemic. It took a lot of tweaking. The hardest part was training models for consumers in different countries around the world, where restrictions were completely different. This is still a pain for some countries at the moment with restrictions coming and going.

So, I decided instead of trying to alleviate the damage of the pandemic in our models, why not embrace them? Why not embed them as features in our models for them to learn about? That’s why I decided to created lockdowndates

lockdowndates* provides all current and past restrictions imposed by governments in over **100 countries worldwide during the pandemic! To get started:

pip3 install lockdowndates

(side note: currently we only support python3.8 and above, this will be changing soon to support 3.6>)

after installing we import:

from lockdowndates.core import LockdownDates

Enter fullscreen mode Exit fullscreen mode

to get restrictions for a single country:

ld = LockdownDates("Aruba", "2022-01-01", "2022-01-08")
lockdown_dates = ld.dates()
lockdown_dates

Enter fullscreen mode Exit fullscreen mode

aruba_country_code aruba_stay_at_home
timestamp
2022-01-01 ABW 2.0
2022-01-02 ABW 2.0
2022-01-03 ABW 2.0
2022-01-04 ABW 2.0
2022-01-05 ABW 2.0
2022-01-06 ABW 2.0
2022-01-07 ABW 2.0
2022-01-08 ABW 2.0

to get restrictions for multiple countries:

ld2 = LockdownDates(["Canada", "Denmark"], "2022-01-01", "2022-01-08")
lockdown_dates = ld2.dates()
lockdown_dates

Enter fullscreen mode Exit fullscreen mode

canada_country_code denmark_country_code canada_stay_at_home denmark_stay_at_home
timestamp
2022-01-01 CAN DNK 1.0 0.0
2022-01-02 CAN DNK 1.0 0.0
2022-01-03 CAN DNK 1.0 0.0
2022-01-04 CAN DNK 1.0 0.0
2022-01-05 CAN DNK 1.0 0.0
2022-01-06 CAN DNK 1.0 0.0
2022-01-07 CAN DNK 1.0 0.0
2022-01-08 CAN DNK 1.0 0.0

the legend for stay_at_home are as follows:

  • NaN – No data available for that date.
  • 1.0 – recommend not leaving house.
  • 2.0 – require not leaving house with exceptions for daily exercise, grocery shopping, and ‘essential’ trips.
  • 3.0 – require not leaving house with minimal exceptions (eg allowed to leave once a week, or only one person can leave at a time, etc.

lockdowndates contains up to date data thanks to oxford university and their open source data!

I will keep updating lockdowndates to contain more restrictions, including restrictions for vaccinated vs non-vaccinated, school restrictions, work from home restrictions and many more! Track our issues here.

So next time you feature engineering on your next machine learning project and have tabular data from the during the pandemic, consider lockdowndates to make your life easier and to improve your metrics!

Documentation: lockdowndates

Github Repo: lockdowndates

原文链接: Introducing a python package that helps with machine learning feature engineering during the pandemic! lockdowndates

© 版权声明
THE END
喜欢就支持一下吧
点赞14 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容