Managing e-statements for fun and profit


Unless you’ve been living under a rock for the last 5 years, you should have noticed the imminent extinction of paper statements from most of the consumer services. Whether we like it or not, e-bills are the way of the future.

One problem with e-bills is that they still need to be stored – preferably, in a way that makes them easy to find. One also cannot rely on the service owners to archive old bills, as they rarely keep them for longer than 1 year.

A typical process managing such bills involves multiple steps:

About 3 years ago, I’ve written a relatively simple tool that attempts to automate the latter two of those 3 steps (automatically downloading is notoriously difficult, given different website organization for each provider and the fact that they love to tweak the website layout, frequently breaking the scraping procedure).

The tool is called classify_bills and is available on GitHub. When run on a bunch on PDF files, it tries to figure out what account each file belongs to and what time period it is for. It then renames it appropriately and moves it into an appropriate destination.

The tool is configuration-driven, it basically runs off a bunch of regular expressions (hat tip to jwz), and configuring a new account is strictly speaking, not pleasant. Still, once configured, it eliminates the chore of laboriously doing those actions manually, bill by bill, month by month, making the process faster and pleasanter.

Powered by Jekyll and Moonwalk