Analyzing Your Spending with Elasticsearch and Kibana
How to export your finanial data from Mint into Elasticsearch
[2018-04-05]#development #data visualization
I love Mint, it’s a great way to track your finances across all your accounts and make budgets. However, its visualizations (“trends”) leave something to be desired; for instance you can graph over time or by category, but not at the same time. Bucket size and time ranges are hard-coded. Filtering is not too bad, but limited to inclusive units of account and category.
Faced with this lack of visualization options, I often wish I had the power of analytics in Kibana… so let’s just go ahead and load the transaction data into Elasticsearch.
First we need an Elasticsearch “cluster”; I’m just using a single node on my local machine, which is fine for development or non-critical work. First download and install the latest version of Elasticsearch, and start it — I just run it in a tmux session for local dev: bin/elasticsearch. Then download and install the latest version of Kibana, and start it: bin/kibana. After a couple minutes you should see both go “green” and you’ll be able to access Kibana at localhost:5601.
Mint doesn’t offer a public API, but they let you easily export your transactions in CSV format. The link is at the bottom of the “transactions” list. Here’s what that looks like:
"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"
"3/08/2018","Trader Joe's","TRADER JOE'S","65.56","debit","Groceries","Blue Cash Everyday","",""
"3/07/2018","Costco Gas","COSTCO GAS","19.40","debit","Gas & Fuel","Visa","",""
Before we can import the data we should set up an index and mapping for it. Since our data is well-known we’ll just declare a static mapping, with field names slightly changed to my preference for snake_case (sometimes). I’ll just do that in the “dev tools” console in Kibana:
PUT transactions1
{
"mappings": {
"transaction": {
"properties": {
"description": {
"type": "text"
},
"original_description": {
"type": "text"
},
"amount": {
"type": "scaled_float",
"scaling_factor": 100
},
"transaction_type": {
"type": "keyword"
},
"category": {
"type": "keyword"
},
"account_name": {
"type": "keyword"
},
"labels": {
"type": "keyword"
},
"notes": {
"type": "text"
},
"date": {
"type": "date",
"format": "M/d/y"
}
}
}
}
}
This is pretty straightforward, but there’s a couple interesting points here:
- amount is a scaled_float, which is actually backed by a long but scaled by 100, so we get one-cent precision (what we have in the source CSV) with no floating-point rounding
- date format is explicitly specified, to match the format seen in the CSV
- text fields will be parsed into individual word tokens for filtering and aggregating, while keyword fields will be treated as a single token
Let’s also go ahead and make an alias, since we’ll probably have to reindex this and I don’t want to have to change index names in any clients:
POST transactions1/_alias/transactions
Now we can add transactions(or transactions1 since aliases aren’t quite working right now in Kibana v6.2.2) as an index pattern in Kibana, in the “management” menu.
There’s a number of options for importing CSV data into Elasticsearch, including using an ingest node, but I’m going to use a little Python script:
#!/usr/bin/env python
import csv
import uuid
import elasticsearch
e=elasticsearch.Elasticsearch()
keys=["date","description","original_description","amount","transaction_type","category","account_name","labels","notes"]
with open('transactions.csv') as f:
r=csv.reader(f)
first=True
for row in r:
if first:
first=False
continue
doc=dict(zip(keys,row))
doc['amount']=float(doc['amount'])
e.create(index='transactions1',doc_type='transaction',id=uuid.uuid1(),body=doc)
This parses the CSV, skips the header line, turns the remaining lines into a dictionary with our chosen field names, and sends them to Elasticsearch using the low-level Python client (pip install elasticsearch) with a unique ID. We actually should fingerprint each transaction and use that as the ID so we can update this later without creating duplicates (sounds like a good topic for a follow-up).
So now that we have the data in Elasticsearch, let’s take a look at what we have. Load the “discover” view in Kibana, and we can immediately start filtering and exploring our transactions.
Haven’t used Kibana before? You may want to start with a tutorial to get the basics, I’d suggest the official getting started guide.
Before we go further, let’s make a little quality-of-life change in how amounts are displayed. Go back to the management menu, select the transactions index pattern, and then edit the amount field so it displays as dollars and cents.
From the “visualizations” module we can of course (mostly) recreate the graphs from Mint.
but we can do so much more. Let’s start with the graph I really wanted to begin with, spending by category over time.
Now this is something I can start diving into and analyzing! If it looks a like a bit of a mess, please note that Kibana visualizations are meant to be interacted with; static screenshots won’t really do them justice.
To really explore multi-dimensional data it’s usually best to have a dashboard that combines several different views into the data.
From here you can easily filter down to the category, account, or time range that you’re interested in. If you want to see the individual transactions, just “pin” the filters and head back over to the “discover” view.
Once the data is in Elasticsearch there’s few limits on what you can do with it. I could keep posting more screenshots of Kibana, but it’s really something best experienced for yourself.