Getting an invoice out the door
One of the things we do at Woolpert is service Google Cloud Platform (GCP) customers. Part of that is sending them a monthly bill for their cloud spend. And part of that means assembling various bits of data from the GCP billing backend into a sensible report that customers can consume either by reading it, or by ingesting the data into their own systems.
Not inconsequentially, we don’t get paid until customers have an invoice.
The specific need
As part of our internal process we have to get a rather specific slice of data from the Billing API:
- a list of billing accounts under our management
- a list of projects under those accounts, and
- the ‘human friendly’ name of those projects.
That helps our customers to understand which projects they’re generating expenses on, and for our internal billing system to reconcile against.
To restate in simpler terms: we need a list of accounts, project IDs, and human readable project names as a CSV.
gcloud
to the rescue
Since this felt experimental up front, I decided to use gcloud
SDK command line interface in a shell script to interrogate the API and see if I could shape it to my needs. Because the data are billing records for all customers, I had to authenticate first using my corporate credentials. Let’s assume you need to do something similar using gcloud auth login
:
- Make sure
gcloud
is configured with an identity that has permission to read all all the billing data in GCP under the Woolpert master billing account. - Make sure the identity you’re using has read access to all of the projects under those billing accounts.
Validate the you’re on the right track:
|
|
So we’re rolling! I have the master billing ID, the name of the billing account, the fact that it’s active (True
), and the sub-billing account ID. I need to use the sub-billing account ID to go a query the project list for that account. Meaning: I need just a part of that output, and in a way that’s easier to consume programmatically. The format
argument for gcloud
lets me do that. Changing the initial attempt:
|
|
Now we see the sub-account ID that I’m after. It’s the second part of the "name"
parameter. How do I get to that? Let’s adapt a little bit from the standard --format="json"
approach and use a more refined filter. This took a little bit of trying; I found this blog post on formatting helpful.
|
|
This is where I found it had been useful to first look at the JSON output, because extracting a value
has a slight odd syntax at first glance. But with the JSON in mind, it seems pretty simple. In plain language:
- Get the value of
name
. - Get the
billing_account_id
value. - Extract only the second part after the forward slash (zero-based is segement 1)
Throw in a filter, and I have exactly what I need: a list of billing account IDs. I drop these in a text file for further use.
Looping
Our billing system also needs the familiar project name. The only way I could easily see to get that was looping through the list of accounts from the previous step, and calling the API thus:
|
|
Now, this may be blindingly obvious to you, but I didn’t know before this how to do a while...do...done
loop through lines in a text file to execute a command for each line. sed
, wc
, and others, yes, with pipes. But explicit looping was new to me.
I’ve highlight line 5 because we see a handy new --format
to export CSV data. gcloud
CSV export does append a trailing comma to each line–which feels unecessary–and I’ll deal with that later. But for scripting this is a great approach. So now I’ve got a text file $ACCOUNT_PROJECTS
containing a comman-delimeted row for each project in each billing account containing: the billing ID and the project ID.
Next I need to get that friendly name, like “my cool cloud project” vs. “my-cool-project-04329834”. Unfortunately, the gcloud beta billing projects list
command cannot access that; it’s an attribute of the project itself called name
. So, time to loop again and ask each project for that final bit of data, this time using the gcloud projects describe
command:
|
|
Easy enough. We have all the data we need for our ‘billing record’ data for our internal process. Let’s write the final usable bit of data for each of these projects into a CSV.
Putting it all together
Here’s what I actually wrote up as a top-to-bottom procedural script:
|
|
Probably the only thing worth calling out that is new is the getProjectId
function. There’s a(n un)suprising number of answers on StackOverflow about how to pull a specific ‘column’ or field out of a CSV file. You probably immediately thought of cut
with various options; me too. But getting that to function on a single string vs. a line from a file led to a lot of gymnastics in approach. Instead I went with the elegant solution proposed here which did the trick nicely.
So there you have it: a few new tidbits to learn, and a couple of csv files that are ready on a monthly cron basis to enrich customer bills via the Cloud Billing API.