EDIT: This piece has gone much bigger than expected. I’m blown away. I was editing during the day to add clarifications onto the end, but I’ve gone back and worked them into the body of the text.
The Treasury data breach has been a shitshow. I don’t think I’ve ever seen a bigger disconnect between the experts and the pundits, and I don’t say that lightly. I’m not a security guy, for what it’s worth: I’m a writer at a tech firm, but I’m fascinated by security and over the last few days I’ve been talking to people who actually know their stuff. Almost unanimously they’re calling this a breach. Almost unanimously, the pundits are off shouting that it’s “not a hack!”.
Right from the start, I’m setting a rule: we’re not going to talk about “hacking”. It means totally different things to the IT sector (anything from coding at all to randomly kludged spaghetti code that really shouldn’t work) and the public (a man in a trenchcoat saying “I’m in!”), and most InfoSec types shy away from it anyway. I’m not going to bore you with the whole hacking vs cracking debate, but we’re going to call this thing what it is: a data breach.
So what happened?*¹ This is a web server:
Its job is to display web content. Every time you go online, you’re accessing content from web servers. Simple enough? This is a staging server:
It serves as a testing environment. Content intended for the public but not yet released goes on the staging server to make sure it runs smoothly for when the time comes to make it public. Some staging server content never goes live: it either didn’t work as expected or it wasn’t meant to be there, or something changed and it got pulled.
Treasury cloned their web server, put it in the staging server, then added the budget to it for testing. The problem is, they also cloned the index configuration: the instructions that the search used to store search data for later use. Both web and staging server stored their search information in the same place and SOLR—the program running the search function—wasn’t properly instructed to avoid the staging server. That gave the web server access to the search information about documents on the staging server via the search bar, though not the staging documents themselves.
To illustrate, here’s the Spinoff today:
See how you get the title and the first few lines? Using the exploit on the Treasury’s site, somebody pulled snippets of the budget like that from the staging server. Critically, to do this, you would need to know the title of the section. You search for a specific heading in the web server, and it comes up with the title and the first 4-5 lines. It was, all things considered, a pretty small hole:
- It required the attacker to know the content was on the staging server
- It required the attacker to know the specific wording on the staging server
- Even then, it only gave them snippets
So what happened? Well, a leak. The actual leak. The budget didn’t leak: the budget’s search index leaked. That’s essentially a table of contents. The budget ToC being out in the open covered points 1 and 2 above: the fact the budget was ready to go public (thus, probably on the staging server) and a list of searchable titles and subtitles.
“Leak” is a strong word, too: it used the same headings as the 2018 budget. I’m still a little fuzzy on whether the actual index leaked (as in, got sent to the wrong place/got left out somewhere irresponsible/got made public too early) or whether somebody just heard it was the same as last year’s via the Thorndon grapevine and started punching in queries.
What about #3? Well, that’s why there were 2000 searches. They pulled 2000 snippets and put the budget together like a jigsaw. It’s not “just a search”: it’s using a leaked search index to perform 2000 searches, to take advantage of an exploit that pulled small pieces of content from a staging server, then stitching that content together in post. It’s not something Johnny Q Public could do by accident. It’s not an “open door” at all. That’s also why National got some details wrong: they didn’t have a complete picture. They had a very good outline, though. All the titles and subtitles, and the first few lines after each.
It’s all a bit rubbish but—to quote InfoSec luminary Adam Boileau—”it’s not rubbish if it works“.
Metaphors about the door being unlocked do us no favours, unless we really want pundits to be better-equipped to twist the actual events. Whether or not it’s a “hack” doesn’t really matter: it’s an intentional attempt to gain access to private data. It utilised an exploit to pull content that wasn’t meant to be public. It’s a breach. More than that, there are established protocols for what happens if somebody finds an exploit in government software. These rules were written by the National Party in 2014, and National failed to follow them. Their failure to follow protocol merits investigation: they let the particular use of an exploit go undetected for their own political gain. Even if the content was delivered to them anonymously by a no-good samaritan, they bear at least partial responsibility for this because they went public instead of reporting it.
Where did the Treasury fuck up?
- They should’ve considered their SOLR configuration when they cloned their data to the staging server.
- They probably shouldn’t have cloned their web server to begin with—making a staging server from scratch with the same dependencies might have been a pain in the ass (I’m honestly not sure: I don’t know what their dependencies look like) but it would’ve been a lot safer.
- They could’ve been jazzier about this year’s subtitles.
Where did the National Party fuck up?
- They identified an exploit but—instead of following CERT protocol—they used it for their own personal gain.
I’m not gonna lie, it’s bad. Somebody dropped the ball, and somebody else put a knife into it.
Still, I do not believe Simon Bridges has committed a crime, nor has he committed Breach of Confidence. He has violated his CERT obligations, which at worst means he’ll get a strongly-worded nonbinding letter from MBIE telling him not to do it again. He did a bad thing, but not all bad things result in him being removed from Parliament in a paddy wagon. To quote one of my anonymous sources: “he’s an asshole, not a criminal.”
It’s still ridiculous that pundits are calling for heads to roll. At the end of the day, it wasn’t a big deal. Grant Robertson shrugged and moved on. The Treasury were right: what harm could somebody actually do by using that exploit? Release a half-complete version of the document a day early?
By the by, it’s not dodgy or extreme that anybody called it a ‘hack’. If there’s a problem with the word, it’s not that it doesn’t mean this, it’s that it does mean this because it’s a vague word that means wildly different things to different people. Not all hacking is a man in a trenchcoat typing into a green/black Linux CLI then saying “I’m in!”—It’s not rubbish if it works. Makhlouf and Robertson could’ve maybe been more precise with their language but that’s not a crime either.
And then, of course, the pundits got to it. Either the Treasury were little angels who did no wrong, or they were cringing fools who dropped a box of printed budgets off at the top Lambton Quay. What we actually have here is a pattern pretty typical of data breaches: a small screwup like improper SOLR config let an attacker access to data they shouldn’t have had. I’m sure somebody is going to shout at me that it wasn’t a small mistake, but unless they can explain how to correctly configure Apache SOLR in a Drupal installation so it doesn’t allow partial read access to cloned data in a staging server then they can fuck right off with their piety and condescension. It’s a screwup for sure, but the people talking about “open doors” need to pull their heads in.
What’s really happening is that the pundits smell blood in the water, and they don’t care what actually happened—they just want an excuse to sink their teeth in.
Same old NZPol, I guess.
If you like what you’re reading, stick around and check out some of my fiction, or follow me @understatesmen on Twitter.
*¹ most of this is coming through various DMs and actually talking to people. I am willing to admit I might’ve muddied the details, though I’ve done my best and at the very least—talking to actual experts and having a tech background—I’m doing a better job than the lukewarm tech reckons of blokes who struggle to operate a washing machine.
Credit for assistance to Sana Oshika, and the others who preferred to go unnamed.
Thanks for this.
It’s helped me, who has no technical skills, understand where the ethics in this is.
We’re told that exemplary values and ethics are vital in the coming world of AI as the human ability is slowly removed.
This has made it very clear that, Principles above Personalities/Politics were set aside in this instance.
For me, a denial of your Principles, that you set in place, against what appears to be an admin error, leaves a deficit of trust with the firmer, and, a you’ve got some systems to sort for the other, a daily if not weekly occurrence in all good businesses.
Your technical explanation is brilliant but your application of law, ethics and everything else seems a little skewed.
1. I don’t know anything about the CERT protocol but it looks very much like a notification process and not at all like something that binds the National Party.
2. You recognise the public has a particular understanding of “hacked”. That understanding is likely shared by and was definitely exploited by Makhlouf and Robertson for political and personal gain.
3. Makhlouf went to the Police. It remains clear that nothing criminal occurred and it is unlikely any civil legal wrong occurred either.
4. As you acknowledge, Treasury failed to follow best practice in securing it’s data. Data which in this case is usually highly guarded for political (but also economic) reasons.
5. Our Courts (and those overseas) have been pretty consistent in saying things you make publically available on the internet (even in limited form and with restricted access) are not private.
As I said, your technical explanation is good. Your conclusions are not.
@T L Steele: Your own reasoning is flawed.
The National party conducted a deliberate, systematic exploit of a technical vulnerability to obtain information they knew they weren’t supposed to have. It’s entirely within the common sense understanding of ‘hacking’ and could be argued in good faith to fit within s249 of the Crimes Act, though I don’t think it meets it. Makhlouf’s actions in going to the Police weren’t unreasonable, because it’s their job to decide if something is a crime.
It’s similar to the time the former Prime Minister went to the Police when he was inadvertently recorded by a journalist (the ‘teapot tapes’) and the Police decided not to prosecute; appropriate, if not necessarily wise.
Your point about Courts treating information available on the internet as ‘not private’ is vague, but I’d direct you to Hammond v NZCU Baywide, where the Human Rights Review Tribunal determined that a photo was collected from Facebook in breach of the Privacy Act and imposed a $168,000 fine on the offending company.
I think we should expect a higher standard than ‘not technically illegal’ from Her Majesty’s Opposition.
A very helpful piece of analysis, Alexander. It makes sense of the behaviour, and what people said when. Much appreciated!
Good work. It’s nice to see that there still is a place for good old fashioned IT journalism.