HPE embarks on internal investigation after ATO outage
- 21 December, 2016 12:08
Hewlett Packard Enterprise (HPE) has confirmed it has launched an internal investigation into the cause of the hardware failure behind the unprecedented system outage which hit the Australian Taxation Office (ATO) on 12 December.
The move comes just days after the Australian Commissioner of Taxation, Chris Jordan, said the government agency would commission an independent investigation into what he called the “worst unplanned system outage in recent memory”.
Now, HPE has outlined its own plans to get to the bottom of the hardware issues that led to more than a week of systems issues for the ATO, and days of outages.
“In addition to the ATO’s independent review, HPE has initiated its own root cause analysis investigation to determine why storage hardware went offline, preceding a series of events that led to the broader system outage experienced by the ATO,” an HPE representative told ARN in a statement.
“We refrain from speculation on possible causes while the investigation is underway and at this time, HPE does not believe that other customers are at risk,” the company said.
While HPE is yet to disclose any further details about the hardware troubles that led to the ATO system outage, the company has reiterated its efforts to work closely with the ATO to restore systems and rectify the issues that arose in the wake of the massive outage.
“A team of Hewlett Packard Enterprise (HPE) engineers continue to work closely with the [ATO] to help restore all systems to functionality as soon as possible. HPE is completely committed to helping to resolve the system issues which have impacted the ATO’s online services, portals and website,” HPE said.
The ATO’s problems first began when storage hardware that had been upgraded in November 2015 by HPE experienced an “unprecedented” failure.
The systems outage was reportedly caused by the failure of two new HPE 3Par storage area network (SAN) units acquired by the ATO in 2015, which resulted in the temporary loss of up to one petabyte of data. The issue was compounded by the subsequent failure of the agency’s back-up systems.
HPE has since indicated the problem does not appear to be systemic in nature, according to reports.
This claim mirrors comments by ATO acting chief information officer, Steve Hamilton, who suggested the problem was the first of its kind.
"Our primary back-up systems, that should have kicked in immediately, were also affected. We understand this is the first time this problem has been encountered anywhere in the world and we are working with HPE to determine the underlying cause," Hamilton said in a statement.
Immediately following the outage, Hamilton issued a statement saying that the ATO was working closely with HPE to resolve issues around online services, portals and the website.
The storage issues knocked out some of the ATO’s core internal systems and public-facing services for days.
On 21 December, a full nine days after the initial outage, the ATO confirmed that all of its business-critical systems were live, with performance returning to normal as systems stabilise.
However, at the time of writing, the ATO indicated it was still in the process of bringing some services back online.
“We are hoping to have our other online services fully functional soon but we don't have a definite time-frame yet,” the agency said in a tweet on 21 December.