CIO

Long weekend comes to an end for HPE and the ATO

The agency begins bringing services back online after marathon weekend of work

The Australian Taxation Office (ATO) has told Australians that many of its services are back online following a weekend-long marathon effort working with Hewlett Packard Enterprise (HPE) engineers to replace failed hardware and restore systems.

The agency said early on 6 February that a number of core services, including the Tax Agent, Business and BAS Agent Portals, ATO Online services, and Standard Business Reporting (SBR) services were back up and running, with other services expected to be brought online “shortly”.

“Our clients may experience some slowness as further work is undertaken to improve the overall performance of our systems,” the ATO said in a statement.

“Our focus will now turn to building system resilience to best ensure the stability of our services to the community.”

The latest update on the ATO’s long-running efforts to restore its systems following a massive outage caused by storage hardware failure on 12 December 2016 comes after days and nights of work that saw ATO staff join forces with HPE technicians to replace the storage hardware and get systems back up and running.

“Engineers from Hewlett Packard Enterprise and the ATO continued system restoration efforts overnight and this morning,” the ATO said on 5 February.

“Good progress has been made as we work towards having services available for clients on Monday. This will however be subject to ongoing testing of the integrity and stability of the system.

“Restoration efforts are progressing well and engineers are now undertaking technical verification of our systems,” the agency later said. “This will continue throughout the night with a view to services being available tomorrow.

“Work has also been completed to ensure we have appropriate contingencies in place which is giving us increasing confidence that we will at least have our core client services available tomorrow [6 February],” it said.

The ATO flagged on 4 February that HPE would work over the weekend to restore systems, with support from the agency’s own technicians. At the time, it indicated that it did not expect services to be available until after 5 February.

“While we are working as quickly as we can, we cannot make services available to the community until the integrity of the system is confirmed,” the agency said.

The agency also revealed that it had commissioned a new Storage Area Network (SAN) to “provide better services in the future,” saying that the new hardware had arrived at ATO premises.

“This new system will contain vast amounts of ATO data and it is currently being configured to provide more reliability and stability,” the ATO said at the time.
The nature and scope of the ATO’s SAN means that this process of replacing the affected hardware will take some time.”

Page Break

According to the ATO, once the integrity of the new system is confirmed, it intends to implement three follow-on priorities: making services available to key impacted stakeholders; building system resilience to best ensure the stability of services to the community; and increasing the capacity of systems to deliver the services the community expects.

At the same time, the agency confirmed that issues with its hardware were at the bottom of the initial systems outage in early December and the most recent outage, which hit at the beginning of February, and continued for a number of days, following its initial efforts to swap-out its affected hardware.

“While recognising that PwC [PricewaterhouseCoopers] are undertaking an independent review into the outages, we can confirm that hardware faults caused both the December outage and this outage, although we understand that the exact nature of the respective faults is different,” the ATO said.

“We recognise the ongoing impact this is having on our stakeholders and the broader community and thank them for their ongoing patience,” it said.

The agency has spent the better part of the past seven weeks working to restore its systems to full capacity, retrieve temporarily lost data, and replace affected hardware, following the “unprecedented” failure of storage hardware that had been upgraded in November 2015 by HPE.

It is understood that the outage involved two new HPE 3Par storage area network (SAN) units acquired by the ATO in 2015.

The restoration work has seen the ATO intermittently take its systems offline since the problems first hit, over two weekends in January, and during the Christmas holiday period.

Soon after trouble first struck, on December 16, Australia’s Commissioner of Taxation, Chris Jordan, referred to the incident as the “worst unplanned system outage in recent memory”.

“This was an extremely unusual and unfortunate event with the outage caused by a significant and unprecedented failure of storage hardware,” Jordan said at the time.

While the ATO has called in PwC to investigate the cause of the initial hardware failure and the resulting outage, HPE confirmed in late December that it had launched its own internal investigation into the cause of the hardware failure.