CIO

Easy risk management can prevent costly errors (ask United Airlines)

United Airlines just can't win. Last week a data center hardware failure grounded caused nine cancellations and 580 delays. Recently, too, the carrier's computer systems let members of its Mileage Plus frequent traveler program book reservations involving either a connection or a final destination in Hong Kong, for travel in the first-class cabin (where purchased tickets can go for more than $15,000), for as little as four--yes, four--miles roundtrip.

This programming glitch remained active for many hours on Sunday, June 17. While no one outside of United and the United States Department of Transportation (DoT) knows for sure how many tickets were booked under this discount, well-placed sources told NBC News that the number was in the thousands. Indeed, on many flight dates, the entire first-class cabin (up to 12 seats on some routes) showed booked.

Consider the revenue implications of this glitch. Even by the most conservative measurement standards, United stands to miss out on hundreds of thousands, if not millions, of dollars in sales opportunities if these tickets are allowed to stand. A costly error, any way you slice it.

News: "Network Connectivity Issue" Grounded United Flights in 2011

As part of the airline's cleanup process, the lion's share of these tickets were unilaterally cancelled by the airline about a week after the glitch was corrected, although some individuals who booked reservations travelled within the next few days, and their reservations were honored.

(Full disclosure: I booked four of these tickets, but they were later cancelled by the airline, so I have not made any sort of gain on this transaction. If the tickets are re-instated as part of a future settlement, that may change, but as of this writing the transaction has been unwound. I simply illustrate it here as a teachable moment.)

Either way, United Airlines did not set itself up for success. The carrier managed to let a gaping hole emerge in its booking procedures, caused a situation that involved regulatory compliance through the DoT's new mistake fare rules, allowed some of its most profitable products to be unavailable for sale for paying customers for days because of this error and risked a public relations firestorm by unilaterally unwinding transactions it considered invalid. It was not a good week at headquarters for sure.

Of course, not all of us run airlines, nor do we run airline IT departments. However, this public example can certainly be instructive of what can happen when good systems do bad things. Furthermore, watching United's situation and its response can teach us ways to avoid the same consequences in our own businesses. Here are three points to consider:

1. Don't Forget the Simple Checks

In United's case, there is no excuse for the ticketing routine not to have a check built into it--something along the lines of, say, if the number of miles paid for a ticket is less than 1,000, then don't issue the ticket. This simple move would have prevented the ensuing disaster.

Indeed, when diving into further research and reading about mistake fare cases, I found information dating back just three years ago indicating that the various airline IT vendors had only at that point installed logic in their fare-loading systems that attempted to detect fat finger mistakes and other airfare errors that deviated widely from the norms. These global distribution systems have been around for more than four decades now, and yet only recently did airlines place a priority on having computers check for obvious errors. How did we get into the late 2000s before we decided to at least start trying to prevent simple mistakes?

Survey: 1 in 4 Companies Fail to Conduct IT Risk Assessment

To get more mileage from this idea, consider all of the ranges of values for transactions and study the feasibility of checking the ranges of those values before transactions are committed, approved or otherwise confirmed. For a bank, for example, how likely is it that a retail customer will make a deposit more than 100 times his average deposit amount over the last few years? For online merchants, how likely is it you really want a transaction for just pennies to complete? (Ignore 0.00 and 1.00 transactions, as these may have legitimate purposes.) Consider amounts, quantities and other places where simple checks can pause what might turn out to be a runaway process that could require costly cleanup.

2. Don't Neglect a Thorough Risk Point Inventory

It goes against human nature to want to spend a lot of time thinking about failure, catastrophe and negative events, but to protect well against all sorts of eventualities, you have to first identify and consider their presence and potential impacts. This point may cause you to scratch your head in the context of the United example. "Did they not think about a booking engine error like this?" you might ask. The answer: if the firm did, it didn't do much about it.

Commentary: 3 Reasons Asking Risky Questions Reduces Risk

Consider all of the steps of a transaction in your business. You should have workflow and process maps available for all systems. At what point can problems occur? What types of problems occur there? What are some cost-effective ways to mitigate some of those concerns?

Better yet, make this type of review an ongoing concern. Build these types of checks into the approval process for new systems and workflows, and meet at regular intervals to keep your analysis up to date.

3. Don't Forget About Synchronizing Teams

The effects of mistakes in IT go beyond IT, yet many IT shops forget this fact or ignore it altogether. Integrating well with all teams in the organization that face any customer is a must, a mandatory minimum.

Case Study: How British Airways Made Money From IT

Let's go back to the airline example for a moment and look at how PR works. The premerger United Airlines and Continental Airlines both publicly confirmed that they would honor any price they sold, whether or not it was a mistake. Premerger United Airlines spokeswoman even Robin Urbanski even told The Wall Street Journal in 2010, "That is the right thing to do."

In 2007, United honored business-class fare from Los Angeles or San Francisco to destinations in New Zealand that was missing one zero, selling at $1,062 plus taxes and fees instead of $10,620. Once it cancelled those tickets to Hong Kong, the carrier essentially reversed itself on that public position, leading it to write a public relations goodwill check without having to later honor it. Then, United's customer service team had to reach out to each affected customer and explain the action. This costly, time-consuming initiative nothing to raise revenue or cut expenses and, from a financial perspective, was nothing but a cost.

Get together with customer service teams, public relations teams, your finance group, an executive liaison, your compliance group and others. Call it a "Systems Issue Working Group." Come up with a way of explaining how your systems work that is accessible and understandable to this group. Implement a plan for both avoiding glitches and responding smoothly to them in the event they happen. Solicit feedback from other teams on what pain points there are during glitches. Take this feedback to your organization to develop processes to check, pause and otherwise mitigate these points.

It's unlikely that you'll never have a glitch that costs you time and money. With some analysis and preparation, though, you can avoid the dumb mistakes that make everyone's skies a little less friendly.

Jonathan Hassell runs 82 Ventures, a consulting firm based out of Charlotte. He's also an editor with Apress Media LLC. Reach him via email and on Twitter. Follow everything from CIO.com on Twitter @CIOonline, on Facebook, and on Google +.