Saturday, November 02, 2013

Exclusive: Insider Explains Fiasco

November 2, 2013, 4:00 p.m.

From 'Integration Testing' to 'Full End-to-End Testing'

Unless you've spent the last month in a cave with your mountain-dwelling guru, you're aware of the fiasco in the Obama Administration's roll out of the Web page that was supposed to provide the gateway for Americans' path to near-universal health insurance. [Cartoon credit: Steve Sack, Minneapolis Star-Tribune, Oct. 23, 2013.]

It's been hard to get the details on how such a thing could happen -- especially when Obamacare ("The Affordable Care Act") has been the one major accomplishment and showpiece of President Obama's Administration.

Of course, a part of the cause may have been CMS' [U.S. Centers for Medicare and Medicaid Services] failure to fully investigate the record of their prime contractor:
Canadian provincial health officials last year fired the parent company of CGI Federal, the prime contractor for the problem-plagued Obamacare health exchange websites . . . after the firm missed three years of deadlines and failed to deliver the province’s flagship online medical registry. . . . The CMS officials refused to say if federal officials knew of its parent company’s IT failure in Canada when awarding the six contracts.
Richard Pollock, "Canadian officials fired IT firm behind troubled Obamacare website," Washington Examiner, Oct. 10, 2013.

There is project management software appropriate to this task. It is called the PERT ("planning, evaluation, review technique") system, something used and developed in part by the Polaris submarine project as I observed it in the 1960s. Polaris required the ability to manage a project involving tens of thousands of sub-contractors under an extremely tight schedule. Surely PERT could have helped, with or without a failed CGI Federal.

Thus, the scenario has been a dramatic case study in both (1) the consequences of a failure to understand some basic principles of Management 101, and (2) how not to roll out a massive, new, and complex bit of software.

Although I did some computer programming in the simplistic Basic language over 30 years ago, since then I've limited myself to some easily mastered DOS and html commands and left the real programming to others. But I've experienced enough to agree with the observation of the head of a university's computer science department when told there were over 100 million lines of code in President Reagan's "Star Wars" program: "I've never seen a computer program that was more than three lines long that ran the first time it was tried."

So I asked a very reliable source whether I could share with you some professional insights which they have provided. This source, in no way affiliated with the effort, has so many years' experience in the business, in a variety of contexts, that to help to maintain his or her anonymity I have substituted "nn" in the following at the spot where they reveal that number. I found what was sent to be helpfully informative, and it's shared here in the hope and expectation you will find it so as well.

# # #

I've been closely following the developments in the debacle. I'm not a politician nor a healthcare expert, so I really can't comment on whether the Affordable Care Act will achieve its goals or fatally undermine the American Dream. That sort of pontification I leave to Democrats and Republicans, respectively.

What I am, though, is a veteran software engineer with nn years of experience dealing with large projects in both the public and private sector.

Point blank: we have been lied to, and we are being lied to, about the future of the site.

I could write five thousand words on precisely how many deceits are on display. I'll try to keep it under a few hundred and just focus on the one whopper of a lie that I believe even non-programmers can understand.

The contractors who originally delivered advised the White House that full end-to-end integration testing had not been completed -- and, in fact, had not even started until a few days before the October 1 rollout. That's a technical term, "full end-to-end integration testing," so let me explain what we mean by that. Integration testing means "we're putting the pieces together to see if they work well." End-to-end integration testing means "we're putting *all* the pieces together to see if they work well." And finally, full end-to-end integration testing means, "we're putting *all* the pieces together and testing them exhaustively to ensure they work well."

To put things in terms of cars: when the team building the tires meets with the team building the hubcaps and the team building the rims and the team building the axles, and they make sure the tires fit on the axles and the hubcaps look nice, that's integration testing. When all the teams come together to assemble a complete car, that's end-to-end integration testing. And when they put a test driver behind the wheel and send the car out for a five hundred mile drive at the local track, that's full end-to-end integration testing.

Any engineer will tell you that full end-to-end integration testing is a headache and a half. Things always go wrong, and they're never the things you expect. As a result, full end-to-end integration testing takes a long time -- oftentimes measured in months.

Would you buy a car if the vendor said, "We only started test laps at the track a couple of days ago and it had some serious problems we haven't been able to fix"?

Of course not. But that's exactly what happened with when it rolled out on October 1.

Secretary Sebelius has been publicly humiliated over the defects in She and the President have promised the website will be fixed and will be reliable no later than December 1.

My question is, where will she find the time to do full end-to-end integration testing? Even if all the changes to the infrastructure are completed November 1, that still leaves only a month for testing to make sure the site works. The contractors who originally developed were adamant that testing it properly would require months. From my own experience, I am inclined to think four months of testing is about the minimum required.

I'm not a politician and I'm not a healthcare expert -- but I'm a very good software engineer.

The system needs at least four months of testing and it's not going to get it. That means that, come December 1, the best we can hope for is that we will be delivered a new site which will not have received any significant testing. Rather than being able to point to a record of successful tests, we will instead be asked to take Secretary Sebelius at her word. "Trust me! It works fine now."

Software that has not been thoroughly tested, or has not passed its thorough testing, is fundamentally incomplete. will be fundamentally incomplete on December 1.

There's simply not enough time for it to be tested, and that means there's not enough time for it to be completed.

# # #


Trish Nelson said...

Nick, thanks for sharing. I am wondering if the person who wrote this can clarify this statement: "The contractors who originally delivered advised the White House that full end-to-end integration testing had not been completed -- and, in fact, had not even started until a few days before the October 1 rollout." Are they saying the White House was not advised until just before October 1 or that the end-to-end integration had not been started until just before October 1? Is there a link/citation for this statement?

Nick said...

I passed Trish Nelson's comment/question along to my source, who responded (with permission that I could post it here):

"On September 27 Marilyn Tavenner, Administrator of CMS, received a memorandum detailing critical, high-risk problems related to the testing regimen. That same day three senior CMS executives signed a response memorandum stating,

'We acknowledge the level of risk the Agency is accepting in the Federally Facilitated Marketplace. The mitigation plan does not reduce the risk to the FFM system itself going into operation on October 1, 2013. However, the added protections do reduce the risk to the overall Marketplace operations and will ensure that the FFM system is completely tested within the next six months.'

CMS knew wouldn't be completely tested for another six months, and yet promised everything was ready for an October 1 rollout.

The most generous interpretation I have is that CMS is staggeringly, pervasively incompetent. What ought be done as a result of this incompetence is a political question, and I wish to leave that alone.

The URL for the memoranda (hosted by CNN, who vouches for its authenticity):"

From Nick:

From the above I was able to track down a report in the National Journal, which links to the CNN report noted above.

Sophie Novack, "CMS Memo Warned of Security Threat Before Obamacare Site Launch," National Journal, Oct. 31, 2013,;

which links to, Joe Johns, "Government Memo Warned of High Security Risk at Health Care Website," CNN, Oct. 30, 2013,

The link in the quoted response comment, above, goes to a pdf of the actual memo.

Steve Groenewold said...

This whole fiasco has been disappointing and exasperating.

When Democrats finally pass the legislation they've worked three generations to create, and the administration has 4 years (four!) to get it ready, they show such an extreme level of incompetence that they've fulfilled each and every stereotype Republicans ascribe to government in spades.

This is a level of failure that's deep enough that Obama should worry about his job.* Sebelius should have been gone months ago.

*Seriously. This should have been the most memorable accomplishment of his presidency. He's a good man, sure, a serious man in a town full of clowns. As a chief executive, he's sorely lacking.