Constellations, Launch, New Space and more…

SpaceX’s Philosophy: Reliability Through Continual Upgrades

By Doug Messier
Parabolic Arc
July 7, 2015
Filed under , , , , , , , , , , , , , , , , , , , , , , , , ,

Remains of a Falcon 9 rocket fall to Earth.

By Douglas Messier
Managing Editor

To succeed in the launch business, you need to be very, very good and more than a little bit lucky. Eventually, there comes a day when you are neither.

That is what happened to SpaceX on June 28. A string of 18 successful Falcon 9 launches was snapped as the company’s latest rocket broke up in the clear blues skies over the Atlantic Ocean. A Dragon supply ship headed for the International Space Station was lost, SpaceX’s crowded manifest was thrown into confusion, and the company’s reputation for reliability was shattered.

It was quite a nasty little shock. But, in another sense, the timing was the only real surprise. One might have expected an accident to occur much earlier in Falcon 9’s history as SpaceX worked out bugs in the launch vehicle. But failures don’t follow any set schedule. They arrive when they arrive, often with little advanced warning.

Even with 19 flights, Falcon 9 has not flown enough times for anyone to gain a real understanding of how the launch vehicle will perform over the long run. You need scores of flights to get a really good handle on reliability. This process is even more difficult in the case of SpaceX, which doesn’t operate like most launch providers.

Hardware as Software

The launch industry tends to be very conservative. A launch provider will build a rocket, test it, and make changes as necessary based on those results. The company matures the design, and then puts it an assembly line staffed with workers who are skilled at doing the exact same things over and over again, day in and day out, for years on end.

Changes are made very carefully and only after thorough testing. Experience has shown that while upgrades can improve a rocket’s performance, they also can cause problems. Given the high cost of launches, there are not a lot of opportunities to fully test out upgrades.

By contrast, SpaceX has treated Falcon 9 as something akin to software — a system designed to be regularly upgraded as engineers learn from flight experience. The original version of Falcon 9 flew five times before it was retired for the Falcon 9 v.1.1, which included higher performance engines, longer fuel tanks, landing legs for first stage recovery, and host of other significant upgrades.

Falcon 9 lifts off from Vandenberg Air Force Base. (Credit: SpaceX)

Falcon 9 lifts off from Vandenberg Air Force Base. (Credit: SpaceX)

SpaceX boasted that the Falcon 9 v.1.1 was virtually a new launch vehicle. And, to some degree, it was. The larger rocket could launch communications and military satellites that need to go higher than low Earth orbit (LEO). The earlier version of the rocket performed all its missions in LEO.

Falcon 9 v.1.1 flew successfully 13 times before failing on its 14th flight, giving it a reliability of 92.85 percent. If you include the five launches of the retired version of the rocket, the reliability increases to 94.74 percent. But again, even 19 flights is not a very large number.

SpaceX even made changes to the launch vehicle during Falcon 9 v.1.1’s brief flight history. Last year, for example, the company decided to bring the production of helium bottles used to pressurize the liquid oxygen (LOX) tanks in house. Previously, the bottles had been supplied by an outside contractor.

It’s not clear why the change was made. Perhaps there were problems with the supplier’s bottles or prices, or SpaceX simply thought it could do a better job at a lower cost. However, it fits a pattern. The company likes to build as much as its rockets in house as possible. SpaceX also is known in the industry to use supplier relationships as a way of identifying and hiring away a company’s best personnel.

Whatever the reason for the change, SpaceX ended up experiencing helium leaks that caused launch delays in 2014 [Orbcomm’s Elusive Falcon 9 Launch Date TBD] and 2015 [SpaceX Puts Off Next Falcon 9 Launch]. The helium problems were one of the reasons SpaceX conducted only six launches in 2014, far short of the 12 the company had hoped to accomplish.

It must be emphasized that the root cause of the “overpressure event” in the upper stage LOX tank that resulted in the Falcon 9’s destruction is yet to be determined. The accident could well, in fact, have no connection to the helium system used to pressurize the tank. It could be unrelated to the decision to bring production in house. The point here is that changes to a launch vehicle can cause unexpected problems.

SpaceX CEO Elon Musk has said the cause of the accident appears to be complex; engineers have spent a lot of time trying to understand exactly what went wrong.

“Obviously, this is a huge blow to SpaceX, and we take these missions incredibly seriously,” Musk said on Tuesday. “Everyone that can engage in the investigation at SpaceX is very, very focused on that. In this case, the data does seem to be quite difficult to interpret. Whatever happened is clearly not a simple, straightforward thing, so we want to spend as much time as possible just reviewing the data.”

Musk said via Twitter that the company will have preliminary results of its investigation by the end of this week. SpaceX will brief the Federal Aviation Administration and key customers before posting the conclusions on its website, he said.

Another Falcon 9 Upgrade

As the investigation continues, SpaceX engineers are working on yet another upgrade to the launch vehicle. The Falcon 9 v.1.2, which had been set to debut late this year, will feature super-chilled propellant and a 10 percent increase in the volume of the second-stage tank. The improvements are designed to increase thrust by 15 percent and help offset the performance hit the rocket took when landing legs and other systems were added to allow for the recovery of the first stage.

Even more changes are likely once SpaceX succeeds in recovering first stage boosters for reuse. Engineers will examine every inch of the booster looking for wear and tear and for any changes that can be made to improve reliability. New versions of the Falcon 9 will undoubtedly emerge.

How one feels about the constant upgrading of the Falcon 9 depends upon where one sits. SpaceX has found plenty of commercial customers willing to take risks on its ever evolving rocket despite its scant launch history.  SpaceX’s low prices are a big factor.

NASA was not that upset by the loss of the Dragon capsule. The space agency entered into commercial cargo agreements with both SpaceX and Orbital ATK fully expecting to lose some supply ships along the way. The payloads the agency places on these vehicles are largely low risk, nothing that can’t be replaced.

Both companies have lost supply ships over the past eight months, with Orbital ATK losing a Cygnus freighter last October when its Antares rocket blew up. The multiple failures — and the loss of a Russian Progress freighter in April — have strained ISS supply lines, but not to the breaking point.

There’s a major difference between the SpaceX and Orbital accidents. Antares is not a major factor in the international launch market. NASA is the rocket’s only user, having booked nine Cygnus supply flights to the space station. Orbital ATK has announced no other launch contracts.

SpaceX, on the other hand, has roughly 50 launches on its manifest. The company is in the critical path for a number of major players: communications satellite fleet operators whose schedules and revenue models have been thrown into uncertainty; NASA, which in addition to cargo flights is expecting the company to launch astronauts to the International Space Station within two years; and the U.S. Air Force, which is looking to bring down its high launch costs by awarding contracts to SpaceX.

The Stakes Get Higher

Lt Gen Ellen Pawlikowski, Space and Missile Systems Center commander, signed agreements with Space-X CEO Elon Musk, Jun 7, 2013 at the Space-X facility in Hawthorne, Calif. (Credit: USAF/Joe Juarez)

Lt Gen Ellen Pawlikowski, Space and Missile Systems Center commander, signed agreements with Space-X CEO Elon Musk, Jun 7, 2013 at the Space-X facility in Hawthorne, Calif. (Credit: USAF/Joe Juarez)

The Falcon 9 accident came at a time when the stakes for SpaceX launches had been raised significantly. NASA had just certified SpaceX to launch payloads more crucial than ISS cargo. The company will be launching the space agency’s Jason-3 remote sensing satellite, a mission that had been scheduled for August prior to the accident.

Far more significant is the recently completed U.S. Air Force certification of Falcon 9, which will allow SpaceX to compete with United Launch Alliance (ULA) for defense launch contracts. SpaceX had pushed the Air Force to complete the certification process faster. The company unsuccessfully sued the service in an attempt to void a large launch contract given to ULA even before certification was completed.

Musk’s argument was that SpaceX was able to deliver launches that are just as reliable and much less expensive than ULA. The accident doesn’t void the certification, but it certainly raises questions about the reliability claim.

Atlas V liftoff (Credit: ULA)

Atlas V liftoff (Credit: ULA)

Gen. William Shelton, who was commander of the U.S. Air Force Space Command until his retirement last August, pointed out ULA’s excellent record in a recent op-ed piece in The Wall Street Journal.

Current U.S. space policy is implemented by buying both the Atlas V and Delta IV rockets from the United Launch Alliance, a joint venture of Lockheed Martin and Boeing. Both rockets have a 100% success record—83 launches without failure.

ULA critics will quibble with that statement; like the Falcon 9, both the Atlas V and Delta IV have experienced anomalies during flights. But, the company has not suffered the type of catastrophic failure that SpaceX experienced last month.

Shelton also pointed out that there’s already a problem with the certification the Air Force just awarded for the Falcon 9:

SpaceX is the first company to complete the certification process for its Falcon 9 Version 1.1 rocket—the one that failed on Sunday. But the company is also developing a “Full Thrust” Falcon 9—capable of carrying all but the heaviest satellites—and that is the rocket it intends to use to bid on national-security contracts.

The Falcon 9 Full Thrust version hasn’t gone through certification, indeed it has never been launched. Nevertheless, SpaceX lobbyists last year convinced key congressional leaders that their rocket is ready to launch national-security missions.

In other words, the Air Force will be launching on yet another version of the Falcon 9 with an even shorter launch history than the one that just failed. That can be handled with some additional certification work. However, it’s an unnerving prospect for an organization whose primary focus is on mission assurance, not cost.

The Air Force does not like taking a lot of risks with its launches. And with good reason. The satellites it launches are crucial to national security, and many of them are very costly.  That makes any launch accidents doubly expensive.  If the Air Force saves money on a cheaper launch vehicle but it ends up losing a very expensive satellite (or two), exactly what has it gained?

The service has gone through periods during which launch vehicles failed on a regular basis. It decided to revamp its processes to emphasize reliability. The Air Force was closely involved with Boeing and Lockheed Martin when the companies developed the Atlas V and Delta IV boosters.  Those efforts haven’t come cheap, but they have paid off.

It helped that the technology used in the boosters had long histories. The Centaur upper stage is an evolved version of the one that first flew in 1963. Russia’s RD-180 engine, which powers the first stage of the Atlas V, is extremely reliable and can trace its roots back to the 1980’s. Falcon 9 just doesn’t have the same legacy yet.

Despite their reliability, both the Atlas V and Delta IV will be phased out. One reason is competition from SpaceX. Both of ULA’s rockets are too expensive to compete on the commercial market; they are almost totally reliant on military and NASA payloads, for which they now have tough competition. The other reason is the decaying relationship between the United States and Russia, which has made continued use of the RD-180 engine unsustainable.

Instead of developing a new engine for the Atlas V, ULA has elected to develop a brand new launch vehicle called Vulcan. The new rocket won’t be ready for flight until 2019, and it won’t be certified to carry defense payloads until several years after its inaugural launch. This will leave SpaceX with the very type of monopoly the company has criticized ULA for having on military launches.

Lives at Risk

Dragon Version 2. (Credit: SpaceX)

Dragon Version 2. (Credit: SpaceX)

Meanwhile, the stakes are about to be raised even higher for SpaceX on the civilian side. The company is in the final stretch of NASA’s commercial crew program, under which it is set to fly astronauts to the International Space Station within the next two years. This is a much more costly and risky endeavor than cargo; the consequences of failure are much higher.

The employees at SpaceX are acutely aware of this reality. As they watched Falcon 9 break up, they were undoubtedly thinking: this time it was only cargo, but what if this happens when there are astronauts aboard? The accident was a very sober reminder of how much more will soon be at stake.

On the plus side, it’s good to have these failures now before crewed flights begin. The company can learn from them and fix what went wrong. On the other hand, there’s not a whole lot of time remaining with a 2017 deadline for crewed flights looming. And what if the rocket has other problems that haven’t surfaced yet?

There was talk after the Falcon 9 accident about whether the Dragon cargo ship could have been saved if it had the abort system planned for human-rated Dragon V2. The general consensus was that the capsule could have been rocketed away and parachuted to safety. However, the discussion misses a key point.

An abort system is like an inflatable aircraft slide: an essential safety feature that you never, ever want to actually use. It’s an option of last resort when all else has failed and there’s no other way to save the lives of the crew.

The ultimate goal is to build a rocket that is so reliable that you never have to use the escape system. SpaceX believes the way to accomplish that goal is through constant innovation, not by the traditional method of flying the same design over and over again.

 The Human Cost

Marlin 1D engines undergoing checks. (Credit: SpaceX)

Marlin 1D engines undergoing checks. (Credit: SpaceX)

There’s another area that SpaceX needs to address as the stakes involving its launches become higher: the use of its workforce.

Musk has adopted a Silicon Valley approach to working hours: hire young employees, contractors and interns and work them to the bone. Sixty to 80 hour weeks are the norm. “The best thing about working at SpaceX is the flexibility,” one intern joked. “You can work whatever 80 hours a week you want.”

There are some real benefit to the SpaceX model:  the esprit de corps it produces, the invaluable experience of working on real space hardware, the prestige that comes with working for a world famous boss, and the sense of mission embodied in Musk’s goal of colonizing Mars and becoming a multi-planet species. Throw in stock options for when the company goes public, an awesome array of food in the cafeteria, and various other perks, and it’s easy to see why a lot of people are eager to work there.

The negatives of this approach are a workforce prone to exhaustion, burnout, high turnover and mistakes. And that should raise some rather serious questions. If you’re sitting in Dragon spacecraft awaiting for tons of propellant to explode underneath you, how comfortable are you going to be knowing the rocket, the capsule, and its escape system were built by people who have been working insane hours for months or even years? Not very.

A related issue is who is on the assembly line building these vehicles. SpaceX tends to attract a lot of very driven, Type A personalities with sharp elbows. Those are not necessarily the folks who are best on an assembly line, where the ability to perform the same tasks over and over again with great precision is an extremely valuable skill.

The solutions to these issues are relatively straightforward. One is to cut back on hours worked and to hire more employees to pick up the slack. The other is an evolution in the workforce with a greater focus on production and (eventually) the refurbishment of recovered first-stage boosters.

Cutting back on hours and hiring more employees could end up raising costs a lot. SpaceX’s launch prices are already the lowest in the industry. They are so low that the company’s competitors can’t figure out how SpaceX is making any money. The Chinese can’t figure it out. Nobody at Orbital ATK can either. Either final costs are much higher when you add in payload processing services and other factors, or SpaceX’s profit margins are razor thin.

It’s possible that reusing the Falcon 9 first stage is more than just a technological breakthrough, but that it’s essential to the company’s long-term viability. Getting 10 flights of a stage — even at a reduced launch price — would bring in more money than a single launch, and it would be the most efficient use of the output of the workforce.

In the wake of the accident, SpaceX is now even further behind on a manifest that has consistently slipped to the right for years now. The loss of the Falcon 9 has scrambled an already tight schedule. The first demonstration of the Falcon Heavy is likely running about three years behind schedule. NASA wants crew flights in two years. Comsat operators want their satellites launched.

Meanwhile, Musk will want to fix whatever went wrong with the Falcon 9 and resume flights as soon as possible. That means employees are going to have to work harder to recover from the failure and to get caught up on the schedule even as the stakes for what they’re doing get higher.

The Falcon 9 failure was a nasty little wake up call. How SpaceX recovers from it, and what it does moving forward, has consequences not only for SpaceX but the entire American space industry. The global space industry, in fact. It will be interesting to see how things play out.

28 responses to “SpaceX’s Philosophy: Reliability Through Continual Upgrades”

  1. James says:

    Build a little, Test a lot. Build a little, test a lot.

    This was the design philosophy that made our things great, Now.,…..not so much its test per this and that and ship. Then something else entirely new.

  2. Emmet Ford says:

    Great article. What an adventure this guy is leading us on.

  3. Aerospike says:

    While I certainly do not agree with all points that got raised (razor thin profit margins) this was another well written piece!

  4. TimR says:

    Yes indeed, a very nice article on the present state of SpaceX. Only thing I’d add is that SpaceX has avoided (maybe more than) a few catastrophic losses because of the static test firing on the launch pad and their ability to shutdown an initiated launch on the pad. Now other LVs can do this but it appears SpaceX’s method is a significant step further. Faults detected during static fire and initial seconds while still on the pad have likely gone a long way to correct design, manufacture and processing issues. Without it, their launch record would be incredibly different. Kudos to SpaceX but 2nd stages stand far more idol during static fires and checkout at the moment of launch.

    • Douglas Messier says:

      Good points.

      I think having 10 engines on each rocket has probably helped. A lot of data from engines in actual flight. That probably helped them to do the Merlin engine upgrades faster than if they were flying two engines per flight.

      • TimR says:

        On the topic of money, depending on how costly the fix is, in shear cash and in time, it will effect what they have on their plate. Something might have to fall off and the funds diverted to resolve this matter and keep everything else on schedule. Maybe the worse case for Elon is having to go public. It would raise a lot of money but he’d lose control and his ability to focus X on Mars.

        What I think he could do is put this internet satellite constellation on hold and divert those funds. The project would jack up demand for F9s and effectively lower their cost to everyone but it won’t reach such a point unless everything else on their plate reaches light of day (on time).

        Perhaps Larry Page could help out more. He said that he could bequeath his fortune to Elon tomorrow if he was run over by a bus because of how much he admires and believes in Elon’s vision.

        Maybe more private funds will arrive. Its interesting that in broad terms, many billionaires today are taking visionary roles in society and I think its in large part because of how dysfunctional the US government has become.

        • Douglas Messier says:

          Those are good points.

          One of the things that I should have touched on in the piece was the $1 billion in investment that SpaceX received, primarily from Google. That’s a big boost to the bottom line. However, it came as SpaceX was embarking on its 4,000 satellite global Internet constellation project. That program will suck up a lot of cash.

          The low launch prices really do confound SpaceX’s competitions. They’re not sure how SpaceX can make money at those prices.

  5. CY says:

    Nice article. One minor note: “nonplussed” means “taken aback”, not “unfazed”.

  6. DougSpace says:

    A very thoughtful article. Very thorough.

    I wonder if SpaceX sort of dodged a bullet. If they had had this failure prior the their recent certifications, would they have received them?

  7. Larry J says:

    The US government doesn’t buy insurance for its satellites. They are self-insured, meaning the taxpayers eat the loss when there’s a failure.

  8. Larry J says:

    Good article. Up until the mid to late 1980s, US rocket companies used to routinely slip upgrades into their designs. The Air Force objected and forced them to slow down the changes. The result was slower innovation but an improvement in launch reliability and better cost control. The old Soviet Union locked in their designs early and only upgraded them when absolutely necessary. This was especially true in their Soyuz series boosters, the most widely produced space launch vehicles ever.

    Reading about the working conditions, it reminds me of what I read last weekend in Mike Mullane’s astronaut autobiography, “Riding Rockets.” He described just those same conditions among NASA’s workforce leading up to the Challenger accident. Fatigue and rockets are often a bad combination.
    As to the cause of the Falcon 9 failure, none of us have access to the detailed design information or the telemetry stream so all we can do is speculate. Like software, it could’ve been a rare and unpredicted set of circumstances based on the particular load it was carrying, such as some form of harmonic vibrations. It’s possible the oxygen tank overpressure was a symptom, not a cause. We simply don’t know. Whenever there’s an accident like a plane crash or rocket failure, everyone expects answers at Internet speeds but quite often that isn’t possible.

  9. Douglas Messier says:

    You didn’t mention the rest of what I wrote:

    “Changes are made very carefully and only after thorough testing.
    Experience has shown that while upgrades can improve a rocket’s
    performance, they also can cause problems. Given the high cost of
    launches, there are not a lot of opportunities to fully test out

    “It helped that the technology used in the boosters had long histories. The Centaur upper stage is an evolved version of the one that first flew in 1963. Russia’s RD-180 engine, which powers the first stage of the Atlas
    V, is extremely reliable and can trace its roots back to the 1980’s. Falcon 9 just doesn’t have the same legacy yet.”

    The RS-68A was introduced on the 20th Delta IV flight. That came 10 years after the booster first flew. The change over also was not immediate. It won’t be until later this month on the 30th Delta IV flight that the next flight with the RS-68A will occur.

    The RL10C-1 is the latest iteration of an engine that dates back to 1963. The engine is well understood and has a lengthy history.

    By having 10 engines on each flight, SpaceX also has gained a lot of expertise with its engines, so upgrades can probably come faster. So there is a real advantage to that design in terms of in-flight engine experience.

  10. Douglas Messier says:

    I concede that I overstated the conservative nature of changes on more traditional launch systems. Clearly, there are continual upgrades and changes on rockets. So, you’re right there.

    However, the degree of changes is the key point. You see changes on launch vehicles with lengthy flight histories with some components and elements that go back 50 years. The leaps SpaceX is making tend to be faster and more aggressive, with fewer launches under its belt.

    Getting back to our software analogy, the leap from Falcon 9 v.1.0 and v.1.1 was more akin to introducing a new operating system rather than upgrading of an existing one. SpaceX sold it as being almost a new launch vehicle. They should probably have called it Falcon 9 v.2.0. The next upgrade is probably more akin to v.2.5.

  11. Douglas Messier says:

    I didn’t say ULA did MORE testing. I think they’re more conservative in how they implement upgrades.

    One of the issues with bringing the helium tank production in house was the delays it caused in flights. And I imagine there’s always a trade off there. Looking at it from the company’s side. It’s hey, we can make this change and we’ve integrated our supply chain and got control of the process and we don’t have to pay an outside contractor. Sounds good. And then it’s not so good because there’s things you didn’t learn from the contractor that are causing problems. And you get further behind on the manifest.

    If I’m a customer, I’m thinking, I want my satellite launched. And it’s already slipped x number of months. And your need to control everything and constantly change things has delayed that even further.

    • windbourne says:

      True, but so far, the helium tank is the only thing that has caused multiple issues ( excluding the explosion ). Generally, spacex has managed change fairly decently.

  12. mivenho says:

    Great article. As for the picture caption, they are Merlin 1D engines, not “Marlin”.

  13. Douglas Messier says:

    Uhhh noo…

  14. Vladislaw says:

    No his point was that the talking about changes rounitely made by SpaceX and the changes to heritage EELV hardware and the speed and complexity of changes is like talking about apples and oranges.

  15. Douglas Messier says:

    Routine, Mr. Rnonymous? I’m not sure changes in launch vehicles are really routine. But here again, I must bow to Jimbo’s superior knowledge and ego. Not necessarily in that order.

    In watching the clips, I saw probably as much or more discussion about the process (how changes are evaluated) as the quantitative nature of the changes.

    I’m not changing the post. We’ve had a discussion about it. Mr. Rnonymous has contributed valuable input into it. He won an argument. Pity Jimbo doesn’t seem to be able to win with any degree of grace. But, I have a pretty good idea why that is and where this is coming from.

  16. Douglas Messier says:

    This last paragraph is bs. You at first demanded corrections then insult me and denigrate the story when u dont get what u want. Sorry jimbo. You are changing your argument so you can sneer some more. Not cool.

    The whole story was about the scale and speed of change and how that affects different customers and the impact on how reliability is calculated. The article would improve but it would hardly collapse if i made changes.

    Spacex sold the v.1.2 as practically a new launch vehicle. That was their PR. USAF process is different.

  17. Douglas Messier says:

    Take yes for an answer. You’ve made your points. You won the argument. WE GET IT. OK? You don’t know when to stop. But when it just continues like this with that tone. Youre not content to hammer a nail flush. You have to keep banging on it til all the surrounding wood is splintered in.

    I dont mind being corrected on things. But when you demand corrections then criticize me for not making them and then say they were impossible to make anyway. Thats insulting. Im perfecting willing to let the piece stand on its own and take my lumps in the discussion. Thats typically how these things are done. I agreed you were right and i was wrong. Give it a rest.

  18. Mkg says:

    Great article! I agree he (Elon) could possibly take us where we never imagined in our lifetimes. But, I also want to mention that he needs to start hiring people with experience that aren’t as young and “scrappy” (word used by hiring mgr.) as the majority of the inexperienced ones he has hired at this time. I know someone who has 30 +years of experience who applied for positions on 3 teams on the production side and was turned down not because of his impressive resume and work experience but because he wasn’t as energetic and ass kissing as the it seems the younger ones are. He would be hired to do the task at hand and get it done correctly! SpaceX needs to re-think hiring practices. You have 40 rockets that need to be made and only 14 of 50 welders hired will not get the job done! Good luck SpaceX!!!

Leave a Reply