Construction and the Toyota Production System

Folks have tried over and over again to make construction more efficient by applying lessons from manufacturing. Typically this means producing buildings in factories, instead of on-site by hand. And while it’s possible to build a building, and a profitable business, using factory methods, it hasn’t resulted in a quantum leap in construction efficiency - it often yields no cost improvement at all. My takeaway from this is that if we hope to achieve the kind of productivity improvements in construction that manufacturing has seen, we need to go beyond surface-level understanding (“factories make things cheaper, so we’ll build buildings in factories”), and drill down to the exact mechanisms at work in an improving manufacturing process.

One of the most recent techniques for improving manufacturing efficiency is the Toyota Production System and its descendants (Just-in-Time, Lean, etc.). Despite being one of the most publicized and discussed manufacturing techniques in existence, I’ve found that many explanations of the Toyota Production System seem to miss the heart of what it is and (critically) why it works. So let’s take a look at exactly how the Toyota Production System works, and why it was an improvement over previous systems. Hopefully that will help us understand whether the methods might be applied to construction.

Ford vs Toyota

The Toyota Production System was developed as a response to the mass production methods of Ford and other US car manufacturers in the 1950s. Toyota, originally a manufacturer of looms and textile equipment, wanted to produce cars, but felt that US-style manufacturing methods were unsuitable to the Japanese market. The US methods were designed around producing cars by the millions - in 1950, Ford’s Rouge plant alone produced 7,000 cars a day. By contrast, the Japanese auto market was tiny and highly fragmented, requiring small volumes of many different types of cars [1]. Even if the market could absorb the huge production volumes of US-style factories, post-war Japan had little capital available for building such enormous production facilities. The Toyota Production System was developed around these constraints, as a way of efficiently producing lower volumes of a wider variety of cars.

Built upon a foundation of interchangeable parts and the assembly line, mass production is designed around achieving economies of scale by producing things in huge quantities. The greater your production volume, the more you can afford the high fixed costs of specialized machines and equipment, the more it’s worth it to capture small efficiencies by vertically integrating, the more you can take advantage of labor specialization by having workers do extremely narrowly defined jobs, and the greater volume discounts you can secure from your suppliers. 

This strategy, it should be clear, works. Mass production methods are responsible for huge gains in efficiency, and huge drops in cost. But continuous pursuit of scale eventually becomes counterproductive.

One problem with mass production methods is that they generate huge inventories. Products are produced by increasingly specialized (and often increasingly expensive) machines and workers that need huge production volumes to spread their costs over. If I have a $5 million dollar press, I’m going to be damn sure to get my money's worth and keep it running 24/7 stamping out parts. If I have a heat-treat that can run batches of 400 parts, I’m going to try like hell to run 400 parts at a time. If I can shave a few seconds off assembly time by having a worker who does nothing but attach part A to widget B, I’m going to do it, and run enough parts through the line to justify their cost.

The result is that mass production methods have a lot of partially completed work (parts made but not yet used, partially completed assemblies, etc.) floating around the system waiting to be processed. This is known as work-in-process, or WIP. In some ways, large volumes of WIP are good. If each workstation has a stack of parts waiting to be processed, it’s buffered from failures or problems that might slow down production. If a machine breaks down, the rest of the production line can keep going, because they’ll still have plenty of parts in the queue. If a workstation upstream produces a bad part, you can just toss it out and grab another one from the pile.

But large amounts of WIP are costly. Parts and assemblies that have been fabricated but not yet sold represent potentially millions of dollars that could be deployed elsewhere if it wasn’t tied up in your production line. All that extra material means extra space to store it, extra people and equipment to move it around and keep track of it, and extra time to navigate around it. And the larger the queues in front of workstations, the longer it takes any particular piece of material to move through your whole process (resulting in high cycle time). For uniform or commodity-like products this might not be a problem, but if you’re producing specific orders for specific customers, long cycle times translate to long wait times for the buyer. 

Large amounts of WIP also makes it hard to diagnose problems - long queues mean if a workstation produces a bad part, it will be a long time before the next station notices. At best, this makes it difficult to diagnose the problem - at worst, it means a machine produces many bad parts before anyone notices.

The Toyota Production System and inventory

The fundamental thesis of the Toyota Production system is that the majority of this WIP is waste - it costs money but doesn’t actually add any value. What’s worse, this waste is concealing other waste - machine downtime, poor quality parts, unnecessary labor, etc. that is only allowed to persist because of the huge inventory buffers.

The Toyota Production System is usually described in terms of eliminating waste, and a series of strategies for doing it (u-shaped cells, single-minute-exchange-of-die, kanban cards, kaizen events, 5 whys, etc.). But another, simpler way to think about it is that it’s a method for limiting how much WIP can accumulate in the system. WIP-limited systems are often called pull-based systems, as work is only “pulled” into the system when authorized by some downstream process. Conventional production methods, on the other hand, constantly put material into the system at a predefined rate, and are thus referred to as push-based systems. 

Pull systems reverse the logic of a conventional production system. A conventional system defines the flow rate through the system, and gets a particular level of WIP as a result. A pull system, on the other hand, defines the level of WIP, and gets a particular flow-rate as a result. (explained like this, it’s easy to see why a) it took so long for this system to be invented, as it’s unintuitive and b) why so many companies have trouble implementing pull-based systems, and often fall-back on previous ‘push’ habits).

The WIP control system that the Toyota Production System uses is called ‘kanban’, which is a set of cards that travel between workstations. A station can only produce something if it has a kanban card for it - once an item gets produced, the kanban card gets passed along with the item to the buffer for the next workstation. When that station takes a part out of the buffer, the kanban card goes back to the previous station. A station can thus only do work if the station downstream from it is ready to receive it.

A kanban system is actually just one of many possible arrangements for limiting WIP, which are collectively called “CONWIP” systems (constant work-in-process).

Why does controlling WIP work?

The immediate benefit of limiting WIP is obvious: by putting a cap on WIP, the system can’t accumulate large volumes of inventory. Once the WIP limit is reached, material stops entering the system until the backlog is cleared. And since less inventory means no huge queues of parts at each workstation, cycle time is reduced as well. Lowering inventory also means your labor and equipment requirements go down, since there’s less stuff that needs to be handled or reworked.

Other benefits stem from the effort required to get the system to work. For instance, the Toyota Production System is considered a method for improving quality (even today Toyota still dominates quality rankings of automakers). This is because CONWIP systems require high quality in order to function. Without large material buffers, any process failure or delay (machine downtime, large setup time, bad parts that must be discarded or repaired) immediately stops production, as downstream processes are immediately starved of parts. The only way to prevent this is to prevent failures from happening in the first place (this is often described in Lean-lingo as “lowering the water level to reveal the rocks”). The Toyota Production system achieves these improvements by tasking line workers (who are cross-trained in various tasks) with noticing problems and suggesting solutions. Low levels of WIP make this process easier - without a large inventory buffer, any failure is immediately noticed.

Reducing failures and increasing quality, in turn, reduces the variability of process times, which allows for higher levels of throughput with the same equipment (or the same throughput with less equipment and fewer workers).

A CONWIP system also reduces cycle time variability. This is because queue times at each station are negatively correlated - a large queue at one station means lower queues elsewhere, as there is only so much WIP to go around. This tends to dampen fluctuations in cycle time. A push system, on the other hand, can have either very high or very low cycle time depending on how much WIP is in the system at any given time. This makes the output of a pull system more predictable than a push system.

Note that in some models of production, the volume of WIP negatively affects the production rate of other parts of the process. This is what lethain uses in his “Why limiting WIP works” model, where every project has a chance of interfering with other projects. And it’s observable in the real world, such as when a port gets so full of containers that it’s no longer possible to unload more. This sort of feedback makes push systems perform even worse compared to pull systems. But even in the absence of this sort of feedback, pull-systems can still be shown to be superior.

A Simple Production Model

To see this in action, let’s return to our pin factory example. In this simple model, raw wire enters the factory and goes through a four-step process - cutting, straightening, adding the head, and sharpening, with finished pins leaving the factory. Unlike the previous example, this time each step has a slightly different process time attached to it.

Let’s run this model with different amounts of WIP, and see what we get. We’ll start with a typical “push” system that has no WIP limits - material is continuously fed into the system at a constant rate of 1 pin per second.

With these parameters, the system makes on average about 3500 pins in 10,000 seconds. As expected with a push system, we see huge volumes of WIP have accumulated, and our pins are taking huge amounts of time to move through the system. And because we have queues at every step in the process, it’s not easy to tell where we should focus any improvement efforts.

What happens if we establish a WIP limit, and only allow new material into the system when there’s fewer than 500 pins worth of WIP?

We get a similar level of throughput (if we average over multiple runs, it's in fact nearly identical), but much lower WIP, and much lower cycle times. This is a key feature of CONWIP systems - for a given level of throughput, a CONWIP system will have less inventory and lower cycle times than a push system. It will also be more robust - in a CONWIP system, the amount of allowable WIP can be far from optimal and still achieve near-optimal performance. In a push system, on the other hand, small changes to the throughput level can have large effects on overall profitability.

Limiting WIP also makes it much clearer where the bottleneck is in this process - we see that the vast majority of the time, almost all the allowable WIP has accumulated at the “add head” station.

What happens if we limit WIP even more, and restrict it to just 50 pins worth of inventory?

We’ve reduced WIP enough that we’re starting to affect our overall production rate (it’s dropped by about 10%). This is typical - firms who try to introduce lean methods often initially see production rates decline as they struggle to deal with the process failures that are now stopping production. But we’ve massively reduced both inventory and cycle time.

It still seems as if the “add head” station is the bottleneck - if we look close, we can see that not only is the queue in front of the “add head” station most of the time, the times where it isn’t are times when no pins are being produced. Now we’re ready to start “removing the rocks” - what happens if we improve the add head process so it only stops 1% of the time rather than 3%?

Now not only have we reduced our cycle time and our WIP level, but we’ve improved our throughput by more than 20% over the initial “push” system! Through further cycles of improvement (“kaizen events”), we could continue to improve this system using the tools in the Toyota Production System toolbox - moving equipment closer together to reduce travel distances, reducing downtime from machine failure, reducing setup times, producing fewer bad parts - each one allowing us to pull more and more inventory out of the system.

Drawbacks of the Toyota Production System

The Toyota Production System has some drawbacks. Because it’s efficiency gains come from pulling the slack out of the system, it has little ability to absorb disruptions to the process. Unreliable supply deliveries, sudden variation in product demand, and lack of control over the production environment are all difficult for the Toyota Production System to accommodate. The system works best when the production environment is as repetitive and predictable as possible, and many of the tools (production smoothing, structuring processes around takt time, standardizing operations) are ways of achieving that. I suspect this is part of the reason for the focus on continuous improvement - since any amount of entropy will degrade performance, a constant focus improving performance prevents decay from sneaking in.

The Toyota Production System and construction

Can we apply the Toyota Production System to construction? Obviously in a factory-based construction environment the methods would apply, but what about more conventional, site-built buildings? After all, even site-built construction feels similar to a factory-based process, where the building (metaphorically) moves through different stages in the assembly process, each one adding value by performing some particular task, until a finished building pops out at the end.

But applying the Toyota Production System to construction isn’t straightforward. The biggest stumbling block is that for a single building, there’s no excess work-in-process to pull out. Just like a factory that builds a single car won’t accumulate any excess inventory, a construction site with a single building will always have “one building” worth of WIP. Even if we were able to, say, rearrange the construction process so we produced one finished room at a time (moving from framing to services to finishes, then moving on to the next room), this wouldn’t change.

You can compress the cycle time, by decreasing the time it takes any given task in the process. The project above, for instance, has a week budgeted for rough framing, but maybe through clever time savings we can get it done in 3 days. But on a single project, compressing the cycle time doesn’t lower your WIP, it increases your throughput - you’re not operating with less inventory, you’re building the building faster. It’s still valuable to build the building faster of course, but this is different from the gains achieved by the Toyota Production System, which are independent of the throughput rate.

Gains from reducing work-in-process can be achieved on larger construction projects, either where multiple buildings are being built (say a large housing development), or a high-rise with enough floors that the bottom levels can be finished and occupied while work is still proceeding on the upper floors.

It also seems like it might be possible to achieve gains from limiting WIP on a smaller scale -  perhaps by, say, ordering lumber in a just-in-time fashion rather than upfront all at once. Depending on how individual portions of the building are financed this could conceivably be doable, but it doesn’t strike me as a potential source of large efficiency gains.

There are other difficulties as well. For one, in construction it’s difficult to create the controlled, repetitive environment the Toyota Production System requires. Each new construction project is like setting up a new factory from scratch, with new workers, new processes, and new constraints, that will get (metaphorically) torn down in a few months or years when the project is complete. But it often takes manufacturing operations years to successfully implement Toyota Production/Lean methods, and many of them don’t succeed.

There’s a clash of labor models as well. The Toyota Production System relies on cross-trained workers who understand the entire process, can be allocated where needed, and know where the points of leverage are to address problems. Construction, on the other hand, is largely performed by specialty subcontractors who perform just a single trade. This is analogous to having specialized equipment that can only do one job, because the die can’t be changed.

So it’s not initially obvious to me whether there’s much potential benefit in trying to apply the Toyota Production System to construction - it seems more tailored to the long-term, repetitive processes of manufacturing than the short-term, one-off processes of construction.

There is a Lean Construction Institute, dedicated to applying the principles of lean manufacturing to the construction process. I haven’t yet read their literature (I wanted to understand the original system thoroughly first), but I’ll be interested in seeing how their methods address these difficulties.

[1] - It’s interesting how manufacturing systems seem to be built around the constraints of their environment. American mass production was also a response to its unique constraints.