Monday, June 12, 2023

Papers: Dew-Becker on Networks

I've been reading a lot of macro lately. In part, I'm just catching up from a few years of book writing. In part,  I want to understand inflation dynamics, the quest set forth in "expectations and the neutrality of interest rates," and an  obvious next step in the fiscal theory program. Perhaps blog readers might find interesting some summaries of recent papers, when there is a great idea that can be summarized without a huge amount of math. So, I start a series on cool papers I'm reading. 

Today: "Tail risk in production networks" by Ian Dew-Becker, a beautiful paper. A "production network" approach recognizes that each firm buys from others, and models this interconnection. It's a hot topic for lots of reasons, below.  I'm interested because prices cascading through production networks might induce a better model of inflation dynamics. 

(This post uses Mathjax equations. If you're seeing garbage like [\alpha = \beta] then come back to the source  here.) 

To Ian's paper: Each firm uses other firms' outputs as inputs. Now, hit the economy with a vector of productivity shocks. Some firms get more productive, some get less productive. The more productive ones will expand and lower prices, but that changes everyone's input prices too. Where does it all settle down? This is the fun question of network economics. 

Ian's central idea: The problem simplifies a lot for large shocks. Usually when problems are complicated we look at first or second order approximations, i.e. for small shocks, obtaining linear or quadratic ("simple") approximations. 

On the x axis, take a vector of productivity shocks for each firm, and scale it up or down. The x axis represents this overall scale. The y axis is GDP. The right hand graph is Ian's point: for large shocks, log GDP becomes linear in log productivity -- really simple. 

Why? Because for large enough shocks, all the networky stuff disappears. Each firm's output moves up or down depending only on one critical input. 

To see this, we have to dig deeper to complements vs. substitutes. Suppose the price of an input goes up 10%. The firm tries to use less of this input. If the best it can do is to cut use 5%, then the firm ends up paying 5% more overall for this input, the "expenditure share" of this input rises. That is the case of "complements." But if the firm can cut use of the input 15%, then it pays 5% less overall for the input, even though the price went up. That is the case of "substitutes." This is the key concept for the whole question: when an input's price goes up, does its share of overall expenditure go up (complements) or down (substitutes)? 

Suppose inputs are complements. Again, this vector of technology shocks hits the economy. As the size of the shock gets bigger, the expenditure of each firm, and thus the price it charges for its output, becomes more and more dominated by the one input whose price grows the most. In that sense, all the networkiness simplifies enormously. Each firm is only "connected" to one other firm. 

Turn the shock around. Each firm that was getting a productivity boost now gets a productivity reduction. Each price that was going up now goes down. Again, in the large shock limit, our firm's price becomes dominated by the price of its most expensive input. But it's a different input.  So, naturally, the economy's response to this technology shock is linear, but with a different slope in one direction vs. the other. 

Suppose instead that inputs are substitutes. Now, as prices change, the firm expands more and more its use of the cheapest input, and its costs and price become dominated by that input instead. Again, the network collapsed to one link.  

Ian: "negative productivity shocks propagate downstream through parts of the production process that are complementary (\(\sigma_i < 1\)), while positive productivity shocks propagate through parts that are substitutable (\(\sigma_i > 1\)). ...every sector’s behavior ends up driven by a single one of its inputs....there is a tail network, which depends on \(\theta\) and in which each sector has just a single upstream link."

Equations: Each firm's production function is (somewhat simplifying Ian's (1)) \[Y_i = Z_i L_i^{1-\alpha} \left( \sum_j A_{ij}^{1/\sigma} X_{ij}^{(\sigma-1)/\sigma} \right)^{\alpha \sigma/(\sigma-1)}.\]Here \(Y_i\) is output, \(Z_i\) is productivity, \(L_i\) is labor input, \(X_{ij}\) is how much good j firm i  uses as an input, and \(A_{ij}\) captures how important each input is in production. \(\sigma>1\) are substitutes, \(\sigma<1\) are complements. 

Firms are competitive, so price equals marginal cost, and each firm's price is \[ p_i = -z_i + \frac{\alpha}{1-\sigma}\log\left(\sum_j A_{ij}e^{(1-\sigma)p_j}\right).\; \; \; (1)\]Small letters are logs of big letters.  Each price depends on the prices of all the inputs, plus the firm's own productivity.  Log GDP, plotted in the above figure is \[gdp = -\beta'p\] where \(p\) is the vector of prices and \(\beta\) is a vector of how important each good is to the consumer. 

In the case \(\sigma=1\) (1)  reduces to a linear formula. We can easily solve for prices and then gdp as a function of the technology shocks: \[p_i = - z_i + \sum_j A_{ij} p_j\] and hence \[p=-(I-\alpha A)^{-1}z,\]where the letters represent vectors and matrices across \(i\) and \(j\). This expression shows some of the point of networks, that the pattern of prices and output reflects the whole network of production, not just individual firm productivity. But with \(\sigma \neq 1\) (1) is nonlinear without a known closed form solution. Hence approximations. 

You can see Ian's central point directly from (1). Take the \(\sigma<1\) case, complements. Parameterize the size of the technology shocks by a fixed vector \(\theta = [\theta_1, \ \theta_2, \ ...\theta_i,...]\) times a scalar \(t>0\), so that \(z_i=\theta_i \times t\). Then let \(t\) grow keeping the pattern of shocks \(\theta\) the same. Now, as the \(\{p_i\}\) get larger in absolute value, the term with the greatest \(p_i\) has the greatest value of \( e^{(1-\sigma)p_j} \). So, for large technology shocks \(z\), only that largest term matters, the log and e cancel, and \[p_i \approx -z_i + \alpha \max_{j} p_j.\] This is linear, so we can also write prices as a pattern \(\phi\) times the scale \(t\), in the large-t limit \(p_i = \phi_i t\),  and  \[\phi_i =  -\theta_i + \alpha \max_{j} \phi_j.\;\;\; (2)\] With substitutes, \(\sigma<1\), the firm's costs, and so its price, will be driven by the smallest (most negative) upstream price, in the same way. \[\phi_i \approx -\theta_i + \alpha \min_{j} \phi_j.\] 
To express gdp scaling with \(t\), write \(gdp=\lambda t\), or when you want to emphasize the dependence on the vector of technology shocks, \(\lambda(\theta)\). Then we find gdp by \(\lambda =-\beta'\phi\). 

In this big price limit, the \(A_{ij}\) contribute a constant term, which also washes out. Thus the actual "network" coefficients stop mattering at all so long as they are not zero -- the max and min are taken over all non-zero inputs. Ian: 
...the limits for prices, do not depend on the exact values of any \(\sigma_i\) or \(A_{i,j}.\) All that matters is whether the elasticities are above or below 1 and whether the production weights are greater than zero. In the example in Figure 2, changing the exact values of the production parameters (away from \(\sigma_i = 1\) or \(A_{i,j} = 0\)) changes...the levels of the asymptotes, and it can change the curvature of GDP with respect to productivity, but the slopes of the asymptotes are unaffected.
...when thinking about the supply-chain risks associated with large shocks, what is important is not how large a given supplier is on average, but rather how many sectors it supplies...
For a full solution, look at the (more interesting) case of complements, and suppose every firm uses a little bit of every other firm's output, so all the \(A_{ij}>0\). The largest input  price in (2) is the same for each firm \(i\), and you can quickly see then that the biggest price will be the smallest technology shock. Now we can solve the model for prices and GDP as a function of technology shocks: \[\phi_i \approx -\theta_i - \frac{\alpha}{1-\alpha} \theta_{\min},\] \[\lambda \approx  \beta'\theta + \frac{\alpha}{1-\alpha}\theta_{\min}.\] We have solved the large-shock approximation for prices and GDP as a function of technology shocks. (This is Ian's example 1.) 

The graph is concave when inputs are complements, and convex when they are substitutes. Let's do complements. We do the graph to the left of the kink by changing the sign of \(\theta\).  If the identity of \(\theta_{\min}\) did not change, \(\lambda(-\theta)=-\lambda(\theta)\) and the graph would be linear; it would go down on the left of the kink by the same amount it goes up on the right of the kink. But now a different \(j\) has the largest price and the worst technology shock. Since this must be a worse technology shock than the one driving the previous case, GDP is lower and the graph is concave.  \[-\lambda(-\theta) = \beta'\theta + \frac{\alpha}{1-\alpha}\theta_{\max} \ge\beta'\theta + \frac{\alpha}{1-\alpha}\theta_{\min} = \lambda(\theta).\] Therefore  \(\lambda(-\theta)\le-\lambda(\theta),\) the left side falls by more than the right side rises. 

Does all of this matter? Well, surely more for questions when there might be a big shock, such as the big shocks we saw in a pandemic, or big shocks we might see in a war. One of the big questions that network theory asks is, how much does GDP change if there is a technology shock in a particular industry? The \(\sigma=1\) case in which expenditure shares are constant gives a standard and fairly reassuring result: the effect on GDP of a shock in industry i is given by the ratio of i's output to total GDP. ("Hulten's theorem.") Industries that are small relative to GDP don't affect GDP that much if they get into trouble. 

You can intuit that constant expenditure shares are important for this result. If an industry has a negative technology shock, raises its prices, and others can't reduce use of its inputs, then its share of expenditure will rise, and it will all of a sudden be important to GDP. Continuing our example, if one firm has a negative technology shock, then it is the minimum technology, and [(d gdp/dz_i = \beta_i + \frac{\alpha}{1-\alpha}.\] For small firms (industries) the latter term is likely to be the most important.  All the A and \(\sigma\) have disappeared, and basically the whole economy is driven by this one unlucky industry and labor.   

...what determines tail risk is not whether there is granularity on average, but whether there can ever be granularity – whether a single sector can become pivotal if shocks are large enough.
For example, take electricity and restaurants. In normal times, those sectors are of similar size, which in a linear approximation would imply that they have similar effects on GDP. But one lesson of Covid was that shutting down restaurants is not catastrophic for GDP, [Consumer spending on food services and accommodations fell by 40 percent, or $403 billion between 2019Q4 and 2020Q2. Spending at movie theaters fell by 99 percent.] whereas one might expect that a significant reduction in available electricity would have strongly negative effects – and that those effects would be convex in the size of the decline in available power. Electricity is systemically important not because it is important in good times, but because it would be important in bad times. 
Ben Moll turned out to be right and Germany was able to substitute away from Russian Gas a lot more than people had thought, but even that proves the rule: if it is hard to substitute away from even a small input, then large shocks to that input imply larger expenditure shares and larger impacts on the economy than its small output in normal times would suggest.

There is an enormous amount more in the paper and voluminous appendices, but this is enough for a blog review. 


Now, a few limitations, or really thoughts on where we go next. (No more in this paper, please, Ian!) Ian does a nice illustrative computation of the sensitivity to large shocks:

Ian assumes \(\sigma>1\), so the main ingredients are how many downstream firms use your products and a bit their labor shares. No surprise, trucks, and energy have big tail impacts. But so do lawyers and insurance. Can we really not do without lawyers? Here I hope the next step looks hard at substitutes vs. complements.

That raises a bunch of issues. Substitutes vs. complements surely depends on time horizon and size of shocks. It might be easy to use a little less water or electricity initially, but then really hard to reduce more than, say, 80%. It's usually easier to substitute in the long run than the short run. 

The analysis in this literature is "static," meaning it describes the economy when everything has settled down.  The responses -- you charge more, I use less, I charge more, you use less of my output, etc. -- all happen instantly, or equivalently the model studies a long run where this has all settled down. But then we talk about responses to shocks, as in the pandemic.  Surely there is a dynamic response here, not just including capital accumulation (which Ian studies). Indeed, my hope was to see prices spreading out through a production network over time, but this structure would have all price adjustments instantly. Mixing production networks with sticky prices is an obvious idea, which some of the papers below are working on. 

In the theory and data handling, you see a big discontinuity. If a firm uses any inputs at all from another firm,  if \(A_{ij}>0\), that input can take over and drive everything. If it uses no inputs at all, then there is no network link and the upstream firm can't have any effect. There is a big discontinuity at \(A_{ij}=0.\) We would prefer a theory that does not jump from zero to everything when the firm buys one stick of chewing gum. Ian had to drop small but nonzero elements of the input-output matrix to produces sensible results. Perhaps we should regard very small inputs as always substitutes? 

How important is the network stuff anyway? We tend to use industry categorizations, because we have an industry input-output table. But how much of the US industry input-output is simply vertical: Loggers sell trees to mills who sell wood to lumberyards who sell lumber to Home Depot who sells it to contractors who put up your house? Energy and tools feed each stage, but don't use a whole lot of wood to make those. I haven't looked at an input-output matrix recently, but just how "vertical" is it? 


The literature on networks in macro is vast. One approach is to pick a recent paper like Ian's and work back through the references. I started to summarize, but gave up in the deluge. Have fun. 

One way to think of a branch of economics is not just "what tools does it use?" but "what questions is it asking?  Long and Plosser "Real Business Cycles," a classic, went after idea that the central defining feature of business cycles (since Burns and Mitchell) is comovement. States and industries all go up and down together to a remarkable degree. That pointed to "aggregate demand" as a key driving force. One would think that "technology shocks" whatever they are would be local or industry specific. Long and Plosser showed that an input output structure led idiosyncratic shocks to produce business cycle common movement in output. Brilliant. 

Macro went in another way, emphasizing time series -- the idea that recessions are defined, say, by two quarters of aggregate GDP decline, or by the greater decline of investment and durable goods than consumption -- and in the aggregate models of Kydland and Prescott, and the stochastic growth model as pioneered by King, Plosser and Rebelo, driven by a single economy-wide technology shock.  Part of this shift is simply technical: Long and Plosser used analytical tools, and were thereby stuck in a model without capital, plus they did not inaugurate matching to data. Kydland and Prescott brought numerical model solution and calibration to macro, which is what macro has done ever since.  Maybe it's time to add capital, solve numerically, and calibrate Long and Plosser (with up to date frictions and consumer heterogeneity too, maybe). 

Xavier Gabaix (2011)  had a different Big Question in mind: Why are business cycles so large? Individual firms and industries have large shocks, but \(\sigma/\sqrt{N}\) ought to dampen those at the aggregate level. Again, this was a classic argument for aggregate "demand" as opposed to "supply."  Gabaix notices that the US has a fat-tailed firm distribution with a few large firms, and those firms have large shocks. He amplifies his argument via the Hulten mechanism, a bit of networkyiness, since the impact of a firm on the economy is sales / GDP,  not value added / GDP. 

The enormous literature since then  has gone after a variety of questions. Dew-Becker's paper is about the effect of big shocks, and obviously not that useful for small shocks. Remember which question you're after.

My quest for a new Phillips curve in production networks is better represented by Elisa Rubbo's "Networks, Phillips curves and Monetary Policy," and Jennifer La'o and  Alireza Tahbaz-Salehi's “Optimal Monetary Policy in Production Networks,” If I can boil those down for the blog, you'll hear about it eventually.  

The "what's the question" question is doubly important for this branch of macro that explicitly models heterogeneous agents and heterogenous firms. Why are we doing this? One can always represent the aggregates with a social welfare function and an aggregate production function. You might be interested in how aggregates affect individuals, but that doesn't change your model of aggregates. Or, you might be interested in seeing what the aggregate production or utility function looks like -- is it consistent with what we know about individual firms and people? Does the size of the aggregate production function shock make sense? But still, you end up with just a better (hopefully) aggregate production and utility function. Or, you might want models that break the aggregation theorems in a significant way; models for which distributions matter for aggregate dynamics, theoretically and (harder) empirically. But don't forget you need a reason to build disaggregated models. 

Expression (1) is not easy to get to. I started reading Ian's paper in my usual way:  to learn a literature start with the latest paper and work backward. Alas, this literature has evolved to the point that authors plop results down that "everybody knows" and will take you a day or so of head-scratching to reproduce. I complained to Ian, and he said he had the same problem when he was getting in to the literature! Yes, journals now demand such overstuffed papers that it's hard to do, but it would be awfully nice for everyone to start including ground up algebra for major results in one of the endless internet appendices.  I eventually found Jonathan Dingel's notes on Dixit Stiglitz tricks, which were helpful. 


Chase Abram's University of Chicago Math Camp notes here  are also a fantastic resource. See Appendix B starting p. 94 for  production network math. The rest of the notes are also really good. The first part goes a little deeper into more abstract material than is really necessary for the second part and applied work, but it is a wonderful and concise review of that material as well. 



  1. Reviews like this with pointers to other papers (and math tricks!) are an enormously valuable public good. Many thanks. Please keep them coming.

  2. German end users of Russian gas may have substituted but didn't large industrial users shut down or move production out of Germany?

  3. From José Luis de la Fuente.
    Hi, you mention at the end of this post "Chase Abram's University of Chicago Math Camp notes here are also a fantastic resource. See Appendix B starting p. 94 for production network math." Where is that Appendix B?

  4. If you intend to pursue this topic further, you may want to read N. Patterson's 2012 working paper titled "Elasticities of substitution in computable equilibrium models", Working Paper 2012-02, from the department of finance, government of Canada (Ottawa). The paper is available via this link:

    Patterson describes the difficulties in estimating the elasticity of substitution, σ, for application in computable general equilibrium model production functions. He discusses the two principal methods of estimating σ -- the Allen method and the Morishima method, and the error in each. He discusses CES vs. "trans-log" vs. linear-logit forms of the production function. He notes that the Cobb-Douglas production function and the Leontief production function are special instances of the CES production function.

    With respect to Dew-Becker, Richard Bellman's 'curse of dimensionality' comes into play. The model for the physical product, Yᵢ , i = 1, 2, ..., N, requires identification of 2N² + 2N + 2 parameters and variables. Patterson notes in his paper that identifying the production function for industry sectors in the relevant range of production variation is a daunting task that few modellers undertake. Instead, he states, identification efforts are focused on determining σ from marginal cost analysis.

    Patterson defines the elasticity of substitution between two inputs as σₖₗ = d(ln(xₗ/xₖ))/d(ln(fₖ(x̃)/fₗ( x̃)), where fₖ(x̃) is the production function output of node k in state x̃, etc.

    Overall, this is a highly developed subject that is dependent on the correct choice of models to achieve useful results. Identification is a large challenge.

    The asymptotic behavior depends on the chosen model. Dimensionality will defeat the purpose in the end. Patterson's research focus is the trade-off made between energy and capital in the production activity. This is probably feasible to undertake. Extending the effort to the full range of economic activity is not likely to be viable for a single researcher to undertake alone.

    The question: Where do you see yourself taking this to?

    1. It is often useful to delve deeper into the subject matter presented in a blog post and go beyond the published paper featured in the posting. In this case, the student lacking a thorough grounding in the theory and practice of 'production functions' is afforded a high-level overview of the subject without being necessarily au courant with the higher levels of mathematical theory to appreciate the strengths, limitations, and weaknesses of the more popular economic models of the production function. One such overview is that written by S. K. Mishra of the Dept. of Economics, North-Eastern Hill University, Shillong (India). His review is comprehensive and provides useful insights into the nature of theoretical production functions used in neo-classical economics. He sets out the pros and cons, and limitations of most, if not all, production functions introduced in the 20th century in Europe and North America. Cobb-Douglas, Leontief, and CES production functions are discussed extensively and their limitations are exposed. Mishra concludes with the critique of the neo-classical economics penchant for algebraic production functions that fail to conform to reality at the aggregate economic level--the so-called "Cambridge (U.K.) versus Cambridge (U.S.A.)" debate. Mishra's concluding remarks are quoted below in their entirety.

      "A Brief History of Production Functions", Mishra, SK, (2007). Munich Personal RePEc Archive, MPRA Paper No. 5254, posted 10 Oct 2007 UTC.

      "IV. Concluding Remarks: It is said that once Stanislaw Ulam very earnestly sought for an example of a theory in social sciences that is both true and nontrivial. Paul Samuelson, after several years supplied one: the Ricardian theory of comparative advantages. If this is true then it speaks enough about the position of ‘production function’ in economics. Anwar Shaikh almost conclusively proved that the properties of the Cobb-Douglas production function are devoid of any economic content; they stem from the algebraic properties of the function. Measurement of capital was put in jeopardy by the capital controversy, not only for aggregate production function, but also for any production function even at the firm level. If one has interacted with engineers and the managers who are decision makers at certain level, one might know their view of the utility of production functions. Possibly, Mrs. Joan Robinson was right to comment that the production function has been a powerful instrument of miseducation. The student of economic theory is taught to write Q= f (L, K ) … and that is its sole utility. The era of classical economics had ended (sometime in 1870’s) with the criticism of it by Marx and the birth of neoclassicism. The era of neoclassical economics possibly ended with the capital controversy sometime in 1970’s."

    2. In modelling the national economy of any large or medium size open economy, the assumption of a production function of a manufactory (producer) that produces multiple product lines (yₗ, y₂, ..., yₘ) from multiple productive factor inputs (x₁, x₂, ..., xₙ) the definition of the appropriate theoretical production function is not clear. When a productive sector of the economy includes multiple manufactories each producing multiple product lines consuming multiple productive factor inputs in doing so, aggregation of the production functions of the individual manufactories into an aggregate production function for the sector is said to be problematic insofar as each individual manufacturer may be operating sub-optimally according to its own preferences and under its own idiosyncratic constraints, and none producing an identical product or using an identical process. Assumption of a particular aggregate production function for ease of computation does not guarantee that the result will conform to practice (which some might refer to as 'reality').

      Theoretical work in generating models of actual economies are only useful to the extent that the model answers specific questions of some importance faithfully, i.e., the answers are reasonable analogues of what the economy modelled would likely do if the model's inputs and constraints are based on conditions that the economy might face under similar circumstances.

      To say that such-and-such a production network model when a scalar quantity, say a coefficient of proportionality, " t ", is increased indefinitely towards infinity, will behave like so-and-so, in the limit, is only a curiosity if it is physically impossible for such a circumstance to arise in the first place. One generally looks to applied mathematics (as opposed to theoretical mathematics) for guidance in such instances.

      Resources are fixed in the short-term; reconfiguration of networks feeding finished products and intermediate products into the U.S. economy take time to achieve, and may not all be achievable; CES production functions cannot faithfully model the elasticity of substitution between more than two productive factor inputs. Mishra (2007), states: "Uzawa (1962) and McFadden (1962, 1963) proved that it is impossible to obtain a functional form for a production function that has an arbitrary set of constant elasticities of substitution if the number of inputs (factors of production) is greater than two. Mathematical enunciations of these assertions are now known as the impossibility theorems of Uzawa and McFadden."

      The student, reading the literature, is cautioned that neo-classical economics with its emphasis on mathematical models of economic systems is fraught with inconsistencies. Practitioners continue to improve the models, with more mathematics. But, mathematics alone will not make macroeconomics more accurate or improve the fidelity of the models used to predict future economic behavior.

      We live in interesting times, in more ways than one.

  5. In an article for the BBVA lecture of the European Economic Association and the BBVA Foundation, David Baqaee and Emmanuel Farhi set down aggregate production functions for networks.  The mathematical treatment is dense, but not abstruse to the point of being incomprehensible. P. Samuelson's 1966 concession to J. Robinson et al., re: aggregation of capital productive input factors to conclude the Cambridge-Cambridge debate is discussed in the context of networked production functions.  The article is the complement to the Dew-Becker paper.

    D. Baqaee and E. Farhi, "The Microeconomic Foundations of Aggregate Production Functions", Journal of the European Economic Association, July 22, 2022.

    Aggregate production functions are reduced-form relationships that emerge endogenously from input-output interactions between heterogeneous producers and factors in general equilibrium. We provide a general methodology for analyzing such aggregate production functions by deriving their first- and second-order properties. Our aggregation formulas provide non-parameteric characterizations of the macro elasticities of substitution between factors and of the macro bias of technical change in terms of micro sufficient statistics. They allow us to generalize existing aggregation theorems and to derive new ones. We relate our results to the famous Cambridge-Cambridge controversy


Comments are welcome. Keep it short, polite, and on topic.

Thanks to a few abusers I am now moderating comments. I welcome thoughtful disagreement. I will block comments with insulting or abusive language. I'm also blocking totally inane comments. Try to make some sense. I am much more likely to allow critical comments if you have the honesty and courage to use your real name.