If cash-flow risk-neutral dynamics are affine (or exponentially affine), you can test this "law of iterated expectations" using a particular variance ratio statistic. 

Obviously, there is a lot of assumptions involved here, but the nice thing about the paper is that they try various ways of "explaining away" the discrepancies they find with nonlinear dynamics, long memory dynamics, omitted factors, measurement errors, etc., but they systematically fail to reproduce the patterns they see in the data for their test statistics. The only explanation they find that reproduces the patterns in the data involves modifying the expectation operator -- i.e., by arguing people are implicitly exaggerating the persistence of the cash flow process. And it works, specifically, by breaking down the law of iterated expectations (the new operator doesn't have this property). If I am remember properly, they also showed that reasonable transaction costs easily eats up arbitrage profits.

Obviously, that Q-expectation is essentially the same equation you use here, although those derivative securities have a finite maturity and some of them only procure the owner with a single cash flow at maturity. How would you reconcile their failure to save the expected value equation with your comments here? The behavior of least-square estimates in linear models in time series hinges on the behavior of the noise term. Regressing prices on dividends by OLS would make sense if both processes are (linearly) cointegrated. If you have p(t) = a + b*d(t) + e(t), your assumption is that (p(t) - a - b*d(t)) is I(0). If you let these be logarithms, i.e. p(t) = ln[ P(t) ], d(t) = ln[ D(t) ] and pd(t) = p(t) - d(t), you should deduce that Cohrane here says a=0 and b=1 works. Why? He says p(t) and d(t) are I(1), but pd(t) = p(t) - d(t) is I(0), hence (1,-1) must be a cointegrating vector between log prices and log dividends. In that case, regressions of the form

p(t) = a + b*d(t) + c'z(t) + e(t)

make sense, as long as c'z(t) is I(0). On the other hand, why would you do that? If you believe Cochrane and much of the literature, impose the restriction and run

pd(t) := p(t) - d(t) = c'(z)t + e(t).

That's going to be more statistically efficient, assuming you're interested in the coefficients in the vector c. The only question these regressions can ever answer is whether price volatility is due to cash flow volatility (or in this case, primary surplus volatility) or due to return volatility.

I've seen even Richard Thaler make the mistake of thinking that the value premium is somehow equivalent to a willingness on the part of investors to pay ten five dollar bills for a single twenty dollar bill. The distinction between "arbitrage opportunity" and "systemic risk premium" is not clear in the minds of many people, likely because they operate with some implicit assumption that a rational investor is risk-neutral with a fixed time preference. 

Maybe more needs to be done to spread more widely the change in the common way of thinking about asset pricing, Fama's observational equivalence theorem, etc. One hedge fund director gave me the following rule:
"Dividends are the worst way for a company to distribute liquidity to its shareholders. If shareholders need liquidity, a stock buy-back is a much better tool."

Is that a reason that a regression of price on dividends is a mess. What it measures is a mix of inefficient distributions with better ones. Any way you could give the intuition in a simple verbal form? It seems like there is a more general lesson here ("your test should allow people in the economy to have information we don't include in our forecasts" – Hayek's lesson, I guess) that might be more broadly useful if it were expressed in more general terms, with asset pricing & gov't debt as a specific example. I'm not math-phobic (eng background), but I don't play with these equations daily. John can skip this comment if he wishes, he is the one I am talking to, he is writing the book.

The reason the investors wants to convert returns into matched binomials of a fair coin is that it resembles a gaussian, and he can treat the investment opportunities as independent arrivals, which is another way of saying risk equalized. It also boils down to one outcome, the investor never wants to wait too long in the congest line to trade. In the balance, he gets either first or second place in the trade pit, always, and under that condition he samples enough to check market size. https://www.bloomberg.com/news/articles/2020-07-08/bond-market-tourists-threaten-to-bolt-with-200-billion-at-risk

Bond-Market Tourists Threaten to Bolt With $200 Billion at Risk
-----
Check out the chart in this discussion. What the authors note is the peak to peak ten year bounce in returns, over 40 coin tosses. That is the finite sequence of di in the first equation for net present value, a finite binomial series. So one can see that if one compares the total return of this asset to some 'safe rate', then the interest charges accumulated must generate a balance binomial with a fair coin toss over the terms. This is the moment matching function, and that is the process of risk equalization. And from the chart, this seems to be the longest level of tolerance unless one is betting the 40 year generational period, or further. One good solid balanced investment in corporate bonds, managed over ten years gets one through the recession cycles with fair returns. Isn't the rational bubble more or less equivalent to the idea that there is some other value that bond holders derive from holding debt besides the PV of primary surpluses. One source of that is that Treasury debt has money like properties (especially at the interest rates we have now). The value fluctuates with policies like QE and liquidity requirements. The t goes to infinity is cute but try explaining that in polite company.
Nice post!

As you say, the resolutions came with Campbell and Shiller (RFS 1988). In fact, the precursor (Campbell and Shiller, JPE 1987) already had these insights, describing in detail the implications of non-stationarity, limited information by the econometrician, the importance of including prices in the VAR, etc., when 'testing' present value models. And they pointed to the danger of relying too much on statistical tests when evaluating economic models.

My favorite piece from the paper: "a statistical rejection of the model ... may not have much economic significance. It is entirely possible that the model explains most of the variation in [prices] even if it is rejected at a 5% level." (Campbell and Shiller, JPE 1987, p. 1063).
I can't do math on web, never learned.

But the present value equation, the first one, does no go to infinity, and the di terms are the terms of a binomial with the coin weighted to generate the observed of the asset taking a dump.

The investor is estimating the distribution of returns as an unfair coin, over the number of coin tosses at the rate in which he updates the portfolio. The investor is looking for the complete probability distribution as some resolution less than perfect.

The interest rate borrowed for each terms converts the distribution into a fair coin so it can be risk equalized with other investments in the portfolio. The investor is setting his adjustment rate to a finite sufficient to track changes in N, market size the unknown fuzzy constant. The binomials are risk adjusted because the total probability of blundering is about the same across the portfolio.

This gets us back to the two peak theory (Jim Hamilton), seeing two peaks in a price is the deja vu moment in which the investor can estimate the unfairness of the coin flip and create the approximate binomial, then set the amount of borrowings to center it.
, Sebastian Merkel‡
This comment has been removed by the author.