Cultural Caviar

Moneyball for Real Estate

May 13, 2015

View as Single Page
Moneyball for Real Estate

Where’s the best place to move for your children’s sake?

For several years now, a Harvard Big Data project has been crunching confidential IRS numbers to discover the enduring secrets of why some parts of the country nurture more prosperous young people than other places. Is it public transportation? Is it integration?

Or have the Harvard analysts failed to understand their own numbers due to ideological and academic prejudices?

As I pointed out last week, the analysis of baseball statistics has made historic progress over the last 40 years because of the influx of amateur kibitzers. In contrast, in more important fields such as medicine and social policy, our society hasn’t gotten as far for reasons such as an excessive respect for professional titles and the fear of discovering some truth that is politically off-limits.

This week I’ll offer an amateur Moneyball analysis of the giant database created by The Equality of Opportunity Project’s staff at Harvard under celebrated economics professor Raj Chetty, a domestic policy adviser to Mrs. Clinton. I’ll demonstrate some tricks I’ve learned over the decades for how to do reality checks on academic analyses.

Chetty’s goal is to find out where in America you should relocate to so your children could grow up to enjoy the American Dream. Then, the rest of the country should copy whatever policies Dreamville, USA is getting right.

This is a great topic for a Big Data exploration. Young couples talk about where to move incessantly in private, and many will spend huge amounts of money for more desirable locations (which are summarized and euphemized with the shorthand phrase “good schools”).

Chetty explained to an interviewer:

“We’re trying to understand the determinants of intergenerational mobility. A simple way to think about it is ‘your odds of moving up in the income distribution,’ [which is] kind of the core ideal of the American dream. We want to investigate what factors seem to increase kids’ chances of moving up in the income distribution and what we can do to promote the outcomes of disadvantaged youth.”

Chetty talked the IRS into letting his research team look at several tens of millions of people’s tax returns from 1996-2000. Then they looked at ten million of their original sample’s dependents, teenagers born in 1980-82, and tracked down their 1040s from 2011-2012 when the second generation was around age 30. (This data was, hopefully, anonymized.)

On the other hand, while Chetty has been making intermittent progress since 2013 at figuring out how to think about his vast pile of numbers, he clearly doesn’t have much of a knack for understanding his adopted country’s complicated social landscape. Imagine if you had moved with your family at age nine to India. How well even today would you understand its baffling social geography?

“In other words, one of Chetty’s big lessons is that if you are a blue collar worker, you should move to a county that will be booming a decade and a half from now for reasons you can’t possibly anticipate.”

Back in 2013 and 2014, Chetty made a big splash with his now retracted findings that teens living in the Southeast in the late 1990s, especially the sad denizens of Charlotte, North Carolina, had the least equality of opportunity. There was much gnashing of teeth in the media about the long, vicious legacy of Jim Crow.

But, it was pointed out to Chetty, it seems backwards to look at where kids lived in 1996-2000 when the bigger impact in terms of their income as adults in 2011-12 is likely where they are living in 2011-12. So, as of May 2015, Chetty has compromised and unveiled a new methodology for ranking 2478 counties by where young people moved to while they were still dependents of their parents. From this, Chetty has made up a list of Good Counties and Bad Counties.

Chetty’s new theory that it only benefits children economically in the long term to be moved from a Bad County to a Good County while still early in life echoes Londoner Samuel Johnson’s observation to Scottish immigrant James Boswell: 

“Much may be made of a Scotchman, if he be caught young.”

Some of Chetty’s latest findings are discombobulating to The Narrative: for example, immigrant-rich big liberal cities, such as Manhattan, turned out to be bad for Americans to move to.

A close reading of the new 2015 paper by Chetty and Nathaniel Hendren, “The Impacts of Neighborhoods on Intergenerational Mobility: Childhood Exposure Effects and County-Level Estimates,” reveals much that is plausible. For example, the effect of local culture, such as gangs, can be different on boys and girls. Chetty and Hendren write:

This suggests that there are pockets of places across the U.S., like Baltimore MD, Pima AZ [Tucson], Wayne County (Detroit) MI, Fresno CA, Hillsborough FL [Tampa], and New Haven CT, which seem to produce especially poor outcomes for boys.

New Haven County is a fine place to live if you have daughters and you are a Tiger Mother professor at Yale Law School, but it’s a terrible place to move to if you have poor black sons. Chetty has no data on what percentage of boys who were moved to Baltimore, Detroit, or New Haven weren’t earning much in 2011-12 because they were in jail, but it’s obviously a considerable risk.

In contrast, Tucson, Fresno, and Tampa were all home construction boomtowns that got wiped out by the bursting of the Housing Bubble in 2008, a memorable cataclysm whose effects on his data Chetty doesn’t seem to have pondered.

Conversely, girls whose parents moved them when they were teens in the 1990s to now booming and low crime Manhattan are likely to pay a penalty in terms of lower family income in 2011-2012 because they are less likely to be married than if they had been moved to Salt Lake City.

Fertility is actually a promising avenue for Chetty to pursue in the future. As we’ll see below, his income calculations are stricken with problems, but he appears to have the data to estimate the answers to questions such as: where should you move if you want your child to present you with a legitimate grandchild by the time you are, say, 70? That is the kind of thing you aren’t supposed to discuss in public these days, but I’d be surprised if Mr. and Mrs. Chetty don’t worry about it.

Unfortunately, Chetty’s attempts to get a grip on income inequality are still inadequate due to shortcomings in his methodology and analysis.

To help you see what some of the problems with Chetty’s work is, let’s walk through the top and bottom of his new rankings of 2,478 counties. When thinking about Big Data, I’ve long found it extremely useful to look at the highest and lowest examples in detail to see what kind of patterns leap out. It’s extremely easy these days to look up facts about outliers, so more people should do it. This doesn’t seem to be a common practice among academic data analysts, however, who evidently fear contamination by bias and stereotypes. But instead they wind up suffering from ignorance, which is worse.

From Chetty’s website you can download his rankings of upward income mobility for all the counties in America. One of his tables focuses on families that averaged in the lower half in 1996-2000 and the other in the upper half. The median family in the bottom half was, of course, at the 25th percentile in income in the 1990s, while, due to regression toward the mean, their typical offspring is by 2011-2012 at the 41st percentile for his or her age group.

The teenage dependents of families in the top half (a.k.a., 75th percentile) in the later 1990s have typically regressed as 30somethings by 2011-2012 to the 56th percentile. But in a few places, such as Fairfax County, Virginia, the affluent have done a striking job of staying affluent across the generations.

In Chetty’s ranking of income impact on below average families, the single worst county for kids’ future income is Shannon, South Dakota. Raising your kids in Shannon County in the late 1990s, would likely drive down their income by 2011-12 by 35% relative to the average county in America.

What’s so bad about Shannon County? Well, a quick glance at Wikipedia shows that since 2014, it’s been called Oglala Lakota County. This American Indian county is entirely within the notoriously tragic Pine Ridge Indian Reservation, font of all sorts of bad news since the Wounded Knee Massacre in 1890. It was the home of American Indian radicalism in the 1970s and is notorious today for its horrific alcoholism.

Hence, Chetty’s system passes this first reality check well: If you’d asked me to name the Worst Place in America, Pine Ridge likely would have been among the first half dozen guesses I would have come up with.

On the other hand, the example of Pine Ridge calls into question a key assumption in Chetty’s new methodology: that people moving between counties comprise a representative, random sample. But who in the world would move their children to the Pine Ridge Indian Reservation, where 103 young people between ages 12 and 24 attempted suicide this winter? The Sioux who move away from Pine Ridge are likely the more determined and sober, while the ones who slink home, children in tow, are probably those defeated by life in the outside world.

In total, six of Chetty’s Bottom 25 worst counties are majority aboriginal (five Indian reservations plus Nome, Alaska).

Another nine of Chetty’s Bottom 25 are majority black (and the remaining ten are all above the national mean in percentage black).

The sheer blackness of Chetty’s Worst Places Lists is so obvious that Chetty has to admit it:

One of the salient findings in Chetty et al. (2014) is that areas with a higher fraction of African Americans have much lower observed rates of upward mobility.

But this result has been a recurrent embarrassment to him. He really doesn’t want to bring up the Occam’s Razor explanation: that while every family regresses over the generations toward the mean, blacks regress toward a lower mean income than do whites.

Race is such a powerful factor influencing how much money the next generation earns that Chetty takes some pains to obscure this fact. Maryland sociologist Philip N. Cohen pointed out how Chetty’s 2014 paper tries to euphemize the role of blackness behind related factors like de facto “segregation:”

Instead, they drop percent Black for racial segregation. I have no idea why, especially considering … [I]n these normalized correlations, fraction Black has a stronger relationship to mobility than racial segregation or economic segregation! In fact, it’s just about the strongest relationship on the whole long table (except for single mothers, with which it is of course highly correlated).