June 3, 2025

Custom NYC Home Price Index: A Flexible Alternative to Zillow

Live updating price index

Source code

This Julia script builds a custom home price index using New York City property sales data from 2003 through recent months. It differs from Zillow’s price indices in both methodology and flexibility, allowing for tailored insights at the borough, neighborhood, and property-type level.

What the Code Does

  1. Data Ingestion & Cleaning:It reads in historical sales data across boroughs, standardizes column names, categorizes property types (e.g., Coop, Condo, Single-Family Home), and filters implausible sale prices and duplicates.
  2. UID Generation:Unique identifiers for properties are created using block-lot-apartment logic to track repeat sales over time.
  3. Outlier Detection:The model flags and removes outlier transactions based on excessive price changes adjusted for time between sales.
  4. Regression-Based Index Calculation:Using a repeat-sales regression weighted by time-between-sales residuals, the code estimates period-over-period price changes. The result is a time series of normalized price indices.
  5. Segmented Indexing:It calculates indices not just overall, but also segmented by price tier (top/bottom deciles/thirds), borough, property type, and neighborhood.

How It’s Different from Zillow’s Price Index

FeatureThis CodeZillow Home Value Index (ZHVI)
Data SourceNYC public sales transaction dataProprietary data, including listings
MethodologyRepeat-sales regression (like Case-Shiller)AVM (automated valuation model)
Outlier FilteringCustom logic using log price changeInternal filters (opaque)
SegmentationFine-grained (borough, neighborhood, price tiers)Limited segmentation via dashboards
TransparencyFully auditable and modifiableBlack-box methodology

Why It’s More Customizable

  • Open & Extensible: Easily adjust filters (e.g., sale price ranges, property types), time resolution, or regression methods.
  • Fine Control Over Segmentation: Custom groupings like “top third” or specific neighborhoods enable hyper-local insights.
  • Tailored Outlier Detection: Change thresholds for what counts as an anomalous price movement.
  • Reproducibility: Every transformation is explicit and reproducible, unlike opaque third-party indices.

Use Cases

  • Policy analysis by borough or income level
  • Neighborhood-specific investment tracking
  • Academic housing studies
  • Custom reporting for real estate professionals

Conclusion

This code provides a transparent, flexible, and research-grade alternative to Zillow’s ZHVI, ideal for users needing precise control over methodology and data scope.