Methodology

Last updated: April 2026

This page explains how the numbers on NeighborhoodGuru are produced, how current they are, and what they can and cannot tell you. Our goal is that every chart, tile, or sentence on the platform can be traced back to a well-defined data class and a reasonable freshness window.

How our analytics work

Our analytics combine three broad categories of input:

Real estate research data — established market indices that describe home values, rents, inventory, days on market, and related signals at the ZIP, metro, and state levels.
Federal statistics — public economic data from federal agencies, including interest rates, housing starts, and demographic and income distributions.
Public records and listing data — address-level attributes (bed/bath/sqft, assessed value, sale history) for property-level scans and deep dives.

We normalize these inputs into a single internal schema, join them on common geographies (ZIP, county, metro, state), and compute derived metrics (momentum, buy-score, rent-to-value ratios, cash-on-cash, and so on). Every derived metric is labeled in the UI so you can tell observed values from modeled estimates.

Data source classes

To avoid false precision, every number on the platform falls into one of five classes. When a tile, tooltip, or chart omits the class, assume modeled estimate.

Observed

A directly measured value — e.g. a closed sale price, a listed rent, a reported loan balance. Observed values are as accurate as the underlying record. They may still be stale if the report lags the event.

Public record

A value sourced from a government record — assessments, parcel attributes, tax rolls, deed transfers. Accuracy depends on the jurisdiction's recording practices. Some fields (e.g. square footage) can drift from reality between reassessments.

Modeled estimate

A value produced by a statistical model from one or more observations — e.g. an automated valuation, a rent estimate, an occupancy forecast. Modeled estimates are always approximate and come with a confidence range. They are not appraisals.

Area-level proxy

A value that describes an area (ZIP, tract, metro) rather than a specific address, but is being used as a best-available stand-in when an address-level value isn't available. Examples: median household income by ZIP used as an indicator for an individual block, or ZIP-wide rent growth used to project a specific unit. Useful for context, but should not be read as property-specific.

Derived

A value computed inside the platform from two or more of the above — e.g. buy-score, gross yield, momentum indicators, PITI payments. Derived values inherit the uncertainty of their worst input and should be compared relatively, not read as precise absolutes.

Freshness commitments

Different classes of data refresh on different cadences. When a data point's underlying provider lags, so will we.

Market indices (home values, rents, inventory): refreshed monthly as providers publish updates.
Federal rate and macro data (mortgage rates, housing starts, macro indicators): refreshed weekly.
Property-level data (listings, attributes, comps): pulled on demand when you run a scan or deep dive; results are cached for a short window so repeat lookups don't re-bill against your allowance.
AI analyses are generated at request time; they reference whatever the freshest underlying data is at that moment.

Where we know a specific number's "as of" date, we surface it alongside the number in the UI. When you see a date stamp like [metric, 2026-03], that is the underlying observation's month.

Known limitations

ZIP-level estimates are not property-specific. A good ZIP does not guarantee a good street; a weak ZIP does not rule out a strong block. Always cross-check an address against current comps.
Confidence varies across geographies. Dense urban ZIPs usually have thousands of observations behind a metric. Rural ZIPs or small counties can have very thin coverage — the same metric label can mean very different statistical confidence in different places.
Reporting lag. Some jurisdictions publish sales and assessments months after the fact. Markets can move materially within that window.
Model error. Valuation and rent models are statistical approximations, not appraisals. We try to show confidence ranges; when we can't, treat a single modeled number as a midpoint, not a guarantee.
Structural change. Models trained on historical data can underreact to regime shifts (a rate spike, a policy change, a new employer moving into or out of a region). Use AI narratives and macro context to sanity-check model outputs.

Interpreting confidence

Several metrics on the platform show a confidence badge or range. Here is how to read them:

High confidence — the metric is backed by direct observations with a recent as-of date and a large sample size. Safe to use as a quantitative anchor for a decision.
Medium confidence — modeled from a reasonable sample, but thin or older underlying data, or meaningful model error. Use for comparative / directional judgments.
Low confidence — either the underlying sample is very small, the data is stale, or we are using an area-level proxy in place of a missing address-level value. Treat as a hint, not an answer.

A confidence label is about statistical support, not about whether the number "feels right." A high-confidence number can still be unhelpful if the underlying geography isn't the one you care about.

Report an inaccuracy

Spotted a number that looks wrong? Please flag it — this is how our numbers improve. Signed-in users can submit a specific metric via Settings → Data issues. Everyone else can email us at support@neighborhoodguru.org. Please include the URL, the specific metric, and what you believe the correct value to be — every report lands in our admin queue and is reviewed.