2026-06-10 / 7 min read

How CrimeScore Turns Raw Crime Data Into Neighborhood Safety Scores

A plain-language walkthrough of how CrimeScore normalizes crime data, Census predictors, model outputs, and percentile scores.

methodologycrime risk scorecrime data model

Crime data is messy before it becomes useful. Different cities publish different fields, use different offense labels, update on different schedules, and draw boundaries in different ways. If you want a national product, you cannot just stack those files together and call it a score.

CrimeScore starts by normalizing incident data into a shared schema. Source records are mapped into generic offense categories, geocoded or spatially joined to Census geography, and prepared for model-compatible reads.

The model does not only look at incident counts. It also uses Census-derived location predictors, such as rate-based socioeconomic and built-environment variables. Counts are converted to rates with explicit denominators so larger areas do not win or lose just because they have more people.

Race and ethnicity are blocked from model features. CrimeScore also hides protected prefixes from customer-facing score contributors. That does not make the model perfect, and it does not remove every fairness concern, but it is a core boundary in the product design.

The current production model uses estimates that are computed ahead of time. The API does not run a model from scratch on every request. Instead, the deployed score table contains precomputed records that can be resolved quickly by location.

The raw model output is converted into a 0 to 100 score with a blend of county-relative and national percentile ranking. That matters because local context and national context both have value. A score should help compare neighborhoods inside a county, but it should not lose sight of broader national standing.

The final score response is deliberately simple: a score, a grade, resolved geography, component scores, and metadata. Starter and higher teams can also request score details with contributor direction and impact.

There are real limitations. Police incident data reflects reporting and enforcement patterns, not every crime that happened. Census data is lagged. Some geographies have weaker open-data coverage. Block group estimates are more local, but they can also be noisier than county-level estimates.

That is why CrimeScore should be treated as a location intelligence signal. It is useful for context, comparison, maps, and analytics. It is not a policing tool, not a personal score, and not a reason to make an adverse decision about an individual person.

FAQ

Common questions

Does CrimeScore run the model live for every API request?

No. The production API uses precomputed score records so responses can be resolved quickly by location.

Does CrimeScore use race as a feature?

No. Race and ethnicity are blocked from model features and hidden from customer-facing contributors.