NamesPop
Analysis

Naming in the Age of Algorithms: How Apps and Data Shape What We Call Our Kids

Jack Lin· Founder & Editor-in-Chief
·10 min read
Naming Trend AnalysisSSA & Open Data

I built a baby name database. In doing so, I became one of the people responsible for shaping what names parents encounter when they search — and therefore, at least in some small way, what names get considered. I am not entirely sure what to do with that. The algorithm decides what comes first. What comes first gets clicked. What gets clicked gets weighted. The loop is quiet and invisible and probably real.

This piece is an attempt to think honestly about what I built, what it does, and what the proliferation of baby name apps and AI generators means for one of the most private decisions parents make.

How People Actually Use Baby Name Apps and Databases

The usage patterns on NamesPop tell a story I suspect is similar across baby name tools. People do not arrive with an open mind and an empty notebook, ready to discover whatever the data contains. They arrive with pre-existing leanings — a shortlist of maybe five to fifteen names — and they use the tool to validate, rank, and compare within that shortlist. They search names they already know. They explore "similar names" suggestions to find names adjacent to ones they already like. They check trend data to confirm or challenge their intuitions about whether a name is rising or falling.

What they rarely do, in my observation, is genuinely discover names they had never encountered. The tool functions as a validation and comparison mechanism more than a discovery mechanism. This sounds modest and possibly harmless. But there is a feedback structure embedded in it that deserves examination.

The Gravitational Pull of Rankings

Every tool that surfaces a "top 100" or "most popular" list is doing something with real consequences: it is making the most popular names more visible than the names that are slightly less popular. A name that ranks 45th is surfaced frequently, clicked frequently, considered frequently. A name that ranks 145th gets far less exposure, even if the quality difference between a 45th-rank name and a 145th-rank name is essentially zero.

The result is a gravitational pull toward the center of the distribution. Parents who are uncertain — which is most parents — tend to anchor on what they see most often, and what they see most often is what is most popular. The tool, by surfacing rankings, creates a selection pressure toward popular names that is independent of any individual parent's preference.

The Algorithm's Feedback Loop

Eli Pariser's The Filter Bubble: What the Internet Is Hiding from You (2011, Penguin Press) introduced the concept of filter bubbles in the context of news and social media, but the underlying mechanism is general: when algorithms surface content based on popularity or predicted relevance, they create a feedback loop that makes the popular more popular and the rare more rare. The algorithm is not neutral; it is a system with a bias built into its structure.

In baby naming, the feedback loop works like this: popular names appear prominently on naming sites → they get clicked and searched more → their metrics improve → they appear even more prominently → they get considered by more parents → they become more popular → the cycle continues. Cathy O'Neil, in Weapons of Math Destruction (2016, Crown), documented how algorithmic feedback loops in various social systems systematically amplify initial advantages. The baby naming version is relatively low-stakes compared to the cases O'Neil examines — credit scoring, criminal justice, hiring — but the mechanism is structurally similar.

There is SSA evidence for this. The diversity index — the number of distinct names in use, weighted for distribution — has been rising steadily since the 1970s, reflecting the general drift toward more individualistic naming. But the rate of increase has not continued to accelerate in the post-2010 smartphone era, which is when name apps became ubiquitous. One possible interpretation: the long tail of naming diversity has grown, but the very top of the distribution has become stickier, because those names benefit most from algorithmic amplification. The names that are already very popular become even more durable, even as the names in the middle and bottom of the distribution continue to diversify.

AI Name Generators: A New Layer

Large language model-based baby name generators are now abundant, and they add a different and in some ways more concerning layer to this story.

An LLM trained on baby name content — Nameberry articles, BabyCenter forums, SSA name pages — learns a probability distribution over names that reflects what is already popular and already written about. When it generates name suggestions, it is sampling from that distribution. This means AI generators are, at a structural level, conservative: they recommend names that are already popular, already associated with positive coverage, already embedded in the naming discourse. The names you are unlikely to encounter on AI generator output are the names that have not yet been discovered, the names from communities underrepresented in the training data, the names at the genuine edge of the distribution.

Cass Sunstein, in #Republic: Divided Democracy in the Age of Social Media (2017, Princeton University Press), writes about information homogenization — how algorithmic curation of information diets can narrow what people encounter, even as the total available information expands. The same paradox applies here. There are more baby names available than ever before, in the sense that more names have been given and more naming traditions are accessible to parents. But AI generators, by sampling from a popularity-weighted distribution, produce recommendations that are systematically narrower than the possibility space they nominally draw from.

The Training Data Problem

Mustafa Eslami and colleagues, in their 2015 CHI conference paper "I Always Assumed That I Wasn't Really That Close to [Her]: Reasoning about Invisible Algorithms in the News Feed," documented that users are often unaware of how algorithmic curation shapes what they see, and that this unawareness leads them to attribute greater neutrality to algorithmic outputs than is warranted. The same dynamic applies to AI name generators. A parent who asks an AI for "unique baby names" receives a list that feels personal and specific but is actually a prediction based on what is popular in the existing naming discourse. The uniqueness is, in most cases, relative rather than absolute.

Filter Bubbles in Name Searches

There is a personalization dimension to this too. Naming tools that use search history or demographic signals to tailor their recommendations — and many do, implicitly through their filtering interfaces — create personalized name landscapes that can differ substantially across demographic groups.

A parent who searches primarily for Irish names will receive recommendations weighted toward Irish names. A parent who indicates a preference for "unique" names will be shown names from the long tail of the distribution. These filters feel like helpful personalization, and they are — they surface more relevant content. But they also narrow the scope of what the parent encounters. The Irish-name-focused parent does not see the Hebrew name that might have been perfect. The unique-name-seeking parent does not see the currently-popular name that their community might find appealing.

I am not arguing against personalization — a tool that shows you everything is unusable. But the narrowing is real, and users should understand that the names they encounter in a personalized tool are a curated slice, not the full landscape.

The Case for Data-Informed (Not Data-Determined) Naming

In building NamesPop, I made some choices that I think about more now that I have been writing about this.

I chose to show trend data prominently — to let parents see whether a name is rising or falling in popularity, because I think that information is genuinely useful and helps parents make more informed decisions. But showing trend data also functions as an amplifier: a name with a "rising" label gets more clicks, more consideration, more social proof. I am aware that in showing the trend, I may be contributing to the trend.

I try to surface the long tail alongside the top names, to make sure that names outside the top 100 are findable through filtering and search. But the default sort is by popularity, because that is what most users want. The defaults are not neutral — they shape behavior in the direction of the default.

What would it mean to build a genuinely neutral naming tool? I am not sure it is possible. Every design choice is a choice about what to surface and how. The interface is a curatorial act even when it tries not to be. The honest version of this is not to pretend neutrality but to be transparent about the choices being made and their likely effects.

What Would Responsible Naming Technology Look Like?

A few things I think would make naming tools more honest about what they do:

Explicit disclosure that rankings are self-reinforcing — that using a popularity ranking as a decision input will, in aggregate, make popular names more popular. Not a warning, but an acknowledgment.

Random discovery features alongside ranking-based surfaces — a genuine "I have no preference and want to see something unexpected" mode that samples from outside the popularity distribution.

Transparency about AI-generated content when it appears, including an honest note that AI recommendations are probability-weighted toward existing popular names.

Demographic breadth in naming databases — ensuring that names from communities underrepresented in English-language naming media are findable, not buried in a long tail that most users never reach.

None of this solves the fundamental problem, which is that any tool that helps you choose will also, by virtue of what it surfaces and how it surfaces it, influence the choice. The algorithm is not neutral. The interface is not neutral. The data is not neutral — it is a record of what people have done, which is itself shaped by what previous generations of tools and media made visible.

I built NamesPop because I wanted to make it easier for parents to explore name data. I still think that is a worthwhile thing to do. But I try to hold it alongside the awareness that making something easier to find can also, in ways that are hard to trace, make certain things harder to see.

Data source: U.S. Social Security Administration. Analysis by NamesPop.

Found this helpful?

Share it with someone who’s picking a name.

More in Analysis

Popular Names

Keep Reading

Find the perfect name for your baby

Explore 100,000+ names with meanings, origins, and popularity trends.