Glitch alert: Why are there gaps in the recent New York baby name data?

glitch

The baby name Esty (a diminutive of Esther) is primarily used in the state of New York, thanks to the large Jewish community in New York City.

But the name was also featured in the Emmy-winning Netflix series Unorthodox a couple of years ago. So, last year, I checked the Esty data (both the national data and the New York data) to see if the show had influenced the name’s usage.

It may have — Esty did indeed see its highest-ever usage both nationally and in New York in 2020. Even more intriguingly, though, I noticed what seemed to be gaps in the recent NY data. Specifically, New York had no data on the name Esty for the years 2016, 2018, and 2019.

Check it out:

Esty usage in the U.S.Esty usage in New York
20216357
20206860
201959
201841
20173636
201643
20153937
20143735

I mean, It’s possible that the New York usage of Esty simply dropped below the 5-baby minimum during those particular years. As per the SSA:

To safeguard privacy, we exclude from our tabulated lists of names those that would indicate, or would allow the ability to determine, names with fewer than 5 occurrences in any geographic area.

If that were the case, though, you’d expect to see corresponding dips in the national usage. And we don’t see that here.

It seems more likely to me that some of the New York data is simply…missing.

So the next question is: Are there gaps in the NY data for other names as well?

To check, I grabbed all the names with heavy New York usage listed in the 2021 state-by-state post and the 2020 state-by-state post — 34 names in total — and looked the data.

The result? Exactly half had similar gaps.

Here’s what I found…

The boy name Cheskel (a form of Chatzkel, which is based on Ezekiel) didn’t appear in the New York state data for 5 years straight:

Cheskel usage in the U.S.Cheskel usage in New York
20212929
202018
201927
201830
201723
201627
20152221
20142523

The girl name Chany (a diminutive of Channah) didn’t appear in the New York state data for 4 years straight:

Chany usage in the U.S.Chany usage in New York
20216558
202056
201960
201855
201756
20165555
20154443
20144241

The boy name Naftuli (based on the Biblical name Naphtali) didn’t appear in the New York state data for 4 years straight:

Naftuli usage in the U.S.Naftuli usage in New York
20212929
202033
201933
201827
201724
20163333
20152422
20142925

The girl name Idy didn’t appear in the New York state data for 4 years:

Idy usage in the U.S.Idy usage in New York
202146
20204747
20193126
201829
201726
201625
20151716
20141513

The boy name Shmiel (a form of Shmuel, which is based on Samuel) didn’t appear in the New York state data for 4 years:

Shmiel usage in the U.S.Shmiel usage in New York
20214040
202045
20193838
201831
201735
201644
20154444
20143837

The girl name Yides (a diminutive of Yehudit, which is a form of Judith) didn’t appear in the New York state data for 4 years:

Yides usage in the U.S.Yides usage in New York
202139
20203434
201951
20183232
201739
201635
20154242
20143838

The boy name Berl didn’t appear in the New York state data for 4 years:

Berl usage in the U.S.Berl usage in New York
202119
20201717
20192323
201818
201716
201622
20152121
20141918

The girl name Frady (a diminutive of Freyde) didn’t appear in the New York state data for 3 years straight:

Frady usage in the U.S.Frady usage in New York
20212525
202022
201923
201821
20172121
20162020
20151714
20141919

The girl name Pessy (a diminutive of Batya, which is a form of the Biblical name Bithiah) didn’t appear in the New York state data for 3 years:

Pessy usage in the U.S.Pessy usage in New York
20216351
202062
201941
20185446
20174133
201634
20154645
20144240

The boy name Lipa (a short form of Lipman, which is based on the name Liberman) didn’t appear in the New York state data for 3 years:

Lipa usage in the U.S.Lipa usage in New York
20215044
20204843
201953
20184438
201737
201642
20154340
20145050

The boy name Usher (a form of Asher) didn’t appear in the New York state data for 3 years:

Usher usage in the U.S.Usher usage in New York
20214136
202037
201958
20183629
201734
20164135
20154540
20143128

The boy name Avrum (a form of Abraham) didn’t appear in the New York state data for 3 years:

Avrum usage in the U.S.Avrum usage in New York
20214234
20203728
201924
20182924
201727
201625
20151716
20142322

The boy name Lazer (a form of Eliezer) didn’t appear in the New York state data for 3 years:

Lazer usage in the U.S.Lazer usage in New York
202140
20203731
20194539
201829
201728
20164335
20152928
20143331

The boy name Yossi (a diminutive of Yosef) didn’t appear in the New York state data for 3 years:

Yossi usage in the U.S.Yossi usage in New York
20213529
202030
20192318
20183024
201721
201629
20152019
20142519

The girl name Goldy (a diminutive of Golda) didn’t appear in the New York state data for 2 years:

Goldy usage in the U.S.Goldy usage in New York
20216957
20206353
20195144
20186254
201756
201646
20154842
20142822

And, finally, the boy name Nachman didn’t appear in the New York state data for 2 years:

Nachman usage in the U.S.Nachman usage in New York
20212718
20202317
201918
20182012
201721
20162116
20152824
20142720

If the gap years matched up more closely with one another — as with the glitch of 1989, for instance — I could chalk it up to a few incomplete batches of data.

But they don’t, so…I don’t know what to make of this.

Do you guys have any thoughts, or theories?

(If you’d like to examine the New York data for yourself, download the “State-specific data” file from the SSA website.)

Sources: Behind the Name, SSA
Image by Michael Dziedzic from Unsplash

7 thoughts on “Glitch alert: Why are there gaps in the recent New York baby name data?

  1. Wow! I have so many questions. When did this start? Are those the only names missing or are there others? (much harder to tell with names that aren’t exclusively in the NY data anyway…). I looked at the file and didn’t see a significant dip in the number of unique names listed per year, so I’d guess there are only a small number of names missing overall. What a crazy thing to stumble across!

    I would almost be tempted to try and email the SSA and ask, although I don’t know if they would be able to answer or not.

  2. That’s really mysterious. I have a collection of namesbystate.zip over almost a decade, the lacunae are consistent over time and also in the older versions of that file.

    My best guess is that they somehow managed to mis-code the state where the babies were born into something not existing (or denoting birth outside the USA). I checked that the missing babies did not occur in one other state—New Jersey would be a good candidate, but I looked at all states and the territories.

  3. @elbowin – Only births in the 50 states and DC are figured into the general national SSA list. Babies born in the territories and outside of the US (even though they may be US citizens) are not included in the master list.

  4. @k8eshore – It seems to have started during the 2nd half of the 2010s. Here are a few more names with NY gaps:

    Girl names: Brucha, Frimy, Nechuma, Rifka
    Boy names: Chesky, Pinchus, Shaul, Yida

    I don’t know how many more there might be. (I’ve checked 70 or 80 names so far.)

    I’ve tried to get in touch with the SSA several times over the years, but, unfortunately, they’ve never replied to any of my name-related questions.

  5. @elbowin – Very smart of you to keep all those older files! Thanks for checking them.

    Given the location, and the fact that all the affected names we’ve seen so far have been Jewish, I have to wonder if the missing/mis-coded data can’t be traced back to a particular NYC hospital or health system that caters to the Orthodox Jewish community.

  6. I wonder which hospital, though. If that were the case, almost all of those babies would have probably been born in one hospital, which is a little odd.

  7. Yes, that would be odd. I think a health care system (with multiple hospitals, but centralized data-processing) would make more sense.

    Or, maybe the issue is on the government level. Maybe the hospitals’ data is correct, but the local government office collecting that that data is (inadvertently) altering it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.