Problems with the SSA’s baby name data

The SSA’s baby name data isn’t perfect. In fact, when it comes to the earliest years on record, some of the data is downright misleading.

I think most of us who write about names regularly are aware of the issues with the data set, but we don’t mention it as often as we should. (Me included, of course.)

So I’m grateful that blogger David Taylor has created graphics to illustrate the problematic aspects of the U.S. Baby Names dataset. I can’t improve upon his explanations, so I’ll just embed his slideshow below and recommend that everyone go check out the original post.

Learn About Tableau

4 thoughts on “Problems with the SSA’s baby name data”

Ellie says:

September 22, 2014 at 3:02 pm

Interesting on the history of the data, But I expected numbers on more modern records as well. Like counting how many babies are named below the top 1000(And the less then five names babies). Combining name spellings ect. I don’t care too much about the data anyway and think people over think it. But I’m curious on more matters.
Brooke Cussans says:

September 22, 2014 at 5:03 pm

What an interesting read – thanks for re-posting it :)
Diane says:

September 24, 2014 at 7:49 am

That was really interesting and included some new info about the data as well as confirming some ideas I’ve had for a long time. One is the use of nicknames as full names in the earlier years–it’s easy to get the impression from the SSA data that a large number of men were officially named Joe, Tom, Bill, Bob, etc. I’ve found it helpful to compare the SSA lists to counts that were made in earlier, pre-Internet years (for example, in the books by Dunkling & Gosling). There you don’t find Joe at all, though it’s #28 on the SSA list for 1890 and #34 in 1925. That leads me to conclude that most of those Joes were officially Josephs. On the other hand, these lists DO indicate that Harry was considerably more popular than Harold in the 1880s and 1890s, but Harold had taken the lead by the 1920s.
Thanks for posting this!
Pingback: US Baby Name Popularity Visualizer - Engaging Data

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Problems with the SSA’s baby name data

Related

4 thoughts on “Problems with the SSA’s baby name data”

Leave a Reply