Read and format project data
= pd.read_csv("https://github.com/byuidatascience/data4names/raw/master/data-raw/names_year/names_year.csv") df
Course DS 250
Juan Zurita
Analyzing the names_year.csv dataset using Pandas and Plotly Express reveals valuable insights into the evolution of names over time. By plotting the data, we can observe trends in naming preferences and popularity across different years. Key insights include identifying the emergence and decline of certain names, tracking the overall popularity of specific names over time, and potentially uncovering cultural or societal influences on naming trends. Additionally, visualizing the data allows the comparison of naming patterns between genders or regions, providing an understanding of how names have evolved throughout history.
How does your name at your birth year compare to its use historically?
Juan is a name used often throughout the years. Its popularity was increasing and was definetiley pupular around the 2001 which is the year I was born. However, after the 2006, we can see a drop in the use of this name.
If you talked to someone named Brittany on the phone, what is your guess of his or her age? What ages would you not guess?
My first guess is that Brittany should eb young because that is just what I feel about that name. After analyzing I have realized that she could be between 36-24 years old. This chart shows how many people were named Brittany.
Mary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names. What trends do you notice?
From this analysis I realized that these names are being less used by everyone. Especially after the 50s. Culture is slowly leaving some christian customs behind and this is one of them.
# Include and execute your code here
names = (df
.query("name in ['Mary','Martha', 'Peter', 'Paul']")
.query("year >= 1920 and year <=2000"))
names_chart = px.line(names, x='year', y='Total', color = 'name', title='Mary, Martha, Peter and Paul Through Time')
names_chart.update_layout(xaxis_title='Year', yaxis_title='Total Count', legend_title=' ')
names_chart.show()
Think of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage?
For this question I chose “Forrest” from the movie “Forrest Gump” which came out in 1994. As we see, this movie really impacted society and I can say was a good hit because I really loved it. The movie definetiley had a big impact on the usage of the name and after that year, everyone stopped using it that much.
# Include and execute your code here
Forrest = df.query("name == 'Forrest'")
chart4 = px.line(Forrest, x='year', y='Total', title='Forrest Through Time')
chart4.add_vline(x='1994', line_dash="dash", line_color="yellow")
chart4.update_layout(xaxis_title='Year', yaxis_title='Total Count')
chart4.show()