Exploring Words in PoKi Poems!

Education Poetry

This post explores language data from poems written by students from first to twelfth grade.

Will Stevens , Robin Hardwick , Ryan Bruder
2021-03-07

Introduction

PoKi poems is a set of data aggregated by whipson on github. It contains the poems of grade school students from elementary through high school. The data set even tracks the emotions expressed on a poem to poem basis. Having read a few of the poems ourselves, we wondered about the changes and differences in poems from grade to grade. With this we began our exploration of the emotional growth of students over time. Namely we asked: How great of a change occurs in their range of emotions? What causes these changes? And what shows us these changes?

But why use Poki Poems in particular?

Initial Wrangling

Show code
poki <- read.csv("https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids/master/poki.csv") %>%
  transform(id = as.character(id))

poki.lem <- read.csv("https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids/master/poki-lem.csv") %>%
  rename(lem = text) %>%
  select(id, lem)

poki.analysis <- read.csv("https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids/master/poki-analysis.csv") %>%
  transform(id = as.character(id)) %>%
  select(-grade, -author) 

poki.full <- right_join(poki, poki.lem, by = "id")

poki.full <- right_join(poki.full, poki.analysis, by = "id")

Here, we joined the three sets of data made by the github user whipson. This allows us to easily find all the variables and compare each of them at once.

Show code
poki.emotion <- poki.full %>%
  group_by(grade) %>% 
  replace_na(list(anger = 0, fear = 0, sadness = 0, joy = 0)) %>%
  summarize(anger = mean(anger), 
            fear = mean(fear), 
            sadness = mean(sadness), 
            joy = mean(joy))  %>%
  pivot_longer(!grade, 
   names_to = "emotion", values_to = "value") %>%
  mutate(emotion = case_when(emotion == "anger" ~ "Anger",
                             emotion == "fear" ~ "Fear",
                             emotion == "joy" ~ "Joy",
                             emotion == "sadness" ~ "Sadness"))

Following joining the tables, we added a column of the mean number of occurrences for each emotion!

Graphing!

When exploring the data, we plot each of our variables to the y-axis, and the grade level to the x-axis.

This first graph shows the number of characters (letters and spaces) per poem!

Show code
poki.full %>%
  group_by(grade) %>%
  summarise(avg_char = mean(char)) %>%
  ggplot(aes(x = grade, y = avg_char)) +
  geom_col(fill = "aquamarine2", color = "white") +
  robins_ggplot_theme() +
  labs(x = "Grade Level",
       y = "Average Number of Characters",
       title = "Comparison of Grade Level to Length of Poem",
       subtitle = "Data from Poki poems (written by kids grades 1—12)",
       caption = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids")

While this is sort of a boring graph, it gives a baseline for length of poem per grade. As students get older the grade seems to have an increase in poem length. While there is steady change from 1st to 8th grade, at 9th grade the students jump up in poem length. This could be that the teachers are asking for a greater length of poem, but it could also reveal a greater mastery over words. 1st graders have more to say, while 2nd graders perhaps say it in a more concise manner. This could also include an expanded vocab that has larger most complicated words or more complicated sentence structure in the later grades. Also worthy of note is the lack of change after the 9th grade. Notably the 10th grade has the same length. No steady change occurs across high school, and the length between 11th and 12th actually shortens. This could indicate a plateau at this age group where students reach a max length and don’t collectively exceed it. Alternatively, this could be the extent to which the educational system requires them to write. The site scribbr (https://www.scribbr.com/academic-essay/length/) shows how long an essay should be within different levels of schooling. This accounts for growth further than the 12th grade. Meanwhile, The Alliance for Excellent Education(AEE) site (https://all4ed.org/webinars-events/challenges-confronting-high-schools-adolescent-literacy/) has an article detailing how the literacy rates of students from 8th to 12th grade are below the basic level. Perhaps this is why the 9th to 12th grade character count beings to slow in growth.

Here we have a graph of graphs! Meaning that each grade has its own graph for the average number of poems based on feeling.

Show code
# Faceted
ggplot(poki.emotion, aes(x = emotion, y = value)) +
  geom_col(fill = "aquamarine2", color = "white") +
  facet_wrap(vars(grade)) +
  robins_ggplot_theme() +
  labs(x = "Grade Level",
       y = "Average Number of Poems",
       title = "Comparison of Grade Level to Emotional Content of Poems",
       subtitle = "Data from PoKi poems (written by kids grades 1—12)",
       caption = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids") +
  theme(strip.background = element_rect(color="white", 
                                        fill="gray85", 
                                        size=1.5, 
                                        linetype="solid"))

This graph details how the different grades distribute their emotions. We observe that as student’s grade levels increase, so too does their capacity for feeling and writing about more varied emotions. The average amount of poems that express joy is highest across all grade levels, but is markedly the most dominant emotion for grades one through five, indicating that elementary school students feel the most joy and much less fear and sadness. Going into middle school, kids’ poems began to encompass the emotions of anger, fear, and sadness more and more, with sadness showing up as about an average of 0.3 in eighth grade and at about 0.2 in sixth grade. High school shows the largest jump in average reported emotions within the poems, with every emotion showing up at least 0.3 of the poems written on average, with joy measuring at about 0.45. Thus, we observe an increase in variety and overall emotion expressed within the poems as students age, indicating higher emotional maturity in later grade levels.

This article titled “Positive and Negative Emotion Regulation in Adolescence: Links to Anxiety and Depression” gives reason why. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6523365/

The article states that youths within the high school age groups are in a time of inner strife as they create strategies for coping in inter personal relationships. This includes at school, work, and in their personal life with less dependence on their parents. As our data shows, it appears that with age and less safety from their parents, students embrace and convey more negative emotions more often.

This graph charts the average number of poems written at each grade, while separating across emotion.

Show code
# By count
ggplot(poki.emotion, aes(x = grade, y = value, fill = emotion)) +
  geom_col() +
  robins_ggplot_theme() +
  labs(x = "Grade Level",
       y = "Average Number of Poems",
       fill = "Name of Emotion",
       title = "Comparison of Grade Level to Emotional Content of Poems",
       subtitle = "Data from PoKi poems (written by kids grades 1—12)",
       caption = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids") +
  scale_fill_brewer(type = "qual", palette = 4)

In particular and similar to the last graph it shows how joy as the sole positive emotion is not the majority of emotions. At the same time it reveals how the average number of poems written in each grade goes up the higher the grade but how high school ceases an overall upward trend. A trend perhaps to be expected given the AEE’s material concerning literacy in high school. Useful here is also their knowledge concerning the longevity of literacy and its compounding effect. In a word, the well read read more and in greater amounts, while the poorly read read less and struggle to catch up. This would account for how high schoolers reach this more neutral trend.

Did someone say another graph for emotions? Good! Cause we have it!

Show code
# By proportion
ggplot(poki.emotion, aes(x = grade, y = value, fill = emotion)) +
  geom_col(position = "fill") +
  robins_ggplot_theme() +
  labs(x = "Grade Level",
       y = "Proportion of Poems",
       fill = "Name of Emotion",
       title = "Comparison of Grade Level to Emotional Content of Poems",
       subtitle = "Data from PoKi poems (written by kids grades 1—12)",
       caption = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids") +
  scale_fill_brewer(type = "qual", palette = 4)

While on the surface this graph may look like the others, it has a distinct use. The above graph demonstrates the frequency of each emotion’s appearance by grade level. As we can see through the gradual shrinking of green toward the right-end of the graph, the proportion of poems that demonstrate joy shrinks as grade level increases, while the other colors (especially sadness and anger) grow in prevalence. As the ncbi article noted, the older an adolescent gets, the more stresses they deal with on their own. Our graph and the ncbi research suggest that as kids age, the emotional variety they feel changes from mostly joy, to a more even distribution of emotions across the spectrum.

Overall, these graphs detail the story of how with age people develop a wider range of complex expression that no longer hinges solely on positivity.

Most Common Words!

Since the data set has the full poems included, we are able to pick out the most used words across all grades. To make the data useful to us we first separated out the word set we wanted. Our first filtering used the lemmatized words within the data. Lemmatizing being the process of putting all words that have the same root under one category. In example, “love” and “loving” get pooled together as “love.” By separating out words and finding the most common, we can begin to look at what topics spur the range of emotions felt by students.

Show code
lem.words <- poki.full %>%
select(lem) %>%
 separate_rows(lem, sep = " ") %>%
  group_by(lem) %>%
  count() %>%
  arrange(desc(n))

In this code you can see the process of arranging the data by count and pulling out the lemmatized data. Following this, we put the top ten most used words onto a table.

Show code
#Lemmatized Word Dist Table
lem.words %>% 
  rename(Number_of_Occurences = n,
         word = lem) %>%
  anti_join(stop_words, by = "word") %>% 
  filter(!word %in% c("n't","'m")) %>%
  head(10) %>%
  gt() %>%
  cols_label(Number_of_Occurences = "Number of Occurences",
             word = "Word") %>%
  tab_header(title = md("**Lemmatized Distribution of Words**"),
             subtitle = "Data from PoKi Poems") %>%
  tab_source_note(source_note = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids")
Lemmatized Distribution of Words
Data from PoKi Poems
Number of Occurences
's
21385
love
18255
day
11917
eat
9022
friend
7914
time
7348
play
7174
dog
6826
life
6581
feel
5923
Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids

Oddly enough, the most used word is not a word at all! But rather “’s” showing contractions or possession. Since our focus is emotions, possession poses an interesting source of feelings as it implies relations between the writer and another person or being. Similarly and the second most used word is love. These relational words play well into the study shown earlier. Particularly where adolescents struggle through the years when coming into their own individuality as they change who they rely on and grow their circle to include new significant others. Love in this context could be rather confusing and fill the negative emotional spaces. This could also show a childish joy for parents or favorite item in the younger ages. So love, while being a mostly positive word does not simply imply positive emotional expression. Also in support of this relational aspect of growth are the words, friend, play, and dog.

To further understand the topics explored emotionally by students, we created a graph of the most common lemmatized words distributed across grades.

Show code
lem.words.grade %>%
  arrange(desc(n), grade) %>% 
  filter(!word %in% c("'s", "n't", "'m")) %>%
  slice_max(n, n = 5) %>%
  ggplot(aes(x = word, y = n))+
  geom_col(show.legend = FALSE, fill = "aquamarine2", color = "white") + 
  coord_flip() +
  facet_wrap(~grade, ncol = 3, scales = "free") +
  robins_ggplot_theme() +
  theme(strip.background = element_rect(color="white", 
                                        fill="gray85", 
                                        size=1.5, 
                                        linetype="solid")) +
  labs(x = "Lemmatized Words",
       y = "Number of Occurrences",
       title = "Most Common Lemmatized Words, Distributed Across Grades",
       subtitle = "Data from PoKi poems (written by kids grades 1—12)",
       caption = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids")

What this graph shows is the word used by the number of times it occurs. Within the graphs love is the only word that occurs in every grade level. Words like friend, eat and play are the most common in elementary levels as well as dog. In elementary, the greatest concerns appear to be with favorite items including which explains the fixation on food and dogs. This shows a rather base emotional complexity as these ideas are not quite as self referential, which can even be read in our curated favorite list located at the end of the blog. “Fantasy soup” is markedly about something tangible, while “Now that I have you” refers to the self and their very own complex emotions. In high school, the words feel and heart show greater reference to the self and emotions expressed by the authors. By talking about what specifically they feel and where they might feel it, these students show familiarity with their emotions in their struggles of the self and other. These words both in lower and higher grades, and in the transitional space of middle grades, reveal how the words being used are strong indicators of the emotional expansion and change that students experience as they grow up.

Show code
normal.words %>%
  filter(text %in% c("all", "have", "love", "will", "one", "go", "day", "know", "never", "always")) %>%
  rename(Number_of_Occurences = n,
         Word = text) %>%
  head(10) %>%
  gt() %>%
  cols_label(Word = "Word", Number_of_Occurences = "Number of Occurences") %>%
  tab_header(title = md("**Un-lemmatized Distribution of Words**"),
             subtitle = "Data from PoKi Poems") %>%
  tab_source_note(source_note = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids")
Un-lemmatized Distribution of Words
Data from PoKi Poems
Number of Occurences
all
16494
have
14585
love
13729
will
12490
one
11481
go
7940
day
7615
know
7020
never
6960
always
6662
Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids

Namely we asked: How great of a change occurs in their range of emotions? What causes these changes? And what shows us these changes?

Conclusion.

So in the end, we have found that students from 1st to 12th grade generally begin to experience a much wider range of emotions including the negative sort, as shown by the NCBI article. This article also details the reasons for more expressing this wider emotional range, including relational stresses. Meanwhile, our data showed the words and topics that drove these changes. In the end, the PoKi Poems data along side studies of developing youths not only tells the story of how adolescent people begin to embrace more negative emotions, but how these emotions expressed, explored and talked about through writing.

Show code
poki.full[c(13, 116, 124, 22302, 22337, 34915, 35059,54438, 60184, 60385, 61295, 61265, 61269), c(2,3,4,5)] %>%
  gt() %>%
  cols_label(title = "Title of Poem",
             author = "Author's Name",
             grade = "Grade Level",
             text = "Poem") %>%
  tab_header(title = md("**Our Favorite Stand-Out Poems**"),
             subtitle = "Taken from PoKi Poems") %>%
  tab_source_note(source_note = "Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids")
Our Favorite Stand-Out Poems
Taken from PoKi Poems
Title of Poem Author's Name Grade Level Poem
LADYBUG! abby 1 i saw a ladybug, her name was sue she tried my glue. i told her not to.
New Born barrie 1 who am i? a mistake? yet here i am. questioning those around me. although i say nothing. ask nothing. i look into their eyes and see their inability to answer. it seems they would like to respond to my questioning but are unable to. their only reply is a gentle cooing. why? could it be they are unaware of their true selves? ten days old and already i pity their existence. who am i? i could be your purgatory or purger. the choice is not yours. it is my life. my choice.
Hi bill 1 bye
Fantasy Soup alyssa 5 the soup that my mom made of dragon, it seems to have unicorn, too. it's like she added a bit of meat to what seemed like a regular stew. but it also had faries &amp; elves, she stirred in a couple of gnomes, whom of which she must have taken right out of their foresty homes. she added wood &amp; water nymphs, she also added sprites. giants, centaurs, talking beasts, who only come out at night. she tossed in a couple of ogres, they tasted a bit like spice. then she put in a couple of mermaids' tails, which made it taste rather nice. now all the things you've heard of, i know you've heard of half the group. their extinction, i'm afraid, is because of my mother's soup.
Waiting amanda 5 waiting, waiting, waiting, and waiting, waiting for someone to find me waiting, waiting for the moon to go, waiting for the sun to come, waiting for the stars to shine, waiting for the world to be mine, waiting, waiting, waiting, and waiting, waiting for someone to find me waiting, waiting for this poem to end, waiting for my cut to mend, waiting for it to rain, waiting for someone to brake this chain, waiting, waiting, waiting, and waiting, waiting for someone to find me waiting, waiting for the world to end, waiting for the river to bend, waiting for the clouds to come, waiting for a waiting someone.
Bren and gabes awesome adventures bren 6 me and gabe have many adventures some of them are just amazing gabe has saved my life from many things sometimes i wonder why im still alive we've seen many many awesome things like big fat teddy bears big foot ayden in his olden days man eating chocolate bunnys and many many more if you like adventures then u should contact gabe well were getting older now so we cant do as much so i guess we'll have to lay a little low but at least we still have great great memories
Jesse McCartney brittany 6 he is my favorite singer hot from his toes to his fingers i'm putting posters all over my wall if i met him i would fall down to the floor that would be all he is so dreamy now i think i'm getting sleepy his hair is blonde his eyes are green now can't you see i'm his number one fan i want to be in his band.
When my brother comes in from playing katelyn 8 when my brother comes in from playing his hair looks like a porcupine, his shoes a big mud hole, but when you look him in the eye you see his face looks like a rotten apple pie, as soon as he has had a shower his hair smells so sweet, but then you look at him and smile and say you're still a little geek.
Untitled elona 12 i'll fade out of this life. when? it's for sure. i'll sink with titanic. transition. how will i be? me, me, me. . . infinity, eternity. will i be able to stand on my feet? yes, infinity. desired or hated. deserved. i have no control. the hand works like the heart. genuine retro. better than the mirror. yes, this awaits me. what will it be? today, i decide and sign.
Do jenny 12 do your best your works. do think your mind purely and truely. do your acting cute and wildly. do speak your word lovely and pacifically. just do something as a child. just do.
Sonnet 130 william 12 my mistress' eyes are nothing like the sun; \tcoral is far more red than her lips' red; if snow be white, why then her breasts are dun; \tif hairs be wires, black wires grow on her head. i have seen roses damask'd, red and white, \tbut no such roses see i in her cheeks; and in some perfumes is there more delight \tthan in the breath that from my mistress reeks. i love to hear her speak, yet well i know \tthat music hath a far more pleasing sound; i grant i never saw a goddess go; \tmy mistress, when she walks, treads on the ground. and yet, by heaven, i think my love as rare \tas any she belied with false compare.
the pizza roller coaster victor 12 the pizza is round it goes up and down
Now That I Have You victoria 12 i dreamed of you, i prayed for you. i looked for you but seemed to never be able to find you. all this time you were right there. every heartache and pain and tears leading up to finding you was worth it because now i have you. itâ’s like an angel put you on earth for me. you stole my way to breathe you stole my heart, but that doesnâ’t seem to bother me because i want you to make me feel like iâ’m the only one in the world. i want you to wrap me up and kiss me and show me what iâ’m missing. i couldnâ’t take a step without you; i couldnâ’t breathe without you, i couldnâ’t live without you, because boy youâ’re all i need. youre the reason i wake up in the mornings with a smile on my face. youâ’re the reason my heart still beats. boy you make me fall in love with you every time you look at me.
Source: https://raw.githubusercontent.com/whipson/PoKi-Poems-by-Kids

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".