Methodologies

What were the questions that drove the research?

Two recurrent questions with this project: firstly, on a linguistic level, how were these writers defying and working concurrently to the broader historical trends of the period? Second, how was the language, structure, and themes evolving—or not doing so—over time, even after Japanese colonization had ended?

How was this done?

To utilize a digital humanities approach, I applied the tools in KH Coder, a text mining software, to break down the translated texts. This program assists with quantitative data analysis, allowing the user to not only find relationships between words and their frequencies, but also broader phrases and categories. Throughout the course of this research, I specifically looked at the co-occurrence network graphs created by KH Coder for each individual short story, as well as the frequency lists in order to determine the kind of words I should be narrowing down on.

Co-occurrence networks create multiple subgraphs demonstrating the relationships between words and people throughout each of the short stories, allowing us to dig deeper into how, on a linguistic level, these stories reflect and subvert broader historical themes. Through these networks, I examined the core themes of these stories on a micro level, and, upon running each of the stories through KH Coder, came to broader conclusions about the extended period, women’s literature throughout the colonial period and Korean War, and how these women were defying–and meeting–the expectations of a patriarchal system during that time.

How were these graphs calculated?

The Jaccard index, also known as the Jacquard coefficient, was utilized to create these coefficients. Within its formula, Jaccard index ranges from 0 to 1, with 1 symbolizing a complete overlap between variables. However, with smaller data sets and numbers, such as only working with six short stories, it becomes more unlikely the Jaccard index would compute a number closer to 1. The Jaccard index calculates a coefficient based on the number of times a word appears within a sentence or paragraph.

I am working with short stories, so the frequency of the words will be lower, ultimately making the generated coefficient lower as well in the mathematical calculations. For example, if a word only appears four times within a short story, how it corresponds to another word would produce a higher coefficient than if it were mentioned five times. With this in mind, for the purposes of this project, a coefficient of greater than 0.30 indicates a stronger connection, 0.20-0.29 implies a correlation between the words, and anything less than 0.20 is there, but a weaker connection.

What short stories were used for this?

This research utilizes three different stories from the colonial period: Kim Myeong-sun’s “A Girl of Mystery,” Na Hye-seok’s “Kyonghui,” and Kim Won-ju’s “Awakening.” These three stories were published between 1917 and 1926. The three postcolonial stories are Han Moo-sook’s “Hydrangeas,” Kang Sin-jae’s “The Mist,” and Song Won-hui’s “When Autumn Leaves Fall.” These stories were published in 1949, 1953, and 1961 respectively, although all three women were born and educated during the colonial period. Each of the six stories were sourced from Kim Young-hee’s collection Questioning Minds in order to maintain consistency with translation styles.

Co-occurrence networks specifically draw out the connections between words, especially those with high appearance patterns corresponding with each other.

Why is this on the Internet in a different, nontraditional format?

With the systemic problems plaguing academic research and its access to broader communities, creating free, digital networks can help break down barriers to access such information. In other words, I wanted this research to be accessible for a broader audience, especially as I run a blog focused largely on Asian literature and film here on this website.