A Digital Humanities Project for History Scholars
: How to Analyze & Visualize Research Trends using Academic Journal Data in Korean History
What is Hanguksa Yeongu Hwibo (the Bulletin of Korean Historical Research)?
Hanguksa Yeongu Hwibo (The Bulletin of Korean Historical Research) is a Korean history database and biannual periodical that compiles a list of publications on Korean history research. It provide the most extensive and detailed information on Korean history research topics by offering original text of important historical materials, organizing the trend of Korean history research in the academia, and introducing the catalogue of literature titles.
It is a collection of books and papers, and is classified by topic periods (e.g., general theories, prehistoric times, ancient times, Goryeo, Joseon, modern period, contemporary history, etc). It services up-to-date bibliographical information on almost all of research outputs concerning Korean history in and out of Korea. As of 2019, there are 45,230 books and 188,214 thesis in their database. These contents are available online. (http://db.history.go.kr/item/level.do?itemId=hb.)
It is maintained by National Institute of Korean History (NIKH). The NIKH is a South Korean national organization in charge of researching, collecting, compiling, promoting the study of historical materials on Korean history. It was established as Guksagwan, which is a national organization that systemically research, collect, preserve, compile, and distribute various historical materials that record important events in Korean history and are necessary for Korean history research, in March 1946, one year after the liberation of Korea. Later in 1949, the name was changed to the current one. To date, the NIKH publishes studies and sourcebooks on Korean history to encourage research in the field.
Download this sample dataset and follow the step-by-step tutorials below to practice data analysis and visualization.
This test dataset includes 11,352 publication lists and metadata that are created from April 2017 to April 2020.
Our preprocessed dataset can help you start your analysis more easily.
You can also try the tutorials with your own dataset to explore your data.
Below you will find a list of tutorials for using different toolkits.
Before you move on to the tutorials, we will walk you through the three basic steps you need to take before starting analysis.
Since the original format of the dataset is a continuous text string on one column of the worksheet, we need to separate it into individual columns based on the category. In order to do that, we split this text data by using a comma delimiter character (e.g., comma, tab, space, or semi-colon) in Excel. You can use the Excel ribbon under the Data tab and click on the Text to Columns icon in the Data Tools group of the Excel ribbon. Select Delimited on the option buttons to split text into different columns in Excel.
We extracted relevant data needed for our analysis (e.g., material ID, classification number, period, author(s), title, periodical (with volume), publisher, publication date, and URL).
After getting rid of unnecessary data values, we sort the list of data by the material ID and organized the inaccurate format in the correct order.
For our analysis, we will only need publication years. Thus, we converted the publication date to year format.
Well-done, it looks much better! Now, our dataset is ready to be analyzed.
WordCloud
WordCloud can be done using various open source toolkits (e.g., MonkeyLearn WordCloud Generator, WordArt.com, Wordclouds.com, TagCrowd, Tagxedo, and Python). This WordCloud is generated using Jason Davies. It is a Wordle-inspired word cloud generator written in JavaScript and available on GitHub under an open source license as d3-cloud.
The layout algorithm itself is incredibly simple.
For each word, starting with the most important: It attempt to place the word at some starting point: usually near the middle, or somewhere on a central horizontal line. If the word intersects with any previously placed words, it moves the word one step along an increasing spiral. Then, it repeats until no intersections are found.
From this WordCloud visualization, we can see the most frequently mentioned words are;
Chosun(조선), Korea(한국), Research(연구), Investigation(검토), Baekje(백제)*, Silla(신라)**, Excavation(출토), Goguryeo(고구려), Liberation(해방), Movement(운동), and Japanese Colonial Period(일제강점기).
*One of Korea's so-called "Three Kingdoms," along with Goguryeo to the north and Silla to the east. It ruled over the southwester part of the Korean peninsula from 18 BCE to 660 CE.
**Or Shilla (57 BCE – 935 CE) was a Korean kingdom located on the southern and central parts of the Korean Peninsula.
Mapping
Tableau can be a free easy mapping tool. Using Tableau, you can also make an interactive map. Tableau divides your data into dimensions (independent variables) and measures (numeric values). Then, it classifies them into datatypes (text; number; date...). Depending on data source, Tableau can automatically geocode the geographic dimensions. But, sometimes, it cannot recognize a correct data field.
For this exercise, I first extracted the latitude and longitude data of publishers' cities from Google Maps, and then assigned the Geographic Role to these values. I created the map by putting the longitude data into the Columns and latitude data into the Raws. To create a map that shows a point for the location of each publisher, I dragged the Location dimension into the Detail button in the Marks pane. To show how many publishers are located in that city, I dragged Publisher dimension into the Size button in the Marks pane.
Since I have many publishers in my dataset, I used different colors for different publishers by dragging the Publisher dimension into the Color button in the Marks pane. I also made the legend to show the color or each publisher.
From this map, we can see the most publishers, including International Journal of Korean History, 건지인문학, and Acta Koreana, are located in Seoul, South Korea.
We can also check most publications are from 건축역사연구, The Review of Korean Studies, The Review of Korean Studies, Sungkyun Journal of East Asian Studies, 강원사학 and 강좌미술사.
Bar Chart (Periods)
Chosun > Contemporary > Modern > Ancient > Goryeo> Prehistory
https://editor.p5js.org/
You can check this reference.
https://www.library.ucla.edu/location/east-asian-library-richard-c-rudolph/korean-studies
seul@g.ucla.edu
Built with Mobirise - Click for more