Computational Methods for Chinese History: A “Digging into Data Challenge” Training Workshop

Harvard University

Date: October 17th, 2015 (Saturday) 9:00 am - 5:00 pm

Venue: Science Center Room B09, Harvard University

Organizer: China Biographical Database Project

Sponsor: Automating Data Extraction from Chinese Texts, Digging into Data Challenge

Introduction:

Do you know how to look up and visualize information in Chinese historical sources? Nearly every day, there are news articles about how big data and computational methods such as mapping and network analysis are changing our world. They are transforming the study of Chinese history as well; scholars could no longer ignore the potential of digital tools.

What sorts of questions about Chinese history can be asked and answered using computational methods? What are the main tools that scholars can use? This one-day workshop featuring experts from Harvard and beyond will provide an overview and practical training.

We will first introduce two main tools, CBDB and MARKUS. The China Biographical Database (CBDB) is a relational database with biographical information about more than 360,000 individuals, primarily from the 7th through 19th centuries. The data is open to use for statistical, social network, and spatial analysis as well as serving as a kind of biographical reference. The standalone version of CBDB in Microsoft Access format enables many functions that are not available in the online version. MARKUS is an open-source platform that allows applying sophisticated text-mining techniques to a wide variety of Chinese historical and literary texts. You can tag and extract personal names, dates, place names, official titles and postings, and other content for analysis and visualization. When reading texts on the MARKUS platform, you can also consult language and biographical dictionaries, as well as other reference sources.

We will then demonstrate the uses of spatial analysis for historical GIS data from China. There will also be content about network analysis (SNA) as a methodological approach, its basic concepts, and the use of software for simple visualization and analysis of network data on Chinese history. The day will conclude with presentations of case studies that came out from digital projects.

This workshop is part of the Automating Data Extraction from Chinese Texts (DID-ACTE) Project, which aims to provide humanists and social scientists with means of transforming historical Chinese sources into structured data. The project is funded by the Digging into Data Challenge, an international research initiative to develop big data analysis methods for the humanities and social sciences.

Signing up:

Space is limited. Please register by October 10th, 2015 at:

http://goo.gl/ElVNnz

Speakers:

Peter K. Bol, Harvard University

Song Chen, Bucknell University

Michael A. Fuller, UC Irvine

Donald Sturgeon, Harvard University

Lik Hang Tsui, Harvard University

Hongsu Wang, Harvard University

Weichu Wang, Harvard University

Xin Wen, Harvard University

Hang Yin, Peking University

Contact Information:

Please send all enquiries to: cbdb.harvard@gmail.com