By: Alisha Kirchoff, PhD Candidate & RSW Fellow
This is the second in a special series on IU graduate students and their research, RSW Research Series. It is an opportunity for RSW colleagues and other readers to learn more about our students’ research projects. If you are interested in learning more about this research or connecting with one of these contributors, please let us know at email@example.com.
As we are in the midst of a global pandemic, many researchers have had to leave their field sites, cease data collection, and reorient their research agendas. COVID-19 has certainly impacted my ability to continue my research as originally planned. This has left me to wonder about the implications of the pandemic for area studies scholars and area studies scholarship. The current pandemic is poised to indelibly change the way institutions of higher education operate, but what could be the long term implications for research? Could enhanced online offerings at colleges and universities create flexibility for scholars to go abroad to collect data while fulfilling teaching and administrative duties remotely? Could pandemic-related considerations stall area studies research in certain arenas? While there are phenomenal resources online and stateside (and specifically at IU!) for historians as well as humanities and culture scholars to access vernacular source material, social scientists have fewer options for data collection when travel to the region is limited. Additionally, junior scholars with fewer and less durable contacts in-country than their more senior colleagues face greater barriers when access to the field is limited. Could big data be the answer?
In the past decade or so, there has been an ongoing debate about the use of big data and computational methods in the social sciences. The benefits are relatively clear: social media and web data is abundant and can contain information in massive quantities. As a result, one can more easily create and collect their own data for quantitative analysis than by working with a partner to field a study. Collecting big data is often more cost effective than other approaches to data collection. However, this approach to social science research also has its critics and challenges. Using computational methods runs the risk of taking the “social” out of social science. The richness of detail that one might get even from a survey is very difficult to capture using big data methods. The nuance associated with interviews and ethnographies are nonexistent in big data. There is a lot of valuable information out there, but quantity often comes at the cost of quality. This presents particular challenges when we are engaged in area studies research where deep, contextualized understandings of the cases and places we study can make a substantial difference on outcomes.
I am a sociologist and a law and society scholar. In 2018 I began a study of the Russian notary profession with an interest not only in studying legal culture through this group, but also to better understand the demographic patterns in the Russian legal profession. Most of the law and society work on legal professionals in Russia is focused on the work of judges or defense lawyers, and there is emergent research on law schools as well. The majority of the scholarly work in the North American canon in law and society is focused on common law systems where the concept of a notary is quite different from its counterpart in the civil law contexts. Notaries in civil law systems are involved in contract negotiations over major financial and real-estate transactions (e.g. wills, real estate). In many ways, Russian notaries are the practitioners of everyday law, but the secondary literature on this group is extremely limited.
When I began this study, there was not a resource available that provided even demographic information about the profession. This is where the benefits of big data come to the fore. With some internet sleuthing I was able to identify a couple of directories online that provided some information about notaries and their practice. After learning some Python (a versatile coding language) and receiving some tech support, I was able to run a program that scrapes information off these directories and deposits it into a spreadsheet database for analysis. By doing this, I was able to get a lay of the land. I could begin to answer questions like who are Russia’s notaries? Where do they practice? I was able to pull information into the database that can help me determine things like approximately how long a given notary had been practicing, and their gender. Thanks to some help from a colleague in Saint Petersburg, I was able to access aggregate information from the Ministry of Justice website about the number of practicing professionals in the field, which was a helpful frame of reference with which I could validate the information I had collected on my own.
While a helpful start, this process revealed a number of questions that I could not answer. After this first round of data collection, it became clear that I needed to continue. So, for about a year and a half, I have been running my Python script on a quarterly basis. I quickly learned that the value of this endeavor was not in having a single snapshot in time, but rather having multiple snapshots over several points in time. Each time I collect another round, I have new information about exit, entry, and movement within the profession. Even though I increase the richness of my dataset with each round of collection, I am still missing critical information. Web scraping tells me nothing about the lived experience of notarial practice. A massive dataset tells me little about the process of becoming a notary or their scope of work. This is why traditional research methodologies are critical to quality area studies research and why programs like RSW, the Department of State Title VIII program or Title VI NRCs remain so important: we cannot tell the whole story without talking to those who are living it. With support from the Carnegie Corporation’s investment in social science research on Russia, I was able to go and conduct a series of interviews of notaries in the Moscow region and connect with the chamber of notaries there as well. It was through these discussions that I was able to help make sense of the data I was collecting from my home in Indiana and better understand not only what I had, but more importantly, what I did not have in my spreadsheets. If not for a trip to the field, I could have made critical errors or misjudgments. Using Big Data helped me get started, helped me figure out which questions to ask, but it did not (and could not!) replace the process of engaging with research subjects.
So, what can my experience tell us about the space for computational or Big Data methods in area studies research? Ultimately, I believe that the use of these technologically-enhanced research techniques are a way to add value to traditional methods but should not supplant them. There is no data set or AI mechanism that can replace an ethnographic study. There is no web form or administrative data set that can replace an interview. But, we can use these tools to supplement and enhance the traditional methods in our disciplines. Computational research methods can help open new avenues for research, can offer mechanisms to assess the feasibility of study in a given area, and can help clear a path to further research. There is a lot we can learn from deploying big data methods, but deep, specialized, contextual knowledge that comes from area studies training is essential to make sense of what we find.