Programming skills are not only becoming more in demand in industry jobs, they’re also becoming a required skill in academia as well. Programming is now used in almost every discipline for tasks such as data collection, organization, and analysis. In this post, I’m going to demonstrate how some basic programming in Python can be used to conduct research in experimental archaeology. All the data files that I use and the Python script are provided at the bottom so you can try it yourself at the end!
What are we researching?
For this particular experiment we are going to be exploring the handedness of our prehistoric ancestors through what they left behind in the archaeological record. There is a long history of research linking handedness to the hemispheric lateralization in the brain, specifically linking handedness to the language processing centers in the brain (Read more about this here!)
How do we determine handedness in the archaeological record?
There are a few ways that you can determine the handedness of an individual using their bones. However it becomes difficult in the archaeological record because these methods require both sides of the skeleton to be successful, and skeletons don’t preserve in a perfect fashion. Sometimes only one leg is found; other times you’ll only find a hand and some rib fragments, but I’ll save the taphonomy lesson for another blog post. Methods that use endocasts (molds of the inside of the skull) run into similar preservation issues as the skeletal analysis methods. Other methods, such as using tooth wear from paramastication (using your teeth as tools) in Neanderthals, or cave paintings don’t provide much insight since it has already been determined that by the time these show up in the archaeological record our human ancestors were already predominantly right-handed [See references].
This leaves us with tools used by our human ancestors. Stone tools are the most abundant artifact type in the archaeological record, so there is a lot of data to work with. A lot of research has been done trying to see if features on the stone tools themselves could be indicative of handedness, but there has been a lot of disagreement in that discussion. So the problem we’re going to focus on here is the spatial distribution of stone tools at archaeological sites [1].
How a stone tool is made:
What is our hypothesis?
If our goal is to see if we can distinguish right-handed toolmakers from left-handed toolmakers, then our hypothesis would be: If right-handers and left-handers make stone tools differently based on their handedness, then we would see a statistically significant difference in the spatial distribution of the flakes.
How do we do this?
A couple summers ago, I collected the exact data we would need to do a pilot test for this hypothesis (learn more about data collection in the footnote). Since the data is already collected [2] there are two main steps that are required in order to visualize the spatial spread made by the toolmakers.
1. Getting the flakes onto the computer screen:
There are many spatial data software programs available to archaeologists to run statistical analyses, however some are very expensive and the ones available for free tend to have limited choices in analyses that a researcher can perform. Even programs like QGIS requires some basic programming knowledge when you want to run analyses that aren’t already pre-programmed into the software. Writing a new code in a widely used programming software, such as Python, to analyze the data allows for customization of the analysis to fit the data provided. For this study, I produced a code in Python that would demonstrate the beginning steps of a hot-spot analysis (spatial analysis and mapping technique interested in the identification of clustering of spatial phenomena), to visualize the spatial data I collected.
2. What to write in Python:
Two libraries were imported into the script: 1) matplotlib.pyplot, this library contains thousands of plots to use to visual your data, 2) Pandas, this library contains data analysis tools commonly used in statistical analyses. A 2D histogram from the library matplotlib.pyplot was chosen as the format to plot the X and Y coordinates. The maximum and minimum coordinates for X and Y were pulled from the Excel sheet to create a boundary for the plot, and labeled in the code as ‘x_max,’ ‘x_min,’ ‘y_max,’ and ‘y_min.’ Once the X and Y coordinates are plotted, the number of flakes in each bin (10 cm2 areas) would be reflected by the color bar. The more populated areas will show as closer to yellow whereas the areas without flakes will show as violet.
The second part of the script was created to import the data points from the Excel spreadsheets. The Pandas library was used to create data frames for both the right and left-hander data files. The Excel spreadsheets were then labeled in the script using pd.ExcelFile, this allows the data to be used by the Pandas library data analysis tools. Finally, each column from each file was turned into a list for the code to use for plotting. Now we can visualize the spatial distribution of the flakes made by the left- and right-hander!
These results are consistent with the results I generated using ArcGIS, a high-end (read: expensive) geographic information software, to produce a hot-spot cluster plot, suggesting that this code is heading in the right direction if I would want to produce a hot spot analysis script in Python. Luckily for us though, QGIS (the free geographic information software) runs on Python and has a hot spot analysis plugin that can run this data for us. However, being able to learn how to use and edit codes can be incredibly helpful for customizing analyses, especially in free software programs such as QGIS and Python. For example, what if my archaeological site wasn’t a perfect square like my experimental site? I could edit the Python plugin in QGIS so that it properly fits the boundary of my site. What if my archaeological site was on a slope and not on a flat piece of land like my experimental site? I can customize the analysis to add in Z-coordinates and inform the code that there is a slop. If I wanted to change the colors on the plot, I can get very specific by writing it out in the script. More importantly, I can do this on a free, accessible, and accurate platform that replicates the functions of a high-end, expensive program, and still conduct data analysis as a student with a tight budget.
Acknowledgements
I would like to thank Oumeyma Ben Brahim for her assistance with data collection and analysis as a student at the Koobi Fora Field School. See her poster on this project!
References for Background
Uomini, N. T. The prehistory of handedness : Archaeological data and comparative ethology. J. Hum. Evol. 57, 411–419 (2009).
References Used to Create Python Script
- Hunter, J., Dale, D., Firing, D., Droettboom, M., Matplotlib development team. May 10, 2017, “pylab_examples example code: hist2d_log_demo.py.” Matplotlib, https://matplotlib.org/examples/pylab_examples/hist2d_log_demo.html
- Willems, Karlijn. Jan 31, 2017, “Python Excel Tutorial The Definite Guide.” Data Camp, https://www.datacamp.com/community/tutorials/python-excel-tutorial.
- “Read Xls with Pandas.” Python Tutorials, pythonspot.com/read-xls-with-pandas/.
Edited by Riddhi Sood and Benjamin Greulich
FOOTNOTES
[1] How do you make stone tools?
Without getting into the physics, the earliest stone tools were created by hitting two rocks together. One rock, the core or cobble, is made of a softer material and has a geological structure (cryptocrystalline) that makes it easier to break using hard hammer percussion. The second rock, the hammerstone, is made of a harder material so when it is hit against the first rock, it doesn’t break. When the core breaks, it creates rock flakes. These sharp flakes are what were used as cutting and chopping tools by our earliest ancestors. The flakes are also what makes up the majority of the artifacts found at various archaeological sites.
[2] I had one left-hander and one right-hander make stone tools on an experimental site. The site was set up so that we had a grid laid out on top of it, and markers were placed on the mat so we knew the exact place the toolmaker was sitting while making tools. We took overhead photos of the site after the left-hander made tools and after the right-hander made tools then uploaded the photos into a program called QGIS. QGIS is a free geographic information software which was used to record the x and y coordinates of the flakes from each toolmaker, and export it into an Excel spreadsheet.
Try it yourself; below is the Python code that I used!
Python Code
“””
Created on Mon Nov 12 09:37:02 2018
@author: chloeholden
Final Project: Cluster Plot of Left- and Right-Handed Produced Flakes
“””
import pandas as pd
import matplotlib.pyplot as plt
# Import excel files with the X and Y coordinates for flake distributions
# x and y coordinates were created in QGIS and exported into an excel file
r_basalt = pd.ExcelFile(‘Basalt Right.xlsx’) # x and y coordinates for flakes knapped by a right hander
l_basalt = pd.ExcelFile(‘Basalt Left.xlsx’) # x and y coordinates for flakes knapped by a left hander
#Creating dataframes for Sheet 1 in left and right hander flake coordinates
df_right = r_basalt.parse(‘Sheet1’)
df_left = l_basalt.parse (‘Sheet1’)
#Identifying individual X and Y columns and turning them into lists
#Right
right_x = df_right[‘x’].tolist()
right_y = df_right[‘y’].tolist()
#Left
left_x = df_left[‘x’].tolist()
left_y = df_left[‘y’].tolist()
# Create empty grid
S = 200 #dimension of grid, 200 square centimeters
N = 10 #dimension of the grid lines, 10 centimeters apart
#min and max from nail coordinates recorded by the total station
x_max = 101.517
x_min = 98.761
y_max = 100.000
y_min = 97.798
# Right Hander Plot
plt.hist2d(right_x, right_y, bins=S)
plt.colorbar()
plt.xlabel(‘x (meters)’)
plt.ylabel(‘y (meters)’)
plt.title(‘Right-Hander Flakes’)
”’
# Left Hander Plot
plt.hist2d(left_x, left_y, bins=S)
plt.colorbar()
plt.xlabel(‘x (meters)’)
plt.ylabel(‘y (meters)’)
plt.title(‘Left-Hander Flakes’)
”’
Excel files: Basalt Left Basalt Right
Deepak Kumar
Interesting work, I hope more archaeologists get introduced to python. Looking forward to see more work like this. Wish you all the best.
Andrew
Here’s an optimised version.
import pandas as pd
import matplotlib.pyplot as plt
# Function to create a scatter plot
def create_scatter_plot(x, y, title):
plt.hist2d(x, y, bins=200)
plt.colorbar()
plt.xlabel(‘x (meters)’)
plt.ylabel(‘y (meters)’)
plt.title(title)
# Import excel files with the X and Y coordinates for flake distributions
r_basalt = pd.ExcelFile(‘Basalt Right.xlsx’)
l_basalt = pd.ExcelFile(‘Basalt Left.xlsx’)
# Creating dataframes for Sheet 1 in left and right-hander flake coordinates
df_right = r_basalt.parse(‘Sheet1’)
df_left = l_basalt.parse(‘Sheet1’)
# Identify individual X and Y columns and turn them into lists
right_x, right_y = df_right[‘x’].tolist(), df_right[‘y’].tolist()
left_x, left_y = df_left[‘x’].tolist(), df_left[‘y’].tolist()
# Create empty grid
S, N = 200, 10 # dimension of grid, 200 square centimeters, and dimension of the grid lines, 10 centimeters apart
# Min and max from nail coordinates recorded by the total station
x_max, x_min = 101.517, 98.761
y_max, y_min = 100.000, 97.798
# Right Hander Plot
create_scatter_plot(right_x, right_y, ‘Right-Hander Flakes’)
# Left Hander Plot
create_scatter_plot(left_x, left_y, ‘Left-Hander Flakes’)
plt.show()
Andrew
Oh. What if you could process data from the images more efficiently?
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import numpy as np
# Function to create a scatter plot
def create_scatter_plot(x, y, title):
plt.hist2d(x, y, bins=200)
plt.colorbar()
plt.xlabel(‘x (meters)’)
plt.ylabel(‘y (meters)’)
plt.title(title)
# Function to process image and outline artifacts
def process_image(image_path):
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, thresh = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Iterate through contours and outline artifacts
for contour in contours:
# Calculate centroid and area of each artifact
M = cv2.moments(contour)
if M[“m00”] != 0:
cX = int(M[“m10”] / M[“m00”])
cY = int(M[“m01”] / M[“m00″])
area = cv2.contourArea(contour)
# Outline artifact on the original image
cv2.drawContours(image, [contour], -1, (0, 255, 0), 2)
# Display centroid and area
print(f”Centroid: ({cX}, {cY}), Area: {area} square pixels”)
# Display the processed image with outlined artifacts
plt.imshow(image, cmap=’gray’)
plt.title(‘Outlined Artifacts’)
plt.show()
# Example usage
image_path = ‘path/to/your/photo.jpg’
process_image(image_path)