Last week, I spent the majority of my internship hours learning how to gather together the correct information for batch uploading image files onto the IU Libraries server which hosts the Hoagy Carmichael Omeka site. We began with the ICO (Image Collections Online) images, as these will be the easiest to upload (there are no multi-file images per item to upload in any of the ICO subseries). This ESSENTIALLY meant creating the correct type of file for the Omeka platform, one that can ingest the right information the Omeka expects in batch uploads. This meant creating suitable CSV (comma separated value) files. However, it wasn’t just as simple as creating one CSV file. I first had to extract the ICO image PURLs by running an xquery on the EAD Excel file, then aligning the correct file names to the extracted PURLs in a new Excel file.
After the information was in the new Excel file, I had to rename all the URLs by designating what size image we wanted to grab and upload into the Omeka site. I decided I wanted the Large files, so I used several Excel formulas to rename all the URLs with ‘large’ in the path (formulas like the ‘right’ formula and the ‘find and replace’ formula helped in reformatting the text in the URL column). I also had to attach ‘.jpg’ to all the file names in the file name column of the sheet, as Omeka expects image files to have this extension in the name (using another Excel formula — CONCAT– and combining one column (the file name column) with the .jpg column, then copying and pasting that newly created file name column over the old column.
Since all this work in Excel was new for me, it took me a few hours, over the course of the week, to get all the small moving pieces together. I was, in the end, able to successfully deliver a properly formatted .csv file to Nick, who was then able to upload all the images to the IU server using the wget command line tool.
I am now in the beginning stages of creating several new .csv files for batch uploading the server images as items in the Omeka site. I will describe this process more in the next blog post.