Today we started to work on data mining. Data. That is a very big word. You will see… there is something you must know about me. I have a secret to confess, a reality to share, a burden to be lifted with help. This is how my doctoral research files looked like just a few years ago:
My folders, not so long ago.
They do not look that different now, truth be told. Those are the files I have on paper. There is another disturbing reality: the folders in my computer.
Oh, yeah. Some of my folders.
These two images reflect the two complexities of my research. I have to deal with (1) a huge amount of data that (2) I have collected in research carried in libraries and private archives. I have not found relevant materials in digitised collections. Period. This makes the whole data mining process… elusive, to say the least.
However, this does not mean I would not have found relevant data on digital collections. I may have. But I do not know, since most digital repositories are not free, just waiting to be read: university projects require suscriptions that are beyond the reach of my university and the oldest primary sources from Google Books cannot be read from Argentina (yes, it is true). This situation means that the sources I am interested in are not available in a proper format.
And those are the challenges I face as a Latin America based scholar. So far, data mining seems like a distant dream.
El problema es el idioma. Pero no se trata de que quiero tener la ayuda de una herramienta determinada en mi idioma. Eso no me parece importante ni posible.
Sitios como Opecalais.com o Voyant me devuelven a una pregunta crucial para quienes realizamos tareas de investigaciones enfocadas en áreas no centrales (no es un eufemismo, es sólo que décadas de corrección política han hecho mella en mí). Las fuentes, los personajes, los lugares, los eventos sobre los que quiero saber más… están simplemente en otro idioma y en otro universo cultural. Es así de simple.
No postulo que sean absolutamente inútiles (aunque tampoco asevero lo contrario). Sigo pensando en el tema. Sí puedo asegurar que para el viernes no tendré una respuesta.
My ideal self
So, today I worked with Omeka, which seemed more attractive than the other content management systems. It is to put it in a way I can understand a platform designed to help people create collections, whatever that means. It seems the perfect tool for my project, since what I want to create is basically a collection of artists’ biographies.
With the very kind help of the instructors, I was able to create a collection and even an item type. This allowed me to have the fields in Spanish, which is clearly important for my project. I had considered the possibility of making the project bilingual, but that is clearly both difficult and useless (whoever wants to search the database I hope to create, understands Spanish).
What I am disappointed about is my inability to act freely and carelessly. I kept thinking: “I will break Omeka”. I wonder if this fear will dissolve into a more assertive attitude in the future.
After having heard the “Drupal” participants, I feel I should try to use it. It looks a little more difficult (I do not need that, to be honest). But I will try it now.
Wow. Zotero. Is it a dream or a nightmare?
On the surface, Zotero is a research tool that allows you to incorporate pretty much everything you want to a personal library. This means (very, but very basically) that you can have all the stuff you have gathered on a particular topic easily accesible and together.
Wow. Sounds great? Even fun? Well, it is not. It may be, perhaps even in the near future. But right now Zotero poses for me more problems than it solves, basically because I have done most of the research for my doctoral dissertation and it would be time-consuming to transfer that data to Zotero.
However, that is not the reason why I feel Zotero is a nightmare. Zotero is the elephant in the room.
The elephant in the room, lurking in my bookshelf…
Until today I had not realised that I have organised my primary sources (and even my secondary sources for that matter) in a disastrous way. Let me explain the situation: for the past four years I have collected a large volume of information on women artists active in Buenos Aires in the long 19th century. I am talking about 130 GB of information. Images, words and even sound files that are basically uncategorised. It is, to put it bluntly, a messy system.
I need to finish my dissertation, not only because I have to. You see, I really want to use Zotero, the tool that is dreamy and nightmarish at one.
My project for Rebuilding the portfolio is based upon my experience as a graduate student dealing with an uncommon subject for Argentine art historians: the careers of women artists in Buenos Aires between 1880 and 1920. There’s little trustworthy information on the large group of active women artists and their stories are little-known. My aim is to create a database with biographies of these remarkable women, allowing visitors to comment/correct the information available. I would like each biography to have a set of tags that connects the data and makes it easy to see unforeseen links (for example, the school they attended, the places the exhibited at, the genres they worked on).
Hello, this is me, from the pretty GMU campus.