The Data Model

About the Data Model

Building a data model is never a neutral act. Behind every field and category lies a set of choices about what to include, what to leave out, and how to make sense of evidence that rarely fits neatly into any box. For this project, we have spent considerable time conceptualizing, iterating, and refining our model — not simply as an exercise in database construction, but as a way to frame existing research questions and formulate new ones. As scholars in the digital humanities have noted, data modeling requires judgment about what to include, finite decisions about classification, and careful attentiveness to bias and value. As Johanna Drucker writes, data modeling involves determining "what will be identified as a feature, how it will be made explicit, and what format it will have." (Drucker, 22) In many ways, this is both the first step in a digital project and the most consequential one.

To build our model, we asked ourselves a series of foundational questions: What was inherent in the documents and collections we wanted to incorporate? What did we risk flattening or omitting in the process? What structure would be flexible and sustainable enough to support tools and modes of analysis we couldn't yet envision? How could we create a model that would not only answer existing research questions but generate new ones? These questions do not have easy answers, and our model has evolved — and continues to evolve — as we incorporate data from new archives and collections.

One concrete example illustrates the complexity involved. As historians, we routinely track the placement of documents within archival collections. But should our model capture not just archival metadata but also the organizational logic — or disorganization — of the collections themselves, including the administrative structure of court cases? Similarly, should we record the date a case was reported, the date it was initiated, or the date the incident occurred? These are not always the same. Ultimately, we opted to record all three, anticipating that future researchers will want to analyze and visualize cases across time in ways we have not yet imagined.

These decisions illustrate a core challenge of this project: transforming court cases — documents with dozens or even hundreds of possible data points — into coherent, analyzable datasets. A single summary entry from the filze of the Council of Ten's criminal deliberations, for example, might include the name and cognome of both assailant and victim, the weapon used, the neighborhood, and more qualitative information such as descriptions of motive or identifying features of a suspect. In constructing our data model, we have tried to be as capacious as possible, recognizing that users may find value in the data in ways we cannot fully anticipate.

Our data model is organized around the following core elements:

  • Violence Events — the moment at which a single act of violence took place, identified by date, time of day, and ritual occasion where applicable.
  • Persons — all individuals connected to a case, including assailants, victims, witnesses, and judicial officials.
  • Relationship Types — one of the most complex features of the model. We track relationships documented in the sources, including enmity and its inverse (amity), family bonds, and other social ties that shaped the dynamics of violence.
  • Spatial Data — including precise geographic coordinates as well as prepositional and descriptive location data drawn directly from the sources (e.g., next to the church, outside the door of the Barberini palazzo). We also capture how early modern Italians themselves categorized space — for instance, as sacred, domestic, or public.
  • Weapon Type — invariably recorded in the sources and significant in its own right, both legally and culturally.
  • Motive — not always documented, but critical to capture when it appears. Notably, the phrase mortal hatred appears consistently in Venetian sources but rarely in Bolognese ones; tracking these semantic differences across corpora is itself an analytical opportunity.
  • Procedural Stage — given that different document types survive from different points in a criminal proceeding, and that the crime and the court case are not always the same thing, we record where in the judicial process each document originates. This allows us, over time, to reconstruct the arc of individual cases across the archive.

A Note on the Data

This data model is necessarily a work in progress. As we incorporate material from additional archives and collections, our categories and classifications will continue to be refined.

It is also important to be transparent about how the data was collected. Each contributor to this project made decisions shaped by their own research interests, expertise, and the particular collections they worked with. Some data was gathered with the project's spatial goals explicitly in mind, which meant combing through archival folders and volumes for sources that offered precise locational information — specific streets, buildings, or landmarks that could be reliably geocoded. This selectivity was intentional: rather than digitizing entire folders, researchers prioritized documents that would yield the most spatially granular data.

Other datasets in the collection were originally assembled as internal reference resources for quantitative rather than spatial analysis, and therefore contain incomplete or imprecise location information. No dataset in this collection represents the entirety of any single archival series; the folders and volumes we draw from often contain hundreds of pages each, and comprehensive digitization would exceed the resources available to any individual researcher or project team.

These limitations are not incidental — they are constitutive of what this dataset is and what it can do. We encourage users to engage with the data critically and to consult the Data Vocabulary for fuller documentation of our categories, decisions, and known gaps.

To contribute data to the project, please contact Amanda Madden at amadden8@gmu.edu.

References

Drucker, Johanna. "Data Modeling and Use," pp.19-33. In The Digital Humanities Coursebook: An Introduction to Digital Methods for Research and Scholarship. Routledge, 2021.

Flanders, Julia, and Fotis Jannidis. "Data Modeling." In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 229–237. Wiley-Blackwell, 2015.