Structured Missingness (2023)

The Hauntological Noise Experiments are a series of attempts at reconciling analog and digital diffusion. In Diffusion models, images have information removed until an image emerges. By asking the systems to create images of Gaussian Noise patterns, Diffusion models are trapped in feedback loops that create abstract visual patterns devoid of information: noise being written into noise.

Photographs are a technology for remembering, and most of our archives contain images of desired memories. This creates a signal within the archive: these are the things which are here. If there is a presence, it stands to reason that there is also an absence. An archive can only preserve what exists. Between what exists, particularly in photographs, is the absence of information.

Most archives strive to preserve information. But to be preserved, a document first has to exist. The person we preserve has to be photographed. The poem has to be written down.

Sometimes, physics or politics intervenes. Film decays. People are erased. The archive catches on fire, or the archive is burned down. Most of us recoil at erasure, no matter the source. It is violent, melancholy. We rely on images to remind us of the past, and the loss of a photograph is the loss of that memory. In the loss of a memory, we’re left with a vast darkness behind us, one that offers little solace to the vast mystery ahead of us.

The decay and erasure of analog images seems more profound than the deletion of digital ones, because images have become cheap currency. The photographic negative is a scarce luxury, mostly artifacts of earlier times. What is present in the film archive is lineage. We can trace an image’s origins to the negative. The negative was in the room with the subject it depicts. Even if we don’t know who the person is, we know something about them: that they stood in front of a camera, smiling or blank-faced, posing for an ID or caught by a reporter.

When information is stripped from these images, what remains loses that connection to the subject of the picture. It’s explicitly gloomy: this is what remains of the things which were here. The decay patterns of film, especially by mold or decomposition, often follows traces of the image. Films may deteriorate faster or slower where certain chemicals reacted to light; mold may be less attracted to bright sections of the image, but find things in the shadows to feed on.

Disconnected Visual Lineage

Obliterated data also shapes the contours of diffusion-based images. Looking for patterns in noise, these model rely entirely on tracings. They invoke, and then reverse, a digital version of deterioration. They strip information out of images to see how the noise is distributed. They memorize the distribution of noise, and then the reverse it. Much as the remnants of the past shape the decay patterns of film stock, digital decay preserves its traces through dense mathematical abstractions.

A diffusion model eats its way through our digital images much like mold on film. It breaks them down, ingesting them, eventually to become something else. The digital is resurrected as new images. The analog is resurrected as spores, or else repaired to restore the original memory of what it once depicted.

There’s a distinction between the act of remembering that is embedded into our personal images, and the act of regenerating which comes about from the diffusion of those images into data. Diffusion doesn’t remember, it analyzes and re-represents.

Michael Newman, writing in Analog, Chance & Memory, claims that we re-represent events through the image, using them to bring the past into our minds and enter into it through imaginary means.

“Remembering, rather than simply repeating something fixed, yields the contingency of the trace, attributing potentiality to it once again. We don't necessarily have to know the origin of the trace, since memory is also a reconstruction." — Michael Newman, Analog, Chance & Memory (2011)

Pierre Nora adds:

"Hallucinatory re-creations of the past are conceivable only in terms of discontinuity. The whole dynamic of our relation to the past is shaped by the subtle interplay between the inaccessible and the non-existent. If the old ideal was to resurrect the past, the new ideal is to create a representation of it. … We find ourselves in a fragmented universe. At the same time, nothing is too humble, improbable or inaccessible to aspire to the dignity of historical mystery. We used to know whose children we were; now we are the children of no one and everyone." -- Pierre Nora, "Realms of Memory."

Though written long before diffusion models, I think the passage powerfully depicts the severance of lineage from the images we generate through algorithmic means. The training data has no connection to the past. Training data is a series of isolated images, clustered together by no form of cultural logic aside from similar keyword pairs. That these pairs sometimes align is almost coincidental, and so we see pictures of Mt. Fuji when we prompt for a Fuji Camera. This is what happens when we re-represent images without remembering: we hallucinate.

The contours of algorithmic images are structured by data. But they are also structured by data’s absence. In Machine Learning, this is a known problem with a simple name: “missingness.” Missingness describes the absence of data in a training library. The absence of data — for example, missing values in a spreadsheet — nonetheless shapes the connective contours of a neural network. Robin Mitra et al describe “Structured Missingness” as a situation “in which missing values exhibit an association or structure, either explicitly or implicitly.” For example, you might have a dataset of surveys. If your survey had a complex or annoying question, some people might skip it. If a bunch of people skip it, there will be a cluster of absences within the dataset, centered on that question.

In medical systems, a lack of data collected from non-native speakers, for example, could be incorporated into a predictive model where it would base decisions on less accurate data for immigrants: essentially, the biases of the 1990s health care system haunting the present. This is not a data-in, data-out issue, but a data-absent, data-out issue. The absence of information creates an abstract structure; and then we draw upon that abstraction to understand the present. We do so without being aware of the missing, or the ignored or the neglected. If we disconnect our memory from the active process of remembering, whatever was missing before will be doomed to be missed again.

I have been interested lately in the structured missingness of visual archives, perhaps in a poetic more than literal sense. Diffusion models collect data by applying structured missingness: removing enough information from an original image in a predictable way, in order to identify the most salient elements of what remains. Archives, on the other hand, are forced to grapple with the missingness of a vast number of things: Images not taken; people not born.

When Mark Fisher described Sonic Hauntology, he spoke about it as a form of stuckness. The haunt is the presence of a spirit without a body, and the links to photographic archives are clear: photographs are a disembodied likeness, and the decay of a photograph disturbs us in ways that evoke ghosts, or bring to mind the spectre of death, in ways that pristine images do not.

The decay of a photograph is the decay of memory, the decay of a memory is forgetting, and forgetting inspires haunting more than memory. If memory is the fulfilment of a promise to the past, then forgetting is a kind of neglect.

What is not remembered is missing. What is missing still structures our models of the present. It structures the inferences that we make, whether we draw them from archives, memories, or datasets. If we automate these inferences — extrapolate patterns from data without regard to its gaps — the more that missing haunts the present.

Photographs are stored in archives, but photographs also are archives. They organize information about the world within their own information storage system: patterns of light and shadow inscribed into chemicals. The structure of the image is the information. When mold grows, it can be confined by the absence or presence of certain compounds in the inks or film, and can wind their way along the contours of the image’s structure.

Likewise, we can think of algorithmic decision making as a form of enacted archive, with the structures of missingness and presence creating its own kind of image. It’s a visualization of social and economic systems, rendered not in light and shadow but in presence and absence of information, 1’s and 0’s. Yet, we, the living, move through the contours of these enacted archives, and significant aspects of our life are shaped by those boundaries.

Gaussian Decay

Diffusion models work through accelerated decay: images are stripped of information at hyper speeds, analyzed, and the data is categorized into structures of text. By contrast, historical images decay over decades. Both involve patterns of decay that follow the structure of the starting image.

Film engages with our memory: they were remembered, once. Diffusion breakdown does not. It does not see or remember the original image in any human sense. At the end of an analysis by an AI system, the original digital image remains.

To grapple with this, I began to create a series of works blending Gaussian Noise with archival images of decaying film. Gaussian noise feedback loops confuse the system into creating Gaussian noise from Gaussian noise, essentially obliterating any reference to the dataset beneath them. Writing noise is an oxymoron — it is like drawing an illustration by erasing it. Luckily, the computer attempts this anyway, creating abstract images as a result of simultaneously adding and removing noise in order to simulate an image of noise. On its own, this double-binding of the AI generates an image of its own breakdown.

I took the resulting images and blended them with images of film breakdown. The Costica Ascinte archive has scanned images in which barely any trace of the original remains.

I’ve created a series of images as a way to visualize my thinking-through of this weird conceptual space. By combining these barest of analog traces with Gaussian noise, I hoped to see how the system might grapple with two forms of absence. Of course, it treats it as a pattern to study. By blending the images, the machine is tasked with finding the patterns that can be exploited between both: to draw a new image, from noise, from images of analog and digital decay. Abstract visualizations of synthetic missingness that are, genuinely, representations of missingness and absence.

The images on this page are a result of those experiments.

Eryk Salvaggio