DREAMATLAS brings together four established fields — empirical dream content research, anthropology, digital humanities, and computational text analysis — applying contemporary NLP and large language models to questions that none of these disciplines could answer alone.
The empirical study of dream content has a quantitative tradition extending from the Hall and Van de Castle content analysis system through the corpus-based work developed around the Sleep and Dream Database (SDDb) and DreamBank. These frameworks have produced robust findings on the consistency of dream content within individuals and populations, but their evidentiary base remains predominantly Western, contemporary, and English-language. The interpretive traditions of anthropology, religious studies, and the history of culture document the centrality of dreams across civilizations and historical periods, but until recently lacked the analytical infrastructure to operate at corresponding scale.
DREAMATLAS addresses this gap. The project compiles and analyses dream accounts across continents and historical periods, drawing on primary sources, ethnographic records, manuscripts, religious texts, and oral traditions. Collaboration with monastic, national, and private libraries is integral to the design, with explicit priority given to non-digitized materials that fall outside standard digital humanities infrastructures.
Recent advances in NLP and large language models permit the detection of semantic structures and emergent patterns at a scale and resolution earlier methods could not achieve. The project uses these capacities to test whether the recurrent structures identified in dream content — understood here as cross-culturally observable motif clusters — are temporally stable and culturally widespread, context-dependent, or some combination of both.
This analytical ambition carries epistemological responsibilities. AI methods produce particular kinds of knowledge: pattern-based, probabilistic, and sensitive to corpus composition. These properties are treated as methodological objects in their own right rather than as background assumptions. The interpretive complexity of cross-cultural material — including the risk of imposing contemporary categorical frameworks onto historically distant sources — is addressed in corpus design and analytical protocols.