Researchers’ needs and options for collaborative synthesis

2022-11-13

This essay examines the needs of researchers in relation to collaborative research synthesis, compares them to some the software solutions that exist at the time of writing, and discusses potential ways to progress forward.

General disclaimers:

Introduction

The “Background” section of the Synthesis Infrastructures Workshop page asks the following:

“How do researchers, scholars, and scientists share, reuse, and synthesize knowledge? […] scholars today rely heavily on a central infrastructure of scholarly publishing […] However, this scholarly communication infrastructure is not acting as infrastructure for synthesis […] how might we design scholarly communication infrastructures that are actually optimized for sharing, reusing, and synthesizing knowledge?”

Synthesis here means either of two quite distinct things:

  1. summarized knowledge (as in this sentence, taken from the same page: “If researchers are lucky, they might come across a published synthesis that is both on topic, with sufficient coverage, and up to date”);
  2. the process of constructing new knowledge (as in the word ideation).

I will refer to the first thing as “synthesis-as-thing” and the second as “synthesis-as-process.”These “x-as-y” names are a callback to Buckland, “Information as thing,” 1991.
This is important to clarify from the beginning, as discussions can branch out in very different directions depending on which understanding of “synthesis” they are based on. For instance, a discussion could be about improving on the format of the literature review to create a new, better kind of “synthesis-as-thing” (one that stays automatically up-to-date for example). But it could just as well be about improving the way we take notes on various materials and craft new statements, which can involve reading literature reviews but is not about them.

The way this essay is framed assumes that “synthesis-as-process” is the larger question within which “synthesis-as-thing” plays a part: to do synthesis, we use many things including syntheses (in the sense of summarized knowledge). And whenever “synthesis” is mentioned on its own on this page, it is meant in the sense of “synthesis-as-process”.

1 Needs

In this section, I examine our needs in relation to collaborative research synthesis.

1.1 We need ways to work with materials

Starting point: we seek to shift from publications as the main unit of work for synthesis.

Proposal: shift to a finer level of granularity (which necessarily broadens the scope for what a “synthesis unit” is) by considering research materials (of which publications are an important part).

Disclaimer: research practices are diverse and difficult to modelize. There are many ways to think about the research process. This in turn means that there are many ways to describe what “research materials” entails. Epistemology is not uniform across time and space. Even the list below is just one way to look at this issue. (There is relevant literature about this in science and technology studies.)Hackett, Amsterdamska, Lynch and Wajcman (eds.), The handbook of science and technology studies, 2008 ; Felt, Fouché, Miller and Smith-Doerr (eds.), The handbook of science and technology studies, 2017.

Working with materials can mean:

  • Gathering things (collecting)
  • Breaking them down (analyzing)
  • Combining them to create new things (synthesizing)

In a first draft of this text, I added “Organizing them” between “Breaking them down” and “Combining them”. However, upon reflexion, the distinction seems less clear-cut. Knowledge organization processes (KOPs) blur the line between analysis and synthesis. KOPs include classifying, indexing, tagging and linking.Considering linking as a KOP is a theoretical proposal from my dissertation. For an overview of KO and the first three KOPs, see Hjørland, “Knowledge organization,” 2016.
For example, classifying materials is not a purely analytical process: if you study concepts, classifying them means establishing relations between them (like in a thesaurus or an ontology); because the same process is used to synthetize new concepts or concept-based statements, it can be said that organizing concepts implies both analytic and synthetic work. My hypothesis is that this statement can be generalized: organizing research materials implies both analytic and synthetic work.

Hypothesis: synthesis is relational, so working with materials in ways that enable synthesis requires relational tools. I define these simply as tools that enable us to link materials together. These include:

  • relational databases
  • graph databases
  • hypertext (e.g. HTML, Markdown with wikilinks, word processors with hyperlinks…)
  • (…others?)

This also includes tools that are based on a tree-like data structure (since a tree is just a hierarchical graph), including:

  • hierarchical markup formats (e.g. XML)
  • outliners
  • mind mapping software
  • (…others?)

1.2 We need to write in a scientific way

Starting point: a small (maybe the smallest) common denominator for research collaboration is to write together. Like for the word “synthesis”, I will also use “scientific writing” in the sense of “writing-as-process” and not “writing-as-thing” (as in “writings”).

Hypothesis: the lack of support for the specific requirements of scientific writing in existing interfaces explains in part the difficulty to ramp up collaborative research synthesis.

Disclaimer: just as it is difficult to modelize “the research process” because of the diversity of scientific practices, it is difficult to draw a list of universal requirements for scientific writing. So, below, I discuss these requirements from the ones that I think are most agreed upon to the ones that are more debatable.

First, scientific writing uses specific forms of writing beyond prose:

  • Citations and bibliographies
  • Math
  • Figures
  • Tables
  • Code listings
  • Notes (footnotes, endnotes, margin notes)
  • (…others?)

(A good heuristic to use in order to draw this particular list: examine writing software and markup languages that are aimed at scientists, and compare their functionality with non-specialized ones. For example, the differences between Markdown and Pandoc’s Markdown say much about the specific requirements of scientific writing.)

Second, the goal of scientific writing is to contribute to scientific knowledge across time and space, which adds other requirements:

  • We need to be able to write in different languages and scripts, i.e. alphabets, syllabaries and logographies; Latin, Arabic, Cyrillic; dead scripts; etc. And these need to be able to coexist in the same writing space or document. This makes Unicode and UTF-8 important standards.
  • At some point, we need to produce publishable output, i.e. data, documents (articles, books, presentations, lectures, recordings, data papers…) or collections (datasets, databases, corpora, websites…). There are many ways to define “publishable” (e.g. “structured”, “accessible”…) so this is one of the most complex aspects of the writing issue.

And third, we have many things to do and very little time, so we may require some of these things:

  • Automation (e.g. processing citations and generating bibliographies, cross-referencing figure labels, creating indexes…)
  • Simultaneous editing, for easier onboarding
  • Editorial workflow management (e.g. roles and permissions…)

(Others?)

1.3 We need ways to navigate complexity

Starting point: breaking things down (analysis) and combining them (synthesis) both increase the complexity of our work, making it difficult to keep track of everything.

Proposal: we can imagine ways of navigating this complexity by drawing from the experience of the various fields that study humans in relation with technique. This includes anthropology,I’m especially thinking about writing as a technology of the intellect, see Goody, The Domestication of the Savage Mind, 1977.
human-computer interaction,Jacko (ed.), The human-computer interaction handbook, 2012.
design, information science….

Useful ways of presenting complex relation information include : lists, tables, graphs…

Useful features of relational software include : overview, sectors, fish-eye view, focus of interest, summary, history, miniatures, trails and paths, backlinks.Most of this list comes from Hofmann and Langendörfer, “Browsing as Incremental Access of Information in the Hypertext system CONCORDE,” 1990 ; for a more recent account of possibilities surrounding “tools for thought,” see Matuschak and Nielsen, How can we develop transformative tools for thought?, 2019.

2 Options

In this section, I examine some of the current options for collaborative research synthesis via digital tools.

2.1 Multiple interoperable tools

This can be:

  • Several applications exchanging data via API. Examples?
  • Decentralized plain text-based solutions, with collaboration enabled by version control software (e.g. Git). (Like Manubot but not publication centric.) Examples?

2.2 All-in-one solutions

This can be:

  • Relational-capable (often dubbed “semantic”) versions of time-tested software types such as wikis and content management systems. Examples : Semantic MediaWiki, Omeka-S.
  • (…others?)

3 Ways forward

I design and oversee the development of Cosma, a visualization program that reads a directory of interlinked Markdown files and generates an HTML file containing a graph view. I imagined Cosma as complementing Zettlr, a scientific Markdown editor that allows internal linking via wikilinks. My dream tool for collaborative research synthesis would merge a Zettlr-like writing experience and a Cosma-like navigation experience with collaboration infrastructure, similar to what the team behind Stylo is working towards regarding manuscripts. I do not know if such a thing exists. If not, I am keen to work on it.

The other way around is to take existing collaborative solutions that are compatible with a graph-based approach and adapt or expand them to meet some scientific requirements regarding writing and navigation. I know a couple of examples:

The goal of this essay is to kick-start a discussion on these ways forward, including the ones I cannot see on my own.

References

Buckland, Michael. “Information as thing.” Journal of the American Society for Information Science. 1991, Vol. 42, no. 5, p. 351–360. https://doi.org/10.1002/(SICI)1097-4571(199106)42:5<351::AID-ASI5>3.0.CO;2-3.
Felt, Fouché, Miller and Smith-Doerr (eds.). The handbook of science and technology studies. 4th ed. The MIT Press, 2017. 978-0-262-03568-2.
Goody, Jack. The Domestication of the Savage Mind. Cambridge University Press, 1977. 978-0-521-21726-2.
Hackett, Amsterdamska, Lynch and Wajcman (eds.). The handbook of science and technology studies. 3rd ed. MIT Press, 2008. 978-0-262-08364-5.
Hjørland, Birger. “Knowledge organization.” Knowledge Organization. 2016, Vol. 43, no. 6, p. 475–484. https://doi.org/10.5771/0943-7444-2016-6-475.
Hofmann, Martin and Langendörfer, Horst. “Browsing as Incremental Access of Information in the Hypertext system CONCORDE.” In : H2PTM’89 : Communication interactive. 1990. https://wicri-demo.istex.fr/Wicri/Sic/H2PTM/fr/index.php/H2PTM_(1989)_Hofmann.
Jacko (ed.). The human-computer interaction handbook: fundamentals, evolving technologies, and emerging applications. 3rd ed. CRC Press, Taylor & Francis, 2012. 978-1-4398-2944-8.
Matuschak, Andy and Nielsen, Michael. How can we develop transformative tools for thought? 2019. https://numinous.productions/ttft/.