Introduction
Partnerships are critically important to data services work. Whether between researchers and service providers or between library data services and Research IT, partnerships are central to our work. In the research computing and data (RCD1) literature, these partnerships are sometimes linked to the concept of boundary spanning (cf. Broude Geva et al. (2020, 397); Koshoffer and Latessa (2023, 309)). We briefly review the connection between RCD collaborations and the boundary spanning concept below.
Boundary spanning is a term that arises from the social sciences literature and connotes an ability to mediate information between familiar and unfamiliar contexts. Tushman and Scanlan (1981, 303) note the emergent (rather than intentional) nature of boundary activity and assert that “(B)oundary spanners must be able to translate across communication boundaries and be aware of contextual information on both sides of the boundary” (1981, 300). Certain conditions give rise to emergent boundary activity: the higher the task uncertainty in innovative projects, the higher the information-processing needs; and to a degree the higher the number of emergent boundary activities (Tushman 1977, 601). Given the similarity of those conditions to some of our more innovative research data services contexts, it is not surprising that we see instances of successful boundary-spanning library and research IT collaborations across federally funded projects such as the Campus Research Computing Consortium (CaRCC, n.d.) and the Machine Actionable Plans (MAP) Pilot (ARL, n.d.), as well as in the literature (cf. Broude Geva et al. (2020); Koshoffer and Latessa (2023)). If boundaries, boundary activities, cross-disciplinary partnerships, and related communication challenges are a common element of data services, how might we incorporate the boundary spanning concept and related theory into data services planning?
In our positions as research computing and data (RCD) professionals, we have built partnerships across service boundaries and appreciate both the benefits and the challenges of crossing jargon barriers; intermingling service approaches; and navigating diverse funding models. Here we share our experiences conversing across professional silos, in the spirit of contributing to an ongoing conversation regarding the evolution and resilience of data services.
Collaborations, context, and language
“Two countries divided by a common language” – George Bernard Shaw
Notably, even when data stakeholders want to work together, conversations can be difficult for a variety of reasons: differing contexts, vantage points, values, expertise, and language can all create a dissonance that hinders — it is what Tushman, an organizational behavior theorist, calls “communication impedance” (Tushman 1977, 591). Sometimes we need a spark to shift conversations onto a different track. What follows is a set of ideas to help partnerships move forward and scale.
Differing contexts, vantage points, values, and expertise: the Venn diagram of data stakeholder overlap, and implications for collaboration
Figure 1: Research computing professionals and data librarians are frequently operating on the same boundary objects (sensu Levina and Vaast (2005, 339)) — research data — but data are produced both by a third party (researchers) and at different stages of the research lifecycle.
In Figure 1, data professionals and research computing staff occupy completely separate parts of the research lifecycle. Research data are what they have in common, making data what Levina and Vaast refer to as “boundary objects” (2005, 339). While this figure’s vast oversimplification overlooks many of the nuances of different data stakeholders’ positions, it asks us to consider the bridging aspect of data services work — how do we support researchers whose data management needs transcend the boundaries of our respective silos? Are handoffs and referrals the most seamless way to treat integrated data storage, transfer, and curation questions, if our goal is to integrate curation practices throughout the research project? What if instead we were to ensure that early conversations at minimum involved representatives from both data librarian and research IT units?
To facilitate cross-disciplinary collaborations, we need to address semantic disconnects. But how? Returning to the boundary spanning literature, we see that Tushman and Scanlan (1981, 291) unpack communication differences when they state,
[W]ords take on a richness of meaning that extends beyond simple associations to the full set of rules, habits, and conventions for the word's use…. Members of subunits also develop their own local social constructions of reality to help them define and interpret their social world …. These collective beliefs provide rules (e.g., premises of decision making, levels of satisficing) by which actors can define and manage the unique and recurring problems of organizational life. These shared beliefs provide the context and rules with which information is processed. The meaning of an event or an expression, then, can only be inferred when both its context and content have been both enacted and encoded….
How do we use knowledge of our shared context to fortify partnerships and bridge the gaps of research data facilitation? One way to both start this conversation and to have data to support discussion is for institutions to employ a service assessment, such as the CaRCC Capabilities Model (CaRCC, n.d.). Data from these assessments can be used to build shared service portfolios that highlight areas of overlap as well as identify gaps. Capabilities assessments, shared services portfolios, and shared projects are all excellent ways to reflect on and explore differing contexts, values, perspectives, language, and expertise while also considering ways to evolve data services to be more resilient to technological uncertainty.
Two [disciplines] divided by a common language
Differences in language usage create misunderstandings. Circumventing these pitfalls in our collaborations starts with having an explicit common goal and acknowledging the need to speak the same language. Understanding the priorities and needs of our colleagues and framing what we want to do in terms of how it helps both sides is a great way to get started. Maimone (2019) makes several project management suggestions in the context of supporting researchers, such as developing a project summary and tracking project progress, that complement these recommendations. Factoring in time to co-produce mutually agreeable working definitions for common “friction” words at the beginning of a project is another front-end loading process with lasting benefits. These processes may be lopsided at first, with one party exerting more effort than the other. Nevertheless, it is important to define, or ask others to define, ambiguous terms in the context of the conversation, and agree on a common vocabulary. A resource like CODATA’s RDM Terminology defines “key terms” in the research process, which may provide the foundation needed to support a shared understanding of terms amongst cross-disciplinary partnerships (CODATA RDM TWG 2024).
From here, we will highlight some data management terms that show the different perspectives of research computing staff, data professionals, and researchers — and how these terms can be easily misunderstood.
Archive
Are we talking about “sending data away” as in the desire to move data from active computing infrastructure to a “slower” and less expensive data storage system? Or are we discussing how to “put it on display” — to publicly preserve the dataset in the spirit of making it available for reuse in the future? If the priority for archiving is to put data away with little concern for accessibility “just in case,” the work involved is very different than carefully documenting data for reuse in the future.
Metadata
Metadata is another “classic” example of where definitions differ amongst Research IT staff and librarians. Scope is a key element to consider when defining this term amongst two groups. Is “metadata” being used in reference to system/platform-generated information, like file creation and size metadata, or is it in reference to descriptive elements characterizing the research outputs?
One type of metadata is generated automatically by the filesystem, whereas the other, respectively, is written by someone who understands the dataset well enough to show someone else its intended purpose and how to use it. This definition disconnect makes it hard to understand why creating metadata might require significant time investment. Defining the difference in metadata types is particularly important when working with researchers, who otherwise may develop a very different expectation of the time investment required of them.
Storage and backup
With storage and backup, what is the context? An apt analogy may be to think in terms of dwellings. When Research Computing folks think about storage, the image we might conceive would be a tent. Researchers in the thick of a project might be thinking about medium-term timelines, as with a starter home. Librarians, on the other end of the spectrum, may be thinking of the fortress needed for long-term preservation. If we are speaking with research computing staff, who may be thinking in the short-term context of local or private cloud high performance input/output rates and capacity vs. cost, it is critical to clarify the timeframe of our intended storage goals before discussing “active” storage, interim backups, and/or preservation solutions. In contrast, library research data management staff tend to think more about the “lifecycle” of storage, focusing on data protection, preservation, and curation, in the near and long term.
Sharing
Along those same lines, sharing is sometimes a question of “internal” or “external”. Is it a question of sharing amongst a research team, meaning that — from a computing perspective — there is consideration around permissions and potential requirements for how sharing may be allowed within a given system? Sharing in this sense is quite different to a librarian’s or funding agency’s use of sharing, which refers typically to the deposit of data in long-term public access repositories.
Conclusion
Two excellent starting points for starting semantic discussions include the CODATA RDM Terminology (CODATA RDM TWG 2024) and the CaRCC Capabilities Model (CaRCC, n.d.) discussed previously. Although the CaRCC tool does not contain definitions, the explicit naming of services will help both data librarians and research IT professionals see themselves in the scope of the defined activities while also highlighting gaps for activities, such as data curation, which are not well-documented in the assessment tool but are still critical to the project scope. The CODATA resource provides dual definitions for some contested terms, often from both a computer science as well as a library perspective.
Ultimately, taking the time to listen to our colleagues and explore connections and differences in terminology, processes, and expectations will help develop shared goals and vocabulary, and open doors to productive future collaborations. The boundary spanning literature is also an excellent resource for furthering our knowledge of how to work across silos, and we invite our data services colleagues to join us in exploring and interrogating more of the boundary spanning literature in the context of data access and preservation. Slowing down enough to establish meaning in the shared research data service space yields deeper collaborations and understanding between researchers, research IT, and data professionals despite the pressures of uncertainty and rapid change.
References
ARL (Association of Research Libraries). n.d. “Machine Actionable Plans (MAP) Pilot: Building a Scalable Data-Management Infrastructure for Strategic Institutional Coordination.” Archived December 15, 2025, at https://web.archive.org/web/20251215083119/https://www.arl.org/building-a-scalable-data-management-infrastructure-for-strategic-institutional-coordination/#About.
Broude Geva, Sharon, Dana Brunson, Thomas Cheatham III, et al. 2020. “Fostering Collaboration Among Organizations in the Research Computing and Data Ecosystem.” In Practice and Experience in Advanced Research Computing 2020: Catch the Wave (PEARC '20). Association for Computing Machinery. https://doi.org/10.1145/3311790.3396645.
CaRCC. n.d. “Campus Research Computing Consortium.” Accessed January 8, 2026. https://carcc.org.
CaRCC. n.d. “CaRCC Capabilities Model.” Accessed January 8, 2026. https://carcc.org/products_resources/tools-capsmodel.
CODATA RDM TWG (CODATA Research Data Management Terminology Working Group). 2024. “CODATA RDM Terminology.” (2023, v0001): overview (2023, v0001). Zenodo. https://doi.org/10.5281/zenodo.10626170.
Koshoffer, Amy, and Amy Latessa. 2023. "Playing in the Same Sandbox: Collaborations on Data Management, Research Technologies, and Research Computing." In Cases on Establishing Effective Collaborations in Academic Libraries, edited by Mary E. Piorun and Regina Fisher Raboin. IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-6684-2515-2.ch015.
Levina, Natalia, and Emmanuelle Vaast. 2005. “The Emergence of Boundary Spanning Competence in Practice: Implications for Implementation and Use of Information Systems.” MIS Quarterly 29 (2): 335–363. https://doi.org/10.2307/25148682.
Maimone, Christina. 2019. “Good Enough Project Management Practices for Researcher Support Projects.” In Practice and Experience in Advanced Research Computing 2019: Rise of the Machines (learning) (PEARC '19). Association for Computing Machinery. https://doi.org/10.1145/3332186.3332198.
Tushman, Michael L. 1977. “Special Boundary Roles in the Innovation Process.” Administrative Science Quarterly 22 (4): 587–605. https://doi.org/10.2307/2392402.
Tushman, Michael, and Thomas J. Scanlan. 1981. "Boundary Spanning Individuals: Their Role in Information Transfer and Their Antecedents." Academy of Management Journal 24 (2): 289–305. https://doi.org/10.2307/255842.
In itself a jargon term, the “Research Computing and Data” community is inclusive of both research computing and library specializations including data librarians, and is a term commonly used in the Campus Research Computing Consortium (CaRCC).↩︎
