Introduction
In research, data is currency, so preventing data loss is critical. However, the method of keeping research data safe is not always straightforward. Often, researchers refer to “backing up” their data, but what do they mean? The definition of “backup” is nebulous: it can be so many different things, from syncing data to the cloud, to making a copy on an external hard drive, to using redundant disk arrays. All of these technical approaches protect against distinct threats to research data. A shared understanding of terms is essential to discussing the best way to protect research data.
Many data management workshops, technical documents, and guides reference the 3-2-1 rule for backing up actively used files (Darragh 2024; Godfrey 2025; Gonzalez-Espinoza 2025; Slevin 2025); note, however, that data protection for active data differs from preservation requirements; the latter is out of scope for this paper. The 3-2-1 rule is a mnemonic device for creating a robust active data backup strategy:
Store three copies of any important file: a primary and two backups.
The file should be stored on two different media types, such as a hard drive and optical media, to protect against hazards related to the physical characteristics of the media, and
One copy should be stored off-site, or at least offline, in case something happens to the physical location of the primary backup.
This rule originated from a 2009 book about digital asset management for photographers (Krogh 2009). However, the author states upfront that “The structure described in this chapter is geared to people who want to create and maintain an archive, but who do not have an IT department to handle the data management. If you have an IT department that can handle the data that your digital photography creates, then some of what is described here is not necessary” (Chapter 4).
Given this caveat, does the 3-2-1 rule apply well to research data? Many storage systems that researchers use (e.g., cloud file storage platforms or central IT-run systems) already have some redundancy built in, such as replication to an off-site data center. Applying the 3-2-1 rule to a file stored on such a platform creates duplicated effort and wasted storage space.
Several other backup methods build upon the 3-2-1 rule:
Here, Near, Far recommends a working copy of your data located on the device you are using to manipulate it (here), a second copy on a separate device, such as an external hard drive or network drive (near) and a third copy in a distinct location such as the cloud or secondary data center (far) (Bueter 2022).
3-2-1-1-0 expands on the 3-2-1 rule by adding a copy in the location that is not connected to the internet to protect against online ransomware and malware, as well as monitoring and integrity checks to identify corruption (as in, 0 errors) (Search Data Backup, n.d.; Rabinov 2021)).
4-3-2 increases the redundancy of the 3-2-1 rule by adding another copy at each level described in the 3-2-1 rule (Rabinov 2021).
These methods still implicitly assume that the individual has no IT support, and the data owner is responsible for the backup solution. To simplify further discussion, “the 3-2-1 rule” moniker will encompass the original rule and the variations for simplicity going forward.
Efficiently implementing the 3-2-1 rule has become more challenging due to new data storage technologies that integrate data loss protection into their platforms. More practically, operationalizing the 3-2-1 rule has become a challenge in research because of the growth in the size of data and the frequency of changes to the data, which involves the perennial challenges of cost and complexity. Newer research methods, such as cryo-EM (Baldwin et al. 2018) and sequence assembly (Oxford Nanopore Technologies, n.d.; Illumina Support Center, n.d.), generate such quantities of data that recomputing the analysis is often cheaper than storing the files generated from the analysis. For data that changes frequently, more and more changes take up more and more space. This raises questions like:
What data should be kept?
What is raw data?
What data can be discarded?
Who is responsible for what, when using automation and/or cloud services?
Who can be asked for advice on these questions?
Workshop Motivations and Goals
The authors concluded that the 3-2-1 rule should be evolved, but into what? Current guidance around research data backup, like the 3-2-1 rule, often does not cover the more nuanced nature and complexity of data protection. The content is challenging to present to researchers about the various backup options and considerations, especially in the context of generating larger and more complex data. Practitioners can struggle with the terminology in discussions about backups — for example, replication and redundancy — and understanding what elements people focus on when discussing backup. Thus, the authors proposed a workshop for the 2025 RDAP Summit (“RDAP Summit 2025 Public Materials” 2025) to get community input and brainstorm resources supporting these conversations. The goals of the workshop were to:
Create a better shared understanding of what “backup” means and create a vocabulary to communicate it.
Communicate better with the teams that run storage infrastructure to ensure that data protection is in place.
Develop resources for learning, both for researchers and research data management professionals.
The authors — all of whom have IT backgrounds as well as research data management backgrounds — provided their knowledge to workshop participants in the hope that the community could help synthesize it into valuable materials. The information is repeated here with the same call to action.
This paper discusses the many methods of data protection, what they protect against, and which methods should be employed to preserve research data during the active research project. Immutable public data preservation when the project is complete will not be discussed because the data protection method differs between these situations. The authors aim to create a better understanding of what the word backup could be referring to when used in conversation and create a shared vocabulary to facilitate communication about data protection. For our purposes, we will offer a working definition of backup as any of various data protection modes implemented to counter an array of failure modes with the intention of preventing data loss. This shared vocabulary will improve communication with the teams that run storage infrastructure, as well as with researchers and other data professionals. This knowledge can be used to develop resources for data management professionals to use for professional development, for delivering instruction or presentations, and for instructional resource development (e.g., LibGuides) for the local stakeholders at institutions.
Failure Modes and Mitigations
To understand how to prevent data loss, we must first understand the causes of data loss. IT professionals will refer to these causes as “failure modes.” Common failure modes include:
Natural disasters (e.g., fires) physically destroying the storage devices.
Accidental deletion or modification of a file by human error.
Failure of the hardware/media.
Cyber attacks destroying or removing access to data.
File corruption.
Network or power outages preventing access to the data (temporary loss of data).
The 3-2-1 rule does take these failure modes into account and suggests ways to mitigate them; however, working with a team at an organization introduces a new type of failure mode that an independent photographer does not encounter: lack of file ownership. Files can be deleted due to the user account that owns them being deleted, or lost because the institution does not own the storage device.
Table 1 summarizes the failure modes and maps the protection method to mitigate or protect against those modes of failure. This table boils down to the essential information that must be considered in any evolution or replacement of the 3-2-1 rule.
Table 1: A summary of the data failure modes and the corresponding protection to mitigate it.
| Failure Mode | Protection |
| Disaster | Offsite replication |
| Human error | Versioning, distinct copies |
| Hardware failure | Redundancy |
| Cyber attacks | One-way replication (not syncing) + versions |
| Data corruption | Replication (not syncing) |
| Data unavailability (e.g., network outage) | Local copy |
| Lack of file ownership | File ownership by an institutional account |
A note on redundancy: Redundancy means having a second set of hardware with a second copy of the data to serve as a backup in case the primary hardware fails. For any storage in the commercial cloud or enterprise storage devices, it has redundancy built in. In all other situations, replication will also create redundancy. In short, researchers typically will not have to worry about redundancy and will not be discussed further. The remaining protections will be defined and discussed.
Replication
Replication is what it sounds like: a copy of the data on a different device. An IT unit may do this by replicating the data on its primary storage device to a secondary storage device in a completely different data center (University of Michigan, n.d.; Harvard University 2020; University of Chicago, n.d.). This mitigates the risk of disasters at the primary data center, such as a building fire or water damage. With replication in place, data can be restored from the secondary device, unaffected by the disaster. Additionally, if the data centers are in different regions of the world with different natural disaster patterns (a requirement for some regulations), the data could survive an earthquake or a tornado. This is the case when working in a commercial cloud service, like Amazon Web Services, which typically offer replication between geographic regions (Amazon Web Services, n.d.).
The best practice for replication is to make the action one-way, meaning data only goes from the primary storage location to the secondary storage location. Secondary data does not return to the primary except during the recovery process, which creates a clear source of truth for what data is original. The data should also be immutable, meaning that it cannot be altered once it is sent to the secondary storage. This protects the replicated data itself from human errors, corruption, and cyberattacks, so that the replication can be trusted during the recovery process (Synology, n.d.). An example of a solution that uses one-way, immutable replication is CrashPlan (CrashPlan 2025). This service is a solution for desktops and laptops that copies the files on the device and sends an immutable and one-way copy of the data to the CrashPlan servers, which can then be accessed via a web portal if recovery is needed.
RAID is not a backup (Boniface, n.d.)
A popular, but misinformed, strategy for backup is to use a RAID device. RAID stands for Redundant Array of Inexpensive Disks (Patterson et al. 1988). Its purpose is in the name: redundancy. A RAID controller presents multiple physical hard drives as one virtual hard drive and should be considered one device. Thus, RAID is not replication. Furthermore, the RAID controller adds a new, higher-risk failure mode. RAID’s original purpose was for higher performance by dividing a file among several drives to allow for multiple file read points. The RAID controller controls this process. If the controller itself fails…you cannot put Humpty together again (the file cannot be reassembled).
RAID can still be helpful in research — and even part of a backup strategy — when used as intended. A RAID incorporated into an instrument computer has faster I/O than network drives, so it can serve as a temporary cache for instruments that produce data faster than the network can transmit. In the event of outages, like network or power outages, it can provide a local offline copy. This type of setup can work well for researchers doing field research in areas where the network is unstable.
Automation
Automation is something to consider when creating a replication solution, mostly notably because humans will forget to do manual backups. Some technologists even assert that backups must be automated for this reason (Jude and Percival 2025). However, automation is not a magical panacea for the issue; it has its own failure modes, such as the potential to encounter an error and abort the replication or perform an incomplete replication. When deciding to automate or not, it is helpful to consider the frequency and criticality of the replication. A nightly backup where most of the data has not changed is a candidate for automation, while a semi-annual disaster recovery replication might be better completed manually for greater control and integrity verification. Even in manual replications, the task should still be easy for the operator to complete; operators will forget the details of what they must do if it is not incorporated into some sort of script or app with an “easy” button.
Versioning
Versioning preserves the state of data at a point in time and comes in a few different flavors. There is version control software, which includes familiar version control methods like Git/GitHub/GitLab (GitHub, n.d.; GitLab, n.d.), BitBucket (Atlassian, n.d.), and Apache Subversion (Apache Software Foundation, n.d.). These are deliberate preservation of versions that include a documentation aspect. Snapshots are often heard in the context of network storage devices (UC Berkeley, n.d.; Northwestern University, n.d.; San Diego Supercomputer Center, n.d.). File changes are stored locally for quick restores and to correct human errors, including deletion. They are designed specifically for restoration and do not hold much value otherwise. Cloud file storage often includes version history, which is similarly designed for restoration. Unlike snapshots, version history may or may not include deletions - that will depend on the cloud subscription or lack thereof. Finally, the manual method of versioning is file duplication and naming, such as paper-final-version2-final-final.docx. This works as a last resort, but is not recommended at scale.
Versioning is often combined with one-way, immutable replication to guard against replication of deletions, corrupted files, and ransomware: the data can be restored from a time point before loss occurred. Time Machine for macOS (Apple, n.d.) is an example of snapshot versioning. If the backups are stored on a drive external to the macOS device, then it is also replication, though not one-way and immutable.
Syncing
Syncing is not data protection. Two-way file synchronization is the process of ensuring that computer files in two or more locations are kept identical. Syncing is used for collaborating, switching between devices such as an instrument computer and your primary computing device, and for convenience. However, syncing is not replication or versioning. If you accidentally delete a file, that file will be deleted on all the other synced devices. If a file gets corrupted, the corruption is synced to the other devices. If a computer gets ransomware, that ransomware is synced to other devices.
Verification
The ultimate goal of every backup is the ability to restore preserved data in its entirety and identically to the original. Therefore, a suitable restoration procedure needs to be adopted, documented, and practiced for each employed backup technology and procedure.
To ensure the restoration goal, one must ensure that veracity is maintained at every stage of the backup lifecycle. This lifecycle is illustrated in Figure 1.

Figure 1: The backup lifecycle. All copies must be verified to be the same, even the backup copy over time, due to the potential for data loss at each stage.
A copy of the original data is created on the backup media of choice and restored later from that media. Veracity must be ensured at every stage of this process — including verifying the backup over time. Transfer errors can and do occur upon copying to or from the backup (represented by the solid arrows in Figure 1). Backup tools can mitigate this risk by verifying checksums of each transferred file on both ends of the transfer and re-copying those with discrepancies until completion. Thus, the backup and restore strategy is often reduced to verification while data is being moved.
An often overlooked aspect is the fact that a backup copy “at rest” — meaning it is sitting on a storage platform untouched – can change over time for a variety of reasons (the concept often colloquially referred to as “bit rot”). Cuneiforms break, magnetic tapes tear or demagnetize, etc. These factors need to be evaluated for every backup technology in use. They could lead to a future restored data identical to its aged backup copy, but different from the original when it was preserved. This aspect of the backup lifecycle is generally in the purview of storage administrators and far removed from researchers or data stewards. However, the latter needs to be cognizant of it. A common strategy is to create a “file manifest” (i.e., a list of every file with size and checksum value at backup creation) that can be stored with the backup and serve as a litmus test for verification against the original. The process of verifying that the backup remains unchanged is called fixity checking.
Ownership
The protections described thus far have one thing missing: the user. These protections implicitly assume that only one user is involved. However, research is often done in teams as part of a larger institution. This adds an additional failure mode that must be considered: ownership. There are two types of ownership: data ownership and file ownership. Data ownership is more conceptual. What entities own the data from a property standpoint? Whenever setting up accounts for various services it is important to read the terms of service. Those terms of service often include a clause that states the entity providing the service will own and/or use any data uploaded into the service (Google, n.d.; Meta, n.d.; OpenAI, n.d.). It is often important to leverage the purchasing department at an institution to help create contracts and agreements that retain data ownership – your institution likely has an ownership stake in the data. Data ownership at research institutions is often complicated and shared between the researchers and their institution (Arizona State University 2025).
File ownership is more practical. Computers and systems assign an owner to all files corresponding to a user account. When a research group member uploads research data to their institutional Dropbox account, their user account owns the files. If their user account is deleted, say, because they graduated and left the university, the files will get deleted along with the account, regardless of how many other accounts the file has been shared with. This also happens when using personal devices, such as photos stored on personal phones. When the phone leaves the university, the data leave with it. From a technical standpoint, leveraging group or institutional ownership when using services is important. For example, Google Shared Drives instead of My Drive; Dropbox Team Folders instead of Dropbox individual accounts; Microsoft SharePoint instead of OneDrive. Box, unfortunately, does not have built-in group ownership. The institution must provide some sort of group account to access Box. Similarly, computers and mobile devices owned and managed by the institution should be used as often as possible. This wrinkle is not usually considered as part of contingency planning.
Communicating About Data Protection
The whole conversation around data protection at a university involves several perspectives: researchers, storage administrators, and leadership. Researchers have a practical perspective; they want to know where to store their data; the lower the cost, the better. Storage administrators have a technical perspective. They are interested in the service's current storage configuration and continuity, meaning the service needs to keep working. Leadership has a high-level perspective. They are concerned about needs being addressed amongst the various stakeholders in the institution and have strategic priorities and budgets that must be considered. Where is the overlap between researchers, storage administrators, and leadership from the perspective of a research data management support specialist?
To be most effective in conversation about backup, inquire with storage administrators about the data protection features employed on these systems. The terminology used in this paper is the terminology used by storage administrators. Ask them: Who is responsible for replication? What are possible replication targets? What snapshot schedules are available? For cloud file storage services at an institution, the storage administrators are usually on a larger team called something like “collaboration” or “productivity.” Ask them: What subscription and corresponding features does the institution have? Are deleted files recoverable? The goal is to learn all the possibilities researchers have for tailoring workflows. Do not be afraid to brainstorm with these storage administrators about solutions for backup or building out workflows — they have valuable expertise.
Talking to researchers about the institution's storage systems, along with data management best practices, is important. In particular, ask questions to understand the researchers’ workflows, and make suggestions that balance the risk of data loss with workflow compatibility and cost. In discussions with researchers related to storage and backup, keep track of patterns and pain points in aggregate, as this is invaluable data for identifying gaps and escalating research needs in these areas to leadership.
It is also important to acknowledge that change is hard when talking to leadership. Do not assume they know why your work is important. You must tie your requests to the University's strategic priorities, compliance issues, or other motivations that drive leadership decisions. Data storage and backup are ongoing challenges that face many institutions. If possible, leading with a solution (or part of a solution) that addresses these challenges — and ties to concerns about data protection — will be welcomed by leadership.
Examples in the Wild
Here are some examples of data protection workflows from US-based institutions that touch on backup approaches that mitigate potential failures, but also address increasing requirements around data protection that can exist at institutional, state, and federal levels.
The first example involves “small” data, defined here as data that do not exceed the limits of cloud file storage quotas at an institution (e.g., <150GB total). Institution 1 had a Box contract, but it became too expensive, and the institution declined to renew it. SharePoint was designated as the replacement for “small” data and collaborative file storage. Migrations are always terrible, but they become complicated by the fact that platforms differ in nuanced, but significant, ways (Magle and McCaffrey 2023). This means migrations are often not 1-to-1 transfers. Unlike Box, SharePoint is intended to be more of an intranet — sharing within an institution, not between institutions. Multi-institutional collaborations experienced challenges at Institution 1 because of this change. With SharePoint as the only institutionally-contracted cloud solution, even if SharePoint is not suitable for research workflows, from a compliance standpoint, other services are not options. A key reason that these other platforms are not options is because National Security Presidential Memorandum-33 (NSPM-33) — a federal memorandum that includes requirements on cybersecurity — requires institutional oversight for data storage, which is only feasible with institutional contracts (Whitman 2022; Biden 2021).
In the context of data protection, SharePoint data are versioned, not replicated, and versions extend back only 30 days. Researchers may be unaware of this and fail to recognize the need for an independent backup plan that initiates replication. As an alternative to SharePoint, high-performance storage is available at Institution 1 for small research data, despite being primarily for “big” data storage (“big” data physically cannot go in SharePoint because of various limits to file size, number of files, and total file size (Magle and McCaffrey 2023)). However, this service is relatively unknown on campus and presents a sizable learning curve for researchers. Without appropriate awareness, researchers with “small” data may opt for alternative less robust data protection strategies, or no strategy at all. At this point, data protection becomes as much about outreach and education as it is about availability of alternatives. This example likely resonates with many institutions that are dealing with similar situations.
As a second use case, Institution 2 provides an example of a complex workflow employing many tools to address backup needs in order to protect the data gathered during the course of research. For background, the institution has a research office on a remote island that does not get reliable internet service. Data collection occurs on a ship and must be sent to central storage on the mainland for analysis; however, collaborative storage is also needed to view files easily from computers on the island. The workflow devised was to designate Dropbox as the primary source of data given these requirements. The Dropbox contract includes protection against accidental deletion, so it serves as appropriate versioning. The data are synced to the analysis server, but still available in Dropbox for easy viewing of file contents. The data are replicated to Institution 2’s inexpensive tape storage as a primary backup and also to a commercial network attached storage (NAS) system (meant for average customers as opposed to enterprises) on the island as an offline backup. In this way, the workflow not only protects against data loss, but also ensures effective collaboration between researchers and availability of the data.
Workshop Results & Next Steps
Given that the 3-2-1 rule does not provide a wholesale solution to researchers' (varied) backup requirements, how do we, as data professionals and educators, move forward regarding backup for active research data? Shifting away from the nomenclature of 3-2-1 will involve changes to the ways many of us discuss backup in instruction, documentation, and campus discussions on the subject. At the workshop held at RDAP 2025, following the presentation of the above content, attendees were asked to participate in breakout sessions that focused on developing learning materials to support the work of data professionals in this area moving forward.
Two of the breakout sessions focused on brainstorming content updates to static resources (e.g., LibGuides) as well as “live” materials (e.g., presentations). For example, what is important for researchers to consider when making data storage and protection decisions, and, given the discussion around the nuances of backup, how can that be distilled into a slide (or two) to be used in a data management presentation? Table 1 emerged as a resource from the workshop that can be utilized to help clarify what failure modes are being mitigated and which approaches to backup are being used. The idea of developing researcher personas also emerged as a way to tailor LibGuide storage and backup recommendations across the continuum of small to big data, and other scenarios of interest.
Another breakout session focused on resources that might be helpful in “talking to IT” about storage services and any mitigations that these storage services might have in place. Given the interdisciplinary nature of such discussions (involving researchers, librarians, and IT professionals), a common understanding of often divergent terminology is important. The CODATA Research Data Management Terminology (RDMT) Working Group’s RDMT (CODATA RDM Terminology Working Group 2024) is an excellent starting point for clarifying some terminology in this space. The RDMT observes an existing practice of disciplinary disambiguation for some terms, such as in the definitions for “Container (digital archiving)” and “Container (computing).” Submitting entries for ambiguous IT-and-curation-related terms such as “archive,” “backup,” “data protection,” “metadata,” and “data sharing” could be a productive next step. One idea that developed in this breakout room was the potential for an “IT office hour,” potentially to be hosted by an organization such as RDAP. This kind of space can also offer the opportunity to develop resources such as sets of defined questions to ask storage administrators or example workflows for storage and backup that can then be brought by a data librarian, for example, to discussions on this topic at their respective institutions.
The final breakout session was focused on “train the trainer” and discussed what information, training, or resources were needed for librarians, or other data professionals, to talk with more familiarity about this topic at their institutions. Some of the discussion was similar to that described in the other breakout rooms — the need to define “jargon-y” terms, how to troubleshoot common questions related to backup — but a distinct topic that emerged was that of assessing risk and how that influences the approach to backing up research data. One workshop participant noted that researchers or research groups are likely not able to implement a complete backup solution at once. Given that reality, what are ways to approach advising researchers on assessing the most critical point(s) of “failure,” which can lead them to choose an appropriate backup method (i.e., see Table 1)? Having an IT- and curation-informed risk rubric or matrix akin to Caldrone's substantive risk assessment activity (Caldrone 2022) to reference would help as a resource to provide this kind of guidance.
Returning to the motivations of the workshop, the event had mixed success. As far as creating understanding and providing vocabulary, attendees were highly engaged with the material and provided numerous compliments about how digestible and illuminating the content was. Given this feedback, the presentation itself is highly valuable and should be repeated. Regarding the last goal of tangible deliverables, the breakout sessions following the workshop presentation encouraged great conversation and idea generation, but did not result in the creation of learning or training resources as the authors desired. In retrospect, structuring the breakout sessions with more defined exercises or drafted materials to start may have made it easier for workshop participants to engage and create outputs that resulted from the information shared in the presentation. The workshop slides, breakout room notes, and current and future resources for community use have been collected in an OSF project that is available to the RDAP community (McCaffrey et al. 2025). Interested contributors are invited to reach out directly to the authors with ideas on resource development. One participant has already incorporated the workshop material into their presentations and contributed the tailored presentation back to the project. The authors are also currently investigating avenues toward hosting follow-up sessions on resource development and invite the whole community to contribute to collective efforts.
References
Amazon Web Services. n.d. “Replicate Data within and between AWS Regions Using Amazon S3 Replication.” Amazon Web Services, Inc. Accessed August 9, 2025. https://aws.amazon.com/getting-started/hands-on/replicate-data-using-amazon-s3-replication.
Apache Software Foundation. n.d. “Apache Subversion.” Apache Subversion. Accessed August 9, 2025. https://subversion.apache.org.
Apple. n.d. “Back up Your Mac with Time Machine.” Apple Support. Accessed August 7, 2025. https://support.apple.com/en-us/104984.
Arizona State University. 2025. Ownership of Research Data and Materials & Intellectual Property Management Implementation Policy. RSP 604. March 1. https://public.powerdms.com/ASU/documents/1559312.
Atlassian. n.d. “Bitbucket | Git Solution for Teams Using Jira.” Bitbucket. Accessed August 9, 2025. https://bitbucket.org/product.
Baldwin, Philip R, Yong Zi Tan, Edward T. Eng, et al. 2018. “Big Data in cryoEM: Automated Collection, Processing and Accessibility of EM Data.” Current Opinion in Microbiology 43 (June): 1–8. https://doi.org/10.1016/j.mib.2017.10.005.
Biden, Joseph. 2021. Presidential Memorandum on United States Government-Supported Research and Development National Security Policy – The White House. Presidential Memorandum NSPM-33. The White House. https://trumpwhitehouse.archives.gov/presidential-actions/presidential-memorandum-united-states-government-supported-research-development-national-security-policy.
Boniface, Joshua. n.d. “RAID Is NOT a Backup!” Accessed August 6, 2025. https://www.raidisnotabackup.com.
Bueter, Ruth. 2022. “File Storage and Backup Best Practices.” The Rotation: A Himmelfarb Library Blog. November 23. https://blogs.gwu.edu/himmelfarb/2022/11/23/file-storage-and-backup-best-practices.
Caldrone, Sandi. 2022. What If You Lost Your Data? A Risk Assessment Activity. August 29. https://hdl.handle.net/2142/114425.
CODATA RDM Terminology Working Group. 2024. CODATA RDM Terminology (2023, V0001): Overview. January 30. https://doi.org/10.5281/zenodo.10626170.
CrashPlan. 2025. “CrashPlan: Enterprise Data Resilience.” CrashPlan. https://www.crashplan.com/home.
Darragh, Jen. 2024. “LibGuides: Research Data Management: Storage and Backup.” October 30. https://guides.library.duke.edu/c.php?g=633433&p=4429284.
GitHub. n.d. “About GitHub and Git.” GitHub Docs. Accessed August 9, 2025. https://docs-internal.github.com/en/get-started/start-your-journey/about-github-and-git.
GitLab. n.d. “The Most-Comprehensive AI-Powered DevSecOps Platform.” GitLab. Accessed August 9, 2025. https://about.gitlab.com.
Godfrey, Krista. 2025. “LibGuides: Research Data Services: Storage and Security.” LibGuide. July 21. https://libguides.uvic.ca/researchdata/planning/storage_security.
Gonzalez-Espinoza, Alfredo. 2025. “CMU LibGuides: Data Management for Research: Data Security and Backup.” August 7. https://guides.library.cmu.edu/researchdatamanagement/security.
Google. n.d. “Google Terms of Service.” Privacy & Terms – Google. Accessed August 9, 2025. https://policies.google.com/terms.
Harvard University. 2020. “Data Storage.” FAS Research Computing, November 16. https://www.rc.fas.harvard.edu/services/data-storage.
Illumina Support Center. n.d. “Data Output and Storage.” Accessed August 8, 2025. https://support-docs.illumina.com/IN/NovaSeq6000Dx_HTML/Content/IN/NovaSeqDx/DataOutputStorage.htm.
Jude, Allan, and Colin Percival, dirs. 2025. RAID Is NOT a Backup and Other Hard Truths About Disaster Recovery. Klara Inc. YouTube, 54:00. https://www.youtube.com/watch?v=4H6AWMyDnlY.
Krogh, Peter. 2009. “6. Backing Up and Validating Data.” In The DAM Book: Digital Asset Management for Photographers, 2. ed. A Digital Photography Ecosystem. O’Reilly. https://learning.oreilly.com/library/view/the-dam-book/9780596803353/ch06.html.
Magle, Tobin, and Deb McCaffrey. 2023. “Why Can’t I Just Use Dropbox? A Comparison of Cloud File Storage Platforms Used for Research.” Journal of eScience Librarianship 12 (3): e763. https://doi.org/10.7191/jeslib.763.
McCaffrey, Deb, Erin D. Foster, Lev Gorenstein, Tobin Magle, and Venice Bayrd. 2025. “Evolving the 3-2-1 Backup Rule for More Resilient Data.” OSF, February 24. https://osf.io/4vtmc.
Meta. n.d. “Meta Terms of Service.” Meta. Accessed August 9, 2025. https://www.facebook.com/terms.
Northwestern University. n.d. “Research Data Storage Service (RDSS).” Northwestern University - Services. Accessed August 9, 2025. https://services.northwestern.edu/TDClient/30/Portal/Requests/ServiceDet?ID=96.
OpenAI. n.d. “Terms of Use.” OpenAI. Accessed August 9, 2025. https://openai.com/policies/row-terms-of-use.
Oxford Nanopore Technologies. n.d. “How Much Storage Space Is Required for PromethION Sequencing Data?” Accessed August 8, 2025. https://nanoporetech.com/support/devices/PromethION-24-and-48/how-much-storage-space-is-required-for-promethion-sequencing-data.
Patterson, David A., Garth Gibson, and Randy H. Katz. 1988. “A Case for Redundant Arrays of Inexpensive Disks (RAID).” Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data - SIGMOD ’88, 109–116. https://doi.org/10.1145/50202.50214.
Rabinov, Natasha. 2021. “What’s the Diff: 3-2-1 vs. 3-2-1-1-0 vs. 4-3-2.” Backblaze Blog | Cloud Storage & Cloud Backup, July 21. https://www.backblaze.com/blog/whats-the-diff-3-2-1-vs-3-2-1-1-0-vs-4-3-2.
“RDAP Summit 2025 Public Materials.” 2025. In The Research Data Access and Preservation Association (RDAP) Documents. RDAP Summit 2025. OSF. https://osf.io/uydpe/wiki/RDAP%20Summit%202025%20Public%20materials.
San Diego Supercomputer Center. n.d. “Storage.” San Diego Supercomputer Center. Accessed August 9, 2025. https://www.sdsc.edu/services/storage.html.
Search Data Backup. n.d. “How the 3-2-1-1-0 Backup Rule Reflects Modern Needs | TechTarget.” Accessed January 24, 2025. https://www.techtarget.com/searchdatabackup/tip/How-the-3-2-1-1-0-backup-rule-reflects-modern-needs.
Slevin, Aisling. 2025. “Library Guides: Research Data Management: Writing a Data Management Plan: 3) Storing, Back-up & Security.” July 23. https://tus.libguides.com/researchdatamanagement/storing-backup-security.
Synology. n.d. Synology WriteOnce (WORM) White Paper. White Paper. Synology Knowledge Center. Accessed August 9, 2025. https://kb.synology.com/en-us/WP/WriteOnce_White_Paper/1\.
UC Berkeley. n.d. “Backup Implementation.” Research IT: Advancing Research@Berkeley. Accessed August 9, 2025. https://docs-research-it.berkeley.edu/services/research-data/data-storage-and-backup/backup-implementation.
University of Chicago. n.d. “Storage and Backup.” Research Computing Center. Accessed August 9, 2025. https://rcc.uchicago.edu/resources/storage-and-backup.
University of Michigan. n.d. “Snapshots and Replication.” ITS Documentation. Accessed August 9, 2025. https://documentation.its.umich.edu/node/5042.
Whitman, Lloyd J. 2022. Guidance for Implementing National Security Presidential Memorandum 33 (NSPM-33) on National Security Strategy for United States Government-Supported Research and Development. US National Science and Technology Council. https://bidenwhitehouse.archives.gov/wp-content/uploads/2022/01/010422-NSPM-33-Implementation-Guidance.pdf.