Introduction
Publication of research data alongside journal articles has increased over the past 20 years (Tedersoo et al. 2021). This shift has been fueled by the open science movement and is reflected in both journal guidelines and funding agency policies that mandate public data sharing (Nelson 2022; National Institutes of Health 2020; Howard Hugues Medical Institution 2025). The promises of open data include improved transparency and reproducibility of research and the potential of data reuse to accelerate scientific discoveries (Kaiser and Brainard 2023). These possibilities are only attainable if the shared data is FAIR: findable, accessible, interoperable, and reuseable (Wilkinson et al. 2016). However, poor depositing practices and limitations of data repositories can make it difficult to find published data (Johnston et al. 2024). To enhance the discoverability of research data outputs, some institutions have built dataset catalogs (Yee et al. 2023; Mannheimer et al. 2021; Sheridan et al. 2021). A dataset catalog is a searchable directory of records about datasets produced by researchers at an institution. The catalog does not host the data itself, rather it contains the relevant metadata to discover and access the data. A dataset catalog is an opportunity to showcase the breadth of research conducted at an institution, improve the discoverability and reuse of datasets, and provide a mechanism to monitor data sharing for compliance purposes.
The University of Alabama at Birmingham (UAB) is a research-intensive (R1) university with over 20,000 undergraduate, graduate, and professional students (Indiana University Center for Postsecondary Research 2021; UAB 2024a). UAB researchers rely heavily on federal funding, primarily from the National Institutes of Health, which accounts for more than 400 million dollars in research funding annually (UAB 2024). The UAB Libraries is a central unit operated by approximately 70 employees. Within the library, the Office of Scholarly Communication is a specialized unit of four faculty members who provide support across the publishing life cycle in areas such as copyright, open access, research data services, and management of our institutional repository, the UAB Digital Commons.
In 2024, the UAB Office of the President established an interdisciplinary Research Data Management Working Group as part of a broader Research Strategic Initiative (UAB 2023) to address issues affecting the production, management, and sharing of research data on campus. This Working Group, which includes representatives from the library, research staff and faculty, and university leadership, identified the need for a central access point to research data originating from the UAB community. The development of the UAB Research Data Catalog (RDC), spearheaded by the library, was proposed to meet the need for centralized access to UAB datasets, and to make these datasets more FAIR for the benefit of the campus community (UAB 2024b).
The design of any catalog must be informed by the information seeking activities of its users. We considered five main prospective user groups within UAB when developing the catalog: (i) researchers, who may submit their data records to the catalog, use the catalog to identify collaborators, and maintain a centralized record of their own data outputs; (ii) administrators, who may track and report on data outputs by searching the catalog; (iii) educators, who can find relevant, real-world data to enrich their classroom activities in the catalog; (iv) current and prospective students, who can identify research opportunities and leaders in their field of interest; and (v) information professionals, who may use the catalog to improve services for data management and data sharing.
In this paper, we describe the construction of the UAB Research Data Catalog as a collection within an existing institutional repository platform (UAB 2024b). We outline the customizations made to the metadata fields and user interface, the workflow used to semi-automate the harvesting of dataset records, and the measures we have taken to allow ready adoption of the process by other institutions.
Planning and Design
UAB’s institutional repository is hosted on Digital Commons (an Elsevier product). Collections within the UAB Digital Commons include campus news, journals published on campus, and scholarly works by students, faculty, and staff. The platform is managed by one dedicated faculty member in the library. By hosting the RDC in Digital Commons, we sought to make its content more accessible by leveraging existing familiarity with the interface and the other campus-related content it contains. Because of this, the RDC may be more readily adopted by the community and increase the overall value of the repository. This option has the benefit of introducing no new platform management overhead, which made it more financially appealing than purchasing a separate commercial data catalog product.
Several institutions have built fully customizable data catalogs available as stand-alone applications. While this has many benefits, including full control of the catalog records and user experience, it also requires time and financial commitments, as well as personnel with expertise in software maintenance, security, and infrastructure. Hosting the RDC on Digital Commons allows us to sidestep many of these concerns by relying on existing solutions within the commercially developed product. Digital Commons provides a secure location for records, conducts platform maintenance, and updates regularly. They also provide consultations with expert staff to assist in repository maintenance. Since one major purpose of the dataset catalog is as an avenue for the discovery of UAB datasets, it is important that the catalog is indexed by search engines. While this is possible with a self-hosted platform, it requires technical knowledge and regular maintenance (Mannheimer et al. 2021). Digital Commons provides search engine optimization with Google for a user-friendly route to discover UAB datasets. Lastly, Digital Commons tracks usage metrics, which enables our team to assess the RDC’s uptake within the community.
Benchmarking to establish essential platform and record features
The default metadata categories and collection settings in Digital Commons are not sufficient to adequately describe research datasets. We reviewed the interfaces and metadata fields of several other data catalogs to inform the design of ours. We investigated commercial products such as Digital Commons Datasets, CKAN and InvenioRDM. We also reviewed data catalogs within the Harvard Dataverse data repository system (https://dataverse.org) and dedicated self-hosted catalogs such as those from New York University Langone Health (Yee et al. 2023), Memorial Sloan Kettering (2024), and Montana State University (Mannheimer et al. 2021). We observed common features of the user interfaces, as well as typical metadata elements used to document research datasets.
Metadata development
As we developed our metadata schematic, we were informed by earlier work that reviewed common metadata fields in biomedical dataset catalogs (Read 2015). We reviewed the metadata schemas of the data catalogs listed above, the beta version of the NLM dataset catalog (National Library of Medicine 2024), and multiple generalist repositories including Dryad and Zenodo. We prioritized metadata fields that would integrate with common data documentation schemas such as DataCite (DataCite 2024), Data Documentation Initiative (DDI Alliance 2012), the World Wide Web Consortium’s Data Catalog Vocabulary (DCAT) (Albertoni et al. 2024), and schema.org (https://schema.org), which is used by Google dataset search. The finalized set of metadata fields used standard Digital Commons fields when possible while remaining discipline-agnostic to allow the catalog to represent datasets across all fields of research.
Customization of the repository collection for dataset records
Our customized metadata schema was implemented in a Digital Commons collection created for the RDC (see Figure 1, and full metadata schema (Warner 2024, Warner 2025)). We clarified the collection’s user interface and removed default metadata fields which were unnecessary or potentially confusing for users. We removed the Recommended Citation field to encourage users to cite the dataset itself rather than the RDC entry for the dataset. We also included a metadata field for Author ORCIDs, which can be hyperlinked to the ORCID profiles. Lastly, we added other dataset-specific metadata fields to provide links to related works and information about funding, data access, and licensing.
Figure 1: Comparison of the default record page with the customized data catalog page to improve user experience with metadata fields altered (orange), removed (red), and added (green). Underlined metadata fields include hyperlinks to external resources. Metadata fields marked by an asterisk are searchable using the Digital Commons basic or advanced search tool. A-B) The “Download” button is renamed “Link to Dataset” and “Description” changed to “Abstract” for clarity. C-D) Unnecessary metadata fields for document type and recommended citation are removed. E-I) Metadata fields for author ORCIDs, related items, repository, access instructions, licensing information, and funder information are added.
Scope of the Research Data Catalog
We focused our efforts on datasets housed in generalist repositories rather than subject-specific repositories. The rationale for this was as follows: (i) datasets in generalist repositories are generally less findable and would benefit more from being included in the RDC, (ii) researchers in fields with subject-specific repositories may already know where to go to find data, and (iii) subject-specific repositories often have specialized metadata fields which could be hard to accommodate in a general catalog. We furthermore predict an increase in the number of datasets shared in generalist repositories following the 2023 update to NIH Data Management and Sharing Policy and a corresponding need for these datasets to be recorded in the RDC (National Institutes of Health 2020).
Criteria for dataset inclusion
We formalized inclusion criteria for a dataset in the RDC. First, the dataset must constitute research data that can be used for one or more of the following purposes: reproducibility studies, secondary analyses, community resource development, or education. For instance, a “dataset” in a repository consisting of supplementary figures for a journal article, but not the underlying data, would not be included. Secondly, the dataset must be authored by a researcher affiliated with the university (i.e., faculty, staff, or student) at the time the dataset was published. Lastly, it must include data that is aligned with the academic and research goals of the university. These inclusion criteria are implemented in the initial discovery of datasets and the manual curation of prospective catalog entries.
By the end of the planning phase, we selected Digital Commons as a platform, developed metadata fields, and built a customized collection to house the RDC. We also outlined a clear scope for the RDC and defined clear inclusion criteria for datasets. The benchmarking process was heavily informed by the previous and current work of the Data Discovery Collaboration (DDC), a community committed to advancing research data catalog practices (Sheridan et al. 2021). This deliberate planning process forced us to think critically about each design and metadata element prior to building and populating the RDC.
Populating the RDC
Populating the Research Data Catalog
We developed a workflow which could reliably identify UAB datasets, extract their metadata information, and reformat them into the Digital Commons batch upload spreadsheet format (Figure 2). Our workflow accommodates the limited size and coding experience of our team and allows us to allocate more time to the manual curation and enhancement of records, rather than locating datasets and extracting metadata. We developed Python code to execute APIs to search for and extract metadata records for UAB-affiliated datasets and to reformat the metadata as required.
Figure 2: Necessary steps to harvest metadata records from data repositories and upload them to the UAB RDC. The primary file format of the dataset records for each stage is written above the stage.
Utilizing the Zenodo API to harvest dataset records
The first stage of the workflow uses APIs to locate UAB-affiliated datasets and extract their metadata. Part of our benchmarking process involved selecting generalist repositories with user-friendly APIs. We began with the Zenodo API for multiple reasons: (i) the API is easy to use, (ii) the API can access datasets in both Zenodo and Dryad due to a past mirroring agreement between the two repositories, and (iii) a significant number of UAB datasets are shared in these repositories. Later, we developed code that would perform an equivalent search using the DataCite API. This API accesses a broader range of datasets with DOIs minted by DataCite, including those housed in Figshare, Zenodo, and Dryad.
Python code to reformat API output
The code executes an API call to search for items with the type “dataset” and an affiliation of “University of Alabama at Birmingham”. The response is returned in JSON format. The JSON response is flattened into a DataFrame object using Pandas, a Python library specialized for manipulating tabular data (McKinney 2010). Once in tabular format, the records are reformatted to the batch upload spreadsheet format, where each column corresponds to a metadata field in the RDC and each row corresponds to a record in the RDC. This reformatting includes adding HTML hyperlinking to ORCID profiles, related items, and license information (see underlined metadata fields in Figure 1). For records retrieved using the DataCite API, different reformatting may be required depending on the repository the dataset is stored in. The records are then exported in spreadsheet format as an .xlsx file. Pandas does not support export to .xls, which is the format accepted by Digital Commons for batch uploads, but this is resolved by manually converting the spreadsheet to the proper format. This spreadsheet then undergoes manual inspection, curation, and enhancement by our team (see Figure 3).
Figure 3: Checklist for manual inspection, curation, and enhancement.
First, we inspect the entries to (i) ensure all records pulled by the API have been appropriately included in the output spreadsheet, (ii) no characters have been displayed incorrectly after data manipulations, and (iii) no entries are duplicated, either in the output spreadsheet itself as a result of multiple versions of the same dataset or from being already included in the RDC. For the former, the code will consolidate explicitly versioned DOIs (i.e., those ending in “.v#”), but a final manual inspection is still recommended. For the latter, we developed additional Python code to compare the DOIs in the output spreadsheet to those already in the RDC and remove duplicated DOIs from the output spreadsheet. The inspection stage serves as a valuable checkpoint to flag scenarios where the automated metadata reformatting does not work as intended so that the code can be refined to account for these cases.
Next, the output spreadsheet is curated to harmonize the entries. This process largely depends on the catalog entries in question and can be adapted to fit the needs of the adopting institution. Possible curation avenues include unifying institution names (ex: University of Alabama, Birmingham to University of Alabama at Birmingham), expanding abbreviations (ex: UAB to University of Alabama at Birmingham), and removing extraneous details (ex: California Digital Library, Oakland, United States of America to California Digital Library). The goal of this curation is to optimize the metadata for consistency and future searchability. The other aspect of curation involves ensuring that all of the datasets are appropriate for inclusion in the RDC, following the criteria outlined previously. Some of this is done automatically by the Python code, by targeting the API search at “datasets” and by removing any Figshare records which have been submitted directly by publishers, as those datasets have insufficient metadata for reuse and often contain just the tables and figures of a publication. In the manual curation stage, any records which may not satisfy our requirements for inclusion in the RDC can be investigated and removed if necessary.
The enhancement stage provides an opportunity for information professionals to supplement the metadata provided by the initial depositor. Specific enhancements may vary based on institution, but could include adding keywords for increased findability, applying the Digital Commons controlled vocabulary to manually enter the Subject Area metadata field, adding ORCIDs or grant numbers which were not included in the original metadata, or adding institution-specific information such as departments or research groups. Time saved during the initial ingestion can be spent curating and enhancing records.
The finalized output spreadsheet is uploaded to the RDC through the Digital Commons batch upload process. The batch feature can also be used to edit existing catalog entries en masse.
Development of custom records for datasets
The RDC also indexes datasets generated from UAB research projects which are not deposited into any repository, including the REasons for Geographic and Racial Differences in Stroke (REGARDS) program (Howard et al. 2005), the National Spinal Cord Injury Statistical Center (NSCISC) (DeVivo, Go, and Jackson 2002), and the Center for AIDS Research (CFAR) public health data (Kitahata et al. 2008). These studies have produced large, high-quality datasets which have supported thousands of publications. The datasets are not publicly available and rely on discovery through university websites or through the scholarly literature. To promote the visibility and reuse of these valuable datasets, we worked with the project leaders to develop entries for their programs in the RDC. The entries include instructions and contact information for requesting access to the data.
Workarounds for limitations of Digital Commons
In many ways, the Digital Commons institutional repository has served as an ideal platform for the RDC. However, there are several features that limit its performance. Unlike some dedicated data catalog software, Digital Commons is not fully customizable. For example, it does not have the option to filter records. Searches for datasets are possible through text input in the basic or advanced search interfaces, or by manually scrolling through entries. Conscious of this drawback, we were deliberate when assigning which metadata fields were searchable to improve data discovery (see Figure 1). Additionally, Digital Commons only permits 50 authors in its Author field (and only 33 authors for bulk uploads). While this is suitable for most datasets, we have encountered several datasets from international research collaborations with hundreds of authors. Currently, there is no way to faithfully represent these entries in the RDC. For the time being, our code flags and records the DOIs of such datasets when they are found by the API but does not add them to the spreadsheet for uploading.
Furthermore, the Author field within Digital Commons only permits one author affiliation. It is common for authors of datasets to list multiple affiliations, and including only a single affiliation could reduce the findability of an entry, as well as compromise its accuracy. Our current solution is to use the first affiliation listed, prioritizing a UAB affiliation if applicable. This is done during the manual curation stage. Currently there is no direct integration with ORCID, which would be an ideal feature for a research data catalog. Our current method is to provide hyperlinks to the author’s ORCID profiles, when available.
Lastly, Digital Commons provides user affiliation information for downloads, but that information is unavailable for catalog record page accesses or external link uses. It does, however, provide total numbers of accesses per page over time. In our view, these drawbacks do not outweigh the benefit of utilizing our existing institutional repository as a data catalog platform.
Current status of the dataset catalog
Over the course of one year, we developed the workflow to populate the catalog, created the catalog infrastructure, and seeded the catalog with records from Zenodo and DataCite, as well as custom records. We are now maintaining the catalog with monthly ingestions. As of the publication of this article, the RDC has 280 records (43 from Dryad, 113 from Zenodo, 117 from Figshare, and 6 custom records). In the ten months since its launch, the catalog received an average of 128 visits per month, with a range of 36-85 records viewed each month.
Adapting the RDC Workflow to Your Institution
Institutional preparations
Promotion and adoption of the Research Dataset Catalog was made easier because the program was strongly supported by an institution-wide research strategic initiative and Working Group. Gaining institutional backing is an important factor when integrating an institution-wide resource. The first step to take if you are preparing to adopt the UAB model at your own institution is to coordinate with other stakeholders and prospective users within your institution. Plan and initiate outreach to relevant units across your campus community, including researchers, administrators, educators, and institutional leadership. To promote the RDC, we published articles in multiple campus newsletters, presented to upper-level research administration, and hosted an institution-wide webinar. Table 1 outlines specific goals for communicating with the various stakeholder groups on campus. This outreach, and the dialogue it generates, may help you locate datasets that should be included, like the UAB custom records, and adapt the metadata fields to best serve your users. Lastly, for effective adoption of the resource, promotion should be integrated into other services across the research cycle, such as research consultations and data management training.
Table 1: Outreach goals for various stakeholder groups.
| Stakeholder Group | Outreach Goal(s) |
|---|---|
Researchers (students, staff and faculty) |
|
| Administrators (grant office, research integrity office, or similar) |
|
| Institutional Leadership |
|
| Educators |
|
| Other institutional information professionals (library or otherwise) |
|
Library preparations
In conjunction with institutional level preparation, you will need to assemble a data catalog team. First, assess the availability and readiness of the staff that will work on this project. Our team consisted of a coordinator familiar with the research data corpus at the institution, a repository expert, and a developer with coding experience in Python. Ensure that your team members have enough experience with coding and the chosen institutional repository platform to be comfortable following this workflow. Note that Digital Commons itself is not a pre-requisite: Many other institutional repositories with a spreadsheet-based batch upload feature can serve as a dataset catalog platform. If you are using a non-Digital Commons product, you will need to assess the metadata fields and user interface, and you may need to adjust the Python code to ensure compatibility with your repository’s batch upload requirements. If you are using Digital Commons, the UAB RDC’s settings can be replicated exactly in your Digital Commons instance. We estimate that the time commitment for following this protocol will require approximately one week to set up the collection and one to two hours per month for maintenance.
Code-level considerations
As a small team, we were determined to develop a resource which could be maintained with limited personnel, and with code that could be easily adapted to a dataset catalog at a different institution. The Python code used to harvest and process records via the Zenodo API and via the DataCite API, and to flag for duplicated records, are provided in Jupyter Notebook (Kluyver et al. 2016) format on the project’s GitHub page with a static version preserved in the Zenodo repository (Warner 2024, Warner 2025). The Jupyter Notebook format allows for interactive code execution and inspection of the results at various stages of data manipulation. It also enabled us to integrate extensive explanation and documentation directly with the code, making it more accessible to non-expert Python users. Those interested in adopting this workflow should speak to their institutional repository provider to replicate our catalog parameters and revise a copy of the code to update the API search parameters with their institutional affiliation(s).
Conclusion
At its core, the purpose of a data catalog is to improve data discovery. This task requires a conflux of dedicated information professionals, institution-specific knowledge, tools to improve efficiency of routine tasks, plus strategies to enhance metadata records. Our approach leveraged the institutional repository platform we had in place and populated it using an API-assisted process to help realize the full potential of an institutional data catalog.
The UAB RDC is an ongoing and evolving project. We plan to develop code to access more datasets, including from Dryad and the Inter-university Consortium for Political and Social Research (ICPSR). On campus, we will continue to solicit feedback and pursue engagement with users and other institutional stakeholders. We now advertise a self-submission form for researchers to submit their own datasets to the RDC, and we have begun to incorporate the cataloging of datasets into the standard research data management consultation processes the library conducts with researchers. We will use input from other institutions that adopt our methods and workflows, as well as feedback from users, to improve our API search strategies, record processing and curation, and catalog interface.
References
Albertoni, Riccardo, David Browning, Simon J D Cox, Alejandra Gonzalez Beltran, Andrea Perego, and Peter Winstanley. 2024. Data Catalog Vocabulary (DCAT) - Version 3. https://www.w3.org/TR/2024/REC-vocab-dcat-3-20240822.
DataCite Metadata Working Group. 2024. DataCite Metadata Schema for the Publication and Citation of Research Data and Other Research Outputs. Version 4.6. https://doi.org/10.14454/mzv1-5b55.
DDI Alliance. 2012. DDI Codebook 2.5. https://ddialliance.org/ddi-codebook_v2.5.
DeVivo, Michael J., Bette K. Go, and Amie B. Jackson. 2002. "Overview of the national spinal cord injury statistical center database." The Journal of Spinal Cord Medicine 25 (4): 335-358. https://doi.org/10.1080/10790268.2002.11753637.
Howard Hughes Medical Institution. 2025. "Sharing Published Materials/Responsibilities of HHMI Authors." https://hhmicdn.blob.core.windows.net/policies/Sharing-Published-Materials-Responsibilities-of-HHMI-Authors.pdf.
Howard, Virginia J., Mary Cushman, LeaVonne Pulley, Camilo R. Gomez, Rodney C. Go, Ronald J. Prineas, Andra Graham, Claudia S. Moy, and George Howard. 2005. "The reasons for geographic and racial differences in stroke study: objectives and design." Neuroepidemiology 25 (3): 135-143. https://doi.org/10.1159/000086678.
Indiana University Center for Postsecondary Research. 2021. The Carnegie Classification of Institutions of Higher Education. https://carnegieclassifications.acenet.edu.
Johnston, Lisa R., Alicia Hofelich Mohr, Joel Herndon, Shawna Taylor, Jake R. Carlson, Lizhao Ge, Jennifer Moore, Jonathan Petters, Wendy Kozlowski, and Cynthia Hudson Vitale. 2024. "Seek and you may (not) find: A multi-institutional analysis of where research data are shared." PLOS ONE 19 (4): e0302426. https://doi.org/10.1371/journal.pone.0302426.
Kaiser, Jocelyn, and Jeffrey Brainard. 2023. "Ready, set, share!" Science 379 (6630): 322-325. https://doi.org/10.1126/science.adg8142.
Kitahata, Mari M., Benigno Rodriguez, Richard Haubrich, Stephen Boswell, W. Christopher Mathews, Michael M. Lederman, William B. Lober, Stephen E. Van Rompaey, Heidi M. Crane, Richard D. Moore, Michael Bertram, James O. Kahn, and Michael S. Saag. 2008. "Cohort profile: the Centers for AIDS Research Network of Integrated Clinical Systems." International Journal of Epidemiology 37 (5): 948-955. https://doi.org/10.1093/ije/dym231.
Kluyver, Thomas, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, Carol Willing, and Jupyter Development Team. 2016. "Jupyter Notebooks - a publishing format for reproducible computational workflows." In Positioning and Power in Academic Publishing: Players, Agents and Agendas. IOS Press. https://doi.org/10.3233/978-1-61499-649-1-87.
Mannheimer, Sara, Jason Clark, Kyle Hagerman, Jakob Schultz, and James Espeland. 2021. "Dataset Search: A lightweight, community-built tool to support research data discovery." Journal of eScience Librarianship 10 (1): e1189. https://doi.org/10.7191/jeslib.2021.1189.
Memorial Sloan Kettering. 2024. MSK Data Catalog. https://datacatalog.mskcc.org.
National Institutes of Health. 2020. NOT-OD-21-013 Final NIH Policy for Data Management and Sharing. https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html.
National Library of Medicine. National Library of Medicine Dataset Catalog (Beta version). https://datasetcatalog.nlm.nih.gov.
Nelson, Alondra. 2022. Ensuring Free, Immediate, and Equitable Access to Federally Funded Research. Edited by Office of Science and Technology Policy. https://bidenwhitehouse.archives.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf.
Read, Kevin. 2015. Common Metadata Elements for Cataloging Biomedical Datasets. Figshare. https://doi.org/10.6084/m9.figshare.1496573.
Sheridan, Helenmary, Anthony J. Dellureficio, Melissa A. Ratajeski, Sara Mannheimer, and Terrie R. Wheeler. 2021. "Data Curation through Catalogs: A Repository-Independent Model for Data Discovery." Journal of eScience Librarianship 10 (3): e1203. https://doi.org/10.7191/jeslib.2021.1203.
Tedersoo, Leho, Rainer Küngas, Ester Oras, Kajar Köster, Helen Eenmaa, Äli Leijen, Margus Pedaste, Marju Raju, Anastasiya Astapova, Heli Lukner, Karin Kogermann, and Tuul Sepp. 2021. "Data sharing practices and data availability upon request differ across scientific disciplines." Scientific Data 8 (1): 192. https://doi.org/10.1038/s41597-021-00981-0.
UAB. 2023. "Research Strategic Initiative." https://www.uab.edu/research1b.
UAB. 2024a. 2024 UAB Headcount Enrollment Report. https://www.uab.edu/institutionaleffectiveness/images/documents/headcount/2024-Fall-Headcount-Enrollment.pdf.
UAB. 2024b. "UAB Research Data Catalog.” https://digitalcommons.library.uab.edu/datasets.
UAB Office for Finance and Administration. 2024. 2024 UAB Financial Report. University of Alabama at Birmingham. https://www.uab.edu/financialaffairs/images/documents/reporting/2024_financial_report.pdf.
Warner, Claire. 2024. UAB dataset catalog GitHub repository. Github. https://github.com/markuslibrary/uab-dataset-catalog.
Warner, Claire. 2025. markuslibrary/uab-dataset-catalog: Dataset Catalog Code 1.0.0. Zenodo. https://doi.org/10.5281/zenodo.17573142.
Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, et al. 2016. "The FAIR Guiding Principles for scientific data management and stewardship." Scientific Data 3: 160018. https://doi.org/10.1038/sdata.2016.18.
Yee, Michelle, Alisa Surkis, Ian Lamb, and Nicole Contaxis. 2023. "The NYU Data Catalog: a modular, flexible infrastructure for data discovery." Journal of the American Medical Informatics Association 30 (10): 1693-1700. https://doi.org/10.1093/jamia/ocad125.