Journal of eScience Librarianship Journal of eScience Librarianship Building an eScience Thesaurus for Librarians: A Collaboration Building an eScience Thesaurus for Librarians: A Collaboration Between the National Network of Libraries of Medicine, New Between the National Network of Libraries of Medicine, New England Region and an Associate Fellow at the National Library of England Region and an Associate Fellow at the National Library of Medicine Medicine Let us know how access to this document benefits you.

Objective: In response to the growing interest and adoption of eScience roles by librarians, those from the National Network of Libraries of Medicine, New England Region (NN/LM NER) and an Associate Fellow from the National Library of Medicine collaborated to build an eScience Thesaurus. The Thesaurus will introduce librarians to terminology and concepts in eScience, point to relevant literature and resources on data and digital research topics, and provide links to interviews with librarians and experts working in eScience-related roles. The eScience Thesaurus is a starting place for librarians to find the vocabulary to research the background, resources, and tools necessary for developing their capacity to provide eScience-related services. Methods: The Associate Fellow completed a review of eScience-related literature to identify the seminal publications for the originations of terms and concepts as they apply to libraries. Next, the Associate Fellow worked with the NN/ LM NER to compile an environmental scan of resources that would be useful and applicable for librarians, and created a scope document and record structure. The team interviewed prominent librarians working in eScience roles and experts that have created digital tools and services used by the library community. Finally, the team sent the Thesaurus records out to five members of the advisory and editorial review boards from the eScience Portal for New England Librarians for evaluation. Results: The eScience Thesaurus is now hosted on the eScience Portal for New England Librarians’ website. It provides a comprehensive list of more than 50 different terminologies and concepts, with links to seminal and relevant literature, resources, grants, and interviews on a variety of eScience-related topics. Conclusion: The eScience Thesaurus is an evolving resource; as the field expands and more eScience-related terms are adopted by the library and information science community, the Portal will enable its users to electronically submit new vocabulary and records to the Thesuarus, thus preserving it as a go-to eScience resource for librarians.


Introduction
The National Science Foundation (NSF) (National Science Foundation 2011), National Institutes of Health (NIH) (National Institutes of Health 2012), and the Office of Sci-ence and Technology Policy (OSTP) (Holdren 2013) have called upon researchers to manage and share their data.The federal-level prioritization of cyberinfrastructure, data management, digital curation, and sharing data has forged new pathways for librarians to utilize their skills in information management, metadata description, and knowledge of publishing practices in the context of scientific data.EScience --data intensive, collaborative, and digital research -has been growing rapidly; it now bridges researchers from the sciences, social sciences, and humanities.In response, there have been several prominent publications by the Association of Research Libraries (ARL) calling for libraries and librarians to explore how to engage researchers and participate in the digital research enterprise (eScience Portal for New England Librarians 2013).At the same time library science scholars have been calling upon the library community to "upskill" (Cox, Verbaan and Sen 2012) and "re-skill" (Auckland 2012) to provide relevant services to these researchers.However, many librarians have found it difficult to stay informed about eScience concepts, the types of roles they can play in these areas, what skills they need, and what this means for their current position (Alvaro et al. 2011;Hernon and Schwartz 2007).At the regional level, in 2009 the National Network of Libraries of Medicine, New England Region (NN/LM NER) began receiving requests from its regional members for professional development opportunities and guidance on eScience, particularly on how to apply these new concepts to their professional practice and developing eSciencerelated library services (Creamer et al. 2012).In 2010, NN/LM NER Director Elaine Martin spearheaded the creation of the eScience Portal to support her region's members.One of her early proposals was for the creation of a controlled vocabulary for eScience that could enhance searching and organizing information on the Portal (Martin 2010).Based on this perceived interest and the documented need and desire for librarians to learn more about eScience and its many facets, the NN/LM NER and an Associate Fellow at the National Library of Medicine (NLM) that had experience working on several projects compiling data-related bibliographies began a collaboration to create an eScience Thesaurus.This resource would provide librarians with information about the concepts from both a theoretical and practical perspective.
As librarians have engaged more with research data in terms of building digital collections and teaching data management, terms from computer science and data science have been adopted by the library science community.The Thesaurus represents a starting point for librarians lacking fluency in these terms to research their original contexts, see how they relate to existing taxonomies, and understand how they have been "co-opted" (Whitmire 2013a) by librarians, and learn how they are now being applied in an information-specific context.For example, related terms like data management, data curation, and digital curation have distinct meanings and uses within and outside an information context.
The Thesaurus serves to capture the point where its vocabulary terms diverge from other domains into library science.This will provide librarians, who have a cursory understanding of these terms, an opportunity to follow the terms' evolution, and master how they are being used within, and outside the library and information context.
The partnering of an Associate Fellow with a Regional Medical Library is the first project of its kind taken on by the NLM: This effort provides an opportunity for the eScience Thesaurus to be promoted at a national level via the National Network of Libraries of Medicine, whose goal is to make this resource available to librarians in all eight of its regions.The Thesaurus will take shape in the form of e-Science vocabulary terms accompanied by a definition, links to related literature and web resources, grant information, and transcripts from interviews with librarians currently working in eScience roles and experts that have created digital resources and tools being used by librarians to provide researcher services.
To provide a context and scope for this pro-ject, the Associate Fellow researched definitions of eScience to design a framework for the creation of the Thesaurus.Definitions of eScience vary widely; Youngseek et al. (2011) provided a definition that summarizes one key characteristic of eScience --it describes collaborative and digital research that exploits "the availability of large datasets and the capability of productively sharing these datasets among international teams of researchers."Similarly, Richard Luce (2012) highlighted how eScience functions from the perspective of a scientist or researcher: "eScience is typically conducted by a multidisciplinary team working on problems that have only become solvable in recent years with improved data collection and data analysis capabilities." On the broadest level, eScience provides opportunities for researchers and scientists to digitally manipulate and exploit "Big" and "Small" data; share their data; discuss new avenues for research and discovery; and work together to produce better results.A wide network of activities including everything from data management to policy development comprises eScience.This definition of eScience --a collaborative, broad network of digital research activities --has served as our theoretical foundation for selecting the terminology and concepts included in the Thesaurus.
Keeping the above definition of eScience in mind, this paper will address the process taken to create the eScience Thesaurus for librarians.There are new roles for librarians to play in a collaborative, scientific, datadriven research environment, and a learning tool like the Thesaurus will provide them with a starting place to explore the background and application of emerging terms and concepts.It will point users to literature, resources, and tools needed to build and provide eScience-related services, and inform and improve their professional practice.

Methods
The Associate Fellow worked with the NN/ LM NER librarians to develop an outline for, an editorial workflow, and a scope and record structure for the Thesaurus.The Associate Fellow conducted a literature review of eScience articles in library, technology, and translational science journals; conducted an environmental scan of digital resources related to eScience that are being adopted and used by librarians; reviewed several eScience-related grants from the Institute of Museum and Library Services (IMLS); and emailed and conducted in-person interviews with prominent librarians and experts working in the field of eScience.Lastly, the Associate Fellow and NN/LM NER librarians divided the records into five sections and sent these to five members of the Portal's advisory and editorial boards for peer review and to ensure its adherence to the scope and relevance to the Portal community.

Literature Review
Scholarly literature about eScience is spread widely across a number of different sources such as journals, blogs, reports, and white papers.To complete a literature review that would satisfy the need to gather the highestquality literature about these topics, the Associate Fellow looked at each of these potential avenues.The first step for his literature review was to identify a) literature databases geared towards librarians and b) specific journals that focused on library and eScience-related issues.Based on the criteria above, he searched for literature within the following sources: area of study.He also utilized citation chaining as another step towards identifying key articles that discussed eScience activities.To validate his search results, the Associate Fellow compared the articles selected for the literature review against established bibliographies (Szigeti & Wheeler 2011;Westra et al. 2010) and two ongoing efforts to compile a comprehensive bibliography for librarians on data-related topics.The first resource he used was the Data Management for Librarians Mendeley® group (Data Management for Librarians 2012).The Associate Fellow compared the literature review results to the Mendeley® group bibliography to fill in any gaps in the literature.He then compared the results to Charles W. Bailey Jr.'s Digital Curation Bibliography (Bailey Jr. 2012).Bailey Jr.'s bibliography offers 650 Englishlanguage articles, books, and technical reports on a variety of data-related topics such as data curation, management, and preservation.This resource was instrumental in ensuring that the Associate Fellow retrieved the most relevant results possible for the purpose of informing librarians about eScience.
The Associate Fellow compiled all citations using reference management software, and he read and tagged each article to describe the most prominent topics discussed; these emergent tags became the vocabulary for the Thesaurus.Articles were tagged based on the keywords provided by the journal or author, and coded according to the most prominent topics that were discussed.Coding the articles served the purpose of ensuring that the provided keywords from the journal or author were accurate.For example, many articles used "data curation" or "data management" as keywords because they represent the broadest topic being discussed, but coding provided an opportunity to tag articles on a more granular level where the major topics of the articles were actually related to aspects of data curation and management such as education, preservation, and implementation.Once all of the articles were read and tagged, he grouped  Journal of the American Medical Informatics Association  Journal of the American Society for Information Science and Technology  Journal of the Medical Library Association  Library and Information Science Abstracts (LISA)  Library, Information, Science and Technology Abstracts (LISTA) The Associate Fellow also searched Google Scholar and created Google Alerts to ensure that all literature covering eScience or any other data-related services in the realm of libraries would be accounted for.This provided the Thesaurus team with an opportunity to stay abreast of new developments in the literature, and to retrieve the most current information available.He searched the journals and resources using the following strategy: Search Strategy: eScience for Librarians The Associate Fellow eliminated the libraryfocused portion of this search when few results were retrieved in any of the databases; this measure provided a wealth of relevant literature, as often libraries were not mentioned within articles about data-related activities, even though the content was directly applicable to the role of a librarian.The Associate Fellow reviewed results from the literature searches to isolate seminal articles about concepts in eScience and data-related library topics to identify key scholars in this them into categories based on their major topics; these categories represented the term or concept that would be described in the Thesaurus (see Terms).
Comparing the Associate Fellow's initial results (156 articles) with the two bibliographies provided a strong collection of formative literature on the topic of eScience (187 articles).The terminology and concepts that emerged from the literature covered a wide range of topics from altmetrics to linked data, thus showing the complexity and breadth of eScience-related tools and activities.The complete list of terms derived from the literature is provided below.Providing access to eScience and datarelated resources was another essential feature of the Thesaurus.Taking the time to retrieve the most relevant literature on these topics was paramount, but finding resources that librarians can use to implement eScience-related services was equally as important.To gather useful resources for librarians, the Associate Fellow reviewed resources and subject guides from libraries that had published on their work building eScience capacity.

Terms
The library subject guides reviewed are as follows:  California Digital Library  Massachusetts Institute of Technology

Conducting Interviews with Librarians Working in eScience Roles "In Their Own Words"
Some of the eScience Thesaurus records include interviews with prominent librarians and experts working in eScience and datarelated roles.The goal of this effort is to provide guidance and direction to librarians on how to use a tool or implement a service by hearing the information from a colleague firsthand.Using the literature and the NN/ LM NER's knowledge of institutions working in eScience areas, the Associate Fellow and NN/LM NER librarians identified a select group of librarians and experts.The Associate Fellow completed two interviews in person and the rest were completed via email.
The Associate Fellow worked with the NN/ LM NER librarians to design the interview questions.The aim was to encourage participants to describe what they do with eScience, how they garnered interest from their institution's research community to support the library's role in helping with eScience and data-related activities, and what skill sets they have that allow them to work in these roles.Non-librarians were asked to describe their service or tool, how they work with libraries, and how librarians use their tools.The Associate Fellow and NN/LM NER librarians curated from the resulting articles those that were considered to be seminal and those that prioritized health sciences literature and applied contexts when possible to support the NN/LM community looking for a place to gain a foothold in eScience.

Interview
accurate and thorough description emphasizing applications that would be geared specifically towards librarians new to eScience.If the Associate Fellow was describing a term that was a resource or tool, he took the definition from its website, and added library-related contextual information from the literature to make it applicable to librarians.

Related Literature
This is a list of literature that pertains to the term or concept being defined.The publications included in this section address putting the theory of the term or concept into practice, and serve to further inform the user about a topic.The record only includes links to the literature if they are freely available.

Related Resources
This is a list of web resources that pertains to the term or concept being defined.The links in this section provide resources that a librarian can use to implement the term or concept being defined.

See Also
This part of the record provides links out to other related terms or concepts that are described within the eScience Thesaurus.

Interview-"In their own words"
This is the link to the transcripts of librarians and experts answering interview questions that relate to the term or concept in the Thesaurus record.This section is only added to the record if the interviews compiled relate to the term or concept being described.

Peer Review
After the Associate Fellow created the first iteration of the records, the NN/LM NER librarians reviewed the Thesaurus.The Associate Fellow implemented changes and then he and the NN/LM NER librarians divid-ment and can you provide examples of the skills and services that you or your other staff members have in this particular area?
The interview question portion of the Thesaurus is titled "In their own words" and is located alongside any Thesaurus record whose term relates to the expertise of a librarian or expert whom the Associate Fellow interviewed (http://esciencelibrary.umassmed.edu/thesaurus/interview_priem).For example, in the Thesaurus record for 'Altmetrics,' the Associate Fellow's interview with Impact Story's Jason Priem is embedded within the record to provide readers with a link to an interview with an expert working in that particular field.Jason discusses specifically how libraries and librarians can apply altmetrics to their researcher services and how they can use Impact Story as a resource.This addition to the Thesaurus provides applied and real-world examples of librarians putting theory into practice, and offers a unique and additional level of information for Thesaurus users.

Developing the Record Structure
After compiling all of the literature, resources, and interviews that were applicable to activities and terminology related to eScience, the Associate Fellow worked with the NN/LM NER librarians to develop a structured Thesaurus record, which is outlined below:

Term/Concept
This is the name of the term or concept being described.

Definition
This is a clear and concise definition of the term or concept.The NN/LM NER librarians curated and reviewed the Associate Fellow's definitions that he developed by reviewing the literature and endeavoring to provide an agement, digital scholarship, and data literacy.The Thesaurus is embedded into the homepage of the eScience Portal for maximum exposure.The Associate Fellow unveiled the resource in a #medlibs Twitter chat on August 22, 2013, where an entire hour was devoted to its different components.
The Medical Library Association (MLA) provided some chat participants with Continuing Education (CE) credit for attending the Twitter chat based on their Summary Recommendations for Action of the Competencies for Lifelong Learning and Professional Success for health sciences librarians (MLA 2013).This chat was one of five #medlibs-hosted Twitter chats about eScience that were required for participants to receive credit (#medlibs chat 2013).ed up the eScience Thesaurus into five sections.They sent one section to five members of the eScience Portal for New England Librarians' advisory and editorial boards.They selected board members that worked at libraries offering eScience and datarelated library services.Once the reviews were returned, the Associate Fellow made the appropriate changes, and passed the records along to the eScience Portal web developer for implementation.

Results
At present, the eScience Thesaurus is hosted on the eScience Portal for New England Librarians.This website is hosted by the NN/LM NER, and currently provides a wealth of information about eScience, data man-60 definition of the term or concept; links to related literature, resources, and grants; and a See Also section that points a user to related terminology within the Thesaurus (Figure 3).Notice in Figure 3 the "In their own words" button in purple beside the Thesaurus term; this feature links the user to an interview with a librarian working in this particular area of eScience.The interview feature provides users with real-world context for a term or concept in the Thesaurus by hearing directly from a librarian about how they are putting theory into practice.

Discussion
The eScience Thesaurus was born from

Navigating the Thesaurus
The Thesaurus has a landing page with a search bar and a hyperlinked alphabet with direct links to terminology and concepts that begin with that letter running across the top of the screen (Figures 1 and 2).
Providing an alphabetical list of each term or concept will afford users with an opportunity to easily retrieve results.The search bar will remain a constant feature throughout each page within the Thesaurus, so that a user can search the full text at any time.
The visual representation of a Thesaurus record follows the structure outlined in the Methods section of the paper, providing a 61  Thesaurus is designed to help ease this anxiety and serve as a resource that offers high quality and authoritative material that will assist librarians in adapting to these new roles.Cox and Pinfield (2013) state that libraries are currently offering limited research data management services and that there are significant challenges in terms of skill gaps and cultural change within institutions.This challenge is to be expected when information professionals are faced with unfamiliar disciplines where new concepts or terminologies are being used.Stanton et al. (2011) confirm findings from Cox and stress that librarians need to develop a range of capabilities before they venture into the realm of providing eScience and data-related services to their patrons.Stanton et al. emphasize the importance of acquiring the background knowledge, skills, classroom experience, and familiarity with new tools in order to manage the complexities that data can present.Finally, Lyon (2012) specifies the need for librarians to understand research workflows and technical standards as they relate to specific scientific domains.Developing these skills can be daunting for a librarian interested in learning about and contributing to eScience activities within their institution.The creation of the eScience Thesaurus will hopefully ease the burden on librarians seeking out reliable information sources to learn about activities and terminology related to eScience and data management.
Librarians with a cursory understanding of eScience may utilize the Thesaurus in a variety of ways; the Thesaurus provides a bridge for vocabulary terms that may have different meanings in the context of library science versus traditional scientific research.For example, the term "data life cycle" would resonate with researchers as the different steps they are required to take in order to collect, preserve, and provide access to their data.In the world of a librarian, the "data life cycle" refers to the different stages of the research process where the librarian can provide support and information to researchers to make concrete connections among the NN/LM NER and the wider library community.The growing vocabulary linked to data-driven science created a need to provide librarians with consistent definitions to align the theoretical understanding of these terms from the perspective of traditional science with the applicable use by librarians in a research and information context.
For example, health sciences librarians investigating eScience and data-related services have contacted the NN/LM NER to inquire about the meanings of particular eScience terms or concepts and how these were being used in an information context or health sciences context.In a forthcoming chapter on research data management and the health sciences librarian, the NN/LM NER authors identified a need for a Thesaurus to provide clear, consistent definitions of eScience terminology.Finally, as the NN/LM libraries have begun to offer more webinars and train -the-trainer classes on the topic of data management, data management plans, and data -driven research, the NN/LM NER identified the need for consistent definitions of this vocabulary for developing these educational materials.The eScience Thesaurus intends to fill this need from the library community with a comprehensive list of terms that are clearly defined, and examples of how this vocabulary can be implemented by librarians throughout the research process.
EScience requires library professionals to learn a new research methodology.Librarians wishing to "reskill" and "upskill" for dataintensive research need support and educational tools and professional development opportunities in order to feel comfortable and proficient providing data-specific services at their institution.Academic and research libraries are constantly evolving; eScience is providing new opportunities for librarians to meet the needs of patrons in the form of specialist research support services training, and management for data-related activities (Corrall et al. 2012;Covert-Vail and Collard 2012) -all of which are becoming vital to preserving the profession.The eScience ready have many of the tools to deal with data; they just need to apply these tools in new ways.

Conclusion
The goal of the eScience Thesaurus is to introduce librarians to new terminology and concepts in this emerging field, and to provide insights from librarians and experts (via interviews) working in eScience roles.It is an attempt to bridge the knowledge gap for librarians by providing an opportunity to learn about the skills and experience other librarians are currently using to put eScience into practice.It also attempts to support the health sciences librarian's transition to providing eScience and data-related services and to help him or her to implement these services in their own professional practice by providing clear definitions of eScience terms; linking to relevant and up-to -date literature; pointing to web resources; and linking to grants that demonstrate eScience in action.Finally, the eScience Thesaurus is geared towards providing highquality, evidence-based information resources about eScience, and will continue to do so as new terminology, activities, and tools develop that will assist librarians in understanding and applying this new knowledge in their practice.Librarianship, 66 (2011): 1-16, http:// dx.doi.org/10.5062/F46Q1V55their workflow more efficient.The Thesaurus affords librarians the opportunity to understand the divergence between how eScience terminology is interpreted within and outside a library context.Whitmire (2013b) stresses this divergence as a concern in her recent article, where as a transplant from traditional science to library science she raises a valid point that there is a distinct difference between a scientist's and a librarian's perception of data-driven research, and its accompanying terminology.For example, she points out that the term "eScience" is seldom used by researchers; rather it is predominantly used by the library community.The eScience Thesaurus aims to align these differences and provide librarians with a better overall understanding of how this terminology can be interpreted in the context of scientific research and subsequently applied in a library context.

References
While a venture into eScience and datadriven science can require a great deal of time and resources, it should not deter librarians from taking an interest in this particular area of study and providing data-related services.Dorothea Salo (2010) echoes this statement when she points to a number of areas of expertise a librarian already has that can be applied to the data challenge including our strong understanding of metadata, digital preservation, public service, and technology.The areas that librarians are familiar with are directly applicable to describing and managing data as well as providing data education for researchers.P. Bryan Heidorn (2011) expands on this by pointing out a number of skills that librarians can use to meet this new role such as the librarian's understanding of appraisal and selection of materials, classification schemas and vocabularies, and the ability to connect with research communities.Finally, Jake Carlson (2011) has shown how traditional tools such as the reference interview can be applied to the new context of data management.The foundation of knowledge in librarianship is directly applicable to providing eScience-related services.Librarians al-

eScience Resources-An Environmental Scan
 Altmetrics  Cyberinfrastructure  Data  Data Citation  Data Curation  Data Curation Profiles Toolkit  Data Dictionary  Data Interview  Data Lifecycle  Data Literacy  Data Management Checklist  Data Management Plan  Data Management Policy  Data Migration  Data Preservation  Data Privacy  Data Provenance  Data Publication  Data Repository  Data Reuse  Data Science  Data Security  Data Sharing  Data Transformation  Data Visualization  DataCite  Data Set  Digital Curation  Digital Curation Lifecycle Model  Data Management Planning Tool (DMPTool)  Electronic Lab Notebook (ELN)  EScience  Fourth Paradigm  Graduate Student Lifecycle  Informationist  Linked Data  MANTRA Data Management Course  OpenDOAR  Research Data Management  Research Lifecycle