The eScience Program is moving into a new contract period with the NN/LM in which we will be focusing on new initiatives. However, we realize the eScience Portal for Librarians is a valuable resource and communication tool for all librarians, both New England and beyond. We would like to continue to keep the portal fresh and updated with weekly content provided by the community users, YOU!
If you would like to contribute to the eScience Portal Community Blog, please email Julie.Goldman@umassmed.edu with your name, contact information and topic! Julie will coordinate with you on your proposed topic and timeline for submission to the blog.
The following list includes suggestions for topics for the e-Science Community blog. Please note that this is by no means an inclusive list! If you peruse the e-Science Community blog, you’ll see that the posts cover a very broad range of relevant topics.
- Reflections/reporting about a conference or a session within a conference
- Review of recent article, paper, book
- Data services at your library (e.g. the current services your library offers, description of data service workflows, lessons learned, planning data services
- Findings from institutional environmental scan—what kind of research is being conducted at your workplace, how are your researchers storing their data, what data issues do they find challenging?
- Results of needs assessments
- Description of working partnerships between your library and other department such as research groups, IT, your institution’s Office of Research, IACUC/IRBs
- Is there not much new to write about regarding your own work/experiences or your library’s services? Conduct an informal interview with someone else— e.g. librarian, researcher, IT person—about an e-Science related topic and write about it.
- Connecting with researchers
- Scholarly Communication issues: Open Access, Open Data, funders and journal requirements for sharing or publishing data, policies, advocacy, altmetrics, trends (e.g. social media)
- A day in the life of a researcher—maybe not a day exactly, but a description of a researcher’s project, his/her background, the tasks they do in the course of the day, funding, reporting, publishing, tenure and promotion, differences between academic and industrial research
- Do you wear many hats at your library in addition to the data hat? How do you manage it?
- Information on tools and technologies
- Reflecting on science librarianship over the years—what’s changed and what’s stayed the same?
- Building data literacy skills: teaching data management best practices, observations of students/faculty/colleagues understanding of data management, pedagogical approaches, marketing library instruction
- Planning a data repository: needs assessments, software options, using in house expertise or outsourcing
- How to build data related competencies; experiences you’ve had taking courses, MOOCs, workshops, CE classes
- How are library schools preparing students?
- Professional development suggestions
- Data management in the context of non-STEM disciplines—what can STEM librarians learn?
- New findings in a STEM field
- Research methods: evaluating research methodologies, research design
- Issues related to protecting confidentiality with human research subjects
- Who owns the data? Discussion of IP rights at your institution
- Lab notebooks (paper vs. electronic formats)
- Managing the print and digital divide (e.g. paper lab notebooks, slides, specimens, field notes)
- Data cleaning
- Data mining
- Data carpentry or Software Carpentry
- Business models for data repositories
- Librarians working outside the library or in non-traditional roles—research informationists, librarians leaving the library to pursue other information science roles
If you have any questions or suggestions for the portal, please let us know.
We look forward to your submissions and continuing this collaborative, open project!
Submitted by Regina Raboin, Associate Director for Library Education and Research at the Lamar Soutter Library, at the University of Massachusetts Medical School, contact her at Regina.Raboin@umassmed.edu or on Twitter @RegRab77
Research Data Access and Preservation (RDAP) Summit 2016, May 4-5, 2016 Atlanta, GA
This is my “RDAP 2016 Story and I’m Sticking to It!”
I ALWAYS look forward to attending RDAP, as for some of my areas of responsibilities and interests it’s the most focused, important conference I can attend. The topics are always spot-on, the presentations thoughtful and timely, and throughout the conference the discussions and people are engaging, lively and stimulating. Each time I attend, I learn something new (and I’ll write more about this later in the post) and leave with at least one idea I’d like to implement at my institution. This year was no different.
RDAP 2016 had several foci: ‘Engaging Liaisons’, ‘Sustainability’, ‘Building the Research Data Community of Practice’, ‘Measuring Up: How Are We Defining Success for Research Data Services?’, ‘DMPs and Public Access: Agency and Data Service Experiences’, ‘Crowdsourcing Guidelines for a Successful Data Event: An Un(conference) Panel’. And slotted into the agenda were the poster sessions, lightning talks and roundtable discussions – it was a successful, albeit ambitious two days!
What I liked about all of the sessions (access conference content here) is that you could be charged with building a new research data management service at your institution and the ‘How-To-Do-This-in-Four-Easy-Steps-Guide’ (well maybe Five!) was here for the taking! Like most services libraries offer, research data management is a complex entity – it involves advocacy, learning, teaching, networking, knowledge-building, management, collaboration, tenacity, vision – but most of all patience (have I used enough adjectives?).
What I’m trying to convey is that this conference, with its people, its content and its vision, enables the attendee to return to their institution with ideas, and perhaps a planned path towards implementing data management services. This could be an idea, such as focusing on undergraduates, like Ryan Clement from Middlebury, which would focus on the institution’s needs through first assessment and then tailoring the service to meet these needs.
Or is your service established, but you need to assess in order to grow? Then paying close attention to Jake Carlson’s talk about University of Michigan’s data management services assessment plans could give you the impetus and ideas needed to do this at your institution. And what about the real concerns over the changing landscape of liaison librarianship, as reported in the Final Report from the ARL Pilot Library Liaison Institute? How does a data management services librarian engage and collaborate with their liaison colleagues? Ideas, possible solutions and how they work through these issues are discussed by Goben, Griffin, Scheib and Martin.
And post-RDAP 2016, when you’ve returned to your workplace, implemented the service, but find you need support, encouragement, or an ear-to-bend, you can turn to one of the research data communities of practice you discovered.
It was all there, small ideas, large ideas; collaborations, solo work; federal, non-federal; repository, no repository – it was all in Atlanta and all of it good.
I learned that we can’t do it alone; that we need this community in order to thrive, to question, to continue these services – to continue a great renaissance that has come to libraries (thanks T. Scott Plutchak!).
As those who Tweet and follow me know I do a lot of tweeting at conferences, so thank goodness I don’t go to these all the time! I tweeted like mad at #RDAP16 and what I like about doing this is that I have many good notes from myself and others that I can use to talk to my colleagues about what I learned and experienced. But not all my tweets are good, so I was thrilled to learn about Storify and how I can choose tweets and organize them in a way that would make sense, and maybe, just maybe tell a decent story.
So this is my RDAP 2016 story, and I’m sticking to it!
Colby College seeks a Science Librarian who is innovative and curious, adaptive and collaborative. The successful candidate will be fearless about engaging with sciences faculty, students and other campus partners to shape a dynamic 21st century teaching, learning and research environment.
This is a full-time, 12-month faculty position without rank, with the opportunity for sabbaticals. Colby librarians collaborate with our CBB consortial colleagues at the Bowdoin and Bates College libraries. They are encouraged to serve on college-wide committees, and to participate in professional development activities.
To express interest, please send a cover letter/statement of interest, curriculum vitae, separate statements detailing teaching philosophy and research interests, and names and contact information for three references as .pdf attachments to Stephanie Frost, Administrative Assistant, Colby College Libraries at firstname.lastname@example.org.
For complete information see the full job posting: https://www.colby.edu/administration_cs/humanresources/employment/scilibrarian.cfm
The University of Rhode Island seeks an experienced data specialist to lead the Libraries’ efforts to support work with data across our University and within the University Libraries. The use of data and statistics is growing quickly as computational methods expand across many disciplines and as better tools make statistical analysis more widely available. The Libraries will serve several roles in enhancing use of data across disciplines on campus, and the incumbent will direct efforts at supporting the use of data throughout the scholarly life cycle. The Data Services Librarian plays a key role in developing services and policies to collect, manage, curate, provide access to, and instruct in the use of data throughout the data life cycle. The ideal candidate will have proven experience supporting applied research with statistical methods and a strong public services orientation.
This position is a 12-month tenure-track faculty appointment at the Associate Professor level. The position will be a calendar year appointment with an expected start date of July 2016. This position reports to the Chair, Public Services and Dean, University Libraries.
For a full job description and application process, please visit https://jobs.uri.edu/postings/1360
The Lamar Soutter Library at the University of Massachusetts has the following open employment opportunities in the NN/LM New England Regional Office:
Education and Outreach Coordinator (NN\LM New England Region)
Under the general direction of the Associate Director or designee, the Education and Outreach Coordinator will develop, plan, and provide services in specified NER programs while assuming broad areas of responsibility related to Outreach, Networking, Education, Librarianship, Consumer Health, Health Literacy, and Exhibiting. The Education and Outreach Coordinator supports librarians in the New England Region by planning and implementing programs and resources that build competencies in outreach to health professionals, consumers, public health workers, and other users of NLM’s resources.
Full Job Application: http://www.ummsjobs.com/job/1037/
Technology and Communications Coordinator (NN\LM New England Region and the NPHCO)
Under the general direction of the Associate Director or designee, the Technology/Communications Coordinator is responsible for planning and implementing the National Network of Libraries of Medicine, New England Region (NN/LM NER) and the National Public Health Coordination Office (NPHCO) marketing and communications plan, and plans and coordinates the NER and NPHCO Technology Program. Serves as the technical liaison, for all areas, of the NN/LM NER and NPHCO. In addition, the Coordinator is responsible for maintaining the web presence for these two programs. The Coordinator troubleshoots technology questions from NPHCO partners including state Public Health department (PHD), libraries, NER Network members, and others.
Full Job Application: http://www.ummsjobs.com/job/1311/
Manager, National Public Health Coordination Office (NPHCO)
Under the general direction of the Director, Library Services or designee, the National Public Health Coordination Office (NPHC) Manager administratively and independently manages all aspects of the NPHCO. The Manager hires and manages staff, monitors and controls budgetary expenditures, recruits partners, oversees program evaluation and serves as the liaison from the NPHCO to the National Library of Medicine, National Network Office (NNO). Additional information on the current Public Health Information Access Program at the Lamar Soutter Library can be found here: http://nnlm.gov/ner/initiatives/phia.html
Full Job Application: http://www.ummsjobs.com/job/1312/
Coordinator, Reference & Education National Public Health Coordination Office (NPHCO)
Under the general direction of the Associate Director or designee, the Reference and Education Coordinator provides public health reference and instructional services to the state Public Health Departments and partner libraries participating in the National Public Health Coordination Office (NPHCO) strategic alliance. The Coordinator will work in concert with the National Network/Libraries of Medicine (NN/LM) and the NN/LM National Training Center (NTC) to design, develop and implement the public health evidence based curriculum. The Coordinator will also respond to partner reference questions including assistance with DOCLINE, LinkOut, and questions related to searches on public health topics using Digital Library(DL) resources. Additional information on the current Public Health Information Access Program at the Lamar Soutter Library can be found here: http://nnlm.gov/ner/initiatives/phia.html
Full Job Application: http://www.ummsjobs.com/job/1310/
All applications can be submitted online.
Please direct questions to Mary.Piorun@umassmed.edu
The Journal of eScience Librarianship will be publishing two escience symposium reflection interviews very soon! The interviews feature Christopher Erdmann and Jian Qin sharing their experiences attending and presenting at the 2016 eScience Symposium. Watch the clips of their interviews now and stay tuned for their full articles available soon!
Submitted by Stacy Konkiel, Outreach & Engagement Manager at Altmetric, @skonkiel
What do you do, as a data librarian, if your university doesn’t offer its own data storage or preservation services?
In September 2015, I emailed with Dan Valen (Figshare), Meredith Morovati and Todd Vision (Dryad), and Tim Smith (Zenodo) to learn more about their repositories: how they differ, what features they offer to researchers, and more. Following are their answers, edited for length and flow.Tell me about your repository: what does it do, why’d you create it, and what sort of services do you offer?
Figshare: At its core, Figshare is an open data repository and a place where researchers can get credit for non-traditional research outputs (read: data). But since its creation, it’s grown into much, much more. We’ve built tools on top of figshare.com to support researchers, publishers, and institutions and aid in the storing, sharing, and discoverability of all different types of research outputs. Our ultimate goal is transparency in research to aid in the reproducibility, replication, and reuse of data. We really want to provide a tool for the academic community that helps research and science be more efficient. We currently offer both researcher- and institution-oriented services (as well as services for publishers, but that’s a story for another day!).
Dryad: Dryad provides the means for making the data underlying scientific publications discoverable, freely reusable, and citable. It was developed at the behest of a consortium of journals that were jointly adopting a policy requiring that research data be archived at the time of publication. They saw a need for a shared repository that can do all sorts of things: accept all kinds of research datatypes and formats, make data deposit easy by integrating manuscript and data submission, preserve the linkage between the publication and the data, have professional curators for quality control and user support, be non-profit (but sustainable!), be governed by the community, make data free for reuse but ensure the original researchers get credit, and so on. Dryad was built to that long wish list, with the ultimate goal of really promoting the uptake of data archiving by individual researchers.
Zenodo: Zenodo was launched to make the sharing, curation and publication of data and software a reality for all researchers. The digital revolution has necessitated a re-tooling of the scholarly processes to handle data and software, but this is proceeding at varying speeds across different communities, disciplines, and nations. To ensure no-one is left behind through lack of access to the necessary tools and resources, Zenodo was conceived as the catch-all for science. In doing so we also had the goal to ensure that tools developed for Big Science could be effectively shared with the long-tail of science. The catalyst for Zenodo’s launch was the European Commission (EC), who commissioned the OpenAIRE project to support their nascent Open Data policy, and CERN provided the capability. But it was rapidly evident from the global feedback and encouragement that the need for such a service was not restricted to EC projects!How does your repository fit in with researchers’ existing workflows?
Dryad: Dryad aims to make available the research data in its “final form” just prior to publication, and to support whatever format the researchers felt most appropriate for managing their data. So our submission system is format agnostic, although our curators are happy to advise about formats that are suitable for preservation and reuse. We are partners in a number of efforts to educate researchers about good data management practices throughout the research lifecycle, because that will benefit the researchers themselves in the long run and it will optimize the reuse potential of the data in Dryad. We are always looking at ways to forefront positive incentives, like data citation, so that researchers will care more about the reusability of their data and make good data management choices as a result. We also support researchers who wish to make corrections or deposit new versions of their data after publication. There is so much high quality and valuable data available now, “get data from Dryad” is becoming a reasonable starting point for many research projects.
Zenodo: Zenodo is being used both in classic and next generation workflows. For those that simply wish to add data or software to an existing publication workflow, many publishers recommend or accept supplementary material be placed in Zenodo and simply link to it. For those that wish to add it to the review process, content is restricted in Zenodo and protected links are shared with the reviewers. Content can be embargoed and automatically opened when the associated paper is published. For those that wish to safely store versions long before publication, content can be closed access and then go through a release cycle, opening up later in the research workflow. To support all these use cases, the simple web interface is supplemented by a rich API which allows third-party tools and services to use Zenodo as a backend in their workflow, such as PyRDM, Scientific Protocols, CiteULike and many more. With GitHub, we actually worked together to create a closer interoperability, enabling GitHub login and inter-service trust for automated passing of payloads.
Figshare: At Figshare, we’re all about interoperability and understand there are a number of tools in the researcher toolkit and steps to their workflows. We have an public REST API (more info at docs.figshare.com) that helps developers build tools on top of Figshare to aid in their respective research workflows. This allows researchers to pull and push data into Figshare and assign appropriate metadata tags automatically, all the while adding as few extra steps into their workflows as possible.
Figshare for researchers provides an easy way for researchers to upload, store, collaborate, and make their research data openly available. All data that’s made public is assigned a DataCite DOI to aid in permanence, citability and discovery. We also have a “Projects” area for both private and public collaboration around data and have a numerous tie-ins with other research tools (such as rOpensci, The Open Science Framework, plot.ly, Overleaf, and more) to fit into researcher workflows.What are your top three favorite things about your platform and/or community?
Zenodo: For Zenodo itself; the fact that it is Open in all senses is really important, not just open source and open content, but also open to ideas and contributions! For the community; I really appreciate the trust and the encouragement to continue and to expand; so it doesn’t feel like “just another service”, it feels like a good cause.
Figshare: One of my favorite things about the platform today is how our three offerings (to individual researchers, publishers, and institutions) are starting to intersect. This past December, we completed an 18 month project to merge our new platform with the old figshare.com codebase, and with it came a suite of new features as well as a cross section of how publisher, institutional, and researcher content overlaps. We have such a vibrant community who have been super open throughout this transition and really helped us prioritize new features and functionality. It’s incredibly rewarding to be part of a nimble company that can respond to the needs of your users and the dev team really deserves a ton of recognition there.
Dryad: It is hard to pick three, but I really enjoy the enthusiasm from the community. It is particularly rewarding when researchers show pride in their data and promote it to their peers, which we see a lot of on Twitter. I also love the cool reuse examples and variety of data types we get – scientists study so many different topics. Researchers often come up to our booth at conferences to thank us and tell us how they have used data from Dryad. But, I have to say that my favorite thing at Dryad is my team. I am lucky to be working with a group of such committed professionals. Our curators, especially, deserve a lot of credit.Is your repository intended to replace or complement institutional repositories? Is it possible (or advisable) to have both?
Figshare: It’s absolutely possible to have both! We’re really supportive of institutional repositories as a way to showcase and preserve university outputs. Figshare is not intended to replace an existing institutional repository but rather complement their features and functionality. We were built to handle data first in all file formats, version active data projects, and visualize different types of data in the browser, and that’s what we do best. Since we’re built on top of our API, we’d like to easily build connectors to existing IR infrastructure to better manage the internal workflows at institutions versus active research data, with the university playing a key role in the archiving and preservation of appropriate content.
Dryad: The important thing to our publishing partners is that authors can submit to Dryad no matter what their institutional affiliation. Institutional repositories, by contrast, are so varied in what they do, and few researchers are at institutions with an IR that provides the same of level of service for publication-related data. That said, all institutions do have an indispensible role as the stewards of the potentially-valued research products that do not get published, and many institutions have increasing role as the provider of protected access to data that cannot be openly released. It is certainly possible for a researcher to deposit data in both an IR and Dryad, and we are happy to see it happen when the researcher is motivated to do so. But if that requires multiple independent deposits of the same material, then we’d see that as a failure of the infrastructure of which we are a part. So we encourage IRs to harvest or link out to the material in Dryad, and we’d be happy to see researchers work with their librarians to carefully document their data in their IR before depositing it in Dryad. I don’t think either of those things are very frequent now, but we can see them becoming more common as enabling interoperability technologies, like ORCID, and institutional data library services, become better established.
Zenodo: Zenodo is offered to complement, not to replace! There are many reasons why a researcher may wish to, or be constrained to, use a community, institute or national repository. But many do not find a natural home, or do not find the necessary services in the options at hand, and for those Zenodo is there to assure a service. When researchers are faced with restrictions on type or size of content elsewhere, they turn to Zenodo to store the extra content to complement their existing collections. Others create new collections in Zenodo and gather unique content as well as some that exists elsewhere (since Zenodo does not insist on minting new DOIs, you can declare the already existing DOI). In this sense Zenodo does not preconceive nor advise any particular course of action, it is offered to support all those avenues being explored by the research community. And as an Open resource, Zenodo does not lock in any content, so if later a researcher changes their mind, we offer easy interfaces to synchronize or download the content.What sort of data preservation options does your repository offer?
Dryad: Dryad considers the original file provided by the author to be the primary object of preservation. We scan the contents at submission, keep multiple copies in different systems, compare checksums, and do other basic operations to ensure the integrity of the bitstreams corresponding to those files. But the reusability of the data also depends also on knowing about the data collection and processing methods. One reason that Dryad data must always be associated with publication is to ensure that there will be at least that minimal contextual information. When the publication does not provide enough details, then having good metadata, for example in a ReadMe file, can be critical. For that reason, we encourage data to be reviewed alongside the manuscript so that metadata deficiencies can be spotted and rectified before release. Reusability also assumes the file format is still readable, which is sometimes a problem with the proprietary formats used by scientific equipment and software. Our curators have a lot of experience in what formats are best for preservation and reusability and can offer advice to researchers. We are also studying under what circumstances we should be migrating files to preservation formats, for instance when formats are known to be obsolete but can be migrated with minimal loss. We don’t overwrite files, but we may add a migrated files to data packages as a special kind of “versioning”. Finally, we are talking to a preservation partner to ensure long-term archival backup of all of Dryad’s contents, so that we can be sure our DOIs will still point to the data as long as the internet, or some replacement for it, is still working.
Zenodo: The goal is to treat all deposited data in the same way as we do CERN’s Large Hadron Collider data, with which we share the same data centres. This means to keep it “forever” on tape, and moreover to keep it online and accessible for as long as possible, resorting to the tape copy only in exceptional cases for the most dusty large data sets. This area of the service is under active development, to expose the preservation actions more clearly and to certify the processes more explicitly.
Figshare: We are starting to work closely with DuraCloud to provide a cloud-based storage solution for institutional data. Because DuraCloud also ties directly into Amazon Glacier and the Digital Preservation Network (all end-user figshare.com content is backed up to DPN) through the DuraCloud Vault node, institutions can leverage DuraCloud as a service for additional archiving and preservation of all Figshare content as they see fit. This is a really exciting development on our end because there’s been a lot of thought put into the archiving and preservation capabilities of DuraCloud.
As a result, the academic community will be able to take advantage of this functionality and decide on what data to preserve and how they want it preserved. All data can be easily pushed from Figshare to DuraCloud with archiving options managed through DuraCloud itself. We handle a lot of the integrity checks of uploaded content and link to publications where appropriate to ensure the data has enough context for reuse. This all happens behind the scenes and the library can ensure that all data capture adheres to the researcher’s DMP/grant requirements or any existing internal policies on archiving and preservation.Thanks, all!
It’s been interesting to hear about all that these three repositories have been up to, and I think that our readers will agree that each service is an excellent potential home for their institution’s research!
For complete information see http://classguides.lib.uconn.edu/RDMR
Morning Event: Open Science Framework (10:00 am – 12:00 pm)
Speaker: April Clyburne-Sherin, Center for Open Science
Afternoon Event: The 3rd Research Data Management Roundtable (1:20 pm – 4:00 pm)
Topic: “Data Management Instruction: Teaching One-Shot Workshops”
Location for both events: Boston College Theology and Ministry Library, Auditorium
(Room 113) Boston College Brighton Campus (map & directions; free parking is available)
Registration will begin at 9:30 am
Morning and afternoon refreshments will be provided.
Open Science Framework 10:00 am – 12:00 pm
“Demonstration and Discussion”
April Clyburne-Sherin is an epidemiologist and research methodologist working to open research through education. She works as the Reproducible Research Evangelist at the Center for Open Science where she conducts workshops on open and reproducible research practices for scientists and students. She will talk about reproducibility of research, open workflows, and The Open Science Framework (OSF), a free, open source web application that connects and supports the research workflow, enabling scientists to increase the efficiency and effectiveness of their research.
Lunch 12:00 pm – 1:20 pm
Participants will have lunch on their own. Options include campus dining locations or at another off-campus restaurant (for suggestions). Please be aware that all these options require a short walk, so participants may want to bring their own lunch.
Research Data Management Roundtable 1:20 pm – 4:00 pm
“Data Management Instruction: Teaching One-Shot Workshops”
This is the third in a series of informal roundtable discussions on specific research data management topics. In this set of discussions, participants will focus on instruction, specifically on how we are teaching “one-shot” workshops. New at this roundtable will feature a group discussion in which attendees are encouraged to share anecdotes (of any kind) about their experiences and also share the tools they are using in their research data management instruction. Come join the discussion and share!
in on the registration form!
For questions about registration, contact Tom Hohenstein (email@example.com).
Both events are sponsored by Boston College and the New England e-Science Program.
For complete information see http://classguides.lib.uconn.edu/RDMR
The video recordings of the presentations from the April eScience Symposium are now available on the New England Region eScience Program YouTube page.
Kendall Roark, Assistant Professor, Research Data Specialist, Purdue University Libraries, gave the keynote address on “Data Work: Research Data Services in Canada & the U.S.”
Lisa Johnston’s poster, “Curating Research Data in DRUM: A workflow and distributed staffing model for institutional data repositories” was awarded “Best Poster Overall” at the symposium.
Check out the great materials from the eScience Symposium!
The University of New Hampshire Library seeks a dynamic, innovative librarian for the newly created position of Health & Human Services Librarian. This librarian will serve as liaison to the College of Health & Human Services, a college with numerous nationally accredited programs, growing enrollments, the recent addition of a doctoral program, and services that impact northern New England. Situated in UNH’s main library, Dimond Library, the Health & Human Services Librarian will provide reference, research, instruction, and collection development support for the disciplines located within CHHS, and will participate in general reference and instruction activities working closely with members of the reference staff and other library units.
For a full job description and application process, please visit http://jobs.usnh.edu/postings/21198. To receive full consideration for this position, in addition to completing the required on-line application form, please be prepared to submit: Resume, cover letter and contact information for three (3) professional references.
Submitted on behalf of Elaine Martin, DA
Director of Library Services, Lamar Soutter Library
Director, National Network of Libraries of Medicine, New England Region
University of Massachusetts Medical School
In the newest issue of Journal of eScience Librarianship, Elaine Martin discusses the role of librarians in data science, and encourages us to take action.
Elaine’s advocacy and leadership in the NN/LM NER eScience Program, has built a strong community of librarians in New England, and beyond, who are truly forward thinking in the expanding discipline of data science.
However, we are seeing some hesitancy on the part of librarians to participate in the data movement. But this is happening at a time when we have seen an increase in the money and involvement in data initiatives from a range of other professions and academic disciplines. Elaine sees this as a critical moment for librarians to actively plan and implement strategies collectively.
In Elaine’s editorial, she proposes a framework for the librarian’s role in data science, believing that the principles and values of the field of library and information science that form the core of our profession need to be part of this new discipline and that we can add unique perspectives and roles. This “User-Centered Framework” consists of buckets: data services, data management practices, data literacy, data archives and preservation, and data policy.
This is just a taste of Elaine’s vision for the future of librarianship. Please read the article in its entirety (http://dx.doi.org/10.7191/jeslib.2015.1092) and add to the discussion! We encourage all to get actively involved and encourage your comments below.
And be sure to check out all the articles in the newest issue of JeSLIB available now!
Submitted by e-Science Portal Editor Daina Bouquin, Assistant Head Librarian, Harvard-Smithsonian Center for Astrophysics, firstname.lastname@example.org
One substantial challenge librarians face as they work with scientists and students in data-intensive fields is the reality that data management is greatly impacted by geographically distributed research teams. Scientists, researchers, and students must constantly consider what approaches are being taken by their peers and colleagues around the world. From the Librarian’s stand point, I believe it’s important to do the same when we consider how we tackle the conversation about the library’s role in supporting those communities’ work. It’s difficult sometimes to keep up to date with everything that’s going on in the quickly changing data librarianship landscape, and sometimes I need to remind myself to make sure I’m taking a global perspective as I try to stay in touch with what other libraries are doing. Through Twitter, conference proceedings, meetings with international attendance, and through one-on-one idea sharing I think Librarians around the world should be forming relationships and working together as much as possible to expand the role librarians have in fostering new knowledge creation in modern scholarship.
Below I’m going to list just a few attention grabbing efforts to approach topics like research data management, metadata, cyber infrastructure, and open science outside of the United States to hopefully inspire readers to keep their eyes open to initiatives abroad.
- Australian National Data Service’s 23 RDM Things
- Research Bazaar (ResBaz) or #resbaz on Twitter
- The Data Labs at Copenhagen University Library
- LIBER Europe
- Exploration of Decentralized Autonomous Collections
- Zenodo software citation
- ORCiD Ambassators
Even though the e-Science Portal and Community Blog feature “New England Librarians” it doesn’t mean we’re sticking to perspectives just from New England! What sort of global or international projects have caught your eye lately?
The theme for this year’s e-Science Symposium was Library Research Data Services: Putting Ideas Into Action. If you’re attended one of the e-Science symposiums before, you know how much is packed into one day! There was too much content to subject you to in one blog post, so I thought I’d focus on just a few highlights.
This is the first time I can recall breakout sessions being held at the symposium. These sessions split attendees into smaller, more interactive groups focused on four areas: compliance, data information literacy, data repositories, and informationist tracks.
In his data information literacy session, Jake Carlson talked about his experiences with a couple of different courses. His approach emphasizes the student’s role as data producer and manager, and has students use their own data for the course as a way to increase their investment in the sessions. The DIL course currently underway at the University of Michigan’s College of Engineering draws upon guest speakers for part of its curriculum, including an IT specialist, data visualization expert, copyright officer, and director of compliance from the College of Engineering. Classes last for 2 hours, and they’ve found it most productive to typically split this allotted time into thirds for each session: first presentations, then discussion, and finally hands-on time for students to work with data. Jake mentioned that student feedback led them to switch the flow of curriculum a bit so that big-picture ideas and hands-on, detailed work both happen throughout the semester rather than starting with large-scale topics early on and funneling down into details later.
The other breakout I attended was Margaret Henderson & Hillary Miller’s session on compliance, which was one of the high points of the symposium for me. Their presentation was chock-full of great resources and tips. One of those tips noted the value of Retraction Watch as a source of cautionary tales for researchers who may be more concerned with the possibility of retraction than compliance. I also loved their suggestion to put together a README file template to help give researchers a leg up on creating good documentation for their datasets. Dryad has some nice guidance and examples to draw from.
The only downside to the breakout sessions was missing out on the other two offerings – I heard great things about them too. The poster session was also very lively this year, with over 20 presenters speaking to a variety of topics. Check out their work here!
That’s just a taste of some of the topics offered up at this year’s symposium. Thanks to UMass Medical and co-sponsors (Lamar Soutter Library, NN/LM New England, and the BLC) for organizing another great event.
Registration is now open for the 2016 New England Science Boot Camp! For further information and to register, visit the Science Boot Camp 2016 Lib Guide at http://guides.library.umass.edu/BootCamp2016. Scholarships are available for current Library School students!
This year’s Science Boot Camp will be held at the University of Massachusetts Dartmouth June 15-17th. Now in its eighth year, Science Boot Camp provides a fun and casual setting where New England science faculty present educational sessions on their respective science domains to librarians. Science topics for this year’s boot camp include Nursing, Physics and Civil & Environmental Engineering. There will also be a special evening presentation, “Dolphin Politics in Shark Bay” by UMass Dartmouth Professor of Biology Dr. Richard Connor on Wednesday, June 15th. The Capstone on Friday June 17th will feature a hands-on session, “Science Literacy.”
Prior to the official start of the boot camp program, Science Boot Campers can opt to take tours on Wednesday, June 15:
- Campus tour featuring the architecture of Paul Rudolph. Tour leader: Bruce Barnes, retired, former head of UMass Dartmouth Library Technical Services
- Visit the renovated Claire T. Carney Library that has received awards from ALA and the American Institute of Architects. Tour leader: Catherine Fortier Barnes, Assistant Director, Claire T. Carney Library.
Science Boot Campers are also invited to participate in an optional Friday afternoon trip to the New Bedford Whaling Museum. Campers will be responsible for the admission fee. More information on this will be provided as it become available!
Science Boot Camp provides librarians with valuable continuing education at a low cost, and offers three options for attendees-full registration with overnight lodging, commuter registration, or a one day registration option.
This year, UMass Dartmouth is offering overnight accommodations for Tuesday June 14th and/or Friday June 18th, at additional cost to campers. Campers who would like to stay Tuesday and/or Friday evening will pay a separate fee and pay directly to UMass Dartmouth Library. Details about this option can be found on the registration page.
If you’ve never been to Science Boot Camp, visit the e-Science Portal’s Science Boot Camp page at http://esciencelibrary.umassmed.edu/science_bootcamp where you’ll find descriptions, links to past SBC LibGuides, and links to SBC videos!
Are you curious about what you can expect to learn at Science Boot Camp 2016? Here are the learning objectives for the 2016 Science Boot Camp science sessions:
For each of the focus topics covered at Science Boot Camp’s science sessions, Science Boot Campers will be able to:
Explain the structure of the field and its foundational ideas
- Understand and be able to use terminologies for the field
- Identify the big questions that this field is exploring
- Discuss new directions for research in this field
- Discuss what questions research in this field is addressing
- Understand how research is conducted, what instrumentation is used, and how data is captured
- Identify how researchers share information within their fields beyond publications
- Share insights into what current research in the field is discovering and implications of these discoveries
- Share insights into how researchers in specific fields collaborate with librarian subject specialists now and how they might collaborate in the future.
- Identify new ways that librarians can support their research communities
Submitted by e-Science Portal Editor Daina Bouquin, Assistant Head Librarian, Harvard-Smithsonian Center for Astrophysics, email@example.com
Note for transparency: Daina’s library has a project entered into the challenge discussed below.
The Knight News Foundation recently finished accepting submissions to their annual Libraries Challenge. Each year, the Knight News Foundation holds a “challenge” that supports what they see as “transformational ideas” to promote democracy by ensuring that communities are informed and engaged. Winners of the challenge receive a share of $3 million in funding and support to help advance their ideas (check out the challenge’s previous winner here). This challenge is great for creative librarians across the spectrum, especially librarians interested in supporting computationally intensive science like members of the e-Science portal community; the prompt this year is: “How might libraries 21st century information needs?”
Over 600 ideas were submitted to address this question, and they are fascinating. Just a quick glance will tell you there are a lot of libraries out there interested building out digital technologies that support data-intensive research and education. Some themes I’ve seen among those entries seem to be:
- Improving accessibility to digital collections (including research data)
- Developing novel data visualization tools
- Building tools for creating digital media (databases, music, e-books, art etc.)
- Leveraging/aggregating data sources from communities and external sources
- Maker spaces of all kinds
- Literacy initiatives focused on data and data science skill sets
What many of these ideas have in common though is that the ideas themselves rely heavily on the ability of libraries to use skills like web development and data wrangling and preservation to help their communities. This in itself is a sign that librarians are focusing on the need to continually develop new skill sets and retool when necessary, which to me is a great sign for librarians everywhere looking for support in continuing their professional development (it’s a sign that you’re not alone if you’re trying to re-tool!)
If you haven’t entered the challenge, you can still support your favorite ideas by giving “applause” to libraries’ proposals (clicking on the heart), commenting to show your support, and sharing the links you like the most with friends. Reading through these ideas will hopefully inspire you to try something new and support members of the community who are doing the same!
Upcoming Webinar: Complying with the NSF’s New Public Access Policy and Depositing a Manuscript in NSF-PAR
Hosted by the University of Massachusetts Medical School, National Network of Libraries of Medicine, New Engalnd Region
Presented by Hope Lappen and Andrew Creamer from Brown University
Tuesday, April 19th, 2016 12PM-1:30PM EST
In 2016 the National Science Foundation (NSF) rolled out its new online public access repository, NSF-PAR for investigators funded by the NSF to deposit their manuscripts to comply with its new Public Access Policy. The NSF’s policy and its new publications repository differ in several key ways from the National Institutes of Health’s (NIH) public access policy and PMC, particularly in terms of requirements for compliance and procedures for deposit. While NIH grants may make up the majority of biomedical institutions’ research funds, the NSF is also an important source of biomedical funding, especially for career awards, research training grants, and translational research. In this webinar we will walk participants through the requirements for compliance and the process for deposit and share insights provided by the NSF Policy Office.
Hope Lappen is the Biomedical and Life Sciences Librarian at Brown University. Prior to coming to Brown, Hope served as the Science and Engineering Librarian at George Washington University and the Eugene Garfield Resident in Science Librarianship at University of Pennsylvania.
Andrew Creamer is the Science Data Management Librarian at Brown University. He helps students and faculty researchers with NSF data management plans and digital curation projects. Prior to Brown, Andrew worked on research data management initiatives at the NN/LM NER.
Being a data librarian shouldn’t just involve helping with a DMP or discussing policy or helping set up an organization system. Data librarians, really any research librarian, should have an understanding and awareness of the multiple factors that impact the researchers they are trying to help. In the realm of biomedical sciences, most of our patrons have or are trying to get NIH grants, so it is helpful to keep up things at the NIH that might have an impact on them. The Extramural Nexus blog is good to follow because it is a nice mix of NIH news and grant information. You can also check Notices of NIH Policy Changes for grant specifics and the NIH News & Events page for more extensive coverage
For instance, one of the recent notices NOT-OD-16-081 covers NIH and AHRQ grant application changes for due dates on or after May 25, 2016. This includes changes to biosketches, which many librarians help with, a new “Data Safety Monitoring Plan” attachment that will need to be included for clinical trials, and new requirements in the area of rigor and transparency.
A quick look at the biosketch change shows that it is a clarification of biosketch instructions and includes the note that research products can include conference proceedings such as meeting abstracts, posters, or other presentations, but the only URL to a publication list has to be to a .gov site like my bibliography (so ORCID is out).
The Data Safety Monitoring Plan is a bit more problematic to research. I have had somebody ask about it recently though, so I took the time to follow some links. There has been a requirement for a plan since 1998, but the notice seems to be indicating that it is now a supplement to the research plan that can be attached as an additional pdf. Instructions vary from institute to institute, but links have been collected. Librarians can’t help with all the details of these plans, but when gathering information about data security for data management plans (DMPs), learning about the secure systems used for patient data and asking for boilerplate language to describe the system can be helpful for this requirement. Knowledge of who to contact in IT or the research office, especially those helping with the Institutional Review Board (IRB), can also be of help to a grant writer.
The NIH has a large FAQ page on Rigor and Transparency, but the big questions are space. Grant writers only have 12 pages to write about the research so they want to know where they have to write about it. So I found this:
“Three elements of the policy (scientific premise, scientific rigor, and relevant biological variables such as sex) should be addressed within the Research Strategy section, as these elements are integral to the research plan. Since scientific premise will be reviewed and scored as part of the Significance review criterion, it is suggested that applicants address premise as part of their corresponding Significance section in the research strategy. Scientific rigor and relevant biological variables will be reviewed and scored as part of the Approach review criterion.
Authentication of key resources will be addressed in a separate attachment, not to exceed one page in length.”
The rigor and transparency requirements are based on the Rigor and Reproducibility training and research being done at the NIH. Because there are elements of analysis, experimental design, methodology, and reporting of results in these new requirements and goals, I feel it is important for data librarians to be aware of what is going on. Our work might not make it into the actual grant, but making sure the raw data is always saved, and encouraging researchers to keep copies of analysis code, and making sure data doesn’t get lost, all contribute to better research.
In light of many retractions coming after the 5 year preservation limit for most grants, and the OSTP memo suggesting that digital data must be made public, I’m recommending to the researcher I help to make a special folder for each publication they have. In that folder they should include the article pdf, if they want to share it as allowed by their contract, the final manuscript copy, in case the publisher doesn’t get the article into PubMed Central, and all the data to back up that article. Along with the data files should be a Readme file with all the information about what the data files contain, and any extra methods that might not have made it into the article. Software code should also be included. I’ve told them to make sure the folder has everything somebody else might need to get the results they did for the paper.
It does take some extra time to learn about things that are often tangentially related to what you think is your job, but I have found that having that extra bit of information to help a researcher can help change perceptions and make you more of a faculty colleague.
Here’s a quick recap of JupyterDays Boston, held at Harvard Law School on March 17-18 and organized by Project Jupyter, O’Reilly Media, and the Harvard-Smithsonian Center for Astrophysics.
What is Jupyter? From jupyter.org, “The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.” Think of them as interactive notebooks that can include a variety of different kinds of content – everything from data visualizations to code to audiovisuals to text. The really nifty thing is that notebooks can have cells that act as interactive widgets – such as executable code – that when shared with others, allows them to run a bit of code and make something happen. Data can be pulled in, parameters set, analyses run, and interactive data visualizations produced – all within the notebook.
The first day’s sessions devoted considerable time to uses of Jupyter Notebooks – and related tools like Binder and Docker – in classroom settings. Several presenters mentioned that they liked using notebooks in part because these browser-based solutions avoid the inevitable headaches and time-sinks that can occur with software installations, especially in BYOM environments. So rather than wrestling with the mechanics of downloads and installs, students can instead jump quickly into working directly with code and data. This reflected a common thread that seemed to run through many sessions; rather than trying to force non-coders into learning how to code in order to get things done, the emphasis instead is on lowering barriers to coding so that people can get on with doing their work and telling their stories.
These points were driven home by some presentations from folks using Jupyter notebooks in their teaching, including a wearable signal processing project from Harvard that has students working with their own physiological data such as heart rate, accelerometer data, and electro-dermal activity, as captured by Empatica E4 wearables and processed in Jupyter Notebooks.
The student perspective was represented as well. A breakout session on texts and other educational materials in Jupyter included a student from Olin College talking about how he liked using Jupyter notebooks written in textbook style, as they intersperse traditional text explaining how to do something followed by ‘now write this code’ cells that immediately force the student to apply what they read. During a panel discussion, a Wellesley professor talked about surveying her students on their use of Jupiter Notebooks in her computer science course. She said that most responded positively to working in Jupiter Notebooks, including one student who said she couldn’t imagine learning the content without them. Some students did feel that the presentation of material was a bit overwhelming when viewed in the notebook, and thought it might be more digestible if it could be presented in smaller chunks.
JupyterDays also afforded the opportunity to play with some systems like running JuypterHub on Docker, running data analysis (on Cambridge city pothole data) with Python and R, and data mining and network analysis using Wikipedia data. All in all, it was a worthwhile event featuring some really interesting, potentially disruptive technologies – many thanks to the host and organizers for putting it together.
Want to learn more?
See here for examples of IPython notebooks in a variety of disciplines. My favorite is the section on reproducible academic publications that couple journal articles with notebooks “…that enable (even if only partially) readers to reproduce the results of the publication.” You can imagine how powerful something like this can be as a record of a research process – and in fact, several participants touted the ability of Jupyter Notebooks to make the research narrative more transparent and reproducible.
Check out this Jupyter notebook that generates an interactive Hans Rosling bubble chart using Plotly.
Webinar hosted by Simmons School of Library and Information Science Continuing Education
Presented by Elaine Martin, DA, Director of UMass Medical School Lamar Soutter Library & Director of the National Network of Libraries of Medicine, New England Region
Tuesday, March 29, 12:00-1:00pm EST
This webinar will introduce attendees to the exciting career of medical librarianship. The webinar will present an overview of the field, including a discussion of what medical librarians do, who they serve, the settings in which they work, and the types of resources they use. Medical librarians play an important role as members of the patient care team, support the educational mission of students in the health professions and partner with researchers in advancing scientific discoveries from the bench to the bedside. New roles for medical librarians such as: conducting systematic reviews to inform patient care decision making, designing health literacy programs and services, teaching the principles of evidence-based medicine, and creating research data management services will be emphasized.
For more information: https://slis.simmons.edu/ce/node/362
This year’s Science Boot Camp will be held June 15-17, 2016 on the campus of University of Massachusetts Dartmouth, in Dartmouth, Massachusetts. Science Boot Camp is a fun and affordable 2 ½ day immersion into science topics offering opportunities for librarians and library students interested in science, health sciences, and technology to learn, meet and network in a fun, laid-back atmosphere. Now in its eighth year, the New England Science Boot Camp has been hosted on multiple New England campuses and has been attended by librarians and library students from various regions of the US and beyond—and inspired the development of other Science Boot Camps in the West, Southeast, and Canada!
The topics for this year’s SBC science sessions include:
Each science session will include one scientist presenting an overview of the field, a second scientist discussing research applications within the field. The Capstone session will feature Science Literacy.
Please Save the Date for 2016 New England Science Boot Camp June 15-17, 2016 at the University of Massachusetts Dartmouth!
For further questions, please contact Barbara Merolli at firstname.lastname@example.org.