Feed aggregator

Science Boot Camp Registration is open!

e-Science Portal Blog - Tue, 04/28/2015 - 14:25

Registration is now open for the 2015 New England Science Boot Camp! For further information and to register, visit the Science Boot Camp 2015 Lib Guide at http://classguides.lib.uconn.edu/SBC2015

This year’s Science Boot Camp will be held at Bowdoin College  June 17-19th. Now in its seventh year, Science Boot Camp provides a fun and casual setting where New England science faculty present educational sessions on their respective science domains to librarians.  Science topics for this year’s boot camp include Cognitive Neuroscience, Marine Science, and Ornithology. There will also be a special evening presentation, “History of Diabetes” on Wednesday, June 17th. The Capstone on Friday June 19th will feature a hands-on session, “Writing and Reviewing Data Management Plans.”

Prior to the official start of the boot camp program, Science Boot Campers can opt to take tours of the Bowdoin College Museum of Art and/or the George J. Mitchell Department of Special Collections & Archives, on Wednesday morning June 17th. Science Boot Campers are also invited to participate in an optional tour of Bowdoin’s Coastal Studies Center Marine Lab on Orr’s Island on Friday afternoon after the conclusion of Science Boot Camp. (SBC registrants are required to indicate if they will participate in one or more of these activities on the SBC registration form<https://webapps.umassd.edu/events/library/?ek=539>.

Science Boot Camp provides librarians with valuable continuing education at a low cost, and offers three options for attendees-full registration with overnight lodging, commuter registration, or a one day registration option.

This year, Bowdoin College is offering overnight accommodations for Tuesday June 16th and/or Friday June 19th, at additional cost to campers. (Campers who would like to stay Tuesday and/or Friday evening will pay a separate fee and pay directly to Bowdoin College. Details about this option can be found at http://classguides.lib.uconn.edu/content.php?pid=665848&sid=5513558 ) Getting to Bowdoin is easy by car, bus, or the Amtrak Downeaster train. For further details see http://www.bowdoin.edu/about/visiting/directions.shtml

If you’ve never been to Science Boot Camp, visit the e-Science Portal’s Science Boot Camp page at http://esciencelibrary.umassmed.edu/science_bootcamp where you’ll find descriptions, links to past SBC LibGuides, and links to SBC videos!

Are you curious about what you can expect to learn at Science Boot Camp 2015? Here are the learning objectives for the 2015 Science Boot Camp science and Capstone sessions:

For each of the focus topics covered at Science Boot Camp’s science sessions, Science Boot Campers will be able to:

Explain the structure of the field and its foundational ideas

  • Understand and be able to use terminologies for the field
  • Identify the big questions that this field is exploring
  • Discuss new directions for research in this field
  • Discuss what questions research in this field is addressing
  • Understand how research is conducted, what instrumentation is used, and how data is captured
  • Identify how researchers share information within their fields beyond publications
  • Share insights into what current research in the field is discovering and implications of these discoveries
  • Share insights into how researchers in specific fields collaborate with librarian subject specialists now and how they might collaborate in the future.
  • Identify new ways that librarians can support their research communities

Additionally, following the “Hands-On Writing and Reviewing Data Management Plans “Capstone, Science Boot Campers will be able to:

  •  Understand the how actions by the OSTP and funders have led to requirements for data  management plans
  •  Write a basic data management plan based on an actual research case
  •  Identify gaps in a data management plan requiring additional information from researcher(s)
  • Review and critique a data management plan written by others
  • Begin to understand the importance of understanding disciplinary terminology in writing or reviewing a data management plan





Heather Coates is building a data management curriculum at her institution, one partnership at a time

e-Science Portal Blog - Mon, 04/27/2015 - 08:09

In the four short years since joining IUPUI’s University Library, Heather Coates has built a data management program from scratch, forged partnerships across campus (and throughout the larger university), and also served the school’s scholarly communication needs overall.

What can we learn from her experiences?

Recently, I emailed with Heather to learn more about her recent successful data management course, offered in partnership with her campus’ Clinical and Translational Sciences Institute (CTSI), as well as how she’s managed to balance her data management outreach and education efforts with her many other duties–an experience that many of us understand all too well!

Tell me about your current role at IUPUI.

My title (Digital Scholarship & Data Management Librarian) is a bit of a mouthful, but is a fairly accurate description of what I do. My primary role is to provide data services to the IUPUI campus. As a part of the Center for Digital Scholarship, I also work with Scholarly Outreach Librarian, Scholarly Communication Librarian, and Digital Humanities Librarian to provide a range of research support services. Primarily, this includes education and advocacy for altmetrics and other sources of evidence demonstrating research impact.

I am also the liaison the Fairbanks School of Public Health. All Center librarians, myself included, also hold liaison roles to keep us connected to information literacy and reference services. It can be really challenging to balance all of these roles, but each of these roles helps me to do the others better.

You didn’t start out as a librarian, correct? What drew you to data librarianship?

It was completely serendipitous. I finished the MLIS program expecting to become a medical librarian or subject librarian in psychology. Finding a job for my husband and I in the same city after the housing crash proved to be difficult. So when I saw the posting for my current position, I pounced. The more I learned about data curation, the more interested I became. While preparing my interview presentation, suddenly all the hard learned lessons from my years in research and coursework in health informatics made sense. It was really exciting to finally feel like I had found my niche.

You recently presented at ACRL 2015 on the data management class you created at IUPUI in partnership with the NIH-funded Indiana Clinical and Translational Sciences Institute (CTSI). Tell me about that course.

I came across the Data Management Team at the Indiana CTSI during my environmental scan of the campus. I can’t remember exactly how I stumbled across the field of clinical data management, but once I was aware of their expertise, I had to reach out.

The team leader, Bob Davis, was very open and willing to talk and eager to hear that the Library was interested in providing data management training. Bob’s team is very experienced, but they are funded on a cost-recovery model. Although they offer several workshops, they have very little time to provide the in-depth training that I wanted to develop. They also have a much deeper and narrow focus on research data management than I could take.

My goal was to develop a broad curriculum that would meet the needs of researchers in the social, life, and physical sciences, as well as the health sciences. Bob and his team graciously shared their training materials and expertise, which hugely shaped the program. They also attended the pilot lab and provided very helpful feedback that continues to shape how we offer training. Honing in on the instructional design has been a combination of applying the evidence and trial and error, but their perspective on the balance of content has been instrumental.

What was the most successful aspect of that course?

Attendees really enjoyed the discussions, especially with researchers from other disciplines. The activity that has been the most popular is the data outcomes mapping exercise, even though most novice researchers struggle with it.

What parts of the course would you change going forward?

Oh, so many things! There is so much room for improvement, especially the instructional design and delivery. I’m still figuring out how to talk less, let the students lead the discussion, and get them more engaged with the activities.

Why did you decide to offer the course outside of the libraries instead of as a library-based workshop series?

Although we have a strong liaison program, many of the strong relationships between the library and faculty across campus are based on instruction and collection development. I felt that it was really important to acknowledge that no one person or group can know everything necessary to teach data management or provide data services. So building relationships with other research support units across campus has been a consistent effort since I started. Research takes a village!

You founded IUPUI’s library data management service in 2011. What’s been your biggest success (and your biggest challenge) since then?

The biggest challenge has been getting the attention of our faculty, staff, and students long enough to tell them what the library has to offer. No answers to that one yet, except that word of mouth is powerful. So I focus on making one connection at a time.

For librarians just getting started on designing and offering data management workshops, what resources would you recommend?

Many of us are the only data librarian on our campuses, so building an external network outside your institution is crucial. The institutional network is necessary to get your job done; the external network is really important for peer support. It’s so nice to have a group of dedicated, brilliant peers who care as much about data as I do!

For more information on Heather’s work, visit her blog, the IUPUI Data Services website, and follow Heather on Twitter. You can also find copies of much of her formally published work on the IUPUIScholarWorks repository.

Highlighting Resources from the New England e-Science Program

e-Science Portal Blog - Fri, 04/24/2015 - 17:48

Submitted by Donna Kafel, Project Coordinator for the E-Science Portal and New England e-Science Program

Staying on top of conference proceedings is challenging, whether you’re physically attending a conference, or wistfully following attendees’ tweets and links to presentations from afar.  The number of conferences, symposia, workshops, camps, and national conference sessions featuring RDM related topics has surged, and keeping abreast of all the great output from these events is a challenge.

Over the years, the New England e-Science Program team, based at the Lamar Soutter Library, University of Massachusetts Medical School, has made a concerted effort to capture the rich content from its two key conferences: the annual University of Massachusetts and New England Area Librarian e-Science Symposium and the New England Science Boot Camp. The e-Science Symposium conference pages include detailed agenda, links to presentation slides, and posters. Recordings of presentations for the past three e-Science Symposia are available for viewing on the UMass Medical School/New England Area Librarians e-Science Symposium YouTube Channel.  The Science Boot Camp page on the e-Science Portal is one of the most heavily trafficked content areas of the portal; featuring descriptions of each of the seven NE Science Boot Camps, LibGuides and videos of Science Boot Camp presentations.

ACRL online course: What you need to know about writing data management plans

e-Science Portal Blog - Thu, 04/23/2015 - 10:54

Registration is open for an upcoming ACRL e-Learning online course, “What You Need to Know about Writing Data Management Plansthat will be offered April 27-May 15, 2015.



Big Data and the Collaborative Web Shaping NIH’s Vision and Future Programs

e-Science Portal Blog - Mon, 04/13/2015 - 14:48

Submitted by guest contributor, Katie Houk, Health & Life Sciences Librarian, San Diego State University

I’d like to recap and present a few of my thoughts on the first presentation of the 7th Annual e-Science Symposium, but first I need to congratulate the Lamar Soutter Library at the University of Massachusetts Medical School, the National Network of Libraries of Medicine New England Region, and the Boston Library Consortium for the best Symposium that they have held so far. Now in its seventh iteration, the symposium presented a cohesive and interesting schedule with excellent speakers, a range of posters, and much food for thought on the state of data and its management in the sciences.

Our first speaker of the morning was Dr. Philip Bourne, newly appointed Associate Director for Data Science at the National Institutes of Health. Dr. Bourne was kind enough to Skype in, quite early in the morning, from California in order to present to our group. Dr Bourne’s talk covered many topics on his mind, and even included some of his personal ideas about the direction of the National Library of Medicine after the current director retires. He first started by recommending two books which are currently shaping his thoughts and outlook on technology and data: The Second Machine Age and BOLD. Some of his more interesting statements, in my personal opinion, were comments on the NIH creating a genomic data sharing policy, and that soon data sharing plans would be required for all awards from the NIH, not just those over a certain amount. They are currently looking at how to enforce these data management plans, and are concerned that DMPs are not machine readable. The NIH is wondering if they should start thinking about standardizing the plans in some way, which could also aid in them becoming machine-readable in the future. Dr. Bourne is also very interested in legitimizing data as a form of scholarship. He gave an anecdote that he has a paper that has been cited by over 19,000 people, but he knows nobody has read it because it’s about data. However, nobody is citing the actual data because data citation is not yet standardized, nor considered as prestigious as citing someone’s published paper.

Dr. Bourne spent the bulk of his talk speaking about the Big Data to Knowledge (BD2K) program at the NIH. This program was created after he came into his new position and it focuses mainly on accelerating discovery and making replicable experiments. It is looking to bring together communities, policies, and infrastructure in order to create research and outcomes that are efficient, sustainable, and collaborative. While Dr. Bourne realizes that in individual labs they’re not as concerned with big policies, I believe he hopes that successes like the ENIGMA project will show the usefulness of standardized protocols and homogenized data in bringing together data from separate sites to uncover population-level discoveries. Dr. Bourne spoke many times about the community involvement aspect of building policies and new programs at the NIH. He pays homage to the idea of open access and using openness and community involvement to build better programs, services, and to make more health discoveries. Bourne’s big ideas for future changes at the National Library of Medicine include the library being more open and more collaborative with the community to develop programs. He feels it should be more effective in it’s use of open access materials, and should function more like a digital public library. I find it interesting that when I explored some some of the projects in the Big Data to Knowledge program, many of them are librarian-ish in nature; in that they are mainly discussing solutions to the struggles of how to define, describe, collect, and organize research data into something that is discoverable, useful, and possibly reusable.

I wonder if Dr. Bourne’s vision for the projects occurring within the NIH’s BD2K includes librarians as part of the community involved in these projects? It seems unlikely to me, since when asked how or if librarians could benefit from and contribute to the new focus of the Data Science section, Dr. Bourne was at a bit of a loss. This conundrum is nothing new to our profession, but it is my opinion that now is an opportune moment for intrepid librarians and database programmers to be coming forward and pointing out that our professional expertise in organizing, describing and accessing knowledge is an excellent reason to be involved in many of the larger projects occurring at the NIH.

Part of the BD2K program is the development of something called the Commons, which is essentially a repository hosted in the cloud. For the Commons there will be a set of interoperability guidelines and the design is to force individual companies to create Commons compliant software by providing extra funding to researchers using such products. The unique aspect to the Commons is that it will also provide cloud supercomputing, research tools, and APIs that labs can use to conduct their data research analyses. The third speaker of the morning, Dr Kuo, makes me wonder if the Commons will be as successful as they’re dreaming it will be, mainly because most researchers on the ground are doing such complex work that they need cheap, fast and flexible options that meets their very specific needs. Compliance with outside mandates often increases the cost and decreases the flexibility of products.Perhaps the question is: Is the NIH big enough and powerful enough to force mandates on individual researchers that will be followed and can be enforced?

As later symposium speakers pointed out, many of the ideas that Dr. Bourne touched on have been attempted before, so it will be interesting to see how these projects pan out in the future.

A few of my personal take away ponderings:

  1. Time to read up!

  2. What would more community involvement in developing programs at the NLM look like?

  3. Are librarians being included on the data projects that are essentially concerned with the types of issues we’ve been dealing with as a profession for centuries – just with mostly physical materials so far? Should they be?

  4. Will the Commons be able to provide the affordability and flexibility researchers need to conduct their varied projects?

  5. Will there be a standardized form and a requirement for DMPs to be in xml when the NIH finally mandates that all proposals must include one?

  6. If scientists are being mandated to use standards and think about interoperability, where will they find out about which standards are available and the best to use? (Or rather, where do librarians go to find this out, and how can this scattered information be collected and accessed more efficiently?)

What do you think?


Link 1: http://www.worldcat.org/title/second-machine-age-work-progress-and-prosperity-in-a-time-of-brilliant-technologies/oclc/867423744&referer=brief_results

Link 2:


ACRL 2015: Sounds familiar

e-Science Portal Blog - Tue, 04/07/2015 - 12:55

This was my first ACRL national conference, but will hopefully not be my last. Attendees really were spoilt for choice at this conference – there were far too many sessions on a wide variety of topics to do justice to them all. ​For this blog post, I thought I’d do a quick recap on a few presentations that stuck with me.

I attended a few sessions related to library and institutional data that were interesting for their analogies to data in the e-science world. Common themes included the notion that a lot of data is collected, but very little is acted upon. Data that is acted upon is mostly related to compliance issues rather than analyzed with an eye toward strategic decision-making. Culture, money, talent, political tensions and time were noted as barriers to putting library and institutional data to best use. There are privacy concerns, as well as uncertainty over how the data may be used. Sounds familiar, right?

For the panel session called Getting Started With Library Value, librarians from five institutions described their strategies for demonstrating library value. Speakers noted the need to focus, prioritize, organize, and simplify; to align efforts carefully with various institutional cycles such as those related to academics, budgets, and evaluation; and to be mindful of the strategic vs. operational tensions that exist within libraries. Granted, none of this was explicitly about e-science per se, but the ideas certainly resonated, and…sounded familiar!

Poster sessions were one highlight of this conference for me. I particularly remember these on topics of potential interest to our blog audience:​

Sprouting STEMs: Brooklyn College Library staff described their Science Information Internship program to expose STEM and health sciences majors to science librarianship.

Nurturing a Data Management Community: This poster detailed the efforts of UIUC’s Research Data Services Interest Group, particularly their meeting series on a range of data-related topics, involving both library staff and external speakers.

If I have a complaint about ACRL it’s that there was so much to see in relatively little time that it felt like I missed out on a lot of sessions I very much wanted to attend. I guess that’s a good problem to have. And of course, it’s always nice to see friends and former colleagues from near and far, including several of our very own e-science portal editors and staff. (Yes, sometimes it takes a conference many miles away to get us in the same room!)​

Hope to see you in Baltimore for ACRL 2017!

An Inside View of the OSTP Memo Responses on Research Data Management

e-Science Portal Blog - Fri, 04/03/2015 - 14:11

Submitted by guest contributor, Jonathan Petters, data management consultant in Johns Hopkins Data Management Services. Prior to Hopkins, Jon was a AAAS Science and Technology Policy Fellow in the Department of Energy’s Office of Science.  Before that he did atmospheric science research.

By now we’ve seen most of the Public Access Plans in response to the February 2013 OSTP Memo “Increasing Access to the Results of Federally Funded Scientific Research”.  I was a little disappointed to see that, though the memo was written more than two years ago, US research funding agencies (in general) will be meeting the minimal requirements specified with respect to digital research data*.  I had hoped the funding agencies would more quickly assist and encourage researchers in changing their data management and sharing behaviors.

Though I’m disappointed I shouldn’t be surprised; I was a science policy fellow in one of these funding agencies for two years.  In addition to learning how funding agencies go about their business, I learned quite a bit about the development of and response to this particular memo.  Considering what I learned as a fellow, it’s not all that surprising we are where we are.  I outline some of the possible reasons below, from my own myopic view.  Let’s call it ‘informed speculation’.

First thing to note: it’s NOT because the federal government is full of lazy folks waiting for retirement.   I have the upmost respect for the government employees I had the pleasure of working with.  The federal government is full of bright and hard-working individuals, and my experiences learning from them are what led to me my current position.

Possible reasons….

(In)ability to track research data

Robust infrastructure to track extramural research data doesn’t always exist.  The value in improved research data management and sharing has been widely considered only recently**.  Funding agencies have been tracking proposals and technical reports comprehensively regarding extramural research for many years, but not necessarily for the data produced by this research.  In some disciplines a funding agency might have detailed knowledge about the research data produced, and for others little to none.

In some cases the inclusion of data management plans within grant and contract proposals is going to be the first time an agency receives information specifically and comprehensively about extramural research data produced through its funding.*** We can’t expect a funding agency to plan to divert resources to new data infrastructure or management when it doesn’t even know what data is where and in what state.

Agencies know something about data produced through intramural research since they directly administer it, and this could explain why more specific digital data guidance is given with respect to the M 13-13 memo.

Program Officers = Reformed Researchers

Funding agencies creating these Public Access Plans are made largely of former (or current) researchers.  Just like academic researchers, these staff members (and administrators) have their personal impressions on the place of research data and data sharing in the research enterprise.  Sure, there are several research disciplines where data management and sharing are valued highly (e.g. astronomy, environmental science, genomics).  However, we shouldn’t expect funding agency staff in disciplines that haven’t to suddenly embrace a whole new view of digital research data, any more than we can expect their funded researchers to.

Advisory Committees

Advisory committee input could provide some important background for Public Access Plan development.  Funding agencies receive trusted guidance on research directions from these advisory committees, made up of leading researchers in their fields.   If these advisory committees don’t push for more data infrastructure or data sharing in a particular research discipline because they think resources are better allocated elsewhere, it’s less likely the respective program officers will be fighting for research data management either.

How the OSTP Memo is/was communicated

After a memo is released, each funding agency affected communicates the memo and its effects through the hierarchy.  In general this communication starts from the top of the organization and trickles down to those program officers several layers below.  As you can imagine, how a memo is communicated throughout the hierarchy (e.g., its importance, its impacts, its directives) helps to create the environment for the agency’s response.   If the purpose of and reasons for focusing on research data management are lost along the way, we could have agency staff members, or supervisors even, who don’t understand “why we’re doing this”****.

Note that this scenario is NOT unique to the government; it describes the kinds of thing that might happen in any large organization, and maybe happens in yours.  Effective communication in a large organization takes time and effort.

Lack of Budget

NO NEW MONEY.  Funding agencies love unfunded mandates as much as we all do.  How does an agency set the level of priority in meeting these new OSTP requirements, with no additional budget and in the midst of many competing priorities germane to their mission?

Put all of these reasons together and we can gain some understanding of why funders here might not have gone as far as we research data management folks would like.  I’m confident we’ll all get there though.  The currency of the research enterprise has been manuscripts for many generations.  Other research products, like data and code, will find their place of importance. Eventually.

Hey, how are impactful executive memos like the February 2013 OSTP Memo get developed anyway?

In this case, a small working group with representatives from the funding agencies and OSTP discussed and hammered out a draft over several months.  In many cases these representatives were digital data and publication experts, who may have communicated these discussions to others in their agency during the drafting process.  This draft was circulated to all the funding agencies (and their administrators) for comment, and each agency circulates the memo through their organization as deemed appropriate.  This circulation is generally wide, ensuring that all relevant agency parties get a chance to weigh in.  These agency comments are then acted upon by OSTP and the memo is finalized and released.


*I’m not talking about publications in this blog post.

**It’s probably because of that Internet thing (which might just be a fad anyway).

***NSF is a possible exception since they’ve been gathering data management plans for four years now.  I thought they’d be in the best position to move forward with refined guidelines.

****There’s a great Dilbert cartoon that exemplifies this hierarchical telephone game but I couldn’t find it.

The Diversity of Data Management : Practical Approaches for Health Sciences Librarianship Webcast

e-Science Portal Blog - Thu, 03/19/2015 - 15:04

The Lamar Soutter Library at the University of Massachusetts Medical School in Worcester, MA is hosting a viewing of the MLA webcast, The Diversity of Data Management:  Practical Approaches for Health Sciences Librarianship, on Wednesday, April 22 from 2-3:30 pm.

As noted by the Medical Library Associaion, this webcast is designed to provide health sciences librarians with an introduction to data management, including how data are used within the research landscape, and the current climate around data management in biomedical research. Three librarians working with data management at their institutions will present case studies and examples of products and services they have implemented, and provide strategies for and success stories about what has worked to get data management services up and running at their libraries.

Attending the webcast is free of charge, but space is limited so advance registration is required.  If you would like to register to attend the webcast in Worcester, click here.

HHS responds to the 2013 OSTP Memo: NIH and Data Management Plans

e-Science Portal Blog - Tue, 03/10/2015 - 17:32

Submitted by guest contributor Daina Bouquin, Data & Metadata Services Librarian, Weill Cornell Medical College of Cornell University, dab2058@med.cornell.edu

In response to the Office of Science and Technology Policy (OSTP) 2013 memo regarding public access to federally funded research, five Health and Human Services agencies released their long awaited implementation plans at the end of February 2015. The OSTP memo, released two years ago in February 2013, instructed federal agencies with research and development budgets of (or exceeding) $100 million to develop strategies to make the results of federally funded research freely available to the public within a year of publication; this directive includes research data as a research result to be shared with the public. The recent updates came from the HHS’s National Institutes of Health (NIH), Centers for Disease Control and Prevention (CDC), Food and Drug Administration (FDA), Agency for Healthcare Research and Quality (AHRQ), and Office of the Assistant Secretary for Preparedness and Response (ASPR), whose plans all address scientific publications and research data with corresponding discovery and access points in PubMed Central and eventually healthdata.gov. However, for this post I will primarily focus in on NIH’s “evolving” data policies and point out the RFI that librarians can contribute to to help shape that evolving process.

For the last few years, I have regularly heard statements from librarians and others that seem to equate the NSF General Data Management Plan Policy with the NIH Data Sharing Policy and Public Access Policy. However, these funder requirements are drastically different both in implementation and result, and the above-described announcements make this all the more clear. Namely, the NSF has robust requirements for all researchers to submit Data Management Plans as part of their grant applications, where the NIH does not. Rather, the NIH requires public access to funded manuscripts, as well as a statement addressing how/if data will or will not be shared in a section of the agency’s grant applications—this second requirement only applies to researchers requesting $500,000 or more in direct costs in funding from NIH for research for any one year. The NIH does not require a formal DMP though, nor is there any process in place by which the NIH ensures that data is actually being shared by the researchers that they fund, though data sharing is actively encouraged. I have personally found that this situation has made it a challenge to illustrate to NIH funded researchers the importance of writing a DMP—when the funder is not asking for more robust planning, it can be difficult (though not impossible) to convince researchers to put in the necessary effort to thoroughly plan.

The NIH’s recently released announcements responding to the OSTP Memo, make very few updates in regard to Data Management Plans, as the HSS agencies see data policies as “evolving” and recognize that much of the agencies’ funded research data resides externally to the agencies themselves. As of right now, HHS has no shared repository for deposit of HHS agencies’ research data or catalog of associated metadata. The plan presented notes that an internal HHS Enterprise Data Inventory will serve as the catalog for all HHS data products and will eventually be linked to HealthData.gov. The NIH announcement did however specifically note the following in its “Further Steps Under Consideration” section on Data Management Plans:

“NIH is supporting an Institute of Medicine study of clinical trial data sharing… In an interim report on this topic, the IOM noted that a cultural change has occurred in discussions about clinical data sharing. Rather than exploring whether it should occur, the focus is on how it should be accomplished”

“NIH will explore the development of policies to require NIH-funded researchers to make the data underlying the conclusions of peer-reviewed scientific research publications freely available in public repositories at the time of publication in machine readable formats… NIH is taking steps to ensure all NIH-funded researchers develop data management plans…. As a first step, the 2003 NIH Data Sharing Policy will be modified to require that all NIH-funded researchers develop data management plans.”


Therefore, much of the recently released NIH response gives vague reference to what is being planned, but little detail on execution of those plans—specifically, what DMP requirements will be executed and when that execution is anticipated to occur. The NIH stance seems to be defined thusly: “NIH will determine the additional steps needed to ensure that the merits of digital data management plans are considered during the peer review process for extramural research grants and contracts” yet much is still unclear regarding what is to be expected.

Librarians working in biomedical research environments should continue to advocate that researchers write robust DMPs regardless of whether or not they are a requirement of their funders and should be sure to be aware of the following regarding NIH requirements:

NIH Data Sharing Policy 

The new sharing policy for genomic data

The separate data policies by NIH institute 

The list of the NIH’s preferred data sharing repositories 

And just for good measure here’s the NIH data sharing FAQ

Also useful is the “data sharing workbook

Librarians can also refer researchers to DMP examples in the Biology like those gathered by the New England Collaborative Data Management Curriculum

Furthermore, I encourage librarians to consider contributing the following Request for Information to help shape NIH data resources developed through the National Library of Medicine:


The National Library of Medicine needs input on the Library’s future in a Big Data world!

This is your chance to influence how some of the NIH’s most prominent data and information resources will be developed and envisioned in the future! 

Respond to the RFI at: www.nlm.gov/RFI

Deadline: 3/13

Topic: NLM seeks input regarding the strategic vision for the NLM to ensure that it remains an international leader in biomedical data and health information. In particular, comments are being sought regarding the current value of and future need for NLM programs, resources, research and training efforts and services (e.g., databases, software, collections). Your comments can include but are not limited to the following topics:

1 – Current NLM elements that are of the most, or least, value to the research community (including biomedical, clinical, behavioral, health services, public health and historical researchers) and future capabilities that will be needed to support evolving scientific and technological activities and needs.

2 – Current NLM elements that are of the most, or least, value to health professionals (e.g., those working in health care, emergency response, toxicology, environmental health and public health) and future capabilities that will be needed to enable health professionals to integrate data and knowledge from biomedical research into effective practice.

3 – Current NLM elements that are of most, or least, value to patients and the public (including students, teachers and the media) and future capabilities that will be needed to ensure a trusted source for rapid dissemination of health knowledge into the public domain.

4 – Current NLM elements that are of most, or least, value to other libraries, publishers, organizations, companies and individuals who use NLM data, software tools and systems in developing and providing value-added or complementary services and products and future capabilities that would facilitate the development of products and services that make use of NLM resources.

5 – How NLM could be better positioned to help address the broader and growing challenges associated with: Biomedical informatics, “big data” and data science; Electronic health records; Digital publications; or Other emerging challenges/elements warranting special consideration.

IDCC 15 – Part 2 (It’s a big conference)

e-Science Portal Blog - Mon, 03/02/2015 - 11:38

Last week in her blog post, Margaret discussed the twitter feed from the International Data Curation Conference (IDCC) that took place on Feb 9th to the 12th. I was fortunate enough to be able to attend and participate this year, and as it is a premier event for data professionals, I’d like to add a bit more about the conference.

The theme this year was “Ten years back, ten years forward: achievements, lessons and the future for digital curation”. Tony Hey, formerly of Microsoft Research and now a Fellow at the University of Washington, was the opening Keynote.  He did a very nice job of illustrating how far we have come in the past ten years. Data management and curation are now recognized as important issues and discussed in high-profile venues like Science and Nature.  However, he also noted that we still have some very serious problems to address. Funding for curation is often based locally, but use of digital data is global. More and more data repositories and tools are coming online, but support for these initiatives are still quite fragile and we have lost some important resources (RIP Arts & Humanities Data Service).

This tension between how far we have come vs. how far we have yet to go was echoed in a panel session titled “Why is it taking so long?” moderated by Carly Strasser from DataCite. Some of the panelists pointed to a lack of incentives, infrastructure and support as barriers to progress. However, others noted that actually quite a lot of progress had been made when one considers the scope of the changes in culture and practice that we are championing.

Presentations on Data Education struck a similar tone. Liz Lyon, from the School of Information Studies at the University of Pittsburgh, noted that roles for Data Professionals are becoming more prominent and defined, but the educational path to prepare oneself to perform these roles is still unclear. iSchools at Pitt and the University of North Carolina, whose program was described by Helen Tibbo, are seeking to position themselves as the places to fill this need.

Though awareness of curation has increased, we still have a ways to go in training academics in curation.  Research done by Daisy Abbott from the University of Glasgow demonstrated a gap between the perception among graduate students that curating their work is important with their reporting that they lack the expertise to curate their work effectively. Fortunately, we have Aleksandra Pawlik and others from the Software Sustainability Institute offering Data Carpentry workshops to help raise data literacy levels of researchers.

The program with presentation slides is available on the IDCC15 website, and the papers will soon be published in the International Journal of Digital Curation. The location of IDCC16 has yet to be announced, but I highly recommend attending if you get the chance.

IDCC15 – I Couldn’t Go But I Followed on Twitter

e-Science Portal Blog - Fri, 02/20/2015 - 10:58

I enjoy going to conferences. I love learning new things and getting new ideas.  I really love the way I’m inspired by the people I meet. But, I can’t go to every conference. Like most people, my university library budget is limited and my own budget is limited. However, as more people in libraries and data take to Twitter and other social media, I can go to conferences vicariously.

 From February 9-12 I was at the 10th International Data Curation Conference  in London, England.  While I wish I had been there, it is possible I would have been so tempted by the sights of London that I might have skipped the meeting.

There is a Storify of the conference available if you want to have a look at all the events and photos and comments. Watching the #idcc15 feed each day made me envious but also excited, as I read about the successes and new ideas that were being discussed during the various programs. Great morning coffee and lunchtime reading. A few highlights you might want to check out:

While there are differences between US and UK regulations, we can learn from programs that work at any institution. Presentations by Imperial College London, Oxford Brookes University, and University of Edinburgh are summarized here, with links to some good resources.

It is also helpful to learn from the researcher’s viewpoint. Purdue’s Data Curation Profiles were the focus of one talk that dealt with the Technology Acceptance Model. And the second talk examined if research supervisors were prepared to provide advice and guidance. Slides and papers for both talks are linked from this summary.

 The Edinburgh group, mentioned above, has a great blog and a couple of posts there talk about IDCC15 covering the first day and another post looking at how the 80/20 rule applies to RDM tools  (if you haven’t heard about the 80/20 rule, also know as the Pareto principle, check out the Wikipedia article)

 A useful Storify covers RDM training for librarians . There are slides embedded in the page, so have a look at the various curricula that were presented.

While we focus on eScience here at the portal, there are also data things going on in other subjects.  If you’ve always wanted to learn a bit about digital humanities, try this video, ”The stuff we forget: Digital Humanities, digital data, and the academic cycle” by Melissa Terras, Director of University College London Centre for Digital Humanities

 This final blog post  recommendation gives you an idea of some of the other subjects covered in the meeting with  links to the talks

By the way, I use TweetDeck to keep track of multiple things on Twitter.  There are some basic instructions here  I have the regular stream of people and organizations I follow in the first column, and after that I have columns for hashtags I’m interested in, such as the #idcc15 label for meeting tweets or #medlibs for medical librarians.  When conferences are over, and I have favorited the tweets I want to follow up on later, I can delete the column. Favorites is another column in my TweetDeck.

7 Recommended Resources for E-Science Newbies

e-Science Portal Blog - Tue, 02/17/2015 - 19:52

Submitted by Donna Kafel, Project Coordinator for the e-Science Portal and the New England e-Science Program for Librarians.

During a recent meeting of the e-Science Portal’s Editorial Board, portal editors suggested that we create a downloadable document, perhaps titled “An Introduction to e-Science” that would provide an annotated list of the best overviews and introductory resources for librarians and library students new to the concept of e-Science and library based data services.  The e-Science Portal team  thought this was a great idea and we have it on our action item list for after the portal redesign is completed this spring.

In the meantime, there are a lot of e-Science newbies out there right now who are at a loss as to where to begin, and who may like some of this information a little sooner.  Looking at all the content packed into library guides on data management, hundreds of journal articles, and data webinars can be a bit overwhelming for those just starting out. Here are seven resources that can help newbies start out on the road to figuring out what is meant by the term e-Science and  how it impacts scholarly communication, library roles in e-Science, the structure of the scientific research environment, data types and data management.

1.  The Fourth Paradigm:  Don’t be intimidated, I’m not recommending that people read the entire book in one sitting! (But it’s worth going back to read individual chapters).  The Fourth Paradigm’s Foreword and the first chapter “Jim Gray on eScience:  A Transformed Scientific Method”  nicely illustrate how the integration of computers and evolving technologies have revolutionized the way science is conducted.

2.  The e-Science Thesaurus is a great place for Newbies to learn terms and concepts, and related  references.  Included in some of the entries, are interviews with librarians who are actively engaged in e-Science (for some interesting interviews, check out Data Curation Profiles Toolkit, Implementing a Data Sharing/Management Policy  and Informationist)

3.  What is e-Science and How Should it be Managed? :  captures the essence of e-science, critical roles for librarians, and the importance of open data sharing.

4.  A nice overview of e-Science and roles for librarians:

a)      Cyberinfrastructure, Data, and Libraries, Part 1. A Cyberinfrastructure Primer for Librarians (2007) – Part one of a primer for librarians on the major issues and terminologies of e-Science.

b)      Cyberinfrastructure, Data, and Libraries, Part 2 – Part two: the role of libraries in data management and how librarians can participate in the downstream and upstream phases of the research cycle.

5.  Data Types (4 min YouTube video)—describes the diverse entities that come under the umbrella term data and the different ways data is captured.

6.  A Day in the Life of an Academic Researcher Part 1 (7 minute YouTube video) and A Day in the Life of an Academic Researcher Part 2 (5 minute YouTube video) explains the research environment and the different roles played by members of a research team.

7.  The Journal of eScience Librarianship (JeSLIB) :  specifically dedicated to the advancement of e-Science librarianship, JeSLIB  includes peer-reviewed research  and “e-Science in Action” articles on topics such as research data management, librarians embedded on research teams, data services, data curation, and data sharing and re-use.

Glitter on the Highway: Data on the Website

e-Science Portal Blog - Wed, 02/11/2015 - 12:09

By Andrew Creamer, Scientific Data Management Specialist, Brown University

Glitter on the mattress
Glitter on the highway
Glitter on the front porch
Glitter on the hallway 
Love Shack, The B-52s, Pierson, Schneider, Strickland, EMI (1989).

Recently I was reading through the drafts of the Data Management Plans (DMPs) and Broader Impacts sections that were submitted with faculty NSF proposals through our data management plan service for 2014-2015. As I reviewed these data management plans, one of the commonalities I noticed was the ubiquity of statements that data would be linked from the project website or personal website. Listed as either a tool for dissemination or post-project archiving and access, or in some cases both, there was data on the website. In a few cases data on the website was conspicuously the only option listed for dissemination or post-project archiving. Most often it was mentioned nested in among other options; for example, for dissemination, the investigators would say they would disseminate the data by sharing it on their personal or project websites, depositing it in some type of data sharing repository, and publishing the results in academic journals and presenting these at scientific meetings. As I looked over these drafts I could see where in each occurrence I had marked a comment asking the investigators for more information about what they meant by putting data on a personal or project website and to please have a conversation with me regarding this option.

The opaque “data on the website issue” comes up in almost every conversation I have had with faculty using our DMP service: “So, you say here that you have a website. How exactly are you storing and making your data available on your website? Who is responsible for doing and maintaining this, etc.” This conversation can go many ways, of course. While some faculty mean that they are depositing the data into a repository and have a persistent link that they will place on their personal or project website in a citation that will link out to the data, some faculty mean that they have a personal server, and in some alarming cases, a web server, where they will place and link to data on their website. While the former intention also leads one down a line of important questioning about suitability and sustainability, such as which repository, what kind of persistent link, etc., it is the latter scenario, of course, that concerns us research data management librarians the most.

In their article published in PLOS ONE last summer, How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers, Pepe et al. (2014) provided evidence that we can use in conversations with investigators about considering alternatives for storing data on their web or personal servers. Their findings showed putting data on a personal or project-based website was the third and fourth most popular practice for data sharing practices among astronomers after emailing or placing data on a FTP-style site. Then they looked through the external links to data published in a defined period of astronomy literature and found:

“This exploratory analysis reveals three key findings. First, since the inception of the web in the early 1990′s, astronomers have increasingly used links in articles to cite datasets and other resources which do not fit in the traditional referencing schemes for bibliographic materials. Second, as for nearly every resource on the web, availability of linked material decays with time: old links to astronomical materials are more likely to be broken than more recent ones. Third, links to “personal datasets”, i.e., links to potential data hosted on astronomers’ personal websites, become unreachable much faster than links to curated “institutional datasets”. (Pepe et. al 2014)

The practice of placing data on a website may be entrenched in the data sharing practices of certain scientific communities, but as research data management librarians we need to be sure that we do not become numb to its ubiquity; instead we must continue to question the researchers about what they mean and list ways that we can still help to make data accessible from their website but mitigate the myriad issues related to storing data on web servers or personal servers, e.g., lack of back up, persistent identifiers, no long-term preservation strategy, lack of sufficient metadata, link rot, diminished discoverability and the access risks when only one person is the sole individual responsible for making data accessible.

On the publisher side, last spring PLOS added this text to their Data Availability Policy: “Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.” This one sentence has also been helping me in the endeavor to dissuade researchers from stating that if another researcher wants or needs access to their data, then he or she can just contact them as the sole means of data access or access their data on their personal website as sole means of data dissemination. So let us hope that research funders will also begin pushing back on researchers that want to use their personal or project websites and their personal and web servers as the sole means of data dissemination or storage location for post-project access.

Citation: Pepe A, Goodman A, Muench A, Crosas M, Erdmann C (2014) How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers. PLoS ONE 9(8): e104798. doi:10.1371/journal.pone.0104798

A gentle introduction to Docker for reproducible research

e-Science Portal Blog - Thu, 01/29/2015 - 16:27

Submitted by guest contributor Stacy Konkiel, Director of Marketing & Research, Impactstory, stacy.konkiel@gmail.com.

By now, many data management librarians are familiar with the concept of reproducible research. We know why it’s important and how to (theoretically) make it happen (thorough documentation, putting data and code online, writing an excellent Methods section in a journal article, etc).

But if a scientist asked you for a single recommended reading on how to make their computational research reproducible, what would you send them?

I’d suggest “Using docker for reproducible computational publications” by Melissa Gymrek (a Bioinformatics PhD student at Harvard/MIT).

In her post, Gymrek introduces Docker, a “lightweight virtual machine” that allows a researcher to create a complete computing environment, hosted in the cloud, that other researchers can log into to reproduce results using the original researcher’s code and data.

No need to download and install R packages, or to figure out how to make someone else’s code play well with their operating system. Just install Docker, enter a simple line at the command line, and–boom–they’ve got a virtual machine running on their computer that they can log into to reproduce someone else’s findings.

Docker is already popular in the software development world, and is gaining popularity with bioinformaticists and other computational researchers. Learn more about Docker and how it can work for reproducible research on Melissa Gymrek’s blog.

Winter is the perfect time for a virtual conference or webinar!

e-Science Portal Blog - Wed, 01/28/2015 - 15:19

There’s been a flurry of upcoming virtual conferences and webinars springing up and providing educational opportunities while obviating the need for travel in  wintry weather. In a previous post, I had noted the upcoming DataONE webinar series that begins on Feb. 9th with the webinar “Open Data and Science:  Towards Optimizing the Research Process.”

NISO is sponsoring a six hour long (11 am – 5 pm EST) virtual conference on Feb. 18th:  “Scientific Data Management :  Caring for your Institution and its Intellectual Wealth. Hosted by Todd Carpenter, Executive Director of NISO, the program includes speakers from the Dept. of Energy, Emory, Tufts, Oregon State University, UIUC, the Center for Open Science, and the RMap project. The final session will be a roundtable discussion. Program topics for the conference include:

  • Data management practice meets policy
  • Uses for the data management plan
  • Building data management capacity and functionality
  • Citing and curating datasets
  • Connecting datasets with other products of scholarship
  • Changing researchers’ practices
  • Teaching data management techniques

Finally (although I suspect I’ll soon be adding to this snowballing list), Elsevier is sponsoring the webinar “Institutional & Research Repositories:  Characteristics, Relationships and Roles” on Feb. 26th from 11 am-12:15 pm (EST)


DataONE is launching a new webinar series

e-Science Portal Blog - Thu, 01/22/2015 - 09:51

DataONE is launching a new Webinar Series (www.dataone.org/webinars) focused on open science, the role of the data lifecycle, and achieving innovative science through shared data and ground-breaking tools.

The first of the series is a presentation and discussion led by Dr Jean-Claude Guédon from the Université de Montréal titled:

Open Data and Science: Towards Optimizing the Research Process”.

Tuesday February 10th 9 am Pacific / 10 am Mountain / 11am Central / 12 noon Eastern

The abstract for the talk registration details can be found at: www.dataone.org/upcoming-webinar.

Webinars will be held the 2nd Tuesday of each month at 12 noon Eastern Time.  They will be recorded and made available for viewing later the same day. A Q&A forum will also be available to attendees and later viewers alike.

More information on the DataONE Webinar Series can be found at: www.dataone.org/webinars .


R: Addressing the Intimidation Factor

e-Science Portal Blog - Wed, 01/21/2015 - 16:22

Submitted by guest contributor Daina Bouquin, Data & Metadata Services Librarian, Weill Cornell Medical College of Cornell University, dab2058@med.cornell.edu

Working with students and researchers to help them better manage and work with their research data is a big part of the librarian’s role in a data-intensive setting. Much of the time though, the librarian needs to critically think through and advise on tools used in different parts of the data life cycle as well– this includes the data pre-processing and analysis phases of a research project. Increasingly, I am finding myself dealing with this sort of situation in my library– for example: a student comes to the library with a question about getting access to some tools for working with her data; this might mean that the student needs help restructuring some spreadsheets or other data manipulation task, but more often than not the student is also seeking statistical software and tools for data visualization. In my experience, this type of situation has been more common than requests for help on data management plans or research documentation. This type of reference interaction is also where many librarians and information professionals begin to have to have discussions about and encounters with R programming.

R is a free statistical programming language with a notorious learning curve, but students and researchers are increasingly seeing the value in tackling that curve. I was fortunate enough to take some advanced statistics courses throughout my educational career and learned R in a trial-by-fire set-up. I also co-instructed an introductory course on Computational Health Informatics this past summer wherein we taught introductory R functionalities. Therefore, when patrons come to the library looking for help getting started with R, I feel confident helping them. However, I know that when R comes up in discussions with my colleagues, they do not always feel confident assessing whether or not it is worthwhile to advise a student to learn R or just run some stats tests in Excel. My colleagues also are often intimidated when it comes to R because they are not confident that they understand how to trouble shoot and find resources for students just getting started with the program.  As a result of witnessing this type of situation on many occasions, I present here my attempt at lowering the intimidation factor surrounding R for librarians. You do not need become an R programmer to know how to approach it critically and with the ability to help others get started.

I just want to start by saying though, that I almost always encourage students to pursue learning R rather than pushing them toward Excel or a statistics program that they would need to purchased as our library does not offer regular access to stats software on our computers.  R is also much more robust for working with data than Excel. However, I realize that some students and researchers just want to get their work done and want nothing to do with learning a new programming language. At that point, I generally very briefly point R out to them anyway in case the student ever does decide that it might be useful to learn. If the student is unsure if R is what she is looking for, however, I ask the following questions:

  • Have you ever worked with data using code? (e.g. Stata, SAS)
  • Would you be willing to spend some time learning how to use a new tool?
  • Are the statistical tests you need to run somewhat complex?
  • Will you need to repeat the steps for how you cleaned up your data?
  • Will you need to repeat the steps for how you analyzed your data?
  • Will visualizing your data be very important to you on this project?
  • Is your data in more than one format?

If the student answer “yes” to a few of these questions, I would strongly encourage them to use R rather than a tool like Excel.  Check out Chris Leonard’s discussion on the R Blog for more information on the Excel vs. R question. And with the below resources and jargon under your belt, you will feel more comfortable approaching R programming if you and your patron do decide that R is a good choice.

One of the first resources I usually point new users to, is Quick R. Quick R provides new learners and experienced users alike with “a roadmap and the code necessary to get started quickly, and orient yourself for future learning” with R. I encourage librarians and patrons to look through the “Data Types” section of Quick R if you are unfamiliar with the concept of data types as understanding how R users talk about data will get you feeling less intimidated by unfamiliar terms right off the bat.

There is some other basic jargon you should be aware of when talking about R with patrons as well. For instance, if you are using R, you will likely need to use R packages. “Packages are collections of R functions, data, and compiled code in a well-defined format” (Quick R). The place where packages are stored is called the library. R comes with a standard set of packages when you install it, but others are available for download and installation. You can install packages by running the following command with the name of the package you need:

> install.packages(“name_of_package”)

Once installed, packages need to be loaded into the session to be used. This can be done using the command:

> library(name_of_package)

There are also buttons on interfaces like R Studio that can help you install and load packages without needing to write commands.

R function is another term you’ll likely hear if you get questions about R. R functions allow you to write commands and store them in easy to read and implement text. For example, this is how one could write a function that subtracts one number from another number. The function in the example is called f1:

> f1 <- function(x,y) {x-y}

There are entire packages of functions written by others to help users accomplish complicated tasks. For example, if a researcher decides she needs run some regression diagnostics, there are pre-written functions to accomplish this task in the package called “car“. When the researcher installs the car package and loads it from her library, she will be able to access to functions to run her diagnostics.  You can view an example of this and many other statistical analysis functions using Quick R.

I also tend to point researchers and my colleagues toward general reference material if they are looking for more granular help getting started with R programming. The following have been very useful in the past:

Additionally, the following tutorial resources are usually very well received:

And one should never neglect the help available through the R Community:

The resources noted above, and many others are listed in a resource guide that I developed on R and Data Mining, which can be found here.

In summary, you do not need to read all of these resources on R to help others work with it. By going through some of the above material and familiarizing yourself with the terminology and resources associated with R, you will be well equipped to help with common R problems. R is challenging, but like all new things, exposure is the only way to get used to it. Start small with terminology and basic documentation– in this way you will gain the confidence and knowledge necessary to begin working on reference transactions that involve R programming.


A Model of Collaborative Education Efforts in Data Management: the Virginia Data Management Boot Camp

e-Science Portal Blog - Thu, 01/15/2015 - 12:42

Submitted by guest contributor Yasmeen Shorish, Physical & Life Sciences Librarian at James Madison University.


Question: How do you deliver the same data management training to graduate students, faculty, and staff simultaneously? How do you deliver that content not just at your own institution, but also to six other institutions across the state?

Answer: Very carefully, with a lot of cooperation, collaboration, and some technical wizardry thrown in as well. This is the story of seven Virginia institutions who stopped repeating content individually and started getting real – real collaborative.

In January 2013, the libraries at the University of Virginia (UVA) and Virginia Tech (VT) teamed up to produce a “Data Management Bootcamp” for graduate students on their campuses. Utilizing telepresence technology, speakers could interact with participants at either school in large, virtual sessions as opposed to discreet events at each venue. Librarian interest in this event resulted in the addition of three additional institutions in 2014: James Madison University (JMU), George Mason University (GMU), and Old Dominion University (ODU). UVA, VT, JMU, and GMU have an existing telepresence set-up called 4-VA and it was not difficult, technology-wise, to add ODU in to participate fully as well. Librarians from these five institutions, including myself, formed a planning group to produce the “2014 Virginia Data Management Bootcamp.”

However, expanding a program from two locations to five locations does present some complications. Can everyone connect simultaneously? Do the screens get too cluttered when everyone is connected? How do we decide what content is most appropriate for five very different institutions? The 2014 Bootcamp began planning in the summer of 2013. A series of virtual meetings among the planning group resulted in an agenda that included understanding research data, operational data management, data documentation and metadata, file formats and transformations, storage and security, DMPTool and funding agencies, rights and licensing, protection and privacy, and preservation and sharing. It was a lot to cover in two full days, with a third half-day for local discussion. The full agenda can be found on this LibGuide.

The group debriefed after the 2014 event and discussed what 2015 should look like. We knew that the next event should be less dense, as that much content in two days was somewhat overwhelming.  The College of William & Mary (WM) and Virginia Commonwealth University (VCU) both expressed a desire to participate. With some technological work involving bridges, WebEx, and patience, the Virginia Data Management Bootcamp was able to expand to include these universities. Happily, increasing the number of participating institutions did not increase the complexity very much. One change that may have had the most impact was that the planning group decided to add more in-person meetings to work through curriculum ideas. We found that as a group, we could accomplish more in a shorter amount of time when we were gathered around one table, discussing ideas.

Using pre and post assessment surveys helped us zero in on some areas for change. We wanted to build more interactivity and limit the amount of lecture for each area.  We also wanted to engage the audience in the research cycle more intentionally than we had been. We redesigned the three-day event in smaller chunks, with more local discussion and more hands-on activities. A full schedule can be found on this LibGuide.

Can other states or groups of libraries produce a cross-institutional data management outreach program? Yes!

What if they lack a fancy telepresence room? Still, yes! There are viable alternatives that may have a different look and feel, but can still accomplish the same goal.

Want to launch a cross-institutional program of your own?

The best way to get started is to first get a sense of who would want to participate. Propose the workshop and form a planning group. The number of participating venues will shape what technology you use to bring it all together. WebEx may be appropriate, or even a Google Hangout (although image quality could be concern).

How much time can you set aside for the workshop? One day? Three days? That will determine what gets covered and how. The more hands-on engagement that you can work into the program, the more likely you are to keep interest across sites.

Determine a meeting schedule for the planning group and decide which meeting method (virtual vs. in person) will be more effective. Individually, each site will need to coordinate with its own campus partners to make it as big an event as they wish. Assessment of some kind is necessary to determine what could change if you do it all over again.

Collaborative education efforts such as these can help institutions leverage the expertise that is naturally distributed. Setting a foundational learning outcome for data management is an achievable goal and a good way to build a community of practice in your local region.





Just published: Journal of eScience Librarianship special issue on data literacy

e-Science Portal Blog - Mon, 01/12/2015 - 14:34

The latest issue of the Journal of eScience Librarianship (JESLIB) has just been published! It is available at http://escholarship.umassmed.edu/jeslib/vol3/iss1/

 Table of Contents

Volume 3, Issue 1 (2014)


What is Data Literacy?
Elaine R. Martin

Full-Length Papers

Planning Data Management Education Initiatives: Process, Feedback, and Future Directions
Christopher Eaker

A Spider, an Octopus, or an Animal Just Coming into Existence? Designing a Curriculum for Librarians to Support Research Data Management
Andrew M. Cox, Eddy Verbaan, and Barbara Sen

An Analysis of Data Management Plans in University of Illinois National Science Foundation Grant Proposals
William H. Mischo, Mary C. Schlembach, and Megan N. O’Donnell

Initiating Data Management Instruction to Graduate Students at the University of Houston Using the New England Collaborative Data Management Curriculum
Christie Peters and Porcia Vaughn

EScience in Action

Research Data MANTRA: A Labour of Love
Robin Rice

Building Data Services From the Ground Up: Strategies and Resources
Heather L. Coates

Building the New England Collaborative Data Management Curriculum
Donna Kafel, Andrew T. Creamer, and Elaine R. Martin

Lessons Learned From a Research Data Management Pilot Course at an Academic Library
Jennifer Muilenburg, Mahria Lebow, and Joanne Rich

Gaining Traction in Research Data Management Support: A Case Study
Donna L. O’Malley

The New England Collaborative Data Management Curriculum Pilot at the University of Manitoba: A Canadian Experience
Mayu Ishida

Are you interested in submitting to JESLIB? Please refer to author guidelines at http://escholarship.umassmed.edu/jeslib/styleguide.html

Call for Participation for Content Editors for the e-Science Portal

Blog: Current Projects - Thu, 12/12/2013 - 11:48

The Editorial Board of the e-Science Portal for New England Librarians is looking for librarians who are passionate about emerging trends in science librarianship and interested in working as part of an editorial team to become Content Editors for the e-Science Portal for New England Librarians. Launched in 2011, the e-Science Portal is a resource for librarians, library students, information professionals, and interested individuals to learn about and discuss:

  • Library roles in e-Science
  • Fundamentals of domain sciences
  • Emerging trends in supporting networked scientific research

Currently the Editorial Board is reorganizing its content and expanding coverage to better serve the information needs of librarians interested in e-Science, new trends in science librarianship and scholarly communication, and ways that libraries are addressing the issues of the networked data age. The e-Science portal is built on a Drupal platform.

Content editors are needed for the following e-Science portal content areas:

  • Data Information Literacy:  resources, courses, information needs of researchers
  • Emerging  Trends & Technologies new roles, emerging technologies, repository tools
  • Scholarly Communication:  publishing data (including peer review, journal policies), sharing, altmetrics, citing data, identifiers, Open Data, Open Science, Open Access
  • Professional Development and Continuing Education:  competencies, courses, e-Science symposia, related professional associations and conferences, recommended websites and blogs

This call for participation is not restricted to New England librarians. Requirements for Content Editor positions include a time commitment of 3 hours per month for the following activities:

·          Identifying, annotating, and posting links to relevant resources on the content area page

·          Reviewing the content page to ensure functioning links and current information

·          Communicating via an e-mail discussion list with other members of the Editorial Board

·          Attending Editorial Board Meetings: while in person attendance at Editorial Board  meetings is preferred, arrangements can be made for Content Editors outside the NE region to attend meetings remotely.

·          Content Editors can refer to the e-Science Portal’s Selection Criteria for guidelines on selecting resources. The e-Science Portal for New England Librarians is funded by the National Network of Libraries of Medicine New England Region.  Stipends will be paid to appointed Content Editors.

For further details about the Content Editor positions, please contact me at Donna.Kafel@umassmed.edu




Syndicate content