Feed aggregator

e-Science Portal Users–we need you!

e-Science Portal Blog - 8 hours 27 min ago

The e-Science Portal design team has been conducting a series of online Optimal Workshop user studies of the portal over the past few months. In May the team had issued a Call for Participation for Usability Testing of the e-Science Portal, and we were happy to receive over ninety volunteers!  With these volunteers’ participation, we’ve conducted three separate tests and gleaned valuable information from their responses.  With this information, we’ll be “tweaking” the design of the portal, but before we do so, we need further input from a new pool of participants.

Whether or not you’re familiar with e-Science and/or the e-Science Portal for New England Librarians,  we need you! On average the test takes 12-15 minutes to complete. You do not need to be a web design expert or have previous experience in user testing, and the instructions are easy.

To volunteer, please complete the following e-Science Portal Usability Testing form at  https://docs.google.com/forms/d/1Wb6kk4QYtfvi4bZuVMnRoZxF_KQmdYUTsQrnK1VDWDE/viewform by Monday, October 27th.

Thank you for participating,

Donna Kafel, Coordinator for the e-Science Portal


Nov. 25 workshop: Improving integrity in scientific research

e-Science Portal Blog - Fri, 10/17/2014 - 09:01

Posted on behalf of Chris Erdmann, Head Librarian, Harvard-Smithsonian Center for Astrophysics, Harvard.

Workshop:  Improving integrity in scientific research: How openness can facilitate reproducibility

Time: 3:00pm – 5:30pm
Date: Tuesday, November 25th
Location: Center for Astrophysics, Phillips Auditorium


  “Using Zenodo to share and safely store your research data”

Lars Holm Nielsen, CERN

Is your 10-year-old dataset stored safely? Is it openly accessible? In the workshop, you will learn how to preserve, share and receive credit for your research data using Zenodo (https://zenodo.org/), created by OpenAIRE and CERN, and supported by the European Commission. We will explore the different aspects and issues related to research data and software publishing, why preservation is important, how to link it up and make your research data discoverable. We will also see how research software hosted by GitHub can be automatically preserved with just a few clicks. In addition, we will look at how research communities can be created in Zenodo to support a variety of publication activities.

Requirements: None, but it’s highly preferable to bring your own laptop and an example research output (dataset, software, presentation, poster, publication, …) you would like to share to be able to follow the interactive part of the workshop.

Improving integrity in scientific research: How openness can facilitate reproducibility

Courtney Soderberg, COS

Have you heard about the reproducibility crisis in science (ex. in AAAS and Economist) and worry about false positive results? Ever wondered how you could increase the reproducibility of your own work and help the accumulation of scientific knowledge? Join us for a workshop on reproducible research, hosted by the Center for Open Science.

This presentation will briefly review the evidence and challenges for reproducibility and discuss how greater transparency and openness across the entire scientific workflow (from project inception, to data sets and analysis, to publication and beyond) can increase levels of reproducibility. It will also include a hands-on demonstration of the Open Science Framework (http://osf.io/) a free, open source web application developed to help researchers connect, document, and share all aspects of their scientific workflow to increase the reproducibility of their work.

Attendees are encouraged to bring laptops and research materials (stimuli, analysis scripts, data sets, etc.) they would like to share so they can follow along with the hands-on section of the presentation.

Please RSVP

Tracking the impacts of data – beyond citations

e-Science Portal Blog - Thu, 10/16/2014 - 13:29

How can you tell if data has been useful to other researchers?

Tracking how often data has been cited (and by whom) is one way, but data citations only tell part of the story, part of the time. (The part that gets published in academic journals, if and when those data are cited correctly.) What about the impact that data has elsewhere?

We’re now able to mine the Web for evidence of diverse impacts (bookmarks, shares, discussions, citations, and so on) for diverse scholarly outputs, including data sets. And that’s great news, because it means that we now can track who’s reusing our data, and how.

All of this is still fairly new, however, which means that you likely need a primer on data metrics beyond citations. So, here you go.

In this post, I’ll give an overview of the different types of data metrics (including citations and altmetrics), the “flavors” of data impact, and specific examples of data metric indicators.

What do data metrics look like?

There are two main types of data metrics: data citations and altmetrics for data. Each of these types of metrics are important for their own reasons, and offer the ability to understand different dimensions of impact.

Data citations

Much like traditional, publication-based citations, data citations are an attempt to track data’s influence and reuse in scholarly literature.

The reason why we want to track scholarly data influence and reuse? Because “rewards” in academia are traditionally counted in the form of formal citations to works, printed in the reference list of a publication.

There are two ways to cite data: cite the data package directly (often by pointing to where the data is hosted in a repository), and cite a “data paper” that describes the dataset, functioning primarily as detailed metadata, and offering the added benefit of being in a format that’s much more appealing to many publishers.

In the rest of this post, I’m going to mostly focus on metrics other than citations, which are being written about extensively elsewhere. But first, here’s some basic information on data citations that can help you understand how data’s scholarly impacts can be tracked.

How data packages are cited

Much like how citations to publications differ depending on whether you’re using Chicago style or APA style formatting, citations to data tend to differ according to the community of practice and the recommended citation style of the repository that hosts data. But there are a core set minimums for what should be included in a citation. Jon Kratz has compiled these “core elements” (as well as “common elements”) over on the DataPub blog. The core elements include:

  • Creator(s): Essential, of course, to publicly credit the researchers who did the work. One complication here is that datasets can have large (into the hundreds) numbers of authors, in which case an organizational name might be used.

  • Date: The year of publication or, occasionally, when the dataset was finalized.

  • Title: As is the case with articles, the title of a dataset should help the reader decide whether your dataset is potentially of interest. The title might contain the name of the organization responsible, or information such as the date range covered.

  • Publisher: Many standards split the publisher into separate producer and distributor fields. Sometimes the physical location (City, State) of the organization is included.

  • Identifier: A Digital Object Identifier (DOI), Archival Resource Key (ARK), or other unique and unambiguous label for the dataset.

Arguably the most important principle? The use of a persistent identifier like a DOI, ARK, or Handle. They’re important for two reasons: no matter if the data’s URL changes, others will still be able to access it; and PIDs provide citation aggregators like the Data Citation Index and Impactstory.org an easy, unambiguous way to parse out “mentions” in online forums and journals.

It’s worth noting, however, that as few as 25% of journal articles tend to formally cite data. (Sad, considering that so many major publishers have signed on to FORCE11’s data citation principles, which include the need to cite data packages in the same manner as publications.) Instead, many scholars reference data packages in their Methods section, forgoing formal citations, making text mining necessary to retrieve mentions of those data.

How to track citations to data packages

When you want to track citations to your data packages, the best option is the Data Citation Index. The DCI functions similarly to Web of Science. If your institution has a subscription, you can search the Index for citations that occur in the literature that reference data from a number of well-known repositories, including ICPSR, ANDS, and PANGEA.

Here’s how: login to the DCI, then head to the home screen. In the Search box, type in your name or the dataset’s DOI. Find the dataset in the search results, then click on it to be taken to the item record page. On the item record, find and click the “Create Citation Alert” button on the right hand side of the page, where you’ll also find a list of articles that reference that dataset. Now you have a list of the articles that reference your data to date, and you’ll also receive automated email alerts whenever someone new references your data.

Another option comes from CrossRef Search. This experimental search tool works for any dataset that has a DataCite DOI and is referenced in the scholarly literature that’s indexed by CrossRef. (DataCite issues DOIs for Figshare, Dryad, and a number of other repositories.) Right now, the search is a very rough one: you’ll need to view the entire list of DOIs, then use your browser search (often accessed by hitting CTRL + F or Command +F) to check the list for your specific DOI. It’s not perfect–in fact, sometimes it’s entirely broken–but it does provide a view into your data citations not entirely available elsewhere.

How data papers are cited

Data papers tend to be cited like any other paper: by recording the authors, title, journal of publication, and any other information that’s required by the citation style you’re using. Data papers are also often cited using permanent identifiers like DOIs, which are assigned by publishers.

How to find citations for data papers

To find citations to data papers, search databases like Scopus and Web of Science like you’d search for any traditional publication. Here’s how to track citations in Scopus and Web of Science.

There’s no guarantee that your data paper is included in their database, though, since data paper journals are still a niche publication type in some fields, and thus aren’t tracked by some major databases. You’ll be smart to follow up your database search with a Google Scholar search, too.

Altmetrics for data

Citations are good for tracking the impact of your data in the scholarly literature, but what about other types of impact, among other audiences like the public and practitioners?

Altmetrics are indicators of the reuse, discussion, sharing, and other interactions humans can have with a scholarly object. These interactions tend to leave traces on the scholarly web.

Altmetrics are so broadly defined that they include pretty much any type of indicator sourced from a web service. For the purposes of this post, we’ll separate out citations from our definition of altmetrics, but note that many altmetrics aggregators tend to include citation data.

There are two main types of altmetrics for data: repository-sourced metrics (which often measure not only researchers’ impacts, but also repositories’ and curators’ impacts), and social web metrics (which more often measure other scholars’ and the public’s use and other interactions with data).

First, let’s discuss the nuts and bolts of data altmetrics. Then, we’ll talk about services you can use to find altmetrics for data.

Altmetrics for how data is used on the social web

Data packages can be shared, discussed, bookmarked, viewed, and reused using many of the same services that researchers use for journal articles: blogs, Twitter, social bookmarking sites like Mendeley and CiteULike, and so on. There are also a number of services that are specific to data, and these tend to be repositories with altmetric “indicators” particular to that platform.

For an in-depth look into data metrics and altmetrics, I recommend that you read Costas et al’s report, “The Value of Research Data” (2013). Below, I’ve created a basic chart of various altmetrics for data and what they can likely tell us about the use of data.

Quick caveat: aside from the Costas et al report, there’s been little research done into altmetrics for data. (DataONE, PLOS, and California Digital Library are in fact the first organizations to do major work in this area, and they were recently awarded a grant to do proper research that will likely confirm or negate much of the below list. Keep an eye out for future news from them.) The metrics and their meanings listed below are, at best, estimations based on experience with both research data and altmetrics.

Repository- and publisher-based indicators

Note that some of the repositories below are primarily used for software, but can sometimes be used to host data, as well.


Web Service


What it might tell us

Reported on



Akin to “favoriting” a tweet or underlining a favorite passage in a book, GitHub stars may indicate that some who has viewed your dataset wants to remember it for later reference.

GitHub, Impactstory

Watched repositories

A user is interested enough in your dataset (stored in a “repository” on GitHub) that they want to be informed of any updates.

GitHub, PlumX


A user has adapted your code for their own uses, meaning they likely find it useful or interesting.

GitHub, Impactstory, PlumX


Ratings & Recommendations

What do others think of your data? And do they like it enough to recommend it to others?

SourceForge, PlumX

Dryad, Figshare, and most institutional and subject repositories

Views & Downloads

Is there interest in your work, such that others are searching for and viewing descriptions of it? And are they interested enough to download it for further examination and possible future use?

Dryad, Figshare, and IR platforms; Impactstory (for Dryad & Figshare); PlumX (for Dryad, Figshare, and some IRs)



Implicit endorsement. Do others like your data enough to share it with others?

Figshare, Impactstory, PlumX


Supplemental data views, figure views

Are readers of your article interested in the underlying data?

PLOS, Impactstory, PlumX



A user is interested enough in your dataset that they want to be informed of any updates.



Social web-based indicators


Web Service


What it might tell us

Reported on


tweets that include links to your product

Others are discussing your data–maybe for good reasons, maybe for bad ones. (You’ll have to read the tweets to find out.)

PlumX, Altmetric.com, Impactstory

Delicious, CiteULike, Mendeley


Bookmarks may indicate that some who has viewed your dataset wants to remember it for later reference.

Impactstory, PlumX; Altmetric.com (CiteULike & Mendeley only)


Mentions (sometimes also called “citations”)

Does others think your data is relevant enough to include it in Wikipedia encyclopedia articles?

Impactstory, PlumX

ResearchBlogging, Science Seeker

Blog post mentions

Is your data being discussed in your community?

Altmetric.com, PlumX, Impactstory


How to find altmetrics for data packages and papers

Aside from looking at each platform that offers altmetrics indicators, consider using an aggregator, which will compile them from across the web. Most altmetrics aggregators can track altmetrics for any dataset that’s either got a DOI or is included in a repository that’s connected to the aggregator. Each aggregator tracks slightly different metrics, as we discussed above. For a full list of metrics, visit each aggregator’s site.

Impactstory (full disclosure: my current employer) easily tracks altmetrics for data uploaded to Figshare, GitHub, Dryad, and PLOS journals. Connect your Impactstory account to Figshare and GitHub and it will auto-import your products stored there and find altmetrics for them. To find metrics for Dryad datasets and PLOS supplementary data, provide DOIs when adding products one-by-one to your profile, and the associated altmetrics will be imported. Here’s an example of what a altmetrics for dataset stored on Dryad looks like on Impactstory.

PlumX tracks similar metrics, and offers the added benefit of tracking altmetrics for data stored on institutional repositories, as well. If your university subscribes to PlumX, contact the PlumX team about getting your data included in your researcher profile. Here’s what altmetrics for dataset stored on Figshare looks like on PlumX.

Altmetric.com can track metrics for any dataset that has a DOI or Handle. To track metrics for your dataset, you’ll either need an institutional subscription to Altmetric or the Altmetric bookmarklet, which you can use when on the item page for your dataset on a website like Figshare or in your institutional repository. Here’s what altmetrics for a dataset stored on Figshare looks like on Altmetric.com.

Flavors of data impact

While scholarly impact is very important, it’s far from the only type of impact one’s research can have. Both data citations and altmetrics can be useful in illustrating these flavors. Take the following scenarios for example.

Useful for teaching

What if your field notebook data was used to teach undergraduates how to use and maintain their own field notebooks? Or if a longitudinal dataset you created were used to help graduate students learn the programming language, R? These examples are fairly common in practice, and yet they’re often not counted when considering impacts. Potential impact metrics could include full-text mentions in syllabi, views & downloads in Open Educational Resource repositories, and GitHub forks.

Reuse for new discoveries

Researcher and open data advocate Heather Piwowar (full disclosure: the co-founder of Impactstory and my boss) once noted, “the potential benefits of data sharing are impressive:  less money spent on duplicate data collection, reduced fraud, diverse contributions, better tuned methods, training, and tools, and more efficient and effective research progress.” If those outcomes aren’t indicative of impact, I don’t know what is! Potential impact metrics could include data citations in the scholarly literature, GitHub forks, and blog post and Wikipedia mentions.

Curator-related metrics

Could a view-to-download ratio be an indicator of how well a dataset has been described and how usable a repository’s UI is? Or of the overall appropriateness of the dataset for inclusion in the repository? Weber et al (2013) recently proposed a number of indicators that could get at these and other curatorial impacts upon research data, indicators that are closely related to previously-proposed indicators by Ingwersen and Chavan (2011) at the GBIF repository. Potential impact metrics could include those proposed by Weber et al and Ingwersen & Chavan, as well as a repository-based view-to-download ratio.

Ultimately, more research is needed into altmetrics for datasets before these flavors–and others–are fully understood.

Now that you know about data metrics, how will you use them?

Some options include: in grant applications, your tenure and promotion dossier, and to demonstrate the impacts of your repository to administrators and funders. I’d love to talk more about this on Twitter.

Recommended reading
  • CODATA-ICSTI Task Group. (2013). Out of Cite, Out of Mind: The current state of practice, policy, and technology for the citation of data [report]. doi:10.2481/dsj.OSOM13-043

  • Costas, R., Meijer, I., Zahedi, Z., & Wouters, P. (2013). The Value of research data: Metrics for datasets from a cultural and technical point of view [report]. Copenhagen, Denmark. Knowledge Exchange. www.knowledge-exchange.info/datametrics

Current Science & Science Data Librarian job postings

e-Science Portal Blog - Thu, 10/16/2014 - 12:12

Submitted by Donna Kafel, Project Coordinator for the e-Science Portal.

Here are some recent job postings for science, health sciences, and data librarians at various institutions across the US and Canada.

California State University, East Bay Library, Health Sciences and Scholarly Communications Librarian: https://csucareers.calstate.edu/Detail.aspx?pid=41475

Carilion Clinic, Roanoke, VA, Clinical Research Librarian:  https://www.healthcaresource.com/carilion/index.cfm?fuseaction=search.jobDetails&template=dsp_job_details.cfm&cJobId=734201&fromCarilion=true

Lewis & Clark College(Portland, OR):  Science & Data Services Librarian:  https://jobs.lclark.edu/postings/4720

New York University Health Sciences Libraries, Knowledge Management Librarian http://hsl.med.nyu.edu/content/knowledge-management-librarian

Life Sciences Librarian, New York University:  http://library.nyu.edu/about/jobs.html#sciences

Research Data Services Librarian, New York University:  http://library.nyu.edu/about/jobs.html#RDM

McGill University:  Data Reference Services Librarian:  http://joblist.ala.org/modules/jobseeker/Data-Reference-Services-Librarian/27493.cfm

University of Cincinnati, Digital Metadata Librarian:  http://www.libraries.uc.edu/about/employment.html

University of Connecticut, Sciences Librarian,:  http://joblist.ala.org/modules/jobseeker/Sciences-Librarian/27501.cfm

University of Delaware, Science Liaison Librarian:  http://www2.lib.udel.edu/personnel/employment/102465ScienceLiaisonLibrarian.pdf

University of Kentucky:  Head of Science Library and e-Science Initiatives:  http://www.diglib.org/archives/6865/

University of Massachusetts Medical School,  Assoc. Director of Library Education and Research https://careers-umms.icims.com/jobs/23818/assoc-dir%2c-lib-education-%26-research/job?mobile=false&width=1837&height=500&bga=true&needsRedirect=false

Call for Papers IASSIST 2015

e-Science Portal Blog - Wed, 10/01/2014 - 15:49

IASSIST (International Association of Social Science Information Services and Technology) announces a Call for Papers for IASSIST 2015, which will be held June 2-5 in Minneapolis, MN.


Challenges of Self Learning

e-Science Portal Blog - Fri, 09/26/2014 - 16:20

Submitted by Donna Kafel,  e-Science Coordinator,  University of Massachusetts Medical School

Data Visualization, Research Methods in Information, How to Think Like a Computer Scientist, Interactive Web Design, Blindspot, the Harvard Edx course “Introduction to Computer Science.” These are just a few examples of the many topics and items on my to-read and to-learn list. I want to learn about Python script and R, I want to be better versed in research methodology, develop self-paced educational modules, be more aware of hidden biases, and develop proficiency in data science techniques. Knowing these things would be very useful for me professionally. And I’m sure I’d enjoy learning some of them if I could find the time.

The following picture depicts my typical daily dilemma.  During the course of my workday, I come across a book, or a new tool, or an online course, or something that I want to learn about. And I think to myself, when I go home tonight I’m going to delve into reading about this topic and learn something. Or I’m going to set aside an hour every night for a week and learn the basics of Python.  I’m going to learn the ins and outs of a new database.  These ideas seem very do-able in the light of the workday.  Yet after work, other demands and tasks take over, and I let these great aspirations fall by the wayside, night after night.

Reflecting on this vicious cycle, I got to wondering about how my colleagues approach self-learning.  Do they set specific goals for themselves?  Do they set aside work time to learn a new technology? Do they ever sleep?

I decided to interview two librarians who I admire for their creativity, unique skills, and passion for learning:  Sally Gore and Chris Erdmann.  I work with Sally at the Lamar Soutter Library at UMass Medical School. Sally works as an Embedded Research Informationist and is involved in some very interesting projects with faculty researchers who are investigating things like patient compliance in mammogram screening and developing a system for citing neuroimages.  Sally is a thoughtful and articulate writer who regularly shares her insights about her experiences working as a librarian in the research environment and emerging trends in librarianship in her blog A Librarian by any Other Name.   Chris is the Head Librarian at the Harvard-Smithsonian Center for Astrophysics.  Much of his work there focuses on astrophysics data and developing library data services that support the needs of astrophysics researchers. Chris directly works with researchers doing data processing and analysis;  assisting them with data citation and publishing, and exploring new approaches for repository systems that support access to huge astrophysics data sets. What’s particularly striking about Chris is his passion for teaching other librarians data science techniques in his DST4L (Data Science Training For Librarians) class that is now in its third iteration. In this class, Chris and his associates have taught librarians programming skills and technologies through hands-on activities and group projects.

I interviewed Sally and Chris individually but both of their responses are noted below each question.

How do you find time to “teach” yourself new things?

Sally:  I set aside one morning a week, usually Friday mornings, for professional reading and writing my blog posts. Making this a weekly practice is a good habit. I strongly believe that librarians need to make an active effort to stay informed, and to do that, we need to set aside some work time for reading and learning. In my spare time I also take the opportunity to attend seminars, and learning events, like Science Café Woo for example. I also try to meet new people at such events, by sitting with people I don’t know and talking with them about their interests and the work they do.

Chris:  When things are quieter at work, I seize moments to focus on learning a new skill. One of my fears is that I won’t be able to keep up with the rapid pace of changing technologies. It’s a huge challenge to find this time, but that’s how I learned a lot of computer programming, during breaks.

I do encourage librarians who are interested in the DST4L class to advocate for professional development time to take the class by pointing out to their administrators the usefulness of the skills they’ll learn. I have thought about teaching the class online but it wouldn’t work well that way.  One of the key factors to successfully sticking with a class is being involved in group projects in which your classmates are counting on your participation. No one wants to let their group down, so they consistently attend the classes.

Did your educational background prior to library school help you with your work now?

Sally:  I have a B.A. in Philosophy and a Master’s in Divinity, and a Master’s in Exercise Physiology. These are all very different fields than the research disciplines that I’m involved with right now.  I do think having worked in a research environment while studying physiology has been a huge plus.  It gave me a sound background in research methods and familiarity with research work and environments.

Chris:  My background is a B.A. in History with a minor in Agriculture and Managerial Economics. Very different from computer science! But several years back, I really wanted to get a job as a programmer and was pretty sure that I could teach myself the basics.  I learned programming initially by picking up a C++ book years ago and studying it. It wasn’t easy but I was determined to learn programming so I could work in a software company.  I did get hired as a programmer. The first week on the job was a bit shaky, but I persevered and learned as I went along.

The thing I missed though in working as a programmer was not working directly with users. I enjoy working with people. I did a consulting gig for a while and was able to work more with users then. As I thought more about wanting to work with people, I started to consider library school.

What has inspired you?

Sally:  I have been working at UMass Medical School for nine years now, but it’s only been during the last two years that I’ve worked directly with researchers. I know much more now about the research work that is being done at the school, yet the more I know the more I realize how much more is being done here that I know nothing about. I’m inspired by the incredibly bright researchers with whom I have the opportunity to work.  I enjoy the work I’m doing now as an informationist on a neuroscience project. I like looking at the big picture; understanding the project activities and project design and data management challenges. In this project, we’re trying to explore new ways for effectively citing individual neuroimages that are part of a “dataset” that is basically a collection of neuroimages.

Chris:  I was inspired by an internship I did at the Smithsonian in which I worked on DigiLab, an interactive exhibit of digital materials. I enjoyed the work I did there so much, it inspired me to continue learning everything I could about programming for the web.

Another thing that I’ve found motivating is going to user forums to learn new coding skills. They’re ideally places where you can informally learn from a community of other users. When I started learning, forums used to be intimidating spaces but have improved dramatically. Now the users are generally more welcoming, though, there is still work to be done to improve the culture so it is less male dominated. I’ve been favorably impressed by Software Carpentry.  They’ve been great to work with, and I always recommend the bootcamps they run to students.

From there my conversations with both Sally and Chris veered to new roles for libraries, data repositories, and revamping library school curricula to include data science courses—topics for other blog posts.  However, I did come away from the interviews with a few do-able approaches for self-learning:

·          Be single-minded. Identify one topic or skills you want to learn and focus on mastering it.

·          Seize opportunities to attend lectures, seminars, poster presentations  on research topics (there are many of these in an academic institution)

·          Enroll in a face-to-face class with required projects

I found these helpful and hope they’ll help others who are paralyzed by the so much to learn, so little time conundrum. I’ll let you know how my self-learning proceeds in a future post!


Dr. Lawrence Tabak: Enhancing Reproducibility and Transparency of Research Findings

e-Science Portal Blog - Mon, 09/22/2014 - 16:12

Enhancing Reproducibility and Transparency of Research Findings.
Lecture given by Lawrence Tabak, Principal Deputy Director of the NIH
Part of the Sanger Series at Virginia Commonwealth University, Richmond,VA

 The starting point of Dr. Tabak’s lecture was the editorial he and Dr. Francis Collins published in Nature, January 30 2014, on NIH plans to enhance the reproducibility of research.

One of the things they noted in the article was that the NIH can’t make the changes to research alone.  Scientists, in their roles as reviewers of grants and articles, editors of journals, and members of tenure panels, can help with the process.

 Science has always been viewed as self-correcting, and it generally is over the long term, but the checks and balances for reproducibility in the short and medium term are a problem. Tabak discussed several problems with current research publications. Journals want exciting articles, “Cartoon biology” according to Tabak, and so the methods sections are shrinking – “more like method tweets”. Add to this issues with;

  • poor research design, eg. not using blinding or randomizing, using a small sample,
  • incorrectly identified materials, eg. not verifying cell strains or antibodies,
  • variability of animals,
  • contamination of cells,
  • sex differences

Along with methods issues, Tabak identified problems with poor training in experimental design, poor evaluation leading to more errata and retractions, the difficulty of publishing negative findings, and the “perverse reward incentives” in the US biomedical research system.

What is the NIH doing?

As well as speaking at many venues outside of the NIH (such as this lecture at VCU) there are efforts to work with editors, industry, and other groups to improve research.

Editors of journals with the most NIH researcher publications were invited to a workshop in June 2014, and a set of principles for journal publication were drawn up.  Science and Nature will run editorials in November with the finalized principles.  The principles will include encouraging rigorous statistical analysis, transparency in reporting, data and materials sharing, and establishing best practices guidelines.

NIH is working with industry through PhRMA to make training materials on research design available to everyone. And there will be some training films developed at the NIH for use around the country.

Tabak mentioned a couple of projects that should help with the validation of high-quality experimental results.The Reproducibility Initiative is a collaboration between Science Exchange, PLOS, figshare and Mendeley, and the Open Science Framework from the Center for Open Science allows researchers to register materials and methods before research, similar to a clinical trials register.

Tabak also discussed a checklist of core elements that might be used when reviewing grants.  Included was the idea that researchers need to make sure their background articles are reproducible and of high-quality. He mentioned that some of the false hopes of patients for a cure for devastating illnesses, such as ALS or cancer, are based on poorly designed animal studies that should have never progressed to clinical trials.

Post-publication review of papers, in forums like PubMed Commons, is one way to insure transparency.  As well as discussing and clarifying the research, some authors have linked to data sets, including negative data sets, which increases the usefulness of the Commons model.

There was also a discussion of funding for replications studies, and alternative funding methods to increase stability for mid-career researchers. Tabak concluded by mentioning that these new funding models would need to be evaluated and it may be that different institutes will use different models.

 What can librarians do?

Throughout the lecture I thought of things librarians could be doing to support scientific transparency and reproducibility. We can encourage the best possible background searching for research, which means training students as well as working with researchers to refine their searching.  We can encourage citation searching and show researchers how to follow up on errata and retractions so they know what others think of the research they are reading. We can encourage the use of social media for informal communications about research. And be sure to keep an eye out for the principles the journals will be sharing in November.

As a data librarian,I can encourage proper data management and documentation for reliable reporting.  I can suggest sharing data in various venues and linking that data to their articles.  I can suggest that data management training be part of the research design training and I can offer to do it.

And libraries can invite lectures such as this one – VCU Libraries sponsored this lecture with our Office of Research and Innovation.  And it was very well attended.

I’m sure there are ideas I’ve missed.  I would love to hear any ideas you might have for ways librarians can help research transparency and reproducibility.

Video of the lecture is availale at http://youtu.be/E06QJTZ6LUw

Two job opportunities at Brown University Library

e-Science Portal Blog - Mon, 09/22/2014 - 09:44

Brown University has two new job opportunities:

  • Physical Sciences Librarian
  • Digital Humanities Librarian

Detailed job announcements for these positions are located on the Employment Opportunities page of the Brown University Library website.

Save the Date for RDAP 2015

e-Science Portal Blog - Mon, 09/22/2014 - 09:19

The 2015  RDAP (Research Data Access & Preservation) Summit will be held April 22-24 in Minneapolis, Minnesota. See RDAP website for further details.

Teaching Research Data Management at University of Tennesee Knoxville

e-Science Portal Blog - Mon, 09/15/2014 - 14:48

Check out the newly published article in the Journal of eScience Librarianship, “Planning Data Management Education Initiatives:  Process, Feedback, and Future Directions.” In the article,  Christopher Eaker, Data Curation Librarian at the University of Tennessee Libraries, discusses a one day Data Management Workshop that he taught to graduate science and engineering students using modules from the New England Collaborative Data Management Curriculum. As part of the workshop, Eaker asked students to take a pre-workshop survey and a series of seven post-module surveys throughout the day.  In the article, Eaker discusses findings from the surveys and how they are shaping  his plans for future research data management training.


Planning underway for 2015 National Digital Stewardship Residency Program

e-Science Portal Blog - Mon, 09/15/2014 - 13:11

The following announcement was made by George Coulbourne, Supervisory Program Specialist, Library of Congress, Office of Strategic Initiatives:

The Library of Congress Office of Strategic Initiatives, in partnership with the Institute of Museum and Library Services (IMLS), is planning for another year of the National Digital Stewardship Residency program (NDSR) to be held in the Washington, DC Metro area, starting in June, 2015. As you may know, this program is designed for recent master’s and doctoral graduates interested in the field of digital stewardship.  This will be the fourth class of residents for this program overall – the first in 2013, was held in Washington, DC and the second and third, which started earlier this month, are being held concurrently in New York and Boston.

The 2015 DC Residents will each be paired with an affiliated host institution for a 12-month program that will provide them with an opportunity to develop, apply, and advance their digital stewardship knowledge and skills in real-world settings. The participating hosts and projects for the 2015 cohort will be announced in early December, and the application period will open shortly after.  News and updates will be posted to the NDSR webpage (www.digitalpreservation.gov/ndsr ), and The Signal blog (http://blogs.loc.gov/digitalpreservation/).

In addition to providing great career benefits for the residents, the success of the NDSR program also provides benefits to the institutions involved as well as the library and archives field in general.

Please help us spread the word about this program, and forward this information to student groups and other organizations who might be interested.  We appreciate your help very much.

To learn more about the NDSR, please visit our website at: www.digitalpreservation.gov/ndsr.




In honor of Labor Day–some recent job postings

e-Science Portal Blog - Thu, 08/28/2014 - 14:40

According to the US Labor Department,  Labor Day ” is dedicated to the social and economic achievements of American workers. It constitutes a yearly national tribute to the contributions workers have made to the strength, prosperity, and well-being of our country.”

And what better way to celebrate Labor Day on e-Science Community than to share a few recent job announcements:

Arizona State University:  2 positions:  Health Sciences Librarian, Digital Projects Librarian

Boston University:  Research Data Management Librarian (Science & Engineering Library)

Medical University of South Carolina:  Research and Education Informationist/Librarian  (2 positions)

New York University:  Data Curator

Santa Clara University:  Science Librarian and Scholarly Communication Coordinator

Tufts University:  Science Collections Librarian

University of Florida:  Data Management Services Librarian



Institute for Research Design in Librarianship: Raising the Bar in Library & Information Science Research

e-Science Portal Blog - Mon, 08/25/2014 - 15:15

Submitted by guest contributors: Daina Bouquin, Data & Metadata Services Librarian, Weill Cornell Medical College of Cornell University, dab2058@med.cornell.edu; Chris Eaker, Data Curation Librarian, University of Tennessee Libraries, ceaker@utk.edu

Why do librarians need to do research? Or rather, why does anyone need to do research? Librarians conduct research to better understand the communities they serve and to develop responses that reflect their needs. Whether it be biomedical research, engineering, art history, or library science, research is imperative to developing the skills necessary to execute on innovative ideas and support decisions with data. Publication allows researchers to share their findings with the wider scholarly community and to build upon the findings of others. Research in the library and information science fields also helps increase receptivity to change in established environments; improves management skills through systematic study and data driven decision making; and helps researchers provide better service to and empathy for faculty researchers within their institutions (Black & Leysen, 1994; Montanelli & Stenstrom, 1986). Librarians who engage in research may also be better equipped to initiate new services that meet the specific needs of their communities. Furthermore, research in the academic library environment is not only useful, but expected for many academic librarians.  Librarians who produce comprehensive research are better able to progress toward promotion, tenure, higher salaries, advancement in the profession, and well-warranted recognition. However, many librarians are confronted with barriers to pursuing research. Many of these obstacles have been documented in the literature and include lack of time to conduct research, unfamiliarity with the research process, lack of support for research, lack of confidence, and inadequate education in research methods (Koufogiannakis & Crumley, 2006, 333; Powell, Baker, & Mika, 2002, 50; McNicol & Nankivell, 2003). In response to these barriers, librarian researchers at Loyola Marymount University developed the Institute for Research Design in Librarianship (IRDL). The IRDL is a 9-day continuing education program designed to mitigate these obstacles and train world-class library and information science researchers.

And so this past June, starting June 16 and running through June 26, twenty-five academic librarians and information professionals participated in the first-ever Institute for Research Design in Librarianship (IRDL) at Loyola Marymount University in Los Angeles, California. The IRDL is funded for three years by the Institute of Museum and Library Services to train a total of 75 professionals (25 per year) in research methods and to support them in developing professional research networks as they embark on their first attempts at comprehensive research and publishing in peer-reviewed journals. The first set of 25 IRDL Scholars (including the two authors of this article) were chosen in a competitive application process out of 86 applicants. To apply for the IRDL, applicants had to submit a proposal for a research project they would like to conduct once IRDL was over.

During IRDL, scholars received comprehensive training in the nuts and bolts of the research process. Topics included creating research questions and hypotheses, using qualitative methods (e.g. in-depth interviews and focus groups) and quantitative methods (e.g. surveys), along with mixed-methods research. Scholars were also given hands-on training with both quantitative and qualitative data analysis techniques and software, such as SPSS and NVivo. By studying these aspects of the research process, and consulting with peers and instructors, scholars were able to start developing skills to help them become more critical consumers of published research — this skillset is key when trying to not only produce quality research, but also contribute to meaningful discussion and criticism of research in information science. Scholars were also introduced to issues regarding realistic approaches to publishing to better prepare them to share their prospective research findings in the future.

The IRDL program also reflected an emphasis on the importance of having a supportive learning environment, mentorship opportunities, and tools to jump-start a new research agenda. Additionally, the Institute gave scholars access to both qualitative and quantitative methods experts both inside and outside of library and information science fields to help address the need to improve the quality of Library and Information Science research. An article published in The Journal of Academic Librarianship analyzed the contents of 1,880 articles in library and information science journals. Of those, they found that only 16% “qualified as research,” which they defined as  “an inquiry which is carried out, at least to some degree, by a systematic method with the purpose of eliciting some new facts, concepts, or ideas” (Turcios, Agarwal & Watkins, 2014). This study also found that surveys were the most commonly used research method among the studies published in the reviewed journals. These results could suggest that although there is research being done, librarians may not be making full use of all the methods they have available to them, and may not be producing as much “research” as they suspect. The goals of the IRDL are reflective of this sentiment.

During IRDL, scholars had to refine their initial proposal based on the new skills and concepts they were learning– now that the IRDL Scholars have returned to their respective institutions, the real work begins. Scholars are finalizing their research design and submitting IRB applications to begin conducting their research. Over the next several months, institute scholars will be conducting interviews and focus groups, administering surveys, and maybe even using our new favorite research method: garbology! Over the next year, keep a watch in the library and information science journals for articles from all the IRDL scholars’ many and varied research projects.

If you’re a new librarian, or a librarian who is still unsure of the research process, we encourage you to apply for next year’s IRDL. The IMLS has funded IRDL for three years, but they are working on plans to make it sustainable so many more cohorts of librarians can be trained in sound research methods and techniques. You can find out more about IRDL at http://irdlonline.org/ or on Twitter @IRDLonline and #IRDL. You will be overwhelmed with information, but that’s the price we must pay to move our research to the next level.


Black, W. K., & Leysen, J. M. (May 1994). Scholarship and the academic librarian. College & Research Libraries, 55, 229-241.

Montanelli, D. S., & Stenstrom, P. F. (September 1986). The benefits of research for academic librarians and the instititions they serve. College & Research Libraries 47, 482-485.

Koufogiannakis, D., & Crumley, E. (2006). Research in librarianship: issues to consider. Library Hi Tech, 24(3), 324-340. doi:10.1108/07378830610692109

McNicol, S., & Nankivell, C. (2003). The LIS research landscape: A review and prognosis. Centre for Information Research. Retrieved from http://www.researchgate.net/publication/228392587_The_LIS_research_landscape_a_review_and_prognosis.

Powell, R. R., Baker, L. M., & Mika, J. J. (2002). Library and information science practitioners and research. Library & Information Science Research, 24(1), 49-72. doi:10.1016/S0740-8188(01)00104-9

Turcios, M. E., Agarwal, N. K., & Watkins, L. (2014). How much of library and information science literature qualifies as research?. The Journal of Academic Librarianship. doi: 10.1016/j.acalib.2014.06.003


ICPSR Managing and Curating Data Workshop

e-Science Portal Blog - Tue, 08/19/2014 - 11:19

Submitted by guest contributor  Willow Dressel, Plasma Physics/E-Science Librarian, Princeton University. wdressel@princeton.edu

The last week of July I attended ICPSR’s workshop Curating and Managing Research Data for Reuse at the University of Michigan in Ann Arbor.  The workshop is part of ICPSR’s summer program and was started three years ago. I was interested in this workshop to try to get a firmer grasp on managing research data and begin to develop a deeper understanding of what is involved in curating.

The workshop was presented by curators from both ICPSR and the UK Data Archives and followed the ICPSR Pipeline Process for curation, with each day progressing through the issues and actions associated with Deposit, Processing, Delivery, and Access. There was a healthy mix of lecture and hands on activities. The roughly twenty or so participants were international and from diverse backgrounds including social science research, other data repositories, and libraries, which provided unique perspectives that greatly enhanced class discussions.

Like many of my colleagues, I am a science librarian who has been tasked with developing services and resources to help science researchers manage their research data. Over the last couple of years I have attended various workshops and conferences to try to get up to speed.  In this time, I have learned a lot about the different issues around managing and preserving scientific research data, as well as what other libraries are doing. As a result, I have managed to put together some really basic services such as data management plan consultation and assistance depositing in disciplinary repositories.

However, as I begin to put together a data management workshop and libguide, I can feel my knowledge gaps in this area. I understand the need for things like documentation, stable file formats, storage and back-up, file cleaning, and confidentiality, but I don’t have a deep understanding of how to do these things. I am still reading and learning as I go. As an undergraduate physics and astronomy major, I worked with only a little bit of spreadsheet data, and that was ten years ago. It’s hard to feel confident in giving people advice on how to manage their data when I have worked with so little data myself. This workshop offered a lot of hands on exercises, including actually working with both quantitative and qualitative data. Prior to attending the workshop, I had been concerned that the heavily social science perspective of the workshop might not be as relevant to me as a science librarian. Now I believe this is a benefit. Who better to learn from than a field with established disciplinary repositories and a long culture of managing, curating, and reusing their data.

As for the curation aspect of the workshop, I don’t currently have data curation in my job description and my institution doesn’t currently offer data curation services. Nevertheless, it seems that this is an important aspect of dealing with research data and I believe having an understanding of the process and issues associated with data curation will help me assist researchers to deposit in a repository as well as inform the possibility of developing these services.

Registration now open for Data Scientist Training for Librarians (DST4L)

e-Science Portal Blog - Tue, 08/19/2014 - 07:35

Data Scientist Training for Librarians or DST4L (http://altbibl.io/dst4l) is an experimental course being offered by the Harvard-Smithsonian Center for Astrophysics John G. Wolbach Library and the Harvard Library to train librarians to respond to the growing data needs of their communities. Data science techniques are becoming increasingly important to all fields of scholarship. In this hands-on course, librarians learn the latest tools for extracting, wrangling, storing, analyzing, and visualizing data. By experiencing the research data lifecycle themselves, librarians develop the data savvy skills that can help transform the services they offer.

The DST4L course is free and open to beginners. Registration opens on August 15th and closes on August 22nd. A maximum of 40 participants will be accepted into the program and it is open to librarians outside of Harvard University. A tentative course outline can be found on the Current Course page (http://altbibl.io/dst4l/current-course/) of the DST4L website. Please review the provisional schedule to see if you can commit to the program first before registering. If you cannot attend the course, material will be made available via the DST4L website as it progresses. The course will not be live streamed or recorded. You must be physically present for the course.

Registration form: http://goo.gl/FtffdX

In addition to the course sessions, there will also be monthly Data Savvy Librarians meetups to work on projects together, share discoveries, and hone our skills using real-world problems. Meetups will be announced via the DST4L Google Group:

https://groups.google.com/forum/#!forum/dst4l (sign up required)

You do not need to be enrolled in DST4L to join the group or attend the meetups, though we recommend that you have some familiarity with data-related tools to participate in the meetups.

Highlights of NN/LM MAR Symposium on Approaches to RDM for Libraries

e-Science Portal Blog - Mon, 08/18/2014 - 14:45

Submitted by Kate McNeil, Social Science Data Services and Economics Librarian, MIT

Highlights of: National Network of Libraries of Medicine Symposium: Doing It Your Way: Approaches to Research Data Management for Libraries
Rockefeller University, NY, April 28-29, 2014

In late April, the National Network of Libraries of Medicine, Middle Atlantic Region (NN/LM MAR) hosted a two-day symposium on research data management (RDM).  The event garnered well over 100 participants from the mid-Atlantic and beyond, professionals both from medical libraries and a variety of other settings who are providing or exploring RDM services.

Keynote Speakers

The initial keynote speaker was Paul Harris, Director, Office of Research Informatics, Vanderbilt University School of Medicine.  He encouraged participants to seek out opportunities to develop tools and services immediately useful to their local researchers which also would further the goals of RDM.  He profiled several tools that they provide locally, including:

  • Project RedCap (Research Electronic Data Capture): This system enables the collection of metadata about active biomedical projects and associated collected data at one’s institution. It was created at Vanderbilt and since has been deployed at other institutions via the RedCap Consortium.
  • StarBRITE CMS Researcher Portal: This Vanderbilt-specific platform enables the centralized collection of information about research: news, pilot funding, project information, researcher profiles.

The second keynote speaker was Keith Webster, the Dean of Libraries for Carnegie Mellon University.  He provided a general overview of the importance of RDM for academic libraries (in the light of changes in the way that science is done and evolving roles for academic libraries).  He then spent time situating the attendees’ work within important trends in the broader, international, professional context.  He encouraged participants to develop their skills in this field and to stay aware of the very significant progress and initiatives happening internationally, particularly in Europe and Australia.  He noted some key reports and articles to read (listed at the end of this posting).

The final keynote speaker was Jared Lyle, Director of Curation Services of the Inter-university Consortium for Political and Social Research (ICPSR).  ICPSR, the large, long-standing social science data archive based at the University of Michigan, has a tradition working with data producers to acquire data and then curate it for optimal re-use by secondary researchers.  ICPSR tools highlighted include: the data catalog (which enables discovery of datasets and granular information including variables) and the Bibliography of Data-related Literature (which links ICPSR studies to resulting publications based on the data in the archive).  With ICPSR’s history of supporting data re-use, he pointed out that a well-prepared data collection should be complete and self-explanatory.  However, researchers (many of whom may have a high willingness to share) rarely have sufficient time, money, or resources to prepare and document their data well for re-use.  But he pointed out as well that many professionals in the field are trying to better understand this landscape and develop new services in order to improve the sharing and quality of research data.  For example, one Stanford librarian works with local researchers to curate, redistribute, and archive their research data.  The Stanford Social Science Data Collection is a type of intermediary repository; staff members work with researchers to capture their datasets, later moving them to a more long-term repository.

University Service Models

In the afternoon, attendees heard from practicing professionals on overviews of their RDM services.  Following are highlights of the services of a selection of universities:

University of Minnesota:

The Libraries have a dedicated staff member to RDM, the Research Data Management/Curation Lead, who provides services and coordinates the work of other staff.  Their RDM service is overseen by a campus advisory group with members from various stakeholder departments; the Libraries are working with this group to develop a campus-wide referral network.  One significant effort of the Libraries is a pilot to have staff actively curate and upload the data associated with 30 researcher projects into their institutional repository (IR).  They also have worked with researchers to self-deposit their datasets into the IR, instructing them on practices in realms such as metadata. They use DSpace for their IR and are finding that the newer version (i.e., 4.x) provides more flexible features for research data, including metadata elements beyond Dublin Core.

University of North Carolina:

This university has an RDM service group co-led by two librarians (each of whose primary focus is on other service areas). Staff members provide a range of services in cooperation with other stakeholder departments on campus to whom they reached out over time.   For example, they collaboratively conducted a series of information sessions on data management for researchers.  The Libraries partnered with campus stakeholders to each teach components on different topics (DMPs, repository options, sensitive research data, data security), including areas of expertise outside of the Libraries.  These popular sessions, in addition to being provided in-person, were live streamed (to a large audience) and recorded for later viewing.  Looking towards the future, the libraries are in the process of actively reorganizing for improved research lifecycle support.

NYU Health Sciences Libraries:

They have worked at developing RDM services, working in partnerships with staff both inside the library (e.g., subject liaisons) and throughout the university.  A core challenge for this institution has been to helping to change perceptions about the scope of a library and demonstrate to researchers the library’s role in RDM services.  To that end, they collaborated with various staff members to develop and distribute several quickly-popular YouTube videos on the significance of RDM.  These videos are used on their own and as part of library instruction (not only at NYU but, as the symposium illustrated, by many other universities as well):


Their dedicated Scientific Data Curation Specialist coordinates services and the work of other staff.  She manages a collaborative consulting team, consisting of two groups: 1) a core of staff members (mostly in the Libraries) and 2) additional second-level team members from departments campus-wide whom are call upon as needed (e.g., staff members from IT and legal).  In addition, their service is overseen from an upper-level management council (with membership across several university departments).

Case Study

On the second day, Sherry Lake, Senior Data Consultant, and Andrea Horne Denton, Research and Data Services Manager, of the University of Virginia educated attendees on some key RDM best practices via a case study that they use in their workshops, based on a case from the Digital Curation Profiles Directory.  Participants examined the profiled research group’s practices in the realms of: data collection and organization, documentation and metadata, storage and backup, and preservation/sharing/licensing. In doing so, they learned about common issues which researchers might face and how to assist them.

Regarding RDM services, UVA has two different approaches:

  • operational: helping to improve researcher efficiency and good organization and documentation practices throughout the life cycle
  • sharing: helping researchers to be aware of requirements and plan for downstream data sharing

UVA provides many services similar to other institutions, and like some others does a series of workshops (dubbed “Research Data Management Boot Camp”) with contributing instructors from departments across the university.

Lastly, the presenter shared two lists of resources that she maintains for keeping up-to-date on the field of RDM:

Principles for RDM Work

Over the two days, presentations highlighted various strategies that professionals utilize in providing RDM services:

  • Promote curation rather than sharing; the former is more salient for researchers, and must precede the latter.
  • A well-prepared data collection should be complete and self-explanatory; help researchers to meet this standard.
  • Encourage best practices yet support people where they are.  I.e., even if a researcher’s method of sharing data— e.g., storing on one’s hard drive and responding to requests—has significant drawbacks, help them to execute their selected method in an optimal way (i.e., in this example, help them to establish appropriate backups) while at the same time gently share concerns about their method and be available to help them consider other methods when the time is right.
  • Continue outreach efforts on a regular basis; people don’t always see ads even if you do a great one-time campaign.
  • Once researchers have shared their data, tell them about the ways to track use of their data.
  • Develop services based on the hypothesis that researchers will do the right thing (maintain their information securely, track metadata, maintain audit trails, etc.) if provided an easy way to do it with needed tools and services.
  • When developing partnerships or services, the technology is the easiest part.  Relationships take time to build; be prepared to slow down to work with diverse needs
  • Frame one’s services within the data curation lifecycle for staff and stakeholders with whom one communicates or partners.
  • In planning collaborative services with senior administrators/department heads, make sure they are communicating plans and expectations down to the PIs.
  • Track your work for assessment.
  • Stay aware of RDM requirements/regulations around the world, both for professional awareness and given the fact that U.S. researchers likely are collaborating across borders.

In summary, while symposium attendees were largely focused on medical library settings, the lessons learned apply to research and libraries in all disciplinary contexts.

Suggested Reports/Articles to Read

Wellcome Trust; Guidance for Researchers: Developing a Data Management and Sharing Plan

Opening for Science Collections Librarian at Tufts University

e-Science Portal Blog - Tue, 08/12/2014 - 11:29

The Tisch Library at Tufts University, has recently announced a job opportunity for a Science Collections Librarian. For further details:  http://tufts.taleo.net/careersection/jobdetail.ftl?job=14000636&lang=en&sns_id=mailto


Application deadline for OpenCon 2014 is August 25th

e-Science Portal Blog - Tue, 08/12/2014 - 10:25

OpenCon 2014 will be held in Washington DC November 15-17th. This workshop is geared to students of all levels, early career researchers, and young professionals in fields related to scholarly and scientific research (e.g. librarians, professional advocates, etc.). OpenCon 2014′s theme is Open Access, Open Education and Open Data.

To attend OpenCon 2014, there is an application process instead of open registration, as a large number of participants will be offered full or partial travel scholarships.  See OpenCon 2014′s application page.

Developing a for-credit course on data management

e-Science Portal Blog - Tue, 08/12/2014 - 08:51

Submitted by Sarah Wright, Life Sciences Librarian for Research at Cornell University’s Albert R. Mann Library

We all know that instruction is a major part of librarians’ jobs, but more specialized instruction opportunities, like educating students about data and research techniques, are often less recognized. Furthermore, librarians rarely expect to offer instruction for credit. But over the course of the last three years, I had the opportunity to pursue just this type of specialized instruction.

It began in 2012 with a grant from the Institute of Museum and Library Services (IMLS) which enabled me, in collaboration with Camille Andrews, learning technologies and assessment librarian at Cornell’s Mann Library, and Cliff Kraft, associate professor in the department of natural resources, to explore needs and develop a course to help graduate students learn to manage their data. The IMLS funded collaboration also included Purdue University, the University of Minnesota and the University of Oregon, and focused on developing data management instruction in several different STEM fields. The experiences of the collaborative effort are collected on our wiki (datainfolit.org), and explained more in-depth in a book due to be published by Purdue University Press this fall: Data Information Literacy: Librarians, Data and the Education of a New Generation of Researchers.

At Cornell, data management instruction has taken the form of a for-credit course, offered experimentally in spring 2013, and then as a formally approved course in 2014 (NTRES 6600: Managing Data to Facilitate Your Research). Cliff and I share teaching duties; I introduce research techniques and best practices, and Cliff contributes the research context. I also take advantage of the expertise of others, calling on colleagues to teach classes on metadata and relational database design.

The course is offered for 1 credit, spanning 6 sessions early in the spring, and introduces the students to a range of best practices, from initial steps like using consistent file names, to creating “readme” directory files and managing complex data documentation. The students also have the opportunity to discuss hot topics like data sharing, something that is often a topic of conversation, but rarely covered in courses.

The faculty collaborator, Cliff, was instrumental in making this course a reality. His interest in co-teaching the course came from his experience managing data in his own research group, where he had long been interested in introducing training, but didn’t feel comfortable with all the topics he felt needed to be addressed. After offering the course experimentally, Cliff took the step of submitting the course to the curriculum committee and asking that it be made a permanent part of the natural resources curriculum. It helped that the students who took the experimentally offered course were enthusiastic – when surveyed, the students said that they would recommend the course to others, reported substantial gains in confidence around the data management topics covered, and one even went so far as to say that the course filled a very important hole in their education.

We plan to continue offering (and improving) the course as long as interest from students continues. Every semester offers an opportunity for fine-tuning. We survey the students to make sure we’re focusing on important topics and the depth of coverage is appropriate. The reaction the second time the course was offered was better than the first – evidence that our choices to cut some content and provide deeper instruction and opportunities for hands-on practice around other content was well-received.

We’re also thinking of branching out into other subjects: both iterations of the course so far have drawn some social scientists, and the natural resources examples we use aren’t a great fit for them. They still get a lot out of the class, but they’d probably get even more out of it if we were able to offer a class targeted at their needs. The only limit is time and energy – it takes a lot of both to develop courses like this, along with subject expertise, so it will require more librarians and faculty who are willing and able to collaborate to develop instruction. The payoff is tremendous though. Not only was I able to develop stronger relationships with graduate students, giving me added insight into their needs, at the end of the course students enthusiastically thanked us for covering such an important topic and for making them aware of a much wider variety of resources and help available at Cornell than they had realized existed. The library benefits from greater embeddedness in the research process, and the students benefit from having us there.

I learned lessons from the process, and include a few of those here:

  • Limit class size (or develop content that can easily scale up). We had prepared for a small class size, developing content around active learning and discussion. When almost 30 students showed up, we had trouble adapting. Good problem to have, but still a problem. (Colleagues at the University of Minnesota developed nice online content targeted to their engineering students that allowed them to scale up.)
  • Make clear from the outset at what level you’ll teach the course. Very advanced students taking a beginning-level class will be frustrated, and vice versa.
  • Know your graduate students. We timed the class to occur in early spring since many of the students in natural resources are wrapping up field work in the fall and then starting up again as the weather improves in the spring.
  • Context is important. Real-life examples are a key component of the class, and Cliff pulls data from his research that resonates much more than a generic example might.
  • Don’t underestimate your ability to contribute. What I considered very basic data management instruction was some of the most well-received; students also requested more best practices guidance than I expected. (Example: file-naming and organization strategies)
  • If you can’t offer a for-credit course, offer what you can. Working with other liaison librarians, I’ve adapted content and offered workshops for graduate students in other fields including engineering and physical sciences and astronomy.


NECDMC goes to Las Vegas: ALA 2014 and ‘Dry Heat/Blue Legs’

e-Science Portal Blog - Mon, 08/04/2014 - 11:31

Submitted by Regina Fisher Raboin, Data Management Services Group Coordinator,  Science Research & Instruction Librarian, Tisch Library, Tufts University

You know the oft-quoted phrase, “Whatever happens in Vegas stays in Vegas”? Well, I did hear this line many, many times while in Las Vegas for the American Library Association’s annual convention, but I can assure you that NECDMC didn’t stay in Vegas!

At this year’s convention, amidst Elvis impersonators, side-by-side hotel wedding chapels and tattoo parlors, and the ever-present din of gambling, I had the pleasure of representing the New England Collaborative Data Management Curriculum’s project coordinators at the Association for Library Collections and Technical Services (ALCTS) Scholarly Interest Group (SIG) forum. Despite the June 28, post-lunch time-slot, there were approximately 100 in attendance.

The presentation was entitled “New England Collaborative Data Management Curriculum (NECDMC): An educational program and service for best practices in research data management (RDM)” and focused on the impetus for the development of the curriculum – the need to instruct faculty/researchers, students and librarians in best practices surrounding research data management. The talk covered development and piloting of the open source curriculum, information on how the curriculum materials can be used and customized, along with how building institutional and regional partnerships leads to successful curriculum implementation, compliance with federal mandates and highlighting best practices in research data management at an institutional level. Additionally, the presentation featured the recent “Train-the-Trainer” workshops and current/future pilots of the curriculum. The presentation was well-received and resulted in questions both during and after the session, along with emails from librarians interested in implementing NECDMC.

Preceding the NECDMC presentation was one by Sherri L. Barnes, the Scholarly Communication Program Coordinator, University of California, Santa Barbara, who spoke about their new Scholarly Communication Program, Scholarly Communication Express, a service that allows campus departments to request 15-minute presentations that are delivered at department meetings. The Express offerings include altmetrics, creating data management plans for the social sciences and sciences, Creative Commons licenses, eScholarship, UC’s institutional repository, EZID accounts, the NIH Public Access Policy, UC Open Access Policy, and understanding article publication agreements. Ms. Barnes commented that the web site is designed to reach an audience that rarely has time to think about, let alone change, the way they navigate the scholarly communication system and manage their intellectual property.

Later that day I spoke to the Numeric and Geospatial Data Services in Academic Libraries Interest Group (Association of College and Research Libraries) annual meeting where I presented, “Tisch Library’s Data Management Services Group: Accomplishments, Strategic Initiatives & Sustainability”.  The presentation and following round-table discussion focused on the development and launching of Tisch’s research data management services, strategic partnerships and initiatives, how the services were marketed, and the library’s plans to sustain these services. NECDMC was also highlighted in this presentation, as it is the anchor for Tisch’s Research Data Management Group’s best practices in research data management initiative.

While I attended many sessions at ALA Annual 2014, I’d like to highlight a few that I found interesting and informative. “Electronic Lab Notebooks: Managing Research from Data Collection to Publication”, offered by LITA (Library and Information Technology Association), looked at how Yale and Cornell implemented LabArchives and how this software fit into their broader data management support programs. A discussion group sponsored by the College and Research Libraries Interest Group and monitored by Buddy Pennington, Director of Collections and Access Management, University of Missouri, Kansas City and Doralyn Rossman, Head of Collection Development, Associate Professor,

Montana State University Library, focused on supporting payment of author publishing fees and negotiating author’s rights, demonstrating value of OA publications to faculty and graduate students, understanding granting agencies’ requirements to making data and findings publicly available, publishing in OA Journals, ensuring preservation and access of OA publications and advocating value beyond Impact Factor such as Altmetrics. The discussion was organized, allowing for the attendees to explore and go beyond the presented discussion topics.

A “why didn’t I think of that?” moment came during the Science and Technology Library Research Forum (ACRL STS) when Uta Hussong-Christian, Instruction & Science Librarian, Oregon State University and Rick Stoddart, Assessment Librarian, Oregon State University, presented a short paper, “STEM Learning in the Library Learning Commons: Examining Whiteboards for Evidence of Learning through Student-Generated Visualizations”. While they had their library’s learning commons usage statistics, they also wanted some type of substantive assessment to complement the data. So every Monday morning they would photograph all of the whiteboards (a ‘cognitive artifact’) in their library and then analyze the types of student-generated visualizations. The analysis coded the content by board, types of content (subjects/disciplines), drawing types (matrix, chart, diagram, etc.), and visualization-skill scores (i.e. how well drawn). They discovered the majority of whiteboards supported STEM student-learning, providing low-tech, high impact learning. Based on their research findings, Oregon State University library is going to be purchasing graphing whiteboards and checking out this new-fangled technology!

So I bet you’re wondering what I mean by “Dry Heat/Blue Legs” in the title. Don’t believe anyone who tells you that 110 degree temperatures (for 4 days I tell you!) aren’t uncomfortable since Las Vegas heat is ‘dry heat’. Well, I tested that hypothesis by wearing new dark blue capri jeans to ALA Annual 2014 – I think I’ll change my Twitter handle to “Blue Legs Raboin”.


Syndicate content