Below are some recent job opportunities for data librarians. If you would like to disseminate news about job opportunities at your institution on e-Science Community, please contact me at firstname.lastname@example.org.
University of Virginia: Senior Research Data Scientist (MLS not required, MS or PhD in Science/Engineering preferred)
Submitted by guest contributor Daina Bouquin, Data & Metadata Services Librarian, Weill Cornell Medical College of Cornell University, email@example.com
When do you use a database, and when do you use a spreadsheet? This simple question is key when designing a data management strategy, but it is important to note neither tool is always the better choice, rather you need to determine which tool is best suited for the task at hand. Spreadsheets get a bad name because they are so easy to misuse, but following some best practices for tabular data will keep you from making the most common mistakes. Meanwhile, databases are typically seen as being overly complicated and involving difficult to learn skills, but there are more than a few resources online to get you started with them (like Stanford’s Database course). This brief walkthrough will help you determine whether to go with a spreadsheet or a database on your next project.
First, it’s important to recognize that both spreadsheets and databases can be useful in manipulating data. Where these two tools differ is in how they store and manipulate data.
In spreadsheets, data values are stored in cells, with many cells making up an array of rows and columns. Cells can “refer” to each other and carry out processes on other cell values. Spreadsheets have taken over functions that, in the past, were carried out with paper and pencil in ledgers and worksheets (like financial record keeping) because they enable data to be recalculated much more quickly and efficiently than by hand. When one value in a column changes, totals and other formulas entered into cells are automatically recalculated. It’s important to note though that spreadsheets are not ideal for long-term data storage and only offer relatively simple query options. They also do not easily guard data integrity, and offer little protection from data corruption– so spreadsheets are great for tracking simple lists, but have realistic limitations. Many people think of MS Excel when they think of spreadsheets, but other platforms like GoogleDocs have spreadsheet applications as well.
With databases, data are usually stored in multiple tables. Each table is given a name and has columns and rows. Each row in the table is called record, and each record typically has a value for each column in a table. Database tables are typically used to store raw data, meaning that data in rows are not the result of some manipulation or function like in a spreadsheet. Databases also allow you to enforce relationships between records in different tables so that the data can then be retrieved through querying. Querying is like asking questions of the data to pull information into a formatted reports (e.g. an invoice). In this way databases easily manage large amounts of data and maintain data integrity better than spreadsheets typically do. Likewise, databases are better for long-term storage of records that may change and also have a much larger storage capacity than spreadsheets. Some database tools include MySQL, SQL Server, Oracle, MS Access, and REDCap.
Some questions to ask when deciding between the two:
- If you use a spreadsheet, would changes in one spreadsheet require you to make changes in others?
- Would the amount of data be manageable in a spreadsheet?
- Would you need several spreadsheets to contain related data?
- Would the data you are looking for be easy to find in a spreadsheet?
- Do multiple people need to access the data?
Answering yes to a few of these questions means you may want to consider going with a database over a spreadsheet for your project.
In summary, you may want to go with a database if:
- Multiple people will need to access the file
- The data is subject to change
- You want to store data long term
- You need a lot of storage space
- You need to generate multiple reports based on the same data– For example: a clinical researcher wants to see a group of patients average weight by month, another researcher only wants to see that measure for a certain subset of patients, and yet another researcher may be interested in instead seeing the median weight grouped by age. Rather than build three spreadsheets with different views, it would be easier and more efficient to make a database that would allow for queries to generate all three reports from one source.
Both spreadsheets and databases have their place, just try to avoid forcing a spreadsheet to do the work of a database. It’ll save you a headache.
A friend recently shared this article. It’s an interesting read that suggests that – education and experience being equal – women are less confident than their male peers, and that this lack of confidence in women negatively impacts their careers.
As I read the piece it triggered a memory – an old but vivid one.
Years ago I worked at a biotech company. My coworkers were almost all male. We came from similar backgrounds; most of us were recent college grads with degrees in biology or chemistry. We had plenty of chances to talk and bond over what was often monotonous work, so we got to know each other quite well. Sometimes we’d chat about science to pass the time.
One day I was talking science with a coworker while we worked on a large batch of samples. (He was the odd man out in our little group, since he’d majored in physics.) He made a pronouncement on some arcane biology topic that now escapes me. I knew what he said wasn’t right, and after a pause I jokingly called him on it.
He laughed and said: “OK, you got me. I’m really not sure – you’re probably right. I’m just the physics guy, remember?”
I said: “Well, you sure sounded like you were positive about it!”
Him: “Yeah, but that’s 90% of life, isn’t it? Sounding like you know what you’re talking about.”
Lighthearted as this conversation was, I can still remember being dumbstruck by his comment – the ol’ light bulb moment. I realized how self-assured my male coworkers sounded when they talked – about science, and just about everything else. I was much more tentative, sprinkling my speech with loopholes and deflections. It was a revelation – maybe they really DIDN’T know a lot more than I did; they just sounded like they did!
Per the 2013 Demographics Survey of ALA members, over 80% of librarians are female. IF librarians are mostly women, and IF women tend to be less confident than men, and IF a lack of self-assurance hurts women in their careers – what does that mean for libraries? And in particular, what does that mean in situations where libraries are pushing boundaries, reinventing themselves, and working to insert themselves and their librarians into the research enterprise in new ways?
There are, of course, many factors that contribute to the success of new initiatives. Much can rise and fall on environmental factors, like the support of library administration, institutional aspirations, and budgetary pressures. Subject matter background certainly helps librarians work closely with their departments, though I’ve argued before that I don’t see discipline knowledge as crucial. After reading this article, I’m wondering if part of what that subject matter expertise gives librarians is confidence. An extra dose of confidence is helpful for any of us, and may be even more welcome in situations where libraries and librarians are forging new paths.
What do you think? Does this article resonate with you, or not? Do you see any connections with our library work? Please comment here and/or on Twitter – #NERescience.
Simmons GSLIS is offering a Scientific Research Data Management class, 532G-01, this upcoming fall semester at Simmons’ Boston and Simmons West (Mt Holyoke) campuses simultaneously via videocasting on Saturday mornings from 9-12 pm. The class will be taught by a team of librarians that includes Dr. Elaine Martin, Rebecca Reznik-Zellen, and Donna Kafel from UMass Medical School, Andrew Creamer from Brown University, and Regina Raboin from Tufts University. The instructors will alternate between the two campuses; one week teaching at the Simmons’ Boston campus and videocasting to the students at Mt. Holyoke, and the alternate week vice versa. The class is open to enrolled Simmons students as well as interested librarians. (Registration for non-Simmons students begins in August). The first class begins on Sept. 6th.
This course, LIS 532G, Scientific Research Data Management, uses the case study method to prepare students from all academic backgrounds for roles in scientific research data management. It explores the current and emerging roles for information professionals in managing large or small volumes of research data sets. The course provides students with the skill set relevant to that of a data librarian whose job involves helping researchers manage and curate research data sets. The course examines the data practices of researchers in scientific fields such as biomedicine and engineering as examples of how researchers produce data and how they use these data for purposes of inquiry. Students learn about the purposes and tools of research data production and data reuse, data lifecycles and data reference interviews, data management practices, and the strategies of offering data consultancy services to researchers. Current issues regarding citing datasets, Open Access policies, and embedding the librarian as a member of a research team will also be addressed. The course will feature guest lectures by data scientists, data librarians and data archivists. Assignments include a series of readings, case study assignments, data reference interviews with researchers, and the development of data reference interview tools and data management plans for real research projects.
Full tuition for this 3 credit class is $3486, plus a $50 activity fee. Auditing the class (no credit or grade earned) is a less expensive option and is half the current tuition ($1743 for non-Simmons grads, $400 for any Simmons grads). For further details, see Simmons Forms and Policies.
If you’re interested in the class, please contact the Admissions Office at Simmons at firstname.lastname@example.org and discuss with the Admissions Office your preference for enrolling for credit or auditing. The office will send you an application. An official copy of your master’s degree transcript is a required component of the application.
New England librarians (or those outside of NE who don’t mind a long drive!) who are interested in learning more about e-Science librarianship and the management of scientific research data may want to consider enrolling in or auditing this class.
This summer ACRL is sponsoring an e-learning online course “What You Need to Know about Writing Data Management Plans” from July 14- August 1, 2014. The course teaches participants the elements of a comprehensive data management plan and is taught by Dee Ann Allison and Kiyomi Deards of the University of Nebraska-Lincoln. See ACRL announcement for more details.
The National Digital Stewardship Residency (NDSR) model was “designed to develop the next generation of stewards to collect, manage, preserve, and make accessible our digital assets.” NDSR was developed by the Library of Congress and was initially piloted in Washington DC. Beginning in September, the NDSR will be piloted in Boston at Harvard, MIT, Northeastern, Tufts, and WGBH. See NDSR announcement for further details!
Submitted by Donna Kafel, Project Coordinator for the NE Librarians’ e-Science Program, University of Massachusetts Medical School
The comment “Well done!!” is music to my ears. Last Friday I had the pleasure of hearing this music from many of the Science Boot Campers at the conclusion of the sixth annual New England Librarians’ Science Boot Camp hosted by Carolyn Mills and her colleagues at the University of Connecticut. After months and months of careful planning, it was very rewarding to hear Boot Campers’ positive feedback and realize that they value the unique science-immersive learning opportunity that the New England Science Boot Camp Planning Group strives to provide each year.
My colleagues in the New England Science Boot Camp Planning Group and I meet regularly throughout the year to plan the next Science Boot Camp. In the course of our work planning boot camps, group members have developed specific expertise and project management skills. One of the most fulfilling aspects of participating in the group is seeing a new Science Boot Camp come to fruition after our initial brainstorming sessions and multiple planning meetings, e-mail discussions, and the massive team efforts led by the librarian who is hosting boot camp at her campus. Each year the group works out the details to plan this 2 ½ day event: reviewing attendees’ evaluations of past boot camps, planning topics for future camps, selecting and inviting faculty speakers, working with the next campus host to plan budget, facilities and logistics, setting up a new Science Boot Camp LibGuide and registration, and making sure we broadly disseminate boot camp announcements to librarians and library students interested in science librarianship— just to name a few.
In planning each boot camp, the Planning Group sticks to the original New England Science Boot Camp mission: to provide science, health sciences, and engineering librarians and interested library students a science-immersive and affordable continuing education event with opportunities to network and share ideas in a fun, laid-back setting. Keeping the event affordable has been made possible through the sponsorships of these organizations: the National Network of Libraries of Medicine New England Region, the Boston Library Consortium, the University of Massachusetts Amherst, University of Massachusetts Boston, University of Massachusetts Dartmouth, University of Massachusetts Medical School, College of the Holy Cross, Tufts University, University of Connecticut, and Worcester Polytechnic Institute.
It’s inspiring to see the Librarians’ Boot Camp model being adapted around the country and in Canada. (For details about the other Science Boot Camps, please see Margaret Henderson’s April 22nd e-Science Community post “Science Boot Camps for Librarians 2014”). Each of our groups can learn a lot from each other, our group members, our faculty presenters, and our current and future boot campers.
In closing, I’d like to acknowledge the following members of the New England Science Boot Camp Planning Group:
Mary Adams and Elizabeth Winiarz of the University of Massachusetts Dartmouth; Paulina Borrego, Naka Ishii, and Maxine Schmidt of the University of Massachusetts Amherst; Tina Mullins of the University of Massachusetts Boston; Andrew Creamer (now at Brown University), Sally Gore, Elaine Martin, and myself from the University of Massachusetts Medical School; Bijan Esfahani of Worcester Polytechnic Institute, Regina Raboin of Tufts University, Barbara Merolli of the College of the Holy Cross, and Carolyn Mills of the University of Connecticut…..
….And give a big shout out to our gracious colleages at the University of Connecticut who hosted this year’s Science Boot Camp. Many thanks to Carolyn Mills, Sharon Giovenale, Valori Banfi, and the rest of the UCONN Science Boot Camp 2014 hosting team!
…And be sure to check out Sally Gore’s post “Hello Mudder, Hello Fadduh” and see her notes and sketches from boot camp….
…And stay tuned—I will be announcing when the Science Boot Camp videorecordings are available!
Submitted by: Jake Carlson, Associate Professor of Library Science/Data Services Specialist, Purdue University, email@example.com, @jrcarlso
Recently I had the pleasure of co-teaching a group of graduate students in a semester long data information literacy program. Amongst their many interests was learning how to organize their data files and folders in a logical fashion so that they can easily find what they need, when they need it. Locating a specific component of their data often devolves into a needle in a haystack search because they had named their files based on whatever thoughts were going through their head at the moment. This problem is compounded when they had to find files from or share files with their advisor, peers or collaborators.
We spent a class session discussing naming conventions for files and folders as a means to alleviate this situation. Naming conventions are means of communicating descriptive and useful information through the name given to a particular file. These names are generated through the consistent application of articulated rules that have been vetted and agreed upon by participating individuals. Well-chosen naming conventions make it easier to not only identify the content of a file at a glance, but also to understand how any given file relates to other files in the collection.
Generating an effective naming convention is an investment of time and effort. Naturally every naming convention will be unique to the environment in which it was created, but we covered some common considerations for getting started. They are:
- Identify the commonalities and important distinctions between the data files. This may include things like author, date, type of experiment, procedure, etc. Naming conventions are usually comprised of multiple elements. Ideally, these elements should be meaningful to the intended audience and significant enough to include as a part of the file or folder name. One way to approach this exercise would be to consider the stages of the data lifecycle and what happens in each stage.
- Find the right number of elements and characters. Including too many elements in a naming convention weighs it down and reduces its usefulness; too few elements create ambiguity. Four to five elements are usually sufficient. Similarly, too many characters can cause problems in transferring files. Consider using meaningful abbreviations where possible and err on the side of brevity.
- Define the elements and acceptable entries. Be sure that these decisions are documented and accessible. A naming convention will break down if not followed consistently and so a reference document will need to be made available to all. You may want to include a “keyword” element that could accommodate a free text description to further convey the content of a file to a user to allow for some flexibility in the naming convention.
- Decide upon the order of the elements. The order of the elements in the naming convention will determine how they are listed and how they are grouped together. Consider what is important to your audience. Do they want files organized primarily by chronology, by author, or some other means? Start out by listing the general elements first and then move towards the more specific ones.
- Versioning. If a versioning number is included be sure to define what constitutes a new version of the file and how lesser revisions will be accommodated in the documentation. Avoid using words like “final”, “update”, or “new” in the file name as they loose meaning over time.
Using these and other guidelines, we had our students develop their own naming conventions to apply to the data they were working on. This assignment was well received by students as it was something they could apply right away. Several students reported sharing what they learned with their peers and making an effort to develop a naming convention for their lab.
References and More Information on Naming Conventions:
North Carolina Dept of Cultural Resources (2008) “Best Practices for File-Naming” http://www.ncdcr.gov/Portals/26/PDF/guidelines/filenaming.pdf
Santaguida, V. (2010) “Folder and File Naming Convention – 10 Rules for Best Practice” http://www.exadox.com/en/articles/file-naming-convention-ten-rules-best-practice
Smith, E. (2011) “Folder Hierarchy Best Practices for Digital Asset Management” http://www.damlearningcenter.com/resources/articles/best-practices-for-folder-organization/
Tomorrow is the opening day for the 6th annual New England Science Boot Camp which is being hosted on the beautiful University of Connecticut campus in Storrs, CT. All the Science and Capstone sessions will be videorecorded and posted on the New England Area Librarians Science Boot Camp YouTube page within a few weeks.
Be sure to follow SBC happenings on Twitter at #sciboot14!
Submitted by guest contributor, Katie Houk, Research & Instruction Librarian at Tufts University Hirsh Health Sciences Library. firstname.lastname@example.org
If you have not read my previous blog, “Ask and You Shall Receive” I recommend you go give it a brief read before diving into this follow-up post.
I am recently returned from Savannah, GA where I was honored to be invited to give a 2 hour interactive presentation based on the first module of the New England Collaborative Data Management Curriculum (NECDMC). Being the new and experimental presenter for the conference committee, I was placed in the very last time slot for the entire conference. As we know from our own conference experiences, this usually doesn’t bode well for attendance; unless you happen to be a celebrity or a hot topic. Fortunately, my champion vocally advocated for attending my session throughout the entire conference and there ended up being about 16 attendees. Since it was advertised as a presentation and not necessarily a workshop, many were surprised that I made them do work, talk to each other, and speak up throughout the session. Thankfully, the general reaction from those who stayed for the majority of the presentation was highly positive and I am enormously thankful for the opportunity.
My overall impressions of the conference and group of people were also very positive. Compared to the librarian conferences I typically attend, it was an intimate experience – I don’t think the number of participants even filled the entire hotel it was hosted in. The intimacy meant that I felt more comfortable attending the business meetings, planning committees and other activities of the society because I saw the same people over and over. However, this is how the phrase “be careful what you wish for” suddenly popped into my head at the end of the program planning committee meeting – after I had committed to joining the society and being a co-convener of a day’s worth of programming – all before I’d even given my presentation! While I am very excited to be able to take on this leadership role, it was kind of surprising how quickly it happened – and that it isn’t even in an organization within my expertise.
Some thoughts I’m mulling over since the conference:
- People are naturally wary of the unknown, but scientists tend to be curious and experimental in nature, so I think they reacted more favorably to my workshop-style presentation than I expected.
- Seize opportunities boldly and enthusiastically. Others seem to respond more openly and helpfully to those who show enthusiasm and passion rather than act too reserved and shy.
- Attending, volunteering and being involved in science or health societies instead of just library associations may be a better way to be seen and heard, as well as become actively involved in promoting the worth of libraries on a national level.
- Scientists seem to know more about what their favorite product companies can do for them than their own institutions – especially librarians (they still have little idea what we do aside from deliver them articles).
- Small conferences seem to be much harder to fund and organize.
- Never be afraid to ask, but be prepared/careful for what you wish for.
Forwarding the following job announcement from the National Library of Medicine:
The National Library of Medicine (NLM), located on the National Institutes of Health (NIH) campus, in Bethesda, Maryland is recruiting recent library science graduates to fill entry level librarian positions. The positions offer a unique opportunity to work at the world’s largest biomedical library, with a mission of acquiring, organizing, and disseminating the biomedical knowledge for the benefit of the public’s health.
Positions are available in:
Web Site Development and Social Media
- Support site development, or new responsive web design for MedlinePlus
- Contribute to social media initiatives of NLM
- Support development and maintenance of NLM web sites by assisting with content management, usability, accessibility, information architecture, plain language, navigation and mobile access
- Acquire materials for the NLM collection and support the licensing of electronic resources
- Create and maintain serial records which serve as the underlying data for various systems throughout NLM; provide quality assurance of NLM serial records in local and national databases to ensure accurate journal citations in databases such as PubMed and PMC (PubMed Central)
Preservation; Digital Preservation
- Provide proper management, preservation and care of historical and non-historical collections, including monographs, serials, archives, manuscripts, oral histories, prints, photographs, posters, ephemera, motion pictures, video recordings, sound recordings, and other materials
- Participate in digital technology, digital imaging and preservation of analog and igital formats
- Organize consumer health information about diseases, conditions, and wellness, in both English and Spanish through MedlinePlus, the NLM consumer health web site
Data and Literature Management
- Design qualitative and quantitative assessments of tools and processes used in the indexing of biomedical literature
- Provide technical and research support for automated (machine-assisted) indexing initiatives involving biomedical literature
- Assist with data content review and editing of bibliographic citations and Web pages, including HTML or XML tagging and metadata application, to ensure data quality and consistency
- Test and evaluate NLM search systems, including the content in the systems and the interfaces used to access the systems
- Participation with customer service, training and outreach services for NLM systems, such as PubMed
Health Services Research, Public Health and Health Information Technology
- Engage with the public health and health services research communities in order to create and manage health information resources that serve their needs
- Support development of knowledge and information resources to promote interoperable exchange of data and information using standardized vocabularies and codesets, standardized survey tools and assessment instruments, and common data elements and measures
Data Science and Big Data
- Assist with initiatives to enhance access to biomedical data sets resulting from publicly funded research
- Analyze and develop guidance related to emerging policies that promote data sharing and open science
- Participate in projects to engage science communities of practice in standards efforts, including common data elements initiatives
Pay: GS-9 level with a pay rate of $52,146
Benefits: health insurance, and other benefits
Eligibility: Eligibility: Must have a library degree from an accredited school; must have a cumulative GPA of 3.0 or higher; must have graduated on or after 12/27/10 and be a citizen of the United States
Apply for NLM positions through the NIH Pathways for Recent Graduates (Librarian) Program of USAJobs: https://www.usajobs.gov/GetJob/ViewDetails/371420100 from June 2 – June 6, 2014
We encourage the submission of a cover letter identifying the area(s) you are most interested in working in, so that we can determine the area best suited in our organization.
NLM and NIH are dedicated to building a workforce that reflects diversity. NLM hires, promotes, trains, and provides career development based on merit, without regard to race, color, religion, national origin, sex (including gender identity), parental status, marital status, sexual orientation, age, disability, genetic information, or political affiliation.
In addition to an interesting, challenging work environment, NLM has a great location on the campus of the National Institutes of Health in Bethesda, Maryland. It is a short Metro ride from Washington D.C. and a short walk from Bethesda’s thriving restaurant and retail district.
For questions regarding these positions, please contact Kathel Dunn, Associate Fellowship Coordinator, National Library of Medicine, Kathel.email@example.com, ph 301.435.4083
Posted at the request of Susan Cole, Assistant Director for Scholarly Resources & Services, Science Librarian, Colby College.
Social Sciences Data Librarian
Colby College Libraries invites applications for a Social Sciences Data Librarian, a new position
in the Scholarly Resources & Services (SRS) group. The Colby Libraries seeks a candidate with
knowledge and enthusiasm to raise campus awareness of data literacy (data curation,
management, and preservation) with the potential to build library services for faculty and student
research. The data librarian may assist faculty with development of data management plans for
grant applications, assist with general data stewardship, as well as serve as a resource to
library colleagues for data and statistical support. The librarian will serve as liaison primarily to
departments in the social sciences or interdisciplinary subject areas, providing information
literacy and research instruction, individual consultations, and collection development.
● Graduate degree in library or information science from an accredited institution or
equivalent is preferred; alternate education and experience may be considered
● Undergraduate or advanced degree in the social sciences or sciences
● Knowledge of data management, curation, and preservation principles and practices
● Experience teaching information literacy and/or data literacy in an academic library
● Experience with statistical software as well as data from governmental and private
● Familiarity with geospatial analysis
● Excellent analytical, oral and written communication and presentation skills
● Commitment to service in a liberal arts setting
● Commitment to professional development
● Flexibility, creativity, energy, and ability to work in a changing environment, and to work
collaboratively as a member of a goal-oriented team
Position open July 1, 2014.
Applicants should address their materials to the chair of the Search Committee, Lisa McDaniels,
and send the following electronically in PDF format to Stephanie Frost (firstname.lastname@example.org).
● Cover letter
● Curriculum vitae
● Statement of teaching philosophy
● Graduate transcripts
● Three letters of recommendation
Founded in 1813, Colby is the 12th-oldest private liberal arts college in the country. Highly
selective, the college serves 1800 students. The 714-acre Mayflower Hill campus located in
central Maine is near inland lakes, an hour from the coast, and three hours from Boston.
Waterville and surrounding areas offer a reasonable cost of living in a beautiful setting. The
Colby College Libraries are central to scholarship and a key part of the Colby academic
program. There are three libraries with a professional staff of 13 librarians. A significant staff
reorganization in 2012 resulted in the Libraries being poised for transformational change in the
provision of services, instruction, and collections. The mission of the Scholarly Resources &
Services group of seven librarians is to support faculty and student research in an innovative
environment. Colby librarians are faculty without rank, eligible for sabbaticals and are expected
to contribute to creative, scholarly, and professional activities, and to participate in library-wide
and campus-wide service. For more information about the Libraries, visit www.colby.edu/library
Colby is an Equal Opportunity/Affirmative Action employer, committed to excellence through
diversity, and strongly encourages applications and nominations of persons of color, women,
and members of other underrepresented groups. For more information about the College,
please visit the Colby Web site: www.colby.edu
Posted on behalf of Chris Eaker, Vice Chair of the DataONE User Group
Registration is now open for the 2014 DataONE Users Group Meeting: http://www.dataone.org/dataone-users-group
In 2009 DataONE was established following a successful application to the “Sustainable Digital Data Preservation and Access Network Partners (DataNet)” Solicitation from NSF. The goal of DataONE was to “enable new science and knowledge creation through universal access to data about life on Earth and the environment that sustains it”. Through DataONE, participants have designed, developed and deployed a robust cyberinfrastructure (CI) with innovative services, and directly engaged and educated a broad stakeholder community. Five years later we have reached the end of that award and are excited to communicate our achievements, technologies and plans for Phase II that will take us through to 2019.
Join the DataONE group in Frisco, CO July 6-7th to learn more about its planned activities, provide feedback on development and network with other DataONE Users. There will be a number of break-out sessions (including one focussed on the DataONE Member Node network), community-led round table discussions and a poster reception for the community to highlight their projects of relevance to the DataONE community. On Monday July 7th there will be a half-day session on the DMPTool, version 2.
The DataONE Users Group meeting is conveniently co-located with the Summer ESIP meeting so head out a few days early to enjoy Colorado and learn more about DataONE.
There have been several recent job postings related to e-Science librarianship. Here is a list of the ones I’ve come across. If you would like to disseminate news about job opportunities at your institution on e-Science Community, please contact me at email@example.com.
Boston College, Chestnut Hill, MA: Digital Scholarship Librarian for the Sciences and Social Sciences. Full job description available at http://www.bc.edu/content/bc/libraries/about/jobs/staff.html
Hampshire College, Amherst, MA: Interdisciplinary Science Librarianhttps://jobs.hampshire.edu/index.cgi?&JA_m=JASDET&JA_s=344
Columbia University, NY: Data Services & Emerging Technologies Librarian. Full job description available at http://www.arl.org/leadership-recruitment/job-listings/record/a0Id000000EV1plEAD#.U3PRP3ZLqec
Case Western University, Cleveland, OH: 2 open positions: Digital Research Services Librarian for the Sciences, Digital Learning & Scholarship Librarian. The full job description and application information are available at http://www.case.edu/finadmin/humres/employment/career.html
Carnegie Mellon, Pittsburgh, PA: Research Data Services Librarian. Full job description available at https://cmu.taleo.net/careersection/2/jobdetail.ftl?job=100763&src=JB-10246
Virginia Tech, Blacksburg, VA: Research Data Consultant. Full job description available at https://listings.jobs.vt.edu/postings/48197
Argonne National Laboratory, Argonne, IL : Science Librarian 1. Full job description available at http://www.aplitrak.com/?adid=bXBzdWxsaXZhbi40NDE2Mi4xMzUyQGFubC5hcGxpdHJhay5jb20
Indiana State University, Terre Haute, IN: Data Curation Librarian. Job description available at http://lib.indstate.edu/about/jobs/DataCurationLibrarian.pdf
University of Florida, Gainesville, FL: Agricultural Sciences and Digital Initiatives Librarian. Full job description available at http://www.uflib.ufl.edu/pers/documents/AgriSciLibrarianSearchComm.pdf
University of California, San Diego, CA: Director of Metadata Services. Full job description available at http://academicaffairs.ucsd.edu/aps/adeo/recruitment/jobDetails.asp?PositionNumber=10-739
If you would like to disseminate news about job opportunities at your institution, let e-Science Community know!
As you might have noticed, the e-Science Portal has been going through a period of transition. We’ve welcomed many new content editors in recent months, and now we’re starting to tackle the design and usability of the portal.
For this, dear readers, we’re asking for your help.
We’d like to invite you to participate in online usability studies of the e-Science portal. It’s a simple, low-intensity way of contributing to the project if you don’t have a lot of time to spare. Not in New England? Still in library school? Not a regular user of the portal? None of that matters! All we’re asking for is about 30 minutes of your time to do some usability testing from the comfort of your own, er, computer.
If you’re interested in participating in the usability studies, we ask that you fill out a (super-short, really!) application by Friday, May 23rd.
Thanks for your consideration, and for your support of the portal.
The upcoming MLA annual meeting is a great time to learn more about escience and data. There are some excellent panels and posters, and some group meetings that will allow you to connect with lots of people with similar interests in helping researchers. Be sure to introduce yourself and ask questions. If you need more hints about attending MLA, see the compilation of ideas at the end of this #medlibs chat blog post. And don’t hesitate to introduce yourself to me, Margaret Henderson, if you see me there. I love to talk science and data (and embroidery, if you want a break from the other stuff.)
SIG business meetings (Special Interest Groups)
- Informationist – Sunday, May 18, 7-8:55 am
- Molecular Biology and Genomics – Tuesday, May 20, 7-8:55 am
- Institutional Animal Care and Use – Tuesday, May 20, 11:30 am -12:25pm
Section Business meetings
- Medical Informatics – Tuesday, May 20, 4:30-5:55 pm
- Information Building Blocks: Open Data Initiatives and Trends – Sunday, May 18, 4:30-5:55 pm
- Evolution of the Librarian: New and Changing Roles – Monday, May 19, 10:30-11:55 am- 2 talks on data, final speaker covers working with research teams.
- Librarian’s Role in the Translational Science Research Team – Monday, May 19, 2-3:25 pm
- Top Technology Trends VII – Tuesday, May 20, 6-7:30 pm
- Plenary Session4: MLA ‘14 Panel – Professional Identity Reshaped
There are lots of posters on data and science when you search the keywords. In many cases you can view the posters online right now, so you can be ready to ask questions during the poster session. Here is a sample:
Poster Session 1 Time: Sunday, May 18, 3:30 PM – 4:25 PM
(5) A Guide for Open Researcher and Contributor ID (ORCID). by Merle Rosenzweig, Caitlin Kelley, and Mari Monosoff-Richards
(19) An Assessment of Doctoral Biomedical Student Research Data Management Needs. by Kate Thornhill and Lisa Palmer.
Poster Session 2 Time: Monday, May 19, 3:30 PM – 4:25 PM
(82) Developing a Model, Library-Based Research Data Management and Curation Service to Help Scientists Archive and Share Research Data. by Richard Jizba and Rose Fredrick
(116) Improving Data Management in Academic Research: Assessment Results for a Pilot Lab. by Heather Coates
Poster Session 3 Time: Tuesday, May 20, 1:00 PM – 1:55 PM
(152) New Measures of Success: Altmetrics and the Changing Face of Scholarly Impact. by Kimberley R. Barker
(171) Putting the I in Team: Informationists on the Inside. Linda Hasman, Scott McIntosh
Submitted by: Amanda Whitmire, Assistant Professor / Data Management Specialist at Oregon State University Libraries. firstname.lastname@example.org | @AWhitTwit
In my last blog post, I introduced my co-editor for the data information literacy (DIL) section of the portal, Jake Carlson, in the form of an interview. In order to be fair, I now turn the tables on myself. Jake and I thought it would be useful to share a bit about ourselves and how we came to be data specialists because 1) neither of us ever imagined that we would be doing this, and 2) we want to encourage the widest possible audience to consider joining our community of practice. Maybe you have wondered if you could specialize in data services, but aren’t sure where to start. We are here to tell you – just dive in!
What is your current position and title, and how long have you been at Oregon State?
I am an Assistant Professor / Data Management Specialist, and I’ve been in this position at OSU since September 2012. I’ve actually been at OSU since September 2001, when I started graduate school in the College of Earth, Ocean and Atmospheric Sciences. After I finished my degree, I also took a postdoctoral position here.
What led you to a career in library-based data services?
It was a combination of things including: a hostile funding environment in my former field (oceanography); a desire to stay in our current location (lovely Corvallis, OR); a desire for job stability (no more soft money); and random chance. I happened to be browsing the OSU jobs site and saw the posting for my position. I had never even heard of such a thing before, and was intrigued. After reading the job description and doing some Googling to figure out how academic libraries were getting into the data game, I was even more intrigued. I showed the job posting to a few people, and all of them said something like, “You would be so perfect for that!”. I emailed a few colleagues across campus to gauge their perception of the necessity of the position – all of them thought it was a great idea for the libraries to be pursuing it. After meeting everyone at my job interview, I was convinced – OSU Libraries was where I wanted to start a new career. When I gave up my “old life” I was scared, and really excited. I haven’t regretted it for even a second.
How did you become interested in DIL?
Having been a graduate student and postdoc who had to slog through teaching herself everything from data storage and backup to file-naming conventions and metadata, I had a very tangible sense that there was a need for training and education in data management. When I discovered Jake’s IMLS-funded DIL Project, the goal of the effort made sense to me: compile a set of core competencies that all researchers need to have in order to effectively manage their data. If we could all agree on what researchers need to know, then we can work together to start building a common set of learning outcomes, teaching materials and talking points, assessment tools, and so on. It seemed important that the community of practitioners, those of us teaching or supporting data management, have a common vocabulary to draw from, and I think that the DIL Project and competencies provide that.
What is your main goal in being an editor for the DIL section of the eScience Portal?
Like Jake, I was interested in being a co-editor of the DIL section of the eScience Portal as a means to stay up to date in the field. We are all extremely busy, and I’m no exception. This position enables me to block off time on my calendar to dedicate toward keeping up to date with the activities and scholarly outcomes in our field. Plus, I get to work with Jake on a regular basis, which is a huge bonus.
I also strongly agree with Jake that being engaged in DIL is a huge opportunity for librarians to broaden and strengthen our relevance to our campus stakeholders. But, I’ve heard time and again that librarians don’t know where to start with getting a foot in the door with data. DIL and the eScience Portal provide a framework and a platform for getting started, so being involved in the Portal in this way made perfect sense.
Something personal: hobbies, favorite vacation destination, favorite food, etc.
I love growing things, especially fruits and vegetables that I can cook, can, share and enjoy. I love to cook, mostly because I really love to eat. Being an Oregonian, it is my duty to be an unapologetic beer snob, and I’ll admit to that. I enjoy all crafts related to fiber and textiles: knitting, crochet, spinning, weaving and quilting. I’m not an oceanographer any more, but I still have plans to weasel my way onto another research cruise someday.
If a researcher came to you tomorrow asking you how to make her data and code openly available, would you recommend GitHub?
More to the point: do you know enough about GitHub and its potential usefulness to computational researchers?
Chances are, you don’t. And you’re not alone.
Most data management librarians who don’t have a computational science or coding background have heard of GitHub, but don’t know enough about it to confidently recommend it to researchers in need of a place to store open source code and open data.
Here are resources that will help you gain a better understanding of what GitHub is, why it’s so useful to computational researchers, and how it works. By the time you finish the final reading, you should be able to set up your own GitHub repository and perform some basic tasks on the site.1. Learn that GitHub is useful because: version control, collaboration, transparency “Git/Github: A primer for researchers,” by Carly Strasser on the DataPub blog (May 2014)
Carly–herself once a practicing scientist–breaks down why researchers find GitHub to be so useful. In short, it’s because the platform allows version control, makes collaboration easy, and encourages transparency–a big bonus for researchers who believe in Open Science.2. Learn that GitHub is useful because: citeability of code and data “Mozilla Science Lab, GitHub and Figshare team up to fix the citation of code in academia” by Nick Summers on TheNextWeb (March 2014)
Most of us know by now the challenges that face researchers who want to make their code and data openly available. Chief among them is a lack of incentive to do so, and no real way of tracking the impact of the scholarship they open up.
The “Code as research object” initiative aims to solve that problem. Here’s how: researchers who use GitHub to build open source software (OSS) or release Open Data can push that code to Figshare, which will issue a DOI for that package. And any code or data currently in GitHub that a user wants to cite but doesn’t already have a DOI can be assigned one by the end user through a Mozilla Science Lab-developed browser bookmarklet.
DOIs make it easier to cite data and code by establishing a permanent identifier that can be used over time. Citations to the code can be tracked, too, when DOIs are used, thanks to third-party apps like Altmetric.com and Impactstory*.3. OK, so how does GitHub work? “GitHub For Beginners: Don’t Get Scared, Get Started” by Lauren Orsini on ReadWrite (September 2013)
In this first post of a two part series, Lauren delves into more detail than Carly, explaining the basics of how version control systems work and what GitHub does (it’s basically a pretty layer on top of a utilitarian version control system called Git).4. Wonderful, let’s practice! “Hello World” on GitHub Guides (May 2014)
Written by the GitHub team for beginners, this how-to guide offers a solid high-level view (complete with animated GIFs!) on what it means to “commit,” “merge,” “pull,” and more on the platform.“GitHub For Beginners: Commit, Push And Go” by Lauren Orsini on ReadWrite (October 2013)
In part two of her series, Lauren describes how to download the desktop software and get started coding using GitHub. You should have a bit of comfort working on the command line before starting this tutorial, but then again few of us in libraryland nowadays are command line n00bs.Now are you ready to Git to it?
Sorry, I couldn’t help the pun.
Now that you’ve read these guides and put in some time practicing, you can confidently recommend GitHub and all its awesome functionalities to researchers who want to openly share code and data.
* Full disclosure: I am employed by the non-profit, Impactstory.org.
Submitted by guest contributor Daina Bouquin, Data & Metadata Services Librarian, Weill Cornell Medical College of Cornell University, email@example.com
In my experience, making metadata part of the conversation is one of the hardest things about being a data librarian. That is to say, that working data information literacy and data management into the conversation with anyone who isn’t already concerned with these topics can be incredibly difficult– even more so when words and phrases like “metadata,” “version control,” ”data integration,” “semantic data structures,” “repositories,” and even “e-Science” are so foreign. Data librarians need to learn how to navigate both the jargon associated with their field and the need to communicate these issues with their patrons, without alienating or disinteresting them from the start.
As a relatively new data librarian in a biomedical research setting, I have come to understand that I need to strategize how I introduce these topics– especially to those patrons who come to me hunting for analysis resources and aren’t as focused on the other issues inherent in data management and curation. Based on my own experience, these are some of my tips for getting around the jargon and getting things done.
First, leverage the discussions you are already having. Whether it’s talking about bibliographic management (managing metadata associated with literature) or finding literature to support someone’s research interests (talk a bit about their study design if you can) see how you can introduce e-Science topics into the conversation. For example, I was approached and asked to teach a class on using Prezi to the post-docs association at my institution, and used that opportunity to integrate data visualization and presentation basics into the topics I covered. The students found it valuable and the next time I taught the Prezi class I spent half the time talking about data vis and the value of having clean, well-managed data to make data visualization more simple and effective (this was at the request of those organizing the class). I have many more instances like this where reframing the conversation just a little led to a lot of data-related outreach.
Second, try to avoid a lot of jargon in your constructive criticism of a researcher’s current data management practices. Reframing the discussion to be relevant to the researcher is key. It’s very easy to confuse someone who isn’t familiar with the terms and concepts you’re discussing and it can come off as alienating and long-winded. Focus on asking questions and making consultations a constructive conversation– consultations are as much about learning about the researcher’s needs and how best to address them as they are about anything else. Ask them what their plans are if a lead investigator leaves, or if they have a secure backup strategy, or if they would like to explore more options for making their research more efficient– you don’t have to necessarily talk about “metadata” much, instead you can focus on data organization and research replicability which may be more straightforward.
Which brings to my last point, make the jargon secondary as often as you can– planning a data organization and collection strategy, discussing workflows and long-term storage are all words and phrases that are more straightforward than “data management planning”, “data collection instrument selection”, “data validation and audit capability”, “data citation” and “archiving”. Literacy is incredibly important, but literacy goes way beyond just knowing the vocabulary. Try focusing on the strategies employed in the Data Curation Profiles Toolkit when doing consultations and interviews and familiarize yourself with the Glossaries of Data Management Terms and the DMPTool so you are sure you can explain what terms and policies mean when you need to, but focus mostly on making positive changes and being approachable– we all know change is hard, try making it friendlier.
This was an hour very well spent! Yesterday I attended the “Practical Data Management” webinar that was presented by Dr. Kristin Briney , Data Services Librarian at the Golda Meir Library, University of Wisconsin Milwaukee. With a PhD in chemistry and years of scientific research experience, Kristin clearly knows first hand the value of good research data management practices. In the webinar, she succinctly addresses the why, when, what, and how aspects of specific data management practices such as file naming conventions, data documentation, README.txt , storage, backup, file migration, and planning for future file usability. The recording of the webinar is now available on ACRL-digital curation interest group’s ALA Connect at http://connect.ala.org/node/220603