Personal information management in academia

December 9th, 2010 § 1 comment § permalink

I’ve been thinking about Personal Information Management (PIM) for the last few weeks as I’ve been wrapping up my semester course work. For my class on Human Information Interactions, I developed a short annotated bibliography for research on how faculty and researchers organize information. I initially had some trouble locating articles that dealt specifically with PIM in academia: most research examines information workers outside of the university. However, there were a handful of useful studies and I thought I would share those in case anyone else needed a good starting point.

Introduction

While scholarly communication has received significant attention from researchers in the field of human information behavior, less attention has been given to how scholars actually organize their files in the pre- and post- publication stages of research. As the world of academic research becomes increasingly digital, networked, and transparent, information scientists should turn their attention to the underlying structures, methodologies, habits, and perceptions of personal archiving in a university environment. Not only is it easier in a digital environment to track the scholarly communication process, but by focusing on these activities, we will see how digital networks are changing the ways scholars create, store, and disseminate information at all stages of research, from planning to publication and beyond.

The field of Personal Information Management (PIM) provides a theoretical and practical framework for discussing the technical details of the research process. Unfortunately, even though there are numerous PIM studies on engineers, travel agencies, financial firms, legal firms, etc., researchers have rarely turned a critical eye upon their own practices. Perhaps, as many of the works below suggest, this is due to the realization that PIM is uniquely tailored by each individual: no one system works for everyone. Those studies that do exist are fairly limited in scope, usually focusing on a single tool (e.g. email, bookmarks) or a single user group (e.g. computer scientists, graduate students). Few studies broadly discuss PIM in a university environment.

The following works were chosen because, in part or in whole, they deal with PIM in a university environment by faculty and researchers. Together, they provide a rough outline of the major concerns for PIM in academia: How much information should be saved? How will it be organized? Who should be responsible for its organization and preservation? What motivations drive information storage? What barriers exist and what are the implications for scholarly communication? For more information on PIM in general, I recommend the works of William Jones and Jamie Teevan, especially their Personal Information Management (Seattle: University of Washington Press, 2007) and Peter Williams, Jeremy John, and Ian Rowland’s 2009 article “The personal curation of digital objects: A lifecycle approach” (Aslib Proceedings, 61(4), 340-363).

Bibliography

Boardman, R. & Sasse, M.A. (2004). “Stuff goes into the computer and doesn’t come out”: A cross-tool study of personal information management. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 583-590). New York: Association for Computing Machinery.

Boardman and Sasse are constantly referred to in the literature that exists on how faculty and researchers organize personal information. Their research provides data and a methodology for creating an empirical foundation for PIM. In the study, information about how users in an academic setting organize information was collected across multiple tools (email, files, and bookmarks) and over time. All the participants except for one were from the university community and the majority of the participants were researchers. Using interviews, observations of the work environment, and long-term observations of file management, the authors examined the structures, maintenance, and retrieval preferences of the participants.

This research provides useful information for understanding how some individuals organize information and how they feel about their personal organizational methods. For example, the authors discovered that when users had similar hierarchies of file folders and hierarchies of email folders (termed “overlap”), users did so according to their roles (e.g. teacher) or projects (e.g. research proposal). Additionally, the users that filed items more frequently (daily) and had established organizational systems exhibited a sense of pride at their ability to organize their files over the years, even while simultaneously recognizing flaws in their system. This confirms what other studies have suggested: that the best PIM system is a highly personalized one.

Most importantly, the authors conclude that the categories used to describe information organizers in previous studies, such as Whittaker and Sidner’s “pilers” and “filers” (Whittaker, S. & Sidner, C. (1996). Email overload: Exploring personal information management of email. Proceedings CHI 1996, 276-283.), were not granular enough to describe all users. The participants in this study used multiple PIM strategies across multiple tools and did not easily fir in the previously established categories. This study provides a broader framework, based on previous research but adapted to describe the results of this experiment, for discussing the various PIM strategies.

Foster, N.F. & Gibbons, S. (2005). Understanding faculty to improve content recruitment for institutional repositories. D-Lib Magazine 11(1). Retrieved November 22, 2010, from http://www.dlib.org/dlib/january05/foster/01foster.html

In this year-long study funded by a 2003 Institute of Museum and Library Services grant, Nancy Foster and Susan Gibbons of the University of Rochester River Campus Libraries system sought to understand how faculty manage information. The purpose of their research was to find innovative ways to market and adapt IR systems to meet faculty needs, ultimately increasing participation. The article’s goal is not to explore PIM, but its findings provide insight into how faculty manage personal information and the information needs of individuals in a research environment.

The authors asked faculty members what they expected from an IR system. The majority of faculty indicated that they wanted tools for authoring, archiving, disseminating, locating, and reading research. They also expressed a desire for tools to control versioning, access information anywhere, and control access by other users. Faculty want their research to be archived with similar materials (related by subject), which suggests how they conceptualize the context of their personal information in a networked environment. In many cases, faculty had already created systems and methods that met these needs without specialized software: e.g. emailing files to oneself or to family members as a versioning control system. The broad array of responses indicates the wide range of information needs.

The observations and documentation of the faculty at work were based on anthropological participant observation. The data was gathered and analyzed by a diverse team that included reference librarians, computer scientists, an anthropologist, a programmer, a cataloger, and a graphic designer: an aspect that makes the research particularly insightful. The latter half of the article is primarily concerned with how to use this information to market buy-in for IR systems. For the purposes of this bibliography, it illustrates one practical benefit of understanding how faculty organize information.

Gandel, P.B., Katz, R.N., Metros, S.E. (2004). The “weariness of the flesh”: Reflections on the life of the mind in an era of abundance. EDUCAUSE Review, 39(2), 40-5.

The authors of this commentary on the current state of knowledge management in higher education offer a CIO’s perspective on the future of personal information organization. Grandel, Vice-Provost for Information Services and Dean of University Libraries at the University of Rhode Island; Katz, Vice-President of EDUCAUSE; and Metros, Deputy CIO and Executive Director for eLearning at Ohio State University, combine their extensive experience working with various stake-holders in the information landscape of universities to offer simple solutions to the problem of information abundance and recommend ways to encourage faculty buy-in on institutional repositories.

The authors claim that before the age of the computer, there was a fairly stable equilibrium between the demand for information and the supply of people to teach that information, but that now we live in an era of information abundance. The shift from an industrial to a knowledge economy, the falling cost of computer processors, the rapid adoption of information systems for all aspects of operations, and the growing acceptance of education as a life-long process have all contributed to a growing dependence on information resources in higher education. The future promises to be an age of abundance as individuals discover and utilize their ability to archive any and all aspects of human life in digital form. This includes the production of scholarly works.

The authors suggest that we think of the information landscape in terms of “ecologies” and of individuals as the organisms within that ecosphere. How will we study these organisms? How will we adapt our ecosystem to meet the needs of these individuals? What necessities will this ecosystem requires? These questions, though not asked explicitly, are suggested as the authors discuss the roles in which administrators, librarians, archivists, and publishers play in this new ecosystem. Grandel, Katz, and Metros conclude by recommending that institutional repositories be easy to use and seamlessly integrated with [faculty] desktop systems to encourage use and provide a stress-free way of incorporating tacit and explicit institutional knowledge into the networked ecosystem of information. Their image of the future calls to mind a great university-run Memex, both individual and institutional in its scope. For the purposes of this bibliography, this article provides an institution-wide perspective on the implications of PIM when integrated into a networked environment.

Kaye, J., Vertesi, J., Avery, S., Dafoe, A., David, S., Onaga, L., Rosero, I., et al. (2006). To have and to hold. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 275-284). New York: Association for Computing Machinery.

Kaye et al. set out to discover how academics at one Ivy League university organize and archive their information and to understand the values inherent in their organizational system. The authors posed a set of questions to 48 academics, took pictures of their information spaces, and qualitatively analyzed the results. They discovered five principle reasons for personal archiving: (1) retrieval, (2) legacy building (3) resource sharing, (4) fear of loss, and (5) identity construction. While the organizational systems varied from one individual to the next, each system tended to utilize one particular medium (e.g. bookshelves, boxes, file folders, digital bookmarks) that was influenced by the organizer’s principal values (the five stated above) and work lifestyle (e.g. single office vs. multiple office).

Kaye et al.’s study suggests that the need to retrieve information is neither the only nor the most important reason for personal archiving among academics. Additionally, the study states that no one system was significantly more effective at information retrieval than any other. Academics archive material for reasons that are not always rational (e.g. fear of loss) or immediately transparent (e.g. identity construction). Based on this knowledge, system designers should develop information systems that reflect the values inherent in personal archiving. Currently-used systems can be judged according to these values. The authors also suggest studying the relationship between personal identity and the customization of desktops, blogs, and personal websites when designing digital archiving tools.

Marshall, C.C. (2008). From writing and analysis to the repository: Taking the scholars’ perspective on scholarly archiving. In Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (pp. 251-260). New York: Association for Computing Machinery.

Catherine Marshall of Microsoft Research and the Center for the Study of Digital Libraries at Texas A&M University, studied the information behaviors of 14 computer scientists with a significant number of publications in their field in order to understand how they organized information related to their research. The participants in this study were more familiar with computing environments and thus illustrated more complex PIM practices. Marshall used semi-structured, open-ended interviews and observations over the course of six months to gather her data.

The participants in the study typically made an effort to archive six types of materials: (1) paper sources of their publications, (2) digital copies of the same, (3) research codes (4) data sets and logs, (5) bibliographies of related work, and (6) email. These files existed in various forms of completion, across multiple tools, and among multiple collaborators, illustrating the complex nature of scholarly communication in a digital, networked environment. Of particular note, Marshall discovers that personal archiving is more a side effect of collaboration and publication than a unique, intended process. If files are shared with colleagues via email, then email becomes the tool used for version control and storage. In her words, personal archiving is at once both “opportunistic” and “social.”

This study also raised a number of interesting questions about PIM, including: if two or more authors are collaborating on a single publication, who has the authoritative version? At what point do data sets become archive-worthy: as raw data or after the data has been worked on? Do citations stored in BibTex files need to be complete or just enough so that they are recognizable? Marshall ends by offering implications for collaborative information management, for personal scholarly archives, and for institutional and disciplinary repositories.

Winget, M.A., Chang, K. & Tibbo, H. (2006). Personal email management on the University Digital Desktop: User behaviors vs. archival best practices. Proceedings of the American Society for Information Science and Technology, 43(1), 1–13.

This article offers a summary of the findings of a three-year project that examined the records management behaviors– particularly email management– of faculty and staff at two North Carolina universities. In-depth interviews were used to collect information about the subjects’ organization methods, retention habits, and concerns about digital information. While the majority of the article discusses the practice of record retention in the legal context of a state-supported university, it does provide some useful data for understanding how faculty and staff at a university manage their email, including: how important emails are stored; how emails are organized; and how attachments are stored.

Winget, Chang, and Tibbo discovered a variety of behaviors when it came to how important messages were stored, including saving them to a hard-drive or network drive, printing them out, moving them to a sub-folder, flagging them, moving them to another format (e.g. Microsoft Word), and leaving them in the inbox. The majority of respondents (88%) used a folder system to organize emails, most ranging from 11 to 50 folders. 89% of the respondents saved attachments outside the email program. Like other studies, this shows the variety of methods university faculty and staff use to organize information. While there are certainly strong tendencies to organize information in a particular way, no one system is shown to be more effective than another.

Winget, M.A. & Ramirez, M. (2006). Developing a meaningful digital self-archiving model: Archival theory vs. natural behavior in the Minds of Carolina Research Project. Proceedings of the American Society for Information Science and Technology, 43(1), 1–12.

The goal of this paper was to examine how users, specifically university faculty, might choose to self-archive digital objects. The authors interviewed two faculty members, one scientist and one humanities scholar, and asked them to consider and collect what they would submit to a digital archive and discuss how they would organize it. The two faculty members took two very different approaches. The scientist intentionally excluded lab notebooks (an item the authors considered to be of great academic value), created a lengthy narrative of his career to accompany the materials that he did include, and mostly referenced his publications by providing links to PubMed citations rather than submitting the actual documents themselves. The humanities scholar provided materials related to the development of a single monograph. These included documents that illustrated the creative and iterative process of translation (of poetry) and contextualized the monograph within the scholar’s work and professional connections. For example, he included pre-prints of the work that contained notes from other colleagues.

Winget and Ramirez spend much of the article making recommendations for future developments of digital archives. Concerning personal information management, they discovered that the desire to self-archive at the early stage of one’s career is inhibited by (1) lack of need to reflect and “look back” and (2) the hesitation to publish mistakes, especially in light of a rigorous tenure process. The article also illustrates how two people can chose two radically different approaches to organizing information and deciding what information is worthy of preservation. Additionally, Winget and Ramirez point out that these approaches were contrary to archival best practices.

Zimmerman, E. (2009). PIM @ academia: How e-mail is used by scholars. Online Information Review, 33(1), 22-42.

In this study, Eric Zimmerman, Vice-Provost for Academic Affairs and Director of Research at Interdisciplinary Center Herzlia in Israel, assesses the relationships between email use and scholarly work. While not an original research question, this study, performed decades after the introduction of email, is unique in that it is undertaken at a time when it is understood, based on previous studies, that the vast majority of scholars today are comfortable using email technology.

Zimmerman surveyed 390 faculty members of the humanities, social sciences, and sciences at Bar-Ilan University in Israel. The surveys were distributed via email and paper formats and asked faculty members a number of questions regarding email use, level of comfort, skill level, and the application of email for scholarly communication. Of 17 predefined uses, faculty mostly used email for: proposal development use, manuscript submission, research collaboration, and participation in committees.

Other important findings include: (1) a negative correlation between age and self-described email skill: older users expressed lower levels of comfort using email; (2) 45% of those surveyed feel overloaded, but almost 65% expressed little difficulty in organizing email; and (3) scholars with more publications tended to use email more frequently. Additionally, Zimmerman found that while respondents view email as a benefit to scholarly work (rated on a Likert scale), when the results are broken down by school, humanities faculty generally rate its benefit lower than social sciences or sciences faculty.

The results of this study suggest that email is perhaps the most widely used tool in the scholarly communication process, serving the processes of communication, collaboration, drafting, peer-review, manuscript submission, versioning, and archiving in the publication process.

I hope this information is helpful. If you have additional resources on Personal Information Management in universities, please share in the comments!