Keywords

5.1 Introduction: Social Science Data Archives as Cyberinfrastructure

Now officially 30 years old, today’s World Wide Web represents not only a vast informational terrain of e-commerce sites, social media platforms, natural-language search engines, and political action organizations but also an equally vast ecosystem of advertising, surveillance, and activity-tracking systems. For decades now, our aggregate movements over this Web have generated greater amounts and more varied types of quantitative social science data than ever before, resulting in repeated calls for new investment in “cyberinfrastructure” (as a 2005 National Science Foundation report termed it) for enabling the collaborative and interdisciplinary study of the “big data” of social science research (Berman and Brady 2005; Bowker et al. 2010). A 2009 Science article even heralded the coming of a new kind of “computational social science,” years before “machine learning” became a media buzzword (Lazer et al. 2009). But such calls for collecting, preserving, and repurposing huge stores of digital social science data are not entirely new. Similar sentiments decades ago motivated the development of what came to be known by the late 1960s as “social science data archives” or SSDAs (Shankar et al. 2016). SSDAs may be defined as “distributed large-scale information infrastructure [s] that have been influential in shaping the development of the social sciences, quantitative methods, data standards, and international relationships among data institutions in the latter half of the twentieth century” (Eschenfelder et al. 2018). Just as with the World Wide Web, these information infrastructures both promised a systematized solution to the problem of making social activity visible and intelligible to social science researchers, while at the same time inevitably relying on the long work hours, creative insights, and collegial collaboration of a hidden network of social data curators (Downey 2014). This chapter describes how some of these data curators came together in the late 1970s to form a new professional organization called the International Association for Social Science Information Service and Technology, or IASSIST—playfully pronounced as “I assist!”—not only to make their own collective data curation work more visible but also to make the social science data archives themselves more sustainable. Building this professional identity and peer network was a crucial, voluntary, and undervalued labor challenge, essential to advertising the existence, circulating the products, disseminating the best practices, and realizing the value proposition of the SSDAs themselves.

5.2 Social Science Data Archives and the Metadata Crisis of the 1970s

The roots of SSDAs stretch back to the development of punched card, public opinion sample survey research in the 1930s. By the late 1940s, sociologist Paul Lazarsfeld had developed an empirical social science laboratory at Columbia, social psychologist Rensis Likert had established a survey research center at Michigan, and pioneering pollster Elmo Roper had deposited the original punched cards from his firm’s well-known public opinion surveys with Williams College (Hastings 1961; Barton 1979; Featherman 2004). By the late 1950s, data processing in sociology, political science, area studies, and related fields motivated two social scientists, one from the United States and one from Europe, to jointly publish a report through the Columbia University School of Library Science, focused on using library techniques to make the new digital data of social science more widely available. Titled A Library Center of Survey Research Data (1957), authored by York Lucci (Bureau of Applied Social Research at Columbia University, New York) and Stein Rokkan (Institute of Social Research, Oslo), the report urged colleagues not only to preserve today’s quantitative social science data for the future but also to circulate it across a global geography for greater peer review, comparative study, and interdisciplinary combination (Lucci and Rokkan 1957; Nasatir 1973; Scheuch 2003).

Many of the large social science data archives that remain today, such as the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan and the Zentralarchiv fur Empirische Sozialforschung (ZA) at the University of Cologne, were founded in the 1960s in the wake of the Lucci and Rokkan report. The “quantitative turn” in the social sciences was by then unavoidable, including new methods for conducting sample surveys, new troves of data in machine-readable form, and new tools for digital processing and model building that could enable what sociologist Herbert Hyman dubbed “secondary analysis”: the efficient revisiting of old research data by new scholars with new questions (Hyman 1991). Advocates of secondary analysis today list essentially the same potential benefits to archiving social science data as they did in the late 1960s: increasing the accessibility of costly (and often publicly funded) data to a wider community of scholars beyond the original research team; enhancing the standardization of data collection and variable construction methods by making individual research choices visible to other scholars; avoiding needless duplication in data-gathering efforts by providing a way to see if a proposed study was truly unique; providing the means for direct critical replication of scientific findings in order to test the robustness of conclusions; bringing together similar data from different states or nations to do comparative research; and offering tools for the education of new undergraduate and graduate students using real-world examples (Bisco 1970; ICPSR 2018). In this way, social science data archives were promoted as material and organizational expressions of supposedly universal norms of scholarship, collegiality, and progress in the social sciences.

The digital data situation of the early 1970s, then, seemed revolutionary to social scientists of the era. An inventory of the largest SSDA organizations operating at the time would include the ICPSR at Michigan; the ZA at Cologne, Germany; the Roper Center at Williams College; the National Opinion Research Center (NORC) at Chicago; the Political Data Program at Yale; the Survey Research Center and the Institute for International Studies at Berkeley; the UK Data Archive in Essex, England; and about a dozen others. As the Director of the International Data Library and Reference Service at UC Berkeley, David Nasatir, put it in a 1973 UNESCO report, “Unlike the conventional library which loans or gives access to original copies of the information desired […] the data archive produces a new copy of the data (or subset of the data) which the user may then take away to his own analytical facilities” (Nasatir 1973). For example, around this time the Roper Center counted “more than 10 million IBM cards of information: raw data from over six thousand studies from twenty-two American suppliers and seventy-one other organizations located in 43 countries” (Bisco 1967). Similarly, the ICPSR had “grown to ninety-six members and expended over a million dollars” per year from a combination of NSF grants, university funds, and its organizational annual membership fees of $2500 apiece (Johnson 2008). The data infrastructure required investment; while “these new social science organizations were naively viewed in the beginning simply as warehouses of information that should be preserved,” according to Ralph Bisco, “data archives necessarily must become complex organizations, with staffs that include specialists in computer operations, programming, and data processing techniques, as well as administrators and professional research personnel” (Bisco 1967).

Such costs seemed a small price to pay for the dream of universal data available on request. Scholarly assessments of SSDAs appeared throughout the 1970s, such as Ralph Bisco’s Data Bases, Computers, and the Social Sciences (1970), David Nasatir’s Data Archives for the Social Sciences (1973), and Howard White’s Reader in Machine-Readable Social Data (1977), telling a celebratory story where new scientific methods and new digital technology had triumphed over the small-scale, idiosyncratic, and isolated interpretive social science investigations of the past. The scientific benefit from the new infrastructure—the value—was attractive: “A new data collection may cost $75,000 or more; a copy of an appropriate computer-processable data collection may cost as little as $5” (Bisco 1970). The only remaining question worth study, as put forth in a 1980 UW-Madison doctoral dissertation, seemed to be why, despite the original suggestions from Lucci and Rokkan back in 1957, academic libraries and archives themselves hadn’t been more centrally involved in this revolution in the 1960s and 1970s—an oversight seemingly addressed by 1982, when the same writer edited a special issue of Library Trends formally introducing the rest of the library profession to these social science data archives (Heim 1980; Heim ed. 1982).

But asking why libraries hadn’t been more involved in the early development of this social science cyberinfrastructure was the wrong question. Even if most libraries were slow to take responsibility for the new “machine-readable data files” (MRDF) of the social science data archives, many librarians themselves were crucial to the development and daily operation of the SSDAs from the start. After all, “data” does not magically appear from a research study in a form that is preservable, transportable, findable, or even understandable by anyone else other than the original gatherer without first expending significant labor to evaluate, organize, clean, classify, catalog, tag, or otherwise describe and transform the data in way that other people—and other technological systems—can deal with (see for example Plantin 2018). Much of this work results in what information professionals refer to today as “metadata”—information about the myriad books and magazines, reports and theses, music and video, and multimedia and hypermedia of all sorts, which libraries collect, organize, store, and circulate. Research on schemas and strategies for producing metadata has been a staple of library and information science for decades, with the concept enjoying a renaissance in the World Wide Web era as the “Dublin Core” emerged as a sort of universal standard for networked digital data projects, while the SSDA community developed its own standard known as DDI through a later cooperative initiative in the 1990s and 2000s. Yet even in today’s cyberinfrastructure, “Metadata creation is often an unfunded mandate” (Mayernik 2008).

And quality metadata is only part of that mandate; many forms of “metadata knowledge” which library and archive professionals develop, refine, and impart to both depositors and patrons are necessary—such as practices and norms for choosing what materials are preserved and what materials are circulated in the first place, choices that are inevitably related to speculative value judgments about the cost and benefit of being able to use such materials in the future. This kind of metadata “curation” work is never complete, because the societies within which libraries and archives function, and the expectations of the patrons which they serve, are constantly changing. Old categories must be rethought, renamed, or reassigned based on the most recent discoveries of academic scholarship, the newest need for interdisciplinary translation, or the latest citizen claims concerning social justice. By following the production and reproduction of all of this metadata and the knowledge surrounding it—how metadata is made visible and invisible, valued and devalued, rendered in both physical and virtual forms—scholars of technology and society can analytically connect practices of librarianship across vastly different institutional, functional, social, and technological contexts (Downey 2010).

Metadata production and reproduction is thus a huge labor challenge within social science information infrastructures. Scholars of technological information and communication networks have long pointed to the ways that human work—in what has been called “virtual labor,” “digital labor,” “immaterial labor,” or, in a more general sense, “information labor”—is not only absolutely necessary to realizing the value of moving information through such networks but also often the least visible feature of such networks (Blok and Downey eds. 2004). This is especially true of libraries and archives, where the basic curation and circulation challenge that happens largely behind the scenes is as much temporal as geographical: to take an informational product produced in the past (say, a social science data set) and describe and define it using the tools and terminology of the present, all in a way that will presumably make sense to a potential scholar seeking it in the future (Plantin 2018). All such decisions must be made imperfectly: there is never enough time or money or even storage space to perfectly catalog and safely store every possible item available today for every possible audience of tomorrow (Downey 2014).

Conceptualizing the metadata practices of social science data curators as a special kind of information labor—“metadata labor”—reveals that the question of who performs, and who pays for, such labor is still important in today’s “big data” era. Writers in the MIT Press volume World Wide Research: Reshaping the Sciences and Humanities (Dutton and Jeffreys 2010) agreed, “In order for data to be reusable by researchers not involved with the original data collection, representations of the data (such as metadata, data dictionaries, or ontologies) need to be created. This process can be expensive in terms of the person power required to clean and annotate the data, even in the research areas where data curation is semiautomated” (Meyer et al. 2010). Back in 1957, York Lucci had hoped that once basic financial support for a single central social science data archive was secured, the main tasks of “selection and screening of studies” and “the development of appropriate archival procedures” for “developing wide and efficient utilization” would only take “several years” (Lucci 1957). Yet such metadata issues vexed the SSDAs for decades—and still do.

For example, even at the height of the SSDA revolution in the 1970s, actually using these archives was a challenge, even for social scientists who knew of their existence. Unlike research library holdings of books and periodicals, or print archival holdings of documents and correspondence, SSDAs shared no cross-institutional finding aid to reveal the availability of research data on a particular topic—there was no “union catalog” for social science data, neither in North America nor in Europe. As a result, the same authors who praised the creation of these resources often lamented that they were underutilized and underfunded. Ralph Bisco noted, “Users are now confronted with a time-consuming, inefficient, and costly means for determining what specific data holdings match their immediate research needs. They must first identify which of the several score archival organizations are likely to maintain the kinds of data they might need, and then they must call, visit, or write each of the likely sources” (Bisco 1970). Thus any data archive which wanted its resources used more widely needed to bear the cost and effort of circulating information about those resources itself—just as the ICPSR and the Roper Center did with their regular lists of new holdings sent out to their own paid members lists. European data archives, which served whole nations and did not have paid members, occasionally advertised holdings descriptions in political science journals or other publications of the scholarly community.

How might these data be more widely and systematically circulated to achieve the transformation in collaborative, secondary, and comparative analysis that social scientists hoped for? It wouldn’t be enough to simply list datasets by name, topic, and date. What scholars looking to extend, replicate, or even critique a past study really needed to know were what kinds of questions were asked of research subjects, with what range of answers, collated into what kind of variable categories, for use with what kind of statistical measurements and breakdowns. Scholars needed detailed information from a study’s “codebook” in order to know whether it would be worthwhile to have an SSDA ship boxes of punched cards or magnetic tapes across the country or across national borders—and whether it would be worthwhile for a new institution to mobilize its own scarce data processing equipment to redigest and reanalyze the study data. As MIT social scientist Ithiel de sola Pool and his colleagues described it at the time, “A codebook is largely meta-data” (Pool et al. 1969).

Two different approaches to handling this data and metadata discovery problem emerged. The first was to set up a funded organization that would have the power to work with all the SSDAs to produce a centralized catalog. This was one of the reasons for the 1962 founding of the US-based Council on Social Science Data Archives (CSSDA or “Council”) (and later the 1976 founding of the similarly named Council of European Social Science Data Archives or CESSDA). From the start, the US-based Council had set an ambitious agenda of rationally organizing the world’s academic social science data archives to avoid “duplication or competition” (Alford 1969). But by 1969 the Council had disbanded. One of the projects left unfinished when the Council folded was a promised inventory of secondary studies and data available through all of its member archives internationally. The group ended up with an incomplete listing of some 2000 studies (many of them from a single archive anyway, the Roper Center), which was “eight feet tall” and too unwieldy to actually publish either in paper or on punched card format for distribution to potential users. A professor from SUNY Stony Brook who helped develop the unreleased inventory, Raymond Maurice, described some of the challenges (Maurice 1969): “They got some money and sent out the inventory format forms to all the universities. There, assistants, some who didn’t know anything about the studies and some who did, filled out the forms.” Maurice said “it’s just like pulling blood out of stone to get a clerk to go through the codebook and tell what is in the data.” Yet surprisingly, in a conference discussion about the failed project, Maurice revealed that the Council feared disseminating even these results would create too much demand for data reuse! “Let’s say I get this inventory out and it goes to five thousand people. All of a sudden we will create a system where people working on Masters’ or PhD theses will be doing data analysis. This may start inundating the data archives” (Maurice 1969). Efforts in Europe to develop a means to share information about data holding across national borders, language, and cultural barriers, such as those managed by a subgroup of the International Social Science Council (ISSC) called the Standing Committee on Social Science Data Archives (SCSSDA or “Standing Committee”), were also of limited success.

Within the United States, the second approach to archival data discovery, following the failure of the Council to produce a workable union catalog, was to use a newsletter to provide a regular update of new and interesting holdings at participating data archives—prioritizing timely notice over comprehensive coverage. Dubbed “s s data” when it debuted in 1971, this quarterly periodical was edited by social science professor John Kolp at the Laboratory for Political Research of the University of Iowa and funded by a two-year National Science Foundation grant, “to collect and communicate at regular intervals information on data acquired by archives” (Kolp 1971). At first, the foundation funding allowed SSDA metadata to freely circulate (through the mail) to any interested North American users. Two years later, when federal funding expired, Kolp listed “35 archives which contribute regularly to the newsletter and approximately 1200 readers” (Kolp 1973). But even under a subscription model—with individual social scientists, their departments, or, crucially, their college/university libraries paying for this work to continue—the experiment proved financially unsustainable and ended by 1981. Kolp admitted the small staff at Iowa simply could not keep up with all the changes in the data landscape: “it was never possible at any one point in time to know which data archives were in existence and which ones were not” and “the degree of cooperativeness [by the data archives] varied a great deal” (Kolp 1980). But another change was in the readers of s s data itself, which Kolp said “serves the data reference community and not primarily the individual researcher, social scientist, or community planner” (Kolp 1980). In other words, the newsletter had become a resource not for social scientists, but for librarians.

5.3 Linking Data Archives to Data Libraries with Metadata Labor

Who were these North American librarians who subscribed to s s data in the 1970s, mediating the metadata circulation between the archivists at the big SSDAs and the students, staff, and faculty of their local university social science departments and survey research centers? Many of them were traditional reference librarians working with either campus or disciplinary communities, helping their user communities understand how a particular data set had been put to use as part of a published research study. However, an increasing number worked in new units where they interfaced not only with library staff and social science researchers but with data processing and computer center personnel as well, given their role in helping users to acquire sets of punched cards or magnetic tapes full of research data and statistical analysis programs. Little by little, the many local sites for this kind of work came to be known as “social science data libraries”—in contrast to “social science data archives” which in Europe often served an entire nation (e.g., the ZA in Germany), and within the United States existed as annual fee paying member organizations (e.g., Roper, ICPSR). Thus the persons who staffed these decentralized data libraries became referred to as “social science data librarians” or simply “data librarians.”

The University of Wisconsin-Madison offered the first and clearest example of this trend. In September 1966 the UW Data and Program Library Service (DPLS) was founded by sociologists Michael Aiken and David Elesh with six data files—it was later claimed to be the “oldest general [data] archive in the United States.” Three years later, in 1969, it counted “about two hundred” data files, had a budget of about $50,000 per year, had affiliated as a member of both the ICPSR and the Roper data archives as members, and employed two full-time staff besides its faculty directors: Margaret (Peggy) O’Neill Adams (Assistant Director) and Alice Robbin (Data Librarian). The DPLS was governed by social science faculty but was “not a part of an academic department”; instead, it was administered by a faculty steering committee “made up of representatives of various social science departments,” with funding committed centrally at the college level. Faculty seemed to value this resource, since DPLS staff reported that “we are on the standard tour for all prospective new faculty members in the social sciences.” And data librarian Robbin, who received her own Master’s in library science from UW, reported that “As a graduate student I had used DPLS myself” (Adams et al. 1969b).

The social science data librarians of the DPLS performed three crucial networking services for students, staff, and faculty who might be interested in what the national SSDAs like ICPSR and Roper had to offer:

  1. 1.

    Downloading national data to local users. Data was acquired for users from the big SSDAs (ICPSR, Roper), as well as from state and federal government producers, on punched card or paper tape through the mail. In most cases, a “cached” copy was also made and kept in the data library before passing the original materials on to the patron, so the library would have a safe version available in case of another request. This meant the DPLS kept on hand “several keypunches—some with interpreting mechanisms, verifiers, cardcounters and sorters, a reproducer, and a card reader to one of the university computers” (Adams et al. 1969b).

  2. 2.

    Uploading local data to national users. Data libraries became the conduit for any local social science research that might be valuable enough to be submitted to a national archive for permanent storage. And even though data libraries weren’t intending to compete with the national SSDAs in terms of data holdings, some locally produced datasets were archived locally as well, especially if they were of such narrow focus or dubious quality as to not be desired by a national repository. Such data would even be provided to other campuses on occasion, “sent out at cost” (Adams et al. 1969b).

  3. 3.

    Building an interdisciplinary community of practice. The DPLS promoted the existence of the SSDAs, the availability of government process-produced datasets, and the use of new secondary analysis research techniques, through regular training sessions and individual mentoring interactions. Data librarians also worked with university computer programmers—the librarians were not expected to code solutions themselves, but they needed to be conversant with computing services colleagues in a way that many faculty and graduate students might not have been. And just as a print library or print archive would monitor how their materials are used, the data librarians monitored how their datasets were used—especially noting whenever errors in those datasets were uncovered. Unlike a print library or archive, however, this library made its data users visible to each other so they could learn from each other. “We keep a record of all errors and the next researcher is warned about them. We also keep a record of all the people who use a file. Each subsequent researcher can go back and talk with another to find out what happened to him when he analyzed the data” (Adams et al. 1969b).

What this pioneering social science data library accomplished in practice, then, was to serve as a meeting place for various university constituencies who each had an interest in the new social science quantitative data movement. Social science faculty used the DPLS resources (and sometimes contributed their hand-rolled data to the DPLS as well). Social science graduate students from all over campus found their way to the DPLS as a training and learning opportunity that they would carry forth to new institutions throughout the 1970s when they themselves were later hired as faculty or researchers. Social science software developers became part of the conversation and actually served on staff at the DPLS with the librarians. And the librarians staffing the DPLS were important conduits back to the local library school for training the next generation of information professionals.

This new forum for technological translation between librarians, scholars, students, and programmers was an important development. After all, only a few years earlier, the well-known library historian Jesse Shera, Dean of the School of Library Science at Case Western Reserve University, had edited “a kind of Intelligent Woman’s Guide to Automation in the Library” for the May 1964 Wilson Library Bulleti. He characterized librarian resistance to automation as rooted in “fear” and “anxiety,” arguing that “being traditionally humanistic, librarians doubt their capacity even to utilize anything that is scientifically derived” (Shera 1964). The apparatus of “library automation” would soon be impossible for the “intelligent woman” of the field to ignore; first with time-sharing mainframe technology and later with desktop microcomputer technology, the 1970s would see a widespread discussion about the proper place of computation in this feminized profession. The MARC project to create a standard format for electronic catalog records enabled the Ohio College Library Center (OCLC) project to connect participating library workers’ cataloging computers together over space and time, which in turn inspired and the Online Public Access Catalog (OPAC) projects of the early 1980s, to make those networked electronic catalogs directly available to patrons. Social science data librarians were thus at the forefront of a technological discussion that offered some hope that the gender stereotypes of librarianship could finally be overturned, and the occupational status of librarianship could finally be upgraded, through the widespread adoption of digital library technology (Downey 2010).

The Wisconsin DPLS showed that for a data library to be sustainable during this period, its participants needed to constantly adapt, relearn, and retrain—not only because data processing technologies and social investigation methodologies were always changing but also because new individuals were always entering the campus and experiencing their first exposure to this new infrastructure. Local data libraries trained both their own staff and the social science scholars they served, not necessarily on the detailed statistical methods needed to evaluate the data files (which was handled by research faculty and staff), or even on the detailed computational skills needed to manipulate the data files (which was handled by computer center staff), but on two additional forms of metadata knowledge: the awareness of what kind of data of what level of quality was available from what sources in the data archive community, and the norms of eventually resharing one’s own data with this community in the same way that one has benefited from the data of others. In this period however, data librarians were largely self-taught as few formal resources existed for professional development. For example, in the United States, ICPSR provided training through its annual meeting of organizational representatives and its summer training program as part of its mission to foster the quantitative social sciences. But during the late 1960s and early 1970s, ICPSR aimed to recruit (mostly male) senior social science faculty members as representatives. At that point, ICPSR saw quantitative social scientists, or their students, as their target audience to develop the ICPSR community; a scan of the 1970 representative list shows only two people not using the title of professor or doctor, and the vast majority of the names are clearly gendered male.

These issues were a constant topic of concern among the academic advocates of SSDAs, given the structures they created to build data awareness and enforce norms of sharing—first the Council and then the Standing Committee—but the social science data librarians were often more effective at addressing these issues. For example, in 1969 the DPLS noted that while many faculty “do not seem particularly prone to disseminating their data or to expending the effort needed so that their data can be deposited in a data library,” they found “the users of the data library, particularly the graduate students, do develop this type of commitment” (Adams et al. 1969a, b). Promoting such awareness and normative behavior was, according to these librarians, “the primary means for keeping these data alive—to put them continually, without delay, and at minimal cost into the hands of potential users beyond their originators” (Adams and Dennis 1970).

5.4 Linking Data Librarians to Each Other Through IASSIST

The Wisconsin DPLS may have been an early case, but the notion grew through the 1970s that enterprising librarians might retool and reskill to help “keep data alive” (and thus realize its value) in the computer age. For example, Judith Rowe, a leading data librarian herself as Associate Director for Social Science User Services at Princeton University Computer Center, argued that the new availability of the US Census on data tapes was the “thin end of the wedge” which would motivate traditional libraries to take machine-readable social science data seriously: “Every ALA national conference since 1972 has had at least one well-attended program on data resources” (Rowe 1974). It was in this environment that the idea for IASSIST first took hold.

While the history of computing and information literature includes some recollections of the origins and impact of IASSIST as written by the participants themselves—from a first anniversary conference paper (Geda 1977) to a 25th anniversary essay written (Adams 2007)—there has been no systematic study of this organization or the role it played in the larger history of social science data archives. The following account uses correspondence from the IASSIST papers (UMIP), deposited at the University of Minnesota Twin Cities Libraries, to demonstrate that the survival of SSDAs in this period involved not only changes in the technological infrastructure of social science data—moving from mainframe-oriented magnetic tapes to personal computers and online access—but also profound changes in the spatial, social, technological, and gender division of labor necessary to preserve the value and utility of social science data itself. Whereas the centralized social science data archives of the 1960s had been instituted by prominent social science researchers and built by data processing professionals—both of whom were largely male—the subsequent decentralized social science data libraries of the 1970s and 1980s were developed and sustained largely college and university librarians—most of whom were female. IASSIST was an organization modeled after the societies, conferences, and journals of the academic social science and technology professions but intended less as a vehicle for disciplinary knowledge production and more as a vehicle for occupational solidarity and professional advancement.

The idea for IASSIST emerged at an international social science research conference in 1974, where, somewhat unusually, both scholars whose data filled the SSDAs and librarians whose metadata “kept the data alive” were invited to attend. With funding provided by International Social Science Council president and Norwegian social scientist Stein Rokkan—the longtime data archives evangelist and coauthor of the original 1957 manifesto calling for data archives to be created in conjunction with academic libraries—the “Conference on Data Archives and Program Library Services” was held in August 1974 in Toronto, in conjunction with the World Congress of Sociology. The conference was largely organized by a leading social scientist in the international SSDA movement: Erwin Scheuch, Director of the Zentralarchiv SSDA at the University of Cologne and chair of the ISSC’s Standing Committee on Social Science Data Archives (the only remaining organizing group for social scientists who worked with data archives, after the folding of the US-based Council in 1969). Normally this might have been another in a long line of conferences where quantitative social scientists gathered to informally network and trade insights on new research methods, new data sets, and new data analysis programs. But as one of the attendees later described, what resulted instead was “a ‘floor-level-uproar’ […] claiming that no activity was going on with the Standing Committee” (Adams 2007; UMIP 1975-03-19 Rowe to Challener; UMIP 1976-06-15 Nielsen).

This “uproar” came about because the 65 attendees at this conference, from 16 different countries, “differed from participants in previous ISSC activities, wherein social science researchers prevailed” (Adams 2007; UMIP 1975-03-19 Rowe to Challener). With the conference title specifically inviting “Program Library Services” attendees, one of the American co-organizers, Wisconsin sociology professor (and DPLS director) Michael Aiken, “sent more than 300 invitations to the conference” that went to “staff members of census agencies, research institutes, and social science data archives (data banks, data libraries)” (Adams 2007; Robbin 1975). As a result, the August 1974 conference included library and computing professionals who made both central data archives and decentralized data libraries work behind the scenes to acquire, clean, transfer, store, search, and deliver that data. It was this shift in the division of labor that made all the difference: “The group identified professionalization and training of data archivists, the people on whose work social science research depended, as the first means of accomplishing their goals” (Adams 2007). Researcher David Nasatir (Berkeley) described the idea as “a grassroots effort among professionals engaged in the daily operations of social science data archives” (UMIP 1975-05-21 Nasatir to Adams). But just who were these professionals? Social scientist Hesung Koh (Yale) termed them “information intermediaries”: “experts who can understand and work well with both information specialists and scholar-users of information […] their specialization involves understanding the interface between these areas, and serving as mediators, helping both information specialists and scholarly users to arrive at more effective reciprocal accommodations, and developing workable structures to accommodate their mutual interests and contributions” (UMIP 1975-05-22 Koh to Adams). In other words, these were social science data librarians. And, hoped sociologist Michael Aiken, these librarians would get things done: “an association of professionals in the data archive field who will define projects of mutual concern [and] set up task forces to carry out these objectives” (UMIP 1974-10-08 Aiken to Geda).

However, this new organization would not simply differ from previous social scientist groups like the Council and the Standing Committee in its focus on the professional division of labor. From the founding ad hoc committee, it was clear that it would also differ in terms of a gendered division of labor; key leadership positions were, for the first time, occupied by women. In addition to the organization efforts of Peggy Adams and Alice Robbin of Wisconsin’s DPLS, two more women in particular became central to the story: Carolyn Geda, of Michigan’s ICPSR, was chosen as chair; and Judith Rowe, of Princeton University, was selected as the “US Secretariat” (there was one representative from each major global region). These four would form a leadership team which remained largely in place, with slightly shifting roles, throughout the first decade of the organization’s history (UMIP 1974-12 Geda).

Importantly, Europe at the time had no alternative professional organization for data librarians. The data archiving conversation in Europe was also driven by largely male social scientists through meetings of political researchers who had become interested in data. For example in a 1973 meeting of European Consortium for Political Research data exchange group (which in 1977 morphed into an early formulation of CESSDA), political science attendees led by Rokkan, discussed familiar problems like how best to facilitate exchanges of political data within Europe, how to build social networks for data dissemination, and how to effectively advertise data holdings to promote reuse (ECPH DEG 1973).

This dual technical and gendered shift in leadership—from the male social scientists to the female library professionals—was crucial to IASSIST’s mission. Geda and the ad-hoc committee imagined that IASSIST would serve the field of social science by actually accomplishing the kinds of metadata tasks that the social scientists themselves had lamented about for years (and which the Council had failed to deliver upon in the 1960s): classification, cataloging, indexing, and all of the standardization required to make that happen. As Danish sociologist Per Nielsen would note later, “The whole IASSIST matter was, in my perception, started in part as a reaction against an authoritarian structure and low-level activity within existing professional settings” (UMIP 1975-11-07 Nielsen). Or as one prospective member put it upon being informed of the new organization, “I hope this organization can be more than the idealistic talk and lousy permanence which seems to characterize most such efforts in this field” (UMIP 1975 anonymous).

5.5 Negotiating the Purpose and Power of IASSIST

However, the question of how to actually organize and fund IASSIST—and which side should hold real power in the organization, the social scientists or the librarians—proved problematic from the beginning. The library-based IASSIST organizers desired real autonomy of action from the faculty-led Standing Committee; however, they also wanted to have a voice in the decision-making of the Standing Committee (and, if possible, benefit from the funding opportunities that the Standing Committee had access to, such as its ties to UNESCO). Aiken wrote to Geda that “many of the people on the Standing Committee were internationally known social scientists, which gave a certain visibility to the committee,” which he clearly thought would be important to building IASSIST’s legitimacy. Thus Aiken proposed that “the task force chairmen of [IASSIST]” should also “occupy the position of task force chairmen in the Standing Committee. From a technical point of view, this would mean that each task force chairman would wear two hats.” Aiken even listed several existing Standing Committee groups headed by prominent social scientists—dealing with computing issues, content analysis, historical data, and archival development in the “Third World”—which he imagined would simply become IASSIST task forces themselves. Aiken believed this was crucial for IASSIST “to have legitimacy in the international social science community” (UMIP 1974-10-08 Aiken to Geda).

But Aiken’s proposal would have ensured that the academic “chairmen” of those Standing Committee task forces, if mirrored in the IASSIST task force structure as well, had a clear place of power and control within the supposedly professional IASSIST from the start. Fellow social scientist David Nasatir from the Berkeley data archive (who had published a UNESCO report on SSDAs in 1973), was clear about this in a hand-written letter to Geda sent shortly after the Toronto meeting. Nasatir admitted to Geda “I’m sorry to say that somewhere […] some developments took place that might be misinterpreted as an exploitative ripoff”—meaning that if the IASSIST action groups were chaired by Standing Committee men and, as Nasatir noted, “no women?!” that would clearly be problematic. Nasatir apologized for this and offered his own advice to Geda: that IASSIST should be organized “first as an independent organization, then an alliance (from strength) with the Standing Committee.” In other words, whether it ever interfaced with the social scientists or not, “IASSIST should grow and thrive on its own (considerable) merits” (UMIP 1974-09-11 Nasatir to Geda).

Carolyn Geda responded tactfully in her mailing to the ad-hoc organizing committee in December 1974, making her case for the structure and purpose of the new organization (UMIP 1975-03-05 Geda to Nielsen). Looming behind this pitch was the recent memory of the failed US Council on Social Science Data Archives. Geda did not want IASSIST to follow the Council’s path—winning great funding and attention upon its founding, but unable to produce any changes or systems of lasting impact—but at the same time she knew that the success of IASSIST would be measured against its ability to achieve some of the goals originally imagined (and abandoned) by the Council. Geda’s pitch was, as a consequence, carefully crafted. It started with a summary history of where the field had been—“a list of some of the major archival meetings occurring between 1962–1969”—a time period exactly spanning the previous Council’s existence. The implication was clear: Much has been said but little has been done. Next it provided some examples of organization constitutions and bylaws, again using the Council as an example. Third came Geda’s summary of the Toronto meeting, including a tentative list of task forces and a questionnaire she proposed sending to any prospective members. Finally were suggested journals, newsletters, and individuals to contact in drumming up membership and publicity for the new organization. A questionnaire for prospective members rounded out the packet. Importantly, both the questionnaire and the wide-ranging mailing list indicated that this was to be a “bottom up” organization of working professionals across a wide range of social science data production, storage, and use sites—and not simply a “top down” organization of prominent social science faculty (UMIP 1974-12 Geda).

This grassroots emphasis was clear from Geda’s list of tentative IASSIST task forces, which differed substantially from the list of Standing Committee task forces that Aiken had proposed a month before. Geda’s task forces were designed to “improve the quality of research data, improve data archive and data library management and services, increase the amount of use and enhance the quality of the use of data for secondary analysis and aid communications among data archives people”—with “data archives people” broadly defined to include not just researchers, but students, policymakers, and especially, library professionals (UMIP 1974-10 Geda). In the end she proposed eight task forces, each charged with a particular technical deliverable (some more easily attainable than others) (1974-12 Geda C):

  1. 1.

    Data Archive Registry: “produce a directory containing the names, addresses, phone numbers, types of holding, dissemination policies, etc., of existing data archives and data libraries throughout the world”

  2. 2.

    Data Archive Development: prepare “a bibliography of all existing materials, including fugitive papers, on the establishment and administration of data archives and data libraries”

  3. 3.

    Data Archive Policies: “establish guidelines in such areas as acquisition, ownership, diffusion, dearchiving, and confidentiality”

  4. 4.

    Data Documentation: generate “minimum standards or guidelines of documentation, e.g., directories of holding, library catalog cards, data abstracts, and codebooks”

  5. 5.

    Classification and Inventory: “deal with major information schemes such as library cataloging and bibliographic information systems” to incorporate data sets into these tools

  6. 6.

    Process Produced Data: consider the “special problems inherent in the acquisition, documentation and use of data not initially collected for research purposes,” such as US government census or budget data

  7. 7.

    Professionalization of Data Archivists: recommend best practices for “job descriptions, job titles, training programs, aptitude tests, etc., which relate to the functions people now perform or could perform in data archives or data libraries”

  8. 8.

    Extension of Traditional Library Reference Services: push to alter normal library practice “to include information available in machine-readable form”

Only one of these eight task forces, “Data Archive Development,” matched Aiken’s original list of topics more appropriate to social science researchers.

Besides sending this summary to everyone on the organizing committee (including her librarian colleagues Rowe and Robbin, as well as professors Aiken and Nasatir), Geda copied this initial sketch for IASSIST directly to European faculty Stein Rokkan and Erwin Scheuch at the Standing Committee. Rather than accepting Aiken’s suggestion that IASSIST be woven into the existing task force structure of the Standing Committee, Geda left the question of affiliation open to discussion (even suggesting that “an appropriate library association” might be a better partner than the social scientists’ group). She concluded by suggesting that the name IASSIST might be too closely tied to the social sciences—“too exclusive”—and invited input on alternatives (UMIP 1974-12 Geda).

Replies to Geda’s outline for IASSIST rolled in during the early months of 1975. Nasatir responded that Geda had done “a splendid job of putting things together” (UMIP 1975-01-14 Nasatir to Geda). John McCarthy, the new head of the Berkeley International Data Library and Reference Service (from where David Nasatir had just stepped down), answered that “the idea […] is an excellent one,” addressing “the need for greater communication between Data Archives and the people who run them.” McCarthy even offered that “The problem is that at this point some archives are run by librarians, while others are run by professional social scientists” (UMIP 1975-01-02 McCarthy to Geda). The new director of the SSRC Survey Archive at the University of Essex, Ivor Crewe, replied “I am all in favour. […] International agreement on cataloguing conventions, the exchange of administrative and technical knowhow, the compilation of a register, minimum standards of data documentation etc all need to be done urgently” (UMIP 1975-02-26 Crewe to Geda). And Hesung Koh (Yale) agreed that: “unless there are efficient information intermediaries who can effectively explore and utilize these highly developed information systems and aid the users, it may become impossible for some researchers and practitioners to benefit from these complicated tools” (UMIP 1975-05-22 Koh to Adams).

Finances were the greatest limitation that the prospective organization faced. Geda lamented that even holding a meeting of the organizing committee would “require nearly $5000 in travel alone” given the international scope of the group. And as for a journal, “I think this is almost impossible. I’m told that a publisher expects a circulation of 1200 individuals at a minimum and an additional $5000” (UMIP 1975-01-08 Geda to Rowe). Questions like these forced Geda, Rowe, and Robbin to better clarify what IASSIST was designed to accomplish—as Robbin put it, “any organization exists to be more than a social agency for its membership,” so “a principal reason for establishing this international organization is to solve problems” (UMIP 1975-01-08 Robbin). Robbin was unflinching in her assessment of how well the academic social scientists had done on solving those problems over the previous decade, through organizations like the Council: “Scholars of the international social science community have done an admirable job of delineating the problems. But, it is obvious that problems described by scholars of the research community more than 10 years ago are the same problems which archive/library personnel continue to face on a daily basis” (UMIP 1975-01-08 Robbin). Thus, funding for communication, whether through meetings or newsletters, was imperative if (in these pre-email days) such action was to be organized and carried out: “While scholars have had multiple mechanisms for expressing quite clearly their needs because their communication networks are well established, personnel of the repositories have had limited access to each other, largely because communication networks in the form of journals and organizations do not exist. In my opinion, the strongest raison d’etre for the establishment of an international organization composed of individuals and institutions engaged in data repository activities (taken in their broadest sense) is to organize a more rational and efficient means for dissemination of information” (UMIP 1975-01-08 Robbin). Thus a funding structure was created to maximize participation from information professionals, with individual dues set at only $15 (UMIP 1976-01 Rowe).

During this time Geda, Rowe, and Robbin faced considerable communication challenges themselves. They worked together at a distance to pull IASSIST together, from Michigan, Princeton, and Wisconsin. In turn, Geda relied more and more on advice from Nasatir as a friendly member of the academic social science community (UMIP 1975-01-10 Geda to Nasatir). Together, these four spent considerable effort discussing Scheuch’s response to Geda’s IASSIST outline, since he represented the official voice of the social scientists on the Standing Committee (UMIP 1975-01-08 Geda to Rowe). Scheuch did agree on the basic idea of IASSIST as “an independent organization based on individual membership,” where people might “participate regardless of their place in official hierarchies.” However, Scheuch saw IASSIST not as a network of professional experts coming together to solve longstanding data archive problems that social scientists had ignored but as a network of technical service providers coming together to better support the newest cooperative and comparative research schemes of the social scientists. Scheuch was blunt in this assessment; he felt that “Inter-archive cooperation as far as organizations are concerned appears to function satisfactorily,” but that IASSIST task forces could be mobilized by the Standing Committee “for an integrated program of research” (UMIP 1975-01-02 Scheuch to Geda). Scheuch’s view was clear: IASSIST might be nice for its members, but it would never replace the agenda of the social scientists themselves.

Reaction from the IASSIST organizers was swift; Geda called it “our first confrontation,” and Robbin urged a quick reply to Scheuch “so that he understands that we are not as naive as he thinks we are.” Robbin’s view was “it is clear that he wants to maintain the power in his group’s hands” (UMIP 1975-01-10 Geda to Nasatir; UMIP 1975-01-13 Robbin to Geda and Rowe). Geda read Scheuch’s response as a turf battle over funding: “Somehow, I feel that he has concluded that we are or will attempt to fund ourselves at his expense” (UMIP 1975-03-05 Geda to Nielsen). David Nasatir feared that this burgeoning split between the Standing Committee and IASSIST would be fatal; he wrote at the time, “Coordinating the needs of the international research community for machine readable social science data with the sources of such data and the repositories of it is a task that currently is not being accomplished, in part, due to potential conflicts between the partisans of IASSIST and those of the [Standing Committee]” (UMIP 1975-05-21 Nasatir to Adams). He agreed with Geda, though, that while Scheuch “acknowledges the basic reason for IASSIST […] to provide a basis for direct communication among practicing data archivists,” Scheuch “fails, however, to pick up on the other major purpose—i.e. to provide a basis for professional identity, growth and recognition.” He also agreed that “It simply isn’t true that [the Standing Committee] can do what IASSIST proposes to do at the level it proposes.” Thus Nasatir offered encouragement to the IASSIST organizers: “Shuech is afraid, and I think rightly so, that if IASSIST working groups get going, [the Standing Committee] will be shown up as the relatively do-nothing group that it has been.” Nasatir advised Geda, “don’t be put off by Erwin’s letter. Rather, let’s keep rolling with the effort to get IASSIST going as a stand alone organization—open to those who want it” (UMIP 1975-01-20 Nasatir to Geda).

After so much back-and-forth over the mail, an in-person meeting was necessary to finalize many of the ideas. Piggybacking on the annual gathering of the European Consortium for Political Research in London, Geda and Rowe pulled together as many of the IASSIST organizers as they could in April 1975. Many of the European archive leaders like Per Nielsen (Denmark), Cees Middendorp (Amsterdam), Philippe Laurent (Belgium), and Ivor Crewe (England) attended; however, rather than the contentious Erwin Scheuch of the ISSC Standing Committee, Stein Rokkan, at the time head of the ISSC himself, was there to give his blessing to the project, confirming that “informal relationships were quite satisfactory at this point and that [IASSIST] would work closely with the ISSC and [the Standing Committee]”—without having to harmonize each other’s task forces under a single leadership (UMIP 1975-03-19 Rowe to Challener; UMIP 1975-08-01 Geda).

5.6 Setting a Metadata Labor Agenda

By September 1975, about a year after the idea had been hatched, the official IASSIST announcement was ready to go out to a mailing list of about 1000 prospects (over half of whom were in the United States) (UMIP 1975-08-01 Geda; UMIP 1975-10-02 Rowe). The pitch, for “an international association for individuals managing, operating, or utilizing machine-readable data archives, data libraries, and program libraries,” included a several-page-long historical background, written by Robbin. Her summary located the start of the data archive movement with the 1957 manifesto from Stein Rokkan and York Lucci; called out the unique contribution of the Roper Center, Zentralarchiv, and ICPSR data archives; and even cited the value of the short-lived Council in the United States and the continuing Standing Committee in Europe. But most importantly, Robbin described the main challenge for SSDA success as the split between the academic world and the practitioner world, with academics interested in “analytic problems of the data base” and practitioners interested in “facilitated access to […] the data base.” IASSIST was intended to address the latter problem, because “Although social science scholars had developed multiple mechanisms for expressing their needs through historically established communications networks, personnel of the data base repositories had limited access to each other” (UMIP 1975-07-08 Robbin to Rowe; UMIP 1975-09-01 Rowe). One of the main reasons for such a comprehensive announcement was to convince prospective members that IASSIST would not duplicate the mistakes (or inaction) of the now-defunct Council: “there is a group of people who will continue to invest their time to increase the membership, fulfill the objectives and see that it is sustained,” wrote Geda (UMIP 1975-03-05 Geda to Nielsen). Another reason was to make clear that this new organization was open widely—as Rowe put it earlier that year, “First of all, our interests are not limited to survey data and second of all, we would hope to attract data library as well as data archive members” (UMIP 1975-03-18 Rowe to Geda).

One important aspect of the IASSIST proposal had changed over the 6 months of planning: the “task force” structure. The agenda for action moved away from the Standing Committee’s academic priorities (like producing computational tools and harmonizing historical data across different countries) and toward the IASSIST professionals’ more pragmatic, service-oriented priorities (like harmonizing classification, cataloging, indexing, and other metadata standards across data archives). Even the language changed: In March, Geda wrote that the term “Committees of Correspondence” should replace the term “task force,” because “the [IASSIST] Committees will solve problems, not formulate policy” (UMIP 1975-03-05 Geda to Nielsen). Although this “Committees of Correspondence” language was used in the draft IASSIST constitution, it was finally replaced at the April 1975 meeting in London with the simpler and more direct term, “action group” (UMIP 1975-03-28 draft IASSIST constitution). And each “action group” was chartered with a specific published, deliverable product, since, as Geda put it, “If we could not readily conceptualize a relevant product, given restricted resources, we deleted the area from consideration” (UMIP 1975-08-01 Geda).

Not all of the original proposed topics for action groups made it into the final list. For example, the training function for information intermediaries had been the subject of not one but two of the initial IASSIST “task force” ideas—the “Professionalization of Data Archivists” group, which was to recommend best practices for “job descriptions, job titles, training programs, aptitude tests, etc., which relate to the functions people now perform or could perform in data archives or data libraries,” and the “Extension of Traditional Library Reference Services” group, intended to push to alter normal library practice “to include information available in machine-readable form” (UMIP 1974-12 Geda). However, those two goals—the ones most important to the professional development of data librarians themselves—were dropped from the final list of six “action groups” that the IASSIST organizing committee (made up of both social scientists and academic researchers) eventually agreed upon. The final six action groups (and their chairs) were:

  1. 1.

    Data Archive Registry (David Nasatir, American University): create “A [machine-readable] directory containing names, addresses, types of holdings, and dissemination policies of existing data archives and libraries throughout the world will be compiled.” This group argued that “the lack of a controlled vocabulary for descriptions of categories or holdings of data, was a major factor in the lack of good subject access to data archives” (IN 1:3 1977).

  2. 2.

    Data Acquisition (Donald Harrison, National Archives): “Recommended procedures for the acquisition of data would be developed with the intent of assisting researchers at critical points during the data collection process to ensure and promote the transfer of high quality data to the public domain for further academic investigation” (IN 1:1 1977).

  3. 3.

    Data Documentation (John Grasso, West Virginia University): “Standards will be developed for ‘simple background variables’ used in surveys, i.e., educational level, age, head of household, as well as constructs such as job satisfaction, anomia, political interest (i.e., to be measured by a scale or index). Thus, the work of this group will be closely linked to that which is going on regarding the development of social indicators. The codes will be incorporated into source books to provide researchers with a resource tool for coding and organizing their data consistently” (IN 1:1 1977).

  4. 4.

    Classification (Sue Dodd, University of North Carolina Chapel Hill): “the library cataloguing of machine-readable data files in public multi-media catalogues” (IN 1:1 1977). This was necessary because “there are no rules or a standard format for citing data in the published literature,” making it impossible “to identify a data file, or its source, or data elements on which the published analysis has relied” (IN 1:2 1977).

  5. 5.

    Process-Produced Data (Michael Leavitt, Brookings Institution): This group would study government-produced data and “the merging of such data with data from sample surveys” (IN 1:1 1977). This group noted that “We developed a listing of minimally required elements of information, which we hope each entry in a Catalogue of Data Files would provide” (IN 1:3 1977).

  6. 6.

    Data Archive Development (Alice Robbin, UW-Madison). Promised “A procedures manual consolidating current archival organizational, administrative, and personnel structures, procedures, and policies,” as well as workshops “to provide professional training in the skills necessary for effective operation of a data library, data archive or social science information center” (IN 1:1 1977). This final action group was, in fact, meant to cover the area of the abandoned “Professionalization of Data Archivists” group. They would soon announce development of “A Guide to Providing Social Science Data Services” (IN 1:2 1977).

So out of the six action groups, half were chaired by academic social science researchers, and half were chaired by data archivists and data librarians (Robbin, Dodd, and Harrison). But one thing was clear: Each of the action groups was meant to address a clear metadata challenge that could only be achieved by drawing on the expertise of the archival and library professions.

5.7 “Off We Go!”: From Action Groups to Mutual Assistance

With the action group agenda set, the new organization was finally launched—“OFF WE GO!” enthused Per Nielsen in November 1975 (UMIP 1975-11-07 Nielsen). “A series of lASSIST meetings were held on August 16-20, 1976, in conjunction with the International Political Science Association World Congress in Edinburgh, Scotland”; this is where IASSIST was formally established (IASSIST 1:1 1977 3). But the group still faced the daunting challenge of building, and sustaining, their new vision. The initial mailing only had a 20% response rate, which worked out to about 130 US members (UMIP 1975-11-26 Rowe; UMIP 1976-01 Rowe). As Per Neilsen wrote, “The main problem of the IASSIST will be that of finding people with an enthusiasm and energy which is far above normal standards!” (UMIP 1975-11-07 Nielsen). And from an international perspective, the group also faced challenges related to language, long distance communications (pre-email!), and travel costs.

The first IASSIST newsletter—twenty pages, hand typed with a do-it-yourself “zine” aesthetic—went out to members in 1977, almost 3 years after the original idea had been raised in Toronto. It proudly declared that IASSIST represented “an international cooperative effort on the part of individuals managing, operating or utilizing machine-readable data archives, data libraries and data services.” In February, the group held its first North American meeting (in Cocoa Beach, Florida). By the time the second issue of the newsletter went out, the organizers could tout their first successes: “The United States response to the first IASSIST Newsletter has been very gratifying. Fifty people are already on the list of paid members and 22 attended the February conference.” Individuals could become members for $15/year, sending payment directly to Judith Rowe at the Princeton University Computer Center (IN 1:1 1977; IN 1:1 1977).

But the origin story for IASSIST doesn’t end with the first mailing and the first gathering. Through the end of the 1970s—a period of economic “stagflation,” taxpayer revolts that cut funding to public agencies like libraries, and rapid technological development in both microcomputers and dial-up networking—IASSIST leaders and members worked to figure out how to bring their carefully designed plan into practice. The landscape for SSDAs by this time was still troubled in the United States and Europe. The newsletter s s data was in its last years of publication, starved for subscribers and frustrated with the lack of collaboration from even some of the largest SSDAs. (Subscriptions from IASSIST members would help it last until 1981.) And similar issues affected IASSIST’s own membership; Per Nielsen, IASSIST co-chair, mentioned his own fundraising problems at his home institution, the Danish Data Archive: “In DDA, we are still fighting for our lives, and that fight takes a lot of time and energy; if we fail (i.e. get no funds or get conditions we can’t accept) we shall be out of business as of April, 1977,” he admitted (UMIP 1976-07-06 Nielsen to Robbin). In this fraught environment, IASSIST hoped it could help—but it also demanded significant time and effort among its leaders and members to do so.

The first order of business is following through on the promises of the “action groups.” Through 1976 and 1977, progress on the US action group agenda varied—and several of the most crucial action groups effectively folded as their original leaders bailed out. In January 1977, David Nasatir, newly employed at California State College where he was now “without support for the activities germane to IASSIST,” pled “Mea culpa!” revealing that “I have not kept up my end of the IASSIST activities, and, as a matter of fact, have done nothing in this regard since last August” (UMIP 1977-01-13 Nasatir to Rowe). Nasatir soon turned over leadership of the Data Archive Registry action group to Iowa’s John Kolp, editor of the s s data journal (UMIP 1977-03-16 Nasatir to Rowe). But Kolp himself bailed out of the role less than a year later, noting that his own data laboratory at Iowa was under threat of budget elimination: “We are currently on a temporary budget from the University until the Dean makes a decision,” and “the University will no longer support any conference trips for individuals in my type of position” (UMIP 1977-11-15 Kolp to Rowe).

The lack of follow-through on the action groups persisted as a problem for years—precisely because IASSIST was composed of both academics and professionals working to keep their own institutions (and careers) afloat in tough budget times; these same people unsurprisingly were forced to let the ambitious collaborative projects of IASSIST fall to the side in favor of local crises. In 1980, William Gammell of the Roper Center resigned as coordinator of the Data Organization and Management action group saying “the demands of my position—from helping develop proposals to making sure a user’s dataset was copied correctly—are such that I can not do a good job as DOMAG Coordinator” (UMIP 1980-02-11 Gammell to Rowe). In the summer of 1983, the administrative committee reported that the Inventory of Data Archives and Libraries project “had been started several years ago, however it had not progressed very far” (UMIP 1983-05-18 IAC). Even IASSIST stalwart Alice Robbin fell short of her action group promises. By January 1977, Robbin’s Data Archive Development group had produced a draft outline for “A Guide to Data Archive Organization, Management and Servicing” (UMIP 1977-01-05 Robbin to Rowe). But less than a year later in October 1977, she wrote to one of her action group colleagues that “I find it impossible to work 200%; thus, IASSIST has gone by the wayside, in terms of tangible output,” and she decided to resign as coordinator of the Data Archive Development action group (UMIP 1977-10-06 Robbin to Ruus).

Writing decades later, Peggy Adams reflected that “Perhaps the best known and most influential product to emerge from the early IASSIST years was the Working Manual for Cataloging Machine-Readable Data Files, prepared by Sue A. Dodd, the U.S. chair of the Classification Action Group” (Adams 2007). However, this was a project that Dodd herself had already been engaged in for years—in a sense it was merely rebranded under the IASSIST banner. Her success, though, illustrates that none of the IASSIST action groups ever attempted to reproduce the failed “union catalog” effort of the Council from a decade before—nor was there a separate “data archive updates” group to try to take over from the now-defunct s s data. Arguably via the IASSIST social infrastructure, the community was able to achieve a standard for cataloging of data holdings in library catalogs. The process of discovering resources at the data archives would be handled through library catalogs: by bringing standardization to the metadata describing those archives’ holdings (the Data Documentation and Classification groups) and then cataloging these, along with the data archives themselves (the Data Archive Registry group) in traditional library cataloging systems (which were themselves moving at that time to new online, networked infrastructures). And ensuring this all worked smoothly would be the responsibility of the two training groups—one for social science researchers themselves (the Data Acquisitions group) and one for the new data librarians who would be tasked to work with them (the Data Archive Development group). This was a classic library science response: create metadata standards, teach those metadata standards, and enforce those metadata standards in order to keep information circulating.

Yet no matter what the plan of the action groups may have been at the start, many founding IASSIST members—especially those with careers in librarianship—continued to push for professional development as a key focus of the new organization (the one focus that had been deleted from the initial list of six action groups). For example, in 1976, as one of the first official IASSIST-sponsored activities, Robbin, Rowe, and Geda organized a 2-week summer workshop on “Data Management, Data Library Activities, and Data User Services” at the ICPSR, which was a big departure from this SSDA’s normal summer program offerings directed at social science faculty and graduate students. With 32 attendees, including “many individuals from ‘traditional’ libraries,” Robbin reported that “the excellent response of the participants has led to the instituting of the Workshop as a permanent part of the ICPSR Summer Program” (UMIP 1976-09-20 Robbin). Sue Dodd wrote to Geda that such an ongoing workshop was a great idea: “I have spoken with three librarians recently […] who have inherited data from Political Science Departments or from one active faculty member, and which have now grown too large to be maintained by one person. […] I personally feel that librarians could be trained to handle data files and thus the workshop would be of tremendous value. I also feel that a background in the Social Science, plus some exposure to quantitative research methods is more important than a background in computer science” (UMIP 1976-05-07 Dodd to Geda). Soon such activities were officially folded back into the “Data Archive Development” action group, even splitting out a new action group specifically on “Data Organization and Management” which would focus on “better teaching tools, workshops, etc., to teach people appropriate techniques for data cleaning organization and management” (UMIP 1976-09-20 Rowe).

By the end of 1976, Alice Robbin was urging her IASSIST action group colleagues that their success on individual IASSIST projects would translate directly into greater status and visibility for their new profession as a whole. “I realize that for many years we have viewed this type of job as a temporary/ transient one, in which we spend a few years before we move on to another job. It is probably difficult for a number of us to think in terms of the professionalization of this area—but it has arrived. IASSIST was created as a communications mechanism for data services people, just as the American Sociological Association and American Political Science Association were created for the sociologists and political scientists. I hope you will consider yourself a member of this profession” (UMIP 1976-11-23 Robbin). So IASSIST seemed to be succeeding in its role of bringing together a correspondence network of data librarians and data archivists, especially in the United States—uniting the “information intermediaries” into a community of practice with a growing professional identity. But this success also drew IASSIST farther away from the social science researchers who initially supported it.

The heavy (and evidently unrealistic) workload that IASSIST had set for itself in its action groups motivated subtle changes in IASSIST’s focus as it entered the 1980s. The founding documents had specified a 3-year term for group’s president, so given a rough start date of 1976 for IASSIST when Geda began her time in the role, an election was held for a new president for the 1979–1982 period. Upon Alice Robbin’s election as the new president of IASSIST, her administrative committee meeting of May 1979 detailed new ideas to address the “loss of direction in the action groups.” The first change was to formally prioritize and improve the communication function—after all, the newsletter was the largest expense of the organization, and “the only visible manifestation of the association.” The second change was to charter a subcommittee for “the maintenance of the present membership and the expansion of membership” (UMIP 1979-05-06 IAC). Action groups were still part of the IASSIST mission, but more as a way to attract members who were already engaged in productive activities that could be shared to a wider audience—especially with affordable but vibrant conferences. (This was the model of their one clearly successful action group, which had brought Sue Dodd’s work under the IASSIST umbrella and given it wider exposure.) Wrote Robbin in July 1979, “We must think of some ways to improve the activities and participation of the IASSIST membership or this organization is going to go down the tubes. One conference a year is not enough to sustain an organization” (UMIP 1979-07-30 Robbin). So while action groups would nominally remain, IASSIST was going to concentrate more on reaching and representing its front-line data archive and library members, relying on its newsletter and conference to showcase and, hopefully, inspire real-world progress in data services.

This vision for IASSIST was lauded by Robbin in her October 1979 letter to the administrative committee, where she described the way IASSIST was seen by participants in a UNESCO-sponsored roundtable on social science information held at the University of Minnesota: “it appeared to all participants […] that IASSIST was one of the two organizations represented there who was actually doing something about social science information problems. We are unaffiliated with any government; we are composed of individuals; and we are working in the ‘trenches’ (to use a phrase that cropped up for two solid days) on a daily basis and are thus aware of the real problems. Finally, we are not hung up with beautiful policy recommendations, but concerned with implementation” (UMIP 1979-10-28 Robbin).

This pattern persisted into the 1980s, rounding out IASSIST’s first decade as a professional association. The organization ran conferences every year—with the larger ones dutifully cycling between US, Canadian, and European locations, and smaller ones organized by local and regional data libraries as appropriate—always featuring a slate of hands-on training workshops (UMIP 1980-06-16 Robbin). In fact, the only substantive new venture to emerge once the original “action groups” fell away was a new “Standing Committee on Education” chartered “to develop, establish, and maintain educational programs and professional standards appropriate to those managing machine-readable data files” (UMIP 1980-11-10 IAC). Professional development of data archive staff remained a top priority in the 1989 IASSIST 5-year plan which included proposing a curriculum and running short courses (UMIP 1989-11-02 Future Directions report). These conferences and workshops ended up being both successful recruiting events for new members, and gentle moneymakers for the IASSIST treasury. Although proceedings of presentation articles and abstracts were assembled and published out of the 1980 and 1981 conferences, these products were eventually folded into the IASSIST newsletter—renamed as the more official-sounding “IASSIST Quarterly” in 1982 (UMIP 1982-10-13 IAC).

IASSIST was also making a conscious choice not to become a research-based society. Starting in 1980, they engaged in several years of negotiations with North Holland Press to create an “IASSIST journal” that would be called Computers and the Social Sciences—a companion to an already-existing journal called Computers and the Humanities. Judith Rowe had been in favor of this venture: “North Holland has done a marketing study and they think—in spite of our reservations about computing journals—there is a market for a single journal catering to the needs of producers, distributors, and users of data, software, systems, and hardware of interest to social scientists.” She admitted, however, that it would take “some arm-twisting” to find enough authors among the IASSIST membership to fill such a journal, especially the debut issue: “We really need stars for those. After that we deal with our relatively small constituency, few of whom are motivated to publish” (UMIP 1980-06-11 Rowe). The journal idea was debated for a year but ultimately abandoned in favor of simply continuing the newly renamed “IASSIST Quarterly” (UMIP 1981-04-03 White). In this way, IASSIST was providing a professional outlet for reports on practitioner projects, but not a commitment to generalizable quantitative social science research. As their own membership committee put it in 1981, “IASSIST needs to have a working membership, working toward assisting one another” (UMIP 1981 IAR).

5.8 Conclusion: Success Through “Assisting One Another” in Metadata Labor

“Assisting one another” turned out to be the IASSIST goal that sustained the organization. Its 1989 5-year plan described the organization as “in a position to advance the interests of […] data professionals, promote professional development of this new career” (UMIP 1989-11-02 Future Directions report). In another example of its professional social infrastructure building function, IASSIST established awards that helped formalize the profession. The IASSIST Achievement Award which recognizes “contributions of an individual to the organization and to the profession” was first given out in 1990. In the proposal for the creation of the award explained, “the profession has matured during the past twenty-five years even to the point that people who have contributed much to [the] field or to IASSIST are beginning to retire” (UMIP 1990-05-29 IAC). Establishment of the award was a declaration that the field existed and that its most valued activities were not social science research projects, but the work involved in managing data, data libraries, and data archives.

IASSIST still survives today—with a Web presence at http://iassistdata.org—describing itself as “an international organization of professionals working in and with information technology and data services to support research and teaching in the social sciences” counting some 300 members “from a variety of workplaces, including data archives, statistical agencies, research centers, libraries, academic departments, government departments, and non-profit organizations” (http://iassistdata.org 2018-11-25). And founders Adams, Geda, Robbin, and Rowe continued to influence the field for years to come. (In a way, Alice Robbin even crossed over into the camp of the academic social science researchers, after earning her doctorate at UW-Madison in political science in 1984 and pursuing a second career as a professor.)

The founders’ goal of building a far-flung and professional organization of information intermediaries, able to adapt to changes in computer and archival technology over the long term, was realized. IASSIST has persisted in its focus on social science data services, through the rise of the World Wide Web and the current excitement over “big data.” For observers writing at the turn of the twenty-first century, the role of the “data librarian” in the knowledge ecosystem of data archives no longer needed to be explained and justified. It has a professional society with a standing conference, a standing journal and a career award. For example, University of Connecticut sociologist (and former data archivist) Richard C. Rockwell wrote “When ICPSR was formed over 30 years ago, it usually dealt with departments of political science and young professors in those departments. Since its formation, a new profession—that of data librarian—has arisen, and increasingly these professionals find their homes in libraries rather than in departments or research institutes” (Rockwell 1997).

Why did these information professionals spend so much time and effort, over so many years, building this professional organization, especially in the face of so much resistance from the very academic constituency which they were hoping to serve? The professional network building represented by IASSIST was certainly timely in the trajectory of library professionals transitioning into information experts along with the deployment of new technological infrastructures. This was a practice visible in other facets of library work where those with expertise in online communications and cataloging, or “machine-readable data,” or even basic microcomputer application literacy, hoped for a path to upgrade their skills and status—as well as the status of their profession. And all of these efforts took place within a profoundly gendered division of labor, with service work gendered female and technological expertise gendered male (as it often still is today) (Downey 2010).

But something else was at work as well—something more fundamental to the demands of data reuse in the social sciences, where constant negotiation between different disciplines, different survey methodologies, and different policy agendas lurked behind every supposedly independent set of data cards and tapes, every variable and coverage entry in a data codebook. For these librarians to succeed as social science data librarians, they needed to constantly negotiate for access and information, constantly trade and produce and correct and cross-reference a myriad of metadata structures, constantly explain and train and justify their very position in the nexus between competing social science faculty, professionalizing social science graduate students, and changing technical services and computing colleagues. Coalition building, peer learning, and negotiated information exchange were built into the job of a social science data librarian in order to make the entire technological infrastructure work—even in the absence of an organization like IASSIST. Or, another way to put it, the social science data archive was itself a socio-technical knowledge infrastructure, depending as much on interpersonal trust and coordination as it did on computational hardware and record formats. Both the work that IASSIST members talked about and the work of keeping IASSIST running itself as a forum for those conversations were crucial forms of metadata labor within this socio-technical knowledge infrastructure.

Thus IASSIST’s most important contribution in these early years may have been fostering an information infrastructure of expertise and learning that allowed data archivists and data librarians to better “assist” each other across a global geography fraught with disciplinary and institutional debates among the academic and national sponsors of the large-scale data archives. IASSIST, through its routines of regular meetings, interest groups and a publication, incubated and supported the human connections of information infrastructure by providing means of socialization and means of professional recognition and achievement in the field through leadership positions and organizational achievement awards. Through the 1980s, IASSIST only ever counted around 250 active, dues-paying members at a time (although more tended to attend their regular conferences and workshops, and the IASSIST Quarterly went out to about 400 different addresses) (UMIP 1988-05-25 IAC; UMIP 1990-06-02 IAG). Yet the impact of IASSIST stretched through the professional and social networks of all those working across the data archive and library landscape as it evolved through a key period of transition, from a still largely experimental practice of only the largest and most elite research universities and social science departments in the mid-1970s to an accepted facet of social science research and education at schools and colleges small and large at the end of the 1980s. As Robbin herself had written in 1975, “the data library/archive cannot, nor should not exist independent of other information centers. Rather, the data library should be viewed as one node in a data information network and the focus of members of the data library should be on the formalization of contacts with other nodes in the network” (Robbin 1975). Collectively, IASSIST was unable to realize much of its original hopes—it did not, for example, become a standards-setting body for all social science metadata, although many of the individuals involved with the development of IASSIST were also involved in early social science metadata work, and the IASSIST working group on codebook documentation and data cataloging helped foster the later DDI metadata standard. Yet IASSIST was a crucial socio-technical knowledge infrastructure—in a pre-Web, pre-email world—for a new category of interdisciplinary “information intermediaries” who needed to negotiate a landscape of high-status academics who often knew much less about technology and metadata than they did. In this way, the IASSIST community produced and reproduced great value for its members. As one of them put it, “IASSIST has helped me develop a professional identity—it’s nice to know that others are just as crazy” (UMIP 1980-02-11 Gammell to Rowe).