Introduction

Digital means for sharing knowledge abound. Anyone with an interest and an Internet connection can add to the collective understanding of a plethora of topics through online forums, social media, blogs, and more. As such, researchers have sought to codify, understand, and act upon such communication by exploring knowledge-sharing in online communities (Kimmerle et al., 2015; Matthews, 2016; Scardamalia & Bereiter, 1994). While teachers, students, and office workers have been heavily studied in online communities (Hmelo-Silver & Barrows, 2008; Karam et al., 2017; Macià & García, 2016; Oberländer et al., 2019; Raković et al., 2020; Teo et al., 2017; Tseng & Kuo, 2014; van Aalst, 2009), there is less emphasis on how communities bounded by shared interests share knowledge.

Given that 80% of Internet users have indicated that they belong to or participate in these communities (Horrigan et al., 2001; Horrigan, 2016) and with most social media users reporting everyday use (Auxier & Anderson, 2021), this represents a substantial gap in our understanding of how diverse people use these spaces for informal learning. In the case of communities that support public participation in science, which is the focus of this case study, this also represents a potential loss of capacity for an enterprise that serves a critical function for society. To contribute to current understandings of community practices and dynamics, our research aims to operationalize the communities of practice (CoP) theoretical framework, which is oft-applied for understanding learning in online, informal environments.

The single case study presented here contextualizes an online, scientific community of practice (CoP) whose diverse participants exhibit knowledge-sharing as domain-specific and social practice. Wenger et al. (2009), writing about learning in such digital spaces, indirectly define their conceptualization of practice as “…learning how to be a certain kind of person with all of the experiential complexity this implies: how to live knowledge, not just acquiring it in the abstract” (p. 7). Such vague, but seemingly powerful notions of the complexities of social participation are common in CoP research, often due to the fruitfulness for describing the potential of the theory, but are equally limiting for more prescriptive or evaluative applications. Our objective was to evaluate the behaviors in an online scientific CoP as potentially representative of the seven high-level groups of activities that Wenger et al. (2009) proposed as expressions of practice. In doing so, we add to the theoretical conceptualizations of social practice in interest-based communities while addressing the noted absence of measures to evaluate knowledge sharing within such virtual CoPs (Ferreira da Silva et al., 2020; Hafeez et al., 2019). Such results would be transferable to other, similar interest-based groups in different contexts, such as Facebook groups for discussing specific bands or sports teams or an app like Nextdoor for staying in touch with your neighbors. We framed the study with the following three specific research questions: What forms of practice existed within an online community focused on social paleontology? How are these forms of practice related to community member attributes? In what ways can practice be traced and identified as a social network?

Theoretical Framework and Literature Review

Social learning is defined as competence within a domain of knowledge, having experiences related to phenomena in the world, and sharing that competence and experience with others (Wenger & Snyder, 2000). Accordingly, we frame this study within the theoretical framework of CoPs, specifically the conceptualization described in Wenger et al. (2002) where the CoP consists of three elements: the domain, the practice, and the people. Additionally, we build on theoretical concepts for virtual CoPs proposed in Wenger et al. (2009), specifically, the range of learning activities that may occur within virtual CoPs. Lastly, we draw on evidence that online communities can provide affordances for learning science and mathematics, including access to distributed networks of expertise and forums for public discourse (Martinez & Peters Burton, 2011).

Within CoPs, the domain encompasses the shared interests of the community at hand. Researchers have explored the CoP theoretical framework from a domain perspective in many fields including citizen science (Brossard et al., 2005; Herodotou et al., 2020, 2022; Liberatore et al., 2018), occupational therapy (Majeski & Schefkind, 2021), and teacher development (Bannister, 2015; Tseng & Kuo, 2014). Researchers who examine domain-specific CoPs emphasize how research efforts can lead to effective domain-specific interventions (Watkins et al., 2018). For example, Liberatore et al. (2018) describe how a virtual birding CoP based on Facebook supported one another as well as developed rudimentary ornithology practices. Although in a different field, Majeski and Schefkind (2021) found similar examples of community support and domain-specific problem-solving exhibited by occupational therapists who formed a CoP. Lastly, Bannister (2015) indicates that teachers came up with solutions to domain-specific problems while meeting as a CoP. The connecting thread among these studies is that CoP members tend to emphasize the development of the community first and the development of their practices second. While research that focuses on a CoP’s domain is important as it provides the reason for CoPs to coalesce; other researchers have focused on the community aspect, or, who comes together in relation to the domain.

Community is the second element of a CoP. Community is defined as people who engage in activities related to the CoP: having conversations, participating in practice, and helping one another learn about the domain (Wenger & Snyder, 2000). Current research into people and their engagement in CoPs indicated that learners value experiences, problem-solving, and lifelong learning (Abedini et al., 2021). In regards to citizen scientists who participate in large-scale communities of practice, such as Zooinverse, researchers heavily focus on demographics (Ibrahim et al., 2021) or the scientific results provided by participants (Jackson et al., 2020). Additional empirical research into the element of community within CoPs concluded that community can sometimes prevent people from participating, particularly if trust is not established effectively (Eberle et al., 2014; García-Monge et al., 2018). The community aspect itself can become the focal point for researchers, with some seeking to understand why the social fabric of the community supports, nurtures, or otherwise sustains interaction (e.g., Malik & Haidar, 2020; Sbrocchi et al., 2022). Others determined that some CoPs fail in their notion of contributing to domain-specific work, instead only working towards providing community-based support (Aldana & Martinez, 2018; Karam et al., 2017; Pontual et al., 2018). Other researchers discovered similar patterns: some CoPs focus on creating community in lieu of knowledge creation (e.g., Bondy et al., 2017; Carrol, 2005; De Cindio, 2012; Wenger et al., 2002). In these studies, researchers imply that if you build mechanisms for community activity, knowledge generation will follow.

The last element is that of practice, the development of shared elements, both explicit and tacit, with which participation and contribution within an area are identified. While practice has been defined and used as a basis of understanding within many studies of teachers (e.g., Campbell et al., 2022; Thompson et al., 2019) and studies of the nature of science (e.g., García-Carmona & Acevedo-Díaz, 2018); practice, within the theoretical community of practice framework, remains ill-defined. Within multiple learning environments, practice has been considered as a proxy for developing proficiency in a domain (e.g., Alexander, 2003; Sadler, 2009). For example, Novakovich et al. (2017) discuss advancing student use and implementation of social media by theorizing and testing CoP design principles around social media and online communities. Their findings suggest that certain practices within a virtual CoP can be surfaced by scaffolding community participation, while others cannot. Practice development is also theorized to facilitate social community in addition to developing domain-specific proficiency (Gray, 2004). Gray (2004) also demonstrated that within an online, scientific community, practice can be both social and scientific.

Wenger and colleagues (2009) further conceptualized practice in online communities as nested activities introduced through formal and informal events that occur when people learn from and with one another (Table 1). Wenger and colleagues proposed seven high-level activities for CoPs: exchanges, productive inquiries, building shared understanding, producing assets, creating standards, formal access to knowledge, and visits. Each of these in turn contains “learning activities” (Wenger et al., 2009, p. 6). Both concepts—high-level activities and learning activities— are synonymous with practice. For example, the high-level activity of exchanges contained the learning activities (i.e., practices) of stories, news, information, pointers to resources, tips, and document sharing. These high-level learning activities can be traced to Wenger and colleagues’ (2002) creation of seven design principles for distributed communities: “design for evolution, open a dialogue between inside and outside perspectives, invite different levels of participation, develop both public and private community spaces, focus on value, combine familiarity and excitement, and create a rhythm for the community” (p. 57). We emphasize that these are theoretically-conjectured design principles and forms of practice. Wenger and colleagues (2009) described such forms of practice and conjectured they were indicative of CoPs but did not operationalize them. In our study, we sought to advance theory by testing and refining Wenger and colleagues’ conceptualization of practices in the context of an established online science community.

Table 1 High-level activities and nested practices theorized by Wenger et al. (2009)

While other researchers have explored the conceptualization of CoPs as envisioned by Wenger (1998) and colleagues (2002, 2009), these explorations focus on singular elements of the CoP framework that are ill-defined or are tangentially related to digital technology. Shifting the focus of research to emphasize the development of practices that lead to participation in and contribution to the domain allows for a greater understanding of the online, social learning landscape as well as provides opportunities for researchers to establish for whom and under what conditions CoPs meet success. Additionally, while work has revolved around online CoPs, much of this work is centered in formal education (e.g., Gunawardena et al., 2009; Xue et al., 2019), not scientific, interest-based communities.

Our focus is the field of paleontology, a charismatic science that enthralls people of all ages through discoveries of extinct flora and fauna and has a rich history of citizen scientists (i.e., amateur paleontologists) helping bring about scientific discovery (e.g., Boessenecker, 2022; Corin et al., 2015; Hartshorn et al., 2014). We use the term social paleontology to refer to our emphasis on the people and the ways they learn, develop, and generate knowledge (Crippen et al., 2016; MacFadden et al., 2016). We seek to add to burgeoning research concerning social paleontology’s landscape of online, social learning practice (Lundgren et al., 2021; Lundgren et al., 2022a; Smith et al., 2021). People and their practices are at the heart of social paleontology, yet further understanding participation, contribution, and social learning (i.e., practice) within the domain is necessary to understand digital practice.

Methodology

This was a single case study (Creswell & Plano Clark, 2011) of an online community called myFOSSIL. myFOSSIL was designed and developed with funding from the US National Science Foundation by a team of interdisciplinary researchers, which included the authors, in order to unite paleontologists from across the continuum of expertise in social paleontology (Crippen et al., 2016; Bex et al., 2019a; Lundgren et al., 2021). The case was bounded by our intent to measure and document the digital practice of science in a diverse, interest-based community that included professionals as well as members of the public and to relate its expression to the attributes of community members.

Study Context

This research is framed within the scientific discipline of paleontology, which can be described as the study of the evolution of Earth’s species and their ecologies through the practices of collection, preparation, curation, and digitization of fossils (Crippen et al., 2016). The great majority of these practices occur in the real world and are well-documented (Twitchett et al., 2015; Catalani, 2014). However, there is emerging evidence for digital forms of paleontological practice (Lam et al., 2019) on social media sites such as Facebook (Lundgren et al., 2022b), Twitter (Bex et al., 2019a), and Instagram (Ocon et al., 2021) as well as dedicated websites (Soul et al., 2018; Lundgren et al., 2021). We focus on digital practices and the community where they emerge as an open, interest-based activity as this affords the inclusion of diverse members and accounts for varied experiences and expertises in the domain.

Thus, myFOSSIL was created as a web-based community as a part of a design-based research project to encourage the enactment of paleontological practice in both online and offline spaces by people from across the continuum of paleontological expertise. myFOSSIL was designed with the CoP framework in mind, specifically employing design principles that were derived from a needs assessment survey and meshed with Hoadley and Kilner’s (2005) elements for knowledge generation within online CoPs (Crippen et al., 2016). The community was formally engaged in the design of the site through a needs assessment survey, a large-scale in-person meeting, rounds of usability testing, and continuous procurement of informal feedback (Crippen et al., 2016). The content of the site is open and accessible without the need for a login, but anyone can become a community member by creating a personal account that is authenticated with a valid email. Activity at the site is then recorded along with the member’s participant ID. The site is currently available via the web or a mobile app, but at the time of this study, it was only accessible via the web (Fig. 1).

Fig. 1
figure 1

myFOSSIL website main page and digital trace data. a The home page that members see when they navigate to the site. b A message exchange. c A user’s activity feed. d An exchange within the forums

Within this online community, members were able to engage with paleontological topics through using three website features called forum posts, direct messages, and activity posts, which are described as follows:

  • Forum posts: topic-specific posts created by myFOSSIL members that could be easily seen and interacted with by other members. Members could mark others’ posts as important to them (favorite), as something to keep up-to-date with (follow), or could add additional information to (reply).

  • Direct messages: private correspondences that originated from one myFOSSIL member and were sent to one or more additional members. The message receivers had the option to respond to the message.

  • Activity posts: original content created by a myFOSSIL member that was situated within an area of the website that was not a forum or a direct message. Activity posts were stand-alone content or replies to other members’ posts.

Methods

This section will focus on describing our process of recruiting participants, explaining the inclusion and exclusion criteria for participation, developing and refining an analytical framework for data collection, and detailing our data analysis approach (Fig. 2).

Fig. 2
figure 2

A chronological description of our research process

Participants

We recruited potential participants through social media posts, personal communication with fossil club members, and community discussions at geological and paleontological conferences. For additional information on the initial formation of the community, see (Crippen et al., 2016; MacFadden et al., 2016). People became study participants after agreeing to an informed consent form and completing an intake survey as part of creating a personal account. In this intake survey, members indicated their past experiences with paleontology, provided basic demographic information, and described their interest in joining the community site (Supplemental Material). Additionally, members expressed how they discovered the site (i.e., through internet searches, social media click-throughs, or word of mouth). Data were collected for a 2-year period (October 2015–2017), during which time membership included nearly 1000 people.

To be included in the study, we applied certain inclusion criteria: consenting to participate on the intake survey, indication of being over 18 years old on the intake survey, and setting a threshold for active membership. Website members who did not consent to be included in the study (n = 63) as well as members who were under 18 were removed (n = 80). In regards to setting a threshold for active membership, we applied aspects of the Pareto Principle. The Pareto Principle indicates “that for many phenomena, about 80% of the consequences are produced by 20% of the causes” (Dunford et al., 2014, p. 140). This principle has been applied to online communities, in which few participants—usually less than 20%—make the majority of contributions (Serradell-Lopez et al., 2023). With the Pareto Principle in mind, we derived a threshold for active membership, based on Wenger et al. (2009) and Malinen (2015), to determine study participants. Active membership was defined as contributing at least one piece of digital trace data during the study period. Two hundred and 63 members contributed at least one message, activity post, or forum post each during the study’s time period and were included as study participants (Fig. 3, Table 2). This represents 31% of myFOSSIL members, which is aligned with the Pareto Principle conceptualization of few participants making the majority of contributions.

Fig. 3
figure 3

Distribution posting frequencies for all digital trace data types. The horizontal line for each PIT category is indicative of the mean for each. Note: Outliers, as determined by the ROUT method, were removed

Table 2 Descriptive statistics of posting frequencies for each digital trace data type

Through analyzing participants’ responses to the intake survey and through perusal of their user profiles (similar to a Facebook profile), the researchers identified demographics and paleontological interests. Participants were classified using the Paleontological Identity Taxonomy (PIT), an instrument that allows researchers to classify community members based on their expressed identity with paleontology (Lundgren et al., 2018; Bex et al., 2019a). In short, this three-tiered instrument provided classification information at a coarse-grain (i.e., structure), medium-grain (i.e., category), and fine-grain (i.e., type) scale. For this study, we focus on the categorical level in which participants were divided into four categories: public, education and outreach, scientist, or commercial collectors. Participants were divided into one category each (i.e., participants could not be coded as both scientist and public). While we recognize that participants could encompass multiple identities, using participant responses from surveys and information from participant’s site profiles helped us to identify participants’ primary identities (i.e., what they themselves focused on when describing themselves). Distinguishing participant categories allowed us to describe and define who was a part of the community as well as ascertain through whom information flowed within the community. The other two levels are not discussed for this study as structure was too coarse of a classification and type was too fine.

Data created by participants were extracted as five tables from the developer interface. Data from these tables were joined to link participant IDs, names, and textual exchanges. Additionally, we used OpenRefine (Verborgh & De Wilde, 2013) to clean data including changing participant names to pseudonyms and removing duplicate posts. Afterwards, we imported data into HyperResearch for coding (version 3.75).

While Wenger and colleagues’ (2009) theoretical framework provided the basis for this research, to empirically study the community and its practices, the theoretical framework was operationalized for myFOSSIL through a process of iterative coding that was subject to interpretation and refinement as data analysis progressed (Creswell & Plano Clark, 2011). In essence, Wenger and colleagues’ (2009) high-level learning activities and their practices (Table 1) needed to be connected to social paleontology-specific activities.

The unit of analysis for coding ranged from sentences to paragraphs and included the application of only one code (i.e., our analysis did not allow for double coding), as we had “a clear and focused research purpose and thus a clear lens and filter for analyzing the data,” i.e., understanding what forms of practice existed within this online learning environment (Saldana, 2016, p. 94). An initial coding iteration led to the discovery that much data remained uncoded due to the undefined nature of the high-level learning activities and their practices. Thus, the authors used an iterative process of reviewing and identifying meaningful collections of trace data as examples of nested practice (i.e., in Wenger and colleagues’ description, specific learning activities were subsumed under learning activity categories), constructing operational definitions for the high-level learning activities, verifying these collections with additional data, and discussing to consensus. Our process of discussing to consensus was especially useful when member posts seemingly could fit into more than one coding category. When this occurred, the authors met, closely examined the text and the definitions of practice until we agreed on the outcome.

Following an initial round of constructing operational definitions, digital trace data were re-examined and coded over a 1-month period with the addition of the new high-level learning activity of ungrouped and its nested practices of support and field trip planning. We chose to name the group of codes support after reflecting on the concept of CoPs, and how community members attempted to be social, thus adding a layer of encouragement to the community. Field trip planning was named to encompass the idea of creating field-based learning expeditions for site members. Our process resulted in the empirical communities of practice (ECoP) framework for coding data; for additional information and examples of its use, see Lundgren et al. (2021) (Table 3). As an additional check on coding consistency, we employed interrater reliability using the kappa statistic in which a second coder used the ECoP framework to code 10% of the data (Creswell, 2009), with kappa values that ranged from moderate (0.57, forums; 0.61, messages) to substantial (0.70, activity posts) (McHugh, 2012).

Table 3 ECoP analytical framework of domain-specific learning activities and practices. Learning activity categories and specific learning activities (practices) based on the CoP conceptual framework found in Wenger et al. (2009)

Using the ECoP framework, we then coded all data within the site including the forums (n = 1858), activity posts (n = 1300), and direct messages (n = 667). To determine statistical differences between knowledge-sharing discourses within website features, we performed a chi-square statistical test and pairwise comparisons using the statistical software program SPSS Statistics (v. 25). To conduct chi-square tests of independence to determine the statistical differences in practice use and data type, the specific practices were collapsed Wenger et al.’s (2009) seven broader learning activity categories (Table 3). These categories were based on the original conceptualization of learning activities as dictated by Wenger et al. (2009).

After collecting and coding all data, we examined the types of knowledge-sharing discourse focusing on differences and similarities in practices created between members of different PIT categories. Member site activity was matched with coded instances and counted, then we performed a chi-square statistical test and pairwise comparisons.

While it was useful to understand the general practices that occurred, it was imperative to understand how they developed and were related to one another, in other words, understanding the chains of practice that emerged. Thus, we applied social network analysis to data collected from the forums. Social network analysis algorithmically maps the connections between entities (Wasserman & Faust, 1994). These algorithms can show who or what is connected to others, the ways in which they are connected, and what they are connecting about. Others have used social network analysis to map knowledge sharing discourse in health communities (Sharma & Land, 2018) as well as for showing patterns of discourse for online courses (e.g., Sharma et al., 2021, 2023). Social network analysis was conducted on the forums as they contained the greatest amount of data and such chains of activity were discernible on the forums.

For this study, connections between practices were determined by sequential sets of messages. When conducting the social network analysis, we first created an adjacency matrix in which the number of connections between each practice was tallied (Supplemental Material). The adjacency matrix was used to create an edge table which was imported into NodeXL, a network extraction, analysis, and visualization software add-in for Microsoft Excel (Hansen et al., 2011). The forum data can be classified in network analysis terms as directed, meaning there was a clear flow of information in which researchers could identify where the information originated (i.e., flows from) and to whom the information was directed. We followed the methods of Himelboim and colleagues (2017) in that the network was visualized using the Harel-Koren fast multiscale graph and groups were determined with the Clauset-Newman-Moore clustering algorithm (Clauset et al., 2004). We collected and examined density (i.e., overall connectedness) and centrality measures (i.e., betweenness centrality, closeness centrality, and eigenvector centrality) to determine information flow. Graph density is measured from 0 to 1, providing information on how many potential connections are actually connected—graphs with a density of 1 mean that every single entity connects to every single other entity. Centrality measures are numeric calculations that indicate varied aspects of relationships between network members. Closeness centrality measures the average distance between one practice and all others in the network. Lower closeness centralities (e.g., less than one) indicate higher connectivity—meaning there is less “distance” that one would need to travel between practices. Two other centrality measures, betweenness and eigenvector, do not have normalized scores and vary network to network. Betweenness centrality measures how information flows through a network. Eigenvector centrality measures the connectedness of connected vertices (Himelboim et al., 2017).

Results

Social Paleontology Practices

To answer our first research question, what forms of practice existed within an online community focused on social paleontology? We coded forum posts, activity posts, and messages using the ECoP framework. Our analysis indicates that more than half of the practices theorized by Wenger et al. (2009) plus two additional practices that were specific to myFOSSIL (e.g., field trip planning, support), existed on the website, although some were more prevalent than others.

Across all three website features, the learning activities of support, tips, problem-solving, and stories occurred more frequently than the other learning activities (Table 4). The most coded practice was support, followed by tips, problem-solving, and stories. Overall, this shows that the forms of social paleontological practice that exist on myFOSSIL were personal, related to sharing advice, and concerned with producing or at least exploring new ideas related to social paleontology. Next, we provide examples and descriptions of the most common practices within myFOSSIL.

Table 4 Total numbers and percentages of practices within myFOSSIL

Regardless of the website features, the main social paleontological practice that existed on myFOSSIL took the form of support. Support, which entailed members thanking others for contributing, acknowledging a contribution, or being otherwise social without adding to knowledge per se, was added following preliminary data analysis. There were 693 instances of support expressed by myFOSSIL members. Support often took the form of an expression of gratitude to other members for posting, such as this activity post created by a member who fit the PIT category of scientist: “It is helpful! Thanks for sharing, [@member!]” (activity post ID #17,880). In data that was coded as support, the focus was on social niceties.

The second most frequently observed practice was tips, which occurred when members provided advice or best practice information to other member/s concerning social paleontology. An illustrative example of tips came from a commercial member who responded to a query concerning uploading fossil specimens, “Hey [member]. Yeah but I don’t have any of them, and the industry photos are business confidential until released in an EIR [Environmental Impact Report] or other final environmental report” (forum post ID #2827). In this post, the commercial member explained why they could not provide photos of fossil specimens, explaining a part of commercial paleontology that was perhaps unknown to the member who had asked to see such photos. After this tip, the other member responded, “thanks!” which was coded as the practice of support.

The third most frequently coded practice was that of problem-solving. Problem-solving, whose definition was communication concerning solutions related to the domain, occurred regardless of website features. An example of problem-solving was from a scientist who was discussing the identification of a trilobite (an extinct arthropod similar in shape and look to modern pillbugs) with another member who had found one in southwestern Wisconsin. In his post, the scientist wrote,

I was assuming the entire fossil to be a pygidium, but now I take your point that it comprises the whole thorax plus the pygidium. And that makes it an excellent match for the thoracopygidium of Thaleops ovata. It also helps explain the outline that appears anterior to the thorax. It’s a cross-section through the cephalon. I note the especially robust right gena. (forum post ID #17132)

In this forum post, the scientist described specifics for trilobite identification, creating a solution to a domain-specific problem—the identification of a trilobite. After his explanation of the identification, the person who found the specimen thanked him for his identification, an example of the practice of support.

Another frequently seen practice was that of stories, defined as person-centered accounts of social paleontological practice. Stories were often long-form accounts of members’ interest in paleontology, or reminiscence concerning fossil hunting. An example of this was from a scientist who posted on the activity feed about some fossilized specimens that she appreciated, writing “Very adorable inarticulate [brachiopod]! We don’t get the variety around here that you find in the Cincinnatian. I think I have only found Pseudolingula here, I prefer the encrusting ones” (activity post ID #18,099). This quote shows the personal connection this scientist had with paleontological specimens. In calling a fossil specimen adorable and relating them to her personal collecting experience by saying that they had low variety where she was, this scientist was personalizing paleontology to her experiences with it. This highlights the ways in which members created personalized narratives regarding paleontology, specifically describing specimens or trips that were taken to collect fossils.

Across all data, the least frequently coded practice was that of external benchmarks, defined as information concerning best practices of digitization of specimens consisting of 0.1% of all coded data. The practices of formal practice transfer and boundary crossing were also rare, each consisting of 1.2% of all codes on myFOSSIL. The scarcity of these practices indicated that either myFOSSIL members did not use the website to discuss these activities or were perhaps unaware of external benchmarks for social paleontology. In terms of social paleontological identity, both formal practice transfer and boundary crossing, which entailed members either transferring information that they had expertise in or crossing the bounds of their specific identity, were rare occurrences on the website with members tending to display their PIT identities versus expanding beyond them.

Forms of Practices and Knowledge-Sharing Discourse Within Website Features

Identifying the forms of social paleontological practice that existed on myFOSSIL provides a holistic view, however, it was also important to understand the ways in which different practices emerged on features of the website, and what kinds of discourse existed within the textual practices. Therefore, we examined data within each website feature individually. We illustrate the results of analyzing the practices within each website feature by giving an overview of what was found using descriptive statistics and presenting the results of the chi-square test of independence to highlight differences within the features (Table 5).

Table 5 R-squared and chi-squared values for forms of practice and PIT identities and practice

While all digital trace data contained the same spread of practices (support, tips, problem-solving, stories), there were statistically significant differences in the ways the practices were used. Indeed, there was a significant association between data type and practice on myFOSSIL. As it was important to determine the differences between data types, pairwise comparisons were then performed. The pairwise comparisons show significant differences in the ways that learning activities were used on forums and activity; on forums and messages, and on activity and messages. That is to say that community members enacted social paleontological practices in different ways dependent on website feature. These results indicate that the knowledge-sharing discourse that existed on myFOSSIL was different depending on the website feature. For example, the ways in which members enacted the broader learning activity category of exchange might have been different whether the members created messages or forum posts. These data are contradictory to the descriptive statistics in which practices were generally similar across all forums, activity, and messages. This means that from a quantitative perspective, the discourse varied depending on the features of myFOSSIL. However, we must caution that given the relatively low R-squared values in our analysis, that the amount of variance in discourse on myFOSSIL features was slight.

Practices and Their Relationship to Community Member Attributes

After analyzing the total number of practices and knowledge-sharing discourse variance within the website features, we turned to our second research question, how are forms of practice related to community member attributes?, and sought to determine how knowledge-sharing discourse varied among members (Fig. 4, Supplemental Material). Within this section, we report on the most frequent practices created by members from specific PIT categories.

Fig. 4
figure 4

Sankey diagram depicting the PIT categories and their total number of practices (left side) and the breakout of practices (right side)

In regards to the different PIT categories, on myFOSSIL, most participants were in the category of public, followed by education and outreach then scientist. Only five commercial collectors were present.

Public members contributed the most to myFOSSIL, with 1658 coded instances of practice. They most frequently used the website to talk about concepts that included the practices of support, stories, and problem-solving. For public members, instances of support often were responses to other members posting photos of their fossils or fossil collecting experiences or in response to members documenting their curation procedures. In terms of public members’ knowledge-sharing discourse regarding stories, public members often described their trips to fossil-collecting locations, such as one public member who shared a memorable experience looking for fossils in the Gobi desert (Forum post ID #16,191). Members of the public also often created a knowledge-sharing discourse that was coded as problem-solving, with members communicating about domain-related solutions within the forums. The knowledge-sharing discourse of public members thus can be described as problem-solvers and advisors who focused on supplying a digital record of real-world experiences.

Scientists often created data that included the practices of support, tips, and problem-solving. Similar to public members, scientists’ most frequent knowledge-sharing discourse took the form of support, such as thanking others for surfacing a problem within a paleontology-themed lesson plan (forum post ID #3383). Additionally, scientists often shared tips, mostly helping others (often public and education and outreach members) identify fossil specimens (forum post ID #3405). Lastly, scientists’ third most frequently coded practice was problem-solving, such as explaining how scale bars, which are rulers or other standardized measurements (such as a coin) to show the length and width of a specimen, work (forum post ID #2550). The knowledge-sharing discourse of scientists thus can be described as socially supporting others while seeking to solve domain-specific problems.

Education and outreach members most often created data that included the practice of support, news and information, and help desk. When talking to other members, education and outreach members sought to thank others for their help. Education and outreach members also created the most posts of all members that were classified as news and information, including external links to news stories that might interest other members. Lastly, education and outreach members often asked other members for help identifying specimens, such as a teacher who wanted to know what her student had found (Forum post ID #20,219). The knowledge-creating discourse of education and outreach members thus can be described as being interested in spreading social- and research-specific information while seeking social- and research-specific support.

Commercial members most often created data that included the practice of support, stories, and problem-solving. While their contributions made up less than one percent of overall contributions on the website, commercial members still interacted with many members within the forums. One commercial member thanked others after seeing members of the public describe in detail their methods for curating their personal collections (Forum post ID #11,120). Similar to others on myFOSSIL, commercial members seemed to enjoy regaling others with stories of their paleontological experiences. For instance, in a forum about figuring out how to distinguish between real and fake fossils, one commercial member described how he informed another collector that they had a fake fossil (Forum post ID #4367). Commercial members were interested in coming up with solutions to domain-specific problems (i.e., problem-solving), such as one commercial member who wrote about the lack of women who were paleontologists and ways to grow the numbers (Forum post #9736). Thus, similar to public members, commercial members’ knowledge-sharing discourse can be described as being problem-solvers and advisors who focused on supplying a digital record of real-world experiences.

To quantify the qualitative data, we conducted a chi-square test of independence (Table 5). There was a significant association between PIT category and practice on myFOSSIL. Pairwise comparisons show that, specifically, there were differences in the ways that education and outreach members implemented practices versus public members and education and outreach members versus scientists. There were no significant differences in the ways that public and scientist members used practices on myFOSSIL. This means that for the categories of public and scientist, they implemented practices on myFOSSIL in similar ways, but those members categorized as education and outreach enacted practices differently.

Social Network Analysis of Practices

In addition to identifying the relationship between practices and community members, we sought to answer our third research question, in what ways can practice be traced and identified as a social network?, and sought to trace and identify practice as a social network. Regardless of the centrality measure, four practices were prevalent: exploring ideas, problem-solving, news and information, and support.

Betweenness centrality, a measure of the shortest distance between two nodes within a social network, was used to determine through which practices information flowed on myFOSSIL, or rather, which practices served as bridges (Hansen et al., 2011) (Table 6). Within this network, we found high betweenness centrality for the practices of exploring ideas, problem-solving, and news and information. These results indicate that being able to brainstorm about the domain (i.e., exploring ideas) and communicating about domain-specific solutions (i.e., problem-solving) were imperative to knowledge-creating discourse within myFOSSIL’s forum. This is important to recognize as when we quantified practices, the most common singular practices were stories and tips, both of which are more socially focused versus domain-specific.

Table 6 Centrality measures of social paleontological practices

Another centrality measure, closeness centrality, indicated the average distance between one practice and all others in the network. On myFOSSIL, all practices had closeness centralities that were less than one, meaning that the average distance between all practices was short. These results indicate that although some practices were more centrally chained, the ways in which all practices were connected were fairly equal.

The last centrality measure, that of eigenvector centrality, indicated the connectedness of connected vertices. For instance, a practice could have a high eigenvector centrality if the practices it was connected to were highly connected. On the myFOSSIL forums, eigenvector centrality ranged from 0.036 to 0.074, with external benchmarks featuring the lowest eigenvector centrality (0.036) and exploring ideas representing the highest (0.074). The practice of exploring ideas was highly connected to other connected practices, meaning that this practice and the practices it connected to were often a part of knowledge-sharing discourse within myFOSSIL.

In addition to quantifying the social network of myFOSSIL, we created a visual depiction in the form of a sociogram (Fig. 5). Sociograms highlight relationships and showcase aspects that might be hidden when depicted in tabular form. We used a metric called edge weight, in which higher numbers of connections between practices resulted in thicker lines between the practices on a sociogram. An example of a relationship with increased edge weight was that of help desk leading to tips. This relationship occurred 87 unique times. Other learning activities which had high edge weights included stories-stories (i.e., one forum post that included the practice of stories often led to another forum post that included the practice of stories), help desk-support, and exploring ideas-exploring ideas. Relationships that occurred fewer times (e.g., models of practice-exploring ideas), had lower edge weights. These connections and the structure of the learning activities on the myFOSSIL forums therefore illustrated the knowledge-creating discourse that existed. Indeed, the forums showed high connectedness between the practices, as well as high numbers of connections between certain practices, which showed the way that people created knowledge on myFOSSIL.

Fig. 5
figure 5

myFOSSIL forum sociogram. Size and width of lines and arrows (edges) indicate practices with higher edge weights; practices with less connections are grayed out. Size of nodes (vertices) indicates betweenness centrality; larger nodes indicate higher betweenness centralities

In addition to showing general connectedness, determining if practices were grouped was important; thus, we applied the Clauset-Newman-Moore clustering algorithm (Clauset et al., 2004). Clustering revealed three subgroups of highly connected practices. These subgroups were visualized in a social network sociogram in which associated vertices were given similar shapes (Fig. 6). Group 1, which we named shown in the shape of a disk, included the practices of boundary crossing; joint events; tips; pointers to resources and document sharing; formal practice transfer, trainings, workshops, and invited speakers; help desk; models of practice; and documenting practice. Group 2, depicted as a diamond, included news and information, support, collaboration, stories, and field trip planning. Group 3, depicted as a square, included exploring ideas, external benchmarks, and problem-solving.

Fig. 6
figure 6

Clauset-Newman-Moore cluster algorithm revealed a shift from seven higher-level practices as theorized by Wenger et al. (2009) to three collections of practices, depicted here. Group 1 consists of pink disks; group 2 is depicted as dark blue diamonds; group 3 is represented as orange squares

We examined these groupings to determine if the included practices were divided into their higher-level learning activity categories (Wenger et al., 2009) and found there was little, if any, association. For example, Wenger and colleagues theorized that the higher-level learning activity of exchange contained the specific learning activities of news and information, pointers to resources and document sharing, stories, and tips (Table 3), yet our analysis showed that tips and pointers to resources and document sharing were clustered in group 1 and news and information and stories were clustered in group 2. Similar separations occurred for the specific learning activities nested in producing assets and formal access of knowledge. This shows that the associations between higher-level learning activities and their specific learning activities were not as strong within this online community as originally conceptualized by Wenger et al (2009). Therefore, the ECoP and conceptual framework determined by Wenger et al. (2009) were modified further by changing the higher-level learning activities from seven to three higher-level categories: community practices (group 1), skill development practices (group 2), and knowledge practices (group 3), as determined by the social network analysis grouping on myFOSSIL’s forums (Table 7).

Table 7 Revised CoP conceptual framework based on grouping algorithm from social network analysis

Discussion

The results of this study add to the theoretical conceptualizations of social practice in interest-based communities while allowing for the measurement of knowledge sharing within online communities using a new tool: the ECoP (Hafeez et al., 2019). Four main findings emerged: practices on myFOSSIL were both domain- and community-specific; website features influenced the development of practices; some PIT categories used practices similarly; and social network analysis of practices revealed differential groupings of practices than were conceptualized by Wenger et al. (2009). We situate each of these main findings within the literature and wider theoretical discussions of CoPs.

myFOSSIL’s Community- and Domain-Specific Practices Inform Our Understanding of Previous CoP Literature

We showed that the forms of practice that existed on myFOSSIL were personal, related to sharing advice, and concerned with producing or at least exploring new ideas related to social paleontology. We found that within this online community, participants often used domain-specific practices such as tips as well as community-specific practices such as support and stories. Finding that members used community-specific practices that support fellow community members reflects previous studies of CoPs in which practices are condensed into the notions of mutual engagement, joint enterprise, and developing shared repertoires (Wenger-Trayner et al., 2015). Abedini and colleagues (2021) indicated that current research lacks understanding of adult learner characteristics with online CoPs—we sought to alleviate this paucity through applying categories based on members’ self-identity with the domain of social paleontology (i.e., the PIT). In applying PIT categories and analyzing their website contributions, we found that some members tended to fulfill similar roles in spite of differences in PIT categories. Aldana and Martinez (2018) found that within a school environment, support-specific discussions among community members allowed for better support of English as a second language students. Within the context of an entrepreneurship community, Hafeez et al. (2019) described how engagement could be measured to determine personal learning. Contextualizing our findings in relation to these previous studies who have emphasized community-building shows how the myFOSSIL online community fulfilled certain CoP tenets that have been previously emphasized.

However, we also found that the online community employed practices that were domain-specific such as problem-solving and tips. Previous studies have explained how domain-specific practices could be enacted regardless of the domain, with community members writing, discussing, and commenting on one another (Liberatore et al., 2018; Wenger et al., 2009). While such descriptions can be useful as an overview, they are insufficient as they do not explicate specifics of CoP member contributions. The results of this study provide clear evidence for domain-specific practices that can occur within online, scientific communities.

Website Features and the Development of Practices

The different features available on the site afforded different practices. These findings tie to the literature on the nature of affordances (Gibson, 1977; Kaptelinin & Nardi, 2013; Martinez & Peters Burton, 2011; Norman, 2013) as well as to one of the seven theoretically-conjectured design principles for CoPs derived by Wenger et al. (2002): developing public and private community spaces. We focus on this theoretically-conjectured design principle as the myFOSSIL online community included both public forums and an activity feed as well as a private messaging function. The public forums afforded the development of different practices than were seen in messages. Interestingly, Wenger and colleagues (2002) emphasize private community spaces as integral to the success of CoPs. While this was somewhat true for myFOSSIL, the “well-orchestrated, lively” public forums could have fostered further connections within the paleontological community.

Affordances are possibilities for interaction that are provided to people by the environment (Gibson, 1977). Martinez and Peters Burton (2011) have taken this concept further, theorizing that there are six key affordances that apply to online environments. Two of the six, distributed expert networks (i.e., forming social connections across time and space) and forums for public discourse (i.e., spaces for building common intellectual goals), are closely related to the CoP aspects we studied. The forums on myFOSSIL allowed for members, especially those in the categories of public and scientist, to use the domain-specific practice of problem-solving frequently to attend to domain-specific issues.

Additionally, Martinez and Peters Burton (2011) indicate the creation of forums for public discourse within online environments showcases “the iterative nature of the scientific enterprise through social discourse” (p. 23). Within myFOSSIL forums, the discourse between members demonstrated the nature of the scientific enterprise, however, it was limited. This is especially apparent when examining the most common practice relationship: help desk-tips. There was little iterative discourse that occurred in this practice, rather, one member indicated that they had a question, and another member responded with the answer. Creating an environment that moves beyond this dichotomous discourse to reach Martinez and Peters Burton’s (2011) cognitive affordance of forums for public discourse would be a fruitful future research endeavor.

When members interacted with different features of myFOSSIL, they enacted different practices. Forum posts were used to provide support, tell stories, and problem-solve. When members interacted on messages, they most often used the practices of support and tips, and on the activity feed, members most often explored ideas, solved problems, and supported one another. Significant differences were found in the practices that were enacted on each, which indicates that each feature afforded members a different interaction experience.

Practices and Expertise Within Online Communities

When considering who is learning and who is engaging in knowledge-sharing discourse within CoPs, some researchers tend to amplify community member divisions. For instance, Dowthwaite and Sprinks (2019) contrast scientists and members of the public in their study, with scientists prescribing protocols and standards for collaborations within the context of a citizen science website. Divisions between community members have been emphasized in other studies of online, scientific communities (Lundgren et al., 2021; Corin et al., 2015; Forbes & Skamp, 2013); these interpretations are limited as they only account for ways in which members differ. Our research emphasized that for some community members (i.e., scientists and members of the public), the development of practices led to similar means of legitimate participation in and contribution to the domain. Additional work into expertise and practice within online CoPs has shown that members from across the continuum of expertise develop practices in similar ways that led to legitimate participation in and contribution to the domain (Lundgren et al., 2022b). In our study, members used their identity-based expertise to build community within the domain of paleontology while enacting scientific practices.

How Social Network Analysis Changed Our Understanding of Previous Conceptualizations Of Practice Within Online CoPs

The use of social network analysis provided empirical evidence for revising the element of practice within the CoP theoretical framework (Wenger et al., 2002, 2009). In the conceptualization envisioned by Wenger et al. (2009), seven high-level activities were grouped; theorized to be positioned via their proximity to and from working with other people. Within this study, social network analysis showed a different grouping of practices based on an algorithm that searches for groups of densely clustered vertices (Hansen et al., 2011). Wenger and colleagues (2009) described seven groups of practices (i.e. high-level activities) as a “range of activities that communities of practices have been known to engage in” (p. 7), but provide thin empirical evidence for this. We modify previous descriptions of such groups of practice based on empirical evidence presented in this study.

Within our study, we found three empirically-based groups that we rename from high-level activities to collections of practices. A collection of practices includes two or more observed practices that occur in relation to one another. We further this line of evidence by expanding on and naming the empirically-based groups. The first collection of practices (group 1) included stories, support, collaboration, tips, and news and information. This collection, community practices, includes those that relate to aspects of the element of the community as originally envisioned by Wenger and colleagues (2002) and explored shallowly by previous researchers. The second collection (group 2) included formal practice transfer, trainings, workshops, invited speakers; help desk; models of practice; pointers to resources and document sharing; and boundary crossing. We describe this collection as skill development practices, as each of the identified practices describes a way of gaining additional expertise or expertise related to the domain. The third collection (group 3) included exploring ideas, problem-solving, external benchmarks. This collection, knowledge practices, emphasizes a wider examination of situations, either domain- or community-specific. This empirically-evidenced grouping of practices offers a new starting point for others who wish to further study and refine the CoP theory.

Implications for Teaching, Learning, and Public Engagement with Science

We studied an online, science-focused CoP in which members with varied expertise and experience could share knowledge. Our work provides insight into digital design features that can elicit community-building, knowledge-building, and practice development. One of the seven design principles of CoPs is “to develop both private and public community spaces” (Wenger et al., 2002, p. 59). On myFOSSIL, there were established areas for public interaction (i.e., forums and activity) as well as for private interaction (i.e., messages). We suggest that people who operate or seek to develop or improve online, science-based CoPs continue to provide members places where they can have private and public conversations. Educators who might be seeking spaces to build their own knowledge or provide free enrichment activities for students could benefit in becoming members of online, science-based CoPs. Such spaces can help educators and students to build domain-specific practices and encourage interaction with domain experts. For other researchers studying science-based CoPs, we imply that CoP practices can and should be revised as skill development practices, knowledge practices, and community practices. By envisioning practice in this way, CoP researchers could re-design platforms to better engage people from varied backgrounds and interests in their specific communities of practice.

Limitations

Within this study, we had to account for two main issues associated with social network analysis and using digital trace data: using found and event-based data (Howison et al., 2011). The first issue, that these data were found as opposed to produced data, meant that the data were a by-product of activities. To alleviate validity issues associated with found data, this research addressed non-links as both a limitation and as a potential avenue for further research. Such non-links were forms of paleontological practice or practice development that did not seem to occur on myFOSSIL and/or did not seem to be enacted by certain categories of PIT members. Additionally, some data on myFOSSIL were not coded as these posts were created by members who were under the age of 18 or myFOSSIL members who did not consent to participate in the study (n = 43).

The second issue, that the data were event-based, was an issue for digital trace data, as within the literature, these data are often presented as dichotomous (Howison et al., 2011; Lampe, 2013). The dichotomy is most often represented as high interaction versus low interaction, for example, more than five interactions with another member of the network equals a strong relationship and less than five interactions equal a weak relationship. The emphasis of the study was on describing the practices broadly, therefore understanding what practices were present was generally important, the dichotomy of high versus low interactions was minimized by focusing on the holistic characteristics of members’ practices.

An additional limitation of our study is online platforms’ and communities’ penchant for rapid change. Wenger and colleagues (2002) and Iriberri and Leroy (2009) indicate that CoPs have lifespans in which critical mass is reached, the community solves their problem of practice, or new communities are created from original ones. myFOSSIL has existed as an online community since 2014, and thus has most likely evolved as per the CoP lifecycle. An offshoot community of myFOSSIL exists on an app (Bex et al., 2019b) and myFOSSIL had a robust social media following in which learning occurred (Lundgren et al., 2022b). As this study took place from 2015 to 2017, the way the community operates now may differ from how it functioned during the study. An analysis of the lifecycle of myFOSSIL could shed light on how it evolved in relation to the CoP lifecycle.

Conclusion

This study was framed by a need for determining who is engaging in interest-based, knowledge-creating discourse and a more grounded, empirical description of online, informal learning environments. Accordingly, we explored these issues for a group of members from the myFOSSIL community. Our findings highlighted the practices used on myFOSSIL, namely ones that promoted community cohesion as well as scientific knowledge generation. This work also characterized the ways that members contributed differently dependent on the website feature and their affordances, as well as dependent on PIT categories, with the PIT categories of scientist and public having more similarities than education and outreach. In addition, this study contributes to the body of literature concerning CoPs, in that the seven conceptual categories developed by Wenger et al. (2009) were collapsed into three empirical categories, offering new ways to explore CoPs that are based on evidence. Lastly, we see potential for results from this case study to be transferable to other interest-based online communities as we provided evidence for theoretical conceptualizations for social practice within such settings.