Introduction

This study investigates the relationships between peer-to-peer interactions and (1) group formation among students, (2) choice of research topics for end-of-semester projects (term papers), and (3) course performance in an online ecology course at a research university. Researchers asking employers to rank twenty-first century workplace skills reported that peer-to-peer interaction and collaboration were ranked among the most necessary skills for workers to bring to the workplace (Finch et al., 2013; Jang, 2016; Lievens & Sackett, 2012); however, employers believe that university graduating students lack these skills and are coming ill-prepared to work collaboratively (Gray et al., 2005; Hart Research Associates, 2015; Hora et al., 2016). In response, scientific societies and science educators initiated calls for reforms in curricula and expectations to promote graduating students’ workplace skills, such as collaboration and communication (e.g., American Chemical Society Committee on Professional Training, 2015, Heron et al., 2016). A major approach to promote collaboration skills is to include peer-to-peer interactions and group work activities in and outside the classroom. Past work has shown that partaking in social interactions improves students’ educational experience, motivates them, and promotes their higher-level thinking skills (Cohen, 1991; Crone, 1997; Daggett, 1997; Junn, 1994; Oztok, 2016; Rocca, 2010). However, these experiences should be carefully planned and executed, as there is a growing body of research that suggest that students’ value of skills is shaped by their experiences (Marbach-Ad et al., 2016, 2019; Demaria et al., 2018; Gilmore et al., 2015; Lavi et al., 2021; McGunagle & Zizka, 2020). Previous work found that students in science disciplines, especially students with high GPA, ranked working in groups as least important to acquire during their undergraduate studies (Marbach-Ad et al., 20162019). In interviews, students explained that negative experiences with group work (e.g., unequal workload distribution) shaped their unfavorable views about the importance of learning how to work collaboratively and acquire this skill.

To change students’ experience and views about group work, research has focused on exploring how to plan well-crafted peer-to-peer interaction assignments. While most of the research focused on in-person settings, it is also important to explore peer-to-peer interactions in online and computer-based settings (Bento & Schuster, 2003; Ouyang & Scharber, 2017; Oztok, 2016; Saqr et al., 2018; Traxler et al., 2018). Such educational settings are becoming more and more relevant, given the increase in online course offerings at universities. Almost all courses were transformed into some mode of online learning during the pandemic (Pokhrel & Chhetri, 2021).

Peer-to-Peer Interactions in Online Settings and Social Network Analysis

Previous studies on online interactions among students (and sometimes with their instructor) showed association between levels or types of online interactions and different course outcomes such as grades/performance (Bernard et al., 2009; Saqr et al., 2018; Shea et al., 2001), satisfaction (Hartman & Truman-Davis, 2001; Pozón-López et al., 2021; Wei & Chou, 2020), and self-perceived learning experience (Dziuban & Moskal, 2001; Wei & Chou, 2020). Based on large-scale systemic reviews and meta-analyses, online interactions have long been emphasized as an integral and essential part of online learning (Bernard et al., 2009; Borokhovski et al., 2016; Saqr et al., 2018; Wanstreet, 2006).

Research methods to explore peer-to-peer and other social interactions in online learning have mainly focused on qualitative data collected through open-ended questions on surveys and interviews, which required the use of qualitative data analysis methods (e.g., content analysis and use of reliable coding themes) (Azer & Azer, 2015). However, in recent years, network analyses to tease apart relational aspects of peer-to-peer interactions have led to novel insights on understanding online engagement and academic performance (Ouyang & Scharber, 2017; Saqr et al., 2018; Traxler et al., 2018). Social network analysis (hereafter called network analysis) has sought out patterns of relationships and interactions between people and investigated the structure of these patterns and their effect on social phenomena (Martınez et al., 2003). These networks represent interactions among people, and the representation consists of nodes, which represent individuals, and edges between nodes, which represent the interactions between those nodes. Such a representation affords looking at the complex structure and interdependencies of interactions among individuals and how they are impacted by various factors in a social or educational setting, a way of looking that is seldom afforded by simple statistical methods (Martınez et al., 2003).

For example, students’ participation in courses has shown that interaction network properties such as degree (number of connections associated with an individual) and centrality (importance of an individual with respect to the structure of the network) impacted the probability of passing a course (Romero et al., 2013). Moreover, centrality was correlated with student learning more strongly than academic motivation, prior performance, and social integration. Other work has shown that the personal interaction networks of students were associated with their course performance (Gašević et al., 2013; Joksimović et al., 2016). Many of these studies have faced barriers, mostly due to the lack of reproducibility across multiple courses and some even support conflicting results (Saqr et al., 2018), which mainly happens because of diverse contextual structuring of interactions under varying course conditions.

In our study, we addressed these barriers by collecting data about peer-to-peer interactions, course performance, and student demographics from an intermediate-level, asynchronous, online ecology course over six semesters. It is noteworthy that most network-based studies in online settings that explored the relationships between peer-to-peer interactions and student performance/productivity were performed in large class sizes or introductory-level courses (Bettinger et al., 2016; Ouyang & Scharber, 2017; Traxler et al., 2018), and have seldom been applied to upper-level courses with a specified content area. In the online format of our course, there are various opportunities for peer-to-peer interactions, including commenting on research papers posted on a discussion board, and group projects synthesizing important peer-reviewed literature. We use network science methods to examine the extent to which peer-to-peer interactions (during online discussions and homework assignments) are associated with group formation, choice of research topic, and course performance. Instead of only focusing on course performance over an entire course-period, and to identify temporal differences in performance as a function of network structure of interactions, we applied fine grain analyses to the dataset.

Group Formation and Composition

Group size and composition recently received a lot of attention in the literature (Wilson et al., 2018). Regarding group size, most studies suggest that small groups of about 3–5 students are more effective and cooperative (Aggarwal & O’Brien, 2008; Lou et al., 2001), and have better group performance, such as ability to solve problems (Heller & Hollabaugh, 1992), and higher satisfaction (Aggarwal & O’Brien, 2008) than groups of other sizes. Although, in a recent meta-analysis that summarized data from 24 studies on students’ chemistry understanding, Apugliese and Lewis (2017) found that there was no difference in performance between groups of 4 or smaller and groups of five or larger.

Beyond optimal group size, research has also been conducted on effective group composition, in terms of academic performance (Jensen & Lawson, 2011), and demographic characteristics, such as gender (Takeda & Homberg, 2014; Woolley et al., 2010) and ethnicity (Watson et al., 1993). However, recommendations, especially regarding building heterogeneous vs. homogeneous groups, have mixed messages. In term of academic performance, in a meta-analysis, researchers found that low-achieving students demonstrated stronger outcomes when placed in heterogeneous groups, whereas mid-achieving students demonstrated stronger outcomes when working in homogenous groups (Lou et al., 2001). In terms of demographics (either ethnic or gender), although it may seem important to evenly distribute minority students among groups (Rosser, 1998), it is also important not to let them feel isolated and different from other members in the group. Freeman and colleagues (2017) argued that when students were allowed to self-select into groups, they tend to choose group members of the same gender and ethnicity.

Reviewing these studies shows that it is difficult to draw general conclusions about the best group composition (Donovan et al., 2018), since most studies are very specific in terms of the discipline, the type of assignment, and the measured outcomes (e.g., performance, satisfaction, communication, and collaboration skills). For the same reasons, it is also difficult to determine the impact of group selection criteria (e.g., random vs. self-selected) on students’ overall success (Jensen & Lawson, 2011). For example, in a study with 16 upper-level undergraduate business courses comparing self-selection and random groups, end-of-semester surveys showed that the students in self-selected groups had better learning outcomes (e.g., communicating, enthusiastic, comfortable, and interested in working with each other) than students in the randomly selected groups. On the other hand, students in the randomly selected groups reported that members used time in group meetings more efficiently and that the group was more task-oriented than students in the self-selected groups (Jensen & Lawson, 2011). With strengths and weaknesses presented for several options, it becomes difficult to draw conclusions on ideal group compositions.

In our course, we asked students to form their own groups. It is noteworthy that the course is completely taught asynchronously, so the whole class does not meet in-person or virtually, and students choose their peers based on a detailed introduction (with photos and professional summary) and subsequent interactions through the course platform. Indeed, a few students (about 1–3 per semester) were not able to form a group on their own and the graduate teaching assistant of the course assigned them up to either previously formed groups, but with 2 members (in case only 1 is unassigned), or made them into their own group. We performed network-based regression analysis (which is akin to a linear regression but also considers the connections in a network) on whether the interactions and/or demographics played any role in group formation. We also performed regression analysis to explore relationships between group composition (ethnicity, year in school, and gender) and performance in the group project/paper.

Overall, in our work we use a network-based framework to explore patterns and correlative structures in the student peer-to-peer interaction, choice of research, and performance data to look at the following research questions:

  1. 1.

    To what extent are peer-to-peer interactions in in-class discussions and homework assignments associated with group formation, choice of research, and course performance?

  2. 2.

    Are demographic variables (gender and ethnicity) associated with group composition and performance?

Methods

Context of the Study

This research took place in the University of Maryland, a research-intensive university in the east coast of the USA. To explore what role online peer-to-peer interactions might have over the timeline of a course, we used peer-to-peer interactions, performance, and demographic data collected from BSCI 361 (Principles of Ecology), a small-sized intermediate-level ecology course (average class size is 20–25), over six different semesters (Winter 2018; Winter 2019; Winter 2020; Summer 2020; Winter 2021; Summer 2021), with the same instructor, same teaching assistant (TA), and an unchanged course structure (Supplemental Material A). Principles of Ecology introduces basic concepts of ecology and the use of these principles to predict possible consequences. The course covers topics in the areas of organism, population, community, and ecosystem ecology, as well as human effects on global systems.

The course is online, asynchronous, and hosted on the university electronic management system platform Enterprise Learning Management System (ELMS) through Canvas. The four-credit course has lectures and course participation through a weekly or bi-weekly discussion (depending upon the semester, in the summer the course is 12 weeks in duration and in the winter semester, the duration of the course is 3 weeks with the same content). The discussions center around research papers which are geared to the topics being discussed in the lectures. Every discussion has a component where each student proposes a possible experimental framework to explore further the paper in focus, including new analyses, hypotheses, and critical testing of the original work (See Supplemental Material B). These prompts showcase different perspectives on the same material and foster discussions, which is the source of the peer-to-peer interactions that we record. Students’ grades consist of the following (see Supplemental Material C): (1) their comments to the posted papers and question prompts, and their responses to at least two other student comments from 10 discussion sessions, (2) three exams, and (3) a term paper (submitted as a group project). The grades for the discussions are the equivalent of one exam grade. By the end of discussion 3, small groups are formed from relationships built and interests developed within the discussion setting. The main goal for the group formation is to work together on the end-of-course research paper project (see Supplemental Material D for rubric). Over the next week or couple of weeks, the students come up with a group term paper topic (although they can change the exact topic if they wish to in later part of the course) and do a survey of related research content before writing up their final paper. If students did not select group members by the deadline, they are assigned into groups by the instructors. The discussion board serves as the “classroom” environment for this online course and is designed to encourage student engagement with each other and with the teaching assistant.

The learning outcomes for the course state that by the end of the course students should be able to do the following:

  1. 1.

    Design a simple experiment to test hypotheses and to identify a good experimental design.

  2. 2.

    Create and interpret tables and graphics.

  3. 3.

    Use mathematical and critical thinking skills to describe ecological processes.

  4. 4.

    Work with others (team and cooperative learning).

Study Participants

Study participants consist of all the students in the six courses (N = 132, with about 20 students per course) who finished the course and did not drop out. They were incentivized with a small (5 points) bonus in the course to participate in the data gathering process. The cohort of students was diverse in ethnicity (33% Caucasian, 28% Asian, 20% Black, 9% Hispanic, and 10% others), and primarily comprised of students who major in General Biology (49% General Biology, 15% Environmental Science, and 36% others, including Law and Music). More students identified themselves as females than as males (56% females, 39% males, 5% others or not identified). The course caters to all levels of college students although it tilted more towards advanced students (14% freshmen, 16% sophomores, 32% juniors, and 38% seniors).

Data Collection

For each of the six semesters, we collected students’ performance data (grades and points scored by each student in different assignments, tests, and discussion postings including comments from peers over the course; Fig. 1), along with course working group membership records, and research term paper contents. Data regarding interactions in discussion boards, student performance reports, and group membership of individuals was exported from the ELMS page, and in cases where automated data archiving was not possible, the data was recorded manually on a spreadsheet.

Fig. 1
figure 1

A schematic for the course structure—units used for data analysis. “D” here refers to a discussion

Data Analysis

To represent the interaction between students based on discussion posts and replies, we used an undirected weighted interaction network analysis for each discussion (Fig. 2). As such, each node represents one individual, and an edge implies undirected interaction (we did not distinguish between who started a comment and who replied on the discussion board). We chose an undirected network format in our case to simplify our analysis. The weight of each edge denotes the number of interactions between the two specific nodes for a given discussion. These networks were constructed for each discussion (separately for every course) and were used to look at changes in interaction structure of discussions over the span of the semester. For each network, we calculated the node degree (number of edges connected to a given node) and total weight for each node (total sum of interactions per node) (Fig. 2). Because the discussions were assigned in sequence, we assumed them to be our temporal axes for each semester. Accordingly, we averaged the values over the six semesters along this time axis to calculate the mean node degree and node weight. We used R statistical software (v3.6) for this analysis and all subsequent analyses with the package igraph (Csardi & Nepusz, 2006).

Fig. 2
figure 2

An example representation of peer-to-peer interaction networks used in this study. The network consists of nodes (red circles) and edges between them (gray lines). Nodes represent individual students and edges represent interactions among them. The major metrics used in this study include the degree, which is the number of edges associated with a node (and is a node attribute), and edge weight, which represents the total number of interactions between two nodes (and is an attribute of the edge connecting them)

The exams incorporate multiple choice, and open-ended short answer questions (that were developed by the instructor and the TA based on the materials learned in the discussions, textbook, and lectures). The open-ended questions were graded based on a rubric provided by the instructor’s exam key (see Supplemental Material E for examples). We first divided the discussions into three sets—depending upon the timing of the three course exams (i.e., the discussions between two exams were termed as a set, as the contents of these discussions were important for the upcoming exam) (Fig. 1). For each of the discussion sets (per course), we created a combined undirected network in the same way we had done for individual discussions.

We then used the constructed networks (see Fig. 2) for each set of discussions and calculated the mean change in degree and weight between subsequent sets of discussions (e.g., degree of node interaction (i) in set 2 minus degree of node interaction in set 1). To understand the relationship between student performance in a set of discussions and the exams’ scores, we regressed these values against the change in grade/points separately for positive and negative changes. The regressions were performed using simple linear and mixed effects models in R with a fixed intercept and a random slope. The fixed intercept assumes, as a null model, that no change in interaction corresponds to no change in performance. We included the random slope to examine possible differences among course cohorts and exams, and account for those sources of variation. We also performed a simple linear regression between the number of total interactions in a given discussion (per course) and the number of attributable projects that result from it.

We calculated differences in these metrics across semesters to identify notable differences. We used two different tests to look at this: (1) a two-sided t-test where the data are the differences in time series of network metrics for two different semesters (say, winter 2018 = X and winter 2019 = Y), and (2) a one-sided F-test to compare variances of XY and X. The first one will tell us if the differences in the time series are significantly different from 0, and in the second one, if time series X is similar to time series Y, then the variance of XY should be less than the variance of X (i.e., if the ratio var(XY)/var(X) is significantly less than one, then Y explains a significant proportion of the variance of X).

To investigate whether the number of course interactions between two students was indicative of them being a part of the same (project) group, we performed a logistic regression of pairwise edge weights (between students) for a given course (cohort) against an indicator of mutual group membership (1 if two students were in the same group, 0 otherwise). This exercise was repeated separately for edge weights before and after the group formation was announced to ensure that we are not just detecting excess communication between group members after group formation. We also explored whether the number of interactions in the collated networks over the whole course (separate for each course) was based on demographics (gender and ethnicity) and future choice of groups, by using a weighted Exponential Random Graph Model (wERGM) using the R packages “statnet” (Handcock et al., 2008) and “ergm.count” (Krivitsky et al., 2012) and a Poisson distribution as the reference model for interactions between individuals. ERGMs are used to investigate whether the network structure is affected by a given variable, and work like a regression in some way. We collated the significance values using Stouffer’s method (Heard & Rubin-Delanchy, 2018).

To explore whether more interactions in a discussion resulted in more end-of-semester group projects that related to the discussion topics, we matched each of the projects to discussion keywords. To do so, ecological concept keywords (3–5) were attributed to each discussion paper by four volunteers (in addition to the first author) independently. They came to a consensus on the final set, so that each discussion can be attributed to a specific set of conceptual ideas. For example, the keywords for discussion 1 included the term phenology (see Supplemental Material F for list of papers). The same process was used to assign keywords to the group term papers, which were then connected to the discussions they most closely matched.

Grades on end-of-semester project were calculated out of a total of 60 points using a rubric (for description of each variable, see Supplemental Material D). The rubric comprises of the following variables: content (total points 15), synthesis (total point 10), good use of sources (total points 5), structure and organization (total points 15), quality of writing (total points 5), and nuts and bolts (total points 10).

We also did a regression between the normalized performance and (1) proportion of students in a group from underrepresented minority (Latino/Hispanic, Black, Native American), (2) proportion of students who identified themselves as female or non-binary, and (3) number of group members.

Results

Below, we present the results according to the two research questions.

  1. 1.

    To what extent are peer-to-peer interactions in in-class discussions and homework assignments associated with group formation, choice of research, and course performance?

    After combining the results from the six semesters, we found that over the timeline of the semester, the average number of people that one individual interacted with (average degree) first increased (from discussion sessions 1 to 4) and then decreased (Fig. 3A). In contrast, the average total number of interactions (average weight per node) mainly increased (until discussion 7) and then stabilized (Fig. 3B). The error cloud in the figures denotes the range of values per semester per discussion for degree (in blue, Fig. 3A) and weight (in red, Fig. 3B). We found no differences in the two metrics among any pairs of semesters (through two different tests: paired t-tests of differences, and a one-sided F-test comparing relative variances; p > 0.1 in both cases; see “Methods”) and therefore, we propose that the pattern was global. We assume that the overall growth in interactions (average weight) throughout the semester is due to the increase in intensive work around term papers. However, the interactions became more specific to a smaller subset of people as the semesters went on, leading to the decrease of degree after discussion 4. We assume that because the students needed to finalize the small group choice by discussion 3, they started to communicate mainly with their group mates.

    When we compared networks constructed from collated sets of discussion (each set is symbolic of an exam, see “Methods”), we found that an increase in interactions between two successive sets of aggregated discussion interaction networks was associated with better performance in related exams (light red dots), but a decrease in interactions between the same (purple dots) did not necessarily seem linked to a decrease in performance (Fig. 4A). This might mean that peer-to-peer interactions in research-based discussions have the potential to raise one’s performance, even though fewer interactions did not necessarily mean lower performance in the exam grade, which may be more individually driven. We found that this effect is significant in all the three regressions we did: simple standard regression, one with courses as random effects, and one with exams as random effects (p < 0.05 in all cases).

    For node edge weights (which reflected the intensity of peer-to-peer interactions), we found a different result. Specifically, change in edge weight was not correlated with change in performance (Fig. 4B). This might mean that the number of people with which students interacted was not necessarily as important as the quality of those interactions, even if those interactions were limited to fewer people.

    We used a logistic regression to examine the relationship between the cumulative number of pairwise interactions over a course and (the probability of) two students being in the same group (for term papers). We found a significant effect (p < 0.001; intercept: 5.7721 ± 0.7812, which makes the odds ratio of having higher interactions and being in a group to be 321.21 with 95% CI: [147.07, 701.56]), which suggests that individuals who interacted more among themselves tended to form working groups for term papers. The results were significant even when the data were divided into discussions before (p < 0.001; intercept: 4.9528 ± 0.3315, which makes the odds ratio of having higher interactions and being in a group to be 141.57 with 95% CI: [101.62, 197.21]) and after the group formation (p < 0.001 intercept: 6.5630 ± 0.7812, which makes the odds ratio of having higher interactions and being in a group to be 708.39 with 95% CI: [327.54, 1532.11]). Note that the pattern was stronger after group formation, meaning that after students formed groups, they tended to interact more with their group members on normal discussion posts. These results show that peer-to-peer interactions, which are random in the beginning and are further shaped by mutual interests, can predict the creation of amicable working groups in an emergent fashion. We found the same pattern of consolidation of interactions across all six courses. Anecdotally, we observed that people with low overall degree and overall weight had to be assigned to groups by the instructor/TA. We found no relation between network interaction structure across courses and demographics. No results were significant at a p value of 0.05 after multiple comparison corrections, but we confirmed that the group identity was a good indicator of past interactions (corrected p = 0.0371).

    The end-of-semester projects were attributed to one or more discussions/per course, based on the identified keywords. The number of these attributable projects (per discussion) was found to correlate significantly with the total number of interactions in that discussion (Fig. 5) (simple linear model p < 0.05). One reason could be that a higher number of interactions in a given discussion created increased levels of interest and more engagement among the students. This phenomenon can lead to a variance in interest accrued among the students for each topic, creating differentials in favor of the projects that they eventually chose to work on. These observations can help us understand the importance of peer-to-peer interactions in structuring the interests of students in a course.

  2. 2.

    Are demographic variables (gender and ethnicity) associated with group composition and performance?

    We found no relation between group performance and proportion of either underprivileged ethnicity or female/non-binary students in a group in the weighted Exponential Random Graph Model (wERGM) after collating the significance values using Stouffer’s method (Heard & Rubin-Delanchy, 2018; we found p > 0.05 in both cases). The same was true (p value > 0.05) for each of the regression we performed between the normalized performance and (1) proportion of students in a group from underrepresented minority (Latino/Hispanic, Black, Native American), (2) proportion of students who identified themselves as female or non-binary, and (3) number of group members.

Fig. 3
figure 3

The evolution of the average number of people that one individual interacted with (average degree, A) and the average total number of interactions (edge weights, B) throughout the discussion sessions. The black line denotes the mean value and the error cloud around it denotes the range of values per semester per discussion for degree (in blue, A) and weight (in red, B)

Fig. 4
figure 4

Relative change in grades as a function of change in node degree (A) and edge weight (B). Each point denotes an exam and collated discussion network comparison for every student. Red points denote a positive change in degree (A) or weight (B), and purple colors denote their negative counterparts, between two successive sets of aggregated discussion interaction networks

Fig. 5
figure 5

Scatterplot of cumulative interactions vs. projects, where each point depicts the interactions and projects related to a discussion in a given course session

Discussion

In their recent paper, Lavi and colleagues (2021) reported that students’ active learning experiences (including group work and peer-to-peer interactions) impacted the development of students’ appreciation for soft skills (e.g., collaboration and communication) and STEM-specific skills (e.g., STEM knowledge application). They stressed that the four active learning methods that they identified (project, course assignment, research, and laboratory lesson) “all share one, and only one, form of teaching and learning: working with others” (p. 9). These findings reinforce previous research results (Dori & Belcher, 2005; Freeman et al., 2014; Prince, 2004), suggesting that peer-to-peer interaction and group work are at the core of active learning, and therefore, it is important to understand its impact on student performance and other aspects of a course.

The effect of peer-to-peer interactions on undergraduate course performance in online platforms has been under research using network tools for quite some time (Ouyang & Scharber, 2017; Oztok, 2016; Saqr et al., 2018; Traxler et al., 2018). However, few studies have looked at the downstream effects of how early semester interactions shape interests of the students in course topics throughout the course. We observed pronounced temporal variation in peer-to-peer interactions (Fig. 1), a topic that is usually only studied in the context of large class sizes. These results provide us important clues into how peer-to-peer relationships evolve. We explored the research questions in a medium-sized online ecology classroom that was replicated six times using the same syllabus and instructors.

Effects of peer-to-peer interactions on performance is an often-explored question in online course and education literature (Gašević et al., 2013; Joksimović et al., 2016; Ouyang & Scharber, 2017; Oztok, 2016; Saqr et al., 2018; Traxler et al., 2018). In this work, we found that increases in interactions among peers had a positive effect on performance, but decreases did not have any effect (Fig. 4A). This suggests the possibility that a subsequent decrease in interactions cannot affect one’s knowledge about a subject (which might be individually benchmarked), but an increase in interaction can result in increased awareness about additional ideas (and understanding of materials) and hence an improvement in course performance.

In past work, researchers have observed that the social structure in online courses is formed early on with formation of interacting clusters (groups), and eventually concentrates in smaller subgroups formed within those groups (Xu et al., 2018). We found something similar in our work, where the number of edges (i.e., peers one interacts with) increased during the early weeks of the semester, but then, after groups coalesced, the strength of interaction within the group increased even as the average number of interactors per student decreased (Fig. 3). Similar patterns have been observed in other studies of online and blended learning, wherein the depth of interactions among peers increases until the middle of the semester and then stabilizes afterwards (Shu & Gu, 2018). This kind of specialized group formation and increase of interaction would be crucial in structuring the form of interactive activities in online courses. We also observed that the probability of being in a group increases if the number of interactions between a pair of students is high, and this was true for both before and after the group formation process. This might mean that initial interactions pave the way for the formation of early clusters of interactions. Eventually, these clusters coalesce into smaller groups (as seen in Xu et al., 2018), and after group formation, students tend to increase their interactions within those smaller groups.

Peer-to-peer interactions also impact students’ downstream choice of research. We saw this when the number of interactions in a discussion was correlated with number of group projects with which the discussion was associated (Fig. 5). This relation should be further explored, because if it holds true, this relationship can be used beneficially to structure peer-to-peer interaction activities and assignments.

Freeman and colleagues (2017) had argued that when students were allowed to self-select into groups, they tend to choose group members of the same gender and ethnicity. In our work, we do not see any such effect. Such a result might be due to either limited data or a number of other factors. Moreover, there was also no effect of demographic composition of groups on their group performance (term paper). As the course was online and the only way students get to interact through the course is via their posts and course content (although they could access certain aspects of demographic information from the self-introduction page where students introduce themselves with a picture of themselves), we speculate that demographics did not play as much an important role in group formation as in in-person classes. We also did not see any effect of group size on normalized group performance (term paper)—and therefore believe there was no indication of an optimal group size, as argued by certain previous studies (Aggarwal & O’Brien, 2008; Heller & Hollabaugh, 1992; Lou et al., 2001).

Overall, we feel that comprehending the complex relationships between peer-to-peer interactions, group formation, and interest creation is central to shaping vital skills in students, and through such insights, one can better mold teamwork and choice of research, which are very important for creating future scientists in the twenty-first century (see Lavi et al., 2021; Rayner & Papakonstantinou, 2015; Viskupic et al., 2021). Our study provides educators with a preliminary framework to disentangle some aspects of these associations.

Limitations and Future Research Suggestions

This study provides easy data collection and data analysis tools that instructors could use to shed light on the relationship between peer-to-peer interactions, students’ group formation, choice of research, and course performance. In the current study, we did have a few limitations that should be monitored in future studies. One limitation is that we did not have any information about the interactions outside of the course infrastructure page among the students, especially pertaining to research group interactions, which might have affected some of the inter-personal interactions within the course timeline. It may be that students were communicating via other modes of communications. Another limitation is that the course was performed in medium-sized classes. Nevertheless, we found trends that suggest how interactions among individuals associated with their performance in exams as well as decisions about choosing term paper topics and groups.

Based on this study, we believe that the interaction structure plays an important role in the dynamic nature of the course. We recommend sharing this study results with instructors and administrators in higher education to promote institutional and departmental discussion about implementing in-class and out-of-class experiences that increase students’ twenty-first century skills and preparation for their future career. Research shows that faculty are interested in exploring data in general, including data about students’ thoughts, values, and understandings, especially if these data were collected from their own students (Thompson et al., 2010; Marbach-Ad et al., 2019). We provide here data analysis techniques that could be utilized by others for further explorations of the importance of utilizing peer-to-peer instruction. Further exploration of these questions can be done in classes which span across semesters with similar course structure, and we feel that this can help solve the reproducibility problem that affects many network science–based studies of online peer-to-peer interactions.