Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 The Viability of Online Social Systems

Addressing the question of what makes an online social system “viable” requires some preliminary conceptual clarifications in order to define the scope of the present analysis. Section 4.1 of the present chapter is devoted to framing the problem conceptually: we first introduce the notion of a collaborative Web community by considering the properties that characterise it; we then discuss a number of ways in which the viability of these systems can be defined and the challenges faced by empirical research in identifying measurable indicators of viability. In Sect. 4.2 we present an empirical analysis of two paradigmatic cases of collaborative Web communities and discuss methodological issues emerging from the study of their dynamics from the point of view of viability. We conclude by presenting in Sect. 4.3 a simple model of viable online communities based on the empirical and conceptual considerations of the first two sections. Viability, we argue, is a notion that is hard to frame in the case of social systems. By discussing alternative characterisations of this notion and illustrating how to tackle it empirically, this chapter aims at offering methodological insights into the study of viable online social systems. These insights, we submit, are a precondition for the application of formal definitions of viability and resilience (see Chap. 2 of this volume) to realistic models of how collaborative systems function.

1.1 Definitions

1.1.1 Online Societies and Online Communities

In this chapter we will focus on a particular kind of online social systems that we can characterise as content-based, collaborative Web communities.Footnote 1 Such communities are defined by the fact that users joining them participate in the collaborative production of content, whether in the form of peer production (such as in the case of wikis or open source development), collaborative annotation (as in the case of social bookmarking and collaborative filtering) or media sharing (as in the case of social media platforms).

Web-based platforms offer particularly appealing conditions to study the nature and dynamics of collaborative groups for two main reasons. On the one hand, such systems offer the possibility of empirically studying different aspects of user behaviour at a large scale though the extraction of online datasets via dedicated tools, such as programmable interfaces (or APIs).

On the other hand, these systems offer a particularly suitable ground for the purpose of the present discussion as they are often embedded in what we may call an “online user society.” Agents in an online society can be characterised as users with a unique online identity. The active user base of platforms such as Facebook, Flickr, or the Wikipedia can be taken as an example of an online society.Footnote 2 Users of such platforms can freely participate in discussion and content production, establish links to other members and create and maintain affiliations to the variety of communities that these platforms support.

Systems supporting an online society are ideally designed to allow the researcher to compare and assess the respective performance of communities that, although different in structure and organisation, tap into the same user base. At the microscopic level, one can observe how the social and affiliation network as well as the participative behaviour of individual users evolve over time. At a mesoscopic level, one can observe how communities evolve over time, how they recruit members and how their structure affects participation. At a macroscopic level, one can study the evolution of an online society as a whole. In this sense, communities targeting the same user base can be seen as competing with each other to recruit members, a condition that makes it possible to study aspects of group dynamics that are often inaccessible in offline communities.

1.1.2 A Taxonomy of User-Centred Relations

Before discussing the functioning of online communities, we define several distinct classes of relations involving users of these systems. First, members can entertain multiple group affiliations at the same time – given the ease of joining a group, affiliation should not be taken as a straightforward indicator of active participation. Second, members can create explicit relations to other members in the form of “contact” or “friend” links. Third, members contribute content to a shared pool of resources maintained by the community; contributed content can consist of text, code patches, media or even, at a more meta level, of annotations which provide information about the other types of items. These classes of user-centred relations involving content, users and groups, indicate the large range of ways in which a collaborative community can evolve and the multiple forms that member participation can take. In order to tackle the question of the viability of online social systems, it becomes particularly important to be able to compare the performance of different communities on the basis of their structure and internal functioning.

1.2 Defining and Securing Viability

Collaborative communities built on top of an online user society face a number of risks that potentially threaten their survival. Peer production systems, for one, typically die of inactivity or an insufficient number of valuable contributions or, conversely, whenever quality assessment becomes unmanageable due to content explosion or ineffective measures against vandalism. The governance of such communities has been based so far on best practices and recommendations, as empirical evidence on the impact of specific policies on how these communities evolve in time is still relatively scarce.Footnote 3

Addressing the problem of the viability of these systems and their governance, therefore requires understanding: (1) what characterises these communities as “viable” and (2) once a viable state is explicitly defined – what policies can be devised in order to achieve or maintain this state.

1.2.1 Dimensions of Viability

1.2.1.1 Viability as Membership Growth

The first typical way to assess the performance of a collaborative community consists in looking at the growth of the number of its members over time. Unsurprisingly, growth is one of the main aspects on which the quantitative literature on collaborative online communities has focussed so far (see Almeida et al. 2007; Voss 2005; Godfrey and Tu 2000, 2001; Lam and Riedl 2009; Mislove et al. 2007; Roth et al. 2008). Studying the viability of a collaborative community by taking into account its population growth (and the speed thereof) is a valuable approach as long as growth per se is a desirable feature for the sustainability of the system. In some cases, however, an uncontrolled growth of participants is likely to lead to the breakdown of collaboration. This is particularly sensitive when growth in population and the growth in the content that a community produces start to diverge: we will address this issue later by referring to the notion of the attentional span of the members of a community. Even if we take the absolute growth in members of a community as an indicator of its performance in securing a solid user base, we need to consider how its growth compares to that of other communities, how it relates to the turnover of its members and how it is affected by different processes through which new members are recruited.

  • In cases in which membership does not imply exclusive affiliation, the growth of a given community should not (at least not in principle) be directly affected by the comparative growth rate of other communities built on the same user base. However, this is unlikely to be the case as, even in an online society with a constantly growing number of users, communities are de facto competing for members. What is crucial is then to understand the nature and effectiveness of processes by which communities manage to secure their membership.

  • Two communities may perform equally well in growing their overall membership over time. However, they may differ significantly in how good they are at maintaining existing members, or controlling member turnover. The same net growth in population can be the result of (a) a slow, steady growth in members without a significant drop-off of existing members or (b) the result of a high turnover, whereby the number of new recruits outweighs the number of losses. Considering the role of turnover beside sheer population growth is reasonable if we assume that, while turnover is essential to secure renewal within a community, a high rate of turnover will disrupt the continuity within the community, i.e. a condition that might be needed in order to preserve the possibility of norm transmission within the community (see Forte et al. 2009).

  • A final important aspect related to membership growth is the variety of recruitment processes that may result in the same observable increase in the population of a given community. The way in which communities recruit members – whether by tapping into the individual social networks of their current members (social recruitment) or by focussing on the interests of the new recruits (e.g. recruitment by homophily) – has crucial consequences. As we will see in the second part of this chapter, social recruitment processes are likely to reach a point after which the community becomes too cohesive to allow a significant turnover in its membership.

In sum, different aspects in the growth of a community’s membership can be relevant to its viability: the rate of population growth over time, the relation between growth in content and growth in population, the rate of member turnover, the mechanisms underlying recruitment and population growth.

1.2.1.2 Viability as Participation

The second challenge in defining the viability of collaborative online communities consists in understanding what diversifies community membership in terms of individual participation and how this, in turn, affects how a community thrives at a macroscopic level. A community may be said to be viable if it manages to secure a minimum number of participants committed to perform the specific tasks within the community that are essential to the achievement of its goals. As self-allocation of effort is a typical feature of peer production systems (see Benkler 2002), individual incentives to participation play a central role in specifying the conditions that make a community viable. A lack of balance between regular members and administrators, a redundancy of effort by community members, and a lack of participants devoted to quality control and norm enforcement are examples of ways in which patterns of participation can be disruptive for a community striving to achieve a specific goal. A detailed analysis of the drivers of participation in peer production is beyond the scope of the present work, but we should mention three key aspects that are relevant to the problem of the viability of these systems:

  • The relation between individual motivation and participation. What actually drives users to participate in online peer production is an issue that has been addressed by several authors (Hars and Ou 2002; Benkler 2002, 2006). Status recognition and expertise within a community are key factors in strengthening membership, but to what extent these internal factors are effective in terms of recruitment and task allocation is an issue that has still to be explored. A community is viable if it manages to channel individual motivation in a way that is functional to achieving a proper division of labour.

  • The distinction between passive membership and participation. Several communities thrive despite a relative low number of active members. This suggests that communities may have a high potential of recruitment even though only a small proportion of their membership is responsible for content production. This is a known property of peer production systems such as wiki-based communities where a majority of participants contributes only marginally to content production as opposed to quality control (Kittur et al. 2007; Roth 2007) and in all those systems where the existence of a community of “lurkers” is a vital condition for the performance of a community (Nonnecke et al. 2006). In this respect, the performance of a mature peer production system may depend on more subtle factors than the sheer proportion of actively contributing members: fighting vandalism, for one, seems to depend more on the number of passive watchers regularly monitoring content than on the small proportion of active contributors.

  • The relation between competitiveness and participation. One final issue that is key to defining the viability of collaborative systems is understanding to what extent these systems are in mutual competition. Depending on the underlying design, two communities competing on the same topical or social “niche” (McPherson 1983) may thrive independently without affecting each other’s performance; conversely, one may observe a migration of users from one community to the other (which will affect the viability of each community if measured by membership growth), or a change in participation rate not involving an actual migration or termination of membership (which will only affect the viability of a community as measured through participation metrics).

Understanding drivers of participation and types of participation is crucial to characterise those communities that may be effective at recruiting members but unable to secure a proper division of labour, such as content production versus content maintenance by active users. In this respect, both role diversification and proportion of active participants represent critical variables to measure the maturity of a community and its potential viability.

1.2.2 Achieving Viability

Once we have identified a specific standard of viability as a function of what aspects of an online community we wish to focus on, the next question is what policies are available to effectively achieve and maintain that standard. The question bears on the delicate issue of control policies in peer production systems, or how to devise an appropriate governance model for systems where individual effort is typically self-allocated and in which traditional organisational structure is not applicable (see Benkler 2002; Forte et al. 2009). We review in this section three classes of factors that are instrumental in controlling the dynamics of a community towards reaching a viable state.

1.2.2.1 Viability and Quality Control

The content dynamics of a collaborative community deserve as much attention as membership dynamics from the perspective of its viability. As we suggested, a divergence in the respective growth of content and population may easily lead to a breakdown of collaboration. This makes quality control policies one of the main factors behind the successful performance of collaborative communities, a problem that is particularly sensitive in communities where content is collectively curated in order to meet some shared quality standards (Forte et al. 2009). The most extensively studied case of collaborative quality control in online peer production is Wikipedia (see Halim et al. 2009; Stvilia et al. 2008; Suh et al. 2008; Wöhner and Peters 2009; Kittur et al. 2007, 2009) – a case in which the effectiveness of quality control policies, the social processes involved in their enforcement and the general distribution of labour among contributors have been empirically analysed. Factors that may drive a community to achieve a viable state from this perspective include the balance between inclusiveness and quality control: too strict quality control policies may drive away potentially valuable contributors, but the same effect can actually result from a demographic explosion or by loose or poorly effective quality control mechanisms.

1.2.2.2 Viability and Governance

Despite the fact that collaborative communities and peer production systems are often referred to as systems that accept unconstrained contributions from their members, they often implement forms of “soft governance” and hierarchical organisation that help maintain an active community and focussed content production. Governance-related factors include solutions controlling content production (i.e. what kind of content is allowed within the community’s product) as well as systems controlling member affiliation (i.e. who can join or who can perform specific tasks). In both cases governance solutions can be enforced a priori, by imposing limits onto the production of participation or content production, and/or a posteriori, by removing inappropriate content or removing existing members not complying with community norms.

1.2.2.3 Viability and Sociability

A third class of factors relates to social interactions, especially in collaborative communities built on top of online networking services. Even though governance measures can exert indirect control on the shape of social interactions, a number of properties of the social network of the members of a group have proved to show an important role in controlling the dynamics of collaborative communities – as a driver of recruitment of new affiliates, or, on the contrary, as an obstacle towards further growth, whenever high social cohesiveness hinders the affiliation of users not belonging to the social neighbourhood of a group. In terms of peer production, we may also expect that groups whose members maintain at the same time too large a number of social ties will start showing symptoms of breakdown in collaboration or in the ability to effectively monitor content production, which in turn will threaten the viability of the system.

2 Two Case Studies

In this section we illustrate how the above conceptual distinctions can be put to work in the empirical study of the evolution of collaborative systems. We focus in particular on the relation between growth-related viability and control factors by looking at properties that spur or regulate growth in two paradigmatic cases of collaborative systems: peer production systems and communities in social media.

2.1 Peer Production Systems

Wikis are, in a broad sense, websites whose content can be contributed and modified by any user in a collective and collaborative fashion. As such, they represent one of the most prominent examples of Web-based peer production systems. The most famous and possibly the most successful of these websites, Wikipedia, has attracted a substantial interest in the research community in recent years (Lih 2004; Anthony et al. 2005; Bryant et al. 2005). The Web has seen, however, several thousands of other wikis thrive and proliferate, with varying degrees of success: some recruit many users, achieving sustainability with established role distributions, frequent updates and efficient measures against vandalism, while others fight to attract contributors. Wiki-based communities can have distinct policies or scope but be equally sustainable, or have identical policies but die for a variety of reasons; all endeavoring to survive within what may be called the “wikisphere”.

This first case study consists of an exploratory investigation of some factors likely to account for diverse wiki destiny and viability, in terms of technical, social and structural features. In this context, we understand “viability” as dynamic sustainability of both population and quality content: in other words, a viable wiki should be able to grow in terms of articles and users in such a way that the whole content can be maintained by a sufficient number of users. Our aim is however not to provide a formal definition of an appropriate notion of viability for wiki-based communities, but rather to present a detailed descriptive analysis of the demographic and structural dynamics of a large sample of wikis as an empirical basis for further research. In particular, we discuss these results in light of the role played by governance measure in affecting the viability of these communities, moving beyond the Wikipedia case.Footnote 4

2.1.1 Wiki Dynamics

Various governance systems and software parameters, i.e. technical and social constraints, define a landscape wherein each online community is settled, grows and lives. How can the growth and evolution of such communities be assessed? As content-based online communities, wikis mainly evolve in two dimensions: (a) contributors, who may or may not constitute an active community; as discussed e.g. by Bryant et al. (2005); and (b) pages, which may or may not amount to authoritative or useful content; as demonstrated for example by Giles (2005).

Users and pages are likely to obey a dual dynamic: while more users may contribute to more pages, content proliferation seems to require more attention from users. As a first approximation, it may thus seem judicious to assess the healthiness of a wiki through these variables, taken as demographic indicators for its actual growth and activity.Footnote 5

To our knowledge, the present case study represents an original longitudinal analysis of the content and population dynamics of a large set of wikis. As well as almost always focusing on Wikipedia, previous quantitative wiki research has mainly examined the topological structure of underlying interaction or hyperlink networks (Capocci et al. 2006; Zlatic et al. 2006) or article-level features (Brandes and Lerner 2008; Wilkinson and Huberman 2007), with little interest in the specific dynamics of the demographic determinants themselves (with the exception of Kittur et al. 2007 who investigates Wikipedia’s demographics of casual vs. committed contributors).

2.1.1.1 Dataset

We constructed a dataset made of simple statistics gathered for a large number of MediaWiki-based wikis,Footnote 6 which enabled us to consider the same set of variables across all wikis and make sure these variables were generally available. The data was collected over the period August 2007–April 2008 from a publicly-available database.Footnote 7 totalling more than 11, 500 wikis. We applied further restrictions on this dataset, as described in Roth et al. (2008).

To sum up, the final, “clean” dataset that we considered for this study is made up of about 360 wikis, all of which have an initial population between 400 and 20, 000 users, are not hosted at some specific ‘wiki farms’ that do not report useful data, and which do not have major discontinuities in the daily change of their population or content. As such, we assume this subset to be representative of a homogeneous sample of wikis having a relatively sizeable yet not exceptionally large base of registered users – the latter being a hallmark of typical outliers (such as the English Wikipedia) in the wikisphere. Besides, the exclusion of discontinuously-growing wikis ensures that the observed dynamics are due to genuinely bottom-up user-driven behaviour rather than top-down administrative intervention or external attacks. In this sense, and from a viability theory perspective, we are thereby focusing on the autonomous dynamics of such systems.

2.1.1.2 Variables

We considered a set of four raw quantitative variables: population size (U), measured by the number of registered users; content size (P), measured by the number of so-called “good” pages (i.e. actual content pages excluding default pages created by the wiki engine), hereafter indifferently called “pages”, “good pages” or “articles” ; administrator population (A), the number of users who are granted “administrator status”, i.e. special rights to modify sensitive content and perform maintenance activity; and editing activity (E), measured by the total number of edits. We also included one qualitative variable indicating the presence of an access control mechanism: editing permission (R), i.e. the possibility of creating a page for unregistered/anonymous users. R is either 1, “anonymous editing allowed”, or 0, “registered users only”. However simplistic these variables may be, they provide key indicators of the global dynamics of a wiki, and shed light on diverse aspects of its structure and evolution. We collected the values of these variables for each wiki every day and over a period of 250 days, i.e. approximatively 8 months.

2.1.1.3 Structural Metrics

Wiki dynamics were studied as a function of a number of structural metrics based on the above variables, and that we can broadly categorise in two broad types of independent variables. On the one hand, descriptive indicators, i.e. variables on which wiki administrators have no direct control: user activity (i.e. the proportion of edits per user EU), user density (i.e. the proportion of users per page UP), and edit density (i.e. the proportion of edits per page EP). On the other hand, governance factors, variables that wiki administrators can directly control: administrator ratio (i.e. the proportion of users who are granted administrator status AU), administrator density (i.e. the proportion of administrators per page AP), editing permission (R). See Table 4.1 for a summary.

Table 4.1 Wiki metrics used as independent variables

We subsequently assessed wiki dynamics by comparing their diverse growth paths with respect to a set of independent variables. ‘Growth’ is defined in terms of population and content size variation (see Table 4.2): user growth G U (resp. page growth G P ) is the ratio between final and initial populations (resp. content sizes): G U = U lastU first (resp. G P = P lastP first). For each continuous variable, instead of carrying out a delicate analysis by dealing with clouds of points, we adopted a more insightful approach by dividing wikis into five quantiles, each including exactly 20% of all wikis in the clean dataset. We then computed and compared growth rate means over all wikis for each quantile. This representation was applied to all the above-mentioned variables, except for R where there are only two “quantiles” (0 or 1), enabling us to distinguish population quantiles on a unique graph.

Table 4.2 Wiki growth indicators

2.1.2 Determinants of Wiki Dynamics

The results suggest that different structural and governance-related factors have significant correlation with – and plausibly, in some cases, effect on – the content and population dynamics of a wiki:

  • Significant descriptive indicators. Figure 4.1 shows the effect of user activity (measured as the proportion of edits per user) on growth rates. The results suggest that user activity correlates very strongly with wiki growth, not only in terms of content production (which is to a certain extent unsurprising) but also new member recruitment. The effect becomes stronger with initially more populated wikis: the more users are actively editing, the more a wiki grows in content and population.

  • Significant governance factors. Turning to governance features, we first analysed the effects of the administrator density on wiki dynamics by examining the impact of the overall proportion of administrators per page. Figure 4.2 shows that having a relatively high number of administrators for a given content size is likely to reduce growth. There is a strong effect of the proportion of admins per page both on user and page growth. For instance, while the last quantile of admins/page ratio has near-zero growth rates over 8 months, the first quantile shows high overall rates ( ∼ +50% for users, ∼ +25% for pages). This effect may be interpreted as the impact of strong governance activity on the proliferation of content and users.

    We identified another significant effect when we considered editing permission. As a binary variable, the editing permission variable generates only two groups of wikis (wikis that allow anonymous editing versus wikis that restrict editing to registered users only). The growth landscape is consequently limited to a one-dimensional comparison over population quantiles. The results in Fig. 4.3 show that for both dimensions – population and content – having no access control is likely to favor growth. While a stronger page growth is quite unsurprising in wikis where no registration is required, the fact that this factor also fuels user registration is more puzzling. One might expect that if users can participate without the need of registration, few would be inclined to register. Our results suggest that on the contrary wikis with unrestricted registration trigger participation more easily than wikis that restrict access.

We also considered two indicators that showed a markedly milder correlation with wiki dynamics. On the one hand, we found that edit density (i.e. edits/page) correlates in a moderately negative way with user growth – with a relatively stronger effect depending on initial population size – while there is surprisingly no significant correlation with page growth. On the other hand, higher administrator ratios (i.e. admins/user) have no significant effect on content or population growth.

Fig. 4.1
figure 1

Growth landscape with respect to user activity, i.e. the proportion of edits per user (EU).

Fig. 4.2
figure 2

Growth landscape with respect to the proportion of admins density (AP).

Fig. 4.3
figure 3

Growth landscape with respect to editing permission (R): red dashed refers to anonymously editable wikis, while blue solid to wikis editable by registered users only.

Figure 4.4 summarises the correlations found between growth rates and each of the variables we considered, by comparing the gain in the population and content sizes between the last and the first quantile for each variable (variables in Fig. 4.4 are ranked from the most positively to the most negatively correlated). If we focus on structural aspects of wikis, we note that the higher the ratio of edits per user the faster the wiki grows, both in terms of content and population. Wikis with very active user communities are not only likely to grow in content, but also to attract a large number of new contributors. This result contrasts with the opposite effect produced by high user density per page.

Fig. 4.4
figure 4

Comparison of growth rates between last and first quantiles, for each variable.

As far as governance factors are concerned, we observed the singular fact that population growth is in average more than 20% faster for anonymously editable wikis. This seems to support the intuition that less barriers favor population growth. Furthermore we observed that, while too many administrators per page may hinder the growth of a wiki (in terms of content size), the proportion of administrators per user does not appear to show a significant influence on growth. In all the above cases, we observed a striking correlation between content and population growth.

This approach broadly draws attention to the remarkable intertwinement of population and content growth in a relatively large sample of (wiki-based) online communities, and constitutes a first contribution towards more comprehensive research on factors behind sustainable wiki communities, beyond the dominant example of the Wikipedia. In particular, we endeavored at connecting simple quantitative features of these online groups to more qualitative characteristics mainly pertaining to simple organisational properties, including distribution of roles, modes of regulation and access control. In a more pragmatic perspective, it basically constitutes an overview of indicators that wiki communities should take into account in order to control their demographics, by paying specific attention to some variables and acting upon them when possible, while neglecting others.

From the perspective of viability theory, the dichotomy we propose between descriptive indicators and governance factors is meant to reflect the traditional distinction between autonomous dynamics and control features. In this respect, our results offer empirical grounds for the design of realistic models of the demographic evolution of these communities, the characterisation of their viable states and of factors that control their dynamics.

2.2 Social Media Communities

Flickr.com, one of the most popular photo and video sharing services, represents another ideal case for the study of online community viability, focusing here on the joint effects of content-based interaction, group affiliation and social network dynamics. The platform supports a dedicated infrastructure for the creation of communities of interest or “groups”, which represents an ideal testbed for studying group viability issues, as well as, more broadly, the effect of user-to-group affiliation links on user behaviour and social interaction among users. The user model of Flickr additionally allows the creation of (user-to-user) “contact” links that can provide a direct insight into user-centered social networks; it also allows interactions among users that are mediated by content (such as commenting on a picture or marking a picture as a “favorite”), hence offering the opportunity to study social behaviour mediated by user-to-content links; finally, thanks to a rich and extensively documented API,Footnote 8 Flickr enables the extraction of large datasets that can be used to study social dynamics at each of these levels of description (content, users, groups).

Flickr attracted a fairly large attention in the research community. Most studies used Flickr as a large data source to study tagging behaviour and folksonomy (Marlow et al. 2006; Nov et al. 2008; Plangprasopchok and Lerman 2009; Sigurbjörnsson and van Zwol 2008). A smaller number of works, more relevant to the present analysis, focused on aspects of social interaction and group-driven behaviour (Lerman and Jones 2007; Mislove et al. 2008; Leskovec et al. 2008; Cha et al. 2008; van Zwol 2007; Valafar et al. 2009). Reviewing the contribution of the literature on the understanding the functioning of communities in online social media is beyond the scope of this chapter. In this section we focus on the relative performance of groups at attracting members and securing content, an issue that has been addressed only in a tangential way by the above studies.

2.2.1 Flickr Groups

A central social feature of Flickr, i.e. groups, has attracted to date a modest attention in the literature, even though it is estimated that a large part of content-mediated interactions and social interactions happens via groups. Flickr groups are of particular interest to the present analysis because, as opposed to purely user-centered social networks, they can be described as communities of interest driven by shared content. Flickr groups are specifically designed to enable collaborative content production and dissemination. In order to share content with the members of a group, a user is explicitly required to submit it to the group. Furthermore, groups have a governance structure consisting of at least one administrator (by default, the group creator) and an optional number of moderators. Group admins and moderators can control the rate and type of submitted content that is shared in the group, via moderation tools, post-submission pruning or throttling (i.e. limiting the number of posted items over a given period of time). These features make Flickr groups ideal candidates for research on content-based online social behaviour and on the coevolution of social and affiliation links. Previous research already partly addressed the role of Flickr groups from this angle (Mislove et al. 2007; Prieur et al. 2008; Schifanella et al. 2010; Negoescu and Perez 2008; Backstrom et al. 2008; Zheleva et al. 2009).

All in all these various results suggest that group formation processes in content-based communities arise from the joint effect of a large number of factors that cannot entirely account for the evolution of a group when considered on an individual basis. The question that we wish to ask is how these different factors interact in affecting the evolution in content and population of a group.

2.2.2 Variety of Group Dynamics

There is a striking variety in growth patterns of Flickr groups when observed over time, even if we focus on macroscopic indicators such as population and content variations: some groups are characterised by a steady population growth accompanied by a null or negative content growth (which may prima facie suggest tight moderation or regular pruning of content); other groups rapidly grow in content but vary slowly or remain virtually constant in population (suggesting the use of groups as “dumps” of pictures with little recruitment of new members); other groups show fluctuations in both content and population (suggesting a significant portion of members who leave the group when no more active); finally, groups may display sudden bursts of growth in content and population and remain subsequently inactive for long periods (which may be the case for groups about recurring or temporally discrete events).

Groups also substantially vary in member turnover, i.e. the portion of a group’s population that is replaced by new members joining the group over time while former members leave. Some groups have a relatively low turnover, suggesting that members tend to stick in the group and are reluctant to leave, while other groups have much faster member replacement rates.

One possibility to come to grips with this variety in global dynamics might then be to ask whether groups can be broadly categorised in a qualitative way into distinct typologies, considering for instance how content specificity or content policies affect the overall group evolution over time. The alternative approach that we take in the present study consists in assuming that similarity in temporal dynamics can be traced back to group similarity in terms of structural features.

Regardless of content, each group can be characterised as occupying at a given time a region in a multidimensional space of properties defining its demographic profile, its structure and its governance mode. These properties can pertain to a group as a whole or refer to aggregate properties of its members, such as their average degree or group affiliation spread.Footnote 9 The temporal dynamics of a group can then be studied as a trajectory across this space. Our study aims to find regularities in the observed temporal dynamics of a large set of groups by assuming that a number of initial properties of these groups can be explored as predictors of their macroscopic evolution – which, again, is taken as a preliminary description of some dimensions of their viability. The literature on group affiliation dynamics offers a number of suggestions as to how groups are generally expected to evolve over time as a function of their size, structure and properties of their membership:

  1. P1

    Larger groups tend to grow faster than smaller groups, in virtue of a preferential attachment principle.

  2. P2

    Cohesive groups tend to recruit less new members than weakly cohesive groups, because of a stronger social closure (or “cliquishness”), which also results in an increased membership inertia and less user turnover.

  3. P3

    Groups whose members are sociable tend to grow faster and attract more contributions than groups whose members have a relatively small number of friends.

  4. P4

    Highly curated groups tend to grow slower in content but faster in population because of the competitiveness produced by higher content selectivity.

  5. P5

    Groups whose members belong to many other groups grow less in content than groups with members that belong just to a few groups.

Each of these hypotheses can be empirically explored, by considering the observed growth rates over a specific time frame as a function of characteristic properties of a group.

2.2.3 Dataset

The data used for this study consists of a sample of 9,360 public Flickr groups whose variations were tracked on a daily basis for a period of 1 month between June and July 2009. The data was obtained via Flickr Group Trackr Footnote 10, a public Web service that we developed in order to allow Flickr group members to track the daily evolution of their community. For each group registered to the service, Flickr Group Trackr pulls a series of statistics from the Flickr API on a daily basis, including: size of the group pool (or number of pictures uploaded to the group), population, privacy level, moderation properties, throttling type and level. Changes along any of these variables can hence be identified with a precision of 24 h. It should be noted that we did not consider group activity data related to discussions in group forums as this data are not available via the Flickr API. The dataset thus obtained from Flickr Group Trackr was complemented with a static snapshot of the same set of groups providing data on: user-to-group affiliation links and user-to-user contact links.

The dataset was filtered in a number of ways to obtain a more homogeneous sample. We limited our analysis to a set of medium-to-large groups with a population range of 100 to 100, 000 members; this restriction was introduced to avoid biases in the analysis due to the presence of small groups (u 0 < 100), whose dynamics are too dependent on the behaviour of individual members to allow any useful generalisation.Footnote 11 To capture the natural dynamics of these groups we also introduced a capping on the maximum daily growth rate in content and population, as we did for wikis, excluding those groups displaying an instantaneous growth of more than 5% of their pool size or population (possibly resulting, again, from extrinsic events such as contests or administrator bulk decisions). Groups that switched to private access control mode during the tracking period were also excluded from the sample. As a result of these restrictions, the final dataset used here consists of 9, 167 groups.

The dataset also contains a complete snapshots of the population of each of the tracked groups at t 0 as well as the complete list of contacts and affiliations for each member of these groups. The union of members of the groups in the dataset spans a total population of 1,267,874 unique users. Group pools size and group populations in our dataset follow a log-normal distribution.Footnote 12

2.2.4 Variables

The metrics that we used as independent variables to study the drivers of group dynamics throughout the present study are described in Table 4.3. Among demographic metrics, ms (membership spread) indicates the number of other groups a group member is affiliated with, averaged over the whole group population. Among structural metrics or metrics related to topological properties of the group social network: k refers to the direct degree for group members calculated on contact links that are internal to the group social network; r measures the proportion of reciprocated or symmetrical contact links within the group and per group member, averaged over the group population. Among governance metrics: mod indicates the number of superusers other than administrators who can accept photos submitted to the group’s moderation queue; μ indicates the presence of a moderation queue, by which photos submitted to a group are reviewed by moderators before being published in the group pool; θ is a quantitative indicator of the maximum number of photos that can be contributed to the group per time period (day, week or month), also denoted as “throttling index”.

Table 4.3 Flickr group metrics used as independent variables

Group growth indicators can be defined in multiple ways. Growth can be assessed in absolute terms as the difference in the total number of members and photos between t 0 and t 1, i.e.: u 1u 0 and p 1p 0, respectively. Alternatively, one may focus on relative growth or “growth rate” over the observation period, or the variation in members and content normalised by the initial size of the group: \(\frac{{u}_{1}-{u}_{0}} {{u}_{0}} \) and \(\frac{{p}_{1}-{p}_{0}} {{p}_{0}} \). Finally, one may consider the actual turnover or the number of unique users who joined (u +) and leaved a group (u ) over the observation period. The turnover itself can be considered in absolute (u +u ) or relative terms \(\frac{{u}_{+}-{u}_{-}} {{u}_{0}} \).

For the sake of the present study and contrarily to the wiki case, we decided to focus on absolute rather than relative growth indicators (see Table 4.4) for a number of reasons. First of all, we wanted to take all groups at face value as equally prone to recruit new members and measure size-dependent effects as only one among several possible assumptions on growth driving factors. Although other studies showed that the size of a group population plays a central role in the recruitment of new members (Backstrom et al. 2008), this assumption can be challenged on the basis of the significant number of group members who do not appear to have any social connection with other members (k = 0). Evidence of the existence of such members, as pointed out by Zheleva et al. (2009), suggests that “social recruitment” is only one among possible mechanisms that attract new members to a group. Second, we wanted to study the specificity of member turnover as indicators of a stable or volatile community, and for this reason we also decided to opt for absolute figures as opposed to relative growth rates. A final reason not to focus on relative growth rates was that results using these rates as dependent variables were not statistically significant in several cases, suggesting that for the timeframe that we considered absolute variations were the most appropriate to focus on.

Table 4.4 Flickr group growth indicators

2.2.5 Aggregate Analysis of Growth-Driving Factors

To investigate the joint contribution of demographic, structural and governance-related factors on the temporal dynamics of groups, we performed a regression analysis of absolute group growth over the whole observation period as a dependent variable. We used four different models aiming at measuring the respective effect of a series of independent variables on absolute user variation (Δu), content variation (Δp) as well as member turnover (u + and u respectively). We used the initial population and content size as control variables in each of the models (see Table 4.5 for the detailed list of variables included in each model). The general regression equation underlying each model (barring specific variable exclusions) is:

$$ \begin{array}{rlrlrl} \log (y) =&\;{\lambda }_{0} + {\lambda }_{{u}_{0}}\log ({u}_{0}) + {\lambda }_{{p}_{0}}\log ({p}_{0}) \\ & + {\lambda }_{r}(r) + {\lambda }_{{c}_{3}}({c}_{3}) + {\lambda }_{k}\log (1 + k) + {\lambda }_{ms}\log (ms) \\ & + {\lambda }_{\mu }\mu + {\lambda }_{mod}\log (1+mod) + {\lambda }_{\theta }\log (\theta ) + {\lambda }_{adm/u0}\log (1 + (adm/{u}_{0})) \\ \end{array} $$
(4.2)
Table 4.5 Results of regression analysis

We thus considered a linear regression of the logs of each variable, when applicable and relevant: logs were essentially used for quantitative variables spanning over one or several orders of magnitude (such as u 0) in order to make them comparable in the regression with variables evolving in e.g. [0, 1] (such as c 3). For each dependent variable y ∈ { Δu, Δp, u +, u − }, we started with an equation specified by the full model of (4.2). Variables corresponding to non-significant p-values were then iteratively excluded, generally resulting in a change in R 2 of less than 1%.

The results of the regression analysis summarised in Table ?? indicate some salient effects of various initial properties of groups on their dynamics. For a given dependent variable, an empty cell indicates that the corresponding independent variable had eventually been excluded from the regression. If we focus on population and content growth, we first notice a (somewhat unsurprising) correlation in the effect of different variables on population growth on the one hand and content growth on the other hand, which is consistent with the above findings in wiki-based communities. As to structural/demographic factors, we observe indeed that population (u 0) and pool size (p 0) are important drivers of absolute growth: the larger the population of a group, the stronger its absolute growth over the observation period (consistently with P1). The average spread of group affiliation for group members (ms) displays a negative correlation, suggesting that groups whose members also belong to many other groups tend to grow slower and the effect is actually stronger on population growth than it is on content (P5.): this is consistent with the idea that groups whose members are selective (i.e. choose to join a smaller number of groups) are likely to attract more members than groups that mostly function like content dumps for occasional members. In terms of topological properties of the group-centered social network, we observe that cohesiveness as measured by the average clustering coefficient of the group-based network (c 3) has a remarkable negative correlation with growth (P2) and is by far the variable displaying the strongest effect across all analyses. Conversely, a high rate of reciprocity (r) and a larger presence of (popular) high-degree nodes in a group (k) have the effect of boosting growth (P3). Possibly the most striking finding is the overall negligible effect of moderation properties on the observed growth. In many cases the effect of moderation factors (μ, mod, θ, admu 0) is not statistically significant; in those cases in which it is, the observed effects are considerably weaker than those related to other group properties, which is partly at odds with our expectations (P4).

The analysis of factors affecting member turnover provides further insights. The strongest effect on turnover is that of cohesiveness, which not only appears to hinder new recruits but also to work as a barrier against user drop-off, as indicated by its negative effect on both components of the turnover: as such, cohesiveness (or the “cliquishness” of a group) works as a factor measuring the social inertia of a group membership, suggesting a higher level of commitment by its members that are more reluctant to leave than in less cohesive groups (P2). The level of engagement is also measured by the symmetric effect that affiliation spread has with respect to member recruitment and drop-off: a higher spread increases the probability that more members will be leaving the group and less new members joining.

2.2.6 Individual Drivers of Group Growth

Whereas the regression results can be used for a global assessment of the contribution of different factors to the dynamics of a group, we can address each of the hypotheses presented in section 4.2.2.2, following the methodology adopted before for wiki communities. We tackled the implications of this regression model on each hypothesis through an analysis of the individual impact of each metric on the observed growth and turnover of a group. Two snapshots for each group were compared at the beginning (t 0) and at the end (t 1) of the tracking period and group growth rates were calculated as the absolute variation in population and pool size between these two snapshots (Δu and Δp respectively). We then ranked groups along each independent variable in 9 quantiles, each containing therefore 1 ∕ 9 of the groups in our dataset. The first quantile represents groups with the lowest values for the considered variable, whereas the last quantile refers to groups with the highest values. The analysis of individual effects should be taken as evidence of how effective each factor would be under the assumption that all other factors had an equal effect on growth.

  • P1: Size matters. The breakdown of the effects of size on the observed growth (Fig. 4.5) shows indeed that the expected growth of a group in content and population follows monotonically from its size. This allows us to discard the null assumption that we made that all groups should in principle be considered at face value as having an equal probability of attracting new members and new content: size does matter, which can be explained by e.g. herding behaviour.

    Fig. 4.5
    figure 5

    P1: Size matters. Effects of u 0 and p 0 on absolute variation of users (solid line) and photos (dashed line)

  • P2: Effects of cohesiveness on group growth. Figure 4.6 (left) shows the breakdown of the effects of cohesiveness on group growth. Consistently with the regression analyses, cohesiveness as measured by the average clustering coefficient for the group-centered network works as a growth-regulating factor. Groups where cohesiveness is high display a higher inertia.

    Fig. 4.6
    figure 6

    (left) P2: Effects of cohesiveness (c 3), (right) P3: Effects of sociability on absolute variation of users (solid line) and photos (dashed line) (k)

  • P3: Are sociable users growth attractors? An effect conflicting with cohesiveness is related to individual sociability as measured by the average within-group degree of members in the group-centered contact social network (Fig. 4.6, right). Note that degree has only been measured within groups: a related hypothesis, assessing sociability through a degree computed over the whole network of the Flickr population would actually allow one to answer the question whether groups in which (global) high-degree nodes (or very social/popular users) are concentrated are more likely to attract members than groups where the degree is more uniformly distributed.

  • P4: The poor effects of governance. Possibly the most striking findings of the present study are the negligible effects of the moderation and governance structure on group growth. Figure 4.7 (left) exemplifies the virtually flat growth landscape that emerges as a function of θ. This is not to deny the effectiveness of curators’ strategies in actually enforcing norms about content and participation on group members. However, from a purely quantitative perspective, these results suggest that in social media sharing systems, social-network factors are likely to drive recruitment and participation to a much larger extent than the factors that group administrators and moderators can control with the help of governance tools. This result contrasts with the above findings on wikis, which raises the question of what differences in terms of user interaction modes and collaborative behaviour may explain this discrepancy.

    Fig. 4.7
    figure 7

    (left) P4: Weak effect of governance, exemplified by throttling (θ). (right) P5: Level of user engagement measured by affiliation spread (ms). Plots represent absolute variation of users (solid line) and photos (dashed line)

  • P5: User engagement and attention. The marginal role of governance-related factors suggests that the main drivers of group dynamics in social media sharing systems need to be found elsewhere. In addition to social ties, individual and collective attentional spans may influence group growth. We saw that affiliation spread (ms) has a globally significant effect on group growth; analysing growth as a function of different values of affiliation spread (Fig. 4.7, right) indicates that this effect is robust also at an individual basis: groups whose members tend to spread their contribution over many other groups are consistently slower in growth than groups whose members are more selective in their affiliations.

From a viability theory perspective, these results diverge from the relatively clear-cut distinction between descriptive factors and governance factors which we discussed in the case of wikis. Indeed, while it would not be hard to consider plain demographic metrics such as population and content size as characterising of the autonomous dynamics of a model of social media groups, typical governance measures (such as throttling or moderation), appear to have little influence on the evolution of these groups. In this regard the topological structure of group-centred social networks exerts a more important effect, however indirect. As noted above, although these factors can be seen as defining the autonomous dynamics of the system (by assuming that social ties are the result of spontaneous social interaction among users) they can also be regarded as genuine control factors (insofar as the design of the system or the implementation of specific policies can favour the creation of some kind of links, e.g. transitive links, or increase the cohesiveness of the members’ social network).

3 A Simple Model of Viable Web Communities

In this section we propose to formulate a simple model in which the viability of collaborative communities could be assessed against constraints on group population and group content size. In other words, we sketch the initial steps needed to formally account for the intertwinement of these constraints, the autonomous demographic dynamics of such systems and the possible control actions which may be adopted to ensure and/or restore their viability. We leave issues specifically related to model solving and simulation-based approximation outside of this chapter. To this end, the interested reader may nonetheless apply the tools and techniques presented in Part 2 of the present book.

3.1 Model Variables

From the two examples discussed in the previous section we may abstract a list of variables that can be applied to describe the structure and governance of collaborative systems in general. Apart from the fundamental demographic metrics consisting of population and content sizes, such communities are generally characterised by the fact that some users (so-called “administrators”) have special privileges; besides, they often feature similar governance mechanisms that limit the amount of contributions (filtering processes, registration requirements before contribution, etc.).

Model variables are to be based upon the corresponding quantities, at least for those features that can be described in quantitative terms. Among these variables, we have to distinguish state variables from control variables (upon which, for instance, administrators may act in order to influence the dynamics of the community they manage). Additionally, we must be able to define a viability domain, wherein we expect state variables should remain in order to consider the system viable.

3.1.1 State Variables

The state space per se is essentially made of:

  • denoting the population of members or participants in a community.

  • representing the size of content contributed by members (pages, photos, etc.).

3.1.2 Control Variables

Those include:

  • the number of administrators in a community. For convenience, we also call administrator ratio the proportion of administrators with respect to the whole population, denoted as a; similarly, the administrator density is defined as the proportion of administrators with respect to content size, denoted as b. We then have: aU = A and bP = A.

  • moderation constraint. The moderation constraint represents a mechanism that filters contributions, such as: e.g. edition permission on wikis (m = R), or moderation queues in Flickr (m = μ).

Eventually, the status of:

  • cohesiveness (for instance measured through the clustering coefficient: c = c 3 in Flickr groups)

as a control variable remains relatively unclear: on the one hand, it may indeed be possible for administrators to favour some kinds of interactions between users, in such a way that the group becomes more cohesive. Yet, it is likely that the evolution of this variable could also be dictated by the autonomous dynamics of the group – for instance and all other things being equal, an increasing population is likely to induce a weaker cohesiveness.

3.1.3 Viability domain

As discussed in the first section, the definition of a “desirable state” as a precondition to studying group viability is bound to have a large number of potential interpretations in social systems or, at least, be more debatable than in the case of e.g. physical systems. In the following preliminary model, we choose to adopt a rather simple approach to viability by stylising and extrapolating a plausible trend observed both in Flickr and wiki groups: that groups tend to roughly grow along a diagonal of constant ratio UP. In particular, we notice that the largest groups (both in terms of content and population) are concentrated along this diagonal. We suggest that these groups should have had a successful development, at least at some point, in order to reach this area of the (U, P) state space (Table 4.6).

Table 4.6 Model variables and parameters

We thus define viable a group such that its UP ratio remains within a given boundary:

$$\rho = [{\rho }^{-},{\rho }^{+}]$$

3.2 Viable Dynamics

The model aims at stylising several effects initially observed in the empirical data. In its present version however, it values mathematical tractability over realism. From a generic point of view, the system of differential equations governing the evolution of state variables could be written as:Footnote 13

$$\left \{\begin{array}{ccc} \frac{{\mathit{dU}}_{t}} {\mathit{dt}} & =& \alpha ({U}_{t},{P}_{t},{A}_{t},{c}_{t},{m}_{t},{r}_{t}) \\ \frac{{\mathit{dP}}_{t}} {\mathit{dt}} & =&\beta ({U}_{t},{P}_{t},{A}_{t},{c}_{t},{m}_{t},{r}_{t}).\end{array} \right.$$
(4.3)

From this point, various equations and diverse models can be proposed depending on different collaborative models one wants to describe, peer production systems (such as wikis) vs. social media groups (such as Flickr communities). We choose to focus on wikis and, in particular, their user dynamics, dU t dt. In the absence of empirical results based on social network properties in wikis we should ignore the role of cohesiveness in the present model. Eventually, we can write:

$${\mathit{dU}}_{t}/\mathit{dt} = {\alpha }_{w}({U}_{t},{P}_{t},{A}_{t},{m}_{t})$$

In our previous analyses, we made the hypothesis that wiki growth over the whole observation period was directly influenced by a constant and permanent impact of the values of control and state variables measured at the beginning of the given period. The most straightforward way to account for this type of growth is to assume that groups experience an exponential growth depending on initial conditions: in other words, given \({U}_{{t}_{0}}\), \({P}_{{t}_{0}}\), \({A}_{{t}_{0}}\) and \({m}_{{t}_{0}}\), growth is assumed to be an exponential function of t.

The following equation can therefore be proposed, for a proper function ϕ:

$$\frac{d{U}_{t}} {{U}_{t}} = \phi ({U}_{{t}_{0}},{P}_{{t}_{0}},{A}_{{t}_{0}},{m}_{{t}_{0}})dt\quad \Leftrightarrow \quad {U}_{t} = {U}_{{t}_{0}}{e}^{(t-{t}_{0})\phi ({U}_{{t}_{0}},{P}_{{t}_{0}},{A}_{{t}_{0}},{m}_{{t}_{0}})}. <EquationNumber>4.4a</EquationNumber>$$
(4.4a)

where ϕ could be estimated for instance from Roth et al. (2008) as a function depending (1) linearly upon \({U}_{{t}_{0}}\), (2) affinely and in a monotonously decreasing manner upon \({A}_{{t}_{0}}/{P}_{{t}_{0}}\) (or, rather, quantiles thereof) and (3) upon a given step function of \({m}_{{t}_{0}}\). Put differently, ϕ can be schematically written as:

$$\begin{array}{rlrlrl} \begin{array}{rlrlrl} \phi ({U}_{{t}_{0}},{A}_{{t}_{0}},{P}_{{t}_{0}},{m}_{{t}_{0}})& = {\lambda }_{a}{U}_{{t}_{0}}\left ( \frac{{b}_{{t}_{0}} - {b}_{\mathrm{max}}} {{b}_{\mathrm{min}} - {b}_{\mathrm{max}}}\right ){g}_{{m}_{{t}_{ 0}}} \\ & = \frac{{\lambda }_{a}} {{\nu }_{b}} {U}_{{t}_{0}}(1 - \frac{{A}_{{t}_{0}}} {{b}_{\mathrm{max}}{P}_{{t}_{0}}} ){g}_{{m}_{{t}_{ 0}}} \end{array} & &\end{array}$$
(4.5)

where λ a + is a given constant, b min and b max are respectively the minimum and maximum administrator densities over all groups, \({\nu }_{b} = \frac{{b}_{\mathrm{max}}-{b}_{\mathrm{min}}} {{b}_{\mathrm{max}}} \) is the constant relative spread of administrator ratios, and g 0 and g 1 are two given constants such that g 0 < g 1 (in practice, from Fig. 4.3 it roughly seems that g 1g 0 ∈ [2, 3]).

3.2.1 Viable Group Dynamics: A Coevolutionary Sketch

By extrapolating on this empirically-based relationship, we propose a dynamic model featuring the following system of equations:

$$\left \{\begin{array}{rlrlrl}\frac{d{U}_{t}} {dt} & = \frac{{\lambda }_{a}} {{\nu }_{b}} {g}_{{m}_{t}}(1 - \frac{{A}_{t}} {{b}_{\mathrm{max}}{P}_{t}}){{U}_{t}}^{2}& & \cr \frac{d{P}_{t}} {dt} & = \frac{{\lambda }_{b}} {{\nu }_{b}}{g}_{{m}_{t}}({P}_{t}^{2} -\frac{{A}_{t}{P}_{t}} {{b}_{\mathrm{max}}} )& \cr \end{array}\right.$$
(4.6)

while ∀t ≥ 0, the following double inequality should hold:

$${\rho }^{-} < {U}_{ t}/{P}_{t} < {\rho }^{+}.$$

Possible refinements, given specific cases and under specific subsets of assumptions, where for instance A t itself could be connected to U t and P t , are left to the reader. The same goes for the viability domain: to account for the need for a group to keep receiving contributions, constraints should be put onto dP t dt – for instance, by requiring that it remains above a certain level of activity and below a certain threshold of cognitive effort for group members (so that they can reasonably keep track of ongoing contributions). Similarly, exact solutions of this class of models are left to the interest of the reader. Approximate and simulation-based exploitation of these kinds of dynamics can be exploited through software such as Kaviar, to allow performing computations of the corresponding viability kernels (see Chap. 7).

4 Conclusions

This chapter presented a methodology aimed at empirically appraising one possible dimension and understanding of “viability” (growth-related viability), in two paradigmatic cases of collaborative communities. In this respect, it constitutes a preliminary framework and a necessary step towards defining and modelling pattern resilience and viability in the context of social systems.

In particular, we assessed the interplay of demographic factors and governance structure and their role among forces driving the macro-level growth of content-based Web communities; in the case of social media communities, we emphasised the role of social network properties as a class of indirect control factors. Despite a largely shared ontology (governance features, demographic factors) between the two case studies and the application of a similar analysis taking growth as a proxy for viability, we noticed several macroscopic discrepancies: most importantly, we observed that demographic and governance properties are good predictors of growth in the case of wikis, but are surprisingly poor at predicting growth rates in social media communities. Some possible reasons for this discrepancy may depend on the specific methodology that we adopted here (e.g. unnoticeable effects on relative growth rates as opposed to absolute growth). The observed inconsistency in the effectiveness of control factors may be due to distinct underlying modes of peer production–i.e. genuinely collaborative, in the case of wikis (as users jointly modify shared content) vs. atomistic in the case of media sharing (as users contribute individual contents to a shared pool) – or distinct types of social interaction (suggesting the effects of the underlying social network as a stronger recruitment factor in social media groups).

All in all, we characterised the relationships between structural, demographic and governance-related variables of these online groups. We interpreted these relationships from the perspective of the evolution, stability and sustainability of these Web communities. We essentially showed how data can be analysed and how, in principle, the results of such analysis could be used as input for highly aggregated mathematical models: within the framework of viability theory, we argue that this approach makes it possible to define hypotheses pertaining to the autonomous dynamics of such systems and the factors that control them, and therefore serve as a first step towards the design and computation of the corresponding “capture basins” or regions of the variable space where viability can be ensured. More broadly, we suggest that it constitutes a preliminary framework and a necessary step towards defining and modelling pattern resilience and viability in the context of social systems.