Keywords

1 Introduction

Social relational networks have rapidly become an important technology in our digital based information intense world [18]. Among the notable examples of social network sites are Facebook and LinkedIn. In addition to providing an ability for enabling people from all over the world to connect with each other each they provide a vast source of information about the individual participants in the network, the so called nodes. Each of these participants can be viewed as a kind of database containing information about themselves. We see here that a social network can be viewed as kind of network of databases. This leads us to understand that the development of appropriate technologies to manage social networks requires a combination of the use of network technologies, graph theory and database technologies [9]. Furthermore the development of intelligent social network management requires the extension of these two technologies by the introduction of ideas from soft and granular computing, computational intelligence [8]. In addition to the intelligent extension of these two technologies with soft computing an important task on the road map of developing intelligent social networks is the combining of network theory with database theory. Since many advances have been made toward the development of intelligent database technologies, especially by contributors to this volume, here we shall take some steps in the other two tasks, the intelligent extension of social network theory and the combining of social networks with databases. We note that one area that has received a considerable amount of research is the mining of social networks [4, 5]. We emphasize that our interest here is not on this problem but we shall be more interested in the issue of querying the social network with its nodes seen as individual databases. We shall refer to this as SONDAB-Q as an acronym for SOcial Network DAtaBase Querying.

The current social network technology can be extended and enriched to help in modeling these newly emerging applications by introducing ideas from fuzzy sets and related granular computing technologies [1016]. We can provide this extension and enrichment in two ways. The first is with the introduction of fuzzy graphs representing the networks [1719]. This allows a generalization of the types of connection between nodes in a network from simply connected or not to weighted or fuzzy connections. Here the idea of strength of connection becomes important. The second and perhaps more interesting extension is the use of Zadeh’s fuzzy set based paradigm of computing with words [1012] to provide a bridge between a human network analyst’s linguistic description of social network concepts and the formal model of the network. Fundamental to this capability is the realization that both formal network models and the paradigm of computing with words are built upon set based technologies. More specifically, the formal representation of a social network is in terms of a mathematical set object called a relationship [1] and computing with words uses a set object, fuzzy subsets, to formally represent the semantics of linguistic terms. This common underlying set based technology allows us to take human concepts and formally represent them in terms of network properties. This in turn allows an analyst to estimate the truth or falsity of observations about a network as well helps in the mining of social relation networks. In an attempt to help the reader get an understanding of the technology useful in this approach we provide few examples of how we would model some social network concepts.

Another useful idea we discuss is vector-valued nodes. Here we associate with each node a vector whose components are the attribute values of the node. Combining this with the machinery of Zadeh’s computing with words we are then able to intelligently query the network with questions that involve both attributes and connections. We see this as the basis of an emerging discipline of social network database querying, SONDAB-Q

2 Fuzzy Graphs

Since a social network can be formally viewed as graph whose nodes represent the members of the social network we begin by introducing some ideas from fuzzy graphs. Here we first describe the idea of a fuzzy relationship. The concept of a fuzzy relationship plays a fundamental role in modeling a type of weighted graph called a fuzzy graph [1719]. Let X be a set of elements. A fuzzy relationship on X is a mapping \(\mathrm{{R}}: \mathrm{{X}} \times \mathrm{{X}} \rightarrow \) [0, 1] where R(x, y) indicates the degree of relationship between x and y. We note that we can view a fuzzy relationship as a fuzzy subset on \(\mathrm{{X}} \times \mathrm{{X}}\). This allows us to use much of the formalism of fuzzy sets. For example we can say that \(\mathrm{{R}}_{1} \subseteq \mathrm{{R}}_{2}\) if \(\mathrm{{R}}_{1}\)(x, y) \(\le \mathrm{{R}}_{2}\)(x, y) for all (x, y). We note that \(\mathrm{{R}}_{1} \subseteq \mathrm{{R}}_{2}\) means that \(\mathrm{{R}}_{1}\) is a subset of \(\mathrm{{R}}_{2}\). We shall also use the terminology \(\mathrm{{R}}_{1} \le \mathrm{{R}}_{2}\).

Some notable properties that can be associated with fuzzy relationships are

  1. (1)

    Reflexivity: R(x, x) \(=\) 1 for all x

  2. (2)

    Symmetry: R(x, y) \(=\) R(y, x)

  3. (3)

    Transitivity: \( \mathrm{{R}}(\mathrm{{x}}, \mathrm{{z}}) \ge \mathrm{{Max}}_{\mathrm{{y}}} [\mathrm{{R}}(\mathrm{{x}}, \mathrm{{y}}) \wedge \mathrm{{R}}(\mathrm{{y}}, \mathrm{{z}})]\)

An important operation on fuzzy relations is composition. Assume \(\mathrm{{R}}_{1}\) and \(\mathrm{{R}}_{2}\) are two relations on X. The composition \(\mathrm{{R}} = \mathrm{{R}}_{1}\ \blacklozenge \ \mathrm{{R}}_{2}\,\) is also a relationship on X such that

$$\begin{aligned} \mathrm{{R}}(\mathrm{{x}}, \mathrm{{z}}) = \mathrm{{Max}}_{\mathrm{{y}}}[\mathrm{{R}}_{1}(\mathrm{{x}}, \mathrm{{y}}) \wedge \mathrm{{R}}_{2}(\mathrm{{y}}, \mathrm{{z}})] \end{aligned}$$

The composition operation can be shown to be associative

$$\begin{aligned} (\mathrm{{R}}_{1}\,\blacklozenge \,\mathrm{{R}}_{2}) \ \blacklozenge \ \mathrm{{R}}_{3} = \mathrm{{R}}_{1}\,\blacklozenge \,(\mathrm{{R}}_{2} \ \blacklozenge \ \mathrm{{R}}_{3}) \end{aligned}$$

The associativity property allows us to use the notation \(\mathrm{{R}}^{\mathrm{{k}}} = \mathrm{{R}}\ \blacklozenge \ \mathrm{{R}} \ \blacklozenge \ \,\ldots \ \blacklozenge \ \mathrm{{R}}\) for the composition of R with itself k times. In addition we shall define \(\mathrm{{R}}^{0}\) to be such that \(\mathrm{{R}}^{0}(\mathrm{{x}}, \mathrm{{y}}) = 0\) for all x and y.

If R is reflexive then \(\mathrm{{R}}^\mathrm{{{k_2}}} \supseteq \mathrm{{R}}^{\mathrm{{k_1}}}\) for \(\mathrm{{k}}_{2} > \mathrm{{k}}_{1}\). On the other hand if R is transitive, it can be shown that \(\mathrm{{R}}^\mathrm{{{k_2}}} \subseteq \mathrm{{R}}^{\mathrm{{k_1}}}\) if \(\mathrm{{k}}_{2}\,>\,\mathrm{{k}}_{1}\). From this we see that if R is reflexive and transitive then \(\mathrm{{R}}^{\mathrm{{k_2}}} = \mathrm{{R}}^{\mathrm{{k_1}}}\) for all \(\mathrm{{k}}_{1}\) and \(\mathrm{{k}}_{2}\,\ne \,0\).

We shall now define various types of fuzzy graphs. Let X be a set of elements, which we shall refer to using graph terminology as nodes or vertices. We shall further assume R is a reflexive fuzzy relationship on X. The pair \({<}\mathrm{{X}}, \mathrm{{R}}{>}\) can be seen as defining a fuzzy or weighted graph in which R(x, y) is the weight associated with the arc \(\mathrm{{x}} \rightarrow \mathrm{{y}}, (\mathrm{{x}}, \mathrm{{y}})\). More generally if F is a fuzzy subset of X we can also define a fuzzy graph as \({<}\mathrm{{X}}, \mathrm{{F}}, \mathrm{{R}}{>}\). Here in addition to having degrees of connection we also have a degree to which each of the nodes belongs to the network. In this case we let R.F be a relationship on X defined such that \(\mathrm{{R}}.\mathrm{{F}}(\mathrm{{x}}, \mathrm{{y}}) = \mathrm{{R}}(\mathrm{{x}}, \mathrm{{y}}) \,\wedge \, \mathrm{{F}}(\mathrm{{x}}) \,\wedge \, \mathrm{{F}}(\mathrm{{y}})\) and say that R.F is the relationship R on F. We note here that \(\mathrm{{R}}.\mathrm{{F}}(\mathrm{{x}}, \mathrm{{x}}) = \mathrm{{F}}(\mathrm{{x}})\). If \(\mathrm{{F}} = \mathrm{{X}}\) then \(\mathrm{{R}}.\mathrm{{F}}(\mathrm{{x}}, \mathrm{{y}}) = \mathrm{{R}}(\mathrm{{x}}, \mathrm{{y}})\). It can be shown here that \(\mathrm{{R}}.\mathrm{{F}}^{\mathrm{{k_1}}} \subseteq \mathrm{{R}}.\mathrm{{F}}^{\mathrm{{k_2}}} \,\mathrm{if}\, \mathrm{{k}}_{2}\,>\,\mathrm{{k}}_{1}\). We note that if \(\mathrm{F} = \mathrm{X}\) then \({<}\mathrm{{X}}, \mathrm{{F}}, \mathrm{{R}}{>} \,=\, {<}\mathrm{{X}}, \mathrm{{R}}{>}\).

If R is symmetric we shall say \({<}\mathrm{{X}}, \mathrm{{F}}, \mathrm{{R}}{>}\) is an undirected fuzzy graph. We note that if R is symmetric then R.F is also symmetric. If R is not symmetric we shall refer to \({<}\mathrm{{X}}, \mathrm{{F}}, \mathrm{{R}}{>}\) as a directed graph and we refer to a pair (x, y) as an arc. Here the weight on the arc (x, y) and arc (y, x) may be different. In the case of where R symmetric we shall refer to the pair (x, y) as an edge. Since we shall primarily be concerned with undirected graphs, we shall simply use the unmodified term graph or network to refer to this case where R is symmetric.

At times, especially when working with undirected graphs, we shall find it convenient to consider the space U which consists of all unordered pairs of distinct elements which we denote as {x, y}. In this case we can refer to \(\mathrm{{R}}_{\mathrm{{U}}}\) as the reflection of R on U. In this \(\mathrm{{R}}_{\mathrm{{U}}}(\{\mathrm{{x}}, \mathrm{{y\}}}) = \mathrm{{R}}(\mathrm{{x}}, \mathrm{{y}}) = \mathrm{{R}}(\mathrm{{y}}, \mathrm{{x}})\). We note that at a formal level we can also view U as consisting of all subsets of X consisting of two elements.

Assume \(\mathrm{{G}} = {<}\mathrm{{X}}, \mathrm{{F}}, \mathrm{{R}}{>}\) is a fuzzy graph. A path \(\uprho \) in \(\mathrm{{G}}\) is a sequence of distinct nodes \(\mathrm{{x}}_{0} \mathrm{{x}}_{1}\cdots \mathrm{{x}}_{\mathrm{{n}}}\). The number of links in the path is n. The strength of the path is defined as

$$\begin{aligned} \mathrm{{ST}}(\uprho )=\mathop {\mathbf{{Min}}}\limits _{\mathrm{{i}}=1\ \mathrm{{to}}\ \mathrm{{n}}} [\mathrm{{R.F}}(\mathrm{{x}}_{\mathrm{{i}}-1}, \mathrm{{x}}_\mathrm{{i}})]. \end{aligned}$$

If F \(=\) X then \(\mathrm{{ST}}(\uprho ) =\mathop {\mathbf{{Min}}}\limits _{\mathrm{{i}}=1\ \mathrm{{to}}\ \mathrm{{n}}} [\mathrm{{R}}(\mathrm{{x}}_\mathrm{{i-1}}, \mathrm{{x}}_\mathrm{{i}})]\).

Two nodes for which there is a path \(\uprho \) with \(\mathrm{{ST}}(\uprho )> 0\) between them are called connected. We call \(\uprho \) a cycle if \(\mathrm{{n}} \ge 2\) and \(\mathrm{{x}}_{0} = \mathrm{{x}}_{\mathrm{{n}}}\)

Consider the graph \(\mathrm{{G}} = {<}\mathrm{{X}}, \mathrm{{R}}{>}\) let us now consider \(\mathrm{{R}}^{\mathrm{{k}}}\). We can show that \(\mathrm{{R}}^\mathrm{{{k}}}(\mathrm{{x}}, \mathrm{{y}})\) is the strength of the strongest path from x to y containing at most k links. We see that if X has n nodes then \(\mathrm{{R}}^{\mathrm{{n}}-1}\) provides the strongest connection between two nodes using any number of links. If \(\mathrm{{R}}^\mathrm{{{k}}}(\mathrm{{x}}, \mathrm{{y}}) \ne \) 0 we can say x and y are connected at least of order k.

We note that if \(\mathrm{{G}} = {<}\mathrm{{X}}, \mathrm{{F}}, \mathrm{{R}}{>}\) we can make statements similar to the above about R.F.

Assume \(\mathrm{{G}} = {<}\mathrm{{X}}, \mathrm{{R}}{>}\) is a fuzzy graph. Let \(\uprho = \mathrm{{x}}_{0}~\mathrm{{x}}_{1} \cdots \mathrm{{x}}_{\mathrm{{n}}}\) be a path in X. A concept introduced by Rosenfeld [17] is the length of the path. He defined

$$\begin{aligned} \mathrm{{L}}(\uprho )=\sum _{\mathrm{{i}}=1}^\mathrm{{n}} {\frac{1}{\mathrm{{R}}(\mathrm{{x}}_{\mathrm{{i}}-1} ,\mathrm{{x}}_\mathrm{{i}})}} \end{aligned}$$

Clearly \(\mathrm{{L}}(\uprho ) \ge \) n. We note that if there exists one \(\mathrm{{R}}(\mathrm{{x}}_{\mathrm{{i}}-1}, \mathrm{{x}}_{\mathrm{{i}}}) = 0\) then \(\mathrm{{L}}(\uprho )=\infty \) and \(\mathrm{{St}}(\uprho ) = 0\). We note that if R is crisp and \(\mathrm{{St}}(\uprho ) \ne 0\) then \(\mathrm{{L}}(\uprho ) = \mathrm{{n}}\). Using this idea we can define the distance between two nodes x and y in the graph G as

$$\begin{aligned} \delta (\mathrm{{x}}, \mathrm{{y}})= {\mathop {\mathrm{{Min}}}\limits _{{\text {all paths x to y}}}} [\mathrm{{L}}(\uprho )] \end{aligned}$$

It is the length of the shortest path from x to y. It can be shown that \(\delta \) is a metric [17], \(\delta (\mathrm{{x}}, \mathrm{{x}}) = 0, \delta (\mathrm{{x}}, \mathrm{{y}}) = \delta (\mathrm{{y}}, \mathrm{{x}})\) and \(\delta (\mathrm{{x}}, \mathrm{{z}}) \le \quad \delta (\mathrm{{x}}, \mathrm{{y}}) + \delta (\mathrm{{y}}, \mathrm{{z}})\). Using the terminology common in social network analysis [1, 2] we can refer to the path \(\uprho \) such that \(\mathrm{{L}}(\uprho )=\delta (\mathrm{{x}}, \mathrm{{y}})\) as the geodesic between x and y and refer to \(\delta (\mathrm{{x}}, \mathrm{{y}})\) as the geodesic distance.

While there appears to be some inverse connection between strength of a path and its length as for example in the case where \(\mathrm{{ST}}(\uprho ) = 0\) implies \(\mathrm{{L}}(\uprho )=\infty \) this is not a strict correlation. Consider for example the two paths \(\uprho _{1}\) and \(\uprho _{2}\) shown in Fig. 1. We see that \(\mathrm{{ST}}(\uprho _{1}) = 0.75 > \mathrm{{Str}}(\uprho _{2}) = 0.5\). On the other hand \(\mathrm{{L}}(\uprho _{1}) = \frac{4}{3}+ \frac{4}{3} + \frac{4}{3} = 4 \ge \mathrm{{L}}(\uprho _{2}) = 1\)

Fig. 1
figure 1

Two paths

Let \({<}\mathrm{{X}}, \mathrm{{R}}{>}\) be a weighted graph. At times it can be useful to view this from the level set point of view [20]. This will allow us to make use of the representation theorem [20] to extend operations to these fuzzy relationships. We recall that if R is a fuzzy relationship on \(\mathrm{{X}} \times \mathrm{{X}}\) then \(\mathrm{{R}}_{\alpha }= \{(\mathrm{{x}}, \mathrm{{y}})/\mathrm{{R}}(\mathrm{{x}}, \mathrm{{y}}) \ge \upalpha \}\) is called the \(\alpha \) level set of R. We see that each \(\mathrm{{R}}_{\alpha }\) is a crisp relationship on X.

We note that if \(\mathrm{{R}}^{\mathrm{{k}}}\) is the k composition of R then \(\mathrm{R}_\alpha ^\mathrm{k}= \{(\mathrm{{x}}, \mathrm{{y}}) /\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{y}}) \ge \alpha \}\). It can also be shown that [21]

$$\begin{aligned} \mathrm{{R}}^{\mathrm{{k}}}_{\alpha }= \mathrm{{R}}_{\alpha } \ \blacklozenge \ \mathrm{{R}}_{\alpha } \quad \blacklozenge \quad \blacklozenge \ \mathrm{{R}}_{\alpha }, \end{aligned}$$

the k composition of \(\mathrm{{R}}_{\alpha }\) is also \(\mathrm{{R}}^{\mathrm{{k}}}_{\alpha }\).

The representation theorem allows us to represent fuzzy relationship in terms of the collection of its level sets. This can be used to extend operations that are well defined on crisp sets to be defined on fuzzy sets. Using this we can express \(\mathrm{{R}}{^{\mathrm{{k}}}} = \mathop {\smallsmile }\limits ^{1}_{\alpha = 0} \{ \frac{\alpha }{\mathrm{{R}}^\mathrm{{k}}_{\alpha }} \}\), here we are using the standard fuzzy set notation where \(\frac{\alpha }{\mathrm{{R}}^\mathrm{{k}}_{\alpha }}\) indicates that \(\alpha \) is the membership grade of \({\mathrm{{R}}^\mathrm{{k}}_{\alpha }}\) .

Undirected fuzzy graphs, which are also transitive, provide a very interesting class of graph. In these graphs if x is related to y and y is related to z then x is related to z. In social networks transitivity captures the property “friend of a friend is a friend”.

Many of the concepts introduced in the preceding are valid for both directed and undirected graphs. A fundamental difference is the following. Assume \(\mathrm{{x}}_{0} \mathrm{{x}}_{1} \cdots \mathrm{{x}}_{\mathrm{{n}}}\) is a sequence of points that constitute a path. In an undirected graph its transpose \(\mathrm{{x}}_{\mathrm{{n}}} \mathrm{{x}}_{\mathrm{{n}}-1} \cdots \mathrm{{x}}_{0}\) is a path having the same strength and length. In a directed graph we can’t say anything about the transpose sequence. Whenever possible in an undirected graph we shall refer a path as between \(\mathrm{{x}}_{0} \,\text {and}\, \mathrm{{x}}_{\mathrm{{n}}}\) while in a directed graph we shall refer to the path from \(\mathrm{{x}}_{0}\, {\text {to}}\, \mathrm{{x}}_{\mathrm{{n}}}\).

3 Computing with Words

Our goal here is to extend our capabilities for analyzing social relational networks by associating with these networks concepts with which human beings view social network relationships in such a way that they are comprehensible to both humans and machines. On one hand human beings predominantly use linguistic terms in which to communicate, reason and understand the world. Machines on the other hand require much more formal symbols. One of the most useful approaches to providing a bridge between man and machine comprehension is the general framework provided by granular computing [8] and more specifically Zadeh’s fuzzy set based paradigm of computing with words. This technology allows for a high level of man-machine cooperation by providing a framework in which concepts can be modeled in a manner amenable to both. The potential for applying fuzzy set based technologies within the domain of social network analysis is particularly promising given that the computer modeling of these networks is in terms of mathematical relationships which as we already noted are equivalent to fuzzy sets.

In the following we introduce some ideas from the fuzzy set based approach to computing with words. Let U be some attribute that takes its value in the space Y. An example of such an attribute is age, which, for human beings takes its value in the set \(\mathrm{Y} = \{0, \ldots , 100\}\). A fundamental concept in computing with words is the idea of linguistic value [22]. A linguistic value is some word used to express the value of U. In the case of age, some examples of linguistic values are “old”, “young”, “about 30”, A linguistic value can be seen as a granule, it is a collection of values from the space Y. As we noted it is with the aid of linguistic values that human beings best understand and reason about their environment.

By a vocabulary we shall mean a collection of commonly understood words that are used to express the linguistic values associated with an attribute. These are the words that people use to communicate with each other. They are also the words people use to reason with about the attribute. Fuzzy sets provide a useful tool for formalizing the idea of a vocabulary in a way that allows machine computation and understanding. If W is a word in the vocabulary associated with the variable U we can express W as a fuzzy subset W of the domain of U. Here then for any element \(\mathrm{{y}} \in \mathrm{{Y}}\) the membership grade of y in W, W(y), indicates the compatibility of the value y with the linguistic value W. Thus the fuzzy subset W provides a machine comprehensible interpretation of the word W.

We are now in a position to bridge the gap in man-machine communication with respect to the analysis of social relational network by allowing the human to build a vocabulary of linguistic terms associated with an attribute and then provide a representation for these terms by fuzzy sets (see Fig. 2). Thus here now we have a communal vocabulary coherent to both the human and the machine.

Fig. 2
figure 2

Paradigm of man-machine understanding

What must be emphasized here is that the choice of the vocabulary as well as the associated meaning of the words in terms of fuzzy sets is completely in the hands of the human partner. This greatly simplifies this task. The vocabulary that will be used is imposed, we are giving the computer meaning. Particularly noteworthy is the fact that learning algorithms need not necessarily be required, the computer is told these are the terms I will be using and this is what they mean in your language, fuzzy sets. This is not to say that the construction of the communal vocabulary is not an important and complex task to which future research must be dedicated so that it is thoughtfully and appropriately done but only to reflect the fact that it is within our power to impose what we have decided. This situation allows us in the following to assume the availability of a communal vocabulary associated with attributes of interest.

In analyzing weighted relational networks there are a number of attributes about which it will be useful to have a vocabulary of commonly accepted terms. One such attribute is strength of connection. This is an attribute whose domain is the unit interval, \(\mathrm{I} = [0, 1]\). Terms like strong, weak, none would be part of a vocabulary associated with this attribute. In this case we would define the word strong as a fuzzy subset S of [0, 1] such that for any y \(\in \) [0, 1] the value S(y) would indicate the degree to which y satisfies the working definition of the concept strong connection. We would assume that S would be such that \(\mathrm{S}(0) = 0, \mathrm{S}(1) = 1\) and S is monotonic, \(\mathrm{{S}}(\mathrm{{y}}_{1}) \ge \mathrm{{S}}(\mathrm{{y}}_{2})\, \mathrm{{if}}\,\mathrm{{y}}_{1} \ge \mathrm{{y}}_{2}\). A prototypical example of the definition of the term strong would be the piecewise linear fuzzy subset shown in Fig. 3.

Fig. 3
figure 3

Fuzzy subset of term strong

Another attribute for which it will be useful to have a communal vocabulary is the number of links in a path, path length. Some words in such a vocabulary would be short and long. In the case of this attribute we would provide a semantics for the words of the vocabulary, in terms of fuzzy subsets of the domain \(\mathrm{H} = \{0, 1, \ldots , \mathrm{n}\}\) when n is the number of vertices in the network.

Concepts, in addition to actual physical objects, can provide attributes of interest. One such concept that we shall find useful are proportions. Here U is an attribute that takes its value in the set \(\mathrm{I} = [0, 1]\) where r \(\in \) [0, 1] is a proportion. Examples of linguistic values that could be part of a communal vocabulary associated with this attribute are many, most and about half. As noted by Zadeh these terms provide a generalization of the quantifiers “all” and “none” that are used in logic. We can refer to these as linguistic quantifiers.

4 Clusters and Cliques

An important idea in classical graph theory is the concept of a cluster; here we want to extend this to weighted graphs. Let us first review the ideas from crisp graph theory. Let \({<}\mathrm{{X}}, \mathrm{{R}}{>}\) be a graph where R is a crisp relationship. We are implicitly assuming our graph is undirected so R is symmetric. One approach is to call a subset C of X a cluster of order k if

  1. (a)

    For all node pairs x and y in C we have \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{y}}) = 1\)

  2. (b)

    For all nodes z \(\notin \) C there is some x \(\in \) C such that \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{z}}) = 0\).

Note: When \(\mathrm{k} = 1\) we call C a clique.

Note: The order k of the cluster is reflected in the term \(\mathrm{{R}}^{\mathrm{{k}}}\).

In [17] Rosenfeld suggested extending these ideas to fuzzy graphs. Here we let \({<}\mathrm{{X}}, \mathrm{{R}}{>}\) be a fuzzy graph where R is a symmetric fuzzy relationship. A crisp subset C \(\subset \) X is called a fuzzy cluster of order k if

$$\begin{aligned} \mathrm{{Min}}_{\mathrm{{x}},\mathrm{{y}} \in \mathrm{{C}}}[\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{y}})] > \mathrm{{Sup}}_{\mathrm{{z}} \notin \mathrm{{C}}}[\mathrm{{Inf}}_{\mathrm{{w}}\in \mathrm{{C}}}\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{w}}, \mathrm{{z}})]. \end{aligned}$$

In the following we suggest an alternative, more human meaningful, definition of a clique and then using the paradigm of computing with words we can provide a procedure for evaluating how well a subset of nodes satisfies our definition.

Let A be a subset of elements from X. We shall define A to be a clique if

  1. C1:

    All elements in A are connected by a short strong path.

  2. C2:

    No element not in A is connected to an element in A by a strong path.

Here then we have two criteria that need to be satisfied for a subset A \(\subset \) X to be considered as a clique, \(\mathrm{{C}}_{1}\) and \(\mathrm{{C}}_{2}\). If we let \(\mathrm{{C}}_{1}(\mathrm{{A}}) \in [0, 1]\) be the degree to which it is true that A satisfies \(\mathrm{{C}}_{1}\) and \(\mathrm{{C}}_2(\mathrm{{A}}) \in [0. 1]\) be the degree to which it is true that A satisfies \(\mathrm{{C}}_{2}\) then C(A) \(= \mathrm{{Min}}[\mathrm{{C}}_{1}(\mathrm{{A}}), \mathrm{{C}}_{2}(\mathrm{{A}})]\) is the degree to which it is true that A is a clique.

We must now formulate the two criteria in terms of features from the network \({<}\mathrm{{X}}, \mathrm{{R}}{>}\). Here we shall make use of the communal vocabularies for the attributes strength of connection and path length. We shall assume the availability of the word strong for strength of connection and short for length of path in the vocabulary, that is we have expressions for the meaning, semantics, of these words as fuzzy subsets.

We first focus on \(\mathrm{{C}}_{1}\). Here we will make use of the term “strong connection” which we will represent as a fuzzy subset S of the unit interval I. In addition we need use the linguistic term “short path” which we represent as a fuzzy subset SH of the space N.

Assume \(\mathrm{{x}}_{i}\) and \(\mathrm{{x}}_{\mathrm{{j}}}\) are two nodes in the proposed clique A. Here \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\) indicates the strength of connection of \(\mathrm{{x}}_{\mathrm{{i}}}\) and \(\mathrm{{x}}_{\mathrm{{j}}}\) for a path of at most k links. For any \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\) the value \(\mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}))\) is the degree to which \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\) is a strong connection. We recall that \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\) is monotonic in k, that is if \(\mathrm{{k}}_{2} > \mathrm{{k}}_{1}\) then \(\mathrm{{R}}^{\mathrm{{k}}_{2}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}) \ge \mathrm{{R}}^{\mathrm{{k}}_{1}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\). In addition as we noted earlier S, strong connection, is also monotonic in strength, S(a) \(\ge \) S(b) if a \(>\) b. From this we can conclude that for \(\mathrm{{k}}_{2} > \mathrm{{k}}_{1}\) then \(\mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}_{2}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})) \ge \mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}_{1}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}))\).

The concept short path with respect to the number of links it contains can be defined as a fuzzy subset SH of \(\mathrm{N} = \{1, 2, \ldots , \mathrm{n}\}\). Here we can observe that generically SH should have the following properties; SH(1) \(=\) 1, SH(n) \(=\) 0 and \(\mathrm{{SH}}(\mathrm{{k}}_{1}) \ge \mathrm{{SH}}(\mathrm{{k}}_{2})\) if \(\mathrm{{k}}_{1} < \mathrm{{k}}_{2}\). Here it is monotonic decreasing in k, increasing k leads to decrease the satisfaction. A prototypical example of this concept is the piecewise linear fuzzy subset as shown in Fig. 4.

Fig. 4
figure 4

Fuzzy set representation of short path

Using these ideas we can determine the degree to which there exists a short-strong connection between \(\mathrm{{x}}_{i}\) and \(\mathrm{{x}}_{\mathrm{{j}}}\). We define this using a form of the Sugeno integral [23] as

$$\begin{aligned} \mathrm{{C}}_{1}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})={\mathop {\mathrm{{Max}}}\limits _{\mathrm{{{k}}}=1 \,\mathrm{{to}}\, \mathrm{{n}}}} [\mathrm{{SH}}(\mathrm{{k}}) \wedge {\mathrm{{S}}}(\mathrm{{R}}^\mathrm{{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}))] \end{aligned}$$

In the above \(\wedge \) is the Min operator. We see that as k increases SH(k) tends to get smaller while \(\mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}},~\mathrm{{x}}_{\mathrm{{j}}})\) tends to get bigger.

We now can use this to determine the degree to which all elements in A are connected by a short-strong path,

$$\begin{aligned} \mathrm{{C}}_{1}({\mathrm{{A}}}) = {\mathop {\mathrm{{Min}}}\limits _{{\mathop {\mathrm{{x}_{\mathrm{i}}}, {\mathrm{{x}_{\mathrm{j}}}} \in \mathrm{{A}}}\limits _{{\mathrm{{x}_{\mathrm{i}}} \ne \mathrm{{x}_{\mathrm{j}}}}}}}}[\mathrm{{C}}_1(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})] . \end{aligned}$$

It is very important to observe the marriage of different types of set objects used in making up the definition of \(\mathrm{{C}}_{1}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\). We first see that we have used \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}},\mathrm{{x}}_{\mathrm{{j}}})\), which is essentially the set–based definition of the network. In addition we have used the fuzzy sets SH and S, which are the fuzzy set definitions of the words short and strong. Here we have taken advantage of the fact that both the basic representation of a graph (network) and the meaning of words can be formulated using a set-based formulation to mix language and network representation. In [8] we referred to this approach as PISNA, the Paradigm for Intelligent Social Network Analysis.

We now consider the second criteria, \(\mathrm{{C}}_{2}\), no element not in A has a strong connection with an element in A. In the following we shall let x \(\in \) A and z \(\notin \) A. For these elements \(\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}, \mathrm{{z}})\) is the strength of the strongest path between x and z of any length. Here then \(\mathrm{{S}}(\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}, \mathrm{{z}}))\) is the degree to which there exists a strong path between x and z. If we calculate \({\mathop {\mathrm{{Max}}}\limits _{\mathrm{{x}} \in \mathrm{{A}}, \mathrm{{z}} \notin \mathrm{{A}}}} [\mathrm{{S}}(\mathrm{{R}}^\mathrm{{{n}}}(\mathrm{{x}},\mathrm{{z}}))]\) we obtain the degree to which there exists a strong path between an element in A and an element not in A. Using this can calculate \(\mathrm{{C}}_{2}\)(A) as

$$\begin{aligned} \mathrm{{C}}_{2}(\mathrm{{A}}) = 1 - {\mathop {\mathrm{{Max}}}\limits _{\mathrm{{x}} \in \mathrm{{A}}, \mathrm{{z}} \notin \mathrm{{A}}}} [\mathrm{{S}}(\mathrm{{R}}^\mathrm{{{n}}}(\mathrm{{x}},\mathrm{{z}}))] \end{aligned}$$

We now can use these formulations to determine the degree to which a subset A is a clique.

In the following we will make some observations regarding the preceding approach to defining the concept of clique.

In the preceding we defined the concept of short path in an absolute way as a fuzzy subset SH of the set of number of elements in a graph. It is possible to express short in a more universal way as a subset of the proportion of the number of elements in the graph. Thus here we can define “short path” as a fuzzy subset \(\mathrm{{SH}}_{\mathrm{{p}}}\) defined on the unit interval where for any r \(\in \) [0, 1] the value \(\mathrm{{SH}}_{\mathrm{{p}}}\)(r) indicates the degree to which the proportion r of elements in the network constitute a “short path.” Thus here we are defining “short path” as a proportion of number of vertices in the network. This allows us to have a universal definition of the concept of “short path” independent of the number of nodes in the network. In this case where we have \(\mathrm{{SH}}_{\mathrm{{p}}}\) we calculate

$$\begin{aligned} \mathrm{{C}}_{1}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}) = {\mathop {\mathrm{{Max}}}\limits _{\mathrm{{{k}}}={1} \,\mathrm{{to}} \, \mathrm{{n}}}} \left[ \mathrm{{SH}}_{\mathrm{{p}}}\left( \frac{\mathrm{{k}}}{\mathrm{{n}}}\right) \wedge \mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}))\right] \end{aligned}$$

We note that \(\mathrm{{SH}}_{p}\) will have the same form as SH, \(\mathrm{{SH}}_{p}\)(r) decreases as r increases.

We also note that it is possible to somewhat relax the second criteria. Instead of having

\(\mathrm{{C}}_{2} =\) no element, not in A, is connected to an element in A by a strong path.

We can say that

\(\mathrm{{C}}_{2} =\) no element not in A is connected to an element in A by a strong path that is not long.

Here we need obtain the word Long from our communal vocabulary. A typical example of a fuzzy subset representing such a definition is shown in Fig. 5.

If L is the fuzzy subset defined on N corresponding to Long then not Long is a fuzzy subset N.L defined such that \(\mathrm{{N.L(k)}} = 1 - \mathrm{{L(k)}}\). Using this we obtain

$$\begin{aligned} {\mathop {\mathrm{{Max}}}\limits _{{\mathrm{{k}}}= {1} \,\mathrm{{to}} \,\mathrm{{n}}}} [\mathrm{{N.L}}(\mathrm{{k}}) \wedge \mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{z}}))] \end{aligned}$$

as the degree to which there is a strong and not long path between x and z. Finally from this we obtain

$$\begin{aligned} \mathrm{{C}}_{2}(\mathrm{{A}}) = 1 - {\mathop {\mathrm{{Max}}}\limits _{{\mathrm{{x}}}{\in }{\mathrm{{A}}},\mathrm{{z}}\notin \mathrm{{A}}}} \left( {\mathop {\mathrm{{Max}}}\limits _{{\mathrm{{k}}}= {1}\, \mathrm{{to}}\, {\mathrm{{n}}}}} \left[ \mathrm{{N.L}}(\mathrm{{k}}) \wedge \mathrm{{S}}\left( \mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{z}})\right) \right] \right) \end{aligned}$$

We now consider the situation in what we allow the set A to be a fuzzy set. Thus we want to determine if A is a fuzzy clique. We shall consider the same two original conditions \(\mathrm{{C}}_{1}\) and \(\mathrm{{C}}_{2}\) as defining a clique. We shall use \(\tilde{\mathrm{{A}}}\) to indicate our fuzzy Clique.

Fig. 5
figure 5

Fuzzy set representing term Long

Here we first look at \(\mathrm{{C}}_{1}\). In this case we still get for any two nodes x and y \(\in \) X.

$$\begin{aligned} \mathrm{{C}}_{1}(\mathrm{{x}}, \mathrm{{y}}) = {\mathop {{\mathrm{{Max}}}}\limits _{{\mathrm{{k}}}= {1}\, \mathrm{{to}} \,\mathrm{{n}}}} [\mathrm{{SH}}(\mathrm{{k}}) \wedge \mathrm{{S}}(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}, \mathrm{{y}}))] \end{aligned}$$

as the degree to which there is a short and strong path between the nodes x and y. Using this we obtain

$$\begin{aligned} \mathrm{{C}}_{1}(\tilde{\mathrm{{A}}}) = {\mathop {\mathrm{{Min}}}\limits _{{\mathrm{{x}}}\ne {\mathrm{{y}}} \in \mathrm{{X}}}} [(1-\tilde{\mathrm{{A}}}(\mathrm{{x}})) \vee (1-\tilde{\mathrm{{A}}}(\mathrm{{y}})) \vee \mathrm{{C}}_{1}(\mathrm{{x}}, \mathrm{{y}}))] \end{aligned}$$

We note that in the case where \(\tilde{\mathrm{{A}}}\) is crisp this reduces to our original definition of \(\mathrm{{C}}_{1}\). For if x or y are in \(\tilde{\mathrm{{A}}}\) then the disjunction reduces to for \(\mathrm{{C}}_{1}(\mathrm{{x}}, \mathrm{{y}})\) while if either x or y is not in \(\tilde{\mathrm{{A}}}\) then the argument becomes 1.

In this case of \(\mathrm{{C}}_{2}\), for a fuzzy clique we get

$$\begin{aligned} \mathrm{{C}}_{2}(\tilde{\mathrm{{A}}}) = 1 - {\mathop {\mathrm{{Max}}}\limits _{{\mathop {{\mathrm{{x}}}, {\mathrm{{z}}} \in \mathrm{{X}}}\limits _{{\mathrm{{x}} \ne \mathrm{{z}}}}}}} [\tilde{\mathrm{{A}}}(\mathrm{{x}}) \wedge (1- \tilde{\mathrm{{A}}}(\mathrm{{z}})) \wedge \mathrm{{S}}(\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}},\mathrm{{z}})) ] \end{aligned}$$

We note that if \(\tilde{\mathrm{{A}}}\) is crisp this reduces to our previous definition. For if \(\mathrm{{x}} \in \tilde{\mathrm{{A}}}\) and if \(\mathrm{{z}} \notin \tilde{\mathrm{{A}}}\) then it becomes the original format. If either \(\mathrm{{x}} \notin \tilde{\mathrm{{A}}}\) or \(\mathrm{{z}} \in \tilde{\mathrm{{A}}}\) then \(\tilde{\mathrm{{A}}}(\mathrm{{x}}) \wedge (1-\tilde{\mathrm{{A}}}(\mathrm{{z}}))\wedge \mathrm{{S}}(\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}, \mathrm{{z}}))= 0\) and it doesn’t effect the calculation of \(\mathrm{{C}}_{2}(\tilde{\mathrm{{A}}})\).

5 Centrality

An important concept in social network analysis is centrality [1, 2]. The centrality of a node is closely related to its importance in the network. Assume \({<}\mathrm{{X}}, \mathrm{{R}}{>}\) is a relational network where R is a crisp relation. The measure of centrality of node \(\mathrm{{x}}_{i}\) is the number of nodes connected to it by at most k links. In this case

$$\begin{aligned} \mathrm{{C}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}})={\mathop {\sum }\limits _{{\mathop {\mathrm{{j}}=1}\limits _{\mathrm{{j}} \ne \mathrm{{i}}}}}^{\mathrm{{n}}}} {\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_\mathrm{{i}} ,\;\mathrm{{x}}_\mathrm{{j}} )} \end{aligned}$$

is the measure of the centrality of node \(\mathrm{{x}}_{\mathrm{{i}}}\).

If we have a network where R is a weighted graph a straightforward way to extend this is to calculate \(\mathrm{{C}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}})\) as in the above. One problem that can arise here is that a large number of weak connections, small values of \(\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\), can add up to appear as a strong connection. Here we can suggest some other alternative methods for obtaining this measure of centrality.

One method is to use the level set representation and obtain a fuzzy set representation for the centrality. Here we can express

$$\begin{aligned} \mathrm{{C}}^{\mathrm{{k}}}_{\alpha }(\mathrm{{x}}_{\mathrm{{i}}})={\mathop {\sum }\limits _{{\mathop {\mathrm{{j}}=1}\limits _{\mathrm{{j}} \ne \mathrm{{i}}}}}^{\mathrm{{n}}}} {\mathrm{{R}}^{\mathrm{{k}}}_{\alpha }(\mathrm{{x}}_\mathrm{{i}} ,\;\mathrm{{x}}_\mathrm{{j}} )} \end{aligned}$$

Thus here \(\mathrm{{C}}^{\mathrm{{k}}}_{\alpha }(\mathrm{{x}}_{\mathrm{{i}}})\) is the number of nodes connected to \(\mathrm{{x}}_{\mathrm{{i}}}\) with strength of at least \(\alpha \) using at most k links. Using this we can define

$$\begin{aligned} {\tilde{\mathrm{{C}}}}{^{\mathrm{{k}}}}(\mathrm{{x}}_{\mathrm{{i}}}) = {\mathop {{\small smile }}\limits _{{\alpha \in [0, 1]}}}\left\{ \frac{\alpha }{\mathrm{{C}}^\mathrm{{k}}_{\alpha }(\mathrm{{x}}_{\mathrm{{i}}})}\right\} , \end{aligned}$$

here we are using the standard fuzzy notation indicating that \(\alpha \) is the membership grade of \(\mathrm{{C}}^\mathrm{{k}}_{\alpha }(\mathrm{{x}}_{\mathrm{{i}}})\).

In this case \({\tilde{\mathrm{{C}}}}{^{\mathrm{{k}}}}(\mathrm{{x}}_{\mathrm{{i}}})\) is a fuzzy number. We should note that since for \(\alpha _{2} > \alpha _{1}\) we have \(\mathrm{{R}}^\mathrm{{k}}_{\mathrm{{\alpha }} 2} \subseteq \mathrm{{R}}^\mathrm{{k}}_{\mathrm{{\alpha }} 1}\) then \(\mathrm{{C}}^\mathrm{{k}}_{\mathrm{{\alpha }} 1} \mathrm{{C}}^\mathrm{{k}}_{\mathrm{{\alpha }} 2}\).

Let us consider another way to extend the concept of centrality to the case of a weighted graph. An alternative and perhaps more appropriate definition of centrality would be the “number of strong connections using at most k links.” Here we shall define the concept “strong” as a fuzzy subset of unit interval. For example see Fig. 3. Using this definition we can obtain

$$\begin{aligned} {\tilde{\mathrm{{C}}}}{^{\mathrm{{k}}}} (\mathrm{{x}}_\mathrm{{i}})={\mathop {\sum }\limits _{{\mathop {\mathrm{{j}}=1}\limits _{\mathrm{{j}} \ne \mathrm{{i}}}}}^{\mathrm{{n}}}} \;{\text {Strong}}\;{\mathrm{{R}}^{\mathrm{{k}}}(\mathrm{{x}}_\mathrm{{i}} ,\;\mathrm{{x}}_\mathrm{{j}})} \end{aligned}$$

Thus here we transform the scores via the concept strong.

6 Social Network Databases

In the following we shall consider a weighted network \({<}\mathrm{{X}}, \mathrm{{R}}{>}\) where each of the nodes has an associated vector of attribute (feature) values. In these types of networks each of the node objects have various attributes, properties and features. This structure can be viewed as the combination of a network and database.

In these networks we have a collection of q attributes \(\mathrm{{U}}_{1}, \ldots , \mathrm{{U}}_{\mathrm{{q}}}\). In the case where the nodes are people, examples of attributes could be nationality, age or income. Each of the attributes takes a value from a domain \(\mathrm{{Y}}_{\mathrm{{i}}}\). In this situation, each node has an associated q vector whose \(i\)th component is the value of the \(i\)th attribute for that node.

We shall use the notation \(\mathrm{{U}}_{\mathrm{{i}}}(\mathrm{{x}}_{\mathrm{{j}}})\) to indicate the variable corresponding to the attribute \(\mathrm{{U}}_{\mathrm{{i}}}\) in the case of node \(\mathrm{{x}}_{\mathrm{{j}}}\). For example with \(\mathrm{{U}}_{\mathrm{{i}}}\) being the attribute age then \(\mathrm{{U}}_{\mathrm{{i}}}(\mathrm{{x}}_{\mathrm{{j}}})\) would indicate the variable age of \(\mathrm{{x}}_{\mathrm{{j}}}\). If \(\mathrm{{U}}_{\mathrm{{i}}}\) is the attribute country of birth then \(\mathrm{{Y}}_{\mathrm{{i}}}\) would be a list of countries and \(\mathrm{{U}}_{\mathrm{{i}}}(\mathrm{{x}}_{\mathrm{{j}}})\) would be the variable corresponding the country of birth of \(\mathrm{{x}}_{\mathrm{{j}}}\). We shall let \(\mathrm{{v}}_{\mathrm{{ij}}}\) indicate the value of the variable \(\mathrm{{U}}_{\mathrm{{i}}}(\mathrm{{x}}_{\mathrm{{j}}})\), thus \(\mathrm{{U}}_{\mathrm{{i}}}(\mathrm{{x}}_{\mathrm{{j}}}) = \mathrm{{v}}_{\mathrm{{ij}}}\). Thus in this case any node in our network has an associated vector \(\mathrm{{V}}_{\mathrm{{j}}}\) whose i component \(\mathrm{{v}}_{\mathrm{{ij}}}\) corresponds to the value of attribute \(\mathrm{{U}}_{\mathrm{{i}}}\) for node \(\mathrm{{x}}_{\mathrm{{j}}}\). We should observe the above network could in some ways be viewed as a kind of database.

In the following we shall begin to describe techniques that can be used to analyze, investigate and question networks with vector-valued nodes. Here we shall be using flexible/fuzzy-querying techniques [9, 24].

In the following we shall assume that country of residence is one of the attributes, we shall denote this as \(\mathrm{{U}}_{1}\). Thus \(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{j}}})\) is the variable denoting the country of residence of \(\mathrm{{x}}_{\mathrm{{j}}}\). In this case the domain \(\mathrm{{Y}}_{1}\) associated with \(\mathrm{{U}}_{1}\) is the set of countries. In addition, a communal vocabulary associated with this attribute would consist of terms such as “Middle East”, “North America”, “South America” and “Southeast Asia”. Other terms such as “mountainous country”, “Spanish speaking”, and “Oil producing” can also be part of the vocabulary. Each of the terms in the vocabulary would be defined in terms of subsets of \(\mathrm{{Y}}_{1}\). Some of these terms can be defined using crisp subsets while others will require fuzzy subsets.

In addition we shall assume age is another of the attributes associated with the network nodes. We shall denote this as \(\mathrm{{U}}_{2}\) with its domain \(\mathrm{{Y}}_{2}\) being the set of non-negative integers. Here we shall also assume the availability of mutually understandable vocabulary of commonly used linguistic terms to describe a person’s age. These terms will be defined in terms of subsets of \(\mathrm{{Y}}_{2}\).

Assume \(\mathrm{{x}}_{\mathrm{{j}}}\) is some node in our network. We can ask “To what extent is \(\mathrm{{x}}_{\mathrm{{j}}}\) strongly connected to a person residing in South America?” In the following we shall let SA indicate the subset of \(\mathrm{{Y}}_{1}\) corresponding to South America. Using this we can obtain as the answer to our question

$$\begin{aligned} \mathop {\mathrm{{Max}}}\limits _{{\mathrm{{i}}, \mathrm{{i}} \ne \mathrm{{j}}}}[\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{X}}_{\mathrm{{i}}})) \wedge \mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}_\mathrm{{i}}, \mathrm{{x}}_\mathrm{{j}})] \end{aligned}$$

More specifically we can ask “To what extent is \(\mathrm{{x}}_{\mathrm{{j}}}\) strongly connected to a young person residing in South America?” In this case with \(\mathrm{{U}}_{2}\) being the attribute age and Young being the subset corresponding concept young person we get as the answer

$$\begin{aligned} \mathop {\mathrm{{Max}}}\limits _{{\mathrm{{i}}, \mathrm{{i}} \ne \mathrm{{j}}}}[\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{X}}_{\mathrm{{i}}})) \wedge \mathrm{{Young}}(\mathrm{{U}}_2(\mathrm{{X}}_\mathrm{{i}}))\wedge \mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}_\mathrm{{i}}, \mathrm{{x}}_\mathrm{{j}})] \end{aligned}$$

We note this value is also the truth of the question “Does \(\mathrm{{x}}_{\mathrm{{j}}}\) have a connection to a Young South American”.

We note that if we want to find out “does \(\mathrm{{x}}_{\mathrm{{j}}}\) have a strong connection to a Young South American” then we obtain the truth of this as

$$\begin{aligned} \mathop {\mathrm{{Max}}}\limits _{{\mathrm{{i}}, \mathrm{{i}} \ne \mathrm{{j}}}}[\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{X}}_{\mathrm{{i}}})) \wedge \mathrm{{Young}}(\mathrm{{U}}_2(\mathrm{{X}}_\mathrm{{i}}))\wedge \mathrm{{Strong}} (\mathrm{{R}}^\mathrm{{{n}}}(\mathrm{{x}}_\mathrm{{i}}, \mathrm{{x}}_\mathrm{{j}})))] \end{aligned}$$

Here we have replaced the predicate \(\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}_\mathrm{{i}}, \mathrm{{x}}_\mathrm{{j}})\) with Strong \((\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}_\mathrm{{i}}, \mathrm{{x}}_\mathrm{{j}}))\)

A related question is the following. Let B be some crisp subset of X. We now ask what is the strongest connection between an element in B and a Young South American not in B. The answer is then obtained from the following

$$\begin{aligned} \mathop {\mathrm{{Max}}}\limits _{{\mathrm{{X}}_\mathrm{{{i}}} \in \mathrm{{B}}}}\left[ \mathop {\mathrm{{Max}}}\limits _{{{\mathrm{{x}}}_{\mathrm{{j}}}}}[\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{X}}_{\mathrm{{i}}})) \wedge \mathrm{{Young}}(\mathrm{{U}}_2(\mathrm{{X}}_\mathrm{{j}}))\wedge \mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}_\mathrm{{i}}, \mathrm{{x}}_\mathrm{{j}}) \wedge \overline{\mathrm{{B}}}(\mathrm{{X}}_\mathrm{{j}})]\right] \end{aligned}$$

If in the above we are interested in only direct connections rather than any connection we replace \(\mathrm{{R}}^{\mathrm{{n}}}\) by R.

We now consider the question: Do all people residing in South America have a strong connection with each other? We shall denote the truth of this question Tr[Q] We calculate this truth-value as

$$\begin{aligned} \mathrm{{Tr}}(\mathrm{{Q}})=\mathop {\mathbf{{Min}}}\limits _{{{\mathrm{{x}}}_{\mathrm{{i}}},{\mathrm{{x}}}_{\mathrm{{j}}} \in \mathrm{{X}}}}[(1-\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{x}}_\mathrm{{i}})) \vee (1-\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{x}}_\mathrm{{j}})) \vee \mathrm{{Strong}} (\mathrm{{R}}^\mathrm{{n}}(\mathrm{{x}}_\mathrm{{i}},\mathrm{{x}}_\mathrm{{j}}))] \end{aligned}$$

Let us look at this for the special case where SA is a crisp set. We first see that in the case of a pair \((\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}})\) in which at least one of the elements do not reside in South America then either \(\mathrm{{SA}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{i}}})) = 0 \mathrm{{or}} \mathrm{{SA}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{j}}}))= 0\) and therefore

$$\begin{aligned} (1-\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{x}}_\mathrm{{i}})) \vee (1- \mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{x}}_\mathrm{{j}})) \vee \mathrm{{Strong}} (\mathrm{{R}}^\mathrm{{n}}(\mathrm{{x}}_\mathrm{{i}},\mathrm{{x}}_\mathrm{{j}}))=1. \end{aligned}$$

This case will not be the min. For the case in which both \(\mathrm{{x}}_{\mathrm{{i}}}\) and \(\mathrm{{x}}_{\mathrm{{j}}}\) reside in South America then \(\mathrm{{SA}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{i}}})) = \mathrm{{SA}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{j}}})) = 1\) and hence

$$\begin{aligned} (1-\mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{x}}_\mathrm{{i}})) \vee (1- \mathrm{{SA}}(\mathrm{{U}}_1(\mathrm{{x}}_\mathrm{{j}})) \vee \mathrm{{Strong}} (\mathrm{{R}}^\mathrm{{n}}(\mathrm{{x}}_\mathrm{{i}},\mathrm{{x}}_\mathrm{{j}}))= \mathrm{{Strong}} (\mathrm{{R}}^\mathrm{{n}}(\mathrm{{x}}_\mathrm{{i}},\mathrm{{x}}_\mathrm{{j}})) \end{aligned}$$

From this we get as expected Tr(Q) \(= \mathop {\mathrm{{Min}}}\limits _{{{\mathrm{{x}}}_{\mathrm{{i}}}{\mathrm{{x}}}_{\mathrm{{j}}} \in \mathrm{{SA}}}}[\mathrm{{Strong}} (\mathrm{{R}}^\mathrm{{n}}(\mathrm{{x}}_\mathrm{{i}},\mathrm{{x}}_\mathrm{{j}}))]\)

Now we shall consider the slightly more complicated question of whether “most of the people residing in western countries have strong connections with each other?” We shall here assume the term most is available in our common vocabulary where it is defined as a fuzzy subset over the unit interval. In addition we shall assume that the concept western country, is a concept that is defined by the fuzzy subset W over the domain \(\mathrm{{Y}}_{1}\). In this case for each \(\mathrm{{x}}_{\mathrm{{i}}} \in \) X we have W\((\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{i}}}))\) indicates the degree to which it is true that \(\mathrm{{x}}_{\mathrm{{i}}}\) is from a western country. In the following we shall set P to be the set of all unordered pairs of distinct elements from X. P is the set of all the subsets of X consisting of two elements. We see that the number of elements in P, \(\mathrm{{n}}_{\mathrm{{P}}} = \frac{(\mathrm{{n}})(\mathrm{{n}}-1)}{2}\) where n is the number of elements in X. We shall denote an element \(\{\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}\}\) in P as \(\mathrm{{p}}_{\mathrm{{k}}}\). Here then k goes from 1 to \(\mathrm{{n}}_{\mathrm{{P}}}\).

For each pair \(\mathrm{{p}}_{\mathrm{{k}}} = \{\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}} \}\) we obtain two values. The first \(\mathrm{{V}}_{\mathrm{{k}}} = \mathrm{{Min}}(\mathrm{{W}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{i}}})), \mathrm{{W}}(\mathrm{{U}}_{1} (\mathrm{{x}}_{\mathrm{{j}}}))\)) indicates the degree to which \(\mathrm{{p}}_{\mathrm{{k}}}\) consists of pair of elements both from a western country. The second value is \(\mathrm{{S}}_{\mathrm{{k}}} = \mathrm{{Strong}}(\mathrm{{R}}^{\mathrm{{n}}}(\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}}))\), indicates the degree to which there is a strong connection between the pair \(\{\mathrm{{x}}_{\mathrm{{i}}}, \mathrm{{x}}_{\mathrm{{j}}} \}\). We shall use the technology of OWA operators to answer our question [25]. We proceed to obtain the answer as follows:

  1. (1)

    Order the \(\mathrm{{S}}_{\mathrm{{k}}}\) and let ind(j) be the index of the \(j{\mathrm{{th}}}\) largest of the \(\mathrm{{S}}_{\mathrm{{k}}}\). Thus here \(\mathrm{{S}}_{\mathrm{{ind}}(\mathrm{{j}})}\) is the \(j\)th largest of the \(\mathrm{{S}}_{\mathrm{{k}}}\) and \(\mathrm{{V}}_{\mathrm{{ind}}(\mathrm{{j}})}\) is its associated V value.

  2. (2)

    We next calculate

    $$\begin{aligned} \mathrm{{R}}=\sum ^{{\mathrm{{n}}}_{\mathrm{{p}}}}_{\begin{array}{l} \mathrm{{j}}=1 \\ \end{array}} \mathrm{{V}}_\mathrm{{ind}}(\mathrm{{j}}) \end{aligned}$$
  3. (3)

    We next obtain a set of weight \(\mathrm{{w}}_{\mathrm{{j}}}\) for \(\mathrm{j} = 1\) to \(\mathrm{{n}}_{\mathrm{{p}}}\) where

    $$\begin{aligned} \mathrm{{w}}_{\mathrm{{j}}}= \mathrm{{Most}}(\frac{\mathrm{{R}}_\mathrm{{j}}}{\mathrm{{R}}}) - \mathrm{{Most}}(\frac{\mathrm{{R}}_\mathrm{{j}} - 1}{\mathrm{{R}}}) \end{aligned}$$

    here \(\mathrm{{R}}_{\mathrm{{j}}}=\sum \nolimits _{\mathrm{{i}}=1}^\mathrm{{j}} {\mathrm{{V}}_{\mathrm{{ind}}(\mathrm{{i}})} } \)

  4. (4)

    We finally calculate the truth of the question as

    $$\begin{aligned} \mathrm{{Tr}}(\mathrm{{q}}) = \sum ^{{\mathrm{{n}}}_{\mathrm{{p}}}}_{\begin{array}{l} \mathrm{{j}}=1 \\ \end{array}} {\mathrm{{w}}_\mathrm{{j}} \mathrm{{S}}_{\mathrm{{ind}}(\mathrm{{j}})}} \end{aligned}$$

An interesting special case of the preceding occurs if the subset W, Western Country, is a crisp set. In this case \(\mathrm{{V}}_{\mathrm{{k}}} = \mathrm{{Min}}(\mathrm{{W}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{i}}})), \mathrm{{W}}(\mathrm{{U}}_{1}(\mathrm{{x}}_{\mathrm{{j}}})))\) is a binary value, either one or zero. It is one if both \(\mathrm{{x}}_{\mathrm{{i}}}\) and \(\mathrm{{x}}_{\mathrm{{j}}}\) are from western countries and zero of either is not from a western country.

Another question we can ask is whether the young people form a clique. Since the young people provide a fuzzy subset over the space X and we have previously indicated a process for determining whether a fuzzy set is a clique we can answer this question.

7 Conclusion

We discussed the idea of fuzzy relationships and their role in modeling weighted social relational networks. The paradigm of computing with words was introduced and the role that fuzzy sets play in representing linguistic concepts was described. We discussed how these technologies can provide a bridge between a network analyst’s linguistic description of social network concepts and the formal model of the network. Some examples of taking an analyst’s network concepts and formally representing them in terms of network properties were provided. We applied this to the concept of clique and then to the idea of node centrality. Finally we introduced the idea of vector–valued nodes and began developing a technology of social network database theory. Clearly this newly introduced idea of social network database theory will provide many applications and will need a more formal mathematical framework, a task for future research.