Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

February 8, 2022

Concepts and Profunctor Nuclei

Posted by Simon Willerton

Guest post by Matthew Di Meglio and Owen Lewis

This article is part of the ongoing work at the 2021 Adjoint School. Thanks to Sophie Libkind, David Jaz Myers, Simon Willerton, Tai-Danae Bradley, Callum Reader, and participants of the 2021 Adjoint School for helpful discussions and feedback.

Previously on this blog, Simon Willerton introduced the notion of the nucleus of an enriched profunctor, and alluded to the fact that the formal concept lattice of a relation is a particular instance of this general categorical construction. Continuing from Simon’s posts, we will present, through the lens of enriched category theory, a generalization of Formal Concept Analysis called Quantitative Concept Analysis (QCA). The material in this blog post is based on Dusko Pavlovic’s Quantitative Concept Analysis and Jonathan Elliot’s On the Fuzzy Concept Complex.

Background and running example

QCA aims to extract latent concepts from a dataset which relates a collection of objects to a collection of attributes. For example, suppose that we work for a movie streaming service, and we have a dataset consisting of each user’s rating of each movie (for simplicity, we will assume that all users have rated all movies). Here, the users are the objects and the movies are the attributes. In this context, the word “genre” is a good synonym for the word “concept”, and genres will later be shown be encoded in the nucleus of a particular enriched profunctor. Intuitively speaking, associated to each genre is the collection of movies in the style of that genre and the collection of users who like that genre. Thus the notion of genre should simultaneously capture similarity in users’ tastes and similarity in movies’ styles. QCA is a processes for discovering genres in our dataset; these genres may then be used to decide which movies should be surfaced to which users.

To make our example concrete, our sets of users and movies will be respectively the sets X={Cara,Dan,Evie}andY={Se7en,Tarzan}.X = \{\mathtt{Cara}, \mathtt{Dan}, \mathtt{Evie}\} \qquad \text{and} \qquad Y = \{\mathtt{Se7en}, \mathtt{Tarzan}\}. Our dataset is expressed as the X×YX \times Y-matrix M=[ Se7en Tarzan Cara 5/5 4/5 Dan 3/5 1/5 Evie 2/5 4/5] M = \left[ \begin{array}{r|cc} & \mathtt{Se7en} & \mathtt{Tarzan} \\ \mathtt{Cara} & 5/5 & 4/5\\ \mathtt{Dan} & 3/5 & 1/5\\ \mathtt{Evie} & 2/5 & 4/5 \end{array}\right] The (x,y)(x, y)-entry of MM is a rating out of 55 indicating the degree to which user xx liked movie yy, from 0/50/5 meaning very little to 5/55/5 meaning a lot. We will think of these ratings as real numbers in the interval V=[0,1]V = [0,1].

Enriching category and fuzzy logic

There is a way to view our set VV of possible ratings as the object set of a symmetric monoidal closed category 𝒱\mathcal{V} such that certain constructions from 𝒱\mathcal{V}-enriched category theory, such as 𝒱\mathcal{V}-functors and 𝒱\mathcal{V}-profunctors, turn out to be relevant to QCA. Given objects xx and yy of 𝒱\mathcal{V},

  • there is exactly one morphism from xx to yy when xyx \leqslant y and no morphisms from xx to yy otherwise (𝒱\mathcal{V} is a thin category);
  • the monoidal product of xx and yy is their product xyx \cdot y; and
  • the internal hom from xx to yy is their truncated division x\y={y/x if y<x, 1 if yx..x \backslash y = \begin{cases} y/x &\text{if } y \lt x,\\ 1 &\text{if } y \geq x. \end{cases}. Note also that the category 𝒱\mathcal{V} is complete and cocomplete, with limits and colimits given respectively by infima and suprema.

Viewing VV as the truth values of a fuzzy logic, from 00, meaning completely false, to 11, meaning completely true, the above categorical notions have natural interpretations in fuzzy logic, summarised in the following table:

Category Theory Logic 𝒱\mathcal{V}
Initial object False 00
Terminal object/Monoidal unit True 11
Monoidal product operator Binary conjunction operator \cdot
Exists-a-morphism relation Entailment relation \leqslant
Internal hom operator Implication operator \\backslash
Limits Universal quantification inf\inf
Colimits Existential quantification sup\sup

Above, we said that the truncated division operator \\backslash may be interpreted as the logical implication operator. That is, given fuzzy truth values xx and yy, we claim that x\yx \backslash y should be interpreted as the fuzzy truth value of the proposition that xx implies yy. Referring back to the definition of truncated division, we see that

  • if yy is at least as true as xx, then the proposition xx implies yy is completely true, i.e. x\y=1x \backslash y = 1; and
  • if yy is less true than xx, then the truth value y/xy / x of the proposition xx implies yy depends on how much yy is less true than xx.

For the above interpretation of limits and colimits of diagrams as universal and existential quantification of predicates, we interpret diagrams in 𝒱\mathcal{V} as modelling predicates valued in VV.

Enriched presheaves and fuzzy subsets

As stated earlier, a genre (concept) will consist of a sub-collection of the set XX of all users and a sub-collection of the set YY of all movies. By a sub-collection of a set ZZ, we mean a function on ZZ valued in our set [0,1][0, 1] of truth values; we will refer to such functions on ZZ as predicates on ZZ. In binary-valued, non-fuzzy logic, predicates are in bijection with the subsets of ZZ; given a predicate P:Z{,}P \colon Z \to \{\top, \bot\}, the corresponding subset of ZZ is the preimage of \top in PP. In the QCA case, we may think of such predicates as fuzzy subsets of ZZ; given a predicate P:Z[0,1]P \colon Z \to [0, 1] and an element zZz \in Z, we may think of P(z)P(z) as the truth value of the proposition that zz belongs to the fuzzy subset represented by PP.

In binary-valued logic, the set of all predicates P:Z{,}P \colon Z \to \{\top, \bot\} has the structure of a complete and cocomplete preordered set (aka proset) (it is actually a complete and cocomplete partially ordered set, which is also known as a complete lattice or suplattice). Under the correspondence with subsets of ZZ, this is nothing other than the power set of ZZ with subset inclusion as the preorder. Limits are given by intersection of subsets, and colimits are given by union of subsets.

In the fuzzy QCA setting, the analogous structure on the set of all predicates P:Z[0,1]P \colon Z \to [0, 1] is that of a proxet. A proximity set or proxet is a set ZZ together with a function (,) Z:Z×Z[0,1](-,-)_Z \colon Z \times Z \to [0, 1], called the proximity map, which satisfies:

  • reflexivity: (z,z) Z=1(z, z)_Z = 1 for all zZz \in Z, and
  • transitivity: (z 1,z 2) Z(z 2,z 3) Z(z 1,z 3) Z(z_1, z_2)_Z \cdot (z_2, z_3)_Z \leq (z_1, z_3)_Z for all z 1,z 2,z 3Zz_1, z_2, z_3 \in Z.

Under the logical interpretation of multiplication and truncated division given earlier, these axioms look like the axioms of a preorder. Indeed, we may think of proxets as fuzzy preorders, where (z 1,z 2) Z(z_1, z_2)_Z is the fuzzy truth value of the proposition that z 1z 2z_1 \preccurlyeq z_2. Additionally, proxets are closely related to quasimetric spaces, which are metric spaces with the symmetry axiom omitted. Given a proximity function (,) Z(-,-)_Z, the function log((,) Z)-\operatorname{log}\big( (-,-)_Z \big) defines a quasimetric. Conversely, given a quasimetric d(,)d(-, -), the function e d(,)e^{-d(-, -)} is a proximity function. Quasimetric spaces viewed as enriched categories are called Lawvere metric spaces.

The proxet structure on the set [0,1] Z[0, 1]^Z of fuzzy subsets of ZZ is given by (P,Q) [0,1] Z=inf zZ{P(z)\Q(z)}. (P, Q)_{[0, 1]^Z} = \inf_{z \in Z} \big\{P(z) \backslash Q(z)\big\}. Again, under the logical interpretation of infima and truncated division, we can interpret (P,Q) [0,1] Z(P, Q)_{[0, 1]^Z} as the fuzzy truth value of the proposition that PQP \subseteq Q when regarded as fuzzy subsets of ZZ. Pointwise infima and suprema correspond to intersection and union of fuzzy subsets.

By now, those of you familiar with enriched category theory have surely noticed that proxets are precisely (small) 𝒱\mathcal{V}-categories. Indeed, the set of objects of a 𝒱\mathcal{V}-category Z\mathbf{Z} is the set underlying the preorder or proxet, the hom-object Z(z 1,z 2)\mathbf{Z}(z_1, z_2) is the truth value of z 1z 2z_1 \preccurlyeq z_2 in the binary case and the value of (z 1,z 2) Z(z_1, z_2)_Z in the QCA case, and the unit and associativity axioms of a 𝒱\mathcal{V}-category give us reflexivity and transitivity in both cases.

Recall that if 𝒱\mathcal{V} is complete and cocomplete, then for any 𝒱\mathcal{V}-category Z\mathbf{Z}, the category [Z op,𝒱][\mathbf{Z}^{op}, \mathcal{V}] of 𝒱\mathcal{V}-presheaves on Z\mathbf{Z} is cocomplete with respect to weighted limits and weighted colimits. Here, weighted limits and weighted colimits are computed pointwise. Similarly, the enriched category [Z,𝒱][\mathbf{Z}, \mathcal{V}] of 𝒱\mathcal{V}-copresheaves on Z\mathbf{Z} is also complete with respect to weighted limits and weighted colimits.

Regarding a set ZZ as a discrete 𝒱\mathcal{V}-category Z\mathbf{Z}, predicates Z[0,1]Z \to [0, 1] may be seen both as 𝒱\mathcal{V}-presheaves Z op𝒱\mathbf{Z}^{\mathrm{op}} \to \mathcal{V} and 𝒱\mathcal{V}-copresheaves Z𝒱\mathbf{Z} \to \mathcal{V}. The proxet structure on [0,1] Z[0, 1]^Z corresponds exactly to the usual 𝒱\mathcal{V}-category structure on the 𝒱\mathcal{V}-category of 𝒱\mathcal{V}-presheaves and 𝒱\mathcal{V}-copresheaves on Z\mathbf{Z}. Similarly, trivially weighted limits and colimits in these presheaf and copresheaf 𝒱\mathcal{V}-categories correspond respectively to to fuzzy intersections (pointwise infima) and fuzzy unions (pointwise suprema).

Lifting profunctors

Recall that a genre is a fuzzy set of movies paired with the fuzzy set of people who like those movies. We need a way to relate fuzzy sets of people and fuzzy sets of movies. Our ratings matrix MM relates individual people and individual movies; our goal now is to lift this relation to fuzzy sets.

The ratings matrix MM is a choice of rating in V=[0,1]V = [0, 1] for each (user, movie)-pair. In light of the previous section, this is the same thing as a predicate X×YVX \times Y \to V, and thus MM may be regarded as a 𝒱\mathcal{V}-functor X opY𝒱\mathbf{X}^{\mathrm{op}} \otimes \mathbf{Y} \to \mathcal{V}, where X\mathbf{X} and Y\mathbf{Y} are the discrete proxets on XX and YY. The `op’ is included so that MM is actually a 𝒱\mathcal{V}-profunctor from X\mathbf{X} to Y\mathbf{Y}.

The following construction for lifting MM works for a general profunctor. The first step is to curry MM in the first argument, yielding the 𝒱\mathcal{V}-functor: M :X op[Y,𝒱] M^\bullet \colon \mathbf{X}^{\mathrm{op}} \to [\mathbf{Y}, \mathcal{V}] As described in the previous section, the codomain of this functor is the category of fuzzy subsets of Y\mathbf{Y}.

As the 𝒱\mathcal{V}-category [X op,𝒱][\mathbf{X}^{\mathrm{op}}, \mathcal{V}] is the free cocompletion of the 𝒱\mathcal{V}-category X\mathbf{X}, and the unit of the corresponding adjunction is the 𝒱\mathcal{V}-Yoneda embedding H:X[X op,𝒱]H \colon \mathbf{X} \to [\mathbf{X}^{\mathrm{op}}, \mathcal{V}], there is a unique cocontinuous 𝒱\mathcal{V}-functor M :[X op,𝒱][Y,𝒱] opM^\star \colon [\mathbf{X}^{{\mathrm{op}}}, \mathcal{V}] \to [\mathbf{Y}, \mathcal{V}]^{{\mathrm{op}}} such that the diagram

commutes. M M^\star is the desired lifting of MM.

The density theorem says that each predicate A[X op,𝒱]A \in [\mathbf{X}^{\mathrm{op}}, \mathcal{V}] may be written as the weighted colimit A=colim AH= xXA(x)H(x),A = \operatorname{colim}^{A}H = \int^{x\in\mathbf{X}}A(x) \otimes H(x), and so, by cocontinuity of M M^\star, we have

(1)M (A)=M (colim AH)=colim A(M H)=lim AM = xX𝒱(A(x),M (x)). M^\star(A) = M^\star(\operatorname{colim}^{A}H) = \operatorname{colim}^A(M^\star \circ H) = \operatorname{lim}^A M^\bullet = \int_{x\in\mathbf{X}}\mathcal{V}\big(A(x), M^\bullet(x)\big).

Returning to QCA, recall that in 𝒱\mathcal{V}, limits are pointwise suprema, the tensor product is ordinary multiplication and the internal hom is truncated division. Therefore, we may rewrite (1) as M (A)=inf xX{A(x)\M(x,)} M^\star(A) = \inf_{x \in \mathbf{X}} \big\{A(x) \backslash M(x, -) \big\}

If we curry in the second argument rather than the first, we can derive a right adjoint, M :[Y,𝒱] op[X op,𝒱]M_\star \colon [\mathbf{Y}, \mathcal{V}]^{\mathrm{op}} \to [\mathbf{X}^{\mathrm{op}}, \mathcal{V}], of M M^\star defined by M (B)=inf yY{B(y)\M(,y)}. M_\star(B) = \inf_{y \in \mathbf{Y}}\big\{B(y) \backslash M(-, y)\big\}.

Concepts as fixed points

Our constructions of M M^\star and M M_\star have a close analogy in linear algebra, discussed in this earlier post by Simon. The upshot is that our functor M M^\star is a bit like multiplication by a matrix MM, and M M_\star like multiplication by M M^\intercal. A common technique for studying the latent information in a matrix is to compute its left and right singular vectors, which are the eignvectors of MM MM^\intercal and M MM^\intercal M. The extraction of our concepts will follow the same pattern, replacing eigenvectors with fixed points.

The set of concepts associated with the [0,1][0, 1]-valued matrix MM is given by Fix(M M )Fix(M M )\operatorname{Fix}(M^\star M_\star) \cong \operatorname{Fix}(M_\star M^\star) These fixed points are called the nucleus of the profunctor MM. Profunctor nuclei uncover interesting structure in many domains beyond QCA. As described in other nCafe posts, they give a categorical interpretation of the Legendre-Fenchel transform, and can be specialized to recover the Isbell completion of a category, which is related to the Dedekind MacNeille completion of a partially ordered set, and the tight span of a metric space.

We can now work out some examples of fuzzy concepts from our ratings matrix MM. Starting with the representable fuzzy subset H(Cara)=[Cara 1 Dan 0 Evie 0]H(\mathtt{Cara}) = \left[\begin{array}{r|cc} \mathtt{Cara} & 1\\ \mathtt{Dan} & 0\\ \mathtt{Evie} & 0 \end{array}\right] of the set XX of users, we have M H(Cara)=[Se7en inf{1\55,0\35,0\25} Tarzan inf{1\45,0\15,0\45}]=[Se7en inf{55,1,1} Tarzan inf{45,1,1}]=[Se7en 1 Tarzan 45] \begin{aligned} M^\star H(\mathtt{Cara}) = \left[\begin{array}{r|cc} \mathtt{Se7en} & \inf \big\{1 \backslash \tfrac{5}{5}, 0 \backslash \tfrac{3}{5}, 0 \backslash \tfrac{2}{5}\big\}\\ \mathtt{Tarzan} & \inf \big\{1 \backslash \tfrac{4}{5}, 0 \backslash \tfrac{1}{5}, 0 \backslash \tfrac{4}{5}\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Se7en} & \inf \big\{\tfrac{5}{5}, 1, 1\big\}\\ \mathtt{Tarzan} & \inf \big\{\tfrac{4}{5}, 1, 1\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Se7en} & 1\\ \mathtt{Tarzan} & \tfrac{4}{5} \end{array}\right] \end{aligned} and M M H(Cara) =[Cara inf{1\55,45\45} Dan inf{1\35,45\15} Evie inf{1\25,45\45}]=[Cara inf{1,1} Dan inf{35,14} Evie inf{25,1}]=[Cara 1 Dan 14 Evie 25] \begin{aligned} M_\star M^\star H(\mathtt{Cara}) &= \left[\begin{array}{r|cc} \mathtt{Cara} & \inf \big\{1 \backslash \tfrac{5}{5}, \tfrac{4}{5} \backslash \tfrac{4}{5}\big\}\\ \mathtt{Dan} & \inf \big\{1 \backslash \tfrac{3}{5}, \tfrac{4}{5} \backslash \tfrac{1}{5} \big\}\\ \mathtt{Evie} & \inf \big\{1 \backslash \tfrac{2}{5}, \tfrac{4}{5} \backslash \tfrac{4}{5}\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Cara} & \inf \big\{1, 1\big\}\\ \mathtt{Dan} & \inf \big\{\tfrac{3}{5}, \tfrac{1}{4}\big\}\\ \mathtt{Evie} & \inf \big\{\tfrac{2}{5}, 1\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Cara} & 1\\ \mathtt{Dan} & \tfrac{1}{4}\\ \mathtt{Evie} & \tfrac{2}{5} \end{array}\right]\\ \end{aligned} As M M M =M M^\star M_\star M^\star = M^\star in general, the pair (M M H(Cara),M H(Cara))=([Cara 1 Dan 14 Evie 25],[Se7en 1 Tarzan 45])\big(M_\star M^\star H(\mathtt{Cara}), M^\star H(\mathtt{Cara})\big) = \left( \left[\begin{array}{r|cc} \mathtt{Cara} & 1\\ \mathtt{Dan} & \tfrac{1}{4}\\ \mathtt{Evie} & \tfrac{2}{5} \end{array}\right], \left[\begin{array}{r|cc} \mathtt{Se7en} & 1\\ \mathtt{Tarzan} & \tfrac{4}{5} \end{array}\right] \right) is a fuzzy concept of MM, called the representable fuzzy concept for Cara\mathtt{Cara}. The first vector in the pair may be interpreted as the fuzzy set of people with similar taste to Cara\mathtt{Cara} and the second as the fuzzy set of movies that Cara\mathtt{Cara} likes.

Similarly, letting H:Y[Y,𝒱] opH' \colon \mathbf{Y} \to [\mathbf{Y}, \mathcal{V}]^{{\mathrm{op}}} be the opposite of the Yoneda embedding for Y op\mathbf{Y}^{{\mathrm{op}}}, starting with the representable fuzzy subset H(Tarzan)=[Se7en 0 Tarzan 1 ]H'(\mathtt{Tarzan}) = \left[\begin{array}{r|cc} \mathtt{Se7en} & 0\\ \mathtt{Tarzan} & 1\\ \end{array}\right] of the set YY of movies, we obtain the representable fuzzy concept (M H(Tarzan),M M H(Tarzan))=([Cara 45 Dan 15 Evie 45],[Se7en 12 Tarzan 1])\big(M_\star H'(\mathtt{Tarzan}), M^\star M_\star H'(\mathtt{Tarzan})\big) = \left(\left[\begin{array}{r|cc} \mathtt{Cara} & \tfrac{4}{5}\\ \mathtt{Dan} & \tfrac{1}{5}\\ \mathtt{Evie} & \tfrac{4}{5} \end{array}\right], \left[\begin{array}{r|cc} \mathtt{Se7en} & \tfrac{1}{2}\\ \mathtt{Tarzan} & 1 \end{array}\right]\right) for Tarzan\mathtt{Tarzan}.

The proximity 𝒱 Y(M M H(Tarzan),M H(Cara))=inf{12\1,1\45}=inf{1,45}=45\mathcal{V}^{\mathbf{Y}}\big(M^\star M_\star H'(\mathtt{Tarzan}), M^\star H(\mathtt{Cara}) \big) = \inf \big\{ \tfrac{1}{2} \backslash 1, 1 \backslash \tfrac{4}{5} \big\} = \inf \big\{1, \tfrac{4}{5} \big\} = \tfrac{4}{5} may be interpreted as the fuzzy truth value of the proposition that the fuzzy set of movies similar to Tarzan\mathtt{Tarzan} is contained in the fuzzy set of movies which Cara\mathtt{Cara} likes. Notice that it is equal to the proximity 𝒱 X(M M H(Cara),M H(Tarzan))=inf{1\45,14\15,25\45}=inf{45,1,1}=45.\mathcal{V}^{\mathbf{X}}\big(M_\star M^\star H(\mathtt{Cara}), M_\star H'(\mathtt{Tarzan})\big) = \inf \big\{ 1 \backslash \tfrac{4}{5}, \tfrac{1}{4} \backslash \tfrac{1}{5}, \tfrac{2}{5} \backslash \tfrac{4}{5}\big\} = \inf \big\{ \tfrac{4}{5}, 1, 1\big\} = \tfrac{4}{5}. This is because Fix(M M )\operatorname{Fix}(M^\star M_\star ) and Fix(M M )\operatorname{Fix}(M_\star M^\star ) are isomorphic proxets.

As explained in Jonathan Elliot’s thesis On the Fuzzy Concept Complex, profunctor nuclei can be visualized as certain convex sets in tropical spaces. We leave the image below as a teaser for the thesis.

A proxet nucleus

Posted at February 8, 2022 4:42 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3380

1 Comment & 0 Trackbacks

Re: Concepts and Profunctor Nuclei

Amazing! This flavor of ACT is one of my favorite.

I think in the sentence immediatly after (1) there’s a ‘typo’, it says ‘limits are pointwise suprema’ and then it proceeds to take infima (which is the correct thing afaiu).

Posted by: Matteo Capucci on February 10, 2022 12:09 PM | Permalink | Reply to this

Post a New Comment