Skip to main content
To KTH's start page

Fairness and Diversity-Aware Algorithms

Ranking, Streaming, and Graph Analysis

Time: Tue 2026-05-05 14.00

Location: F3 (Flodis), Lindstedtsvägen 26

Language: English

Subject area: Computer Science

Doctoral student: Honglian Wang , Teoretisk datalogi

Opponent: Professor Christina Lioma, University of Copenhagen, Copenhagen, Denmark

Supervisor: Professor Aristides Gionis, ; Professor Henrik Boström,

Export to calendar

QC 20260408

Abstract

As algorithmic systems increasingly shape human experiences, ensuring fairness and diversity has become a central challenge. This thesis studies fairness and diversity through the lens of algorithm design and optimization theory, providing formal frameworks and efficient algorithms across three domains: ranking-based recommendation, streaming recommendation, and graph analysis.

The first part of the thesis investigates diversity maximization in recommender systems with stochastic user engagement. We first study how to rank items in recommendation systems, where users engage with content sequentially and probabilistically. We introduce two novel diversity measures, sequential sum diversity and sequential coverage diversity, which account for uncertainty in user engagement. Our goal is to find a ranking of items that maximizes these sequential diversity measures. We show that sequential coverage diversity is ordered submodular, enabling a greedy 1/2-approximation. For sequential sum diversity, we provide polynomial-time constant-factor approximation algorithms. Separately, we study a streaming setting where items arrive continuously and users may visit the system multiple times at arbitrary moments. For this setting, we aim to design a streaming algorithm that maximizes a stochastic coverage diversity measure. We show that a classic greedy algorithm achieves a tight 1/2-competitive ratio but requires memory linear in the stream length. With sublinear memory and an upper bound T' on the number of user visits T, we propose STORM, which achieves a 1/4(T'-T+1)-competitive ratio. We further propose STORM++, improving the competitive ratio to 1/8delta, where the integer parameter delta controls the tradeoff between solution quality and computational cost.

The second part of the thesis studies diversity as a constraint in densest subgraph discovery and addresses the problem of finding dense communities in networks with heterogeneous relationship types. We model relationship types as edge colors and formulate the At Least h Colored Edges Densest Subgraph problem (ALHCEDGESDSP), which seeks subgraphs that are both dense and contain at least h_i edges of each color i. We prove that even the simplest variant of this problem is NP-hard and develop constant-factor approximation algorithms. Our key technical contribution links the edge-constrained and node-constrained versions of the densest subgraph problem. We first show that algorithms for the At Least k Nodes Densest Subgraph problem (DalkS) can approximate the At Least h Edges Densest Subgraph problem (ATLEASTHEDGESDSP), and then extend the algorithm for DalkS to handle colored edge constraints for solving ALHCEDGESDSP.

The third part of the thesis studies graph interventions for fairness in networks. We examine two fairness measures, PageRank fairness and hitting-time fairness, developing methods to balance influence and improve accessibility across groups. For each demographic group, the sum of PageRank scores within it quantifies the influence of that group. PageRank fairness measures how far the current group-wise influence deviates from a given target. We formulate the PageRank fairness problem as an optimization problem that adjusts edge weights such that the resulting graph achieves a group-wise influence distribution as close to the target as possible. The optimization problem involves a nonconvex objective over a convex feasible set under practical constraints, such as not adding new edges and limiting the magnitude of weight changes. We solve this PageRank fairness maximization problem using efficient projected gradient descent, proving convergence to a stationary point. For hitting-time fairness in bipartite graphs, we formulate two problems, minimizing the average (BMAH) and the maximum hitting time (BMMH) from one group to another via strategic edge additions. We provide a (2+epsilon)-approximation for BMAH by combining fast random walk simulation with greedy supermodular minimization. For the more challenging BMMH problem, we develop two approaches: the first leverages its connection to the BMAH problem, and the second employs a method based on the asymmetric k-center problem. Both approaches yield provable approximation guarantees for BMMH.

The algorithms and analysis techniques presented in this thesis contribute to the growing body of work on fairness and diversity in algorithmic systems. By formalizing new problem variants that capture realistic constraints in interactive and networked settings, and by providing approximation algorithms with provable guarantees, this work expands the toolkit available for addressing fairness and diversity challenges in computational systems.

Link to DiVA