The Libraries will be performing system maintenance to UWSpace on Thursday, March 13th from 12:30 to 5:30 pm (EDT). UWSpace will be unavailable during this time.
 

Perspectives of Graph Diffusion: Computation, Local Partitioning, Statistical Recovery, and Applications

dc.contributor.authorYang, Shenghao
dc.date.accessioned2025-03-06T14:20:15Z
dc.date.available2025-03-06T14:20:15Z
dc.date.issued2025-03-06
dc.date.submitted2025-03-03
dc.description.abstractDiffusion describes the process of mass moving from one region to another. In the con- text of graph, the diffusing mass spreads from nodes to nodes along the edges of the graph. Broadly speaking, this includes a number of stochastic and deterministic processes such as random walk, heat diffusion, network flow and electrical flow on graphs. Graph diffusion is a highly important primitive, and has been shown to have a variety of surprising properties both theoretically and practically. In this thesis, we present several new perspectives of graph diffusion, with an emphasis on how diffusion algorithms uncover the local clustering structure of the input data without necessarily exploring the entire graph. In the first two parts of the thesis, we introduce a new class of graph diffusion methods that are provably better at extracting the local clustering structure of a graph or a hy- pergraph. Here, diffusion is formulated as a pair of primal and dual convex optimization problems, based on the idea of spreading mass in the graph while minimizing a p-norm net- work flow cost. The primal solution of the diffusion problem provides an intuitive physical interpretation where paint (i.e. mass) spills from the source nodes, spreads over the graph, and there is a sink at each node where up to a certain amount of paint can settle. The dual solution embeds the nodes on the non-negative real line and is considered as the output of diffusion. We will show that the dual variables nicely encode the local clustering structure around a given set of seed nodes. In particular, assume the existence of a cluster C of low conductance Φ(C), the sweep cut procedure on the dual variables returns a cluster whose conductance is not too much larger than Φ(C). In the next two parts of the thesis, we introduce a weighted diffusion mechanism which allows any existing diffusion method to take into account additional node information such as node attributes and labels. The method weighs the edges of the graph based on the attributes or the labels of each node. Depending on the nature and availability of additional node information, two simple yet effective edge-weighting schemes are introduced and analyzed. Over contextual random graphs generated by a local variant of the stochastic block model with noisy node information, we will show that, if the additional information contains enough signal about the ground-truth cluster, then employing existing diffusion algorithms in the weighted graph can more accurately recover the ground-truth cluster than employing diffusion in the original graph without edge weights. In particular, statistical recovery guarantees in terms of precision and F1 score will be derived and compared. All of the results are supplemented with extensive experiments on both synthetic and real-world data to illustrate the technical results and the effectiveness of the new methods in practice. The code is open-source on GitHub.
dc.identifier.urihttps://hdl.handle.net/10012/21498
dc.language.isoen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectgraph algorithms
dc.subjectgraph diffusion methods
dc.subjectlocal graph clustering
dc.subjecthypergraphs
dc.subjectsemi-supervised learning
dc.subjectcommunity detection
dc.titlePerspectives of Graph Diffusion: Computation, Local Partitioning, Statistical Recovery, and Applications
dc.typeDoctoral Thesis
uws-etd.degreeDoctor of Philosophy
uws-etd.degree.departmentDavid R. Cheriton School of Computer Science
uws-etd.degree.disciplineComputer Science
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0
uws.contributor.advisorFountoulakis, Kimon
uws.contributor.affiliation1Faculty of Mathematics
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yang_Shenghao.pdf
Size:
3.07 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: