SHORTEST PATH
ANALYSIS IN REAL
GRAPHS
Authors
Waqas Nawaz,
Kifayat Ullah Khan,
Young-Koo Lee
Department of Computer Engineering, Kyung
Hee University, South Korea
The 3rd International Conference on Convergence
and its Application (ICCA 2014) , 25-27 June,
Seoul, Korea
 “... shortest path problems are among the most
fundamental combinatorial optimization
problems with many applications, both direct and
as subroutines in other combinatorial optimization
algorithms. Algorithms for these problems have
been studied since the 1950’s and still
remain an active area of research.”[1]
MOTIVATION
2
[1] Camil Demetrescu, Andrew V. Goldberg, and David S. Johnson.
Implementation challenge for shortest paths. In Encyclopedia of Algorithms.
2008.
 Telephone routes
 Which communication links to activate when a user makes a
phone call, e.g. from HK to New York, USA.
 Road systems design
 Problem: how to determine the no. of lanes in each road?
 Given: expected traffic between each pair of locations
 Method: Estimate total traffic on each road link assuming each
passenger will use shortest path
 Many other applications, including:
 Finance (arbitrage),
 In economics and finance, arbitrage is the practice of taking advantage of
a price difference between two or more markets: striking a combination of
matching deals that capitalize upon the imbalance, the profit being the
difference between the market prices. (Wikipedia)
 Assembly line inspection systems design,
 Graph Median, Traffic Simulation, Image Segmentation,
Drug Target Identification, Community Detection,
Social Search, Social Networking, Message Routing,
SHORTEST PATH  APPLICATIONS
3
SHORTEST PATH ANALYSIS:
CONTRIBUTIONS
 Empirically prove that a significant
amount of shortest paths are
overlapped
 The behavior of the overlapped
regions in diverse networks
 E.g. Scale free networks
 The impact of hub-nodes on the
shortest paths
 E.g. What portion of the shortest
paths are pass through the hub
nodes or across dense regions
 Analysis on the coverage of the
entire graph through shortest
paths 4
Hub-nodes
 Which portion of a graph is traversed
through shortest paths?
OR
Validate
 A significant amount of shortest paths are
overlapped
 Hub nodes are contained in shortest
paths
PROBLEM STATEMENT
5To the best of our knowledge, there is no such
empirical analysis exists in literature
DEFINITION: SHORTEST PATH (SP)
 Definition: A sequence of edges i.e.
 pi = Eseq = {e1 e2  … em} from source vertex vs to
destination vertex vd where dist(pi) is minimum
 m is the number of edges,
 ei = {(vi-1,vi, cost)|vi-1,vi Є V},
 vs ≠ vd
 dist(…) is the distance function based on edge cost
 Example
 Shortest Path = p(v0, v7) = {e1 e2  e8  e5  e6 
e7 }
 where m=7, vs = v0 and vd = v7
6v2 v3 v4 v5 v6 v7v1v0
e1 e2 e3
e4 e5 e6 e7
e8
e9
source destination
 Straight Forward Approach (Brute-force)
 Generate all pair shortest paths  SP-DB
(file on disk)
If N is the number of vertices then N2
paths, may not appropriate for very
large graphs
 Manually scan SP-DB to identify the
overlaps and frequently occurring vertices
Efficiency subjected to careful data
structure or indexing method
HOW TO ANALYZE SPS? BASIC IDEA (1/2)
7
 Alternate Approach (Non-Exhaustive)
 Generate all pair shortest paths (small
graphs) OR k-source shortest paths (for
large graphs, where k << N ) into SP-DB
 Estimate the occurrences of vertices
using data mining approach
Frequent Item-set Mining with given
threshold to limit search space
We can easily prune the rarely
occurring vertices
HOW TO ANALYZE SPS? BASIC IDEA (2/2)
8
 Shortest Path Computation
 Frequent Item-set Mining towards finding SPOREs
 FP-Growth approach
 Each shortest path is considered as a transaction which
contains nodes as set of items
 If sup = 2 then
 1 len SPORE (C, D, E)
 2 len SPORE (CD, DE, CE)
 3 len SPORE (CDE)
NON-EXHAUSTIVE APPROACH: EXAMPLE
9
All Pair Shortest Pathsk-Source Shortest Paths, k=2
C D ENM P
C D EBA F
 Real Dataset
 Social circles from Facebook (anonymized)
 Vertices (4,050), Edges (88,254), Diameter (8)
 Environment
 Windows 7, 32bit, Java Implementation
EXPERIMENTS
10
Original Facebook Graph Shortest Path Traversals
FACEBOOK STATS: DEGREE DISTRIBUTION VS.
FREQUENCY OF OCCURRENCES
11
The frequency distribution of shortest
path overlaps is influenced by network
node degree distribution
The probability of the shortest path
passing through hub-nodes is high
A significant amount of shortest paths
are overlapped
CONCLUSION
12

(Icca 2014) shortest path analysis in social graphs

  • 1.
    SHORTEST PATH ANALYSIS INREAL GRAPHS Authors Waqas Nawaz, Kifayat Ullah Khan, Young-Koo Lee Department of Computer Engineering, Kyung Hee University, South Korea The 3rd International Conference on Convergence and its Application (ICCA 2014) , 25-27 June, Seoul, Korea
  • 2.
     “... shortestpath problems are among the most fundamental combinatorial optimization problems with many applications, both direct and as subroutines in other combinatorial optimization algorithms. Algorithms for these problems have been studied since the 1950’s and still remain an active area of research.”[1] MOTIVATION 2 [1] Camil Demetrescu, Andrew V. Goldberg, and David S. Johnson. Implementation challenge for shortest paths. In Encyclopedia of Algorithms. 2008.
  • 3.
     Telephone routes Which communication links to activate when a user makes a phone call, e.g. from HK to New York, USA.  Road systems design  Problem: how to determine the no. of lanes in each road?  Given: expected traffic between each pair of locations  Method: Estimate total traffic on each road link assuming each passenger will use shortest path  Many other applications, including:  Finance (arbitrage),  In economics and finance, arbitrage is the practice of taking advantage of a price difference between two or more markets: striking a combination of matching deals that capitalize upon the imbalance, the profit being the difference between the market prices. (Wikipedia)  Assembly line inspection systems design,  Graph Median, Traffic Simulation, Image Segmentation, Drug Target Identification, Community Detection, Social Search, Social Networking, Message Routing, SHORTEST PATH  APPLICATIONS 3
  • 4.
    SHORTEST PATH ANALYSIS: CONTRIBUTIONS Empirically prove that a significant amount of shortest paths are overlapped  The behavior of the overlapped regions in diverse networks  E.g. Scale free networks  The impact of hub-nodes on the shortest paths  E.g. What portion of the shortest paths are pass through the hub nodes or across dense regions  Analysis on the coverage of the entire graph through shortest paths 4 Hub-nodes
  • 5.
     Which portionof a graph is traversed through shortest paths? OR Validate  A significant amount of shortest paths are overlapped  Hub nodes are contained in shortest paths PROBLEM STATEMENT 5To the best of our knowledge, there is no such empirical analysis exists in literature
  • 6.
    DEFINITION: SHORTEST PATH(SP)  Definition: A sequence of edges i.e.  pi = Eseq = {e1 e2  … em} from source vertex vs to destination vertex vd where dist(pi) is minimum  m is the number of edges,  ei = {(vi-1,vi, cost)|vi-1,vi Є V},  vs ≠ vd  dist(…) is the distance function based on edge cost  Example  Shortest Path = p(v0, v7) = {e1 e2  e8  e5  e6  e7 }  where m=7, vs = v0 and vd = v7 6v2 v3 v4 v5 v6 v7v1v0 e1 e2 e3 e4 e5 e6 e7 e8 e9 source destination
  • 7.
     Straight ForwardApproach (Brute-force)  Generate all pair shortest paths  SP-DB (file on disk) If N is the number of vertices then N2 paths, may not appropriate for very large graphs  Manually scan SP-DB to identify the overlaps and frequently occurring vertices Efficiency subjected to careful data structure or indexing method HOW TO ANALYZE SPS? BASIC IDEA (1/2) 7
  • 8.
     Alternate Approach(Non-Exhaustive)  Generate all pair shortest paths (small graphs) OR k-source shortest paths (for large graphs, where k << N ) into SP-DB  Estimate the occurrences of vertices using data mining approach Frequent Item-set Mining with given threshold to limit search space We can easily prune the rarely occurring vertices HOW TO ANALYZE SPS? BASIC IDEA (2/2) 8
  • 9.
     Shortest PathComputation  Frequent Item-set Mining towards finding SPOREs  FP-Growth approach  Each shortest path is considered as a transaction which contains nodes as set of items  If sup = 2 then  1 len SPORE (C, D, E)  2 len SPORE (CD, DE, CE)  3 len SPORE (CDE) NON-EXHAUSTIVE APPROACH: EXAMPLE 9 All Pair Shortest Pathsk-Source Shortest Paths, k=2 C D ENM P C D EBA F
  • 10.
     Real Dataset Social circles from Facebook (anonymized)  Vertices (4,050), Edges (88,254), Diameter (8)  Environment  Windows 7, 32bit, Java Implementation EXPERIMENTS 10 Original Facebook Graph Shortest Path Traversals
  • 11.
    FACEBOOK STATS: DEGREEDISTRIBUTION VS. FREQUENCY OF OCCURRENCES 11
  • 12.
    The frequency distributionof shortest path overlaps is influenced by network node degree distribution The probability of the shortest path passing through hub-nodes is high A significant amount of shortest paths are overlapped CONCLUSION 12