GraDiaS is a software to approximate the diameter of large graphs.
The software is used for the experiments in the following papers:
"A Practical Parallel Algorithm for Diameter Approximation of Massive Weighted Graphs" by Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, and Eli Upfal.
"Space and Time Efficient Parallel Graph Decomposition, Clustering, and Diameter Approximation" by Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, and Eli Upfal.
This project uses
sbt to manage the build
process. To generate a
jar file suitable for deployment issue the
This will generate a single jar file containing all the dependencies
needed by the project, with the exclusion of
The program is meant to be run using the standard
script. The main class to pass to the script is
The application command line interface documentation can be accessed from the command line itself
--help flag for the command.
user@host> spark-submit --properties-file default.conf --class it.unipd.dei.gradias.Tool sg.jar --help gradias version 0.11.0 - (c) 2013-2015 Matteo Ceccarello Usage: spark-submit [spark-opts] --class it.unipd.dei.gradias.Tool gradias-0.11.0.jar <subcommand> [options] Options: --help Show help message --version Show version of this program Subcommands: delta-cluster delta-sssp cluster mpx13 hadi bfs convert generate
Each subcommand has its own help message
user@host> spark-submit --properties-file default.conf --class it.unipd.dei.gradias.Tool sg.jar cluster --help gradias version 0.11.0 - (c) 2013-2014 Matteo Ceccarello -d, --diameter-algorithm <arg> (default = Distributed) -i, --input <arg> the input graph -o, --output <arg> the output file. If not given, no output is written -s, --skip-diameter --noskip-diameter -t, --target-fraction <arg> --help Show help message --version Show version of this program
A note about the input format
All the algorithms implemented in this program accept the same graph input format, that is a textual adjacency list:
node-0 neighbour-0A neighbour-0B neighbour-0C neighbour-0D node-1 neighbour-1A neighbour-1B neighbour-1C ...... .....
Moreover, since the algorithms work on undirected graphs, the input must be symmetric:
1 is in the adjacency list of
0 must be in the adjacency list of
1 as well.
Several conversion tools are provided by the program, for more information see
Change the distance datatype from
Add support for disconnected graphs
Add the TEPS metric
Optimize graph data structures
Bitmasks to store flags in place of booleans
This release features mainly optimizations for the weighted implementation of the clustering algorithm
This release introduces the extension of the algorithm to weighted graphs.
FEATURE Support weighted graphs
REFACTOR Generic coder refactoring to reduce duplication
BUGFIX Sometimes computing the diameter of the quotient graph with Dijkstra resulted in errors like the following
java.lang.IllegalArgumentException: Cannot decrease priority of an element not in the queue: 149
This was due to an integer overflow that is now fixed. Runs that were not crashing for the above error should not have been affected by this bug.
BUILD Skip git properties injection in manifest if there is no git repository, i.e. in source tarballs
|This is a breaking release.|
This is a cleanup release, with dead code removal.
REFACTOR Rename the project from
GraDiaS: "Graph Diameter on Spark"
FEATURE Upgrade to spark-1.2.0
FEATURE Colors! Now the output is colored. Moreover the logging configuration has been changed in order to be more clear.
REFACTOR Remove old implementations of CLUSTER and MPX13. They are replaced by more efficient implementations of based on a common cluster growing procedure (see next point).
REFACTOR Add class
ClusterExpanderthat centralizes the implementation of cluster growing logic. Now all the algorithms using cluster-growing strategies (BFS, MPX13 and CLUSTER) can use the same implementation, thus reducing code duplication. Being no longer necessary, class
VertexGraphFunctionshas been removed, along with all the hierarchy of Vertex classes.
This is the release for SPAA15. It mainly features a leaner decomposition strategy, and a BFS implementation to set a baseline.
Improve performance of last step of the algorithm: weighted diameter computation. Now the user can select the algorithm to use between
Floyd Warshall: sequential, very fast on small (quotient) graphs
Dijkstra: sequential, slightly slower than Floyd Warshall
Distributed: runs Dijkstra’s algorithm distributed on the cluster. On big quotient graphs (~8000 nodes) allows for a speedup of up to 40x.
General performance improvements.
Upgrade to spark-1.0.0
Dead code cleanup
Reimplement graph shrinking procedure
Reduce communication in weighted edges creation
Report timing percentages
Featuring new fast decomposition
Implement Miller, Peng and Xu’s algorithm for graph decomposition.
Upgrade to spark-0.9.0
Dead code cleanup
New version of ball decomposition
Implement new contracting ball decomposition
Implement new decomposition strategy based on sample and prune
Make partitioning optional
Use Elias Fano codes to compress the color lists
Upgrade to spark 0.8.1
Add more information to the output
Complete rewrite of the tool
Ball decomposition strategy using limited-length lists of colors
Probability of selection depends on ball cardinality
Rewrite flood ball decomposition
Rewrite HyperANF spark implementation
Use codahale-metrics for timing
Print more information to standard output
Refactor flood ball decomposition
Add flood ball decomposition
Switch to Spark 0.8.0-incubating
Implement ball decomposition (deterministic and randomized) and the distributed version of HyperANF