GraphLab: Distributed Graph-Parallel API  2.2
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
GraphLab: Distributed Graph-Parallel API Documentation

The GraphLab project started in 2009 to develop a new parallel computation abstraction tailored to machine learning. GraphLab 1.0 represents our first shared memoy design which, through the addition of several matrix factorization toolkits, started to grow a community of users.

In the last couple of years, we have focused our development effort on the distributed environment. Unfortunately, it took nearly a year to figure out that distributing the GraphLab 1 abstraction was excessively complicated and is unable to scale up to power-law graphs commonly seen in the real world.

In GraphLab 2.1, we completely redesign of the GraphLab 1 framework for the distributed environment. The implementation is distributed by design and a "shared-memory" execution is essentially running a distributed system on a cluster of 1 machine.

And in this new release of GraphLab 2.2, we introduce the new Warp System which through the use of fine-grained user-mode threading, introduces a new API which brings about a major increase in useability, and will allow us to provide new capabilities more easily in the future.

There are two starting points where one may begin using GraphLab.

  • Toolkits You can lookup the toolkit documentation here if you have a computation task which is already implemented by one of our toolkits.
  • GraphLab C++ Tutorial If you have a computation task which is not implemented by our toolkits, you could try implementing yourself! For now a certain degree of C++ knowledge is required.

The new GraphLab 2.2 Warp System is available for experimentation. A GraphLab Warp System Tutorial tutorial is provided, and we are are looking for feedback to continue extending and improving the Warp system. Performance tuning is also underway.

Software Stack

software_stack.png
system_overview.png