Home > grapher

grapher

Grapher is a project mainly written in Ruby, it's free.

Grapher

This plugin lets you store a graph representation of your data in Redis. ("Graph" as in "Graph Theory", this is NOT charting sofrtware.)

A graph representation of your objects is a very powerful analytical tool for discovering inter-object relationships that cannot be discovered easily or nearly as efficiently using a relational database.

For more info on Redis see http://redis.io/

Introduction

Imagine the following model:

User has_many :orders

Order belongs_to :user has_many :items, :through => order_items

OrderItem belongs_to :order belongs_to :item, :polymorphic => true has_one :user, :through => :order

Imagine that you would like to keep track of a user buying an item. You can get a user's orders via user.orders, then for each of those collect the items. So it's possible, but not in one step.

Imagine that you wanted to compare what a user has purchased with what other users have purchased, so that you could make a recommendation such as "users who purchased X also purchased Y". Given the above database structure it would be a fairly complicated operation requiring numerous SQL calls.

(Bear with us, we're getting to the point.)

Now imagine that you stored a user's purchases in a graph, like this (numbers are users, letters are items):

+---> D <---+ | | 1 --> A <-- 2 --> B ^ | 3 --> C <-- 4

In the above graph:

user 1 purchased [A, D] user 2 purchased [A, B, D] user 3 purchased [A, C] user 4 purchased [C]

You can see that users 1 and 2 both purchased items A and D. Based on this you could say that users 1 and 2 have a similar purchasing pattern, and that item B may be of interest to user 1. You could also say that items A and D may be related because more than one user purchased them together.

You may have noticed that users 1 and 2 are two graph edges away from each other. Also that there are multiple paths connecting users 1 and 2 (via A and via D).

This means that to find users with a similar purchasing pattern, all we need to do is to enumerate all users exactly two "hops" away. And if we were to count the number of time we come across the same user, this count would become a rank of similarity - the higher the rank, the more similar these users are.

Same principle applies to items. Items two edges apart can be deemed similar, and the more common paths, the higher the degree of similarity.

If we were to store the above graph as a two hashes of sets (one for forward links from user to item, the second for reverse from item to user), such as (in pseudo-ruby code):

forward = { 1 => {A, D}, 2 => {A, B, D}, 3 => {A, C}, 4 => {C} }

reverse = { A => {1, 2, 3}, B => {2}, C => {3, 4}, D => {1, 2} }

we could quite easily accomplish the above operation. For example, all users similar to 1 are:

forward[1].each { |item| reverse[item] }

Similarly items similar to A would be:

reverse[A].each { |user| forward[user] }

There is only one remaining problem - if your database is any decent size by today's standards, building these hashes may take enough CPU/memory to make it impractical.

This is where Redis comes in. We can build this structure once and maintain it with proper callbacks, having it stored in Redis. Redis provides all the basic operations for the above calculations and is fast enough for these calculations to be performed real-time.

Example

class OrderItem < ActiveRecord::Base belongs_to :order belongs_to :item, :polymorphic => true has_one :user, :through => :order graph_edge_from :user, :to => :item, :on => :create, :verb => 'Purchased' ...

class User < ActiveRecord::Base has_many :orders graph_node :verbs => '>Purchased'

u = User.find(:first) u.graph_neighbors => #<Set: {"Item:234", "Item:345", "Item:456"}> u.graph_neighbors(:distance => 2) => #<Set: {"User:12", "User:23", "User:2345"}> u.graph_neighbors_ranked(:distance => 2) => [[9, "User:23"], [5, "User:23"], [1, "User:2345"]]

Copyright (c) 2011 Tournesol Ventures, LLC

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Previous:janus