Home > ghadoop-gfarm

ghadoop-gfarm

Ghadoop-gfarm is a project mainly written in JAVA and C++, based on the Apache-2.0 license.

Introduction

You can use Hadoop, an open-source MapReduce framework, on Gfarm with this Hadoop-Gfarm plugin.

Install

  1. Installing Gfarm Please read INSTALL.en or INSTALL.ja contained in Gfarm package.

  2. Installing Hadoop Please read the documents on Hadoop site (http://hadoop.apache.org/core/).

  3. Installing Hadoop-Gfarm

3.1 Building Hadoop-Gfarm To build Hadoop-Gfarm, edit JAVA_HOME, HADOOP_HOME and GFARM_HOME in build.sh and type the following commands.

$ ./build.sh

3.2 Configuring Hadoop Now hadoop-gfarm.jar and libGfarmFSNative.so is created and copied to Hadoop's library directory. hadoop-gfarm.jar is copied to ${HADOOP_HOME}/lib libGfarmFSNative.so is copied to ${HADOOP_HOME}/lib/native/Linux-amd64-64 and ${HADOOP_HOME}/lib/native/Linux-i386-32

Finally, please add this configuration to the core-site.xml.

fs.gfarm.impl org.apache.hadoop.fs.gfarmfs.GfarmFileSystem The FileSystem for gfarm: uris.

3.3 Access Gfarm through Hadoop If installation is done, you can access Gfarm's root directory through Hadoop like following.

$ ${HADOOP_HOME}/bin/hadoop dfs -ls gfarm:///

Or you can run sample MapReduce programs like this.

$ ${HADOOP_HOME}/bin/hadoop jar hadoop-${VERSION}-examples.jar wordcount gfarm:///input gfarm:///output

3.4 Cofiguring Hadoop-Gfarm You can configure your working directory by the following property in core-site.xml.

fs.gfarm.workingDir /home/${user.name} The working directory on gfarm file system.