Cassandra-summit-demo is a project mainly written in SCALA and PYTHON, it's free.
Hadoop integration demo for the Cassandra Summit
Example ETL and analytics workflow using Cassandra and Hadoop.
Load unstructured data into HDFS
hadoop dfs -put taxobox*.yaml /user/hadoop
Use Hadoop Streaming to add structure and insert into Cassandra
bin/load -input /user/hadoop/taxobox*.yaml
Analyze data with Pig into HDFS
cat bin/analyze.pig | bin/analyze
Use Java MapReduce to store summary results back into Cassandra
bin/summarize
Thank you to Infochimps for the carefully scraped Wikipedia dataset: http://infochimps.org/datasets/taxobox-wikipedia-infoboxes-with-taxonomic-information-on-animal