Here are the steps to install Apache Mahout  1. Download latest package from http://mirror.nexcess.net/apache/mahout/  2. Extract the package  3. Create a directory and put into HDFS  <hadoop_directory> hdfs dfs -put /home/Hadoop/data/mydata.txt /mahout_data/  4. Run clustering in mahout  <mahout_directory>/bin/mahout seqdirectory -i hdfs://localhost:9000/mahout_data/ -o hdfs://localhost:9000/clustered_data/  5. The output file will be in clustered_data directory