Saturday, December 1, 2012

Getting Started with Hadoop on Ubuntu

So, I'm trying to play with hadoop again (haven't done it in a while), and since ubuntu is my current weapon of choice, I found a great tutorial at http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ ,but I wanted something even simpler, like a script, plus a sample program and instructions on how to compile and run it, so I created it (at the very bottom are the differences with Noll's tutorial). It is available at: https://github.com/okaram/scripts/blob/master/hadoop/install.sh .

You just need to download it and change it so it can be executable: You probably want to look at it in your favorite editor (it is not a good idea to just run a script from the internet; I trust myself, but you shouldn't trust me), and you may want to change the mirror while you're at it (I live in Atlanta, so I use Georgia Tech's). After you're happy, run it as root: And you should be done with the installation ! the script creates a user for hadoop, called hduser; you can change to it, by typing: Then, as that user, you want to setup your path and classpath (the classpath is needed for compiling): And start hadoop: Now download my sample program (it is the standard WordCount example, from the tutorial, but without the package statement, so you can compile it directly from that folder), compile it and create a jar file: Now, we need to put some data into hadoop; first we create a folder and copy a file into it (our same WordCount.java, since we just need a text file): And we copy that folder into hadoop (and list it, to verify it's there): And now we can run our program in hadoop: When you want to stop hadoop, just run the stop-all.sh command; also, if you want to copy the output to your file system, just use the -copyToLocal option of hadoop's dfs.
The install script is completely automated, so you can even use it to start an amazon ec2 instance with it; for example, use: to start a micro instance, with a ubuntu 12.04 daily build (for Dec-1-2012; change the ami id to get a different one :), and a key named mac-vm.

33 comments:

  1. it's a nice project, very helpful for us and thank's for sharing. we are providing Hadoop online training


    ReplyDelete
  2. The information which you provided is very much useful for Hadoop Online Training Learners thanks for sharing valuable information

    ReplyDelete
  3. Thanks for Providing the best Information on Hadoop. The information which you provided is very much Useful for Hadoop Online Training Learners.

    ReplyDelete
  4. The Information was very much useful for Hadoop Online Training Learners Thank You for Sharing Valuable Information it is very useful for us.

    ReplyDelete
  5. The Information you provided is very much useful for Hadoop Learners. This Information was very Intersting, We also provide Hadoop Online training in India.
    Skypeid: rsonlinehyd
    Please contact us India:+91 9052699906,USA :+1 909-666-5386
    Email:contact@rstrainings.com

    ReplyDelete
  6. The information which you have provided is very good and easily understood.
    It is very useful who is looking for hadoop Online Training.

    ReplyDelete
  7. This is the information that I was looking for and let me tell you one thing that is it is very useful for who is looking for HADOOP.

    ReplyDelete
  8. Really good piece of knowledge, I had come back to understand regarding your website from my friend Sumit, Hyderabad And it is very useful for who is looking for HADOOP.

    ReplyDelete
  9. Excellent piece of knowledge, I had come back to read concerning your web site from my friend shiva, bangalore. I have readed atleast eight posts of your website and let me tell you, your website provides the most fascinating information. This is the knowledge that I had been craving for, I am already your rss reader currently and that I would frequently be careful for the new posts. Thanks plenty another time, Regards,Hadoop Online Training

    ReplyDelete
  10. Thanks for providing the best information it's very useful for HADOOP learners.we also provide the best HADOOP Online Training.

    ReplyDelete
  11. The Hadoop platform was designed to solve problems where you have a lot of data, mixture of complex and structured data.
    Hadoop Development

    ReplyDelete
  12. Nice article very happy to see this Hadoop Online Training Article.. I came to know 123trainings at hyderabad is also providing excellent hadoop online training.. keep Posting this article.. I also have hadoop online training demo video.

    ReplyDelete
  13. I was really impresed by reading this article about
    Hadoop online training
    , It will be useful for Hadoop online training learners

    ReplyDelete
  14. This is nice article for who is interested in learning Hadoop.Thanks for sharing hadoop online training information with us.

    ReplyDelete
  15. Thanks for your support, i am very interested in learning Hadoop.. Right now i am learning Hadoop in Attain Technologies at Hyderabad, They will provide the Best Hadoop Online Training..

    ReplyDelete
  16. Thanks for sharing valuable information this is very useful for
    Hadoop Online Training
    Learners.. I am appreciating you for gathering such a nice information.

    ReplyDelete
  17. Thanks for sahring for this valuble information and it is useful for hadoop learners.Hadoop online trainings also provides best hadoop online training.

    ReplyDelete
  18. This information which you provided is very much useful for us.It was very interesting and useful for hadoop online training learners.We also providing
    hadoop online training .

    ReplyDelete
  19. This comment has been removed by the author.

    ReplyDelete
  20. hi you have gathered a valuable information on Hadoop...., i am looking for content like this and i am much impressed with the information and nice course content, thanks a lot for the Information regarding Hadoop Online Training.

    ReplyDelete
  21. This comment has been removed by the author.

    ReplyDelete
  22. Thanks for gathering information regarding HADOOP,I have gathered 123trainings is the best HADOOP ONLINE TRAINING provider from hyderabad.

    ReplyDelete
  23. It was nice article it was very useful for me as well as useful for HADOOP online training learners thanks for providing this valuable information.

    ReplyDelete
  24. It's nice information and it is very helpful for us.123trainings provides best online training Hadoop.

    ReplyDelete
  25. Appreciation for nice Updates, I found something new and folks can get useful information about Hadoop Online Training

    ReplyDelete
  26. I Appreciate it and It is good and very helpful for us.123trainings provides best online Hadoop training .To see free demo online Hadoop training demo class in Hyderabad

    ReplyDelete
  27. Thanks for the wonderful information and it is useful for Hadoop learners.123trainings provides best online Hadoop training.to see free demo classHadoop online training class in india

    ReplyDelete

  28. You have gathered Nice information I was really impressed by seeing this information, it was very interesting and it is very useful for hadop Big data Learners

    ReplyDelete
  29. Thanks for sharing this valuble information and it is useful for me and also Hadoop learners.we also provideHadoop Bigdata Online Training Classes In India

    ReplyDelete
  30. Appreciation for nice Update, I found something new and folks can get useful information about Hadoop Online Training

    ReplyDelete

  31. Thanks for sharing valuable information and it is useful for hadooponlinetrainings provides the best Hadoop Online Training classes.

    ReplyDelete