OpenSolaris

You are not signed in. Sign in or register.

OpenSolaris Project: Hadoop Live CD

View the leaders for this project
Project Observers

Endorsing communities

HPC Developer
Storage

OpenSolaris and Hadoop


Organizations routinely collect a huge amount of data, including web crawls, email messages, and scientific data. Processing these datasets with traditional relational database models or streaming algorithms is no longer scalable. A new data processing model, MapReduce, addresses this challenge by leveraging large clusters of hundreds or thousands of heterogeneous servers.

Hadoop is a software platform for processing huge amounts of data. It consists of the Hadoop Distributed FileSystem (HDFS) which is capable of storing Petabytes of data across thousands of nodes. HDFS ensures that data is always available, even if underlying nodes get corrupted or fail. Hadoop also includes Map/Reduce, a programming model for breaking the data into smaller chunks of work and distributing that work across the nodes in the cluster.

Interest in Hadoop is growing, however, creating, deploying, and managing a Hadoop deployment is challenging, since the performance and throughput of the system is strongly influenced by the structure of the cluster that Hadoop runs within. Thus, supporting Hadoop requires a holistic approach to server provisioning, operating system configuration, network design, and datacenter deployment.

The goal of this project is to contribute to Hadoop's development by making it more scalable, robust, deployable, and easy to use. We will work closely with the open-source and research communities to develop these innovations and ensure that they are made widely available.

OpenSolaris Hadoop LiveCD


From bootup to Hadoop cluster in 15 minutes

This project initial CD development tool aims to provide new users to Hadoop with a fully functional Hadoop cluster that is easy to start up and use. We have built a bootable CD-ROM image that provides users with a three-node virtual Hadoop cluster using OpenSolaris Zones. The CD is "live", meaning that it does not modify the contents of the user's computer. This makes it ideal for those wishing to try out Hadoop without having to install any software.

A community project: Our hope is that by lowering the barrier to trying out Hadoop, more people can try it out and explore its features. If you have requests, ideas, or suggestions for improvements to this distribution of Hadoop, please post them to the discussion board here. We'd like to include additional features in this release, and Hadoop in general, that make its adoption easier.

Features of the OpenSolaris Live Hadoop Bootable CD-ROM

    All-in-one bootable CD-ROM
  • OpenSolaris operating system and Hadoop 0.17 distribution included in this distribution.
  • "Live CD": Everything runs directly off of the CD--no software is installed and no local files are modified
  • Ideal for classroom and lab environments where computers are shared with others
    Three node virtual Hadoop cluster
  • Once OpenSolaris boots, two virtual servers are created using Zones
  • Zones are very lightweight, minimizing virtualization overheads and leaving more memory for your application
  • The "Global" zone hosts the NameNode and JobTracker, and two "Local" zones each host a DataNode and TaskTracker

How to use OpenSolaris Live Hadoop

Begin by downloading the CD-ROM image from the project download link. Next, burn this image onto a blank CD. The steps needed to do this vary depending on what type of computer that you have. Once you have the finished CD, make sure it is in your computer's CD-ROM drive, and reboot your computer. Select the CD as the boot device, and wait for the OpenSolaris environment to appear. The way you specify a boot device also varies depending on what type of computer that you have.

A small Quick Start Guide is that will show you how to start Hadoop, interact with the virtual cluster, and run a demo Map/Reduce job.

Please read the Release Notes for installation information, known issues, and late-breaking news.

Hardware Requirements

  • An x86-based PC or Mac computer
  • A CD-ROM drive
  • At least 1GB memory

Links to related projects

web analytics

View My Stats

Announcements

02 Sep 2008 OpenSolaris Live Hadoop Release 2008.08.28