Big Data on a Shoestring

Big Data on a Shoestring by Nicholas Bessmer

Book: Big Data on a Shoestring by Nicholas Bessmer Read Free Book Online
Authors: Nicholas Bessmer
3 - Our Big Data Analytics Example Using Pig Latin Sample Script
     
    For the purpose of this guide, we will work through setting up a Hadoop Big Data Analytics example and run a simple Pig Latin example script from the Pig tutorial. This will perform some analysis on the Excite search engine.
     
    For future reference, you can find huge data sets to test Big Data with at the following site:
    http://aws.amazon.com/publicdatasets/
     
    Some examples that can be useful to businesses:
     
    »         US and foreign Census Data.
    »         Labor statistics
    »         Federal Reserve data
    »         Federal contracts
     
    Here are examples that are useful to scientists:
     
    »         Daily global weather measurements
    »         Genome databases.
     
    We may want to use census data from our local metropolitan area to identify trends such as disposable income or demographics like where elderly or young people reside. This type of marketing savvy requires not only computer power but also the framework that Hadoop provides. Think of Hadoop as a toolbox that allows people to approach managing huge volumes of unstructured and structured data.
     
    In Amazon’s example, these sample big data sets are accessible by signing up for their EC2 service. This is a metered service that allows businesses and institutions to run applications and services in the cloud. Amazon is acts as a central utility like the electrical company from which customers rent services – in this case computing power and data storage.  EC2 is Amazon’s Elastic Cloud and what follows are the steps to set up an EC2 account through Amazon.
     

     
    Here is the sign up screen to “rent” Amazon Web Services to run your application and database in the cloud. This is a metered service that fluctuates based on your demand. It will not break the bank.
     

     
    Better yet, let’s sign up for the micro version .
     
    Free Tier*
     
    As part of AWS’s Free Usage Tier, new AWS customers can get started with Amazon EC2 for free. Upon sign-up, new AWS customers receive the following EC2 services each month for one year:
    750 hours of EC2 running Linux/Unix Micro instance usage
    750 hours of EC2 running Microsoft Windows Server Micro instance usage
    750 hours of Elastic Load Balancing plus 15 GB data processing
    30 GB of Amazon EBS Standard volume storage plus 2 million IOs and 1 GB snapshot storage
    15 GB of bandwidth out aggregated across all AWS services
    1 GB of Regional Data Transfer
     
    Not bad to test drive this service. You need to provide you credit number and do a phone validation (to make sure you are a real person). Remember – we want the micro service to start off with. You will receive a confirmation email and you should select MANAGE YOUR ACCOUNT. Sign in with the credentials that you created and:
     

     
    Select 1 instance of the MICRO – as shown in example above.   I do not want to swallow you in technical details but you will see screens indicating your progress as follows:
     

     
    You will need to store a special file called a KEY PAIR to your local computer. Save this! You will be prompted by Amazon to remember this.
     
    Finally:
     
     

     
    Start your instance – your virtual server. You can connect to your new instance by downloading these tools: You can right click on your instance name and select “CONNECT”:
     

     
    In about 20 minutes you can have a LINUX server running in The Cloud!

Getting Our Tools Running on Our New Big Data Server
     
    Remember that a lot of computer technology is a series of a repetitive and often cook book like steps. It is the 80/20 rule: 20% of the work is challenging and imaginative – the other 80% very predictable. But predictable can be good especially when trying to avoid techno-speak if you are for example a small business owner.
     
    Let’s get our tool installed! In your command terminal fired off when you CONNECT to your new EC2 LINUX

Similar Books

Brink of Chaos

Tim Lahaye

Dispatches

Michael Herr

Beyond This Moment

Tamera Alexander

Her Sudden Groom

Rose Gordon

Caleb + Kate

Cindy Martinusen Coloma

Ask For It

Gail Faulkner