Forge Home

11,141 downloads

9,590 latest version

4.6 quality score

We run a couple of automated
scans to help you access a
module's quality. Each module is
given a score based on how well
the author has formatted their
code and documentation and
modules are also checked for
malware using VirusTotal.

Please note, the information below
is for guidance only and neither of
these methods should be considered
an endorsement by Puppet.

Version information

  • 0.9.3 (latest)
  • 0.9.2
  • 0.9.1
  • 0.9.0
released Jun 4th 2015
This version is compatible with:
  • , , , , ,

Start using this module

  • r10k or Code Manager
  • Bolt
  • Manual installation
  • Direct download

Add this module to your Puppetfile:

mod 'cesnet-pig', '0.9.3'
Learn more about managing modules with a Puppetfile

Add this module to your Bolt project:

bolt module add cesnet-pig
Learn more about using this module with an existing project

Manually install this module globally with Puppet module tool:

puppet module install cesnet-pig --version 0.9.3

Direct download is not typically how you would use a Puppet module to manage your infrastructure, but you may want to download the module in order to inspect the code.

Download
Tags: hadoop, pig

Documentation

cesnet/pig — version 0.9.3 Jun 4th 2015

####Table of Contents

  1. Overview
  2. Module Description - What the module does and why it is useful
  3. Setup - The basics of getting started with pig
  4. Usage - Configuration options and additional functionality
  5. Reference - An under-the-hood peek at what the module is doing and how
  6. Development - Guide for contributing to the module

##Overview

Install Apache Pig - platform for analyzing large data sets.

##Module Description

This module installs Apacha Pig - platform for analyzing large data sets. By default pig expects locally set-up Hadoop client.

Supported are:

  • Fedora 21: native packages (tested on Pig 0.13.0)
  • Debian 7/wheezy: Cloudera distribution (tested on CDH 5.3.0, Pig 0.12.0)
  • Ubuntu 14/trusty: Cloudera distribution (tested on CDH 5.3.0, Pig 0.12.0)
  • RHEL 6, CentOS 6, Scientific Linux 6: Cloudera distribution (tested on CDH 5.4.2, Pig 0.12.0)

##Setup

###What cesnet-pig module affects

  • Packages: installs pig packages

###Setup Requirements

Be aware of:

###Beginning with pig

Example:

include pig

##Usage

By default pig uses Hadoop for its operations, like launched with -x mapreduce:

pig -x mapreduce

Pig can be launched locally this way:

pig -x local

Use Pig with HBase: add following to the pig scripts (replace <ZooKeeper_version> and <HBase_version> by current values):

register /usr/lib/zookeeper/zookeeper-<ZooKeeper_version>.jar
register /usr/lib/hbase/hbase-<HBase_version>-security.jar

Use Pig with DataFu: add following to the pig scripts (replace <DataFu_version> by current value):

REGISTER /usr/lib/pig/datafu-<DataFu_version>.jar

###Classes

  • config
  • init
  • install
  • params

###Module Parameters

####datafu_enabled true

Install also Pig User-Defined Functions collection.

##Development