300x250 AD TOP

Powered by Blogger.

Thursday, 12 June 2014

Tagged under: ,

How To Solr Server


You can download solr server from the following link

http://lucene.apache.org/solr/downloads.html


once downloaded extract and place it into your virtual machine. Move into the Solr folder and you'll see some useful readme files. you should read them first time you get.

To start up a quick up and running server go into the example directory
open terminal and execute the command

java -jar start.jar

now you should be able to see the Solr interface at
http://localhost:8983/solr/

To index a pdf document using the post.jar that is already provided in the Solr distribution move into example docs folder

The post.jar utility is not meant for production use, but as a convenience tool for experimenting with Solr.

Open a new terminal and execute the following command.
this command uses ExtractingRequestHandler aka Solr Cell project

java -Durl=http://localhost:8983/solr/update/extract -Dparams=literal.id=doc5 -Dtype=text/pdf -jar post.jar SQA.pdf

Note you should have the file SQA.pdf in the exampledocs directory.
you should see an output like this:
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/update/extract?literal.id=doc5 using content-type text/pdf..
POSTing file SQA.pdf
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/update/extract?literal.id=doc5..
Time spent: 0:00:06.259

you should continue to read here


To query the Solr Server you should go to this url
http://localhost:8983/solr/#/collection1/query


Tagged under: , ,

How To Run Oozie Coordinator Jobs

First you have to export the url
export url "http://localhost.localdomain:1100/oozie"

Then submit the job
oozie job -oozie http://localhost.localdomain:11000/oozie -config coord.properties -submit

For running a simple Oozie job you have to run a job.properties file but for time triggered you have run coord.properties file


Directory Structure for using Oozie:
ProjectFolder
                Lib
Jars that you want to run. If you have made a mapreduce job than make a jar from it and place It here
                DataFolder
                                Input output files
                CoordinatorFolder
                                Coord.xml
Job.properties file ( should be on the local file system all other should be on hdfs,namenode jobtracker are mentioned here)
                Workflow.xml ( actual workflow file that specifies which job to run which class to run and parameters are also specified here)
Coordinator Job
1-      The workflow job is started after the predicate is satisfied. A predicate can reference to data, time and/or external events
2-      The outputs of last 4 runs of a workflow that runs every 15 minutes become the input of another workflow that runs every 60 minutes. Chaining together these workflows result it is referred as a data application pipeline

Useful links:
http://hadooped.blogspot.com/2013/06/apache-oozie-part-1-workflow-with-hdfs.html

Wednesday, 11 June 2014

Tagged under: , ,

Installing Tcollector

The following tutorial will guide you to install and use Tcollector.

Tcollector is a child project of OpenTSDB.

Opentsdb will start saving some points, we have to use tcollector (plugin) to store some useful information about system components like ram and disk space

git clone git://github.com/OpenTSDB/tcollector.git

Now go in this tcollector folder and edit 'startstop' and change this variable
TSD_HOST=something something
after changing it should look like
TSD_HOST=127.0.0.1

here 127.0.0.1 is the DNS name of my server

Start the tcollector
sh startstop start
To stop the tcollector
sh startstop stop

Now to query it through restful api these are some of the useful links

http://www.euphoriaaudio.com/opentsdb/querying_examples.html
http://www.euphoriaaudio.com/opentsdb/http-api-q.html

Run this in the browser
http://localhost:4242/api/query?start=1h-ago&m=sum:rate:proc.stat.cpu

you should see a json format returned as a result.

cheers
Tagged under: , , ,

Setup your OpenTSDB in the Cloudera VM.





The following tutorial will guide you to setup your OpenTSDB in the Cloudera VM.


First of all we have the set the time of our machine.

this command will set the time of VM, very important for opentsdb
sudo ln -sf /usr/share/zoneinfo/UTC /etc/localtime

If at any point you get the permission error try to login as root using
sudo -s

Install Gnuplot
sudo yum install git automake gnuplot

Make the Git repo
git clone https://github.com/OpenTSDB/opentsdb.git

Then go into the opentsdb folder
cd opentsdb

Execute this command to execute the shell script already in the repo
./build.sh

Set the environment variable
env COMPRESSION=none HBASE_HOME=/usr/lib/hbase ./src/create_table.sh

Then execute this command
mkdir /tmp/tsd

Look in the opentsdb folder and it should contain a folder named "build" just appeared
We're good to go!!

Now run the opentsdb
./build/tsdb tsd --port=4242 --staticroot=build/staticroot/ --cachedir=/tmp/tsd/ --auto-metric

*some important standards are ignored you should study in detail the parameters of the ./build/tsdb tsd command

To check whether OpenTSDB is running or not
http://127.0.0.1:4242

You should see an interface to play with opentsdb and plot some graphs


Opentsdb will start saving some points, we have to use tcollector (plugin) to store some useful information like ram and disk space. We'll cover that in another tutorial.


Cheers

Sunday, 8 June 2014

Tagged under: ,

How to Install & Configure Couldera VM



Prerequisites and Installation:


1- Cloudera VM
      you can download this from this link ( Download Cloudera VM )

2- Virtual Machine Player
        you will also need a software which runs virtual machine.
        Following are the links of famous virtual machine players.
      Virtual Box
         you can download this from this link (Download Virtual Box)

      VMware Player or VMware Workstation

         you can download this from (Download VMware Products)

        Note: According to me, VMware Workstation is the best.



Configuration: (VMware Workstation)

  1. Open VMware Workstation.
  2. Go to the File menu and Click on "Open" or press "Ctrl + O"
  3. Browse the "Virual Machine File" you downloaded. Hint: see first Prerequisite.
  4. VMware Workstation will start extracting the files.
  5. You have now successfully configure the Cloudera VM, run and enjoy!