InfoSphere BigInsights is IBM’s bigdata offering to help organizations discover and analyze business insights hidden in large volumes of a diverse range of data – data that’s often ignored or discarded because it’s too huge, impractical or difficult to process using traditional means. Examples include log records, click streams, social media data, news feeds, emails, electronic sensor output, and even transactional data.
BigInsights brings the power of open source Apache Hadoop project to enterprise. In addition, there are a number of IBM value-add components that make up this Enterprise Analytics platform. These value-adds are in the areas of analysis and discovery, security, enterprise software integration, administrative and platform enhancements. For more details please visit below URL.
You can also download no-charge Quick Start Edition of IBM Infosphere BigInsight.
In this blog we’ll see steps involved in BigInsights installation and configuration on RHEL. There are three major parts to it.
1) Meet the pre-requisites (Hardware & Software)
2) Complete pre-installation activities
3) Install BigInsights 3.0
Meet the pre-requisites (Hardware & Software)
Let’s start with step -1. You can go thru standard supported environment specification on IBM site (http://www-01.ibm.com/support/docview.wss?uid=swg27027565). Here I am going to install single-node BigInsights 3.0 on RHEL 6.4 system with the specification shown in below screenshot.
We need to verify or install the Expect, Numactl, and Ksh Linux packages. One way to get these libraries is to download them independently from various Linux websites and install them. The other and probably the better way is to use your OS (RHEL 6.4 in this case) disk or .ISO image for the process. I am going to use the second option here. First I copied “RHEL6.4-20130130.0-Server-x86_64-DVD1.iso” file in /data folder (newly created) then mounted it as /media and update repository.
mount -oloop RHEL6.4-20130130.0-Server-x86_64-DVD1.iso /media
rpm --import /media/*GPG*
yum clean all
Next step is to verify that the Expect, Numactl and Ksh Linux packages are installed.
rpm -qa | grep expect
rpm -qa | grep numactl
rpm -qa | grep ksh
If the packages are not installed, then run the following command to install them.
yum install expect
yum install numactl
yum install ksh
Now we are ready for step-2.
Complete pre-installation activities
In addition to product prerequisites, there are tasks common to all InfoSphere BigInsights installation and upgrade paths. You must complete these common tasks before you start an installation or upgrade.
Task – 1) Ensure that adequate disk space exists for these directories - / (10GB), /tmp (5GB), /opt (15GB), /var (5GB) & /home (5GB).
Task – 2) Check that all devices have a Universally Unique Identifier (UUID) and that the devices are mapped to the mount point
Before you edit /etc/fstab, save a copy of the original file.
Task – 3) Create the biadmin user and group.
// Add the biadmin group.
groupadd -g 123 biadmin
// Add the biadmin user to the biadmin group.
useradd -g biadmin -u 123 biadmin
//Set the password for the biadmin user.
//add the biadmin user to the sudoers group.
sudo visudo -f /etc/sudoers
Find out and add ‘#’ to comment below line if its not there
# Defaults requiretty
Also add these lines just below “# %wheel ALL=(ALL) NOPASSWD: ALL” line
biadmin ALL=(ALL) NOPASSWD:ALL
root ALL=(ALL) NOPASSWD:ALL
Open the /etc/security/limits.d/90-nproc.conf file and add below lines.
@biadmin soft nofile 65536
@biadmin soft nproc 65536
@root soft nofile 65536
@root soft nproc unlimited
Open the /etc/security/limits.conf file and add below lines.
Task – 4) Configure your network.
Edit the /etc/hosts to include the IP address, fully qualified domain name. The format is IP_address domain_name short_name. For example,
127.0.0.1 localhost.localdomain localhost
172.21.6.151 bda.iicbang.ibm.com bda
Edit the /etc/resolv.conf to include the nameservers
Save your changes and then restart your network.
service network restart
We need to configure passwordless SSH for the root and biadmin.
ssh-keygen -t rsa (When asked select the default file storage location and leave the password blank.)
ssh-copy-id -i ~/.ssh/id_rsa.pub email@example.com
Ensure that you can log in to the remote server without a password.
Repeat this SSH setting process for biadmin user also.
Run the following commands in succession to disable the firewall.
service iptables save
service iptables stop
chkconfig iptables off
Now disable IPv6 –
echo “install ipv6 /bin/true” >> /etc/modprobe.d/disable-ipv6.conf
Edit the /etc/sysconfig/network file and append the following lines.
Edit /etc/sysconfig/network-scripts/ifcfg-eth0 (assuming eth0 is used for networking) and add these lines –
Append following lines at the end of /etc/sysctl.conf file.
net.ipv6.conf.all.disable_ipv6 = 1
kernel.pid_max = 4194303
net.ipv4.ip_local_port_range = 1024 64000
Restart your machine.
Verify that IPv6 is disabled.
IPv6 is disabled if all lines containing inet6 are not listed in the output.
Task – 5) Synchronize the clocks of all servers using Network Time Protocol (NTP) source.
Add below line in /etc/ntp.conf
server 172.21.4.40 iburst
Update the NTPD service with the time servers that you specified.
chkconfig --add ntpd
Start the NTPD service.
service ntpd start
Verify that the clocks are synchronized with a time server.
Step – 6) Run the pre-installation checker utility to verify that your Linux environment readiness
I have copied BigInsights software copy in /data folder. Let’s unzip it.
tar -xvf IS_BigInsights_EE_30_LNX64.tar.gz
We must run and pass all bi-prechecker.sh tests before start BigInsights installation. Before that let’s create a file containing your host name.
Echo “bda.iicbang.ibm.com” > hostlist.txt
./bi-prechecker.sh –m ENTERPRISE –f hostlist.txt –u biadmin
If all the checks are [ OK ] then we are ready for next step. If there are [FAILED] entries then go thru the log file created by utility in the same folder and correct it.
Install BigInsights 3.0
Let’s start installation steps which are pretty easy if previous steps are completed successfully.
Navigate to the directory where you extracted the biginsights
Run the start.sh script.
The script starts WebSphere Application Server Community Edition on port 8300. The script provides you with a URL to the installation wizard. In my case I received -
Open it in the browser. On the License Agreement panel, accept the license agreement and then click Next.
On the Installation Type panel, select Cluster installation, select the check box to Create a response file and save your selections without completing an installation, and then click next.
On the File System panel, enter a name for your cluster (BICluster is default), select Install Hadoop Distributed File System (HDFS), enter the mount point where you want to install HDFS, and then click Next. You can choose other file system also.
On the 'Nodes' panel, click your node to use for HDFS. I can see bda.ibm.com listed here.
Next, on 'Components 1' screen, pass on ‘catalog’ and ‘bigsql’ password whatever you desire to keep.
Click Next on the remaining panels until you reach the Summary panel. On the Summary panel, click Create response file. The installation program displays the location where your response file is saved. Take note of this location so that you can easily locate your response file after you install HDFS and are ready to install InfoSphere BigInsights.
Make sure you can see all the services running on your node on ‘results’ panel.
Next it’ll take you to BigInsights Console screen. That shows your installation is successfully completed. You can browse information from Welcome tab and decide your next action.
Now if you want to add more nodes in the cluster, prepare them and add from Cluster Status tab.
To stop all the services, run below command -
Similarly there is ./start-all.sh to start all the services.
We also need to install “IBM InfoSphere BigInsights Eclipse tools” for developing and deploying applications to the BigInsights server and writing programs using Java MapReduce, JAQL, Pig, Hive and BigSQL. First of all download Eclipse 4.3 + from www.eclipse.org. Then, add the http://<server>:<port>/updatesite/ URL to your Eclipse Software Updater (Help Menu -> Install) as shown below. Select the location and all entries under the IBM InfoSphere BigInsights category. Then simply follow the steps to install the InfoSphere BigInsights plugins.
Planning to install InfoSphere BigInsights 3.0
Preparing to install InfoSphere BigInsights 3.0
Installing Infosphere BigInsights 3.0
BigInsight 3.0 Tutorials