Tuesday, 17 July 2012
Friday, 13 July 2012
Thursday, 12 July 2012
AWS related - notes to self
Uploading files to AWS
A python interface to AWS
e.g. code using the interface
A python interface to AWS
e.g. code using the interface
How to decrypting S3 data before EMRing
The below discussion demonstrates how to decrypt S3 data as a bootstrap action to the EMR cluster: https://forums.aws.amazon.com/thread.jspa?threadID=50189
Another example is to use the S3 Java client side encryption in Map/Reduce jobs: http://aws.typepad.com/aws/2011/04/client-side-data-encryption-using-the-aws-sdk-for-java.html
AWS streaming job flow
See link below re how to create a streaming job flow (note: can use gzip + password as part of the streaming job)
Wednesday, 11 July 2012
Installing HA - notes to self
c/o Graham H
Checking the HA
[root@dmmlw-r410-12 ~]# crm_mon
============
Last updated: Tue Jul 10 14:12:10 2012
Stack: openais
Current DC: myserver2 - partition with quorum
Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ myserver1 myserver2 ]
shared_ip_one (ocf::heartbeat:IPaddr): Started myserver1
Configuration
Checking the HA
[root@dmmlw-r410-12 ~]# crm_mon
============
Last updated: Tue Jul 10 14:12:10 2012
Stack: openais
Current DC: myserver2 - partition with quorum
Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ myserver1 myserver2 ]
shared_ip_one (ocf::heartbeat:IPaddr): Started myserver1
Configuration
Install these packages:
cifs-utils-4.8.1-2.el6.x86_64.rpm
cluster-glue-1.0.5-2.el6.x86_64.rpm
cluster-glue-libs-1.0.5-2.el6.x86_64.rpm
corosync-1.2.3-36.el6.x86_64.rpm
corosynclib-1.2.3-36.el6.x86_64.rpm
corosynclib-devel-1.2.3-36.el6.x86_64.rpm
heartbeat-3.0.4-1.el6.x86_64.rpm #from epel repo
heartbeat-libs-3.0.4-1.el6.x86_64.rpm
#from epel repo
keyutils-1.4-1.el6.x86_64.rpm
libibverbs-1.1.4-2.el6.x86_64.rpm
libmlx4-1.0.1-7.el6.x86_64.rpm
librdmacm-1.0.10-2.el6.x86_64.rpm
libtalloc-2.0.1-1.1.el6.x86_64.rpm
libtool-ltdl-2.2.6-15.5.el6.x86_64.rpm
lm_sensors-libs-3.1.1-10.el6.x86_64.rpm
net-snmp-libs-5.5-31.el6.x86_64.rpm
pacemaker-1.1.5-5.el6.x86_64.rpm
pacemaker-cts-1.1.5-5.el6.x86_64.rpm
pacemaker-libs-1.1.5-5.el6.x86_64.rpm
perl-TimeDate-1.16-11.1.el6.noarch.rpm
PyXML-0.8.4-19.el6.x86_64.rpm
resource-agents-3.0.12-22.el6.x86_64.rpm
net-snmp-5.5-31.el6.x86_64.rpm
sudo rpm -i --nodeps
libvirt-0.8.7-18.el6.x86_64.rpm libvirt-client-0.8.7-18.el6.x86_64.rpm
numactl-2.0.3-9.el6.x86_64.rpm gnutls-utils-2.8.5-4.el6.x86_64.rpm
nc-1.84-22.el6.x86_64.rpm libxslt-1.1.26-2.el6.x86_64.rpm
netcf-libs-0.1.7-1.el6.x86_64.rpm augeas-libs-0.7.2-6.el6.x86_64.rpm
cyrus-sasl-md5-2.1.23-8.el6.x86_64.rpm qpid-cpp-client-0.10-3.el6.x86_64.rpm
boost-1.41.0-11.el6.x86_64.rpm boost-1.41.0-11.el6.x86_64.rpm
boost-date-time-1.41.0-11.el6.x86_64.rpm boost-python-1.41.0-11.el6.x86_64.rpm
boost-test-1.41.0-11.el6.x86_64.rpm
boost-regex-1.41.0-11.el6.x86_64.rpm
boost-graph-1.41.0-11.el6.x86_64.rpm boost-serialization-1.41.0-11.el6.x86_64.rpm
boost-wave-1.41.0-11.el6.x86_64.rpm
boost-iostreams-1.41.0-11.el6.x86_64.rpm
boost-signals-1.41.0-11.el6.x86_64.rpm ebtables-2.0.9-6.el6.x86_64.rpm
iscsi-initiator-utils-6.2.0.872-21.el6.x86_64.rpm libicu-4.2.1-9.el6.x86_64.rpm
dnsmasq-2.48-4.el6.x86_64.rpm radvd-1.6-1.el6.x86_64.rpm
qemu-img-0.12.1.2-2.160.el6.x86_64.rpm yajl-1.0.7-3.el6.x86_64.rpm
libcgroup-0.37-2.el6.x86_64.rpm libpciaccess-0.10.9-4.el6.x86_64.rpm
sudo rpm -i
fence-virtd-libvirt-0.2.1-8.el6.x86_64.rpm fence-virtd-0.2.1-8.el6.x86_64.rpm
sudo rpm -i
libesmtp-1.0.4-15.el6.x86_64.rpm
sudo rpm -i
clusterlib-3.0.12-41.el6.x86_64.rpm
sudo rpm -i openais-1.1.1-7.el6.x86_64.rpm
openaislib-1.1.1-7.el6.x86_64.rpm
sudo rpm -i pexpect-2.3-6.el6.noarch.rpm
sudo rpm -i perl-Net-Telnet-3.03-11.el6.noarch.rpm
sudo rpm -i cman-3.0.12-41.el6.x86_64.rpm
fence-virt-0.2.1-8.el6.x86_64.rpm fence-agents-3.0.12-23.el6.x86_64.rpm
net-snmp-utils-5.5-31.el6.x86_64.rpm ricci-0.16.2-35.el6.x86_64.rpm
sg3_utils-1.28-3.el6.x86_64.rpm sg3_utils-libs-1.28-3.el6.x86_64.rpm
oddjob-0.30-5.el6.x86_64.rpm nss-tools-3.12.9-9.el6.x86_64.rpm
nss-tools-3.12.9-9.el6.x86_64.rpm modcluster-0.16.2-10.el6.x86_64.rpm
sudo rpm -i pacemaker-1.1.5-5.el6.x86_64.rpm
pacemaker-cts-1.1.5-5.el6.x86_64.rpm pacemaker-libs-1.1.5-5.el6.x86_64.rpm
create /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual
page
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 10.x.x.x
#mcastaddr: 226.94.1.1
broadcast: yes
mcastport: 5405
ttl: 1
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: on
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
#end of file
############################
run:
crm configure
paste the below into the new shell:
primitive shared_ip_one IPaddr params ip=10.x.x.0
cidr_netmask="255.255.254.0"
nic="bond0"
property stonith-enabled="false"
location share_ip_one_master shared_ip_one 100:
myserver1
monitor shared_ip_one
20s:10s
commit
exit
Tuesday, 10 July 2012
Monday, 9 July 2012
Pentaho PDI (Kettle) - notes to self
Starters
The Pentaho download page majors on the commercial versions.
(not sure whether the Community Edition (CE) comes with the commercial version)
Scroll down to Community Projects to find the open source version.
Spoon - the GUI where one designs, develops and tests ETL graphs.
So remember to run spoon.sh (spoon.bat) to fire up the environment and not simply click on the "Data Integration 64-bit" application (this resulted in the necessary libext JDBC libraries not to be available and resulted in a few errors below).
So remember to run spoon.sh (spoon.bat) to fire up the environment and not simply click on the "Data Integration 64-bit" application (this resulted in the necessary libext JDBC libraries not to be available and resulted in a few errors below).
http://wiki.pentaho.com/display/EAI/02.+Spoon+Introduction
Problem with MySQL connection and PDI v.4.3
Error connecting to database [mytest_mysql] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database
Exception while loading class
org.gjt.mm.mysql.Driver
...
Caused by: java.lang.ClassNotFoundException: org.gjt.mm.mysql.Driver
To resolve this problem, read the issue log here which requires you download the ConnectorJ from here
$ tar xvzf mysql-connector-java-5.1.21.tar.gz mysql-connector-java-5.1.21/mysql-connector-java-5.1.21-bin.jar
$ cp -ip /downloads/mysql-connector-java-5.1.21/mysql-connector-java-5.1.21-bin.jar /usr/local/pentaho/data-integration/libext/JDBC/
Problem with MySQL connection and PDI v.4.3
Error connecting to database [mytest_mysql] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database
Exception while loading class
org.gjt.mm.mysql.Driver
...
Caused by: java.lang.ClassNotFoundException: org.gjt.mm.mysql.Driver
To resolve this problem, read the issue log here which requires you download the ConnectorJ from here
$ tar xvzf mysql-connector-java-5.1.21.tar.gz mysql-connector-java-5.1.21/mysql-connector-java-5.1.21-bin.jar
$ cp -ip /downloads/mysql-connector-java-5.1.21/mysql-connector-java-5.1.21-bin.jar /usr/local/pentaho/data-integration/libext/JDBC/
mysql misc - notes to self
installed mysql on mac
lazily running as root
needing to mkdir /var/run/mysqld and chmod 777 /var/run/mysqld
(missing something obviously)
Can across this useful document for installing mysql on mac after following my nose.
To connect to mysql using perl DBI/DBD
To load data into mysql
how to run an SQL command in a file from within mysql
mysql> source mysqlcmds.sql
how to run an SQL command from the command line
mysql < mysqlcmds.sql
(note can leave the if the first line in the mysqlcmds.sql file is use
Self consuming mysql sql script in shell script
cat load_myfile.sh
#!/bin/bash
MYFILE=/mypath/myfile.dat
mysql --user=myuser --password=xyz <
use mytest;
load data local infile '${MYFILE}'
replace
into table mytest.mytable
character set UTF8
fields terminated by '|';
EOF
If you are getting the following error, it could be that you are missing the "local" keyword (if you are providing a full path to the file"
$ ./load_myfile.sh
ERROR 13 (HY000) at line 3: Can't get stat of '/mypath/myfile.dat' (Errcode: 13)
lazily running as root
needing to mkdir /var/run/mysqld and chmod 777 /var/run/mysqld
(missing something obviously)
Can across this useful document for installing mysql on mac after following my nose.
To connect to mysql using perl DBI/DBD
To load data into mysql
how to run an SQL command in a file from within mysql
mysql> source mysqlcmds.sql
how to run an SQL command from the command line
mysql
(note can leave the
Self consuming mysql sql script in shell script
cat load_myfile.sh
#!/bin/bash
MYFILE=/mypath/myfile.dat
mysql --user=myuser --password=xyz <
use mytest;
load data local infile '${MYFILE}'
replace
into table mytest.mytable
character set UTF8
fields terminated by '|';
EOF
If you are getting the following error, it could be that you are missing the "local" keyword (if you are providing a full path to the file"
$ ./load_myfile.sh
ERROR 13 (HY000) at line 3: Can't get stat of '/mypath/myfile.dat' (Errcode: 13)
Thursday, 5 July 2012
Big Data articles
Big Data related articles
Datasift - Tweet related apps
Datasift - Tweet related apps
Doug Cutting on Microsoft/Oracle's move in Big Data arena (Cloudera vs Hortonworks too)
http://www.theregister.co.uk/2012/06/27/doug_cutting_hadoop_interview/
Big Data Security
Securosis' Big Data Architectural Issues
Securosis' Securing Big Data recommendations white paper
Cassandra vs HBASE
http://bigdatanoob.blogspot.co.uk/2012/11/hbase-vs-cassandra.html
Interesting hackathon result
http://www.opencompute.org/blog/ocp-hackathon-winner-adaptive-storage/
Big Data Security
Securosis' Big Data Architectural Issues
Securosis' Securing Big Data recommendations white paper
Cassandra vs HBASE
http://bigdatanoob.blogspot.co.uk/2012/11/hbase-vs-cassandra.html
Interesting hackathon result
http://www.opencompute.org/blog/ocp-hackathon-winner-adaptive-storage/
Subscribe to:
Posts (Atom)