Any faster alternative to #Hadoop HDFS?

I’d like to have an alternative to Hadoop HDFS, a faster and not java filesystem: S3: S3 Support in Apache Hadoop if your servers are hosted at Amazon AWS chep: using hadoop with ceph glusterfs: managing hadoop compatible storage lustre: Running hadoop with lustre Openstack Swift: Hadoop OpenStack Support: Swift Object Store xstreamfs: there is an hadoop client Which is better? Any suggestions? References: [1] https://en.wikipedia.org/wiki/Comparison_of_distributed_file_systems

November 17, 2016 · 1 min · 66 words · Matteo Redaelli

About Cayley a scalable graph database

This is fast tutorial of using the Caylay graph database (with MongoDB as backend): Cayley is “not a Google project, but created and maintained by a Googler, with permission from and assignment to Google, under the Apache License, version 2.0” download and unzip a binary distribution edit cayley.cfg { "database": "mongo", "db\_path": "cayley.redaelli.org:27017", "read\_only": false, "host": "0.0.0.0" } ./cayley init -config=cayley.cfg ./cayley http -config=cayley.cfg -host=“0.0.0.0” & create a file demo....

August 3, 2015 · 1 min · 123 words · Matteo Redaelli

Archlinux and Docker for my Raspberry PI2

What is the best linux distribution for Raspberry PI2? I started with Raspian (Debian is my preferred Linux distribution for servers, desktops and laptops) but docker didn’t work. But with Archlinux it works fine. How to create a docker images with Archlinux & RPI2? See http://linucc.github.io/docker-arch-rpi2/ matteoredaelli/docker-karaf-rpi is the first docker image I have created. Below my docker info output: [root@raspi1 ~]# docker info Containers: 4 Images: 9 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 17 Execution Driver: native-0....

March 28, 2015 · 1 min · 154 words · Matteo Redaelli

A case study of adopting Bigdata technologies in your company

Bigdata projects can be very expensive and can easily fail: I suggest to start with a small, useful but not critical project. Better if it is about unstructured data collection and batch processing. In this case you have time to get practise with the new technologies and the Apache Hadoop system can have not critical downtimes. At home I have the following system running on a small Raspberry PI: for sure it is not fast ;-) At work I introduced Hadoop just few months ago for collecting web data and generating daily reports....

March 13, 2015 · 1 min · 93 words · Matteo Redaelli

How to setup a multi node Hadoop cluster

Read the interesting articles by Michael G. Noll Running Hadoop on Ubuntu Linux (Single-Node Cluster) Running Hadoop on Ubuntu Linux (Multi-Node Cluster)

March 7, 2013 · 1 min · 22 words · Matteo Redaelli