Apache Spark howto import data from a jdbc database using python

Using Apache spark 2.0 and python I’ll show how to import a table from a relational database (using its jdbc driver) into a python dataframe and save it in a parquet file. In this demo the database is an oracle 12.x file jdbc-to-parquet.py:``` from pyspark.sql import SparkSession spark = SparkSession \ .builder \ .appName(“Python Spark SQL basic example”) \ .getOrCreate() df = spark.read.format(“jdbc”).options(url=“jdbc:oracle:thin:ro/ro@mydboracle.redaelli.org:1521:MYSID”, dbtable=“myuser.dim_country”, driver=“oracle.jdbc.OracleDriver”).load() df.write.parquet(“country.parquet”)

October 27, 2016 · 1 min · 66 words · Matteo Redaelli

Howto quickly setup an interface among systems using Apache Camel / Karaf (OSGI)

In the article building system integrations with Apache Camel I’ll show how to create in 10 minutes an integration between two databases (without writing any lines of java or c# code): looking for uses in the database MOODLE (mysql) with missing attributes for each of that users retreiving the missing attributes from the database UPMS (m$ sql server) and then adding the missing attributes to the database MOODLE I’ll use...

July 25, 2012 · 1 min · 85 words · Matteo Redaelli