This month we have Tom Cooper (@tomncooper) to tell us about Apache Spark:
Apache Spark (http://spark.apache.org/) is rapidly becoming the go-to choice for large cluster processing. From log analysis to machine learning, Spark provides a fast and fault tolerant infrastructure for batch and stream processing. Most importantly it has a well supported Python API, so all you need is a little Python knowledge (and a bit of functional programming) and you can can quickly design programs for analysing terabytes of data over thousands of machines!
Tom is a PhD student at Newcastle University working on optimising event processing systems that use programs such as Spark. He will start with an introduction to what Spark is, talk about the innovations it uses, the powerful inbuilt libraries it provides and give examples of how Python makes using Spark really straight forward.
The talk won’t be all academic, at the end there will be a code dojo where you will be able to try a few different spark examples. However, to make this dojo work you will need to have a copy of Spark on your laptop. Please download the archive at this link: http://mirrors.ukfast.co.uk/sites/ftp.apache.org/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
Pizza will be courtesy of our kind sponsors Sharpe Recruitment (@sharperecruit) and Pebble (@mypebble) - get there early to get a slice! And we once again will be hosted by the ace folks at Campus North (@campusnorthuk).
Campus North is down the hill from the Body Zone Gym and the Cornerstone Cafe on the corner of Carliol Square. Please ring the doorbell, or tweet @PythonNorthEast, to get let in.
Hope to see you there and please spread the word.
The meetup requires a minimum of 15 signed up to proceed.
5 Carliol Square, Newcastle upon Tyne NE1, United Kingdom
An event page by Python North East
Made with love in London
We ask for your email address so that we and the attendees have a way of contacting you.