Toggle Menu

<-- Back to schedule

Functional programming in Python with Toolz and fn.py


In my brief experience people rarely take this [streaming] route. They use single-threaded in-memory Python until it breaks, and then seek out Big Data Infrastructure like Hadoop/Spark at relatively high productivity overhead. ~ Matt Rocklin

That quote succinctly summarises my computational life, right up until recent months.

In “traditional” programming, you load a dataset into memory, process it in some way, and output the result. This is simple to understand. But in streaming programs, a function processes some of the data, yields the processed chunk, then downstream functions deal with that chunk, then the original function receives a bit more, and so on… All these things are going on at the same time! How can one keep them straight?

This talk will introduce Matt Rocklin’s Toolz library which makes functional programming easy in Python and provides a framework to write elegant, concise code to analyse bigger-than-memory data, and fn.py, which has even more FP constructs. I’ll present streaming data analysis using FP from the ground up, from a simple “hello-world” example to image illumination correction and streaming extensions to scikit-learn classifiers, and analysing a genome in a few minutes.

Juan Nunez-Iglesias

Juan is a research scientist at the University of Melbourne’s Victorian Life Sciences Computation Initiative. His undergraduate degree was in biomedical science, but he slowly veered towards a computational biology and software engineering career. He is a core developer of the scikit-image library, teacher of scientific programming, and author of an upcoming book about the SciPy library. I am also author of the (decidedly non-functional!) gala image segmentation library.

GitHub: jni
Twitter: jnuneziglesias
Blog: I Love Symposia!


Geelong 2016

Our Emperor Penguin Sponsors

Geelong

About Geelong

Geelong is Victoria's second largest city, located on Corio Bay, and within a short drive from popular beach-front communities on the Bellarine Peninsula as well as being the gateway to the famous Great Ocean Road

More Info »

linux.conf.au

linux.conf.au

linux.conf.au is widely regarded by delegates as one of the best community run Linux conferences worldwide and is the largest Linux and Open Source Software conference in the Asia-Pacific.

Read More »

Sponsorship

Sponsorship

Our Sponsors help make linux.conf.au become the awesome conference everyone comes back to year after year. Come see who's on board this year, or find out how to get in contact with us

Sponsorship »