How To Build a Data Processing Pipeline Using Luigi in Python on Ubuntu 20.04
Luigi is a Python package that manages long-running _batch processing_, which is the automated running of data processing jobs on batches of items. Luigi allows you to define a data processing job as a set of dependent tasks. In this tutorial you will build a data processing pipeline to analyze the most common words from the most popular books on Project Gutenburg. You will build a pipeline using Luigi. You will use Luigi tasks, targets, dependencies, and parameters to build your pipeline.