In this article I am going to explain the scenario I have been working on the past year translating code from Pandas to PySpark to improve performance times and make scripts more efficient in a production environment. Let’s start understanding these technologies....