SQL vs Pandas

Ramesh Ponnusamy
1 min readJun 28, 2024

--

Many new data engineers and ML engineers, after completing Python/ML courses, often believe that SQL might not be necessary because Python can handle many tasks. However, it’s helpful to recognize the unique strengths of SQL, especially when it comes to debugging, generating complex reports, and ensuring data accuracy.

I got a task to change 30 SQL queries into pandas for a report with a lot of complex SQL logic. I ran into some data integrity issues during this process:

  1. Union with different types of columns
  2. Union with a different number of columns
  3. Row number with partitioning on null columns

I thought these operations would work the same in pandas as in SQL, but I was wrong. I had to manually handle null values and make sure the number of columns matched for unions.

SQL has been a trusted part of data management for over 50 years. While there are many programming languages, SQL is still very important for its ability to query and manage databases effectively

--

--

Ramesh Ponnusamy
Ramesh Ponnusamy

Written by Ramesh Ponnusamy

Data-Architect, SQL Master,Python ,Django, Flask dev, AI prompting, Linked-in: https://www.linkedin.com/in/ramesh-ponnusamy/ mail : ramramesh1374@gmail.com

No responses yet