Welcome to the community and thanks for your detailed question!
It could be possible that the error is due to the way you’re merging multiple DataFrames in pandas. In pandas, you can’t merge more than two DataFrames at once using a single pd.merge() call. Instead, you should perform the joins step-by-step, similar to chaining SQL joins. According to this discussion, you should first merge employees and departments on emp_id, then merge the result with itemtab on item_id.
Here’s a concise example based on your data:
import pandas as pd
# Your DataFrames
employees = pd.DataFrame({
'emp_id': [1, 2, 3],
'name': ['Alice', 'Bob', 'Charlie']
})
departments = pd.DataFrame({
'dept_id': [101, 102, 103],
'emp_id': [2, 3, 4],
'item_id': [2, 3, 4],
'dept_name': ['HR', 'IT', 'Finance']
})
itemtab = pd.DataFrame({
'dept_id': [101, 102, 103, '', '', ''],
'item_id': [1, 2, 3, 4, 5, 6],
'item_name': ['computer', 'car', 'bus', 'home', 'flate', 'treain']
})
# Stepwise inner joins
result = pd.merge(employees, departments, on='emp_id', how='inner')
result = pd.merge(result, itemtab, on='item_id', how='inner')
print(result)
This approach should give you the equivalent of your Oracle SQL inner join. If you want more detail or a breakdown of each step, just let me know!
Sources: