Pandas borrows a core concept from SQL:
the Join
But there are so many different types of joining two DataFrames
Let’s make this easy and go through Left, Right, Inner and Outer joins
Knowing these is great in interviews in addition to your usual DataFrame shenanigans
Joins always happen on two DataFrames.
This explanation will use: ⬛
- : The left DataFrame
- : The right DataFrame
- ⬛: The result
Don’t get the word “join” wrong though, you can actually end up with a smaller DataFrame ⬛ than either or
⬅️ The Left Join is selfish
This one takes the complete left DataFrame and only checks for overlaps from the right
No from outside of the bounds of will make it into ⬛
️ The Right Join is almost the same as Left
Only take everything in and overlapping
⤵️ The Inner Join
This one is tricky.
Almost always ⬛ will be smaller than &.
For the Inner join, you only look at the parts of and that overlap.
Nothing is included in ⬛ that exists outside of this common area.
↔️ The Outer Join
Is possibly the simples one.
It is exactly what we would expect from a “join”.
Take all of and all of and combine it into ⬛.
All of the data is in our result.
TL;DR
- Pandas borrows from SQL using Joins
- Left and Right join maintain the original and whatever overlaps in the other
- Inner is only the common ground
- Outer uses all the data in both DataFrames
原文链接:Pandas borrows a core concept from SQL and these 3 emoji tell you exactly how to use it
暂无评论内容