daft.DataFrame.join#
- DataFrame.join(other: DataFrame, on: Optional[Union[List[Union[Expression, str]], Expression, str]] = None, left_on: Optional[Union[List[Union[Expression, str]], Expression, str]] = None, right_on: Optional[Union[List[Union[Expression, str]], Expression, str]] = None, how: str = 'inner', strategy: Optional[str] = None) DataFrame [source]#
Column-wise join of the current DataFrame with an
other
DataFrame, similar to a SQLJOIN
Note
Although self joins are supported, we currently duplicate the logical plan for the right side and recompute the entire tree. Caching for this is on the roadmap.
- Parameters:
other (DataFrame) – the right DataFrame to join on.
on (Optional[Union[List[ColumnInputType], ColumnInputType]], optional) – key or keys to join on [use if the keys on the left and right side match.]. Defaults to None.
left_on (Optional[Union[List[ColumnInputType], ColumnInputType]], optional) – key or keys to join on left DataFrame.. Defaults to None.
right_on (Optional[Union[List[ColumnInputType], ColumnInputType]], optional) – key or keys to join on right DataFrame. Defaults to None.
how (str, optional) – what type of join to performing, currently only
inner
is supported. Defaults to “inner”.strategy (Optional[str]) – The join strategy (algorithm) to use; currently “hash”, “sort_merge”, “broadcast”, and None are supported, where None chooses the join strategy automatically during query optimization. The default is None.
- Raises:
ValueError – if
on
is passed in andleft_on
orright_on
is not None.ValueError – if
on
is None but bothleft_on
andright_on
are not defined.
- Returns:
Joined DataFrame.
- Return type: