The way to apply a function to pandas data structures is not always obvious–several methods exist (
map) and their scope is different.
First there is two main structures (fortunately I’m not talking about
Series: one-dimensional labeled array
DataFrame: 2-dimensional labeled data structure
map methods can work on different ways.
The function is called (mapped) for each individual element (value)–so it takes the element (each distinct value) as parameter.
Series: can be used with either a dict, a function, or a
DataFrame: It is equivalent to calling map on all columns of the
By row / column
The function is called (applied) for an entire row or a column–so it takes a row or a column as parameter, in other words a
DataFramethat can be called with an axis parameter indicating to apply to column (
0) or to row (
applycan also be used with a
Series: it will only work for the entire array when used with a numpy universal function
ufunc. So it’s not working element-wise, however when used with standard function it will work element-wise.
In short, apply works on row / column of a
applymap works element-wise on a
apply for most cases–works element-wise on a
References / Further reading
- Wes McKinney, Python for Data Analysis ( O’Reilly, 2012)
- Difference between map, applymap and apply methods in Pandas