Understanding Pandas Loc and Iloc Indexing
Pandas is a powerful library used for data manipulation and analysis in Python. Its data structures, such as Series and DataFrames, provide an efficient way to store and manipulate data. The loc and iloc indexing methods are commonly used to access specific rows and columns of a DataFrame.
In this article, we will explore the loc and iloc indexing methods in pandas, their differences, and how they can be used effectively.
Introduction
The loc method is label-based, meaning that it uses the row labels (index) to select values. On the other hand, the iloc method is integer position-based, using the column positions (integer) to select values.
Let’s start with an example DataFrame:
import pandas as pd
d = {'shift_city_id': [1, 1,2,3,3,3,1],
'closest_city_id': [np.nan, np.nan,np.nan,np.nan,np.nan,np.nan,np.nan]}
df = pd.DataFrame(data=d)
print(df)
Output:
shift_city_id closest_city_id
0 1 NaN
1 1 NaN
2 2 NaN
3 3 NaN
4 3 NaN
5 3 NaN
6 1 NaN
Loc Indexing
The loc method is label-based, meaning that it uses the row labels (index) to select values. The syntax for loc indexing is:
df.loc[row_label, column_labels]
In our example DataFrame, we can use loc indexing to access specific rows and columns.
Let’s try to assign a value to closest_city_id where shift_city_id equals 3 using loc indexing:
# Assigning values using loc indexing
df.loc[(df['shift_city_id'] == 3), 'closest_city_id'] = [3,4,5]
print(df)
Output:
shift_city_id closest_city_id
0 1 NaN
1 1 NaN
2 2 NaN
3 3 3.0
4 3 4.0
5 3 5.0
6 1 NaN
As you can see, the loc method only updated the value in the specified row where shift_city_id equals 3.
Iloc Indexing
The iloc method is integer position-based, using the column positions (integer) to select values. The syntax for iloc indexing is:
df.iloc[row_position, column_position]
In our example DataFrame, we can use iloc indexing to access specific rows and columns.
Let’s try to assign a value to closest_city_id where shift_city_id equals 3 using iloc indexing:
# Assigning values using iloc indexing
df.loc[(df['shift_city_id'] == 3), 'closest_city_id'] = [3,4,5]
print(df)
Output:
shift_city_id closest_city_id
0 1 NaN
1 1 NaN
2 2 NaN
3 3 3.0
4 3 4.0
5 3 5.0
6 1 NaN
As you can see, the iloc method updated all the values in the specified column where shift_city_id equals 3.
Why Loc and Iloc Are Different
The main difference between loc and iloc indexing is how they handle row and column selection.
- The
locmethod uses label-based indexing, which means that it looks for exact matches between the labels and the values. This can be useful when working with DataFrames that have unique or custom index labels. - The
ilocmethod uses integer position-based indexing, which means that it selects values based on their position in the DataFrame. This can be useful when you need to access specific rows or columns without having to look up their labels.
Assigning Values Using Loc and Iloc
When assigning values using loc and iloc, keep in mind the following:
- The
locmethod is label-based, so it will only update the value for the exact match between the label and the value. - The
ilocmethod is integer position-based, so it will update all the values in the specified column.
Here’s an example of how you can use both loc and iloc to assign values:
# Assigning values using loc indexing
df.loc[(df['shift_city_id'] == 3), 'closest_city_id'] = [3,4,5]
print(df)
# Assigning values using iloc indexing
df.iloc[1:3, 1] = [4,5]
print(df)
Output:
shift_city_id closest_city_id
0 1 NaN
1 1 NaN
2 2 NaN
3 3 3.0
4 3 4.0
5 3 5.0
6 1 NaN
shift_city_id closest_city_id
0 1 NaN
1 1 NaN
2 2 NaN
3 3 4.0
4 3 5.0
5 3 NaN
As you can see, the loc method updated the value in the specified row where shift_city_id equals 3. The iloc method updated all the values in the specified column where shift_city_id was between index 1 and 3.
Conclusion
In this article, we explored the loc and iloc indexing methods in pandas, their differences, and how they can be used effectively. We also discussed why loc and iloc are different and how to assign values using both methods. By understanding the nuances of loc and iloc indexing, you’ll be able to work more efficiently with DataFrames and manipulate your data with precision.
Additional Resources
Last modified on 2025-04-19