Pandas is a Python library that provides data structures and functions needed to work with structured data seamlessly. It is built on top of NumPy and is great for data manipulation and analysis.
Installing Pandas
Before using Pandas, you need to install it. You can do this using pip. Open your command line or terminal and run:
pip install pandas
Importing Pandas
Once installed, you can import Pandas in your Python script or Jupyter Notebook:
import pandas as pd
Key Data Structures in Pandas
Pandas mainly has two data structures:
Series:
A one-dimensional labeled array capable of holding any data type.
DataFrame:
A two-dimensional labeled data structure with columns of potentially different types.
Creating a Series
You can create a Pandas Series from a list, dictionary, or NumPy array.
import pandas as pd
data_list = [1, 2, 3, 4]
series_from_list = pd.Series(data_list)
print(series_from_list)
# From a dictionary
data_dict = {'a': 1, 'b': 2, 'c': 3}
series_from_dict = pd.Series(data_dict)
print(series_from_dict)
Creating a DataFrame
A DataFrame can be created from a variety of data structures.
From a dictionary of lists
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df)
Basic DataFrame Operations
Viewing Data
You can view the first few rows of a DataFrame using the head() method:
print(df.head())
Accessing Columns
You can access a column by its name:
print(df['Name'])
Accessing Rows
You can access rows by index using the iloc[] method:
print(df.iloc[0]) # First row
print(df.iloc[1:3]) # Rows from index 1 to 2
Adding a New Column
You can add a new column to the DataFrame:
df['Salary'] = [50000, 60000, 70000]
print(df)
Filtering Data
You can filter data based on a condition:
# Filter rows where Age is greater than 28
filtered_df = df[df['Age'] > 28]
print(filtered_df)
Top comments (0)