Updated: 24 apr 2026
Imports¶
The Python libraries required for this notebook are imported below
Source
from typing import List
import pandas as pdAbout¶
In this notebook, we will use a basic Extract-Transform-Load (ETL) workflow to load bikeshare system metadata for networks in Canada.
User Inputs¶
url = (
"https://raw.githubusercontent.com/MobilityData/gbfs/refs/heads/master/systems.csv"
)Below is a helper function to clean the column names in the raw data
def clean_col_names(df: pd.DataFrame) -> pd.DataFrame:
"""Clean column names."""
df = df.rename(columns=lambda x: x.replace(" ", "_")).rename(columns=str.lower)
return dfA function to filter the data to get networks in a list of countries is defined below
def filter_by_location(df: pd.DataFrame, countries: List[str]) -> pd.DataFrame:
"""Filter systems by country."""
df = df.query("country_code.isin(@countries)")
return dfRun ETL Pipeline¶
Raw metadata is now loaded and processed for bikeshare systems in Canada
df = pd.read_csv(url).pipe(clean_col_names).pipe(filter_by_location, ["CA"])
dfLoading...