Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Load Bikeshare Systems Data

Authors
Affiliations
McMaster University
Brown University
Updated: 24 apr 2026

Imports

The Python libraries required for this notebook are imported below

Source
from typing import List

import pandas as pd

About

In this notebook, we will use a basic Extract-Transform-Load (ETL) workflow to load bikeshare system metadata for networks in Canada.

User Inputs

url = (
    "https://raw.githubusercontent.com/MobilityData/gbfs/refs/heads/master/systems.csv"
)

Below is a helper function to clean the column names in the raw data

def clean_col_names(df: pd.DataFrame) -> pd.DataFrame:
    """Clean column names."""
    df = df.rename(columns=lambda x: x.replace(" ", "_")).rename(columns=str.lower)
    return df

A function to filter the data to get networks in a list of countries is defined below

def filter_by_location(df: pd.DataFrame, countries: List[str]) -> pd.DataFrame:
    """Filter systems by country."""
    df = df.query("country_code.isin(@countries)")
    return df

Run ETL Pipeline

Raw metadata is now loaded and processed for bikeshare systems in Canada

df = pd.read_csv(url).pipe(clean_col_names).pipe(filter_by_location, ["CA"])
df
Loading...