Updated: 24 apr 2026
Imports¶
All requires Python libraries are imported below
Source
import altair as alt
import pandas as pdBelow are altair settings required to support plotting large datasets and customize appearance
_ = alt.data_transformers.enable("vegafusion")
_ = alt.renderers.set_embed_options(actions=False)About¶
Perform EDA.
User Inputs¶
url = "https://vegafusion-datasets.s3.amazonaws.com/vega/movies_201k.parquet"Get Data¶
%%time
df = pd.read_parquet(url)
df.head()CPU times: user 158 ms, sys: 47.8 ms, total: 206 ms
Wall time: 427 ms
Loading...
EDA¶
Distributions of Numerical Features¶
Show a bar chart of all movie ratings
chart = (
alt.Chart(df)
.mark_bar()
.encode(
alt.X("IMDB_Rating:Q", bin=alt.Bin(maxbins=75)),
y="count()",
)
)
chartLoading...
Relationships of Numerical Features¶
%%time
chart = alt.Chart(df).mark_rect().encode(
alt.X('IMDB_Rating:Q', bin=alt.Bin(maxbins=60)),
alt.Y('Rotten_Tomatoes_Rating:Q', bin=alt.Bin(maxbins=40)),
alt.Color('count():Q', scale=alt.Scale(scheme='greenblue'))
)
chartCPU times: user 843 μs, sys: 0 ns, total: 843 μs
Wall time: 848 μs
Loading...