Home > Software engineering >  Altair Javascript error preventing chart display with specific column encoding
Altair Javascript error preventing chart display with specific column encoding

Time:04-26

Working with a Formula 1 dataset -- the Pandas DataFrame has a column "constructorID" which has the constructors in an all-lowercase, underscore format (like "red_bull") while the "constructor" column has them properly formatted and capitalized ("Red Bull"). I'm trying to have the color encoding & legend linked to the "constructor" column as opposed to "constructorID" for a cleaner display since this is for a final project. BUT, while "constructorID" displays the chart as expected, "constructor" returns a Javascript error.

The constructorID unique column values:

array(['williams', 'red_bull', 'toro_rosso', 'mclaren', 'alpine',
       'mercedes', 'sauber', 'alphatauri', 'alfa', 'haas', 'renault',
       'racing_point', 'ferrari', 'force_india', 'aston_martin'],
      dtype=object)

The constructor unique column values:

array(['Williams', 'Red Bull', 'Toro Rosso', 'McLaren', 'Alpine F1 Team',
       'Mercedes', 'Sauber', 'AlphaTauri', 'Alfa Romeo', 'Haas F1 Team',
       'Renault', 'Racing Point', 'Ferrari', 'Force India',
       'Aston Martin'], dtype=object)

I've tried to debug by creating the simplest possible chart:

alt.Chart(df_year_pts).mark_circle().encode(
    color='constructorID:N',
    x='driver_yr_pts:Q',
    y='constructor_yr_pts:Q'
)

Code above works perfectly.

alt.Chart(df_year_pts).mark_circle().encode(
    color='constructor:N',
    x='driver_yr_pts:Q',
    y='constructor_yr_pts:Q'
)

Code above returns the following error:

Javascript Error: Cannot read properties of undefined (reading 'params')
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.

Any suggestions or thoughts on what might be going wrong with using the "constructor" column as an encoding? I really don't know what the difference could be between that column and the "constructorID" column.

EDIT: Minimal dictionary-form sample of my data, first 5 rows:

{'constructor': {0: 'Williams',
  1: 'Red Bull',
  2: 'Toro Rosso',
  3: 'Red Bull',
  4: 'McLaren'},
 'constructorID': {0: 'williams',
  1: 'red_bull',
  2: 'toro_rosso',
  3: 'red_bull',
  4: 'mclaren'},
 'constructor_yr_pts': {0: 0.0, 1: 417.0, 2: 85.0, 3: 319.0, 4: 30.0},
 'driver': {0: 'Jack Aitken',
  1: 'Alexander Albon',
  2: 'Alexander Albon',
  3: 'Alexander Albon',
  4: 'Fernando Alonso'},
 'driverID': {0: 'aitken', 1: 'albon', 2: 'albon', 3: 'albon', 4: 'alonso'},
 'driver_yr_pts': {0: 0.0, 1: 76.0, 2: 16.0, 3: 105.0, 4: 17.0},
 'year': {0: 2020, 1: 2019, 2: 2019, 3: 2020, 4: 2017}}

Here's the basic code I'm using to read in data from Ergast API

import altair as alt
import pandas as pd
from pyergast import pyergast
import requests

rounds_17 = pyergast.get_schedule(2017)
df_2017 = pd.DataFrame()
for i in range(len(rounds_17)):
    i  = 1
    temp = pyergast.get_race_result(2017, i)
    temp['year'] = 2017
    temp['round'] = i
    df_2017 = pd.concat([df_2017, temp], ignore_index=True)

rounds_18 = pyergast.get_schedule(2018)
df_2018 = pd.DataFrame()
for i in range(len(rounds_18)):
    i  = 1
    temp = pyergast.get_race_result(2018, i)
    temp['year'] = 2018
    temp['round'] = i
    df_2018 = pd.concat([df_2018, temp], ignore_index=True)

rounds_19 = pyergast.get_schedule(2019)
df_2019 = pd.DataFrame()
for i in range(len(rounds_19)):
    i  = 1
    temp = pyergast.get_race_result(2019, i)
    temp['year'] = 2019
    temp['round'] = i
    df_2019 = pd.concat([df_2019, temp], ignore_index=True)

rounds_20 = pyergast.get_schedule(2020)
df_2020 = pd.DataFrame()
for i in range(len(rounds_20)):
    i  = 1
    temp = pyergast.get_race_result(2020, i)
    temp['year'] = 2020
    temp['round'] = i
    df_2020 = pd.concat([df_2020, temp], ignore_index=True)

rounds_21 = pyergast.get_schedule(2021)
df_2021 = pd.DataFrame()
for i in range(len(rounds_21)):
    i  = 1
    temp = pyergast.get_race_result(2021, i)
    temp['year'] = 2021
    temp['round'] = i
    df_2021 = pd.concat([df_2021, temp], ignore_index=True)

df_total = pd.concat([df_2017,df_2018,df_2019,df_2020, df_2021], ignore_index=True)
df_total['points'] = pd.to_numeric(df_total['points'])

df_year_pts = df_total.groupby(['driverID','driver','year','constructorID','constructor'])['points'].sum().to_frame('year_pts').reset_index()
s_constructor_yr_pts = df_year_pts.groupby(['constructorID','year'])['year_pts'].sum()
df_year_pts = df_year_pts.merge(s_constructor_yr_pts, how='left',on=['constructorID','year']).rename(columns={'year_pts_x':'driver_yr_pts','year_pts_y':'constructor_yr_pts'})

So df_year_pts is what I'm calling in alt.Chart

CodePudding user response:

This looks like a bug in Vega or Vega-Lite. Here's a minimal reproduction in the vega editor:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.17.0.json",
  "data": {"values": [{"constructor": "x"}]},
  "mark": "circle",
  "encoding": {"color": {"field": "constructor"}}
}

It seems to have something to do with having a field/column named "constructor". If I name the column anything else, it works as expected.

With this in mind, I think the best workaround for now would be to rename your column.

I've reported the Vega-Lite bug here: https://github.com/vega/vega-lite/issues/8125

  • Related