-
-
Notifications
You must be signed in to change notification settings - Fork 1
How to use countplot() in plotly with VAEX data frame? #174
Comments
This is basically |
df_train = vaex DataFrame fig = px.histogram(df_train, x ='Census_ProcessorClass' , color = 'HasDetections', barmode = 'relative') I am getting this Value error : /opt/conda/lib/python3.7/site-packages/plotly/express/_chart_types.py in histogram(data_frame, x, y, color, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, hover_name, hover_data, animation_frame, animation_group, category_orders, labels, color_discrete_sequence, color_discrete_map, marginal, opacity, orientation, barmode, barnorm, histnorm, log_x, log_y, range_x, range_y, histfunc, cumulative, nbins, title, template, width, height) /opt/conda/lib/python3.7/site-packages/plotly/express/_core.py in make_figure(args, constructor, trace_patch, layout_patch) /opt/conda/lib/python3.7/site-packages/plotly/express/_core.py in build_dataframe(args, constructor) /opt/conda/lib/python3.7/site-packages/plotly/express/_core.py in process_args_into_dataframe(args, wide_mode, var_name, value_name) ValueError: Value of 'x' is not the name of a column in 'data_frame'. Expected one of [0] but received: Census_ProcessorClass |
Try converting your Vaex df to a Pandas one to see if that resolves things? |
Yeah Nic I am pretty sure it will resolve the issue but it will take a lot of time and memory to convert my data into pandas dataframe. I think my system may crash. I am looking for more efficient ways. Is there any method to make Vaex dataframe acceptable by plotly. |
PX doesn't natively accept Vaex data frames at the moment, no. Part of the reason for that is that for plots like these histograms, it doesn't do Python-side aggregation: all the data is sent to the browser for aggregation, so there's a bit of an upper bound on the dataset size that |
See plotly/plotly.py#2649 for more details |
Thanks will check this |
Hey after lot of trail and errors, I think I found a better way. Check this code it worked fig = px.histogram (x = df_train['Census_ProcessorClass'].tolist(), color= df_train['HasDetections'].tolist()) |
I found a much better method: df_train.select(df_train['Census_ProcessorClass'] ,'Census_ProcessorClass' != 'None' ) %%time CPU times: user 761 ms, sys: 33.1 ms, total: 794 ms |
Some one please give me an alternate plotly code for this one :
sns.countplot(x='Census_ProcessorClass', hue='HasDetections',data=df_train)
plt.show()
both are int64
The text was updated successfully, but these errors were encountered: