maui.eda.histogram_analysis

maui.eda.histogram_analysis(df, x_axis, category_column, show_plot=True)[source]

Generates a histogram plot for data distribution across a specified axis, optionally segmented by categories.

This function creates a histogram to visualize the distribution of data in df along the x_axis, with data optionally segmented by category_column. The histogram’s appearance, such as opacity and bar gap, is customizable. The plot is generated using Plotly Express and can be displayed in the notebook or IDE if show_plot is set to True.

Parameters:
dfpandas.DataFrame

The DataFrame containing the data to plot. Must include the columns specified by x_axis and category_column.

x_axisstr

The name of the column in df to be used for the x-axis of the histogram.

category_columnstr

The name of the column in df that contains categorical data for segmenting the histogram. Each category will be represented with a different color.

show_plotbool, optional

If True (default), the generated plot will be immediately displayed. If False, the plot will not be displayed but will still be returned by the function.

Returns:
plotly.graph_objs._figure.Figure

The Plotly figure object for the generated histogram. This object can be further customized or saved after the function returns.

Notes

This function is designed to offer a quick and convenient way to visualize the distribution of data in a DataFrame along a specified axis. It is particularly useful for exploratory data analysis and for identifying patterns or outliers in dataset segments.

Examples

>>> from maui import samples, eda
>>> df = samples.get_audio_sample(dataset="leec")
>>> fig = eda.histogram_analysis(df, 'landscape', 'environment')