maui.eda.heatmap_analysis¶

maui.eda.heatmap_analysis(df, x_axis, y_axis, color_continuous_scale='Viridis', show_plot=True, **kwargs)[source]¶

Generates a heatmap to analyze the relationship between two categorical variables in a DataFrame.

This function groups the data by the specified x_axis and y_axis categories, counts the occurrences of each group, and then creates a heatmap visualization of these counts using Plotly Express. The heatmap intensity is determined by the count of occurrences, with an option to customize the color scale.

Parameters:

dfpandas.DataFrame: The input DataFrame containing the data to be analyzed. Must include the columns specified by x_axis and y_axis, as well as a ‘file_path’ column used for counting occurrences.
x_axisstr: The name of the column in df to be used as the x-axis in the heatmap.
y_axisstr: The name of the column in df to be used as the y-axis in the heatmap.
color_continuous_scalestr, optional: The name of the color scale to use for the heatmap. Defaults to ‘Viridis’. For more options, refer to Plotly’s documentation on color scales.
show_plotbool, optional: If True (default), displays the heatmap plot. If False, the plot is not displayed but is still returned.
**kwargsdict: Additional arguments for plot customization, such as height and width.

Returns:

tuple: A tuple containing: - df_group (pandas.DataFrame): A DataFrame with the grouped counts for each combination of x_axis and y_axis values. - fig (plotly.graph_objs._figure.Figure): A Plotly figure object containing the heatmap.

Notes

The ‘file_path’ column in the input DataFrame is used to count occurrences of each group formed by the specified x_axis and y_axis values. This function is useful for visualizing the distribution and relationship between two categorical variables.

Examples

>>> from maui import samples, eda
>>> df = samples.get_audio_sample(dataset="leec")
>>> df_group, fig = eda.heatmap_analysis(df, 'landscape', 'environment')