maui.eda.daily_distribution_analysis

maui.eda.daily_distribution_analysis(df, date_column, category_column, show_plot=True)[source]

Analyzes and visualizes the daily distribution of samples by categories.

This function generates a histogram that shows the distribution of samples over days, separated by a specified category. It provides insights into how the frequency of samples varies daily and according to the categories within the specified category column.

Parameters:
dfpandas.DataFrame

The DataFrame containing the data to be analyzed. It must include the specified date_column and category_column.

date_columnstr

The name of the column in df that contains date information. The values in this column should be in a date or datetime format.

category_columnstr

The name of the column in df that contains categorical data, which will be used to color the bars in the histogram.

show_plotbool, optional

If True (default), the function will display the generated plot. If False, the plot will not be displayed but will still be returned.

Returns:
plotly.graph_objs._figure.Figure

A Plotly figure object representing the histogram of daily sample distribution by the specified category. The histogram bars are colored based on the categories in the category_column.

Notes

The function leverages Plotly for plotting, thus ensuring interactive plots that can be further explored in a web browser. It’s particularly useful for time series data where understanding the distribution of events or samples over time and across different categories is crucial.

Examples

>>> from maui import samples, eda
>>> df = samples.get_audio_sample(dataset="leec")
>>> fig = eda.daily_distribution_analysis(df, 'dt', 'landscape')