--- jupyter: jupytext: notebook_metadata_filter: all text_representation: extension: .md format_name: markdown format_version: '1.1' jupytext_version: 1.1.1 kernelspec: display_name: Python 3 language: python name: python3 language_info: codemirror_mode: name: ipython version: 3 file_extension: .py mimetype: text/x-python name: python nbconvert_exporter: python pygments_lexer: ipython3 version: 3.6.7 plotly: description: How to make interactive Distplots in Python with Plotly. display_as: statistical language: python layout: base name: Distplots order: 4 page_type: example_index permalink: python/distplot/ thumbnail: thumbnail/distplot.jpg --- ## Combined statistical representations with px.histogram Several representations of statistical distributions are available in plotly, such as [histograms](https://plotly.com/python/histograms/), [violin plots](https://plotly.com/python/violin/), [box plots](https://plotly.com/python/box-plots/) (see [the complete list here](https://plotly.com/python/statistical-charts/)). It is also possible to combine several representations in the same plot. For example, the `plotly.express` function `px.histogram` can add a subplot with a different statistical representation than the histogram, given by the parameter `marginal`. [Plotly Express](/python/plotly-express/) is the easy-to-use, high-level interface to Plotly, which [operates on a variety of types of data](/python/px-arguments/) and produces [easy-to-style figures](/python/styling-plotly-express/). ```python import plotly.express as px df = px.data.tips() fig = px.histogram(df, x="total_bill", y="tip", color="sex", marginal="rug", hover_data=df.columns) fig.show() ``` ```python import plotly.express as px df = px.data.tips() fig = px.histogram(df, x="total_bill", y="tip", color="sex", marginal="box", # or violin, rug hover_data=df.columns) fig.show() ``` ## Combined statistical representations with distplot figure factory The distplot [figure factory](/python/figure-factories/) displays a combination of statistical representations of numerical data, such as histogram, kernel density estimation or normal curve, and rug plot. #### Basic Distplot A histogram, a kde plot and a rug plot are displayed. ```python import plotly.figure_factory as ff import numpy as np np.random.seed(1) x = np.random.randn(1000) hist_data = [x] group_labels = ['distplot'] # name of the dataset fig = ff.create_distplot(hist_data, group_labels) fig.show() ``` #### Plot Multiple Datasets ```python import plotly.figure_factory as ff import numpy as np # Add histogram data x1 = np.random.randn(200) - 2 x2 = np.random.randn(200) x3 = np.random.randn(200) + 2 x4 = np.random.randn(200) + 4 # Group data together hist_data = [x1, x2, x3, x4] group_labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4'] # Create distplot with custom bin_size fig = ff.create_distplot(hist_data, group_labels, bin_size=.2) fig.show() ``` #### Use Multiple Bin Sizes Different bin sizes are used for the different datasets with the `bin_size` argument. ```python import plotly.figure_factory as ff import numpy as np # Add histogram data x1 = np.random.randn(200)-2 x2 = np.random.randn(200) x3 = np.random.randn(200)+2 x4 = np.random.randn(200)+4 # Group data together hist_data = [x1, x2, x3, x4] group_labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4'] # Create distplot with custom bin_size fig = ff.create_distplot(hist_data, group_labels, bin_size=[.1, .25, .5, 1]) fig.show() ``` #### Customize Rug Text, Colors & Title ```python import plotly.figure_factory as ff import numpy as np x1 = np.random.randn(26) x2 = np.random.randn(26) + .5 group_labels = ['2014', '2015'] rug_text_one = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] rug_text_two = ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii', 'jj', 'kk', 'll', 'mm', 'nn', 'oo', 'pp', 'qq', 'rr', 'ss', 'tt', 'uu', 'vv', 'ww', 'xx', 'yy', 'zz'] rug_text = [rug_text_one, rug_text_two] # for hover in rug plot colors = ['rgb(0, 0, 100)', 'rgb(0, 200, 200)'] # Create distplot with custom bin_size fig = ff.create_distplot( [x1, x2], group_labels, bin_size=.2, rug_text=rug_text, colors=colors) fig.update_layout(title_text='Customized Distplot') fig.show() ``` #### Plot Normal Curve ```python import plotly.figure_factory as ff import numpy as np x1 = np.random.randn(200) x2 = np.random.randn(200) + 2 group_labels = ['Group 1', 'Group 2'] colors = ['slategray', 'magenta'] # Create distplot with curve_type set to 'normal' fig = ff.create_distplot([x1, x2], group_labels, bin_size=.5, curve_type='normal', # override default 'kde' colors=colors) # Add title fig.update_layout(title_text='Distplot with Normal Distribution') fig.show() ``` #### Plot Only Curve and Rug ```python import plotly.figure_factory as ff import numpy as np x1 = np.random.randn(200) - 1 x2 = np.random.randn(200) x3 = np.random.randn(200) + 1 hist_data = [x1, x2, x3] group_labels = ['Group 1', 'Group 2', 'Group 3'] colors = ['#333F44', '#37AA9C', '#94F3E4'] # Create distplot with curve_type set to 'normal' fig = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors) # Add title fig.update_layout(title_text='Curve and Rug Plot') fig.show() ``` #### Plot Only Hist and Rug ```python import plotly.figure_factory as ff import numpy as np x1 = np.random.randn(200) - 1 x2 = np.random.randn(200) x3 = np.random.randn(200) + 1 hist_data = [x1, x2, x3] group_labels = ['Group 1', 'Group 2', 'Group 3'] colors = ['#835AF1', '#7FA6EE', '#B8F7D4'] # Create distplot with curve_type set to 'normal' fig = ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=.25, show_curve=False) # Add title fig.update_layout(title_text='Hist and Rug Plot') fig.show() ``` #### Plot Hist and Rug with Different Bin Sizes ```python import plotly.figure_factory as ff import numpy as np x1 = np.random.randn(200) - 2 x2 = np.random.randn(200) x3 = np.random.randn(200) + 2 hist_data = [x1, x2, x3] group_labels = ['Group 1', 'Group 2', 'Group 3'] colors = ['#393E46', '#2BCDC1', '#F66095'] fig = ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=[0.3, 0.2, 0.1], show_curve=False) # Add title fig.update(layout_title_text='Hist and Rug Plot') fig.show() ``` #### Plot Only Hist and Curve ```python import plotly.figure_factory as ff import numpy as np x1 = np.random.randn(200) - 2 x2 = np.random.randn(200) x3 = np.random.randn(200) + 2 hist_data = [x1, x2, x3] group_labels = ['Group 1', 'Group 2', 'Group 3'] colors = ['#A56CC1', '#A6ACEC', '#63F5EF'] # Create distplot with curve_type set to 'normal' fig = ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=.2, show_rug=False) # Add title fig.update_layout(title_text='Hist and Curve Plot') fig.show() ``` #### Distplot with Pandas ```python import plotly.figure_factory as ff import numpy as np import pandas as pd df = pd.DataFrame({'2012': np.random.randn(200), '2013': np.random.randn(200)+1}) fig = ff.create_distplot([df[c] for c in df.columns], df.columns, bin_size=.25) fig.show() ``` #### Reference For more info on `ff.create_distplot()`, see the [full function reference](https://plotly.com/python-api-reference/generated/plotly.figure_factory.create_distplot.html)