您的位置 > 首页 > 商业智能 > Become a Data Visualization Whiz with this Comprehensive Guide to Seaborn in Pyt ...

Become a Data Visualization Whiz with this Comprehensive Guide to Seaborn in Pyt ...

来源:分析大师 | 2019-10-08 | 发布:BOB体育娱乐平台之家

There is just something extraordinary about a well-designed visualization. The colors stand out, the layers blend nicely together, the contours flow throughout, and the overall package not only has a nice aesthetic quality, but it provides meaningful insights to us as well.This is quite important in data science where we often work with a lot of messy data. Having the ability to visualize it is critical for a data scientist. Our stakeholders or clients will more often than not rely on visual cues rather than the intricacies of a machine learning model.There are plenty of excellent Python visualization libraries available, including the built-in matplotlib. But seaborn stands out for me. It combines aesthetic appeal seamlessly with technical insights, as we’ll soon see.In this article, we’ll learn what seaborn is and why you should use it ahead of matplotlib. We’ll then use seaborn to generate all sorts of different data visualizations in Python. So put your creative hats on and let’s get rolling!Seaborn is part of the comprehensive and popular Applied Machine Learning course. It’s your one-stop-destination to learning all about machine learning and its different aspects.Have you ever used the ggplot2 library in R? It’s one of the best visualization packages in any tool or language. Seaborn gives me the same overall feel.Seaborn is an amazing Python visualization library built on top of matplotlib.It gives us the capability to create amplified data visuals. This helps us understand the data by displaying it in a visual context to unearth any hidden correlations between variables or trends that might not be obvious initially. Seaborn has a high-level interface as compared to the low level of Matplotlib.I’ve been talking about how awesome seaborn is so you might be wondering what all the fuss is about.I’ll answer that question comprehensively in a practical manner when we generate plots using seaborn. For now, let’s quickly talk about how seaborn feels like it’s a step above matplotlib.Seaborn makes our charts and plots look engaging and enables some of the common data visualization needs (like mapping color to a variableor usingfaceting). Basically, it makes the data visualization and exploration easy to conquer. And trust me, that is no easy task in data science.There are essentially a couple of (big) limitations in matplotlib that Seaborn fixes:That second point stands out in data science since we work quite a lot with dataframes. Any other reason(s) you feel seaborn is superior to matplotlib? Let us know in the comments section below the article!The seaborn library has four mandatory dependencies you need to have:To install Seaborn and use it effectively, first, we need to install the aforementioned dependencies. Once this step is done, we are all set to install Seaborn and enjoy its mesmerizing plots. To install Seaborn, you can use the following line of code-To install the latest release of seaborn, you can usepip:You can also use conda to install the latest version of seaborn:To import the dependencies and seaborn itself in your code, you can use the following code-That’s it! We are all set to explore seaborn in detail.We’ll be working primarily with two datasets:I’ve picked these two because they contain a multitude of variables so we have plenty of options to play around with. Both these datasets also mimic real-world scenarios so you’ll get an idea of how data visualization and exploration work in the industry.You can check out this and other high-quality datasets and hackathons on the DataHack platform. So go ahead and download the above two datasets before you proceed. We’ll be using them in tandem.Let’s get started! I have divided this implementation section into two categories:We’ll look at multiple examples of each category and how to plot it using seaborn.A statistical relationship denotes a process of understanding relationships between different variables in a dataset and how that relationship affects or depends on other variables.Here, we’ll be using seaborn to generate the below plots:I have picked the ‘Predict the number of upvotes‘ project for this. So, let’s start by importing the dataset in our working environment:A scatterplot is perhaps the most common example of visualizing relationships between two variables. Each point shows an observation in the dataset and these observations are represented by dot-like structures. The plot shows the joint distribution of two variables using a cloud of points.To draw the scatter plot, we’ll be using the relplot() function of the seaborn library. It is a figure-level role for visualizing statistical relationships. By default, using a relplot produces a scatter plot:SNS.relplot is the relplot function from SNS class, which is a seaborn class that we imported above with other dependencies.The parameters – x, y, and data – represent the variables on X-axis, Y-axis and the data we are using to plot respectively. Here, we’ve found a relationship between the views and upvotes.Next, if we want to see the tag associated with the data, we can use the below code:We can add another dimension in our plot with the help of hue as it gives color to the points and each color has some meaning attached to it.In the above plot, the hue semantic is categorical. That’s why it has a different color palette. If th
京ICP备11001960号  京ICP证090565号 京公网安备1101084107号 论坛法律顾问:王进律师知识产权保护声明免责及隐私声明   主办单位:人大经济论坛 版权所有
联系QQ:2881989700  邮箱:service@pinggu.org
合作咨询电话:(010)62719935 广告合作电话:13661292478(刘老师)

投诉电话:(010)68466864 不良信息处理电话:(010)68466864