By Abhilash Nelson, Senior IT Consultant

Python for Data Science

Language: English
All Levels

Course description

Welcome to my new course Python Essentials with Pandas and Numpy for Data Science In this course, we will learn the basics of Python Data Structures and the most important Data Science libraries like NumPy and Pandas with step by step examples! The first session will be a theory session in which, we will have an introduction to python, its applications and the libraries. In the next session, we will proceed with installing python in your computer. We will install and configure anaconda which is a platform you can use for quick and easy installation of python and its libraries. We will get ourselves familiar with Jupiter notebook, which is the IDE that we are using throughout this course for python coding. Then we will go ahead with the basic python data types like strings, numbers and its operations. We will deal with different types of ways to assign and access strings, string slicing, replacement, concatenation, formatting and f strings. Dealing with numbers, we will discuss the assignment, accessing and different operations with integers and floats. The operations include basic ones and also advanced ones like exponents. Also we will check the order of operations, increments and decrements, rounding values and type casting. Then we will proceed with basic data structures in python like Lists tuples and set. For lists, we will try different assignment, access and slicing options. Along with popular list methods, we will also see list extension, removal, reversing, sorting, min and max, existence check , list looping, slicing, and also inter-conversion of list and strings. For Tuples also we will do the assignment and access options and the proceed with different options with set in python. After that, we will deal with python dictionaries. Different assignment and access methods. Value update and delete methods and also looping through the values in the dictionary. And after learning all of these basic data types and data structures, its time for us to proceed with the popular libraries for data-science in python. We will start with the NumPy library. We will check different ways to create a new NumPy array, reshaping , transforming list to arrays, zero arrays and one arrays, different array operations, array indexing, slicing, copying. we will also deal with creating and reshaping multi dimensional NumPy arrays, array transpose, and statistical operations like mean variance etc using NumPy Later we will go ahead with the next popular python library called Pandas. At first we will deal with the one dimensional labelled array in pandas called as the series.  We will create assign and access the series using different methods. Then will go ahead with the Pandas Data frames, which is a 2-dimensional labelled data structure with columns of potentially different types. We will convert NumPy arrays and also pandas series to data frames. We will try column wise and row wise access options, dropping rows and columns, getting the summary of data frames with methods like min, max etc. Also we will convert a python dictionary into a pandas data frame. In large datasets, its common to have empty or missing data. We will see how we can manage missing data within dataframes. We will see sorting and indexing operations for data frames. Most times, external data will be coming in either a CSV file or a JSON file. We will check how we can import CSV and JSON file data as a dataframe so that we can do the operations and later convert this data frame to either CSV and json objects and write it into the respective files.   Also we will see how we can concatenate, join and merge two pandas data frames. Then we will deal with data stacking and pivoting using the data frame and also to deal with duplicate values within the data-frame and to remove them selectively. We can group data within a data-frame using group by methods for pandas data frame. We will check the steps we need to follow for grouping. Similarly we can do aggregation of data in the data-frame using different methods available and also using custom functions. We will also see other grouping techniques like Binning and bucketing based on data in the data-frame At times we may need to use custom indexing for our dataframe. We will see methods to re-index rows and columns of a dataframe and also rename column indexes and rows. We will also check methods to do collective replacement of values in a dataframe and also to find the count of all or unique values in a dataframe. Then we will proceed with implementing random permutation using both the NumPy and Pandas library and the steps to follow. Since an excelsheet and a dataframe are similar 2d arrays, we will see how we can load values in a dataframe from an excelsheet by parsing it. Then we will do condition based selection of values in a dataframe, also by using lambda functions and also finding rank based on columns. Then we will go ahead with cross Tabulation of our dataframe using contingency tables. The steps we need to proceed with to create the cross tabulation contingency table. After all these operations in the data we have, now its time to visuzlize the data. We will do exercises in which we can generate graphs and plots. We will be using another popular python library called Matplotlib to generate graphs and plots. We will do tweaking of the grpahs and plots by adjusting the plot types, its parameters, labels, titles etc. Then we will use another visualization option called histogram which can be used to groups numbers into ranges. We will also be trying different options provided by matplotlib library for histogram Overall this course is a perfect starter pack for your long journey ahead with big data and machine learning. So lets start with the lessons. See you soon in the class room.

Related Skills

Course overview - 37

  • Introduction to Python

  • Preparing Computer - Installing Anaconda

  • Python Strings

  • Python Numbers and Operators

  • Python Lists

  • Python Sets

  • Python Tuples

  • Python Dictionary

  • Numpy Library - Introduction

  • Numpy Array Operations and Indexing

  • Numpy Multi-Dimentional Arrays

  • Introduction to Pandas Series

  • Introduction to Pandas Dataframes

  • Pandas Dataframe Conversion and Drop

  • Pandas Dataframe Summary Select

  • Pandas Missing Sort

  • Pandas Heirarchial-Multi Indexing

  • Pandas CSV File Read Write

  • Pandas Json Read Write

  • Pandas Concatnation Merging and Joining

  • Pandas Stacking and Pivoting

  • Pandas Duplicate Data Management

  • Pandas Mapping

  • Pandas Grouping

  • Pandas Aggregation

  • Pandas Binning or Bucketing

  • Pandas Reindex Rename

  • Pandas Replace Values

  • Pandas Dataframe Metrics

  • Pandas Random Permutation

  • Pandas Excelsheet Import

  • Pandas Condition Selection and Lambda Function

  • Pandas Ranks Min Max

  • Pandas Cross Tabulation

  • Graphs and Plots Using Matplotlib

  • Histograms Using Matplotlib

  • Source Code

Learners who have already enrolled in this course

Meet your instructor

Abhilash Nelson
Abhilash NelsonSenior IT Consultant
I am a pioneering security-oriented Android/iOS Mobile and PHP/Python Web Developer Application Developer offering more than eight years’ overall IT experience which involves designing, implementing, integrating, testing and supporting impact-full web and mobile applications. I am a Post Graduate Masters Degree holder in Computer Science and Engineering. My experience with PHP/Python Programming is an added advantage for server based Android and iOS Client Applications.