12/23/2023 0 Comments Animal age converter pandaValue_counts() returns a Pandas Series containing the counts of unique values. You can read more about to_datetime() here. All datetime functions can now be applied on this column. The DOB column has now been changed to Pandas datatime format. to_datetime() is very powerful when the dataset has time series values or dates. It can take an integer, floating point number, list, Pandas Series, or Pandas DataFrame as argument. To_datetime() converts a Python object to datetime format. The documentation for iloc can be found here. So iloc would return rows with index 0, 1, 2, and 3, while loc would return rows with index 0, 1, 2, 3, and 4. Iloc works in a similar manner, just that iloc is not inclusive on both values. You can find the documentation for loc here. Loc is one of the most powerful functions in Pandas, and is a must-know for all Data Analysts and Data Scientists. So 0:4 will mean indices 0 to 4, both included. Keep in mind that index starts from 0 in Python, and that loc is inclusive on both values mentioned. The above code will return the “Name”, “Age”, and “State” columns for the first 5 customer records. We can also access rows and columns based on labels instead of row and column number. For instance, if we only want the last 2 rows and the first 3 columns of a dataset, we can access them with the help of loc. Loc helps to access a group of rows and columns in a dataset, a slice of the dataset, as per our requirement. For looking at the documentation for astype(), click here. You can verify the change in data type by looking at the data types of all columns in the dataset using the dtypes attribute. data_1 = data_1.Gender.astype('category') ![]() Or if you want to convert an object datatype to category, you can use astype(). For instance, if floating point numbers have somehow been misinterpreted by Python as strings, you can convert them back to floating point numbers with astype(). It can be a very helpful function in case your data is not stored in the correct format (data type). astype()Īstype() is used to cast a Python object to a particular data type. It is important to know the memory usage of a DataFrame, so that you can tackle errors like MemoryError in Python. The memory usage of each column has been given as output in a Pandas Series. More details on memory_usage() can be found here. ![]() By specifying the deep attribute as True, we can get to know the actual space being taken by each column. Memory_usage() returns a Pandas Series having the memory usage of each column (in bytes) in a Pandas DataFrame. By assigning the include attribute the value ‘all’, we can get the description to include all columns, including those containing categorical information. data_1.describe()ĭescribe() lists out different descriptive statistical measures for all numerical columns in our dataset. More details about describe() can be found here. ![]() describe() helps in getting a quick overview of the dataset. It summarizes central tendency and dispersion of the dataset. describe()ĭescribe() is used to generate descriptive statistics of the data in a Pandas DataFrame or Series. head() and tail() help you get a quick glance at your dataset, and check if data has been read into the DataFrame properly. Tail() is similar to head(), and returns the bottom n rows of a dataset. The first 6 rows (indexed 0 to 5) are returned as output as per expectation. If you want more/less number of rows, you can specify n as an integer. By default, df.head() will return the first 5 rows of the DataFrame. Head(n) is used to return the first n rows of a dataset. ![]() read_csv() and to_csv() are one of the most used functions in Pandas because they are used while reading data from a data source, and are very important to know. It helps to write data contained in a Pandas DataFrame or Series to a csv file. To_csv() function works exactly opposite of read_csv(). You can download the dataset used in the blog. You will have to change the path of the file you want to read. The data has been read from the data source into the Pandas DataFrame.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |