{"id":4717,"date":"2021-05-01T14:57:45","date_gmt":"2021-05-01T09:27:45","guid":{"rendered":"https:\/\/python-programs.com\/?p=4717"},"modified":"2021-11-22T18:49:07","modified_gmt":"2021-11-22T13:19:07","slug":"pandas-get-sum-of-column-values-in-a-dataframe","status":"publish","type":"post","link":"https:\/\/python-programs.com\/pandas-get-sum-of-column-values-in-a-dataframe\/","title":{"rendered":"Pandas: Get sum of column values in a Dataframe"},"content":{"rendered":"
In this article, we will discuss about how to get the sum To find the sum of values in a dataframe. So, let’s start exploring the topic.<\/p>\n
To find the sum of values of a single column we have to use the Here by using Here by using Syntax- dataFrame_Object_name.loc[:, \u2018column_name\u2019].sum( )<\/em><\/p>\n So, let’s see the implementation of it by taking an example.<\/p>\n In case we don\u2019t know about the column name but we know its position, we can find the sum of all value in that column using both So, let’s see the implementation of it by taking an example.<\/p>\n If we need the sum of values from a column\u2019s specific entries we can-<\/p>\n So, let’s see the implementation of it by taking an example.<\/p>\n In case we want the sum of all values that follows our conditions, for example scores of a particular city like New York can be found out by –<\/p>\n So, let’s see the implementation of it by taking an example.<\/p>\n How to get the sum of column values in a dataframe in Python ? In this article, we will discuss about how to get the sum To find the sum of values in a dataframe. So, let’s start exploring the topic. Select the column by name and get the sum of all values in that …<\/p>\nsum( )<\/code> or the
loc[ ]<\/code> function.<\/p>\n
Using sum() :<\/h4>\n
sum( )<\/code> only, we selected a column from a dataframe by the column name and from that we can get the sum of values in that column.<\/p>\n
Syntax- dataFrame_Object[\u2018column_name\u2019].sum( )<\/pre>\n
#Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Example data\r\nstudents = [('Jill',\u00a0\u00a0\u00a0 16,\u00a0\u00a0\u00a0\u00a0 'Tokyo',\u00a0 150),\r\n('Rachel',\u00a0\u00a0\u00a0 38,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 177),\r\n('Kirti',\u00a0\u00a0\u00a0 39,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 97),\r\n('Veena',\u00a0\u00a0 40,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 np.NaN),\r\n('Lucifer',\u00a0\u00a0 np.NaN, 'Texas',\u00a0\u00a0 130),\r\n('Pablo', 30,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 155),\r\n('Lionel',\u00a0\u00a0 45,\u00a0\u00a0\u00a0\u00a0 'Colombia', 121) ]\r\ndfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])\r\n#Sum of all values in the 'Score' column of the dataframe\r\ntotalSum = dfObj['Score'].sum()\r\nprint(totalSum)<\/pre>\n
Output :\r\n830.0<\/pre>\n
Using loc[ ] :<\/h4>\n
loc[]<\/code> and\u00a0
sum( )<\/code> only, we selected a column from a dataframe by the column name and from that we can get the sum of values in that column.<\/p>\n
#Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Example data\r\nstudents = [('Jill',\u00a0\u00a0\u00a0 16,\u00a0\u00a0\u00a0\u00a0 'Tokyo',\u00a0 150),\r\n('Rachel',\u00a0\u00a0\u00a0 38,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 177),\r\n('Kirti',\u00a0\u00a0\u00a0 39,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 97),\r\n('Veena',\u00a0\u00a0 40,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 np.NaN),\r\n('Lucifer',\u00a0\u00a0 np.NaN, 'Texas',\u00a0\u00a0 130),\r\n('Pablo', 30,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 155),\r\n('Lionel',\u00a0\u00a0 45,\u00a0\u00a0\u00a0\u00a0 'Colombia', 121) ]\r\ndfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])\r\n#Sum of all values in the 'Score' column of the dataframe using loc[ ]\r\ntotalSum = dfObj.loc[:, 'Score'].sum()\r\nprint(totalSum)<\/pre>\n
Output :\r\n830.0<\/pre>\n
Select the column by position and get the sum of all values in that column :<\/h3>\n
iloc[ ]<\/code> and
sum( )<\/code>. The iloc[ ] returns a series of values which is then passed into the
sum( )<\/code> function.<\/p>\n
#Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n\r\n# Example data\r\nstudents = [('Jill',\u00a0\u00a0\u00a0 16,\u00a0\u00a0\u00a0\u00a0 'Tokyo',\u00a0 150),\r\n('Rachel',\u00a0\u00a0\u00a0 38,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 177),\r\n('Kirti',\u00a0\u00a0\u00a0 39,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 97),\r\n('Veena',\u00a0\u00a0 40,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 np.NaN),\r\n('Lucifer',\u00a0\u00a0 np.NaN, 'Texas',\u00a0\u00a0 130),\r\n('Pablo', 30,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 155),\r\n('Lionel',\u00a0\u00a0 45,\u00a0\u00a0\u00a0\u00a0 'Colombia', 121) ]\r\ndfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])\r\ncolumn_number = 4\r\n# Total sum of values in 4th column i.e. \u2018Score\u2019\r\ntotalSum = dfObj.iloc[:, column_number-1:column_number].sum()\r\nprint(totalSum)<\/pre>\n
Output :\r\nScore\u00a0\u00a0\u00a0 830.0\r\ndtype: float64<\/pre>\n
Find the sum of columns values for selected rows only in Dataframe :<\/h3>\n
#Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n\r\n# Example data\r\nstudents = [('Jill',\u00a0\u00a0\u00a0 16,\u00a0\u00a0\u00a0\u00a0 'Tokyo',\u00a0 150),\r\n('Rachel',\u00a0\u00a0\u00a0 38,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 177),\r\n('Kirti',\u00a0\u00a0\u00a0 39,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 97),\r\n('Veena',\u00a0\u00a0 40,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 np.NaN),\r\n('Lucifer',\u00a0\u00a0 np.NaN, 'Texas',\u00a0\u00a0 130),\r\n('Pablo', 30,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 155),\r\n('Lionel',\u00a0\u00a0 45,\u00a0\u00a0\u00a0\u00a0 'Colombia', 121) ]\r\ndfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])\r\ncolumn_number = 4\r\nentries = 3\r\n#Sum of the first three values from the 4th column\r\ntotalSum = dfObj.iloc[0:entries, column_number-1:column_number].sum()\r\nprint(totalSum)<\/pre>\n
Output :\r\nScore\u00a0\u00a0\u00a0 424.0\r\ndtype: float64<\/pre>\n
Find the sum of column values in a dataframe based on condition :<\/strong><\/h3>\n
#Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n\r\n# Example data\r\nstudents = [('Jill',\u00a0\u00a0\u00a0 16,\u00a0\u00a0\u00a0\u00a0 'Tokyo',\u00a0 150),\r\n('Rachel',\u00a0\u00a0\u00a0 38,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 177),\r\n('Kirti',\u00a0\u00a0\u00a0 39,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 97),\r\n('Veena',\u00a0\u00a0 40,\u00a0\u00a0\u00a0\u00a0 'Texas',\u00a0\u00a0 np.NaN),\r\n('Lucifer',\u00a0\u00a0 np.NaN, 'Texas',\u00a0\u00a0 130),\r\n('Pablo', 30,\u00a0\u00a0\u00a0\u00a0 'New York',\u00a0 155),\r\n('Lionel',\u00a0\u00a0 45,\u00a0\u00a0\u00a0\u00a0 'Colombia', 121) ]\r\ndfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])\r\n#Sum of all the scores from New York city\r\ntotalSum = dfObj.loc[dfObj['City'] == 'New York', 'Score'].sum()\r\nprint(totalSum)<\/pre>\n
Output :\r\n252.0<\/pre>\n","protected":false},"excerpt":{"rendered":"