English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
The behavior of basic iteration (for
In short, basic iteration (for i in object) produces −
Series − value DataFrame − column label Panel − item label
Iterating over a DataFrame gives column names. Let's see the following example.
import pandas as pd import numpy as np N=20 df = pd.DataFrame({ 'A': pd.date_range(start='2016-01-01, periods=N, freq='D'), 'x': np.linspace(0, stop=N-1, num=N), 'y': np.random.rand(N), 'C': np.random.choice(['Low', 'Medium', 'High'], N).tolist(), 'D': np.random.normal(100, 10, size=(N)).tolist() ) for col in df: print col
The output is as follows
A C D x y
To iterate over the rows of a DataFrame, we can use the following functions-
iteritems() − Iterate over (key, value) pairs iterrows() − Iterate over rows in the form of (index, series) pairs itertuples() − Iterate over rows in the form of namedtuples
Iterate over each column as a key, and use labeled value pairs as keys, and column values as Series objects.
import pandas as pd import numpy as np df = pd. DataFrame(np. random.randn(4,3), columns=[ 'col1', 'col2', 'col3']) for key, value in df. iteritems(): print key, value
Resultados da Execução:
col1 0 0.802390 1 0.324060 2 0.256811 3 0.839186 Name: col1, dtype: float64 col2 0 1.624313 1 -1.033582 2 1.796663 3 1.856277 Name: col2, dtype: float64 col3 0 -0.022142 1 -0.230820 2 1.160691 3 -0.830279 Name: col3, dtype: float64
It can be seen that each column is iterated as a key-value pair in the series.
iterrows() returns an iterator that produces each index value and a sequence containing each row of data.
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3']) for row_index, row in df.iterrows(): print row_index, row
Resultados da Execução:
0 col1 1.529759 col2 0.762811 col3 -0.634691 Name: 0, dtype: float64 1 col1 -0.944087 col2 1.420919 col3 -0.507895 Nome: 1, dtype: float64 2 col1 -0.077287 col2 -0.858556 col3 -0.663385 Nome: 2, dtype: float64 3 col1 -1.638578 col2 0.059866 col3 0.493482 Nome: 3, dtype: float64
Devido ao iterrows() percorrer as linhas, não será mantido o tipo de dados na linha. 0,1,2é o índice da linha, col1,col2,col3é o índice da coluna.
O método itertuples() retorna um iterador, gerando um tupla nomeada para cada linha do DataFrame. O primeiro elemento do tupla será o valor correspondente ao índice da linha, e os outros valores serão os valores da linha.
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3']) for row in df.itertuples(): print row
Resultados da Execução:
Pandas(Index=0, col1=1.5297586201375899, col2=0.76281127433814944, col3=- 0.6346908238310438) Pandas(Index=1, col1=-0.94408735763808649, col2=1.4209186418359423, col3=- 0.50789517967096232) Pandas(Index=2, col1=-0.07728664756791935, col2=-0.85855574139699076, col3=- 0.6633852507207626) Pandas(Index=3, col1=0.65734942534106289, col2=-0.95057710432604969, col3=0.80344487462316527)
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3']) for index, row in df.iterrows(): row['a'] = 10 print df
Resultados da Execução:
col1 col2 col3 0 -1.739815 0.735595 -0.295589 1 0.635485 0.106803 1.527922 2 -0.939064 0.547095 0.038585 3 -1.016509 -0.116580 -0.523158
Observação, não reflete nenhuma mudança.