12.3.10.4.6. Grouping dataΒΆ

import pandas as pd
import numpy as np

Concat

Create a dataFrame

dataFrame = pd.DataFrame(np.random.randn(10, 4))

break in pieces

pieces = [dataFrame[:3], dataFrame[3:7], dataFrame[7:]]
pd.concat(pieces)
0 1 2 3
0 1.637018 -0.095788 2.347225 -0.154849
1 -1.465488 -0.216412 -0.347421 0.658839
2 -0.730726 -1.934136 -0.064249 0.292632
3 -0.306315 1.245162 -1.301430 -0.520355
4 0.110070 0.442971 -1.458418 -0.524914
5 -2.091738 0.100677 0.469029 -0.941721
6 -0.317765 -0.699475 -0.270448 0.623718
7 0.167101 -0.670248 -1.233412 -0.410496
8 1.568575 1.643292 0.807052 -0.520182
9 0.014365 -0.772688 0.237279 -0.283754


Join

left = pd.DataFrame({"key": ["foo", "foo"], "lval": [1, 2]})
right = pd.DataFrame({"key": ["foo", "foo"], "rval": [4, 5]})
pd.merge(left, right, on="key")
key lval rval
0 foo 1 4
1 foo 1 5
2 foo 2 4
3 foo 2 5


Grouping

dataFrame = pd.DataFrame(
    {
        "A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
        "B": ["one", "one", "two", "three", "two", "two", "one", "three"],
        "C": np.random.randn(8),
        "D": np.random.randn(8),
    }
)
dataFrame.groupby("A").sum()
C D
A
bar 3.415306 1.891978
foo 1.344929 2.036847


dataFrame.groupby(["A", "B"]).sum()
C D
A B
bar one 1.162646 1.264296
three 2.278214 0.676291
two -0.025555 -0.048609
foo one 3.349957 0.085027
three -0.072058 0.578082
two -1.932971 1.373738


Total running time of the script: ( 0 minutes 0.021 seconds)