12.3.10.4.7. Merge dataΒΆ

import pandas as pd
import numpy as np

Concat

Create a dataFrame

dataFrame = pd.DataFrame(np.random.randn(10, 4))

break in pieces

pieces = [dataFrame[:3], dataFrame[3:7], dataFrame[7:]]
pd.concat(pieces)
0 1 2 3
0 -0.594662 1.364867 -0.650156 0.670481
1 0.862960 -0.101338 -1.251675 0.085087
2 -0.654680 0.431838 0.757597 -0.921025
3 -0.428261 0.467367 0.634254 -0.398001
4 0.636669 1.327476 -0.892867 0.318646
5 -1.791799 0.323012 -1.138974 0.134096
6 -0.131071 0.728004 0.652262 -1.299391
7 -1.250320 0.284590 -0.769879 -0.794100
8 -0.188462 -1.323862 -1.481396 1.708920
9 1.075906 -1.282515 -0.251804 -0.949320


Join

left = pd.DataFrame({"key": ["foo", "foo"], "lval": [1, 2]})
right = pd.DataFrame({"key": ["foo", "foo"], "rval": [4, 5]})
pd.merge(left, right, on="key")
key lval rval
0 foo 1 4
1 foo 1 5
2 foo 2 4
3 foo 2 5


Grouping

dataFrame = pd.DataFrame(
    {
        "A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
        "B": ["one", "one", "two", "three", "two", "two", "one", "three"],
        "C": np.random.randn(8),
        "D": np.random.randn(8),
    }
)
dataFrame.groupby("A").sum()
C D
A
bar 1.092838 -1.140762
foo 0.121636 -1.941217


dataFrame.groupby(["A", "B"]).sum()
C D
A B
bar one 2.444171 -2.047571
three 0.264714 0.762593
two -1.616048 0.144216
foo one 0.712266 -2.224192
three -1.798597 0.724009
two 1.207966 -0.441035


Total running time of the script: ( 0 minutes 0.017 seconds)