12.3.10.4.10. Reshaping dataΒΆ

import pandas as pd
import numpy as np


tuples = list(
    zip(
        *[
            ["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
            ["one", "two", "one", "two", "one", "two", "one", "two"],
        ]
    )
)
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
dataFrame = pd.DataFrame(np.random.randn(8, 2), index=index, columns=["A", "B"])
dataFrame2 = dataFrame[:4]

Stack

stacked = dataFrame2.stack()
stacked.unstack()
A B
first second
bar one 1.021953 2.347960
two -1.133448 -0.915521
baz one 0.725976 2.472088
two 0.749382 -0.296489


stacked.unstack(1)
second one two
first
bar A 1.021953 -1.133448
B 2.347960 -0.915521
baz A 0.725976 0.749382
B 2.472088 -0.296489


stacked.unstack(0)
first bar baz
second
one A 1.021953 0.725976
B 2.347960 2.472088
two A -1.133448 0.749382
B -0.915521 -0.296489


Pivot tables

dataFrame = pd.DataFrame(
    {
        "A": ["one", "one", "two", "three"] * 3,
        "B": ["A", "B", "C"] * 4,
        "C": ["foo", "foo", "foo", "bar", "bar", "bar"] * 2,
        "D": np.random.randn(12),
        "E": np.random.randn(12),
    }
)
pd.pivot_table(dataFrame, values="D", index=["A", "B"], columns=["C"])
C bar foo
A B
one A -0.065709 -1.049577
B -0.217735 -0.544323
C 0.411135 -1.737220
three A 1.037492 NaN
B NaN -1.840691
C -0.516667 NaN
two A NaN 0.464021
B 1.111117 NaN
C NaN 1.046043


Time series

indexData = pd.date_range("1/5/2022", periods=100, freq="S")
timeStemps = pd.Series(np.random.randint(0, 500, len(indexData)), index=indexData)
timeStemps.resample("5Min").sum()
2022-01-05    23303
Freq: 5T, dtype: int32
timeStempsUTC = timeStemps.tz_localize("UTC")
timeStempsUTC.tz_convert("US/Eastern")
2022-01-04 19:00:00-05:00    161
2022-01-04 19:00:01-05:00    425
2022-01-04 19:00:02-05:00    216
2022-01-04 19:00:03-05:00    356
2022-01-04 19:00:04-05:00    136
                            ...
2022-01-04 19:01:35-05:00    412
2022-01-04 19:01:36-05:00    154
2022-01-04 19:01:37-05:00     28
2022-01-04 19:01:38-05:00    113
2022-01-04 19:01:39-05:00    100
Freq: S, Length: 100, dtype: int32
ps = timeStemps.to_period()
ps.to_timestamp()
2022-01-05 00:00:00    161
2022-01-05 00:00:01    425
2022-01-05 00:00:02    216
2022-01-05 00:00:03    356
2022-01-05 00:00:04    136
                      ...
2022-01-05 00:01:35    412
2022-01-05 00:01:36    154
2022-01-05 00:01:37     28
2022-01-05 00:01:38    113
2022-01-05 00:01:39    100
Freq: S, Length: 100, dtype: int32
prng = pd.period_range("1990Q1", "2000Q4", freq="Q-NOV")
ts = pd.Series(np.random.randn(len(prng)), prng)
ts.index = (prng.asfreq("M", "e") + 1).asfreq("H", "s") + 9

Total running time of the script: ( 0 minutes 0.040 seconds)