1、生成一个时间段
In [1]:import pandas as pd
In [2]:import numpy as np1)生成一个时间区间段,间隔为小时In [3]:rng = pd.date_range('1/1/2011', periods=72, freq='H')2)生成一个Series,并制定索引为时间段In [4]:ts = pd.Series(np.random.randn(len(rng)), index=rng)In [5]:tsOut[5]: 2011-01-01 00:00:00 -0.2040852011-01-01 01:00:00 1.1017112011-01-01 02:00:00 1.8405002011-01-01 03:00:00 0.1124262011-01-01 04:00:00 -0.3104132011-01-01 05:00:00 1.1807622011-01-01 06:00:00 0.0877752011-01-01 07:00:00 1.0878772011-01-01 08:00:00 -0.9502372011-01-01 09:00:00 -0.468453Freq: H, dtype: float643)改变时间间隔
In [6]:converted = ts.asfreq('45Min', method='pad')In [7]:convertedOut[7]: 2011-01-01 00:00:00 -0.2040852011-01-01 00:45:00 -0.2040852011-01-01 01:30:00 1.1017112011-01-01 02:15:00 1.8405002011-01-01 03:00:00 0.1124262011-01-01 03:45:00 0.1124262011-01-01 04:30:00 -0.3104132011-01-01 05:15:00 1.1807622011-01-01 06:00:00 0.0877752011-01-01 06:45:00 0.0877752011-01-01 07:30:00 1.0878772011-01-01 08:15:00 -0.9502372011-01-01 09:00:00 -0.468453Freq: 45T, dtype: float642、转换为日期格式
2.1 数字生成日期格式
In [8]: pd.Timestamp(datetime(2012, 5, 1))
Out[8]: Timestamp('2012-05-01 00:00:00')2.2 字符生成日期格式
In [9]: pd.Timestamp('2012-05-01')
Out[9]: Timestamp('2012-05-01 00:00:00')2.3 只有年月
In [10]: pd.Period('2011-01')
Out[10]: Period('2011-01', 'M')In [11]: pd.Period('2012-05', freq='D')
Out[11]: Period('2012-05-01', 'D')2.4 转化为日期格式
In [22]: pd.to_datetime(pd.Series(['Jul 31, 2009', '2010-01-10', None]))
Out[22]: 0 2009-07-311 2010-01-102 NaTdtype: datetime64[ns]In [23]: pd.to_datetime(['2005/11/23', '2010.12.31'])
Out[23]: DatetimeIndex(['2005-11-23', '2010-12-31'], dtype='datetime64[ns]', freq=None)3、生成一个时间段
3.1 生成索引的方法
In [35]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
Note the frequency information
In [36]: index = pd.DatetimeIndex(dates)In [37]: index
Out[37]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)Automatically converted to DatetimeIndex
In [38]: index = pd.Index(dates)In [39]: index
Out[39]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)date_range日历,bdate_range工作日
In [40]: index = pd.date_range('2000-1-1', periods=1000, freq='M')In [41]: index
Out[41]: DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-30','2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31', '2000-09-30', '2000-10-31', ... '2082-07-31', '2082-08-31', '2082-09-30', '2082-10-31', '2082-11-30', '2082-12-31', '2083-01-31', '2083-02-28', '2083-03-31', '2083-04-30'], dtype='datetime64[ns]', length=1000, freq='M')
In [42]: index = pd.bdate_range('2012-1-1', periods=250)
In [43]: index
Out[43]: DatetimeIndex(['2012-01-02', '2012-01-03', '2012-01-04', '2012-01-05','2012-01-06', '2012-01-09', '2012-01-10', '2012-01-11', '2012-01-12', '2012-01-13', ... '2012-12-03', '2012-12-04', '2012-12-05', '2012-12-06', '2012-12-07', '2012-12-10', '2012-12-11', '2012-12-12', '2012-12-13', '2012-12-14'], dtype='datetime64[ns]', length=250, freq='B')
In [44]: start = datetime(2011, 1, 1)
In [45]: end = datetime(2012, 1, 1)
In [46]: rng = pd.date_range(start, end)
In [47]: rng
Out[47]: DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04','2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08', '2011-01-09', '2011-01-10', ... '2011-12-23', '2011-12-24', '2011-12-25', '2011-12-26', '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30', '2011-12-31', '2012-01-01'], dtype='datetime64[ns]', length=366, freq='D')
In [48]: rng = pd.bdate_range(start, end)
In [49]: rng
Out[49]: DatetimeIndex(['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06','2011-01-07', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', ... '2011-12-19', '2011-12-20', '2011-12-21', '2011-12-22', '2011-12-23', '2011-12-26', '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30'], dtype='datetime64[ns]', length=260, freq='B')
3.2 每个月末,每隔一周
In [50]: pd.date_range(start, end, freq='BM')
Out[50]: DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29','2011-05-31', '2011-06-30', '2011-07-29', '2011-08-31', '2011-09-30', '2011-10-31', '2011-11-30', '2011-12-30'], dtype='datetime64[ns]', freq='BM')
In [51]: pd.date_range(start, end, freq='W')
Out[51]: DatetimeIndex(['2011-01-02', '2011-01-09', '2011-01-16', '2011-01-23','2011-01-30', '2011-02-06', '2011-02-13', '2011-02-20', '2011-02-27', '2011-03-06', '2011-03-13', '2011-03-20', '2011-03-27', '2011-04-03', '2011-04-10', '2011-04-17', '2011-04-24', '2011-05-01', '2011-05-08', '2011-05-15', '2011-05-22', '2011-05-29', '2011-06-05', '2011-06-12', '2011-06-19', '2011-06-26', '2011-07-03', '2011-07-10', '2011-07-17', '2011-07-24', '2011-07-31', '2011-08-07', '2011-08-14', '2011-08-21', '2011-08-28', '2011-09-04', '2011-09-11', '2011-09-18', '2011-09-25', '2011-10-02', '2011-10-09', '2011-10-16', '2011-10-23', '2011-10-30', '2011-11-06', '2011-11-13', '2011-11-20', '2011-11-27', '2011-12-04', '2011-12-11', '2011-12-18', '2011-12-25', '2012-01-01'], dtype='datetime64[ns]', freq='W-SUN')
3.3 从End往前数20个工作日,从start往后数20个工作日
In [52]: pd.bdate_range(end=end, periods=20)
Out[52]: DatetimeIndex(['2011-12-05', '2011-12-06', '2011-12-07', '2011-12-08','2011-12-09', '2011-12-12', '2011-12-13', '2011-12-14', '2011-12-15', '2011-12-16', '2011-12-19', '2011-12-20', '2011-12-21', '2011-12-22', '2011-12-23', '2011-12-26', '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30'], dtype='datetime64[ns]', freq='B')
In [53]: pd.bdate_range(start=start, periods=20)
Out[53]: DatetimeIndex(['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06','2011-01-07', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-24', '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28'], dtype='datetime64[ns]', freq='B')
4、根据部分索引选择,切片
In [56]: rng = pd.date_range(start, end, freq='BM')
In [57]: ts = pd.Series(np.random.randn(len(rng)), index=rng)
In [58]: ts.index
Out[58]: DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29','2011-05-31', '2011-06-30', '2011-07-29', '2011-08-31', '2011-09-30', '2011-10-31', '2011-11-30', '2011-12-30'], dtype='datetime64[ns]', freq='BM')
In [59]: ts[:5].index
Out[59]: DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29','2011-05-31'], dtype='datetime64[ns]', freq='BM')
In [60]: ts[::2].index
Out[60]: DatetimeIndex(['2011-01-31', '2011-03-31', '2011-05-31', '2011-07-29','2011-09-30', '2011-11-30'], dtype='datetime64[ns]', freq='2BM')
In [61]: ts['1/31/2011']
Out[61]: -1.2812473076599531In [62]: ts[pd.datetime(2011, 12, 25):]
Out[62]: 2011-12-30 0.687738Freq: BM, dtype: float64In [63]: ts['10/31/2011':'12/31/2011']
Out[63]: 2011-10-31 0.1497482011-11-30 -0.7323392011-12-30 0.687738Freq: BM, dtype: float64In [64]: ts['2011']
Out[64]: 2011-01-31 -1.2812472011-02-28 -0.7277072011-03-31 -0.1213062011-04-29 -0.0978832011-05-31 0.6957752011-06-30 0.3417342011-07-29 0.9597262011-08-31 -1.1103362011-09-30 -0.6199762011-10-31 0.1497482011-11-30 -0.7323392011-12-30 0.687738Freq: BM, dtype: float64In [65]: ts['2011-6']
Out[65]: 2011-06-30 0.341734Freq: BM, dtype: float64DataFrame中指定了时间索引,可以根据时间索引提取子集
In [66]: dft = pd.DataFrame(np.random.randn(100000,1),columns=['A'],index=pd.date_range('20130101',periods=100000,freq='T'))In [67]: dft
Out[67]:A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.4033102013-01-01 00:02:00 -0.1549512013-01-01 00:03:00 0.3016242013-01-01 00:04:00 -2.1798612013-01-01 00:05:00 -1.3698492013-01-01 00:06:00 -0.954208... ...2013-03-11 10:33:00 -0.2930832013-03-11 10:34:00 -0.0598812013-03-11 10:35:00 1.2524502013-03-11 10:36:00 0.0466112013-03-11 10:37:00 0.0594782013-03-11 10:38:00 -0.2865392013-03-11 10:39:00 0.841669[100000 rows x 1 columns]
In [68]: dft['2013']
Out[68]:A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.4033102013-01-01 00:02:00 -0.1549512013-01-01 00:03:00 0.3016242013-01-01 00:04:00 -2.1798612013-01-01 00:05:00 -1.3698492013-01-01 00:06:00 -0.954208... ...2013-03-11 10:33:00 -0.2930832013-03-11 10:34:00 -0.0598812013-03-11 10:35:00 1.2524502013-03-11 10:36:00 0.0466112013-03-11 10:37:00 0.0594782013-03-11 10:38:00 -0.2865392013-03-11 10:39:00 0.841669[100000 rows x 1 columns]
In [69]: dft['2013-1':'2013-2']
Out[69]:A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.4033102013-01-01 00:02:00 -0.1549512013-01-01 00:03:00 0.3016242013-01-01 00:04:00 -2.1798612013-01-01 00:05:00 -1.3698492013-01-01 00:06:00 -0.954208... ...2013-02-28 23:53:00 0.1031142013-02-28 23:54:00 -1.3034222013-02-28 23:55:00 0.4519432013-02-28 23:56:00 0.2205342013-02-28 23:57:00 -1.6242202013-02-28 23:58:00 0.0939152013-02-28 23:59:00 -1.087454[84960 rows x 1 columns]
In [70]: dft['2013-1':'2013-2-28']
Out[70]:A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.4033102013-01-01 00:02:00 -0.1549512013-01-01 00:03:00 0.3016242013-01-01 00:04:00 -2.1798612013-01-01 00:05:00 -1.3698492013-01-01 00:06:00 -0.954208... ...2013-02-28 23:53:00 0.1031142013-02-28 23:54:00 -1.3034222013-02-28 23:55:00 0.4519432013-02-28 23:56:00 0.2205342013-02-28 23:57:00 -1.6242202013-02-28 23:58:00 0.0939152013-02-28 23:59:00 -1.087454[84960 rows x 1 columns]
In [71]: dft['2013-1':'2013-2-28 00:00:00']
Out[71]:A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.4033102013-01-01 00:02:00 -0.1549512013-01-01 00:03:00 0.3016242013-01-01 00:04:00 -2.1798612013-01-01 00:05:00 -1.3698492013-01-01 00:06:00 -0.954208... ...2013-02-27 23:54:00 0.8970512013-02-27 23:55:00 -0.3092302013-02-27 23:56:00 1.9447132013-02-27 23:57:00 0.3692652013-02-27 23:58:00 0.0530712013-02-27 23:59:00 -0.0197342013-02-28 00:00:00 1.388189[83521 rows x 1 columns]
In [72]: dft['2013-1-15':'2013-1-15 12:30:00']
Out[72]:A
2013-01-15 00:00:00 0.501288
2013-01-15 00:01:00 -0.6051982013-01-15 00:02:00 0.2151462013-01-15 00:03:00 0.9247322013-01-15 00:04:00 -2.2285192013-01-15 00:05:00 1.5173312013-01-15 00:06:00 -1.188774... ...2013-01-15 12:24:00 1.3583142013-01-15 12:25:00 -0.7377272013-01-15 12:26:00 1.8383232013-01-15 12:27:00 -0.7740902013-01-15 12:28:00 0.6222612013-01-15 12:29:00 -0.6316492013-01-15 12:30:00 0.193284[751 rows x 1 columns]
In [73]: dft.loc['2013-1-15 12:30:00']
Out[73]: A 0.193284Name: 2013-01-15 12:30:00, dtype: float645、常用时间
类别 解释
year 年month 月day 日hour 时minute 分钟second 秒microsecond 微秒nanosecond 纳秒date 返回日期time 返回时间dayofyear 年序日weekofyear 年序周week 周dayofweek 周中的第几天,Monday=0, Sunday=6weekday 周中的第几天,Monday=0, Sunday=6weekday_name 周中的星期几,ex: Fridayquarter 季度days_in_month 一个月中有多少天is_month_start 是否月初第一天is_month_end 是否月末最后一天is_quarter_start 是否季度的最开始is_quarter_end 是否季度的最后一个is_year_start 是否年初第一天is_year_end 是否年末第一天6、某一时间点,往前往后加一段时间
类别 解释
BDay 工作日CDay 自定义日期Week 周WeekOfMonth 月中的第几周LastWeekOfMonth 月中的最后一周MonthEnd 日历上月末MonthBegin 日历上月初BMonthEnd 工作月初BMonthBegin 月开始营业CBMonthEnd 自定义月末CBMonthBegin 自定义月初QuarterEnd 日历季末QuarterBegin 日历季初BQuarterEnd 工作季末BQuarterBegin 工作季初FY5253Quarter retail (aka 52-53 week) quarterYearEnd 日历年末YearBegin 日历年初BYearEnd 工作年末BYearBegin 工作年初FY5253 retail (aka 52-53 week) yearBusinessHour 工作小时CustomBusinessHour 自定义小时Hour 小时Minute 分钟Second 秒In [84]: d = pd.datetime(2008, 8, 18, 9, 0)In [86]: from pandas.tseries.offsets import *In [87]: d + DateOffset(months=4, days=5)
Out[87]: Timestamp('2008-12-23 09:00:00')In [88]: d - 5 * BDay()
Out[88]: Timestamp('2008-08-11 09:00:00')月末
In [89]: d + BMonthEnd()Out[89]: Timestamp('2008-08-29 09:00:00')In [90]: d
Out[90]: datetime.datetime(2008, 8, 18, 9, 0)往前数月末
In [91]: offset = BMonthEnd()In [92]: offset.rollforward(d)
Out[92]: Timestamp('2008-08-29 09:00:00')往后数月末
In [93]: offset.rollback(d)Out[93]: Timestamp('2008-07-31 09:00:00')时间方面的
In [94]: day = Day()In [95]: day.apply(pd.Timestamp('2014-01-01 09:00'))
Out[95]: Timestamp('2014-01-02 09:00:00')In [96]: day = Day(normalize=True)
In [97]: day.apply(pd.Timestamp('2014-01-01 09:00'))
Out[97]: Timestamp('2014-01-02 00:00:00')In [98]: hour = Hour()
In [99]: hour.apply(pd.Timestamp('2014-01-01 22:00'))
Out[99]: Timestamp('2014-01-01 23:00:00')In [100]: hour = Hour(normalize=True)
In [101]: hour.apply(pd.Timestamp('2014-01-01 22:00'))
Out[101]: Timestamp('2014-01-01 00:00:00')In [102]: hour.apply(pd.Timestamp('2014-01-01 23:00'))
Out[102]: Timestamp('2014-01-02 00:00:00')周相关的
In [103]: dOut[103]: datetime.datetime(2008, 8, 18, 9, 0)In [104]: d + Week()
Out[104]: Timestamp('2008-08-25 09:00:00')In [105]: d + Week(weekday=4)
Out[105]: Timestamp('2008-08-22 09:00:00')In [106]: (d + Week(weekday=4)).weekday()
Out[106]: 4In [107]: d - Week()
Out[107]: Timestamp('2008-08-11 09:00:00')7、时间序列相关的时间处理
In [213]: ts = ts[:5]
In [214]: ts.shift(1)
Out[214]: 2011-01-31 NaN2011-02-28 -1.2812472011-03-31 -0.7277072011-04-29 -0.1213062011-05-31 -0.097883Freq: BM, dtype: float64In [215]: ts.shift(5, freq=datetools.bday)
Out[215]: 2011-02-07 -1.2812472011-03-07 -0.7277072011-04-07 -0.1213062011-05-06 -0.0978832011-06-07 0.695775dtype: float64In [216]: ts.shift(5, freq='BM')
Out[216]: 2011-06-30 -1.2812472011-07-29 -0.7277072011-08-31 -0.1213062011-09-30 -0.0978832011-10-31 0.695775Freq: BM, dtype: float64In [217]: ts.tshift(5, freq='D')
Out[217]: 2011-02-05 -1.2812472011-03-05 -0.7277072011-04-05 -0.1213062011-05-04 -0.0978832011-06-05 0.695775dtype: float64