博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Pandas中时间和日期处理
阅读量:6935 次
发布时间:2019-06-27

本文共 14010 字,大约阅读时间需要 46 分钟。

1、生成一个时间段

In [1]:import pandas as pd

In [2]:import numpy as np
1)生成一个时间区间段,间隔为小时
In [3]:rng = pd.date_range('1/1/2011', periods=72, freq='H')
2)生成一个Series,并制定索引为时间段
In [4]:ts = pd.Series(np.random.randn(len(rng)), index=rng)
In [5]:ts
Out[5]:
2011-01-01 00:00:00 -0.204085
2011-01-01 01:00:00 1.101711
2011-01-01 02:00:00 1.840500
2011-01-01 03:00:00 0.112426
2011-01-01 04:00:00 -0.310413
2011-01-01 05:00:00 1.180762
2011-01-01 06:00:00 0.087775
2011-01-01 07:00:00 1.087877
2011-01-01 08:00:00 -0.950237
2011-01-01 09:00:00 -0.468453
Freq: H, dtype: float64

3)改变时间间隔

In [6]:converted = ts.asfreq('45Min', method='pad')
In [7]:converted
Out[7]:
2011-01-01 00:00:00 -0.204085
2011-01-01 00:45:00 -0.204085
2011-01-01 01:30:00 1.101711
2011-01-01 02:15:00 1.840500
2011-01-01 03:00:00 0.112426
2011-01-01 03:45:00 0.112426
2011-01-01 04:30:00 -0.310413
2011-01-01 05:15:00 1.180762
2011-01-01 06:00:00 0.087775
2011-01-01 06:45:00 0.087775
2011-01-01 07:30:00 1.087877
2011-01-01 08:15:00 -0.950237
2011-01-01 09:00:00 -0.468453
Freq: 45T, dtype: float64

2、转换为日期格式

2.1 数字生成日期格式

In [8]: pd.Timestamp(datetime(2012, 5, 1))

Out[8]: Timestamp('2012-05-01 00:00:00')

2.2 字符生成日期格式

In [9]: pd.Timestamp('2012-05-01')

Out[9]: Timestamp('2012-05-01 00:00:00')

2.3 只有年月

In [10]: pd.Period('2011-01')

Out[10]: Period('2011-01', 'M')

In [11]: pd.Period('2012-05', freq='D')

Out[11]: Period('2012-05-01', 'D')

2.4 转化为日期格式

In [22]: pd.to_datetime(pd.Series(['Jul 31, 2009', '2010-01-10', None]))

Out[22]:
0 2009-07-31
1 2010-01-10
2 NaT
dtype: datetime64[ns]

In [23]: pd.to_datetime(['2005/11/23', '2010.12.31'])

Out[23]: DatetimeIndex(['2005-11-23', '2010-12-31'], dtype='datetime64[ns]', freq=None)

3、生成一个时间段

3.1 生成索引的方法

In [35]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]

Note the frequency information

In [36]: index = pd.DatetimeIndex(dates)

In [37]: index

Out[37]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)

Automatically converted to DatetimeIndex

In [38]: index = pd.Index(dates)

In [39]: index

Out[39]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)

date_range日历,bdate_range工作日

In [40]: index = pd.date_range('2000-1-1', periods=1000, freq='M')

In [41]: index

Out[41]:
DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-30',

'2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',           '2000-09-30', '2000-10-31',           ...           '2082-07-31', '2082-08-31', '2082-09-30', '2082-10-31',           '2082-11-30', '2082-12-31', '2083-01-31', '2083-02-28',           '2083-03-31', '2083-04-30'],          dtype='datetime64[ns]', length=1000, freq='M')

In [42]: index = pd.bdate_range('2012-1-1', periods=250)

In [43]: index

Out[43]:
DatetimeIndex(['2012-01-02', '2012-01-03', '2012-01-04', '2012-01-05',

'2012-01-06', '2012-01-09', '2012-01-10', '2012-01-11',           '2012-01-12', '2012-01-13',           ...           '2012-12-03', '2012-12-04', '2012-12-05', '2012-12-06',           '2012-12-07', '2012-12-10', '2012-12-11', '2012-12-12',           '2012-12-13', '2012-12-14'],          dtype='datetime64[ns]', length=250, freq='B')

In [44]: start = datetime(2011, 1, 1)

In [45]: end = datetime(2012, 1, 1)

In [46]: rng = pd.date_range(start, end)

In [47]: rng

Out[47]:
DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04',

'2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08',           '2011-01-09', '2011-01-10',           ...           '2011-12-23', '2011-12-24', '2011-12-25', '2011-12-26',           '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30',           '2011-12-31', '2012-01-01'],          dtype='datetime64[ns]', length=366, freq='D')

In [48]: rng = pd.bdate_range(start, end)

In [49]: rng

Out[49]:
DatetimeIndex(['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06',

'2011-01-07', '2011-01-10', '2011-01-11', '2011-01-12',           '2011-01-13', '2011-01-14',           ...           '2011-12-19', '2011-12-20', '2011-12-21', '2011-12-22',           '2011-12-23', '2011-12-26', '2011-12-27', '2011-12-28',           '2011-12-29', '2011-12-30'],          dtype='datetime64[ns]', length=260, freq='B')

3.2 每个月末,每隔一周

In [50]: pd.date_range(start, end, freq='BM')

Out[50]:
DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29',

'2011-05-31', '2011-06-30', '2011-07-29', '2011-08-31',           '2011-09-30', '2011-10-31', '2011-11-30', '2011-12-30'],          dtype='datetime64[ns]', freq='BM')

In [51]: pd.date_range(start, end, freq='W')

Out[51]:
DatetimeIndex(['2011-01-02', '2011-01-09', '2011-01-16', '2011-01-23',

'2011-01-30', '2011-02-06', '2011-02-13', '2011-02-20',           '2011-02-27', '2011-03-06', '2011-03-13', '2011-03-20',           '2011-03-27', '2011-04-03', '2011-04-10', '2011-04-17',           '2011-04-24', '2011-05-01', '2011-05-08', '2011-05-15',           '2011-05-22', '2011-05-29', '2011-06-05', '2011-06-12',           '2011-06-19', '2011-06-26', '2011-07-03', '2011-07-10',           '2011-07-17', '2011-07-24', '2011-07-31', '2011-08-07',           '2011-08-14', '2011-08-21', '2011-08-28', '2011-09-04',           '2011-09-11', '2011-09-18', '2011-09-25', '2011-10-02',           '2011-10-09', '2011-10-16', '2011-10-23', '2011-10-30',           '2011-11-06', '2011-11-13', '2011-11-20', '2011-11-27',           '2011-12-04', '2011-12-11', '2011-12-18', '2011-12-25',           '2012-01-01'],          dtype='datetime64[ns]', freq='W-SUN')

3.3 从End往前数20个工作日,从start往后数20个工作日

In [52]: pd.bdate_range(end=end, periods=20)

Out[52]:
DatetimeIndex(['2011-12-05', '2011-12-06', '2011-12-07', '2011-12-08',

'2011-12-09', '2011-12-12', '2011-12-13', '2011-12-14',           '2011-12-15', '2011-12-16', '2011-12-19', '2011-12-20',           '2011-12-21', '2011-12-22', '2011-12-23', '2011-12-26',           '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30'],          dtype='datetime64[ns]', freq='B')

In [53]: pd.bdate_range(start=start, periods=20)

Out[53]:
DatetimeIndex(['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06',

'2011-01-07', '2011-01-10', '2011-01-11', '2011-01-12',           '2011-01-13', '2011-01-14', '2011-01-17', '2011-01-18',           '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-24',           '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28'],          dtype='datetime64[ns]', freq='B')

4、根据部分索引选择,切片

In [56]: rng = pd.date_range(start, end, freq='BM')

In [57]: ts = pd.Series(np.random.randn(len(rng)), index=rng)

In [58]: ts.index

Out[58]:
DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29',

'2011-05-31', '2011-06-30', '2011-07-29', '2011-08-31',           '2011-09-30', '2011-10-31', '2011-11-30', '2011-12-30'],          dtype='datetime64[ns]', freq='BM')

In [59]: ts[:5].index

Out[59]:
DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29',

'2011-05-31'],          dtype='datetime64[ns]', freq='BM')

In [60]: ts[::2].index

Out[60]:
DatetimeIndex(['2011-01-31', '2011-03-31', '2011-05-31', '2011-07-29',

'2011-09-30', '2011-11-30'],          dtype='datetime64[ns]', freq='2BM')

In [61]: ts['1/31/2011']

Out[61]: -1.2812473076599531

In [62]: ts[pd.datetime(2011, 12, 25):]

Out[62]:
2011-12-30 0.687738
Freq: BM, dtype: float64

In [63]: ts['10/31/2011':'12/31/2011']

Out[63]:
2011-10-31 0.149748
2011-11-30 -0.732339
2011-12-30 0.687738
Freq: BM, dtype: float64

In [64]: ts['2011']

Out[64]:
2011-01-31 -1.281247
2011-02-28 -0.727707
2011-03-31 -0.121306
2011-04-29 -0.097883
2011-05-31 0.695775
2011-06-30 0.341734
2011-07-29 0.959726
2011-08-31 -1.110336
2011-09-30 -0.619976
2011-10-31 0.149748
2011-11-30 -0.732339
2011-12-30 0.687738
Freq: BM, dtype: float64

In [65]: ts['2011-6']

Out[65]:
2011-06-30 0.341734
Freq: BM, dtype: float64

DataFrame中指定了时间索引,可以根据时间索引提取子集

In [66]: dft = pd.DataFrame(np.random.randn(100000,1),columns=['A'],index=pd.date_range('20130101',periods=100000,freq='T'))

In [67]: dft

Out[67]:

A

2013-01-01 00:00:00 0.176444

2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-03-11 10:33:00 -0.293083
2013-03-11 10:34:00 -0.059881
2013-03-11 10:35:00 1.252450
2013-03-11 10:36:00 0.046611
2013-03-11 10:37:00 0.059478
2013-03-11 10:38:00 -0.286539
2013-03-11 10:39:00 0.841669

[100000 rows x 1 columns]

In [68]: dft['2013']

Out[68]:

A

2013-01-01 00:00:00 0.176444

2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-03-11 10:33:00 -0.293083
2013-03-11 10:34:00 -0.059881
2013-03-11 10:35:00 1.252450
2013-03-11 10:36:00 0.046611
2013-03-11 10:37:00 0.059478
2013-03-11 10:38:00 -0.286539
2013-03-11 10:39:00 0.841669

[100000 rows x 1 columns]

In [69]: dft['2013-1':'2013-2']

Out[69]:

A

2013-01-01 00:00:00 0.176444

2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-28 23:53:00 0.103114
2013-02-28 23:54:00 -1.303422
2013-02-28 23:55:00 0.451943
2013-02-28 23:56:00 0.220534
2013-02-28 23:57:00 -1.624220
2013-02-28 23:58:00 0.093915
2013-02-28 23:59:00 -1.087454

[84960 rows x 1 columns]

In [70]: dft['2013-1':'2013-2-28']

Out[70]:

A

2013-01-01 00:00:00 0.176444

2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-28 23:53:00 0.103114
2013-02-28 23:54:00 -1.303422
2013-02-28 23:55:00 0.451943
2013-02-28 23:56:00 0.220534
2013-02-28 23:57:00 -1.624220
2013-02-28 23:58:00 0.093915
2013-02-28 23:59:00 -1.087454

[84960 rows x 1 columns]

In [71]: dft['2013-1':'2013-2-28 00:00:00']

Out[71]:

A

2013-01-01 00:00:00 0.176444

2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-27 23:54:00 0.897051
2013-02-27 23:55:00 -0.309230
2013-02-27 23:56:00 1.944713
2013-02-27 23:57:00 0.369265
2013-02-27 23:58:00 0.053071
2013-02-27 23:59:00 -0.019734
2013-02-28 00:00:00 1.388189

[83521 rows x 1 columns]

In [72]: dft['2013-1-15':'2013-1-15 12:30:00']

Out[72]:

A

2013-01-15 00:00:00 0.501288

2013-01-15 00:01:00 -0.605198
2013-01-15 00:02:00 0.215146
2013-01-15 00:03:00 0.924732
2013-01-15 00:04:00 -2.228519
2013-01-15 00:05:00 1.517331
2013-01-15 00:06:00 -1.188774
... ...
2013-01-15 12:24:00 1.358314
2013-01-15 12:25:00 -0.737727
2013-01-15 12:26:00 1.838323
2013-01-15 12:27:00 -0.774090
2013-01-15 12:28:00 0.622261
2013-01-15 12:29:00 -0.631649
2013-01-15 12:30:00 0.193284

[751 rows x 1 columns]

In [73]: dft.loc['2013-1-15 12:30:00']

Out[73]:
A 0.193284
Name: 2013-01-15 12:30:00, dtype: float64

5、常用时间

类别 解释

year 年
month 月
day 日
hour 时
minute 分钟
second 秒
microsecond 微秒
nanosecond 纳秒
date 返回日期
time 返回时间
dayofyear 年序日
weekofyear 年序周
week 周
dayofweek 周中的第几天,Monday=0, Sunday=6
weekday 周中的第几天,Monday=0, Sunday=6
weekday_name 周中的星期几,ex: Friday
quarter 季度
days_in_month 一个月中有多少天
is_month_start 是否月初第一天
is_month_end 是否月末最后一天
is_quarter_start 是否季度的最开始
is_quarter_end 是否季度的最后一个
is_year_start 是否年初第一天
is_year_end 是否年末第一天

6、某一时间点,往前往后加一段时间

类别 解释

BDay 工作日
CDay 自定义日期
Week 周
WeekOfMonth 月中的第几周
LastWeekOfMonth 月中的最后一周
MonthEnd 日历上月末
MonthBegin 日历上月初
BMonthEnd 工作月初
BMonthBegin 月开始营业
CBMonthEnd 自定义月末
CBMonthBegin 自定义月初
QuarterEnd 日历季末
QuarterBegin 日历季初
BQuarterEnd 工作季末
BQuarterBegin 工作季初
FY5253Quarter retail (aka 52-53 week) quarter
YearEnd 日历年末
YearBegin 日历年初
BYearEnd 工作年末
BYearBegin 工作年初
FY5253 retail (aka 52-53 week) year
BusinessHour 工作小时
CustomBusinessHour 自定义小时
Hour 小时
Minute 分钟
Second 秒
In [84]: d = pd.datetime(2008, 8, 18, 9, 0)
In [86]: from pandas.tseries.offsets import *

In [87]: d + DateOffset(months=4, days=5)

Out[87]: Timestamp('2008-12-23 09:00:00')

In [88]: d - 5 * BDay()

Out[88]: Timestamp('2008-08-11 09:00:00')

月末

In [89]: d + BMonthEnd()
Out[89]: Timestamp('2008-08-29 09:00:00')

In [90]: d

Out[90]: datetime.datetime(2008, 8, 18, 9, 0)

往前数月末

In [91]: offset = BMonthEnd()

In [92]: offset.rollforward(d)

Out[92]: Timestamp('2008-08-29 09:00:00')

往后数月末

In [93]: offset.rollback(d)
Out[93]: Timestamp('2008-07-31 09:00:00')

时间方面的

In [94]: day = Day()

In [95]: day.apply(pd.Timestamp('2014-01-01 09:00'))

Out[95]: Timestamp('2014-01-02 09:00:00')

In [96]: day = Day(normalize=True)

In [97]: day.apply(pd.Timestamp('2014-01-01 09:00'))

Out[97]: Timestamp('2014-01-02 00:00:00')

In [98]: hour = Hour()

In [99]: hour.apply(pd.Timestamp('2014-01-01 22:00'))

Out[99]: Timestamp('2014-01-01 23:00:00')

In [100]: hour = Hour(normalize=True)

In [101]: hour.apply(pd.Timestamp('2014-01-01 22:00'))

Out[101]: Timestamp('2014-01-01 00:00:00')

In [102]: hour.apply(pd.Timestamp('2014-01-01 23:00'))

Out[102]: Timestamp('2014-01-02 00:00:00')

周相关的

In [103]: d
Out[103]: datetime.datetime(2008, 8, 18, 9, 0)

In [104]: d + Week()

Out[104]: Timestamp('2008-08-25 09:00:00')

In [105]: d + Week(weekday=4)

Out[105]: Timestamp('2008-08-22 09:00:00')

In [106]: (d + Week(weekday=4)).weekday()

Out[106]: 4

In [107]: d - Week()

Out[107]: Timestamp('2008-08-11 09:00:00')

7、时间序列相关的时间处理

In [213]: ts = ts[:5]

In [214]: ts.shift(1)

Out[214]:
2011-01-31 NaN
2011-02-28 -1.281247
2011-03-31 -0.727707
2011-04-29 -0.121306
2011-05-31 -0.097883
Freq: BM, dtype: float64

In [215]: ts.shift(5, freq=datetools.bday)

Out[215]:
2011-02-07 -1.281247
2011-03-07 -0.727707
2011-04-07 -0.121306
2011-05-06 -0.097883
2011-06-07 0.695775
dtype: float64

In [216]: ts.shift(5, freq='BM')

Out[216]:
2011-06-30 -1.281247
2011-07-29 -0.727707
2011-08-31 -0.121306
2011-09-30 -0.097883
2011-10-31 0.695775
Freq: BM, dtype: float64

In [217]: ts.tshift(5, freq='D')

Out[217]:
2011-02-05 -1.281247
2011-03-05 -0.727707
2011-04-05 -0.121306
2011-05-04 -0.097883
2011-06-05 0.695775
dtype: float64

转载地址:http://gswnl.baihongyu.com/

你可能感兴趣的文章
linux 手动释放buff/cache
查看>>
java操作elasticsearch实现query String
查看>>
iOS开发 - OC - block的详解 - 深入篇
查看>>
保利入驻宜昌 外来大咖谁更靠谱
查看>>
11.15日工作总结(补)
查看>>
(转)Spring读书笔记-----Spring的Bean之Bean的基本概念
查看>>
Python_面向对象_类1
查看>>
设计模式-简单工厂模式
查看>>
公开SNS社区即时找朋友链的源代码和部署方案(续四)
查看>>
[TJOI2019]甲苯先生的滚榜——非旋转treap
查看>>
String、Brush、Color 相互转换
查看>>
项目管理初探
查看>>
如何为ListView中的ImageView添加动画,谢谢!
查看>>
mvc
查看>>
ScribeFireBlog 发的一篇在Cnblogs的日志
查看>>
编译器中代码自动对齐快捷键
查看>>
选项卡jQuery(ele).each()
查看>>
栈原理演示
查看>>
使用MASM04 - Win32汇编语言012
查看>>
B/S,C/S架构
查看>>