没有合适的资源?快使用搜索试试~ 我知道了~
R语言神器data.table
资源详情
资源评论
资源推荐
Package ‘data.table’
February 1, 2017
Version 1.10.4
Title Extension of `data.frame`
Depends R (>= 3.0.0)
Imports methods
Suggests bit64, knitr, nanotime, chron, ggplot2 (>= 0.9.0), plyr,
reshape, reshape2, testthat (>= 0.4), hexbin, fastmatch, nlme,
xts, gdata, GenomicRanges, caret, curl, zoo, plm, rmarkdown,
parallel
Description Fast aggregation of large data (e.g. 100GB in RAM), fast or-
dered joins, fast add/modify/delete of columns by group us-
ing no copies at all, list columns, a fast friendly file reader and parallel file writer. Offers a natu-
ral and flexible syntax, for faster development.
License GPL-3 | file LICENSE
URL http://r-datatable.com
BugReports https://github.com/Rdatatable/data.table/issues
MailingList datatable-help@lists.r-forge.r-project.org
VignetteBuilder knitr
ByteCompile TRUE
NeedsCompilation yes
Author Matt Dowle [aut, cre],
Arun Srinivasan [aut],
Jan Gorecki [ctb],
Tom Short [ctb],
Steve Lianoglou [ctb],
Eduard Antonyan [ctb]
Maintainer Matt Dowle <mattjdowle@gmail.com>
Repository CRAN
Date/Publication 2017-02-01 15:52:19
1
2 R topics documented:
R topics documented:
data.table-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
:= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
all.equal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
as.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
as.data.table.xts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
as.xts.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
between . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
chmatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
data.table-class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
datatable.optimize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
dcast.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
duplicated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
first . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
foverlaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
frank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
fread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
fsort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
fwrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
IDateTime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
J . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
last . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
like . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
melt.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
na.omit.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
print.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
rbindlist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
rleid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
rowid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
setattr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
setcolorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
setDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
setDT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
setDTthreads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
setkey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
setNumericRounding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
setops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
setorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
shouldPrint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
special-symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
split . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
subset.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
data.table-package 3
tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
test.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
timetaken . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
transform.data.table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
truelength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
tstrsplit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Index 98
data.table-package Enhanced data.frame
Description
data.table inherits from data.frame. It offers fast and nemory efficient: file reader and writer,
aggregations, updates, equi, non-equi, rolling, range and interval joins, in a short and flexible syntax,
for faster development.
It is inspired by A[B] syntax in R where A is a matrix and B is a 2-column matrix. Since a
data.table is a data.frame, it is compatible with R functions and packages that accept only
data.frames.
Type vignette(package="data.table") to get started. The Introduction to data.table vignette
introduces data.table’s x[i, j, by] syntax and is a good place to start. If you have read the vi-
gnettes and the help page below, please feel free to ask questions on Stack Overflow data.table tag or
on datatable-help mailing list. To report a bug please type: bug.report(package = "data.table").
Please check the homepage for up to the minute live NEWS.
Tip: one of the quickest ways to learn the features is to type example(data.table) and study the
output at the prompt.
Usage
data.table(..., keep.rownames=FALSE, check.names=FALSE, key=NULL, stringsAsFactors=FALSE)
## S3 method for class 'data.table'
x[i, j, by, keyby, with = TRUE,
nomatch = getOption("datatable.nomatch"), # default: NA_integer_
mult = "all",
roll = FALSE,
rollends = if (roll=="nearest") c(TRUE,TRUE)
else if (roll>=0) c(FALSE,TRUE)
else c(TRUE,FALSE),
which = FALSE,
.SDcols,
verbose = getOption("datatable.verbose"), # default: FALSE
allow.cartesian = getOption("datatable.allow.cartesian"), # default: FALSE
drop = NULL, on = NULL]
4 data.table-package
Arguments
... Just as ... in data.frame. Usual recycling rules are applied to vectors of
different lengths to create a list of equal length vectors.
keep.rownames If ... is a matrix or data.frame, TRUE will retain the rownames of that object
in a column named rn.
check.names Just as check.names in data.frame.
key Character vector of one or more column names which is passed to setkey. It
may be a single comma separated string such as key="x,y,z", or a vector of
names such as key=c("x","y","z").
stringsAsFactors
Logical (default is FALSE). Convert all character columns to factors?
x A data.table.
i Integer, logical or character vector, single column numeric matrix, expression
of column names, list, data.frame or data.table.
integer and logical vectors work the same way they do in [.data.frame
except logical NAs are treated as FALSE.
expression is evaluated within the frame of the data.table (i.e. it sees column
names as if they are variables) and can evaluate to any of the other types.
character, list and data.frame input to i is converted into a data.table
internally using as.data.table.
If i is a data.table, the columns in i to be matched against x can be specified
using one of these ways:
• on argument (see below). It allows for both equi- and the newly imple-
mented non-equi joins.
• If not, x must be keyed. Key can be set using setkey. If i is also keyed,
then first key column of i is matched against first key column of x, second
against second, etc..
If i is not keyed, then first column of i is matched against first key column
of x, second column of i against second key column of x, etc...
This is summarised in code as min(length(key(x)), if (haskey(i)) length(key(i)) else ncol(i)).
Using on= is recommended (even during keyed joins) as it helps understand the
code better and also allows for non-equi joins.
When the binary operator == alone is used, an equi join is performed. In SQL
terms, x[i] then performs a right join by default. i prefixed with ! signals a
not-join or not-select.
Support for non-equi join was recently implemented, which allows for other
binary operators >=, >, <= and <.
See Keys and fast binary search based subset and Secondary indices and auto
indexing.
Advanced: When i is a single variable name, it is not considered an expression
of column names and is instead evaluated in calling scope.
j When with=TRUE (default), j is evaluated within the frame of the data.table;
i.e., it sees column names as if they are variables. This allows to not just select
columns in j, but also compute on them e.g., x[, a] and x[, sum(a)] returns
data.table-package 5
x$a and sum(x$a) as a vector respectively. x[, .(a, b)] and x[, .(sa=sum(a), sb=sum(b))]
returns a two column data.table each, the first simply selecting columns a, b
and the second computing their sums.
The expression ‘.()‘ is a shorthand alias to list(); they both mean the same.
As long as j returns a list, each element of the list becomes a column in the
resulting data.table. This is the default enhanced mode.
When with=FALSE, j can only be a vector of column names or positions to
select (as in data.frame).
Advanced: j also allows the use of special read-only symbols: .SD, .N, .I,
.GRP, .BY.
Advanced: When i is a data.table, the columns of i can be referred to in j by
using the prefix i., e.g., X[Y, .(val, i.val)]. Here val refers to X’s column
and i.val Y’s.
Advanced: Columns of x can now be referred to using the prefix x. and is par-
ticularly useful during joining to refer to x’s join columns as they are otherwise
masked by i’s. For example, X[Y, .(x.a-i.a, b), on="a"].
See Introduction to data.table vignette and examples.
by Column names are seen as if they are variables (as in j when with=TRUE). The
data.table is then grouped by the by and j is evaluated within each group. The
order of the rows within each group is preserved, as is the order of the groups.
by accepts:
• A single unquoted column name: e.g., DT[, .(sa=sum(a)), by=x]
• a list() of expressions of column names: e.g., DT[, .(sa=sum(a)), by=.(x=x>0, y)]
• a single character string containing comma separated column names (where
spaces are significant since column names may contain spaces even at the
start or end): e.g., DT[, sum(a), by="x,y,z"]
• a character vector of column names: e.g., DT[, sum(a), by=c("x", "y")]
• or of the form startcol:endcol: e.g., DT[, sum(a), by=x:z]
Advanced: When i is a list (or data.frame or data.table), DT[i, j, by=.EACHI]
evaluates j for the groups in ‘DT‘ that each row in i joins to. That is, you can
join (in i) and aggregate (in j) simultaneously. We call this grouping by each i.
See this StackOverflow answer for a more detailed explanation until we roll out
vignettes.
Advanced: In the X[Y, j] form of grouping, the j expression sees variables in
X first, then Y. We call this join inherited scope. If the variable is not in X or Y
then the calling frame is searched, its calling frame, and so on in the usual way
up to and including the global environment.
keyby Same as by, but with an additional setkey() run on the by columns of the result,
for convenience. It is common practice to use ‘keyby=‘ routinely when you wish
the result to be sorted.
with By default with=TRUE and j is evaluated within the frame of x; column names
can be used as variables.
When with=FALSE j is a character vector of column names, a numeric vector
of column positions to select or of the form startcol:endcol, and the value
returned is always a data.table. with=FALSE is often useful in data.table to
剩余100页未读,继续阅读
R助手
- 粉丝: 103
- 资源: 9
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0