////
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
////
[[connectors]]
Notes for specific connectors
-----------------------------
MySQL JDBC Connector
~~~~~~~~~~~~~~~~~~~~
This section contains information specific to MySQL JDBC Connector.
Upsert functionality
^^^^^^^^^^^^^^^^^^^^
MySQL JDBC Connector is supporting upsert functionality using argument
+\--update-mode allowinsert+. To achieve that Sqoop is using MySQL clause INSERT INTO
... ON DUPLICATE KEY UPDATE. This clause do not allow user to specify which columns
should be used to distinct whether we should update existing row or add new row. Instead
this clause relies on table's unique keys (primary key belongs to this set). MySQL
will try to insert new row and if the insertion fails with duplicate unique key error
it will update appropriate row instead. As a result, Sqoop is ignoring values specified
in parameter +\--update-key+, however user needs to specify at least one valid column
to turn on update mode itself.
MySQL Direct Connector
~~~~~~~~~~~~~~~~~~~~~~
MySQL Direct Connector allows faster import and export to/from MySQL using +mysqldump+ and +mysqlimport+ tools functionality
instead of SQL selects and inserts.
To use the MySQL Direct Connector, specify the +\--direct+ argument for your import or export job.
Example:
----
$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \
--direct
----
Passing additional parameters to mysqldump:
----
$ sqoop import --connect jdbc:mysql://server.foo.com/db --table bar \
--direct -- --default-character-set=latin1
----
Requirements
^^^^^^^^^^^^
Utilities +mysqldump+ and +mysqlimport+ should be present in the shell path of the user running the Sqoop command on
all nodes. To validate SSH as this user to all nodes and execute these commands. If you get an error, so will Sqoop.
Limitations
^^^^^^^^^^^^
* Currently the direct connector does not support import of large object columns (BLOB and CLOB).
* Importing to HBase and Accumulo is not supported
* Use of a staging table when exporting data is not supported
* Import of views is not supported
Direct-mode Transactions
^^^^^^^^^^^^^^^^^^^^^^^^
For performance, each writer will commit the current transaction
approximately every 32 MB of exported data. You can control this
by specifying the following argument _before_ any tool-specific arguments: +-D
sqoop.mysql.export.checkpoint.bytes=size+, where _size_ is a value in
bytes. Set _size_ to 0 to disable intermediate checkpoints,
but individual files being exported will continue to be committed
independently of one another.
Sometimes you need to export large data with Sqoop to a live MySQL cluster that
is under a high load serving random queries from the users of your application.
While data consistency issues during the export can be easily solved with a
staging table, there is still a problem with the performance impact caused by
the heavy export.
First off, the resources of MySQL dedicated to the import process can affect
the performance of the live product, both on the master and on the slaves.
Second, even if the servers can handle the import with no significant
performance impact (mysqlimport should be relatively "cheap"), importing big
tables can cause serious replication lag in the cluster risking data
inconsistency.
With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in
milliseconds, you can let the server relax between checkpoints and the replicas
catch up by pausing the export process after transferring the number of bytes
specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different
settings of these two parameters to archieve an export pace that doesn't
endanger the stability of your MySQL cluster.
IMPORTANT: Note that any arguments to Sqoop that are of the form +-D
parameter=value+ are Hadoop _generic arguments_ and must appear before
any tool-specific arguments (for example, +\--connect+, +\--table+, etc).
Don't forget that these parameters are only supported with the +\--direct+
flag set.
Microsoft SQL Connector
~~~~~~~~~~~~~~~~~~~~~~~
Extra arguments
^^^^^^^^^^^^^^^
List of all extra arguments supported by Microsoft SQL Connector is shown below:
.Supported Microsoft SQL Connector extra arguments:
[grid="all"]
`----------------------------------------`---------------------------------------
Argument Description
---------------------------------------------------------------------------------
+\--identity-insert Set IDENTITY_INSERT to ON before \
export insert.
+\--non-resilient+ Don't attempt to recover failed \
export operations.
+\--schema <name>+ Scheme name that sqoop should use. \
Default is "dbo".
+\--table-hints <hints>+ Table hints that Sqoop should use for \
data movement.
---------------------------------------------------------------------------------
Allow identity inserts
^^^^^^^^^^^^^^^^^^^^^^
You can allow inserts on columns that have identity. For example:
----
$ sqoop export ... --export-dir custom_dir --table custom_table -- --identity-insert
----
Non-resilient operations
^^^^^^^^^^^^^^^^^^^^^^^^
You can override the default and not use resilient operations during export.
This will avoid retrying failed operations. For example:
----
$ sqoop export ... --export-dir custom_dir --table custom_table -- --non-resilient
----
Schema support
^^^^^^^^^^^^^^
If you need to work with tables that are located in non-default schemas, you can
specify schema names via the +\--schema+ argument. Custom schemas are supported for
both import and export jobs. For example:
----
$ sqoop import ... --table custom_table -- --schema custom_schema
----
Table hints
^^^^^^^^^^^
Sqoop supports table hints in both import and export jobs. Table hints are used only
for queries that move data from/to Microsoft SQL Server, but they cannot be used for
meta data queries. You can specify a comma-separated list of table hints in the
+\--table-hints+ argument. For example:
----
$ sqoop import ... --table custom_table -- --table-hints NOLOCK
----
PostgreSQL Connector
~~~~~~~~~~~~~~~~~~~~~
Extra arguments
^^^^^^^^^^^^^^^
List of all extra arguments supported by PostgreSQL Connector is shown below:
.Supported PostgreSQL extra arguments:
[grid="all"]
`----------------------------------------`---------------------------------------
Argument Description
---------------------------------------------------------------------------------
+\--schema <name>+ Scheme name that sqoop should use. \
Default is "public".
---------------------------------------------------------------------------------
Schema support
^^^^^^^^^^^^^^
If you need to work with table that is located in schema other than default one,
you need to specify extra argument +\--schema+. Custom schemas are supported for
both import and export job (optional staging table however must be present in the
same schema as target ta
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
共1238个文件
java:707个
patch:130个
txt:104个
Sqoop(发音:skup)是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。
资源推荐
资源详情
资源评论
收起资源包目录
sqoop-1.4.6-cdh5.14.2.tar.gz (1238个子文件)
SqlServerImportAspect.aj 5KB
SqlServerExportAspect.aj 5KB
apply-patches 648B
configure-sqoop.cmd 6KB
hive.cmd 3KB
write-version-info.cmd 3KB
create-tool-scripts.cmd 2KB
sqoop-env-template.cmd 1KB
sqoop.cmd 1KB
configure-sqoop 7KB
stylesheet.css 11KB
docbook.css 5KB
docbook.css 5KB
do-component-build 2KB
sqoop-import.1.gz 4KB
sqoop-import-all-tables.1.gz 3KB
sqoop-import-mainframe.1.gz 3KB
sqoop-export.1.gz 3KB
sqoop-codegen.1.gz 3KB
sqoop-create-hive-table.1.gz 2KB
sqoop.1.gz 1KB
sqoop-eval.1.gz 1KB
sqoop-job.1.gz 1KB
sqoop-list-databases.1.gz 1KB
sqoop-list-tables.1.gz 1KB
sqoop-merge.1.gz 1KB
sqoop-metastore.1.gz 829B
sqoop-help.1.gz 655B
sqoop-version.1.gz 550B
hive 2KB
SqoopUserGuide.html 533KB
index-all.html 102KB
SqoopDevGuide.html 41KB
JdbcWritableBridge.html 36KB
JdbcWritableBridge.html 32KB
LobRef.html 28KB
DelimiterSet.html 28KB
SqoopRecord.html 24KB
BlobRef.html 22KB
ClobRef.html 21KB
RecordParser.html 20KB
DelimiterSet.html 19KB
DelimiterSet.html 17KB
LargeObjectLoader.html 16KB
BlobRef.html 16KB
ClobRef.html 15KB
LobRef.html 14KB
constant-values.html 14KB
BlobRef.html 13KB
ClobRef.html 13KB
RecordParser.html 13KB
LargeObjectLoader.html 13KB
FieldFormatter.html 12KB
FieldFormatter.html 12KB
ProcessingException.html 11KB
RecordParser.ParseError.html 11KB
LobSerializer.html 11KB
ProcessingException.html 11KB
SqoopRecord.html 11KB
LobSerializer.html 10KB
overview-tree.html 10KB
RecordParser.ParseError.html 10KB
BigDecimalSerializer.html 10KB
RecordParser.ParseError.html 10KB
package-tree.html 10KB
package-summary.html 10KB
package-summary.html 10KB
package-use.html 9KB
package-use.html 9KB
BigDecimalSerializer.html 9KB
deprecated-list.html 9KB
help-doc.html 8KB
FieldMapProcessor.html 8KB
BooleanParser.html 8KB
package-tree.html 8KB
LobRef.html 8KB
FieldMappable.html 8KB
BooleanParser.html 7KB
DelimiterSet.html 7KB
FieldMappable.html 7KB
ProcessingException.html 7KB
FieldMapProcessor.html 7KB
FieldMappable.html 7KB
FieldMapProcessor.html 6KB
LobRef.html 6KB
RecordParser.ParseError.html 6KB
FieldMappable.html 6KB
LargeObjectLoader.html 6KB
FieldMapProcessor.html 6KB
ProcessingException.html 6KB
LargeObjectLoader.html 6KB
RecordParser.html 6KB
SqoopRecord.html 6KB
BlobRef.html 6KB
ClobRef.html 6KB
serialized-form.html 5KB
allclasses-frame.html 5KB
allclasses-noframe.html 4KB
BigDecimalSerializer.html 4KB
JdbcWritableBridge.html 4KB
共 1238 条
- 1
- 2
- 3
- 4
- 5
- 6
- 13
资源评论
- promick2019-03-28官网上实在不好找,百度也找了好久,只找到这个,很庆幸找到啦
life__log
- 粉丝: 8
- 资源: 6
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功