sqoop-1.4.6-cdh5.14.2.tar.gz资源-CSDN文库

数据传输

sqoop

4星 · 超过85%的资源需积分: 20 51 浏览量 2018-08-07 15:42:11 上传评论收藏 29.32MB GZ 举报

共1238个文件

java：707个

patch：130个

txt：104个

资源推荐

资源详情

资源评论

收起资源包目录

sqoop-1.4.6-cdh5.14.2.tar.gz （1238个子文件）

SqlServerImportAspect.aj 5KB

SqlServerExportAspect.aj 5KB

apply-patches 648B

configure-sqoop.cmd 6KB

hive.cmd 3KB

write-version-info.cmd 3KB

create-tool-scripts.cmd 2KB

sqoop-env-template.cmd 1KB

sqoop.cmd 1KB

configure-sqoop 7KB

stylesheet.css 11KB

docbook.css 5KB

do-component-build 2KB

sqoop-import.1.gz 4KB

sqoop-import-all-tables.1.gz 3KB

sqoop-import-mainframe.1.gz 3KB

sqoop-export.1.gz 3KB

sqoop-codegen.1.gz 3KB

sqoop-create-hive-table.1.gz 2KB

sqoop.1.gz 1KB

sqoop-eval.1.gz 1KB

sqoop-job.1.gz 1KB

sqoop-list-databases.1.gz 1KB

sqoop-list-tables.1.gz 1KB

sqoop-merge.1.gz 1KB

sqoop-metastore.1.gz 829B

sqoop-help.1.gz 655B

sqoop-version.1.gz 550B

hive 2KB

SqoopUserGuide.html 533KB

index-all.html 102KB

SqoopDevGuide.html 41KB

JdbcWritableBridge.html 36KB

JdbcWritableBridge.html 32KB

LobRef.html 28KB

DelimiterSet.html 28KB

SqoopRecord.html 24KB

BlobRef.html 22KB

ClobRef.html 21KB

RecordParser.html 20KB

DelimiterSet.html 19KB

DelimiterSet.html 17KB

LargeObjectLoader.html 16KB

BlobRef.html 16KB

ClobRef.html 15KB

LobRef.html 14KB

constant-values.html 14KB

BlobRef.html 13KB

ClobRef.html 13KB

RecordParser.html 13KB

LargeObjectLoader.html 13KB

FieldFormatter.html 12KB

ProcessingException.html 11KB

RecordParser.ParseError.html 11KB

LobSerializer.html 11KB

ProcessingException.html 11KB

SqoopRecord.html 11KB

LobSerializer.html 10KB

overview-tree.html 10KB

RecordParser.ParseError.html 10KB

BigDecimalSerializer.html 10KB

RecordParser.ParseError.html 10KB

package-tree.html 10KB

package-summary.html 10KB

package-use.html 9KB

BigDecimalSerializer.html 9KB

deprecated-list.html 9KB

help-doc.html 8KB

FieldMapProcessor.html 8KB

BooleanParser.html 8KB

package-tree.html 8KB

LobRef.html 8KB

FieldMappable.html 8KB

BooleanParser.html 7KB

DelimiterSet.html 7KB

FieldMappable.html 7KB

ProcessingException.html 7KB

FieldMapProcessor.html 7KB

FieldMappable.html 7KB

FieldMapProcessor.html 6KB

LobRef.html 6KB

RecordParser.ParseError.html 6KB

FieldMappable.html 6KB

LargeObjectLoader.html 6KB

FieldMapProcessor.html 6KB

ProcessingException.html 6KB

LargeObjectLoader.html 6KB

RecordParser.html 6KB

SqoopRecord.html 6KB

BlobRef.html 6KB

ClobRef.html 6KB

serialized-form.html 5KB

allclasses-frame.html 5KB

allclasses-noframe.html 4KB

BigDecimalSerializer.html 4KB

JdbcWritableBridge.html 4KB

共 1238 条

//// Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. //// [[connectors]] Notes for specific connectors ----------------------------- MySQL JDBC Connector ~~~~~~~~~~~~~~~~~~~~ This section contains information specific to MySQL JDBC Connector. Upsert functionality ^^^^^^^^^^^^^^^^^^^^ MySQL JDBC Connector is supporting upsert functionality using argument +\--update-mode allowinsert+. To achieve that Sqoop is using MySQL clause INSERT INTO ... ON DUPLICATE KEY UPDATE. This clause do not allow user to specify which columns should be used to distinct whether we should update existing row or add new row. Instead this clause relies on table's unique keys (primary key belongs to this set). MySQL will try to insert new row and if the insertion fails with duplicate unique key error it will update appropriate row instead. As a result, Sqoop is ignoring values specified in parameter +\--update-key+, however user needs to specify at least one valid column to turn on update mode itself. MySQL Direct Connector ~~~~~~~~~~~~~~~~~~~~~~ MySQL Direct Connector allows faster import and export to/from MySQL using +mysqldump+ and +mysqlimport+ tools functionality instead of SQL selects and inserts. To use the MySQL Direct Connector, specify the +\--direct+ argument for your import or export job. Example: ---- $ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \ --direct ---- Passing additional parameters to mysqldump: ---- $ sqoop import --connect jdbc:mysql://server.foo.com/db --table bar \ --direct -- --default-character-set=latin1 ---- Requirements ^^^^^^^^^^^^ Utilities +mysqldump+ and +mysqlimport+ should be present in the shell path of the user running the Sqoop command on all nodes. To validate SSH as this user to all nodes and execute these commands. If you get an error, so will Sqoop. Limitations ^^^^^^^^^^^^ * Currently the direct connector does not support import of large object columns (BLOB and CLOB). * Importing to HBase and Accumulo is not supported * Use of a staging table when exporting data is not supported * Import of views is not supported Direct-mode Transactions ^^^^^^^^^^^^^^^^^^^^^^^^ For performance, each writer will commit the current transaction approximately every 32 MB of exported data. You can control this by specifying the following argument _before_ any tool-specific arguments: +-D sqoop.mysql.export.checkpoint.bytes=size+, where _size_ is a value in bytes. Set _size_ to 0 to disable intermediate checkpoints, but individual files being exported will continue to be committed independently of one another. Sometimes you need to export large data with Sqoop to a live MySQL cluster that is under a high load serving random queries from the users of your application. While data consistency issues during the export can be easily solved with a staging table, there is still a problem with the performance impact caused by the heavy export. First off, the resources of MySQL dedicated to the import process can affect the performance of the live product, both on the master and on the slaves. Second, even if the servers can handle the import with no significant performance impact (mysqlimport should be relatively "cheap"), importing big tables can cause serious replication lag in the cluster risking data inconsistency. With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in milliseconds, you can let the server relax between checkpoints and the replicas catch up by pausing the export process after transferring the number of bytes specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different settings of these two parameters to archieve an export pace that doesn't endanger the stability of your MySQL cluster. IMPORTANT: Note that any arguments to Sqoop that are of the form +-D parameter=value+ are Hadoop _generic arguments_ and must appear before any tool-specific arguments (for example, +\--connect+, +\--table+, etc). Don't forget that these parameters are only supported with the +\--direct+ flag set. Microsoft SQL Connector ~~~~~~~~~~~~~~~~~~~~~~~ Extra arguments ^^^^^^^^^^^^^^^ List of all extra arguments supported by Microsoft SQL Connector is shown below: .Supported Microsoft SQL Connector extra arguments: [grid="all"] `----------------------------------------`--------------------------------------- Argument Description --------------------------------------------------------------------------------- +\--identity-insert Set IDENTITY_INSERT to ON before \ export insert. +\--non-resilient+ Don't attempt to recover failed \ export operations. +\--schema <name>+ Scheme name that sqoop should use. \ Default is "dbo". +\--table-hints <hints>+ Table hints that Sqoop should use for \ data movement. --------------------------------------------------------------------------------- Allow identity inserts ^^^^^^^^^^^^^^^^^^^^^^ You can allow inserts on columns that have identity. For example: ---- $ sqoop export ... --export-dir custom_dir --table custom_table -- --identity-insert ---- Non-resilient operations ^^^^^^^^^^^^^^^^^^^^^^^^ You can override the default and not use resilient operations during export. This will avoid retrying failed operations. For example: ---- $ sqoop export ... --export-dir custom_dir --table custom_table -- --non-resilient ---- Schema support ^^^^^^^^^^^^^^ If you need to work with tables that are located in non-default schemas, you can specify schema names via the +\--schema+ argument. Custom schemas are supported for both import and export jobs. For example: ---- $ sqoop import ... --table custom_table -- --schema custom_schema ---- Table hints ^^^^^^^^^^^ Sqoop supports table hints in both import and export jobs. Table hints are used only for queries that move data from/to Microsoft SQL Server, but they cannot be used for meta data queries. You can specify a comma-separated list of table hints in the +\--table-hints+ argument. For example: ---- $ sqoop import ... --table custom_table -- --table-hints NOLOCK ---- PostgreSQL Connector ~~~~~~~~~~~~~~~~~~~~~ Extra arguments ^^^^^^^^^^^^^^^ List of all extra arguments supported by PostgreSQL Connector is shown below: .Supported PostgreSQL extra arguments: [grid="all"] `----------------------------------------`--------------------------------------- Argument Description --------------------------------------------------------------------------------- +\--schema <name>+ Scheme name that sqoop should use. \ Default is "public". --------------------------------------------------------------------------------- Schema support ^^^^^^^^^^^^^^ If you need to work with table that is located in schema other than default one, you need to specify extra argument +\--schema+. Custom schemas are supported for both import and export job (optional staging table however must be present in the same schema as target ta

评论收藏

内容反馈