FTP Data Source
|
Collect
1. Overview
File Transfer Protocol (FTP) is a file transfer protocol that consists of two parts:
- FTP server : used to store files. You can use an FTP client to access resources on the FTP server through FTP.
- FTP client : An FTP client can be used to access resources on the FTP server over FTP.
After configuring the FTP data source, you can use Data Fusion > Task Management function to import its data into the divine data table or entity, easy to use in the report, analysis model, intelligent operations and other modules.
Before configuring the data source, please refer to the following to confirm that your FTP data source meets the requirements:
Data source type | Data source name | Version/protocol requirements | User permission requirement | Other requirements |
---|---|---|---|---|
Object storage | FTP | FTP/SFTP | Contains at least read access to the folder path | Data files can only be in txt or csv format |
2. Add FTP data source
- Select Data Fusion > Universal Data Access > Data Source Management.
- Click All Data Source Tab page.
- Click FTP data source.
- Click the Create button in the upper right corner.
- Fill in the FTP connection information.
- Data Source Connection Name: This is a customizable field and serves as the unique identifier for the data source connection within the platform.
- Protocol Type: Supports FTP and SFTP.
- Server: The IP address of the data source connection; multiple entries are supported in a cluster environment.
- Port Number: The port number for the data source connection.
- Base Path: This path is the absolute path to the root directory, for example: /home/sa_cluster.
- File Type: Specifies the types of data files to be read; currently supports txt and csv. During data synchronization, only files of the specified types will be read.
- Username: The valid username for the data source connection.
- Password: The valid password corresponding to the username.
- Click the Test Connection button.
- Click the Submit button.
2.1. FTP Dataset Configuration Method
To ingest data via FTP data source, configure paths, folders, and files according to the following method.
2.1.1. Path Rule Definition
When importing a data set, configure the path according to the path rule, for example:/home/dataGroupFile/dataFile
- /home:basic path
- dataGroupFile:Data sets are grouped into folders
- dataFile:Data set folder
2.1.2. What a single data set needs to contain
Serial number | Content name and necessity | Effect | Restraint | Sample file |
---|---|---|---|---|
01 | Data sets are grouped into folders required | Analogy structured database DB, plays the role of grouping data sets | The name is not restricted and can be customized | - |
02 | Data set folder required | A folder represents a data set; | Folder naming rules: can only contain letters, digits, and underscores (_), and must start with a letter. Maximum 100 characters | - |
03 | Metadata file required | Describes the data structure of the current dataset. Only one metadata file can be stored in a dataset folder | File format: yml format | |
04 | Data file not required. | Stores the current data set data file, Multiple data files can be stored in one data set folder. | File format:txt or csv format
After FTP data is added, only three update modes are supported: full overwrite, full add, and incremental add. | |
05 | Ready file | Indicates that a data file is ready to be generated. There is no specific requirement on the contents of the file | File format:verf format |
3. Manage FTP data source
- Select Data Fusion > Universal Data Access > Data source management.
- Click Added data source Tab page.
- Click FTP data source.
- Edit: Supports modifying all configuration parameters of the data connection.
- Delete: Delete the current connection.
If the current data connection is used by a task, modifying the parameters or deleting the connection will cause the task to fail.
4. Mapping rules for field types
Import data from FTP data source into the Sensors data table. Field type mapping errors may cause content conversion errors or task execution failures. Configure field mapping according to the following rules to ensure safe data conversion:
Original field type | Data table field type |
---|---|
tinyint | NUMBER / INT / BIGINT |
smallint | NUMBER / INT / BIGINT |
mediumint | NUMBER / INT / BIGINT |
int | NUMBER / INT / BIGINT |
bigint | NUMBER / BIGINT |
float | NUMBER |
double | NUMBER |
decimal | NUMBER |
char | STRING |
enum | STRING |
longtext | STRING |
mediumtext | STRING |
string | STRING |
text | STRING |
tinytext | STRING |
varchar | STRING |
year | STRING |
date | TIMESTAMP |
datetime | TIMESTAMP |
timestamp | TIMESTAMP |
Note: The content of this document is a technical document that provides details on how to use the Sensors product and does not include sales terms; the specific content of enterprise procurement products and technical services shall be subject to the commercial procurement contract.