Data Discovery Connector SDK
Overview
The Data Discovery Connector SDK facilitates seamless interaction with the worker node services, allowing users to engage with custom scan jobs effectively. This SDK streamlines the process of accessing and utilizing the functionalities of the worker node services, enhancing the overall experience of interacting with custom scan jobs.
Getting Access to the SDK
Please reach out to your OneTrust representative or the support team to obtain the Data Discovery Connector SDK files.
Initialize SDK - OTDataDiscoveryBridge Client
UUID datasourceId; //UUID of the Custom Scanner datasource
String bridgeBaseUrl; // Worker Node Ingress URL to Data Discovery Connector/Bridge
String tokenUri; // Cloud Environment base url
String clientId; // Client Id from Client Credential with Data Discovery scope
String clientSecret; // Client Secret from Client Credential with Data Discovery scope
IdentityServiceConfiguration identityServiceConfiguration =
IdentityServiceConfiguration.builder()
.url(new URL(tokenUri))
.build();
ClientCredentialAuthProvider authProvider =
new ClientCredentialAuthProvider(clientId, clientSecret, identityServiceConfiguration);
OTDataDiscoveryConfiguration otDataDiscoveryConfiguration =
OTDataDiscoveryConfiguration.builder()
.datasourceId(datasourceId)
.url(new URL(bridgeBaseUrl))
.build();
OTDataDiscoveryBridge bridge = OTDataDiscoveryBridge(authProvider, otDataDiscoveryConfiguration);
Running Scans
Wait for the Job to be Initiated
Scan Jobs should be initiated from the OneTrust application for the configured data source. A polling thread should be implemented in the custom scanner to periodically look up any pending jobs.
List<JobDetails> pendingJobs = bridge.getJobs()
If there is a pending job, the list will be non-empty.
Start/Complete/Fail Job
public void startJob(JobDetails jobDetails) {
bridge.startJob(jobDetails);
}
public void completeJob(JobDetails jobDetails) {
bridge.completeJob(jobDetails);
}
public void failJob(JobDetails jobDetails, String errorMessage) {
bridge.failJob(jobDetails, errorMessage);
}
Catalog Data
DatabaseMetadata dbMetadata =
DatabaseMetadataRequestsBuilder.databaseMetadataBuilder(
ownerName,
productFamily,
productName,
productVersion,
databaseName,
childrenCount);
//Create MetadataRequest for Database
MetadataRequest metadataRequest =
DatabaseMetadataRequestsBuilder.builder(
jobId,
datasourceId,
dbMetadata);
CatalogOrSchemaMetadata catalogMetadata =
CatalogMetadataRequestsBuilder.catalogMetadataBuilder(
catalogName,
version);
//Create MetadataRequest for Catalog
MetadataRequest metadataRequest =
CatalogMetadataRequestsBuilder.builder(
jobId,
datasourceId,
databaseName,
List.of(databaseName),
EntityType.DATABASE,
List.of(catalogMetadata));
TableMetadata tableMetadata =
TableMetadataRequestBuilder.prepareTableMetadata(
remarks,
tableName,
numRows,
size);
//Create MetataRequest for Tables
MetadataRequest metadataRequest =
TableMetadataRequestBuilder.builder(
jobId,
datasourceId,
databaseName,
catalogName,
EntityType.CATALOG,
List.of(tableMetadata));
//Create ColumnMetadata
ColumnMetadata columnMetadata =
ColumnMetadataRequestBuilder.columnMetadataBuilder(
columnName,
avgColumnSize,
columnIndex,
dataType);
//Create MetadataRequest for Catalog
MetadataRequest metadataRequest =
ColumnMetadataRequestBuilder.builder(
jobId,
datasourceId,
tableName,
EntityType.TABLE, //Parent EntityType
List.of(databaseName, catalogName, tableName), //XPath for Parent
List.of(columnMetadata));
Build MetadataRequests and Send
bridge.catalog(jobDetails,
MetadataRequests.builder().metadataRequestList(List.of(metadataRequest)).build())
Classify Data
Extract data from the source.
Build ContentDetails for data. For structured data sources, a single ContentDetails
per column.
private void processBatch(List<List<String>> rowCache, // List of Row Data comprised of a List of Strings one per Column.
boolean hasMoreData,
List<ColumnMetadata> columnMetadata, // List of the ColumnMetadata in order by column index
BaseMetadata table, // TableMetadata for extracted table
JobDetails jobDetails,
long count, //Count of All rows extracted so far.
List<String> parentXPath) {
List<ContentDetails> contentDetailsList = new ArrayList<>();
// For each row of data create a ContentDetails object per column collecting all of the data elements for the
// respective columns
IntStream.range(0, columnMetadata.size())
.forEach(i -> contentDetailsList.add(ContentDetails.builder()
.data(rowCache.stream()
.map(row -> Objects.nonNull(row.get(i)) ? row.get(i) : "")
.collect(Collectors.toList()))
.endOfContent(!hasMoreData)
.entityName(columnMetadata.get(i).getEntityName())
.entityType(EntityType.COLUMN)
.parentEntityType(EntityType.TABLE)
.parentEntityName(table.getEntityName())
.parentXpath(parentXPath)
.build()));
// Build ClassificationRequest
var classificationRequest = ClassificationRequest.builder()
.jobId(jobDetails.getJobId())
.datasourceId(jobDetails.getDatasourceId().toString())
.structured(true)
.totalCount(count)
.content(contentDetailsList).build();
bridge.classify(jobDetails, classificationRequest);
}
Updated 9 months ago