Job Title : Lead Architect
Location: Hartford, CT.
CDF Cluster Build
- Expertise with CDF / NiFi Platform + Security
- Expertise with AWS deployments
- Expertise with NiFi development, custom processors and controllers
1. Perform pre-installation validation tasks
2. Install and deploy one (1) Client Data Flow (HDF) 3.x Clusters
3. Perform installation configuration and validation, as per Cloudera checklist
4. Set-up NiFi Registry and Schema Registry
5. Perform benchmark testing and make necessary tuning adjustments
6. Create an Ambari blueprint for the CDF installation
7. Configure security in the cluster as follows:
a. Integrate with corporate AD server, create Kerberos principles and propagate Kerberos keytabs
b. Configure Ranger with Hadoop Active Directory user groups
c. Test security as per configuration
8. Set-up Monitoring and Alerting to harden the operation of the data flow
9. Perform an end to end NiFi flow
10. Perform HDP configuration backup
11. Compose installation & configuration runbook, as per Cloudera standard format
Data Ingestion and Data Analytics
1. Develop NiFi dataflow to monitor the S3 bucket(s) and ingest files from a prioritized list of custom file format(s), perform the required transformations (aka record conversion) and land the files in ITE and HDP. Insuring complete data provenance throughout the process.
a. Customer's input files will come in the form of an encrypted format. Use C-Level encryption/decryption library provided by Customer to decrypt, de-compress and stitch the input files
b. Use parameter decoder rings to convert parameter names to Customer standard
c. Integrate Customer's Python scripts to the NiFi processor
d. Output files will be ubin2 and csv/orc and sent to MFT (data landing zone for ITE platform) and HDFS (data landing zone for HDP platform) respectively
e. (Optional) Demo dataflow from one NiFi flow to another
2. Create a NiFi monitoring and alerting mechanism to acknowledge and validate that the files have been received
3. Configure NiFi Registry and Schema Registry to have a fully supported deployment life cycle, with version control, from Development to Production
4. Create Hive Schemas and Tables for the data landed in HDFS
5. Create an audit trail (status events) to log information about files that have been received
a. Leverage Elastic Search processors in NiFi to send the status events to Customer's existing Elastic Search infrastructure
6. Perform NiFi validation activities, as part of Cloudera standard checklist
7. Perform cluster performance testing and make necessary tuning adjustments and/or provide hardware recommendations on cost-optimal VMs
8. Issue Resolution activities as part of User Acceptance Testing (Functional and Non-Functional)
9. Review alternative approaches (developed by Customer team and/or Customer's Vendors) in two (2), three (3) hour sessions.
10. Compose configuration runbook, as per Cloudera standard format
Central Business Solutions, Inc,
37600 Central Ct.
Newark, CA 94560.