The Data Engineer is responsible for the maintenance, improvement, cleaning, and manipulation of data in the business s operational and analytics datastores. The Data Engineer works with the business software engineers, data analytics teams, data scientists, and data warehouse engineers in order to understand and aid in the implementation of data requirements, analyze performance, and troubleshoot issues.
The Data Engineer provides support to the Data and Analytics in database preparation and wrangling, data flow & analysis activities. The Data Engineer supports the development and deployment of innovative tools, utilities, components and services to support simple ETL needs to complex data wrangling issues to deliver data for advanced analytics and data processing.
The Data Engineer defines and builds the data pipelines that will enable faster, better, data-informed decision-making within the business.
- The successful candidate will be able to develop technical solutions to improve access to data and data usage
- Understand data needs and with direction from Sr. resources be able to construct data pipelines for automating and accelerating data preparation and data wrangling tasks
- Standardization of data processing modules to deliver modularity and enhance reusability
- Develop simple REST based services for data delivery
- Create and maintain standards for data profiling and best practices around data quality
. The candidate must have a successful track record in ETL job design, development and automation activities with minimal supervision. The candidate will be expected to support a variety of structured, semi-structured and unstructured data preparation and wrangling needs to support business and technical data requirements. Troubleshoot, monitor and coordinate defect resolution related to data processing & preparation; Responsible for support of all existing data pipeline processes across various data assetsQualifications
- 8+ years of hands on experience in with COTS ETL tools such as Informatica, SAP DS, Pentaho PDI
- 1+ year experience with Nifi/HDF and Kafka
- 3+ years of hands on experience in with python and various python toolkits and libraries such as SciPy, NumPy, nltk, etc.
- 3+ year of experience and ability to create complex SQL queries and functions
- 2+ years of hands on experience in handling data API in various formats for ingest and output, such as XML and JSON
- Experience in java is essential
- Experience in Linux scripting is essential
- 1+ year experience working with CI/CD tools including Git for etl and scripts repository
- Experience with AWS lambda functions is an advantage
- Working experience with document databases like MongoDB, CouchDB, Cassandra is advantageous
- Experience in working with the Hadoop ecosystem is advantageous
- Be able to work independently with minimal supervision
- Bachelor's degree in Computer Science or related discipline
Employer is an Equal Employment Opportunity (EEO) employer. It is the policy of the Company to provide equal employment opportunities to all qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, protected veteran or disabled status, or genetic information.