Short DescriptionABB is looking for Data Developer who will develop and improve continuous integration and deployment processes and contribute to the ongoing development of the data warehouse ecosystem.
- Develop and improve continuous integration and deployment processes and contribute to the ongoing development of the data warehouse ecosystem.
- Provide ETL solutions to populate and enhance our data stores, design and build optimized data structures, data pipelines to transform and validate data.
- Maintain high data quality and integrity across our systems by designing QA/QC processes of data pipelines.
- Assist in data governance processes, planning, security, and execution (Develop and maintain data dictionaries for governance of published data sources).
- Integrate with diverse APIs supporting internal and external customers.
- Build self-monitoring, robust, scalable batch and streaming data pipelines for 24/7 global operations with highly reusable code modules and packages that can be leveraged across the data pipeline.
- Bachelor's degree (B. S.) from four-year college or university in computer science, information sciences or a related field.
- 3+ years of experience building software and working with data in a hands-on, data-centric role with data engineering, streaming, or warehousing.
- Demonstrated strength in data modeling, ETL development, and data warehousing and ability to understand and organize data from various sources.
- Extensive experience writing complex, highly-optimized queries across large data sets using T-SQL, PL/SQL, Postgres, MySQL, or something similar
- Strong expertise in an object-oriented language (preferably Python or C#) and with various data-centric cloud services in Azure, AWS or Google to deliver data pipelines and manage data stores.
- Experience consuming and cleansing data from third-party APIs and other sources and identifying and resolving performance and data quality issues.
- Columnar relational data stores and NoSQL technologies.
- Big data tools such as Hadoop, Hive, Spark, etc, as well as knowledge of more traditional warehouses.
- Modern data pipelines, data streaming, and real-time analytics using Apache Kafka, AWS kinesis, Spark Streaming, ElasticSearch, or similar tool.
- Machine learning tools and concepts.