Position

ETL (Hadoop & Hive) Data Engineer,
Bangalore

Location


Bangalore

Office Address


Project Description


Data engineers are responsible for finding trends in data sets and developing algorithms to help make raw data more useful to the enterprise. This IT role requires a significant set of technical skills, including a deep knowledge of SQL database design and multiple programming languages. But data engineers also need communication skills to work across departments to understand what business leaders want to gain from the company's large datasets.
Data engineers are often responsible for building algorithms to help give easier access to raw data, but to do this, they need to understand company's or business's objectives. It's important to have business goals in line when working with data, especially for companies that handle large and complex datasets and databases.

Responsibilities


    • Design and Develop ETL Pipeline to ingest data into Hadoop from different data sources (Files, Mainframe, Relational Sources, NoSQL Etc.) using Informatica BDM
    • Parse unstructured data, semi structured data such as JSON, XML etc. using Informatica Data Processor.
    • Analyze the Informatica PowerCenter Jobs and redesign and develop them in BDM.
    • Design and develop efficient Mapping and workflows to load data to Data Marts.
    • Perform the GAP analysis between various legacy applications to migrate them to newer platforms/data marts.
    • Write efficient queries in Hive or Impala and PostgreSQL to extract data on Adhoc basis to do the data analysis.
    • Identify the performance bottlenecks in ETL Jobs and tune their performance by enhancing or redesigning them.
    • Work with Hadoop administrators, PostgreS DBAs to partition the hive tables, refresh metadata and various other activities, to enhance the performance of data loading and extraction.
    • Performance tuning of ETL mappings and queries.
    • Write simple or medium complex shell scripts to preprocess the files, schedule ETL jobs etc.
    • Identify various manual processes, queries etc. in the Data and BI areas, design and develop ETL Jobs to automate them.
    • Participate in daily scrums; work with vendor partners, QA team and business users in various stages of development cycle.

Skills


Must have

    • 7+ years of experience in designing and developing ETL Jobs (Informatica or any other ETL tool)
    • 3+ years of experience working on Informatica BDM platform
    • Experience on various execution modes in BDM such Blaze, Spark, Hive, Native.
    • 3+ years of experience working on Hadoop Platform, writing hive or impala queries.
    • 5+ years of experience working on relational databases (Oracle, Teradata, PostgreSQL etc.) and writing SQL queries.
    • Should have deep knowledge on performance tuning of ETL Jobs, Hadoop Jobs, SQL's, Partitioning, Indexing and various other techniques.
    • Experience in writing Shell scripts.
    • Experience in Spark Jobs (Python or Scala) is an asset.
    • 1+ years of experience with working on AWS technologies for data pipelines, data warehouses
    • Minimum 5+ years of experience with building ETLs to load data warehouse, data marts
    • Awareness of Kimball and Inmon data warehouse methodologies
    • Nice to have knowledge on all the products of Informatica such as IDQ, MDM, IDD, BDM, Data Catalogue, PowerCenter etc.
    • Must have experience working in Agile SCRUM methodology, should have used Jira, Bit bucket, GIT, Jenkins to deploy the codes from one environment to other.
    • Experience working in diverse multicultural environment with different vendors, onsite/offshore vendor teams etc.
    • P&C Insurance industry knowledge will be an added asset
    • Certifications in Informatica product suite as a developer
    • Nice to have 2+ years of experience with AWS data stack (IAM, 33, Kinesis Stream, Kinesis firehose, Lambda, Athena, Glue, RedShift and EMR
    • Exposure to other cloud platforms such as Azure and GCP are acceptable as we

Nice to have

    Working in Agile Projects

Languages


English: C2 Proficient

Relocation package


If needed, we can help you with relocation process. Click here for more details: see more details

Work Type


BigData (Hadoop etc.)

Ref Number


VR-57246

Explore More

LoGeek Magazine
icon Logeek Luxoft
Learn more
Events
icon Events Luxoft
Learn more
Relocation Program
icon Relocation Luxoft
Learn more
Referral
Platform
icon Referral Luxoft
Learn more
Students
and Grads
icon Students Luxoft
Learn more

More job opportunities in
BigData (Hadoop etc.)

Specialization Position / Title Location Send to a friend
BigData (Hadoop etc.) Senior SparkSQL developer Bangalore, IN
BigData (Hadoop etc.) Senior BigData developer Bangalore, IN
BigData (Hadoop etc.) Senior Data Engineer Bangalore, IN
BigData (Hadoop etc.) Datastage Application Developer Bangalore, IN