Luxoft is a global IT service provider of innovative technology solutions that delivers measurable business outcomes to multinational companies. Its offerings encompass strategic consulting, custom software development services, and digital solution engineering. Luxoft enables companies to compete by leveraging its multi-industry expertise in the financial services, automotive, communications, and healthcare & life sciences sectors. For more information, please visit the website.
Senior Python Engineer with Hadoop
In Media we are striving to employ cutting edge technologies and approaches whenever possible. Most of our Customers come from the USA, UK, Australia, Japan and Western Europe, hence satisfactory English command is a must.
Our Customer is among the world's leading creators and distributors of award-winning still imagery, video and multimedia products, as well as other forms of premium digital content, available through its trusted house of brands, including Image.net©, iStock© and others. The project targets to enhance and further develop advanced media management platform, to help our Customer to serve business customers in more than 100 countries and continue to be the first place media professionals to discover, purchase and manage the digital content. You will have an opportunity to work alongside of the best-in-class photographers and imagery (business trips to Seattle, USA are planned) that help customers produce inspiring work which appears every day in the world's most influential newspapers, magazines, advertising campaigns, films, television programs, books and online media.
- Develop and maintain high performing ETL/ELT processes, including data quality and testing
- Own the data infrastructure including provisioning, monitoring and automation of infrastructure and application deployments
- Instrument monitoring and alerting
- Design and build data models for Snowflake warehouse and Hadoop based enterprise data lake
- Create and maintain infrastructure and application documentation
- Develop dashboards, reports and visualization
- Ensure scalability and high performance of the platform
- Design, enhance internally developed frameworks in python
Must- MS/BS degree in computer science or related field
- 5+ years hands-on experience with designing and implementing data solutions that can handle terabytes of data.
- Strong knowledge in modern distributed architectures and compute / data analytics / storage technologies on AWS Cloud; Good understanding infrastructure choices, sizing and cost of cloud infrastructure/services
- Hands-on working experience on AWS Redshift or Snowflake or Google BigQuery
- Hands on experience administering, designing, developing, and maintaining software solutions in Hadoop Production clusters.
- Solid understanding of architectural principles and design patterns / styles using parallel large-scale distributed frameworks such as Hadoop and Spark;
- Experience with Spark and Hive
Solid experience with Python
- Experience with terraform and Docker
- Experience with open source job orchestration tools such as AirFlow or Job Scheduler
- Experience in reporting and visualization tools such as looker/tableau will be a plus
- Outstanding analytical skills, excellent team player and delivery mindset.
- Experience in performance troubleshooting, SQL optimization, and benchmarking.
- Experienced in UNIX environment such as creation of shell scripts
Nice to have- Experience in agile methodologies
- English: Upper-intermediate