Java + Big Data Tech Lead

This job posting is no longer active.

Location: Bengaluru India
Job ID: 1009170HV
Date Posted: Mar 24, 2021
Segment: IT
Business Unit: Hitachi Vantara
Company Name: Hitachi Vantara Corporation
Profession (Job Category): Sales, Marketing & Product Management

The Company

Hitachi Vantara, a wholly-owned subsidiary of Hitachi, Ltd., guides our customers from what's now to what's next by solving their digital challenges. Working alongside each customer, we apply our unmatched industrial and digital capabilities to their data and applications to benefit both business and society. More than 80% of the Fortune 100 trust Hitachi Vantara to help them develop new revenue streams, unlock competitive advantages, lower costs, enhance customer experiences, and deliver social and environmental value.

The Role

Total Experience - 12-14 Years

Must Have:

Basics of Distributed computing
1. MapReduce
2. Distributed computing vs RDBMS/ scale up vs scale out
Hands on experience in any one of the programming languages (Java, Python, Scala)
Understanding of Linux and Bash scripting
Knowledge of SQL
Basics of Hadoop framework, problem patterns that can be solved like filtering, aggregation, joins etc
Understanding of Spark concepts like RDD, Dataframes, Clousures etc., has implemented at least one project using Spark and Scala
Should have worked on at least 1-2 bigdata projects (Could be ingestion, ETL processing) on the Cloudera Platform
Understanding of Hive/Pig, concepts like partitioning, bucketing, metastore, schema on read vs schema on write, SerDe
Solid programming fundaments /design concepts.
In depth understanding of different batch and stream processing technologies and NoSQL storage
Demonstrated work experience as an Sr.Developer/ Jr. Architect role in Bigdata/Cloud and opensource technology stack.
Should be able to articulate, suggest right use of technology stack for different use cases with reasoning.
Understanding of Lambda, Kappa architecture
Should have participated or able to suggest right hardware choices, platform components, distributions etc.

Good to Have:

Programming concepts
1. Object oriented vs Functional programming concepts
2. Design patterns (Singleton, Immutable, Factory)
MapReduce Programming like Combiner, Partitioiner, InputFormat/OutputFormat, Serialization
Distributed Computing
1. Scale up vs Scale out
Scala hands on, SparkSQL, dataframes etc.
Understanding of different storage formats Avro, RCFile, ORC, Parquet
Has worked/working on any one of the cloud platform AWS, Azure, GCP
Has worked/working on any one of the bigdata platforms like Hortonworks, Cloudera, Datastacks, Databricks
Aware of latest technology trends in streaming, real-time, batch processing frameworks (Storm, Apache Beam, Flink, Spark, Kafka Connect etc)
Certified in any of the bigdata distribution (Hortonworks/Cloduera/Databricks/Datastacks)

We are an equal opportunity employer. All applicants will be considered for employment without attention to age, race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.