Data Engineer (3+ years)
Location: Mumbai - Hybrid
About the Role:
We are seeking a skilled Data Engineer to join our team focused on building an LLM-based
automated underwriting system for home loan applications. You will be responsible for
designing and maintaining data pipelines, optimizing our databases, and ensuring reliable,
clean, and well-structured data to support our AI/ML models and business logic.
Key Responsibilities:
1. Data Pipeline Development:
o Design, develop, and maintain ETL pipelines to ingest, transform, and load data from multiple sources (eg, MySQL, MongoDB, Elastic) into a centralized data store for model training and inference.
o Implement workflows using tools like Apache Airflow, AWS Glue, or equivalent to ensure timely and reliable data availability.
2. Data Pre-processing & Feature Engineering:
o Clean, normalize, and enrich large datasets of home loan applications, both structured (eg, financial data) and unstructured (eg, customer narratives, supporting documents).
o Collaborate with Data Scientists and ML Engineers to define and implement feature extraction, data transformations, and schema optimizations to support LLM-based underwriting models.
3. Database Management & Optimization:
o Work closely with the Database Administrator to configure and optimize MySQL and MongoDB for high-performance query execution and efficient data retrieval.
o Ensure data integrity, consistency, and compliance with relevant security and privacy regulations.
4. Infrastructure & Scalability:
o Implement scalable data storage solutions on AWS (eg, S3, Redshift, DynamoDB) and integrate these resources seamlessly with model training environments (eg, Sage Maker).
o Monitor and troubleshoot data pipelines, ensuring minimal downtime and prompt resolution of data quality issues.
5. Collaboration & Best Practices:
o Collaborate with Machine Learning Engineers, Data Scientists, and Backend Developers to understand data requirements and deliver solutions aligned with project goals.
o Follow best practices in version control (Git), documentation, and testing to maintain a robust and reliable data infrastructure.
6. Continuous Improvement:
o Stay updated with the latest tools, technologies, and best practices in data engineering and cloud services.
o Propose and implement improvements to enhance data quality, reduce latency, and streamline data operations.
Qualifications & Experience:
Bachelor's or Master's degree in Computer Science, Information Systems, or a related
field.
3+ years of experience in data engineering or a related role.
Strong proficiency in SQL and experience with relational databases (MySQL) and
NoSQL databases (MongoDB).
Hands-on experience with ETL/ELT frameworks and data workflow orchestration
tools (eg, Airflow, AWS Glue).
Proficiency in Python or a similar programming language for data processing.
Familiarity with AWS cloud services (eg, EC2, S3, Redshift, Lambda, Sage Maker)
and containerization (Docker).
Understanding of data modelling, data warehousing, and schema design best practices.
Strong problem-solving skills, attention to detail, and ability to work independently
and as part of a team.
Nice-to-Have Skills:
Exposure to NLP or ML projects, especially those involving large language models.
Experience with CI/CD for data pipelines and Infrastructure-as-Code (eg, Terraform,
CloudFormation).
Knowledge of security and compliance standards relevant to financial data.
About Us
Easy Home Finance has revolutionized the industry with India's very first online mortgage experience, helping millions achieve their dream of first home ownership hassle free.
We believe homeownership should be easy, transparent, and low-cost for all. We're using technology to make it faster, and humans to help make it friendly and enjoyable.