You are viewing a preview of this job. Log in or register to view more details about this job.

Data Engineer Intern

We are looking for a highly motivated Data Engineer Intern with a passion for data-driven solutions and a knack for building scalable data processing systems. The ideal candidate will have a strong foundation in data analysis and developing NLP-based applications. You will be instrumental in creating semantic product search and basket analysis tools/frameworks for the retail industry.

Key Responsibilities

  • Design and implement scalable data processing and analysis pipelines.
  • Develop and integrate NLP/LLM models for semantic product search and basket analysis.
  • Collaborate with data scientists and engineers to refine data for predictive modeling.
  • Assist in the collection, cleansing, and transformation of large data sets.
  • Evaluate and improve the performance of existing data systems and processes.
  • Participate in the development of algorithms for personalized customer recommendations.
  • Research emerging technologies and methodologies to enhance our data capabilities.


  • Currently pursuing a degree in Computer Science, Data Science, Engineering, or a related field.
  • Strong programming skills in Python, including experience with Pandas, NumPy, and Scikit-learn.
  • Familiarity with NLP tools and libraries (e.g., NLTK, spaCy, or Transformers).
  • Experience with data processing and analysis tools (e.g., SQL, Apache Spark).
  • Understanding of machine learning concepts and algorithms.
  • Excellent problem-solving and analytical skills.
  • Ability to work collaboratively in a team environment.
  • Strong communication skills, both verbal and written.

Preferred Qualifications

  • Prior experience or projects involving NLP or machine learning.
  • Knowledge of cloud computing services (e.g., AWS, Google Cloud Platform).
  • Familiarity with version control systems, preferably Git.

What We Offer

  • Competitive stipend based on your skillset
  • Hands-on experience with real-world data engineering and NLP projects.
  • Databricks training.
  • Azure/AWS/GCP data infrastructure access
  • Mentorship from experienced professionals in the field.
  • A collaborative, innovative, and inclusive work environment.
  • Opportunities for professional development and networking.