Author Image

Hi, I am Ashwath

Ashwath Shetty

Data Scientist at Elsevier


- I like Maths, Programming, Physics, open source & scalable systems.
- currently working as a Data Scientist in Elsevier R&D, focusing on NLP & Computer Vision.
- research interest includes - Large Language Models(LLMs), Multi Modal LLMS, document understanding, GenAI, Democratizing AI.
- I write on Substack, Linkedin - you can follow me from below link. Subscribe to my substack to have early access to my new writing series on latest research paper discussion.
- If you’re interested to have me for a tech talk reach out to me on - ashwathkumarshettyr@gmail.com
- I do free data science mentorship for under underprivileged communities & high school students, please drop a mail to me, if you are interested.

Problem Solving
Team Work
Leadership
Story Telling

Skills

Experiences

1
Data Scientist
Elsevier

- Present, Remote

Elsevier is a leading dutch publishing house.

Responsibilities:
  • I’m part of the R&D team based in Amsterdam. The team’s primary focus revolves around pioneering research and development of cutting-edge deep learning models in Natural Language Processing (NLP) and Computer Vision (CV) for products like Scopus, Science Direct & SciVal.
  • Spearheaded the creation of an advanced Multi-Modal LLM based system, generating alt-text for nursing book images to enhance accessibility for visually impaired individuals.
  • Developed a multi-modal and multilingual deep learning model using LiLT to extract entities from PDF-formatted scientific research articles. Extracted Entities are subsequently added to Scopus database.
  • Designed and developed a common evaluation pipeline for assessing the performance of the Entity Extraction System. Currently employed by 3+ teams to evaluate diverse services, including those from products like Science Direct.
  • Stakeholder management across the customers, product managers & Engineering team to bring the product to the market. Collaborating closely with teams in Europe & North America.

Tata Elxsi

October 2019 - Dec 2022, Bengaluru

Tata Elxsi is a widely recognized company in various domains and constantly works on innovative projects.

Senior Data Scientist

October 2019 - Dec 2022

  • Engineered a NER based Recommendation system to identify confidential information in documents for Redaction.
    - Developed system decreased the time taken for redaction by 90% and enhanced the speed of redaction by 10X.
    - Led the project from inception to deployment, reporting directly to the Co-founder and CTO from the client’s side. This effort resulted in the company securing $900k in pre-seed funding.
  • Automated the transaction verification process in the truck-based delivery system for a Mexican company to optimize human interaction and increase verification speed.
    - Independently managed the end-to-end NLP pipeline for the project, including the development of a document similarity-based algorithm for document classification.
    - Developed system helped in saving $10M per year to the company by optimizing the truck delivery process.
  • Designed and developed an automated performance review pipeline to analyze the data and generate reports, facilitating a comprehensive understanding of the game development team’s monthly performance.
  • Designed and developed a machine learning framework for structured data, resulting in a 2X increase in development speed and overall efficiency.
Data Scientist

October 2019 - October 2021

  • Started as a software engineer & worked in a game development project for a short time and later moved to data scientist role. I was promoted to a senior role within two years due to my exceptional performance.
  • Developed Multiple Educational Games for Disney, collaborated with teams in USA, UK, Australia, Mexico. Games developed are used by 1M+ kids all over the world.
2

3
Omdena

- Present,

Omdena is a AI for good organization which collaborates with the NGOs and Engineers to bring AI for good to the society.

Lead Machine Learning Engineer

- Present

  • I’m one of the Omdena’s top Volunteer in AI for good initiative. check out my profile here https://collaborator.omdena.com/collaborator-profile/56418
  • Actively Participating as a volunteer for the open-source contribution with Omdena( https://omdena.com/ ) to execute many AI projects successfully in collaboration with the NGOs and Startups. Also, part of Omdena’s AI community in promoting AI for good.
  • Developed a machine learning based methodology to predict the time to the next treatment curve for oncology immunotherapy. Additionally, created a dashboard to visualize the results. This project was carried out in collaboration with Omdena and Mango Science as part of Omdena’s AI for Good initiative. The project also involved continuous interaction with oncology specialists and patients.
Machine Learning Engineer

- Present

  • Identify the Root Cause Analysis of Anomalies in Datacenters using state-of-the-art Unsupervised Machine Learning techniques and causal inference mechanisms.
  • Automated the Training Pipeline and Created Filters to reduce the 800 features to a minimal amount for Modelling.
  • Developed System helped in Finding the Anomalies in real-time and Optimizing the human interaction and Delay in mitigating the issues which further lead decrease in server downtime and latency.
  • Tools and Technologies used - FbProphet, Python.

Education

B.Tech in Computer Science and Engineering
CGPA: 8.8 out of 10
Taken Courses
  • Data Structures and algorithms, Object Oriented Programming
  • Data Science, Probability and Statistics, Machine Learning
  • Database Management System, Parallel Computing
  • Operating Systems, Networks, Advanced Mathematics etc.
Extracurricular Activities
  • As a member of NVIDIA - Boston Innovation Lab, I organized hackathons and technical events. Additionally, led the Final year project team in several inter & intra college hackathons and seminars.

Projects

Alt-text generation using Multi Modal LLM.
Alt-text generation using Multi Modal LLM.
Owner

Spearheaded the creation of an advanced Multi-Modal LLM based system, generating alt-text for nursing book images to enhance accessibility for visually impaired individuals.

Entity Extraction using Multi Modal & Multi lingual Deep learning
Entity Extraction using Multi Modal & Multi lingual Deep learning
Owner

Developed a multi-modal and multilingual deep learning model using LiLT to extract entities from PDF-formatted scientific research articles. Extracted Entities are subsequently added to Scopus database.

Oncology immuno therapy treatement.
Oncology immuno therapy treatement.
Owner

Developed a machine learning based methodology to predict the time to the next treatment curve for oncology immunotherapy. Additionally, created a dashboard to visualize the results.

Automatic Transaction Verification using NLP.
Automatic Transaction Verification using NLP.
Owner

Automated the transaction verification process in the truck-based delivery system for a Mexican company to optimize human interaction and increase verification speed.

Fraudulent Transaction detection
Fraudulent Transaction detection
Owner

Developed API Detects the frudulent transaction.

Car Price Prediction
Car Price Prediction
Owner

This Web App Predicts the price of the Car based on the user provided inputs. Click here to see the live demo [demo may not be online if heroku’s free cloud quota is exceeded]

Character Recognizer
Character Recognizer
Owner

This Web App predicts the Characters drawn on the canvas using Deep learning-Computer Vision technique. Click here to see the live demo [demo may not be online if heroku’s free cloud quota is exceeded]

Extracting the confidential information from the documents.
Extracting the confidential information from the documents.
Contributor

Detecting the Confidential information from the documents using NLP-NER Technique.

Using Machine Learning for Root Cause Analysis of Anomalies
Using Machine Learning for Root Cause Analysis of Anomalies
Contributor

Anomaly Detection and Root Cause Analysis for the data centers Time series Data using Unsupervised ML techniques.

Accomplishments

Machine Learning

This course is taught by Andrew NG provides a broad introduction to machine learning, data mining and statistical pattern recognition. Topics include supervised learning, unsupervised learning and the best practices in machine learning. course also includes numerous case studies and applications of machine learning.

Deep Learning Specialization

This course is a specialization in deep learning offered and taught by Stanford Lectures. click on the links to see the certification and grades. courses includes

  • Neural Networks and Deep Learning
  • Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
  • Structuring Machine Learning Projects
  • Convolutional Neural Networks
  • Natural Language Processing.

  • Summer Analytics

    This online data science certification course covered the concepts of Python, Machine Learning, Web Scraping, Data Cleaning and Data Analysis. Course ended with a kaggle competition as a milestone project.

    Complete Python Bootcamp

    This Course covered from beginner to advanced level topics in python. Some of the topics including core concepts, datastructures and OOP in python are covered thoroughly.

    Advanced SQL

    This course covered basics to advanced SQL with BigQuery.

    Achievements

    Global Rank 215 and Master Level

    Tech Talk

    Top 2 Percentage

    Top 1 Percentage

    Omdena Sensai contribution

    Omdena oncology contribution

    Omdena Redactable contribution

    recommendations