Export Resume

About

🔍 Turning data into strategic decisions
Welcome! I’m Ilyass El Fourati, a passionate Data Scientist and engineer dedicated to uncovering insights and delivering impactful solutions through data. With a strong academic background in Data Science (École Centrale Casablanca, Lille, and Marseille), I merge advanced technical expertise with strategic thinking to solve complex business challenges.

💡 What I do:

  • Build predictive models to optimize decision-making and forecast trends.
  • Automate data pipelines to ensure efficiency and reliability in Big Data workflows.
  • Leverage advanced technologies: Generative AI (LLM, RAG), NLP, Computer Vision, and Cloud Computing (Azure, GCP).
🌟 What sets me apart:
  • Crafting actionable insights through intuitive dashboards and clear data visualizations.
  • Streamlining processes by designing tailored ETL pipelines for seamless data integration.
  • Unlocking the value of unstructured data with semantic analysis (NLP) and visual recognition (Computer Vision).
  • Empowering businesses by automating repetitive and time-intensive tasks.
🎯 My mission? To collaborate with innovative teams and tackle challenges in big data management, applied AI, and process optimization.

🌱 What drives me: Innovation, continuous learning, and creating meaningful impact through data.

Experience -

Data Scientist Apr 2024 - Oct 2024
Santarelli Group
- Internship

  • Designed an application based on language models (LLM) specialized in the field of intellectual property to assist patent engineers.
  • Utilized retrieval-augmented generation (RAG) to leverage the company's internal documents, minimizing hallucinations and facilitating the processing of large documents.
  • Performed optical character recognition (OCR) on written opinions (official letters from patent offices) to prepare documents for analysis by LLM.
  • Extracted patent or scientific article references from OCR documents and automated their download using web scraping.

Data Scientist & Engineer Mar 2023 - Sept 2023
Forvia
- Internship

  • Collect, analyze, and visualize data from internal software (ETL).
  • Implement data storage in a PostgreSQL database.
  • Design dynamic dashboards on Foundry (Palantir) to compare data from various sources.
  • Optimize an internal search engine by integrating semantics.
  • Implement summary models to shorten the length of texts in documents. (NLP)
Librairies & Techniques :
🛠️Python : Plotly, NLP, Pandas, Transformers, Gensim, HuggingFace, bm25_rank, nltk, ReGex.
🛠️Foundry (Palantir), MS Azure
🛠️PostgreSQL

Consultant Data Scientist | Computer Vision Dec 2022 - Mar 2023
NGE
- Freelance

Reduction of CO2 emissions due to the use of concrete from purchase invoices :

  • Collection and cleaning of invoices.
  • Extraction of precise data from scanned images using YOLOv7 and DocTR (OCR)
  • Detection and counting of people, enabling accurate footfall analysis and facilitating crowd control measures.
  • Development of a web application in Python with DASH, allowing for the upload of invoices and the automated generation of an Excel dataframe containing the extracted data.
Libraries & techniques :
🛠️Python : Numpy, Dash, Pandas, Sickit-learn, YOLOv7, Tensorflow, OCR, NER, NLP, Streamlit

Consultant Data Engineer Oct 2022 - Dec 2022
Groupe ADF
- Freelance

The aim of this project is to clean and standardize maintenance data collected from various sources, structure and store it in a database, apply machine learning (NLP) to the textual information, and visualize and highlight the stored data in Power BI.

  • Creation of a generalized pipeline to clean, standardize, and store CMMS (SAP) data from various companies using Pandas (Python) and MySQL.
  • Extraction of keywords from textual data using TF-IDF, RAKE, and TextRank.
  • Detection of themes using Topic Modeling algorithms (BERTopic, LDA, and NMF).
  • Comparison of models used for keyword and theme extraction using specific Natural Language Processing (NLP) metrics such as coherence.
  • Enhancing data accessibility and readability by creating visualizations such as Pareto charts, NLP algorithm results, failure reports, etc., in Power BI.
Libraries & techniques :
🛠️ Python : Numpy, NLP, Pandas, Sickit-learn, Gensim.
🛠️Power BI : Power Query, DashBoards/Rapports, DAX.
🛠️MySQL



Academic Project -

RAG for Access to Scientific Articles Feb 2024 - Mar 2024
Ecole Centrale Casablanca & AxIA

  • Objective: Facilitate LLMs' access to scientific articles for advanced interaction.
  • Benchmarking of LLMs (Gemini, GPT, T5) and vector databases (Chroma, Vectorstore Index).
  • PDF parsing and use of Langchain and LlamaIndex libraries.
  • Results: Improved integration of LLMs with scientific databases.
  • Directed by: Yan LeCun
Poster of the project

Rooftop Heat and Energy Analysis Sep 2023 - Feb 2024
Ecole Centrale Casablanca

  • Objective: Map flat roofs with maximum solar heat absorption potential in Casablanca and evaluate the energy and environmental implications.
  • Extracted high-resolution satellite images using QGIS.
  • Labeled roofs and their colors.
  • Customized YOLOv7 for roof and color detection.
  • Conducted energy analysis of roofs, including energy consumption for cooling and internal heat levels.
  • Compared results with solutions like Low-E glazing and reflective paint.
  • Results: Identified solutions to improve energy efficiency and reduce carbon emissions.
Technologies: QGIS, YOLOv7, Dash, HTML, CSS.
Poster of the project

Data Science Project Sept 2022 - Oct 2022
Ecole Centrale Marseille
- Academic Project

In-depth study of the impact of COVID on the global economy :