About
🔍 Turning data into strategic decisions
Welcome! I’m Ilyass El Fourati, a passionate Data Scientist and engineer dedicated to uncovering insights and delivering impactful solutions through data. With a strong academic background in Data Science (École Centrale Casablanca, Lille, and Marseille), I merge advanced technical expertise with strategic thinking to solve complex business challenges.
💡 What I do:
- Build predictive models to optimize decision-making and forecast trends.
- Automate data pipelines to ensure efficiency and reliability in Big Data workflows.
- Leverage advanced technologies: Generative AI (LLM, RAG), NLP, Computer Vision, and Cloud Computing (Azure, GCP).
- Crafting actionable insights through intuitive dashboards and clear data visualizations.
- Streamlining processes by designing tailored ETL pipelines for seamless data integration.
- Unlocking the value of unstructured data with semantic analysis (NLP) and visual recognition (Computer Vision).
- Empowering businesses by automating repetitive and time-intensive tasks.
🌱 What drives me: Innovation, continuous learning, and creating meaningful impact through data.
Experience -
Data Scientist Apr 2024 - Oct 2024
Santarelli Group - Internship
- Designed an application based on language models (LLM) specialized in the field of intellectual property to assist patent engineers.
- Utilized retrieval-augmented generation (RAG) to leverage the company's internal documents, minimizing hallucinations and facilitating the processing of large documents.
- Performed optical character recognition (OCR) on written opinions (official letters from patent offices) to prepare documents for analysis by LLM.
- Extracted patent or scientific article references from OCR documents and automated their download using web scraping.
Data Scientist & Engineer Mar 2023 - Sept 2023
Forvia - Internship
- Collect, analyze, and visualize data from internal software (ETL).
- Implement data storage in a PostgreSQL database.
- Design dynamic dashboards on Foundry (Palantir) to compare data from various sources.
- Optimize an internal search engine by integrating semantics.
- Implement summary models to shorten the length of texts in documents. (NLP)
🛠️Python : Plotly, NLP, Pandas, Transformers, Gensim, HuggingFace, bm25_rank, nltk, ReGex.
🛠️Foundry (Palantir), MS Azure
🛠️PostgreSQL
Consultant Data Scientist | Computer Vision Dec 2022 - Mar 2023
NGE - Freelance
Reduction of CO2 emissions due to the use of concrete from purchase invoices :
- Collection and cleaning of invoices.
- Extraction of precise data from scanned images using YOLOv7 and DocTR (OCR)
- Detection and counting of people, enabling accurate footfall analysis and facilitating crowd control measures.
- Development of a web application in Python with DASH, allowing for the upload of invoices and the automated generation of an Excel dataframe containing the extracted data.
🛠️Python : Numpy, Dash, Pandas, Sickit-learn, YOLOv7, Tensorflow, OCR, NER, NLP, Streamlit
Consultant Data Engineer Oct 2022 - Dec 2022
Groupe ADF - Freelance
The aim of this project is to clean and standardize maintenance data collected from various sources, structure and store it in a database, apply machine learning (NLP) to the textual information, and visualize and highlight the stored data in Power BI.
- Creation of a generalized pipeline to clean, standardize, and store CMMS (SAP) data from various companies using Pandas (Python) and MySQL.
- Extraction of keywords from textual data using TF-IDF, RAKE, and TextRank.
- Detection of themes using Topic Modeling algorithms (BERTopic, LDA, and NMF).
- Comparison of models used for keyword and theme extraction using specific Natural Language Processing (NLP) metrics such as coherence.
- Enhancing data accessibility and readability by creating visualizations such as Pareto charts, NLP algorithm results, failure reports, etc., in Power BI.
🛠️ Python : Numpy, NLP, Pandas, Sickit-learn, Gensim.
🛠️Power BI : Power Query, DashBoards/Rapports, DAX.
🛠️MySQL
Academic Project -
RAG for Access to Scientific Articles Feb 2024 - Mar 2024
Ecole Centrale Casablanca & AxIA
- Objective: Facilitate LLMs' access to scientific articles for advanced interaction.
- Benchmarking of LLMs (Gemini, GPT, T5) and vector databases (Chroma, Vectorstore Index).
- PDF parsing and use of Langchain and LlamaIndex libraries.
- Results: Improved integration of LLMs with scientific databases.
- Directed by: Yan LeCun
Rooftop Heat and Energy Analysis Sep 2023 - Feb 2024
Ecole Centrale Casablanca
- Objective: Map flat roofs with maximum solar heat absorption potential in Casablanca and evaluate the energy and environmental implications.
- Extracted high-resolution satellite images using QGIS.
- Labeled roofs and their colors.
- Customized YOLOv7 for roof and color detection.
- Conducted energy analysis of roofs, including energy consumption for cooling and internal heat levels.
- Compared results with solutions like Low-E glazing and reflective paint.
- Results: Identified solutions to improve energy efficiency and reduce carbon emissions.
Poster of the project
Data Science Project Sept 2022 - Oct 2022
Ecole Centrale Marseille - Academic Project
In-depth study of the impact of COVID on the global economy :
- Data collection, cleaning, analysis and visualisation.
- Creation of a Dashboard (Dash/Plotly) to present the results.
- Collection of data from different satellites
- Evaluation of remote sensing data for predicting extreme rainfall over Morocco.
Scientific Research Project Oct 2021 - Jan 2022
Ecole Centrale Casablanca - Academic Project
Education -
Data ScienceSept 2022 - Sept 2023
DigitalLab, Ecole Centrale Marseille France, Nice
International exchangeFeb 2022 - Jul 2022
Ecole Centrale Lille France, Lille
Data Science and digitalizationSept 2020 - Oct 2024
Ecole Centrale Casablanca Morocco, Casablanca
Mathematics & Physics Engineering ScienceSept 2018 - Apr 2020
Preparatory Classes Ibn Timiya Morocco, Marrakech
Certificats -
Python for Data Science, AI & Development
Coursera
SQL & BigQuery , Pandas, Data Cleaning, Intermediate Machine Learning and Intro to Deep Learning
Kaggle
Java Basics: Selection and Iteration
Coursera