Academic Research
Migrant & Women Representation in French Startups
Data-driven thesis examining demographic representation in the French startup ecosystem using machine learning and statistical analysis.
Financial models, research reports, and strategic analyses.
Academic Research
Data-driven thesis examining demographic representation in the French startup ecosystem using machine learning and statistical analysis.
Development Economics
Cross-country econometric analysis of mobile broadband adoption and institutional determinants of infrastructure roll-out.
Blockchain Analytics
Econometric analysis of Bitcoin and Ethereum fees, identifying network congestion and DeFi activity as primary drivers.
Machine Learning
End-to-end ML pipeline to detect and segment railway infrastructure from high-resolution satellite imagery.
Systems Programming
High-performance chess engine in C++ using alpha-beta pruning, bitboards, and a custom opening book trained on 1M Lichess games.
Academic Research
Data-driven thesis research examining demographic representation in the French startup ecosystem through machine learning and statistical analysis.
Conducted original research on the underrepresentation of women and migrant-origin groups in startup creation and innovation in France. Built a dataset by combining Crunchbase startup data, French census benchmarks, and external datasets, then developed an ensemble classification system to infer founder origins using Ethnea, NamePrism, and a custom LSTM-based neural network.
Applied Wilson confidence intervals, chi-squared tests, logistic regression models, and benchmark scenario analysis to measure representation gaps. Findings show substantial underrepresentation across multiple groups, suggesting that inequalities in entrepreneurial access contribute to talent misallocation and reduced economic innovation.
Development Economics
Cross-country econometric analysis of mobile broadband adoption, examining institutional and economic determinants of infrastructure roll-out.
Investigated the effects of federalism, decentralization, market competition, GDP per capita, and rural population on mobile broadband subscription rates across countries. Regression analysis identified GDP per capita and decentralization as statistically significant determinants, while federalism showed little explanatory power once controls were included.
The project addressed econometric challenges including multicollinearity, omitted variable bias, skewed distributions, and missing data. Built and tested multiple regression specifications in R, using log transformations and hypothesis testing to evaluate the economic and statistical significance of institutional variables in infrastructure development.
Blockchain Analytics
Econometric analysis of Bitcoin and Ethereum transaction fees, identifying network congestion and DeFi activity as primary drivers over token prices.
This project explored the determinants of transaction fees in decentralized finance by modelling Bitcoin and Ethereum fee dynamics using econometric methods. For Bitcoin, the USD price was statistically significant but had limited economic significance, while transactions per second emerged as the strongest determinant, highlighting congestion and competition for block space.
For Ethereum, ETH price had only a marginal impact, whereas the value of Uniswap tokens showed both statistical and economic significance, suggesting that DeFi activity contributes to higher fees. The analysis also revealed omitted variable bias in both models, emphasizing the importance of specification when interpreting results. Overall, the findings suggest that blockchain fees are driven more by usage intensity and application-layer demand than by token prices themselves.
Machine Learning
Built an end-to-end machine learning pipeline to detect and segment railway infrastructure from high-resolution satellite imagery.
Developed a custom geospatial dataset by combining railway vector coordinates from OpenStreetMap/Overpass API with high-resolution satellite imagery sourced via Google Earth Engine. Constructed 100m × 100m image patches and generated precise segmentation masks for supervised learning.
Designed and fine-tuned both a binary image classifier and a semantic segmentation model using convolutional neural networks, data augmentation, and custom loss functions including Dice Loss and Binary Cross-Entropy. Achieved over 95% classification accuracy and strong F1 performance, demonstrating the feasibility of scalable railway network mapping through satellite-based computer vision.
Systems Programming
A high-performance chess engine built in C++ using modern search algorithms, optimized board representations, and custom evaluation heuristics.
Built collaboratively for CSE201, KEAN uses bitboards for efficient board representation, Zobrist hashing for fast position lookup, and a position stack to detect threefold repetition. The engine implements alpha-beta pruning with iterative deepening, aspiration windows, principal variation search, null move pruning, and quiescence search to optimize decision-making speed and accuracy.
The evaluation function combines piece-square tables, king safety, pawn structure analysis (isolated, doubled, passed, and backward pawns), mobility scoring, and positional heuristics. Additional features include a custom opening book trained on ~1 million high-level Lichess games, FEN import/export, UCI-compatible move parsing, and a transposition table for efficient state caching.