Jan Krepl
Verified Expert in Engineering
Machine Learning Engineer and Developer
Jan is a machine learning engineer passionate about software engineering, machine learning, leadership, and online education. He has extensive professional experience applying computer vision, natural language processing, and time series analysis in academic and business settings. Jan also dedicates much of his free time to contributing to open-source software and educational content creation.
Portfolio
Experience
Availability
Preferred Environment
Python, Machine Learning, Notion
The most amazing...
...thing I've developed is a question-answering tool extracting knowledge from scientific papers.
Work Experience
Machine Learning Section Manager | Blue Brain Project
The EPFL
- Designed a literature search system focused on semantic search, question answering, named entity recognition, and entity linking, built on top of recent large language models. The entire system was deployed at scale with Kubernetes and AWS.
- Managed a team of four experienced machine learning engineers.
- Acted as a lead developer enforcing best practices.
Machine Learning Engineer | Blue Brain Project
The EPFL
- Conceived and implemented a supervised algorithm for 2D brain slice image registration that became a part of internal workflows.
- Developed a knowledge extraction pipeline for scientific articles with main functionalities such as parsing, neural search, and named entity recognition.
- Engaged directly in various neuroscientific projects, including neuron-type classification with graph neural networks and morphology image synthesis with generative adversarial networks.
Data Scientist
Nectar Financial
- Enhanced internal portfolio optimization algorithms with return forecasting using supervised learning techniques. Added custom constraints and objective functions, making the tool more flexible.
- Applied text embedding algorithms, such as Doc2Vec and TF-IDF, on hedge fund fact sheets and reports. In turn, these embeddings were used for clustering, which allowed for better diversification.
- Developed a custom back-testing framework considering various hedge-fund-specific constraints like lock-ups.
Quantitative Risk Analyst
UBS
- Maintained the Lombard lending section's stress-testing codebase that used Visual Basic, SQL, and SAS.
- Generated regular risk reports used as inputs for other departments.
- Supported senior analysts in creating custom risk models.
Experience
Mildlyoverfitted | Educational Videos
http://www.youtube.com/@mildlyoverfitted/DeepDow | Portfolio Optimization with Deep Learning
http://github.com/jankrepl/deepdow/• Forecasting the market's future evolution, such as long short-term memory networks (LSTM) and generalized autoregressive conditional heteroskedasticity (GARCH).
• Providing optimization problem designs and solutions, such as convex optimization.
It does so by constructing a pipeline of layers. The last layer performs the allocation, and all the previous ones serve as feature extractors. The overall network is fully differentiable, and one can optimize its parameters by gradient descent algorithms.
MLtype | Command Line Tool
http://github.com/jankrepl/mltype/Atlas Alignment | Multimodal Registration and Alignment
http://github.com/BlueBrain/atlas-alignment/PyChubby | Automated Face-warping Tool
http://github.com/jankrepl/pychubby/Skillset
Languages
Python, SQL, SAS, Excel VBA, JavaScript
Libraries/APIs
PyTorch, Scikit-learn, NumPy, Keras, SciPy, Pandas, Matplotlib, REST APIs, TensorFlow, Asyncio, Python Asyncio, JAX, OpenCV, SpaCy, React
Tools
Vim Text Editor, Git, GitLab CI/CD, Pytest, TensorBoard, GitLab, GitHub, ChatGPT, Notion, Amazon SageMaker, Cloud Dataflow, Google Compute Engine (GCE), AWS Glue, Terraform, Inkscape, Apache Airflow, Adobe Premiere Pro, Seaborn, Gensim, StatsModels, Scikit-image, Google Kubernetes Engine (GKE)
Paradigms
Unit Testing, Data Science, Test-driven Development (TDD), REST, Scrum, Agile Software Development
Platforms
Kubernetes, Docker, Amazon Web Services (AWS), Jupyter Notebook, Vertex AI, Amazon EC2, Google Cloud Platform (GCP), AWS Lambda
Storage
Elasticsearch, Google Cloud Storage, Amazon S3 (AWS S3), Redis, Redis Cache, NoSQL, Neo4j, MySQL, PostgreSQL
Other
Probability Theory, Mathematical Analysis, Linear Algebra, Statistics, Machine Learning, Portfolio Optimization, Orchestration, Machine Learning Operations (MLOps), Shell Scripting, Generative Pre-trained Transformers (GPT), BERT, Hugging Face Transformers, Sphinx, Natural Language Processing (NLP), Finance, Computer Vision, OpenAI GPT-4 API, Artificial Intelligence (AI), Hugging Face, APIs, Regular Expressions, Natural Language Understanding (NLU), GPT, Algorithms, Back-end Development, Language Models, Pub/Sub, Large Language Models (LLMs), Technical Leadership, Leadership, Retrieval-augmented Generation (RAG), OpenAI, Back-end, Containerization, Serverless, SDKs, Pinecone, Optimization, Microeconomics, Macroeconomics, Mathematical Finance, Quantitative Risk Analysis, Numerical Methods, MLflow, FastAPI, LangChain, Time Series Analysis, Product Consultant, Measure Theory, Econometrics, Private Company Valuation, Deep Learning, Scrum Master, CI/CD Pipelines, Online Course Design, Recurrent Neural Networks (RNNs), Open Source, Image Registration, Data Versioning, Google BigQuery, Full-stack Development
Frameworks
Apache Spark
Education
Master's Degree in Quantitative Finance
ETH Zurich - Zurich, Switzerland
Bachelor's Degree in Economics
Charles University - Prague, Czechia
Certifications
HashiCorp Certified: Terraform Associate (003)
HashiCorp
AWS Certified Solutions Architect - Associate
Amazon Web Services
Google Cloud Certified Professional Machine Learning Engineer
Google Cloud
AWS Certified Machine Learning - Specialty
Amazon Web Services
Databricks Certified Associate Developer for Apache Spark 3.0
Databricks Inc.
AWS Certified Cloud Practitioner
Amazon Web Services
CKAD: Certified Kubernetes Application Developer
The Linux Foundation
Professional Scrum Master (PSM I)
Scrum.org
CFA Level I (Passed)
CFA Institute
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring